Recently in linux Category

Hard Disk replacement: Linux Shared Hosting Hardware Node PEMVZMPS16

TrackBacks (0) Comments (0)
We have been alerted to a failed hard disk on one of our linux shared hosting nodes PEMVZMPS16. To ensure the RAID becomes full optimal asap we are going to replace this disk tonight at 9PM.

What's affected?
Two linux shared hosting web servers are located on this node. They are:
  • 81.17.254.74    pemlinweb15.blacknight.com
  • 81.17.254.75    pemlinweb16.blacknight.com 
What sort of downtime is to be expected
The plan is to bring the server offline, replace the disk, boot it back up and then let the RAID resync itself.

The downtime will be in the timeframe of 15 minutes, but we are going to schedule a window of one hour.

When is this happening ?
The downtime will occur tonight 26th of August 2010 at 21:00 hours.

Once the maintenance is complete we'll update this blog post.
 
UPDATE 21:08 - The disk change is complete and the server is now back online.

Issues with PEMLINWEB32/33 Linux Shared Hosting

TrackBacks (0) Comments (0)
We are currently experiencing issues with two linux shared hosting nodes:

pemlinweb32.blacknight.com - 81.17.254.44
pemlinweb33.blacknight.com - 81.17.254.48

Our engineerings are working to resolve this issue asap and will keep this blog post updated.

UPDATE 12:20AM - This issue has been resolved.

Shared Hosting Server Ragnell Compromised

TrackBacks (0) Comments (7)
Our shared hosting server Ragnell has been compromised, and the majority of the index.php's have been replaced with a hacked version. We have disabled all copies of the compromised index files already.

We are at the moment making sure the hole used is fixed before re-enabling Apache. As part of this, PHP is being upgraded to PHP5.

We are also going to see about restoring the disabled index files, however this is going to take longer. The backup system we use is geared towards full system backups, so restoring individual files is likely to take a while. If you have an uptodate copy of your index file, it will probably be faster if you get it uploaded yourself. This can be done even while Apache is down.

Update 1430: The upgrade of php / Apache is almost complete. Once it's finished we will be able start restoring index files from backups.

Update 1515: Apache is back up and running.  We are currently restoring the index files from backup. This is going to take a long time.

UPDATE 1615: If your site's index file has been restored or if you've restored it yourself let us know if there are any issues.

UPDATE 16:52: As restoring individual index files is proving to be far too unwieldy, we are currently restoring the whole partition to another box. This will allow us to script the restore of any index files which are still showing as compromised. 

UPDATE 1910: The restoration of the index files is progressing, but it's slow, as we are checking each index file to see if it has been compromised or simply replaced from a customer's own backup. If you have a backup / replacement index file and are having issues uploading it you may need to CHMOD 644 the current index.php
UPDATE 09:30 Friday Aug 20th

After 3 failed attempts at a full restore to a machine in our offices, we have successfully done a full restore to a machine in the data centre. This morning around 9am we restored any files which had a checksum that matched that of the defaced files that were placed there during the compromise on Saturday last.

Anyone who requires other files to be restored for any reason should contact us ASAP so we can restore them for you.

CP.blacknight.com MAJOR UPGRADE

TrackBacks (0) Comments (8)

When: Monday August 16th from 02:00 until 8am

What: Control panel software, provisioning system, Agents on all hardware nodes, mysql nodes, web servers will all be upgraded from version 2.9.4 to version 5.0. During this window access to the control panel will be restricted. However e-mail, hosting etc services should not be affected by this upgrade.

Changes made to webspaces, new or existing e-mail accounts added or modified, new database creations etc during this window would be ill advised.

Services Affected:

cp.blacknight.com - Control Panel only.

Windows Shared hosting servers may be restarted. only if the upgrade of the management application proves to be problematic.

Linux Shared hosting servers may be restarted. only if the upgrade of the management application proves to be problematic.

Exchange Servers may be restarted, but due to the design of the Hosted Exchange no downtime should be noticed.

Qmail servers will have their imap/pop3/smtp services restarted as new versions of the software get put in place. Also changes may be made to LDAP so there could be intermittent authentication issues.

Sitebuilder servers should not be affected.

VPS nodes will have some software upgraded, but end users should experience no downtime.

Domain registrations and modifications will not be affected by this upgrade.

Update 07:40 Monday 16th:

The upgrade is still being performed at the moment, but it's in it's final stages. We'll post an "all clear" notice when it's complete.

Update 08:45 Monday 16th:

During the upgrade Parallels appear to have broken the provisioning system somehow. As this drives the Control Panel the CP is still down. They're working to resolve this issue but we've got no ETA as of yet.

Update 1045
The maintenance and upgrade has been completed. If anyone has any specific issues post-upgrade please contact our helpdesk

Shared Direct Admin server - Ector down

TrackBacks (0) Comments (0)
Our engineers are currently seeing issues with one of our older Direct Admin servers:

Ector.blacknight.ie - 81.17.252.50

The server was rebooted not long ago and there are some lingering issues that may affect the receiving of email and possibly cause downtime on your websites.  We are working on this issue at the moment and our highest priority is to get this server back up and running as soon as possible.

We also hope to implement some changes on this server over the upcoming week to resolve the intermittant issues this server has been having over the last couple of weeks.


Update 12:50 - The server is back up now and our engineers are checking all services to ensure they are up and running.  They are still working on the server to restore all back to a normal level of service.

Update 15:10 - I'm afraid the server has gone down again and our engineers are working on restoring service once more.  We are still investigating the root cause of these issues today.

Update 15:26 - The server is still a little slow but services are back now.  We will be contacting some of the busier sites on this server in order to alleviate the load issues.

Update Aug 5th 11:22 - This server has had some config changes, we've moved a few of the busier wordpress blog sites off it and also we've found a problem in the AntiSpam system that was causing lookups on dead realtime blacklists. All of these changes appear to have resolved the problems we were seeing on this server for the past couple of weeks. We'll be monitoring it closely for the next week and if there are no issues in that timeframe we'll consider this issue closed once and for all.

Shared Hosting Linux - Gorlois

TrackBacks (0) Comments (0)
We are currently experiencing issues with our shared hosting linux node Gorlois. Our engineers are currently working on the issue at the moment and hope to have it resolved shortly.

We will keep this post updated with progress

UPDATE 09:46 - This issue has been resolved.

Emergency Reboot - PEMVZMPS13

TrackBacks (0) Comments (0)
We are scheduling an emergency reboot of the hardware node PEMVZMPS13. The kernel on the server is throwing some strange errors. We need to reboot it to unload some modules and fix them.

Whats effected ?
Two pemlinweb linux shared hosting nodes:
81.17.254.89    pemlinweb08.blacknight.com     
81.17.254.91    pemlinweb09.blacknight.com

How long ?
About 10 minutes from now so it will be fully back online by 10:05AM.

We'll update this blog post once completed.

UPDATE 10:16 - The server has not come back online after the reboot. We've engineers on route to the server now to diagnose. ETA 15 minutes until they arrive.

UPDATE 10:27 - The issue has now been resolved.

Shared Hosting Linux - Ector

TrackBacks (0) Comments (0)
We are currently experiencing issues with our shared hosting server Ector. Our engineers are working on resolving this asap.

3:00PM - This issue is now resolved.

FTP Issues to windows and linux hosting packages (Resolved)

TrackBacks (0) Comments (0)
Summary: After last nights maintenance window some IP ranges (not all) were not allowing ftp to work properly. This was due to a missing policy-map on the firewall that instructs it to inspect all ftp traffic and track statefull and passive connections.

This morning this policy-map was put back in place around 08:40 and since then ftp is checking out from all NAT'd connections. Non NAT'd connections wouldn't have experienced any connection problems to FTP.

This issue is resolved.


Old Linux Shared Hosting Node - Arthur - Network Issues

TrackBacks (0) Comments (0)
One of our linux shared hosting nodes located in the UK is having networking issues. It's dropping packets intermittently.

Our engineers are liaising with the data center techs where the server is located to resolve this issue asap.