March 2011 Archives

pemlinweb10 site defacements

TrackBacks (0) Comments (2)
Summary: This evening around 21:31 GMT all websites on pemlinweb10 were defaced. In most cases additional files called index.htm, index.html and index.php were put into document roots for websites.

The action we've taken thus far:

1) disabled all the files
2) removed all the files (currently underway still) so that you can upload your site files if you have backups.
3) kicked off a bare metal restore of the servers file system to one of our dedicated restore servers.
4) found the entry point, hole and have plugged them.

What's left:
The bare metal restore will take approx 9 hours to complete, it's currently one hour into this so it'll be finished by morning when our engineering shift starts at 8am. We will then begin a mass restore of sites from the restore. The latest backup for this server is started at 15:00 this evening and ended at 18:31 so the data should be quite recent. And infact since it was just index files it shouldn't take a lot of effort to restore them once the full restore has taken place.

Caveats: Anyone who uploads a new index files this evening should expect this to be overwritten from our restore tomorrow morning. If the restore fails for some reason we'll have to kick it off again. However we have other options available to us but a bare metal restore is typically the fastest method.

Update 08:31: The restore is still on going, the ETA is around 3h and 30 minutes which brings us upto midday or there abouts. We'll post further updates when we have the restore completed.

FYI we just rebooted the hardware node that pemlinweb10 resides on and it should be back in a few minutes.

Update 12:10: At the moment any file who's name starts with index is being restored to the state it was in at 15:00 yesterday. We don't have an ETA on how long this part will take, but it should hopefully be finished before lunch.

Update 16:30: The restore has taken longer than expected, mainly because we vastly under estimated just how many "index" files there are. For example, Joomla puts an index.html in every folder as a measure against directory listings. The vast majority of files are  restored now, with only a few thousand left. The process should be finished within the next 15 mins.

Pemlinweb21 and 22 issues

TrackBacks (0) Comments (0)
I'm afraid that two of our Linux shared hosting servers are currently experiencing technical issues due to abnormally high load.  They are:

Pemlinweb21 - 81.17.254.58
Pemlinweb22 - 81.17.254.59

Our engineers are working on stopping the processes using up all the resources on the servers now and we hope to have service restored as soon as possible.

Update 14:56 - The node itself has been restarted now and this has restored service fully.  Our engineers will continue to monitor the servers closely, but what we believe to be the primary cause of this has been stopped fully now.

Email delay on cp.blacknight.com mailservers

TrackBacks (0) Comments (0)
We are currently experiencing a larger than average number of emails in the mailqueues of the shared mailservers for any shared hosting package using email through cp.blacknight.com.

This may cause a delay in your emails getting delivered to you.  Our engineers are working on this and it is their highest priority right now.  We hope to get the situation resolved as quickly as possible.


Update 10:54 - I'm afraid the mail queues are still quite high, though our engineers have blocked one possible contributer to the high backlog.  We have escalated this to the software developers also.

Update 12:40 - Apologies over the delay in an update.  The mailqueues started to go down quickly shortly after the change made at 10:54 but our engineers could not see the exact cause.  They are still closely monitoring the mailservers and working with the Software Developers but email delivery has been down to normal levels for the last half or so now.

Linux VPS Hardware Node: PEMVZLIN08 Reboot

TrackBacks (0) Comments (0)
We are currently seeing some errors coming from virtuozzo on PEMVZLIN08. In order to patch these we are going to need to reboot the node. The downtime will be in the region on 20-25 minutes and the following VEs will be affected:

78.153.208.112  vps-1005939-526.cp.blacknight.com
78.153.209.57   vps-1031932-1090.cp.blacknight.com
 78.153.209.83   vps-1031975-1092.cp.blacknight.com
 78.153.210.146  vps-1000752-1098.cp.blacknight.com
78.153.210.151  vps-1032538-1103.cp.blacknight.com
78.153.209.165  bethrifty.ie                    
78.153.210.158  vps-1032781-1108.cp.blacknight.com
78.153.210.31   vps-1033905-1119.cp.blacknight.com
78.153.210.174  webthings.biz                   
78.153.210.152  vps-1035379-1138.cp.blacknight.com
78.153.210.181  bone.buckley.ie                 
78.153.208.125  vps-1036716-1157.cp.blacknight.com
78.153.209.248  vps-163-1181.cp.blacknight.com  
78.153.210.36   vps-1037460-1183.cp.blacknight.com
78.153.210.179  vps-1037652-1185.cp.blacknight.com
78.153.210.196  vps-1037851-1188.cp.blacknight.com
78.153.210.199  vps-1038038-1191.cp.blacknight.com
78.153.210.215  vps-1010637-1206.blacknight.com 
78.153.210.217  vps-1039772-1208.cp.blacknight.com
78.153.210.221  vps-1040070-1212.cp.blacknight.com
78.153.210.223  vps-1040383-1215.cp.blacknight.com
78.153.210.235  vps-1041676-1226.cp.blacknight.com
78.153.210.175  vps-1042287-1234.cp.blacknight.com

We will update this post once completed.

UPDATE 09:54 - The server needs a disk check on it which it is currently doing. This will take aprox another 10/15 mins.

UPDATE 10:24 - This issue is now fully resolved.

Shared DirectAdmin And Helm Maintenance

TrackBacks (0) Comments (0)
Next Wednesday we will be moving our Shared DirectAdmin and Helm hosts behind a dedicated pair of firewalls. This will involve a certain amount of inevitable downtime, however we will try and keep it to a minimum. Total downtime should be less than 30mins.

As part of this, we'll also be doing a reboot of all our DirectAdmin machines to ensure that they're on the latest kernel.

If you log into your hosting via http://cp.blacknight.com you will not be affected.

UPDATE 23/03/2011 23:35: This work has been completed and all services seem to be back up and running.

pemlinweb23 and pemlinweb24 experiencing issues

TrackBacks (0) Comments (0)
Summary: Pemlinweb23 and 24 are currently experiencing some issues. We're working on the issue at the moment and we hope to have them back shortly.

Services affected: Websites hosted on pemlinweb23 and pemlinweb24 including ftp services.

Update 23:45: Both these servers are back working 100%

Issues On Igraine

TrackBacks (0) Comments (0)
Igraine has stopped responding. An engineer is currently on site getting it back up and running.

UPDATE 16:50: The server is back up and running and we're investigating why it went down in the first place. 

Linux Shared Hosting - PEMLINWEB17/18

TrackBacks (0) Comments (0)
We are currently experiencing issues with a hardware node that houses the following:

pemlinweb17.blacknight.com - 81.17.254.77
pemlinweb18.blacknight.com - 81.17.254.78

Our engineers are working to resolve the issues asap.

UPDATE 17:10: Both servers are back up and running.


Shared Hosting Outage

TrackBacks (0) Comments (0)
We're just after having an serious network issue and things should be back up and running now.

The root cause was a DOS attempt from within our network which overloaded the firewalls.

We're currently making sure that all services are backup.

cp.blacknight.com - Control Panel login issues

TrackBacks (0) Comments (0)
I'm afraid we are currently experiencing technical difficulties on the newer hosting package control panel system.  That is the control panel at:

http://cp.blacknight.com

Our developers are working on this currently and we hope to have service restored fully as soon as possible.


14:20 - Service should be fully restored now, though our developers are still closely monitoring the service just in case.

mail.blacknight.com - Mail Delay

TrackBacks (0) Comments (2)
We are experiencing high volumes of emails on our mail cluster at the moment which is causing delays in mail delivery.

Our engineers are working to alleviate the problem. Your mail will be delivered, it might be a little bit late.

We will keep this post updated of course.

16:19 - Due to extremely high CPU usage on one of the spam nodes within the cluster, we are going add an additional quad core CPU to it. Downtime expected to be within 15 minutes.

PEMVZMPS19 Issues

TrackBacks (0) Comments (0)
We are currently experiencing issues with our linux shared hosting node: PEMVZMPS19.

This node affects:

pemlinweb21.blacknight.com (81.17.254.58)
pemlinweb22.blacknight.com (81.17.254.59)

Our engineers are working to resolve this issue at the moment.

UPDATE 07:24 This issue is now resolved.

PEMVZMPS52 (MySQL HW Node Reboot)

TrackBacks (0) Comments (0)
We are seeing memory leaks happening on a hardware node that serves our a couple of MySQL servers. Specifically:

mysql637.cp.blacknight.com - 78.153.214.46
mysql640.cp.blacknight.com - 78.153.214.47

We are going to reboot this node now. We will update this post with details once completed.

UPDATE 11:31AM - This issue is now fully resolved.

pemlinweb47 / 48 file system inconsistancies

TrackBacks (0) Comments (2)
Summary: Following on from http://blacknig.ht/29j we've noticed some file system issues with these two nodes. We're taking them down immediately and we're going to fix this now.

Domains on the following IPs are affected:

78.153.214.38
78.153.214.39

UPDATE 13:45: Both webservers are back up and the file systems are looking a lot healthier.

Update 14:20: We noticed that while the servers themselves started ok, apache didn't for some reason. We've rectified this now and we're working on a permanent solution. Anyone still experiencing problems please contact support ASAP.

PEMLINWEB47/48

TrackBacks (0) Comments (0)
PEMLINWEB47 and PEMLINWEB48 are currently unresponsive. We have an engineer on site investigating and we'll have them up as soon as possible again.

10:47 - This issue is resolved.

12:45 - There are some issues with this node, we've opened a new post for it http://blacknig.ht/29k

pemvzmps39 having file system issues

TrackBacks (0) Comments (0)
Summary: Two servers that are on pemvzmps39 (pemlinweb31 and mysql452) are having problems this morning. The server is going to have to be taken down and work performed on it.

Customers affected will be people using mysql452int.cp.blacknight.com (172.16.3.64) for their databases and pemlinweb31 for their webspaces will experience issues.

On pemlinweb31 the following IPs are affected:

81.17.254.38 81.17.255.45 81.17.255.46 81.17.254.193 81.17.255.100 81.17.255.10 - sites may work but may be unable to upload new files and ftp will be problematic.

We're working to resolve this now.

12:01PM - An engineer is now on site to resolve the issue.

12:14PM - The file system is showing lots of broken inodes that have been fixed, I'm working on bringing the server back online now.

12:19PM - The server is successfully back online, I'm just waiting on final vzquota checks to perform before everything is operable again.

12:25PM - This issue is fully resolved.

Windows VPS Hardware Node Reboot

TrackBacks (0) Comments (0)
Due to a software error we are seeing with virtuozzo on the windows vps hardware node, PEMVZWIN02, we are going to need to reboot it ASAP. We're sorry for the downtime this causes but it's urgent enough to warrant a reboot it urgently.

The effected VPSs are:

78.153.208.116  VPS-238
78.153.208.123  VPS-258
78.153.208.127  VPS-282
78.153.208.28   VPS-287
78.153.209.211  VPS-288
78.153.208.168  VPS-331
78.153.208.149  VPS-337
78.153.208.172  VPS-342
78.153.208.169  VPS
78.153.208.171  VPS-344
78.153.209.151  VPS-346
78.153.208.15   VPS-347
78.153.208.62   VPS-353
78.153.208.176  VPS-354
78.153.208.179  VPS-357
78.153.208.184  VPS-362
78.153.210.75   VPS-387
78.153.208.205  VPS-390
78.153.208.222  VPS-409
78.153.208.223  VPS-410
78.153.209.118  VPS-620
78.153.209.160  VPS-664
78.153.209.164  VPS-667
78.153.209.107  VPS-710

The downtime will be in the region of 15 minutes and we'll keep this post updated with progress as always.

Inbound DOS morning of Thursday March 3rd

TrackBacks (0) Comments (0)
Summary: A DOS (1) directed towards one of our BGP transit customers was impairing one of our routers. As this router carriers much or our traffic via Global Crossing traffic across it wasn't working as normal. The routes in question which were being attacked have been withdrawn from our network and it is now stable again.

Services affected: Cp, E-mail, Websites, VPS servers and Dedicated servers were all affected by this DOS attack

(1) Denial of Service attack http://en.wikipedia.org/wiki/Denial-of-service_attack