August 2008 Archives

At approx. 20:35 Level(3) full route set dissappeared on one of our GigEs. Customers may have experienced slight delays as BGP decided which carrier to use for all those routes.

We've opened a ticket with Packet Exchange and we're awaiting an answer from them. We're currently using Tiscali and Global crossing (out of DEG and IX) with Cogent as backup if either fail.

We'll post an update once we hear more from them on this issue.

Update: Saturday Aug 30th 12:10am

We've had no update as of yet from Packet Exchange however the issue appeared to get fixed already. When we get the RFO from them, we'll post a summary.
Bors is currently experiencing some issues. We're working to resolve this as soon as possible.

Background:

Over the few days we've noticed the load on this server being above normal. We're investigating why and we think that the issue is being caused by comment spammers hammering some customers blogs. This is loading the mysql server on bors and causing the load to go up very quickly.

We hope to have a resolution to this issue today.
Three websites on three different shared servers were compromised by a hacker through weak FTP passwords.  The hacker uploaded a trojan to these hosting packages and so these three servers were placed on anti-spam blacklists.

All three website owners have been contacted now and their FTP passwords reset.  The offending files have been removed and the servers should be fully out of the blacklists soon.  In the meantime for any users of the following servers they might be seeing some emails they send bouncing back to them undeliverable:

Galahad - 81.17.248.4
Gorlois - 81.17.252.85
Rivalin - 81.17.252.145

As a note to all users, please ensure all of your passwords are relatively secure.  Some secure password tips would be:

# Don't use a dictionary word
# Don't use part of the username
# Keep the password at least 7 characters long
# Have a combination of at least three of:
- lowecase characters (a, b, c)
- uppercase characters (A, B, C)
- numbers (1, 2, 3)
- non-alphanumeric characters (!, %, *, {, £, )


Update (12.00pm):  The three servers were removed from the blacklist about 90-120 minutes ago and most, if not all, mailservers around the world should have updated their blacklists to no longer include these three IP addresses.  The IP addresses are fully removed from the blacklist itself.
Our software vendors will be doing maintenance on the control panel tomorrow morning from around 7am to 8am.

During this period the control panel at cp.blacknight.com and the online shop will not be available.


If you are using the popular CMS Joomla please make sure that you are running the latest version.

Older versions of Joomla are affected by a serious security issue which can lead to your site(s) being compromised and possibly defaced.

If you installed Joomla using the auto-installer (installatron) available to users on our DirectAdmin powered servers you should be able to upgrade via the control panel.

Even the Joomla developers were affected by this security issue


Reblog this post [with Zemanta]
This server went down at 13:09 and we're currently working to resolve the issue. Hopefully it'll be back promptly.

Update: 13:54

There appears to be a disk issue currently. This means that fsck is running on the machine at the moment which could take a long time as it's running on the /home partition which has circa 3million files or so on it. I will post another update when I thing I have an ETA for it to come back up.

Update: 19:30

We're still working on this server. We do apologise for this extended downtime and we assure you that once recovered from it, won't happen again as we're putting measures in place to mitigate such issues. To address concerns regarding e-mail, mostly sending mail servers will queue the e-mail for upto 5 days so you should all get your e-mail once it's back up.

Update: 21:45

All services should be restored. If anyone is still experiencing issues please contact our support desk
We have been informed by eNom that they will be conducting maintenance on their backend next Sunday, August 17th 2008.

The maintenance window will not have any impact on existing registrations, however we will be unable to process any new registration requests from around 9am to 11am Irish time.

This only affects the following TLDs:
  • com
  • net
  • org
  • info
  • biz
  • mobi
This does not affect .eu, .ie or .co.uk, as we are directly accredited for those extensions.
Tomorrow morning at 9am Parallels are upgrading our billing software to version 4.3.3 including Hotfix 01. This will result in our Store and CCP being down for a period of time. Services won't be affected during this time.

We'll be on hand before hand to make sure we're prepared and we'll test everything extensively to make sure there's no bugs after the install.

This will specifically affect only our new platform. i.e. https://store.blacknight.com and https://cp.blacknight.com

Update: 10:00 am August 14th

This hasn't happened yet as we had some complications on the test environment which we've been working on since 7am. I believe we've fixed this issue and we'll hopefully be upgrading the live environment shortly.

Expect the panel and the store to go offline for periods of time upto 11:30am

Update: 11:50 am August 14th

We've tested the upgrades and all appears well. This upgrade has fixed a bug with co.uk registrations and a few other bugs we'd been faced with.

Store and CP should be up and running completely now.
Due to unprecedented demand on our new hosting platform we've a backlog of .ie domain names. This is because many of the domains submitted required further validation manually before we can submit them to the registry.

We're working on the store to gather this information correctly. It normally should have had, but because of some JavaScript problems in Internet Explorer 6 and 7 a lot of registrations got through without the details being correct.

We hope to get the new registrations that are pending fixed as soon as we can. Our development and engineering teams are working on the issue as a priority.
Zemanta Pixie
We're apply window and Virtuozzo updates this morning to some of our windows VPS hardware nodes.

It'll take approx 20 minutes for each one to complete reboot and come back up.
Our engineers here have had to bring down the webservice (IIS) on the shared windows server Palamedes (81.17.248.55).  This will cause your website to go down, however all email services will be unaffected.

The reason this service has been brought down is because it looks like many sites on the server were compromised last night and had their index pages defaced.  In order to make the server secure and see how this hacker(s) compromised so many sites on the one server we have had to disable the webservice temporarily.

We hope to have this service enabled again as soon as possible and in the meantime we apologise for any inconveniences this may cause.

Update: 12:15

Currently we're working to delete all the files that contain the string of text that was put in place. This will cause many sites to show blank pages, but it'll also re-enable many sites on this server.

The restore is going to take some time to run as we can't filer out the index files. We don't have an estimated time to fix for everyone, but many of your .net apps will be back very shortly.

Update: 21:00 Sat August 9th

After a full days investigation we've found the hole that allowed this attack. A high profile site had an upload feature which allowed malicious attackers to upload arbitrary code. This code was an asp.net "shell" which was a basic web page which allowed the attackers access via to customers folders. We're unsure yet if there's any real protection in shared windows hosting regarding an attack vector like this, it's unlikely without restricting .net apps and causing functionality issues.

The site in question has been shut down and the owner contacted. We've also crawled through millions of files on the server to find any/all traces of the offensive index.html files placed on customers domains. We've also found some other copies of the ASP.net shell that the attackers left incase we found their primary entry point.

There will be one further update to this blog post in the coming week with further analysis of this attack vector and our solution to preventing it from happening again.

Final Update: Friday August 15 09:30

During our investigations of this intrusion we've noticed a few security implications. We've now taken measures to ensure that the default applcation pools for .net 1.1 and 2.0 do not run as the service Network Service. They now run as their own unprivileged users. We've also ensured that customers in their own application pools now run as their own web user also.

We've taken other measures to mitigate other attacks via other coding systems such as perl and php also to further strengthen the shared hosting platform.
If you would like to see what's going on with the server your site is hosted on you now can.

We've setup a simple server / service status page which lists most of our shared hosting servers.

The various services are polled remotely from a server outside our core network once every 60 seconds.

Green - the service is working
Red - the service isn't working
Grey - the service isn't monitored (usually because it's either not available on that server or is not used eg. a mail server wouldn't have a mysql database or if it did you wouldn't be connecting to it)

If you have any feedback on the status page please do let us know!
As this weekend is a bank holiday in Ireland our offices will be closed on Monday.

Technical support will be available via the helpdesk as normal.

If you have a dedicated server, transit or colocation services please use the "out of hours" contact details