Summary: Shared hosting server morgana is currently experiencing issues and our engineering team is working on it.
We'll post updates when we have them.
Update: 17:00
Sorry about the delays in getting back to you all. This machine is dead, we're currently in the process of building a replacement machine. We hope to have it ready in the coming hours, at which point it'll take some time to migrate the data off of the disks. We estimate approx 8 or so hours for the complete restore. We do apologise for the delays in this, however these machines have upward of 10m+ files on them and these can take some time to restore.
UPDATE: 19:12
The new hardware is online and the technical team are currently working on setting software and services and transferring data onto it.
Update: 23:10
The new server is up. All mysql data is restored, all configuration is restored. The main data store is currently at 65% restored totalling around 52GB completed. Restores can take some time as there's a lot of data involved.
We've tested serveral websites and they all appear to be working fine.
On the upgrade, it's a more modern server than the old morgana was and has 8GB of ram. So expect to see some additional performance gains.
Update 01:00
This machine is now backup. All sites that we've checked appear to be working normally, e-mail is flowing etc. If you have any issues please let our helpdesk know.
The server known as Ragnell (81.17.252.110) is currently down. We're working to resolve this issue at the moment.
I hope it'll be back in service by 15:15 at the latest.
Update: 15:20
This machine is now back in service, in the end we have to physically reboot it. Looks like a small DOS of the apache service caused it to kernel panic when it consumed all the memory and swap.
Our linux shared hosting box known as "Bors" is currently experiencing difficulties. One of our engineers is looking at it right now and we hope to have a quick resolution.
We'll post further updates once we have more information.
Update: 17:26
This machine has kernel panic'd. That's two in a week, we suspect that perhaps the memory has become faulty. We'll follow up with a maintenance window next week to replace the memory.
Morgana has been having intermittent service issues this evening. An engineer is currently working on it and we'll update this ticket when we have anything to report.
Update: 21:00
The server is back up since 20:45, we're monitoring the situation for the next few hours and this ticket will remain open until we're satisfied that the server has stablised. We believe this issue was caused by a surge of traffic to a popular blog that is hosted on this machine.
The shared server 'bors' is currently having issues. We're working to get this back now. It's one of the older shared servers that we have and it's time for it to be replaced. We'll begin contacting customers soon to move them over to the new system.
Update: Nov 11, 10:31
@ 15:39 on the 10th of Nov this server came back fully, sorry for not updating the post.
The above named server is currently experiencing issues. We're working to resolve them at the moment and it should be back working shortly. Please address support queries to our support team via e-mail support (at) blacknight.com or via the web https://support.blacknight.ie
Update: 12:50
This server is back now. Similar to 'iseults' issue last week, we can attribute this downtime to an attack on a customers website.
We're monitoring the situation closely.
We're currently working on an issue with this machine, it appears that the load shot up and that it has become unresponsive. We believe this is due to 1 customer site being effectively DOS'd. More information will be posted when we have it.
Update: 10:12
This machine is back up now, same issue as Monday. Contacting the customer on the receiving end to see what we can do to mitigate this issue in future.
Bors is currently experiencing some issues. We're working to resolve this as soon as possible.
Background:
Over the few days we've noticed the load on this server being above normal. We're investigating why and we think that the issue is being caused by comment spammers hammering some customers blogs. This is loading the mysql server on bors and causing the load to go up very quickly.
We hope to have a resolution to this issue today.
This server went down at 13:09 and we're currently working to resolve the issue. Hopefully it'll be back promptly.
Update: 13:54
There appears to be a disk issue currently. This means that fsck is running on the machine at the moment which could take a long time as it's running on the /home partition which has circa 3million files or so on it. I will post another update when I thing I have an ETA for it to come back up.
Update: 19:30
We're still working on this server. We do apologise for this extended downtime and we assure you that once recovered from it, won't happen again as we're putting measures in place to mitigate such issues. To address concerns regarding e-mail, mostly sending mail servers will queue the e-mail for upto 5 days so you should all get your e-mail once it's back up.
Update: 21:45
All services should be restored. If anyone is still experiencing issues please contact our support desk
The server "ragnell" is currently experiencing issues.
From what we have been able to determine one site on the server has caused issues.
Our engineering team are currently working on restoring services as quickly as possible to affected clients.
UPDATE: The server has been restored to full service.
We're investigating an issue with the server "arthur" which is currently not reachable. An engineer is on their way to look at it. We hope to have it back shortly.
There is already a spare server built for "arthur" and if the drives have gone bang wallop we'll restore the latest backups to it on the new hardware.
We'll update this blog post when more information is available.
Update: 15:55
The server is back, it required manual intervention in the form of an fsck on one of the partitions which took some time to complete. We fear the disk(s) are having issues and we're going to schedule a maintenance window in the coming few days to complete the migration. We're already syncing data to the new hardware in case it falls over again.