February 2011 Archives

Windows VPS Server reboot

TrackBacks (0) Comments (0)

One of our Windows VPS servers requires a reboot to resolve a technical issue. We propose to carry this out at 22:00 GMT this evening. The following VPSs will be effected:

 

78.153.196.200
78.153.196.201
78.153.196.203
78.153.196.204

Update 23:03 - this maintenence window is complete and now closed.

Linux Shared Servers - 81.17.254.44 & 81.17.254.48

TrackBacks (0) Comments (0)

The following Linux Shared Hosting Servers are experiencing difficulties and we have dispatched someone to reboot them.

81.17.254.44 & 81.17.254.48

Update 17:00 - servers have been rebooted, and have now returned to normal.

MySQL Server mysql452 & Shared Linux server linweb31 unavailable

TrackBacks (0) Comments (0)

The above MySQL server is currently experiencing problems. We're endeavouring to bring the node back online as soon as possible.

Update 14:42: We have had to reboot the hardware node hosting this server. Unfortunately this has also brought down shared linux server pemlinweb31 - 81.17.254.48 - we hope to have them both back ASAP.

Update 16:42 The servers are coming back online after a file-system check. We apologise for any inconvenience caused by this outage.

MySQL Server issue

TrackBacks (0) Comments (0)
We are currently experiencing load problems with 2 MySQL nodes. These are :- mysql870 and mysql873

UPDATE 15:50: It seems there was a memory leak on the hardware node. We've got that under control and both MySQL servers are behaving as expected now. We have a ticket open with our vendors to see if this can be fixed.

Issues With PEMLINWEB32 and PEMLINWEB33

TrackBacks (0) Comments (0)
We have had to temporarily shut down pemlinweb32 and pemlinweb33 as the hardware node they're on is having issues. The RAID is currently rebuilding, and while the pemlinwebs were running, it was taking way too long and significantly affecting the performance of the web servers. By stopping the servers, we're able to rebuild the RAID a lot faster, and get the webservers working properly again.

UPDATE 00:30: This has been completed.

Issues with PEMLINWEB45/46

TrackBacks (0) Comments (0)
We are currently experiencing issues with two of our linux shared hosting web servers. Our engineers are working to resolve this currently. 

pemlinweb45.blacknight.com 78.153.214.36
pemlinweb46.blacknight.com 78.153.214.37

UPDATE 22:08: This is now resolved.

PEMLINWEB21/22 Issues

TrackBacks (0) Comments (0)
We are currently experiencing issues with the nodes pemlinweb21 and pemlinweb22. Our engineers are working on resolving this asap:

pemlinweb21.blacknight.com 81.17.254.58
pemlinweb22.blacknight.com 81.17.254.59

UPDATE 06:28AM - This issue is now resolved.


pemlinweb 19 & pemlinweb20 unresponsive

TrackBacks (0) Comments (0)

The above servers have become unresponsive and have been rebooted. They are currently running a quota check which may take some time to complete. When complete, the servers will return back to normal.

pemlinweb19 - 81.17.254.79

pemlinweb20 - 81.17.254.57

 

Update 14:25: These servers have now returned to normal.

Org Registry Scheduled Maintenance

TrackBacks (0) Comments (0)
PIR, the .org registry, is conducting scheduled maintenance on 19 February 2011 between 15:00 and 19:00 UTC.

During this period whois, updates and new registrations will not be available

Existing .org domain names will continue to resolve as normal

pemlinweb21 & pemlinweb22 unresponsive

TrackBacks (0) Comments (0)
The above servers have become unresponsive and have been rebooted. They are currently running a quota check which may take some time to complete. When complete, the servers will return back to normal.

smtp1r.cp.blacknight.com / mail.blacknight.com issue

TrackBacks (0) Comments (1)
Summary: This morning we'd a brief outage on our LDAP servers which caused authentication issues and some issues for people trying to send e-mail.

It occurred at 10:35:08 and ran until 10:42:10. This issue happens from time to time during high load where ldap's indexes get corrupt. Typically it only last a couple of minutes and there are newer versions of openldap where the issue is fixed, however the upgrade isn't currently supported by our software vendor. We've been discussing this with them for some time.

Helpdesk Maintenance

TrackBacks (0) Comments (0)
From 19:00 to 20:00 this evening our Helpdesk will be down while we upgrade the backend software. As part of this, all inbound mails will be queued for the duration of the maintenance period.

Out of hours support for dedicated and colo customer will still be available on the on-call number.

MYSQL Server Issue

TrackBacks (0) Comments (0)
The mysql servers at 81.17.254.34/172.16.4.247 and 81.17.254.35/172.16.4.248 are currently down. This was due to the load getting so high that we were forced to reboot the hardware node as no access was possible. The server is on the way back up, but we're waiting for an file system check to complete.

ETA is currently about 15 to 20 mins.

Update 18:05: Both servers are back up and running.

mail.blacknight.com / smtp1r.cp.blacknight.com emergency maintenance

TrackBacks (0) Comments (1)
Summary: This evening we're going to go ahead and do the work we said that we'd do in http://blacknig.ht/18d .

We'll start at 19:00 and it shouldn't take longer than an hour or so.

Update 20:00: The mail is syncing between the old and new storage. It's about 50% done. We suspect it won't take more than another hour or so.

Update 20:19: FYI this affects all Qmail services, so pop/imap and smtp. Including pop33r.cp.blacknight.com etc

Update 20:45: This isn't quite complete yet. It might run past 21:00 GMT and if it does we'll roll back the changes and try again later tonight. We'll leave the sync running while mail continues to flow and put pop3/imap services back live.

Update 21:00: We've backed out of the move for now. We thought that the sync of 24 hours of data would only take 45-60 minutes, unfortunately it's still going. We'll go again in a few hours when less people are waiting on e-mail.

Update Saturday 5th @ 00:05: The data sync has been on going for the past few hours and is almost complete. At this stage we're going to block inbound SMTP e-mail and leave pop3/imap alive until we're ready for the final move to the new platform. By disabling SMTP inbound we won't have to re-sync from scratch again all the new e-mail that arrives.

Update 02:20: The data sync is complete. People find the odd e-mail from earlier this evening marked as unread or it'll download again via pop3. This was unavoidable. however the good news is that we've moved over to the new storage platform and so far it's performing far better, however this is a quiet time for the cluster.

Webmail/pop/imap/smtp access has now been fully restored.

mail.blacknight.com / smtp1r.cp.blacknight.com emergency maintenance

TrackBacks (0) Comments (0)
Summary: In order to provide a better quality of service to our customers we've decided to step up the installation of the new mailstorage system. We're going to go ahead and do it this evening.

What and When: At 22:30 we'll shutdown inbound e-mail, imap and pop3 access to the qmail cluster. We'll spend around 15-30 minutes confirm the configuration is synchronized between the two machines and finally we'll run one last sync of the data. We kicked off the restore of last nights backup onto the new platform which has taken 10hours or there abouts to complete. We're hoping this will bring huge improvements to e-mail delivery, imap and pop access and especially webmail access.

We've an upgrade of the webmail planned for the near future to to a newer version of Atmail which has better caching support for folders and e-mail.

Update 23:30: Due to delays in the restore completing on time we'll scrub this until tomorrow night around the same time. We'll post a fresh maintenance window tomorrow morning for it.

smtp1r.cp.blacknight.com / mail.blacknight.com slow mail delivery

TrackBacks (0) Comments (0)
Summary: Due to a large volume of inbound e-mail from certain sources there's around a 15-20 minute delay on inbound e-mail. Outbound is unaffected at this time.

This should clear itself up by around 13:30 14:00. In the mean time services continue to function just a little slower than normal.