mail.blacknight.com issues

TrackBacks (0) Comments (15)

Notification Type

Emergency Maintenance

Service Affecting

Yes

Message

We are currently experiencing some intermittent issues with our mail cluster located at mail.blacknight.com

Our engineers are currently working to resolve this issue asap.

We will update this blog post once operations are back to normal.

UPDATE 16:16: The erroneous server has been removed from the mail cluster completely. This means all services are back to normal but missing some power. We are working to get the server back online.

UPDATE 16:44: The server has no been brought online and introduced back into the cluster.

UPDATE 17:06: Due to inconsistencies with ldap, we are trying to rebuild the index. This is causing some issues currently which we are working on now. Please bare with us.

UPDATE 17:10: Mail is being routed correctly now. We are monitoring it closely.

UPDATE Nov 25th 11:30: There's still some lingering issues with the mail cluster which are causing POP timeouts. We're currently investigating and working with our vendors in order to try and narrow down exactly where the problem is occuring.

UPDATE Nov 25th 12:40: This issue is on going. There appears to be a locking issue on the nfs server which is being caused by a huge volume of pop3 connections coming into the servers.

UPDATE Nov 25th 15:00: This issue is still on going. We're aware 1000s of pop3 and imap users can not currently access e-mail. The problem is still be worked on and we hope to have services restored today. We may need to put an emergency maintenance window in place in order to fully diagnose the issue. Right now the engineering team is tracing the high load on the mail servers back to the NFS file server and the storage it uses.

The storage is provided by an EMC Clariion CX300 Fibre Channel SAN. This should facilitate 100s of thousands of concurrent users however it seems that at extremely busy times it can't cope. The main reason for this appears to be the fact that the pop3 process immediately does a read on Maildir folder for the mailbox that is logging in without the list command being sent to by the client application. That means that every single auth command creates an IO intensive job that the mail servers have to deal with. We're doing everything within our power to resolve this and we've put a new storage array into the data centre. We hope to bring this online tonight. During which time we'll have to take mail completely offline.

Points of Interest:

1) Both pop3 and imap are affected by this issue.
2) Delivery of inbound e-mail to your inbox is unaffected.
3) if we do decide on a maintenance window, we'll store all e-mail off the cluster and forward it on once the qmail cluster has returned to good health.
4) During the move to the new system, your old e-mail may dissappear, however it'll arrive back in your imap folder within 24-48 hours as we sync data back from the old storage server.
5) we're working with Parallels to devise a far better storage arrangement and configuration of the qmail cluster. This has been on going for a couple of months but we believe there is an end in sight.
6) Our support team are trying to deal with all your calls, e-mails and live chats as quickly as possible. Please bear in mind that the problem is out of their control and that the most up to date information on the issue will be posted here on the status blog
.

Services affected currently: Webmail, imap and pop mail - smtp should continue as normal.

UPDATE Nov 26th 09:04: Services are resumed but at a degraded rate currently. Our engineers are currently executing a plan of action to get the system back up and running at full speed again. Please bare in mind the sheer volume of data we are working with, this is the primary factor on why this issue is taking some time to resolve.

0 TrackBacks

Listed below are links to blogs that reference this entry: mail.blacknight.com issues.

TrackBack URL for this entry: http://www.blacknightstatus.com/cgi-bin/mt/mt-tb.cgi/224

15 Comments

I'm getting a mis-match on the security certificates for the mail server now.

Can you check this, please?

I am having terrible email problems for the past day or two with IMAP, not pop3, especially SMTP. JH

Dave Alshire on November 25, 2009 2:20 PM

Hello,

I was just wondering if there were any further updates on this issue or if there was an expected time for a resolution?

Regards,
Dave

Karl Toomey on November 25, 2009 2:38 PM

Can't login to webmail and all accounts are offline in outlook. Using imap.

James Pelow on November 25, 2009 2:50 PM

Mismatch on SSL certs and "attempt to read data from server failed".

The technical team are working on this issue.

As soon as we have an update we will update this blog post

Thanks

Michele

Dave Alshire on November 26, 2009 8:56 AM

Hello,

I was just wondering if there was an update on this issue for this morning?

Regards,
Bill

hi blacknight,

my clients are really annoyed about this - if it happens again myself and other client will be changing providers, businesses really rely on email these days... not too impressed with the web-mail interface either....

Wondering when we'll get all the missed emails from yesterday.

Helene Hugel on November 27, 2009 10:26 AM

Could you please tell me if there is any update this morning?Perhaps to know when we might be able to access our mails through Pop3? Thank you.


Hi,

Has this issue been fully resolved now? Can we expect a reliable and stable email hosting service going forward?

Also, would be helpful if you could ask the engineers to update the blog with the current status i.e. what corrective actions were taken and why did it take so long to fix the problem

Regards,

Michael R

Some clients are reporting messages and responses sent to me are bouncing. This matter seems to be taking a long time to resolve. Is anyone else experiencing intermittent send/receive issues and bounce reports from clients?

Could you please contact support with details
Thanks
Michele

Webmail and Imap down yet again.

Chris

Please see the most recent status post http://www.blacknightstatus.com/2009/12/mail-authentication-issues.html

The current issue is being worked on and we are currently running the cluster in a degraded state

Michele

Leave a comment