Notification Type
Technical Information
Service Affecting
No
Message
Summary: We've been quite aware of the problems that have been occuring on the Qmail cluster of late. It's been particularly bad for around 4 weeks since a software update from our software vendor. We've been back and forth with them about this for a long time and the bottom line is we have to throw hardware at it because software fixes take too long to come online.What are we doing?
Good question. We've been analyzing the qmail cluster with the help of Parallels to identify certain pinch points. It's also become a lot clearer to us that we can add new nodes into the cluster relatively easily. This has only been possible due to the number of complaints we've had about it with Parallels. So we now have a very clear understanding of their structure and procedures for adding new servers.
1) Today we added two more multipurpose servers to the cluster.
i.e. 2 more smtp/pop/imap servers
These have been tested all day today and added into the load balancer around 17:00 this evening. The load is now being split across all the multipurpose nodes in the cluster.
We expect this to improve the stability of everyones service in the short term.
2) We've identified 1 more pinch point that we believe needs to be rectified. The main pinch point relates to e-mail delivery speed and it's the anti-spam server. While the control panel has supported multiple SA (Spamassassin) servers for a while the documentation is incomplete. So we've put questions to Parallels and some have been answered while others we're still waiting on. We're very hopeful to get more SA servers in place in the coming weeks and also upgrading the hardware for the existing node.
3) The only single point of failure relates specifically to LDAP. Up until recently it wasn't possible to add more than 1 LDAP server. The LDAP server basically provides authentication, quota, mailbox location and other specific info the mail servers. When this has issues people can't login, can't send e-mail and basically e-mail doesn't work at all. We intend to have to have at least 1 ldap replica in place in the coming week. It's a bit tricky as it means that we have to shut the entire cluster down. We're still evaluation the best way to do this and we'll announce it as soon as possible.
We know that the issues are frustrating. I can assure you that it's just as frustrating for us. As an example when the authentication (LDAP) issues occurred on Friday 27th of May we had 1400 phone calls in 55 minutes. This is unprecedented. We had already made a call a few days prior to get more hardware specifically for the Qmail cluster and have it installed as soon as possible. To close we'll hopefully put these issues to bed so that we can offer a much more stable e-mail platform for everyone.
Leave a comment