Notification Type
Emergency Maintenance
Date
April 12, 2011 9:50 AM
Service Affecting
Yes
Message
Summary: The above server is having intermittent backnet problems. The backnet is used for Active Directory authentication and communication with MySQL and MSSQL servers. As we've not been able to catch the problem when it was occurring until now we didn't put a post up about it.
We're working now to find the problem and put a fix in place. We'll post further updates as we have more information.
Update 13:15 - We will be taking this server offline for the next while to try and fix the problem once and for all. We apologise for this unexpected downtime.
Update 15:10 - This node has been up and down for about an hour while we worked on it. It has then since been down completely for the last 40 minutes or so. There appears to be corruption in the registry and or system files. Right now we're backing out of our debug work and we're going to restore the machine to a known good state. The ETA for this is unknown at this time. It will be a complete server restore however. Once the restore is under way and is giving us a realistic restore time we'll put an ETA on this status site with those details.
Update 19:00 - The restore is still under way and is approximately 80% complete. All data will be as was when we took the server offline, when it returns.
Update 23:00 - The server restore has completed and was successful. Because we had to roll back, the initial problem may still exist, but we will keep monitoring it to see if it still is reoccuring.
Update 16:00 - It appears the initial problem is indeed still occuring. We want to take the server offline for approximately 10 minutes at 18:00 to apply a fix.
Leave a comment