November 2009 Archives

VPS Windows Hardware Node: PEMVZWIN04

TrackBacks (0) Comments (0)
We are currently experiencing issues with a VPS hardware node: PEVZWIN04. Our engineers are working on resolving the issue at present.

A full description of the problem will be posted once resolved.

ETA for resolution: 20 minutes.

Update 11:26: The hardware node is coming back online now, it is taking aprox 2/3 minutes per VPS to boot.

Update 11:58: All VPSs are up and running successfully.

mail.blacknight.com issues

TrackBacks (0) Comments (15)
We are currently experiencing some intermittent issues with our mail cluster located at mail.blacknight.com

Our engineers are currently working to resolve this issue asap.

We will update this blog post once operations are back to normal.

UPDATE 16:16: The erroneous server has been removed from the mail cluster completely. This means all services are back to normal but missing some power. We are working to get the server back online.

UPDATE 16:44: The server has no been brought online and introduced back into the cluster.

UPDATE 17:06: Due to inconsistencies with ldap, we are trying to rebuild the index. This is causing some issues currently which we are working on now. Please bare with us.

UPDATE 17:10: Mail is being routed correctly now. We are monitoring it closely.

UPDATE Nov 25th 11:30: There's still some lingering issues with the mail cluster which are causing POP timeouts. We're currently investigating and working with our vendors in order to try and narrow down exactly where the problem is occuring.

UPDATE Nov 25th 12:40: This issue is on going. There appears to be a locking issue on the nfs server which is being caused by a huge volume of pop3 connections coming into the servers.

UPDATE Nov 25th 15:00: This issue is still on going. We're aware 1000s of pop3 and imap users can not currently access e-mail. The problem is still be worked on and we hope to have services restored today. We may need to put an emergency maintenance window in place in order to fully diagnose the issue. Right now the engineering team is tracing the high load on the mail servers back to the NFS file server and the storage it uses.

The storage is provided by an EMC Clariion CX300 Fibre Channel SAN. This should facilitate 100s of thousands of concurrent users however it seems that at extremely busy times it can't cope. The main reason for this appears to be the fact that the pop3 process immediately does a read on Maildir folder for the mailbox that is logging in without the list command being sent to by the client application. That means that every single auth command creates an IO intensive job that the mail servers have to deal with. We're doing everything within our power to resolve this and we've put a new storage array into the data centre. We hope to bring this online tonight. During which time we'll have to take mail completely offline.

Points of Interest:

1) Both pop3 and imap are affected by this issue.
2) Delivery of inbound e-mail to your inbox is unaffected.
3) if we do decide on a maintenance window, we'll store all e-mail off the cluster and forward it on once the qmail cluster has returned to good health.
4) During the move to the new system, your old e-mail may dissappear, however it'll arrive back in your imap folder within 24-48 hours as we sync data back from the old storage server.
5) we're working with Parallels to devise a far better storage arrangement and configuration of the qmail cluster. This has been on going for a couple of months but we believe there is an end in sight.
6) Our support team are trying to deal with all your calls, e-mails and live chats as quickly as possible. Please bear in mind that the problem is out of their control and that the most up to date information on the issue will be posted here on the status blog
.

Services affected currently: Webmail, imap and pop mail - smtp should continue as normal.

UPDATE Nov 26th 09:04: Services are resumed but at a degraded rate currently. Our engineers are currently executing a plan of action to get the system back up and running at full speed again. Please bare in mind the sheer volume of data we are working with, this is the primary factor on why this issue is taking some time to resolve.

Scheduled Maintenance - PEMLINWEB11

TrackBacks (0) Comments (0)
We are scheduling downtime this evening on one of our shared linux hosting nodes. Due to increased webspaces on this node, we will be moving it to a new hardware node.

The node itself will be down for 30 minutes during the transfer.

The node will begin to transfer at 22:00 this evening 24/11/2009.

Affected node is:

pemlinweb11.blacknight.com (81.17.254.93)

This will ensure that these nodes have plenty of breathing room for the future.

Update: We originally planned to migrated two nodes, but to ensure there is no issues we are going to migrate one at a time.

cp.blacknight.com upgrade on Thursday 26th of November

TrackBacks (0) Comments (0)
Summary: On Thursday November 26th we will be having POA 2.9 HF03 installed on our control panel. This has some feature improvements and also some bug fixes in relation to DNS propagation delays. There are also some other minor bugs that will be fixed.

When: 4am Thursday November 26th

What: Major software update of our control panel and provisioning software. This long awaited upgrade fixes a number of bugs we've encountered since the last upgrade on July 21st this year.

This means that from around 4am Irish time until Circa 8am Irish time our control panel cp.blacknight.com will be offline. Your hosting, e-mail and other ancillary services shouldn't be affected bar a restart of iis and apache post upgrade. These should all be staged and shouldn't cause more than a minute or twos down time for your websites.

If after 9am on Thursday you're experiencing any difficult please let us knows.

Shared Server Ragnell experiencing issues

TrackBacks (0) Comments (0)

The shared, DirectAdmin, server "Ragnell" is currently experiencing issues.

Our technical team are working on a resolution.

This notification only affects sites on that server.

UPDATE 15:30 - This issue has now been resolved and services are back to normal.

Shared Windows Database Hosting - Lancelot

TrackBacks (0) Comments (0)
We are currently experiencing issues with our shared windows Microsoft SQL Server hosting node, Lancelot.blacknight.ie

Our engineers are working to resolve this issue currently and will update this blog post once resolved.

UPDATE 13:33 - This issue has now been resolved and services are back to normal.

.Be Registry Scheduled Maintenance

TrackBacks (0) Comments (0)
DNS.be, the .be registry operator, have informed of scheduled maintenance next week

When?

Tuesday 24 November 2009

What time?

1730 - 2200 CET (Central European Time)

What will be affected?

All registration services and whois.

Existing .be domain names will not be impacted

Issues with Shared Hosting Qmail cluster

TrackBacks (0) Comments (0)

Our Shared Hosting Qmail cluster is currently experiencing some problems. This has resulted in some people experiencing timeouts collecting their emails.

 

More on this as we find out more.

Telnic (.tel) Maintenance - Sunday 22 November

TrackBacks (0) Comments (0)
Telnic have scheduled maintenance of the Community TelHosting Services on Sunday 22nd November 2009.

The maintenance window is scheduled from 0830 to 1200 (UTC)

During this timeframe you the following services will be unavailable:
CTH Web based UI and API - Not available during the maintenance window

All other Telnic services will be available

In simple terms - you won't be able to add or update data on your .tel domains, but any existing domains should not be affected. New registrations etc., should be available.

Shared Hosting Server Priamus

TrackBacks (0) Comments (0)
The shared hosting server Priamus' Apache was experiencing issues earlier this morning due to a higher than normal level of activity

Once our technical team became aware of the issue they took action to remedy the situation


Com / Net Scheduled Maintenance

TrackBacks (0) Comments (0)
Both .com and .net will be undergoing maintenance on Sunday December 13th 2009 from 0100 to 0145

During this time period new registrations and updates will not be possible.

Existing domain names will not be affected

Issues with Shared Linux Hosting Server - Morgana

TrackBacks (0) Comments (2)
Our shared hosting server, Morgana was compromised at 14:50. The attackers have removed data from both the direct admin configuration and the users home directories.

The most recently backup we have of the server is from the 27th of August.

We are working on restoring what we can from this backup now.

Once all of the data has been synced from the backup to the server, we will then recreate any user accounts that we're not within the backup.

The ETA on the initial backup restore will be within the next few hours, all going well. Once complete we will move on to creating the accounts we could not restore.

We will keep this blog post as up to date as possible as we progress.

Update 17:06 - We have restored the user data from the backup - If anyone is still experiencing any issues please let support know.

Update 1137 - Our technical team have been working around the clock in relation to this matter.
If you are still experiencing any  issues  please contact our support team directly via email (support@blacknight.com) and we will make every attempt to rectify any issues. Please include the domain name and as much detail as possible of the issue you are experiencing.

Update 1418 - Update:

Our senior technical staff have now completed all work on restoring data in relation to this matter.

Any backups of data that we could retrieve have been retrieved or are in the final stages of being retrieved i.e. the restore is still ongoing.

Please note that if your website or its files etc., are no longer visible at this juncture it is most likely that they are not retrievable from our servers and we would, therefore, ask you to restore them from your own backups.

Our technical support team is doing their utmost to reply to queries as quickly as possible

In relation to the illegal breach of this server we have contacted the relevant authorities and intend to update on this as soon as their investigations are complete

Update 1519 - Some email users may have been experiencing errors eg. 550 or similar errors. This issue has now been resolved.

Emergency hard disk replacement: PEMVZMPS21

TrackBacks (0) Comments (0)
Due to a hard drive failure on our hardware node, pemvzmps21, we are about to take this server offline. The disk needs to be taken out of the RAID array, replaced and then rebuilt.

The estimated downtime is 15 minutes, this is the time it will take to actually replace the disk itself physically.

Once it is replace and the RAID array restored the status blog will be updated.

The services this will affect are any services hosted on:

pemlinweb23.blacknight.com (81.17.254.62)
pemlinweb24.blacknight.com (81.17.254.63)

Thank you for your co-operation


UPDATE: New disk is in, system booted. All customer services should be restored.
UPDATE: New disk is now resyncing it's data from it's friends.
UPDATE: Disks have resynced and the RAID array has been rebuilt. All services are back to normal. If you are experiencing any issues please contact support asap.

Thanks again.

Domain Service Unscheduled Maintenance

TrackBacks (0) Comments (0)
Enom, who we use for a large portion of our domains still, are currently experiencing issues.

This means that existing domains will continue to resolve, but there may be issues or delays with modifications, updates and registrations

UPDATE 1332: This matter appears to have been resolved

Telnic (.tel) Maintenance

Image representing Telnic as depicted in Crunc...

Image via CrunchBase

Telnic (.tel) are conducting maintenance on their backend on November 7th 2009

From 1400 to 1600 we will not be able to process new registrations or updates

Existing domains will not be affected and will continue to resolve

UPDATE: This maintenance has completed

Info Registry Maintenance

TrackBacks (0) Comments (0)
The .info registry has scheduled maintenance on 14 November 2009 from 1500 to 1900

During this period no new registrations or updates will be possible

Existing .info domains will continue to resolve as normal

UPDATE: We have been informed that this maintenance was completed without incident

Co.uk Registry (Nominet) Maintenance

TrackBacks (0) Comments (0)
The co.uk domain registry (Nominet) will be carrying out maintenance on Sunday November 8th 2009 from 7.30 am to 12.30 pm

During this period no new registrations or updates will be processed.

Existing co.uk domains will not be affected

Full details here

UPDATE: This maintenance has now completed

Dedicated / Co-Location Network Switch Upgrade

TrackBacks (0) Comments (0)
We are planning an upgrade of a switch that is utilized by some of our dedicated and co-location customers. The plan includes upgrading the 24 port switch to a new 48 port switch.

The upgrade will take place on Wednesday the 11th of November at 22:00 hours.

All dedicated and co-location clients with equipment in the cabinet shall be notified separately via email to ensure they are fully up to speed on the matter.

The new switch will be mounted and connected and then the customers simply migrated over one by one to the new switch. There should be a blip of no more than 5/10 seconds per port.

This upgrade will allow for expansion in the future.

Once the upgrade is complete I will post a status update.

Thank you for your patience and understanding.

Update 10:16PM - The switch migration was completed. The downtime per port was aprox 10 seconds.