Recently in network Category

Packet Latency on Dedicated and VPS

TrackBacks (0) Comments (0)
We're currently experiencing some packet latency on our Dedicated and VPS network. Our engineers are working on this issue.

The issue is now resolved and network connectivity is returning to normal.

Reminder: Network Upgrade 20/12/11 23:00 - 21/12/11 01:00

TrackBacks (0) Comments (0)
As part of upgrading our core network, we're going to be upgrading part of the core network in Interxion. This will consist of the following tasks:

  •     Upgrading one of the switch blades in each of our core routers in Intexion
  •     Moving the connection to INEX LAN2 onto a new router.

There should be no downtime caused by this upgrade, however it is possible that there will be slight blips in connectivity as we take the core routers on and off the network.

This is a reminder status post, the original post was issued on the 14/12/11 http://www.blacknightstatus.com/2011/12/network-upgrade-201211-2300---211211-0100.html

Update: 01:10: This window is now complete. There was a number of brief (<10minute) outages during the window as we moved stuff around.

Packet Latency

TrackBacks (0) Comments (0)
We're currently experiencing some packet latency on our shared dedicated and colocation network 78.153.200.0/23 - Our engineers are working to resolve this ASAP.

UPDATE 18:28 - This is now fully resolved.

Network Upgrade 20/12/11 23:00 - 21/12/11 01:00

TrackBacks (0) Comments (0)
As part of upgrading our core network, we're going to be upgrading part of the core network in Interxion. This will consist of the following tasks:

  • Upgrading one of the switch blades in each of our core routers in Intexion
  • Moving the connection to INEX LAN2 onto a new router.
There should be no downtime caused by this upgrade, however it is possible that there will be slight blips in connectivity as we take the core routers on and off the network.

Network Upgrade 1/12/11 23:00 - 02/12/11 01:00

TrackBacks (0) Comments (0)
Introduction
As our network continues to grow, we need to ensure there is ample hardware to support it. We've had a few DDoS attacks over the past year which we want to eliminate for the future. Finally we need to ensure that there is enough room for growth within our core network to support the current demands.

What's going to happen?
Between the hours of 23:00 and 01:00 on the 1st of Decemeber 2011 we'll be making some infrastructure changes. The maintenance window of 2 hours does not mean services will be down for this length of time. You'll only see some network latency for a couple of minutes here and there as our engineering team make the necessary changes.

We've planned out exactly how the upgrade is going to run. Our core network edge and transit routers will be upgraded. The new edge routers allowing exponentially more packets per second through our multiple transit links. This will ensure resiliency against DDoS attacks.

Each edge router is in a HA pair. As we move our transit links from their old homes and on to the new edge routers, the routes to the servers will need to be recalculated by the new edge routers. This can take a few minutes.

We'll also be changing how we are doing some interior routing to maximize stability.

When it's going to take place?
The maintenance is scheduled between 23:00 and 01:00 on 01/12/11.

As always, we'll keep this status blog post up to date with how the upgrade is going.

Thank you for your patience.

Update 01:00 The Upgrade is taking longer then expected, it will be 3am before we expect to complete now.

Update 03:00 We had an outage during this upgrade, connectivity is now restored and we apologise for any inconvenience.

We are continuing with the upgrade and will update this post again once it has completed.

Network Latency

Summary: We're currently experiencing some network latency on our shared hosting networks. Our engineers are working to resolve this issue ASAP.

Update: 20:00: Everything bar pemlinweb32 is back up and running. During tonights outage we've moved cp.blacknight.com and ns1.blacknight.com to different network infrastructure.

Pemlinweb32 outage is due to the attack that caused this outage is targeted at this node.

The following IPs on this node are currently not reachable.

81.17.254.44
81.17.255.52
81.17.255.91
81.17.255.103

Update: Friday 18th @ 11:00: This issue was fully resolved last night at 21:00.




Extreme amounts of incoming traffic

TrackBacks (0) Comments (0)
We're currently experiencing high amounts of traffic flooding into our network. Our engineers are working to resolve this currently.

You've services may be latent as a result.

10:53 - This issue is still on going as our engineers continue to find the issue.

Update 11:25: The network has been stable for the last 10 minutes since 11:15. We found the destination of the attack and blocked it within our network. Normally this would be much quicker to do and we're going to investigate ways to find these types of attacks quicker so that they are less troublesome.

Currently the shared IP address of pemlinweb32 is blackholed and as such no traffic can get to it. This IP is 81.17.254.44 so if your site is still down this is why. We'll hopefully get a resolution for this shortly and we'll get this back up and running also. Thank you for your patience.

Update 15:40
Access to pemlinweb32 81.17.254.44 is now restored. Thank you for your patience.

Network Issue

TrackBacks (0) Comments (15)
We're currently experiencing a network issue which appears to be affecting large parts of our shared / VPS network.

Our technical team are aware of the issue and are working on it

We will update with more information as soon as we have it

UPDATE 2300 - the issue appears to be a DDOS against our DNS servers which is affecting any service that uses them.

UPDATE 0005 - Our network engineers are working on the issue. Due to the nature of the attack it is impacting sites and services on our shared network ie. more than DNS.

UPDATE 0017 - our main site, control panel and most client sites *should* be accessible now. We're confirming any outstanding issues

UPDATE 1135 - The network issue was fully resolved last night. One or two clients reported issues this morning that were unrelated.

With respect to the issues last night I am still waiting on a more detailed explanation from our technical team, but in simple terms what happened was as follows

Something / someone launched a large attack against us sending a very large amount of DNS queries / packets of data. The huge spike in queries basically maxed out the firewalls that protect that segment of our shared hosting network.

Once our technical team were able to identify the network block that was being hit they were able to take actions to remedy the situation.

All shared hosting services were functional again from around midnight last night. We are aware of one Windows webserver that had an issue last night, but it is, as far as I know, unrelated.

If we can get a more detailed explanation of last night's issue we'll post it here / edit the post

network issue - Monday 20th @ 12:50

TrackBacks (0) Comments (2)
Summary: We're currently experiencing a network problem. We're working to diagnose it and get everything that is down backup.

Initial findings: We've found some MAC related log entries in some of our core network switches. These highlight a layer 2 loop within the network. These began @ 12:48. The knock on effect was layer 2 port flaps followed by BGP and OSPF flaps localised to that data centre. Any traffic passing through that data centre via Cogent, INEX LAN2 in or out would have been dead in the water.

At 12:56 the flaps stopped and the routers began to stabilise. It took a few minutes after this for everything to calm down. We're still investigating this right now and we'll post more information as it's available.

RFO: The reason for this outage was due to an ethernet loop within our network in the InterXion datacentre. A new piece of equipment was introduced to the network earlier this week fully configured and as such it caused no issues. While this equipment was being worked on it's configuration was wiped and upon reboot it appeared on the network and caused a network loop. This caused a spanning tree event in our core switching fabric in InterXion. The result was the network outage that was observed. Typically events like this can't occur and we put strict provisions in place to prevent it however in this instance a 3rd party piece of equipment caused the issue. It wasn't immediately evident that this would occur as the device in question had been tested in our lab for 4 weeks prior to it's deployment within the data centre. Obviously we take every precaution when working on our network and sometimes events like this can occur. However it should not happen in future and we're working on ways to prevent it happening again. At the very least we hope to contain issues to individual racks rather than the entire ethernet fabric in that data centre.

Shared Hosting Network DDoS

TrackBacks (0) Comments (0)
We are currently experiencing a network DDoS on our shared hosting networks. Our engineers are working to mitigate the attack ASAP.

More information will be posted as soon as it becomes available.

03:32 - This issue has been fully resolved.