At approx. 20:35 Level(3) full route set dissappeared on one of our GigEs. Customers may have experienced slight delays as BGP decided which carrier to use for all those routes.
We've opened a ticket with Packet Exchange and we're awaiting an answer from them. We're currently using Tiscali and Global crossing (out of DEG and IX) with Cogent as backup if either fail.
We'll post an update once we hear more from them on this issue.
Update: Saturday Aug 30th 12:10am
We've had no update as of yet from Packet Exchange however the issue appeared to get fixed already. When we get the RFO from them, we'll post a summary.
Summary: On Sunday morning next @ 02:00 am one of our Fibre providers (Auroa) are carrying out maintenance on the backup ring that they provide to us.This will be a total cut of the fibre while they replace several joins to make them more flexible. During this 8 hour window our traffic will be on our primary link which does not go near the affected physical area and the route is direct to DEG.
If we do experience an outage on our primary ring due to this window, we've a tertiary path that we may use and this should automatically failover.
Any questions please address them to our technical team https://support.blacknight.ie/
update: Sunday June 22 @ 10:10am
This upgrade was successful, Aurora report that circa 04:30 they completed the work on the fibre that our CWDM ring runs over. We got the all clear at 10am to confirm the fibre was back in the man hole and that the window was completed successfully with no snags.
The shared hosting server Morgana is currently not responding. It is being rebooted, and we waiting for it to come back up.
This post will be updated when we find out what caused the issue.
Update: The server is back up and we are now investigating to see what caused the issue.
The "DEG Mesh", i.e. the network that Data Electronics provide to customers has been experiencing issues intermittently during the day today. We've escalated this issue to them and we're awaiting a response regarding the issue.
Currently it's not affecting DNS, but DNS responses may take a few 100ms longer than normal.
Update: 9:45
The issue appears to be resoled. We've had no formal notification of a fix as of yet. I'll give an update once we've received a formal RFO.
We weren't aware of the maintenance window that they were doing last night and it looks like they have overrun by several hours.
Currently we're running on Tiscali primarily but we've Cogent and Global Crossing to fall back onto if we have to. You'll probably see traffic going over Global Crossing as we added a GigE from them yesterday.
There are no issues currently that we can see as a result of Level(3) being down for us. We've more than adequate capacity elsewhere in the network to cover us.
Information:
Currently Blacknight have 4 x GigE connections to the outside world delivered in 2 diverse locations in Dublin.We've multiple wave lengths between both of these locations with secondary backup links where needed. Our network core is spread between these 2 data centres comprising of 12 Juniper routers using iBGP, eBGP, route reflector technology and OSPF in our core to provide the ultimate experience in network traffic flow. Additionally all our inter-datacentre layer 3 links run over resiliant layer 2 paths where we use RPVST for redundancy. We also use VRRP for all our customer facing traffic where there is no firewalling and where there is firewalling we use HA pairs of Cisco ASAs.
Lastly in adition to the above we've 1 x GigE connection to INEX and 1 x 100Meg connection to INEX which gives our Irish Users the experience of us being on their ISPs network. We are one of the few HSPs in Ireland who are directly peered with all Irish ISPs.
This morning our backup CWDM link from DEG <-> InterXion experienced an outage of 5 seconds approx. We got a call immediately from our fibre provider informing us of a possible low light situation with this link.
We've a ticket open regarding this and we're monitoring the situation at the moment. Due to the design and resilience built into our network this outage did not affect production traffic.
Update: 12:30
We got a call back regarding this. The fibre strand carrying our wave lengths was snagged while pulling other fibre into the HBA facility in North Dublin. This resulted in a 25db loss in signal strength. This snag was found and the fibre was straightend and the problem should be fully resolved. No reoccuring blip occurred during the fix.
We're upgrading our Level(3) link tomorrow morning at 8:30am local time from 100Meg to 1000Meg. This will bounce our Level(3) and Packet Exchange eXpress peerings for a brief period and everything should return to normal within 60 seconds or so.
This should only cause momentary lag where some routes that currently route over Level(3) and PE eXpress will move to another carrier and then back again once the peerings come backup.
We don't expect any network downtime during this windows.
Hi there,
1 shared server is down and several customer machines are currently off air due to a switch fault in DEG.
The switch's interfaces are up, but it has stopped passing frames. We'll post updates as we get them.
Update: 13:15
Switch has been rebooted and access has returned for all customers. We'll examine our syslogs etc and see what we can find out. We've replacement hardware onsite already so if this re-occurs we'll replace this switch in the coming 24 hours.