UK – Sheffield – Network Issue – 23/10/2009

26/10/2009 14:30 – Further investigation into this network issue has been completed and are included below.

Closed.

16:15 – We are aware of a network issue affecting our Sheffield POP (Point of presence) which is being investigated by the engineers.

16:25 – The engineers have investigated the issue and it should now been resolved. If you are still having any issues, please contact support via the usual method.

16:35 – It would seem there is still an issue with some parts of the network which the engineers are currently investigating.

17:10 – The source of the remaining issue with network has been identified and resolved. If you are still notice any issues with your service, please contact support.

24/10/2009 14:30 – The network is operating normally and no further outages, packet loss or high latency have occured since. This blog post will be updated with further information once compiled from the necesary parties.

26/10/2009 14:30 – Further investigation into the network issue has been completed. The issue was traced to one of our upstream providers performing unscheduled maintenance on other devices located in the main communications cabinets for the facility. The initial packet loss was caused by a (presumed) already damaged patch cable being ‘nudged’ within the main communication cabinet which provides some services between core BGP routers and distribution switches (between ‘the internet’ and ‘the cabinets’). Once the cable was identified as the source of the issue it was swapped out using on-site spares almost immediately, however the packet-loss caused by the initial cable fault had already caused one of the core BGP routers to exceed its available memory and lock the sessions. As the BGP sessions were force dropped off the router it caused a crash which had knock on affects for the rest of the network and it was unable to fail-over. Whilst this was occuring it was decided to bring forward the planned replacement of the ageing Cisco core BGP router with a new more redundant and stable Foundry device on which the dropped sessions were quickly restored. This new device is now operating very smoothly and along with other improvements being made (including replacing all distribution cables and reworking cable management) should prevent such an issue occuring in the future. We apologise to all customers for this outage/disruption and would like to ensure you that it is our top priority to eliminate risks in our network architecture and prevent a recurrence of these faults moving forward.

This entry was posted in Service Notices. Bookmark the permalink.

Leave a Reply

Please DO NOT use this form to submit support requests, all information submitted will be PUBLICLY VISIBLE.

Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>