Major routing issue
MAJOR Closed General
STATUS
Closed
CREATED
Sep 22, 08:50 AM (13½ years ago)
AFFECTED
General
STARTED
Sep 22, 08:37 AM (13½ years ago)
CLOSED
Sep 22, 09:30 AM (13½ years ago)
REFERENCE
561 / AA561
INFORMATION
  • INITIAL
    13½ years ago by Adrian

    Investigating now - if this is the same as we had at the weekend we should be able to sort it quite quickly.

  • UPDATE
    13½ years ago

    We hope to have this sorted in a few minutes.

  • UPDATE
    13½ years ago

    This is impacting some VoIP services but not all.

  • UPDATE
    13½ years ago

    There will be a slight blip on broadband while we sort this.

  • UPDATE
    13½ years ago

    This looks like some issue with routing through LINX. We may take down the route collector peering until we are happy we have identified the cause.

  • UPDATE
    13½ years ago

    Still seeing some issues.

  • UPDATE
    13½ years ago

    Equipment reboot worked briefly and then the problem re-occured. It seems clear this is a routing issue with a peer that is causing a black hole. We do not understand exactly where or how yet and this is being addressed.

  • UPDATE
    13½ years ago

    We have taken town LINX route server peering and things are looking a lot better - checking things now.

  • UPDATE
    13½ years ago

    It may be worth explaining this a little. We have dual redundent equipment to allow for failures. If something fails completely, or can be turned off, then the systems re-route to use other equipment. Depending on where such issues are this can mean no outage, a few seconds or a few minutes.

    However, if there is a partial failure, such as a single black-hole route for the link to Maidenhead, then this is not an equipment failure. The other routers get that route and expect it to be valid. This can create complex problems that are hard to diagnose, also and mean we have to use various alternative means to access systems which causes delays.

  • RESOLUTION
    13½ years ago by Adrian

    I would stress, just because taking down the LINX route server seems to have addressed the issue does not mean there is an issue with LINX. This could be something odd with our routers, or the LINX router server or a peer via that route server feeding something odd to us as a route. We're trying to identify what has happened but for now we'll leave the route server shutdown until we know.

  • Closed