28 Feb 2015 14:11:01
[VoIP and SIMs] VoIP Call Quality Problems - Closed
Posted: 27 Feb 2015 12:30:57
We're investigating a problem with VoIP audio problems, calls breaking up etc. This looks to be some packetloss somewhere between us and our carriers. We'll update this post again shortly.
27 Feb 2015 13:04:21
We have identified the cause of this packetloss and are looking in to fixing it.
27 Feb 2015 14:55:30
We're working closely with a 3rd party that is involved in a BGP traffic problem between us and them. This is taking longer to get to the bottom of that we first thought.
27 Feb 2015 15:22:14
As we and the other BGP peer have not been able to get to the root cause of the problem we have put in a temporary fix. This has brought traffic levels back down to normal.
27 Feb 2015 16:26:46
Surprisingly, the problem has come back even though peering has been disabled! Needless to say, we are investigating again!
27 Feb 2015 17:00:05
The problem has gone away again whilst it was being looked in to.
27 Feb 2015 17:04:04

It's worth us explaining the problem... We have a peer at LINX that is sending us lots of traffic. This traffic is not for us, but for someone completely different - a different country even. Even through we have stopped the peering to this 3rd party, the traffic is still being sent, intermittently. This is causing our links to be filled, and hence causes packet loss.

We have been in direct contact with the 3rd party all afternoon, and we and they are confused as to how this is happening. At the point in time, we suspect some kind of router memory corruption which is causing the router to send traffic to the wrong peer. This type of problem is difficult to prove, and so it is taking time to get to the bottom of it.

We are still in contact with the 3rd party and will work to resolve this with them.

Resolution We were able to stop the floods of traffic yesterday afternoon, as a temporary measure, but the underlying problem remained until 10am Saturday when the LINX facing card at the peer was reset after the issue was reported by other LINX members. It is a shame that this was not done yesterday. This does confirm that it was to just AAISP that was affected by this. We will be working on contingency plans to allow us to react more efficiently for something like this in future. Thank you all for your understanding.
Started 27 Feb 2015 12:15:00
Closed 28 Feb 2015 14:11:01