27 Mar 09:30:00
[Broadband] - CLOSED Slow throughput issue on TT back-haul lines at peak times - Closed
Details
19 Feb 18:35:15
We have seen some cases with degraded performance on some TT lines, and we are investigating. Not a lot to go on yet, but be assured we are working on this and engaging the engineers within TT to address this.
Update
21 Feb 10:13:20

We have completed further tests and we are seeing congestion manifesting itself as slow throughput at peak times (evenings and weekends) on VDSL (FTTC) lines that connect to us through a certain Talk Talk LAC.

This has been reported to senior TalkTalk staff.

To explain further; VDSL circuits are routed from TalkTalk to us via two LACs. We are seeing slow thoughput at peak times on one LAC and not the other.

Update
27 Feb 11:08:58
Very often with congestion it is easy to find the network port or system that is overloaded but so far, sadly, we've not found the cause. A&A staff and customers and TalkTalk network engineers have done a lot of checks and tests on various bits of the backhaul network but we are finding it difficult to locate the cause of the slow throughput. We are all still working on this and will update again tomorrow.
Update
27 Feb 13:31:39
We've been in discussions with other TalkTalk wholesalers who have also reported the same problem to TalkTalk. There does seem to be more of a general problem within the TalkTalk network.
Update
27 Feb 13:32:12
We have had an update from TalkTalk saying that based on multiple reports from ISPs that they are investigating further.
Update
27 Feb 23:21:21
Further tests this evening by A&A staff shows that the throughput is not relating to a specific LAC, but that it looks like something in TalkTalk is limiting single TCP sessions to 7-9M max during peak times. Running single iperf tests results in 7-9M, but running ten at the same time can fill a 70M circuit. We've passed these findings on to TalkTalk.
Update
28 Feb 09:29:56
As expected the same iperf throughput tests are working fine this morning. TT are shaping at peak times. We are pursuing this with senior TalkTalk staff.
Update
28 Feb 11:27:45
TalkTalk are investigating. They have stated that circuits should not be rate limited and that they are not intentionally rate limiting. They are still investigating the cause.
Update
28 Feb 13:14:52
Update from TalkTalk: Investigations are currently underway with our NOC team who are liaising with Juniper to determine the root cause of this incident.
Update
1 Mar 16:38:54
TalkTalk are able to reproduce the throughput problem and investigations are still on going.
Update
2 Mar 16:51:12
Some customers did see better throughput on Wednesday evening, but not everyone. We've done some further testing with TalkTalk today and they continue to work on this.
Update
2 Mar 22:42:27
We've been in touch with the TalkTalk Network team this evening and have been performing further tests (see https://aastatus.net/2363 ). Investigations are still ongoing, but the work this evening has given a slight clue.
Update
3 Mar 14:24:48
During tests yesterday evening we saw slow throughput when using the Telehouse interconnect and fast (normal) throughput over Harbour Exchange interconnect. Therefore, this morning, we disabled our Telehouse North interconnect. We will carry on running tests over the weekend and we welcome customers to do the same. We are expecting throughput to but fast for everyone. We will then liaise with TalkTalk engineers regarding this on Monday.
Update
6 Mar 15:39:33

Tests over the weekend suggest that speeds are good when we only use our Harbour Exchange interconnect.

TalkTalk are moving the interconnect we have at Telehouse to a different port at their side so as to rule out a possible hardware fault.

Update
6 Mar 16:38:28
TalkTalk have moved our THN port and we will be re-testing this evening. This may cause some TalkTalk customers to experience slow (single thread) downloads this evening. See: https://aastatus.net/2364 for the planned work notice.
Update
6 Mar 21:39:55
The testing has been completed, and sadly we still see slow speeds when using the THN interconnect. We are now back to using the Harbour Exchange interconnect where we are seeing fast speeds as usual.
Update
8 Mar 12:30:25
Further testing happening today: Thursday evening https://aastatus.net/2366 This is to try and help narrow down where the problem is occurring.
Update
9 Mar 23:23:13
We've been testing, tis evening, this time with some more customers, so thank you to those who have been assisting. (We'd welcome more customers to be involved - you just need to run an iperf server on IPv4 or IPv6 and let one of our IPs through your firewall - contact Andrew if you're interested). We'll be passing the results on to TalkTalk, and the investigation continues.
Update
10 Mar 15:13:43
Last night we saw some line slow and some line fast, so having extra lines to test against should help in figuring out why this is the case. Quite a few customers have set up iperf server for us and we are now testing 20+ lines. (Still happy to add more). Speed tests are being run three times an hour and we'll collate the results after the weekend and will report back to TalkTalk the findings.
Update
11 Mar 20:10:21
Update
13 Mar 15:22:43

We now have samples of lines which are affected by the slow throughput and those that are not.

Since 9pm Sunday we are using the Harbour Exchange interconnect in to TalkTalk and so all customers should be seeing fast speeds.

This is still being investigated by us and TalkTalk staff. We may do some more testing in the evenings this week and we are continuing to run iperf tests against the customers who have contacted us.
Update
14 Mar 15:59:18

TalkTalk are doing some work this evening and will be reporting back to us tomorrow. We are also going to be carrying out some tests ourselves this evening too.

Our tests will require us to move traffic over to the Telehouse interconnect, which may mean some customers will see slow (single thread) download speeds at times. This will be between 9pm and 11pm

Update
14 Mar 16:45:49
This is from the weekend:

Update
17 Mar 10:42:28
We've stopped the iperf testing for the time being. We will start it back up again once we or TalkTalk have made changes that require testing to see if things are better or not, but at the moment there is no need for the testing as all customers should be seeing fast speeds due to the Telehouse interconnect not being in use. Customers who would like quota top-ups, please do email in.
Update
17 Mar 18:10:41
To help with the investigations, we're also asking for customers with BT connected FTTC/VDSL lines to run iperf so we can test against them too - details on https://support.aa.net.uk/TTiperf Thank you!
Update
20 Mar 12:54:02
Thanks to those who have set up iperf for us to test against. We ran some tests over the weekend whilst swapping back to the Telehouse interconnect, and tested BT and TT circuits for comparison. Results are that around half the TT lines slowed down but the BT circuits were unaffected.

TalkTalk are arranging some further tests to be done with us which will happen Monday or Tuesday evening this week.

Update
22 Mar 09:37:30
We have scheduled testing of our Telehouse interlink with TalkTalk staff for this Thursday evening. This will not affect customers in any way.
Update
22 Mar 09:44:09
In addition to the interconnect testing on Thursday mentioned above, TalkTalk have also asked us to retest DSL circuits to see if they are still slow. We will perform these tests this tonnight, Wednesday evening.

TT have confirmed that they have made a configuration change on the switch at their end in Telehouse - this is the reason for the speed testing this evening.

Update
22 Mar 12:06:50
We'll be running iperf3 tests against our TT and BT volunteers this evening, very 15 minutes from 4pm through to midnight.
Update
22 Mar 17:40:20
We'll be changing over to the Telehouse interconnect between 8pm and 9pm this evening for testing.
Update
23 Mar 10:36:06

Here are the results from last night:

And BT Circuits:

Some of the results are rather up and down, but these lines are in use by customers so we would expect some fluctuations, but it's clear that a number of lines are unaffected and a number are affected.

Here's the interesting part. Since this problem started we have rolled out some extra logging on to our LNSs, this has taken some time as we only update one a day. However, we are now logging the IP address used at our side of L2TP tunnels from TalkTalk. We have eight live LNSs and each one has 16 IP addresses that are used. With this logging we've identified that circuits connecting over tunnels on 'odd' IPs are fast, whilst those on tunnels on 'even' IPs are slow. This points to a LAG issue within TalkTalk, which is what we have suspected from the start but this data should hopefully help TalkTalk with their investigations.

Update
23 Mar 16:27:28
As mentioned above, we have scheduled testing of our Telehouse interlink with TalkTalk staff for this evening. This will not affect customers in any way.
Update
23 Mar 22:28:53

We have been testing the Telehouse interconnect this evening with TalkTalk engineers. This involved a ~80 minute conference call and setting up a very simple test of a server our side plugged in to the switch which is connected to our 10G interconnect, and running iperf3 tests against a laptop on the TalkTalk side.

The test has highlighted a problem at the TalkTalk end with the connection between two of their switches. When plugged in to the second switch we got about 300Mbit/s, but when their laptop was in the switch directly connected to our interconnect we got near full speed or around 900Mb/s.

This has hopefully given them a big clue and they will now involve the switch vendor for further investigations.

Update
23 Mar 23:02:34
TalkTalk have just called us back and have asked us to retest speeds on broadband circuits. We're moving traffic over to the Telehouse interconnect and will test....
Update
23 Mar 23:07:31
Initial reports show that speeds are back to normal! Hooray! We've asked TalkTalk for more details and if this is a temporary or permanent fix.
Update
24 Mar 09:22:13

Results from last night when we changed over to test the Telehouse interlink:

This shows that unlike the previous times, when we changed over to use the Telehouse interconnect at 11PM speeds did not drop.

We will perform hourly iperf tests over the weekend to be sure that this has been fixed.

We're still awaiting details from TalkTalk as to what the fix was and if it is a temporary or permanent fix.

Update
24 Mar 16:40:24
We are running on the Telehouse interconnect and are running hourly iperf3 tests against a number of our customers over the weekend. This will tell us if the speed issues are fixed.
Update
27 Mar 09:37:12

Speed tests against customers over the weekend do not show the peak time slow downs, this confrims that what TalkTalk did on Thursday night has fixed the problem. We are still awaiting the report from TalkTalk regarding this incident.

The graph above shows iperf3 speed test results taken once an hour over the weekend against nearly 30 customers. Although some are a bit spiky we are no longer seeing the drastic reduction in speeds at peak time. The spikyness is due to the lines being used as normal by the customers and so is expected.

Update
28 Mar 10:52:25
We're expecting the report from TalkTalk at the end of this week or early next week (w/b 2017-04-03).
Update
10 Apr 16:43:03
We've not yet had the report from TalkTalk, but we do expect it soon...
Resolution This has been fixed, we're awaiting the full report from TalkTalk.
Started 18 Feb
Closed 27 Mar 09:30:00
Cause TT