Order posts by limited to posts

24 Feb 13:00:05
Details
09 Dec 2014 11:20:04
Some lines on the LOWER HOLLOWAY exchange are experiencing peak time packet loss. We have reported this to BT and they are investigating the issue.
Update
11 Dec 2014 10:46:42
BT have passed this to TSO for investigation. We are waiting for a further update.
Update
12 Dec 2014 14:23:56
BT's Tso are currently investigating the issue.
Update
16 Dec 2014 12:07:31
Other ISPs are seeing the same problem. The BT Capacity team are now looking in to this.
Update
17 Dec 2014 16:21:04
No update to report yet, we're still chasing BT...
Update
18 Dec 2014 11:09:46
The latest update from this morning is: "The BT capacity team have investigated and confirmed that the port is not being over utilized, tech services have been engaged and are currently investigating from their side."
Update
19 Dec 2014 15:47:47
BT are looking to move our affected circuits on to other ports.
Update
13 Jan 10:28:52
This is being escalated further with BT now, update to follow
Update
19 Jan 12:04:34
This has been raised as a new reference as the old one was closed. Update due by tomorrow AM
Update
20 Jan 12:07:53
BT will be checking this further this evening so we should have more of an update by tomorrow morning
Update
22 Jan 09:44:47
An update is due by the end of the day
Update
22 Jan 16:02:24
This has been escalated further with BT, update probably tomorrow now
Update
23 Jan 09:31:23
we are still waiting for a PEW to be relayed to us. BT will be chasing this for us later on in the day.
Update
26 Jan 09:46:03
BT are doing a 'test move' this evening where they will be moving a line onto another VLAN to see if that helps with the load, if that works then they will move the other affected lines onto this VLAN. Probably Wednesday night.
Update
26 Jan 10:37:45
there will be an SVLAN migration to resolve this issue on Wednesday 28th Jan.
Update
30 Jan 09:33:57
Network rearrangement is happening on Sunday so we will check again on Monday
Update
2 Feb 14:23:12
Network rearrangement was done at 2AM this morning, we will check for paclet loss and report back tomorrow.
Update
3 Feb 09:46:49
We are still seeing loss on a few lines - I am not at all happy that BT have not yet resolved this. A further escalation has been raised with BT and an update will follow shortly.
Update
4 Feb 10:39:03
Escalated futher with an update due at lunch time
Update
11 Feb 14:14:58
We are getting extremly irritated with BT on this one, it should not take this long to add extra capaity in the affected area. Rocket on it's way to them now ......
Update
24 Feb 12:59:54
escalated further with BT, update due by the end of the day.
Update was expected 24 Feb 17:59:57
Previously expected 1 Feb 09:34:04 (Last Estimated Resolution Time from AAISP)

24 Feb 12:59:32
Details
20 Jan 12:53:37
We are seeing low level packet loss on some BT circuits connected to the EUSTON exchange, this has been raised with BT and as soon as we have an update we will post an update here.
Update
20 Jan 12:57:32
Here is an example graph:
Update
22 Jan 09:02:48
We are due an update on this one later this PM
Update
23 Jan 09:36:21
BT are chasing this and we are due an update at around 1:30PM.
Update
26 Jan 09:41:39
Work was done over night on the BT side to move load onto other parts of the network, we will check this again this evening and report back.
Update
27 Jan 10:33:05
We are still seeing lines with evening packet loss but BT don't appear to understand this and after spending the morning arguing with them they have agreed to investigate further. Update to follow.
Update
28 Jan 09:35:28
Update from BT due this PM
Update
29 Jan 10:33:57
Bt are again working on this but no further updates will be given until tomorrow morning
Update
3 Feb 16:19:06
This one has also been escalated further with BT
Update
4 Feb 10:18:11
BT have identified a fault within their network and we have been advised that an update will be given after lunch today
Update
11 Feb 14:16:56
Yet another rocket on it's way to BT
Update
24 Feb 12:59:20
escalated further with BT, update due by the end of the day.
Broadband Users Affected 0.07%
Started 10 Jan 12:51:26 by AAISP automated checking
Update was expected 24 Feb 17:59:23
Previously expected 21 Jan 16:51:26

24 Feb 12:59:06
Details
2 Feb 10:10:46
We are seeing low level packet loss on BT lines connected to the Wapping exchange - approx 6pm to 11pm every night. Reported to BT...
Update
2 Feb 10:13:57
Here is an example graph:
Update
3 Feb 15:55:40
Thsi has been escalated further with BT
Update
4 Feb 10:27:37
Escalated further with BT, update due after lunch
Update
11 Feb 14:18:00
Still not fixed, we are arming yet another rocket to fire at BT smiley
Update
24 Feb 12:58:51
escalated further with BT, update due by the end of the day.
Broadband Users Affected 0.09%
Started 2 Feb 10:09:12 by AAISP automated checking
Update was expected 24 Feb 17:58:54

24 Feb 13:00:22
Details
2 Feb 11:21:40

Below is a list of exchanges that BT plan to WBC enable between April - June this year.

This is for information only and we will attempt to bulk migrate customers to the 21CN network as and when they are enabled.

ABERCHIRDER
ANCRUM
ANGLE
ANSTEY MILLS
ASHBURY
AVEBURY
BARLASTON
BEAL
BERRIEW
BILLESDON
BIXTER
BOBBINGTON
BODFARI
BRENT KNOLL
BRETTON
BROMESBERROW
BUCKLAND NEWTON
BULWICK
BURGH ON BAIN
BURRELTON
CAPUTH
CHOLESBURY
CHURCHSTANTON
CLANFIELD
COBERLEY
COLINSBURGH
CRANFORD
CREATON
CRONDALL
CRUCORNEY
CYNWYL ELFED
DALE
DINAS CROSS
DINAS MAWDDWY
DITTON PRIORS
DOLWEN
DUNECHT
DUNRAGIT
DURLEY
EARLDOMS
EAST HADDON
EAST MEON
EDDLESTON
FAYGATE
GAMLINGAY
GAYTON
GLAMIS
GLENLUCE
GREAT CHATWELL
GREENLAW
HAMNAVOE
HUXLEY
ILMINGTON
INNERWICK
KELSHALL
KINLET
KIRKCOLM
LANGHOLM
LANGTREE
LITTLE STEEPING
LLANDDAROG
LLANFAIRTALHAIARN
LLANFYLLIN
LLANNEFYDD
LYDFORD
LYONSHALL
MANORBIER
MICHELDEVER
MIDDLETON SCRIVEN
MIDDLETON STONEY
MILLAND
MILTON ABBOT
MUNSLOW
MUTHILL
NANTGLYN
NEWNHAM BRIDGE
NORTH CADBURY
NORTH CRAWLEY
NORTH MOLTON
NORTHWATERBRIDGE
OFFLEY
PARWICH
PENTREFOELAS
PUNCHESTON
SHEERING
SHEPHALL
STEBBING
STOKE GOLDINGTON
STOW
SUTTON VENY
TALYBONT ON USK
TEALBY
TWINSTEAD
UFFINGTON
WATERHOUSES
WITHERIDGE
WIVELSFIELD GREEN
WOBURN
LEWDOWN
ROMSLEY


24 Feb 12:57:31
Details
11 Feb 10:17:36
We are seeing evening congestion on the Wrexham exchange, also off that two other BRAS's are affected. They are: 21CN-BRAS-RED6-SF 21CN-BRAS-RED7-SF Customers can check which BRAS/exchange they are connected to from our control pages
Update
11 Feb 10:27:08
Here is an example graph:
Update
13 Feb 11:39:14
We are chasing BT for an update and as soon as we have further news we wil update this post.
Update
16 Feb 10:20:43
It looks like the peak time latency just went away Thursday evening with no report from BT that they actually changed something. We will continue monitoring for the next few days to ensure it really has gone away.
Broadband Users Affected 0.05%
Started 11 Feb 10:12:08 by AA Staff
Closed 24 Feb 12:57:31

13 Feb 15:31:16
Details
13 Feb 14:29:06
We currently supply the Technicolor TG582 for most ADSL services, but we are considering switching to a new router, the ZyXEL VMG1312-B

It is very comprehensive and does both ADSL and VDSL as well as bridging and wifi. It means we can have one router for all service types. As some of you may know, BT will be changing FTTC to be "wires only" next year, and so a VDSL router will be needed.

We have a small number available now for people to trial - we want to test the routers, our "standard config" and the provisioning process.

Please contact trial@aa.net.uk or #trial on the irc server for more information.

P.S. Obviously it does Internet Protocol, the current one, IPv6 and the old one IPv4

Obviously this initial trial is limited number of routers which we are sending out at no charge to try different scenarios. However, we expect to be shipping these as standard later in the month, and they will be available to purchase on the web site.

Update
13 Feb 15:49:08
Thanks for all the emails and IRC messages about trialling the new routers. We will contact customers back on Monday to arrange shipping of the units.
Update
16 Feb 10:43:54
We now have enough trialists for the new router, we will contact a selection of customers today to arrange delivery of the routers. Thanks
Started 13 Feb 14:25:34

15 Feb 12:24:53
Details
12 Feb 18:00:36
The Technicolor routers we supply have a factory default config which connects to us and operates a default setup. This is applied if someone uses the RESET button on the routers.

We have identified that a few dozen of these routers are in this state, which is not correct.

As part of work we are doing for some new routers we plan to start shipping soon, we have made it that any router logging in using the factory default will automatically be updated to have the correct config.

This will likely cause the graph to reset and the LNS in use depends on the login, but it will also set the WiFi SSID correctly, and other parameters. So users may see a change.

If you have any issues, do contact support.

Resolution The change in process has meant a number of routers have been auto-provisioned as expected, the remainder will when they next connect. Any issues, do contact support.
Started 12 Feb 17:00:00
Closed 15 Feb 12:24:53
Previously expected 14 Feb

5 Feb 13:07:52
Details
8 Jan 15:44:04
We are seeing some levels of congestion in the evening on the following exchanges: BT COWBRIDGE, BT MORRISTON, BT WEST (Bristol area), BT CARDIFF EMPIRE, BT THORNBURY, BT EASTON, BT WINTERBOURNE, BT FISHPONDS, BT LLANTWIT MAJOR. These have been reported to BT and they are currently investigating.
Update
8 Jan 15:56:59
He is an example graph:
Update
9 Jan 15:21:53
BT have been chased further on this as they have not provided an update as promised.
Update
9 Jan 16:19:48
We did not see any congestion over night on the affected circuits but we will continue monitoring all affected lines and post another update on Monday.
Update
12 Jan 10:37:32
We are still seeing congestion on the Exchanges listed above between the hours of 20:00hrs and 22:30hrs. We have updated BT and are awaiting their reply.
Update
20 Jan 12:52:05
We are now seeing congestion starting from 19:30 to 22:30 on these exchanges. We are awaiting an update from BT.
Update
21 Jan 11:13:44
BT have sent this into the TSO team, we are to await their investigation results. We will provide another update as soon as we have a reply.
Update
22 Jan 09:06:14
An update is expected on this tomorrow
Update
23 Jan 09:33:48
This one is still being investigated at the moment, and may need a card or fiber cable fitting, Will chase this for an update later on in the day.
Broadband Users Affected 0.30%
Started 8 Jan 15:40:15 by AA Staff
Closed 5 Feb 13:07:52

29 Jan 10:07:29
Details
27 Jan 11:48:41
We are currently seeing congestion in the evening between the hours of 8PM and 11PM on the following BRASs: BRAS 21CN-BRAS-RED4-CF-C, BRAS 21CN-ACC-ALN12-CF-C, BRAS 21CN-BRAS-RED8-CF-C. We have raised this into BT and their Estimated completion date is: 29-01-2015 11:23 We will update you as soon as we have some more information.
Update
27 Jan 12:02:35
Here is an example graph:
Resolution Nothing back from BT but we suspect they have increased capacity accorss the links, any further news on this we will update the post.
Started 27 Jan 11:44:56
Closed 29 Jan 10:07:29

24 Jan 08:17:21
Details
23 Jan 08:35:26
In addition to all of the BT issues we have ongoing (and affecting all ISPs), we have seen some signs of congestion in the evening last night - this is due to planned switch upgrade work this morning. Normally we aim not to be the bottleneck, as you know, but we have moved customers on to half of our infrastructure to facilitate the switch change, and this puts us right on the limit for capacity at peak times. Over the next few nights we will be redistributing customers back on to the normal arrangement of three LNSs with one hot spare, and this will address the issue. Hopefully we have enough capacity freed up to avoid the issue tonight. Sorry for any inconvenience. Longer term we have more LNSs planned as we expand anyway.
Update
24 Jan 07:30:14
The congestion was worse last night, and the first stage of moving customers back to correct LNSs was done over night. We are completing this now (Saturday morning) to ensure no problems this evening.
Resolution Lines all moved to correct LNS so there should be no issues tonight.
Started 22 Jan
Closed 24 Jan 08:17:21
Previously expected 24 Jan 08:30:00

28 Jan 09:38:34
Details
4 Jan 09:45:22
We are seeing evening congestion on the Bristol North exchange, incident has been raised with BT and they are investigating.
Update
19 Jan 09:51:48
Here is an example graph:
Update
22 Jan 08:58:26
The fault has been escalated further and we are expected an update on this tomorrow.
Update
23 Jan 09:37:14
No Irams/Pew has been issued yet, and no further updates this morning. We are chasing BT. Update is expected around 1:30PM today.
Update
26 Jan 09:36:18
BT are due to update us on this after 3Pm today
Update
26 Jan 13:24:05
BT are looking to change the SFP port on the BRAS, we are chasing time scales on this now.
Update
26 Jan 14:16:43
This work will take place between 02:00 and 06:00 tomorrow morning
Update
27 Jan 09:23:21
Chasing BT to confirm the work was done over night, update to follow
Update
27 Jan 11:25:18
Nope, the work was postponed to this evening so we won't know whether they have fixed it until Wednesday evening. We will see.....
Update
28 Jan 09:38:34
Wow. Another BT congested link has been fixed over night.
Resolution BT changed the SFP port on the BRAS
Broadband Users Affected 0.01%
Started 4 Jan 09:45:22
Closed 28 Jan 09:38:34
Previously expected 29 Jan 13:23:24

28 Jan 09:25:23
Details
21 Jan 09:44:42
Our minitoring has picked up further congestion within the BT network causing high latency between 6pm-11pm every night on the following BRAS's. This is affecting BT lines only and in the Bristol and South/South west Wales areas. 21CN-BRAS-RED3-CF-C 21CN-BRAS-RED6-CF-C An incident has been raised with BT and we will update this post as and when we have updates.
Update
21 Jan 09:47:51
Here is an example graph:
Update
22 Jan 08:46:12
We are expecting a resolution on the tomorrow - 2015-01-23
Update
23 Jan 09:35:26
This one is still with the Adhara NOC team. They are trying to solve the congestion problems. Target resolution is today 23/1/15, we have no specific time frame so we will update you as soon as we have more information from BT.
Update
26 Jan 10:03:08
we are expecting an update on this later this afternoon
Update
26 Jan 16:23:32
BT are seeing some errors on slot 7 on one of the 7750’s, they looking to swap it over this evening, then they will monitor it, will update you once I get any further update.
Update
27 Jan 09:20:48
We are checking with BT whether or not a change was made over night.
Update
28 Jan 09:25:17
BT have actually cleared the congestion. We will monitor this very closely though.
Broadband Users Affected 0.03%
Started 4 Jan 18:00:00 by AA Staff
Closed 28 Jan 09:25:23
Previously expected 28 Jan 09:20:53

27 Jan 15:45:12
Details
27 Jan 13:45:04
There appears to be a problem with one of BT's BRAS's (21CN-BRAS-RED3-BM-TH) where customers are unable to connect. We are speaking to BT about this now and will update this post ASAP.
Update
27 Jan 13:52:57
BT 'tech services' are aware and dealing as we speak ....
Update
27 Jan 13:54:12
There are engineers on site already!
Update
27 Jan 15:45:36
WOW. BT have fixed their BRAS fault in record time. smiley
Broadband Users Affected 0.01%
Started 27 Jan 12:19:21
Closed 27 Jan 15:45:12
Previously expected 27 Jan 17:42:21

23 Jan 06:47:57
Details
19 Jan 09:55:58
We are replacing one of our core switches in London on Friday from 6AM. This should not be service affecting, but should be considered an 'at risk' period. We'll update this post as this work is carried out.
Update
21 Jan 15:01:16
This has been rescheduled for Friday 6AM.
Update
23 Jan 06:23:05
Work on this is about to start.
Update
23 Jan 06:48:22
This work has been completed.
Started 23 Jan 06:00:00
Closed 23 Jan 06:47:57

22 Jan 09:48:14
Details
13 Jan 12:17:05
We are seeing low level packet loss on the Hunslet exchange (BT tails) this has been reported to BT. All of our BT tails connected to the Hunslet exchange are affected.
Update
13 Jan 12:27:11
Here is an example graph:
Update
15 Jan 11:50:15
Having chased BT up they have promised us an update by the end of play today.
Update
16 Jan 09:07:51
Bt have identified a card fault within their network. We are just waiting for conformation as to when it will be fixed.
Update
19 Jan 09:31:11
It appears this is now resolved - well BT have added extra capacity on the link: "To alleviate congestion on acc-aln2.ls-bas -10/1/1 the OSPF cost on the backhauls in area 8.7.92.17 to acc-aln1.bok and acc-aln1.hma have been temporarily adjusted to 4000 from 3000. This has brought traffic down by about 10 to 15 % - and should hopefully avoid the over utilisation during peak"
Resolution Work has been completed on the BT network to alleviate traffic
Broadband Users Affected 0.01%
Started 11 Jan 12:14:28 by AAISP Pro Active Monitoring Systems
Closed 22 Jan 09:48:14

19 Jan 12:01:39
Details
14 Jan 11:23:00
We have reported congestion effecting TT BERMONDSEY in the evenings starting from the 8th of Jan. Updates will follow when we have them. Thanks for your patience.
Update
14 Jan 11:28:46
Here is an example graph
Update
15 Jan 10:31:10
Talk Talk have now fixed the congestion issue and we are no longer seeing packet loss.
Started 8 Jan 11:20:05

19 Jan 11:28:57
Details
19 Dec 2014 09:44:48
Today the CVE-2014-9222 router vulnerability AKA 'misfortune cookie' has been announced at http://mis.fortunecook.ie/ This is reported to affect many broadband routers all over the world. The web page has further details.
We are contacting our suppliers for their take on this, we'll post follow-ups to this status post shortly.
It is also worth noting that at the time of writing CVE-2014-9222 is still 'reserved': http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-9222
Update
19 Dec 2014 09:52:28
Technicolor Routers:- These routers are not (yet?) on the list, we are awaiting a response from Technicolor regarding this.
Update: Technicolor say "We don’t use that webserver, so not impacted”"
Update
19 Dec 2014 09:59:46
ZyXEL P-660R-D1: This router is on the list. We are awaiting a response from ZyXEL though. We do already have this page regarding the web interface on ZyXELs: http://wiki.aa.org.uk/Router_-_ZyXEL_P660R-D1#Closing_WAN_HTTP and closing the Web server from the WAN may help with this vulnerability.
Update: The version of RomPager (the web server) on ZyXELs that we have been shipping for some time is 4.51. Allegedly versions before 4.34 are vulnerable, so they may not be vulnerable. You can tell the version with either:
wget -S IP.of.ZyXEL
or
curl --head IP.of.ZyXEL
Update 2015-01-07: P-660R-D1 Not affected: http://www.zyxel.com/support/ZyXEL_Helps_You_Guard_against_misfortune_cookie_vulnerability.shtml
Update
19 Dec 2014 10:00:57
Dlink 320B: We supply these in Bridge mode and therefore are not vulnerable.
Update
19 Dec 2014 10:02:38
FireBrick: Firebricks are not vulnerable.
Started 19 Dec 2014 09:00:00

16 Jan 09:16:49
Details
15 Jan 14:26:38
Just before 2PM today a number of TalkTalk circuits dropped out, it looks like they have all now come back on line. We are investigation this with TalkTalk
Update
15 Jan 14:54:39
TT are investigating further but initial response from them "nothing is presenting me with an obvious answer other than what appears to have been a connectivity problem to HEX" Update to follow
Update
16 Jan 09:17:35
Update from TT. Summary Network monitoring has identified that Wholesale Business customers across the country may have experienced a brief loss of service between 14:00 and 14:05. Network Monitoring completed by our NOC has identified that there was a drop of traffic at these times between a Network Core Router at the Brentford Data Centre (NCR002.BRE) and a Redback router (LTS001.HEX). This subsequently caused Wholesale Business customers who route through Harbour exchange to experience a loss of service.
Resolution Technical / Suspected Root Cause Investigations by our Network Support have identified that this issue occurred due to planned LTS work under SR8753. This work caused the queues on the LTS to fill up and caused a CPU spike, which in turn caused the tunnels to drop. There are plans to increase circuit capacity going forward to ease bandwidth levels and prevent a repeat of this issue.
Broadband Users Affected 1%
Started 15 Jan 14:24:54
Closed 16 Jan 09:16:49
Cause Carrier
Previously expected 15 Jan 18:24:54

13 Jan 11:46:40
Details
18 Dec 2014 20:00:52
The working days between Christmas and New Year are "Christmas" rate, this means that any usage on 29th, 30th, and 31st December is not counted towards your Units allowance. As usual, bank holidays are treated as 'Weekend' rate.
(This doesn't apply to Home::1 or Office::1 customers.)
We wish all out customers a Merry Christmas!
Started 18 Dec 2014 19:00:00

11 Jan 11:36:06
Details
11 Jan 11:14:41
We may have to restart one or two of our LNSs in the early hours of Monday morning as there seems to be an issue with them. This is not affecting service at present but does mean we do not have any of the normal graphs which are essential to diagnostics and fault handling. Lines should reconnect promptly and ideally only be off for a few seconds, but, as usual, this very much depends on the routers and some lines make take a few minute to reconnect. There may be a second controlled PPP restart for some lines that end up on the wrong LNS once lines have stabilised.
Resolution False alarm, work cancelled.
Broadband Users Affected 66%
Started 12 Jan
Closed 11 Jan 11:36:06
Previously expected 12 Jan 07:00:00

10 Jan 20:00:00
Details
10 Jan 19:44:03
Since 19:20 we have seen issues on all TalkTalk backhaul lines. Investigating
Update
10 Jan 20:08:08
Looks to be recovering
Update
10 Jan 21:32:01
Most lines are up as of 8pm. We'll investigate the cause of this.
Started 10 Jan 19:20:00
Closed 10 Jan 20:00:00

10 Jan 15:35:00
Details
10 Jan 10:13:09
We are investigating a problem affecting some wholesale customers. This seems to be a data centre connectivity fault. Updatea to follow.
Update
10 Jan 10:23:40
This is related to a fault with Datahop,outside of our network. They are aware and investigating.
Update
10 Jan 15:37:12
Our Datahop ports are now back online. (Ironically, they came on line whilst waiting on hold for their NOC!). Wholesales lines are starting to reconnect. We suspect a faulty switch is to blame within Datahop.
Update
10 Jan 21:09:52
Datahop confirm this was caused by a faulty switch.
Started 10 Jan 09:20:00
Closed 10 Jan 15:35:00

12 Dec 2014 11:00:40
Details
11 Dec 2014 10:42:15
We are seeing some TT connected lines with packetloss starting at 9AM yesterday and today. The loss lasts until 10AM and then there continues a low amount of loss. We have reported this to TalkTalk
Update
11 Dec 2014 10:46:34
This is the pattern of loss we are seeing:
Update
12 Dec 2014 12:00:04
No loss has been seen on these lines today. We're still chasing TT for any update though.
Resolution The problem went away... TT were unable to find the cause.
Broadband Users Affected 7%
Started 11 Dec 2014 09:00:00
Closed 12 Dec 2014 11:00:40

11 Dec 2014 14:15:00
Details
11 Dec 2014 14:13:58
BT issue affecting SOHO AKA GERRARD STREET 21CN-ACC-ALN1-L-GER. we have reported to this BT and they are now investigating.
Update
11 Dec 2014 14:19:33
BT are investigating, however the circuits are mostly back online.
Started 11 Dec 2014 13:42:11 by AAISP Pro Active Monitoring Systems
Closed 11 Dec 2014 14:15:00
Previously expected 11 Dec 2014 18:13:11 (Last Estimated Resolution Time from AAISP)

02 Dec 2014 09:05:00
Details
01 Dec 2014 21:54:24
All FTTP circuits on Bradwell Abbey have packetloss. This started at about 23:45 on 30th November. This is affecting other ISPs too. BT did have an Incident open, but this has been closed. They restarted a line card last night, but it seems the problem has been since the card was restarted. We are chasing BT.
Example graph:
Update
01 Dec 2014 22:38:39
It has been a struggle to get the front line support and the Incident Desk at BT to accept that this is a problem. We have passed this on to our Account Manager and other contacts within BT in the hope of a speedy fix.
Update
02 Dec 2014 07:28:40
BT have tried doing something overnight, but the packetloss still exists at 7am 2nd December. Our monitoring shows:
  • packet loss it stops at 00:30
  • The lines go off between 04:20 and 06:00
  • The packet loss starts again at 6:00 when they come back onine.
We've passed this on to BT.
Update
02 Dec 2014 09:04:56
Since 7AM today, the lines have been OK... we will continue to monitor.
Started 30 Nov 2014 23:45:00
Closed 02 Dec 2014 09:05:00

03 Dec 2014 09:44:00
Details
27 Nov 2014 16:31:03
We are seeing what looks like congestion on the Walworth exchange. Customers will be experiencing high latency, packetloss and slow throughput in the evenings and weekends. We have reported this to TalkTalk.
Update
02 Dec 2014 09:39:27
Talk Talk are still investigating this issue.
Update
02 Dec 2014 12:22:04
The congestion issue has been discovered on Walworth Exchange and Talk Talk are in the process of traffic balancing.
Update
03 Dec 2014 10:30:14
Capacity has been increased and the exchange is looking much better now.
Started 27 Nov 2014 16:28:35
Closed 03 Dec 2014 09:44:00

03 Dec 2014 18:20:00
Details
03 Dec 2014 10:45:55
We are seeing MTU issues on some 21CN lines this morning where lines are unable to pass more than 1462 byte IP packets. It isn't affecting all lines, and the common factor appears to be that they are on ACC-ALN1 BRASs in London, but not all lines on those BRASs are affected. We have reported the issue to BT Wholesale and they are investigating the issue.
Update
03 Dec 2014 16:18:23
BT have now raised an incident and are investigating.
Update
03 Dec 2014 18:57:49
This has been fixed. We've been speaking to some network guys at BT this evening and helping them find the fault.
Resolution BT fixed this last night. We believe BT had equipment in the network that was miss-configured.
Started 03 Dec 2014 09:00:00
Closed 03 Dec 2014 18:20:00

19 Nov 2014 16:20:46
Details
19 Nov 2014 15:11:12
Lonap (one of the main Internet peering points in the UK) has a problem. We have stopped passing traffic over Lonap. Customers may have seen packetloss for a short while, but routing should be OK now. We are monitoring the traffic and will bring back Lonap when all is well.
Update
19 Nov 2014 16:21:29
The Lonap problem has been fixed, and we've re-enabled our peering.
Started 19 Nov 2014 15:00:00
Closed 19 Nov 2014 16:20:46

21 Nov 2014 00:18:00
Details
21 Nov 2014 10:58:09
We have a number of TT lines down all on the same RAS: HOST-62-24-203-36-AS13285-NET. We are chasing this with TalkTalk.
Update
21 Nov 2014 11:01:29
Most lines are now back. We have informed TalkTalk.
Update
21 Nov 2014 12:18:22
TT have come back to us. They were aware of the problem, it was caused by a software problem on an LTS.
Started 21 Nov 2014 10:45:00
Closed 21 Nov 2014 00:18:00

25 Nov 2014 10:43:46
Details
21 Oct 2014 14:10:19
We're seeing congestion from 10am up to 11:30pm across the BT Rose Street, PIMLICO and the High Wycombe exchange. A fault has been raised with BT and we will post updates as soon as we can. Thanks for your patience.
Update
28 Oct 2014 11:23:44
Rose Street and High Wycombe are now clear. Still investigating Pimlico
Update
03 Nov 2014 14:41:45
Pimlico has now been passed to BT's capacity team to deal with . Further capacity is needed and will be added asap. We will provide updates as soon as it's available.
Update
05 Nov 2014 10:12:30
We have just been informed by the BT capacity team that end users will be moved to a different VLAN on Friday morning. We will post futher updates when we have them.
Update
11 Nov 2014 10:23:59
Most of the Pimlico exchange is now fixed. Sorry for the delay.
Update
19 Nov 2014 11:01:57
There is further planned work on the Pimlico exchange for the 20th November. This should resolve the congestion on the Exchange.
Update
25 Nov 2014 10:44:43
Pimlico lines are now running as expected. Thanks for your patience.
Started 21 Oct 2014 13:31:50
Closed 25 Nov 2014 10:43:46

04 Nov 2014 16:47:11
Details
04 Nov 2014 09:42:18
Several graphs have been missing in recent weeks, some days, and some LNSs. This is something we are working on. Unfortunately, today, one of the LNSs is not showing live graphs again, and so these will not be logged over night. We hope to have a fix for this in the next few days. Sorry for any inconvenience.
Resolution The underlying cause has been identified and will be deployed over the next few days.
Started 01 Oct 2014
Closed 04 Nov 2014 16:47:11
Previously expected 10 Nov 2014

05 Nov 2014 02:27:31
Details
04 Nov 2014 09:50:36
Once again we expect to reset one of the LNSs early in the morning. This will not be the usual LNS switch, with the preferred time of night, but all lines on the LNS at once. The exact time depends on staff availability, sorry. This means a clear of PPP which can immediately reconnect. This may be followed by a second PPP reset a few minutes later. We do hope to have a proper solution to this issue in a couple of days.
Resolution Reset completed. We will do a normal rolling update of LNSs over next three nights. This should address the cause of the problem. If we have issues with graphs before that is complete, we may have to do a reset like this again.
Broadband Users Affected 33%
Started 05 Nov 2014
Closed 05 Nov 2014 02:27:31
Previously expected 05 Nov 2014 07:00:00

01 Nov 2014 11:35:11
[Broadband] - Blip - Closed
Details
01 Nov 2014 11:55:38
There appears to be something of a small DoS attack which resulted in a blip around 11:29:16 today, and caused some issues with broadband lines and other services. We're looking in to this at present and graphs are not currently visible on one of the LNSs for customers.
Update
01 Nov 2014 13:09:44
We expect graphs on a.gormless to be back tomorrow morning after some planned work.
Resolution Being investigated further.
Started 01 Nov 2014 11:29:16
Closed 01 Nov 2014 11:35:11

02 Nov 2014 04:08:38
Details
01 Nov 2014 13:07:11
We normally do LNS switch overs without a specific planned notice - the process is routine for maintenance and means clearing the PPP session to reconnect immediately on another LNS. We do one line at a time, slowly. We even have a control on the clueless so you can state preferred time of night.

However, tomorrow morning, we will be moving lines off a.gormless (one third of customers) using a different process. It should be much the same, but all lines will be at one time of night, and this may mean some are slower to reconnect.

This plan is to do this early morning - the exact time depends on when staff are available. Sorry for any inconvenience.

Resolution Completed as planned. Graphs back from now on a.gormless.
Broadband Users Affected 33%
Started 01 Nov 2014 03:00:00
Closed 02 Nov 2014 04:08:38
Previously expected 01 Nov 2014 07:00:00

07 Oct 2014 06:17:13
Details
03 Oct 2014 16:25:24
As we advised, we have had to make some radical changes to our billing to fix database load issues. These have gone quite well overall, but there have been a few snags. We think we have them all now, but this month we had to revert some usage charging giving some free usage.

We have identified that quarterly billed customers on units tariffs were not charged, so these are being applied shortly as a new invoice. Anyone with excess usage as a result, please do ask accounts for a credit.

We have also identified that call charges have not been billed - these can be billed to date if anyone asks, or if you leave it then they should finally catch up on next month's bill.

Sorry for any inconvenience.

Started 01 Oct 2014
Previously expected 01 Nov 2014

15 Oct 2014 17:14:18
Details
06 Oct 2014 14:22:50
For the next week or so we're considering 5am-7am to be a PEW window for some very low disruption work (a few seconds of "blip"). We're still trying very hard to improve our network configuration and router code to create a much more stable network. It seems, from recent experience, that this sort of window will be least disruptive to customers. It is a time where issues can be resolve by staff if needed (which is harder at times like 3am) and we get more feedback from end users. As before, we expect this work to have no impact in most cases, and maybe a couple of seconds of routing issues if it is not quite to plan. Sadly, all of our efforts to create the same test scenarios "on the bench" have not worked well. At this stage we are reviewing code to understand Sunday morning's work better, and this may take some time before we start. We'll update here and on irc before work is done. Thank you for your patience.
Update
07 Oct 2014 09:06:41
We did do work around 6:15 to 6:30 today - I thought I had posted an update here before I started but somehow it did not show. If we do any more, I'll try and make it a little earlier.
Update
08 Oct 2014 05:43:11
Doing work a little earlier today. We don't believe we caused any blips with today's testing.
Update
09 Oct 2014 05:47:53
Another early start and went very well.
Update
10 Oct 2014 08:22:53
We updated remaining core routers this morning, and it seemed to go very well. Indeed pings we ran had zero loss when upgrading routers in Telecity. However, we did lose TalkTalk broadband lines in the process. These all reconnected straight away, but we are no reviewing how this happens to try and avoid it in future.
Resolution Closing this PEW from last week. We may need to do more work at some point, but we are getting quite good at this now.
Started 07 Oct 2014 06:00:00
Closed 15 Oct 2014 17:14:18
Previously expected 14 Oct 2014 07:00:00

05 Oct 2014 07:26:50
Details
03 Oct 2014 10:41:59
We do plan to upgrade routers again over the weekend, probably early saturday morning (before 9am). I'll post on irc at the time and update this notice.

The work this week means we expect this to be totally seamless, but the only way to actually be sure is to try it.

If we still see any issues we'll do more on Sunday.

Update
04 Oct 2014 06:54:19
Upgrades starting shortly.
Update
04 Oct 2014 07:24:47
Almost perfect!

We loaded four routers, each at different points in the network. We ran a ping that went through all four routers whilst doing this. For three of them we did see ping drop a packet. The fourth we did not see a drop at all.

This may sound good, but it should be better - we should not lose a single packet doing this. We're looking at the logs to work out why, and may try again Sunday morning.

Thank you for your patience.

Update
04 Oct 2014 07:53:52
Plan for tomorrow is to pick one of the routers that did drop a ping, and shut it down and hold it without restarting - at that point we can investigate what is still routing via it and why. This should help us explain the dropped ping. Assuming that provides the clues we need we may load or reconfigure routers later on Sunday to fix it.
Update
05 Oct 2014 06:57:39
We are starting work shortly.
Update
05 Oct 2014 07:11:00
We are doing the upgrades as planned, but not able to do the level of additional diagnostics we wanted. We may look in to that next weekend.
Resolution Only 3 routers were upgraded, the 3rd having several seconds of issues. We will investigate the logs and do another planned work. It seems early morning like this is less disruptive to customers.
Started 04 Oct 2014
Closed 05 Oct 2014 07:26:50
Previously expected 06 Oct 2014

02 Oct 2014 19:05:55
Details
02 Oct 2014 19:05:15
We'd like to thank customers for patience this week. The tests we have been doing in the evenings have been invaluable. The issues seen have mostly related to links to Maidenhead (so voice calls rather than broadband connections).

The work we are doing has involved a lot of testing "on the bench" and even in our offices (to the annoyance of staff) but ultimately testing on the live customer services is the final test. The results have been informative and we are very close to out goal now.

The goal is to allow router maintenance with zero packet loss. We finally have the last piece in the jigsaw for this, and so should have this in place soon. Even so, there may be some further work to achieve this.

Apart from a "Nice to have" goal, this also relates to failures of hardware, power cuts, and software crashes. The work is making the network configuration more robust and should allow for key component failures with outages as short as 300ms in some cases. LNS issues tend to take longer for PPP to reconnect, but we want to try and be as robust as possible.

So, once again, thank you all for your patience while we work on this. There may be some more planned works which really should now be invisible to customers.

Started 02 Oct 2014 19:00:41

01 Oct 2014 17:49:32
Details
30 Sep 2014 18:04:06
Having been very successful with the router upgrade tonight, we are looking to move to the next router on Wednesday. Signs so far are that this should be equally seamless. We are, however, taking this slowly, one step at a time, to be sure.
Resolution We loaded 4 routers in all, and some were almost seamless, and some had a few seconds of outage, it was not perfect but way better than previously. We are now going to look in to the logs in detail and try to understand what we do next.

Our goal here is zero packet loss for maintenance.

I'd like to thank all those on irc for their useful feedback during these test.

Started 01 Oct 2014 17:00:00
Closed 01 Oct 2014 17:49:32
Previously expected 01 Oct 2014 18:00:00

30 Sep 2014 18:02:25
Details
29 Sep 2014 21:57:11
We are going to spend much of tomorrow trying to track down why things did not go smoothly tonight, and hope to have a solution by tomorrow (Tuesday) evening.

This time I hope to make a test load before the peak period at 6pm, so between 5pm and 6pm when things are a bit of a lull between business and home use.

If all goes to plan there will be NO impact at all, and that is what we hope. If so we will update three routers with increasing risk of impact, and abort if there are any issues.

Please follow things on irc tomorrow.

If this works as planned we will finally have all routers under "seamless upgrade" processes.

Update
30 Sep 2014 08:29:42
Tests on our internal systems this morning confirm we understand what went wrong last night, and as such the upgrade tonight should be seamless.

For the technically minded, we had an issue with VRRP becoming master too soon, i.e. before all routes are installed. The routing logic is now linked to VRRP to avoid this scenario, regarless of how long routing takes.

Resolution The upgrade went very nearly perfectly on the first router - we believe the only noticeable impact was the link to our office, which we think we understand now. However, we did only do the one router this time.
Started 30 Sep 2014 17:00:00
Closed 30 Sep 2014 18:02:25
Previously expected 30 Sep 2014 18:00:00

29 Sep 2014 22:37:36
Details
21 Aug 2014 12:50:32
Over the past week or so we have been missing data on some monitoring graphs, this is shown as purple for the first hour in the morning. This is being caused by delays in collecting the data. This is being looked in to.
Resolution We believe this has been fixed now. We have been monitoring it for a fortnight after making an initial fix, and it looks to have been successful.
Closed 29 Sep 2014 22:37:36

29 Sep 2014 19:29:19
Details
29 Sep 2014 14:06:12
We expect to reload a router this evening, which is likely to cause a few seconds of routing issues. This is part of trying to address the blips caused by router upgrades, which are meant to be seamless.
Update
29 Sep 2014 18:48:37
The reload is expected shortly, and will be on two boxes at least. We are monitoring the effect of the changes we have made. They should be a big improvement.
Resolution Upgrade was tested only on one router (Maidenhead) and caused some nasty impact on routing to call servers and control systems - general DSL was unaffected. Changes are backed out now, and back to drawing board. Further PEW will be announced as necessary.
Started 29 Sep 2014 17:00:00
Closed 29 Sep 2014 19:29:19
Previously expected 29 Sep 2014 23:00:00

29 Sep 2014 13:17:50
Details
29 Sep 2014 08:48:37
Some updates to the billing system have caused a problem for units billed customers resulting in their usage for next month starting early, i.e. usage is now being logged for October.

Because of the way usage carriers forward, this is unlikey to have much impact on customer in terms of additional charges. However, any customers that think they have lost out, please let us know and we'll make a manual adjustment.

The problem has been corrected for next month.

Update
29 Sep 2014 08:57:00
It looks like customers won't get billed top-up and may not get billed units either, so we are working on un-doing this issue so that billing is done normally. Please bear with us.
Update
29 Sep 2014 09:23:40
We are working on this now and should have usage billing back to normal later this morning.
Resolution Usage billing has been restored to around 1am Saturday, giving customers 2.5 days of unmetered usage.
Started 29 Sep 2014 08:45:12
Closed 29 Sep 2014 13:17:50

28 Sep 2014 19:20:54
Details
28 Sep 2014 18:52:50
We are experiencing a network problem affecting our broadband customers. Staff are investigating.
Update
28 Sep 2014 19:08:28
This is looking like some sort of Denial of Service attack. We're lookig at mitigating this.
Update
28 Sep 2014 19:16:36
The traffic has died down, things are starting to look better.
Update
28 Sep 2014 19:21:46
Traffic is now back to normal.
Started 28 Sep 2014 18:30:00
Closed 28 Sep 2014 19:20:54

20 Sep 2014 07:09:09
Details
20 Sep 2014 11:59:13
RADIUS account is behind at the moment. This is causing the usage data to appear as missing from customer lines. The accounting is behind, but it's not broken, and is catching up. The usage data doesn't appear to be lost, and should appear later in the day.
Update
21 Sep 2014 08:12:52
Records have now caught up.
Closed 20 Sep 2014 07:09:09
Previously expected 20 Sep 2014 15:57:11

26 Aug 2014 09:15:00
Details
26 Aug 2014 09:02:02
Yesterday's and today's line graphs are not being shown at the moment. We are working on restoring this.
Update
26 Aug 2014 09:42:18
Today's graphs are back, yesterdays are lost though.
Started 26 Aug 2014 08:00:00
Closed 26 Aug 2014 09:15:00

29 Sep 2014 16:57:23
Details
02 Sep 2014 17:15:50
We had a blip on one of the LNSs yesterday, so we are looking to roll out some updates over this week which should help address this, and some of the other issues last month. As usual LNS upgrades would be over night. We'll be rolling out to some of the other routers first, which may mean a few seconds of routing changes.
Update
07 Sep 2014 09:43:40
Upgrades are going well, but we are taking this slowly, and have not touched the LNSs yet. Addressing stability issues is always tricky as it can be weeks or months before we know we have actually fixed the problems. So far we have managed to identify some specific issues that we have been able to fix. We obviously have to be very careful to ensure these "fixes" do not impact normal service in any way. As such I have extended this PEW another week.
Update
13 Sep 2014 11:07:13
We are making significant progress on this. Two upgrades are expected today (Saturday 13th) which should not have any impact. We are also working on ways to make upgrades properly seamless (which is often the case, but not always).
Update
14 Sep 2014 17:21:35
Over the weekend we have done a number of tests, and we have managed to identify specific issues and put fixes in place on some of the routers on the network to see how they go.

This did lead to some blips (around 9am and 5pm on Sunday for example). We think we have a clearer idea on what happened with these too, and so we expect that we will load some new code early tomorrow or late tonight which may mean another brief blip. This should allow us to be much more seamless in future.

Later in the week we expect to roll out code to more routers.

Update
16 Sep 2014 16:57:07
We really think we have this sussed now - including reloads that have near zero impact on customers. We have a couple more loads to do this week (including one at 5pm today), and some over night rolling LNS updates.
Update
17 Sep 2014 12:23:59
The new release is now out, and we are planning upgrades this evening (from 5pm) and one of the LNSs over night. This should be pretty seamless now. At the end of the month we'll upgrade the second half of the core routers, assuming all goes well. Thank you for your patience.
Update
18 Sep 2014 17:15:27
FYI, there were a couple of issues with core routers today, at least one of which would have impacted internet routing for some destinations for several seconds. These issues were on the routers which have not yet been upgraded, which is rather encouraging. We are, of course, monitoring the situatuion carefully. The plan is still to upgrade the second half of the routers at the end of the month.
Update
19 Sep 2014 12:12:42
One of our LNS's (d.gormless) did restart unexpectedly this morning - this router is scheduled to be upgraded tonight.
Update
28 Sep 2014 13:25:10
The new release has been very stable for the last week and is being upgraded on remaining routers during Sunday.
Resolution Stable releases loaded at weekend
Started 02 Sep 2014 18:00:00
Closed 29 Sep 2014 16:57:23
Previously expected 19 Sep 2014

02 Sep 2014 17:08:13
Details
02 Sep 2014 15:38:09
Some people use the test LNS (doubtless) for various reasons, and it is also used some of the time for our NAT64 gateway.

We normally do re-loads on doubtless to test things with no notice, but we expect there may be quite a few this afternoon/evening as we are trying to track down an issue with new code that is not showing on the bench test systems.

As usual this is a PPP reset and reconnect and if it crashes may be a few seconds extra outage. With any luck this will not take many resets to find the issue.

Resolution Testing went well.
Started 02 Sep 2014 15:40:00
Closed 02 Sep 2014 17:08:13
Previously expected 03 Sep 2014

01 Sep 2014 19:42:08
Details
01 Sep 2014 19:42:56
c.gormless rebooted, lines moved to other LNS automatically. We are investigating.
Broadband Users Affected 33%
Started 01 Sep 2014 19:39:19
Closed 01 Sep 2014 19:42:08

23 Apr 2014 10:21:03
Details
01 Nov 2013 15:05:00
We have identified an issue that appears to be affecting some customers with FTTC modems. The issue is stupidly complex, and we are still trying to pin down the exact details. The symptoms appear to be that some packets are not passing correctly, some of the time.

Unfortunately one of the types of packet that refuses to pass correctly are FireBrick FB105 tunnel packets. This means customers relying on FB105 tunnels over FTTC are seeing issues.

The work around is to remove the ethernet lead to the modem and then reconnect it. This seems to fix the issue, at least until the next PPP restart. If you have remote access to a FireBrick, e.g. via WAN IP, and need to do this you can change the Ethernet port settings to force it to re-negotiate, and this has the same effect - this only works if directly connected to the FTTC modem as the fix does need the modem Ethernet to restart.

We are asking BT about this, and we are currently assuming this is a firmware issue on the BT FTTC modems.

We have confirmed that modems re-flashed with non-BT firmware do not have the same problem, though we don't usually recommend doing this as it is a BT modem and part of the service.

Update
04 Nov 2013 16:52:49
We have been working on getting more specific information regarding this, we hope to post an update tomorrow.
Update
05 Nov 2013 09:34:14
We have reproduced this problem by sending UDP packets using 'Scapy'. We are doing further testing today, and hope to write up a more detailed report about what we are seeing and what we have tested.
Update
05 Nov 2013 14:27:26
We have some quite good demonstrations of the problem now, and it looks like it will mess up most VPNs based on UDP. We can show how a whole range of UDP ports can be blacklisted by the modem somehow on the next PPP restart. It is crazy. We hope to post a little video of our testing shortly.
Update
05 Nov 2013 15:08:16
Here is an update/overview of the situation. (from http://revk.www.me.uk/2013/11/bt-huawei-fttc-modem-bug-breaking-vpns.html )

We have confirmed that the latest code in the BT FTTC modems appears to have a serious bug that is affecting almost anyone running any sort of VPN over FTTC.

Existing modems seem to be upgrading, presumably due to a roll out of new code in BT. An older modem that has not been on-line a while is fine. A re-flashed modem with non-BT firmware is fine. A working modem on the line for a while suddenly stopped working, presumably upgraded.

The bug appears to be that the modem manages to "blacklist" some UDP packets after a PPP restart.

If we send a number of UDP packets, using various UDP ports, then cause PPP to drop and reconnect, we then find that around 254 combinations of UDP IP/ports are now blacklisted. I.e. they no longer get sent on the line. Other packets are fine.

Sending 500 different packets, around 254 of them will not work again after the PPP restart. It is not actually the first or last 254 packets, some in the middle, but it seems to be 254 combinations. They work as much as you like before the PPP restart, and then never work after it.

We can send a batch of packets, wait 5 minutes, PPP restart, and still find that packets are now blacklisted. We have tried a wide range of ports, high and low, different src and dst ports, and so on - they are all affected.

The only way to "fix" it, is to disconnect the Ethernet port on the modem and reconnect. This does not even have to be long enough to drop PPP. Then it is fine until the next PPP restart. And yes, we have been running a load of scripts to systematically test this and reproduce the fault.

The problem is that a lot of VPNs use UDP and use the same set of ports for all of the packets, so if that combination is blacklisted by the modem the VPN stops after a PPP restart. The only way to fix it is manual intervention.

The modem is meant to be an Ethernet bridge. It should not know anything about PPP restarting or UDP packets and ports. It makes no sense that it would do this. We have tested swapping working and broken modems back and forth. We have tested with a variety of different equipment doing PPPoE and IP behind the modem.

BT are working on this, but it is a serious concern that this is being rolled out.
Update
12 Nov 2013 10:20:18
Work on this in still ongoing... We have tested this on a standard BT retail FTTC 'Infinity' line, and the problem cannot be reproduced. We suspect this is because when the PPP re-establishes a different IP address is allocated each time, and whatever is session tracking does not match the new connection.
Update
12 Nov 2013 11:08:17

Here is an update with some a more specific explanation as to what the problem we are seeing is:

On WBC FTTC, we can send a UDP packet inside the PPP and then drop the PPP a few seconds later. After the PPP re-establishes, UDP packets with the same source and destination IP and ports won't pass; they do not reach the LNS at the ISP.

Further to that, it's not just one src+dst IP and port tuple which is affected. We can send 254 UDP packets using different src+dest ports before we drop the PPP. After it comes back up, all 254 port combinations will fail. It is worth noting here that this cannot be reproduced on an FTTC service which allocates a dynamic IP which changes each time PPP re-established.

If we send more than 254 packets, only 254 will be broken and the others will work. It's not always the first 254 or last 254, the broken ones move around between tests.

So it sounds like the modem (or, less likely, something in the cab or exchange) is creating state table entries for packets it is passing which tie them to a particular PPP session, and then failing to flush the table when the PPP goes down.

This is a little crazy in the first place. It's a modem. It shouldn't even be aware that it's passing PPPoE frames, let along looking inside them to see that they are UDP.

This only happens when using an Openreach Huawei HG612 modem that we suspect has been recently remotely and automatically upgraded by Openreach in the past couple of months. Further - a HG612 modem with the 'unlocked' firmware does not have this problem. A HG612 modem that has probably not been automatically/remotely upgraded does not have this problem.

Side note: One theory is that the brokenness is actually happening in the street cab and not the modem. And that the new firmware in the modem which is triggering it has enabled 'link-state forwarding' on the modem's Ethernet interface.

Update
27 Nov 2013 10:09:42
This post has been a little quiet, but we are still working with BT/Openreach regarding this issue. We hope to have some more information to post in the next day or two.
Update
27 Nov 2013 10:10:13
We have also had reports from someone outside of AAISP reproducing this problem.
Update
27 Nov 2013 14:19:19
We have spent the morning with some nice chaps from Openreach and Huawei. We have demonstrated the problem and they were able to do traffic captures at various points on their side. Huawei HQ can now reproduce the problem and will investigate the problem further.
Update
28 Nov 2013 10:39:36
Adrian has posted about this on his blog: http://revk.www.me.uk/2013/11/bt-huawei-working-with-us.html
Update
13 Jan 2014 14:09:08
We are still chasing this with BT.
Update
03 Apr 2014 15:47:59
We have seen this affect SIP registrations (which use 5060 as the source and target)... Customers can contact us and we'll arrange a modem swap.
Update
23 Apr 2014 10:21:03
BT are in the process of testing an updated firmware for the modems with customers. Any customers affected by this can contact us and we can arrange a new modem to be sent out.
Resolution BT are testing a fix in the lab and will deploy in due course, but this could take months. However, if any customers are adversely affected by this bug, please let us know and we can arrange for BT to send a replacement ECI modem instead of the Huawei modem. Thank you all for your patience.

--Update--
BT do have a new firmware that they are rolling out to the modems. So far it does seem to have fixed the fault and we have not heard of any other issues as of yet. If you do still have the issue, please reboot your modem, if the problem remains, please contact support@aa.net.uk and we will try and get the firmware rolled out to you.
Started 25 Oct 2013
Closed 23 Apr 2014 10:21:03