Happening Monday and Tuesday early mornings:
Most of our LNSs have extra diagnostic and debugging hardware installed which has been used to help us track down the problems we were having with them in early 2024. We are now in a position where we can remove this. We have planned to do this work over two nights: the early hours of February 17th and 18th.
This work will involve us moving customers off the LNSs as we do the work. In practice this means customer connections will experience a drop and reconnect at around 1AM and 4AM on the 17th and 18th February.
BT are performing "Simultaneous Upgrades" which will affect all BT ADSL, VDSL and FTTP services during the early hours of February 12th. It is disappointing that this work is a multi-site activity as we have multiple links to BT in multiple datacentres so as to prevent outages when BT have work to do.
They say:
Start: 2025-02-12 00:01. We're simultaneously upgrading some 21C network software across multiple sites. The PW window is from 00:01 until 07:00, and we will start at 00:01 with non-disruptive pre-checks. From 01:00 we will start to re-boot the devices to the new version of code. This will cause an outage to all customers of up to 15 minutes whilst the device restarts. However, if roll-back is required, the outage could exceed 1hr. All work will be complete by 07:00.
In practice, we expect customers with BT provided connections to drop and reconnect a couple of times in the early hours of February 12th.
Here are our office opening times over Christmas and new year:
The work outlined below will start from Saturday 23rd November.
Background:Our FireBrick team has been working on the 'hang' problem that we faced with the LNSs earlier in the year.
The nature of the problem has made investigating the problem very time consuming as it is extremely difficult to reproduce. However, we do believe that a plausible cause has been identified, and code changes have been made to mitigate the problem.
We have been testing this new code, both in our test lab and on a few select A&A routers, for over two months. During this time the new code has not caused the hardware to hang, where older versions of the code did.
Our next step is to run the new code on our LNSs, the ones our customers connect to for their broadband connections.
We plan to do this slowly, out of hours and in a couple of phases.
We believe the cause of the hang is related to how memory is initially allocated for the tasks the FireBrick will be performing, this means that if the hardware is going to hang then this will most likely happen over the first couple of days (or first couple of hours).
Stage one: (Completed)
We plan to upgrade only one of our LNSs at first. We will move broadband connections on to it in the early hours of the morning and then move them back off a few hours later. This means that during the day, customers will be on the normal set of LNSs.
Then, each night, over the course of two weeks, the LNS will be power cycled and we will move an increasing number of connections over, until it is at the point of taking twice the amount of connections that we'd normally run on an LNS. (We normally run LNSs at around 40% capacity, so twice the number of connections is not a problem.)
Stage two: (Completed)
Once we have confirmed that the hang is not happening, the second phase would be to run customer connections on the upgraded for a few days at a time.
We will go through a cycle of: move connections off, reboot the LNS, move connections on, wait a few days. Repeat. We will do this with an increasing number of connections until it's at the point of taking a normal amount of connections.
Stage Three:
As of the end of 2024, half our LNSs have been running the new software without any problems. From January 7th we will e doing overnight upgrades of the remaining LNSs.
More information:
So as to minimise impact to customers, the work of moving connections off and on will happen overnight between 1AM and 5AM.