LNS Update and Planned Upgrades
MAINTENANCE Open Broadband
STATUS
Open
CREATED
Nov 19, 09:46 AM (1¼ months ago)
AFFECTING
Broadband
STARTED
Nov 19, 09:30 AM (1¼ months ago)
REFERENCE
42728 / AA42728
MASTODON
INFORMATION
  • INITIAL
    1¼ months ago by Andrew

    The work outlined below will start from Saturday 23rd November.

    Background:

    Our FireBrick team has been working on the 'hang' problem that we faced with the LNSs earlier in the year.

    The nature of the problem has made investigating the problem very time consuming as it is extremely difficult to reproduce. However, we do believe that a plausible cause has been identified, and code changes have been made to mitigate the problem.

    We have been testing this new code, both in our test lab and on a few select A&A routers, for over two months. During this time the new code has not caused the hardware to hang, where older versions of the code did.

    Our next step is to run the new code on our LNSs, the ones our customers connect to for their broadband connections.

    We plan to do this slowly, out of hours and in a couple of phases.

    We believe the cause of the hang is related to how memory is initially allocated for the tasks the FireBrick will be performing, this means that if the hardware is going to hang then this will most likely happen over the first couple of days (or first couple of hours).

    Stage one:

    We plan to upgrade only one of our LNSs at first. We will move broadband connections on to it in the early hours of the morning and then move them back off a few hours later. This means that during the day, customers will be on the normal set of LNSs.

    Then, each night, over the course of two weeks, the LNS will be power cycled and we will move an increasing number of connections over, until it is at the point of taking twice the amount of connections that we'd normally run on an LNS. (We normally run LNSs at around 40% capacity, so twice the number of connections is not a problem.)

    Stage two:

    Once we have confirmed that the hang is not happening, the second phase would be to run customer connections on the upgraded for a few days at a time.

    We will go through a cycle of: move connections off, reboot the LNS, move connections on, wait a few days. Repeat. We will do this with an increasing number of connections until it's at the point of taking a normal amount of connections.

    More information and to opt out:

    So as to minimise impact to customers, the work of moving connections off and on will happen overnight between 1AM and 5AM.

    As mentioned, this phase of upgrading involved only one LNS being upgraded. This will be the one named 'i.gormless'. The connections that will be moved on to 'i.gormless' will be those currently on the LNS named 'h.gormless'. if you are currently on 'h.gormless' (as seen on the top/left) of your line quality graph and want to opt out, then please email support.

    Once this phase has been completed, we will review and plan the next stages.

  • UPDATE
    1 month ago by Andrew

    Overnight test over the weekend have so far been successful. We'll be continue to test during his week.

  • UPDATE
    1 month ago by Andrew

    No problems thus far. We are now at phase 2, where we are running customers on the upgraded LNS for longer periods of time. We currently have eight LNSs in use, and one (i.gormless) is running the new software. This stage will involve lines on G, H and I.Gormless.

  • UPDATE
    23¾ days ago by Andrew

    Still no problems. We will upgrade two more LNS (G.gormless and H.Gormless), and will move some customers on to these over the weekend. This will happen overnight.

  • UPDATE
    18¼ days ago by Andrew

    All LNSs remain stable. Over this weekend (14-15th December) we will upgrade further LNSs and move customers between LNSs: E, F, G, H, I. This will be our last set of LNS upgrades until the new year.

  • UPDATE
    16½ days ago by Andrew

    These LNSs are now running newer software: F, G, H, I. We'll continue this work in the new year.

  • NEXT UPDATE...

    Due in 4¾ days