The cause of this disruption was BT planned work to carry out 'invasive testing' on our links. They have confirmed that the work has been completed.
They failed to inform us of this. We already have a formal complaint regarding previous lack of notifications, and BT have since been sending us notification of works (eg the one for 27th March) manually to us. This is being followed up with our account manager.
We do apologise to our customers who were affected by this.
We're furious.
We have had further information from BT about their work. The work was on a transmission link between two datacentres, and as part of that all ports on devices that use the link also have their ports disabled and enabled. As a result we saw one port on each pair of our host links go down and up around 15 times each - at the same time. As this was not cleanly shutdown by BT it caused traffic to break and customers to drop and reconnect multiple times between midnight and 3:30AM.
The X.Witless LNS hung and restarted which caused customers to disconnect and reconnect.
This incident is related to https://aastatus.net/42608 X.Witless had been running without incident for 104 days. However, it is not fitted with an NVMe drive and was running software that pre-dates our NVMe drive fixes. We suspect the hang was caused by these two factors.
Further work on our LNSs is being planned and updates will be posted to the status page in due course.