We are going to spend much of tomorrow trying to track down why things did not go smoothly tonight, and hope to have a solution by tomorrow (Tuesday) evening. This time I hope to make a test load before the peak period at 6pm, so between 5pm and 6pm when things are a bit of a lull between business and home use. If all goes to plan there will be NO impact at all, and that is what we hope. If so we will update three routers with increasing risk of impact, and abort if there are any issues. Please follow things on irc tomorrow. If this works as planned we will finally have all routers under "seamless upgrade" processes.
Tests on our internal systems this morning confirm we understand what went wrong last night, and as such the upgrade tonight should be seamless. For the technically minded, we had an issue with VRRP becoming master too soon, i.e. before all routes are installed. The routing logic is now linked to VRRP to avoid this scenario, regarless of how long routing takes.
The upgrade went very nearly perfectly on the first router - we believe the only noticeable impact was the link to our office, which we think we understand now. However, we did only do the one router this time.