A disk server has failed, it is impacting all web sites we host and email. Engineers are working on this now.
There is a major issue with one of the disk servers, and we are planning to switch to a backup, but that is likely to involve an engineer visit to the data centre.
Engineer is on his way to the data centre now.
This is looking more complex than expected - we have switched the secondary controller, but there are issues with one of the disk arrays as well. Engineer still on site.
Disk array is rebuilding now. We should have email working shortly and then web pages once the disk array rebuilds.
Web space up, and mail servers being reconnected to disk array now.
Issues with web pages again, investigating.
The secondary disk server is now showing problems too. We are working on it.
This is proving to be quite a serious issue - we appear to have issues with two separate disk controllers and with some of the RAID disks and with the file system on one of the disks. This is a very odd multiple failure, especially given that all of this is monitored constantly and was not showing any issues yesterday. We do have daily backups, so if all else fails there are ways to get service restored with backups and some loss of recent emails or changes. At this stage we are working to repair the failed file systems before considering that move.
It looks like we have the mail file store repaired and mail should be back on line shortly.
Web pages back.
Incoming email should now be working again.
We are checking all mail and web servers now to confirm all is well again.
Obviously this sort of multiple failure is somewhat unexpected. We do have plans for new disk servers anyway, and this type of failure will be considered as part of that system design.