Post Incident Report
On Wednesday 6th March it was identified that the server responsible for sending mail for all Cloud Instances was offline. This outage meant that any email sent from Schoolbox Cloud Instances would have been lost whilst the server was offline. Aside from their emails not being delivered, Users would not have had any feedback when attempting to send emails that their emails would not be delivered. This outage also affected automatically sent email, such as New Digests and other notification email.
Root Cause
The root cause was a loss of IP address by the server, due to failure to successfully renew the DHCP Lease. The DHCP Lease failed to renew due to this expiry coinciding with automated server configuration update and verification process that increased the amount of memory being used, and hence not leaving sufficient memory for the DHCP Lease Renewal.
Timeline
Short Term Remediation
Long Term Remediation
Learnings/Further Actions
I want to take this opportunity to apologise on behalf of Schoolbox for any inconvenience that was caused. Multiple teams were a part of this PIR process and we are all aware of the learnings and will continue to make on-going improvements across our systems.