Summary and Impact to Customers
On Monday 20th June from 11:01 – 14:30, SYNAQ Securemail experienced an inbound mail delivery incident.
The resultant impact of the event was the delay of approximately 15 minutes of inbound mail.
Root cause and Solution
The root cause of this event was due to an incompatibility between the database server kernel and newly installed server firmware. The new firmware caused a kernel bug that triggered a reboot of the database server, thus making the server temporarily inaccessible. Consequently, a backlog in the mail queues was created, causing an inbound mail delay of approximately 15 minutes.
In order to resolve this issue, our engineers failed over the affected mail query load to the slave database server. This caused the mail query load to revert back to processing efficiently so that the backlog of mails could be delivered. In addition, we identified that the kernel bug created by the new firmware was due to an incompatibility with the existing kernel version and we thus upgraded the kernel to the latest version.
Implement a control to ensure that all database server components are reviewed for firmware compatibility before conducting new upgrades.