Summary and Impact to Customers
On Tuesday 10th November 2020 from 15:40 to 16:30, SYNAQ Cloud Mail experienced a mail authentication incident.
The resultant impact of the event was that users using IMAP, POP3 and SMTP to auth could not access the platform.
Root cause and Solution
The root cause of this event was due to a error with the deployment of a new SSL cert into the environment. This was tested in staging and pre-prod and passed tests successfully. Staging and Pre-Prod only make use of one authentication server and this configuration was deployed to production. This caused all production auth requests to attempt to authenticate against a single auth server and this server could not handle all the requests. This process has been performed many times without issue in the past, and human error caused the pre-prod configuration to incorrectly be applied to production.
In order to resolve this issue, the production configuration was reapplied and authentication was served all auth servers in the cluster.
• Staging and Pre-Prod configuration has been updated to match Production and include all auth servers in its configuration.
• New monitoring alerts have been built and deployed to detect if this scenario ever re-occurs.