Summary and Impact to Customers
On Tuesday 2nd January 2017 from 09:30 to 11:20, SYNAQ Cloud Mail experienced an LDAP replication incident.
The resultant impact of the event was:
Root cause and Solution
The root cause of this event was due to a bug that we discovered in both of the Master LDAP servers. This affected the ability of the slave LDAP servers to synchronise with the Masters, thus preventing the processing of queries from the mail transport servers for the purpose of delivering emails. This caused the delay of inbound and outbound emails.
In addition, the Admin, Webmail Consoles and Mail Clients were also unable to query the LDAP servers for authentication and information purposes which caused these consoles to become inaccessible.
In order to solve this issue, we remotely restarted the affected Master LDAP servers which caused the slave LDAP servers to resynchronise with the Master servers. Upon synchronisation, mail queries were processed and the mail backlog was cleared. In addition, the Admin, Webmail Consoles and Mail Clients were once again able to query the LDAP servers for authentication.
An incident ticket was logged with our Zimbra upstream provider to investigate the root cause of the issue. They identified and confirmed the known bug in LDAP and provided us with the fix for this issue.