Summary and Impact to Customers
On Wednesday 13th March 2019 from 07:40am – 20:04pm SYNAQ Securemail experienced a major
incident.
The resultant impact of the event was Clients were initially unable to send and receive mail and
thereafter experienced mail delays.
Root Cause and Solution
The root cause of this event was due to exhausted DNS connections on our DNS servers as a result
of two authoritive name servers that were not responding to DNS queries for the domains that they
were the authoritive for. DNS is important to the operation of the Securemail service because it is
used to perform multiple security checks, resolve MX records and to allow connectivity to our
platform via the relevant host names.
This caused connections from our DNS server to the two authoritive name servers to remain open
as the affected servers did not respond to our queries. This caused the SYNAQ DNS servers to reach
their connection limit and as such, prevented all further DNS queries from taking place. As a result,
no mail could be sent or received.
In order to solve this issue, and whilst we were waiting for the name server provider to resolve their
incident, we implemented a temporary work-around where we redirected our DNS queries away
from the root servers to other DNS servers that still had cached results and responded with the
required domain information. This allowed mail to start flowing again and to work through the
existing backlog of mails.
At approximately 15:40pm, the name server provider resolved their incident and we rerouted our
DNS queries back to the root servers. This allowed our mail delivery to resume at a normal rate and
the backlog of mails was completed at 20:04pm.
Remediation Actions
We are investigating new methods to mitigate our connections limits from being reached
should a third party DNS name server provider experience any incidents.