Hey's app was unavailable due to AWS returning incorrect DNS records. Mail was unaffected.
Pingdom, Nagios
App.hey.com was unavailable from 21:26 until 21:39 UTC. Some users reported lingering DNS issues for about 10 minutes following the restoration of most throughput. Mail was unaffected.
We use Failover Records to serve our friendly error pages from IP addresses in the datacenter in the case that the Hey App is down. At the beginning of this incident AWS was erroneously returning these failover records, despite the app being up. We removed the failover record and manually set DNS to use an A record with the proper IP addresses for the app. However, for a brief period of time there was no record. This caused AWS to return a blank reply for app.hey.com. This should have happened until the negative TTL expired. The negative TTL is set to 60 seconds but some users definitely saw empty responses longer than that.
Once again, sorry for the brief hiccup! We're 99.99% on uptime for the last three months, but even a few minutes is a big deal when it comes to email.