Chat outage
Incident Report for Ada
Resolved
CHATTER EXPERIENCE SUMMARY:

- From 7:20 a.m. to 9:14 a.m. EDT and from 10:40 a.m. to 11:10 a.m. EDT, the bot was unresponsive.

DASHBOARD EXPERIENCE SUMMARY:

- The bot builder dashboard was disabled from 9:50 a.m. and 11:15 a.m. EDT.
- Your analytics may show increased volumes due to duplication of some conversations.
- Transcripts in the Conversations View may not be in chronological order between 9:14 a.m. and 11:10 a.m. EDT.

EVENT SUMMARY:

Starting at around 7:20 a.m. EDT on April 30, 2021 chatters interacting with our chat service began experiencing errors and significant latency.

We identified that a vendor managed database update was under-resourced for the capacity needs of our service. The vendor's support team joined our incident response at 7:38 a.m. EDT to begin increasing database resources. The vendor reported that due to a system limitation on their side, the resources could not be added directly to the existing database. Instead a time consuming database copy would need to be performed by the vendor.

In the interest of bringing chat back online as soon as possible, we decided to begin using a backup database from the previous day. Chat service was restored using our backup database at 9:14 a.m. EDT.

Between 9:50 a.m. and 11:15 a.m. EDT we intentionally limited access to the Ada Dashboard to minimize lost work.

At 10:40 a.m. EDT we paused chat services again to switch back to our primary database.

Chat service was fully functional at 11:10 a.m. EDT.

We are currently working to restore full conversation data to our production database of chats that occured on our backup database.

A visual timeline of the incident is available here https://static.ada.support/outage-timeline-2021-04-30.png .

A full post-event summary including actions taken will be provided next week through your ACX Manager.
Posted Apr 30, 2021 - 18:19 EDT
Update
We continue to monitor chat services and perform our data sync.

An initial post-event summary will be shared by end of day Eastern Time.
Posted Apr 30, 2021 - 17:14 EDT
Update
Chat services remain fully functional.

We have reduced the data gap in chat logs to less than two hours. We are continuing efforts to sync this data from our backup database to our primary.

We will post another update by 5:00 PM EDT.
Posted Apr 30, 2021 - 13:32 EDT
Update
Chat service is fully functional.

We continue to sync data from the past 24 hours. We will post an update within two hours.
Posted Apr 30, 2021 - 11:20 EDT
Monitoring
Chat service is restored. The database changeover has been completed successfully.

Bot builders may notice that chat logs more recent than 5:00 PM EDT April 29, 2021 are currently not available in their dashboard. We are working to copy this data now from backup to our main database.
Posted Apr 30, 2021 - 09:21 EDT
Update
The database reprovisioning has been completed and we are performing testing before restoring service. We expect service to be restored shortly.

Another update will be posted within 30 minutes.
Posted Apr 30, 2021 - 09:04 EDT
Update
We are implementing a database reprovisioning and are in the process of cloning data volumes.

Another update will be posted within 30 minutes.
Posted Apr 30, 2021 - 08:28 EDT
Identified
The issue has been identified and a fix is being implemented.
Posted Apr 30, 2021 - 07:54 EDT
Investigating
We are currently investigating this issue.
Posted Apr 30, 2021 - 07:34 EDT
This incident affected: Chat Client.