The series details page on the learning site was inaccessible
Incident Report for MindTickle, Inc.
Postmortem

Impact: The series details page on the learning site was inaccessible for 22 minutes, from 23:24 PT to 23:46 PT on 28th November, 2023.

Why it happened:
A state mismatch occurred due to a serial deployment of two applications, leading to GraphQL queries failing.

Incident timeline (PT):

  • 28th November, 2023 - 23:09 - Deployment triggered
  • 28th November, 2023 - 23:23 - Deployment partially completed
  • 28th November, 2023 - 23:24 - Spike in Errors observed
  • 28th November, 2023 - 23:40 - By this time, build was completed. We waited for the application to deploy.
  • 28th November, 2023 - 23:43 - Deployment completed
  • 28th November, 2023 - 23:46 - 400 Error rate dropped.

What we did to fix it:
Corrective steps: The state mismatch was resolved once the deployments were completed, and no further corrective action was necessary.

Preventive steps:
We defined the order of operations in deployment scripts to prevent a recurrence of this issue.

Posted Dec 28, 2023 - 18:37 PST

Resolved
The series details page on the learning site was inaccessible. The end users could not access the Series to complete their programs.
Posted Nov 28, 2023 - 09:54 PST