Service outage
Incident Report for Bloomerang
Postmortem

Overview and Timeline

On January 18th at 4:17pm EST, Bloomerang teams received alerts of customers’ inability to login to the CRM application.  Our incident management team started a triage by 4:21pm EST; additional teams were assembled to actively review their areas and work to identify the cause of the issue.  As a result of the research, it was determined that teams needed to rollback a production database change that was implemented at 4:10pm EST.  Once the rollback plan was identified, teams were able to restore services by 5:05pm EST.

The database change was preemptive work to prepare for a larger change later in the evening.  Neither the change nor the rollback procedure caused data integrity issues with customer data.

Per process, our teams continued to engage the triage for an arbitrary amount of time to confirm restoration and perform any additional cleanup.  During the triage, our incident management team provided updates via the external status page; the incident and triage were marked as resolved by 5:59pm EST.

Root Cause(s)

A change against our production services caused a disruption with the login process.

The solution was a rollback of a database change implemented at 4:10pm EST.

Action Item(s)

Action Items Tentative Completion Date
Internal teams responsible for the production change will perform a retrospective and identify a safe path forward to implement this particular product enhancement. Friday, 1/19/2024
Identify additional checks for this type of change in the CRM staging environment. Monday, 2/5/2024
The amount of time taken to rollback could be decreased; additional rollback procedures will be incorporated for pre-deployment changes. Monday, 2/5/2024
Posted Jan 19, 2024 - 17:28 EST

Resolved
This incident has been resolved.
Posted Jan 18, 2024 - 17:59 EST
Monitoring
A fix has been implemented and we are actively monitoring the results.
Posted Jan 18, 2024 - 17:12 EST
Update
Our team is continuing to work to restore services. We will provide additional updates as information is available.
Posted Jan 18, 2024 - 17:01 EST
Identified
We have identified the root cause of the current outage & are working to resolve it now. Further updates will be provided as soon as available.
Posted Jan 18, 2024 - 16:42 EST
Investigating
We’re experiencing a service outage with one or more of our listed components. Our team is currently working to restore the service. We apologize for any inconvenience. Users may be affected.
Posted Jan 18, 2024 - 16:33 EST
This incident affected: CRM.