RFO 23.02.2024
Incident Summary On the 23rd of February 2024 between 03:12:33 UTC and 06:06:33 UTC the CSIS Threat Intelligence Portal and Threat Intelligence APIs were unavailable. This was due to an exhaustion of available database connections. The issue was detected both by our automatic monitoring and CSIS MDR-A staff, who escalated the incident to the development team. The responding development team identified the root cause at 06:01 UTC and canceled the offending database transactions that were blocking other transactions from completing. This resulted in all affected services becoming available again at 06:06:33 UTC. In total the affected services were unavailable for 2 hours and 54 minutes. This brought our Threat Intelligence API availability in February to 99.48%.
Root cause details A long running schema change (DDL) transaction on the database storing compromised credentials concided with a surge in compromised credentials being detected. The DDL transaction blocked the processing of the incoming compromised credentials (DML transactions), which eventually exhausted all available database connections, also for other services.
Timeline:
Corrective Action Items We identified and are implementing the following action items during the post-incident analysis to avoid similar incidents and decrease our time-to-recovery in future incidents:
Our monitoring reports all is ok.
Our monitoring reports that we are potentially degraded. We are investigating as you are reading this.
We’ll find your subscription and send you a link to login to manage your preferences.
We’ve found your existing subscription and have emailed you a secure link to manage your preferences.
We’ll use your email to save your preferences so you can update them later.
Subscribe to other services using the bell icon on the subscribe button on the status page.
You’ll no long receive any status updates from CSIS Security Group, are you sure?
{{ error }}
We’ll no longer send you any status updates about CSIS Security Group.