Resolved -
This incident has been resolved.
Mar 10, 17:00 CET
Monitoring -
A fix has been implemented and we are monitoring the results.
Mar 10, 16:16 CET
Identified -
We've been experiencing an elevated rate of API errors over the course of Saturday and Sunday. The issue would crash one of our virtual machines in the US1 cluster once every couple of hours, resulting in a burst of HTTP 500 errors until it would self-heal and stabilize just minutes later, affecting some percent of the incoming traffic. The root cause of the issue is now identified, we already have a temporary fix in place on production, and the ultimate fix is being implemented and expected to be deployed within the next hours.
Mar 10, 10:51 CET