Degraded performance across US cluster

Incident Report for Hevo

Resolved

Thank you for your patience while we worked on this issue. We encountered an incident where tasks for all entities—Models, Pipeline Ingestion tasks, and Destination Load tasks—were not executing as expected. This incident has come up due to a recent outage on the US cluster.

We have now resolved the issue. As a result, you may have noticed delays in ingestion/loading tasks or missed model run triggers. Moving forward, all scheduled tasks will run as expected and will be automatically recovered in the next schedule.

Our development team has identified and implemented a fix for this issue. We truly appreciate your patience and understanding while we worked on resolving this. We sincerely apologise for any inconvenience this may have caused.

Please note that if there are any pipelines/models where the schedule interval is high (about 3 hours or more) and you need the data on an urgent basis, please trigger these entities manually to make sure they run as expected.
Posted Mar 07, 2025 - 16:51 UTC

Monitoring

The US cluster has been moved out of read-only mode, and we are monitoring the changes.
Posted Mar 07, 2025 - 15:24 UTC

Identified

The issue has been identified, and a fix is currently being implemented. During this process, the US cluster will remain in read-only mode for a period of time.
Posted Mar 07, 2025 - 14:57 UTC

Investigating

We are currently investigating the issue.
Posted Mar 07, 2025 - 14:35 UTC
This incident affected: US Cluster (Data Pipelines).