Customer Monitoring Service Degradation

MAJORRESOLVED

The PROBE_SECRET environment variable was not configured in the API worker, causing all probe regions to receive 401 Unauthorized responses when fetching monitors. Customer monitors were not being checked during this period. Platform monitoring was unaffected.

Started

Sat, Dec 28, 2024, 8:43 AM UTC

Resolved

Sun, Dec 29, 2024, 3:50 AM UTC

Duration

19h 7m

Affected Services
Customer MonitoringProbes API

Incident Timeline

UpdateSun, Dec 29, 2024, 3:50 AM UTC

This incident has been resolved. All customer monitors are now being checked normally. We apologize for the ~19 hour monitoring gap.

UpdateSun, Dec 29, 2024, 3:46 AM UTC

Fix deployed: Added PROBE_SECRET to the API worker. Monitoring has resumed across all 10 probe regions.

UpdateSun, Dec 29, 2024, 3:40 AM UTC

Root cause identified: The PROBE_SECRET environment variable was missing from the API worker configuration. Lambda probes were receiving 401 Unauthorized responses when fetching the monitor list.

UpdateSat, Dec 28, 2024, 8:43 AM UTC

We are investigating reports of customer monitors not updating. Platform monitoring appears to be functioning normally.