Production
Healthy
Uptime
99.998%
Preview
Degraded
Latency
+342ms
Local Dev
Stable
Active Nodes
14 Instances
dns
health_metrics System Health
Live
API
24ms
Database
92%
Redis Cache
1.2ms
Webhooks
88%
error Active Incidents
1
2
Critical
2m ago
Auth-Service: High Latency in us-east-1
+2
Warning
14m ago
CDN: Cache hit rate dropped below 85%
No owner assigned
Resolved
45m ago
Database replica lag synchronized
2023-11-24 14:02:11
[INFO]
GET /api/v1/auth/session - 200 OK (14ms)
2023-11-24 14:02:14
[INFO]
Worker node-2: Heartbeat received
2023-11-24 14:02:45
[ERROR]
Failed to connect to redis-primary (Timeout after 5000ms)
2023-11-24 14:03:01
[WARN]
Circuit breaker for 'Payment-Service' opened (15% error rate)
2023-11-24 14:03:02
[INFO]
Auto-scaling group: Triggering node-3 provisioning...
2023-11-24 14:03:05
[INFO]
GET /health - 200 OK
2023-11-24 14:03:08
[INFO]
Request ID: req_99axz2 completed
2023-11-24 14:03:12
[ERROR]
Auth-Service: Token validation failed - Cluster unreachabe
2023-11-24 14:03:15
[INFO]
Cleaning up ghost sessions...
Listening for incoming events...
shield
Active Mitigation
Context: Auth-Service Latency
check
Isolate US-East-1 Cluster
Traffic rerouted to EU-Central-1 at 14:05
2
Rollback Auth-Service to v2.4.1
65% Replicas Ready (4 of 6 nodes)
3
Clear Redis Authentication Cache
Pending step 2 completion
4
Post-Mortem: Notify Stakeholders