Revision 122
Articles and updates:
Is Your Kubernetes Infrastructure Resilient? Test It with a Chaos Day (link)
Hot Take: I Want Execs Closer to Incidents, Not Farther (link)
Three Guiding Lights on Sustaining Resilience (link)
Going beyond MTTx and measuring “good” incident management (link)
OpenTelemetry: Solving Modern Application Monitoring Challenges (link)
What LLMs can do for SREs in Cloud Native Infrastructure (link)
Kelsey Hightower on Nix vs. Docker: Is There a Different Way? (link)
Kagent: Bringing Agentic AI to Cloud Native (link)
Scale Microservices Testing Without Duplicating Environments (link)