<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>o11yparadise: observability done right</title><description>True stories of observability done right: incidents caught before customers noticed, alerts that fired exactly when they should, dashboards that told the truth, and the practices that made it all calm. Read the stories and share your own.</description><link>https://o11yparadise.com/</link><language>en-us</language><item><title>The Blameless Postmortem That Changed Everything</title><link>https://o11yparadise.com/stories/the-blameless-postmortem/</link><guid isPermaLink="true">https://o11yparadise.com/stories/the-blameless-postmortem/</guid><description>An outage could have ended with someone blamed and nothing learned. Instead a blameless postmortem turned it into three concrete fixes and a team that started surfacing risks early.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate><category>postmortem</category><category>culture</category><category>incident-response</category></item><item><title>The Canary That Caught It</title><link>https://o11yparadise.com/stories/the-canary-that-caught-it/</link><guid isPermaLink="true">https://o11yparadise.com/stories/the-canary-that-caught-it/</guid><description>A bad deploy raised the error rate the instant it touched real traffic. The canary saw it at one percent, rolled itself back automatically, and 99 percent of users never noticed.</description><pubDate>Sat, 13 Jun 2026 00:00:00 GMT</pubDate><category>rollout</category><category>canary</category><category>automation</category></item><item><title>The Runbook That Worked</title><link>https://o11yparadise.com/stories/the-runbook-that-worked/</link><guid isPermaLink="true">https://o11yparadise.com/stories/the-runbook-that-worked/</guid><description>A new hire&apos;s first night page, resolved in ten calm minutes, because the alert linked straight to a runbook that named the dashboard, the likely cause, and the exact command to run.</description><pubDate>Sat, 06 Jun 2026 00:00:00 GMT</pubDate><category>runbooks</category><category>on-call</category><category>incident-response</category></item><item><title>The Label We Did Not Add</title><link>https://o11yparadise.com/stories/the-label-we-did-not-add/</link><guid isPermaLink="true">https://o11yparadise.com/stories/the-label-we-did-not-add/</guid><description>A pull request tried to add a user ID to a metric. A linter flagged it, a reviewer explained why, and an eight million series cardinality explosion quietly never happened.</description><pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate><category>metrics</category><category>cardinality</category><category>culture</category></item><item><title>Tail Sampling Caught the One in Two Thousand</title><link>https://o11yparadise.com/stories/tail-sampling-caught-the-rare-one/</link><guid isPermaLink="true">https://o11yparadise.com/stories/tail-sampling-caught-the-rare-one/</guid><description>A bug that corrupted one order in two thousand would have been invisible under head sampling. Tail-based sampling kept every failing trace, and the root cause was found in minutes.</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate><category>tracing</category><category>sampling</category><category>latency</category></item><item><title>The Page That Never Came</title><link>https://o11yparadise.com/stories/the-page-that-never-came/</link><guid isPermaLink="true">https://o11yparadise.com/stories/the-page-that-never-came/</guid><description>A week on call with zero pages. Not because nothing happened, but because error budgets, symptom-based alerts, and ruthless noise pruning meant only the things worth waking for ever woke anyone.</description><pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate><category>on-call</category><category>alerting</category><category>error-budget</category></item><item><title>The Dashboard That Told the Truth</title><link>https://o11yparadise.com/stories/the-dashboard-that-told-the-truth/</link><guid isPermaLink="true">https://o11yparadise.com/stories/the-dashboard-that-told-the-truth/</guid><description>When latency crept up, one dashboard showed the real story in seconds: a data freshness badge, SLO burn lines, and the one graph that mattered. The fix took longer to deploy than to find.</description><pubDate>Sat, 09 May 2026 00:00:00 GMT</pubDate><category>dashboards</category><category>golden-signals</category><category>slo</category></item><item><title>The Alert That Saved the Weekend</title><link>https://o11yparadise.com/stories/the-alert-that-saved-the-weekend/</link><guid isPermaLink="true">https://o11yparadise.com/stories/the-alert-that-saved-the-weekend/</guid><description>A single well-tuned alert on a slow memory climb fired on Friday afternoon, days before it would have become a Saturday-night outage. Nobody lost a weekend.</description><pubDate>Sat, 02 May 2026 00:00:00 GMT</pubDate><category>alerting</category><category>slo</category><category>capacity</category></item></channel></rss>