Technical writing, reflections, and lessons learned
Featured
Lessons learned from years of operating distributed systems at scale. Covering failure modes, resilience patterns, and the mindset required to build systems that survive reality.
A practical guide to implementing Site Reliability Engineering practices in teams that are scaling rapidly. From on-call rotations to error budgets.
Archive