Philly ETE 2020 – Jamie Allen – What is Site Reliability Engineering (SRE)?

by
Tags: , , , , , ,
Category:




Check out our YouTube playlist to watch all the talks from Emerging Technologies for the Enterprise 2020.

Abstract

In the past several years, many of the hot tech firms have adopted Google’s approach of Site Reliability Engineering to improve customer experiences in using their applications and services. What is SRE, and how does it differ from the concept of DevOps?

In this talk, I will outline how SRE builds on the concepts of engineers owning their code in production, and what the core principles and activities are of the discipline itself. We will discuss the 4 Golden Signals, the difference between SLIs, SLOs and SLAs, how to define your Error Budget, observability and more.

About Jamie Allen

Jamie Allen is the Senior Director of Delivery Management of EPAM Systems, a global systems integrator based in Newtown, PA. In previous lives, Jamie has been a consultant at Chariot Solutions, the global head of consulting and training for Typesafe/Lightbend (creators of Scala and Akka), and Director of Engineering at Starbucks responsible for reimplementing Starbucks Rewards and Mobile Order/Pay backend systems from scratch in the cloud with microservices.

He has also been an Production Engineering leader at Facebook in the Core Systems group, helping manage SRE for the shared platform used by all Facebook services. Jamie is the author of Effective Akka, and co-author of Reactive Design Patterns.