As software becomes more and more complex, the need to understand how it performs in production becomes an important part of software development. If you want to better understand your applications, this class is for you.
In this workshop, we'll cover the basics of monitoring, what to monitor, how to visualize it, and how to alert when things go wrong. We'll also look at how to combine metrics with logs to troubleshoot production issues.
During the day, we'll talk about:
You'll also learn how to use Grafana, an open source platform for monitoring and observability, to set up your monitoring solution:
After this workshop:
You'll have an understanding of how to monitor your software You'll be able to use Grafana to create dashboards and alerts You'll be able to use metrics and logs together to debug running software Prerequisites: Basic experience with web services.
Marcus Olsson is a Developer Advocate at Grafana Labs, the company behind Grafana. Before that he was consulting teams on cloud native practices and infrastructure. He's also known to draw gophers and organize Go meetups.
Observability - one of the key properties for modern applications. Typically, when we talk about it, we mean logs, metrics, and traces. However, according to the definition, observability is a measure that shows our ability to understand the current state of the system or any component of the system. In this workshop, we will see how applications themselves can contribute to observability of the whole system.
We will discuss the importance of observability and the role of application in building observable systems. During the workshop, we will focus on practical examples and cases. We will start with an application that doesn’t provide any observability and will see how to improve the situation step-by-step. At the end of the workshop, our application will achieve a significantly better level of observability. The ideas and techniques of this workshop are applicable to different tools but for simplicity, we will use DataDog to analyze the data. If you would like to work with your own code, feel free to bring it to the workshop. Otherwise, you will be provided with a typical web service.
Elena is an Engineering Manager at GetYourGuide and a co-hosts of the GolangShow podcast. With 12+ years of overall experience in IT, she values DevOps culture and passionate about automation, software architecture and site reliability engineering topics.
Reliability is the most important feature of a service. In a perfect world, our services would have 100% reliability and software bugs would never exist. Sadly, we have to account for failures, bugs and other barriers when running a service. This workshop will introduce you on how to measure site reliability with examples and hands-on experience. This workshop is meant for both technical and non-technical attendees interested in taking a deep dive into the SRE workflow.
Over the course of this workshop we will define reliability and put things in the perspective of your users, error budgets and taking a hands on approach in defining and developing SLIs and SLOs. We will work through an example and slowly refine our SLOs. This is an interactive workshop and questions are welcome throughout the workshop.
All workshop materials will be provided, just bring your note taking tools.
Ishuah Kariuki is a Senior Software Engineer and SRE lead at Hover. He's based in Nairobi, Kenya. Besides writing code, he's a budding triathlete, adrenaline junkie and an avid hiker.