Manning Publications, 2023. — 266 p.
Don’t fly blind. Observability gives you actionable insights into your cloud-native systems — from pinpointing errors, to increasing developer productivity, to tracking compliance.
Observability is the difference between an error message and an error explanation with a recipe for how to resolve the error! You know exactly which service is affected, who’s responsible for its repair, and even how it can be optimized in the future. Cloud Observability in Action teaches you how to set up an observability system that learns from a cloud application’s signals, logging, and monitoring, all using free and open-source tools.
In Cloud Observability in Action you will learn how to:
Apply observability in cloud-native systems.
Understand observability signals, including their costs and benefits.
Apply good practices around instrumentation and signal collection.
Deliver dashboarding, alerting, and SLOs/SLIs at scale.
Choose the correct signal types for given roles or tasks.
Pick the right observability tool for any given function.
Communicate the benefits of observability to management.
A well-designed observability system provides insight into bugs and performance issues in cloud-native applications. They help your development team understand the impact of code changes, measure optimizations and track user experience. Best of all, observability can even automate your error handling so that machine users apply their fixes — no more 3 AM calls for emergency outages.
About the technology:Cloud-native systems are made up of hundreds of moving parts. When something goes wrong, it’s not enough to know there is a problem — you need to know where it is, what it is, and how to fix it. This book takes you beyond traditional monitoring, explaining observability systems that turn application telemetry into actionable insights.
In cloud-native environments, such as public cloud offerings like AWS or on-premises infrastructure, for example, a Kubernetes cluster, you typically deal with many moving parts. This ranges from the infrastructure layer including compute (such as VMs or containers) and databases to the application code that you own. Depending on your role and the environment you may be responsible for any number of the pieces in the puzzle. Let’s have a look at a concrete example: consider a serverless Kubernetes environment in a cloud provider. In this case, both the Kubernetes control plane as well as the data plane (the worker nodes) are managed for you, which means you can focus on your application code, in terms of operations.
About the book:Cloud Observability in Action gives you the background and techniques you need to successfully introduce observability into cloud-based serverless and Kubernetes environments. In it, you’ll learn to use open standards and tools like OpenTelemetry, Prometheus, and Grafana to build your observability system and end reliance on proprietary software. You’ll discover insights from different telemetry signals, including logs, metrics, traces, and profiles. Plus, the book’s rigorous cost-benefit analysis ensures you’re getting a real return on your observability investment.
What's inside:Observability in and of cloud-native systems.
Dashboarding, alerting, and SLOs/SLIs at scale.
Signal types for any role or task.
State-of-the-art open-source observability tools.
Who should read this book:The book focuses primarily on developers, and DevOps/site reliability engineers (SREs), who are working with cloud-native applications. It is meant for anyone interested in running cloud-native applications, be that in Kubernetes or using function-as-a-service offerings, such as AWS Lambda.