2nd Edition. — O’Reilly Media, Inc., 2025. — 390 p. — ISBN: 978-1-098-15635-0.
The views expressed in this work are those of the author and do not represent the publisher’s views. While the publisher and the author have.
Used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all.
Responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this.
Work. Use of the information and instructions contained in this work is at your risk. If any code samples or other technology this work contains.
Or describes is subject to open-source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof.
Complies with such licenses and/or rights.
This book is organized into five parts as follows:
Part I, “Foundational Concepts” Chapters 1 and 2 introduce distributed systems as well as some fundamental concepts which are essential to understanding the distributed system designs described in Part II, “Single-Node Patterns”.
Part II, “Single-Node Patterns” Chapters 3 through 5 discuss reusable patterns and components that occur on individual nodes within a distributed system. They cover the sidecar, adapter, and ambassador single-node patterns.
Part III, “Serving Patterns”.
Chapters 6 through 8 cover multinode distributed patterns for long-running serving systems like web applications. Patterns for many different types of serving systems, including basic replication, sharding, and work sharing, are discussed. Additionally, Chapters 9 and 10 discuss.
Essential distributed concepts like functions, event-driven programming, and leader election.
Part IV, “Batch Computational Patterns”.
Chapters 11 through 13 cover distributed system patterns for large-scale batch data processing regarding work queues, event-based processing,
and coordinated workflows.
Part V, “Universal Concepts”.
The book concludes with several topics that are universal to all distributed systems. Chapter 14 covers logging, monitoring, and alerting.
For your application; Chapter 15 provides a survey of AI infrastructure; and Chapter 16 describes many common failures and design errors that occur over and over again as we build distributed systems.