Architecting for Scale

book cover

This book is simple and well organised. It addresses the key topics that need to be addressed if you want to build, deploy and operate large-scale applications. Here they are, I do not invent anything, they are the five sections of the book

  • Availability: learn techniques for building highly available applications, and for tracking and improving availability going forward
  • Risk management: identify, mitigate, and manage risks in your application, test your recovery/disaster plans, and build out systems that contain fewer risks
  • Services and microservices: understand the value of services for building complicated applications that need to operate at higher scale
  • Scaling applications: assign services to specific teams, label the criticalness of each service, and devise failure scenarios and recovery plans
  • Cloud services: understand the structure of cloud-based services, resource allocation, and service distribution

Lee Atchison is very good at providing an overview of all these subjects. Without going into details, it gives essential points. It is therefore a very good introduction for new comers in the domain, but it can also be used as a reference because it provides simple and fairly well-crafted definitions calling for consensus. Beyond the definitions I have used it several times as a toolbox for example to build a risk analysis or a map of the services with their dependencies and their criticality. In these cases it gives all the structure to follow, the key points to address (for example the classification in service tiers) there is more than to follow the guide. The book is simple and clear and built for understanding because it progresses in sequential steps. In short a very good first reading that will become a handbook that calls other readings like: Release It!1–a must–or Site Reliability Engineering2.

Lee Atchison, Architecting for Scale (O’Reilly, 2016)

  1. Michael T. Nygard, Release It! (Pragmatic Bookshelf, 2007) [return]
  2. Collective work, Site Reliability Engineering *(O’Reilly, 2016) [return]