Distributed systems are defined by the legendary Leslie Lamport as:
A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.
Facetious remarks aside, when talking about distributed systems we mean systems in which multiple machines take part in solving a problem. The field is complex and somewhat abstruse, making it difficult for people new to it to see the connection between theory and practice. Most college courses, in particular, focus on the theoretical aspects, lacking emphasis on how the concepts learned can be leveraged to build practical systems.
Yet the importance of understanding, if not mastering, distributed systems is more important today than ever before. Today’s big tech companies (Google, Facebook, Twitter, Amazon, etc.) operate systems that handle vast amounts of data. But even startups and small companies often need to design systems that can scale seamlessly as the number of users increases.
The aim of this post is to provide a (relatively) beginner-friendly collection of resources, mostly original papers, with about a paragraph of commentary on the high-level ideas presented in it. The papers are arranged in what seems to me an intuitive way, starting with the foundations, and progressing all the way to real-world systems.
Other reading lists of this sort have surfaced throughout the years, such as the ones from Cloudera’s Henry Robinson, and AWS’s Swami Sivasubramanian and Marc Brooker, with whom this list will have many recommendations in common. The reader is highly encouraged to consult the posts of these distinguished industry practitioners.
Needless to say, the post makes no pretense of being comprehensive, and there are many outstanding papers and resources that have been left out (such as Byzantine failures), since it would have otherwise become unwieldy. Furthermore, no blog post can be an adequate substitute for exposure to distributed systems in industry (or possibly academia). I hope, however, that this post, with its focus on concepts and their applications, will provide a good place to start for someone new to the field.