Francesco Cesarini

Workshop: Architecting Reactive Systems for Scalability and Availability

O'Reilly Author & Founder of Erlang Solutions

Workshop: Architecting Reactive Systems for Scalability and Availability

You need to implement a fault-tolerant, scalable, soft, real-time system with requirements for high availability. It has to be event driven and react to external stimulus, load, and failure. It must always be responsive. You have heard many success stories that suggest Erlang is the right tool for the job. And indeed it is—but while Erlang is a powerful programming language, on its own, it’s not enough to group these features together and build complex reactive systems. To get the job done correctly, quickly, and efficiently, you also need middleware, reusable libraries, tools, design principles, and a programming model that tells you how to architect and distribute your system.

In this tutorial, we will look at the steps needed to design scalable and resilient systems. The lessons learnt apply to Erlang, but are in fact technology agnostic and could be applied to most stacks, including Scala/AKKA, Elixir/OTP and others. We will focus on:

  • Distribution: This section covers how to break up your system into manageable microservices. How do you collect these micro services into nodes, which together form distributed architectural patterns, giving you your end-to-end system? What network connectivity do you use to let them communicate with each other?
  • Interfaces and state: This section covers how you define your service interfaces. What data and state do you distribute across your nodes, clusters, and data centers? And if requests fail across nodes, what is your recovery strategy?
  • Availability: You need at least two computers to make a fault-tolerant system. When dealing with fault tolerance, you have to make decisions about resilience and reliability. This section covers techniques needed to make sure your system never fails and the trade-offs you need to make in your design.
  • Scalability: When you picked your distributed pattern, decided how to distribute your data, and made choices on fault tolerance, resilience, and reliability, you also made trade-offs on scalability. This section covers the decisions you have to make and how they affect scalability, as well as how to deal with capacity planning, load regulation, and back pressure.
  • Visibility: This section covers the importance of visibility on both a business level and a system level. To achieve five-nines availability, you need preemptive support and automation. To trigger automation, you need to know the state of your system and be able to react to it as quickly as possible. This includes metrics, alarms, and notifications.

About Francesco

Francesco Cesarini is the founder of Erlang Solutions Ltd. He has used Erlang on a daily basis since 1995, starting as an intern at Ericsson’s computer science laboratory, the birthplace of Erlang. He moved on to Ericsson’s Erlang training and consulting arm working on the first release of OTP, applying it to turnkey solutions and flagship telecom applications. In 1999, soon after Erlang was released as open source, he founded Erlang Solutions, who have become the world leaders in Erlang based consulting, contracting, training and systems development. Francesco has worked in major Erlang based projects both within and outside Ericsson, and as Technical Director, has led the development and consulting teams at Erlang Solutions. He is also the co-author of 'Erlang Programming' and 'Designing for Scalability with Erlang/OTP' both published by O'Reilly and lectures at Oxford University.

Twitter: @FrancescoC

Back to conference page