Sean T Allen

How did I get here? Building Confidence in a Distributed Stream Processor

VP of Engineering @ Sendence

How did I get here? Building Confidence in a Distributed Stream Processor

When we build a distributed application, how do we have confidence that our results are correct? We can test our business logic over and over but if the engine executing it isn't trustworthy, we can't trust our results.

How can we build trust in our execution engines? We need to test them. It's hard enough to test a stream processor that runs on a single machine, it gets even more complicated when you start talking about a distributed stream processor. As Kyle Kingsbury's Jepsen series has shown, we have a long way to go creating tests that can provide confidence that our systems are trustworthy.

At Sendence, we're building a distributed streaming data analytics engine that we want to prove is trustworthy. This talk will focus on the various means we have come up with to create repeatable tests that allow us to start trusting that our system will give correct results. You’ll learn how to combine repeatable programmatic fault injection, message tracing, and auditing to create a trustworthy system. Together, we’ll move through the design process repeatedly answering the questions “What do we have to do to trust this result?” and “If we get the wrong result, how can we determine what went wrong so we can fix it?”. Hopefully you’ll leave this talk inspired to apply a similar process to your own projects.

Talk objectives:

- Understand the need for verification of distributed systems.
- Learn approaches and techniques for verification with distributed systems.
- Understand some of the different challenges and solutions for verification with stream processing systems.

Target audience:

- Developers and Architects interested in practical approaches to verify correctness in a distributed system.