At Booking.com, we have a constant flow of events coming from various applications and internal subsystems. This critical data needs to be stored for real-time, medium and long term analysis. Events are schema-less, making it difficult to use standard analysis tools.This presentation will explain how we built a storage and analysis solution based on Riak. The talk will cover: data aggregation and serialization, Riak configuration, solutions for lowering the network usage, and finally, how Riak's advanced features are used to perform real-time data crunching on the cluster nodes.
- present a real use case of Riak, used here in an unusual situation.
- from the problem description to making the solution future-proof, explain the architecture choices
- give a list of steps to find the right distributed database
- make the case that turn-key solutions are not great, we always need hackability
- show some code in fun languages: Erlang and Perl (and maybe Elixir)
- intermediate software engineers that understand the challenge of big data, realtime analytics and distributed systems.
Developers will be happy to see some (only a bit of easy) Erlang and Perl code. Architects will understand what challenge we were facing, why we went with our solution, what it can bring to them and what are the alternatives. Sales/Marketing guys will be bored and won't understand (that's on purpose).
Damien Krotkine is a software engineer at Booking.com (world’s leading online hotel and accommodation reservations company). He currently works on the events subsystem, where he helps gathering, storing, managing and analyzing billions of events each day in real-time. Previously, he has been working in various fields like Linux Distribution, e-commerce, online real-time advertising. He's an active member of the Perl community, maintaining some NoSQL related modules ( Redis driver, Riak client, Bloomd client ... ). He likes distributed systems, Erlang, Elixir, Perl, and everything BigData related.