Keynote | Technical
End-to-End “Exactly Once” with Heron & Pulsar
Thursday 16th | 17:20 - 17:50 | Theatre 25
Heron is an open-source streaming engine, employed by Twitter, Microsoft and Google, to process billions of events every day. Events are processed the moment they are generated to provide results immediately. Streamlio has recently collaborated with Twitter to add exactly once to Heron. However, to truly provide exactly once, the solution must be end-to-end. Streamlio solves this piece of the puzzle with Apache Pulsar (incubating), an enterprise grade message system from Yahoo!.
In this talk, Ivan will explain what exactly-once is, how it is implemented in Heron, and how Pulsar enables the Streamlio platform to provide exactly-one from end to end.
Faster event processing means your business can react to changes quicker. However, this speed often comes at the cost of accuracy, with some events lost and some processed multiple times. There are use cases where it is absolutely essential that an event is processed only once. Streamlio has recently collaborated with Twitter to add exactly-once processing to Heron.
However, event processing is only one piece of the puzzle. To provide end-to-end exactly-once, events production must be idempotent. Furthermore, it must be possible to replay events from an arbitrary point to handle failure cases. Pulsar solves exactly this problem. Pulsar is an enterprise grade messaging system, originally created, and heavily used in production by Yahoo!. Pulsar is now a Apache incubating project.
In this talk, Ivan will discuss use cases and tradeoffs of exactly-once, and how it differs from the other semantics in Heron. He will describe the architecture Pulsar and the guarantees it provides. Finally, he will show how Streamlio have combined Heron and Pulsar to create an end-to-end solution for exactly-once stream processing.