← Back to the schedule

Keynote | Technical

The battle of stream processing frameworks

Thursday 16th | 16:35 - 17:05 | Theatre 20

One-liner summary:

Data processing is one of the key components of any data driven system. Right information at right time can impact business to such extent that the business can grow or get left out in the dust of the competitors. In this presentation we will fight the battle of numbers between Apache Spark, Apache Kafka and Apache Flink and crown the king of stream processing by multiple categories. At the end, you will be able to pick the right one to solve the problem at hand.

Keywords defining the session:

- Spark

- Flink

- Kafka


Stream processing is a hot topic at the moment and since there are bunch of technologies providing stream processing (some more some less) it became a hot mess. Some of the current frameworks have stream processing as an added feature, some are a natural fit and some are simply streaming by design. One of the most heavily used data processing frameworks in the last few years is Apache Spark and Spark streaming was added far down the road as a micro-batching functionality. Apache Kafka was initially built as a distributed, event based data ingestion framework but slowly became a streaming processing platform because of the high requirements for stream processing capabilities. Apache Flink is basically the new kid on the block but is a true streaming first framework