SCHEDULE - TALK DETAIL


← Back to the schedule

Keynote | Technical

Flink-Kudu connector: an open source contribution to develop Kappa architectures

Friday 17th | 13:20 - 13:50 | Theatre 25


Keywords defining the session:

- Apache Flink

- Apache Kudu

- Kappa

Takeaway points of the session:

- How to implemente kappa architectures

- How to integrate Apache Flink and Kudu

Kappa Architecture is a software architecture pattern that makes use of an immutable, append only log. All the processing of the event will be performed in the input streams and persisted as real-time views. Apache Flink is very well suited to be the processing engine because it provides support for event-time semantics, stateful exactly-once processing, and achieves high throughput and low latency at the same time. Apache Kudu Kudu is a storage system good at both ingesting streaming data and good at analysing it using ad-hoc queries (e.g. interactive SQL based) and full-scan processes (e.g Spark/Flink). So Kudu is a good fit to store the real-time views in a Kappa Architecture. We have developed and open-sourced a connector to integrate Apache Kudu and Apache Flink. It allows reading/writing data from/to Kudu using the DataSet and DataStream Flink’s APIs. The connector has been submitted to the Apache Bahir project and is already available from maven central repository.