Big Data Spain

17th ~ 18th NOV 2016 MADRID, SPAIN #BDS16

Efficient Big Data Exploration with SQL and Apache Drill

Thursday 17th

from 15:40 to 16:20

Theatre 20



We are continuously confronted with increasing volumes of data coming from various sources. We may get it in text formats like JSON, CSV or, for example, as server logs. Sometimes it is compressed or distributed within a tree of subdirectories. We have it also in relational and non-relational data stores. Is it possible to quickly explore all that variety of data directly with SQL without involving expensive and complex infrastructure?

Apache Drill is a low latency distributed schema-free SQL query engine for large-scale datasets, including structured and semi-structured/nested data. Drill is designed to scale to several thousands of nodes and query petabytes of data at the speeds that BI/Analytics environments require. Drill also provides a high performance Java API which can be used to develop custom functions.

This session will give a practical introduction to Apache Drill. During live demos we will dive into querying capabilities across various data formats and sources as well as configuration of the tool. At the end we will be fully prepared to start exploration of our own Big Data with SQL and Apache Drill.

Jonatan Kazmierczak foto

Jonatan Kazmierczak

Atos Consulting SwitzerlandSenior consultant