from 15:40 to 16:20
We are continuously confronted with increasing volumes of data coming from various sources. We may get it in text formats like JSON, CSV or, for example, as server logs. Sometimes it is compressed or distributed within a tree of subdirectories. We have it also in relational and non-relational data stores. Is it possible to quickly explore all that variety of data directly with SQL without involving expensive and complex infrastructure?
Apache Drill is a low latency distributed schema-free SQL query engine for large-scale datasets, including structured and semi-structured/nested data. Drill is designed to scale to several thousands of nodes and query petabytes of data at the speeds that BI/Analytics environments require. Drill also provides a high performance Java API which can be used to develop custom functions.
This session will give a practical introduction to Apache Drill. During live demos we will dive into querying capabilities across various data formats and sources as well as configuration of the tool. At the end we will be fully prepared to start exploration of our own Big Data with SQL and Apache Drill.
Atos Consulting SwitzerlandSenior consultant