← Back to the schedule

Mapping Of Water Bodies Using Algorithms Of Big Data Analytics

Calendar icon

Thursday 15th

Time icon

16:55 | 17:35

Location icon

Theatre 19


Keywords defining the session:

- BigData Space

- Satellite Image Processing

Takeaway points of the session:

- Capacity of processing big volumes of satellite imagery.

- Capacity of extract meaningful information from big data using analytics tools.


The main objective of the project is to create a flexible platform that is capable to process big volumes of Satellite-borne data in an efficient manner to produce Land-related Earth Observation information. The architecture of the system is based in two different components: the processing subsystem and the analytics subsystem.

The Processing Subsystem is in charge of retrieve and download satellite imagery, ingest and cataloguing the data, orchestrate the processors in charge of generating the EO parameters and disseminate the generated products. The design of the platform allows to deploy it on regular cloud infrastructure, being able to process vast amounts of data due to their scalability (including continuous update of the final output if it is demanded). It is modular and allows the flexibility of access data from different constellations and change the Earth Observation processors so it produces different thematic outputs just by making use of different algorithms.

The Analytics Subsystem is in charge of retrieving ancillary information from multiple sources, ingest the intermediate EO product, and orchestrate the processors to apply the analytics algorithms, disseminate the generated products and feedback the processing subsystem to improve algorithms of the Processing Subsystem. The analytics capabilities of the platform are able to provide added value to the traditional Earth Observation product. Information hidden in the satellite data, relations between satellite data and other scientific sources and finally, the innovative use of IoT and Location-Based Social Networks allow the platform to be able to extract useful information from apparently undecipherable datasets.
We focus our work in the identification of waterbodies in satellite images of earth. The location and persistence of surface water is both affected by climate and human activity and affects climate, biological diversity and human wellbeing. We analyze the satellite images of a region in northeast Spain and record where and when water was present, where occurrence changed and what form changes took in terms of persistence and frequency.

Each on the image we work with is composed by one hundred and twenty millions of pixels. Each of this image is converted in a sparse matrix, and this matrix is stored in a Hive table. Each row of the table presents the information related to each pixel of the image. Using spark we face two levels analysis: pixel level and waterbody level. At level pixel, we analyze water persistence and frequency, and at waterbody level we study temporal evolution and measure oscillations on surface water volume.