from 17:00 pm to 17:45 pm
In this talk Alex is going to introduce new open source framework Frontera. Frontera is a crawl frontier framework, telling your web crawler what to crawl and when.
It's basically the brain of your web crawler. Frontera allows to build real-time, large scale, distributed web crawlers. Offering:
Along with framework description Alex will share with you technical problems he faced developing framework and demonstrate how to build a distributed crawler using Scrapy, Apache Kafka and HBase. The talk is organized in funny and exciting form of a story.