← Back to the training program

Paco Nathan

Derwen, Inc.

Daniel Vila Suero

Get Started with NLP and AI in Python


After last year's successful course “Get Started with NLP in Python”, we now offer a second opportunity to attend an updated version of this training.

Python provides a number of excellent packages for natural language processing (NLP) along with great ways to leverage the results. This course builds on spaCy, datasketch, word2vec, and other popular libraries for NLP, and then builds on PyTorch and related libraries for deep learning.

If you are new to NLP, this course will provide you with initial hands-on work: the confidence to explore much further into use of Deep Learning with text, natural language generation, chatbots, etc.

First, however, we’ll show you how to prepare text for parsing, how to extract key phrases, prepare text for indexing in search, calculate similarity between documents, etc. That provides a great starting point for developing custom search, content recommenders, applications of AI, etc.


·Some programming in Python (we’ll use Python 3)
·Basic understanding of HTML and the DOM structure for web pages
·Access to a computer with a browser, where you can install Python packages and develop code at a command line; we recommend using `virtualenv`
·Know how to install Python libraries using `pip`, etc.
·Basic familiarity with `git` and use of GitHub

Downloads required in advance of the course:

·Install Python 3.5 (or later), git, virtualenv
·Install Jupyter
·Install BeautifulSoup4, spaCy, datasketch, gensim, networkx, PyTorch

We will provide a GitHub link to everyone who registers for this course, including detailed instructions for setup, plus Jupyter notebooks for each of the course exercises and a Docker container with the required libraries and datasets pre-loaded.


Big Data Spain will issue the certificate for this course to prove subject matter competency


·You are a Python programmer and need to learn how to use available packages for NLP and deep learning
·You are a data scientist with some Python experience and need to leverage NLP, text mining, and deep learning
·You are interested in deep learning, chatbots, knowledge graphs, and related AI work, and want to understand the basics for preparing text data for those kinds of use cases

Bio of the instructor - Paco Nathan

Paco Nathan is known as a "player/coach", with core expertise in data science, natural language processing, machine learning, cloud computing; 35+ years tech industry experience, ranging from Bell Labs to early-stage start-ups. Co-chair JupyterCon, host of Executive Briefings at The AI Conf and Strata Data. Evangelist for Computable. Advisor for Amplify Partners, Deep Learning Analytics, Recognai. Recent roles: Director, Learning Group @ O'Reilly Media; Director, Community Evangelism @ Databricks and Apache Spark. Cited in 2015 as one of the Top 30 People in Big Data and Analytics by Innovation Enterprise.

Bio of the instructor - Daniel Vila Suero

Daniel Vila is co-founder of, a Madrid-based startup and spin-off from the Technical University of Madrid, building next generation solutions for text analytics and content management using the latest AI techniques. Daniel holds a PhD in Artificial Intelligence by the Technical University of Madrid (2016) and has built one the largest knowledge graphs in Spain combining NLP and semantic technologies: powering the data service from the National Library of Spain. He also received the Fujitsu Laboratories of Europe Innovation Award in 2014 and is a stable contributor to spaCy, one of the most advanced industrial libraries for NLP in Python.