SCHEDULE - TALK DETAIL


← Back to the schedule

Keynote | Technical

Training Deep Learning Models on Multiple GPUs in the Cloud

Friday 17th | 13:20 - 13:50 | Theatre 19


One-liner summary:

Training Deep Learning Models on Multiple GPUs in the Cloud

Keywords defining the session:

- Deep Learning

- GPUs

- Scalability

Description:

GPUs on the cloud as Infrastructure as a Service (IaaS) seem a commodity. However to efficiently distribute deep learning tasks on several GPUs is challenging. Even if some frameworks offer ways to benefit from data parallelism, devil is in details. Experts in deep learning care about learning rate and batch sizes. But some engineering details can ruin the scalability or efficiency of their training. Besides software implementation issues, latency on communications, GPUDirect support, drivers configuration or even the pricing of cloud providers can have big impact on training times and costs. Results on this topic will be shared.