July 17, 2019
Member of Technical Staff | OpenAI
In the last year, OpenAI has trained a team of five neural networks able to defeat the reigning world champion esports team, achieved state of the art results on a variety of domain-specific language modeling tasks, released a public demonstration of combining multiple musical styles using unsupervised learning, and more.
To deliver on these results, OpenAI operates a wide range of complex infrastructure, including some of the largest Kubernetes clusters in the world. Many of the workloads running on this infrastructure don’t adhere to commonly accepted practices or ways of operating software.
This talk will cover infrastructure patterns & techniques we've used to successfully scale our research. Audience members will take away ideas and practices to help with their own ML/AI teams, in both development and deployment, from research through to production.