Distributed and Declarative Deep Learning Systems

(T) Developing deep learning models is hard. Debugging deep learning models is even harder. And, few companies have data pipelines that provide the training and inference of large scale deep learning models for mission critical applications. So any tool that could help into that journey feels like resting in an oasis in the middle of the desert.

Following are two tools to explore if you want to start that journey: Horovod and Ludwig. Both tools were developed at Uber but are open source.

Horovod, led by Travid Addair, enables distributed training of deep learning models developed in TensorFlow and PyTorch on GPUs. The beauty of Horovod is that it works on Apache Spark so that features produced by ETLs running on Spark can be reused. Databricks integrated it to Spark 3.0.

Following is a talk from Tavis given at the Stanford MLSys Seminar Series:

Ludwig, led by Piero Molino, enables to develop state-of-the-art NLP and computer vision models for any data sets and automates the deployment of those models in inference. Ludwig makes this possible through its declarative approach to build machine learning pipelines. Instead of writing code in particular for training, evaluation, and hyperparameter optimization, you only need to declare the schema of your data with a simple YAML configuration.

Ludwig architecture is based on a general modularized deep learning system with three components, an encoder, a combiner, and a decoder that can be instantiated to implement those various machine learning data pipelines.

Following is the release notes for Ludwig version 0.4 which include integrations with Ray, Dask, TabNet, and MLflow, and an early talk from Piero about Ludwig:

And a more recent talk given at the Stanford MLSys Seminar Series:

Note: The picture above is une peniche.

Copyright © 2005-2021 by Serge-Paul Carrasco. All rights reserved.
Contact Us: asvinsider at gmail dot com