Horovod, an LF AI Foundation Incubation Project, has released version 0.19.0 and we’re thrilled to see the results of their hard work. Horovod is a distributed deep learning framework that improves the speed, scale, and resource utilization of deep learning training.
In version 0.19.0, Horovod adds tighter integration with Apache Spark, including a new high-level Horovod Spark Estimator framework and support for accelerator-aware task-level scheduling in the upcoming Spark 3.0 release. With Horovod Spark Estimators, you can train your deep neural network directly on your existing Spark DataFrame, leveraging Horovod’s ability to scale to hundreds of workers in parallel without any specialized code for distributed training. This enables deep learning frameworks to integrate seamlessly with ETL jobs, allowing for more streamlined production jobs, with faster iteration between feature engineering and model training.
This release also contains experimental new features including a join operation for PyTorch and the ability to launch Horovod jobs programmatically from environments like notebooks using a new interactive run mode.
With the new join operation, users no longer need to worry about how evenly their dataset divides when training. Just add a join step at the end of each epoch, and Horovod will train on any extra batches without causing the waiting workers to deadlock.
Using Horovod’s new interactive mode, users can launch distributed training jobs in a single line of Python. Define the distributed training function, execute it with multiple parallel processes, then return the results as a Python list of objects. This new API mirrors horovod.spark, but can run on any nodes you would normally use with horovodrun.
Full release notes for Horovod version 0.19.0 available here. Curious about how Horovod can make your model training faster and more scalable? Check out these new updates and try out the framework. And be sure to join the Horovod Announce and Horovod Technical-Discuss mailing lists to join the community and stay connected on the latest updates.
Congratulations to the Horovod team and we look forward to continued growth and success as part of the LF AI Foundation! To learn about hosting an open source project with us, visit the LF AI Foundation website here.
Horovod Key Links
- Mail Lists
LF AI Resources