Skip to main content
THE LINUX FOUNDATION PROJECTS
LF AI & Data Blog

ForestFlow Joins LF AI as New Incubation Project

By April 23, 2020No Comments

The LF AI Foundation (LF AI), the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML), and deep learning (DL), today is announcing ForestFlow as its latest Incubation Project. ForestFlow is a scalable policy-based cloud-native machine learning model server. ForestFlow strives to strike a balance between the flexibility it offers data scientists and the adoption of standards while reducing friction between Data Science, Engineering and Operations teams. ForestFlow was released and open sourced by Dreamworks.

“We are very pleased to welcome ForestFlow to LF AI. ForestFlow provides an easy way to deploy ML models to production and realize business value on an open source platform that can scale as the user’s projects and requirements scale,” said Dr. Ibrahim Haddad, Executive Director of LF AI. “We look forward to supporting this project and helping it to thrive under a neutral, vendor-free, and open governance.” LF AI supports projects via a wide range of benefits; and the first step is joining as an Incubation Project. 

Ahmad Alkilani, Principal Architect and developer of ForestFlow at DreamWorks Animation, said, “We developed ForestFlow in response to our need to move ML models into production that affected the scheduling and placement of rendering jobs and the throughput of our rendering pipeline which has a material impact to our bottom line. Our focus was on maintaining our own teams’ agility and keeping ML models fresh in response to changes in data, features, or simply the production tools that historical data was associated with. Another pillar for developing ForestFlow was the openness of the solution we chose. We were looking to minimize vendor lock-in having a solution that was amenable to on-premise and cloud deployments all the same while offloading deployment complexities from the job description of a Data Scientist. We want our team to focus on extracting the most value they can out of the data we have and not have to worry about operational concerns. We also needed a hands-off approach to quickly iterate and promote or demote models based on observed metrics of staleness and performance. With these goals in mind, we also realize the value of open source software and the value the Linux Foundation brings to any project and specifically LF AI in this space. DreamWorks Animation is pleased that LF AI will manage the neutral open governance for ForestFlow to help foster the growth of the project.”

Continuous deployment and lifecycle management of Machine Learning/Deep Learning models is currently widely accepted as a primary bottleneck for gaining value out of ML projects. Hear from ForestFlow about why they set out to create this project: 

  • We wanted to reduce friction between our data science, engineering and operations teams
  • We wanted to give data scientists the flexibility to use the tools they wanted (H2O, TensorFlow, Spark export to PFA etc..)
  • We wanted to automate certain lifecycle management aspects of model deployments like automatic performance or time-based routing and retirement of stale models
  • We wanted a model server that allows easy A/B testing, Shadow (listen only) deployments and Canary deployments. This allows our Data Scientists to experiment with real production data without impacting production and using the same tooling they would when deployment to production.
  • We wanted something that was easy to deploy and scale for different deployment scenarios (on-prem local data center single instance, cluster of instances, Kubernetes managed, Cloud native etc..)
  • We wanted the ability to treat inference requests as a stream and log predictions as a stream. This allows us to test new models against a stream of older infer requests.
  • We wanted to avoid the “super-hero” data scientist that knows how to dockerize an application, apply the science, build an API and deploy to production. This does not scale well and is difficult to support and maintain.
  • Most of all, we wanted repeatability. We didn’t want to reinvent the wheel once we had support for a specific framework.

ForestFlow is policy-based to support the automation of Machine Learning/Deep Learning operations which is critical to scaling human resources. ForestFlow lends itself well to workflows based on automatic retraining, version control, A/B testing, Canary Model deployments, Shadow testing, automatic time or performance-based model deprecation and time or performance-based model routing in real-time. The aim for ForestFlow is to provide data scientists a simple means to deploy models to a production system with minimal friction accelerating the development to production value proposition. Check out the quickstart guide to get an overview of setting up ForestFlow and an example on inference. 

Learn more about ForestFlow here and be sure to join the ForestFlow-Announce and ForestFlow-Technical-Discuss mail lists to join the community and stay connected on the latest updates. 

A warm welcome to ForestFlow and we look forward to the project’s continued growth and success as part of the LF AI Foundation. To learn about how to host an open source project with us, visit the LF AI website.

ForestFlow Key Links

LF AI Resources

Author

  • Andrew Bringaze

    Andrew Bringaze is the leader in the Senior Developer for the Linux Foundation. With over 15 years in experience, his focus is on open source code, WordPress, React, site security, API's, and much more...

    View all posts