sparklyr, an LF AI Foundation Incubation Project, has released version 1.2.0 and we’re excited to see a great release with contributions from several members of the community. sparklyr is an R Language package that lets you analyze data in Apache Spark, the well-known engine for big data processing, while using familiar tools in R. The R Language is widely used by data scientists and statisticians around the world and is known for its advanced features in statistical computing and graphics.
In version 1.2.0, sparklyr adds a variety of improvements, including:
A number of interop issues with Spark 3.0.0-preview were fixed
The `registerDoSpark` method was implemented to allow Spark to be used as a `foreach` parallel backend in Sparklyr (see registerDoSpark.Rd)
And more…A complete list of changes can be found in sparklyr 1.2.0 section of the NEWS.md file: sparklyr-1.2.0
The power of open source projects is the aggregate contributions originating from different community members and organizations that collectively help drive the advancement of the projects and their roadmaps. The sparklyr community is a great example of this process and was instrumental in producing this release. A special THANK YOU goes out to the following community members for their contributions of commits and pull request reviews!
Congratulations to the sparklyr team and we look forward to continued growth and success as part of the LF AI Foundation! To learn about hosting an open source project with us, visit the LF AI Foundation website.
The LF AI Foundation (LF AI), the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML), and deep learning (DL), today is announcing ForestFlow as its latest Incubation Project. ForestFlow is a scalable policy-based cloud-native machine learning model server. ForestFlow strives to strike a balance between the flexibility it offers data scientists and the adoption of standards while reducing friction between Data Science, Engineering and Operations teams. ForestFlow was released and open sourced by Dreamworks.
“We are very pleased to welcome ForestFlow to LF AI. ForestFlow provides an easy way to deploy ML models to production and realize business value on an open source platform that can scale as the user’s projects and requirements scale,” said Dr. Ibrahim Haddad, Executive Director of LF AI. “We look forward to supporting this project and helping it to thrive under a neutral, vendor-free, and open governance.” LF AI supports projects via a wide range of benefits; and the first step is joining as an Incubation Project.
Ahmad Alkilani, Principal Architect and developer of ForestFlow at DreamWorks Animation, said, “We developed ForestFlow in response to our need to move ML models into production that affected the scheduling and placement of rendering jobs and the throughput of our rendering pipeline which has a material impact to our bottom line. Our focus was on maintaining our own teams’ agility and keeping ML models fresh in response to changes in data, features, or simply the production tools that historical data was associated with. Another pillar for developing ForestFlow was the openness of the solution we chose. We were looking to minimize vendor lock-in having a solution that was amenable to on-premise and cloud deployments all the same while offloading deployment complexities from the job description of a Data Scientist. We want our team to focus on extracting the most value they can out of the data we have and not have to worry about operational concerns. We also needed a hands-off approach to quickly iterate and promote or demote models based on observed metrics of staleness and performance. With these goals in mind, we also realize the value of open source software and the value the Linux Foundation brings to any project and specifically LF AI in this space. DreamWorks Animation is pleased that LF AI will manage the neutral open governance for ForestFlow to help foster the growth of the project.”
Continuous deployment and lifecycle management of Machine Learning/Deep Learning models is currently widely accepted as a primary bottleneck for gaining value out of ML projects. Hear from ForestFlow about why they set out to create this project:
We wanted to reduce friction between our data science, engineering and operations teams
We wanted to give data scientists the flexibility to use the tools they wanted (H2O, TensorFlow, Spark export to PFA etc..)
We wanted to automate certain lifecycle management aspects of model deployments like automatic performance or time-based routing and retirement of stale models
We wanted a model server that allows easy A/B testing, Shadow (listen only) deployments and Canary deployments. This allows our Data Scientists to experiment with real production data without impacting production and using the same tooling they would when deployment to production.
We wanted something that was easy to deploy and scale for different deployment scenarios (on-prem local data center single instance, cluster of instances, Kubernetes managed, Cloud native etc..)
We wanted the ability to treat inference requests as a stream and log predictions as a stream. This allows us to test new models against a stream of older infer requests.
We wanted to avoid the “super-hero” data scientist that knows how to dockerize an application, apply the science, build an API and deploy to production. This does not scale well and is difficult to support and maintain.
Most of all, we wanted repeatability. We didn’t want to reinvent the wheel once we had support for a specific framework.
ForestFlow is policy-based to support the automation of Machine Learning/Deep Learning operations which is critical to scaling human resources. ForestFlow lends itself well to workflows based on automatic retraining, version control, A/B testing, Canary Model deployments, Shadow testing, automatic time or performance-based model deprecation and time or performance-based model routing in real-time. The aim for ForestFlow is to provide data scientists a simple means to deploy models to a production system with minimal friction accelerating the development to production value proposition. Check out the quickstart guide to get an overview of setting up ForestFlow and an example on inference.
A warm welcome to ForestFlow and we look forward to the project’s continued growth and success as part of the LF AI Foundation. To learn about how to host an open source project with us, visit the LF AI website.
Guest Author(s): LF AI Graduated Projects, Angel and Acumos
The goal of the LF AI Foundation (LF AI) is to accelerate and sustain the growth of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) open source projects. Backed by many of the world’s largest technology leaders, LF AI is a neutral space for harmonization and ecosystem engagement to advance AI, ML, and DL innovation. Projects are hosted in either of two stages: graduation and incubation. At the time of publishing this blog post, LF AI hosts three graduation level projects (Acumos, Angel, and ONNX) and eight incubation level projects (Adlik, Elastic Deep Learning, Horovod, Marquez, Milvus, NNStreamer, Pyro and sparklyr).
The incubation stage is designated for new or early-stage projects that are aligned with the LF AI mission and require help to foster adoption and contribution in order to sustain and grow the project. Incubation projects may receive mentorship from the LF AI Technical Advisory Council (TAC) and are expected to actively develop their community of contributors, governance, project documentation, and other variables that factor into broad success and adoption.
Incubation projects are eligible to graduate when they meet a certain number of criteria demonstrating significant growth of contributors and adopters, commitment to open governance, achieving and maintaining a CII best practices badge, and establishing collaboration with other LF AI hosted projects. Getting to this stage requires work, perseverance, and tangible signs of progress.
On the other hand, the graduation stage signaled projects that have achieved significant growth of contributors and adopters, are important to the ecosystem, and also are eligible for foundational financial support.
Angel joined LF AI as an incubation project in August 2018. It is a high-performance distributed machine learning platform based on the philosophy of Parameter Server. It is tuned for high performance and has a wide range of applicability and stability, demonstrating increasing advantage in handling higher dimension models. The Angel Project has been proactively collaborating with the Acumos Project community, resulting in positive outcomes to both communities.
In its effort to move to graduation, the Angel Project community looked at the full range of LF AI hosted projects and chose Acumos for integration.
Inside AI open source community, cross-project collaboration is essential. The Angel platform focuses on training of models based on machine learning algorithms while it doesn’t host any public model marketplace. On the other hand, Acumos supports an AI marketplace that empowers data scientists to publish adaptive AI models, while shielding them from the need to custom develop fully integrated solutions.
This makes Angel teaming up with Acumos a perfect match as the two would work like a factory and distributor after integration and therefore create a synergy effect. The Angel team believed that integration with Acumos could encourage and facilitate algorithm sharing by Angel users and therefore benefit the overall community.
In the following sections, we will explore some of the challenges the projects faced during the process and how integration was achieved.
—Challenge A: Lack of reference to on-board Java-based model to Acumos marketplace that was dominated by Python models. This challenge was solved with the assistance of Acumos technical gurus from AT&T, Tech Mahindra, and Orange. They provided clear guidance and instructions including jar package access, configuration, as well as Java model preparation.
—Challenge B: Seeking deployed internet accessible environment. It was appreciated that Huawei offered access to Acumos environments set on its public cloud in Hong Kong. However, the uploading process wasn’t all smooth sailing as several attempts failed due to unsuccessful generation of artifacts. The problem was later solved with the help from AT&T and Huawei by restarting Nexus and cleaning the disk to address the insufficient storage issue.
What Was Achieved?
A successful integration of Angel and Acumos demonstrated that Angel’ s Java-based models could be on-boarded to a marketplace dominated by Python projects.
At the same time, connecting Angel and Acumos in both API invoking and production deployment would allow more developers to use the Angel framework to train domain specific algorithms and share their works with people around the world. Acumos also become a stronger platform by adding more frameworks and users.
Cross project collaboration played a key role in Angel’s graduation as it proved that the project was an open system and could be connected with other projects. Only by demonstrating the capability of linking both upstream and downstream components in a productive data pipeline, a project could be deemed as a member of the global machine learning community, rather than an isolated system.
The collaboration between Angel and Acumos sets an example for other incubation level projects hosted by LF AI. The foundation hopes that more projects will follow the footsteps of Angel and Acumos, and with collective efforts, a sustainable development of harmonized community can be achieved soon.
To encourage further collaboration, Angel plans to invite global diversified users to publish their models onto Acumos. In parallel, Angel will also look at opportunities to incorporate their project with other components such as ML-flow framework, Web portal and monitoring system, more formats of model file support, etc.
To learn more about these two LF AI hosted projects, and to view all projects, visit the LF AI Projects page. If you would like to learn more about hosting a project in LF AI and the benefits, click here.
The LF AI Foundation (LF AI), the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML), and deep learning (DL), today is announcing NNStreamer as its latest Incubation Project. NNStreamer is a set of Gstreamer plugins that support ease and efficiency for Gstreamer developers adopting neural network models and neural network developers managing neural network pipelines and their filters. NNStreaner was released and open sourced by Samsung.
“We are very pleased to welcome NNStreamer to LF AI. Machine Learning applications often process online stream input data in real-time, which can create a complex system. NNStreamer can be used to easily represent and efficiently execute against these challenges,” said Dr. Ibrahim Haddad, Executive Director of LF AI. “We look forward to supporting this project and helping it to thrive under a neutral, vendor-free, and open governance.” LF AI supports projects via a wide range of benefits; and the first step is joining as an Incubation Project. Full details on why you should host your open source project with LF AI are available here.
NNStreamer promotes easier and more efficient development of on-device AI systems by allowing the description of general systems with various input, outputs, processors, and neural networks with the pipe-and-filter architecture. It provides easy-to-use APIs with corresponding SDKs as well: C-APIs (all platforms), Tizen.NET (C#), and Android (Java) along with a wide range of neural network frameworks and software platforms (Ubuntu, macOS, OpenEmbedded). NNStreamer became an open source project in 2018 and is under active development with the Tizen project and a wide range of consumer electronics devices.
Learn more about NNStreamer via their GitHub. You can also check out the NNStreamer 2018 GStreamer Conference presentation recording here, as well as their presentation at the Samsung Developer Conference in 2019 here. And be sure to join the NNStreamer-Announce and NNStreamer-Technical-Discuss mail lists to join the community and stay connected on the latest updates.
A warm welcome to NNStreamer and we look forward to the project’s continued growth and success as part of the LF AI Foundation. To learn about how to host an open source project with us, visit the LF AI website.
We are excited to welcome three new members to the LF AI Foundation – Xenonstack as a General member and in our Associate category we now have AI for People and Ambianic. Learn a bit more about these organizations:
Xenonstack joins LF AI as a General member. In their own words: A Product Engineering, Technology Services and Consulting company building Intelligent Distributed Systems at Scale, AI, and Data-driven Decision Platforms and Solutions. We are enabling enterprises for Digital transformation with Cloud, Data and AI Strategy and Enterprise Agility.
AI for People
AI for People joins LF AI as an Associate member. In their own words: Our mission is to learn, pose questions and take initiative on how Artificial Intelligent technology can be used for the social good.
Our strategy is to conduct impact analysis, projects and democratic policies that act at the crossing of Artificial Intelligence and society. We are a diverse team of motivated individuals that is dedicated to bring AI Policy to the people, in order to create positive change in society with technology, through and for the public.
Ambianic joins LF AI as an Associate member. In their own words: Ambianic’s mission is to make our homes and workspaces a little cozier. Ambianic is an Open Source Ambient Intelligence platform that puts local control and privacy first. It enables users to train and share custom AI models without compromising privacy.
We look forward to partnering with these new LF AI Foundation members to help support open source innovation and projects within the artificial intelligence (AI), machine learning (ML), and deep learning (DL) space. Welcome to our new members!
Interested in joining the LF AI community as a member? Learn more here.
Guest Author: Zhipeng Huang Principle Engineer, Huawei Technologies Huawei’s Representative on the LF AI Foundation Technical Advisory Council
Meet MindSpore: Huawei’s Open Source AI Computing Framework
We are very excited to announce that Huawei is open sourcing MindSpore, an AI computing framework. MindSpore was developed by Huawei with the goal of implementing on-demand collaboration across the cloud-edge-device. It provides unified APIs and end-to-end AI capabilities for model development, execution, and deployment in all scenarios.
Using a distributed architecture (Figure 1), MindSpore leverages a native automatically differentiable programming paradigm and new AI native execution modes to achieve better resource efficiency, security, and trustworthiness. Meanwhile, MindSpore makes full use of the computing power of Ascend AI processors and lowers the entry requirements of industry AI development, bringing inclusive AI faster to reality.
MindSpore is designed to provide development experience with friendly design and efficient execution for the data scientists and algorithmic engineers, native support for Ascend AI processor, and software hardware co-optimization.
Our goal with open sourcing MindSpore is to provide the global community of AI open source with a computing framework that will further advance the development and enrichment of the AI software/hardware application ecosystem.
Building AI Native Programming ecosystem with emphasis on Interoperability
With recent development of the Pyro project, an incubation project of LF AI Foundation, Julia and MindSpore, it has become evident that AI native programming is the next trend in deep learning framework development. Gone with the old days that mathematical libraries were added to existing engineering toolsets, data scientists will more and more likely to use their familiar toolset with more engineering capability added. AI developers should be able to write the models as the mathematical format without a steep learning curve of software engineering.
In order to build the new AI native programming ecosystem, interoperability is a critical issue to be solved. At the northbound (Figure 2 – red blocks), other than IR, interop for things like crypto, type system, metadata are also needed to be addressed. At the southbound (Figure 2 – purple blocks), in addition to heterogeneous computing hardware that needs to be supported, storage interoperability should also be considered.
MindSpore community will work with the LF AI Foundation community and more specifically the Technical Advisory Council through its ML Workflow effort to address interoperability issues. We also plan to engage with the ONNX community (ONNX is a Graduate level project in LF AI Foundation) to make sure that by exporting ONNX models, developers could utilize MindSpore in more scenarios.
Working with Kubeflow
MindSpore is also utilizing cloud native ecosystem for deployment and management. With the recent Kubeflow 1.0 and upcoming Kubernetes 1.18 release, we can experiment with the latest cloud native computing technology for agile MLOps.
In order to take advantage of the prowess of Kubeflow and Kubernetes, the first thing we did is to write the operator for MindSpore (called, ms-operator), and also define a MindSpore CRD (Custom Resource Definition). The current version of ms-operator is based on an early version of PyTorch Operator and TF Operator .
The implementation of ms-operator contains the specification and implementation of MSJob custom resource definition. We will demonstrate running a walkthrough of making ms-operator image, creating a simple msjob on kubernetes with MindSpore “`0.1.0-alpha“` image. The whole MindSpore community is still working on implementing distributed training on different backends so that users can create and manage msjobs like other built-in resources on Kubernetes in the near future.
The MindSpore community is driving to collaborate with the Kubeflow community as well as making the ms-operator more complex, well-organized and up-to-date. All these components make it easy for machine learning engineers and data scientists to leverage cloud assets (public or on-premise) for machine learning workloads.
Milvus, an open source vector similarity search engine, was accepted by the LF AI Foundation (LF AI) as its latest incubation project after TAC voting. LF AI is the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML), and deep learning (DL).
Adopted by over 100 organizations and institutions worldwide, Milvus empowers applications in a variety of fields, including image processing, computer vision, natural language processing, voice recognition, recommender systems, drug discovery, etc. Milvus was originally developed by Zilliz, a Shanghai-based startup company, and open sourced in October 2019.
Zilliz, with the vision of “Reinvent data science”, develops open source data science software for the era of AI and 5G/IoT. “We are pushing forward a globalization strategy that fully incorporates global open source communities. We believe open development leads to greater implementation and greater good for all.” said Starlord, the founder & CEO of Zilliz. “We believe Milvus will help to accelerate the AI adoption for more organizations after joining LF AI.”
“We are very pleased to welcome Milvus to LF AI. Vector similarity search engine is an important component for processing rapidly growing unstructured data. Many AI domains, such as image processing, computer vision, NLP, recommendation systems, and more, could benefit from the capability of Milvus vector similarity search engine. Milvus can help to build up AI applications with open source AI technology,” said Dr. Ibrahim Haddad, Executive Director of LF AI. “We look forward to supporting this project and helping it to thrive under a neutral, vendor-free, and open governance.”
Milvus is easy-to-use, highly reliable, scalable, robust, and blazing fast, along with a rich list of features.
Comprehensive Similarity Metrics – Milvus offers frequently used similarity metrics, including Euclidean distance, inner product, Hamming distance, Jaccard distance, etc., allowing you to explore vector similarity in the most effective and efficient way possible.
Leading-Edge Performance – Milvus is built on top of multiple optimized Approximate Nearest Neighbor Search (ANNS) indexing libraries, including faiss, annoy, hnswlib, etc., thus ensuring that you always get the best performance across various scenarios.
Cost-Efficient – Milvus harnesses the parallelism of modern processors and enables billion-scale similarity searches in milliseconds on a single off-the-shelf server.
Highly Scalable and Robust – You can deploy Milvus in a distributed environment. To increase the capacity and reliability of a Milvus cluster, you can simply add more nodes.
Cloud Native – Milvus is designed to run on public cloud, private cloud, or hybrid cloud.
A warm welcome to Milvus and we look forward to the project’s continued growth and success as part of the LF AI Foundation. LF AI supports projects via a wide range of benefits; and the first step is joining as an Incubation Project. Full details on why you should host your open source project with LF AI are available here.
Break out those party hats as we are thrilled to have celebrated the 2nd birthday of the LF AI Foundation! It’s hard to believe that two years have passed since the Foundation launched in March 2018. We invite you to go down memory lane with us and see the key milestones accomplished.
Members – We launched with 10 members and now have 23 across our Premier, General, and Associate levels. We’ve seen a diverse group of companies getting involved across various industries and we welcome those interested in contributing to the support of open source projects within the artificial intelligence (AI), machine learning (ML), and deep learning (DL) space.
Technical Projects – Our technical project portfolio grew from the single Acumos project to 10 projects; with a mix of 3 Graduated and 7 Incubation. Two of these projects were approved by the LF AI Technical Advisory Council (TAC) and are undergoing their onboarding and will be announced soon, stay tuned! The TAC is continually working to bring in new projects and we look forward to sharing those with you in the next few months. We’re always looking for new open source projects to host within LF AI, please email us firstname.lastname@example.org if you’d like to discuss joining us.
Interactive Landscape – The launch of the LF AI Interactive Landscape has become a great tool to gain insights into how LF AI projects, among many others, fit into the space of open source AI, ML, and DL. Explore the landscape and please reach out to help us expand it with your own open source project or others that should be included.
Initiatives – Through the great contributions across the LF AI Community we launched two very important initiatives. An ML Workflow Working Group with the goal of defining an ML Workflow and to promote cross project integration. As well as the Trusted AI Committee with the goal of creating policies, guidelines, tooling, and use cases by industry in this very important space. Both of these initiatives are open for participation and we encourage anyone interested to join the conversations by joining the mail lists or attending an upcoming meeting; check out their wiki pages for more information.
Events – There have been 11+ event opportunities to connect the LF AI Community face to face across the globe; including LF AI Days which are regional, one-day events hosted and organized by local members with support from LF AI and its projects. Visit our LF AI Events page for more details.
Community – In 2019, we rebranded the foundation from LF Deep Learning to LF AI Foundation and continued efforts to increase communication and collaboration within the LF AI Community. If you haven’t connected with us across the various channels please do so!
Lots of accomplishments and growth over the past two years! In the coming years there will be more exciting developments in the space of AI, ML, and DL, and we invite you to be a part of it through the LF AI Community. Check out our How to Get Involved Guide or email us at email@example.com for any questions.
The LF AI Foundation (LF AI) is excited to be part of the ITU AI/ML in 5G Challenge as a promotion partner. The challenge is focused on finding solutions to relevant problems in 5G through the use of Artificial Intelligence (AI) and Machine Learning (ML). The global challenge theme is “How to apply ITU’s ML architecture in 5G networks”.
A few words from ITU on the challenge, “ITU invites you to participate in the ITU Artificial Intelligence/Machine Learning in 5G Challenge, a competition which is scheduled to run from now until the end of the year. Participation in the Challenge is free of charge and open to all interested parties from countries that are a member of ITU”. LF AI looks forward to following the challenge and encourages your participation. Learn more by visiting the ITU AI/ML in 5G Challenge website, and submit interest from now until April 30th.
IBM, ONNX, and the LF AI Foundation are pleased to sponsor the upcoming LF AI Day* – ONNX Community Virtual Meetup – Silicon Valley 2020, to be held via Zoom on April 9th from 9am-12pm PT.
ONNX, an LF AI Foundation Graduated Project, is an open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them.
The virtual meetup will cover ONNX Community updates, partner/end-user stories, and SIG/WG updates. Check out the full agenda here. If you are using ONNX in your services and applications, building software or hardware that supports ONNX, or contributing to ONNX, you should attend! This is a great opportunity to connect with and hear from people working with ONNX across many companies.
Registration is now open and the event is free to attend. Capacity will be 500 attendees. For up to date information on this virtua meetup, please visit the event website.
Note: In order to ensure the safety of our event participants and staff due to the Novel Coronavirus situation (COVID-19) the ONNX Steering Committee decided to make this a virtual-only event via Zoom.
*LF AI Day is a regional, one-day event hosted and organized by local members with support from LF AI and its Projects. Learn more about the LF AI Foundation here.