Skip to main content
All Posts By


Machine Learning eXchange (MLX): One stop shop for Trusted Data and AI artifacts

By Blog

By:  Animesh Singh, Christian Kadner, Tommy Chaoping Li

Three pillars of AI lifecycle: Datasets, Models and Pipelines

In the AI lifecycle, we use data to build models for decision automation. Datasets, Models, and Pipelines (which take us from raw Datasets to deployed Models) become the three most critical pillars of the AI lifecycle. Due to the large number of steps that need to be worked on in the Data and AI lifecycle, the process of building a model can be bifurcated amongst various teams and large amounts of duplication can arise when creating similar Datasets, Features, Models, Pipelines, Pipeline tasks, etc. This also poses a strong challenge for traceability, governance, risk management, lineage tracking, and metadata collection. 

Announcing Machine Learning eXchange (MLX)

To solve the problems mentioned above, we need a central repository where all the different asset types like Datasets, Models, and Pipelines are stored to  be shared and reused across organizational boundaries. Having opinionated and tested Datasets, Models, and Pipelines with high quality checks, proper licenses, and lineage tracking increases the speed and efficiency of the AI lifecycle tremendously. 

To solve the above challenges, IBM and Linux Foundation AI and Data(LFAI and Data) are joining hands to announce Machine Learning eXchange (MLX), a Data and AI Asset Catalog and Execution Engine, in Open Source and Open Governance. 

Machine Learning eXchange (MLX) allows upload, registration, execution, and deployment of:

  • AI pipelines and pipeline components
  • Models
  • Datasets
  • Notebooks

MLX Architecture

MLX provides:

  • Automated sample pipeline code generation to execute registered models, datasets, and notebooks
  • Pipelines engine powered by Kubeflow Pipelines on Tekton, the core of Watson Studio Pipelines
  • Registry for Kubeflow Pipeline Components
  • Dataset management by Datashim
  • Serving engine by KFServing

MLX Katalog Assets


In machine learning, it is common to run a sequence of tasks to process and learn from data, all of which can be packaged into a pipeline.

ML Pipelines are:

  • A consistent way for collaborating on data science projects across team and organization boundaries
  • A collection of coarse grained tasks encapsulated as pipeline components to be snapped together like lego bricks
  • A one-stop shop for people interested in training, validating, deploying, and monitoring AI models

Some sample Pipelines included in the MLX catalog:

Pipeline Components

A pipeline component is a self-contained set of code that performs one step in the ML workflow (pipeline), such as data acquisition, data preprocessing, data transformation, model training, and so on. A component is a block of code performing an atomic task and can be written in any programming language and using any framework.

Some sample pipeline components included in the MLX catalog:


MLX provides a collection of free, open source, state-of-the-art deep learning models for common application domains. The curated list includes deployable models that can be run as a microservice on Kubernetes or OpenShift and trainable models where users can provide their own data to train the models.

Some sample models included in the MLX catalog:


The MLX catalog contains reusable datasets and leverages Datashim to make the datasets available to other MLX assets like notebooks, models, and pipelines in the form of Kubernetes volumes.

Sample datasets contained in the MLX catalog include:


Jupyter notebook is an open-source web application that allows data scientists to create and share documents that Jupyter notebook is an open-source web application that allows data scientists to create and share documents that contain runnable code, equations, visualizations, and narrative text. MLX can run Jupyter notebooks as self-contained pipeline components by leveraging the Elyra-AI project.

Sample notebooks contained in the MLX catalog include:

Join us to build cloud-native AI Marketplace on Kubernetes

The Machine Learning Exchange provides a marketplace and platform for data scientists to share, run, and collaborate on their assets. You now can use it to host and collaborate on Data and AI assets within your team and across teams. Please join us on the Machine Learning eXchange github repo, try it out, give feedback, and raise issues. Additionally, you can connect with us via the following:

  • To contribute and build end to end Machine Learning Pipelines on OpenShift and Kubernetes, please join the Kubeflow Pipelines on Tekton project and reach out with any questions, comments, and feedback!
  • To deploy Machine Learning Models in production, check out the KFServing project.

MLX Key Links

Thank You

Thanks to the many contributors of Machine Learning Exchange, mainly

  • Andrew Butler
  • Animesh Singh
  • Christian Kadner
  • Ibrahim Haddad
  • Karthik Muthuraman
  • Patrick Titzler
  • Romeo Kienzler
  • Saishruthi Swaminathan
  • Srishti Pithadia
  • Tommy Chaopling Li
  • Yihong Wang

LF AI & Data Resources

LF AI & Data Foundation Announces New Project Lifecycle Document

By Blog

Today, LF AI & Data Foundation is releasing an updated version of its project lifecycle document that defines the project stages, requirements for a project to be accepted in each stage, transitioning between stages, and the benefits associated with each stage. Since the document’s last revision over two years ago, we have gained experience in onboarding over a dozen new projects and have received feedback that has allowed us to move forward with these improvements. 

Revisions to the new document include:

  • Introducing Sandbox stage. This new stage is specific to projects that intend to join LF AI & Data in the Incubation stage in the future. The Sandbox stage provides time to lay the foundations for Incubation and is designated for very new projects.  For example, new projects that are designed to extend one or more LF AI & Data projects with functionality or interoperability libraries, or independent projects that fit the LF AI & Data mission and provide the potential for a novel approach to existing functional areas.
  • Improving the requirements to incubate projects. These improvements more specifically require that submitted projects for the Incubation stage have at least two organizations actively contributing to the project, have a defined Technical Steering Committee (TSC), have a sponsor who is an existing LF AI & Data member, have earned at least 300 stars on GitHub, and have achieved and maintained a Core Infrastructure Initiative Best Practices Silver Badge.
  • Improving the requirements to graduate projects. These improvements more specifically require projects to have a healthy number of code contributions coming from at least five organizations, have reached a minimum of 1,000 stars on GitHub, and have achieved and maintained a Core Infrastructure Initiative Best Practices Gold Badge.
  • Adding specific language to clarify the benefits for projects hosted in every stage
  • Elaborating on the Archive Stage projects to eliminate ambiguities 
  • Adding information on the Annual Review of projects. This annual review will include an assessment as to whether projects in Sandbox and Incubation are making adequate progress towards the Graduation stage; and that projects in the Graduation stage are maintaining positive growth and adoption.
  • General edits for the purpose of clarity

Dr. Ibrahim Haddad, Executive Director of LF AI & Data, said: “It’s been great to witness and experience the growth in LF AI & Data’s hosted projects as we’ve added 15 new projects in 2020. With this intensive experience came a lot of learned lessons. The LF AI & Data community took these learnings and used them to update our Project Lifecycle Document introducing a new project stage – Sandbox – and raising the bar for the admission into other stages. I am looking forward to welcoming new projects in our updated stages and continuing the growth of our community.”

Dr. Jim Spohrer, Chair of the Technical Advisory Council in LF AI & Data said: “I am very excited about the sandbox stage that will help us engage with and provide visibility for early-stage open source AI and Data community projects. Having such a stage supported in LF AI & Data is a true best practice for steadily growing foundations.” 

New projects joining the LF AI & Data Foundation will be required to follow the process and requirements outlined in the updated project lifecycle document.

If you are interested in hosting your open source AI or Data  project with the LF AI & Data Foundation, please review the project lifecycle document and email us via We’re eager to help and discuss with you such possibilities.

For further reading, please visit these pages:

All-In Open Source: Why I Quit Tech Giant and Found My OSS Startup

By Blog

Author: Han Xiao, Founder & CEO of Jina AI. Former board member of the LF AI Foundation.

In Feb. 2020, I left Tencent AI and found my startup Jina AI. Jina AI is a neural search company that provides cloud-native neural search solutions powered by AI and deep learning. On April 28th 2020, we released our core product “Jina” in open source. You can use Jina for searching anything: image-to-image, video-to-video, tweet-to-tweet, audio-to-audio, code-to-code, etc. To understand our ambition at Jina AI, I often explain Jina with two analogies.

  • A “Tensorflow” for search. Tensorflow, Pytorch, MXNet, Mindspore, are universal frameworks for deep learning. You can use them for recognizing cats from dogs, or playing Go and DOTA. They are powerful and versatile but not optimized for a specific domain. In Jina, we are focusing on one domain only: the search. We are building on top of the universal deep learning framework and provide an infrastructure for any AI-powered search applications.
  • A design pattern.  There are design patterns for every era: from functional programming to object-oriented programming. The same goes for the search system. Thirty years ago, it all started with a simple textbox. Many design patterns have been proposed for implementing the search system behind this textbox, some of which are incredibly successful commercial-wise. In the era of neural search, a query can go beyond a few keywords: an image, a video, a code snippet, or an audio file. When traditional symbolic search systems can not effectively handle these data formats, people need a new design pattern for building neural search systems. That’s what Jina is: a new design pattern for this new era.

Who set me on this path?

I’ve been working in the field of AI, especially in open-source AI for some time. You may have heard or used my previous work on Fashion-MNIST and bert-as-service. From 2018 to early 2020, I was an Engineering Lead at Tencent AI Lab, where I led a team to build the search infrastructure of China’s everyday app: WeChat.

In 2019, I was representing Tencent as the board member of the LF AI Foundation. It is this year I learned how professional open-source initiative works. Besides reviewing the proposal of high-quality open source projects, I actively engaged in meetings of the Governing Board, Technical Advisory Council, Outreach Committee, and Trusted AI Committee, providing input to this global community. I co-organized multiple offline events including LF AI Day Shanghai, a Christmas gathering. I helped to foster an open tech culture and expand LF AI’s influence within the company. By the end of 2019, Tencent has a seat in each subcommittee, and is among the most engaged corporate members of the Foundation.

Two things I learned during my work at the LF AI Foundation:

  • Open source = Open source code + Open governance. Community is the key.
  • Open source AI infrastructure is the future, and I need to act now.

I’m sure many share the same vision as I do. But my belief is so strong that it drives me jumping out of the tech giant and doing Jina AI as a startup from scratch. Challenging as it is, this is the opportunity I can not miss, and this is the future I believe in. All of my team share this belief as strong as I do. At Jina AI, we only do what we believe. I always tell my team: people who actually make change are the ones who believe that change is possible.

Challenges of an OSS company

Running an open-source software (OSS) company needs courage, an open mindset, and a strong belief. 

As an OSS company, when you first show the codebase to the world, you need courage. The code quality is now a symbol of the company. Are you following the best practice? Are you making a tech debt here and there? Open source is an excellent touchstone to help you understand and improve the quality of software engineering and development procedures. 

Embracing the community is vital for an OSS company. It requires an open mindset. Doing open source is not the same as doing a press release or a spotlight speech: it is not a one-way communication. You need to walk into the community, talk to them, solve their issues, answer their questions, and accept their criticisms. You need to manage your ego and do trivial things such as maintenance, housekeeping.

Some people may think that big tech companies hold a better position when committing to open-source because they can leverage better resources. That is not true. No matter how big the company is, each has its comfort zone built over the years. For many tech companies, open-source is a new game: the value it brings is often not quantifiable through short-term KPI/OKR, and the rules of play are not familiar to everyone. Not every decision-maker in the company believes in it. It’s like a person who has been playing Go for years, with a high rank, and enjoys it. One day you just show up and tell this guy: hey, let’s play mahjong, mahjong is fun! And you are expecting this guy to say “sure“? Regardless of the company’s size, it is always important to make everyone inside the company believe the value of open source. After all, it is always the individual who gets things done.

Best time for AI engineering

For engineers who want to do open source on AI, this is the best time. Thanks to the deep learning frameworks and off-the-shelf pre-trained models, there are many opportunities in the end-to-end application market for individuals to make significant contributions. Ask your colleagues or friends “which AI package do you use for daily tasks such as machine translation/image enhancement/data compression/code completion?”, you would get different answers from person to person. It is often an indicator that the market is still uncontested, and there is ample opportunity for growth and building a community around it.

One thing I like to remind the AI open-source developer is the sustainability of the project. With new AI algorithms popping up every day, how do you keep up the pace? What is the scope of your project? How do you maintain the project when facing community requests? When I was developing bert-as-service, I received many requests on extending it to AlBERT, DistilBERT, BioBERT etc. I prioritize those that fit into my roadmap. Sometimes this means hard-feeling to some people. But let’s be frank, you can’t solve every issue, not by yourself. It is not how open source works and certainly not how you work. The most considerable risk of open-source software is that the core developers behind are burned-out. The best open source may not be shiniest, but the one that lives the longest. So keep your enthusiasm and stay long!

Doing open-source is doing a startup

In the end, doing open source projects is like doing a startup, technical advantage is only part of the story.

Uploading the code to Github is just a starting point, and there are tasks such as operating, branding, and community management to consider. Like entrepreneurship, you need to draw a “pie” that encapsulates the passions and dreams of the community. You need to have the determination and precise target not to get sidetracked by the community issues.

As someone with a Machine Learning Ph.D., I’ve never believed that some black-magic algorithm would be the competitive advantage of an open-source project. Instead, I’m more convinced that sound engineering, attention to detail, slick user-experience, and community-driven governance model ultimately determine user retention.

The most important thing is often your understanding and belief in open source. If you are an idealist, then you will inspire those idealists to march with you. If you’re a detail-oriented person, every little feature in your project will be worshipped by those who care about the details. If you are a warm-hearted person, then the community you build up will appreciate your selfless giving.

Whichever person you are, it’s what you believe in open source makes what open source is.

Jina AI Key Links

LF AI Hosted Projects Cross Collaboration: Angel and Acumos

By Blog

Guest Author(s): LF AI Graduated Projects, Angel and Acumos

The goal of the LF AI Foundation (LF AI) is to accelerate and sustain the growth of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) open source projects. Backed by many of the world’s largest technology leaders, LF AI is a neutral space for harmonization and ecosystem engagement to advance AI, ML, and DL innovation. Projects are hosted in either of two stages: graduation and incubation. At the time of publishing this blog post, LF AI hosts three graduation level projects (Acumos, Angel, and ONNX) and eight incubation level projects (Adlik, Elastic Deep Learning, Horovod, Marquez, Milvus, NNStreamer, Pyro and sparklyr).

The incubation stage is designated for new or early-stage projects that are aligned with the LF AI mission and require help to foster adoption and contribution in order to sustain and grow the project. Incubation projects may receive mentorship from the LF AI Technical Advisory Council (TAC) and are expected to actively develop their community of contributors, governance, project documentation, and other variables that factor into broad success and adoption.

Incubation projects are eligible to graduate when they meet a certain number of criteria demonstrating significant growth of contributors and adopters, commitment to open governance, achieving and maintaining a CII best practices badge, and establishing collaboration with other LF AI hosted projects. Getting to this stage requires work, perseverance, and tangible signs of progress.

On the other hand, the graduation stage signaled projects that have achieved significant growth of contributors and adopters, are important to the ecosystem, and also are eligible for foundational financial support.

Angel Project

Angel joined LF AI as an incubation project in August 2018. It is a high-performance distributed machine learning platform based on the philosophy of Parameter Server. It is tuned for high performance and has a wide range of applicability and stability, demonstrating increasing advantage in handling higher dimension models. The Angel Project has been proactively collaborating with the Acumos Project community, resulting in positive outcomes to both communities.

In its effort to move to graduation, the Angel Project community looked at the full range of LF AI hosted projects and chose Acumos for integration.

Why Acumos?

Inside AI open source community, cross-project collaboration is essential. The Angel platform focuses on training of models based on machine learning algorithms while it doesn’t host any public model marketplace. On the other hand, Acumos supports an AI marketplace that empowers data scientists to publish adaptive AI models, while shielding them from the need to custom develop fully integrated solutions.

This makes Angel teaming up with Acumos a perfect match as the two would work like a factory and distributor after integration and therefore create a synergy effect. The Angel team believed that integration with Acumos could encourage and facilitate algorithm sharing by Angel users and therefore benefit the overall community.

In the following sections, we will explore some of the challenges the projects faced during the process and how integration was achieved.

Integration Challenges

Challenge A: Lack of reference to on-board Java-based model to Acumos marketplace that was dominated by Python models. This challenge was solved with the assistance of Acumos technical gurus from AT&T, Tech Mahindra, and Orange. They provided clear guidance and instructions including jar package access, configuration, as well as Java model preparation.

Challenge B: Seeking deployed internet accessible environment. It was appreciated that Huawei offered access to Acumos environments set on its public cloud in Hong Kong. However, the uploading process wasn’t all smooth sailing as several attempts failed due to unsuccessful generation of artifacts. The problem was later solved with the help from AT&T and Huawei by restarting Nexus and cleaning the disk to address the insufficient storage issue.

What Was Achieved?

A successful integration of Angel and Acumos demonstrated that Angel’ s Java-based models could be on-boarded to a marketplace dominated by Python projects.

At the same time, connecting Angel and Acumos in both API invoking and production deployment would allow more developers to use the Angel framework to train domain specific algorithms and share their works with people around the world. Acumos also become a stronger platform by adding more frameworks and users.

Cross project collaboration played a key role in Angel’s graduation as it proved that the project was an open system and could be connected with other projects. Only by demonstrating the capability of linking both upstream and downstream components in a productive data pipeline, a project could be deemed as a member of the global machine learning community, rather than an isolated system.

The collaboration between Angel and Acumos sets an example for other incubation level projects hosted by LF AI. The foundation hopes that more projects will follow the footsteps of Angel and Acumos, and with collective efforts, a sustainable development of harmonized community can be achieved soon.

Next Steps

To encourage further collaboration, Angel plans to invite global diversified users to publish their models onto Acumos. In parallel, Angel will also look at opportunities to incorporate their project with other components such as ML-flow framework, Web portal and monitoring system, more formats of model file support, etc.

To learn more about these two LF AI hosted projects, and to view all projects, visit the LF AI Projects page. If you would like to learn more about hosting a project in LF AI and the benefits, click here.

LF AI Resources

Enabling the Open Source AI Native Ecosystem with an AI Specific Computing Framework

By Blog

Guest Author: Zhipeng Huang Principle Engineer, Huawei Technologies Huawei’s Representative on the LF AI Foundation Technical Advisory Council

Meet MindSpore: Huawei’s Open Source AI Computing Framework 

We are very excited to announce that Huawei is open sourcing MindSpore, an AI computing framework. MindSpore was developed by Huawei with the goal of implementing on-demand collaboration across the cloud-edge-device. It provides unified APIs and end-to-end AI capabilities for model development, execution, and deployment in all scenarios.

Using a distributed architecture (Figure 1), MindSpore leverages a native automatically differentiable programming paradigm and new AI native execution modes to achieve better resource efficiency, security, and trustworthiness. Meanwhile, MindSpore makes full use of the computing power of Ascend AI processors and lowers the entry requirements of industry AI development, bringing inclusive AI faster to reality.

Figure 1: MindSpore High Level Architecture

MindSpore is designed to provide development experience with friendly design and efficient execution for the data scientists and algorithmic engineers, native support for Ascend AI processor, and software hardware co-optimization. 

Our goal with open sourcing MindSpore is to provide the global community of AI open source with a computing framework that will further advance the development and enrichment of the AI software/hardware application ecosystem. 

Building AI Native Programming ecosystem with emphasis on Interoperability

With recent development of the Pyro project, an incubation project of LF AI Foundation, Julia and MindSpore, it has become evident that AI native programming is the next trend in deep learning framework development. Gone with the old days that mathematical libraries were added to existing engineering toolsets, data scientists will more and more likely to use their familiar toolset with more engineering capability added. AI developers should be able to write the models as the mathematical format without a steep learning curve of software engineering.

In order to build the new AI native programming ecosystem, interoperability is a critical issue to be solved. At the northbound (Figure 2 – red blocks), other than IR, interop for things like crypto, type system, metadata are also needed to be addressed. At the southbound  (Figure 2 – purple blocks), in addition to heterogeneous computing hardware that needs to be supported, storage interoperability should also be considered.

Figure 2: Interoperability Proposal to be discussed in LF AI’s Technical Advisory Council

MindSpore community will work with the LF AI Foundation community and more specifically the Technical Advisory Council through its ML Workflow effort to address interoperability issues. We also plan to engage with the ONNX community (ONNX is a Graduate level project in LF AI Foundation) to make sure that by exporting ONNX models, developers could utilize MindSpore in more scenarios.

Working with Kubeflow

MindSpore is also utilizing cloud native ecosystem for deployment and management. With the recent Kubeflow 1.0 and upcoming Kubernetes 1.18 release, we can experiment with the latest cloud native computing technology for agile MLOps.

Figure 3: MindSpore and the Cloud Native Ecosystem

In order to take advantage of the prowess of Kubeflow and Kubernetes, the first thing we did is to write the operator for MindSpore (called, ms-operator), and also define a MindSpore CRD (Custom Resource Definition). The current version of ms-operator is based on an early version of PyTorch Operator  and TF Operator .

The implementation of ms-operator contains the specification and implementation of MSJob custom resource definition. We will demonstrate running a walkthrough of making ms-operator image, creating a simple msjob on kubernetes with MindSpore “`0.1.0-alpha“` image. The whole MindSpore community is still working on implementing distributed training on different backends so that users can create and manage msjobs like other built-in resources on Kubernetes in the near future.

The MindSpore community is driving to collaborate with the Kubeflow community as well as making the ms-operator more complex, well-organized and up-to-date. All these components make it easy for machine learning engineers and data scientists to leverage cloud assets (public or on-premise) for machine learning workloads. 

Horovod Version 0.19.0 Now Available!

By Blog

Horovod, an LF AI Foundation Incubation Project, has released version 0.19.0 and we’re thrilled to see the results of their hard work. Horovod is a distributed deep learning framework that improves the speed, scale, and resource utilization of deep learning training.

In version 0.19.0, Horovod adds tighter integration with Apache Spark, including a new high-level Horovod Spark Estimator framework and support for accelerator-aware task-level scheduling in the upcoming Spark 3.0 release. With Horovod Spark Estimators, you can train your deep neural network directly on your existing Spark DataFrame, leveraging Horovod’s ability to scale to hundreds of workers in parallel without any specialized code for distributed training. This enables deep learning frameworks to integrate seamlessly with ETL jobs, allowing for more streamlined production jobs, with faster iteration between feature engineering and model training. 

This release also contains experimental new features including a join operation for PyTorch and the ability to launch Horovod jobs programmatically from environments like notebooks using a new interactive run mode

With the new join operation, users no longer need to worry about how evenly their dataset divides when training. Just add a join step at the end of each epoch, and Horovod will train on any extra batches without causing the waiting workers to deadlock.

Using Horovod’s new interactive mode, users can launch distributed training jobs in a single line of Python. Define the distributed training function, execute it with multiple parallel processes, then return the results as a Python list of objects. This new API mirrors horovod.spark, but can run on any nodes you would normally use with horovodrun.

Full release notes for Horovod version 0.19.0 available here. Curious about how Horovod can make your model training faster and more scalable? Check out these new updates and try out the framework. And be sure to join the Horovod Announce and Horovod Technical-Discuss mailing lists to join the community and stay connected on the latest updates. 

Congratulations to the Horovod team and we look forward to continued growth and success as part of the LF AI Foundation! To learn about hosting an open source project with us, visit the LF AI Foundation website here.

Horovod Key Links

LF AI Resources

LF AI Foundation Announces Graduation of Angel Project

By Blog

Distributed machine learning platform has evolved into a full stack machine learning platform, ready for large scale deployment

SAN FRANCISCO – December 19, 2019 – The LF AI Foundation, the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML) and deep learning (DL), is announcing today that hosted project Angel is moving from an Incubation to a Graduation Level Project. This graduation is the result of Angel demonstrating thriving adoption, an ongoing flow of contributions from multiple organizations, and a documented and structured open governance process. Angel has achieved a Core Infrastructure Initiative Best Practices Badge, and demonstrated a strong commitment to community.

Angel is a distributed machine learning platform based on parameter server. It was open sourced by Tencent, the project founder, in July 2017 and then joined LF AI as an Incubation Project in August 2018. The initial focus of Angel was on sparse data and big model training. However, Angel now includes feature engineering, model training, hyper-parameter tuning and model serving, and has evolved  into a full stack machine learning platform.

“With Angel, we’ve seen impressive speed in adding new features and rollout in large corporations at scale. With the 3.0 release of Angel, we have witnessed excellent progress in features, adoption and contributions in a short period of time,” said Dr. Ibrahim Haddad, Executive Director of the LF AI Foundation. “This is a big step forward signaling to the market a maturing open source technology ready for large scale deployment. Congratulations, Angel!”

More than 100 companies or institutions use Angel in products or inside the firewall. The extensive list of implementations includes well-known names like Weibo, Huawei, Xiaomi, Baidu, DiDi, and many more. 

“We are excited to move from Incubation to Graduate Level Project in LF AI, and we see that as just another important milestone in the process, not the end goal. We need to continue to push both technically and with community outreach, to increase momentum, adoption and encourage additional contributions. We will continue to aim for lofty goals,” said Fitz Wang, Senior Researcher at Tencent, Angel Technical Project Lead. “We will be deeply involved in LF AI events in 2020 and present at several events under the LF AI booth. If you’d like to contribute to Angel, please reach out to us via our mailing lists and visit the LF AI booth at any of the LF events.”

Feature Roadmap for 2020

  • Version 3.2 – Graph Computing, adding more algorithms
    • Traditional graph algorithms: Closeness, HyperANF, more
    • Graph Embedding algorithms: Node2Vec, DeepWalk
    • Graph neural network: GraphSAGE
  • Version 3.3 – Federated Learning

Angel Project Resources

LF AI Resources

About LF AI Foundation

The LF AI Foundation, a Linux Foundation project, accelerates and sustains the growth of Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) open source projects. Backed by many of the world’s largest technology leaders, LF AI is a neutral space for harmonization and ecosystem engagement to advance AI, ML and DL innovation. To get involved with the LF AI Foundation, please visit

About Linux Foundation 

Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation projects like Linux, Kubernetes, Node.js and more are considered critical to the development of the world’s most important infrastructure. Its development methodology leverages established best practices and addresses the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at

# # #

LF AI Foundation Welcomes ZILLIZ as Premier Member

By Blog

LF AI continues fast pace of membership and project portfolio growth

GPU hardware-accelerated Analytics Platform for Massive-Scale Geospatial and Temporal Data

SAN FRANCISCO – December 17, 2019 – The LF AI Foundation, the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML) and deep learning (DL), is announcing today that ZILLIZ has joined the Foundation as a Premier member.

ZILLIZ was founded in 2016 with its headquarter in Shanghai. With the vision of “Reinvent Data Science”, ZILLIZ aligns itself on developing open source data science software leveraging new generation heterogenous computing technologies. Milvus, a high-performance vector search engine for deep learning applications open sourced by ZILLIZ recently, is gathering momentum in the open source AI community.

“We are pushing forward a globalization strategy that fully incorporates global open source communities. We believe open development leads to greater implementation and greater good for all,” said ZILLIZ Founder and CEO Charles Xie. “We think the most critical data challenges today are processing unstructured data which are explosively growing. And even for structured data, we also need new approaches while 5G/IoT applications are gaining dominance in the next decade. We believe open source and open collaboration will foster more innovations to address these challenges.”

“As a pioneer of data science software embracing heterogenous hardware, ZILLIZ is enabling enterprises to transform the unstructured data from digital contents to data assets, which is essential in building high quality AI systems and services.” said Dr. Ibrahim Haddad, Executive Director of the LF AI Foundation. “We are pleased to welcome ZILLIZ as a Premium Member of LF AI and excited to support their contributions connected with the open source AI community including their Milvus project.”

LF AI Project Portfolio Growth

2019 has been a growth year for LF AI, seeing the foundation quickly adding to its portfolio of hosted projects. LF AI currently hosts the following projects: Acumos, Angel, Elastic Deep Learning, Horovod, Pyro, Adlik and ONNX. Two more projects will be joining in December and will be announced at a later date. 

To learn more about hosting a project in LF AI and the benefits, please visit and explore the “Projects” main menu item.

A full list of the LF AI hosted projects is available here:

LF AI Resources

About LF AI Foundation

The LF AI Foundation, a Linux Foundation project, accelerates and sustains the growth of Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) open source projects. Backed by many of the world’s largest technology leaders, LF AI is a neutral space for harmonization and ecosystem engagement to advance AI, ML and DL innovation. To get involved with the LF AI Foundation, please visit

About Linux Foundation 

Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation projects like Linux, Kubernetes, Node.js and more are considered critical to the development of the world’s most important infrastructure. Its development methodology leverages established best practices and addresses the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at

# # #

Thank you! LF AI Day Shanghai Summary

By Blog

Organizer: LF AI Foundation
Co-organizer: Huawei, Tencent, Baidu, Alibaba, DiDi, WeBank, Tesra Sponsor: Huawei, Tencent

From Jessica Kim, LF AI Outreach Committee Chairperson: “With China’s first commercial deployment into 5G, the real era of intelligence has arrived, but we still have a lot of technical issues that need to be explored and solved in a practical way, and people from different industries and different technical fields need to work together.

On September 17th, 2019, at the first LF AI Day in China, held at the Huawei Research Institute in Shanghai, senior technical experts from Huawei, Tencent, Baidu, Alibaba Cloud, DiDi, Tesra and Webank gathered to share the applications and practices of AI. Online live-streaming viewing rates exceeded more than 1500 viewers. For folks who missed the whole day live event, Huawei Editor prepared the event summary, to be able to review the wonderful moments of the General Assembly!”

LF AI Receives Best Contribution Award from Chinese Association for Artificial Intelligence (CAAI)

By Blog

LF AI is pleased to receive the Best Contribution Award from the Chinese Association for Artificial Intelligence (CAAI). Communities are based on contributing; this award has special significance in that regard. 

Thank you!

CAAI is devoted to academic activities in science and technology in the People’s Republic of China. CAAI has 40 branches covering the fields of science and smart technology. It is the only state-level science and technology organization in the field of artificial intelligence under the Ministry of Civil Affairs. 

LF AI co-organized 3 successful national AI conferences with CAAI in 2019. We look forward to more involvement in 2020 in terms of projects with their members and collaboration on events.

Pictures from GAITC 2019, a CAAI event