Projects
The LF AI & Data Foundation supports open source projects within artificial intelligence and the data space.

1chipML
1chipML is an open source library for basic numerical crunching and machine learning for microcontrollers.
Learn MoreAcumos AI
Acumos AI is a platform and open source framework that makes it easy to build, share, and deploy AI apps.
Learn MoreAdlik
Adlik is a toolkit for accelerating deep learning inference. The goal of Adlik is to accelerate deep learning inference process both on cloud and embedded environments.
Learn More
Adversarial Robustness Toolbox
Adversarial Robustness Toolbox (ART) provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats.
Learn MoreAI Explainability 360
AI Explainability 360 is an open source toolkit that can help users better understand the ways that machine learning models predict labels using a wide variety of techniques throughout the AI application lifecycle.
Learn MoreAI Fairness 360
AI Fairness 360 is an extensible open source toolkit that can help users understand and mitigate bias in machine learning models throughout the AI application lifecycle.
Learn MoreAmundsen
Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Learn MoreAngel ML
The Angel Project is a high-performance distributed machine learning platform based on Parameter Server, running on YARN and Apache Spark.
Learn More
BeyondML
BeyondML is a framework for developing sparse neural networks that can perform multiple tasks across multiple data domains.
Learn More
BI & AI
The goal of this committee is to integrate the power of AI and BI to make it CI (Cognitive Intelligence) by combing the speed machines accelerate (AI) with the direction intuited by human insight (BI).
Learn MoreCLAIMED
CLAIMED (Component Library for AI, Machine Learning, ETL and Data Science) is a runtime and programming language agnostic Data & AI component framework.
Learn More
DataOps Committee
The DataOps Committee in LF AI & Data is is a global group that consists of participants from various geographies focused on DataOps.
Learn More
DataPractices
DataPractices is a “Manifesto for Data Practices,” comprised of values and principles to illustrate the most effective, modern, and ethical approach to data teamwork.
Learn MoreDatashim
Datashim is enabling and accelerating data access for Kubernetes/Openshift workloads in a transparent and declarative way.
Learn More
DeepRec
DeepRec is a high-performance recommendation deep learning framework based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow.
Learn MoreDelta
DELTA is a deep learning based end-to-end natural language and speech processing platform.
Learn MoreEgeria
Egeria is the world’s first open source metadata standard. It provides open APIs, event formats, types and integration logic so organizations can share data management and governance across the entireenterprise without reformatting or restricting the data to a single format, platform, or vendor product.
Learn More
Egeria Conformance
To ensure both consistency and alignment with the standards driven by Egeria, the Egeria Conformance program is available for vendors to showcase how they are shipping Egeria as part of their offering.
Learn More
Elastic Deep Learning
EDL is an Elastic Deep Learning framework designed to help deep learning cloud service providers to build cluster cloud services using deep learning frameworks such as PaddlePaddle and TensorFlow.
Learn MoreElyra
Elyra is an open-source low code / no code framework for creating reproducible, scalable and component based data science pipelines.
Learn MoreFATE
FATE (Federated AI Technology Enabler) is the world's first industrial grade federated learning open source framework to enable enterprises and institutions to collaborate on data while protecting data security and privacy.
Learn MoreFeast
Feast is an open source feature store for machine learning. It was developed as a collaboration between Gojek and Google in 2018.
Learn More
FlagAI
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale models.
Learn MoreFlyte
Flyte is a production-grade, declarative, structured and highly scalable cloud-native workflow orchestration platform.
Learn MoreForestFlow
ForestFlow is a scalable policy-based cloud-native machine learning model server.
Learn More
Horovod
Horovod makes it easy to take a single-GPU TensorFlow program and successfully train it on many GPUs faster. Horovod also achieved significantly improved GPU resource usage figures.
Learn MoreJanusGraph
JanusGraph is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster.
Learn MoreKedro
Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.
Learn More
Kompute
Kompute is a general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing use cases.
Learn More
KServe
KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks.
Learn MoreLakeSoul
LakeSoul is a cloud-native Lakehouse framework developed by DMetaSoul team, and supports scalable metadata management, ACID transactions, efficient and flexible upsert operation, schema evolution, and unified streaming & batch processing.
Learn MoreLudwig
Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system.
Learn More
Machine Learning eXchange
Machine Learning eXchange (MLX) is a Data and AI Assets Catalog and Execution Engine.
Learn MoreMarquez
Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata.
Learn More
Milvus
Milvus is an open-source vector database that is highly flexible, reliable, and blazing fast.
Learn More
ML Security Committee
The ML Security committee is a global group that advances, showcases and explores challenges and solutions concerning the security of machine learning tooling, systems and use-cases.
Learn More
MLOps Committee
The LF AI & Data Foundation MLOps Committee helps related projects get more recognization and adoption through cooperation by a passionate community of members.
Learn MoreNNStreamer
NNStreamer is a set of Gstreamer plugins that support ease and efficiency for Gstreamer developers adopting neural network models and neural network developers managing neural network pipelines and their filters.
Learn MoreOpenBytes
OpenBytes aims to facilitate wider sharing of, and collaboration with, data in the AI community through the promotion of data standards and formats and enabling contributions of data.
Learn More
OpenFL
OpenFL is a Python 3 library for federated learning that enables organizations to collaboratively train a model without sharing sensitive information.
Learn More
Pyro
Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend.
Learn More
RosaeNLG
RosaeNLG is an open source project, template-based Natural Language Generation (NLG) automating the production of relatively repetitive texts based on structured input data and textual templates, run by a NLG engine.
Learn More
Sparklyr
Sparklyr is an open-source and modern interface to scale data science and machine learning workflows using Apache Spark™, R, and a rich extension ecosystem.
Learn More
Substra
Substra is a framework offering distributed orchestration of machine learning tasks among partners while guaranteeing secure and trustless traceability of all operations.
Learn More

Trusted AI
The LF AI & Data Trusted AI Committee is a global group working on policies, guidelines, tools and use cases by industry to ensure the development of trustworthy AI systems and processes to develop them continue to improve over time.
Learn MoreXtreme1
Xtreme1 is the next generation open source platform for multi-sensory training data.
Learn MoreEgeria
Egeria is the world’s first open source metadata standard. It provides open APIs, event formats, types and integration logic so organizations can share data management and governance across the entireenterprise without reformatting or restricting the data to a single format, platform, or vendor product.
Learn MoreFlyte
Flyte is a production-grade, declarative, structured and highly scalable cloud-native workflow orchestration platform.
Learn More
Horovod
Horovod makes it easy to take a single-GPU TensorFlow program and successfully train it on many GPUs faster. Horovod also achieved significantly improved GPU resource usage figures.
Learn More
Milvus
Milvus is an open-source vector database that is highly flexible, reliable, and blazing fast.
Learn More
Pyro
Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend.
Learn MoreAdlik
Adlik is a toolkit for accelerating deep learning inference. The goal of Adlik is to accelerate deep learning inference process both on cloud and embedded environments.
Learn More
Adversarial Robustness Toolbox
Adversarial Robustness Toolbox (ART) provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats.
Learn MoreAI Explainability 360
AI Explainability 360 is an open source toolkit that can help users better understand the ways that machine learning models predict labels using a wide variety of techniques throughout the AI application lifecycle.
Learn MoreAI Fairness 360
AI Fairness 360 is an extensible open source toolkit that can help users understand and mitigate bias in machine learning models throughout the AI application lifecycle.
Learn MoreAmundsen
Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Learn MoreAngel ML
The Angel Project is a high-performance distributed machine learning platform based on Parameter Server, running on YARN and Apache Spark.
Learn More
DataPractices
DataPractices is a “Manifesto for Data Practices,” comprised of values and principles to illustrate the most effective, modern, and ethical approach to data teamwork.
Learn MoreDatashim
Datashim is enabling and accelerating data access for Kubernetes/Openshift workloads in a transparent and declarative way.
Learn MoreDelta
DELTA is a deep learning based end-to-end natural language and speech processing platform.
Learn More
Elastic Deep Learning
EDL is an Elastic Deep Learning framework designed to help deep learning cloud service providers to build cluster cloud services using deep learning frameworks such as PaddlePaddle and TensorFlow.
Learn MoreFATE
FATE (Federated AI Technology Enabler) is the world's first industrial grade federated learning open source framework to enable enterprises and institutions to collaborate on data while protecting data security and privacy.
Learn MoreFeast
Feast is an open source feature store for machine learning. It was developed as a collaboration between Gojek and Google in 2018.
Learn MoreForestFlow
ForestFlow is a scalable policy-based cloud-native machine learning model server.
Learn MoreJanusGraph
JanusGraph is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster.
Learn MoreKedro
Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.
Learn More
Kompute
Kompute is a general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing use cases.
Learn More
KServe
KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks.
Learn MoreLudwig
Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system.
Learn MoreMarquez
Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata.
Learn MoreNNStreamer
NNStreamer is a set of Gstreamer plugins that support ease and efficiency for Gstreamer developers adopting neural network models and neural network developers managing neural network pipelines and their filters.
Learn More
OpenFL
OpenFL is a Python 3 library for federated learning that enables organizations to collaboratively train a model without sharing sensitive information.
Learn More
Sparklyr
Sparklyr is an open-source and modern interface to scale data science and machine learning workflows using Apache Spark™, R, and a rich extension ecosystem.
Learn More
Substra
Substra is a framework offering distributed orchestration of machine learning tasks among partners while guaranteeing secure and trustless traceability of all operations.
Learn More

1chipML
1chipML is an open source library for basic numerical crunching and machine learning for microcontrollers.
Learn More
BeyondML
BeyondML is a framework for developing sparse neural networks that can perform multiple tasks across multiple data domains.
Learn MoreCLAIMED
CLAIMED (Component Library for AI, Machine Learning, ETL and Data Science) is a runtime and programming language agnostic Data & AI component framework.
Learn More
DeepRec
DeepRec is a high-performance recommendation deep learning framework based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow.
Learn MoreElyra
Elyra is an open-source low code / no code framework for creating reproducible, scalable and component based data science pipelines.
Learn More
FlagAI
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale models.
Learn MoreLakeSoul
LakeSoul is a cloud-native Lakehouse framework developed by DMetaSoul team, and supports scalable metadata management, ACID transactions, efficient and flexible upsert operation, schema evolution, and unified streaming & batch processing.
Learn More
Machine Learning eXchange
Machine Learning eXchange (MLX) is a Data and AI Assets Catalog and Execution Engine.
Learn MoreOpenBytes
OpenBytes aims to facilitate wider sharing of, and collaboration with, data in the AI community through the promotion of data standards and formats and enabling contributions of data.
Learn More
RosaeNLG
RosaeNLG is an open source project, template-based Natural Language Generation (NLG) automating the production of relatively repetitive texts based on structured input data and textual templates, run by a NLG engine.
Learn MoreXtreme1
Xtreme1 is the next generation open source platform for multi-sensory training data.
Learn MoreAcumos AI
Acumos AI is a platform and open source framework that makes it easy to build, share, and deploy AI apps.
Learn More
BI & AI
The goal of this committee is to integrate the power of AI and BI to make it CI (Cognitive Intelligence) by combing the speed machines accelerate (AI) with the direction intuited by human insight (BI).
Learn More
DataOps Committee
The DataOps Committee in LF AI & Data is is a global group that consists of participants from various geographies focused on DataOps.
Learn More
ML Security Committee
The ML Security committee is a global group that advances, showcases and explores challenges and solutions concerning the security of machine learning tooling, systems and use-cases.
Learn More
MLOps Committee
The LF AI & Data Foundation MLOps Committee helps related projects get more recognization and adoption through cooperation by a passionate community of members.
Learn More
Trusted AI
The LF AI & Data Trusted AI Committee is a global group working on policies, guidelines, tools and use cases by industry to ensure the development of trustworthy AI systems and processes to develop them continue to improve over time.
Learn More