Skip to main content

Projects

Filter by:

1chipML

1chipML is an open source library for basic numerical crunching and machine learning for microcontrollers.

Learn More

Acumos AI

Acumos AI is a platform and open source framework that makes it easy to build, share, and deploy AI apps.

Learn More

Adlik

Adlik is a toolkit for accelerating deep learning inference. The goal of Adlik is to accelerate deep learning inference process both on cloud and embedded environments.

Learn More

Adversarial Robustness Toolbox

Adversarial Robustness Toolbox (ART) provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats.

Learn More

AI Explainability 360

AI Explainability 360 is an open source toolkit that can help users better understand the ways that machine learning models predict labels using a wide variety of techniques throughout the AI application lifecycle.

Learn More

AI Fairness 360

AI Fairness 360 is an extensible open source toolkit that can help users understand and mitigate bias in machine learning models throughout the AI application lifecycle.

Learn More

Amundsen

Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Learn More

Angel ML

The Angel Project is a high-performance distributed machine learning platform based on Parameter Server, running on YARN and Apache Spark.

Learn More

Artigraph

Artigraph is a tool to improve the authorship, management, and quality of data.

Learn More

BeyondML

BeyondML is a framework for developing sparse neural networks that can perform multiple tasks across multiple data domains.

Learn More

BI & AI

The goal of this committee is to integrate the power of AI and BI to make it CI (Cognitive Intelligence) by combing the speed machines accelerate (AI) with the direction intuited by human insight (BI).

Learn More

CLAIMED

CLAIMED (Component Library for AI, Machine Learning, ETL and Data Science) is a runtime and programming language agnostic Data & AI component framework.

Learn More

DataOps Committee

The DataOps Committee in LF AI & Data is is a global group that consists of participants from various geographies focused on DataOps.

Learn More

DataPractices

DataPractices is a “Manifesto for Data Practices,” comprised of values and principles to illustrate the most effective, modern, and ethical approach to data teamwork.

Learn More

Datashim

Datashim is enabling and accelerating data access for Kubernetes/Openshift workloads in a transparent and declarative way.

Learn More

DeepRec

DeepRec is a high-performance recommendation deep learning framework based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow.

Learn More

Delta

DELTA is a deep learning based end-to-end natural language and speech processing platform.

Learn More

DocArray

DocArray is a library for nested, unstructured, multimodal data in transit.

Learn More

Egeria

Egeria is the world’s first open source metadata standard. It provides open APIs, event formats, types and integration logic so organizations can share data management and governance across the entireenterprise without reformatting or restricting the data to a single format, platform, or vendor product.

Learn More

Egeria Conformance

To ensure both consistency and alignment with the standards driven by Egeria, the Egeria Conformance program is available for vendors to showcase how they are shipping Egeria as part of their offering.

Learn More

Elastic Deep Learning

EDL is an Elastic Deep Learning framework designed to help deep learning cloud service providers to build cluster cloud services using deep learning frameworks such as PaddlePaddle and TensorFlow.

Learn More

Elyra

Elyra is an open-source low code / no code framework for creating reproducible, scalable and component based data science pipelines.

Learn More

FATE

FATE (Federated AI Technology Enabler) is the world's first industrial grade federated learning open source framework to enable enterprises and institutions to collaborate on data while protecting data security and privacy.

Learn More

Feast

Feast is an open source feature store for machine learning. It was developed as a collaboration between Gojek and Google in 2018.

Learn More

Feathr

Feathr is an enterprise-grade, high-performance feature store.

Learn More

FlagAI

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale models.

Learn More

Flyte

Flyte is a production-grade, declarative, structured and highly scalable cloud-native workflow orchestration platform.

Learn More

ForestFlow

ForestFlow is a scalable policy-based cloud-native machine learning model server.

Learn More

Horovod

Horovod makes it easy to take a single-GPU TensorFlow program and successfully train it on many GPUs faster. Horovod also achieved significantly improved GPU resource usage figures.

Learn More

JanusGraph

JanusGraph is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster.

Learn More

Kedro

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.

Learn More

Kompute

Kompute is a general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing use cases.

Learn More

KServe

KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks.

Learn More

LakeSoul

LakeSoul is a cloud-native Lakehouse framework developed by DMetaSoul team, and supports scalable metadata management, ACID transactions, efficient and flexible upsert operation, schema evolution, and unified streaming & batch processing.

Learn More

Ludwig

Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system.

Learn More

Machine Learning eXchange

Machine Learning eXchange (MLX) is a Data and AI Assets Catalog and Execution Engine.

Learn More

Marquez

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata.

Learn More

Milvus

Milvus is an open-source vector database that is highly flexible, reliable, and blazing fast.

Learn More

ML Security Committee

The ML Security committee is a global group that advances, showcases and explores challenges and solutions concerning the security of machine learning tooling, systems and use-cases.

Learn More

MLOps Committee

The LF AI & Data Foundation MLOps Committee helps related projects get more recognization and adoption through cooperation by a passionate community of members.

Learn More

NNStreamer

NNStreamer is a set of Gstreamer plugins that support ease and efficiency for Gstreamer developers adopting neural network models and neural network developers managing neural network pipelines and their filters.

Learn More

ONNX

ONNX is an open format to represent deep learning models.

Learn More

OpenBytes

OpenBytes aims to facilitate wider sharing of, and collaboration with, data in the AI community through the promotion of data standards and formats and enabling contributions of data.

Learn More

OpenDataology

OpenDataology is an open source dataset license compliance analysis project.

Learn More

OpenDS4All

ODPi’s OpenDS4All enables the creation of educational Data Science programs.

Learn More

OpenFL

OpenFL is a Python 3 library for federated learning that enables organizations to collaboratively train a model without sharing sensitive information.

Learn More

OpenLineage

OpenLineage proposes an open standard and API for lineage collection.

Learn More

Pyro

Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend.

Learn More

RosaeNLG

RosaeNLG is an open source project, template-based Natural Language Generation (NLG) automating the production of relatively repetitive texts based on structured input data and textual templates, run by a NLG engine.

Learn More

SOAJS

SOAJS is an open source microservices and API management platform.

Learn More

Sparklyr

Sparklyr is an open-source and modern interface to scale data science and machine learning workflows using Apache Spark™, R, and a rich extension ecosystem.

Learn More

Substra

Substra is a framework offering distributed orchestration of machine learning tasks among partners while guaranteeing secure and trustless traceability of all operations.

Learn More

TonY

TonY is a framework to natively run deep learning jobs on Apache Hadoop.

Learn More

Trusted AI

The LF AI & Data Trusted AI Committee is a global group working on policies, guidelines, tools and use cases by industry to ensure the development of trustworthy AI systems and processes to develop them continue to improve over time.

Learn More

Xtreme1

Xtreme1 is the next generation open source platform for multi-sensory training data.

Learn More

Egeria

Egeria is the world’s first open source metadata standard. It provides open APIs, event formats, types and integration logic so organizations can share data management and governance across the entireenterprise without reformatting or restricting the data to a single format, platform, or vendor product.

Learn More

Flyte

Flyte is a production-grade, declarative, structured and highly scalable cloud-native workflow orchestration platform.

Learn More

Horovod

Horovod makes it easy to take a single-GPU TensorFlow program and successfully train it on many GPUs faster. Horovod also achieved significantly improved GPU resource usage figures.

Learn More

Milvus

Milvus is an open-source vector database that is highly flexible, reliable, and blazing fast.

Learn More

ONNX

ONNX is an open format to represent deep learning models.

Learn More

Pyro

Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend.

Learn More

Adlik

Adlik is a toolkit for accelerating deep learning inference. The goal of Adlik is to accelerate deep learning inference process both on cloud and embedded environments.

Learn More

Adversarial Robustness Toolbox

Adversarial Robustness Toolbox (ART) provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats.

Learn More

AI Explainability 360

AI Explainability 360 is an open source toolkit that can help users better understand the ways that machine learning models predict labels using a wide variety of techniques throughout the AI application lifecycle.

Learn More

AI Fairness 360

AI Fairness 360 is an extensible open source toolkit that can help users understand and mitigate bias in machine learning models throughout the AI application lifecycle.

Learn More

Amundsen

Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Learn More

Angel ML

The Angel Project is a high-performance distributed machine learning platform based on Parameter Server, running on YARN and Apache Spark.

Learn More

DataPractices

DataPractices is a “Manifesto for Data Practices,” comprised of values and principles to illustrate the most effective, modern, and ethical approach to data teamwork.

Learn More

Datashim

Datashim is enabling and accelerating data access for Kubernetes/Openshift workloads in a transparent and declarative way.

Learn More

Delta

DELTA is a deep learning based end-to-end natural language and speech processing platform.

Learn More

Elastic Deep Learning

EDL is an Elastic Deep Learning framework designed to help deep learning cloud service providers to build cluster cloud services using deep learning frameworks such as PaddlePaddle and TensorFlow.

Learn More

FATE

FATE (Federated AI Technology Enabler) is the world's first industrial grade federated learning open source framework to enable enterprises and institutions to collaborate on data while protecting data security and privacy.

Learn More

Feast

Feast is an open source feature store for machine learning. It was developed as a collaboration between Gojek and Google in 2018.

Learn More

ForestFlow

ForestFlow is a scalable policy-based cloud-native machine learning model server.

Learn More

JanusGraph

JanusGraph is a scalable graph database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster.

Learn More

Kedro

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.

Learn More

Kompute

Kompute is a general purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing use cases.

Learn More

KServe

KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks.

Learn More

Ludwig

Ludwig is an open-source, declarative machine learning framework that makes it easy to define deep learning pipelines with a simple and flexible data-driven configuration system.

Learn More

Marquez

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata.

Learn More

NNStreamer

NNStreamer is a set of Gstreamer plugins that support ease and efficiency for Gstreamer developers adopting neural network models and neural network developers managing neural network pipelines and their filters.

Learn More

OpenDS4All

ODPi’s OpenDS4All enables the creation of educational Data Science programs.

Learn More

OpenFL

OpenFL is a Python 3 library for federated learning that enables organizations to collaboratively train a model without sharing sensitive information.

Learn More

OpenLineage

OpenLineage proposes an open standard and API for lineage collection.

Learn More

SOAJS

SOAJS is an open source microservices and API management platform.

Learn More

Sparklyr

Sparklyr is an open-source and modern interface to scale data science and machine learning workflows using Apache Spark™, R, and a rich extension ecosystem.

Learn More

Substra

Substra is a framework offering distributed orchestration of machine learning tasks among partners while guaranteeing secure and trustless traceability of all operations.

Learn More

TonY

TonY is a framework to natively run deep learning jobs on Apache Hadoop.

Learn More

1chipML

1chipML is an open source library for basic numerical crunching and machine learning for microcontrollers.

Learn More

Artigraph

Artigraph is a tool to improve the authorship, management, and quality of data.

Learn More

BeyondML

BeyondML is a framework for developing sparse neural networks that can perform multiple tasks across multiple data domains.

Learn More

CLAIMED

CLAIMED (Component Library for AI, Machine Learning, ETL and Data Science) is a runtime and programming language agnostic Data & AI component framework.

Learn More

DeepRec

DeepRec is a high-performance recommendation deep learning framework based on TensorFlow 1.15, Intel-TensorFlow and NVIDIA-TensorFlow.

Learn More

DocArray

DocArray is a library for nested, unstructured, multimodal data in transit.

Learn More

Elyra

Elyra is an open-source low code / no code framework for creating reproducible, scalable and component based data science pipelines.

Learn More

Feathr

Feathr is an enterprise-grade, high-performance feature store.

Learn More

FlagAI

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale models.

Learn More

LakeSoul

LakeSoul is a cloud-native Lakehouse framework developed by DMetaSoul team, and supports scalable metadata management, ACID transactions, efficient and flexible upsert operation, schema evolution, and unified streaming & batch processing.

Learn More

Machine Learning eXchange

Machine Learning eXchange (MLX) is a Data and AI Assets Catalog and Execution Engine.

Learn More

OpenBytes

OpenBytes aims to facilitate wider sharing of, and collaboration with, data in the AI community through the promotion of data standards and formats and enabling contributions of data.

Learn More

OpenDataology

OpenDataology is an open source dataset license compliance analysis project.

Learn More

RosaeNLG

RosaeNLG is an open source project, template-based Natural Language Generation (NLG) automating the production of relatively repetitive texts based on structured input data and textual templates, run by a NLG engine.

Learn More

Xtreme1

Xtreme1 is the next generation open source platform for multi-sensory training data.

Learn More

Acumos AI

Acumos AI is a platform and open source framework that makes it easy to build, share, and deploy AI apps.

Learn More

BI & AI

The goal of this committee is to integrate the power of AI and BI to make it CI (Cognitive Intelligence) by combing the speed machines accelerate (AI) with the direction intuited by human insight (BI).

Learn More

DataOps Committee

The DataOps Committee in LF AI & Data is is a global group that consists of participants from various geographies focused on DataOps.

Learn More

ML Security Committee

The ML Security committee is a global group that advances, showcases and explores challenges and solutions concerning the security of machine learning tooling, systems and use-cases.

Learn More

MLOps Committee

The LF AI & Data Foundation MLOps Committee helps related projects get more recognization and adoption through cooperation by a passionate community of members.

Learn More

Trusted AI

The LF AI & Data Trusted AI Committee is a global group working on policies, guidelines, tools and use cases by industry to ensure the development of trustworthy AI systems and processes to develop them continue to improve over time.

Learn More