OpenLineage Joins LF AI & Data as New Sandbox Project

By July 22, 2021Blog

LF AI & Data Foundation—the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI) and data open source projects, today is announcing OpenLineage as its latest Sandbox Project. 

Released and open sourced by Datakin, OpenLineage is an open standard for metadata and lineage collection designed to instrument jobs as they are running. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.

OpenLineage is a cross-industry effort involving contributors from major open source data projects, including LF AI & Data projects; Marquez, Amundesen, and Egeria. Without OpenLineage, projects have to instrument all jobs and integrations are external, which can break new versions. When OpenLineage is applied, effort of integration is shared and integration can be pushed in each project, meaning the user will not need to play catch-up. 

Before OpenLineage

With OpenLineage

Dr. Ibrahim Haddad, Executive Director of LF AI & Data, said: “We are excited to welcome the OpenLineage project in LF AI & Data. The project addresses a critical component in governing AI and data projects and further expands the robustness of our portfolio of hosted technical projects. We look forward to working with the OpenLineage project to grow the project’s footprint in the ecosystem, expand its community of adopters and contributors, and to foster the creation of collaboration opportunities with our members and other related projects.”

Julien Le Dem, founder of OpenLineage, said: “Data lineage is a complicated and multidimensional problem; the best solution is to directly observe the movement of data through heterogeneous pipelines. That requires the kind of broad industry coordination that the Linux Foundation has become known for. We are proud for OpenLineage to become a LF AI & Data project, and look forward to an ongoing collaboration.”]

LF AI & Data supports projects via a wide range of services, and the first step is joining as an Incubation Project. Learn more about OpenLineage on their GitHub and be sure to join the OpenLineage-Announce and OpenLineage-Technical-Discuss mail lists to join the community and stay connected on the latest updates. 

A warm welcome to OpenLineage! We look forward to the project’s continued growth and success as part of the LF AI & Data Foundation. To learn about how to host an open source project with us, visit the LF AI & Data website.

OpenLineage Key Links

LF AI & Data Resources