Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata.

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata. It maintains the provenance of how datasets are consumed and produced, provides global visibility into job runtime and frequency of dataset access, centralization of dataset lifecycle management, and much more. Marquez enables highly flexible data lineage queries across all datasets, while reliably and efficiently associating (upstream, downstream) dependencies between jobs and the datasets they produce and consume. Marquez is a modular system and has been designed as a highly scalable, highly extensible platform-agnostic solution for metadata management. Marquez’s data model emphasizes immutability and timely processing of datasets.

Marquez is an incubation-stage project of the LF AI & Data Foundation.

Contributed by: WeWork in December 2019