Announcement

Free to view yesterday and today

Customer Service: cat_manager

加载中

正在获取最新内容，请稍候...

Apache Gravitino: Powerful Open Data Catalog for Federated Metadata Lake

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake. It aims to provide a unified view and management layer over diverse data sources.

Java

Added on 2025年6月15日

View on GitHub

Apache Gravitino: Powerful Open Data Catalog for Federated Metadata Lake preview

1,632

Stars

495

Forks

Java

Language

Project Introduction

Summary

This project is an open data catalog designed to build a high-performance, geo-distributed, and federated metadata lake. It simplifies metadata management and discovery across diverse, distributed data systems.

Problem Solved

Managing metadata across numerous distributed and heterogeneous data sources is complex, leading to data silos, poor discoverability, and fragmented governance. This project solves this by providing a unified, performant, and scalable metadata lake.

Core Features

Federated Metadata Catalog

Allows integration and cataloging of metadata from various data systems (databases, data lakes, etc.) into a single federated view.

High Performance & Geo-Distribution

Designed for high throughput and low latency access to metadata, supporting large-scale deployments across geographical locations.

Data Discovery and Governance

Provides robust capabilities for data discovery, search, and governance across the federated metadata lake.

Tech Stack

Java

Apache Spark

RESTful APIs

Distributed Systems Principles

使用场景

The project's capabilities in building a federated metadata lake are applicable across various industries and scenarios requiring unified data visibility and management.

Unified Data Discovery and Access

Details

Create a single pane of glass for all data assets scattered across cloud object storage, data warehouses, and operational databases, improving data discoverability for analytics teams.

User Value

Reduces time spent searching for data, enables cross-source analytics, and breaks down data silos.

Enterprise Data Governance and Compliance

Details

Implement consistent data access policies, track data lineage, and manage data quality rules centrally across federated data sources to meet regulatory requirements.

User Value

Ensures compliance with data regulations (like GDPR, CCPA) and improves data trust through centralized governance.

Recommended Projects

You might be interested in these projects

launchbadgesqlx

A modern, async-first, pure Rust SQL toolkit providing compile-time checked queries for PostgreSQL, MySQL, and SQLite databases without requiring a DSL.

Rust

150281425

View Details

alyssaxuuscreenity

Screenity is a free, privacy-friendly, and powerful screen recorder for Chrome, offering unlimited recording time, customizable features, and no watermarks. Ideal for tutorials, demos, and bug reports directly from your browser.

JavaScript

147311178

View Details

kubernetesdashboard

Kubernetes Dashboard is a general purpose, web-based UI for Kubernetes clusters. It allows users to manage and troubleshoot applications running on Kubernetes, as well as the cluster itself.

149594238

View Details