加载中
正在获取最新内容,请稍候...
正在获取最新内容,请稍候...
A high-performance, native Rust implementation of the Delta Lake protocol, offering robust data lake capabilities and convenient Python bindings for data engineers and analysts.
delta-rs is the official native Rust library for interacting with Delta Lake tables. It provides a high-performance, low-level implementation of the Delta Lake protocol, along with idiomatic Python bindings, making it ideal for performance-critical data tasks and integration into native applications.
Traditional Delta Lake implementations often rely on the JVM, which can introduce overhead and complexity. Accessing Delta Lake from non-JVM/Spark environments or requiring maximum performance can be challenging. delta-rs provides a native, fast alternative that enables broader language access.
Interact with Delta Lake tables directly from high-performance Rust code, leveraging native execution speeds.
Seamlessly read and write Delta tables within Python data workflows, integrating with libraries like pandas, polars, and pyarrow.
Fully compliant with the Delta Lake protocol specifications, ensuring compatibility with other Delta Lake implementations.
Efficiently read data from Delta tables with support for predicates and column projection.
Support for appending, overwriting, and other modifications to Delta tables.
delta-rs is suitable for various scenarios where high-performance, language-native interaction with Delta Lake is required, particularly outside of traditional Spark/JVM-based ecosystems.
Build high-speed data ingestion, transformation, or loading processes using Rust, leveraging its performance characteristics to read from and write to Delta tables efficiently.
Significantly reduce processing time and infrastructure costs for data pipelines compared to interpreted languages for heavy data operations.
Read Delta tables directly into Python data structures (like pandas DataFrames or Polars DataFrames) for local data cleaning, analysis, or machine learning model training without needing a distributed cluster.
Simplifies local development, testing, and analysis workflows by providing direct, fast access to Delta Lake data in Python.
Develop standalone applications, command-line tools, or microservices in Rust that need to read or write data to Delta Lake tables as part of their core functionality.
Enables the creation of new types of performant applications that interact directly with Delta Lake, expanding its use beyond traditional big data frameworks.
You might be interested in these projects
Official repository for the Linera protocol, a highly scalable, low-latency blockchain designed for parallel execution using microchains.
Jetpack Media3 provides robust support libraries for Android media use cases, including ExoPlayer, a highly extensible and customizable media player.
Explore Langflow, a powerful visual no-code/low-code tool for building and deploying AI-powered agents and complex workflows using popular Large Language Models (LLMs) and tools.