Announcement

Free to view yesterday and today

Customer Service: cat_manager

加载中

正在获取最新内容，请稍候...

Unstract: No-code LLM Platform for Document Structuring

Unstract is a no-code platform enabling users to leverage Large Language Models (LLMs) for extracting and structuring data from unstructured documents. Easily build APIs and ETL pipelines without writing code.

Python

Added on 2025年6月26日

View on GitHub

Unstract: No-code LLM Platform for Document Structuring preview

5,393

Stars

506

Forks

Python

Language

Project Introduction

Summary

This project offers a powerful, no-code platform designed to significantly simplify the process of extracting and structuring information from unstructured documents by harnessing the capabilities of Large Language Models. It allows users to visually build data pipelines and deploy them as APIs or integrate into existing ETL processes.

Problem Solved

Extracting structured data from unstructured documents like PDFs, images, and text files is traditionally complex, requiring extensive programming and custom parsers. This project simplifies this by providing a visual, no-code way to leverage powerful LLMs for accurate and efficient data extraction.

Core Features

No-code Workflow Builder

Design and build complex document processing workflows using an intuitive drag-and-drop interface.

Flexible LLM Integration

Integrate seamlessly with various state-of-the-art Large Language Models to power your extraction tasks.

Automated API Generation

Instantly publish your configured extraction logic as a production-ready REST API endpoint.

ETL Pipeline Compatibility

Export extracted structured data directly into formats compatible with popular ETL pipelines.

Tech Stack

Python

FastAPI

React

PostgreSQL

Docker

Kubernetes

使用场景

The flexibility of Unstract allows it to be applied across numerous industries and functions where extracting structured data from unstructured documents is a critical task.

场景一：财务 - 自动化发票处理

Details

Automatically extract key fields such as vendor name, invoice number, date, amount, and line items from diverse invoice formats (PDFs, scans).

User Value

Significantly reduces manual data entry, accelerates accounts payable cycles, and improves data accuracy.

场景二：人力资源 - 简历解析与筛选

Details

Parse resumes and CVs to extract candidate information including contact details, work experience, education, and skills into a structured format for applicant tracking systems.

User Value

Streamlines the recruitment process, enabling faster candidate screening and database building.

场景三：法务/合规 - 合同数据提取

Details

Extract relevant clauses, dates, parties, and terms from legal contracts, agreements, or compliance documents for analysis and management.

User Value

Improves efficiency in contract review, facilitates compliance audits, and enables better contract lifecycle management.

Recommended Projects

You might be interested in these projects

immortalwrtimmortalwrt

ImmortalWrt is an open-source router firmware project, forked from OpenWrt, with optimizations and pre-configured features tailored for users in mainland China.

83162451

View Details

kubernetes-sigsexternal-dns

Automates the synchronization of Kubernetes resources (like Services and Ingresses) with external DNS providers, enabling dynamic service discovery and access.

82452697

View Details

wailsappwails

本项目旨在通过自动化技术简化特定任务的处理流程，显著提升效率和准确性。适用于需要处理大量数据的开发者和分析师。

289011404

View Details