mineru-document-extractor
MinerU document extraction CLI that converts PDFs, images, and web pages into Markdown, HTML, LaTeX, or DOCX via the MinerU API. Supports token-free flash extraction for quick start, precision extraction with table/formula recognition, web crawling, batch processing, and piped workflows.
Changelog: Source: GitHub https://github.com/opendatalab/MinerU-Ecosystem
No comments yet. Be the first one!