opendataloader-project / opendataloader-pdf
OpenDataLoader PDF is a high-performance, open-source parser designed to convert PDF documents into structured formats like Markdown, JSON, and HTML for AI and RAG pipelines. It features a hybrid processing mode that combines deterministic local parsing with AI-driven analysis to achieve industry-leading extraction accuracy for complex tables, formulas, and scanned documents. Additionally, the project provides automated accessibility solutions, including end-to-end Tagged PDF generation compliant with international standards.