Extract PDF content as LlamaIndex-compatible JSON documents. Perfect for RAG pipelines, LangChain, and other LLM frameworks.
Click to select files or drag and drop
One or more PDF files
Your files never leave your device.
Output Format:
Each PDF will be extracted as a JSON file containing an array of LlamaIndex Document objects with:
text - Extracted text content per pagemetadata - Page number, headings, and document info
extra_info - Additional context for RAG systemsProcessing...