Add documentation for all PDF tools

This commit is contained in:
alam00000
2026-03-20 21:48:48 +05:30
parent 1d986aedb1
commit b4a2c98497
125 changed files with 13507 additions and 5422 deletions

45
docs/tools/pdf-to-json.md Normal file
View File

@@ -0,0 +1,45 @@
---
title: PDF to JSON
description: Convert PDF files to structured JSON format using CPDF. Exports PDF internal structure as machine-readable JSON data.
---
# PDF to JSON
Converts one or more PDF files into JSON format. The tool uses CPDF (Coherent PDF) to serialize the internal structure of each PDF into a machine-readable JSON representation. The output is always delivered as a ZIP archive.
## How It Works
1. Select one or more PDF files using the file picker.
2. Click **Convert** to start processing. The conversion runs in a background web worker.
3. A ZIP archive downloads containing one JSON file for each input PDF.
The status bar shows progress through the conversion and ZIP creation stages.
## Options
This tool has no configurable options. Each PDF is converted to its full JSON representation.
## Output Format
- A `pdfs-to-json.zip` file containing `filename.json` for each input PDF.
The JSON output represents the PDF's internal object structure, including page trees, font references, metadata, and content streams.
## Use Cases
- Inspecting the internal structure of a PDF for debugging or validation.
- Extracting structured metadata and page properties programmatically.
- Feeding PDF structure data into custom processing pipelines.
- Auditing PDF documents for compliance by analyzing their object trees.
## Tips
- This tool outputs the raw PDF object structure, not extracted text content. If you need page text in JSON format, use [Prepare PDF for AI](./prepare-pdf-for-ai) instead.
- For plain text extraction, [PDF to Text](./pdf-to-text) is more straightforward.
- Large PDFs with complex structures produce very large JSON files.
## Related Tools
- [PDF to Text](./pdf-to-text)
- [Prepare PDF for AI](./prepare-pdf-for-ai)
- [Extract Tables](./extract-tables)