Files
bentopdf/docs/getting-started.md
alam00000 77da6d7a7d feat: integrate Tesseract.js with improved language availability and font handling
- Refactored OCR page recognition to utilize a configured Tesseract worker.
- Added functions to manage font URLs and asset filenames based on language.
- Implemented language availability checks and error handling for unsupported languages.
- Enhanced PDF workflow to display available OCR languages and handle user selections.
- Introduced utility functions for resolving Tesseract asset configurations.
- Added tests for OCR functionality, font loading, and Tesseract runtime behavior.
- Updated global types to include environment variables for Tesseract and font configurations.
2026-03-14 15:50:30 +05:30

78 lines
2.7 KiB
Markdown

# Getting Started
Welcome to BentoPDF! This guide will help you get up and running quickly.
## What is BentoPDF?
BentoPDF is a free, open-source, privacy-first PDF toolkit that runs **entirely in your browser**. Your files never leave your device—all processing happens locally using WebAssembly (WASM) technology.
## Quick Start
### Option 1: Use the Hosted Version
Visit [bentopdf.com](https://bentopdf.com) to use BentoPDF instantly—no installation required.
### Option 2: Self-Host with Docker
> [!IMPORTANT]
> Office file conversion requires `SharedArrayBuffer`, which needs both:
>
> - `Cross-Origin-Opener-Policy: same-origin`
> - `Cross-Origin-Embedder-Policy: require-corp`
> - a secure context
>
> `http://localhost` works for local testing because browsers treat loopback as trustworthy. `http://192.168.x.x` or other LAN IPs usually do not, so Word/Excel/PowerPoint conversions will require HTTPS when accessed from other devices on your network.
```bash
# Pull and run the Docker image
docker run -d -p 3000:8080 ghcr.io/alam00000/bentopdf:latest
# Or use Docker Compose
curl -O https://raw.githubusercontent.com/alam00000/bentopdf/main/docker-compose.yml
docker compose up -d
```
Then open `http://localhost:3000` in your browser.
> [!NOTE]
> If you are preparing an air-gapped OCR deployment, you must host the OCR text-layer fonts internally in addition to the Tesseract worker, core runtime, and traineddata files. The full setup is documented in [Self-Hosting](/self-hosting/), including `VITE_OCR_FONT_BASE_URL` and the bundled `ocr-fonts/` directory.
### Option 3: Build from Source
```bash
# Clone the repository
git clone https://github.com/alam00000/bentopdf.git
cd bentopdf
# Install dependencies
npm install
# Start development server
npm run dev
```
## Features at a Glance
| Category | Tools |
| -------------------- | --------------------------------------------------------------- |
| **Convert to PDF** | Word, Excel, PowerPoint, Images, Markdown, EPUB, MOBI, and more |
| **Convert from PDF** | JPG, PNG, Text, Excel, SVG, and more |
| **Edit & Annotate** | Sign, Highlight, Redact, Fill Forms, Add Stamps |
| **Organize** | Merge, Split, Rotate, Delete Pages, Reorder |
| **Optimize** | Compress, Repair, Flatten, OCR |
| **Security** | Encrypt, Decrypt, Remove Restrictions |
## Browser Support
BentoPDF works best on modern browsers:
- ✅ Chrome/Edge 90+
- ✅ Firefox 90+
- ✅ Safari 15+
## Next Steps
- [Explore all tools](/tools/)
- [Self-host BentoPDF](/self-hosting/)
- [Contribute to the project](/contributing)