From 58c78b09d2c85b2ff2225f8e6bd1efbb059084a3 Mon Sep 17 00:00:00 2001 From: alam00000 Date: Fri, 13 Mar 2026 23:32:52 +0530 Subject: [PATCH] docs: update self-hosting documentation with secure context requirements for Office file conversion --- README.md | 3 +++ docs/getting-started.md | 25 +++++++++++++++++-------- docs/self-hosting/apache.md | 2 ++ docs/self-hosting/docker.md | 4 +++- docs/self-hosting/index.md | 5 +++++ docs/self-hosting/nginx.md | 2 +- tasks/lessons.md | 13 +++++++++++++ 7 files changed, 44 insertions(+), 10 deletions(-) create mode 100644 tasks/lessons.md diff --git a/README.md b/README.md index 2948dda..0928f93 100644 --- a/README.md +++ b/README.md @@ -356,6 +356,9 @@ It is very straightforward to host your own instance of BentoPDF using a static Since BentoPDF is fully client-side, all processing happens in the user's browser and no server-side processing is required. This means you can host BentoPDF as simple static files on any web server or hosting platform. +> [!IMPORTANT] +> Office file conversion uses LibreOffice WASM, which requires `SharedArrayBuffer`. That means the app must be both cross-origin isolated and served from a secure context. `http://localhost` works for local testing, but `http://192.168.x.x` or other LAN IPs usually require HTTPS even if the server already sends the correct COOP/COEP headers. + **Download from Releases (Recommended):** The easiest way to self-host is to download the pre-built distribution file from our [GitHub releases](https://github.com/alam00000/bentopdf/releases). Each release includes a `dist-{version}.zip` file that contains all necessary files for self-hosting. diff --git a/docs/getting-started.md b/docs/getting-started.md index 3235ee9..1ac33fc 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -14,6 +14,15 @@ Visit [bentopdf.com](https://bentopdf.com) to use BentoPDF instantly—no instal ### Option 2: Self-Host with Docker +> [!IMPORTANT] +> Office file conversion requires `SharedArrayBuffer`, which needs both: +> +> - `Cross-Origin-Opener-Policy: same-origin` +> - `Cross-Origin-Embedder-Policy: require-corp` +> - a secure context +> +> `http://localhost` works for local testing because browsers treat loopback as trustworthy. `http://192.168.x.x` or other LAN IPs usually do not, so Word/Excel/PowerPoint conversions will require HTTPS when accessed from other devices on your network. + ```bash # Pull and run the Docker image docker run -d -p 3000:8080 ghcr.io/alam00000/bentopdf:latest @@ -41,14 +50,14 @@ npm run dev ## Features at a Glance -| Category | Tools | -|----------|-------| -| **Convert to PDF** | Word, Excel, PowerPoint, Images, Markdown, EPUB, MOBI, and more | -| **Convert from PDF** | JPG, PNG, Text, Excel, SVG, and more | -| **Edit & Annotate** | Sign, Highlight, Redact, Fill Forms, Add Stamps | -| **Organize** | Merge, Split, Rotate, Delete Pages, Reorder | -| **Optimize** | Compress, Repair, Flatten, OCR | -| **Security** | Encrypt, Decrypt, Remove Restrictions | +| Category | Tools | +| -------------------- | --------------------------------------------------------------- | +| **Convert to PDF** | Word, Excel, PowerPoint, Images, Markdown, EPUB, MOBI, and more | +| **Convert from PDF** | JPG, PNG, Text, Excel, SVG, and more | +| **Edit & Annotate** | Sign, Highlight, Redact, Fill Forms, Add Stamps | +| **Organize** | Merge, Split, Rotate, Delete Pages, Reorder | +| **Optimize** | Compress, Repair, Flatten, OCR | +| **Security** | Encrypt, Decrypt, Remove Restrictions | ## Browser Support diff --git a/docs/self-hosting/apache.md b/docs/self-hosting/apache.md index bc9f000..129c0ac 100644 --- a/docs/self-hosting/apache.md +++ b/docs/self-hosting/apache.md @@ -224,6 +224,8 @@ Header always set Cross-Origin-Embedder-Policy "require-corp" Header always set Cross-Origin-Opener-Policy "same-origin" ``` +It also needs a secure context. `http://localhost` works for local testing, but `http://192.168.x.x` or other LAN IPs usually require HTTPS. If the headers are present but `window.crossOriginIsolated` is still `false`, check whether the page is being opened over plain HTTP on a non-loopback origin. + The pre-compressed `.wasm.gz` and `.data.gz` files also need correct `Content-Encoding`: ```apache diff --git a/docs/self-hosting/docker.md b/docs/self-hosting/docker.md index 4e9c1ac..97123d7 100644 --- a/docs/self-hosting/docker.md +++ b/docs/self-hosting/docker.md @@ -10,7 +10,9 @@ The easiest way to self-host BentoPDF in a production environment. > - `Cross-Origin-Opener-Policy: same-origin` > - `Cross-Origin-Embedder-Policy: require-corp` > -> The official container images include these headers. If using a reverse proxy (Traefik, Caddy, etc.), ensure these headers are preserved or added. +> The page must also be served from a secure context. `http://localhost` works for local testing, but `http://192.168.x.x` or other LAN IPs usually do not qualify, so Office conversion over plain HTTP will fail even if the headers are present. +> +> The official container images include these headers. If using a reverse proxy (Traefik, Caddy, etc.), ensure these headers are preserved or added, and use HTTPS for non-loopback access. > [!TIP] > **Podman Users:** All `docker` commands work with Podman by replacing `docker` with `podman` and `docker-compose` with `podman-compose`. diff --git a/docs/self-hosting/index.md b/docs/self-hosting/index.md index d1ac849..4149905 100644 --- a/docs/self-hosting/index.md +++ b/docs/self-hosting/index.md @@ -6,6 +6,11 @@ BentoPDF can be self-hosted on your own infrastructure. This guide covers variou The fastest way to self-host BentoPDF: +> [!IMPORTANT] +> Office file conversion requires `SharedArrayBuffer`, which means the app must be both cross-origin isolated and served from a secure context. The official image already sends the required COOP/COEP headers, but browsers still disable `SharedArrayBuffer` on plain HTTP local-network origins such as `http://192.168.x.x`. +> +> Use `http://localhost` only for same-device testing. If users access BentoPDF through a LAN IP or hostname, terminate it with HTTPS. + ```bash # Docker docker run -d -p 3000:8080 ghcr.io/alam00000/bentopdf:latest diff --git a/docs/self-hosting/nginx.md b/docs/self-hosting/nginx.md index 79c666b..a0382b9 100644 --- a/docs/self-hosting/nginx.md +++ b/docs/self-hosting/nginx.md @@ -176,7 +176,7 @@ types { ### Word/ODT/Excel to PDF Not Working -LibreOffice WASM requires `SharedArrayBuffer`, which needs `Cross-Origin-Embedder-Policy` and `Cross-Origin-Opener-Policy` headers. Note that nginx `add_header` directives in a `location` block **override** server-level `add_header` directives — they don't merge. Every `location` block with its own `add_header` must include the COEP/COOP headers. +LibreOffice WASM requires `SharedArrayBuffer`, which needs `Cross-Origin-Embedder-Policy` and `Cross-Origin-Opener-Policy` headers. It also needs a secure context, so `http://localhost` works for local testing but `http://192.168.x.x` or other LAN IPs usually require HTTPS. Note that nginx `add_header` directives in a `location` block **override** server-level `add_header` directives — they don't merge. Every `location` block with its own `add_header` must include the COEP/COOP headers. Verify with: diff --git a/tasks/lessons.md b/tasks/lessons.md new file mode 100644 index 0000000..2fd09f5 --- /dev/null +++ b/tasks/lessons.md @@ -0,0 +1,13 @@ +- Compare tool overlay regressions: check shared CSS before changing page logic; global canvas positioning rules can hide rendered PDF content while leaving highlight layers visible. +- When hardening code after a type-safety follow-up, never leave empty `catch {}` blocks in the touched path. Either guard the risky call up front or catch the error into a variable and handle it intentionally with a safe fallback, warning, or typed default. +- When adding or refining form creator models, keep reusable types and interfaces in `src/js/types` instead of defining them inline in logic files or tests. Logic modules should import shared types rather than owning them. +- Shared app types must be exported from `src/js/types/index.ts` and imported through the `@/types` alias. Do not import shared types from individual type files or relative `types/index` paths in feature code or tests. +- After a user correction about type safety, capture the lesson immediately: never leave `any` in a bug fix if the surrounding library types can be modeled or narrowed. Prefer extracting the logic into a typed helper and adding regression tests for the corrected path. +- pdf.js assigns document-scoped font name prefixes (`g_d0_`, `g_d1_`, ...) per loaded document. Always normalize these before comparing font names across documents to avoid false positive style changes. +- LibreOffice PDF conversion support can depend on explicit import filters. Before concluding `soffice` or LibreOffice WASM cannot convert `pdf -> docx/pptx`, test the exact filtered path such as `--infilter=writer_pdf_import` or `--infilter=impress_pdf_import` and inspect wrapper code for hardcoded capability gates. +- LibreOfficeKit `documentLoadWithOptions()` is not the same as CLI `--infilter`. In the current WASM/LOK build, the options string is forwarded as `FilterOptions`, not media-descriptor `FilterName`, so passing `FilterName=writer_pdf_import` from JS does not force PDF to load as Writer or Impress. +- When a consolidated patch contains unrelated features (e.g. abort API + PDF filter fix), a compile failure in one breaks the whole build. Never bundle unrelated features in one patch — split by concern. If already bundled, surgically remove the broken feature from the single patch rather than layering a second "undo" patch on top. +- Never add new C++ APIs (enums, functions) to a WASM build patch without confirming the header declarations are visible at compile time in all translation units that use them. The `OperationType` enum was declared in `lok.hxx` but the `.cxx` failed because of include ordering or missing header propagation. +- When removing hunks from a unified diff patch: (1) update hunk line counts in `@@` headers, (2) remove entire file sections if no +/- lines remain, (3) verify the old-side line numbers still match the actual source (removing earlier hunks shifts offsets), (4) verify the `-` lines match the actual source text character-for-character (e.g. `xInteraction` vs `uno::Reference(pInteraction)`). +- ALWAYS check the existing API before rebuilding WASM or forking libraries. The matbee libreoffice-converter already had `inputFilter` support in ConversionOptions, browser.worker.ts, and buildLoadOptions(). The entire WASM rebuild was unnecessary and replacing browser.worker.global.js with a fork build caused a DeploymentException that broke all LibreOffice conversions. +- Never replace compiled vendor assets (browser.worker.global.js, soffice.wasm.gz, etc.) unless absolutely necessary. These are tightly coupled and a mismatched worker JS + WASM binary causes initialization failures.