Node.js Docker Server for Heavy Processing
The Node.js server offloads CPU/memory-intensive PDF operations (conversion, export, remediation) from the Cloudflare Worker to a Docker container running native Puppeteer. This eliminates CF Browser Rendering rate limits and provides more headroom for long-running jobs.
Architecture
Browser β CF Worker β nodeProxyMiddleware β (heavy POST + healthy) CF Tunnel β 10.1.1.3:8790 β Node server (same Hono routes) β’ storage.objects β R2 via S3 API β’ storage.kv β CF KV via REST API β’ storage.browser β Native Puppeteer β Response streams back through CF WorkerFallback: If the Node server is unhealthy or unreachable, the CF Worker runs the route itself using CF Browser Rendering. The circuit breaker in nodeProxyMiddleware opens after 2 consecutive failures, cooling down for 30 seconds before retrying.
Proxied Routes
Only heavy POST requests are forwarded. Everything else stays on the CF Worker.
| Pattern | Description |
|---|---|
POST /api/convert/:fileId | PDF-to-HTML conversion |
POST /api/export/:fileId/pdf | HTML-to-PDF export |
POST /api/remediate/html | Single HTML remediation |
POST /api/remediate/batch | Batch remediation |
POST /api/remediate/url | URL scan + remediation |
Storage Strategy
The Node server accesses the same data as the CF Worker β same R2 bucket, same KV namespace β through different APIs.
| Need | CF Worker | Node Server |
|---|---|---|
| Object storage (files, ZIPs) | R2 binding | R2 via S3-compatible API |
| Key-value (metadata, status) | KV binding | CF KV via REST API |
| Browser (PDF gen, screenshots) | CF Browser Rendering | Native Puppeteer |
Why this works
- R2 natively supports the S3 protocol. The
R2ObjectStorageclass uses@aws-sdk/client-s3pointed athttps://{accountId}.r2.cloudflarestorage.com. - KV is accessible via Cloudflareβs REST API. The
CloudflareKvRestStorageclass callshttps://api.cloudflare.com/client/v4/accounts/{id}/storage/kv/namespaces/{ns}/values/{key}. - Puppeteer runs natively in the Docker container (the base image ships Chromium). Reuses the existing
AwsBrowserProviderfromproviders/aws.ts.
Key Files
| File | Purpose |
|---|---|
workers/api/src/server.ts | Node.js HTTP entry point (Hono + @hono/node-server) |
workers/api/src/providers/node-server.ts | R2ObjectStorage, CloudflareKvRestStorage, KV namespace shim, factory |
workers/api/src/middleware/node-proxy.ts | CF Worker middleware that forwards requests |
workers/api/src/services/node-proxy.ts | Proxy logic, health checker, circuit breaker |
workers/api/Dockerfile | Docker image based on ghcr.io/puppeteer/puppeteer:latest |
docker-compose.yml | Single api-node service on port 8790 |
.env.node-server.example | Template for all required environment variables |
workers/api/tsconfig.node.json | Node-specific TypeScript config (IDE only β tsx ignores it) |
How Auth Works
The proxy forwards the Authorization header as-is. The Node serverβs requireAuth middleware validates the JWT using SUPABASE_JWT_SECRET. Both CF and Node independently verify the token β double validation is safe.
KV Namespace Shim
Some routes pass c.env directly and access env.KV_SESSIONS (e.g., extractTablesWithVision in convert.ts). The createKvNamespaceShim() function returns an object matching the CF KVNamespace get/put/delete shape, backed by the REST API. The serverβs storage middleware injects this shim into c.env.KV_SESSIONS.
Local Development
Run the Node server locally (requires env vars set):
cd workers/apiPORT=8790 tsx src/server.tsOr with file watching:
cd workers/apiPORT=8790 npm run server:devTo test the full proxy flow locally, set NODE_API_URL=http://localhost:8790 in wrangler.toml [vars] and run wrangler dev alongside the Node server.
Deployment
Prerequisites
- Docker and Docker Compose on
10.1.1.3 - SSH access:
ssh -i ~/.ssh/nightly-audit [email protected] - Cloudflare Tunnel configured on the server
Step 1: Create R2 S3 API Token
- Go to Cloudflare Dashboard > R2 > Manage R2 API Tokens
- Click Create API token
- Permissions: Object Read & Write
- Specify bucket:
accessible-pdf-files - Save the Access Key ID and Secret Access Key
Step 2: Create CF API Token for KV
- Go to Cloudflare Dashboard > My Profile > API Tokens
- Click Create Token
- Use Custom token template
- Permissions: Account > Workers KV Storage > Edit
- Save the token
Step 3: Configure Environment
On 10.1.1.3, in the project directory:
cp .env.node-server.example .env.node-serverFill in all values:
| Variable | Source |
|---|---|
R2_ACCOUNT_ID | Cloudflare dashboard > Account ID (sidebar) |
R2_ACCESS_KEY_ID | From Step 1 |
R2_SECRET_ACCESS_KEY | From Step 1 |
R2_BUCKET_NAME | accessible-pdf-files |
CF_ACCOUNT_ID | Same as R2_ACCOUNT_ID |
CF_API_TOKEN | From Step 2 |
KV_SESSIONS_NAMESPACE_ID | 9d39d6e609b945848f5082cea23306b0 (from wrangler.toml) |
SUPABASE_URL | Supabase project settings |
SUPABASE_SERVICE_ROLE_KEY | Supabase project settings > API |
SUPABASE_JWT_SECRET | Supabase project settings > API > JWT Secret |
ANTHROPIC_API_KEY | Anthropic console |
| Other API keys | As needed for enabled parsers |
Step 4: Build and Start
docker compose up -d --buildVerify itβs running:
docker compose pscurl http://localhost:8790/healthExpected response:
{ "success": true, "data": { "status": "healthy", "platform": "node-server", "uptime": 5.123 }}Step 5: Configure Cloudflare Tunnel
Add a public hostname rule in the Cloudflare Zero Trust dashboard:
| Field | Value |
|---|---|
| Subdomain | node-pdf (or similar) |
| Domain | anglin.com |
| Service | http://localhost:8790 |
This gives the Node server a public URL like https://node-pdf.anglin.com.
Step 6: Set NODE_API_URL on CF Worker
wrangler secret put NODE_API_URL --env production# Enter: https://node-pdf.anglin.comThe CF Worker will now proxy heavy requests to the Node server.
Step 7: Verify End-to-End
- Proxy active: Trigger a conversion on
pdf.anglin.com. Check CF Worker logs for[node-proxy] Proxying POST /api/convert/...and Node server logs for the incoming request. - Fallback works:
docker compose stop, trigger another conversion. CF Worker logs should show[node-proxy] Node server unhealthy β falling back to CF. - Recovery:
docker compose start, wait ~30s for circuit breaker cooldown, trigger again. Should route through Node server.
Updating
To deploy a new version after code changes:
cd /path/to/accessible-pdf-convertergit pulldocker compose up -d --buildMonitoring
- Health check: The Dockerfile includes a
HEALTHCHECKthat hits/healthevery 30s. Docker will mark the container as unhealthy if it fails 3 times. - Logs:
docker compose logs -f api-node - Uptime Kuma: Add
https://node-pdf.anglin.com/healthas a monitored endpoint.
Troubleshooting
| Symptom | Likely Cause | Fix |
|---|---|---|
| Container starts then exits | Missing required env var | Check docker compose logs api-node for Missing required env var: ... |
| Health check passes but proxy fails | CF Tunnel not routing | Verify tunnel config in Zero Trust dashboard |
| KV operations fail with 403 | API token lacks permissions | Recreate token with Workers KV Storage: Edit |
| R2 operations fail with 403 | S3 token wrong bucket scope | Recreate R2 API token scoped to accessible-pdf-files |
| Puppeteer crashes | Out of memory | Increase container memory limit in docker-compose.yml |
| Circuit breaker stuck open | Node server was down > 30s | It auto-recovers after the 30s cooldown. Force reset by redeploying the CF Worker. |