Anthropic Batch API Integration
Why Batch Mode Exists
Large PDF conversions (textbooks, reports, legal filings) can be 50β200+ pages. Each page is sent to Claude as an individual vision request. At Claude Sonnet pricing (~$3/$15 per MTok in/out), a 100-page document costs roughly $15 in real-time mode.
The Anthropic Message Batches API processes requests at 50% of the standard token cost in exchange for asynchronous delivery β results are guaranteed within 24 hours, typically arriving within 1 hour. Batch requests also run in a separate queue that does not count against normal Messages API rate limits.
For a 100-page document, batch mode reduces the conversion cost from ~$15 to ~$7.50.
How It Works
Conversion Flow
User uploads PDF (45 pages) β βΌPOST /api/convert/:fileId { realTime: false } β ββ pageCount > 10 && !realTime ? β β β βΌ YES β Split PDF into 45 single-page PDFs β Build 45 batch requests (custom_id = "page-1" β¦ "page-45") β Submit to anthropic.messages.batches.create() β Store KV record: batch:{batchId} β { fileId, userId, userEmail, β¦ } β Set file status β batch_pending β Return { batchMode: true, estimatedReadyAt } β βΌ NO (β€10 pages or realTime: true) Normal real-time processConversion()Cron Poller (every 5 minutes)
Cloudflare Cron Trigger ββ */5 * * * * β βΌList all KV keys with prefix "batch:" β For each batch record: β βΌRetrieve batch status via anthropic.messages.batches.retrieve(batchId) β ββ processing_status !== 'ended' β skip, check again in 5 min β βΌ 'ended'Stream JSONL results via anthropic.messages.batches.results(batchId)Sort by custom_id (page-1, page-2, β¦)Assemble per-page HTML into full document β βΌRun post-processing pipeline: 1. structurePages() β headers, footers, page sections 2. optimizeDeterministic() β table headers, brβsemantic, SVG aria 3. enhanceAccessibility() β title, lang, ARIA 4. validateAndFix() β WCAG AA auto-fix β βΌSave HTML to R2Update file metadata β status: completedSend email notification (Resend)Send Web Push notification (if subscribed)Delete batch KV recordSingle-Pass Constraint
Batch mode is single-pass only. The normal agentic-vision pipeline uses an iterative screenshot-comparison loop: render the HTML, screenshot it, compare to the original PDF, and refine. This loop requires browser rendering between Claude calls, which is impossible in batch mode since all requests run independently.
Batch requests are submitted with maxIterations = 0 (no refinement). The initial conversion prompt is the same high-quality prompt used in the iterative pipeline, so single-pass quality is still good for most documents. Users who need maximum fidelity should use real-time mode.
Key Files
| File | Purpose |
|---|---|
workers/api/src/services/batch-vision-converter.ts | Splits PDF, builds batch requests, submits to Anthropic |
workers/api/src/services/batch-result-assembler.ts | Streams JSONL results, sorts by page, assembles HTML |
workers/api/src/routes/batch-cron.ts | Cron handler β polls batches, runs post-processing, notifies user |
workers/api/src/routes/convert.ts | Routing logic β batch vs real-time based on page count and realTime flag |
workers/api/src/services/email.ts | sendBatchCompletionEmail() β notifies user when batch finishes |
workers/api/src/services/web-push.ts | VAPID-based push notifications for instant browser alerts |
workers/api/src/routes/push.ts | Push subscription endpoints |
apps/web/src/app/dashboard/page.tsx | Batch toggle UI, batch_pending status badge, adaptive polling |
KV Data Schema
Batch Job Record
Key: batch:{anthropicBatchId}
{ "fileId": "abc123", "userId": "user456", "filename": "annual-report.pdf", "batchId": "msgbatch_01JxYz...", "pageCount": 45, "options": { "highFidelity": false }, "createdAt": "2026-02-27T12:00:00Z"}This record is created when the batch is submitted and deleted after the cron successfully processes the results.
User Experience
Dashboard Behavior
| Condition | What the User Sees |
|---|---|
| File β€ 10 pages | Normal real-time conversion (no batch option applies) |
| File > 10 pages, batch mode on (default) | Convert β βBatch Queuedβ badge with clock icon, βweβll email you when itβs readyβ |
| File > 10 pages, batch mode off | Convert β normal real-time processing with progress bar |
| Batch completes | Status changes to βCompletedβ on next poll (30s interval), email arrives, push notification if subscribed |
Polling Intervals
- Real-time processing files: poll every 3 seconds (fast feedback)
- Batch-pending files only: poll every 30 seconds (no point polling faster β server checks every 5 min)
Cost Comparison
| Mode | Per-Page Cost (Sonnet) | 100-Page Document | Delivery |
|---|---|---|---|
| Real-time (iterative, 4 passes) | ~$0.60 | ~$60 | Minutes |
| Real-time (single-pass) | ~$0.15 | ~$15 | Minutes |
| Batch (single-pass, 50% off) | ~$0.075 | ~$7.50 | ~1 hour |
Batch mode is the default for documents over 10 pages because the cost savings are significant and most users do not need instant results for large documents.
Deployment
VAPID Keys (for Web Push)
Generate a key pair:
npx web-push generate-vapid-keysSet as Cloudflare Worker secrets:
echo "YOUR_PUBLIC_KEY" | npx wrangler secret put VAPID_PUBLIC_KEY --env productionecho "YOUR_PRIVATE_KEY" | npx wrangler secret put VAPID_PRIVATE_KEY --env productionCron Trigger
The cron trigger is defined in wrangler.toml and auto-registers on deploy:
[triggers]crons = ["*/5 * * * *"]To test locally:
npx wrangler dev --test-scheduledFailure Handling
- Batch submission fails: Falls through to normal real-time processing (graceful degradation).
- Individual page fails in batch: A placeholder is inserted (βPage N could not be convertedβ) and the rest of the document is assembled normally.
- Cron processing fails: The batch KV record is not deleted, so it will be retried on the next cron run (5 minutes later).
- Batch never completes: Anthropic guarantees results within 24 hours. If a batch record persists beyond 24 hours, it should be investigated manually.
Limitations
- No iterative refinement β batch mode is single-pass. Complex layouts (multi-column, heavy math) may have lower fidelity than iterative real-time mode.
- No real-time progress β the user sees βBatch Queuedβ with no percentage updates until the batch completes.
- Minimum latency ~5 minutes β even if Anthropic returns results instantly, the cron only runs every 5 minutes.
- No webhook β the Anthropic Batch API does not support webhooks, so polling is the only option.