Skip to content

WCAG 2.2 AA Coverage Enhancement Plan

1. Architecture Map of the Current Pipeline

Entry Points

FlowEntry FileEndpoint
PDF uploadworkers/api/src/routes/convert.tsPOST /api/convert/:fileId
URL fetchworkers/api/src/routes/gateway.tsPOST /api/gateway/convert
Bulk URLsworkers/api/src/routes/gateway.tsPOST /api/gateway/bulk
HTML remediationworkers/api/src/routes/remediate.tsPOST /api/remediate/html
V2 remediationworkers/api/src/routes/remediate-v2.tsPOST /remediate/remediate
PPTX remediationworkers/api/src/routes/remediate-pptx.tsDedicated PPTX flow
DOCX remediationworkers/api/src/routes/remediate-docx.tsDedicated DOCX flow

Page Chunking & Routing

PDF upload
pdf-complexity-detector.ts ← Zero-cost binary inspection (fonts, XObjects, vector paths)
classifies each page → text | math | image | mixed | table | dense-table
chunk-boundary-detector.ts ← Auto-split at ~30-50 pages using PDF outline/bookmarks
↓ Hard-split at MAX_CHUNK_SIZE_PAGES if no natural breaks
chunk-scheduler.ts ← Polls Supabase `chunk_jobs` table
optimistic locking, sequential chunk-N+1 visibility
chunk-processor.ts ← Per-chunk: extract page range → SmartCascade → store to R2

Model Routing per Page Type (Smart Cascade)

File: workers/api/src/services/smart-cascade-converter.ts

Page TypeTier 1 (cheap)Tier 2 (escalation)Escalation Trigger
Text-onlyMarker API ($0.001/pg)Gemini 2.5 Flash ($0.005/pg)Quality score < 80
MathMarker + Temml MathMLGemini 2.5 FlashLaTeX rendering failures
Image/visualGemini 2.5 Flash ($0.005/pg)Claude Sonnet 4.6 agentic ($0.15/pg)Score stall after 2 passes
MixedClaude Sonnet 4.6 agentic (direct)
TableGemini 2.5 FlashVision table extractor (Claude Sonnet)Complex table detection
Dense tableVision table extractor (Claude Sonnet)

Post-Processing Pipeline

File: workers/api/src/services/post-processing-pipeline.ts

After all chunks are assembled (chunk-assembler.ts), output runs through these sequential steps:

  1. structurePages (page-structurer.ts) — page breaks, headers/footers, page numbers
  2. optimizeDeterministic (ux-optimizer.ts) — CSS injection, MathML normalization, long-description aria-describedby
  3. runValidators (validators.ts) — MathML, table, heading validators with structured reporting
  4. addMathReadingAnnotations (latex-math-renderer.ts) — LaTeX → screen-reader “reads as” hints
  5. polishVisuals (visual-polisher.ts) — LLM-powered CSS enhancement (Gemini Flash, CSS-only)
  6. enhanceAccessibility (wcag-validator.ts) — ARIA roles, landmarks, lang, skip links, viewport, <b><strong>, sr-only class
  7. validateAndFix (wcag-validator.ts) — Deterministic WCAG AA validation + auto-fix (no LLM)
  8. wrapInDocument (utils/html.ts) — Full HTML document + XHTML IR generation

Axe-Core Fix Loop (Optional)

File: workers/api/src/services/axe-fixer.ts

After the deterministic pipeline, an optional browser-based axe-core audit runs up to 3 iterations:

  • runAxeAudit() → identifies violations → applyFixes() → re-audit
  • Handles ~30 axe rule IDs deterministically (contrast, headings, lists, ARIA, tables, etc.)
  • Reverts if a fix introduces regression

Seams Where Post-Processing Could Be Inserted

The pipeline in post-processing-pipeline.ts is sequential and well-factored. New passes can be inserted:

  1. Between step 6 (enhanceAccessibility) and step 7 (validateAndFix) — this is where afterEnhance callback already exists (line 55). Ideal for AI-powered passes that need full document context.
  2. After step 7 (validateAndFix) — for passes that should run on the “final” HTML before document wrapping.
  3. After the axe-fixer loop — for passes that need the fully-fixed HTML (e.g., screen-reader simulation).
  4. In chunk-processor.ts — for per-page passes before assembly (e.g., reading-order verification per page).
  5. In image-enhancer.ts — for per-image passes (e.g., alt-text self-critique, long-description generation).

2. Current WCAG 2.2 AA Coverage Audit

Criteria Catalogue

The codebase maintains a full WCAG 2.1 A+AA catalogue in workers/api/src/services/wcag-criteria-map.ts (50 success criteria). Many criteria are correctly marked typicallyNA: true for static converted documents (audio/video, keyboard traps, timing, etc.).

Criteria We Handle Well (Automated + Tested)

SCNameImplementationFiles
1.1.1Non-text ContentAlt text generation (Gemini Flash), quality gate with retry, isAltTextAcceptable() blocklistimage-enhancer.ts, image-description-pipeline.ts, wcag-validator.ts
1.3.1Info and RelationshipsHeading hierarchy fix, table header/scope injection, <b><strong> semantics, list structure fixeswcag-validator.ts, axe-fixer.ts, ux-optimizer.ts
1.4.3Contrast (Minimum)Inline style detection + ratio calculation (4.5:1 / 3:1 large text). Axe-fixer can fix contrast.wcag-validator.ts:248-272, axe-fixer.ts
2.4.1Bypass BlocksSkip-to-main-content link auto-injectedwcag-validator.ts:112-120
2.4.2Page Titled<title> injected from filenamewcag-validator.ts:89-101
2.4.4Link PurposePoor-link-text detector (regex patterns), LLM-assisted replacementworkers/remediate/src/wcag-impl.ts:211-341
2.4.6Headings and LabelsHeading hierarchy normalization (h1→h3 → h1→h2)wcag-validator.ts, axe-fixer.ts
3.1.1Language of Pagelang attribute injected on <html>wcag-validator.ts:84-86
4.1.1ParsingDuplicate ID removal, valid ARIA attributes, structural validationaxe-fixer.ts

Criteria We Handle Weakly or Partially

SCNameCurrent StateGap
1.1.1Non-text ContentAlt text generated for all imagesNo long descriptions for complex images (charts, infographics, data visualizations). Short alt text misses data content. isAltTextAcceptable() is a heuristic blocklist check, not a semantic quality assessment.
1.3.1Info and RelationshipsTables get <th> + scopeNo <caption> generation. Definition lists (<dl>) validated but not generated. Form label association is in the remediate worker but not the convert pipeline.
1.3.2Meaningful SequenceMCID-based reading order checked in PDF scorerNo reading-order verification for HTML output. Multi-column layouts from vision models may produce wrong reading order. The prompt says “maintain reading order” but there’s no post-hoc verification.
1.4.3ContrastInline styles checkedCSS class-based colors and inherited styles not checked. Only the axe-fixer catches these (requires browser rendering).
1.4.5Images of TextListed in criteria mapNo detection or remediation. Scanned PDFs with text-as-image are OCR’d but there’s no check that the OCR fully replaces the image.
2.4.6Headings and LabelsHierarchy validatedNo check for heading descriptiveness — empty headings caught, but “Chapter 1” or “Section A” pass.
3.1.2Language of PartsListed in criteria map with custom rule ai-lang-of-partsNot implemented in the convert pipeline. No per-element lang attribute detection for multilingual documents.
4.1.2Name, Role, ValueButton names, link names, input labelsNo validation for custom ARIA roles/states beyond what axe catches.

Criteria We Skip Entirely

SCNameStatusNotes
1.4.4Resize TextNot testedWould require viewport rendering; intrinsically met by responsive CSS injected in step 6
1.4.10ReflowNot testedSame — CSS ensures reflow, but not verified
1.4.11Non-text ContrastNot testedUI component borders/icons — partially N/A for converted docs
1.4.12Text SpacingNot testedCSS allows text-spacing overrides, but not verified
2.4.7Focus VisibleNot verified:focus-visible CSS injected (line 188 wcag-validator.ts) but never validated
4.1.3Status MessagesNot addressedN/A for static converted documents

Summary: Coverage Estimate

  • 50 WCAG 2.1 A+AA success criteria in the catalogue
  • ~18 are N/A for static converted documents (audio, video, keyboard traps, timing, etc.)
  • ~32 applicable criteria remain
  • ~20 well-handled with automated detection + fix
  • ~8 partially handled (detection exists, fix is incomplete or shallow)
  • ~4 listed but untested (CSS-based criteria assumed-passing)

Estimated coverage: ~62-70% of applicable criteria are fully passing. The ~70% figure the user cited aligns with this analysis.


3. Cost + Model Inventory

Per-Page Model Usage

ServiceModelProviderInput $/M tokOutput $/M tokApprox Cost/PageWhen Used
Marker APIDatalab$0.001Text-only pages
Smart cascade T1Gemini 2.5 FlashGoogle$0.15$2.50$0.005Vision pages (first try)
Smart cascade T2Claude Sonnet 4.6Anthropic$3.00$15.00$0.15Vision pages (escalation)
Image enhancerGemini 2.5 FlashGoogle$0.15$2.50$0.0003/imageAlt text generation
Mathpix equationsMathpix APIDatalab$0.002/imageEquation images
Visual polishGemini 2.5 FlashGoogle$0.15$2.50~$0.002CSS-only polish (optional)
Struct metadataClaude Sonnet 4.6Anthropic$3.00$15.00Cached/amortizedPDF structure extraction

Prompt Caching

Currently used in:

  • agentic-vision-converter.tscache_control: { type: 'ephemeral' } on PDF documents and system prompts
  • struct-table-extractor.ts — ephemeral cache for table extraction
  • score-metadata-extractor.ts — ephemeral cache for metadata

Tracked fields: cacheCreationInputTokens, cacheReadInputTokens in token usage.

Cost Tracking

Fully implemented. The cost-ledger.ts service records every AI call to a cost_ledger Supabase table with:

  • user_id, file_id, product, operation_type, model, backend
  • input_tokens, output_tokens, estimated_cost_usd, metadata
  • Structured JSON logging to stdout for Loki/Grafana

Additionally:

  • budget-estimator.ts — pre-conversion cost estimate based on page complexity
  • dollar-budget.ts — hard cost cap per conversion with graceful stop
  • credits.ts — user credit system (3 credits/page worst case)
  • llm-cost.ts — centralized per-model pricing table

4. Enhancement Opportunities (Ranked by Coverage-Gain-per-Dollar)

Enhancement 1: Cross-Page Heading Hierarchy + Coherence Check

WCAG SC(s): 1.3.1 (Info and Relationships), 2.4.6 (Headings and Labels), 2.4.10 (Section Headings)

Problem: Each chunk is processed independently. Heading levels may be inconsistent across chunk boundaries (chunk 1 ends at h2, chunk 2 restarts at h1). The existing analyzeHeadings() in wcag-validator.ts only validates within a single HTML fragment.

Approach: After chunk assembly, run a single cheap-model pass over the entire heading tree (extracted as a flat list of {level, text, position}). The model normalizes levels for cross-chunk coherence and flags non-descriptive headings.

  • Estimated incremental cost: ~$0.001-0.003 per document (Gemini 2.5 Flash, <2K tokens total — headings only, not full HTML)
  • Files to touch: chunk-assembler.ts (extract heading tree post-assembly), new heading-coherence-checker.ts, post-processing-pipeline.ts (insert step)
  • Risk/Complexity: S — deterministic heading extraction, small LLM call, easy to validate
  • Coverage gain: Improves 1.3.1, 2.4.6 from partial to full for multi-chunk documents

Enhancement 2: Alt-Text Self-Critique (Cheap Model)

WCAG SC(s): 1.1.1 (Non-text Content)

Problem: Current alt text quality gate (isAltTextAcceptable()) uses heuristic blocklist checks (too short, generic phrases like “image of”). It doesn’t assess whether alt text is semantically adequate for the image content. Complex images like charts get short alt text that misses the data story.

Approach: After initial alt text generation, run a cheap self-critique pass that checks:

  1. Does the alt text describe the image’s purpose (not just appearance)?
  2. For charts/data: does it convey the key data point or trend?
  3. Is it too verbose (>150 chars) or too terse (<15 chars)?
  4. Does it avoid redundancy with surrounding text?

Use Gemini 2.0 Flash Lite ($0.075/$0.30 per M tokens) — cheapest option that can compare text against an image.

  • Estimated incremental cost: ~$0.0002 per image ($0.002 for a 10-image document)
  • Files to touch: image-enhancer.ts (add critique step after generation), image-description-pipeline.ts (wire critique into pipeline)
  • Risk/Complexity: S — single cheap call per image, clear pass/fail output
  • Coverage gain: Upgrades 1.1.1 from “generates alt text” to “generates quality-assured alt text”

Enhancement 3: Deep-Vision Long Description for Complex Images

WCAG SC(s): 1.1.1 (Non-text Content — long description for complex images)

Problem: Charts, infographics, data visualizations, and complex diagrams get a short alt text but no long description. WCAG 1.1.1 requires “an alternative that serves an equivalent purpose” — for a bar chart, that means conveying the data, not just “bar chart showing quarterly revenue.”

Approach: Gate with the existing diagramType classifier from image-enhancer.ts:

  • If diagramType is chart, diagram, or illustration AND the image is >100KB (not a simple icon):

    • Send to a capable vision model (Gemini 2.5 Flash) with a structured prompt asking for data extraction
    • Generate a long description (<500 words) with data table if applicable
    • Inject as aria-describedby linked <details> element (pattern already exists in ux-optimizer.ts:1242)
  • Estimated incremental cost: ~$0.005 per complex image. Typical doc has 0-3 complex images → $0.00-$0.015/doc

  • Files to touch: image-enhancer.ts (add long-description generation), ux-optimizer.ts (inject <details> element), image-description-pipeline.ts (wire in)

  • Risk/Complexity: M — needs prompt engineering for data extraction; <details> injection pattern exists but needs expansion

  • Coverage gain: Addresses the biggest remaining gap in 1.1.1 — complex image descriptions

Legal note: Under ADA Title II (DOJ rule effective April 2027/2028 per CLAUDE.md compliance deadlines), complex images in government documents must have equivalent text alternatives. This is a high-priority SC for the target market.

Enhancement 4: Reading-Order Verification for Multi-Column Layouts

WCAG SC(s): 1.3.2 (Meaningful Sequence)

Problem: Vision models sometimes output multi-column content in wrong reading order (interleaving columns, or reading across instead of down). The convert pipeline has no post-hoc verification — it trusts the model output.

Approach: After conversion (per-page, in chunk-processor.ts), compare the HTML’s text sequence against the PDF’s text extraction order (from unpdf extractText). If >20% of text segments are reordered, flag for re-processing or route to a reading-order jury:

  1. Extract text blocks from HTML (split on block elements)
  2. Extract text from PDF (already available from complexity detector)
  3. Compute Kendall tau correlation between the two orderings
  4. If τ < 0.7: send the page image + current HTML to a vision model asking specifically about reading order
  • Estimated incremental cost: $0 for the deterministic check (most pages pass). ~$0.005 per flagged page for vision jury. Estimated 5-10% of pages flag → $0.0025-0.005/page average.
  • Files to touch: New reading-order-verifier.ts, chunk-processor.ts (call after conversion), post-processing-pipeline.ts (optional doc-level pass)
  • Risk/Complexity: M — text extraction ordering comparison is non-trivial for complex layouts; false positive rate needs tuning
  • Coverage gain: Moves 1.3.2 from “not verified” to “verified with automated check”

Enhancement 5: Form-Field Label-Association Jury

WCAG SC(s): 1.3.1 (Info and Relationships), 1.3.5 (Identify Input Purpose), 3.3.2 (Labels or Instructions)

Problem: The premium-form-converter.ts generates form HTML but label↔input association relies on the initial vision model output. The remediate worker (wcag-impl.ts:220-246) detects unlabeled inputs but only in the remediation flow — not the convert flow. No jury pass verifies the association is correct (label text matches the field’s purpose).

Approach: After form conversion, run a cheap model pass that:

  1. Extracts all <label>/<input> pairs
  2. Checks each for/id pairing is correct
  3. Verifies label text semantically matches field purpose
  4. Adds autocomplete attributes per 1.3.5 (currently missing entirely)
  • Estimated incremental cost: ~$0.002 per form page. Most docs have 0 form pages → $0 for typical docs.
  • Files to touch: premium-form-converter.ts (add post-conversion jury), new form-label-jury.ts
  • Risk/Complexity: S — small scope (form pages only), clear validation criteria
  • Coverage gain: Moves 1.3.5 from “partially” to “full” for form documents. Also addresses 3.3.2.

Legal note: Forms in government PDFs are high-scrutiny items under ADA Title II. Incorrect label association is one of the most commonly cited WCAG violations in DOJ settlement agreements.

Enhancement 6: Language-of-Parts Detection

WCAG SC(s): 3.1.2 (Language of Parts)

Problem: The criteria map lists ai-lang-of-parts as a custom rule but it’s not implemented. Multilingual documents (common in government and academic PDFs) have no per-element lang attribute annotation.

Approach: After document assembly, scan text blocks for language switches using a lightweight language-detection library (e.g., cld3 or Gemini Flash with a simple prompt). Insert lang attributes on elements containing non-primary-language text.

  • Estimated incremental cost: ~$0.001 per document (Gemini 2.0 Flash Lite, text-only, <1K tokens for language classification). For the library approach: $0 (CPU-only).
  • Files to touch: New language-of-parts-detector.ts, post-processing-pipeline.ts (new step after enhanceAccessibility)
  • Risk/Complexity: S — well-defined problem, small scope
  • Coverage gain: Moves 3.1.2 from “not implemented” to “automated”

Enhancement 7: Confidence Scoring + Human Review Queue

WCAG SC(s): All — meta-enhancement

Problem: The pipeline ships everything with equal confidence. A perfect text-only page and a mangled multi-column layout with complex images get the same treatment. There’s no way for downstream consumers to know which items need human review.

Approach: Add a per-element confidence score to the output:

  1. Leverage the existing QualityScore (8-dimension breakdown in quality-scorer.ts)
  2. Extend with per-element granularity: each <img>, <table>, heading gets a confidence tag
  3. Elements with confidence < threshold populate a requiresHumanReview array in the output
  4. The array includes: element type, WCAG SC at risk, reason, location in document

This is the scaffolding that all other enhancements feed into.

  • Estimated incremental cost: $0 (deterministic aggregation of existing scores)
  • Files to touch: quality-scorer.ts (add per-element scoring), shared types in packages/shared/src/types.ts, output formatting in convert.ts
  • Risk/Complexity: M — needs type changes across shared package + API output, but no new AI calls
  • Coverage gain: Enables human-in-the-loop for the ~10% of elements that automated passes can’t confidently handle

Enhancement 8: Screen-Reader Simulation Read-Back

WCAG SC(s): 1.3.1, 1.3.2, 2.4.6, 4.1.2 — structural coherence validation

Problem: Even with all the above passes, the final output may have structural issues that are only apparent when “read” linearly as a screen reader would — e.g., a table caption that’s been orphaned from its table, or a heading that doesn’t match the content that follows it.

Approach: Serialize the final HTML to a linear text stream (strip tags, preserve element boundaries with markers). Send to a cheap model asking: “Does this read coherently as a document? Flag any points where the reading order breaks, content seems out of place, or a heading doesn’t match what follows.”

  • Estimated incremental cost: ~$0.005-0.01 per document (Gemini 2.5 Flash, full-document text, ~5K-10K tokens)
  • Files to touch: New screen-reader-simulator.ts, post-processing-pipeline.ts (final step before wrapInDocument)
  • Risk/Complexity: L — LLM judgment on coherence is subjective; needs careful prompt engineering to avoid false positives. Results should route to requiresHumanReview, not auto-fix.
  • Coverage gain: Catches structural issues that slip through rule-based checks. Primary value is quality assurance, not direct SC coverage.

Enhancement 9: Opus-Tier Jury on Low-Confidence Items

WCAG SC(s): All — quality escalation for the hardest items

Problem: Some elements are genuinely hard — complex data visualizations, unusual table structures, ambiguous reading order. Cheap models produce low-confidence results. Currently these ship as-is.

Approach: After confidence scoring (Enhancement 7), items with confidence < 40% and high WCAG impact (images, tables, forms) are sent to Claude Opus for a single review pass. Opus output replaces the original only if it scores higher.

  • Estimated incremental cost: ~$0.05-0.10 per low-confidence item. With <10% of elements flagged and ~2-5 per doc: $0.10-0.50/doc (only for complex documents).
  • Files to touch: New opus-jury.ts, post-processing-pipeline.ts (conditional step gated on confidence scores)
  • Risk/Complexity: M — cost management is critical; must have a hard budget cap. Feature flag essential.
  • Coverage gain: Targeted improvement on the hardest 5-10% of elements where cheaper models fail.

Summary: Enhancement Priority Matrix

#EnhancementWCAG SCsCost/DocComplexityCoverage LiftPriority
1Heading coherence check1.3.1, 2.4.6$0.002SMediumP1
2Alt-text self-critique1.1.1$0.002SMediumP1
3Long desc for complex images1.1.1$0.015MHighP1
4Reading-order verification1.3.2$0.003MHighP1
5Form label-association jury1.3.1, 1.3.5, 3.3.2$0.002SMedium (form docs)P2
6Language-of-parts detection3.1.2$0.001SMedium (multilingual)P2
7Confidence scoring + human queueAll (meta)$0MFoundationalP1
8Screen-reader simulation1.3.1, 1.3.2, 2.4.6$0.008LLow-MediumP3
9Opus jury on low-confidenceAll$0.10-0.50MHigh (targeted)P3

Estimated Total Cost Impact

For a typical 20-page document (mostly text, 5 images, 1 chart, 0 forms):

  • Current pipeline cost: ~$0.05-0.20
  • All P1 enhancements: +$0.02-0.03 (+10-60%)
  • All P1+P2: +$0.02-0.04
  • All P1+P2+P3: +$0.05-0.55 (Opus jury dominates when triggered)

Expected Coverage Lift

  • Current: ~70% of applicable WCAG 2.2 AA criteria fully passing
  • After P1 enhancements: ~80-82% (heading coherence, alt-text quality, reading order, long descriptions)
  • After P1+P2: ~83-85% (forms + language-of-parts)
  • After P1+P2+P3: ~85-88% (screen-reader simulation catches edge cases, Opus jury handles hard items)

The remaining ~12-15% are criteria that require browser-based interactive testing (reflow, text spacing, focus visible) or are genuinely N/A but marked as applicable for conservatism in VPAT reporting.

  • ADA Title II (DOJ rule): Deadlines April 26, 2027 (pop ≥50K) and April 26, 2028 (pop <50K) per the compliance schedule in CLAUDE.md. All enhancements addressing SC 1.1.1, 1.3.1, and 1.3.2 are directly relevant to government PDF compliance.
  • Section 508 (US Federal): Requires WCAG 2.0 AA conformance; all enhancements apply.
  • EN 301 549 (EU): Maps to WCAG 2.1 AA; language-of-parts (Enhancement 6) is specifically flagged in EU accessibility audits of multilingual documents.
  • AODA (Ontario): Requires WCAG 2.0 AA; same SC applicability.

Important: Automated remediation cannot serve as a substitute for human attestation of WCAG conformance. The output should clearly indicate which criteria were automatically verified vs. which require human review (Enhancement 7). VPAT reports generated from this pipeline already use conservative “partially supports” / “not-verified” language (per wcag-criteria-map.ts honesty rules), and this should continue.


Implementation Status (P1 Enhancements)

All P1 enhancements have been implemented on branch feature/wcag-coverage-enhancements.

Files Created

EnhancementFileTests
Confidence scoring (#7)workers/api/src/services/confidence-scorer.ts__tests__/services/confidence-scorer.test.ts (16 tests)
Heading coherence (#1)workers/api/src/services/heading-coherence-checker.ts__tests__/services/heading-coherence-checker.test.ts (11 tests)
Alt-text critique (#2)workers/api/src/services/alt-text-critique.ts__tests__/services/alt-text-critique.test.ts (6 tests)
Long descriptions (#3)workers/api/src/services/long-description-generator.ts__tests__/services/long-description-generator.test.ts (14 tests)
Reading-order verifier (#4)workers/api/src/services/reading-order-verifier.ts__tests__/services/reading-order-verifier.test.ts (12 tests)

Files Modified

FileChanges
workers/api/src/services/post-processing-pipeline.tsAdded 3 new pipeline steps (heading coherence, reading-order check, confidence scoring) with feature flags
workers/api/src/services/image-enhancer.tsAdded enableAltTextCritique flag and critique call after retry logic
workers/api/src/services/image-description-pipeline.tsAdded long description generation for qualifying complex images

Feature Flags

All enhancements are disabled by default. Enable via PostProcessOptions or EnhancerConfig:

FlagWherePurpose
enableConfidenceScoringPostProcessOptionsPer-element confidence scores + requiresHumanReview output
confidenceThresholdPostProcessOptionsConfidence threshold (default: 60)
enableHeadingCoherencePostProcessOptionsCross-chunk heading hierarchy normalization
enableReadingOrderCheckPostProcessOptionsPDF ↔ HTML reading-order comparison
pdfTextPagesPostProcessOptionsPDF text per page (required for reading-order check)
enableAltTextCritiqueEnhancerConfigSemantic alt-text quality verification

Cost Tracking

  • Alt-text critique: Uses estimateLlmCost() with model gemini-2.0-flash-lite. Token usage tracked in CritiqueResult and flows through enhanceImagesInHtml token totals.
  • Long descriptions: Uses estimateLlmCost() with model gemini-2.5-flash. Token usage tracked in LongDescriptionResult and logged in image pipeline.
  • Heading coherence: No LLM calls — deterministic, zero cost.
  • Confidence scoring: No LLM calls — deterministic, zero cost.
  • Reading-order verifier: No LLM calls — deterministic, zero cost.
  • Form label jury: No LLM calls — deterministic, zero cost.
  • Language-of-parts: No LLM calls — deterministic Unicode analysis, zero cost.

Implementation Status (P2 Enhancements)

P2 enhancements implemented on branch feature/wcag-coverage-enhancements.

Files Created

EnhancementFileTests
Form label jury (#5)workers/api/src/services/form-label-jury.ts__tests__/services/form-label-jury.test.ts (26 tests)
Language-of-parts (#6)workers/api/src/services/language-of-parts-detector.ts__tests__/services/language-of-parts-detector.test.ts (19 tests)

Feature Flags (P2)

FlagWherePurpose
enableFormLabelJuryPostProcessOptionsLabel↔input validation + autocomplete injection
enableLanguageOfPartsPostProcessOptionsUnicode-based per-element lang attribute detection

Implementation Status (P3 Enhancements)

P3 enhancements implemented on branch feature/wcag-coverage-enhancements.

Files Created

EnhancementFileTests
Screen-reader sim (#8)workers/api/src/services/screen-reader-simulator.ts__tests__/services/screen-reader-simulator.test.ts (13 tests)
Opus jury (#9)workers/api/src/services/opus-jury.ts__tests__/services/opus-jury.test.ts (10 tests)

Feature Flags (P3)

FlagWherePurpose
enableScreenReaderSimPostProcessOptionsCoherence validation via linearized read-back
enableOpusJuryPostProcessOptionsClaude Opus review of low-confidence items
maxOpusCostUsdPostProcessOptionsHard budget cap for Opus jury (default: $0.50/doc)

Cost Tracking (P3)

  • Screen-reader sim: Gemini 2.5 Flash, ~$0.005-0.01/doc. Token usage tracked in ScreenReaderSimResult.
  • Opus jury: Claude Opus 4.6, ~$0.05-0.10/item. Token usage tracked per-verdict. Hard budget cap prevents runaway costs.