Skip to content

Customer-Dedicated AWS Deployment

Overview

This document assesses the feasibility of deploying a dedicated instance of The Accessible on a customer’s own AWS account, including AWS GovCloud. The goal is to understand what work is required, what’s already portable, and what the major obstacles are.

Current Architecture

The application runs across two platforms:

  • Cloudflare β€” static frontend (Pages), API (Workers), object storage (R2), KV (sessions/rate limiting), browser rendering
  • AWS β€” Lambda API (failover), EC2 worker fleet (ASG), SQS queues, DynamoDB, SES email intake, CloudWatch monitoring
  • Supabase β€” PostgreSQL database, authentication (magic link + Google OAuth)
  • Third-party SaaS β€” AI/OCR services (Claude, Gemini, OpenAI, Mistral, Mathpix, Marker), Stripe payments, Resend email

Portability by Layer

Already Portable (Low Effort)

LayerCurrentCustomer AWSNotes
FrontendCloudflare PagesS3 + CloudFrontStatic Next.js export, deploy anywhere
APICF Workers + LambdaLambda + API GatewayCDK stacks already exist in infra/cdk/
Object StorageCloudflare R2AWS S3Uses S3Client with configurable endpoint (workers/api/src/providers/node-server.ts)
KV StoreCloudflare KVDynamoDBAlready built in CDK, KV abstraction exists in providers
QueueCF cron + SQSSQSAlready in CDK
PDF ProcessingWeasyPrint + Audiveris (Docker)Same containers on EC2/ECSDockerfiles exist, ECR push in CI
Browser RenderingCF Browser RenderingPuppeteer on EC2Node.js server mode already uses native Puppeteer
MonitoringGrafana + LokiSame stack or CloudWatchDocker Compose available
EmailResend + SESAWS SESSES email intake already built in CDK email stack
PaymentsStripeCustomer’s Stripe accountJust different API keys
AI ServicesClaude, Gemini, OpenAI, etc.Same APIsAll configured via env vars/API keys

Requires Significant Work

Database: Supabase β†’ RDS PostgreSQL (Medium)

The 54 SQL migrations in supabase/migrations/ are standard PostgreSQL and will run on RDS. However:

  • Some migrations reference Supabase-specific schemas (auth.uid(), auth.jwt() in RLS policies, auth.users table joins)
  • The API uses @supabase/supabase-js for all database queries across ~30+ route files
  • Query patterns are mostly .from('table').select().eq() which translate straightforwardly to any query builder

Work needed:

  • Strip or replace auth.* references in RLS policies
  • Move authorization to the API layer (simpler than reimplementing RLS with Cognito)
  • Replace @supabase/supabase-js DB calls with a direct Postgres client (e.g., pg, drizzle-orm, or kysely)
  • Replace auth.users foreign keys with a standalone users table

Authentication: Supabase Auth β†’ AWS Cognito (High)

This is the single largest piece of work. Supabase Auth is deeply integrated:

  • Frontend β€” supabase.auth.signInWithOtp(), signInWithOAuth(), getSession(), onAuthStateChange() in auth-context for web, music, and forms apps
  • API middleware (workers/api/src/middleware/auth.ts) β€” JWT verification via Supabase JWKS endpoint
  • Per-user DB clients (workers/api/src/utils/supabase.ts) β€” creates Supabase clients with user JWT for RLS

Work needed:

  • Replace Supabase auth calls with Cognito SDK or Amplify Auth in 3+ frontend apps
  • Replace magic link flow with Cognito custom auth challenge or hosted UI
  • Replace Google OAuth with Cognito identity provider federation
  • Rewrite JWT verification middleware for Cognito tokens
  • Map Cognito sub to the app’s user ID system

Third-Party AI/OCR Service Concerns

Security-conscious or government customers may object to sending document data to third-party SaaS. This is relevant for:

Mathpix & Marker β€” Most Likely Objections

  • Mathpix sends documents to api.mathpix.com for math/equation OCR. No self-hosted option. Best-in-class for LaTeX extraction from scanned PDFs.
  • Marker API sends documents to Datalab’s servers for document parsing.
  • Both are already optional. The converter cascade skips them if no API key is configured. The system degrades gracefully β€” equation-heavy documents won’t convert as well.
  • Fallback: Google Cloud Vision OCR (via Vertex AI) or self-hosted Tesseract. Lower quality for math content.

AI LLM APIs β€” Possible Objections

The core conversion pipeline requires at least one vision LLM. Options by data sensitivity:

  1. Gemini via Vertex AI (most government-friendly) β€” runs in customer’s own GCP project, data stays within their boundary
  2. Claude / OpenAI / Mistral β€” commercial SaaS, documents leave the customer’s infrastructure
  3. Self-hosted open-source models (e.g., LLaVA, Qwen-VL on EC2 GPU) β€” data stays in VPC but quality drops significantly and cost increases

Minimum viable for restricted environments: Gemini on Vertex AI only.

GovCloud-Specific Considerations

  1. AI API access β€” GovCloud VPCs may restrict outbound to commercial AI APIs. This could be a blocker if even Google Vertex AI is disallowed. FedRAMP-authorized alternatives may be required.
  2. AWS partition β€” GovCloud uses aws-us-gov partition. CDK handles this via Stack.of(this).partition.
  3. Cognito β€” Available in GovCloud (us-gov-west-1).
  4. SES β€” Available in GovCloud with restrictions.
  5. Stripe β€” May need a government-approved payment processor instead.
  6. Docker images β€” ECR works in GovCloud, but pulling base images from Docker Hub may be restricted. Pre-build and push to customer ECR.
  7. Data residency β€” Self-hosted services (WeasyPrint, Audiveris) run in the customer’s VPC, so no concern there.

Effort Estimate

Work ItemEffortNotes
CDK env config for customer1-2 daysNew entry in infra/cdk/lib/env-config.ts
Deploy CDK stacks1 daycdk deploy --all with customer credentials
RDS setup + migrations1-2 daysStrip auth.* references, test schema
Auth: Cognito (API side)3-5 daysNew middleware, JWT verification, user management
Auth: Cognito (Frontend)3-5 daysReplace Supabase auth in 3+ apps
Replace Supabase DB client5-8 days~30+ route files, query translation, connection pooling
Frontend deploy (S3+CloudFront)1 dayBuild, upload, configure distribution
SES domain setup0.5 daysDKIM/SPF verification
Hardcoded URL cleanup1-2 daysParameterize theaccessible.org, CORS config
AI key configuration (SSM)0.5 daysProvision parameters in customer account
Testing & validation3-5 daysEnd-to-end testing of all flows
Total~20-30 daysOne developer familiar with the codebase

Build provider interfaces that let the same codebase run on Supabase or pure AWS:

  1. Database provider β€” DatabaseClient interface with SupabaseClient and PostgresClient implementations, selected by DB_PROVIDER env var (~1 week)
  2. Auth provider β€” AuthProvider interface with SupabaseAuth and CognitoAuth implementations, selected by AUTH_PROVIDER env var (~1 week)
  3. Environment template β€” customer-aws config in CDK, deployment runbook, SSM setup script (~2-3 days)

Total: ~3-4 weeks, then ~2-3 days per new customer instance.

Option B: One-Off Fork (If only 1 customer)

Fork the codebase and replace Supabase directly with Cognito + RDS.

Total: ~2-3 weeks, but each customer gets a divergent codebase.

Key Files

PurposePath
CDK stack orchestrationinfra/cdk/bin/app.ts
Environment configinfra/cdk/lib/env-config.ts
CDK stacksinfra/cdk/lib/stacks/*.ts
AWS CI/CD.github/workflows/deploy-aws.yml
S3/storage abstractionworkers/api/src/providers/node-server.ts
Auth middlewareworkers/api/src/middleware/auth.ts
Supabase clientworkers/api/src/utils/supabase.ts
Frontend authapps/web/src/lib/auth-context.tsx
Frontend API layerapps/web/src/lib/api.ts
Database migrationssupabase/migrations/
Docker Composedocker-compose.yml
Env var templates.env.local.example, .env.node-server.example