Skip to content

Bring Your Own LLM Keys - Implementation Plan

Goal

Allow a tenant to provide its own AI provider keys so document and image analysis runs against customer-controlled model accounts instead of our shared hosted accounts.

Minimum Product Promise

If a tenant provides at least one supported vision-capable model provider, the core conversion path remains usable without using our hosted AI credentials.

Current State

The stack already supports multiple AI providers by environment variable. The missing work is per-tenant configuration, routing, validation, fallback policy, and observability.

Supported Provider Model

Phase 1 should focus on providers that already exist in the codebase and have stable API usage patterns:

  • Google Gemini / Vertex AI
  • Anthropic
  • OpenAI, if currently supported in active paths

Later phases can add:

  • Mistral
  • Azure OpenAI
  • self-hosted OpenAI-compatible endpoints

Product Design

Settings Surface

Add a new Settings section: AI Providers.

For each provider:

  • enabled toggle
  • provider type
  • API key or service-account credentials
  • endpoint override if applicable
  • model selection
  • vision capability indicator
  • optional rate-limit cap

Tenant-level controls:

  • default provider order
  • disable hosted fallback
  • allow hosted fallback

Status panel:

  • last validation result
  • tested models
  • vision-capable providers available
  • current effective routing order

Functional Requirements

  • A tenant can store multiple providers.
  • Routing chooses the first healthy provider that satisfies the task.
  • Tasks requiring vision must only use providers marked vision-capable.
  • If hosted fallback is disabled and no healthy provider exists, the job fails immediately with a clear tenant-facing error.

Technical Plan

Phase 1 - Tenant Provider Schema

Add tenant-scoped AI provider records with:

  • tenant_id
  • provider
  • label
  • credentials_encrypted
  • endpoint
  • default_model
  • supports_vision
  • is_enabled
  • priority
  • validation_status
  • validation_error
  • validated_at
  • created_by
  • updated_by

Also add tenant-level policy:

  • allow_hosted_ai_fallback
  • require_customer_managed_ai

Phase 2 - Secret Handling

  • Encrypt provider credentials at rest.
  • Support one-time input and redaction after save.
  • Support key rotation without breaking unrelated providers.

Phase 3 - Validation Service

Each configured provider needs a health test that verifies:

  • credential validity
  • requested model availability
  • vision support if the tenant expects it
  • quota / permission sanity

The validation path should not send customer documents. It should use a synthetic health-check prompt or provider metadata endpoint.

Phase 4 - Runtime Provider Resolution

Build a tenant-aware provider selection layer:

  1. load tenant provider policy
  2. determine task requirements:
    • text only
    • vision required
    • high-context / long-output path
  3. choose the highest-priority healthy provider
  4. record which provider actually handled the request

Persist provider choice on the job record for supportability.

Phase 5 - Failure and Fallback Behavior

Explicitly define:

  • whether retries stay on the same provider or fail over
  • when hosted fallback is allowed
  • what user-facing error appears when no compliant provider is available

Recommended default:

  • retry once on same provider for transient failures
  • fail over to next healthy tenant provider
  • never use hosted fallback unless tenant policy permits it

Phase 6 - Billing and Analytics

If customers bring their own keys, we need a billing policy decision:

  • do we still charge for conversion only
  • do we discount hosted AI costs
  • do we expose provider usage counters in the UI

Even if pricing does not change immediately, we need internal analytics for:

  • jobs by provider
  • failures by provider
  • average latency by provider

Backend Work Items

  • Add tenant AI provider schema and migrations
  • Add encrypted credential storage
  • Add provider validation service
  • Add provider selection abstraction
  • Refactor current env-based provider resolution to allow tenant override
  • Add job-level provider attribution
  • Add audit events for provider config changes

Frontend Work Items

  • Add AI Providers settings UI
  • Add priority ordering UI
  • Add validation, enable/disable, and rotation flows
  • Add provider-health status

Documentation Work Items

  • Customer guide for supported providers and required permissions
  • Support runbook for common validation errors
  • Security note describing how provider data is used and not retained

Dependencies

  • secret encryption/storage
  • tenant settings framework
  • provider abstraction cleanup in the conversion pipeline

Risks

  • Per-tenant provider routing increases operational complexity.
  • Provider-specific edge cases can leak into product behavior.
  • Some providers may have inconsistent model naming or permissions APIs.

Acceptance Criteria

  • A tenant can add and validate at least one vision-capable provider from Settings.
  • A conversion job for that tenant uses the tenant’s provider instead of hosted credentials.
  • Hosted fallback behavior is enforced exactly as configured.
  • Logs and admin tools show which provider handled each job.

Estimated Effort

  • Backend and provider abstraction: 5-7 days
  • Frontend settings UI: 2-3 days
  • Validation, testing, runbooks: 2-3 days
  • Total: 9-13 days