Skip to content

Secret Rotation System

Automated secret rotation across Cloudflare Workers, AWS SSM, and on-prem Docker hosts with zero-downtime rolling deploys.


1. System Overview

The rotation system has three components:

ComponentFilePurpose
Rotation workflow.github/workflows/rotate-secrets.ymlFull end-to-end rotation via GitHub Actions
Reminder workflow.github/workflows/rotation-reminder.ymlOpens a quarterly GitHub issue with a rotation checklist
Inventory & runbookdocs/admin/secrets-inventory-and-rotation.mdComplete secrets inventory, manual procedures, rotation log

Supporting scripts:

ScriptPurpose
scripts/update-env-secret.shUpdates a single key in .env.node-server with automatic backup
scripts/smoke-test.shHealth check: asserts HTTP 200 and valid response body

How a rotation flows

You generate a new key in the provider dashboard
↓
gh workflow run rotate-secrets.yml -f key_name=new_value
↓
β”Œβ”€ Phase 1: Cloud secrets ─────────────────────────────────┐
β”‚ Cloudflare Worker secrets (wrangler secret put) β”‚
β”‚ AWS SSM Parameter Store (aws ssm put-parameter) β”‚
β”‚ β€” runs in parallel β€” β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
β”Œβ”€ Phase 2: Test server (10.1.1.17) ──────────────────────┐
β”‚ SSH in β†’ update .env.node-server β†’ restart container β”‚
β”‚ Wait for /health β†’ smoke test β”‚
β”‚ ❌ FAILS HERE? β†’ workflow stops, production untouched β”‚
β””β”€β”€οΏ½οΏ½οΏ½β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½οΏ½β”€β”€β”€β”˜
↓
β”Œβ”€ Phase 3: Production (10.1.1.4) ────────────────────────┐
β”‚ SSH in β†’ backup .env.node-server β†’ update secrets β”‚
β”‚ SIGTERM api-node-1 β†’ drain β†’ restart β†’ wait healthy β”‚
β”‚ SIGTERM api-node-2 β†’ drain β†’ restart β†’ wait healthy β”‚
β”‚ Restart sidecars β†’ final public health check β”‚
β”‚ ❌ FAILS HERE? β†’ rollback from .bak, restart container β”‚
β””β”€β”€β”€β”€β”€οΏ½οΏ½β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓
You revoke the old key in the provider dashboard

2. One-Time Setup

2.1 Add the SSH key to GitHub Actions

The workflow SSHes into 10.1.1.4 and 10.1.1.17. It needs the private key that both servers already trust.

  1. Copy the private key content:

    Terminal window
    cat ~/.ssh/nightly-audit
  2. Go to the GitHub repo β†’ Settings β†’ Secrets and variables β†’ Actions

  3. Click New repository secret:

    • Name: SERVER_SSH_KEY
    • Value: paste the entire private key (including -----BEGIN and -----END lines)
  4. Click Add secret

2.2 Verify existing GitHub secrets

These should already exist from the deploy workflows. Confirm in Settings β†’ Secrets:

SecretUsed for
CLOUDFLARE_API_TOKENwrangler secret put to update Cloudflare Worker secrets
AWS_ACCESS_KEY_IDaws ssm put-parameter to update Lambda config
AWS_SECRET_ACCESS_KEYSame as above
PACKAGES_TOKENNot used by rotation β€” but verify it exists for deploys

2.3 Verify server access

From your Mac, confirm SSH works to both servers:

Terminal window
ssh -i ~/.ssh/nightly-audit [email protected] "hostname && docker ps --format '{{.Names}}' | head -5"
ssh -i ~/.ssh/nightly-audit [email protected] "hostname && docker ps --format '{{.Names}}'"

2.4 Verify scripts are on both servers

The workflow copies scripts via scp on each run, but for manual use:

Terminal window
# Copy to both servers
scp -i ~/.ssh/nightly-audit scripts/update-env-secret.sh scripts/smoke-test.sh \
[email protected]:~/accessible/scripts/
scp -i ~/.ssh/nightly-audit scripts/update-env-secret.sh \
[email protected]:~/accessible/scripts/

3. Usage

3.1 Rotate a single key

  1. Generate a new key in the provider’s dashboard (do NOT revoke the old one yet)

  2. Trigger the workflow:

    Terminal window
    gh workflow run rotate-secrets.yml -f anthropic_api_key=sk-ant-api03-newkeyhere
  3. Watch the run:

    Terminal window
    gh run watch
  4. Once the run succeeds, revoke the old key in the provider dashboard

  5. Log the rotation in docs/admin/secrets-inventory-and-rotation.md Section 5

3.2 Rotate multiple keys at once

Pass multiple -f flags. All keys update atomically in the same rolling restart:

Terminal window
gh workflow run rotate-secrets.yml \
-f anthropic_api_key=sk-ant-api03-newkey \
-f gemini_api_key=AIzaSy-newkey \
-f openai_api_key=sk-svcacct-newkey

3.3 Rotate high-impact keys

Some keys require extra care:

JWT Secret (invalidates all active sessions)

Terminal window
# Generate a strong random secret
NEW_JWT=$(openssl rand -base64 48)
# Rotate during low-traffic hours
gh workflow run rotate-secrets.yml -f jwt_secret="$NEW_JWT"

Users will need to re-authenticate after this rotation.

Supabase Service Role Key

This key is generated by Supabase β€” you cannot choose the value.

  1. Go to https://supabase.com/dashboard/project/vuvwmfxssjosfphzpzim/settings/api
  2. Click Generate new keys (this regenerates anon key, service role key, and JWT secret simultaneously)
  3. Copy the new service_role key
  4. Rotate all three at once:
    Terminal window
    gh workflow run rotate-secrets.yml \
    -f supabase_service_role_key=eyJhbG... \
    -f jwt_secret=new-jwt-secret-from-supabase
  5. Update the SUPABASE_ANON_KEY in any frontend .env files manually (this key is public but should stay current)

Stripe Keys

Stripe supports rolling keys β€” both old and new work during the overlap.

  1. Go to https://dashboard.stripe.com/apikeys
  2. Click Roll key on the secret key
  3. Copy the new key (old key stays valid for 24h by default)
  4. Rotate:
    Terminal window
    gh workflow run rotate-secrets.yml -f stripe_secret_key=sk_live_newkey
  5. For webhook secrets, create a new webhook endpoint in Stripe, test it, then delete the old one

3.4 Trigger from the GitHub UI

  1. Go to the repo on GitHub β†’ Actions tab
  2. Select Rotate Secrets from the left sidebar
  3. Click Run workflow
  4. Fill in only the keys you want to rotate (leave the rest blank)
  5. Click Run workflow

3.5 What to do if the workflow fails

Phase 1 (cloud secrets) fails:

  • Cloudflare or AWS credentials may be expired
  • Check: gh run view --log-failed
  • Fix credentials in GitHub Secrets, re-run

Phase 2 (test server) fails:

  • The new key may be invalid, or the test server is down
  • Production was NOT touched β€” safe to investigate
  • SSH in and check: ssh -i ~/.ssh/nightly-audit [email protected] "docker logs accessible-pdf-pptx-remediate --tail 30"
  • The .env.node-server.bak on the test server has the previous values

Phase 3 (production) fails:

  • The workflow auto-rolled back: .env.node-server.bak was restored and the failed container restarted
  • Check which node failed: gh run view --log-failed
  • SSH in and verify: ssh -i ~/.ssh/nightly-audit [email protected] "docker ps"
  • The old key should still work since you haven’t revoked it yet

4. Available Secrets

All inputs are optional β€” only provide the ones you’re rotating:

Workflow InputEnv VariableDeployed To
anthropic_api_keyANTHROPIC_API_KEYCloudflare, Docker
gemini_api_keyGEMINI_API_KEYCloudflare, Docker
openai_api_keyOPENAI_API_KEYCloudflare, Docker
mistral_api_keyMISTRAL_API_KEYCloudflare, Docker
marker_api_keyMARKER_API_KEYCloudflare, Docker
mathpix_app_keyMATHPIX_APP_KEYCloudflare, Docker
stripe_secret_keySTRIPE_SECRET_KEYCloudflare, Docker
stripe_webhook_secretSTRIPE_WEBHOOK_SECRETCloudflare, Docker
resend_api_keyRESEND_API_KEYCloudflare, AWS SSM, Docker
jwt_secretJWT_SECRETCloudflare, AWS SSM, Docker
supabase_service_role_keySUPABASE_SERVICE_ROLE_KEYCloudflare, AWS SSM, Docker
ses_webhook_secretSES_WEBHOOK_SECRETCloudflare, Docker

5. Quarterly Rotation Schedule

The rotation-reminder.yml workflow runs automatically on the 1st of January, April, July, and October. It creates a GitHub issue with a prioritized checklist.

Rotation priority tiers:

TierFrequencyKeys
HighEvery 90 daysAWS_ACCESS_KEY_ID/SECRET, CLOUDFLARE_API_TOKEN, PACKAGES_TOKEN, JWT_SECRET
StandardEvery 6 monthsANTHROPIC_API_KEY, GEMINI_API_KEY, OPENAI_API_KEY, STRIPE_SECRET_KEY, RESEND_API_KEY
LowAnnually or on suspicionMISTRAL_API_KEY, MARKER_API_KEY, MATHPIX_APP_KEY, VAPID_PRIVATE_KEY, TELEGRAM_BOT_TOKEN

6. Next Steps β€” Automation Roadmap

6.1 Eliminate plaintext .env files on Docker hosts (Priority: HIGH)

The .env.node-server files on 10.1.1.4 and 10.1.1.17 are the weakest link. Two approaches to eliminate them:

Option A: Pull secrets from AWS SSM at container startup

Add an entrypoint wrapper that fetches secrets from SSM before starting the app:

entrypoint-with-secrets.sh
#!/bin/bash
for PARAM in ANTHROPIC_API_KEY GEMINI_API_KEY STRIPE_SECRET_KEY JWT_SECRET; do
VALUE=$(aws ssm get-parameter \
--name "/accessible-pdf/production/$PARAM" \
--with-decryption \
--query Parameter.Value \
--output text 2>/dev/null)
if [[ -n "$VALUE" ]]; then
export "$PARAM=$VALUE"
fi
done
exec "$@"

Effort: Modify each Dockerfile to use the wrapper entrypoint. Add IAM credentials to the Docker host (instance profile or env var). Move all secrets into SSM.

Benefit: Secrets never touch disk. Rotation becomes: update SSM β†’ restart container. No .env file management.

Option B: Docker Swarm secrets

Convert from docker compose to Docker Swarm mode. Secrets are encrypted at rest and only mounted in-memory at /run/secrets/.

Effort: Higher β€” requires Swarm init, service definitions, and code changes to read from /run/secrets/ instead of env vars.

Recommendation: Option A (SSM at startup). It’s the smallest change and aligns with the existing AWS infrastructure.

6.2 Fully automated key rotation for supported providers (Priority: MEDIUM)

Some providers support API-driven key rotation. A scheduled GitHub Action could rotate these without human involvement:

ProviderAPI SupportAutomation Path
AWS IAMFullAWS Secrets Manager auto-rotation with a Lambda rotator function
StripeFullPOST /v1/api_keys/roll β€” Stripe rolls the key and both work during overlap
GitHub PATFullCreate fine-grained tokens with expiry via GitHub API, revoke old ones
CloudflareFullCreate/revoke API tokens via Cloudflare API
AnthropicNoneManual β€” dashboard only
OpenAINoneManual β€” dashboard only
ResendNoneManual β€” dashboard only
SupabaseNoneManual β€” regenerates all keys simultaneously

Implementation plan:

  1. Start with AWS IAM β€” highest risk (long-lived credentials), best tooling (Secrets Manager has built-in rotation)
  2. Add Stripe β€” rolling key API makes this straightforward
  3. Add GitHub PAT β€” use fine-grained tokens with 90-day expiry, auto-create replacements
  4. Add Cloudflare API token β€” rotate via API, update GitHub Actions secret via gh secret set

For providers without rotation APIs, the quarterly reminder issue is the automation ceiling.

6.3 Secret scanning and leak detection (Priority: MEDIUM)

Add automated detection for accidental secret exposure:

  1. Enable GitHub secret scanning on the repo (Settings β†’ Code security β†’ Secret scanning)

    • GitHub natively detects leaked API keys for Stripe, AWS, Anthropic, OpenAI, and others
    • Sends alerts and can auto-revoke with partner providers
  2. Add gitleaks to CI:

    # Add to .github/workflows/test.yml
    - name: Scan for secrets
    uses: gitleaks/gitleaks-action@v2
    env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

    This blocks PRs that accidentally include secrets.

  3. Pre-commit hook (local):

    .pre-commit-config.yaml
    repos:
    - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
    - id: gitleaks

6.4 Centralized secrets dashboard (Priority: LOW)

Build a simple internal page that shows:

  • Last rotation date for each key (read from docs/admin/secrets-inventory-and-rotation.md Section 5)
  • Days until next rotation due
  • Color-coded status: green (current), yellow (due soon), red (overdue)
  • Direct links to provider dashboards for manual rotation

This could be a static page generated by a GitHub Action that reads the rotation log and publishes to an internal URL.

6.5 Migrate to HashiCorp Vault or AWS Secrets Manager (Priority: LOW, long-term)

The current system works well for the current scale. If the number of services or secrets grows significantly, consider a dedicated secrets manager:

  • AWS Secrets Manager: Native integration with Lambda, supports automatic rotation, audit trail via CloudTrail
  • HashiCorp Vault: More flexible, supports dynamic secrets (short-lived credentials generated on demand), but requires running and maintaining Vault infrastructure

When to consider: When you exceed ~30 secrets, add more servers, or need audit compliance (SOC 2, HIPAA) that requires formal secret access logging.


7. Security Notes

  • Never revoke the old key before the new one is deployed and verified. The workflow enforces this by testing before production.
  • Workflow inputs are masked with ::add-mask:: so values don’t appear in GitHub Actions logs.
  • The SERVER_SSH_KEY secret grants access to production servers. Rotate it periodically and limit who can trigger workflows.
  • Concurrency lock: The workflow uses concurrency: secret-rotation to prevent two rotations running simultaneously.
  • Rollback is automatic on production β€” if a container fails health checks, the .env.node-server.bak backup is restored.
  • Cloud secrets are NOT rolled back on failure. This is safe because the old key remains valid until you manually revoke it in the provider dashboard.