SpeakTrue

Web app: Vultr VPS, optional home server, CI/CD, and failover

Last updated: 2026-04-23

This document describes how to run web/python-web-app (Flask + Gunicorn) on a VPS, optionally mirror it on a home server, deploy both from one pipeline, and route traffic with automatic failover. Marketing on Cloudflare Pages and Supabase (DB, auth, Edge Functions) are unchanged: only the origin for the Flask app is covered here.

Repo artifacts:

web/python-web-app/deploy/vultr/gunicorn.service.example — systemd unit
web/python-web-app/deploy/vultr/nginx-site.conf.example — Nginx reverse proxy

Health check path for monitors / load balancers: GET /health (see web/python-web-app/src/routes/health.py).

1. Architecture options

1.1 Single origin (simplest)

Cloudflare (DNS + optional proxy)
    └── app.example.com → A record → Vultr (Nginx:443 → Gunicorn:127.0.0.1:8000)
            └── Supabase (hosted): DB, auth, storage, Edge Functions

Use SSL/TLS → Full (strict) when Nginx has a valid Let’s Encrypt certificate.
Do not expose Gunicorn to the public internet; only Nginx on 80/443.

1.2 Two origins + automatic failover (Vultr + home)

Users → one hostname (e.g. app.example.com)
     → Cloudflare Load Balancing (health monitors on /health)
         ├── Pool member 1: Vultr (origin)
         └── Pool member 2: Home (origin, or Cloudflare Tunnel endpoint)
     └── Unhealthy members removed; traffic uses healthy origin

Not “two A records in DNS” as the failover mechanism: clients cache answers; behavior is slow and inconsistent.
Use a load balancer with health checks (Cloudflare Load Balancing, or another product with the same idea).
“Seamless” = new requests go to the surviving origin quickly. In-flight requests to a dead server still fail; that is expected.

1.3 Marketing + API-style split (matches older plan)

Apex or www → Cloudflare Pages (marketing).
app or api subdomain → Vultr (or Load Balancer in front of Vultr + home).

2. Instance sizing (Vultr / similar)

1 vCPU, 2 GB RAM, ~55 GB disk, ~$10/mo + ~$2 backup is a reasonable starting size for Nginx + a few Gunicorn workers and low–moderate traffic, if heavy work stays on Supabase/Edge and you are not running many parallel FFmpeg jobs on the box.
FFmpeg can spike CPU/RAM; if you see OOM or long queues, move to 2 vCPU and/or 4 GB RAM.
Disk is enough for OS, venv, code, and logs; large blobs should stay in GCS / Supabase Storage as in your stack.
Backups (provider snapshots) help recover a broken Nginx, systemd, or bad deploy; they do not replace your Supabase backup story for data.

3. First-time server setup (each Linux host: Vultr and/or home)

Applies to both machines if you run two origins; use the same deploy layout on both if you want identical deploy scripts.

OS: Ubuntu 22.04/24.04 LTS.
Firewall (ufw): allow 22 (SSH), 80, 443; deny everything else from the outside. Do not open the Gunicorn port publicly.
Packages (example): git, nginx, certbot, python3-certbot-nginx, ufw, python3-venv, python3-pip, libpq-dev (if you use psycopg2 against a direct DATABASE_URL), ffmpeg (app uses it; see FFMPEG_TIMEOUT_SECONDS in the codebase).
Deploy user (recommended): non-root deploy with sudo for service reloads, SSH key in authorized_keys. Disable password SSH and root login if policy allows.

Code layout (example, adjust paths in systemd/Nginx/CI to match):

/var/www/speaktrue/                    # git clone root
└── web/python-web-app/
    ├── venv/                          # python3 -m venv venv
    ├── .env                          # secrets, chmod 600, not in git
    ├── gb_tts_app.py
    └── ...

Application environment: set variables the app reads (at minimum whatever you use in production), for example:
- DATABASE_URL (if the Flask stack talks to Postgres directly)
- SUPABASE_URL, SUPABASE_ANON_KEY, and if needed server-side SUPABASE_SERVICE_ROLE_KEY (see web/python-web-app/src/services/supabase_edge_client.py and related)
- GCS and ElevenLabs keys if those code paths are enabled
- FFMPEG_TIMEOUT_SECONDS=180 (aligns with web/python-web-app/Procfile)

Install deps:

cd /var/www/speaktrue/web/python-web-app
source venv/bin/activate
pip install -r requirements.txt

Gunicorn (production entry): Procfile uses
FFMPEG_TIMEOUT_SECONDS=180 gunicorn gb_tts_app:app
Bind 127.0.0.1:8000 behind Nginx. Copy and edit deploy/vultr/gunicorn.service.example to /etc/systemd/system/ (e.g. speaktrue-web.service), then systemctl daemon-reload, enable, start. The archived Vultr workflow in .github/workflows/deploy-web-app.yml.disabled used SERVICE_NAME=speaktrue-web before CI deployment was disabled.
Passwordless restart for the deploy user (for CI over SSH), e.g. drop-in sudoers:
deploy ALL=(ALL) NOPASSWD: /bin/systemctl restart speaktrue-web, /bin/systemctl status speaktrue-web
(adjust user and unit to match your setup).
Nginx: copy deploy/vultr/nginx-site.conf.example, set server_name, certbot --nginx for TLS, nginx -t, reload. Optionally add a location /static/ alias to web/python-web-app/static/ as in earlier snippets.
One-off / release task: web/python-web-app/Procfile has release: python import_glossary.py — run when needed after deploy, same venv, with DATABASE_URL set:
cd .../web/python-web-app && source venv/bin/activate && python import_glossary.py
Cloudflare → origin: A record to this server’s public IP, Proxy on or off. With Proxy on, set SSL mode to Full (strict) once origin has a real cert.

4. Home server: making it a second origin

Approach	Notes
Port forward + dynamic DNS	Router forwards 443 to the box; DDNS if residential IP changes. You get a public hostname or IP for the LB.
Cloudflare Tunnel (`cloudflared`)	Outbound-only from home; no inbound port open. Cloudflare can send traffic to the tunnel; configure your LB / DNS per Cloudflare’s docs for your plan.
VPN only (Tailscale, etc.)	Good for SSH and CI from a runner inside the network, not usually for public users unless you add another layer.

Home upload is often asymmetric; it may be slower or less reliable than a datacenter. Treat it as backup / secondary, or use LB weights (if your product supports it).
If CI cannot reach home’s SSH from the internet, use a self-hosted GitHub runner on the LAN, or deploy to home only via manual / scheduled job when the path is up.

5. CI/CD: one pipeline, two deploy targets

Goal: same commit on main (or your release branch) is deployed to Vultr and home with the same steps.

Typical steps on each host (run over SSH as deploy):

cd /var/www/speaktrue && git fetch && git checkout <ref> && git pull (or pull then checkout tag).
cd web/python-web-app && source venv/bin/activate && pip install -r requirements.txt
Run python import_glossary.py if your release process requires it.
sudo systemctl restart speaktrue-web (or whatever you named the unit).

GitHub repository secrets (names are suggestions):

Secret	Use
`DEPLOY_VULTR_HOST`	Vultr public IPv4/hostname for SSH
`DEPLOY_VULTR_SSH_KEY`	Private key for `deploy@` that host
`DEPLOY_HOME_HOST`	Home DDNS/hostname (or jump host) for SSH, when used
`DEPLOY_HOME_SSH_KEY`	Private key for home, when used

The Vultr GitHub Actions workflow is currently disabled at .github/workflows/deploy-web-app.yml.disabled because the active production path is the homelab Docker deploy and the Vultr SSH secrets are not configured. To restore Vultr CI later, rename it back to .github/workflows/deploy-web-app.yml, configure the DEPLOY_VULTR_* secrets, and re-check the trigger policy before enabling push deploys.

Idempotence: run migrations / glossary import in a way that is safe to repeat, or branch them behind flags.

6. Cloudflare Load Balancing (failover)

In Cloudflare, enable Load Balancing (paid add-on; confirm current pricing in your account).
Create a pool with two origins (Vultr and home, or tunnel hostnames, each serving HTTPS for the app hostname you configure).
Health monitor: HTTPS (or HTTP if you only use 80 for checks — prefer HTTPS to match production), path /health, interval and thresholds per Cloudflare’s guidance.
Attach a load balancer with hostname app.example.com (or your chosen name) to that pool, with failover (or least load) policy.
SSL/TLS: use Full (strict); each origin must present a valid cert for the hostname the monitor hits (or the tunnel’s configuration).

Without Load Balancing: you can still have two deployed copies (backup + manual DNS switch, or DDNS swap) — that is not automatic “seamless” failover for end users.

7. Limitations to plan for

Topic	Impact
In-flight requests	If origin A dies, active TCP connections to A fail. New traffic should go to B after the LB marks A down.
Flask server sessions (signed cookies)	If session state was only in memory of A, a user might need to sign in again when moved to B. Mitigation: shared session store (e.g. Redis) both use, or stateless JWT for API-style auth.
WebSockets / long polling	Clients should reconnect; LB will send new connections to a healthy member.
Shared filesystem	If you ever store uploads only on one disk, the other won’t have them — keep user files in Supabase Storage / GCS / DB.

8. Quick verification

curl -fsS https://app.example.com/health
curl -I https://app.example.com
sudo journalctl -u speaktrue-web -f
sudo tail -f /var/log/nginx/access.log

9. Cost reference (illustrative)

Item	Notes
Vultr (example)	~$10/mo + ~$2 backup in your case
Cloudflare	Plan-dependent; Load Balancing is a paid add-on — check dashboard
Supabase / other backends	Unchanged from your project billing

docs/ops/CLOUDFLARE_PAGES_DEPLOY.md — marketing on Pages
docs/ops/SECRETS_HYGIENE.md — handling secrets
docs/ops/SUPABASE_AUTH_REDIRECTS.md — auth URLs when hostnames change