Operator handbook

Daily playbook for teams running Velocity in staging and production.

Last updated October 8, 2025View on GitHub

Run Velocity with confidence

This handbook is written for the people who babysit Velocity day and night: platform engineers, SREs, and operations teams. Use it as a living checklist—keep it close to your runbooks and automation.

Readiness scan

ItemTarget stateVerification tip
ToolingRust ≥ 1.81, Node ≥ 20, velocity-cli installedrustc --version && node --version && velocity --version
NetworkUDP/4433 permitted end-to-end; fallback TCP path definednmap -sU <host> -p 4433 and a one-shot HTTP/3 probe
CertificatesHybrid Dilithium + ECDSA chain available, stored securelyvelocity-cli cert inspect certs/<name>
HostsService account owns /etc/velocity (read) and /var/lib/velocity (write)sudo -u velocity touch /var/lib/velocity/.probe

Bake these checks into CI or pre-flight jobs so every environment starts from the same baseline.

First hour runbook

  1. Clone the pilot config: copy the starter YAML, rename it to match the environment, and commit it to your infrastructure repo.
  2. Generate or hybridise certificates: use velocity-cli cert hybridize against real chains; store outputs in your secret manager.
  3. Launch the service: velocity-cli serve --config /etc/velocity/<env>.yaml from your automation or systemd unit.
  4. Smoke-test clients: run the Velocity client and a plain HTTP/3 request to verify both the PQ and fallback paths.
  5. Snapshot telemetry: capture metrics, logs, and the admin diagnostic bundle as your gold baseline.

Configuration layers in practice

  • Base file (serve.<env>.yaml) holds the long-lived settings: listeners, profiles, ticket rotation, and proxy targets.
  • Environment variables (VELOCITY_*) override secrets or unique host data—perfect for templated deployments.
  • Edge runtime file (edge.yaml) gives you programmable routing, auth, and templated responses without rebuilding binaries.
  • CLI flags have the last word. Use them sparingly for emergency overrides so the source of truth stays declarative.

Keep version-controlled samples for each layer and run velocity-cli config lint before committing changes.

Observe and react

  • Scrape the Prometheus endpoint and watch handshake latency (velocity_handshake_duration_seconds) plus downgrade counters.
  • Stream structured logs to your SIEM. Tag by profile, workload, and downgrade_reason to accelerate incident triage.
  • Schedule velocity-cli admin diagnostics --dump as a cron job so you always have recent snapshots for forensics.

Troubleshooting playbook

SignalLikely causeFirst move
CERT_PQ_VALIDATION_FAILEDMissing Dilithium signature or expired chainRe-issue the hybrid cert, confirm host clock, reload Velocity.
Spike in downgradesClients negotiating HTTP/3 fallback or policy mismatchInspect downgrade_reason, adjust profile permits, notify client owners.
Tickets failing to rotateSecret not writable or cron misconfiguredCheck permissions on /var/lib/velocity, re-run rotation job manually.
Metrics gapScrape target moved or auth token expiredReconcile Prometheus config, validate bearer tokens, retry scrape.
UDP dropsLoad balancer unaware of QUICSwitch to UDP pass-through mode or deploy a supported proxy.

Keep everyone aligned

  • Publish the readiness and troubleshooting tables above in your internal wiki.
  • Automate ticket rotation (24h max age, 12h rotation) using your secrets platform.
  • Review the Security playbooks quarterly and practise the exploit hardening drills.
  • Surface lessons learned in the next release retro and feed updates back into this handbook via pull requests.