Here’s the summary of the hardware and the software that powers Healthchecks.io.
Since 2017, Healthchecks.io runs on dedicated servers at Hetzner. The current lineup is:
- HAProxy servers: 4x AX41-NVMe servers (Ryzen 3600, 6 cores)
- Web servers: 3x AX41-NVMe servers (Ryzen 3600, 6 cores)
- PostgreSQL servers: 2x AX101 servers (Ryzen 5950X, 16 cores)
All servers are located in the Falkenstein data center park, scattered across the FSN-DCx data centers so they are not all behind the same core switch. The monthly Hetzner bill is €484.
- Ubuntu 20.04 on all machines.
- Systemd manages services that need to run continuously (haproxy, nginx, postgresql, etc.)
- Wireguard for private networking between the servers. Tiered topology: HAProxy servers cannot talk to PostgreSQL servers.
- Netdata agent for monitoring the machines and the services running on them. Connected to Netdata Cloud for easy overview of all servers.
- HAProxy 2.2 for terminating TLS connections, and load balancing between app servers. Enables easy rolling updates of application servers.
- PostgreSQL 13, streaming replication from primary to standby. No automatic failover: I can trigger failover with a single command, but the decision is manual.
On app servers:
- uWSGI runs the Healthchecks Python application (web frontend, management API).
- hchk, a small application written in Go, handles ping API (hc-ping.com) and inbound email.
- NGINX handles rate limiting, static file serving, and reverse proxying to uWSGI and hchk.
- AWS S3 for storing encrypted database backups.
- Braintree for accepting payments and managing subscriptions.
- Cloudflare for hosting DNS records.
- Elastic Email for sending transactional email.
- Fastmail for sending and receiving support email.
- GitHub for version control and tracking issues, and GitHub Actions for running tests on every commit.
- Hardypress for blog.healthchecks.io (static WordPress blog as-a-service).
- HetrixTools for uptime monitoring.
- IcoMoon for authoring icon fonts.
- pgDash for monitoring PostgreSQL servers. Here’s a blog post about setting it up.
- PingPong for powering status.healthchecks.io (service status, incidents, planned downtimes, performance metrics).
- SSLMate for provisioning certificates from command-line.
- Syften for getting notifications when Healthchecks is mentioned on HN, Twitter, Reddit and elsewhere.
- Twilio for sending SMS, WhatsApp and phone call notifications.
Healthchecks.io, the cron job monitoring service, uses cron jobs itself for the following periodic tasks:
- Once a day, make a full database backup, encrypt it with gpg, and upload it to AWS S3.
- Once a day, send “Your account is inactive and is about to be deleted” notifications to inactive users.
- Once a day, send “Your subscription will renew on …” for annual subscriptions that are due in 1 month.
Bonus – Development and Deployment Setup
- My main dev machine is a desktop PC with a single 27″ 1440p display.
- Ubuntu 20.04, GNOME Shell.
- Sublime Text for editing source code. A combination of meld, Sublime Merge and command-line git for working with git.
- Yubikeys for signing git commits and logging into servers.
- Fabric scripts for deploying code and running maintenance tasks on servers.
- sops for storing secrets.
- A dedicated laptop inside a dedicated backpack, for dealing with emergencies while away from the main PC.