First, a TL;DR: on how much money I’m making. Healthchecks.io has around 90 paying customers, and the monthly revenue is a little above $700/mo. The bulk of that goes back into running costs.
I’m Pēteris Caune, a 34-year-old guy from Latvia. I’m married and the father of a baby daughter. I ride and race mountain bikes. In my day job, I work remotely for a small Irish company. I do Python web applications and Android mobile applications mostly, but it varies a lot.
Here’s the elevator pitch (assume a tall building).
Let’s say you just finished setting up a cron job that makes database backups and uploads them to S3. You just ran it by hand, and made sure a new .sql.gz file appeared in S3–all is well! Now, if it stops working one day six months from now, would you notice? The backup job can fail in many ways; here are just a few possibilities:
- A well-meaning DBA changes the database password, but forgets to update the backup script
- Slowly, over time, the machine doing backups runs out of disk space
- Somebody “cleans up” AWS IAM policies and the script cannot upload to S3 anymore
- Everybody has forgotten which machine and which user account is doing the database backups, and the machine gets decommissioned
- The machine gets rebooted, backup script now fails because reboots were not tested
Here’s what you can do: edit the backup script to send an HTTP GET request to Healthchecks as the very last step. Healthchecks will treat these requests as “I’m still alive!” messages and will keep track of them. As soon as your service is silent for too long, it will send an alert (configurable: email, SMS, Slack, etc.) to you. And since Healthchecks runs on a separate host in a separate datacenter, you will get an alert even if your entire DC goes down.
A quick word of caution: in this specific database backup example, you still want to test the backups by restoring them regularly. There are failure modes where the backup seemingly completes successfully, but the generated database dump is invalid or incomplete.
What other things can you monitor? Here are examples that would benefit from Healthchecks-style monitoring:
- A job that runs weekly and sends out newsletters or weekly reports
- A job that synchronizes business data between separate systems. For example, fetches a rss feed and updates entries in a local database
- A job that checks database replication status every minute
- A job that updates dns entries when ip address changes
- A job that renews a letsencrypt certificate. Alternatively, monitors the expiry status of a certificate
- A machine that sends pings unconditionally every minute. You receive an alert when the machine loses the network connection or is powered off
Why I started Healthchecks
I started work on Healthchecks three years ago, in summer 2015. I was looking for a service like this myself. Dead Man’s Snitch and Cronitor, higher-priced Healthchecks competitors, did already exist. However, they were too expensive for the relatively unimportant things I wanted to monitor. A little arrogantly, I thought I could build something that is cheaper and better. I was also looking for an excuse to work on something fun. Compared to some of my work assignments, here I would be in complete control of the product features, the design, the technical nitty gritty, the pricing strategy and everything else. I mulled over the idea for some time. Still undecided, I started hacking on a blank Django project in June 2015. A month later, I registered the healthchecks.io domain name, and at that point, the game was on!
Timeline of Notable Events
2015–06–11 First commit.
2015–07–18 Registered the healthchecks.io domain name
2015–07–29 The website goes live, running on a single $5 DigitalOcean droplet
2015–09–30 Added Slack and HipChat integrations
2016–10–21 Published “Deploying a Django App with No Downtime”, HN: 184 points, 93 comments
2015–12–10 Braintree payments setup complete.
2016–03–31 First paying customer! $5 MRR
2016–05–10 Implemented Team Access
2016–06–07 100M processed pings
2016–08–20 While road tripping and camping in the wilderness, hchk.io goes down for 24 hours.
Side note: After this incident I bought a used Thinkpad X240 and set up a development environment on it. It now travels with me when I leave home for more than a few hours. I have been poking around the servers while sitting in a parking lot before a cross-country MTB race. The laptop is set up with full disk encryption if it gets lost or stolen. My GPG/SSH key sits on a Yubikey.
2016–09–24 200M processed pings
2016–10–31 $100 MRR
2016–12–27 Implemented Cron expression support
2017–05–04 Migration to Google Cloud Platform
2017–07–31 Finished off and published Cron Syntax Cheatsheet
2017–08–20 1 billion pings processed
2017–10–29 Migration to Hetzner. Bare metal servers.
2018–08–24 Processing around 100 pings per second. $700 MRR–still a hobby project.
Healthchecks.io gets a dozen or so new signups per day. Most are just checking out the service. But there are also people who register and set up ten checks and ping them right away.
Currently, Healthchecks.io receives 8 million pings per day. There is rate-limiting for checks that get pinged very often. Of the daily 8 million, about 4 million get written to the database.
Most active accounts have 2–20 checks. There are quite a few heavy users too: one account has 900+ checks, another has 400+, another has 300+ checks. There are 17 accounts with over 100 checks.
The most popular notification method is email, followed by webhooks and Slack.
Profit-wise, Healthchecks is still firmly a side project for me. After bills and taxes, there is little profit left. I could cut costs by migrating to a couple of cheap VPSes, and by getting rid of the load balancer. I could severely limit the free plan, and force people to upgrade to paid plans. But by doing that, I would give up my initial goals: free for individuals, fairly priced for companies, and with a good quality service. Healthchecks would turn from “a project I love hacking on and am proud of” to “a project I do solely for money while hating myself”. So–I’m not doing that.
I have no big announcements to write about here. I will keep making small iterative improvements to the service. I will try and keep the code and the design as simple (think KISS) as I can. When it becomes financially viable, I will look at expanding the team-of-one, to improve the bus factor.
With that, thanks for reading! If you haven’t already, check out Healthchecks.io here. The project is also open source: you can grab the code from GitHub, change and improve it, and host your own instance.