Intro

I needed a tool to alert me when my cron jobs silently fail. There is already a number of existing services for this, but it seemed like a fun thing to build myself. So I present to you: healthchecks.io

I am using this myself and it has already been useful for me a couple times. Say, a seemingly benign code change in one service causes my batch job to fail 12 hours later, in the middle of night. Without any monitoring I might be blissfully unaware for days or months, until I need those backups or whatever, but now I get an email alert and can get it sorted in minutes. Sweet!

I licensed this under BSD licence, hoping it might be useful for other people too. It’s such a simple service it feels wrong to charge big bucks for it. You can grab the code from GitHub, run it, extend it, add unicorns or raptors, and so on. Or you can use the hosted service which is free. I cannot make guarantees that I’ll keep the hosted service around for ten years, though. The running costs for me currently are: $5/mo DigitalOcean box, two domain names and SSL certificates, a bit of space on S3 for daily backups, and my own time for maintaining this.

On implementation side, it’s a pretty straightforward Django app with nothing particularly clever going on, which is a good thing. It does make use of a database trigger (which works with both PostgreSQL and MySQL), and it has neat JS horizontal slider widgets for setting duration parameters.

Now, about future plans. After about two months of spare time hacking, I feel healthchecks.io is well into MVP stage. It is already useful for me as-is. There’s a list of features I’m considering, but I also want to keep the code base simple, with few dependencies, and easy to deploy. I may add bits for reliability, integrations with other services, but probably no big new features like active checks (think Pingdom), status pages (statuspage.io), installable monitoring agents (NewRelic and many others), and so on. Keep it simple!