PagerDuty is a well-known incident management system. It provides alerting, on-call scheduling, escalation policies and incident tracking. If you use or plan on using PagerDuty, you can can integrate it with your healthchecks.io account in few simple steps!
Step 1: Add a new service to your PagerDuty account
Log into your PagerDuty account, go to Configuration > Services, and click on “Add New Service”:
Give it a descriptive name, and select “Use our API directly”. Click on “Add Service” and take note of its API key:
Step 2: Add a notification channel to your Healthchecks account
Log into your Healthchecks account, and go to Channels. In “Add Notification Channel” section, select “PagerDuty” and paste the API key from previous step:
… and done! From now on, when a check goes down, Healthchecks will open a new incident in your PagerDuty account. When the check goes back up again, Healthchecks will resolve the incident. Simple and easy!
I needed a tool to alert me when my cron jobs silently fail. There is already a number of existing services for this, but it seemed like a fun thing to build myself. So I present to you: healthchecks.io
I am using this myself and it has already been useful for me a couple times. Say, a seemingly benign code change in one service causes my batch job to fail 12 hours later, in the middle of night. Without any monitoring I might be blissfully unaware for days or months, until I need those backups or whatever, but now I get an email alert and can get it sorted in minutes. Sweet!
I licensed this under BSD licence, hoping it might be useful for other people too. It’s such a simple service it feels wrong to charge big bucks for it. You can grab the code from GitHub, run it, extend it, add unicorns or raptors, and so on. Or you can use the hosted service which is free. I cannot make guarantees that I’ll keep the hosted service around for ten years, though. The running costs for me currently are: $5/mo DigitalOcean box, two domain names and SSL certificates, a bit of space on S3 for daily backups, and my own time for maintaining this.
On implementation side, it’s a pretty straightforward Django app with nothing particularly clever going on, which is a good thing. It does make use of a database trigger (which works with both PostgreSQL and MySQL), and it has neat JS horizontal slider widgets for setting duration parameters.
Now, about future plans. After about two months of spare time hacking, I feel healthchecks.io is well into MVP stage. It is already useful for me as-is. There’s a list of features I’m considering, but I also want to keep the code base simple, with few dependencies, and easy to deploy. I may add bits for reliability, integrations with other services, but probably no big new features like active checks (think Pingdom), status pages (statuspage.io), installable monitoring agents (NewRelic and many others), and so on. Keep it simple!