mirror of
https://github.com/healthchecks/healthchecks.git
synced 2026-04-25 15:05:49 +03:00
[GH-ISSUE #232] Cron monitoring via healthchecks on 5000+ servers #167
Labels
No labels
bug
bug
bug
feature
good-first-issue
new integration
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/healthchecks#167
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @gganeshan on GitHub (Mar 20, 2019).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/232
@cuu508 thanks a lot for this amazing tool.
We have a use case where we want to run GOSS checks on 5000+ servers on a nightly basis via cron. The server list is not static and can change frequently when servers are decommissioned or provisioned.
I had a few questions for you to evaluate if healthchecks is the right option for us:
Do any of your existing customers have a similar use case??
It would be interesting to know how they solve these issues.
@abitlegacy commented on GitHub (Mar 21, 2019):
I'm not the developer - but I do host my own instance.
I have a similar use case - but my company is largely Windows based. I have two primary projects that most checks belong to - one is for data pulls and pushes. The other is for Windows DSC - whenever a server updates it's configuration it throws a POST with information regarding status and any changes it made.
@gganeshan commented on GitHub (Mar 21, 2019):
@djreynolds922 thanks a lot for your response.
I have a few follow up questions 😄 .
I see documentation on using MySQL or PostgreSQL as the backend but how do you integrate healthchecks with a webserver of your choice?? I would love to use caddy for my self-hosted instance.
The self-registration happens as a separate cron job right??
If it is part of the same step as the healthcheck itself then if the cron never runs then your server registration also never happens so the purpose is defeated.
By CheckID do you mean the unique ping address? Do you use puppet (or something similar) to set these environment variables??
if I tag the checks with servername then I can retrieve server specific unique ping address using the API.
Cool. Multiple emails / DLs can only be added by sending an invite right?? I wish we could add trusted email addresses somehow via the backend esp on self-hosted instances.
I didnt mean SMPT config, I meant configuration of trusted emails/DLs recipients for a specific project (similar to my point above). I dont like the current workflow of adding recipients only via invitation.
@gganeshan commented on GitHub (Mar 21, 2019):
May be I just run healthchecks with uwsgi and proxy caddy to the uwsgi endpoint.
Reference: https://github.com/mholt/caddy/issues/176
@abitlegacy commented on GitHub (Mar 22, 2019):
Healthchecks is just a Django application - for Caddy it seems like the community favorite deployment is Caddy as a Reverse Proxy => Gunicorn. Here's an example on the Caddy github repo: https://github.com/caddyserver/examples/tree/master/django - it took a bit for me to get everything working with IIS - but getting a *nix server in my environment would've been more difficult.
Self-Registration happens on initial provisioning - not within the scheduled jobs I'm monitoring. If I had a build server - I probably would have the build server register the expected servers.
CheckID is the unique ping address. Sorry - internally I call it a CheckID in my provisioning scripts.
That's true - but I would recommend looking at the UI - the checks page lists all the tags as filterable buttons. If you have 5k+ in there - I can't imagine it'll look good. You could always modify the templates yourself to work around it though.
For adding E-mail addresses - the only way I know how to do it is through the invite - although you could probably add them directly into the database as well - I just setup two distribution groups in our Exchange server and my E-Mails go to those (one for dev, one for prod) - people can add and removed themselves from the distribution group and then I modified the E-Mail template to ensure no one accidentally clicks "No longer receive these".
As far as adding users to a project - it's definitely a pain. I'm pretty sure you can use any django backend you'd like though - so if you want to do ldap or similar I'm sure you could. It'd just require modifying some files yourself.
@cuu508 commented on GitHub (Mar 26, 2019):
On the hosted service, healthchecks.io, I'm currently using nginx as the reverse proxy (+ TLS termination, serving static files, rate limiting, etc.) and uwsgi for running the Django application.
I have used Caddy instead of nginx in the past as well and it worked well.
gunicorn would work as well, but I personally like uwsgi for its "Swiss army knife of web serving" aspect. It has chaotic documentation but tons of configuration options and features.
For provisioning, I recommend the same approach as @djreynolds922 is suggesting: create the check during provisioning, and probably ping it as well to kick off the timer. On each server, cache the ping URL in an environment variable (or a configuration file, or wherever is convenient). On the first run, when the ping URL is not yet set, retrieve it using the "Create a Check" API call and specifying the
uniqueparameter. When using theuniqueparameter, that operation effectively becomes "get or create".Monitoring 5000 servers (checks) should be no issue. One thing to watch out for is if the all cron jobs run exactly at the same time (and have nice, synchronized clocks). When the monitoring host gets 5000 requests at the exact same millisecond, depending on your kernel and web server configuration you could see delays because of dropped and retried packets. If the pings are spaced out even a little bit then this would not be a worry.
On adding email addresses without the verification step: I guess in a self-hosted setting the verification has not much use and only gets in the way. I'll look into adding a USE_EMAIL_VERIFICATION=True/False configuration setting.
@gganeshan commented on GitHub (Apr 5, 2019):
thank you @cuu508 and @djreynolds922 for your responses.
Really appreciate it.