mirror of
https://github.com/healthchecks/healthchecks.git
synced 2026-04-25 06:55:53 +03:00
[GH-ISSUE #626] Implement check auto-provisioning when pinging with a slug that does not exist #456
Labels
No labels
bug
bug
bug
feature
good-first-issue
new integration
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/healthchecks#456
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @mike503 on GitHub (Mar 29, 2022).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/626
I am able to confirm that pre-existing slugs work. However, non-existent ones do not.
However, following your example here https://blog.healthchecks.io/2021/09/monitoring-postgresql-with-pgmetrics-and-pgdash/ and using runitor (and actually, without runitor, simply using curl) it gives a 404 when supplying a ping key and slug. It needs a pre-existing ping key/slug URL. I have a ping key set in the project. I've tried every combination and it doesn't work.
According to your blog post (and runitor) it seems like this should auto-provision the slug "doesnt-exist" for me, which is desired. It doesn't though.
@cuu508 commented on GitHub (Apr 4, 2022):
Sorry for the confusion, I should have made it explicit in the blog post that there is no auto-provisioning. You have to create checks either via the web UI or using the management API before pinging a slug URL.
I think auto-provisioning would be a neat feature to have.
@mike503 commented on GitHub (Apr 4, 2022):
Yes it’s a huge desire. It’s actually one I require in a service, and don’t want to have 80+ instances with a wrapper script to issue management API calls before a quick cron hit (one of them fires every minute) - that’s a lot of extra chatter. Cronitor supports this and is really the only reason I’m having to use them over you at this point.
@cuu508 commented on GitHub (Apr 4, 2022):
Thanks, I'm noting the interest.
@cuu508 commented on GitHub (Aug 8, 2022):
I'm looking into this and would like to discuss a few implementation choices.
Security. If we implement auto-provisioning, the Ping Key gains the power of creating new checks. Let's say Alice has shared a slug-based ping URL with Bob. Currently Bob can only ping the one Alice's check (and maybe also the other Alice's checks if he can guess their slugs). After implementing auto-provisioning, Bob can now also create new checks in Alice's account, which Alice may not expect. Is this a valid concern? Should auto-provisioning perhaps need to be explicitly turned on in the "Project Settings" page?
My current preference: keep it simple, auto-provisioning is enabled for all projects that have a Ping Key, no explicit toggle in "Project Settings".
What to do when user tries to use a slug with invalid syntax? A few examples of invalid slugs:
foo--bar: the slug should not contain repeated single dashes or underscores-foo: the slug should should not have leading or trailing single dashes or underscoresFoo: the slug should not use uppercase lettersNow, let's say the user tries to auto-provisioning a check with
foo--barslug. What are our choices?foo--bartofoo-bar.foo--barto be usable as-is. I would also be annoyed if slugs with uppercase letters cannot be used. From my point of view, the service is being nitpicky for no good reason.My current preference: reject invalid slugs with HTTP 400.
How to handle account limits when provisioning new checks. Let's say my account is already at the check limit, and I try to auto-provision another check. What should happen?
My current preference: return HTTP 403, and in the web UI show a warning message that can be dismissed.
Footguns:
My current preference: at some point in the future implement batch actions so garbage checks can be cleaned up quickly. When auto-provisioning fails due to account limits, show a dismiss-able warning in the web UI.
@mike503 commented on GitHub (Aug 8, 2022):
Very thorough thoughts, but I think most of that should be within the customer's responsibility. If they have ugly slugs, that's on them. If they auto provision too many, same deal. Maybe just have some sort of threshold to email (either a notice about the number of checks growing fast, courtesy notice) or an account level threshold, and send them a reminder when they're getting close to it, things like that. Reject it if they are at maximum, maybe send an email that said "we just got a request for a new monitor , but you're at your limit" and a link to bump it up.
I'd rather have the responsibility to manage that stuff myself than limited by the platform.
@cuu508 commented on GitHub (Aug 10, 2022):
The issue with allowing ugly slugs (
-foo,foo--bar,Foo) is unintuitive behaviour when the user edits schedule:-foo, a check with that slug gets createdfoo-fooagain, another check gets created. The user now has two checks:fooand-foo.I'm leaning towards rejecting invalid slugs (return HTTP 400 when pinging them). Document the slug syntax rules, and if the user pings an invalid slug and gets an error response, that's on them. Some other options would be:
@mike503 commented on GitHub (Aug 12, 2022):
that'd be fine, ultimately. simply getting the ability to have slugs created on demand is the biggest thing.
@cuu508 commented on GitHub (Oct 14, 2022):
After doing some more thinking and prototyping, I've decided to pass on this feature, at least for now. It is tempting to have, but also a ton of work and a ton of extra complexity to implement properly. And I'm not willing to bear with a half-assed implementation :-)
A workaround is to use a wrapper script, in nutshell:
A quick PoC in python:
and as a shell script:
@frutik commented on GitHub (Nov 16, 2022):
Thank you for your great work!
Regarding the access management issues preventing auto-provisioning implementation, probably a big part of the usage patterns is a simple single-tenant setup. At least for self-hosted installations. And in this setup would be nice to have not just auto-provisioning ping but even ping requests able to modify the check settings. For example, change the timeout or schedule... So, there is a single place to define and control (basic settings) the check.
I am using Django management commands to implement cronjob actions, and this way (with auto-provisioning with ping and settings management), such a command becomes a single point of control for the monitoring.
Something like (huge simplification)
Of course, with your proposed approach, I can handle provisioning with 404 controlling. But I can not easily change my checks from @hourly to @daily in my code.
I understand, this description only covers my specific scenario, but maybe it can bring some new ideas and give some chance to idea of implementing that feature.
@cuu508 commented on GitHub (Jun 22, 2023):
I've now deployed an initial version of check auto-provisioning functionality.
What works:
What does not work yet:
Slug validation rules: the slug must contain only lowercase letters, digits, hyphens and underscores. But you can use them in any combination, for example "foo--bar" and "--foo" are valid slugs.
Limits:
@bdd commented on GitHub (Jun 26, 2023):
Following Pēteris's contribution, I made a prerelease of runitor supporting 201 response code so it doesn't think ping failed. If you use runitor, I'd appreciate testing reports of v1.3.0-beta.1. If no issues are found I intend to cut v.1.3.0 on Wednesday (PDT morning, UTC early evening).
@cuu508 commented on GitHub (Jul 6, 2023):
Check auto-provisioning announced in blog here: https://blog.healthchecks.io/2023/07/new-feature-check-auto-provisioning/