mirror of
https://github.com/healthchecks/healthchecks.git
synced 2026-04-25 15:05:49 +03:00
[GH-ISSUE #510] Keep alerting when a check is down #373
Labels
No labels
bug
bug
bug
feature
good-first-issue
new integration
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/healthchecks#373
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @kerenskybr on GitHub (May 7, 2021).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/510
Hi. How could be possible to implement a way to keep sending alerts for down checks? I tried change some things in
sendlalerts.pybut was not able to put this feature. Any tips?@cuu508 commented on GitHub (May 8, 2021):
Hi @kerenskybr , do you know about the hourly / daily reminders in Account Settings > Email Reports?
If you select "Remind me hourly", you will get an email message every hour as long as any check is down. A couple notes:
@kerenskybr commented on GitHub (May 10, 2021):
Yes, for slack/webhooks alerts. I think it's a cool to have feature. I tried to implement by my self (i been using it to control backups, servers and so), with no luck.
My idea is to keep receiving the pings when a check is down (i.e. each 5 minutes)
@cuu508 commented on GitHub (Aug 10, 2022):
Here's an UI mockup of the Account Settings – Email Reports screen:
It adds checkboxes that let the user select which configured notification channels to use for ongoing reminders. It also adds a "Every 10 minutes" option.
A couple notes:
@rwjack, @kerenskybr what do you think, would this make sense to you?
@rwjack commented on GitHub (Aug 10, 2022):
Awesome, that looks like the thing we're looking for!
PS. I don't want to sound like a dick, but I'm just curious. Is there a reason we couldn't set the reminder timer to X minutes, instead of it being hardcoded 10/60/1440?
I hope you see that most of the people, dare I say everyone that uses healthchecks.io isn't an end user, and doesn't require simplicity. I personally wouldn't be taking the Apple route of simplicity here.
I think HC.io requires customizability and interoperability, and should get raw dog dirty with the possible configuration options. I also understand that isn't as easy to implement. Though from the perspective of my basic dev knowledge, I'd say if you've built ALL of this, passing a text box variable to a timer function presents a piece of cake for you. It's just a matter of what direction YOU want to lead the project.
@rwjack commented on GitHub (Aug 10, 2022):
Small remark, I use matrix, though in my original feature request, I mentioned slack as just an example. Do you have any plans on implementing matrix recurring notifications?
@cuu508 commented on GitHub (Aug 10, 2022):
Thanks for the feedback @rwjack.
Oops, I screwed up the mockup: the "Email Reports" screen is in the account settings, not project settings. So it would be confusing to list notification methods from individual projects here. Back to the drawing board :-)
@cuu508 commented on GitHub (Aug 10, 2022):
If we can pick 2-4 choices that would satisfy the 99% of users, then the fixed choices have the following advantages:
I don't have concrete plans about anything here yet, just brainstorming, but in theory, yes. Different integration types would need different notification templates – we cannot send a blob of formatted HTML and CSS in a Matrix message like we can in email, for example.
@rwjack commented on GitHub (Aug 10, 2022):
I don't think there needs to be a different template other than the current one. The text doesn't need to change on every repeated notification, it can be the same as the original notification. Maybe the easiest implementation would be:
Or if you want to make it pretty and complex, global (preferably individual) checks could even have a timeout of eg. 3 notifications, and after the first notification, the second one could have appended text along the lines: "(2/3)", where 2 is the 2nd try and 3 is total notification retries. This can further be expanded into:
@cuu508 commented on GitHub (Jan 12, 2023):
Repeat notifications and escalation policies (if 3/3 do some special action) are best handled with dedicated incident management systems (OpsGenie, PagerDuty, Splunk On-Call, others). I'd like to keep Healthchecks focused on one task: notify when a client does not check in at the expected time. The Healthchecks notification can feed in a different system which then handles repeats, acknowledgements, snoozes, escalations, on-call schedules, issue resolutions etc.