mirror of
https://github.com/healthchecks/healthchecks.git
synced 2026-04-26 07:25:51 +03:00
[GH-ISSUE #549] log entries: keep last n failure entries #398
Labels
No labels
bug
bug
bug
feature
good-first-issue
new integration
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/healthchecks#398
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @lukastribus on GitHub (Aug 6, 2021).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/549
Hello,
when the log entries hit the maximum, old messages are removed.
Especially with higher frequency intervals, keeping a few of those "failure" events (which may contain important debug information's in the body) would be useful, as opposed to remove log entries solely based on the timestamp. Positive log entries are often only useful for their timestamp.
It so happens that I could have 100 positives log entries but lacking the last 2 - 3 negative log entries with debug informations in the body, and I'm really interested in the failures.
I'm not sure how this could be structured clearly without over-complicating the UI, maybe always keep the last 3 negative entries in the log?
@cuu508 commented on GitHub (Aug 6, 2021):
Thanks for the suggestion!
I remember Mandrill (transactional email service, now part of Mailchimp) doing something like that. They were keeping a log of last 100 successful API calls, and a separate log of last 100 failed API calls. If there are lots of API calls, the successful log may only cover a short time period, but the last 100 failures were still available in the other log. I may have the details wrong, but that was the general idea.
It's possible to do something similar in Healthchecks but it would complicate bookkeeping, and would be a nontrivial change. It could also up to double the database size. For operational simplicity, I want to keep the database size as low as possible.
If you use Healthchecks.io, you can upgrade to a paid plan for 1000 log entry limit. If you run a self-hosted instance, you can set any log entry limit.
@lukastribus commented on GitHub (Aug 9, 2021):
Hello,
the goal would certainly be to keep the total database size the same, for example keeping 95 log entries regardless of whether the type was OK or fail, and another 5 entries that failed.
But yeah, I agree that this could make things more complicated.
@Wouter0100 commented on GitHub (Feb 4, 2022):
I was looking to open a feature request for this as well. We have a cron run every minute, making it very likely that we'll only start to look to errors when a 100 minutes is passed. Splitting up OK/fail entries like that (90 OK, 10 fail) would work.
Wouldn't it be possible to store it in the same table, but have different cleanup rules?
@lukastribus commented on GitHub (Feb 16, 2022):
Currently, my cronjob runs every 10 minutes. I implemented an additional check that the OK to healthchecks is sent only once an hour, not at every cronjob run.
Now I have the problem that when the job fails (sends fail to healthchecks), I don't get an OK on the next successful run, because it only sends an OK every hour.
The cron script would have to keep track of previous failures to be able to handle this correctly. To handle this the right way on the script side, lots of complexity is needed.
@cuu508 commented on GitHub (Feb 16, 2022):
@lukastribus are you using healthchecks.io or self-hosting?
If self-hosting, you can raise the limit of how many log entries are kept (see "Ping log limit" in Django admin → Accounts → Profiles).
On healthchecks.io paid plans the limit is 1000 log entries. If the job runs every 10 minutes, that covers almost a full week (or half that if you also send
/startevents). If #609 works out, I will look into lifting the 1000 entry limit for paid plans higher.@lukastribus commented on GitHub (Feb 16, 2022):
I use healthchecks.io for now. I ended up maintaining state locally in case of errors, this adds complexity but it works.