mirror of
https://github.com/healthchecks/healthchecks.git
synced 2026-04-25 06:55:53 +03:00
[GH-ISSUE #1118] OnCalendar grace time does not apply to the first expected check #776
Labels
No labels
bug
bug
bug
feature
good-first-issue
new integration
pull-request
question
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/healthchecks#776
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @agross on GitHub (Jan 24, 2025).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/1118
Hello,
I've configured my first
OnCalendarcheck yesterday.The expression is
Mon..Fri *-*-* 05..18:0/10:00. The server sending notifications has its timezone set toUTCand the grace time is 1 hour.My expectation is that the first notification should arrive at 05:00 but can be up to 1 hour late before the check goes down. Unfortunately, this does not seem to be the case. Today, I was notified because the first
OKarrived at 2025-01-24T05:00:08.117457+00:00, which is not too late given the 1-hour grace time.The times shown in the table above are UTC.
@cuu508 commented on GitHub (Jan 24, 2025):
If this was on healthchecks.io, can you please send me the check's UUID to contact at healthchecks.io?
The ping logs table has a timezone selector buttons at the top right corner – is the display timezone set to UTC (and not Browser's time zone)?
@agross commented on GitHub (Jan 24, 2025):
I host healthchecks locally, running the latest docker image.
The times shown in the table above are UTC.
Screenshot of the notification causing the
OK:@cuu508 commented on GitHub (Jan 24, 2025):
OK, in that case, that does look wrong, and either I'm missing something or there's a bug somewhere.
I haven't been able to reproduce this. Can you try and produce reliable reproduction steps for me? Does this also happen if you set the schedule so the next expected ping is only a little bit in the future? For example, if the time is 10:10 UTC currently, can you trigger the issue if you set the schedule to
10:15:00, send one ping and then wait 5 minutes?@agross commented on GitHub (Jan 24, 2025):
In the meantime, I've recreated the ping on healthchecks.io; the ID is
5b13c8f6-527c-4dba-b55e-a44476e37b29. With the slight difference in the Mon-Sat OnCalendar expression, we'll see it fail tomorrow morning.@agross commented on GitHub (Jan 27, 2025):
Unfortunately, it didn’t reproduce on healthchecks.io. I've now reconfigured to use our self-hosted instance and I'll post updates here if it starts to fail again.
@agross commented on GitHub (Jan 28, 2025):
Well, it happened again today on the self-hosted instance. In the meantime I changed the grace time to 45 minutes. It seems that the OnCalendar expression is considered
latestarting 4:00 UTC (invisible in the log), turnsdownat 4:45 UTC and the server running the action at 5:00 UTC makes the checkupagain.The expression is (and was):
Mon..Fri *-*-* 05..18:0/10:00with Server's Time Zone == UTC.Am I right to assume the "Server" above the machine executing pings? Or is it the server running the healthchecks instance, i.e. evaluating
OnCalendarexpression?@cuu508 commented on GitHub (Jan 28, 2025):
Yes, it's the timezone of the machine sending pings.
Would you be able to produce reliable reproduction steps for me? Or give me access to a machine (say, a throwaway VM in one of the public clouds) where you've reproduced the issue?
@agross commented on GitHub (Jan 28, 2025):
It rather looks like the machine running healthchecks given the 1-hour offset. Why would the healthcheck turn
downat 4:45 UTC when the OnCalendar expression specifies5-18(and UTC as the time zone)?The server sending pings is configured for UTC (
timedatectl) and the server receiving pings runs in the Europe/Berlin time zone. Regardless, the timestamps in the receiving server's log make sense, e.g. 5:45 in Europe/Berlin is 4:45 UTC as shown below:Would you like SSH access or access to the healthchecks instance?
@agross commented on GitHub (Jan 28, 2025):
Interesting, the preview timestamps equal if I select the server's time zone.
If I select another time zone like Riga, the times start to differ.
@cuu508 commented on GitHub (Jan 28, 2025):
You previously mentioned you are running Healthchecks using docker.
You're on to something here, this is definitely not right.
@cuu508 commented on GitHub (Jan 28, 2025):
I can reproduce the incorrect "Expected Ping Dates" preview if I set
TIME_ZONE = "Europe/Berlin"inlocal_settings.py:Have you by any chance changed Django's
TIME_ZONEsetting?@agross commented on GitHub (Jan 28, 2025):
I've mounted
/etc/localtime:/etc/localtime, so both the host and the container share the same time zone.@agross commented on GitHub (Jan 28, 2025):
After removing the mount the preview feature works as expected. Is healthchecks required to run in UTC?
@cuu508 commented on GitHub (Jan 28, 2025):
The timezone of the OS (or the container) should not matter. For example, on my local development machine, the timezone is
Europe/Riga, and on the server serving healthchecks.io it is set toEurope/Berlin(because that happens to be Hetzner's default).What matters is that Django's TIME_ZONE setting is "UTC". The code that handles datetime arithmetic is written with assumption that internally we will be using UTC timestamps.
What puzzles me is how in your case setting the container's timezone influenced the time zone that the Django code uses.
Could you do the following experiment:
@cuu508 commented on GitHub (Jan 28, 2025):
I mounted
/etc/localtimelike you did and now see the same problem. I'll play with it.@agross commented on GitHub (Jan 28, 2025):
I tried both /etc/localtime mount situations and in both cases Django's time zone is UTC.
@cuu508 commented on GitHub (Jan 28, 2025):
After some time diving through Django sources, I narrowed down the problem to the following snippet:
Inside container with a mounted
/etc/localtimethis produces:On the host system the same snippet produces:
So, inside the container,
zoneinfo.ZoneInfo("UTC")is somehow messed up. Reading zoneinfo docs, the zoneinfo module uses system's time zone database. It is usually at/usr/share/zoneinfo.If we look at
ls -la /etc/localtimewe see it is a symlink to/usr/share/zoneinfo/Etc/UTC.I guess what happens is, when you mount your host system's timezone file over
/etc/localtimeinside the container, it also affects/usr/share/zoneinfo/Etc/UTCinside the container.Which leads to messed up
ZoneInfo("UTC")instances. Which then leads to incorrect date formatting in Django templates.I think the straightforward fix would be not to mount
/etc/localtimeinside the container, and let it default to UTC.@agross commented on GitHub (Jan 28, 2025):
Yes, I think so too and this is also what I changed in my setup. Hopefully, the check will stay up, we'll know tomorrow ;-)
Perhaps a bit of documentation in the compose file to make that explicit can help as well.
Thank you for your support regarding this issue and for creating healthchecks in the first place!
@cuu508 commented on GitHub (Jan 30, 2025):
I added a system check which inspects
settings.TIME_ZONEand prints an error message if it does not have the expected "UTC" value. This should guard against users settingTIME_ZONEinlocal_settings.py.I'm not sure about adding a note about the
/etc/localtimepitfall. This is not Healthchecks-specific. If you mount non-UTC timezone file over/etc/localtimeyou will have system-wide issues. For example,TZ=UTC datecommand will print a non-UTC time. Users can break their systems in all kinds of creative ways, it seems futile and out of scope for Healthchecks to warn against them.FWIW I added a comment in a StackOverflow answer here: https://stackoverflow.com/a/67054850/5821