[GH-ISSUE #348] Notification emails: include more details about the check #266

Closed
opened 2026-02-25 23:41:49 +03:00 by kerem · 6 comments
Owner

Originally created by @cuu508 on GitHub (Mar 25, 2020).
Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/348

Consider including:

  • check's tags
  • check's schedule
  • last ping, total pings
  • log of received pings
  • request body of the last '/fail` request (#308)

Consider changing the summary table to a table of totals:

  • 3 checks are down
  • 17 checks are up

This would make the notification emails more search-friendly, also there would be less items competing for recipient's attention.

Originally created by @cuu508 on GitHub (Mar 25, 2020). Original GitHub issue: https://github.com/healthchecks/healthchecks/issues/348 Consider including: * check's tags * check's schedule * last ping, total pings * log of received pings * request body of the last '/fail` request (#308) Consider changing the summary table to a table of totals: * 3 checks are down * 17 checks are up This would make the notification emails more search-friendly, also there would be less items competing for recipient's attention.
kerem closed this issue 2026-02-25 23:41:49 +03:00
Author
Owner

@cuu508 commented on GitHub (Dec 23, 2020):

I'm now mocking this up, and thinking about moving away from the heavy styled HTML emails.

Current:

image

Mockup:

image

I like the minimal styling better because it puts function over form, and avoids various rendering issues in different email clients. It would use the same font face and size as plain text emails. In most cases the information density would be higher.

What do you think? Any and all feedback welcome!

<!-- gh-comment-id:750288628 --> @cuu508 commented on GitHub (Dec 23, 2020): I'm now mocking this up, and thinking about moving away from the heavy styled HTML emails. Current: ![image](https://user-images.githubusercontent.com/661859/102996377-b6efa100-452b-11eb-880b-1d8a5ac88404.png) Mockup: ![image](https://user-images.githubusercontent.com/661859/102997187-24e89800-452d-11eb-836c-53d5c2f2524c.png) I like the minimal styling better because it puts function over form, and avoids various rendering issues in different email clients. It would use the same font face and size as plain text emails. In most cases the information density would be higher. What do you think? Any and all feedback welcome!
Author
Owner

@r4lv commented on GitHub (Feb 9, 2021):

Dear cuu508,
I know I am a bit late for the discussion, but I really preferred the previous alert emails 😋

While this is definitely a matter of taste, let me explain my thoughts. I agree with the need of more information, but I strongly disagree with putting "function over form" — ideally, we would have both, which is already the case for all the rest of healthchecks! On the healthchecks website, I really appreciate that I quickly find everything I need, while also being pleasant to the eye. When I first saw the new alert (a few minutes ago), I associated nothing of it with the healthchecks I know. It looked broken to me, so I went trough the source and found out it was on purpose 😋

Some ideas:

  • Add a configuration option to switch between the old and the new layout. That should be quite easy to do, but I agree that it's a step "backward".
  • Add some minimal styling to the new template, to visually integrate it with the website: font size and colour, centering, coloured buttons, header and footer. That would be my preferred option.
  • I could also imagine an even more minimal template — it depends on the use case, but I usually don't care about the period, the total pings etc. A big red "DOWN" button, which takes me to the website where I can see everything in detail could be enough.
  • To continue with this idea, I could imagine a "verbosity"-level for the alert emails, so the user could choose between terse (the current, new template), long (the old template) and short (the old template without the summary, header, but with a bigger UP/DOWN button).

Let me know what you think, and if you would like some help.

And thank you for your amazing work!

<!-- gh-comment-id:776203547 --> @r4lv commented on GitHub (Feb 9, 2021): Dear cuu508, I know I am a bit late for the discussion, but I really preferred the previous alert emails 😋 While this is definitely a matter of taste, let me explain my thoughts. I agree with the need of more information, but I strongly disagree with putting "function over form" — ideally, we would have both, which is already the case for all the rest of healthchecks! On the healthchecks website, I really appreciate that I quickly find everything I need, while also being pleasant to the eye. When I first saw the new alert (a few minutes ago), I associated nothing of it with the healthchecks I know. It looked broken to me, so I went trough the source and found out it was on purpose 😋 Some ideas: - Add a configuration option to switch between the old and the new layout. That should be quite easy to do, but I agree that it's a step "backward". - Add some minimal styling to the new template, to visually integrate it with the website: font size and colour, centering, coloured buttons, header and footer. That would be my preferred option. - I could also imagine an even more minimal template — it depends on the use case, but I usually don't care about the period, the total pings etc. A big red "DOWN" button, which takes me to the website where I can see everything in detail could be enough. - To continue with this idea, I could imagine a "verbosity"-level for the alert emails, so the user could choose between `terse` (the current, new template), `long` (the old template) and `short` (the old template without the *summary*, header, but with a bigger UP/DOWN button). Let me know what you think, and if you would like some help. And thank you for your amazing work!
Author
Owner

@cuu508 commented on GitHub (Mar 8, 2021):

Hello @r4lv, thanks for the feedback – I appreciate it!

I agree that the function and form should be the goal. If/when they are in conflict, there's probably a happy balance somewhere in there.

For some context, here's how the current template came to be: I wanted to add more information in the emails. HTML emails are a PITA – there are relatively few features and techniques that work and look consistent across email clients. Inlining, tables, it's like going back to IE6. To test it in various email clients, I sometimes used Litmus. Making any changes is also a minefield of Gmail potentially deciding that our emails now look too much like some spam pattern and should be marked as spam or suspicious.

I was planning to make extensive changes to the alert template but couldn't afford to spend days if not weeks in the designing - testing - compromising loop. So I went with the practical approach of using only the very basic formatting options.

Two things I like about using email client's default style:

  • more information density
  • it fits in, looks natural in every email client. The email body uses a similar font face as the rest of the UI. I know this is subjective, people sometimes use a non-default font specifically to stand out, to have an unique identity

I do want to do more work on this. At the very least, experiment some more with the template to see if there are any "easy wins" to make it look subjectively nicer, while remaining as simple as it is currently. Having multiple, selectable templates is also an interesting idea. One downside with that is the extra maintenance going forward, keeping multiple templates up to date and tested.

<!-- gh-comment-id:792844513 --> @cuu508 commented on GitHub (Mar 8, 2021): Hello @r4lv, thanks for the feedback – I appreciate it! I agree that the function *and* form should be the goal. If/when they are in conflict, there's probably a happy balance somewhere in there. For some context, here's how the current template came to be: I wanted to add more information in the emails. HTML emails are a PITA – there are relatively few features and techniques that work and look consistent across email clients. Inlining, tables, it's like going back to IE6. To test it in various email clients, I sometimes used Litmus. Making any changes is also a minefield of Gmail potentially deciding that our emails now look too much like some spam pattern and should be marked [as spam or suspicious](https://blog.healthchecks.io/2018/10/investigating-gmails-this-message-seems-dangerous/). I was planning to make extensive changes to the alert template but couldn't afford to spend days if not weeks in the designing - testing - compromising loop. So I went with the practical approach of using only the very basic formatting options. Two things I like about using email client's default style: * more information density * it fits in, looks natural in every email client. The email body uses a similar font face as the rest of the UI. I know this is subjective, people sometimes use a non-default font specifically to stand out, to have an unique identity I do want to do more work on this. At the very least, experiment some more with the template to see if there are any "easy wins" to make it look subjectively nicer, while remaining as simple as it is currently. Having multiple, selectable templates is also an interesting idea. One downside with that is the extra maintenance going forward, keeping multiple templates up to date and tested.
Author
Owner

@lukastribus commented on GitHub (Mar 8, 2021):

Another minor details:

Please avoid relative time in emails ("two hours ago", "3 days and 40 minutes ago"). In troubleshooting this is almost always harder to interpret and in emails especially, since the email could already be 3 hours old by the time someone takes a look at it.

The reader needs to focus on troubleshooting the real issue, not engage in arithmetic exercises just to understand what time the events actually occurred.

<!-- gh-comment-id:792863723 --> @lukastribus commented on GitHub (Mar 8, 2021): Another minor details: Please avoid relative time in emails ("two hours ago", "3 days and 40 minutes ago"). In troubleshooting this is almost always harder to interpret and in emails especially, since the email could already be 3 hours old by the time someone takes a look at it. The reader needs to focus on troubleshooting the real issue, not engage in arithmetic exercises just to understand what time the events actually occurred.
Author
Owner

@cuu508 commented on GitHub (Mar 8, 2021):

The relative times work OK if you read the email soon after receiving it. Let's say the period is 1 day, and it says the last ping was 1 day, 1 hour ago. So you go and think – "OK, the 1 day period passed, the 1 hour grace time passed, and this is why I'm now getting the notification".

In the web interface, in the ping log, you can switch between UTC, browser's timezone and server's timezone (if known).

In email messages it is not obvious what timezone to use. Whenever I receive monitoring alerts from systems, I first have to figure out what timezone the sender has probably assumed, and then do the mental math anyway...

There's #365 about letting users specify their preferred timezone – that would help but is not implemented yet.

<!-- gh-comment-id:792869968 --> @cuu508 commented on GitHub (Mar 8, 2021): The relative times work OK if you read the email soon after receiving it. Let's say the period is 1 day, and it says the last ping was 1 day, 1 hour ago. So you go and think – "OK, the 1 day period passed, the 1 hour grace time passed, and this is why I'm now getting the notification". In the web interface, in the ping log, you can switch between UTC, browser's timezone and server's timezone (if known). In email messages it is not obvious what timezone to use. Whenever I receive monitoring alerts from systems, I first have to figure out what timezone the sender has probably assumed, and then do the mental math anyway... There's #365 about letting users specify their preferred timezone – that would help but is not implemented yet.
Author
Owner

@lukastribus commented on GitHub (Mar 8, 2021):

I was just thinking about how the timestamp in a down message I received the other day could be off by 4 - 5 hours (hourly cronjob), until I realized that the email itself was delayed for 4 - 5 hours, and gmail doesn't show the Date header of the email in it's interface (you'd have to go in "show original" to find the big disparity between when the email was sent and when it was actually received).

Now in this case the root cause was an outage at my SMTP provider, so the email was delayed because of that. However there is also greylisting which could delay the email for a some time.

Gmail tries very hard to hide the actual Date header (not sure why, not sure about other MUAs), I'd argue that relative time formatting require the email to arrive at the destination in quasi-realtime, to be able to make sense of it, and that is not universally true for SMTP.

The relative times work OK if you read the email soon after receiving it. Let's say the period is 1 day, and it says the last ping was 1 day, 1 hour ago. So you go and think – "OK, the 1 day period passed, the 1 hour grace time passed, and this is why I'm now getting the notification".

The assumption being that the end-user is not sure a) what intervall/grace period is actually configured or b) whether healthchecks actually works correctly or not.

I think it's more likely that the user is concerned about the actual production service that is not running at this point, and when it actually did run the last time. Relative time formatting makes this harder, in my opinion.

I agree the longer the interval, the less critical this gets. But for jobs which intervals of 60 minutes or less, with just a few minutes of grace period, I think it's more important.

There's #365 about letting users specify their preferred timezone – that would help but is not implemented yet.

Thanks, I subscribed and commented there.

<!-- gh-comment-id:792894903 --> @lukastribus commented on GitHub (Mar 8, 2021): I was just thinking about how the timestamp in a down message I received the other day could be off by 4 - 5 hours (hourly cronjob), until I realized that the email itself was delayed for 4 - 5 hours, and gmail doesn't show the Date header of the email in it's interface (you'd have to go in "show original" to find the big disparity between when the email was sent and when it was actually received). Now in this case the root cause was an outage at my SMTP provider, so the email was delayed because of that. However there is also greylisting which could delay the email for a some time. Gmail tries very hard to hide the actual Date header (not sure why, not sure about other MUAs), I'd argue that relative time formatting require the email to arrive at the destination in quasi-realtime, to be able to make sense of it, and that is not universally true for SMTP. > The relative times work OK if you read the email soon after receiving it. Let's say the period is 1 day, and it says the last ping was 1 day, 1 hour ago. So you go and think – "OK, the 1 day period passed, the 1 hour grace time passed, and this is why I'm now getting the notification". The assumption being that the end-user is not sure a) what intervall/grace period is actually configured or b) whether healthchecks actually works correctly or not. I think it's more likely that the user is concerned about the actual production service that is not running at this point, and when it actually did run the last time. Relative time formatting makes this harder, in my opinion. I agree the longer the interval, the less critical this gets. But for jobs which intervals of 60 minutes or less, with just a few minutes of grace period, I think it's more important. > There's #365 about letting users specify their preferred timezone – that would help but is not implemented yet. Thanks, I subscribed and commented there.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/healthchecks#266
No description provided.