[GH-ISSUE #468] Substitution of \n to spaces in Snippet webhook field makes some preformatted emails look ugly. #305

Closed
opened 2026-03-15 13:47:24 +03:00 by kerem · 5 comments
Owner

Originally created by @baiomys on GitHub (Mar 16, 2025).
Original GitHub issue: https://github.com/axllent/mailpit/issues/468

Hi.

Especially forwarded messages previews look bad.

It should be like this:

From: sender user@proton.me
To: user@google.com

Sent with Proton Mail secure email. 
------- Forwarded Message ------- 
From: user <user@proton.me> 
Date: On Sunday, March 16th, 2025 at 11:39 AM 
Subject: Hi 
To: suser@google.com <any...

Maybe you could use some UTF-8 character (\u00A0 NBSP for example) that looks like space instead of \n, to make backwards conversion possible?

Originally created by @baiomys on GitHub (Mar 16, 2025). Original GitHub issue: https://github.com/axllent/mailpit/issues/468 Hi. Especially forwarded messages previews look bad. It should be like this: ``` From: sender user@proton.me To: user@google.com Sent with Proton Mail secure email. ------- Forwarded Message ------- From: user <user@proton.me> Date: On Sunday, March 16th, 2025 at 11:39 AM Subject: Hi To: suser@google.com <any... ``` Maybe you could use some UTF-8 character (\u00A0 NBSP for example) that looks like space instead of \n, to make backwards conversion possible?
kerem closed this issue 2026-03-15 13:47:29 +03:00
Author
Owner

@axllent commented on GitHub (Mar 17, 2025):

Hi.

Simply put, this is not possible. The snippet is intentionally written as a single line of text for two main reasons:

  1. It is a snippet, not as preview, so the the data is as concise (small) as possible, but more importantly....
  2. the snippet is generated from the HTML copy if the HTML exists. and if not then the plain text version used (which is often not the same as the HTML in newsletters). The HTML version is stripped of all html code and replaced with spaces as there is literally no control over how the message is actually formatted (eg: <div>From: sender user@proton.me</div><div>To: user@google.com</div> does not contain new lines at all, or HTML could contain new lines which are not visible in a browser).

What you are asking for would require that the snippet generator replicate the browser logic in order to understand how the HTML data is visually structured (at least in terms of new lines), which is extremely difficult and falls completely outside of the purpose of a snippet.

To give you a better example, here's the exact HTML from Gmail in a forwarded message (all on one line):

<div dir="ltr"><div class="gmail_default" style="font-family:arial,sans-serif"></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">---------- Forwarded message ---------<br>From: <b class="gmail_sendername" dir="auto">deliberist</b> <span dir="auto">&lt;<a href="mailto:notifications@github.com">notifications@github.com</a>&gt;</span><br>Date: Mon, 17 Mar 2025 at 01:52<br>Subject: Re: [axllent/mailpit] First attempt at building linux/s390x Docker images. (PR #423)<br>To: axllent/mailpit &lt;<a href="mailto:mailpit@noreply.github.com">mailpit@noreply.github.com</a>&gt;<br>Cc: Ralph Slooten &lt;<a .....

Here's the syntax from a forwarded message from Thunderbird:

    <div class="moz-forward-container"><br>
        <br>
        -------- Forwarded Message --------</font>
      <table cellpadding="0" cellspacing="0" border="0"
        class="moz-email-headers-table">
        <tbody>
          <tr>
            <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font
                color="#000000">Subject: </font></th>
            <td><font color="#000000">Your personal access tokens will
                expire in 60 days or less</font></td>
          </tr>
          <tr>
            <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font
                color="#000000">Date: </font></th>
            <td><font color="#000000">Sat, 15 Mar 2025 12:00:06 +0000</font></td>
          </tr>
          <tr>
            <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font
                color="#000000">From: </font></th>
            <td><font color="#000000">Git Server
                <a class="moz-txt-link-rfc2396E" href="mailto:git@example.com">&lt;git@example.com&gt;</a></font></td>
          </tr>
          <tr>
            <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font
                color="#000000">Reply-To: </font></th>
            <td><font color="#000000">Git Server
                <a class="moz-txt-link-rfc2396E" href="mailto:user@example.com">&lt;user@example.com&gt;</a></font></td>
          </tr>
          <tr>
            <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font
                color="#000000">To: </font></th>
            <td><font color="#000000"><a class="moz-txt-link-abbreviated" href="mailto:user@example.com">user@example.com</a></font></td>
          </tr>
        </tbody>
      </table>
</div>

I hope you see my point as to why it would not work? Even if you were to separately query the message API to get the full HTML to try display it yourself (which is the only way), you would have a very difficult task trying to show only a part of that as a multi-line snippet due to the very mixed structure of so many emails.

<!-- gh-comment-id:2728002630 --> @axllent commented on GitHub (Mar 17, 2025): Hi. Simply put, this is not possible. The snippet is intentionally written as a single line of text for two main reasons: 1. It is a snippet, not as preview, so the the data is as concise (small) as possible, but more importantly.... 2. the snippet is generated from the HTML copy if the HTML exists. and if not then the plain text version used (which is often not the same as the HTML in newsletters). The HTML version is stripped of all html code and replaced with spaces as there is literally no control over how the message is actually formatted (eg: `<div>From: sender user@proton.me</div><div>To: user@google.com</div>` does not contain new lines at all, or HTML could contain new lines which are not visible in a browser). What you are asking for would require that the snippet generator replicate the browser logic in order to understand how the HTML data is visually structured (at least in terms of new lines), which is extremely difficult and falls completely outside of the purpose of a snippet. To give you a better example, here's the exact HTML from Gmail in a forwarded message (all on one line): ```html <div dir="ltr"><div class="gmail_default" style="font-family:arial,sans-serif"></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">---------- Forwarded message ---------<br>From: <b class="gmail_sendername" dir="auto">deliberist</b> <span dir="auto">&lt;<a href="mailto:notifications@github.com">notifications@github.com</a>&gt;</span><br>Date: Mon, 17 Mar 2025 at 01:52<br>Subject: Re: [axllent/mailpit] First attempt at building linux/s390x Docker images. (PR #423)<br>To: axllent/mailpit &lt;<a href="mailto:mailpit@noreply.github.com">mailpit@noreply.github.com</a>&gt;<br>Cc: Ralph Slooten &lt;<a ..... ``` Here's the syntax from a forwarded message from Thunderbird: ```html <div class="moz-forward-container"><br> <br> -------- Forwarded Message --------</font> <table cellpadding="0" cellspacing="0" border="0" class="moz-email-headers-table"> <tbody> <tr> <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font color="#000000">Subject: </font></th> <td><font color="#000000">Your personal access tokens will expire in 60 days or less</font></td> </tr> <tr> <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font color="#000000">Date: </font></th> <td><font color="#000000">Sat, 15 Mar 2025 12:00:06 +0000</font></td> </tr> <tr> <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font color="#000000">From: </font></th> <td><font color="#000000">Git Server <a class="moz-txt-link-rfc2396E" href="mailto:git@example.com">&lt;git@example.com&gt;</a></font></td> </tr> <tr> <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font color="#000000">Reply-To: </font></th> <td><font color="#000000">Git Server <a class="moz-txt-link-rfc2396E" href="mailto:user@example.com">&lt;user@example.com&gt;</a></font></td> </tr> <tr> <th valign="BASELINE" align="RIGHT" nowrap="nowrap"><font color="#000000">To: </font></th> <td><font color="#000000"><a class="moz-txt-link-abbreviated" href="mailto:user@example.com">user@example.com</a></font></td> </tr> </tbody> </table> </div> ``` I hope you see my point as to why it would not work? Even if you were to separately query the message API to get the full HTML to try display it yourself (which is the only way), you would have a very difficult task trying to show only a part of that as a multi-line snippet due to the very mixed structure of so many emails.
Author
Owner

@baiomys commented on GitHub (Mar 17, 2025):

Thanks for detailed answer.
To be honest, TXT parts exist in emails more often than HTML, maybe you should try to generate snippet from TXT first and then from HTML? Or at least make priority user definable?
As "side effect" using TXT for snippets will be dramatically faster, as you need to do only one regex for \n -> nbsp,
without stripping HTML tags, which is slow an inefficient.

<!-- gh-comment-id:2728056084 --> @baiomys commented on GitHub (Mar 17, 2025): Thanks for detailed answer. To be honest, TXT parts exist in emails more often than HTML, maybe you should try to generate snippet from TXT first and then from HTML? Or at least make priority user definable? As "side effect" using TXT for snippets will be dramatically faster, as you need to do only one regex for \n -> nbsp, without stripping HTML tags, which is slow an inefficient.
Author
Owner

@axllent commented on GitHub (Mar 17, 2025):

Thanks for detailed answer. To be honest, TXT parts exist in emails more often than HTML, maybe you should try to generate snippet from TXT first and then from HTML? Or at least make priority user definable? As "side effect" using TXT for snippets will be dramatically faster, as you need to do only one regex for \n -> nbsp, without stripping HTML tags, which is slow an inefficient.

I don't think anything you said there is particularly accurate.

I'm not actually sure if populated HTML occurs less than populated text content - it is heavily dependent on the type of emails. From experience however, the HTML content (if used) is always the "accurate one". In newsletters the plain text is regularly used just to say "you need a HTML viewer to read this message", or "click here to view the newsletter online. In addition to this, many text emails are really badly formatted too as it's often auto-generated.

Snippets are created and stored only once during ingestion which takes approximately 0.0004s per email on Github Actions (benchmarking) - they are not generated "on the fly", so performance ("slow an inefficient") is not a valid argument here.

Lastly, I do not think that injecting hidden UTF8 characters is a good idea. I do understand what you are trying to achieve, but it will result in inconsistent and inaccurate output. I'll give it a bit more thought tonight though.

<!-- gh-comment-id:2728156693 --> @axllent commented on GitHub (Mar 17, 2025): > Thanks for detailed answer. To be honest, TXT parts exist in emails more often than HTML, maybe you should try to generate snippet from TXT first and then from HTML? Or at least make priority user definable? As "side effect" using TXT for snippets will be dramatically faster, as you need to do only one regex for \n -> nbsp, without stripping HTML tags, which is slow an inefficient. I don't think anything you said there is particularly accurate. I'm not actually sure if populated HTML occurs less than populated text content - it is heavily dependent on the type of emails. From experience however, the HTML content (if used) is always the "accurate one". In newsletters the plain text is regularly used just to say "you need a HTML viewer to read this message", or "click here to view the newsletter online. In addition to this, many text emails are really badly formatted too as it's often auto-generated. Snippets are created and stored only once during ingestion which takes approximately 0.0004s per email on Github Actions (benchmarking) - they are not generated "on the fly", so performance ("slow an inefficient") is not a valid argument here. Lastly, I do not think that injecting hidden UTF8 characters is a good idea. I do understand what you are trying to achieve, but it will result in inconsistent and inaccurate output. I'll give it a bit more thought tonight though.
Author
Owner

@baiomys commented on GitHub (Mar 17, 2025):

Thanks, sounds promising. =)

<!-- gh-comment-id:2728178174 --> @baiomys commented on GitHub (Mar 17, 2025): Thanks, sounds promising. =)
Author
Owner

@axllent commented on GitHub (Mar 20, 2025):

After further review I can say this feature is out-of-scope.

  1. It cannot be implemented without creating inconsistency in the expected output
  2. It cannot be implemented without complicating the functionality used to generate the snippet, especially when trying to handle the first point above
  3. It is of no benefit whatsoever to Mailpit for its intended purpose
  4. The information you require can already be obtained via the existing API

To implement a truncated & formatted preview you will need to query the API (/api/v1/message/<id>) which returns both the text & HTML parts, then you can decide yourself how you want to handle / parse & display that.

As I mentioned earlier, your application should be designed to handle these types of features. Mailpit serves as an email and SMTP testing tool, offering an API for developers to access information through third-party tools. While your project approach is impressive, it doesn't align with Mailpit's intended use. Many of your feature requests seem to be trying to introduce your application logic into Mailpit, which is neither practical nor appropriate.

I would appreciate it if you could take a moment to carefully review points 3 and 4 regarding feature requests. The same is true for #469. My time is limited, but I will always do my best to address bugs and consider new features that genuinely benefit Mailpit. My goal is to avoid unnecessarily expanding the functionality for purposes that were not intended.

<!-- gh-comment-id:2739416495 --> @axllent commented on GitHub (Mar 20, 2025): After further review I can say this feature is out-of-scope. 1. It cannot be implemented without creating inconsistency in the expected output 2. It cannot be implemented without complicating the functionality used to generate the snippet, especially when trying to handle the first point above 3. It is of no benefit whatsoever to Mailpit **for its intended purpose** 4. The information you require can already be obtained via the existing API To implement a truncated & formatted preview you will need to query the API (`/api/v1/message/<id>`) which returns both the text & HTML parts, then you can decide yourself how you want to handle / parse & display that. As I mentioned earlier, your application should be designed to handle these types of features. Mailpit serves as an email and SMTP testing tool, offering an API for developers to access information through third-party tools. While your project approach is impressive, it doesn't align with Mailpit's intended use. Many of your feature requests seem to be trying to introduce your application logic into Mailpit, which is neither practical nor appropriate. I would appreciate it if you could take a moment to carefully review points 3 and 4 regarding feature requests. The same is true for #469. My time is limited, but I will always do my best to address bugs and consider new features that genuinely benefit Mailpit. My goal is to avoid unnecessarily expanding the functionality for purposes that were not intended.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/mailpit#305
No description provided.