[GH-ISSUE #25] [Bug] International characters garbled in some messages #23

Closed
opened 2026-03-03 01:19:06 +03:00 by kerem · 20 comments
Owner

Originally created by @Kabouik on GitHub (Jan 30, 2020).
Original GitHub issue: https://github.com/d99kris/nmail/issues/25

Originally assigned to: @d99kris on GitHub.

Example:

Screenshot from 2020-01-30 10:24:12

In some other messages on the same account (so I guess it is dependent on the sender and the client they use, not on the main.conf file), characters with accents are correctly displayed:

Screenshot from 2020-01-30 10:25:35

Originally created by @Kabouik on GitHub (Jan 30, 2020). Original GitHub issue: https://github.com/d99kris/nmail/issues/25 Originally assigned to: @d99kris on GitHub. Example: ![Screenshot from 2020-01-30 10:24:12](https://user-images.githubusercontent.com/7107523/73436374-b31ec300-434a-11ea-85c0-760237046944.png) In some other messages on the same account (so I guess it is dependent on the sender and the client they use, not on the `main.conf` file), characters with accents are correctly displayed: ![Screenshot from 2020-01-30 10:25:35](https://user-images.githubusercontent.com/7107523/73436492-ea8d6f80-434a-11ea-9baa-3dacb298b351.png)
kerem closed this issue 2026-03-03 01:19:06 +03:00
Author
Owner

@d99kris commented on GitHub (Jan 30, 2020):

Hi, could you please forward one such email to me (d99kris at gmail dot com), assuming you have one without personal/private information? Preferably use a different email client (where the text is shown correctly) to forward the email. In either case please let me know if you forwarded it using nmail or another client. Thanks!

<!-- gh-comment-id:580196371 --> @d99kris commented on GitHub (Jan 30, 2020): Hi, could you please forward one such email to me (d99kris at gmail dot com), assuming you have one without personal/private information? Preferably use a different email client (where the text is shown correctly) to forward the email. In either case please let me know if you forwarded it using nmail or another client. Thanks!
Author
Owner

@Kabouik commented on GitHub (Jan 30, 2020):

I just forwarded to you the two emails I took as examples above. They were not forwarded using nmail. Hope you can find the cause.

<!-- gh-comment-id:580227718 --> @Kabouik commented on GitHub (Jan 30, 2020): I just forwarded to you the two emails I took as examples above. They were not forwarded using nmail. Hope you can find the cause.
Author
Owner

@d99kris commented on GitHub (Jan 31, 2020):

Thanks! Hmm. The email where you indicated there would be an issue I see (what I think is) correct output. Both in html and plain mode (using T to toggle). Attaching screenshots.
french_html
french_plain

I guess it's possible the email client used to forward the email re-encoded it using a different encoding that nmail could handle.

Hm, I might need to add some capability in nmail to export the raw email message so you can save the message to a file and share, in order for me to troubleshoot.

<!-- gh-comment-id:580676080 --> @d99kris commented on GitHub (Jan 31, 2020): Thanks! Hmm. The email where you indicated there would be an issue I see (what I think is) correct output. Both in html and plain mode (using `T` to toggle). Attaching screenshots. <img width="238" alt="french_html" src="https://user-images.githubusercontent.com/786150/73532110-f36b6780-4456-11ea-9c3e-405e69c3ffa3.png"> <img width="238" alt="french_plain" src="https://user-images.githubusercontent.com/786150/73532111-f36b6780-4456-11ea-9d70-c5651277b439.png"> I guess it's possible the email client used to forward the email re-encoded it using a different encoding that nmail could handle. Hm, I might need to add some capability in `nmail` to export the raw email message so you can save the message to a file and share, in order for me to troubleshoot.
Author
Owner

@Kabouik commented on GitHub (Feb 1, 2020):

Yes, it looked OK for me when viewed in another client, but nmail had issues with it. Since I forwarded the messages to you using another client, maybe the issue did not show up if that client somehow re-encoded it.

I sent you another example from nmail this time. Can you confirm you received it? The draft was deleted after sending but I got a "Move message failed" error, and no message in sent mails.

<!-- gh-comment-id:581027717 --> @Kabouik commented on GitHub (Feb 1, 2020): Yes, it looked OK for me when viewed in another client, but nmail had issues with it. Since I forwarded the messages to you using another client, maybe the issue did not show up if that client somehow re-encoded it. I sent you another example from nmail this time. Can you confirm you received it? The draft was deleted after sending but I got a "Move message failed" error, and no message in sent mails.
Author
Owner

@d99kris commented on GitHub (Feb 1, 2020):

Thanks, I can confirm I received it. However I'm afraid I will need the original email/encoding to efficiently determine what is going wrong. I'll look at adding the save/export message functionality.

<!-- gh-comment-id:581028621 --> @d99kris commented on GitHub (Feb 1, 2020): Thanks, I can confirm I received it. However I'm afraid I will need the original email/encoding to efficiently determine what is going wrong. I'll look at adding the save/export message functionality.
Author
Owner

@Kabouik commented on GitHub (Feb 1, 2020):

Good to know still. The email also appeared in the sent folder, eventually. I'll wait for the export functionality. It would be an interesting feature anyway, not just for debugging, but also to have header details and information about smtp servers and so on.

<!-- gh-comment-id:581036452 --> @Kabouik commented on GitHub (Feb 1, 2020): Good to know still. The email also appeared in the sent folder, eventually. I'll wait for the export functionality. It would be an interesting feature anyway, not just for debugging, but also to have header details and information about smtp servers and so on.
Author
Owner

@d99kris commented on GitHub (Feb 4, 2020):

I've implemented support for exporting raw email message in #30 - just press E when viewing an email.

Please help to export the email message which viewed incorrectly for you and send the file to me (via email or attach here), it will enable me to reproduce and debug. Thanks!

<!-- gh-comment-id:581954654 --> @d99kris commented on GitHub (Feb 4, 2020): I've implemented support for exporting raw email message in #30 - just press `E` when viewing an email. Please help to export the email message which viewed incorrectly for you and send the file to me (via email or attach here), it will enable me to reproduce and debug. Thanks!
Author
Owner

@d99kris commented on GitHub (Mar 29, 2020):

Hi @Kabouik - I know you're busy and also facing more serious bugs using nmail, but I just wanted to send a reminder to help export an email and share with me, whenever you find time. Thanks :)

<!-- gh-comment-id:605637310 --> @d99kris commented on GitHub (Mar 29, 2020): Hi @Kabouik - I know you're busy and also facing more serious bugs using nmail, but I just wanted to send a reminder to help export an email and share with me, whenever you find time. Thanks :)
Author
Owner

@Kabouik commented on GitHub (Mar 30, 2020):

I'm sorry for opening issues and then being a bit light on troubleshooting. I postponed this for too long.

I'm sending you a couple exported emails (those in the screenshots above). Also, more recently, I realized some emails show international characters correctly as plain text, but garble them again when viewed as html in browser or in nmail using lynx (sending that too).

<!-- gh-comment-id:606014078 --> @Kabouik commented on GitHub (Mar 30, 2020): I'm sorry for opening issues and then being a bit light on troubleshooting. I postponed this for too long. I'm sending you a couple exported emails (those in the screenshots above). Also, more recently, I realized some emails show international characters correctly as plain text, but garble them again when viewed as html in browser or in nmail using lynx (sending that too).
Author
Owner

@d99kris commented on GitHub (May 13, 2020):

Hi @Kabouik - Thanks a lot for sharing the email examples.

I have analyzed a couple of them, more specifically 5_22121.eml sent to me on 2020-04-01 and 20944.eml sent to me on 2020-03-30.

These two emails both only have html-parts (no plain text part).

The html part for 5_22121.eml indicates charset=Windows-1252 in its html-header meta-tag. However interpreting it as Windows-1252 gives several "weird" characters (both using lynx and also when opening the file in a proper web-browser Google Chrome). If I however change the header to say utf-8 the file can be correctly converted to text by lynx (and viewed in Google Chrome).

The html part for 20944.eml indicates charset=iso-8859-1 in its html-header meta-tag. However interpreting it as iso-8859-1 gives several "weird" characters (both in lynx and Google Chrome). However if I manually modify the html header to say utf-8 it can be correctly viewed in Chrome, and correctly converted using lynx.

Based on the above investigation I believe these emails are wrongly encoded (alt. has wrong header), and thus I don't think there is anything that can be fixed.

But if you know of any email client that is able to display these emails the way you expect, please let me know. But my feeling is that the best a client can do is to abide to the encoding info provided in the file header.

<!-- gh-comment-id:627994747 --> @d99kris commented on GitHub (May 13, 2020): Hi @Kabouik - Thanks a lot for sharing the email examples. I have analyzed a couple of them, more specifically `5_22121.eml` sent to me on `2020-04-01` and `20944.eml` sent to me on `2020-03-30`. These two emails both only have html-parts (no plain text part). The html part for `5_22121.eml` indicates `charset=Windows-1252` in its html-header meta-tag. However interpreting it as `Windows-1252` gives several "weird" characters (both using lynx and also when opening the file in a proper web-browser Google Chrome). If I however change the header to say `utf-8` the file can be correctly converted to text by lynx (and viewed in Google Chrome). The html part for `20944.eml` indicates `charset=iso-8859-1` in its html-header meta-tag. However interpreting it as `iso-8859-1` gives several "weird" characters (both in lynx and Google Chrome). However if I manually modify the html header to say `utf-8` it can be correctly viewed in Chrome, and correctly converted using lynx. Based on the above investigation I believe these emails are wrongly encoded (alt. has wrong header), and thus I don't think there is anything that can be fixed. But if you know of any email client that is able to display these emails the way you expect, please let me know. But my feeling is that the best a client can do is to abide to the encoding info provided in the file header.
Author
Owner

@Kabouik commented on GitHub (May 13, 2020):

Hi @d99kris, thanks for looking into that this thoroughly.

I am pretty sure those emails show correct characters when viewed directly in the Outlook web app (but meh, right). I just checked in Mailspring, they also display correctly. In both cases, however, the default view is HTML. In Mailspring, there is a hidden button that allows showing the raw email (similar to the .eml from nmail). I checked the raw message in Mailspring for the email that corresponds to 5.22121.eml, and I can confirm the international characters are correctly displayed.

Here is the header, just in case it differs from what is in the .eml (but I suppose it should not):

Received: from [redacted] with Microsoft SMTP Server (TLS)
 id 15.0.1395.4 via Mailbox Transport; Wed, 1 Apr 2020 15:14:59 +0200
Received: from [redacted] with Microsoft SMTP Server (TLS)
 id 15.0.1395.4; Wed, 1 Apr 2020 15:14:59 +0200
Received: from [redacted] with Microsoft SMTP Server (TLS) id 15.0.1395.4; Wed, 1 Apr
 2020 15:14:59 +0200
Subject: =?UTF-8?Q?Fwd=3a_Baisse_de_biodiversit=c3=a9_et_coronavirus_=28inte?=
 =?UTF-8?Q?rvention_de_ [redacted]
References: [redacted]
To: [redacted]
From: [redacted]
X-Forwarded-Message-Id: [redacted]
Message-ID: [redacted]
Date: Wed, 1 Apr 2020 15:14:56 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
 Thunderbird/68.6.0
In-Reply-To: [redacted]
Content-Type: multipart/mixed;
	boundary="------------27BF9F6938AB8F6A2A4F795E"
Content-Language: fr
Return-Path: [redacted]
X-MS-Exchange-Organization-AuthSource: [redacted]
X-MS-Exchange-Organization-AuthAs: Internal
X-MS-Exchange-Organization-AuthMechanism: 00
X-Originating-IP: [138.102.156.18]
X-ClientProxiedBy: [redacted]
X-MS-Exchange-Organization-Network-Message-Id: 0e923246-6f93-4314-a724-08d7d63eadfc
X-MS-Exchange-Organization-AVStamp-Enterprise: 1.0
MIME-Version: 1.0

--------------27BF9F6938AB8F6A2A4F795E
Content-Type: multipart/alternative;
	boundary="------------019299122F2C78E1692E6228"

--------------019299122F2C78E1692E6228
Content-Type: text/plain; charset="windows-1252"; format=flowed
Content-Transfer-Encoding: 8bit

I redacted a lot of information containing names or email addresses, sorry about that (please let me know if I forgot something), but it does show `charset="windows-1252" as you mentioned, yet the content is right. The subject, however, is wrong, as you can see above. The subject is OK when viewing the message directly in Mailspring instead of the raw file:

ss-2020-05-13_161706

This is just a guess, but I expect the characters to appear correctly in Thunderbird and Outlook too, since the colleagues involved in this email discussion use either of these clients.

<!-- gh-comment-id:628020439 --> @Kabouik commented on GitHub (May 13, 2020): Hi @d99kris, thanks for looking into that this thoroughly. I am pretty sure those emails show correct characters when viewed directly in the Outlook web app (but meh, right). I just checked in Mailspring, they also display correctly. In both cases, however, the default view is HTML. In Mailspring, there is a hidden button that allows showing the raw email (similar to the .eml from nmail). I checked the raw message in Mailspring for the email that corresponds to 5.22121.eml, and I can confirm the international characters are correctly displayed. Here is the header, just in case it differs from what is in the .eml (but I suppose it should not): ``` Received: from [redacted] with Microsoft SMTP Server (TLS) id 15.0.1395.4 via Mailbox Transport; Wed, 1 Apr 2020 15:14:59 +0200 Received: from [redacted] with Microsoft SMTP Server (TLS) id 15.0.1395.4; Wed, 1 Apr 2020 15:14:59 +0200 Received: from [redacted] with Microsoft SMTP Server (TLS) id 15.0.1395.4; Wed, 1 Apr 2020 15:14:59 +0200 Subject: =?UTF-8?Q?Fwd=3a_Baisse_de_biodiversit=c3=a9_et_coronavirus_=28inte?= =?UTF-8?Q?rvention_de_ [redacted] References: [redacted] To: [redacted] From: [redacted] X-Forwarded-Message-Id: [redacted] Message-ID: [redacted] Date: Wed, 1 Apr 2020 15:14:56 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 In-Reply-To: [redacted] Content-Type: multipart/mixed; boundary="------------27BF9F6938AB8F6A2A4F795E" Content-Language: fr Return-Path: [redacted] X-MS-Exchange-Organization-AuthSource: [redacted] X-MS-Exchange-Organization-AuthAs: Internal X-MS-Exchange-Organization-AuthMechanism: 00 X-Originating-IP: [138.102.156.18] X-ClientProxiedBy: [redacted] X-MS-Exchange-Organization-Network-Message-Id: 0e923246-6f93-4314-a724-08d7d63eadfc X-MS-Exchange-Organization-AVStamp-Enterprise: 1.0 MIME-Version: 1.0 --------------27BF9F6938AB8F6A2A4F795E Content-Type: multipart/alternative; boundary="------------019299122F2C78E1692E6228" --------------019299122F2C78E1692E6228 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 8bit ``` I redacted a lot of information containing names or email addresses, sorry about that (please let me know if I forgot something), but it does show `charset="windows-1252" as you mentioned, yet the content is right. The subject, however, is wrong, as you can see above. The subject is OK when viewing the message directly in Mailspring instead of the raw file: ![ss-2020-05-13_161706](https://user-images.githubusercontent.com/7107523/81823986-7905b580-9524-11ea-9a38-12c050589f94.png) This is just a guess, but I expect the characters to appear correctly in Thunderbird and Outlook too, since the colleagues involved in this email discussion use either of these clients.
Author
Owner

@d99kris commented on GitHub (May 13, 2020):

Thanks for the quick reply! Okay, I'll check out Mailspring and maybe Thunderbird on my side.

<!-- gh-comment-id:628022766 --> @d99kris commented on GitHub (May 13, 2020): Thanks for the quick reply! Okay, I'll check out Mailspring and maybe Thunderbird on my side.
Author
Owner

@Kabouik commented on GitHub (May 13, 2020):

The colleagues in the discussion might have been using OWA too.

<!-- gh-comment-id:628023875 --> @Kabouik commented on GitHub (May 13, 2020): The colleagues in the discussion might have been using OWA too.
Author
Owner

@Kabouik commented on GitHub (Jun 19, 2020):

I received other emails with charset="iso-8859-1" since then (probably charset="Windows-1252" too) so I tested OWA too. I can confirm they display correctly in Mailspring and OWA; I have not tested Thunderbird but I suspect it would work too or I would have heard colleagues complaining about it. This seems to happen mostly (only?) with emails that contain only html parts.

I tried exporting one email with iso-8859-1 and reimporting it after replacing the charset with utf-8. It replaced all weird characters by in nmail with lynx and in Firefox as external html viewer. Interestingly, the same characters now show up in OWA as well for the imported email, while the original email with iso-8859-1 show correct characters.

<!-- gh-comment-id:646541482 --> @Kabouik commented on GitHub (Jun 19, 2020): I received other emails with `charset="iso-8859-1"` since then (probably `charset="Windows-1252"` too) so I tested OWA too. I can confirm they display correctly in Mailspring and OWA; I have not tested Thunderbird but I suspect it would work too or I would have heard colleagues complaining about it. This seems to happen mostly (only?) with emails that contain only html parts. I tried exporting one email with `iso-8859-1` and reimporting it after replacing the charset with `utf-8`. It replaced all weird characters by `�` in `nmail` with `lynx` and in Firefox as external html viewer. Interestingly, the same `�` characters now show up in OWA as well for the imported email, while the original email with `iso-8859-1` show correct characters.
Author
Owner

@Kabouik commented on GitHub (Jun 19, 2020):

This seems to happen mostly (only?) with emails that contain only html parts.

No it does not happen only with emails that contain only html parts. I received another email just now with both plain text and html parts, encoded as iso-8859-1, and it shows wrong characters when viewed with lynx, but correct characters when viewed in plain text.

Note that when I open those exported emai in $EDITOR (even outside nmail), special characters are also wrongly decoded, i.e., "Cette motion a d=E9j=E0 =E9t=E9 adopt=E9e" instead of "Cette motion a déjà été adoptée", but this is probably caused by the export from nmail in the first place. Perhaps that's why characters are converted to instead of the expected characters when I change the charset to utf-8 and reimport, as if the information is already lost.

<!-- gh-comment-id:646544205 --> @Kabouik commented on GitHub (Jun 19, 2020): > This seems to happen mostly (only?) with emails that contain only html parts. No it does not happen only with emails that contain only html parts. I received another email just now with both plain text and html parts, encoded as `iso-8859-1`, and it shows wrong characters when viewed with `lynx`, but correct characters when viewed in plain text. Note that when I open those exported emai in `$EDITOR` (even outside `nmail`), special characters are also wrongly decoded, *i.e.*, "Cette motion a d=E9j=E0 =E9t=E9 adopt=E9e" instead of "Cette motion a déjà été adoptée", but this is probably caused by the export from `nmail` in the first place. Perhaps that's why characters are converted to `�` instead of the expected characters when I change the charset to `utf-8` and reimport, as if the information is already lost.
Author
Owner

@Kabouik commented on GitHub (Jun 22, 2020):

I tried sending an email to myself using the Outlook web app (OWA), and accents do not display correctly in nmail (they do in the webmail). Surprisingly, nmail displayed the object correctly though, including the accents.

The issue might not exist for emails sent from Thunderbird (at least I know that some of my contacts use Thunderbird and accents are properly displayed in nmail, not sure if that is the case for all emails sent with Thunderbird).

<!-- gh-comment-id:647504273 --> @Kabouik commented on GitHub (Jun 22, 2020): I tried sending an email to myself using the Outlook web app (OWA), and accents do not display correctly in `nmail` (they do in the webmail). Surprisingly, `nmail` displayed the object correctly though, including the accents. The issue might not exist for emails sent from Thunderbird (at least I know that some of my contacts use Thunderbird and accents are properly displayed in `nmail`, not sure if that is the case for all emails sent with Thunderbird).
Author
Owner

@d99kris commented on GitHub (Jul 7, 2020):

Hi @Kabouik - thanks for reporting the bug and sharing the email examples with me.

I've finally figured out what the issues were, and I believe the should be addressed now.

Please let me know if you still encounter garbled characters in any emails. Thanks!

<!-- gh-comment-id:654898408 --> @d99kris commented on GitHub (Jul 7, 2020): Hi @Kabouik - thanks for reporting the bug and sharing the email examples with me. I've finally figured out what the issues were, and I believe the should be addressed now. Please let me know if you still encounter garbled characters in any emails. Thanks!
Author
Owner

@Kabouik commented on GitHub (Jul 7, 2020):

Awesome, that's well worth a git pull!

I think I can confirm you did solve the issue. I am not 100% sure the emails I checked were all causing the issue with the previous version, but I'll come back here if I ever see it occur again. Thanks a lot, that's a big leap forward for me.

<!-- gh-comment-id:654921689 --> @Kabouik commented on GitHub (Jul 7, 2020): Awesome, that's well worth a git pull! I think I can confirm you did solve the issue. I am not 100% sure the emails I checked were all causing the issue with the previous version, but I'll come back here if I ever see it occur again. Thanks a lot, that's a big leap forward for me.
Author
Owner

@Kabouik commented on GitHub (Sep 30, 2020):

This is a note for myself that I need to send you some other examples: I have come across a situation with a recipient in Ukraine not seeing my replies properly when sent with nmail, but no problem when sent using the Disroot webmail. He was talking to me in English, so I suspect the issue was with the Cyrillic characters in his name forcing a change in the encoding format. I'll send you the messages but just need to take a few minutes to review them and redact the personal information before sharing.

<!-- gh-comment-id:701454509 --> @Kabouik commented on GitHub (Sep 30, 2020): This is a note for myself that I need to send you some other examples: I have come across a situation with a recipient in Ukraine not seeing my replies properly when sent with nmail, but no problem when sent using the Disroot webmail. He was talking to me in English, so I suspect the issue was with the Cyrillic characters in his name forcing a change in the encoding format. I'll send you the messages but just need to take a few minutes to review them and redact the personal information before sharing.
Author
Owner

@d99kris commented on GitHub (Oct 3, 2020):

Sure! Feel free to perhaps log a new issue if it's for sent/outgoing messages, as opposed to received messages. Thanks!

On 2020-09-30 23:11 Kabouik notifications@github.com wrote:

This is a note for myself that I need to send you some other examples,
I have come across a situation with a recipient in Ukraine not seeing
my messages properly when sent with nmail, but visible when sent using
the Disroot webmail. Hew was talking to me in English, so I suspect the
issue was with the Cyrillic characters in his name forcing a change in
the encoding format. I'll send you the messages but just need to take a
few minutes to review them and redact the personal information before
sharing.

--
You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub:
https://github.com/d99kris/nmail/issues/25#issuecomment-701454509

<!-- gh-comment-id:703088123 --> @d99kris commented on GitHub (Oct 3, 2020): Sure! Feel free to perhaps log a new issue if it's for sent/outgoing messages, as opposed to received messages. Thanks! On 2020-09-30 23:11 Kabouik <notifications@github.com> wrote: > This is a note for myself that I need to send you some other examples, > I have come across a situation with a recipient in Ukraine not seeing > my messages properly when sent with nmail, but visible when sent using > the Disroot webmail. Hew was talking to me in English, so I suspect the > issue was with the Cyrillic characters in his name forcing a change in > the encoding format. I'll send you the messages but just need to take a > few minutes to review them and redact the personal information before > sharing. > > -- > You are receiving this because you modified the open/close state. > Reply to this email directly or view it on GitHub: > https://github.com/d99kris/nmail/issues/25#issuecomment-701454509 >
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/nmail#23
No description provided.