[GH-ISSUE #2895] Formatted ProtoErrorKind::Msg contains non-text #1079

Open
opened 2026-03-16 01:33:58 +03:00 by kerem · 4 comments
Owner

Originally created by @lilydjwg on GitHub (Mar 31, 2025).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2895

I sent a request over DoH and got 503 as response. I logged it since it wasn't successful, but systemd didn't like the log:

��
[1B blob data]
[12B blob data]
A
[1B blob data]
[5B blob data]
[1B blob data]
[1B blob data]
[1B blob data]
[2B blob data]
[3B blob data]

and console too:

http unsuccessful code: 503 Service Unavailable, message: ��isituporg)�'
��̧��|!g�u�cc�b��o�

I tried with {:?}, {:#} or {:#?}.

This code seems to write the string directly without any escaping:

github.com/hickory-dns/hickory-dns@ba92a2a02a/crates/proto/src/error.rs (L176-L177)

Originally created by @lilydjwg on GitHub (Mar 31, 2025). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2895 I sent a request over DoH and got 503 as response. I logged it since it wasn't successful, but systemd didn't like the log: ``` �� [1B blob data] [12B blob data] A [1B blob data] [5B blob data] [1B blob data] [1B blob data] [1B blob data] [2B blob data] [3B blob data] ``` and console too: ``` http unsuccessful code: 503 Service Unavailable, message: ��isituporg)�' ��̧��|!g�u�cc�b��o� ``` I tried with `{:?}`, `{:#}` or `{:#?}`. This code seems to write the string directly without any escaping: https://github.com/hickory-dns/hickory-dns/blob/ba92a2a02afa63b720c46cc4eeca67d01e3b4224/crates/proto/src/error.rs#L176-L177
Author
Owner

@djc commented on GitHub (Mar 31, 2025):

So this is from https://github.com/hickory-dns/hickory-dns/blob/main/crates/proto/src/h2/h2_client_stream.rs#L156.

That does use String::from_utf8_lossy() which makes sure we end up with valid Unicode, but may insert mojibake in the case of binary content that's not actually textual content. I would expect that's pretty rare. I'm not sure this is really a problem, and even if it is, I'm not sure there's an obvious improvement.

(I suppose we could yield an array of bytes as a structured logging element instead?)

<!-- gh-comment-id:2766564998 --> @djc commented on GitHub (Mar 31, 2025): So this is from https://github.com/hickory-dns/hickory-dns/blob/main/crates/proto/src/h2/h2_client_stream.rs#L156. That does use `String::from_utf8_lossy()` which makes sure we end up with valid Unicode, but may insert mojibake in the case of binary content that's not actually textual content. I would expect that's pretty rare. I'm not sure this is really a problem, and even if it is, I'm not sure there's an obvious improvement. (I suppose we could yield an array of bytes as a structured logging element instead?)
Author
Owner

@lilydjwg commented on GitHub (Apr 1, 2025):

Not only the replacement chars (which is fine), but also control chars (e.g. 0x7), which causes systemd to think is binary and breaks to multiple records (on each write?). There was a bell character in my case and it worked as a visual highlight. (Could someone intentionally inject some interesting ANSI escape sequences?)

I prefer bytes (e.g. as a list of u8). Or bstr if you can use it. Or at least do something to the control characters.

<!-- gh-comment-id:2767886133 --> @lilydjwg commented on GitHub (Apr 1, 2025): Not only the replacement chars (which is fine), but also control chars (e.g. 0x7), which causes systemd to think is binary and breaks to multiple records (on each write?). There was a bell character in my case and it worked as a visual highlight. (Could someone intentionally inject some interesting ANSI escape sequences?) I prefer bytes (e.g. as a list of u8). Or bstr if you can use it. Or at least do something to the control characters.
Author
Owner

@marcus0x62 commented on GitHub (Apr 1, 2025):

Converting the error with String::from_utf8 with a fallback to a byte array would improve the error string by preserving the original bytes without always printing as a byte array (which would make the log substantially harder to read):

    let error_string = String::from_utf8(response_bytes.as_ref().into())
        .unwrap_or(format!("{response_bytes:?}"));

Then we could escape the resulting string to cover control characters (while preserving the byte values in case those are needed for troubleshooting:

   // TODO: make explicit error type
   return Err(ProtoError::from(format!(
       "http unsuccessful code: {}, message: {}",
       response_stream.status(),
       error_string.escape_default(),
   )));

But, escaping should probably be done system-wide if we're going to do it at all - most of the logging call sites include user- or network-provided request or response data as part of the log message.

<!-- gh-comment-id:2769053034 --> @marcus0x62 commented on GitHub (Apr 1, 2025): Converting the error with String::from_utf8 with a fallback to a byte array would improve the error string by preserving the original bytes without *always* printing as a byte array (which would make the log substantially harder to read): ``` let error_string = String::from_utf8(response_bytes.as_ref().into()) .unwrap_or(format!("{response_bytes:?}")); ``` Then we could escape the resulting string to cover control characters (while preserving the byte values in case those are needed for troubleshooting: ``` // TODO: make explicit error type return Err(ProtoError::from(format!( "http unsuccessful code: {}, message: {}", response_stream.status(), error_string.escape_default(), ))); ``` But, escaping should probably be done system-wide if we're going to do it at all - most of the logging call sites include user- or network-provided request or response data as part of the log message.
Author
Owner

@lilydjwg commented on GitHub (Apr 3, 2025):

But, escaping should probably be done system-wide if we're going to do it at all - most of the logging call sites include user- or network-provided request or response data as part of the log message.

Sounds reasonable, and I've opened https://github.com/tokio-rs/tracing/issues/3250.

<!-- gh-comment-id:2774464380 --> @lilydjwg commented on GitHub (Apr 3, 2025): > But, escaping should probably be done system-wide if we're going to do it at all - most of the logging call sites include user- or network-provided request or response data as part of the log message. Sounds reasonable, and I've opened https://github.com/tokio-rs/tracing/issues/3250.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#1079
No description provided.