[GH-ISSUE #1561] Allow resolver to switch to TCP for large messages without having to include TCP in name server pool #700

Closed
opened 2026-03-15 23:53:24 +03:00 by kerem · 8 comments
Owner

Originally created by @peterthejohnston on GitHub (Oct 6, 2021).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/1561

Is your feature request related to a problem? Please describe.
We encountered an issue testing a device running Fuchsia where the local router timed out any DNS requests over TCP. (It's possible that it was just dropping TCP packets to port 53.) It responded to DNS over UDP fine. This ended up significantly slowing down IP lookup requests, because by default we include both a UDP "version" and a TCP version of each name server in the name server pool, to allow for TCP connections to be used when needed. However, when retry comes into the equation, this effectively means that on a DNS query that gets retried, potentially all the name servers get requested, including the TCP versions.

Describe the solution you'd like
As RFC 5966, section 1 describes:

Most DNS [RFC1034] transactions take place over UDP [RFC0768]. TCP
[RFC0793] is always used for zone transfers and is often used for
messages whose sizes exceed the DNS protocol's original 512-byte
limit.

I don't think we want to be making any DNS queries over TCP by default; it seems to me that we should only do that in the case that the message doesn't fit into a single DNS message. It looks to me like trust-dns already does this; if we get a response that a message has been truncated, trust-dns attempts to switch to a TCP connection to make the query. However, it relies on there being TCP connections in the name server pool. Ideally, I'd like to allow for this use case (using TCP for oversize messages) without having to include TCP name servers alongside UDP ones in the regular pool.

Maybe we could optionally have a separate name server pool for TCP to allow for this fallback to occur without having TCP name servers in the regular pool. Do you have any thoughts about the approach here?

Describe alternatives you've considered
We can always just not include any TCP versions in the name server pool, but this would prevent the resolver from falling back to TCP when the message size is too large which is undesirable.

Originally created by @peterthejohnston on GitHub (Oct 6, 2021). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/1561 **Is your feature request related to a problem? Please describe.** We encountered an issue testing a device running Fuchsia where the local router timed out any DNS requests over TCP. (It's possible that it was just dropping TCP packets to port 53.) It responded to DNS over UDP fine. This ended up significantly slowing down IP lookup requests, because by default we [include both](https://cs.opensource.google/fuchsia/fuchsia/+/main:src/connectivity/network/dns/src/main.rs;l=225;drc=9a56fdf03671a06a5f0e9eeb2d5117f640a0c402) a UDP "version" and a TCP version of each name server in the name server pool, to allow for TCP connections to be used when needed. However, when retry comes into the equation, this effectively means that on a DNS query that gets retried, potentially all the name servers get requested, including the TCP versions. **Describe the solution you'd like** As [RFC 5966, section 1](https://datatracker.ietf.org/doc/html/rfc5966#section-1) describes: > Most DNS [[RFC1034](https://datatracker.ietf.org/doc/html/rfc1034)] transactions take place over UDP [[RFC0768](https://datatracker.ietf.org/doc/html/rfc0768)]. TCP [[RFC0793](https://datatracker.ietf.org/doc/html/rfc0793)] is always used for zone transfers and is often used for messages whose sizes exceed the DNS protocol's original 512-byte limit. I don't think we want to be making any DNS queries over TCP by default; it seems to me that we should only do that in the case that the message doesn't fit into a single DNS message. It looks to me like trust-dns already does this; if we get a response that a message has been truncated, trust-dns [attempts to switch](https://github.com/bluejekyll/trust-dns/blob/main/crates/resolver/src/name_server/name_server_pool.rs#L241) to a TCP connection to make the query. However, it relies on there being TCP connections in the name server pool. Ideally, I'd like to allow for this use case (using TCP for oversize messages) without having to include TCP name servers alongside UDP ones in the regular pool. Maybe we could optionally have a separate name server pool for TCP to allow for this fallback to occur without having TCP name servers in the regular pool. Do you have any thoughts about the approach here? **Describe alternatives you've considered** We can always just not include any TCP versions in the name server pool, but this would prevent the resolver from falling back to TCP when the message size is too large which is undesirable.
kerem closed this issue 2026-03-15 23:53:29 +03:00
Author
Owner

@bluejekyll commented on GitHub (Oct 6, 2021):

I'm thinking of options here, and definitely open to ideas. I think the truncation flag should be set in the case that you're talking about. In that case, you want to drop through to the TCP logic. A simple option we could add an option that would only promote to TCP on truncation and not on any other errors. That would stop us from attempting TCP after timeouts...

<!-- gh-comment-id:936595975 --> @bluejekyll commented on GitHub (Oct 6, 2021): I'm thinking of options here, and definitely open to ideas. I think the truncation flag should be set in the case that you're talking about. In that case, you want to drop through to the TCP logic. A simple option we could add an option that would only promote to TCP on truncation and not on any other errors. That would stop us from attempting TCP after timeouts...
Author
Owner

@peterthejohnston commented on GitHub (Oct 6, 2021):

I've uploaded a PR that adds an option to only promote to TCP on truncation, and not on other errors. Let me know what you think.

<!-- gh-comment-id:937173719 --> @peterthejohnston commented on GitHub (Oct 6, 2021): I've uploaded [a PR](https://github.com/bluejekyll/trust-dns/pull/1562) that adds an option to only promote to TCP on truncation, and not on other errors. Let me know what you think.
Author
Owner

@peterthejohnston commented on GitHub (Oct 13, 2021):

Hi, sorry for the frequent requests—would it be possible to tag a v0.21.0-alpha.4 release so we could pull in this change? We'd really appreciate it.

<!-- gh-comment-id:942617127 --> @peterthejohnston commented on GitHub (Oct 13, 2021): Hi, sorry for the frequent requests—would it be possible to tag a v0.21.0-alpha.4 release so we could pull in this change? We'd really appreciate it.
Author
Owner

@bluejekyll commented on GitHub (Oct 13, 2021):

yeah, let me get that process going.

<!-- gh-comment-id:942754763 --> @bluejekyll commented on GitHub (Oct 13, 2021): yeah, let me get that process going.
Author
Owner

@peterthejohnston commented on GitHub (Oct 13, 2021):

Thanks! We could also depend on a specific git revision rather than a publicly released crates.io version in order to pull in the change, if that would make things easier for you.

<!-- gh-comment-id:942755610 --> @peterthejohnston commented on GitHub (Oct 13, 2021): Thanks! We could also depend on a specific git revision rather than a publicly released crates.io version in order to pull in the change, if that would make things easier for you.
Author
Owner

@bluejekyll commented on GitHub (Oct 13, 2021):

That’s completely up to you. I know that you have specific vendoring rules, so I’d leave that in your court to decide.

<!-- gh-comment-id:942761095 --> @bluejekyll commented on GitHub (Oct 13, 2021): That’s completely up to you. I know that you have specific vendoring rules, so I’d leave that in your court to decide.
Author
Owner

@peterthejohnston commented on GitHub (Oct 13, 2021):

Sounds good, I'll see if that's feasible for us then and follow up with you if we do need a new release.

<!-- gh-comment-id:942762776 --> @peterthejohnston commented on GitHub (Oct 13, 2021): Sounds good, I'll see if that's feasible for us then and follow up with you if we do need a new release.
Author
Owner

@bluejekyll commented on GitHub (Oct 13, 2021):

@peterthejohnston: https://github.com/bluejekyll/trust-dns/releases/tag/v0.21.0-alpha.4, crates are up on crates.io as well.

<!-- gh-comment-id:942805260 --> @bluejekyll commented on GitHub (Oct 13, 2021): @peterthejohnston: https://github.com/bluejekyll/trust-dns/releases/tag/v0.21.0-alpha.4, crates are up on crates.io as well.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#700
No description provided.