[GH-ISSUE #2738] Resolver: Deduplicate connections to name servers #1056

Closed
opened 2026-03-16 01:28:08 +03:00 by kerem · 7 comments
Owner

Originally created by @divergentdave on GitHub (Jan 24, 2025).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2738

Is your feature request related to a problem? Please describe.
The NameServerCache inside the recursor may hold multiple connections to the same name server, via the same protocol, in separate RecursorPools for different zones. When shared DNS servers are used, many zones could have overlapping sets of name server IPs. This is undesirable, as it may result in many TCP connections to the same name server, wasting overhead and potentially running afoul of connection limits.

Describe the solution you'd like
If we removed the RecursorPool encapsulation, we could separately store connections in one cache, indexed by IP and protocol, and zone name server pool information in another cache, indexed by zone name, containing a list of IPs, and possibly a list of weak references to connections. When building name server pools during recursion, instead of always constructing new connections to every name server and bundling them into a pool, we'd store the list of name server IPs and reuse or create connections to individual servers as needed.

Currently, RecursorPool also has a map in a mutex used to deduplicate in-flight requests to name servers of a single zone. We'd probably want to reimplement that deduplication at the zone level.

Describe alternatives you've considered
Alternately, we could leave RecursorPool mostly intact, but share connection objects to common servers across RecursorPools, lazily opening connections to new name servers on demand.

Additional context
This occurred to me while considering possible solutions to #2574. Adding a layer of indirection in this manner would also let us expire the metadata about a zone's pool without tearing down the connections. Once refreshed referrals are received, we could continue using the existing connections, assuming the same IP addresses are returned.

Originally created by @divergentdave on GitHub (Jan 24, 2025). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2738 **Is your feature request related to a problem? Please describe.** The `NameServerCache` inside the recursor may hold multiple connections to the same name server, via the same protocol, in separate `RecursorPool`s for different zones. When shared DNS servers are used, many zones could have overlapping sets of name server IPs. This is undesirable, as it may result in many TCP connections to the same name server, wasting overhead and potentially running afoul of connection limits. **Describe the solution you'd like** If we removed the `RecursorPool` encapsulation, we could separately store connections in one cache, indexed by IP and protocol, and zone name server pool information in another cache, indexed by zone name, containing a list of IPs, and possibly a list of weak references to connections. When building name server pools during recursion, instead of always constructing new connections to every name server and bundling them into a pool, we'd store the list of name server IPs and reuse or create connections to individual servers as needed. Currently, `RecursorPool` also has a map in a mutex used to deduplicate in-flight requests to name servers of a single zone. We'd probably want to reimplement that deduplication at the zone level. **Describe alternatives you've considered** Alternately, we could leave `RecursorPool` mostly intact, but share connection objects to common servers across `RecursorPool`s, lazily opening connections to new name servers on demand. **Additional context** This occurred to me while considering possible solutions to #2574. Adding a layer of indirection in this manner would also let us expire the metadata about a zone's pool without tearing down the connections. Once refreshed referrals are received, we could continue using the existing connections, assuming the same IP addresses are returned.
kerem 2026-03-16 01:28:08 +03:00
Author
Owner

@djc commented on GitHub (Jan 24, 2025):

Agreed that the current pooling setup doesn't really make sense.

(Should this be part of #2725?)

<!-- gh-comment-id:2612707422 --> @djc commented on GitHub (Jan 24, 2025): Agreed that the current pooling setup doesn't really make sense. (Should this be part of #2725?)
Author
Owner

@divergentdave commented on GitHub (Jan 29, 2025):

I think adding it to the tracking issue makes sense. I can see having many TCP connections open to common providers leading to operational problems, because of rate limiting on the authoritative side. Done.

<!-- gh-comment-id:2623189779 --> @divergentdave commented on GitHub (Jan 29, 2025): I think adding it to the tracking issue makes sense. I can see having many TCP connections open to common providers leading to operational problems, because of rate limiting on the authoritative side. Done.
Author
Owner

@cpu commented on GitHub (Aug 21, 2025):

If we removed the RecursorPool encapsulation, we could separately store connections in one cache, indexed by IP and protocol, and zone name server pool information in another cache, indexed by zone name, containing a list of IPs, and possibly a list of weak references to connections

+1 to this idea. The authoritative server encrypted transport connection state metadata described by RFC 9539 also wants to be indexed by nameserver IP without consideration of zone name. I think this change would be helpful/necessary for the probing policy it describes to be implemented faithfully.

<!-- gh-comment-id:3211859822 --> @cpu commented on GitHub (Aug 21, 2025): > If we removed the RecursorPool encapsulation, we could separately store connections in one cache, indexed by IP and protocol, and zone name server pool information in another cache, indexed by zone name, containing a list of IPs, and possibly a list of weak references to connections +1 to this idea. The authoritative server encrypted transport connection state metadata described by RFC 9539 also wants to be indexed by nameserver IP without consideration of zone name. I think this change would be helpful/necessary for the probing policy it describes to be implemented faithfully.
Author
Owner

@djc commented on GitHub (Aug 22, 2025):

If we removed the RecursorPool encapsulation, we could separately store connections in one cache, indexed by IP and protocol, and zone name server pool information in another cache, indexed by zone name, containing a list of IPs, and possibly a list of weak references to connections

+1 to this idea. The authoritative server encrypted transport connection state metadata described by RFC 9539 also wants to be indexed by nameserver IP without consideration of zone name. I think this change would be helpful/necessary for the probing policy it describes to be implemented faithfully.

Will this still reuse some of the abstractions in the resolver? I think we'd want to stick with a structure like in #3210 where connections to a single server are kept together; so I don't think the recursor should split up connections to the same server again (referring to "separately store connections [..] indexed by IP and protocol"). I think the resolver's NameServer could be that abstraction, though I wonder if the recursor won't need any of the NameServerPool infrastructure. I guess maybe it could build its own PoolState values?

<!-- gh-comment-id:3213960339 --> @djc commented on GitHub (Aug 22, 2025): > > If we removed the RecursorPool encapsulation, we could separately store connections in one cache, indexed by IP and protocol, and zone name server pool information in another cache, indexed by zone name, containing a list of IPs, and possibly a list of weak references to connections > > +1 to this idea. The authoritative server encrypted transport connection state metadata described by RFC 9539 also wants to be indexed by nameserver IP without consideration of zone name. I think this change would be helpful/necessary for the probing policy it describes to be implemented faithfully. Will this still reuse some of the abstractions in the resolver? I think we'd want to stick with a structure like in #3210 where connections to a single server are kept together; so I don't think the recursor should split up connections to the same server again (referring to "separately store connections [..] indexed by IP and protocol"). I think the resolver's `NameServer` could be that abstraction, though I wonder if the recursor won't need any of the `NameServerPool` infrastructure. I guess maybe it could build its own `PoolState` values?
Author
Owner

@cpu commented on GitHub (Aug 27, 2025):

@divergentdave Do you have thoughts on this & #3210 as the OP of this issue?

<!-- gh-comment-id:3228102030 --> @cpu commented on GitHub (Aug 27, 2025): @divergentdave Do you have thoughts on this & #3210 as the OP of this issue?
Author
Owner

@divergentdave commented on GitHub (Aug 27, 2025):

Storing connections to the same server via different protocols together seems fine to me, the protocol doesn't necessarily have to be part of the map key. The resolver and the recursor will have significant overlap in the fields they store per-server, so I think it may make sense to share data types.

<!-- gh-comment-id:3228822373 --> @divergentdave commented on GitHub (Aug 27, 2025): Storing connections to the same server via different protocols together seems fine to me, the protocol doesn't necessarily have to be part of the map key. The resolver and the recursor will have significant overlap in the fields they store per-server, so I think it may make sense to share data types.
Author
Owner

@marcus0x62 commented on GitHub (Oct 29, 2025):

Addressed in #3328

<!-- gh-comment-id:3463689870 --> @marcus0x62 commented on GitHub (Oct 29, 2025): Addressed in #3328
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#1056
No description provided.