mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 03:05:51 +03:00
[GH-ISSUE #1043] Resolver: HTTPS client never closes connections #591
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#591
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @balboah on GitHub (Mar 17, 2020).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/1043
Describe the bug
When configuring using
NameServerConfigGroup::from_ips_https()the resulting client cause connections to keep ESTABLISHED state and cause a memory leak adding a new connection for each lookup.My loop of 200 lookups caused 200 connections in ESTABLISHED state.
To Reproduce
Running this example prints 196-200:
example main.rs
Expected behavior
Connections should be closed after resolved lookup, which works with the TLS a.k.a DoT version.
System:
Version:
Version: trust-dns git rev
77fd933dAdditional context
Using the rustls TLS only edition works fine.
Tokio runtime is created with:
@bluejekyll commented on GitHub (Mar 19, 2020):
This is surprising. The design isn't to drop the connections between usage, in fact TCP and TLS should maintain connections, so the behavior there seems unexpected there as well. I think there are tests that cover this, but we should do a better review here.
The bigger concern is that you're seeing a memory leak, and we should determine why that is.
@balboah commented on GitHub (Mar 19, 2020):
yes the optimal for my HTTPS case would be to have one keep alive connection, especially if it's able to multiplex queries on the h2 streams. I'm not sure how that works with TLS only.
but the critical part is that it leaks and triggers a crash in my limited environment
@balboah commented on GitHub (Mar 19, 2020):
I haven't dug into the code yet but I remember there is a "max idle connections per host" setting in reqwest which defaults to max int size. Maybe this is a similar issue, rapid queries isn't limited on the number of idle connections it may spawn, and each will take a fair amount of memory
@balboah commented on GitHub (Mar 24, 2020):
Currently suspecting that the bug is around the
NameServerPool::parallel_conn_loop.There will be a new
HttpsClientStreamfor every query that was not cached, it wouldn't be possible for it to re-use its connection then. I'm having trouble following the Futures calls and all the different types that collectively gets the job done so I'll leave this for now@bluejekyll commented on GitHub (Mar 24, 2020):
Yes. Some of these are older futures. All the code hasn’t been 100% cleaned up, so I can understand it being hard to follow.
I’ll take a look at that. I spent a lot of time last release cleaning up that area, so It’s fresh in my mind.
@balboah commented on GitHub (Apr 1, 2020):
@bluejekyll hey, did you find some time to squash this bug? No stress, but would save me a lot of time :)
@bluejekyll commented on GitHub (Apr 1, 2020):
Not yet. I was hoping to find some time this week.
@bluejekyll commented on GitHub (Apr 5, 2020):
I've been looking at this today. I think what happened is that through all the refactoring between 0.18 and 0.19 to support async/await, this area of the code was screwed up. I have a test I'm building to try and detect this so that we don't lose this functionality in the future. Then I'll work on a patch to fix.