mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 03:05:51 +03:00
[GH-ISSUE #472] resolver lookup_ip block forever when use different tokio runtime. #497
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#497
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @cssivision on GitHub (May 18, 2018).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/472
runtime.block_on(resolve("www.google.com", resolver)).unwrap()will block forever, until i replaceruntimewithio_loop, or dropdrop(io_loop)before this call, test code below.@cssivision commented on GitHub (May 18, 2018):
this issue may related to
tokio, but i think i should also post here.@bluejekyll commented on GitHub (May 18, 2018):
Yes. I’m not sure what the solution is here. The Resolver spawns the network handlers onto the first Runtime. Then the lookup futures returned are run in a different runtime, they both must be run or use a single one.
For a multi runtime solution, this example is good: https://github.com/bluejekyll/trust-dns/blob/master/resolver/examples/global_resolver.rs
@bluejekyll commented on GitHub (May 18, 2018):
I've been thinking for a little bit that we should restructure the timeout, or perhaps add another timer option on the lookup itself. If we do that, we can timeout on the client side. It doesn't fix the problem, but it would at least always timeout.
@cssivision commented on GitHub (May 18, 2018):
the global resolver in the example works, because the
runtimehttps://github.com/bluejekyll/trust-dns/blob/master/resolver/examples/global_resolver.rs#L38 dropped after the global resolver init.@bluejekyll commented on GitHub (May 18, 2018):
What's going on there, is that
Runtime::rundoesn't return until all associated futures with that Runtime complete. There are implicit futures associated with the NameServers registered in the NameServerPool that are spawned into the Runtime. TheResolverFuturethen is returned, and basically contains only channels pointing back to the NameServer futures. That global Runtime will exist and run in that thread as long as the ResolverFuture exists, and since that's a static reference, it will exist for the life of the program.See #464, that discusses what I'd like to do to change this interface to make those interior Futures more explicit. I think it would help avoid the issue you've run into, though it doesn't remove the fact that the Future for the connections is separate from the future for the lookup, that is waiting potentially in another thread.
I'm open to other ideas here, but every time I think through how to make this more ergonomic, I end-up stuck with the problem that the NameServerPool owns the connection, and if you want a ResolverFuture that supports multiple threads you need a model like this... This is partly why I want to hide all this, and create a GlobalResolver, at the expensive of a single resolver thread running the background Runtime.
@bluejekyll commented on GitHub (May 21, 2018):
Also. Refer to #479 for a standard multi-threaded example.
@carllerche commented on GitHub (May 24, 2018):
@bluejekyll Generally speaking, the strategy for background "worker" tasks is to have some strategy to shutdown automatically when it knows it has no further work to do.
The basic example would be a task that pulls work off of an mpsc channel. In this case, when all the
Senderhandles have dropped, the worker task should shutdown because it can never receive any more work.The
h2library has a similar setup as well. There is aConnectiontask that manages the connection and aSendRequesthandle that can be cloned to issue client requests. Once allSendRequesthandles have dropped, theConnectiontask starts a graceful shutdown process and then terminates.@bluejekyll commented on GitHub (May 24, 2018):
Yes. The trust-dns libraries are built this way as well. If all channels are closed, the ResolverFuture spawned tasks will also shutdown. Currently this will also shutdown any tcp connections and close udp sockets as well.
The conversation above I think is more about how to guarantee the background tasks are being run before a query is performed.