[GH-ISSUE #472] resolver lookup_ip block forever when use different tokio runtime. #201

Closed
opened 2026-03-07 22:45:32 +03:00 by kerem · 8 comments
Owner

Originally created by @cssivision on GitHub (May 18, 2018).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/472

extern crate futures;
extern crate tokio;
extern crate trust_dns_resolver;

use std::io;
use std::net::IpAddr;

use futures::{future, Future};
use tokio::runtime::current_thread::Runtime;
use trust_dns_resolver::config::{ResolverConfig, ResolverOpts};
use trust_dns_resolver::ResolverFuture;

pub fn other(desc: &'static str) -> io::Error {
    io::Error::new(io::ErrorKind::Other, desc)
}

pub fn resolve(
    host: &str,
    resolver: ResolverFuture,
) -> Box<Future<Item = IpAddr, Error = io::Error> + Send> {
    let res = resolver.lookup_ip(host).then(move |res| match res {
        Ok(r) => if let Some(addr) = r.iter().next() {
            future::ok(addr)
        } else {
            future::err(other("no ip return"))
        },
        Err(_) => future::err(other("resolve fail")),
    });

    Box::new(res)
}

fn main() {
    let mut io_loop = Runtime::new().unwrap();
    let resolver = ResolverFuture::new(ResolverConfig::cloudflare(), ResolverOpts::default());
    let resolver = io_loop.block_on(resolver).unwrap();
    let mut runtime =
        tokio::runtime::current_thread::Runtime::new().expect("failed to launch Runtime");

    println!(
        "{}",
        runtime
            .block_on(resolve("www.google.com", resolver))
            .unwrap()
    );
}

runtime.block_on(resolve("www.google.com", resolver)).unwrap() will block forever, until i replace runtime with io_loop, or drop drop(io_loop) before this call, test code below.

 println!(
        "{}",
        io_loop
            .block_on(resolve("www.google.com", resolver))
            .unwrap()
 );

// --------------------------------------------------
drop(io_loop);
println!(
        "{}",
        runtime
            .block_on(resolve("www.google.com", resolver))
            .unwrap()
 );
Originally created by @cssivision on GitHub (May 18, 2018). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/472 ```rust extern crate futures; extern crate tokio; extern crate trust_dns_resolver; use std::io; use std::net::IpAddr; use futures::{future, Future}; use tokio::runtime::current_thread::Runtime; use trust_dns_resolver::config::{ResolverConfig, ResolverOpts}; use trust_dns_resolver::ResolverFuture; pub fn other(desc: &'static str) -> io::Error { io::Error::new(io::ErrorKind::Other, desc) } pub fn resolve( host: &str, resolver: ResolverFuture, ) -> Box<Future<Item = IpAddr, Error = io::Error> + Send> { let res = resolver.lookup_ip(host).then(move |res| match res { Ok(r) => if let Some(addr) = r.iter().next() { future::ok(addr) } else { future::err(other("no ip return")) }, Err(_) => future::err(other("resolve fail")), }); Box::new(res) } fn main() { let mut io_loop = Runtime::new().unwrap(); let resolver = ResolverFuture::new(ResolverConfig::cloudflare(), ResolverOpts::default()); let resolver = io_loop.block_on(resolver).unwrap(); let mut runtime = tokio::runtime::current_thread::Runtime::new().expect("failed to launch Runtime"); println!( "{}", runtime .block_on(resolve("www.google.com", resolver)) .unwrap() ); } ``` `runtime.block_on(resolve("www.google.com", resolver)).unwrap()` will block forever, until i replace `runtime` with `io_loop`, or drop `drop(io_loop)` before this call, test code below. ```rust println!( "{}", io_loop .block_on(resolve("www.google.com", resolver)) .unwrap() ); // -------------------------------------------------- drop(io_loop); println!( "{}", runtime .block_on(resolve("www.google.com", resolver)) .unwrap() ); ```
kerem 2026-03-07 22:45:32 +03:00
Author
Owner

@cssivision commented on GitHub (May 18, 2018):

this issue may related to tokio, but i think i should also post here.

<!-- gh-comment-id:390076846 --> @cssivision commented on GitHub (May 18, 2018): this issue may related to `tokio`, but i think i should also post here.
Author
Owner

@bluejekyll commented on GitHub (May 18, 2018):

Yes. I’m not sure what the solution is here. The Resolver spawns the network handlers onto the first Runtime. Then the lookup futures returned are run in a different runtime, they both must be run or use a single one.

For a multi runtime solution, this example is good: https://github.com/bluejekyll/trust-dns/blob/master/resolver/examples/global_resolver.rs

<!-- gh-comment-id:390079344 --> @bluejekyll commented on GitHub (May 18, 2018): Yes. I’m not sure what the solution is here. The Resolver spawns the network handlers onto the first Runtime. Then the lookup futures returned are run in a different runtime, they both must be run or use a single one. For a multi runtime solution, this example is good: https://github.com/bluejekyll/trust-dns/blob/master/resolver/examples/global_resolver.rs
Author
Owner

@bluejekyll commented on GitHub (May 18, 2018):

I've been thinking for a little bit that we should restructure the timeout, or perhaps add another timer option on the lookup itself. If we do that, we can timeout on the client side. It doesn't fix the problem, but it would at least always timeout.

<!-- gh-comment-id:390103542 --> @bluejekyll commented on GitHub (May 18, 2018): I've been thinking for a little bit that we should restructure the timeout, or perhaps add another timer option on the lookup itself. If we do that, we can timeout on the client side. It doesn't fix the problem, but it would at least always timeout.
Author
Owner

@cssivision commented on GitHub (May 18, 2018):

the global resolver in the example works, because the runtime https://github.com/bluejekyll/trust-dns/blob/master/resolver/examples/global_resolver.rs#L38 dropped after the global resolver init.

<!-- gh-comment-id:390108951 --> @cssivision commented on GitHub (May 18, 2018): the global resolver in the example works, because the `runtime` https://github.com/bluejekyll/trust-dns/blob/master/resolver/examples/global_resolver.rs#L38 dropped after the global resolver init.
Author
Owner

@bluejekyll commented on GitHub (May 18, 2018):

What's going on there, is that Runtime::run doesn't return until all associated futures with that Runtime complete. There are implicit futures associated with the NameServers registered in the NameServerPool that are spawned into the Runtime. The ResolverFuture then is returned, and basically contains only channels pointing back to the NameServer futures. That global Runtime will exist and run in that thread as long as the ResolverFuture exists, and since that's a static reference, it will exist for the life of the program.

See #464, that discusses what I'd like to do to change this interface to make those interior Futures more explicit. I think it would help avoid the issue you've run into, though it doesn't remove the fact that the Future for the connections is separate from the future for the lookup, that is waiting potentially in another thread.

I'm open to other ideas here, but every time I think through how to make this more ergonomic, I end-up stuck with the problem that the NameServerPool owns the connection, and if you want a ResolverFuture that supports multiple threads you need a model like this... This is partly why I want to hide all this, and create a GlobalResolver, at the expensive of a single resolver thread running the background Runtime.

<!-- gh-comment-id:390113203 --> @bluejekyll commented on GitHub (May 18, 2018): What's going on there, is that `Runtime::run` doesn't return until all associated futures with that Runtime complete. There are implicit futures associated with the NameServers registered in the NameServerPool that are spawned into the Runtime. The `ResolverFuture` then is returned, and basically contains only channels pointing back to the NameServer futures. That global Runtime will exist and run in that thread as long as the ResolverFuture exists, and since that's a static reference, it will exist for the life of the program. See #464, that discusses what I'd like to do to change this interface to make those interior Futures more explicit. I think it would help avoid the issue you've run into, though it doesn't remove the fact that the Future for the connections is separate from the future for the lookup, that is waiting potentially in another thread. I'm open to other ideas here, but every time I think through how to make this more ergonomic, I end-up stuck with the problem that the NameServerPool owns the connection, and if you want a ResolverFuture that supports multiple threads you need a model like this... This is partly why I want to hide all this, and create a GlobalResolver, at the expensive of a single resolver thread running the background Runtime.
Author
Owner

@bluejekyll commented on GitHub (May 21, 2018):

Also. Refer to #479 for a standard multi-threaded example.

<!-- gh-comment-id:390551054 --> @bluejekyll commented on GitHub (May 21, 2018): Also. Refer to #479 for a standard multi-threaded example.
Author
Owner

@carllerche commented on GitHub (May 24, 2018):

@bluejekyll Generally speaking, the strategy for background "worker" tasks is to have some strategy to shutdown automatically when it knows it has no further work to do.

The basic example would be a task that pulls work off of an mpsc channel. In this case, when all the Sender handles have dropped, the worker task should shutdown because it can never receive any more work.

The h2 library has a similar setup as well. There is a Connection task that manages the connection and a SendRequest handle that can be cloned to issue client requests. Once all SendRequest handles have dropped, the Connection task starts a graceful shutdown process and then terminates.

<!-- gh-comment-id:391847036 --> @carllerche commented on GitHub (May 24, 2018): @bluejekyll Generally speaking, the strategy for background "worker" tasks is to have some strategy to shutdown automatically when it knows it has no further work to do. The basic example would be a task that pulls work off of an mpsc channel. In this case, when all the `Sender` handles have dropped, the worker task should shutdown because it can never receive any more work. The `h2` library has a similar setup as well. There is a `Connection` task that manages the connection and a `SendRequest` handle that can be cloned to issue client requests. Once all `SendRequest` handles have dropped, the `Connection` task starts a graceful shutdown process and then terminates.
Author
Owner

@bluejekyll commented on GitHub (May 24, 2018):

Yes. The trust-dns libraries are built this way as well. If all channels are closed, the ResolverFuture spawned tasks will also shutdown. Currently this will also shutdown any tcp connections and close udp sockets as well.

The conversation above I think is more about how to guarantee the background tasks are being run before a query is performed.

<!-- gh-comment-id:391870832 --> @bluejekyll commented on GitHub (May 24, 2018): Yes. The trust-dns libraries are built this way as well. If all channels are closed, the ResolverFuture spawned tasks will also shutdown. Currently this will also shutdown any tcp connections and close udp sockets as well. The conversation above I think is more about how to guarantee the background tasks are being run before a query is performed.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#201
No description provided.