[GH-ISSUE #2273] Resolver takes a long time to resolve NXDOMAIN #945

Open
opened 2026-03-16 01:03:09 +03:00 by kerem · 8 comments
Owner

Originally created by @mhils on GitHub (Jul 2, 2024).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2273

Describe the bug

When hickory-resolver is configured with more than two nameservers and one of the first two is not reachable, DNS resolution takes five seconds for NXDOMAIN responses.

Edit: see https://github.com/hickory-dns/hickory-dns/issues/2273#issuecomment-2205877684

To Reproduce

#!/usr/bin/env cargo +nightly -Zscript
---
[dependencies]
hickory-resolver = { git = "https://github.com/hickory-dns/hickory-dns.git" }
---


use std::net::*;
use std::str::FromStr;
use hickory_resolver::Resolver;
use hickory_resolver::config::*;

fn main() {
    dbg!(hickory_resolver::version());
    let mut config = ResolverConfig::new();
    config.add_name_server(NameServerConfig::new(SocketAddr::from_str("1.1.1.1:53").unwrap(), Protocol::Udp));
    config.add_name_server(NameServerConfig::new(SocketAddr::from_str("8.8.8.8:54").unwrap(), Protocol::Udp));  // invalid port
    config.add_name_server(NameServerConfig::new(SocketAddr::from_str("9.9.9.9:55").unwrap(), Protocol::Udp));  // invalid port
    
    let resolver = Resolver::new(config, ResolverOpts::default()).unwrap();

    let start = std::time::Instant::now();
    dbg!(resolver.lookup_ip("nxdomain.example.com").unwrap_err());
    let stop = std::time::Instant::now();
    dbg!(stop - start);
}

Output:

> ./repro.rs
[/Users/mhils/.cargo/target/a4/4553f572871dfe/repro.rs:14:5] hickory_resolver::version() = "0.25.0-alpha.1"
[/Users/mhils/.cargo/target/a4/4553f572871dfe/repro.rs:23:5] resolver.lookup_ip("nxdomain.example.com").unwrap_err() = ResolveError {
    kind: Proto(
        ProtoError {
            kind: NoRecordsFound { ... },
    ),
}
[/Users/mhils/.cargo/target/a4/4553f572871dfe/repro.rs:25:5] stop - start = 5.019177416s

Removing the third nameserver reduces the resolution time to normal levels (13ms on my system).

Expected behavior
Faster resolution. :)

System:

  • OS: macOS
  • Architecture: arm64
  • Version: 14.5
  • rustc version: 1.79

Version:
Crate: resolver
Version: v0.24.1 and HEAD

Additional context
Thank you for your fantastic work on hickory! 🍰 ❤️

Originally created by @mhils on GitHub (Jul 2, 2024). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2273 **Describe the bug** <s>When hickory-resolver is configured with more than two nameservers and one of the first two is not reachable, DNS resolution takes five seconds for NXDOMAIN responses.</s> Edit: see https://github.com/hickory-dns/hickory-dns/issues/2273#issuecomment-2205877684 **To Reproduce** ```rust #!/usr/bin/env cargo +nightly -Zscript --- [dependencies] hickory-resolver = { git = "https://github.com/hickory-dns/hickory-dns.git" } --- use std::net::*; use std::str::FromStr; use hickory_resolver::Resolver; use hickory_resolver::config::*; fn main() { dbg!(hickory_resolver::version()); let mut config = ResolverConfig::new(); config.add_name_server(NameServerConfig::new(SocketAddr::from_str("1.1.1.1:53").unwrap(), Protocol::Udp)); config.add_name_server(NameServerConfig::new(SocketAddr::from_str("8.8.8.8:54").unwrap(), Protocol::Udp)); // invalid port config.add_name_server(NameServerConfig::new(SocketAddr::from_str("9.9.9.9:55").unwrap(), Protocol::Udp)); // invalid port let resolver = Resolver::new(config, ResolverOpts::default()).unwrap(); let start = std::time::Instant::now(); dbg!(resolver.lookup_ip("nxdomain.example.com").unwrap_err()); let stop = std::time::Instant::now(); dbg!(stop - start); } ``` Output: ``` > ./repro.rs [/Users/mhils/.cargo/target/a4/4553f572871dfe/repro.rs:14:5] hickory_resolver::version() = "0.25.0-alpha.1" [/Users/mhils/.cargo/target/a4/4553f572871dfe/repro.rs:23:5] resolver.lookup_ip("nxdomain.example.com").unwrap_err() = ResolveError { kind: Proto( ProtoError { kind: NoRecordsFound { ... }, ), } [/Users/mhils/.cargo/target/a4/4553f572871dfe/repro.rs:25:5] stop - start = 5.019177416s ``` Removing the third nameserver reduces the resolution time to normal levels (13ms on my system). **Expected behavior** Faster resolution. :) **System:** - OS: macOS - Architecture: arm64 - Version: 14.5 - rustc version: 1.79 **Version:** Crate: resolver Version: v0.24.1 and HEAD **Additional context** Thank you for your fantastic work on hickory! 🍰 ❤️
Author
Owner

@bluejekyll commented on GitHub (Jul 3, 2024):

It sounds like the third nameserver is not responding. Is that right?

<!-- gh-comment-id:2205015240 --> @bluejekyll commented on GitHub (Jul 3, 2024): It sounds like the third nameserver is not responding. Is that right?
Author
Owner

@mhils commented on GitHub (Jul 3, 2024):

That is correct.

Scenario A: Two nameservers configured, only the first one is responding: Immediate response.
Scenario B: Three nameservers configured, only the first one is responding: 5s timeout.

Here's what WireShark shows for Scenario B. As in the code above, 8.8.8.8 and 9.9.9.9 are unreachable (wrong port):
image

dns.pcapng.zip

<!-- gh-comment-id:2205431547 --> @mhils commented on GitHub (Jul 3, 2024): That is correct. **Scenario A:** Two nameservers configured, only the first one is responding: Immediate response. **Scenario B:** Three nameservers configured, only the first one is responding: 5s timeout. Here's what WireShark shows for Scenario B. As in the code above, 8.8.8.8 and 9.9.9.9 are unreachable (wrong port): <img width="1167" alt="image" src="https://github.com/hickory-dns/hickory-dns/assets/1019198/abf91c48-cfdd-4668-af5e-75278f0e6b48"> [dns.pcapng.zip](https://github.com/user-attachments/files/16080924/dns.pcapng.zip)
Author
Owner

@djc commented on GitHub (Jul 3, 2024):

What is your expected behavior here? If you configure a number of nameservers, wouldn't you expect all of them would be tried? Is there a threshold particular to two nameservers? What happens when you have your "third" (failing to respond) nameserver second (omitting the second nameserver)?

<!-- gh-comment-id:2205438522 --> @djc commented on GitHub (Jul 3, 2024): What is your expected behavior here? If you configure a number of nameservers, wouldn't you expect all of them would be tried? Is there a threshold particular to two nameservers? What happens when you have your "third" (failing to respond) nameserver second (omitting the second nameserver)?
Author
Owner

@mhils commented on GitHub (Jul 3, 2024):

Thank you two - I bamboozled myself when turning this into a MRE. 🙈 My non-minified code only had a single non-responsive DNS server. But I was also using read_system_conf(), and read_system_conf() sets trust_negative_responses = false, which in turn makes the resolver wait...

Is there a reason why trust_negative_responses is set to false for system configurations? Is that working around some known bugs? The 5s timeout was certainly unexpected for me seeing an immediate NXDOMAIN response in Wireshark.

<!-- gh-comment-id:2205877684 --> @mhils commented on GitHub (Jul 3, 2024): Thank you two - I bamboozled myself when turning this into a MRE. 🙈 My non-minified code only had a single non-responsive DNS server. But I was also using `read_system_conf()`, and `read_system_conf()` sets `trust_negative_responses = false`, which in turn makes the resolver wait... Is there a reason why `trust_negative_responses` is set to `false` for system configurations? Is that working around some known bugs? The 5s timeout was certainly unexpected for me seeing an immediate NXDOMAIN response in Wireshark.
Author
Owner

@djc commented on GitHub (Jul 3, 2024):

See #1861 for discussion -- I forget the deep context, maybe @bluejekyll remembers.

<!-- gh-comment-id:2205910872 --> @djc commented on GitHub (Jul 3, 2024): See #1861 for discussion -- I forget the deep context, maybe @bluejekyll remembers.
Author
Owner

@mhils commented on GitHub (Jul 3, 2024):

Thanks @djc!

Doing some archeology here, #1212 moved the trust bit from ResolverOpts to NameServerConfig. This is how trust_nx_responses: false was introduced to system_conf. #1861 then renames it. I can't find any artifacts describing why NXDOMAIN is treated as untrustworthy by default though. :)

<!-- gh-comment-id:2205945939 --> @mhils commented on GitHub (Jul 3, 2024): Thanks @djc! Doing some archeology here, #1212 moved the trust bit from ResolverOpts to NameServerConfig. This is how `trust_nx_responses: false` was introduced to system_conf. #1861 then renames it. I can't find any artifacts describing why NXDOMAIN is treated as untrustworthy by default though. :)
Author
Owner

@djc commented on GitHub (Jul 3, 2024):

Yeah, I think that's deep lore -- I think @bluejekyll must have explained at some point in the past, but not sure where...

There was also https://github.com/hickory-dns/hickory-dns/pull/1556.

<!-- gh-comment-id:2205975869 --> @djc commented on GitHub (Jul 3, 2024): Yeah, I think that's deep lore -- I think @bluejekyll must have explained at some point in the past, but not sure where... There was also https://github.com/hickory-dns/hickory-dns/pull/1556.
Author
Owner

@bluejekyll commented on GitHub (Aug 11, 2024):

for reference, the untrusted NXDOMAIN response is due to misconfigured DNS at various companies where the internal DNS ends up responding as authoritative for domains it technically does not manage. I had originally wanted to just chalk that up to "your company has a badly configured DNS", but that meant that in those situations users didn't have an option for make the resolver work in that context.

<!-- gh-comment-id:2282341090 --> @bluejekyll commented on GitHub (Aug 11, 2024): for reference, the untrusted NXDOMAIN response is due to misconfigured DNS at various companies where the internal DNS ends up responding as authoritative for domains it technically does not manage. I had originally wanted to just chalk that up to "your company has a badly configured DNS", but that meant that in those situations users didn't have an option for make the resolver work in that context.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#945
No description provided.