mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 19:25:56 +03:00
[GH-ISSUE #1085] Can't resolve internal host name via IPv4 #602
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#602
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @svenstaro on GitHub (Apr 27, 2020).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/1085
Describe the bug
Carried over from https://github.com/hatoo/oha/issues/56. Basically I got an internally resolvable host called
internal.host(IPv4-only) that I need to use an internal DNS server for.nslookupandhostwork just fine and can resolve the name.trust-dnscan't. I getTo Reproduce
https://github.com/hatoo/oha-56 was made by the author of oha to illustrate the problem.
Expected behavior
The host should resolve just fine like
nslookuporhost.System:
Version:
Crate: trust-dns-resolver
Version: 0.19.3
Additional context
See https://github.com/hatoo/oha-56 and https://github.com/hatoo/oha/issues/56
@bluejekyll commented on GitHub (Apr 27, 2020):
Can you share (feel free to remove the specific IP's and or names, but we need to understand the upstream resolver configuration, internal vs. external for example) of the
/etc/resolv.conf? I'm guessing you have an internal DNS service.I think what's possibly going on is that trust-dns is recognizing an upstream response from a public name service as being authoritative for
.hostwhich is a real domain. This represents a bad name choice for internal names, and trust-dns-resolvers dynamic usage of any nameservers in any order (which differs from glibc which always queries resolvers in order).I've been considering to loosen the current usage of this configuration option: https://docs.rs/trust-dns-resolver/0.19.4/trust_dns_resolver/config/struct.ResolverOpts.html#structfield.distrust_nx_responses, to include distrusting all NX and NoError responses as well.
@svenstaro commented on GitHub (Apr 27, 2020):
Alright I'll try to share as much as I can. So:
My computer's network is managed by systemd-networkd and as the first resolver I use systemd-resolved. However, I also use dnsmasq after systemd-resolved to do some weirder stuff where I choose specific upstream DNS servers depending on the host. I'll illustrate this using actual config.
10.13.37.1 is my local network's router and my primary gateway. In this scenario, I'll try to reach
enterprise.wtf.wtfbeing a real tld (though obviously this is not the real host :P.enterprise.wtfresolves fine usingnslookup,host, etc. but not using trust-dns.So yes, the internal host I'm trying to reach does use an actual TLD just as you suspected. And yes, this kind of sucks on the enterprise's part. However, this is nothing I have control over and I'd like my Rust stuff to work even in this kind of situation. Obviously I can't expect any global authority to validate my internal host's validity but that's fine for my case.
@bluejekyll commented on GitHub (Apr 27, 2020):
For reference, the only reserved DNS names are these: https://tools.ietf.org/html/rfc6761
See this thread for good understanding of why using non-reserved DNS names is dangerous: https://serverfault.com/questions/17255/top-level-domain-domain-suffix-for-private-network
I'd highly recommend encouraging your enterprise to stop using a non-regerstered domain. This is a potential security vulnerability waiting to happen.
As to fixing this, if you're interested in testing this, I think it could be done by expanding this to include NXDomain and NoError responses:
github.com/bluejekyll/trust-dns@76a3776d88/crates/resolver/src/name_server/name_server.rs (L141-L147)If that works, I was already considering expanding this for reasons like this.
@svenstaro commented on GitHub (Apr 27, 2020):
Yeah... I believe I'll need a few more years of convincing before making progress on that front. The implications are bad. :)
I did some hacking but I don't really have any overview of this code base and barely have any idea what I'm doing. I did this:
But I'm not sure I get the result of this:
Probably not how you wanted me to hack this. :)
@bluejekyll commented on GitHub (Apr 27, 2020):
That's the change I was expecting :)
That's probably correct, as that it will now treat these as "failures" and not accurately as "NxDomain" which is technically a successful response. It will trigger the logic to continue trying to resolve the name.
I see that you're still getting an error though, is it not finding the name you're looking for?
@svenstaro commented on GitHub (Apr 27, 2020):
Indeed, it still doesn't appear to resolve it. The code is the same as linked above: https://github.com/hatoo/oha-56
I put in
google.comjust for shits and that outputs the IP just fine.drill is perfectly happy with my host:
@bluejekyll commented on GitHub (Apr 27, 2020):
Oh, sorry, I've misled you I realize, I think I mislead you on the "NoError" state... That has to incorporate one more check,
ResponseCode::NoError if response.answers().is_empty() => ...I bet if you run
cargo teston your current changes, a lot would be broken :)@svenstaro commented on GitHub (Apr 27, 2020):
True, without your changes, a lot of tests are broken and they are happy if I add that condition. Sadly, it doesn't change anything in my particular case.
Just to confirm, my code is now
and the output is the same as in https://github.com/bluejekyll/trust-dns/issues/1085#issuecomment-620160050.
Judging by the output, my case never hits that match arm anyway.
@bluejekyll commented on GitHub (Apr 27, 2020):
Hm, that would imply we're never hitting your internal DNS server. Can you try configuring https://docs.rs/trust-dns-resolver/0.19.4/trust_dns_resolver/config/struct.ResolverOpts.html#structfield.num_concurrent_reqs to 1, I'm wondering if you're running into an existing other bug: #933
@svenstaro commented on GitHub (Apr 27, 2020):
I now have
and that didn't seem to change the behavior at all.
@bluejekyll commented on GitHub (Apr 27, 2020):
We need to cleanup these APIs... ugg.
So the issue with your new configuration, is that the Default ResolverConfig only uses the public Google DNS resolvers. You'll want to use this function to get your system's config: https://docs.rs/trust-dns-resolver/0.19.4/trust_dns_resolver/system_conf/fn.read_system_conf.html
That should be easier to find. Then you can change any of the ResolverOpts as necessary, but it will start with the opts as read from the
resolv.conf. At that point you should get your internal DNS resolvers included in the lookup list.@svenstaro commented on GitHub (Apr 27, 2020):
Oh gee, it turns out that we simply used that wrong then. Now when using
read_system_conf(), everything just works without code changes intrust_dns_resolver!However, I do run into #933 if I don't limit the concurrent requests to 1.
@svenstaro commented on GitHub (Apr 27, 2020):
All in all, I'm happy we quickly found the problem thanks to your help. It resulted in https://github.com/hatoo/oha/pull/59 so at least we made the world a slightly better place.
Keep rocking.
@bluejekyll commented on GitHub (Apr 27, 2020):
That's great news. Is the concurrent lookup issue a bug with the code that you were experimenting with before? (the changes to error detection you made to
name_server.rs)@svenstaro commented on GitHub (Apr 27, 2020):
I reverted my changes and then found out that I hit #933 and then I added the workaround and the spurious problems went away. Not sure what else I'm supposed to be checking.
@bluejekyll commented on GitHub (Apr 27, 2020):
I was wondering if you could help debug #933, if you have time, I'm wondering if this:
Fixes that, if you have some time to check, it would help debug that other issue, with the concurrency set to something more than 1...