mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 11:15:54 +03:00
[GH-ISSUE #2788] recursor fails when cname is returned from an intermediate NS request #1062
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#1062
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @DirectXMan12 on GitHub (Feb 19, 2025).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2788
Describe the bug
a CNAME response to an NS query seems to throw the recursor for a loop -- it (correctly?) skips the CNAME, but instead of going "oh, i got an empty list of servers, oops idk what to do" instead of trying a longer prefix. this seems to be... backwards? from what a couple of other recursors do.
for instance, suppose we're trying to resolve
docs.redhat.com.edgekey.net(skipping root querying):this seems to be what hickory does now:
netforedgekey.net, get back a list of responses$ONE_OF_THOSEforcom.edgekey.net, get back 0 nameservers and 1 CNAMEno connections available(afaict? i didn't dig too deep into the code, so this is my guess as to why this error message is being shown)other resolvers 1 seem to do:
netfordocs.redhat.com.edgekey.net, get back an authority section pointing toedgekey.net$ONE_OF_THOSEfordocs.redhat.com.edgekey.net, get back the actual response.To Reproduce
configure hickory as a recursor:
then try querying for
docs.redhat.com(ordocs.redhat.com.edgekey.netto skip the intervening CNAME).Expected behavior
We get back a valid CNAME pointing (in the above case, pointing to somewhere on akamai's network).
System:
Version:
Crate: recursor (via the binary)
Version: 0.25.0-alpha.5
Additional context
it kinda? seems like maybe edgekey shouldn't be returning those CNAMEs on the NS record responses if i had to guess, but also it kinda seems like hickory's deviation from other resolvers is also causing issues.
i tried
dig +trace, then to confirm i found a random dns tracing tool online athttps://simpledns.plus/lookup-dgand also "normal" public resolvers seem to handle this fine (e.g.8.8.8.8). ↩︎@djc commented on GitHub (Feb 19, 2025):
@divergentdave this should probably be part of #2725?
@divergentdave commented on GitHub (Feb 19, 2025):
The reason Hickory DNS is querying for only
com.edgekey.netis that it is doing "QNAME minimzation". This has better privacy properties, but may require more queries when zone cuts are more than one label apart. Other resolvers also support QNAME minimization, though it is often controlled by a configuration parameter.The correct behavior here is described in step 6c of the algorithm in RFC 9156. When we get the NOERROR response with a CNAME, since the iterative query was not for the final QNAME, we need to add another label and make another query to the same server.
The fact that we can get both
docs.redhat.com.edgekey.net. IN CNAME e21727.dsca.akamaiedge.net.andcom.edgekey.net. IN CNAME e19.b.akamaiedge.net.from the same authoritative server seems weird to me, but oh well.Agreed, will add.
@DirectXMan12 commented on GitHub (Feb 26, 2025):
aah, cool, i did not know about qname minimization, will definitely add rfc 9156 to my backlog of reading material, that's neat :-D.
@bluejekyll commented on GitHub (Mar 2, 2025):
@divergentdave what's your opinion on this? offer an option to disable QNAME minimization?
@divergentdave commented on GitHub (Mar 2, 2025):
That's an option, but first and foremost we should fix the recursor so that it follows RFC 9156's algorithm
@divergentdave commented on GitHub (Apr 8, 2025):
I wrote up a test to reproduce this, and I was surprised to see that resolution succeeds if there is no CNAME at the name in between zone cuts. However, this seems like it only happens by accident, and it is fragile in how it depends on authoritative name server responses.
Here's what happens when looking up
www.b.a.testing. IN Ain the test. There are separate name servers for the.,testing., andb.a.testing.zones.Zone cuts are two labels apart, no CNAME record in the middle
testing.are locatedtesting.zone fora.testing. IN NSProtoErrorKind::NoRecordsFound, because its answer section is emptyRecursorDnsHandle::lookup()logs the error and propagates it upRecursorDnsHandle::ns_pool_for_zone()logs "ns for a.testing forwarded to testing. via SOA record"testing.zone as the name servers fora.testing.as well, thougha.testing.is not really its own zoneThis effectively achieves QNAME minimization with a different algorithm, but it's misleading that we refer to
a.testing.as being a zone in the process. This only works if the authoritative server includes an SOA record in its response, which could introduce another compatibility issue. There is also an extra entry inname_server_cacheundera.testing., and extra connections to the authoritative servers.Zone cuts are two labels apart, with CNAME record in the middle
testing.are locatedtesting.zone fora.testing. IN NSProtoError::from_response()callsDnsResponse::contains_answer(), which, for this query type, only requires that the answers section be non-empty. Thus,ProtoError::from_response()returns the response insideOk(...).RecursorDnsHandle::lookup()caches the response and returns itRecursorDnsHandle::ns_pool_for_zone()logs "response is not NS ...; skipping" as it ignores the CNAME recordb.a.testing. IN NSfails with "no connections available"(as discussed in previous comments)