[GH-ISSUE #2889] DNSSEC validation failure on legacy.research.icann.org #1081

Closed
opened 2026-03-16 01:33:58 +03:00 by kerem · 2 comments
Owner

Originally created by @divergentdave on GitHub (Mar 27, 2025).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2889

I tried looking up SOA legacy.research.icann.org. with a validating recursive resolver, and got a SERVFAIL response. This zone appears to be properly signed according to dnsviz. All the zone cuts are one label apart, and DNSKEY and DS records are using typical algorithms. I did notice that the legacy.research.icann.org. zone only has one DNSKEY, rather than separate a separate KSK and ZSK. Otherwise, the zones look pretty bog-standard to me. I can get DS legacy.research.icann.org. and SOA research.icann.org. successfully. After some testing, it looks like this is a generic problem with DNSSEC validation this many labels deep. I suspect that the recursive call near the end of find_ds_records() may be part of the problem.

Originally created by @divergentdave on GitHub (Mar 27, 2025). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2889 I tried looking up `SOA legacy.research.icann.org.` with a validating recursive resolver, and got a SERVFAIL response. This zone appears to be properly signed according to [dnsviz](https://dnsviz.net/d/legacy.research.icann.org/dnssec/). All the zone cuts are one label apart, and DNSKEY and DS records are using typical algorithms. I did notice that the `legacy.research.icann.org.` zone only has one DNSKEY, rather than separate a separate KSK and ZSK. Otherwise, the zones look pretty bog-standard to me. I can get `DS legacy.research.icann.org.` and `SOA research.icann.org.` successfully. After some testing, it looks like this is a generic problem with DNSSEC validation this many labels deep. I suspect that the recursive call near the end of `find_ds_records()` may be part of the problem.
kerem closed this issue 2026-03-16 01:34:03 +03:00
Author
Owner

@divergentdave commented on GitHub (Apr 15, 2025):

The find_ds_records() function is used for two different purposes, to fetch a DS RRset in order to validate DNSKEY RRs, and to decide whether a response with no RRSIGs is insecure or just bogus.

In the first situation, we already have some RRSIGs, which include a signer name. This ought to tell us where the zone cut is (though we should first check that the signer name is an ancestor of the RRSIG's owner name).

In the latter situation, we need to find one or more zone cuts, and get either an authenticated DS RRset or a proof of nonexistence of DS RRs (though we discard the DS RRset if we get one, as the callsite only makes use of the error). We do so by requesting the DS RRset of each successive ancestor name, using recursive calls.

If we split this function up, we could eliminate the recursion in the first case, which would make this easier to reason about. This might help with #2812 as well.

<!-- gh-comment-id:2806667970 --> @divergentdave commented on GitHub (Apr 15, 2025): The `find_ds_records()` function is used for two different purposes, to fetch a DS RRset in order to validate DNSKEY RRs, and to decide whether a response with no RRSIGs is insecure or just bogus. In the first situation, we already have some RRSIGs, which include a signer name. This ought to tell us where the zone cut is (though we should first check that the signer name is an ancestor of the RRSIG's owner name). In the latter situation, we need to find one or more zone cuts, and get either an authenticated DS RRset or a proof of nonexistence of DS RRs (though we discard the DS RRset if we get one, as the callsite only makes use of the error). We do so by requesting the DS RRset of each successive ancestor name, using recursive calls. If we split this function up, we could eliminate the recursion in the first case, which would make this easier to reason about. This might help with #2812 as well.
Author
Owner

@divergentdave commented on GitHub (Apr 17, 2025):

I added some logging, and the initial error is due to a depth limit in DnssecDnsHandle.

  2025-04-17T20:56:53.477605Z  INFO hickory_proto::dnssec::dnssec_dns_handle: got response to DS query, ds_message: Err(ProtoError { kind: Message("exceeded max validation depth") })
    at crates/proto/src/dnssec/dnssec_dns_handle/mod.rs:818
    in hickory_proto::dnssec::dnssec_dns_handle::fetch_ds_records with zone: Name("testing.")
...

Here's where the request depth gets incremented:

  • 1: after start of initial request
  • 2: inside verify_response(), then verify_rrsets()
  • 3: inside verify_rrset()
  • 4: inside verify_default_rrset()
  • 5: inside send() for foo.bar.hickory-dns.testing. DNSKEY query
  • 6: inside verify_response() and verify_rrsets() again, but this time for the DNSKEY query
  • 7: inside verify_rrset()
  • 8: inside send() for foo.bar.hickory-dns.testing. DS query
  • 9: verify_response() and verify_rrsets() for the DS query
  • 10: verify_rrset()
  • 11: verify_default_rrset()
  • 12: send() for bar.hickory-dns.testing. DNSKEY
  • 13: verify_response() and verify_rrsets()
  • 14: verify_rrset()
  • 15: send() for bar.hickory-dns.testing. DS
  • 16: verify_response() and verify_rrsets()
  • 17: verify_rrset()
  • 18: verify_default_rrset()
  • 19: send() for hickory-dns.testing. DNSKEY
  • 20: verify_response() and verify_rrsets()
  • 21: verify_rrset()
  • 22: send() for hickory-dns.testing. DS
  • 23: verify_response() and verify_rrsets()
  • 24: verify_rrset()
  • 25: verify_default_rrset()
  • 26: send() for testing. DNSKEY
  • 27: verify_response() and verify_rrsets()
  • 28: verify_rrset()

After this, send() gets called again with the query testing. DS, and the maximum depth check returns an error. The default maximum depth is 26.

We can increase this default limit and/or reduce the number of places where we bump the depth counter. I think we only really need to increment the depth when making a request, as a safety measure to cap the number of outgoing requests. In other places, we can do a plain clone of the handle, or move it.

<!-- gh-comment-id:2814053406 --> @divergentdave commented on GitHub (Apr 17, 2025): I added some logging, and the initial error is due to a depth limit in `DnssecDnsHandle`. ``` 2025-04-17T20:56:53.477605Z INFO hickory_proto::dnssec::dnssec_dns_handle: got response to DS query, ds_message: Err(ProtoError { kind: Message("exceeded max validation depth") }) at crates/proto/src/dnssec/dnssec_dns_handle/mod.rs:818 in hickory_proto::dnssec::dnssec_dns_handle::fetch_ds_records with zone: Name("testing.") ... ``` Here's where the request depth gets incremented: * 1: after start of initial request * 2: inside verify_response(), then verify_rrsets() * 3: inside verify_rrset() * 4: inside verify_default_rrset() * 5: inside send() for `foo.bar.hickory-dns.testing. DNSKEY` query * 6: inside verify_response() and verify_rrsets() again, but this time for the DNSKEY query * 7: inside verify_rrset() * 8: inside send() for `foo.bar.hickory-dns.testing. DS` query * 9: verify_response() and verify_rrsets() for the DS query * 10: verify_rrset() * 11: verify_default_rrset() * 12: send() for `bar.hickory-dns.testing. DNSKEY` * 13: verify_response() and verify_rrsets() * 14: verify_rrset() * 15: send() for `bar.hickory-dns.testing. DS` * 16: verify_response() and verify_rrsets() * 17: verify_rrset() * 18: verify_default_rrset() * 19: send() for `hickory-dns.testing. DNSKEY` * 20: verify_response() and verify_rrsets() * 21: verify_rrset() * 22: send() for `hickory-dns.testing. DS` * 23: verify_response() and verify_rrsets() * 24: verify_rrset() * 25: verify_default_rrset() * 26: send() for `testing. DNSKEY` * 27: verify_response() and verify_rrsets() * 28: verify_rrset() After this, `send()` gets called again with the query `testing. DS`, and the maximum depth check returns an error. The default maximum depth is 26. We can increase this default limit and/or reduce the number of places where we bump the depth counter. I think we only really need to increment the depth when making a request, as a safety measure to cap the number of outgoing requests. In other places, we can do a plain clone of the handle, or move it.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#1081
No description provided.