[GH-ISSUE #2503] [resolver] (0.25.0-alpha.2) DnsSec validate enabled should also use ProtoErrorKind::NoRecordsFound and skip validation #1006

Closed
opened 2026-03-16 01:14:39 +03:00 by kerem · 6 comments
Owner

Originally created by @kolbma on GitHub (Oct 9, 2024).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2503

When looking up different types where no RR exists, there should be used ProtoErrorKind::NoRecordsFound.
This is the case if ResolverOpts::validate == false.

But if ResolverOpts::validate == true there is currently ProtoErrorKind::Message with "could not validate negative response missing SOA".
Why it tries to validate anything if there is no existing RR to validate?

So e.g. if you want to lookup NS for a not existing domain, you get error "could not validate negative response missing SOA".
Does this make any sense?

Originally created by @kolbma on GitHub (Oct 9, 2024). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2503 When looking up different types where no RR exists, there should be used `ProtoErrorKind::NoRecordsFound`. This is the case if `ResolverOpts::validate == false`. But if `ResolverOpts::validate == true` there is currently `ProtoErrorKind::Message` with _"could not validate negative response missing SOA"_. Why it tries to validate anything if there is no existing RR to validate? So e.g. if you want to lookup NS for a not existing domain, you get error "could not validate negative response missing SOA". Does this make any sense?
kerem closed this issue 2026-03-16 01:14:45 +03:00
Author
Owner

@marcus0x62 commented on GitHub (Oct 10, 2024):

This overall problem is complicated, and something I'm working on in #2502, but the short answer is we can't just skip validation because of a missing record in a zone. If Hickory is configured as a validating name server and the parent zone for a given query has DS records for the sub zone being queried, then things like missing SOA records in the sub zone will (and should) trigger validation failures for some queries.

There is a bug where SOA (and associated DNSSEC records) are not being propagated correctly during error type conversion, which is more than likely what you are seeing - Hickory probably is getting an SOA record for the query you are attempting, but we aren't passing it correctly through the entire resolution process. We'd need a packet capture (or debug logs) to be sure.

<!-- gh-comment-id:2405521965 --> @marcus0x62 commented on GitHub (Oct 10, 2024): This overall problem is complicated, and something I'm working on in #2502, but the short answer is we can't just skip validation because of a missing record in a zone. If Hickory is configured as a validating name server and the parent zone for a given query has DS records for the sub zone being queried, then things like missing SOA records in the sub zone will (and should) trigger validation failures for some queries. There is a bug where SOA (and associated DNSSEC records) are not being propagated correctly during error type conversion, which is more than likely what you are seeing - Hickory probably is getting an SOA record for the query you are attempting, but we aren't passing it correctly through the entire resolution process. We'd need a packet capture (or debug logs) to be sure.
Author
Owner

@kolbma commented on GitHub (Oct 10, 2024):

Well, I've tested it with an unregistered domain/NXDOMAIN. And then you get this crude "could not validate negative response missing SOA" error.
The SOA is from the NIC. And the RRs of the NIC are validatable secure.
But I mean, what should be validated for the unregistered domain?

delv returns with

;; resolution failed: ncache nxdomain
; negative response, fully validated
<!-- gh-comment-id:2405867113 --> @kolbma commented on GitHub (Oct 10, 2024): Well, I've tested it with an unregistered domain/NXDOMAIN. And then you get this crude _"could not validate negative response missing SOA"_ error. The SOA is from the NIC. And the RRs of the NIC are validatable secure. But I mean, what should be validated for the unregistered domain? _delv_ returns with ``` ;; resolution failed: ncache nxdomain ; negative response, fully validated ```
Author
Owner

@marcus0x62 commented on GitHub (Oct 10, 2024):

In the case of a query against an unregistered domain with a parent zone that is signed and Hickory as a validating recursor, the process should go, approximately, like this:

  • Hickory receives a query for 'unregistered.com'
  • Hickory looks up the com nameservers based on the configured roots.
  • Hickory asks for the nameservers for unregistered.com from the com nameservers.
  • Com nameservers return NXDomain with the SOA for com. and several NSEC/NSEC3 and RRSIG records to prove unregistered.com does not exist.
  • Hickory validates the returned records. against the dnskeys for com.
  • Hickory requests the DS records for com from the root servers and verifies those match the dnskey from the previous request
  • Hickory verifies the root servers keys against the trust anchor.

More generally, for any negative response as a validating resolver, we need to validate either that the non-existence proofs we have are valid, or that the lack of non-existence proofs is valid before returning a response to the client. We can't skip validation in these scenarios.

Note: while I disagree with the suggested fix, I'm leaving this issue open because the current recursor behavior in this case is wrong. As of Oct-11-2024 it is looking like there are three separate issues that need to be addressed to provide a comprehensive fix:

1 - Error conversions causing loss of NXDomain/NoRecord response info, including non-existence records. Fixed in #2502
2 - DNSSEC validation not performed on error responses at all. Fixed in #2502
3 - ~~NSEC3 validation of error responses appears to be broken. No underlying cause(s) identified as of yet. This appears to be caused by the NSEC3 validation implementation not supporting opt-out. This may end up being fully or partially fixed in a separate PR. This will be fixed in a separate PR. cc @pvdrz ~~ Fixed in #2546

<!-- gh-comment-id:2405891826 --> @marcus0x62 commented on GitHub (Oct 10, 2024): In the case of a query against an unregistered domain with a parent zone that is signed and Hickory as a validating recursor, the process should go, approximately, like this: * Hickory receives a query for 'unregistered.com' * Hickory looks up the com nameservers based on the configured roots. * Hickory asks for the nameservers for unregistered.com from the com nameservers. * Com nameservers return NXDomain with the SOA for com. and several NSEC/NSEC3 and RRSIG records to prove unregistered.com does not exist. * Hickory validates the returned records. against the dnskeys for com. * Hickory requests the DS records for com from the root servers and verifies those match the dnskey from the previous request * Hickory verifies the root servers keys against the trust anchor. More generally, for any negative response as a validating resolver, we need to validate either that the non-existence proofs we have are valid, or that the lack of non-existence proofs is valid before returning a response to the client. We can't skip validation in these scenarios. Note: while I disagree with the suggested fix, I'm leaving this issue open because the current recursor behavior in this case is wrong. As of Oct-11-2024 it is looking like there are three separate issues that need to be addressed to provide a comprehensive fix: 1 - ~~Error conversions causing loss of NXDomain/NoRecord response info, including non-existence records.~~ **Fixed in #2502** 2 - ~~DNSSEC validation not performed on error responses at all.~~ **Fixed in #2502** 3 - ~~NSEC3 validation of error responses appears to be broken. ~~No underlying cause(s) identified as of yet.~~ This appears to be caused by the NSEC3 validation implementation not supporting opt-out. ~~This may end up being fully or partially fixed in a separate PR.~~ This will be fixed in a separate PR. cc @pvdrz ~~ **Fixed in #2546**
Author
Owner

@marcus0x62 commented on GitHub (Nov 20, 2024):

@kolbma this should be fixed now. Can you retest and let me know if this is working for you now?

<!-- gh-comment-id:2489110344 --> @marcus0x62 commented on GitHub (Nov 20, 2024): @kolbma this should be fixed now. Can you retest and let me know if this is working for you now?
Author
Owner

@kolbma commented on GitHub (Nov 22, 2024):

Sorry, but I can't check this out at the moment. In alpha.3 there are now too many API incompatibilities (Generics and Async) to before, that it makes no sense to fiddle in support of 0.24 and 0.25 in my code base with feature cfg.

<!-- gh-comment-id:2493982025 --> @kolbma commented on GitHub (Nov 22, 2024): Sorry, but I can't check this out at the moment. In alpha.3 there are now too many API incompatibilities (Generics and Async) to before, that it makes no sense to fiddle in support of 0.24 and 0.25 in my code base with feature cfg.
Author
Owner

@marcus0x62 commented on GitHub (Nov 26, 2024):

Thanks @kolbma. I'm going to close this out for now; feel free to reopen it if you run into this problem in the future.

<!-- gh-comment-id:2501847141 --> @marcus0x62 commented on GitHub (Nov 26, 2024): Thanks @kolbma. I'm going to close this out for now; feel free to reopen it if you run into this problem in the future.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#1006
No description provided.