[GH-ISSUE #3011] Store one Proof per RRset, instead of per record #1112

Open
opened 2026-03-16 01:37:50 +03:00 by kerem · 6 comments
Owner

Originally created by @divergentdave on GitHub (May 23, 2025).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/3011

Proof enums are currently attached to records or an error variant. This is not a good way to represent DNSSEC validation results, because validation is done one RRset at a time. We could remove the proof field from the Record struct and ProtoErrorKind::Nsec and instead store a HashMap<RrKey, Proof> alongside a response. The proof in ProtoErrorKind::Nsec represents the verification state of the queried RRset in name error or no data responses. Being able to track the verification state of other RRsets, different from the queried RRset, and without any matching records, would be useful for properly validating wildcard expansion, see #2882.

Originally created by @divergentdave on GitHub (May 23, 2025). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/3011 `Proof` enums are currently attached to records or an error variant. This is not a good way to represent DNSSEC validation results, because validation is done one RRset at a time. We could remove the `proof` field from the `Record` struct and `ProtoErrorKind::Nsec` and instead store a `HashMap<RrKey, Proof>` alongside a response. The proof in `ProtoErrorKind::Nsec` represents the verification state of the queried RRset in name error or no data responses. Being able to track the verification state of other RRsets, different from the queried RRset, and without any matching records, would be useful for properly validating wildcard expansion, see #2882.
Author
Owner

@divergentdave commented on GitHub (Jul 10, 2025):

After reading through RFC 4035 again and thinking through this some more, I think the solution will need to be a bit different than described above.

There are two different kinds of validation results we can produce. We can determine that a particular RRset is secure, bogus, or insecure, and we can separately determine whether an entire response to a query is valid/authenticated/secure, bogus, or insecure/not authenticated. The per-record Proofs in positive responses reflect the former, while the Proof inside ProtoErrorKind::Nsec for negative responses reflects the latter. The DnssecSummary produced from the per-record Proofs is supposed to reflect overall response validity as well. We need to validate the RRSIG over an RRset before we can determine if the overall response that contains the RRset in its Answer section is valid, but validating the overall response may require validating an authenticated denial of existence as well.

As a result, I don't think it would make sense to mix Proof enums representing both validated RRsets and validated nonexistence of RRsets in one HashMap. If anything, it might be useful to memoize validation of RRsets. For example, we may need to refer to the same validated NSEC or NSEC3 RR multiple times if it covers multiple names we're interested in. Even so, that doesn't need to live in the Message struct, as it doesn't need to live past validation.

RFC 4035 section 5.3.4 says that, whenever an RRset is the result of expanding wildcard records, "Once the validator has verified the signature, as described in section 5.3, it must take additional steps to verify the non-existence of an exact match or closer wildcard match for the query." This means that the wildcard-related validation rules apply both to validation of a single RRset, such as in positive responses, and to overall response validation, such as when authenticating a wildcard no data response. This also means that validation of individual RRsets cannot be done independently from each other, or even one section at a time, as DnssecDnsHandle::verify_response() does. We may need to validate an NSEC/NSEC3 record in order to finish validating some other RRset.

We may need to pass something like a DnssecSummary out of the validation code to the Catalog in the future, especially if we stop representing NXDOMAIN/no data as a Result::Err. To do so, we'll have to add it to the DnsResponse struct, change the DnsHandle trait to add some extensibility, or replace the use of DnsHandle to interface with DnssecDnsHandle.

<!-- gh-comment-id:3058167378 --> @divergentdave commented on GitHub (Jul 10, 2025): After reading through RFC 4035 again and thinking through this some more, I think the solution will need to be a bit different than described above. There are two different kinds of validation results we can produce. We can determine that a particular RRset is secure, bogus, or insecure, and we can separately determine whether an entire response to a query is valid/authenticated/secure, bogus, or insecure/not authenticated. The per-record `Proof`s in positive responses reflect the former, while the `Proof` inside `ProtoErrorKind::Nsec` for negative responses reflects the latter. The `DnssecSummary` produced from the per-record `Proof`s is supposed to reflect overall response validity as well. We need to validate the RRSIG over an RRset before we can determine if the overall response that contains the RRset in its Answer section is valid, but validating the overall response may require validating an authenticated denial of existence as well. As a result, I don't think it would make sense to mix `Proof` enums representing both validated RRsets and validated nonexistence of RRsets in one `HashMap`. If anything, it might be useful to memoize validation of RRsets. For example, we may need to refer to the same validated NSEC or NSEC3 RR multiple times if it covers multiple names we're interested in. Even so, that doesn't need to live in the `Message` struct, as it doesn't need to live past validation. RFC 4035 section 5.3.4 says that, whenever an RRset is the result of expanding wildcard records, "Once the validator has verified the signature, as described in section 5.3, it must take additional steps to verify the non-existence of an exact match or closer wildcard match for the query." This means that the wildcard-related validation rules apply both to validation of a single RRset, such as in positive responses, and to overall response validation, such as when authenticating a wildcard no data response. This also means that validation of individual RRsets cannot be done independently from each other, or even one section at a time, as `DnssecDnsHandle::verify_response()` does. We may need to validate an NSEC/NSEC3 record in order to finish validating some other RRset. We may need to pass something like a `DnssecSummary` out of the validation code to the `Catalog` in the future, especially if we stop representing NXDOMAIN/no data as a `Result::Err`. To do so, we'll have to add it to the `DnsResponse` struct, change the `DnsHandle` trait to add some extensibility, or replace the use of `DnsHandle` to interface with `DnssecDnsHandle`.
Author
Owner

@djc commented on GitHub (Jul 10, 2025):

FYI, I started hacking on this a little bit today.

<!-- gh-comment-id:3058179522 --> @djc commented on GitHub (Jul 10, 2025): FYI, I started hacking on this a little bit today.
Author
Owner

@djc commented on GitHub (Jul 15, 2025):

replace the use of DnsHandle to interface with DnssecDnsHandle

This feels like the right solution to me at least in terms of the library API. I think the question I'm struggling with a little is how we want to present DNSSEC verification capabilities to the user.

  • In the resolver library API, we expose validate: true to enable verification and this results in Record::proof potentially being set, but this is otherwise being ignored (although it is available in low-level APIs to callers).
  • In the recursor, when running in Validating mode (enabled via DnssecPolicy::ValidateWithStaticKey), we will use proof.is_indeterminate() to determine whether we can yield cache results. It doesn't seem like proof values actually influence the data that we return?

From what I can tell it doesn't seem like DnssecDnsHandle::verify_rrsets() actually changes the Record data that it gets, it only seems to augment it with Proof data.

For the resolver library API, perhaps it makes more sense to expose a separate DnssecResolver API which will expose an API that, for each record, wraps it in its proof status?

For the recursor, I'm not completely sure what the intended behavior for a validating recursor is (nor does this seem to be clearly documented in the RecursorMode::Validating or DnssecPolicy::ValidateWithStaticKey). Should it only return secure records?

<!-- gh-comment-id:3072757957 --> @djc commented on GitHub (Jul 15, 2025): > replace the use of `DnsHandle` to interface with `DnssecDnsHandle` This feels like the right solution to me at least in terms of the library API. I think the question I'm struggling with a little is how we want to present DNSSEC verification capabilities to the user. - In the resolver library API, we expose `validate: true` to enable verification and this results in `Record::proof` potentially being set, but this is otherwise being ignored (although it is available in low-level APIs to callers). - In the recursor, when running in `Validating` mode (enabled via `DnssecPolicy::ValidateWithStaticKey`), we will use `proof.is_indeterminate()` to determine whether we can yield cache results. It doesn't seem like `proof` values actually influence the data that we return? From what I can tell it doesn't seem like `DnssecDnsHandle::verify_rrsets()` actually changes the `Record` data that it gets, it only seems to augment it with `Proof` data. For the resolver library API, perhaps it makes more sense to expose a separate `DnssecResolver` API which will expose an API that, for each record, wraps it in its proof status? For the recursor, I'm not completely sure what the intended behavior for a validating recursor is (nor does this seem to be clearly documented in the `RecursorMode::Validating` or `DnssecPolicy::ValidateWithStaticKey`). Should it only return secure records?
Author
Owner

@djc commented on GitHub (Jul 15, 2025):

A quick web search turns up this documentation from PowerDNS. Describing its default process mode:

When dnssec.validation is set to process the behaviour is similar to process-no-validate. However, the recursor will try to validate the data if at least one of the DO or AD bits is set in the query; in that case, it will set the AD-bit in the response when the data is validated successfully, or send SERVFAIL when the validation comes up bogus.

RFC 2535, section 6.1:

Security aware servers MUST NOT return Bad data. For non-security aware resolvers or security aware resolvers requesting service by having the CD bit clear, security aware servers MUST return only Authenticated or Insecure data in the answer and authority sections with the AD bit set in the response. Security aware servers SHOULD return Pending data, with the AD bit clear in the response, to security aware resolvers requesting this service by asserting the CD bit in their request. The AD bit MUST NOT be set on a response unless all of the RRs in the answer and authority sections of the response are either Authenticated or Insecure. The AD bit does not cover the additional information section.

<!-- gh-comment-id:3072792946 --> @djc commented on GitHub (Jul 15, 2025): A quick web search turns up this [documentation](https://doc.powerdns.com/recursor/dnssec.html) from PowerDNS. Describing its default `process` mode: > When [dnssec.validation](https://doc.powerdns.com/recursor/yamlsettings.html#setting-yaml-dnssec-validation) is set to process the behaviour is similar to [process-no-validate](https://doc.powerdns.com/recursor/dnssec.html#process-no-validate). However, the recursor will try to validate the data if at least one of the DO or AD bits is set in the query; in that case, it will set the AD-bit in the response when the data is validated successfully, or send SERVFAIL when the validation comes up bogus. [RFC 2535, section 6.1](https://www.rfc-editor.org/rfc/rfc2535#section-6.1): > Security aware servers MUST NOT return Bad data. For non-security aware resolvers or security aware resolvers requesting service by having the CD bit clear, security aware servers MUST return only Authenticated or Insecure data in the answer and authority sections with the AD bit set in the response. Security aware servers SHOULD return Pending data, with the AD bit clear in the response, to security aware resolvers requesting this service by asserting the CD bit in their request. The AD bit MUST NOT be set on a response unless all of the RRs in the answer and authority sections of the response are either Authenticated or Insecure. The AD bit does not cover the additional information section.
Author
Owner

@djc commented on GitHub (Jul 15, 2025):

I suppose 2535 has been declared obsolete in favor of 4035.

Section 3.1.6:

A security-aware name server MUST NOT set the AD bit in a response unless the name server considers all RRsets in the Answer and Authority sections of the response to be authentic. A security-aware name server's local policy MAY consider data from an authoritative zone to be authentic without further validation. However, the name server MUST NOT do so unless the name server obtained the authoritative zone via secure means (such as a secure zone transfer mechanism) and MUST NOT do so unless this behavior has been configured explicitly.

And section 3.2.3:

The name server side of a security-aware recursive name server MUST NOT set the AD bit in a response unless the name server considers all RRsets in the Answer and Authority sections of the response to be authentic. The name server side SHOULD set the AD bit if and only if the resolver side considers all RRsets in the Answer section and any relevant negative response RRs in the Authority section to be authentic. The resolver side MUST follow the procedure described in Section 5 to determine whether the RRs in question are authentic. However, for backward compatibility, a recursive name server MAY set the AD bit when a response includes unsigned CNAME RRs if those CNAME RRs demonstrably could have been synthesized from an authentic DNAME RR that is also included in the response according to the synthesis rules described in [RFC2672].

<!-- gh-comment-id:3072851927 --> @djc commented on GitHub (Jul 15, 2025): I suppose 2535 has been declared obsolete in favor of 4035. [Section 3.1.6](https://www.rfc-editor.org/rfc/rfc4035#section-3.1): > A security-aware name server MUST NOT set the AD bit in a response unless the name server considers all RRsets in the Answer and Authority sections of the response to be authentic. A security-aware name server's local policy MAY consider data from an authoritative zone to be authentic without further validation. However, the name server MUST NOT do so unless the name server obtained the authoritative zone via secure means (such as a secure zone transfer mechanism) and MUST NOT do so unless this behavior has been configured explicitly. And [section 3.2.3](https://www.rfc-editor.org/rfc/rfc4035#section-3.2.3): > The name server side of a security-aware recursive name server MUST NOT set the AD bit in a response unless the name server considers all RRsets in the Answer and Authority sections of the response to be authentic. The name server side SHOULD set the AD bit if and only if the resolver side considers all RRsets in the Answer section and any relevant negative response RRs in the Authority section to be authentic. The resolver side MUST follow the procedure described in [Section 5](https://www.rfc-editor.org/rfc/rfc4035#section-5) to determine whether the RRs in question are authentic. However, for backward compatibility, a recursive name server MAY set the AD bit when a response includes unsigned CNAME RRs if those CNAME RRs demonstrably could have been synthesized from an authentic DNAME RR that is also included in the response according to the synthesis rules described in [[RFC2672](https://www.rfc-editor.org/rfc/rfc2672)].
Author
Owner

@divergentdave commented on GitHub (Jul 15, 2025):

  • In the recursor, when running in Validating mode (enabled via DnssecPolicy::ValidateWithStaticKey), we will use proof.is_indeterminate() to determine whether we can yield cache results. It doesn't seem like proof values actually influence the data that we return?

FWIW this check is needed because the recursor can insert cache entries both before and after validation. This check prevents using a pre-validation response without going through the validation process again.

The big place where the rubber meets the road (when running the recursor or forwarder in a server) is in build_forwarded_response(), which uses DnssecSummary::from_records() to consume proofs. I discussed this a bit in https://github.com/hickory-dns/hickory-dns/issues/3041#issuecomment-3032905729.

<!-- gh-comment-id:3075838493 --> @divergentdave commented on GitHub (Jul 15, 2025): > * In the recursor, when running in `Validating` mode (enabled via `DnssecPolicy::ValidateWithStaticKey`), we will use `proof.is_indeterminate()` to determine whether we can yield cache results. It doesn't seem like `proof` values actually influence the data that we return? FWIW this check is needed because the recursor can insert cache entries both before and after validation. This check prevents using a pre-validation response without going through the validation process again. The big place where the rubber meets the road (when running the recursor or forwarder in a server) is in `build_forwarded_response()`, which uses `DnssecSummary::from_records()` to consume proofs. I discussed this a bit in https://github.com/hickory-dns/hickory-dns/issues/3041#issuecomment-3032905729.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#1112
No description provided.