[GH-ISSUE #2934] BinEncoder's `canonical_names` flag is overloaded #1096

New issue

Closed

opened 2026-03-16 01:35:53 +03:00 by kerem · 1 comment

kerem commented

2026-03-16 01:35:53 +03:00

Owner

Originally created by @divergentdave on GitHub (Apr 15, 2025).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2934

The canonical_names flag in the BinEncoder struct is overloaded, as it is used in different places both to decide whether to change encoded names to lower-case form, and whether to enable name compression. The flag can be set in multiple places, both before calling emit() on a message or record, and within <RData as BinEncodable>::emit(), which uses with_canonical_names() for only some record types when encoding the RDATA. The set of record types that do or don't always set canonical_names = true for RDATA is inconsistent with both the list of well-known record types from RFC 1035 (which may use compression inside RDATA) and the list of record types from RFC 4034 section 6.2 (which must use lowercase names inside RDATA for DNSSEC canonical form). To get this right, I think we first need to split up this flag into two or three separate flags.

The encoder struct and encodable trait need to serve three different purposes: writing to wire protocol messages, constructing to-be-signed data for DNSSEC signing and verification, and any use of the encoder that does not encode an entire DNS message. The server's ResponseHandle leaves canonical_names set to false, while the encoders used by TBS always have canonical_names set to true. For the wire protocol, we should generally keep the case of names as-is, and we need to preserve the case for names that are in RDATA of record types not listed in RFC 4034 section 6.2 (for instance, CAA and SVCB). We may use name compression in RDATA for well-known record types, but we must not use name compression in the RDATA of any record type introduced after RFC 1035. For DNSSEC canonical form, we must use lowercase names in RDATA for types listed in RFC 4034 section 6.2, and we must preserve the case of names in RDATA for types not listed there. We must not use name compression anywhere. For uses where we encode something other than an entire message into a buffer, for example when using temporary buffers in some routines, we can't use name compression, because the temporary buffer will get copied somewhere else in the final message, invalidating offsets in compression pointers.

From code inspection, it looks like we have incorrect behavior in multiple directions. For example:

ANAME RDATA gets encoded inside with_canonical_names(), and <ANAME as BinEncodable>::emit() passes the canonical names flag as the lowercase argument to Name::emit_with_lowercase(). This results in an incorrect DNSSEC canonical encoding, since ANAME is not on the list in RFC4034 for lowercasing.
If a message contains an ANAME record that includes a capital letter in its RDATA, plus an RRSIG over it, then when Hickory DNS encodes it for wire protocol use, it would lowercase the RDATA, invalidating the signature from the perspective of downstream validating clients.
SVCB RDATA encoded for wire protocol messages would encode the target name with canonical_names set to false. As a result, name compression could be used inside the RDATA. This would cause problems when interacting with other servers that don't have support for the SVCB record type, because they would have to obliviously copy the RDATA into a message when forwarding, which could invalidate the meaning of a compression pointer inside the name.

Originally created by @divergentdave on GitHub (Apr 15, 2025). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2934 The `canonical_names` flag in the `BinEncoder` struct is overloaded, as it is used in different places both to decide whether to change encoded names to lower-case form, and whether to enable name compression. The flag can be set in multiple places, both before calling `emit()` on a message or record, and within `<RData as BinEncodable>::emit()`, which uses `with_canonical_names()` for only some record types when encoding the RDATA. The set of record types that do or don't always set `canonical_names = true` for RDATA is inconsistent with both the list of well-known record types from RFC 1035 (which may use compression inside RDATA) and the list of record types from RFC 4034 section 6.2 (which must use lowercase names inside RDATA for DNSSEC canonical form). To get this right, I think we first need to split up this flag into two or three separate flags. The encoder struct and encodable trait need to serve three different purposes: writing to wire protocol messages, constructing to-be-signed data for DNSSEC signing and verification, and any use of the encoder that does not encode an entire DNS message. The server's `ResponseHandle` leaves `canonical_names` set to false, while the encoders used by `TBS` always have `canonical_names` set to true. For the wire protocol, we should generally keep the case of names as-is, and we need to preserve the case for names that are in RDATA of record types not listed in RFC 4034 section 6.2 (for instance, CAA and SVCB). We may use name compression in RDATA for well-known record types, but we must not use name compression in the RDATA of any record type introduced after RFC 1035. For DNSSEC canonical form, we must use lowercase names in RDATA for types listed in RFC 4034 section 6.2, and we must preserve the case of names in RDATA for types not listed there. We must not use name compression anywhere. For uses where we encode something other than an entire message into a buffer, for example when using temporary buffers in some routines, we can't use name compression, because the temporary buffer will get copied somewhere else in the final message, invalidating offsets in compression pointers. From code inspection, it looks like we have incorrect behavior in multiple directions. For example: - ANAME RDATA gets encoded inside `with_canonical_names()`, and `<ANAME as BinEncodable>::emit()` passes the canonical names flag as the `lowercase` argument to `Name::emit_with_lowercase()`. This results in an incorrect DNSSEC canonical encoding, since `ANAME` is not on the list in RFC4034 for lowercasing. - If a message contains an ANAME record that includes a capital letter in its RDATA, plus an RRSIG over it, then when Hickory DNS encodes it for wire protocol use, it would lowercase the RDATA, invalidating the signature from the perspective of downstream validating clients. - SVCB RDATA encoded for wire protocol messages would encode the target name with `canonical_names` set to false. As a result, name compression could be used inside the RDATA. This would cause problems when interacting with other servers that don't have support for the SVCB record type, because they would have to obliviously copy the RDATA into a message when forwarding, which could invalidate the meaning of a compression pointer inside the name.

kerem closed this issue

2026-03-16 01:35:59 +03:00

kerem commented

2026-03-16 01:36:04 +03:00

Author

Owner

@divergentdave commented on GitHub (Apr 15, 2025):

RFC 6840 has some updates that override RFC 4034 section 6.2 for NSEC and RRSIG record RDATA, based on common practice.

@divergentdave commented on GitHub (Apr 15, 2025): [RFC 6840](https://datatracker.ietf.org/doc/html/rfc6840#section-5.1) has some updates that override RFC 4034 section 6.2 for NSEC and RRSIG record RDATA, based on common practice.

kerem referenced this issue

2026-03-16 04:19:07 +03:00

[PR #1096] [MERGED] Increased the UDP buffer size to 4096 as recommended by RFC6891 when … #1969