[GH-ISSUE #2816] proto-error trying to send dns updates with 0.25 #1068

Closed
opened 2026-03-16 01:30:40 +03:00 by kerem · 15 comments
Owner

Originally created by @jcgruenhage on GitHub (Mar 1, 2025).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2816

Describe the bug
I'm trying to send DNS update requests as part of a dyndns daemon. This used to work with 0.24, but fails on 0.25. On 0.25, it technically "works" as well, as in, sending the updates works, but parsing the responses fails with "proto error".

I can provide TCP dumps, and the ~70 lines of relevant code calling hickory-dns functions in my code are in git.jcg.re/jcgruenhage/dyndnsd@f6608da7f4/src/dns.rs (L123-L187)

To Reproduce
Call ClientHandle::delete_rrset. Assuming the permissions etc are correct, I'd assume to get a successful response back. The pcap looks like this is the case. hickory fails to come to the same conclusion, and instead returns proto error.

Expected behavior
I'd expect the response to be parsed in a way that comes to the conclusion that the update was successful. The change from the update message is applied, I can't see in the tcpdump why it wouldn't be considered that either.

System:

  • OS: Chimera Linux
  • Architecture: x86_64
  • rustc version: 1.85

Version:
Crate: client / proto
Version: c141df9468

Additional context
The DNS server on the other side is knotd, Knot DNS 3.4.0. The full source code of the dyndns daemon I'm building is available in git.jcg.re/jcgruenhage/dyndnsd@f6608da7f4. When I modify the code to ignore the responses I get back from the server, it just works.

I've tried debugging this myself, but haven't gotten very far yet. If this is user error, I'd happily be told what I'm doing wrong.

Originally created by @jcgruenhage on GitHub (Mar 1, 2025). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2816 **Describe the bug** I'm trying to send DNS update requests as part of a dyndns daemon. This used to work with 0.24, but fails on 0.25. On 0.25, it technically "works" as well, as in, sending the updates works, but parsing the responses fails with "proto error". I can provide TCP dumps, and the ~70 lines of relevant code calling hickory-dns functions in my code are in https://git.jcg.re/jcgruenhage/dyndnsd/src/commit/f6608da7f48fd54fbcffbe95ee2a3054da5726fd/src/dns.rs#L123-L187 **To Reproduce** Call `ClientHandle::delete_rrset`. Assuming the permissions etc are correct, I'd assume to get a successful response back. The pcap looks like this is the case. `hickory` fails to come to the same conclusion, and instead returns `proto error`. **Expected behavior** I'd expect the response to be parsed in a way that comes to the conclusion that the update was successful. The change from the update message is applied, I can't see in the tcpdump why it wouldn't be considered that either. **System:** - OS: Chimera Linux - Architecture: x86_64 - rustc version: 1.85 **Version:** Crate: client / proto Version: c141df9468b4039976ff1fcb298b6e9c84798520 **Additional context** The DNS server on the other side is `knotd, Knot DNS 3.4.0`. The full source code of the dyndns daemon I'm building is available in https://git.jcg.re/jcgruenhage/dyndnsd/src/commit/f6608da7f48fd54fbcffbe95ee2a3054da5726fd. When I modify the code to ignore the responses I get back from the server, it just works. I've tried debugging this myself, but haven't gotten very far yet. If this is user error, I'd happily be told what I'm doing wrong.
kerem 2026-03-16 01:30:40 +03:00
Author
Owner

@djc commented on GitHub (Mar 1, 2025):

Thanks for testing and for the detailed bug report!

Would be cool if you could just extract the response bytes you get from the server into a unit test as a PR.

<!-- gh-comment-id:2692440516 --> @djc commented on GitHub (Mar 1, 2025): Thanks for testing and for the detailed bug report! Would be cool if you could just extract the response bytes you get from the server into a unit test as a PR.
Author
Owner

@jcgruenhage commented on GitHub (Mar 1, 2025):

The response:

0000   00 e0 4c a0 64 8f fc ec da 4d 17 b2 86 dd 60 07   ..L.d....M....`.
0010   2e bc 00 c2 06 35 2a 01 04 f8 1c 1b b7 b6 00 00   .....5*.........
0020   00 00 00 00 00 01 20 01 16 b8 cb b7 d8 00 3f ea   ...... .......?.
0030   fd a3 fc a3 5e 4b 00 35 b2 44 0e e6 f4 2d ce a9   ....^K.5.D...-..
0040   6a fd 80 18 01 fd 5e 9f 00 00 01 01 08 0a c7 b3   j.....^.........
0050   e4 07 69 8a 94 b1 00 a0 a9 f6 a8 00 00 01 00 00   ..i.............
0060   00 00 00 01 06 64 79 6e 64 6e 73 03 6a 63 67 02   .....dyndns.jcg.
0070   72 65 00 00 06 00 01 0a 6a 6f 6c 6c 65 2d 6d 61   re......jolle-ma
0080   74 65 06 64 79 6e 64 6e 73 03 6a 63 67 02 72 65   te.dyndns.jcg.re
0090   00 00 fa 00 ff 00 00 00 00 00 5d 0b 68 6d 61 63   ..........].hmac
00a0   2d 73 68 61 35 31 32 00 00 00 67 c3 7b 13 01 2c   -sha512...g.{..,
00b0   00 40 d8 31 0a 3c a9 47 5d 0c 6d 69 4e ea c4 58   .@.1.<.G].miN..X
00c0   0a 70 7b bc be c1 af 44 7f 18 a6 46 29 9e fa 9f   .p{....D...F)...
00d0   3e 75 c5 6b c0 48 2f 54 70 89 9c 02 00 9e ef 1f   >u.k.H/Tp.......
00e0   29 da d2 48 54 85 ce c7 9a 2b 01 28 e6 10 fe 74   )..HT....+.(...t
00f0   77 8f a9 f6 00 00 00 00                           w.......

Does this help?

<!-- gh-comment-id:2692442311 --> @jcgruenhage commented on GitHub (Mar 1, 2025): The response: ``` 0000 00 e0 4c a0 64 8f fc ec da 4d 17 b2 86 dd 60 07 ..L.d....M....`. 0010 2e bc 00 c2 06 35 2a 01 04 f8 1c 1b b7 b6 00 00 .....5*......... 0020 00 00 00 00 00 01 20 01 16 b8 cb b7 d8 00 3f ea ...... .......?. 0030 fd a3 fc a3 5e 4b 00 35 b2 44 0e e6 f4 2d ce a9 ....^K.5.D...-.. 0040 6a fd 80 18 01 fd 5e 9f 00 00 01 01 08 0a c7 b3 j.....^......... 0050 e4 07 69 8a 94 b1 00 a0 a9 f6 a8 00 00 01 00 00 ..i............. 0060 00 00 00 01 06 64 79 6e 64 6e 73 03 6a 63 67 02 .....dyndns.jcg. 0070 72 65 00 00 06 00 01 0a 6a 6f 6c 6c 65 2d 6d 61 re......jolle-ma 0080 74 65 06 64 79 6e 64 6e 73 03 6a 63 67 02 72 65 te.dyndns.jcg.re 0090 00 00 fa 00 ff 00 00 00 00 00 5d 0b 68 6d 61 63 ..........].hmac 00a0 2d 73 68 61 35 31 32 00 00 00 67 c3 7b 13 01 2c -sha512...g.{.., 00b0 00 40 d8 31 0a 3c a9 47 5d 0c 6d 69 4e ea c4 58 .@.1.<.G].miN..X 00c0 0a 70 7b bc be c1 af 44 7f 18 a6 46 29 9e fa 9f .p{....D...F)... 00d0 3e 75 c5 6b c0 48 2f 54 70 89 9c 02 00 9e ef 1f >u.k.H/Tp....... 00e0 29 da d2 48 54 85 ce c7 9a 2b 01 28 e6 10 fe 74 )..HT....+.(...t 00f0 77 8f a9 f6 00 00 00 00 w....... ``` Does this help?
Author
Owner

@djc commented on GitHub (Mar 1, 2025):

Yes, though it would make my life easier if you can stuff this into a PR as a unit test with a Rust array. Maybe you can parse it as a message to see how it fails?

<!-- gh-comment-id:2692445019 --> @djc commented on GitHub (Mar 1, 2025): Yes, though it would make my life easier if you can stuff this into a PR as a unit test with a Rust array. Maybe you can parse it as a message to see how it fails?
Author
Owner

@jcgruenhage commented on GitHub (Mar 1, 2025):

Just down in this module somewhere?

github.com/hickory-dns/hickory-dns@cda2c38fd4/crates/proto/src/op/message.rs (L1124)

<!-- gh-comment-id:2692446448 --> @jcgruenhage commented on GitHub (Mar 1, 2025): Just down in this module somewhere? https://github.com/hickory-dns/hickory-dns/blob/cda2c38fd439090e76644c3726a2ffa2aa14e370/crates/proto/src/op/message.rs#L1124
Author
Owner

@djc commented on GitHub (Mar 1, 2025):

Yup, let's start with that!

<!-- gh-comment-id:2692446659 --> @djc commented on GitHub (Mar 1, 2025): Yup, let's start with that!
Author
Owner

@jcgruenhage commented on GitHub (Mar 1, 2025):

failures:

---- op::message::tests::dns_update_response_serialization stdout ----

thread 'op::message::tests::dns_update_response_serialization' panicked at crates/proto/src/op/message.rs:1331:45:
failed to parse message: ProtoError { kind: UnrecognizedLabelCode(134) }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    op::message::tests::dns_update_response_serialization

test result: FAILED. 188 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.23s

Will push a PR with the message.

<!-- gh-comment-id:2692448976 --> @jcgruenhage commented on GitHub (Mar 1, 2025): ``` failures: ---- op::message::tests::dns_update_response_serialization stdout ---- thread 'op::message::tests::dns_update_response_serialization' panicked at crates/proto/src/op/message.rs:1331:45: failed to parse message: ProtoError { kind: UnrecognizedLabelCode(134) } note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace failures: op::message::tests::dns_update_response_serialization test result: FAILED. 188 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.23s ``` Will push a PR with the message.
Author
Owner

@jcgruenhage commented on GitHub (Mar 2, 2025):

I've bisected to figure out which commit introduced the breakage, and it's c8691c8c2d, hope that helps. I've looked at the change a bit but don't see where the bug comes from.

<!-- gh-comment-id:2692685398 --> @jcgruenhage commented on GitHub (Mar 2, 2025): I've bisected to figure out which commit introduced the breakage, and it's c8691c8c2d5d332ec2154c19c7d9b759f4bc2f13, hope that helps. I've looked at the change a bit but don't see where the bug comes from.
Author
Owner

@jcgruenhage commented on GitHub (Mar 2, 2025):

I've asked ChatGPT about the changes in that commit and my testcase, and it told me that I have to remove the TCP/IP headers, and the first two bytes from the DNS data that indicates the length, because they're not part of the message. With both of these removed from my test case, it's parsed as a message successfully. Doesn't change that my dyndns daemon is still exhibiting the same problem where I'm getting back a proto error though.

<!-- gh-comment-id:2692697836 --> @jcgruenhage commented on GitHub (Mar 2, 2025): I've asked ChatGPT about the changes in that commit and my testcase, and it told me that I have to remove the TCP/IP headers, and the first two bytes from the DNS data that indicates the length, because they're not part of the message. With both of these removed from my test case, it's parsed as a message successfully. Doesn't change that my dyndns daemon is still exhibiting the same problem where I'm getting back a proto error though.
Author
Owner

@djc commented on GitHub (Mar 2, 2025):

I've asked ChatGPT about the changes in that commit and my testcase, and it told me that I have to remove the TCP/IP headers, and the first two bytes from the DNS data that indicates the length, because they're not part of the message. With both of these removed from my test case, it's parsed as a message successfully. Doesn't change that my dyndns daemon is still exhibiting the same problem where I'm getting back a proto error though.

Ahh, that's good to know, so I guess it means the problem might not be in the message parsing itself, but at a different level.

From your original application context, do you have the exact ProtoError contents?

<!-- gh-comment-id:2692728028 --> @djc commented on GitHub (Mar 2, 2025): > I've asked ChatGPT about the changes in that commit and my testcase, and it told me that I have to remove the TCP/IP headers, and the first two bytes from the DNS data that indicates the length, because they're not part of the message. With both of these removed from my test case, it's parsed as a message successfully. Doesn't change that my dyndns daemon is still exhibiting the same problem where I'm getting back a proto error though. Ahh, that's good to know, so I guess it means the problem might not be in the message parsing itself, but at a different level. From your original application context, do you have the exact `ProtoError` contents?
Author
Owner

@jcgruenhage commented on GitHub (Mar 2, 2025):

I'll restructure my error handling to get more details out of it. so far anyhow is giving me the context, but not the details of the error itself:

[2025-03-02T11:40:13Z ERROR dyndnsd] Failed to update record: Failed to replace A record
    
    Caused by:
        0: Failed to delete old record
        1: proto error
<!-- gh-comment-id:2692734654 --> @jcgruenhage commented on GitHub (Mar 2, 2025): I'll restructure my error handling to get more details out of it. so far anyhow is giving me the context, but not the details of the error itself: ``` [2025-03-02T11:40:13Z ERROR dyndnsd] Failed to update record: Failed to replace A record Caused by: 0: Failed to delete old record 1: proto error ```
Author
Owner

@djc commented on GitHub (Mar 2, 2025):

Maybe also try if you can get the backtrace feature enabled and see what extra data that gives you.

<!-- gh-comment-id:2692736038 --> @djc commented on GitHub (Mar 2, 2025): Maybe also try if you can get the backtrace feature enabled and see what extra data that gives you.
Author
Owner

@jcgruenhage commented on GitHub (Mar 2, 2025):

diff --git a/src/main.rs b/src/main.rs
index 78b39db..6f941b7 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -74,7 +74,7 @@ async fn main() -> Result<()> {
         .context("Failed to initiate DNS updater")?;
     loop {
         if let Err(error) = update(&config, &mut cache, &cache_path, &mut updater).await {
-            log::error!("Failed to update record: {:?}", error);
+            log::error!("Failed to update record: {:#?}", error);
         }
         interval.tick().await;
     }

gives me

[2025-03-02T13:31:31Z ERROR dyndnsd] Failed to update record: Error {
        context: "Failed to replace A record",
        source: Error {
            context: "Failed to delete old record",
            source: Error {
                kind: Proto(
                    ProtoError {
                        kind: Msg(
                            "Tsig key wrong key error",
                        ),
                    },
                ),
            },
        },
    }

which is probably already enough context, right?

If any additional context is required (the whole request/response exchange, tsig keys, server configuration, anything), I'm happy to provide that.

<!-- gh-comment-id:2692737199 --> @jcgruenhage commented on GitHub (Mar 2, 2025): ```diff diff --git a/src/main.rs b/src/main.rs index 78b39db..6f941b7 100644 --- a/src/main.rs +++ b/src/main.rs @@ -74,7 +74,7 @@ async fn main() -> Result<()> { .context("Failed to initiate DNS updater")?; loop { if let Err(error) = update(&config, &mut cache, &cache_path, &mut updater).await { - log::error!("Failed to update record: {:?}", error); + log::error!("Failed to update record: {:#?}", error); } interval.tick().await; } ``` gives me ``` [2025-03-02T13:31:31Z ERROR dyndnsd] Failed to update record: Error { context: "Failed to replace A record", source: Error { context: "Failed to delete old record", source: Error { kind: Proto( ProtoError { kind: Msg( "Tsig key wrong key error", ), }, ), }, }, } ``` which is probably already enough context, right? If any additional context is required (the whole request/response exchange, tsig keys, server configuration, anything), I'm happy to provide that.
Author
Owner

@djc commented on GitHub (Mar 2, 2025):

Okay, so that's coming from here:

https://github.com/hickory-dns/hickory-dns/blob/main/crates/proto/src/dnssec/tsig.rs#L142

I'm guessing the record.name() to self.0.signer_name comparison has been thrown off by the FQDN adjustments from #2560. Can you have a look at what these values are in your test case?

<!-- gh-comment-id:2692761684 --> @djc commented on GitHub (Mar 2, 2025): Okay, so that's coming from here: https://github.com/hickory-dns/hickory-dns/blob/main/crates/proto/src/dnssec/tsig.rs#L142 I'm guessing the `record.name()` to `self.0.signer_name` comparison has been thrown off by the FQDN adjustments from #2560. Can you have a look at what these values are in your test case?
Author
Owner

@jcgruenhage commented on GitHub (Mar 2, 2025):

diff --git a/crates/proto/src/dnssec/tsig.rs b/crates/proto/src/dnssec/tsig.rs
index 97ea4ccf..80bec9b8 100644
--- a/crates/proto/src/dnssec/tsig.rs
+++ b/crates/proto/src/dnssec/tsig.rs
@@ -18,6 +18,7 @@ use alloc::string::ToString;
 use alloc::sync::Arc;
 use alloc::vec::Vec;
 use core::ops::Range;
+use std::dbg;
 
 use tracing::debug;
 
@@ -139,7 +140,8 @@ impl TSigner {
 
         // https://tools.ietf.org/html/rfc8945#section-5.2
         // 1.  Check key
-        if record.name() != &self.0.signer_name || tsig.algorithm() != &self.0.algorithm {
+        if dbg!(record.name()) != dbg!(&self.0.signer_name) || tsig.algorithm() != &self.0.algorithm
+        {
             return Err(DnsSecErrorKind::TsigWrongKey.into());
         }
 
[/home/jcgruenhage/dev/github.com/hickory-dns/hickory-dns/crates/proto/src/dnssec/tsig.rs:143:12] record.name() = Name("jolle-mate.dyndns.jcg.re.")
[/home/jcgruenhage/dev/github.com/hickory-dns/hickory-dns/crates/proto/src/dnssec/tsig.rs:143:35] &self.0.signer_name = Name("jolle-mate.dyndns.jcg.re")
[2025-03-02T17:05:15Z ERROR dyndnsd] Failed to update record: Error {
        context: "Failed to replace A record",
        source: Error {
            context: "Failed to delete old record",
            source: Error {
                kind: Proto(
                    ProtoError {
                        kind: Msg(
                            "Tsig key wrong key error",
                        ),
                    },
                ),
            },
        },
    }

Seems you're right in your suspicion.

<!-- gh-comment-id:2692818795 --> @jcgruenhage commented on GitHub (Mar 2, 2025): ``` diff --git a/crates/proto/src/dnssec/tsig.rs b/crates/proto/src/dnssec/tsig.rs index 97ea4ccf..80bec9b8 100644 --- a/crates/proto/src/dnssec/tsig.rs +++ b/crates/proto/src/dnssec/tsig.rs @@ -18,6 +18,7 @@ use alloc::string::ToString; use alloc::sync::Arc; use alloc::vec::Vec; use core::ops::Range; +use std::dbg; use tracing::debug; @@ -139,7 +140,8 @@ impl TSigner { // https://tools.ietf.org/html/rfc8945#section-5.2 // 1. Check key - if record.name() != &self.0.signer_name || tsig.algorithm() != &self.0.algorithm { + if dbg!(record.name()) != dbg!(&self.0.signer_name) || tsig.algorithm() != &self.0.algorithm + { return Err(DnsSecErrorKind::TsigWrongKey.into()); } ``` ``` [/home/jcgruenhage/dev/github.com/hickory-dns/hickory-dns/crates/proto/src/dnssec/tsig.rs:143:12] record.name() = Name("jolle-mate.dyndns.jcg.re.") [/home/jcgruenhage/dev/github.com/hickory-dns/hickory-dns/crates/proto/src/dnssec/tsig.rs:143:35] &self.0.signer_name = Name("jolle-mate.dyndns.jcg.re") [2025-03-02T17:05:15Z ERROR dyndnsd] Failed to update record: Error { context: "Failed to replace A record", source: Error { context: "Failed to delete old record", source: Error { kind: Proto( ProtoError { kind: Msg( "Tsig key wrong key error", ), }, ), }, }, } ``` Seems you're right in your suspicion.
Author
Owner

@jcgruenhage commented on GitHub (Mar 2, 2025):

Ftr: If I add a trailing dot in my config file for the key name, then it starts to just work as expected.

<!-- gh-comment-id:2692899144 --> @jcgruenhage commented on GitHub (Mar 2, 2025): Ftr: If I add a trailing dot in my config file for the key name, then it starts to just work as expected.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#1068
No description provided.