[GH-ISSUE #393] dns_sd::tests::test_list_services panics on my machine #178

Closed
opened 2026-03-07 22:40:41 +03:00 by kerem · 8 comments
Owner

Originally created by @luser on GitHub (Apr 10, 2018).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/393

I saw your users.rlo post and thought I'd try it out locally, but the test fails for me:

$ cargo test -v --features mdns test_list_services -- --nocapture --ignored
<...>
running 1 test
thread 'dns_sd::tests::test_list_services' panicked at 'There is more than one message in the response, this code path needs to deal with that', proto/src/xfer/dns_response.rs:45:9
note: Run with `RUST_BACKTRACE=1` for a backtrace.
test dns_sd::tests::test_list_services ... FAILED

I have a lot of devices on my home network, so I'm not sure what exactly is causing the problem. If there's any useful info I can provide to help you narrow this down I'm happy to do so.

Originally created by @luser on GitHub (Apr 10, 2018). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/393 I saw your users.rlo post and thought I'd try it out locally, but the test fails for me: ``` $ cargo test -v --features mdns test_list_services -- --nocapture --ignored <...> running 1 test thread 'dns_sd::tests::test_list_services' panicked at 'There is more than one message in the response, this code path needs to deal with that', proto/src/xfer/dns_response.rs:45:9 note: Run with `RUST_BACKTRACE=1` for a backtrace. test dns_sd::tests::test_list_services ... FAILED ``` I have a lot of devices on my home network, so I'm not sure what exactly is causing the problem. If there's any useful info I can provide to help you narrow this down I'm happy to do so.
kerem 2026-03-07 22:40:41 +03:00
Author
Owner

@luser commented on GitHub (Apr 10, 2018):

I ran the test again with Wireshark running. There were three devices responding to the query. Here's the dissected view of their responses:

Multicast Domain Name System (response)
    Transaction ID: 0xbc14
    Flags: 0x8400 Standard query response, No error
    Questions: 1
    Answer RRs: 5
    Authority RRs: 0
    Additional RRs: 0
    Queries
        _http._tcp.local: type PTR, class IN, "QM" question
    Answers
        _http._tcp.local: type PTR, class IN, lockbox2._http._tcp.local
        lockbox2._http._tcp.local: type TXT, class IN
        lockbox2._http._tcp.local: type SRV, class IN, priority 0, weight 0, port 5000, target lockbox2.local
        lockbox2.local: type AAAA, class IN, addr fe80::211:32ff:fe6d:aacf
        lockbox2.local: type A, class IN, addr 192.168.1.31

Multicast Domain Name System (response)
    Transaction ID: 0xbc14
    Flags: 0x8400 Standard query response, No error
    Questions: 1
    Answer RRs: 1
    Authority RRs: 0
    Additional RRs: 4
    Queries
        _http._tcp.local: type PTR, class IN, "QM" question
    Answers
        _http._tcp.local: type PTR, class IN, GCDWebServer._http._tcp.local
    Additional records
        GCDWebServer._http._tcp.local: type SRV, class IN, priority 0, weight 0, port 80, target iPhone-4.local
        GCDWebServer._http._tcp.local: type TXT, class IN
        iPhone-4.local: type AAAA, class IN, addr fe80::14e6:cd36:9b2d:cc84
        iPhone-4.local: type A, class IN, addr 192.168.1.180

Multicast Domain Name System (response)
    Transaction ID: 0xbc14
    Flags: 0x8400 Standard query response, No error
    Questions: 1
    Answer RRs: 5
    Authority RRs: 0
    Additional RRs: 0
    Queries
        _http._tcp.local: type PTR, class IN, "QM" question
    Answers
        _http._tcp.local: type PTR, class IN, OctoPrint instance on octopi._http._tcp.local
        OctoPrint instance on octopi._http._tcp.local: type TXT, class IN
        OctoPrint instance on octopi._http._tcp.local: type SRV, class IN, priority 0, weight 0, port 80, target octopi.local
        octopi.local: type AAAA, class IN, addr fe80::ba27:ebff:fee1:22e7
        octopi.local: type A, class IN, addr 192.168.1.194

Those are my Synology NAS, an iPhone, and a Raspberry Pi 3 running OctoPi. If it'd be helpful I'm happy to send you the full packet capture via email.

<!-- gh-comment-id:380095292 --> @luser commented on GitHub (Apr 10, 2018): I ran the test again with Wireshark running. There were three devices responding to the query. Here's the dissected view of their responses: ``` Multicast Domain Name System (response) Transaction ID: 0xbc14 Flags: 0x8400 Standard query response, No error Questions: 1 Answer RRs: 5 Authority RRs: 0 Additional RRs: 0 Queries _http._tcp.local: type PTR, class IN, "QM" question Answers _http._tcp.local: type PTR, class IN, lockbox2._http._tcp.local lockbox2._http._tcp.local: type TXT, class IN lockbox2._http._tcp.local: type SRV, class IN, priority 0, weight 0, port 5000, target lockbox2.local lockbox2.local: type AAAA, class IN, addr fe80::211:32ff:fe6d:aacf lockbox2.local: type A, class IN, addr 192.168.1.31 Multicast Domain Name System (response) Transaction ID: 0xbc14 Flags: 0x8400 Standard query response, No error Questions: 1 Answer RRs: 1 Authority RRs: 0 Additional RRs: 4 Queries _http._tcp.local: type PTR, class IN, "QM" question Answers _http._tcp.local: type PTR, class IN, GCDWebServer._http._tcp.local Additional records GCDWebServer._http._tcp.local: type SRV, class IN, priority 0, weight 0, port 80, target iPhone-4.local GCDWebServer._http._tcp.local: type TXT, class IN iPhone-4.local: type AAAA, class IN, addr fe80::14e6:cd36:9b2d:cc84 iPhone-4.local: type A, class IN, addr 192.168.1.180 Multicast Domain Name System (response) Transaction ID: 0xbc14 Flags: 0x8400 Standard query response, No error Questions: 1 Answer RRs: 5 Authority RRs: 0 Additional RRs: 0 Queries _http._tcp.local: type PTR, class IN, "QM" question Answers _http._tcp.local: type PTR, class IN, OctoPrint instance on octopi._http._tcp.local OctoPrint instance on octopi._http._tcp.local: type TXT, class IN OctoPrint instance on octopi._http._tcp.local: type SRV, class IN, priority 0, weight 0, port 80, target octopi.local octopi.local: type AAAA, class IN, addr fe80::ba27:ebff:fee1:22e7 octopi.local: type A, class IN, addr 192.168.1.194 ``` Those are my Synology NAS, an iPhone, and a Raspberry Pi 3 running OctoPi. If it'd be helpful I'm happy to send you the full packet capture via email.
Author
Owner

@bluejekyll commented on GitHub (Apr 10, 2018):

You hit a spot where I was trying to keep the fact that mDNS returns more than one Messsage per request from dirtying all the other lookups with multiple Messages. I’m not sure why, as I thought I had that clean, but I’ll take a look when I have a minute. I’ll get some more responding on my network and see if I can reproduce.

Thanks for the report!

<!-- gh-comment-id:380113072 --> @bluejekyll commented on GitHub (Apr 10, 2018): You hit a spot where I was trying to keep the fact that mDNS returns more than one Messsage per request from dirtying all the other lookups with multiple Messages. I’m not sure why, as I thought I had that clean, but I’ll take a look when I have a minute. I’ll get some more responding on my network and see if I can reproduce. Thanks for the report!
Author
Owner

@bluejekyll commented on GitHub (Apr 10, 2018):

if you rerun with BACKTRACE=full that should give us the backtrace to the actual panic path.

<!-- gh-comment-id:380115253 --> @bluejekyll commented on GitHub (Apr 10, 2018): if you rerun with `BACKTRACE=full` that should give us the backtrace to the actual panic path.
Author
Owner

@luser commented on GitHub (Apr 10, 2018):

Here's the output with RUST_BACKTRACE=full: https://gist.github.com/luser/da0cc48090d04010bfc8a72832fa5f4e

<!-- gh-comment-id:380161417 --> @luser commented on GitHub (Apr 10, 2018): Here's the output with `RUST_BACKTRACE=full`: https://gist.github.com/luser/da0cc48090d04010bfc8a72832fa5f4e
Author
Owner

@luser commented on GitHub (Apr 11, 2018):

This no longer panics on my machine, thanks! (I don't know if it's accurately presenting all of the results.)

running 1 test
service: lockbox2._http._tcp.local.
service: SRV {
    priority: 0,
    weight: 0,
    port: 5000,
    target: Name {
        is_fqdn: true,
        labels: [
            lockbox2,
            local
        ]
    }
}
info: {
    "serial": Some(
        "1750PDN017500"
    ),
    "version_major": Some(
        "6"
    ),
    "version_build": Some(
        "15266"
    ),
    "version_minor": Some(
        "1"
    ),
    "admin_port": Some(
        "5000"
    ),
    "mac_address": Some(
        "00:11:32:6d:aa:cf|00:11:32:6d:aa:d0"
    ),
    "vendor": Some(
        "Synology"
    ),
    "secure_admin_port": Some(
        "5001"
    ),
    "model": Some(
        "DS918+"
    )
}
ip: fe80::211:32ff:fe6d:aacf
ip: 192.168.1.31
service: OctoPrint\040instance\040on\040octopi._http._tcp.local.
service: SRV {
    priority: 0,
    weight: 0,
    port: 80,
    target: Name {
        is_fqdn: true,
        labels: [
            octopi,
            local
        ]
    }
}
info: {
    "path": Some(
        "/"
    )
}
ip: fe80::ba27:ebff:fee1:22e7
ip: 192.168.1.194
test dns_sd::tests::test_list_services ... ok
<!-- gh-comment-id:380482497 --> @luser commented on GitHub (Apr 11, 2018): This no longer panics on my machine, thanks! (I don't know if it's accurately presenting all of the results.) ``` running 1 test service: lockbox2._http._tcp.local. service: SRV { priority: 0, weight: 0, port: 5000, target: Name { is_fqdn: true, labels: [ lockbox2, local ] } } info: { "serial": Some( "1750PDN017500" ), "version_major": Some( "6" ), "version_build": Some( "15266" ), "version_minor": Some( "1" ), "admin_port": Some( "5000" ), "mac_address": Some( "00:11:32:6d:aa:cf|00:11:32:6d:aa:d0" ), "vendor": Some( "Synology" ), "secure_admin_port": Some( "5001" ), "model": Some( "DS918+" ) } ip: fe80::211:32ff:fe6d:aacf ip: 192.168.1.31 service: OctoPrint\040instance\040on\040octopi._http._tcp.local. service: SRV { priority: 0, weight: 0, port: 80, target: Name { is_fqdn: true, labels: [ octopi, local ] } } info: { "path": Some( "/" ) } ip: fe80::ba27:ebff:fee1:22e7 ip: 192.168.1.194 test dns_sd::tests::test_list_services ... ok ```
Author
Owner

@bluejekyll commented on GitHub (Apr 11, 2018):

It’s definitely possible. There are a lot of things that could be going on. If there is another mDNS responder running on the node, we might not be getting everything to the socket where trust-dns is running.

This impl is definitely experimental, so if you want to help debug why we might be dropping information that would be great.

For the mDNS responder issue we can set a flag in the message to respond directly to the UDP port. That’s described in the RFC and might make responses more reliable at the potential cost of more network traffic.

<!-- gh-comment-id:380485076 --> @bluejekyll commented on GitHub (Apr 11, 2018): It’s definitely possible. There are a lot of things that could be going on. If there is another mDNS responder running on the node, we might not be getting everything to the socket where trust-dns is running. This impl is definitely experimental, so if you want to help debug why we might be dropping information that would be great. For the mDNS responder issue we can set a flag in the message to respond directly to the UDP port. That’s described in the RFC and might make responses more reliable at the potential cost of more network traffic.
Author
Owner

@luser commented on GitHub (Apr 11, 2018):

I don't know much about mDNS but I'm handy with Wireshark and have a working knowledge of network and system programming, so I'm happy to provide any info or do any testing that would be helpful!

<!-- gh-comment-id:380515876 --> @luser commented on GitHub (Apr 11, 2018): I don't know much about mDNS but I'm handy with Wireshark and have a working knowledge of network and system programming, so I'm happy to provide any info or do any testing that would be helpful!
Author
Owner

@bluejekyll commented on GitHub (Apr 11, 2018):

I didn't know much about mDNS when I started this, which also probably means that I got some things wrong :)

In fact, I just noticed a bug rereading the RFC: https://tools.ietf.org/html/rfc6762#section-19

* ignores the Query ID field (except for generating legacy responses), I definitely look for the query_id on all responses still.

I wonder if the query-id is not being set in some responses: #395

And this is the bit that might need to be set if there is another active mDNS responder on the same system: https://tools.ietf.org/html/rfc6762#section-18.12

There are lots of fun corner cases here :(

<!-- gh-comment-id:380529670 --> @bluejekyll commented on GitHub (Apr 11, 2018): I didn't know much about mDNS when I started this, which also probably means that I got some things wrong :) In fact, I just noticed a bug rereading the RFC: https://tools.ietf.org/html/rfc6762#section-19 `* ignores the Query ID field (except for generating legacy responses)`, I definitely look for the query_id on all responses still. I wonder if the query-id is not being set in some responses: #395 And this is the bit that might need to be set if there is another active mDNS responder on the same system: https://tools.ietf.org/html/rfc6762#section-18.12 There are lots of fun corner cases here :(
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#178
No description provided.