[GH-ISSUE #2277] dns-test: spurious test failure tshark::tests::nameserver #949

Closed
opened 2026-03-16 01:03:56 +03:00 by kerem · 4 comments
Owner

Originally created by @japaric on GitHub (Jul 3, 2024).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2277

Describe the bug
what the title says

To Reproduce

  1. cd conformance && cargo t -p dns-test
  2. repeat (1) until you see the failure

Expected behavior
the test should always work

System:

  • OS: Ubuntu
  • Architecture: x86_64
  • Version 22.04
  • rustc version: 1.79

Version:
Crate: dns-test
Version: c340da8e09

Additional context
This assertion failed.

Most likely, tshark printed "new packets 1" and then immediately printed "new packets 2"; wait_for_capture exits on the first line match so it didn't see the second stdout line.

Originally created by @japaric on GitHub (Jul 3, 2024). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2277 **Describe the bug** what the title says **To Reproduce** 1. `cd conformance && cargo t -p dns-test` 2. repeat (1) until you see the failure **Expected behavior** the test should always work **System:** - OS: Ubuntu - Architecture: x86_64 - Version 22.04 - rustc version: 1.79 **Version:** Crate: dns-test Version: c340da8e09482b849a00d8a59f6eb5b2c63b1171 **Additional context** [This assertion](https://github.com/hickory-dns/hickory-dns/blob/c340da8e09482b849a00d8a59f6eb5b2c63b1171/conformance/packages/dns-test/src/tshark.rs#L289) failed. Most likely, `tshark` printed "new packets 1" and then immediately printed "new packets 2"; wait_for_capture exits on the first line match so it didn't see the second stdout line.
kerem 2026-03-16 01:03:56 +03:00
  • closed this issue
  • added the
    bug:tests
    label
Author
Owner

@japaric commented on GitHub (Jul 3, 2024):

@justahero I think you looked into this before and I think you had the highest "success rate" in seeing the failure. I very rarely see it myself.

maybe we can change the assertion to assert_ne!(0, capture)? I think that even when wait_for_capture eagerly reports 1 instead of 2, terminate always returns 2 packets because it sends a signal to tshark and then waits until tshark closes its stdout / stderr so that gives it chance to flush all the captured records. IOW, wait_for_capture's return value should be treated as a "hint" that indicates that there's at least N packets to be read.

want to try that? @justahero

<!-- gh-comment-id:2205938964 --> @japaric commented on GitHub (Jul 3, 2024): @justahero I think you looked into this before and I think you had the highest "success rate" in seeing the failure. I very rarely see it myself. maybe we can change the assertion to `assert_ne!(0, capture)`? I think that even when `wait_for_capture` eagerly reports `1` instead of `2`, `terminate` always returns 2 packets because it sends a signal to `tshark` and then waits until `tshark` closes its stdout / stderr so that gives it chance to flush all the captured records. IOW, `wait_for_capture`'s return value should be treated as a "hint" that indicates that there's at least N packets to be read. want to try that? @justahero
Author
Owner

@justahero commented on GitHub (Jul 3, 2024):

Using the assert_ne!(0, capture); assertion should work. On my Macbook I run into the test failure quite frequently, ~about 20-25% of the time. Indeed tshark reports either .. new packets 2 or split that into two separate lines with .. new packets 1. But it's also the only test that fails for me.

I tried to find a valid solution for this issue earlier this week by detecting all lines & summing up the found packets, but it turned out to be a more complex issue. For one, there is no good way to detect all these new packets lines correctly in the Tshark::wait_for_capture method. The method returns at the first expected occurrence of : new packets. The output of stdout is consumed, checking for more lines here means the method blocks indefinitely.

The other approach was to merge both wait_for_capture & terminate methods into a single one, to have a single location to get lines from self.stdout, but this did not work either. It kind of worked for the failing test, but introduced other issues, that caused other tests to fail. In particular waiting for tshark to log the captured messages became indeterministic. Even when increasing the sleep value to 5 in the kill command it did not fix the issue, tshark would log an incomplete set of captures to the capture file in eavesdrop, therefore causing some tests to fail.

<!-- gh-comment-id:2205968120 --> @justahero commented on GitHub (Jul 3, 2024): Using the `assert_ne!(0, capture);` assertion should work. On my Macbook I run into the test failure quite frequently, ~about 20-25% of the time. Indeed `tshark` reports either `.. new packets 2` or split that into two separate lines with `.. new packets 1`. But it's also the only test that fails for me. I tried to find a valid solution for this issue earlier this week by detecting all lines & summing up the found packets, but it turned out to be a more complex issue. For one, there is no good way to detect all these `new packets` lines correctly in the [`Tshark::wait_for_capture`](https://github.com/hickory-dns/hickory-dns/blob/main/conformance/packages/dns-test/src/tshark.rs#L76-L88) method. The method returns at the first expected occurrence of `: new packets`. The output of `stdout` is consumed, checking for more lines here means the method blocks indefinitely. The other approach was to merge both `wait_for_capture` & `terminate` methods into a single one, to have a single location to get lines from `self.stdout`, but this did not work either. It kind of worked for the failing test, but introduced other issues, that caused other tests to fail. In particular waiting for `tshark` to log the captured messages became indeterministic. Even when increasing the sleep value to 5 in the [`kill` command](https://github.com/hickory-dns/hickory-dns/blob/main/conformance/packages/dns-test/src/tshark.rs#L92) it did not fix the issue, `tshark` would log an incomplete set of captures to the capture file in [`eavesdrop`](https://github.com/hickory-dns/hickory-dns/blob/main/conformance/packages/dns-test/src/tshark.rs#L30-L34), therefore causing some tests to fail.
Author
Owner

@japaric commented on GitHub (Jul 3, 2024):

Indeed tshark reports either .. new packets 2 or split that into two separate lines with .. new packets 1.

another thing to try could be to replace the assertion for something like

let mut captured = 0;
while captured <= 2 {
    captured += tshark.wait_for_capture()?;
}
assert_eq!(2, captured); // not sure if still necessary

the loop should exit in both the one "new packets 2" case and the two "new packets 1" case, no?


we have seen the failure in this repo CI for the first time so it would be good to fix it.

https://github.com/hickory-dns/hickory-dns/actions/runs/9777369448/job/26991905906#step:5:2050

<!-- gh-comment-id:2205990001 --> @japaric commented on GitHub (Jul 3, 2024): > Indeed tshark reports either .. new packets 2 or split that into two separate lines with .. new packets 1. another thing to try could be to replace the assertion for something like ``` rust let mut captured = 0; while captured <= 2 { captured += tshark.wait_for_capture()?; } assert_eq!(2, captured); // not sure if still necessary ``` the loop should exit in both the one "new packets 2" case and the two "new packets 1" case, no? --- we have seen the failure in this repo CI for the first time so it would be good to fix it. https://github.com/hickory-dns/hickory-dns/actions/runs/9777369448/job/26991905906#step:5:2050
Author
Owner

@justahero commented on GitHub (Jul 3, 2024):

I think the while loop is a better solution as this still requires the number of captures to match an expected value. Maybe a new method added to Tshark that expects X number of packets is useful, e.g. tshark.wait_for_num_packets.

<!-- gh-comment-id:2206015535 --> @justahero commented on GitHub (Jul 3, 2024): I think the `while` loop is a better solution as this still requires the number of captures to match an expected value. Maybe a new method added to `Tshark` that expects `X` number of packets is useful, e.g. `tshark.wait_for_num_packets`.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#949
No description provided.