mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 03:05:51 +03:00
[GH-ISSUE #351] server panics when responding to AXFR on zone with absurd number of records #458
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#458
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @iliana on GitHub (Feb 26, 2018).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/351
On
826f2c4195I am able to panic named by loading a zone with ~10000 AAAA records and requesting an AXFR on that zone.The panic is from the first assert here:
github.com/bluejekyll/trust-dns@826f2c4195/proto/src/serialize/binary/encoder.rs (L145-L152)The current release version does not panic, but fails to serialize any records past a certain point (~650 AAAA records in size, it appears).
Log output and backtrace
@bluejekyll commented on GitHub (Feb 26, 2018):
Thank you for the report. Out of curiosity, is this a stress test or is this a use case you want to support?
We definitely need some limits here, and need to break the axfr into multiple responses.
@iliana commented on GitHub (Feb 26, 2018):
Stress test, although I'm primarily playing with internals (I have some code that serializes an entire zone using
BinEncoderand decided to test it with a very large number of records). I figured if I'm seeing a panic there, I'd see a panic in the AXFR handler...@bluejekyll commented on GitHub (Feb 26, 2018):
This doesn’t surprise me. I’ve been intending to revisit AXFR for a little bit now, mainly to figure out a good auth option for it.
What you’ve uncovered is a naive implementation where the entire zone is crammed into a single response. What needs to happen is that the records need to be broken up into multiple responses. This may require a bit of refactoring work to be efficient.
Also, with zones of that size there are some other issues. Currently the entire zone is cached in memory. This will pose an issue at some point. I’ve been thinking of playing around with memory mapping the zone files and putting a MRU read-through cache in front of the file. I think I have some issues filed for these issues.
Getting back to your specific issue: Messages have a bounded length of u16::max_value, this limit is due to the DNS over TCP spec. There should be an earlier error, or we should convert these to errors, so that additional records can’t be serialized into the message.
@bluejekyll commented on GitHub (Feb 26, 2018):
Grr... I thought I had some logic to enforce the size of the serialized stream, but it doesn't look that way. There are a few places that should be checking the size that currently aren't. The assertion that's currently panicking was written with the assumption that something else was enforcing the maximum length of the buffer.
TcpStreamwill blindly truncate the message:https://github.com/bluejekyll/trust-dns/blob/master/proto/src/tcp/tcp_stream.rs#L311-L314
In the encoder, none of the emit functions are currently guarded to enforce a size less than u16:
https://github.com/bluejekyll/trust-dns/blob/master/proto/src/serialize/binary/encoder.rs#L122
When fixing the encoder, we should make it's enforcement variable, based on say EDNS max length options. Each emit method (perhaps a macro for this?) should attempt to write, and on failure, revert the write, and return an Error (something cheap and recoverable).
This can be used to also better truncate response records, rather than the very aggressive method available now.