mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 03:05:51 +03:00
[GH-ISSUE #528] Problems with max_ttl on embedded systems #520
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#520
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @kingoflolz on GitHub (Jul 12, 2018).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/528
Hi, I'm using trust-dns-resolver on some openwrt systems and found that version 0.9 crashes because it overflows time structs in
new_with_max_ttl, as the time type defines seconds as an i32.I've worked around this by making MAX_TTL lower (to one year), and from reading the RFC 2181 it seems that it is standard conforming behavior " Implementations are always free to place an upper bound on any TTL received, and treat any larger values as if they were that upper bound".
However, I'm not familiar enough with the codebase to know what happens if trust-dns-resolver stays running for over a year, and if this will break any assumptions the rest of the code has in the codebase. If there are no problems with using a shorter max TTL, I suggest it be the default as the one we have right now makes it unusable on devices with a shorter time type
@bluejekyll commented on GitHub (Jul 12, 2018):
Thanks for the report. I'd love to hear more sometime about your use case.
This seems to be the relevant section of the RFC: https://tools.ietf.org/html/rfc2181#section-8
You probably want to address this section of code: https://github.com/bluejekyll/trust-dns/blob/master/proto/src/xfer/dns_multiplexer.rs#L333-L334
I'm also ok with reducing the TTL to a year, that shouldn't be a problem. Could you make sure to put a link to the RFC and a note about why the TTL is being set to what it will be? I'd love a PR on this, thanks!
I did a quick scan of the code, the only other spot (this is for the server/client which it looks like you aren't using so you don't need to address it) would be: https://github.com/bluejekyll/trust-dns/blob/master/client/src/serialize/txt/master.rs#L274
@azdlowry commented on GitHub (Jul 31, 2018):
Seem to be having this problem on 32bit Android as well. It's hard to narrow down the exact area of code that has the issue as backtraces don't work on Android. But we get a panic with any of the resolvers:
thread 'service_location::test::run_trust_dns_example' panicked at 'overflow when adding duration to time', libcore/option.rs:960:5
@azdlowry commented on GitHub (Jul 31, 2018):
I've reduced MAX_TTL to 2^28-1 and it no longer panics. Though I don't think this is the best solution.
@bluejekyll commented on GitHub (Jul 31, 2018):
We could create a new TTL type that expresses maximum bounds with conversions from i32 as necessary, validating and storing a maximum based on platform.
@azdlowry commented on GitHub (Jul 31, 2018):
Only glanced through the code, so this might not be correct, but it seems the issue is with the use Instant type as this wraps a libc time_t structure, which is i32 on 32bit linux and i64 on 64bit linux. Should trust-dns be using a different time structure that is consistent across platforms?
@bluejekyll commented on GitHub (Aug 2, 2018):
So one issue I have is access to a testing platform to verify a fix for this. The library relies on 'std::time::Duration' for all TTL and time evaluation, I don’t really want to change that, but we can add restrictions on top of that.
I’m fine with forcing these values to be safe for 32bit platforms. I don’t see any need for longer TTLs than that.
I propose a new TTL type that would enforce these limits and make it typesafe, but I’m open to other ideas.
@azdlowry commented on GitHub (Aug 2, 2018):
I'm happy to make the changes, and I can look into adding 32bit linux to CI as well.
My only concern with limiting the TTL is in picking a duration that is sensible. We can pick 19 years today but we'll be back here with the same problem in January 2019. If we say 10 years we'll be good till year 2028. If 10 years is good enough on 32 bit systems why is it not good enough on 64 bit systems as well? I think I don't understand the reason for the long TTL enough to be able to pick sensible ones.
I'll also have a look into replacing the Instant type with something that is consistent across platforms, then we don't need to change the TTL at all.
@bluejekyll commented on GitHub (Aug 4, 2018):
If it sounded like I had a differing opinion on this, I apologize. I agree that it should be the same limit on 64bit and 32bit. TTL is really just a caching hint, too short is better than too long. I'm happy with anything over a day. The only reason the current values are there is that they seemed to be in accordance with the RFCs. The issue here is that I've incorrectly added the MAX_TTL value, to an existing time. We should be safe to just drop the MAX_TTL to 86400.
Instant and Duration don't appear to have a saturating_add, which would really be the correct addition to use here...
@bluejekyll commented on GitHub (Aug 5, 2018):
#542 and #543 are partial fixes for this issue. I would like us to come up with a type-safe response, but they will at least fix the current issue.
@bluejekyll commented on GitHub (Aug 5, 2018):
This
std::timeissue appears related: https://github.com/rust-lang/rust/issues/48980@bluejekyll commented on GitHub (Aug 5, 2018):
I'm going to close this, we can open a different issue we want to harden the TTL max with a type, but I think we might want to get saturating add into the std::time impls.
@azdlowry commented on GitHub (Aug 6, 2018):
No apologies needed, I didn't want to dive into any changes without understanding all the details first. Saturating add does seem to be the best solution. Thanks.