mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 03:05:51 +03:00
[GH-ISSUE #2066] No default root certs when using rustls #868
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#868
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @simonsmd on GitHub (Oct 15, 2023).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2066
Is your feature request related to a problem? Please describe.
The
dns-over-rustlsanddns-over-https-rustlsfeatures of the hickory-resolver crate do not enable thenative-certsorwebpki-rootsfeature by default. If the user does not enable one of these features manually, no root certificates are added to the default rustls configuration and connecting to a DoT or DoH server fails because the server certificate cannot be validated.Describe the solution you'd like
Ideally the
webpki-rootsfeature would be enabled by default ifdns-over-rustlsordns-over-https-rustlsis enabled.However, as far as i can tell the
webpki-rootsfeature overwrites thenative-certsfeature which means it would not be possible to usenative-certsanymore. This can be solved either by removing thewebpki-rootsfeature and always enabling the code that adds the webpki-roots certificates unlessnative-certsis enabled or making thenative-certsfeature overwrite thewebpki-rootsfeature.Describe alternatives you've considered
If the above solution is not possible/desired, the requirement to enable one of the two features when using DoT or DoH should at least be documented.
Additional context
The missing root certificates lead to one of the following two errors (in my testing):
ResolveError { kind: Proto(ProtoError { kind: Io(Kind(ConnectionRefused)) }) }ResolveError { kind: Proto(ProtoError { kind: Io(Os { code: 101, kind: NetworkUnreachable, message: "Network is unreachable" }) }) }These errors do not mention the real cause of the issue which makes debugging this problem challenging. The failed server certificate validation is logged if debug logging is enabled but I think it would be better if the error would mention the failed certificate validation. I am not familiar with the code base though so I do not know how hard it is to implement this.
@bluejekyll commented on GitHub (Oct 15, 2023):
@daxpedda, do you have any thoughts on this?
@bluejekyll commented on GitHub (Oct 15, 2023):
btw, I think the
NetworkUnreachablemessage is an IPv6 issue, and not necessarily related.@daxpedda commented on GitHub (Oct 16, 2023):
The problem with adding either
webpki-rootsornative-certstodns-over-rustlsis that it forces the dependency on users which might not want to use it.The alternative of course is to add it to the default feature set, but that is weird because the default feature set doesn't even include
dns-over-rustls/quic/http/h3to begin with. This might be the best way forward though.Whatever the decision, I believe this issue will be common among consumers, so it should be documented appropriately.
I think it was mentioned before that the goal is to have native root certs by default.
@djc commented on GitHub (Oct 16, 2023):
IIRC most libraries pick a default via the default feature, allowing downstream users to work with the defaults while also allowing them to disable the defaults and enable the other option. This is friendlier especially for inexperienced users.
@bluejekyll commented on GitHub (Oct 16, 2023):
I think we need to expose the choice as part of the API (and the feature only adds/removes the dependency). This would make the choice explicit in the code and not change behavior unexpectedly as feature enablement is additive.
@cpu commented on GitHub (Nov 3, 2023):
I bumped into this updating some code from v0.23.2 to latest.
On 0.23.2 this code works without panic when run with
--features dns-over-https-rustlsalone:However, switching to 0.24+ (and updating the
trust_dns_xxximports tohickory_) and running with just--features dns-over-https-rustlsproduces an unclear error fromlookupAs described in this issue enabling the
webpki-rootsfeature and running with--features dns-over-https-rustls,webpki-rootsresolves the problem.I think the main point of confusion for me is why the error is being reported as a invalid data instead of something about being unable to verify the presented certificate chain to a known trust anchor. If the error indicated that I think it would be a lot easier to debug. Unlike the OP I don't see a connection refused or IPv6 unreachable error.
@XOR-op commented on GitHub (Nov 17, 2023):
Struggled with this bug for a few hours today. I suggest we should add a bug label for this issue and/or add notes to the changelog, since this behavior is a breaking change (requiring users to manually enable features if they use rustls before) and not well documented.
@cpu commented on GitHub (Dec 11, 2023):
I think there's two issues here:
webpki-rootsdefault being removed)Resolving 1 makes sense and would fix the breakage users are seeing, but I suspect 2 will cause other issues down the road (like if the DNS server's certificate happens to be invalid). I started trying to figure out where the error was being swallowed/remapped ~last week, but didn't arrive at a conclusion before I ran out of time.
As a quick fix it seems like it might be worthwhile to have the lazy
CLIENT_CONFIGconstructor inresolver/src/tls/dns_over_rustls.rserror ifroot_store.is_empty()after evaluating the feature flags fornative-certsandwebpki-rootsby surfacing an error suggesting using one of the two feature flags. That will probably cut down on some confusion in the meantime.I hope I can find time to look deeper but it won't be for a little while.
@cpu commented on GitHub (Apr 13, 2024):
Here's a PR that attempts to fix this aspect. This has come up for a few people now (both in Discord and in other issues) so it seems worth fixing ASAP: https://github.com/hickory-dns/hickory-dns/pull/2179
I went down this rabbit hole again today and haven't found a conclusive answer. In my testing upstream we get a nice
io::Errorlike err:Custom { kind: InvalidData, error: InvalidCertificate(UnknownIssuer) }frrom a tokio rustls connection made with a config without roots to a remote server as expected. I can't figure out where or why, but something in the Hickory DNS stack is reducing this down to just a simple io error with type InvalidData - no nested source error or message.@djc commented on GitHub (Apr 13, 2024):
I remember digging into this, IIRC there's an issue where hickory tries to reconstruct an io::Error and there's no way to extract the existing custom error kind or something like that. Probably in one of the error conversions in one of the error.rs modules.
@cpu commented on GitHub (Apr 13, 2024):
Good pointer, thanks! It's the
Cloneimpl forProtoErrorKind, specifically theelseclause here:github.com/hickory-dns/hickory-dns@6c2a1e2c23/crates/proto/src/error.rs (L644-L648)That creates a "simple representation of error kind" and so boils our upstream error into just
InvalidData.@cpu commented on GitHub (Apr 13, 2024):
By my read we have the original error with full info as
efrom theIo(ref e)binding, but it isn'tCloneso some contortions are needed. We can't useeor the inner error fromget_ref()withio::Error::new(e.kind(), xxx)to make our own new error because borrowed data would escape the method.I think the usual fix would be to have
Io(io::Error)beIo(Rc<io::Error>)orIo(Arc<io::Error>)instead right? 🤔 That's a breaking change, but I'm not sure I spot another fix and it seems fairly important to keep the broader error context.@djc commented on GitHub (Apr 13, 2024):
Breaking change is fine, we're going to release a breaking change anyway once Quinn 0.11 is published.
@cpu commented on GitHub (Apr 13, 2024):
Sounds good. Here's a fix: https://github.com/hickory-dns/hickory-dns/pull/2181
I think it might be sensible to consider landing it & https://github.com/hickory-dns/hickory-dns/pull/2179 both, but with the
InvalidCertificate(UnknownIssuer)error coming through correctly #2179 seems less pressing. It's a lot easier to arrive at the right fix when you know the underlying error.