mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 11:15:54 +03:00
[GH-ISSUE #265] RFC: Refactor parser to minimize exposure to parsing bugs for unwanted record types #129
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#129
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @briansmith on GitHub (Oct 25, 2017).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/265
Consider the implementation of
RData::read()given below.There are two, related, issues with this code:
There is no way to minimize the code size by stripping out unneeded code for unused features. For example, consider a user of -resolver that doesn't enable DNSSEC. The compiler won't be able to drop the DNSSEC-specific parsing (and other processing, but in this issue we're concerned mainly with parsing) logic.
More seriously, let's say there is an exploitable security vulnerability in the MX parsing code, and my application only uses -resolver and only resolves A/AAA/SRV records. For this application, ideally the MX parsing code would not even be reachable since MX records are not relevant to A/AAAA/SRV queries, and so ideally my application wouldn't need a security update. However, with the current design, my application will still parse MX records if they are (maliciously) present in the input, even though they're irrelevant to A/AAAA/SRV lookups.
I propose that we solve both issues in the following way:
RData::read()that takes a structure that looks something like this:Factor out each case of
RData::read()intoAParser,AAAAParser, etc.Change the boolean option that determines whether or not DNSSEC validation is done to instead be a reference to a
Validatorthat either does nothing (NoDNSSEC) or that does DNSSEC validation (DNSSECValidator). The validator will return aRDataParserthat handles the relevant DNSSEC record types (all none forNoDNSSEC, and the full set of parsers, forDNSSECValidator).Change the resolver's
ipv4_lookup()so that it passes something like this to RData::read():That is,
ipv4_lookup()would parseCNAMEandArecords and maybe DNSSEC records depending on whether the user chose to enable validation.ipv6_lookup(), etc. similarly.The effects would be:
If one uses only
NoDNSSEC, then none of the DNSSEC record parsing code (and, later, none of the DNSSEC processing code in general) would get linked into the application, saving code size, and removing any exposure to DNSSEC-specific vulnerabilities.If one uses only
ipv4_lookup()andipv6_lookup()andservice_lookup()then none of the parsing logic for record types irrelevant to `A/AAAA/SRV queries (and, later, none of the processing code for those types) would get linked in, and the application wouldn't be exposed to any vulnerabilities in that dropped code.For reference, here's the
RData::read()code as it is today:@bluejekyll commented on GitHub (Oct 25, 2017):
This is an interesting idea. I worry that the indirection might be confusing, to new people coming to the project, but enough docs should make that ok.
I'm wondering if there is a way to skip the use of Optional, and instead just have different impls of the read method associated to the Parsers. Also, I tend to reserve parser for text based input, and deserializer/reader for binary (not sure what best practice is here in Rust, but I think this would be).
I get the value of reducing the potential attack vectors based on usage, but I worry that it's a little excessive. Should we just use the
--features=dnssec-*as the method for enabling/disabling dnssec support? As to each lookup variant having it's own set of parsers, one challenge will be the generic lookup will potentially be hard to get the set of parsers correct.Question, is it better to harden all the parsers against possible exploit, or should we actual limit the exposure of each request to these individualized sets? This may incur a cost of needing to test more variations of possible results to each request type. As a side note, many DNS servers/resolvers will return additional records in the nameservers and additionals section of the response Message. My basic concern is that the set of parsers would end up only varying by 1 or 2 supported record types.
I'm not against this change, I just worry that it might be costly to guarantee correctness (I may be overly concerned about this, but I'm not sure).
@briansmith commented on GitHub (Oct 26, 2017):
First, I should have mentioned that this idea was motivated by
CVE-2017-15650.
I expect that it won't affect the public interface of -resolver at all, and it probably won't affect the exposed interface of anything other than -proto either.
I don't think the use of
Optionis going to cause any problem. I don't quite understand what you're suggesting, but I'm guessing you mean that we'd have a trait and then different implementations of that trait for each kind of query. I'm not against that but I doubt it would be less source code or less generated code and also I don't see a runtime performance benefit to it, but I'm likely overlooking something.Noted. I'll try to use your terminology.
(Quoted out of order):
I just did that in #267. That idea is not mutually-exclusive with this idea, though I think a good implementation of this idea will the need to do
#[cfg(feature = "dnssec")]mostly unnecessary in most of the code.One of the challenges to maintaining a security-critical library is that people who are hardcore about minimalism and security are always at odds with people who want more features. The idea here is to placate the people who value extreme security and minimalism while still making it easy to maintain and add non-core features.
Also, as is usually the case in security engineering, we take a layered approach, because the whole idea of security engineering is that you don't know where the (security) bugs in your code are, so you have to do things that generally mitigate vulnerabilities.
The generic lookup function could be changed to delegate to the more specific lookup functions, so that it would do the minimally-risky thing for free. And/or, we could just warn people that the general lookup function has maximum exposure to risks related to processing hostile inputs, and recommend they use the more specific functions whenever practical. In any case, we don't have to optimize for the security of the general lookup function first; we can do
lookup_ipv4(),lookup_ipv6(),lookup_service(), etc. first. (More generally, in priority order according to contributors' priorities and resources available.)In the long term, it is better to harden all the parsers because that benefits the most people. However, these two things are not mutually exclusive. Even with hardened parsers, people want minimal exposure to risk.
That's a good point, and that might make this idea moot. My thinking was specifically that I'd like to not think about MX or SOA or other domain-management records when doing A/AAA/SRV lookups.
When I have time, I'll try to make a minimal proof-of-concept PR that does this for
lookup_ipv4()to show the advantages and costs.@bluejekyll commented on GitHub (Oct 26, 2017):
I really do like all the reasoning here. In terms of results coming back, though:
It's definitely possible to remove SRV and SOA from A or AAAA queries, though lookup_ip is dual-stack, and that actually will return A and AAAA, though in multiple requests. One thing that will happen is CNAMEs will potentially (very likely) be returned. Additionally, NS records and SOA records are both useful as references to the actual zone authorities. SOA specifically has default caching information for the zone, which should be useful in the long run.
Basically I think what we'll end up with is a core set of RData types present in all requests, SOA, NS, CNAME, etc... And then the variations would be on the terminal record types, like A, AAAA, SRV, etc. Does that make sense?
@bluejekyll commented on GitHub (Nov 3, 2017):
Ok, as I started working on #280 I started thinking about this more.
This would produce an enum like RData today:
Implement the read() for these variants in the macro as well, which would do the correct match on RecordType etc (need to think about this one more).
pass this into the lookup() methods to restrict the valid set of RDatas parsed during a query as a TypeParameter and then call it through the read method, etc.