mirror of
https://github.com/hickory-dns/hickory-dns.git
synced 2026-04-25 11:15:54 +03:00
[GH-ISSUE #2261] Concurrent access to the DNS cache for better performance #942
Labels
No labels
blocked
breaking-change
bug
bug:critical
bug:tests
cleanup
compliance
compliance
compliance
crate:all
crate:client
crate:native-tls
crate:proto
crate:recursor
crate:resolver
crate:resolver
crate:rustls
crate:server
crate:util
dependencies
docs
duplicate
easy
easy
enhance
enhance
enhance
feature:dns-over-https
feature:dns-over-quic
feature:dns-over-tls
feature:dnsssec
feature:global_lb
feature:mdns
feature:tsig
features:edns
has workaround
ops
perf
platform:WASM
platform:android
platform:fuchsia
platform:linux
platform:macos
platform:windows
pull-request
question
test
tools
tools
trust
unclear
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/hickory-dns#942
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @telbizov on GitHub (Jun 26, 2024).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2261
Hello everyone,
I am building a high-performance asynchronous forward proxy with tokio. I started using hickory resolver for the DNS resolution part and so far I like it a lot. For example, I really appreciate the fact that I can use my own RuntimeProvider to allocate TCP and UDP sockets because of an edge case I have.
I run multiple
current_threadtokio runtimes (one per core) in dedicated worker threads that I launch myself. Then, I have a lazy static for the AsyncResolver which is shared across all threads. The default mechanics work really well in my case. Every thread that callslookup_ip()gets to run the async future on its own local runtime thus distributing the workload of the lookups themselves well across worker. Then the result of the lookup is shared in the commonDnsLrucache behind the globalAsyncResolvervariable, which is exactly what I am after -- shared DNS cache.The issue that I am facing is that caching is implemented via the
DnsLrutype which usescache: Arc<Mutex<LruCache<Query, LruValue>>>for the storage of the data. Furthermore, any access to any value within that data structure leads to locking the mutex of the entire LruCache. This includes both writes and reads. This essentially leads to serialization of accesses and a significant performance drop. In my experiments it's around 35%.It seems to me that hickory's performance could benefit greatly if it used a structure that is designed for concurrent access. A quick search around for concurrent
HashMapshowed DashMap and scc::HashMap. I tested their performance in a concurrent read and write scenario and compared with anArc<Mutex<HashMap>>which locks/unlocks on every read/write operation (similar to what DnsLru seems to be doing) and the performance differences were significant (with 16 threads).scc::HashMapis 18x faster in write-only test and 30x faster in the read-only test.What I was hoping for is one of two things:
Arc<Mutex<LruCache<...>>>with a concurrent analog that will provide better performance characteristics, or better yet ...Traitthat will allow the users of the library to supply their own caching type. This of course would be the most generic way to solve this problem and it might prove useful in other cases - for example using Redis as a backend to store cache, etc.Any chance you would consider abstracting the caching in a trait?
Thank you,
Rumen Telbizov
@bdaehlie commented on GitHub (Jun 27, 2024):
Can both be done? Abstracting the caching in a trait and also improving the default to be something concurrent?
@telbizov commented on GitHub (Jun 27, 2024):
Sounds good to me!
@djc commented on GitHub (Jun 27, 2024):
Yup, both of these seem like reasonable ideas. I won't be able to implement these but happy to review a short design proposal and/or a PR implementing these.
@telbizov commented on GitHub (Jun 27, 2024):
Thanks for the feedback @djc. I'll consider spending some time on this next week and will let you know.
@telbizov commented on GitHub (Jul 1, 2024):
@djc et al. I'll try to take a shot at tackling this issue. I'll keep you posted in the coming days and weeks and will hopefully present a PR for your consideration. I might possibly reach out to you on discourse, to speed up to back-and-forth in the beginning.
@divergentdave commented on GitHub (Nov 13, 2024):
Currently, there are two sources of cache mutations when we read from the
DnsLrucache. At the application level, we may delete an entry after attempting to read it if its TTL is expired. Insidelru-cache, the cache moves/refreshes entries in thelinked-hash-mapin order to maintain the LRU order. Since almost alllru-cachemethods require mutable references, we unfortunately can't just swap our TokioMutexfor anRwLockfor a quick win.It looks like the
mokalibrary would work well for the resolver cache. It can handle both eviction, using a more sophisticated policy than the current LRU policy, and per-element expiry times, pruning entries past their TTL before we attempt to look them up again. The biggest downsides seem to be code size/dependency count and lack of WASM support, but we can just polyfill with something like the existing implementation if need be. The cache returns clones of values, but that's fine because we were copying already, in order to rewrite TTLs.