[PR #2332] [MERGED] Fix Issue #2306 / infinite recursion in ns_pool_for_zone #2963

Closed
opened 2026-03-16 11:17:41 +03:00 by kerem · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/hickory-dns/hickory-dns/pull/2332
Author: @marcus0x62
Created: 7/30/2024
Status: Merged
Merged: 8/27/2024
Merged by: @djc

Base: mainHead: recursor_infinite_recursion


📝 Commits (3)

  • 2c76692 Add append_ips method to NameServerConfigGroup
  • 128859e Add append_ips_from_lookup function to reduce duplicated and inconsistent boilerplate in
  • 9347c47 Fix Issue #2306 / infinite recursion in ns_pool_for_zone

📊 Changes

4 files changed (+135 additions, -122 deletions)

View changed files

📝 conformance/packages/conformance-tests/src/resolver/dnssec/regression.rs (+0 -1)
📝 crates/recursor/src/lib.rs (+2 -1)
📝 crates/recursor/src/recursor_dns_handle.rs (+118 -120)
📝 crates/resolver/src/config.rs (+15 -0)

📄 Description

As described in Issue #2306, it is possible to cause a Hickory server crash under certain conditions:

  1. There are no cached NS pools for a given zone with valid name servers

  2. Querying the parent DNS servers for a given zone only return child NS records. E.g.,
    querying IN NS for example.com only returns something like:

    example.com. IN NS ns1.example.com.
    example.com. IN NS ns2.example.com.

    with no non-child NS records. E.g.,

    example.com. IN NS ns1.someotherdomain.net.

  3. No glue records are returned with the NS records.

This will cause an infinite loop where ns_pool_for_zone calls resolve to get A/AAAA records for the NS records returned by the parent server, which since there is no pool for example.com will in turn call ns_pool_for_zone again, which triggers another call to resolve, and so on until the stack is exhausted.

This can be seen, as described in the issue if you configure BIND to act as a dummy root server for a hosted zone. It will only return glue records if those are present in its cache -- not the hosted zone. So, if you send the BIND server an A query directly for the delegated nameserver prior to sending the recursive query through Hickory, this bug will not be triggered. If you empty caches on both servers and send a query for the hosted zone to Hickory first, the bug will be triggered.

This patch works by changing the logic in ns_pool_for_zone to only call resolve from within ns_pool_for_zone for non-child NS servers. If, after querying for any non-child NS servers there still are not any NS servers to use for a zone, then queries for child NS servers will be sent to the parent zone of the zone being queried, but these queries are done with NameServerPool.lookup so as to avoid the possibility of infinite recursion. The initial check for glue records is unchanged, so the overall order of priority within ns_pool_for_zone for identifying name servers to use for a pool is:

1. Use glue records from the query to the parent zone
2. Use resolve to query non-child nameservers returned without glue from the parent zone
3. Use NameServerPool.lookup to try to directly resolve child nameserver records returned without glue from the parent zone nameserver.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/hickory-dns/hickory-dns/pull/2332 **Author:** [@marcus0x62](https://github.com/marcus0x62) **Created:** 7/30/2024 **Status:** ✅ Merged **Merged:** 8/27/2024 **Merged by:** [@djc](https://github.com/djc) **Base:** `main` ← **Head:** `recursor_infinite_recursion` --- ### 📝 Commits (3) - [`2c76692`](https://github.com/hickory-dns/hickory-dns/commit/2c76692022291b1afbe3088f57f6a65a8e39017d) Add append_ips method to NameServerConfigGroup - [`128859e`](https://github.com/hickory-dns/hickory-dns/commit/128859ed68c1c3ad5230c653c258851af2ea8603) Add append_ips_from_lookup function to reduce duplicated and inconsistent boilerplate in - [`9347c47`](https://github.com/hickory-dns/hickory-dns/commit/9347c478aa14b9799ef6f617a4c6b0c69e6697a8) Fix Issue #2306 / infinite recursion in ns_pool_for_zone ### 📊 Changes **4 files changed** (+135 additions, -122 deletions) <details> <summary>View changed files</summary> 📝 `conformance/packages/conformance-tests/src/resolver/dnssec/regression.rs` (+0 -1) 📝 `crates/recursor/src/lib.rs` (+2 -1) 📝 `crates/recursor/src/recursor_dns_handle.rs` (+118 -120) 📝 `crates/resolver/src/config.rs` (+15 -0) </details> ### 📄 Description As described in [Issue #2306](https://github.com/hickory-dns/hickory-dns/issues/2306), it is possible to cause a Hickory server crash under certain conditions: 1) There are no cached NS pools for a given zone with valid name servers 2) Querying the parent DNS servers for a given zone only return child NS records. E.g., querying IN NS for example.com only returns something like: example.com. IN NS ns1.example.com. example.com. IN NS ns2.example.com. with no non-child NS records. E.g., example.com. IN NS ns1.someotherdomain.net. 3) No glue records are returned with the NS records. This will cause an infinite loop where ns_pool_for_zone calls resolve to get A/AAAA records for the NS records returned by the parent server, which since there is no pool for example.com will in turn call ns_pool_for_zone again, which triggers another call to resolve, and so on until the stack is exhausted. This can be seen, as described in the issue if you configure BIND to act as a dummy root server for a hosted zone. It will only return glue records if those are present in its cache -- not the hosted zone. So, if you send the BIND server an A query directly for the delegated nameserver prior to sending the recursive query through Hickory, this bug will not be triggered. If you empty caches on both servers and send a query for the hosted zone to Hickory first, the bug will be triggered. This patch works by changing the logic in ns_pool_for_zone to only call resolve from within ns_pool_for_zone for non-child NS servers. If, after querying for any non-child NS servers there still are not any NS servers to use for a zone, then queries for child NS servers will be sent to the parent zone of the zone being queried, but these queries are done with NameServerPool.lookup so as to avoid the possibility of infinite recursion. The initial check for glue records is unchanged, so the overall order of priority within ns_pool_for_zone for identifying name servers to use for a pool is: 1. Use glue records from the query to the parent zone 2. Use resolve to query non-child nameservers returned without glue from the parent zone 3. Use NameServerPool.lookup to try to directly resolve child nameserver records returned without glue from the parent zone nameserver. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
kerem 2026-03-16 11:17:41 +03:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#2963
No description provided.