[GH-ISSUE #1910] Memory leak when responses timeout #812

Closed
opened 2026-03-16 00:20:35 +03:00 by kerem · 7 comments
Owner

Originally created by @hottea773 on GitHub (Mar 20, 2023).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/1910

Describe the bug

When doing DNS resolution (using the resolver) with responses which timeout trust-dns is leaking memory.

To Reproduce
Steps to reproduce the behaviour:

With the following main.rs:

use std::net::SocketAddr;

use trust_dns_resolver::{
    config::{NameServerConfig, Protocol, ResolverConfig, ResolverOpts},
    Resolver,
};

fn main() {
    let mut resolver_config = ResolverConfig::new();

    resolver_config.add_name_server(NameServerConfig {
        // This is a documentation IP which isn't resolvable
        socket_addr: SocketAddr::new("192.0.2.100".parse().unwrap(), 53),
        protocol: Protocol::Udp,
        tls_dns_name: None,
        trust_negative_responses: true,
        bind_addr: None,
    });

    let mut resolver_opts = ResolverOpts::default();
    // This is stupidly short just to make it leak quicker
    resolver_opts.timeout = std::time::Duration::from_nanos(100);
    resolver_opts.ip_strategy = trust_dns_resolver::config::LookupIpStrategy::Ipv4Only;

    let resolver = Resolver::new(resolver_config.clone(), resolver_opts).unwrap();

    loop {
        let _response = resolver.lookup_ip("www.google.com");
    }
}

and the following Cargo.toml:

[package]
name = "dns-leak"
version = "0.1.0"
edition = "2021"

[dependencies]
trust-dns-resolver = "0.22.0"

[patch.crates-io]
trust-dns-resolver = { git = "https://github.com/bluejekyll/trust-dns.git", branch = "main" }

Run cargo run and watch the memory usage over time (0.8% -> 1.3% -> 2.1% -> 5.8%):

$ top -p 14107 -b -n 1
top - 16:44:39 up 14:46,  0 users,  load average: 0.79, 0.72, 0.74
Tasks:   1 total,   1 running,   0 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.1 us,  1.6 sy,  0.0 ni, 89.0 id,  0.0 wa,  0.0 hi,  6.3 si,  0.0 st
MiB Mem :   7949.0 total,    347.7 free,   5084.0 used,   2517.4 buff/cache
MiB Swap:   2048.0 total,   1317.5 free,    730.5 used.   2570.5 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
14107 develop+  20   0   70308  67488   5928 R  53.3   0.8   1:34.70 dns-leak
$ top -p 14107 -b -n 1
top - 16:46:41 up 14:48,  0 users,  load average: 0.55, 0.64, 0.70
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.1 us,  0.8 sy,  0.0 ni, 93.4 id,  0.0 wa,  0.0 hi,  1.6 si,  0.0 st
MiB Mem :   7949.0 total,    313.9 free,   5116.8 used,   2518.3 buff/cache
MiB Swap:   2048.0 total,   1317.5 free,    730.5 used.   2537.7 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
14107 develop+  20   0  110388 107568   5928 S  46.7   1.3   2:35.09 dns-leak
$ top -p 14107 -b -n 1
top - 16:49:48 up 14:51,  0 users,  load average: 0.73, 0.71, 0.72
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.5 us,  0.8 sy,  0.0 ni, 95.0 id,  0.0 wa,  0.0 hi,  1.7 si,  0.0 st
MiB Mem :   7949.0 total,    299.0 free,   5166.1 used,   2484.0 buff/cache
MiB Swap:   2048.0 total,   1290.7 free,    757.3 used.   2488.4 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
14107 develop+  20   0  171436 168620   5928 S  46.7   2.1   4:07.33 dns-leak
$ top -p 14107 -b -n 1
top - 17:05:07 up 15:07,  0 users,  load average: 0.46, 0.56, 0.65
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.3 us,  0.0 sy,  0.0 ni, 91.8 id,  0.0 wa,  0.0 hi,  4.9 si,  0.0 st
MiB Mem :   7949.0 total,    127.9 free,   5444.8 used,   2376.3 buff/cache
MiB Swap:   2048.0 total,   1291.7 free,    756.3 used.   2209.7 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
14107 develop+  20   0  472208 469268   5928 S  53.3   5.8  11:38.98 dns-leak

Expected behaviour

trust-dns-resolver not to leak memory

System:

  • OS: ubuntu 20.04 on WSL
  • rustc version: rustc 1.66.0-dev (adb13e80e 2022-12-16)

Version:
Crate: resolver
Version: main at commit 72f2a07c901743f3a4cc7900cda22b75de60d569. This does not reproduce with 0.22.0

Additional context
The leak seems to be within the Resolver. When we create a new resolver for every query (move the let resolver ... line inside the loop in the example) we don't seem to see a leak.

Originally created by @hottea773 on GitHub (Mar 20, 2023). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/1910 **Describe the bug** When doing DNS resolution (using the resolver) with responses which timeout trust-dns is leaking memory. **To Reproduce** Steps to reproduce the behaviour: With the following `main.rs`: ```rust use std::net::SocketAddr; use trust_dns_resolver::{ config::{NameServerConfig, Protocol, ResolverConfig, ResolverOpts}, Resolver, }; fn main() { let mut resolver_config = ResolverConfig::new(); resolver_config.add_name_server(NameServerConfig { // This is a documentation IP which isn't resolvable socket_addr: SocketAddr::new("192.0.2.100".parse().unwrap(), 53), protocol: Protocol::Udp, tls_dns_name: None, trust_negative_responses: true, bind_addr: None, }); let mut resolver_opts = ResolverOpts::default(); // This is stupidly short just to make it leak quicker resolver_opts.timeout = std::time::Duration::from_nanos(100); resolver_opts.ip_strategy = trust_dns_resolver::config::LookupIpStrategy::Ipv4Only; let resolver = Resolver::new(resolver_config.clone(), resolver_opts).unwrap(); loop { let _response = resolver.lookup_ip("www.google.com"); } } ``` and the following `Cargo.toml`: ```toml [package] name = "dns-leak" version = "0.1.0" edition = "2021" [dependencies] trust-dns-resolver = "0.22.0" [patch.crates-io] trust-dns-resolver = { git = "https://github.com/bluejekyll/trust-dns.git", branch = "main" } ``` Run `cargo run` and watch the memory usage over time (0.8% -> 1.3% -> 2.1% -> 5.8%): ``` $ top -p 14107 -b -n 1 top - 16:44:39 up 14:46, 0 users, load average: 0.79, 0.72, 0.74 Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie %Cpu(s): 3.1 us, 1.6 sy, 0.0 ni, 89.0 id, 0.0 wa, 0.0 hi, 6.3 si, 0.0 st MiB Mem : 7949.0 total, 347.7 free, 5084.0 used, 2517.4 buff/cache MiB Swap: 2048.0 total, 1317.5 free, 730.5 used. 2570.5 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14107 develop+ 20 0 70308 67488 5928 R 53.3 0.8 1:34.70 dns-leak $ top -p 14107 -b -n 1 top - 16:46:41 up 14:48, 0 users, load average: 0.55, 0.64, 0.70 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 4.1 us, 0.8 sy, 0.0 ni, 93.4 id, 0.0 wa, 0.0 hi, 1.6 si, 0.0 st MiB Mem : 7949.0 total, 313.9 free, 5116.8 used, 2518.3 buff/cache MiB Swap: 2048.0 total, 1317.5 free, 730.5 used. 2537.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14107 develop+ 20 0 110388 107568 5928 S 46.7 1.3 2:35.09 dns-leak $ top -p 14107 -b -n 1 top - 16:49:48 up 14:51, 0 users, load average: 0.73, 0.71, 0.72 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 2.5 us, 0.8 sy, 0.0 ni, 95.0 id, 0.0 wa, 0.0 hi, 1.7 si, 0.0 st MiB Mem : 7949.0 total, 299.0 free, 5166.1 used, 2484.0 buff/cache MiB Swap: 2048.0 total, 1290.7 free, 757.3 used. 2488.4 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14107 develop+ 20 0 171436 168620 5928 S 46.7 2.1 4:07.33 dns-leak $ top -p 14107 -b -n 1 top - 17:05:07 up 15:07, 0 users, load average: 0.46, 0.56, 0.65 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 3.3 us, 0.0 sy, 0.0 ni, 91.8 id, 0.0 wa, 0.0 hi, 4.9 si, 0.0 st MiB Mem : 7949.0 total, 127.9 free, 5444.8 used, 2376.3 buff/cache MiB Swap: 2048.0 total, 1291.7 free, 756.3 used. 2209.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14107 develop+ 20 0 472208 469268 5928 S 53.3 5.8 11:38.98 dns-leak ``` **Expected behaviour** `trust-dns-resolver` not to leak memory **System:** - OS: ubuntu 20.04 on WSL - rustc version: `rustc 1.66.0-dev (adb13e80e 2022-12-16)` **Version:** Crate: resolver Version: `main` at commit `72f2a07c901743f3a4cc7900cda22b75de60d569`. This does not reproduce with `0.22.0` **Additional context** The leak seems to be within the `Resolver`. When we create a new resolver for every query (move the `let resolver ...` line inside the `loop` in the example) we don't seem to see a leak.
kerem closed this issue 2026-03-16 00:20:41 +03:00
Author
Owner

@bluejekyll commented on GitHub (Mar 21, 2023):

Looking at the example posted, it looks more like the loop is unbounded and creating infinite requests until it crashes. Is there any indication that they leak if given the time for the requests to clean themselves up?

<!-- gh-comment-id:1477234722 --> @bluejekyll commented on GitHub (Mar 21, 2023): Looking at the example posted, it looks more like the loop is unbounded and creating infinite requests until it crashes. Is there any indication that they leak if given the time for the requests to clean themselves up?
Author
Owner

@djc commented on GitHub (Mar 21, 2023):

Version: main at commit 72f2a07c901743f3a4cc7900cda22b75de60d569. This does not reproduce with 0.22.0

Can you run a git bisect on this?

<!-- gh-comment-id:1477505615 --> @djc commented on GitHub (Mar 21, 2023): > Version: `main` at commit `72f2a07c901743f3a4cc7900cda22b75de60d569`. This does not reproduce with `0.22.0` Can you run a `git bisect` on this?
Author
Owner

@hottea773 commented on GitHub (Mar 21, 2023):

Looking at the example posted, it looks more like the loop is unbounded and creating infinite requests until it crashes. Is there any indication that they leak if given the time for the requests to clean themselves up?

They're synchronous requests, so I wouldn't expect them to be doing any "clean-ip" after they're done either way.
Anyway, I've modified my example to a more realistic 50 millisecond timeout and given it time to clean up after itself and the memory remains in use after the loop's finished:

use chrono::Utc;
use std::net::SocketAddr;
use std::{thread, time};

use trust_dns_resolver::{
    config::{NameServerConfig, Protocol, ResolverConfig, ResolverOpts},
    Resolver,
};

fn main() {
    let mut resolver_config = ResolverConfig::new();

    resolver_config.add_name_server(NameServerConfig {
        socket_addr: SocketAddr::new("192.0.2.100".parse().unwrap(), 53),
        protocol: Protocol::Udp,
        tls_dns_name: None,
        trust_negative_responses: true,
        bind_addr: None,
    });

    let mut resolver_opts = ResolverOpts::default();
    resolver_opts.timeout = std::time::Duration::from_millis(50);
    resolver_opts.ip_strategy = trust_dns_resolver::config::LookupIpStrategy::Ipv4Only;

    let resolver = Resolver::new(resolver_config.clone(), resolver_opts).unwrap();

    println!("Started requests at {}", Utc::now());
    for ii in 0..10345 {
        let _response = resolver.lookup_ip(format!("www.google-{ii}.com"));
    }
    println!("Finished requests at {}", Utc::now());
    thread::sleep(time::Duration::from_secs(2000));
    println!("Finished sleep at {}", Utc::now());
}

Output:

$ cargo run
   Compiling dns-leak v0.1.0 (/home/developer/temp/dns-leak)
    Finished dev [unoptimized + debuginfo] target(s) in 1.01s
     Running `target/debug/dns-leak`
Started requests at 2023-03-21 09:58:06.411667172 UTC
Finished requests at 2023-03-21 10:25:04.456814688 UTC
Finished sleep at 2023-03-21 10:58:24.529743602 UTC

Top output summary:

top - 09:58:24 up 16:59,  0 users,  load average: 0.76, 0.30, 0.21
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s): 14.9 us,  0.8 sy,  0.0 ni, 84.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7949.0 total,    544.8 free,   4586.1 used,   2818.1 buff/cache
MiB Swap:   2048.0 total,    550.2 free,   1497.8 used.   3068.6 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
13034 develop+  20   0    9420   6868   6248 S   6.2   0.1   0:00.61 dns-leak


top - 10:08:15 up 17:09,  0 users,  load average: 0.22, 0.19, 0.18
Tasks:   1 total,   1 running,   0 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.7 us,  0.0 sy,  0.0 ni, 98.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7949.0 total,    554.4 free,   4573.0 used,   2821.6 buff/cache
MiB Swap:   2048.0 total,    559.4 free,   1488.6 used.   3081.7 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
13034 develop+  20   0   14884  12336   6248 R   0.0   0.2   0:17.02 dns-leak


top - 10:22:58 up 17:24,  0 users,  load average: 0.07, 0.13, 0.16
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7949.0 total,    551.9 free,   4573.5 used,   2823.6 buff/cache
MiB Swap:   2048.0 total,    560.4 free,   1487.6 used.   3081.1 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
13034 develop+  20   0   22908  20356   6248 S   6.2   0.3   0:41.88 dns-leak


top - 10:25:04 up 17:26,  0 users,  load average: 0.36, 0.20, 0.18
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.8 si,  0.0 st
MiB Mem :   7949.0 total,    549.6 free,   4574.9 used,   2824.6 buff/cache
MiB Swap:   2048.0 total,    561.7 free,   1486.3 used.   3079.8 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
13034 develop+  20   0   23932  21380   6248 S   6.7   0.3   0:44.93 dns-leak


top - 10:58:24 up 17:59,  0 users,  load average: 0.04, 0.05, 0.08
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us,  0.0 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7949.0 total,    517.3 free,   4579.8 used,   2851.9 buff/cache
MiB Swap:   2048.0 total,    571.4 free,   1476.6 used.   3074.9 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
13034 develop+  20   0   23932  21380   6248 S   0.0   0.3   0:45.01 dns-leak
<!-- gh-comment-id:1477659335 --> @hottea773 commented on GitHub (Mar 21, 2023): > Looking at the example posted, it looks more like the loop is unbounded and creating infinite requests until it crashes. Is there any indication that they leak if given the time for the requests to clean themselves up? They're synchronous requests, so I wouldn't expect them to be doing any "clean-ip" after they're done either way. Anyway, I've modified my example to a more realistic 50 millisecond timeout and given it time to clean up after itself and the memory remains in use after the loop's finished: ```rust use chrono::Utc; use std::net::SocketAddr; use std::{thread, time}; use trust_dns_resolver::{ config::{NameServerConfig, Protocol, ResolverConfig, ResolverOpts}, Resolver, }; fn main() { let mut resolver_config = ResolverConfig::new(); resolver_config.add_name_server(NameServerConfig { socket_addr: SocketAddr::new("192.0.2.100".parse().unwrap(), 53), protocol: Protocol::Udp, tls_dns_name: None, trust_negative_responses: true, bind_addr: None, }); let mut resolver_opts = ResolverOpts::default(); resolver_opts.timeout = std::time::Duration::from_millis(50); resolver_opts.ip_strategy = trust_dns_resolver::config::LookupIpStrategy::Ipv4Only; let resolver = Resolver::new(resolver_config.clone(), resolver_opts).unwrap(); println!("Started requests at {}", Utc::now()); for ii in 0..10345 { let _response = resolver.lookup_ip(format!("www.google-{ii}.com")); } println!("Finished requests at {}", Utc::now()); thread::sleep(time::Duration::from_secs(2000)); println!("Finished sleep at {}", Utc::now()); } ``` Output: ``` $ cargo run Compiling dns-leak v0.1.0 (/home/developer/temp/dns-leak) Finished dev [unoptimized + debuginfo] target(s) in 1.01s Running `target/debug/dns-leak` Started requests at 2023-03-21 09:58:06.411667172 UTC Finished requests at 2023-03-21 10:25:04.456814688 UTC Finished sleep at 2023-03-21 10:58:24.529743602 UTC ``` Top output summary: ``` top - 09:58:24 up 16:59, 0 users, load average: 0.76, 0.30, 0.21 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 14.9 us, 0.8 sy, 0.0 ni, 84.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7949.0 total, 544.8 free, 4586.1 used, 2818.1 buff/cache MiB Swap: 2048.0 total, 550.2 free, 1497.8 used. 3068.6 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13034 develop+ 20 0 9420 6868 6248 S 6.2 0.1 0:00.61 dns-leak top - 10:08:15 up 17:09, 0 users, load average: 0.22, 0.19, 0.18 Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie %Cpu(s): 1.7 us, 0.0 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7949.0 total, 554.4 free, 4573.0 used, 2821.6 buff/cache MiB Swap: 2048.0 total, 559.4 free, 1488.6 used. 3081.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13034 develop+ 20 0 14884 12336 6248 R 0.0 0.2 0:17.02 dns-leak top - 10:22:58 up 17:24, 0 users, load average: 0.07, 0.13, 0.16 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7949.0 total, 551.9 free, 4573.5 used, 2823.6 buff/cache MiB Swap: 2048.0 total, 560.4 free, 1487.6 used. 3081.1 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13034 develop+ 20 0 22908 20356 6248 S 6.2 0.3 0:41.88 dns-leak top - 10:25:04 up 17:26, 0 users, load average: 0.36, 0.20, 0.18 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni, 99.2 id, 0.0 wa, 0.0 hi, 0.8 si, 0.0 st MiB Mem : 7949.0 total, 549.6 free, 4574.9 used, 2824.6 buff/cache MiB Swap: 2048.0 total, 561.7 free, 1486.3 used. 3079.8 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13034 develop+ 20 0 23932 21380 6248 S 6.7 0.3 0:44.93 dns-leak top - 10:58:24 up 17:59, 0 users, load average: 0.04, 0.05, 0.08 Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.8 us, 0.0 sy, 0.0 ni, 99.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 7949.0 total, 517.3 free, 4579.8 used, 2851.9 buff/cache MiB Swap: 2048.0 total, 571.4 free, 1476.6 used. 3074.9 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 13034 develop+ 20 0 23932 21380 6248 S 0.0 0.3 0:45.01 dns-leak ```
Author
Owner

@mokeyish commented on GitHub (Mar 26, 2023):

I also encountered this problem. But I don't know if it's the same reason.

Reproduce:(windows (11) only)

nslookup.exe -qtype=a google.com 127.0.0.1  #  Forward resolution error: request timed out

Repeat executes the command above about 10 times, you can see fatal runtime error: stack overflow

<!-- gh-comment-id:1484036067 --> @mokeyish commented on GitHub (Mar 26, 2023): I also encountered this problem. But I don't know if it's the same reason. Reproduce:(windows (11) only) ```shell nslookup.exe -qtype=a google.com 127.0.0.1 # Forward resolution error: request timed out ``` Repeat executes the command above about 10 times, you can see `fatal runtime error: stack overflow`
Author
Owner

@mokeyish commented on GitHub (Mar 26, 2023):

I also encountered this problem. But I don't know if it's the same reason.

Reproduce:(windows (11) only)

nslookup.exe -qtype=a google.com 127.0.0.1  #  Forward resolution error: request timed out

Repeat executes the command above about 10 times, you can see fatal runtime error: stack overflow

Not same reason, I fixed it at #1912

<!-- gh-comment-id:1484064561 --> @mokeyish commented on GitHub (Mar 26, 2023): > I also encountered this problem. But I don't know if it's the same reason. > > Reproduce:(windows (11) only) > > ```shell > nslookup.exe -qtype=a google.com 127.0.0.1 # Forward resolution error: request timed out > ``` > > Repeat executes the command above about 10 times, you can see `fatal runtime error: stack overflow` Not same reason, I fixed it at #1912
Author
Owner

@hottea773 commented on GitHub (Mar 30, 2023):

Ok, so I've done a git bisect and found that it very clearly started leaking in github.com/bluejekyll/trust-dns@15423b8610.

I've done a bit of testing, and found that if we simply revert the change in github.com/bluejekyll/trust-dns@15423b8610 (diff-8c08c0c200) it fixes the leak. I'm no expert, but it seems to me that we're doing something along the lines of adding a handle for every request, but never joining on them, or otherwise tidying that up hence leaking memory per-request. I'd propose that we fix this by simply reverting that part of the change, but happy to be told otherwise. @jeff-hiner probably has a better understanding.

<!-- gh-comment-id:1490482302 --> @hottea773 commented on GitHub (Mar 30, 2023): Ok, so I've done a `git bisect` and found that it very clearly started leaking in https://github.com/bluejekyll/trust-dns/commit/15423b86101bed097dcb973486c1cc3a464a7d08. I've done a bit of testing, and found that if we simply revert the change in https://github.com/bluejekyll/trust-dns/commit/15423b86101bed097dcb973486c1cc3a464a7d08#diff-8c08c0c20017a021b2a1500e162204aba4c70c2f9dc44568b0b8601ab3e35fe5R395 it fixes the leak. I'm no expert, but it seems to me that we're doing something along the lines of adding a handle for every request, but never joining on them, or otherwise tidying that up hence leaking memory per-request. I'd propose that we fix this by simply reverting that part of the change, but happy to be told otherwise. @jeff-hiner probably has a better understanding.
Author
Owner

@jeff-hiner commented on GitHub (Mar 30, 2023):

There's a reason the join sets were added.

For context here: the spawned futures typically hold cloned TCP or DoH sockets. If those futures are spawned in the background without retaining the JoinHandles, then disposing of the parent ConnectionProvider leaves those tasks running and their sockets open until they time out.

It looks like in this case nothing is reaping the self.join_set. The easiest way to fix that is to add reaping within spawn_bg while the join set is locked. This fix will retain finished tasks until either the next spawn_bg call or the parent ConnectionProvider is torn down.

Give me a moment, I'll have a PR up shortly.

<!-- gh-comment-id:1490533204 --> @jeff-hiner commented on GitHub (Mar 30, 2023): There's a reason the join sets were added. For context here: the spawned futures typically hold cloned TCP or DoH sockets. If those futures are spawned in the background without retaining the JoinHandles, then disposing of the parent `ConnectionProvider` leaves those tasks running and their sockets open until they time out. It looks like in this case nothing is reaping the `self.join_set`. The easiest way to fix that is to add reaping within `spawn_bg` while the join set is locked. This fix will retain finished tasks until either the next `spawn_bg` call or the parent `ConnectionProvider` is torn down. Give me a moment, I'll have a PR up shortly.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#812
No description provided.