[GH-ISSUE #395] Hang up on nonexistent domains or timeouts #141

Closed
opened 2026-02-26 04:34:10 +03:00 by kerem · 5 comments
Owner

Originally created by @Sajito on GitHub (Mar 22, 2023).
Original GitHub issue: https://github.com/mageddo/dns-proxy-server/issues/395

Originally assigned to: @mageddo on GitHub.

What is Happening / What is expected

I'm not quiet sure if this relates to nonexistent domain or is always the case if a query times out.
When querying for a domain DPS, tries every solver available, each with a timeout of 10 seconds, until it get's a response or every solver failed.
If the domain does not exist, DPS needs multiple minutes to try each solver. While it is trying them, any other request will not be processed.

Steps to reproduce:

  • Start DPS
  • Query for a nonexistent domain, eg. asd.asd
    • This will timeout after some seconds
  • Now query a real domain, eg. google.com
    • This will also timeout now after some seconds
  • After some minutes, queries will work again

Specs

  • OS: EndeavourOS
  • Docker Version:
Client:
 Version:           23.0.1
 API version:       1.42
 Go version:        go1.20
 Git commit:        a5ee5b1dfc
 Built:             Sat Feb 11 13:58:04 2023
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          23.0.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.20
  Git commit:       bc3805a0a0
  Built:            Sat Feb 11 13:58:04 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.0
  GitCommit:        1fbd70374134b891f97ce19c70b6e50c7b9f4e0d.m
 runc:
  Version:          1.1.4
  GitCommit:        
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
  • DPS Version: Tried with 3.13.1 and 3.15.2-snapshot. Somewhere between these two versions something changed. In 3.13.1 only a timeout message is logged. In 3.15.2-snapshot a NullPointerException is logged after the timeout message.

Sorry for the missing log, I'm in a hurry right now, but I think this should be fairly easy to reproduce.

Originally created by @Sajito on GitHub (Mar 22, 2023). Original GitHub issue: https://github.com/mageddo/dns-proxy-server/issues/395 Originally assigned to: @mageddo on GitHub. ### What is Happening / What is expected I'm not quiet sure if this relates to nonexistent domain or is always the case if a query times out. When querying for a domain DPS, tries every solver available, each with a timeout of 10 seconds, until it get's a response or every solver failed. If the domain does not exist, DPS needs multiple minutes to try each solver. While it is trying them, any other request will not be processed. Steps to reproduce: - Start DPS - Query for a nonexistent domain, eg. `asd.asd` - This will timeout after some seconds - Now query a real domain, eg. `google.com` - This will also timeout now after some seconds - After some minutes, queries will work again ### Specs * OS: EndeavourOS * Docker Version: ``` Client: Version: 23.0.1 API version: 1.42 Go version: go1.20 Git commit: a5ee5b1dfc Built: Sat Feb 11 13:58:04 2023 OS/Arch: linux/amd64 Context: default Server: Engine: Version: 23.0.1 API version: 1.42 (minimum version 1.12) Go version: go1.20 Git commit: bc3805a0a0 Built: Sat Feb 11 13:58:04 2023 OS/Arch: linux/amd64 Experimental: false containerd: Version: v1.7.0 GitCommit: 1fbd70374134b891f97ce19c70b6e50c7b9f4e0d.m runc: Version: 1.1.4 GitCommit: docker-init: Version: 0.19.0 GitCommit: de40ad0 ``` * DPS Version: Tried with 3.13.1 and 3.15.2-snapshot. Somewhere between these two versions something changed. In 3.13.1 only a timeout message is logged. In 3.15.2-snapshot a NullPointerException is logged after the timeout message. Sorry for the missing log, I'm in a hurry right now, but I think this should be fairly easy to reproduce.
kerem 2026-02-26 04:34:10 +03:00
Author
Owner

@mageddo commented on GitHub (Mar 22, 2023):

Hey, I'm not able to reproduce when using DPS with default options

time nslookup asd.asd 
Server:		172.17.0.4
Address:	172.17.0.4#53

** server can't find asd.asd: NXDOMAIN


real	0m0.022s
user	0m0.003s
sys	0m0.009s
typer@typer-pc:~$ time nslookup google.com 
Server:		172.17.0.4
Address:	172.17.0.4#53

Non-authoritative answer:
Name:	google.com
Address: 142.251.132.46
Name:	google.com
Address: 2800:3f0:4001:828::200e


real	0m0.023s
user	0m0.006s
sys	0m0.007s

I suppose it's related to the remote server you're using, maybe it is taking more than 10 seconds to answsr (DPS timeout is set 10s) then returns null. In that case the slowness is related to the proxied server.

I will make a test with a bad remote server to see if I get a NPE.

<!-- gh-comment-id:1479530751 --> @mageddo commented on GitHub (Mar 22, 2023): Hey, I'm not able to reproduce when using DPS with default options ``` time nslookup asd.asd Server: 172.17.0.4 Address: 172.17.0.4#53 ** server can't find asd.asd: NXDOMAIN real 0m0.022s user 0m0.003s sys 0m0.009s typer@typer-pc:~$ time nslookup google.com Server: 172.17.0.4 Address: 172.17.0.4#53 Non-authoritative answer: Name: google.com Address: 142.251.132.46 Name: google.com Address: 2800:3f0:4001:828::200e real 0m0.023s user 0m0.006s sys 0m0.007s ``` I suppose it's related to the remote server you're using, maybe it is taking more than 10 seconds to answsr (DPS timeout is set 10s) then returns null. In that case the slowness is related to the proxied server. I will make a test with a bad remote server to see if I get a NPE.
Author
Owner

@mageddo commented on GitHub (Mar 22, 2023):

Okay... the NPE bug is confirmed when using a bad remote server which will never respond (8.8.8.8:85), I will work to fix that soon, no sure if it will increase your speed experience as your remote server is taking too long to respond.

10:03:08.178 [Thread-9       ] INF c.m.dnsproxyserver.server.dns.solver.SolverRemote l=47   m=handle                          status=timedOut, req=google.com, msg=Timed out while trying to resolve google.com./A, id=34124 class=IOException
10:03:08.179 [Thread-9       ] DEB c.m.dnsproxyserver.server.dns.solver.SolverCache  l=42   m=lambda$handleRes$0              status=noAnswer, action=cantCache, k=A-google.com
10:03:08.179 [Thread-9       ] WAR c.m.d.server.dns.RequestHandlerDefault            l=82   m=solve0                          status=solverFailed, currentSolverTime=10006, totalTime=10115, solver=SolverCachedRemote, eClass=NullPointerException, msg=Cannot invoke "com.mageddo.dnsproxyserver.server.dns.solver.Response.toBuilder()" because "res" is null
java.lang.NullPointerException: Cannot invoke "com.mageddo.dnsproxyserver.server.dns.solver.Response.toBuilder()" because "res" is null
	at com.mageddo.dnsproxyserver.server.dns.solver.SolverCachedRemote.handle(SolverCachedRemote.java:35)
	at com.mageddo.dnsproxyserver.server.dns.RequestHandlerDefault.solve0(RequestHandlerDefault.java:68)
	at com.mageddo.dnsproxyserver.server.dns.solver.SolverCache.lambda$handleRes$0(SolverCache.java:40)
	at com.mageddo.commons.caching.LruTTLCache.computeIfAbsent0(LruTTLCache.java:82)
	at com.mageddo.dnsproxyserver.server.dns.solver.SolverCache.handleRes(SolverCache.java:38)
	at com.mageddo.dnsproxyserver.server.dns.solver.SolverCache.handle(SolverCache.java:33)
	at com.mageddo.dnsproxyserver.server.dns.RequestHandlerDefault.solve(RequestHandlerDefault.java:46)
	at com.mageddo.dnsproxyserver.server.dns.RequestHandlerDefault.handle(RequestHandlerDefault.java:38)
	at com.mageddo.dnsproxyserver.server.dns.UDPServer.handle(UDPServer.java:54)
	at com.mageddo.dnsproxyserver.server.dns.UDPServer.lambda$start0$0(UDPServer.java:42)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1589)
<!-- gh-comment-id:1479537575 --> @mageddo commented on GitHub (Mar 22, 2023): Okay... the NPE bug is confirmed when using a bad remote server which will never respond (8.8.8.8:85), I will work to fix that soon, no sure if it will increase your speed experience as your remote server is taking too long to respond. ``` 10:03:08.178 [Thread-9 ] INF c.m.dnsproxyserver.server.dns.solver.SolverRemote l=47 m=handle status=timedOut, req=google.com, msg=Timed out while trying to resolve google.com./A, id=34124 class=IOException 10:03:08.179 [Thread-9 ] DEB c.m.dnsproxyserver.server.dns.solver.SolverCache l=42 m=lambda$handleRes$0 status=noAnswer, action=cantCache, k=A-google.com 10:03:08.179 [Thread-9 ] WAR c.m.d.server.dns.RequestHandlerDefault l=82 m=solve0 status=solverFailed, currentSolverTime=10006, totalTime=10115, solver=SolverCachedRemote, eClass=NullPointerException, msg=Cannot invoke "com.mageddo.dnsproxyserver.server.dns.solver.Response.toBuilder()" because "res" is null java.lang.NullPointerException: Cannot invoke "com.mageddo.dnsproxyserver.server.dns.solver.Response.toBuilder()" because "res" is null at com.mageddo.dnsproxyserver.server.dns.solver.SolverCachedRemote.handle(SolverCachedRemote.java:35) at com.mageddo.dnsproxyserver.server.dns.RequestHandlerDefault.solve0(RequestHandlerDefault.java:68) at com.mageddo.dnsproxyserver.server.dns.solver.SolverCache.lambda$handleRes$0(SolverCache.java:40) at com.mageddo.commons.caching.LruTTLCache.computeIfAbsent0(LruTTLCache.java:82) at com.mageddo.dnsproxyserver.server.dns.solver.SolverCache.handleRes(SolverCache.java:38) at com.mageddo.dnsproxyserver.server.dns.solver.SolverCache.handle(SolverCache.java:33) at com.mageddo.dnsproxyserver.server.dns.RequestHandlerDefault.solve(RequestHandlerDefault.java:46) at com.mageddo.dnsproxyserver.server.dns.RequestHandlerDefault.handle(RequestHandlerDefault.java:38) at com.mageddo.dnsproxyserver.server.dns.UDPServer.handle(UDPServer.java:54) at com.mageddo.dnsproxyserver.server.dns.UDPServer.lambda$start0$0(UDPServer.java:42) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang.Thread.run(Thread.java:1589)
Author
Owner

@mageddo commented on GitHub (Mar 22, 2023):

Fixed the NPE, but realized that there is some unexpected behavior happening, I'm seeing logs saying resolver took 40s to finish the process

10:22:52.760 [Thread-34      ] DEB c.m.d.server.dns.RequestHandlerDefault            l=48   m=solve                           status=solved, kind=udp, time=45857, res=google.com

I will take a deeper look at it.

<!-- gh-comment-id:1479567978 --> @mageddo commented on GitHub (Mar 22, 2023): Fixed the NPE, but realized that there is some unexpected behavior happening, I'm seeing logs saying resolver took 40s to finish the process ``` 10:22:52.760 [Thread-34 ] DEB c.m.d.server.dns.RequestHandlerDefault l=48 m=solve status=solved, kind=udp, time=45857, res=google.com ``` I will take a deeper look at it.
Author
Owner

@mageddo commented on GitHub (Mar 23, 2023):

Hey, I've made an improvement on the cache parallelism, I think that is the most major offender to the performance. There is another improvement to do, let me know if it's enough for your use case.

Just released 3.15.3-snapshot with the improvement.

<!-- gh-comment-id:1480496405 --> @mageddo commented on GitHub (Mar 23, 2023): Hey, I've made an improvement on the cache parallelism, I think that is the most major offender to the performance. There is another improvement to do, let me know if it's enough for your use case. Just released [3.15.3-snapshot](https://github.com/mageddo/dns-proxy-server/releases/tag/3.15.3-snapshot) with the improvement.
Author
Owner

@Sajito commented on GitHub (Mar 23, 2023):

Hey, just tried it and it looks really good. Thank you!

<!-- gh-comment-id:1480664636 --> @Sajito commented on GitHub (Mar 23, 2023): Hey, just tried it and it looks really good. Thank you!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/dns-proxy-server-mageddo#141
No description provided.