[GH-ISSUE #2014] do not leak and requested domain to the log #849

Open
opened 2026-03-16 00:32:53 +03:00 by kerem · 21 comments
Owner

Originally created by @LuckyTurtleDev on GitHub (Sep 2, 2023).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2014

Is your feature request related to a problem? Please describe.
After updating trust dns server to 0.23.0 it has starting leaking the ip address an the requested domain to the logs.

INFO  trust_dns_server::server::server_future           > request:34218 src:UDP://::ffff:172.20.0.1#45729 QUERY:example.com.:A:IN qflags:RD,AD response:NoError rr:0/0/0 rflags:RD,RA

Because of privacy reasons this is problematic.

Describe the solution you'd like
Do not leak ip and requested domain

Describe alternatives you've considered

  • Setting log level to warn. But this would be suppress the most other log messages too, so the log would be become useless.
  • filtering log message, before printing. But regex for ipv6 are very complex and need probably some computation time.
Originally created by @LuckyTurtleDev on GitHub (Sep 2, 2023). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/2014 **Is your feature request related to a problem? Please describe.** After updating trust dns server to 0.23.0 it has starting leaking the ip address an the requested domain to the logs. ``` INFO trust_dns_server::server::server_future > request:34218 src:UDP://::ffff:172.20.0.1#45729 QUERY:example.com.:A:IN qflags:RD,AD response:NoError rr:0/0/0 rflags:RD,RA ``` Because of privacy reasons this is problematic. **Describe the solution you'd like** Do not leak ip and requested domain **Describe alternatives you've considered** * Setting log level to warn. But this would be suppress the most other log messages too, so the log would be become useless. * filtering log message, before printing. But regex for ipv6 are very complex and need probably some computation time.
Author
Owner

@bluejekyll commented on GitHub (Sep 3, 2023):

Is there any guidance you want to provide on why this is a privacy issue? This is valuable debug information, so we could split this into two separate log lines? one with full context, one with only the request parameters and not the source IP? Other option would be to have a parameter to choose to log or not? or config for the log line itself?

I'm open to options.

<!-- gh-comment-id:1703973050 --> @bluejekyll commented on GitHub (Sep 3, 2023): Is there any guidance you want to provide on why this is a privacy issue? This is valuable debug information, so we could split this into two separate log lines? one with full context, one with only the request parameters and not the source IP? Other option would be to have a parameter to choose to log or not? or config for the log line itself? I'm open to options.
Author
Owner

@LuckyTurtleDev commented on GitHub (Sep 3, 2023):

Is there any guidance you want to provide on why this is a privacy issue?

I have write a pi-hole clone. It can also run at a public server so android can access it via dot.
But after the updating trust dns server to 0.23.0 it has start logging every ip which, connect to it and the requested domain.
Since this are very private data I think this is a big privacy issue.

I can increase the log level to warning, but then it does not even show the open ports/protocols and some other important inflammations anymore. And logging them with warn is also wrong because they are no warnings.

<!-- gh-comment-id:1704054281 --> @LuckyTurtleDev commented on GitHub (Sep 3, 2023): > Is there any guidance you want to provide on why this is a privacy issue? I have write a pi-hole clone. It can also run at a public server so android can access it via dot. But after the updating trust dns server to 0.23.0 it has start logging every ip which, connect to it and the requested domain. Since this are very private data I think this is a big privacy issue. I can increase the log level to warning, but then it does not even show the open ports/protocols and some other important inflammations anymore. And logging them with `warn` is also wrong because they are no warnings.
Author
Owner

@msrd0 commented on GitHub (Sep 5, 2023):

Setting log level to warn.

This is not sufficient, as I have seen IP addresses and domains queried in error messages as well.

<!-- gh-comment-id:1706198820 --> @msrd0 commented on GitHub (Sep 5, 2023): > Setting log level to warn. This is not sufficient, as I have seen IP addresses and domains queried in error messages as well.
Author
Owner

@XOR-op commented on GitHub (Sep 5, 2023):

What is your threat model? I think your program can always see the details of those requests. So in my assumption, you want to dump your logs into a file, but don't want other users/process under the same user read those logs. For this case, I would suggest you set a write-only permission for the log file (-w-------). Correct me if I make a wrong assumption.

<!-- gh-comment-id:1707290610 --> @XOR-op commented on GitHub (Sep 5, 2023): What is your threat model? I think your program can always see the details of those requests. So in my assumption, you want to dump your logs into a file, but don't want other users/process under the same user read those logs. For this case, I would suggest you set a write-only permission for the log file (-w-------). Correct me if I make a wrong assumption.
Author
Owner

@msrd0 commented on GitHub (Sep 5, 2023):

This is less of a security "thread" and more of a privacy issue. If I were hosting a DNS server using trust-dns software, I don't want any log of my user's ip addresses or requests. I will always have the opportunity to modify the code on my server to report those details, so a little bit of trust is always necessary.

That being said, here in the EU, IP addresses can be considered personal data under the GDPR. Given the nature of DNS queries, it is impractical to ask for a user's consent to store their IP address. This means it is only lawful to do so if storing the IP address is necessary under certain criteria. Unfortunately, I believe that logging IP addresses is unnecessary, so it will be hard to argue should someone complain.

The only option left currently is to manually implement the logger downstream and apply some regular expression that tries to remove sensitive data such as domain names or IP addresses from the log messages, given that even error messages can contain such details. It would, however, be much easier if trust-dns just removes those details from the log messages again.

<!-- gh-comment-id:1707303699 --> @msrd0 commented on GitHub (Sep 5, 2023): This is less of a security "thread" and more of a privacy issue. If I were hosting a DNS server using trust-dns software, I don't want any log of my user's ip addresses or requests. I will always have the opportunity to modify the code on my server to report those details, so a little bit of trust is always necessary. That being said, here in the EU, IP addresses [can be considered personal data](https://www.gdpreu.org/the-regulation/key-concepts/personal-data/#The_Definition_of_Personal_Data) under the GDPR. Given the nature of DNS queries, it is impractical to ask for a user's consent to store their IP address. This means it is only lawful to do so if storing the IP address is [_necessary_ under certain criteria](https://gdpr-info.eu/art-6-gdpr/). Unfortunately, I believe that logging IP addresses is unnecessary, so it will be hard to argue should someone complain. The only option left currently is to manually implement the logger downstream and apply some regular expression that tries to remove sensitive data such as domain names or IP addresses from the log messages, given that even error messages can contain such details. It would, however, be much easier if trust-dns just removes those details from the log messages again.
Author
Owner

@bluejekyll commented on GitHub (Sep 6, 2023):

I think this is a reasonable feature request, I just don't know off the top of my head how we should implement it. It would require a fairly extensive audit to identify all the potential areas where we log source IPs, and then have a solution in place to not log them.

I'm definitely open to us making this change. I think one thing we could do is say that all info logs and lower (debug, etc) can contain IPs but warnings or maybe error and up messages should be treated as restricted to not containing IPs. Then disabling info logs and only enabling warning or error would ensure that no IPs would be logged. That seems like the simplest thing to do, would that be sufficient?

<!-- gh-comment-id:1707495874 --> @bluejekyll commented on GitHub (Sep 6, 2023): I think this is a reasonable feature request, I just don't know off the top of my head how we should implement it. It would require a fairly extensive audit to identify all the potential areas where we log source IPs, and then have a solution in place to not log them. I'm definitely open to us making this change. I think one thing we could do is say that all info logs and lower (debug, etc) can contain IPs but warnings or maybe error and up messages should be treated as restricted to not containing IPs. Then disabling info logs and only enabling warning or error would ensure that no IPs would be logged. That seems like the simplest thing to do, would that be sufficient?
Author
Owner

@djc commented on GitHub (Sep 6, 2023):

I would argue that info shouldn't contain PII either (as IIRC info is commonly enabled by default), so only debug and trace should. Also I think we can be explicit that this is only about source IPs? That is, it mainly concerns the server logging the IP of requests it receives -- which reduces the scope quite a bit. IMO IPs contained in DNS responses are unlikely to be PII, as are name server IPs.

<!-- gh-comment-id:1707863687 --> @djc commented on GitHub (Sep 6, 2023): I would argue that info shouldn't contain PII either (as IIRC info is commonly enabled by default), so only debug and trace should. Also I think we can be explicit that this is only about source IPs? That is, it mainly concerns the server logging the IP of requests it receives -- which reduces the scope quite a bit. IMO IPs contained in DNS responses are unlikely to be PII, as are name server IPs.
Author
Owner

@bluejekyll commented on GitHub (Sep 6, 2023):

I think the problem I have right now is that we have, imo, a conflict between operational issues and privacy. I think the IP information for requests coming in on a publicly accessible server is important in case there is some malicious behavior. This could be used to help identify if there are folks working to DOS or DDOS a system. Then there is the use case mentioned here in regards to making sure we're not disclosing the behavior of an individual user. To me, the trust-dns service operator will always have access to this information (if they want it) regardless of what we do in the trust-dns server code. That is, it's not hard to sniff traffic on a server, and collect all of these details. quic and https can make that harder, but still the potential exists.

So I see a couple of options for us. The trust-dns server currently supports these log levels out of the box: info is default, debug, and quiet, where quiet will only report warning or error messages. I'd like us to come up with a policy that balances all the needs that folks have. I think to answer what should appear in log messages at what time, we need to identify the various people involved. First, I'm going to assume we are all ok with any information being present in DEBUG log lines. This gives the power to an operator to collect a lot of information, but that is a certain amount of trust that is being put into the DNS resolver/server being used, so I think this is acceptable. So let's not discuss DEBUG log lines for now.

That leaves INFO, WARN, and ERROR. When the INFO log level is disabled, there is no output from the server at all on any request. Based on this conversation, what I'm gathering is that INFO is enabled by default in many cases (it is by the trust-dns binary as well). What I want to know is that given this, it can be disabled today, why is INFO being enabled at all? Is it reasonable for us to suggest to people that they do not enable INFO on trust-dns when privacy is important? It sounds like based on this conversation, people think that isn't putting us on a privacy first posture. I'm not sure I agree with that at the moment, but I can be convinced. So assuming we remove this information from INFO, then my question becomes what information do people want in the standard INFO log line? For reference this is the default INFO log line on a request from trust-dns:

1694015948:INFO:trust_dns_server::server::server_future:909:request:43742 src:UDP://127.0.0.1#60636 QUERY:www.example.com.:A:IN qflags:RD response:NoError rr:1/0/0 rflags:RD,AA

Let's split this out and discuss what things are ok, I'll have checkmarks for each item, unchecked things are what I assume the folks on this issue believe should not be included:

  • 1694015948 - timestamp
  • INFO:trust_dns_server::server::server_future:909:request - level, location, and type
  • 43742 - DNS message ID from the request packet
  • src:UDP - request is via UDP
  • //127.0.0.1#60636 - source IP address and port (disclosing PII)
  • www.example.com. - requested name (disclosing the name requested)
  • A - record type (is this safe?)
  • IN - DNS class
  • qflags:RD - query flags set
  • response:NoError - response code
  • rr:1/0/0 - response record counts, answer/authority/additional
  • rflags:RD,AA - response flags set

I realize I'm being a little pedantic here, but I'm trying to figure out the best path forward. Without the source IP and query name, what is valuable out of this message to keep? Or said another way, what information do people need to track on each request made to the server and for what reason?

In terms of error and warn messages, it will be problematic not to include the requested details of what query failed, and then if there is malicious behavior, the errors and warnings could help identify that (like badly configured zone or server, slow TCP connections, scanning operations trying to disclose and query the entire catalog or zone file, etc). This is why I'm struggling with a proper response to this issue. It feels like there are competing priorities depending on the needs people have for the logs.

<!-- gh-comment-id:1708767337 --> @bluejekyll commented on GitHub (Sep 6, 2023): I think the problem I have right now is that we have, imo, a conflict between operational issues and privacy. I think the IP information for requests coming in on a publicly accessible server is important in case there is some malicious behavior. This could be used to help identify if there are folks working to DOS or DDOS a system. Then there is the use case mentioned here in regards to making sure we're not disclosing the behavior of an individual user. To me, the trust-dns service operator will always have access to this information (if they want it) regardless of what we do in the trust-dns server code. That is, it's not hard to sniff traffic on a server, and collect all of these details. quic and https can make that harder, but still the potential exists. So I see a couple of options for us. The trust-dns server currently supports these log levels out of the box: `info` is default, `debug`, and `quiet`, where quiet will only report warning or error messages. I'd like us to come up with a policy that balances all the needs that folks have. I think to answer what should appear in log messages at what time, we need to identify the various people involved. First, I'm going to assume we are all ok with any information being present in DEBUG log lines. This gives the power to an operator to collect a lot of information, but that is a certain amount of trust that is being put into the DNS resolver/server being used, so I think this is acceptable. So let's not discuss DEBUG log lines for now. That leaves INFO, WARN, and ERROR. When the INFO log level is disabled, there is no output from the server at all on any request. Based on this conversation, what I'm gathering is that INFO is enabled by default in many cases (it is by the trust-dns binary as well). What I want to know is that given this, it can be disabled today, why is INFO being enabled at all? Is it reasonable for us to suggest to people that they do not enable INFO on trust-dns when privacy is important? It sounds like based on this conversation, people think that isn't putting us on a privacy first posture. I'm not sure I agree with that at the moment, but I can be convinced. So assuming we remove this information from INFO, then my question becomes what information do people want in the standard INFO log line? For reference this is the default INFO log line on a request from trust-dns: ```console 1694015948:INFO:trust_dns_server::server::server_future:909:request:43742 src:UDP://127.0.0.1#60636 QUERY:www.example.com.:A:IN qflags:RD response:NoError rr:1/0/0 rflags:RD,AA ``` Let's split this out and discuss what things are ok, I'll have checkmarks for each item, unchecked things are what I assume the folks on this issue believe should not be included: - [x] `1694015948` - timestamp - [x] `INFO:trust_dns_server::server::server_future:909:request` - level, location, and type - [x] `43742` - DNS message ID from the request packet - [x] `src:UDP` - request is via UDP - [ ] `//127.0.0.1#60636` - source IP address and port (disclosing PII) - [ ] `www.example.com.` - requested name (disclosing the name requested) - [ ] `A` - record type (is this safe?) - [x] `IN` - DNS class - [x] `qflags:RD` - query flags set - [x] `response:NoError` - response code - [x] `rr:1/0/0` - response record counts, answer/authority/additional - [x] `rflags:RD,AA` - response flags set I realize I'm being a little pedantic here, but I'm trying to figure out the best path forward. Without the source IP and query name, what is valuable out of this message to keep? Or said another way, what information do people need to track on each request made to the server and for what reason? In terms of error and warn messages, it will be problematic not to include the requested details of what query failed, and then if there is malicious behavior, the errors and warnings could help identify that (like badly configured zone or server, slow TCP connections, scanning operations trying to disclose and query the entire catalog or zone file, etc). This is why I'm struggling with a proper response to this issue. It feels like there are competing priorities depending on the needs people have for the logs.
Author
Owner

@msrd0 commented on GitHub (Sep 6, 2023):

I guess to prevent DoS attacks an similar, a rate limit per IP would be much more effective, at least for people like me that don't inspect what their server does 99% of the time. That being said, I agree that the log messages is kinda irrelevant, especially when you remove IP and Domain name from it, so why don't you just make those log messages optional? If you suspect your server is under attack, you could enable logging, and if that's not the case, you can just leave it disabled?

<!-- gh-comment-id:1708802298 --> @msrd0 commented on GitHub (Sep 6, 2023): I guess to prevent DoS attacks an similar, a rate limit per IP would be much more effective, at least for people like me that don't inspect what their server does 99% of the time. That being said, I agree that the log messages is kinda irrelevant, especially when you remove IP and Domain name from it, so why don't you just make those log messages optional? If you suspect your server is under attack, you could enable logging, and if that's not the case, you can just leave it disabled?
Author
Owner

@bluejekyll commented on GitHub (Sep 7, 2023):

so why don't you just make those log messages optional?

Currently I'd argue the INFO log lines are optional, you must enable them for them to be visible. In the trust-dns binary, we default to INFO being on but it's easily disabled with the quiet flag. I think we'd still need to go through and cleanse WARN and ERROR messages of any IP data, and then we'd set a policy in the project recognizing that... (maybe we could add some tests as well somehow)

<!-- gh-comment-id:1709325357 --> @bluejekyll commented on GitHub (Sep 7, 2023): > so why don't you just make those log messages optional? Currently I'd argue the INFO log lines are optional, you must enable them for them to be visible. In the trust-dns binary, we default to INFO being on but it's easily disabled with the quiet flag. I think we'd still need to go through and cleanse WARN and ERROR messages of any IP data, and then we'd set a policy in the project recognizing that... (maybe we could add some tests as well somehow)
Author
Owner

@djc commented on GitHub (Sep 7, 2023):

From the message contents you listed, I think it's really only the source IP that's problematic in terms of privacy. I do think a good default would be to avoid logging PII at levels above debug. It's fine that the operator still has ways to access this data, but at the same time it's good to avoid making it easy to leak by accident.

<!-- gh-comment-id:1709663221 --> @djc commented on GitHub (Sep 7, 2023): From the message contents you listed, I think it's really only the source IP that's problematic in terms of privacy. I do think a good default would be to avoid logging PII at levels above debug. It's fine that the operator still has ways to access this data, but at the same time it's good to avoid making it easy to leak by accident.
Author
Owner

@bluejekyll commented on GitHub (Sep 12, 2023):

Thanks, @djc, do you feel like it's ok to keep the requested name in the logs? Is that something we might want to leave in INFO, but remove from WARN and ERROR?

<!-- gh-comment-id:1714855455 --> @bluejekyll commented on GitHub (Sep 12, 2023): Thanks, @djc, do you feel like it's ok to keep the requested name in the logs? Is that something we might want to leave in INFO, but remove from WARN and ERROR?
Author
Owner

@LuckyTurtleDev commented on GitHub (Sep 12, 2023):

I think requested domains are very private data too, especially if the server is only used by a few users . I would like to avoid logging them at info/warn/error.

<!-- gh-comment-id:1715497983 --> @LuckyTurtleDev commented on GitHub (Sep 12, 2023): I think requested domains are very private data too, especially if the server is only used by a few users . I would like to avoid logging them at info/warn/error.
Author
Owner

@djc commented on GitHub (Sep 12, 2023):

That's fair, I guess the requested domain is also something that should not be logged at the default log levels (IMO there's no difference between INFO and WARN in terms of privacy). There's a reason a bunch of people are working on EncryptedClientHello for TLS, and it's mostly because they want to avoid leaking the server name.

<!-- gh-comment-id:1715576499 --> @djc commented on GitHub (Sep 12, 2023): That's fair, I guess the requested domain is also something that should not be logged at the default log levels (IMO there's no difference between INFO and WARN in terms of privacy). There's a reason a bunch of people are working on EncryptedClientHello for TLS, and it's mostly because they want to avoid leaking the server name.
Author
Owner

@bluejekyll commented on GitHub (Sep 12, 2023):

If we are agreeing to not log name/ip etc, how much value is there in the rest of the message? should we just drop it to a DEBUG line, and then do an audit on other messages to see if there's data we don't want there?

<!-- gh-comment-id:1716133095 --> @bluejekyll commented on GitHub (Sep 12, 2023): If we are agreeing to not log name/ip etc, how much value is there in the rest of the message? should we just drop it to a DEBUG line, and then do an audit on other messages to see if there's data we don't want there?
Author
Owner

@djc commented on GitHub (Sep 13, 2023):

I think there's still value in terms of figuring out how many messages are coming in, the message ID is useful to correlate with packet captures.

<!-- gh-comment-id:1717451112 --> @djc commented on GitHub (Sep 13, 2023): I think there's still value in terms of figuring out how many messages are coming in, the message ID is useful to correlate with packet captures.
Author
Owner

@jpds commented on GitHub (Sep 21, 2023):

I cannot think of any other DNS server software that implements this (or even a HTTP server that does this). Unbound, BIND, CoreDNS, and NSD do not do this.

As for GDPR and from a production systems and networks perspective; logging client IPs would fall squarely under https://gdpr-info.eu/recitals/no-49/ and I'm not going to wait for an attack to happen, or worse, "suspecting" that one has happened before I turn on logging. You'd have very little in terms of figuring out what actually happened to the compromised system (which, at this point, cannot be trusted anymore and you'd also have senior management breathing down your neck for answers).

still value in terms of figuring out how many messages are coming in

It's far better to do this at a metric level, I filed #2032 about this.

<!-- gh-comment-id:1730165933 --> @jpds commented on GitHub (Sep 21, 2023): I cannot think of any other DNS server software that implements this (or even a HTTP server that does this). Unbound, BIND, CoreDNS, and NSD do not do this. As for GDPR and from a production systems and networks perspective; logging client IPs would fall squarely under https://gdpr-info.eu/recitals/no-49/ and I'm not going to wait for an attack to happen, or worse, "suspecting" that one has happened before I turn on logging. You'd have very little in terms of figuring out what actually happened to the compromised system (which, at this point, cannot be trusted anymore and you'd also have senior management breathing down your neck for answers). > still value in terms of figuring out how many messages are coming in It's far better to do this at a metric level, I filed #2032 about this.
Author
Owner

@bluejekyll commented on GitHub (Sep 22, 2023):

any other DNS server software that implements this

to be clear, I think you mean, “logs the client IP address”, correct?

I think we’re all in agreement with this at this point. I’ll even happily make the change, the rest of the discussion was just around the value here.

<!-- gh-comment-id:1731793998 --> @bluejekyll commented on GitHub (Sep 22, 2023): > any other DNS server software that implements this to be clear, I think you mean, “logs the client IP address”, correct? I think we’re all in agreement with this at this point. I’ll even happily make the change, the rest of the discussion was just around the value here.
Author
Owner

@jpds commented on GitHub (Sep 22, 2023):

to be clear, I think you mean, “logs the client IP address”, correct?

The opposite, obfuscating a client IP address / the domain it requested:

$ sudo ./coredns
.:53
CoreDNS-1.11.1
linux/amd64, go1.20.7, ae2bbc2
[INFO] 127.0.0.1:46610 - 40617 "A IN bbc.com. udp 48 false 1232" NOERROR qr,aa,rd 79 0.000286699s
<!-- gh-comment-id:1731817719 --> @jpds commented on GitHub (Sep 22, 2023): > to be clear, I think you mean, “logs the client IP address”, correct? The opposite, obfuscating a client IP address / the domain it requested: ``` $ sudo ./coredns .:53 CoreDNS-1.11.1 linux/amd64, go1.20.7, ae2bbc2 [INFO] 127.0.0.1:46610 - 40617 "A IN bbc.com. udp 48 false 1232" NOERROR qr,aa,rd 79 0.000286699s ```
Author
Owner

@bluejekyll commented on GitHub (Sep 22, 2023):

Thank you for clarifying, I was able to interpret your comment both ways. This gets me back to some of my original questions about just making these guarantees in different logging levels, and saying INFO will have this, but WARN and ERROR will not?

<!-- gh-comment-id:1731878138 --> @bluejekyll commented on GitHub (Sep 22, 2023): Thank you for clarifying, I was able to interpret your comment both ways. This gets me back to some of my original questions about just making these guarantees in different logging levels, and saying INFO will have this, but WARN and ERROR will not?
Author
Owner

@LuckyTurtleDev commented on GitHub (Oct 15, 2023):

I still thinking adding the information at a log message manual by the user would be easier than filter the out by using a regex. Especial for domains this would be not easy, because they end at . like many other stuff and a regex would match to much or not all domains.

An other option would be a option to configure, the logging of this information independent of the log level.
For example by function or feature flag.

<!-- gh-comment-id:1763344878 --> @LuckyTurtleDev commented on GitHub (Oct 15, 2023): I still thinking adding the information at a log message manual by the user would be easier than filter the out by using a regex. Especial for domains this would be not easy, because they end at `.` like many other stuff and a regex would match to much or not all domains. An other option would be a option to configure, the logging of this information independent of the log level. For example by function or feature flag.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#849
No description provided.