[GH-ISSUE #244] Replace LALRPOP-based resolve.conf parser with a simpler one #116

Closed
opened 2026-03-07 22:20:02 +03:00 by kerem · 9 comments
Owner

Originally created by @briansmith on GitHub (Oct 23, 2017).
Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/244

Some issues with LALRPOP I've noticed:

  • LALRPOP is complex. It is difficult to audit it and the code it generates for correctness, even if one has an extensive background in parsing and parser generators.

  • LALRPOP generates large code. This is generally a good trade-off as the code is specialized to parse things very fast. However, in the case of a typical use of Trust-DNS-Resolver, resolv.conf is parsed exactly once, and resolv.conf is small and simple, so optimizing for code size makes more sense than optimizing for maximum parsing performance.

  • It has a lot of dependencies: github.com/nikomatsakis/lalrpop@01ccd43789/lalrpop/Cargo.toml (L16-L29). Especially with #239 and my (currently-)forthcoming PRs on top of it, LALRPOP's dependencies dwarf trust_dns_resolver's and trust_dns_proto's dependencies.

  • LALRPOP itself is very slow to build: I'm guess LALRPOP is self-hosting and thus uses a LALRPOP-generated parser. This makes LALRPOP very slow to build. It seems to be the slowest part of building trust_dns_resolver from scratch, by far.

I believe we can replace the LALRPOP-based parser with a much simpler one that doesn't need any third-party parsing library or parser generators, because resolv.conf has a very simple file format.

Originally created by @briansmith on GitHub (Oct 23, 2017). Original GitHub issue: https://github.com/hickory-dns/hickory-dns/issues/244 Some issues with LALRPOP I've noticed: * LALRPOP is complex. It is difficult to audit it and the code it generates for correctness, even if one has an extensive background in parsing and parser generators. * LALRPOP generates large code. This is generally a good trade-off as the code is specialized to parse things very fast. However, in the case of a typical use of Trust-DNS-Resolver, resolv.conf is parsed exactly once, and resolv.conf is small and simple, so optimizing for code size makes more sense than optimizing for maximum parsing performance. * It has a lot of dependencies: https://github.com/nikomatsakis/lalrpop/blob/01ccd43789db6f4f873a37673516b839fb04f422/lalrpop/Cargo.toml#L16-L29. Especially with #239 and my (currently-)forthcoming PRs on top of it, LALRPOP's dependencies dwarf trust_dns_resolver's and trust_dns_proto's dependencies. * LALRPOP itself is very slow to build: I'm guess LALRPOP is self-hosting and thus uses a LALRPOP-generated parser. This makes LALRPOP very slow to build. It seems to be the slowest part of building trust_dns_resolver from scratch, by far. I believe we can replace the LALRPOP-based parser with a much simpler one that doesn't need any third-party parsing library or parser generators, because resolv.conf has a very simple file format.
kerem 2026-03-07 22:20:02 +03:00
Author
Owner

@briansmith commented on GitHub (Oct 23, 2017):

I see now that also the LALRPOP-based parser has trouble with comments. It's probably easier to make a hand-written parser that skips comments than to write a custom lexer for the LALRPOP-based parser that supports comments.

<!-- gh-comment-id:338524550 --> @briansmith commented on GitHub (Oct 23, 2017): I see now that also the LALRPOP-based parser has trouble with comments. It's probably easier to make a hand-written parser that skips comments than to write a custom lexer for the LALRPOP-based parser that supports comments.
Author
Owner

@bluejekyll commented on GitHub (Oct 23, 2017):

Yes. I agree with all of this. The LALRPOP based parser was an experiment to see if it would be worth to use for zone’s file parsing, rather than the custom handwritten parser in place now.

It would be very easier to make a cleaner version of the resolv.conf parser that is more flexible as well.

<!-- gh-comment-id:338672851 --> @bluejekyll commented on GitHub (Oct 23, 2017): Yes. I agree with all of this. The LALRPOP based parser was an experiment to see if it would be worth to use for zone’s file parsing, rather than the custom handwritten parser in place now. It would be very easier to make a cleaner version of the resolv.conf parser that is more flexible as well.
Author
Owner

@briansmith commented on GitHub (Oct 25, 2017):

LALRPOP generates a parser that has a dependency on the regex crate, so while it looks like the regex dependency is unused, it is actually needed for this reason. If the new parser isn't implemented using regex then the regex dependency can be removed when the new parser is committed.

Besides regex, the lalrpop-util dependency can also be removed when lalrpop is removed.

Also, I noticed that there are some commented-out tests that imply that comments are allowed anywhere at the end of a line. Actualy, according to the Linux resolv.conf man page, comments must start at the first character of a line; i.e. they cannot follow non-comment text or even whitespace on a line. This should make comment filtering easy to handle in the new parser, since we can just break the input into lines and then filter out all lines that start with "#" or ";" before processing them.

<!-- gh-comment-id:339202403 --> @briansmith commented on GitHub (Oct 25, 2017): LALRPOP generates a parser that has a dependency on the `regex` crate, so while it looks like the `regex` dependency is unused, it is actually needed for this reason. If the new parser isn't implemented using `regex` then the `regex` dependency can be removed when the new parser is committed. Besides `regex`, the `lalrpop-util` dependency can also be removed when `lalrpop` is removed. Also, I noticed that there are some commented-out tests that imply that comments are allowed anywhere at the end of a line. Actualy, according to the Linux resolv.conf man page, comments must start at the first character of a line; i.e. they cannot follow non-comment text or even whitespace on a line. This should make comment filtering easy to handle in the new parser, since we can just break the input into lines and then filter out all lines that start with "#" or ";" before processing them.
Author
Owner

@little-dude commented on GitHub (Nov 28, 2017):

What do you think of https://github.com/tailhook/resolv-conf ?
The readme says it depends on nom but it doesn't actually. The only dependency is https://github.com/tailhook/quick-error, which does not depend on anything else.

<!-- gh-comment-id:347409020 --> @little-dude commented on GitHub (Nov 28, 2017): What do you think of https://github.com/tailhook/resolv-conf ? The readme says it depends on `nom` but it doesn't actually. The only dependency is https://github.com/tailhook/quick-error, which does not depend on anything else.
Author
Owner

@bluejekyll commented on GitHub (Nov 28, 2017):

I already use error-chain, I think that uses quick-error internally, so that's not a huge issue.

Looking at that impl, it's pretty simple and straightforward. I think that would be ok. I can't say if it handles everything that we need, as I only looked at it briefly, but it looks reasonable.

<!-- gh-comment-id:347585340 --> @bluejekyll commented on GitHub (Nov 28, 2017): I already use `error-chain`, I think that uses `quick-error` internally, so that's not a huge issue. Looking at that impl, it's pretty simple and straightforward. I think that would be ok. I can't say if it handles everything that we need, as I only looked at it briefly, but it looks reasonable.
Author
Owner

@little-dude commented on GitHub (Nov 29, 2017):

I've started working on this. I think we could use this crate to parse the resolv.conf. This give us a resolv_conf::Config that we can turn into a trust_dns_resolver::config::ResolverConfig and trust_dns_resolver::config::ResolverOpts.

I started porting trust-dns tests to https://github.com/tailhook/resolv-conf/ (see: https://github.com/tailhook/resolv-conf/pull/2)

<!-- gh-comment-id:347936722 --> @little-dude commented on GitHub (Nov 29, 2017): I've started working on this. I think we could use this crate to parse the resolv.conf. This give us a [`resolv_conf::Config`](https://docs.rs/resolv-conf/0.4.0/resolv_conf/struct.Config.html) that we can turn into a [`trust_dns_resolver::config::ResolverConfig`](https://docs.rs/trust-dns-resolver/0.6.0/trust_dns_resolver/config/struct.ResolverConfig.html) and [`trust_dns_resolver::config::ResolverOpts`](https://docs.rs/trust-dns-resolver/0.6.0/trust_dns_resolver/config/struct.ResolverOpts.html). I started porting trust-dns tests to https://github.com/tailhook/resolv-conf/ (see: https://github.com/tailhook/resolv-conf/pull/2)
Author
Owner

@cssivision commented on GitHub (Feb 6, 2018):

LALRPOP itself is very slow to build, can we cut a new release for resolver? @bluejekyll , also this issue can be closed.

<!-- gh-comment-id:363292378 --> @cssivision commented on GitHub (Feb 6, 2018): LALRPOP itself is very slow to build, can we cut a new release for resolver? @bluejekyll , also this issue can be closed.
Author
Owner

@bluejekyll commented on GitHub (Feb 6, 2018):

Yes. I need to put together some release notes. And after learning some more about mDNS requirements, I want to review some of the new name and label methods before we release. I think I have some time to do this tonight.

<!-- gh-comment-id:363295597 --> @bluejekyll commented on GitHub (Feb 6, 2018): Yes. I need to put together some release notes. And after learning some more about mDNS requirements, I want to review some of the new name and label methods before we release. I think I have some time to do this tonight.
Author
Owner

@bluejekyll commented on GitHub (Feb 6, 2018):

Fixed in #335

<!-- gh-comment-id:363312893 --> @bluejekyll commented on GitHub (Feb 6, 2018): Fixed in #335
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/hickory-dns#116
No description provided.