[GH-ISSUE #187] MD047/single-trailing-newline bug? #162

Closed
opened 2026-03-03 01:24:16 +03:00 by kerem · 15 comments
Owner

Originally created by @mathiasbynens on GitHub (May 17, 2019).
Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/187

Reproduction steps:

  1. git clone git@github.com:v8/v8.dev.git && cd v8.dev
  2. npm install (note: this includes markdownlint-cli@0.15.0)
  3. npm install markdownlint-cli@0.16.0
  4. markdownlint src/**/*.md fails with the following error message:
src/blog/scanner.md: 95: MD047/single-trailing-newline Files should end with a single newline character

But, the last character in the file is definitely a U+000A LF character, as evidenced by hexdump:

$ hexdump -C src/blog/scanner.md | tail
00003900  73 73 69 62 6c 65 2e 20  49 64 65 61 6c 6c 79 2c  |ssible. Ideally,|
00003910  20 74 68 65 73 65 20 73  74 65 70 73 20 61 72 65  | these steps are|
00003920  20 61 75 74 6f 6d 61 74  65 64 20 61 73 20 70 61  | automated as pa|
00003930  72 74 20 6f 66 20 61 20  62 75 69 6c 64 20 70 72  |rt of a build pr|
00003940  6f 63 65 73 73 2c 20 69  6e 20 77 68 69 63 68 20  |ocess, in which |
00003950  63 61 73 65 20 79 6f 75  20 64 6f 6e e2 80 99 74  |case you don...t|
00003960  20 68 61 76 65 20 74 6f  20 77 6f 72 72 79 20 61  | have to worry a|
00003970  62 6f 75 74 20 69 74 20  77 68 65 6e 20 61 75 74  |bout it when aut|
00003980  68 6f 72 69 6e 67 20 63  6f 64 65 2e 0a           |horing code..|
0000398d

Ref.
https://github.com/DavidAnson/markdownlint/pull/176
https://github.com/igorshubovych/markdownlint-cli/issues/56

Originally created by @mathiasbynens on GitHub (May 17, 2019). Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/187 Reproduction steps: 1. `git clone git@github.com:v8/v8.dev.git && cd v8.dev` 1. `npm install` (note: this includes `markdownlint-cli@0.15.0`) 1. `npm install markdownlint-cli@0.16.0` 1. `markdownlint src/**/*.md` fails with the following error message: ``` src/blog/scanner.md: 95: MD047/single-trailing-newline Files should end with a single newline character ``` But, the last character in the file is definitely a U+000A LF character, as evidenced by `hexdump`: ``` $ hexdump -C src/blog/scanner.md | tail 00003900 73 73 69 62 6c 65 2e 20 49 64 65 61 6c 6c 79 2c |ssible. Ideally,| 00003910 20 74 68 65 73 65 20 73 74 65 70 73 20 61 72 65 | these steps are| 00003920 20 61 75 74 6f 6d 61 74 65 64 20 61 73 20 70 61 | automated as pa| 00003930 72 74 20 6f 66 20 61 20 62 75 69 6c 64 20 70 72 |rt of a build pr| 00003940 6f 63 65 73 73 2c 20 69 6e 20 77 68 69 63 68 20 |ocess, in which | 00003950 63 61 73 65 20 79 6f 75 20 64 6f 6e e2 80 99 74 |case you don...t| 00003960 20 68 61 76 65 20 74 6f 20 77 6f 72 72 79 20 61 | have to worry a| 00003970 62 6f 75 74 20 69 74 20 77 68 65 6e 20 61 75 74 |bout it when aut| 00003980 68 6f 72 69 6e 67 20 63 6f 64 65 2e 0a |horing code..| 0000398d ``` Ref. https://github.com/DavidAnson/markdownlint/pull/176 https://github.com/igorshubovych/markdownlint-cli/issues/56
kerem 2026-03-03 01:24:16 +03:00
Author
Owner

@DavidAnson commented on GitHub (May 17, 2019):

On a bus with a phone right now, but I’m suddenly afraid this is a CR/LF platform/encoding issue. I will have a look tonight and release a fix shortly if the rule is in error.

<!-- gh-comment-id:493286416 --> @DavidAnson commented on GitHub (May 17, 2019): On a bus with a phone right now, but I’m suddenly afraid this is a CR/LF platform/encoding issue. I will have a look tonight and release a fix shortly if the rule is in error.
Author
Owner

@DavidAnson commented on GitHub (May 17, 2019):

For the record, I am not a total buffoon - this rule behaves as expected on Windows, Mac, and Linux with default Git line ending settings: https://travis-ci.org/DavidAnson/markdownlint

<!-- gh-comment-id:493287207 --> @DavidAnson commented on GitHub (May 17, 2019): For the record, I am not a total buffoon - this rule behaves as expected on Windows, Mac, and Linux with default Git line ending settings: https://travis-ci.org/DavidAnson/markdownlint
Author
Owner

@nschonni commented on GitHub (May 17, 2019):

I opened up the repo and it looks like the file is encoded with CRLF, except for link 94, which is CR. I
Edit: nvm, missed in the block

<!-- gh-comment-id:493287634 --> @nschonni commented on GitHub (May 17, 2019): I opened up the repo and it looks like the file is encoded with CRLF, except for link 94, which is CR. I Edit: nvm, missed in the block
Author
Owner

@mathiasbynens commented on GitHub (May 17, 2019):

Ok, let’s focus on the scanner.md file for now, since the above discussion already indicates there’s something weird going on.

The file is here: https://github.com/v8/v8.dev/blob/master/src/blog/scanner.md Its raw URL is https://raw.githubusercontent.com/v8/v8.dev/master/src/blog/scanner.md.

Just to make sure we’re all talking about the same exact thing, with no special Git config that normalizes line endings or anything like that, here’s the SHA256 checksum of the file:

$ sha256sum src/blog/scanner.md
e9e676725f4fd46d7f780ba1c768a6716212bf1d90ec6d06948960cfa3815f3d  src/blog/scanner.md

I can confirm just downloading the raw version gets you the exact same file:

$ curl https://raw.githubusercontent.com/v8/v8.dev/master/src/blog/scanner.md > x
$ sha256sum x
e9e676725f4fd46d7f780ba1c768a6716212bf1d90ec6d06948960cfa3815f3d  x

@nschonni Are you getting the same checksum? When you say “it looks like the file is encoded with CRLF, except for line 94, which is CR” that’s highly surprising, as I cannot confirm that at all.

<!-- gh-comment-id:493289597 --> @mathiasbynens commented on GitHub (May 17, 2019): Ok, let’s focus on the `scanner.md` file for now, since the above discussion already indicates there’s something weird going on. The file is here: https://github.com/v8/v8.dev/blob/master/src/blog/scanner.md Its raw URL is <https://raw.githubusercontent.com/v8/v8.dev/master/src/blog/scanner.md>. Just to make sure we’re all talking about the same exact thing, with no special Git config that normalizes line endings or anything like that, here’s the SHA256 checksum of the file: ``` $ sha256sum src/blog/scanner.md e9e676725f4fd46d7f780ba1c768a6716212bf1d90ec6d06948960cfa3815f3d src/blog/scanner.md ``` I can confirm just downloading the raw version gets you the exact same file: ``` $ curl https://raw.githubusercontent.com/v8/v8.dev/master/src/blog/scanner.md > x $ sha256sum x e9e676725f4fd46d7f780ba1c768a6716212bf1d90ec6d06948960cfa3815f3d x ``` @nschonni Are you getting the same checksum? When you say “it looks like the file is encoded with CRLF, except for line 94, which is CR” that’s highly surprising, as I cannot confirm that at all.
Author
Owner

@mathiasbynens commented on GitHub (May 17, 2019):

Here are simplified repro instructions:

curl https://raw.githubusercontent.com/v8/v8.dev/master/src/blog/scanner.md > x.md
npm i markdownlint-cli@0.16.0 -g
markdownlint x.md

Expected results

The output should not include any warnings related to MD047/single-trailing-newline.

Actual results

The output includes:

x: 95: MD047/single-trailing-newline Files should end with a single newline character
<!-- gh-comment-id:493290097 --> @mathiasbynens commented on GitHub (May 17, 2019): Here are simplified repro instructions: ``` curl https://raw.githubusercontent.com/v8/v8.dev/master/src/blog/scanner.md > x.md npm i markdownlint-cli@0.16.0 -g markdownlint x.md ``` ### Expected results The output should not include any warnings related to `MD047/single-trailing-newline`. ### Actual results The output includes: ``` x: 95: MD047/single-trailing-newline Files should end with a single newline character ```
Author
Owner

@nschonni commented on GitHub (May 17, 2019):

When you say “it looks like the file is encoded with CRLF, except for line 94, which is CR” that’s highly surprising, as I cannot confirm that at all.

Sorry, I threw in a VS Code extension to show line endings, but it looks like it was bad. Opening in Notepad++ does show it as CRLF (unless I do that test curl, then it as LF, but still failed)

<!-- gh-comment-id:493290918 --> @nschonni commented on GitHub (May 17, 2019): > When you say “it looks like the file is encoded with CRLF, except for line 94, which is CR” that’s highly surprising, as I cannot confirm that at all. Sorry, I threw in a VS Code extension to show line endings, but it looks like it was bad. Opening in Notepad++ does show it as CRLF (unless I do that test curl, then it as LF, but still failed)
Author
Owner

@mathiasbynens commented on GitHub (May 17, 2019):

Opening in Notepad++ does show it as CRLF

This is still surprising! The file is not supposed to contain any CRLF (i.e. U+000D U+000A) and doesn’t seem to, assuming hexdump is correct. 🤔 I wonder why you’re seeing something else than I am. Did you check if the checksum matches?

<!-- gh-comment-id:493291685 --> @mathiasbynens commented on GitHub (May 17, 2019): > Opening in Notepad++ does show it as CRLF This is still surprising! The file is not supposed to contain any CRLF (i.e. U+000D U+000A) and doesn’t seem to, assuming `hexdump` is correct. 🤔 I wonder why you’re seeing something else than I am. Did you check if the checksum matches?
Author
Owner

@nschonni commented on GitHub (May 17, 2019):

I'll check that next. It did find what was tripping it up. This line if removed makes it happy

[^1]: `<!--` is the start of an HTML comment, whereas `<!-` scans as “less than”, “not”, “minus”.

Since the rule does a quick check of line lengths, I think for whatever reason it's not counting this line and is throwing off the calculation
github.com/DavidAnson/markdownlint@0b9b74ccfd/lib/md047.js (L11-L17)

<!-- gh-comment-id:493292841 --> @nschonni commented on GitHub (May 17, 2019): I'll check that next. It did find what was tripping it up. This line if removed makes it happy ```md [^1]: `<!--` is the start of an HTML comment, whereas `<!-` scans as “less than”, “not”, “minus”. ``` Since the rule does a quick check of line lengths, I think for whatever reason it's not counting this line and is throwing off the calculation https://github.com/DavidAnson/markdownlint/blob/0b9b74ccfd8097340fa354fe394031093da4cfdd/lib/md047.js#L11-L17
Author
Owner

@nschonni commented on GitHub (May 17, 2019):

$ sha256sum x.md
e9e676725f4fd46d7f780ba1c768a6716212bf1d90ec6d06948960cfa3815f3d *x.md
$ sha256sum src/blog/scanner.md
8a164ee0f4e653a196d9ac16873ddfe5d579228a07bba3d616781deb40d7ede8 *src/blog/scanner.md
<!-- gh-comment-id:493293158 --> @nschonni commented on GitHub (May 17, 2019): ``` $ sha256sum x.md e9e676725f4fd46d7f780ba1c768a6716212bf1d90ec6d06948960cfa3815f3d *x.md ``` ``` $ sha256sum src/blog/scanner.md 8a164ee0f4e653a196d9ac16873ddfe5d579228a07bba3d616781deb40d7ede8 *src/blog/scanner.md ```
Author
Owner

@DavidAnson commented on GitHub (May 17, 2019):

That line may be tripping up the HTML comment detection code. If so, MD047 may be correct itself, but seeing the second-to-last line instead.

<!-- gh-comment-id:493296711 --> @DavidAnson commented on GitHub (May 17, 2019): That line may be tripping up the HTML comment detection code. If so, MD047 may be correct itself, but seeing the second-to-last line instead.
Author
Owner

@DavidAnson commented on GitHub (May 17, 2019):

Yep, here’s a repro: https://dlaa.me/markdownlint/#%25m%23%20Heading%0A%0A%5B%5E1%5D%3A%20%60%3C!--%60%20is%2C%20whereas%20%60%3C!-%60%20scans%0A%0AText%0A

Sorry about that!

<!-- gh-comment-id:493297524 --> @DavidAnson commented on GitHub (May 17, 2019): Yep, here’s a repro: https://dlaa.me/markdownlint/#%25m%23%20Heading%0A%0A%5B%5E1%5D%3A%20%60%3C!--%60%20is%2C%20whereas%20%60%3C!-%60%20scans%0A%0AText%0A Sorry about that!
Author
Owner

@DavidAnson commented on GitHub (May 17, 2019):

Slightly simpler: https://dlaa.me/markdownlint/#%25m%23%20Heading%0A%0A%60%3C!--%60%0A

I will debug later from a real computer. :)

Thanks for helping track this down, both of you!!

<!-- gh-comment-id:493298611 --> @DavidAnson commented on GitHub (May 17, 2019): Slightly simpler: https://dlaa.me/markdownlint/#%25m%23%20Heading%0A%0A%60%3C!--%60%0A I will debug later from a real computer. :) Thanks for helping track this down, both of you!!
Author
Owner

@DavidAnson commented on GitHub (May 17, 2019):

I've fixed this scenario, but realized there's a bigger issue with the handling of HTML comment bodies such that <!-- inside a code fence/span/block isn't correctly ignored. I'll address that in a future release. I'll release a patch with this commit in the next few days.

<!-- gh-comment-id:493322901 --> @DavidAnson commented on GitHub (May 17, 2019): I've fixed this scenario, but realized there's a bigger issue with the handling of HTML comment bodies such that `<!--` inside a code fence/span/block isn't correctly ignored. I'll address that in a future release. I'll release a patch with this commit in the next few days.
Author
Owner

@DavidAnson commented on GitHub (May 18, 2019):

@mathiasbynens: Version 0.14.2 of markdownlint is now available. If you reinstall the CLI, it should pick up the new reference.

<!-- gh-comment-id:493706071 --> @DavidAnson commented on GitHub (May 18, 2019): @mathiasbynens: Version `0.14.2` of `markdownlint` is now available. If you reinstall the CLI, it should pick up the new reference.
Author
Owner

@mathiasbynens commented on GitHub (May 19, 2019):

Thanks for the quick fix!

<!-- gh-comment-id:493718278 --> @mathiasbynens commented on GitHub (May 19, 2019): Thanks for the quick fix!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/markdownlint#162
No description provided.