[GH-ISSUE #253] Warn against dead links within the document #217

Closed
opened 2026-03-03 01:24:48 +03:00 by kerem · 5 comments
Owner

Originally created by @ben-clayton on GitHub (Feb 12, 2020).
Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/253

It would be fantastic if bad anchor-style links were detected and warned.
I wouldn't expect bad external URL links to be caught.

Originally created by @ben-clayton on GitHub (Feb 12, 2020). Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/253 It would be fantastic if bad anchor-style links were detected and warned. I wouldn't expect bad external URL links to be caught.
kerem 2026-03-03 01:24:48 +03:00
Author
Owner

@koppor commented on GitHub (Apr 29, 2020):

Dead external links can be discovered with markdown-link-check:

One file: npx markdown-link-check README.md.

Multiple files outputting the result into a text file:

find . -name \*.md -exec npx markdown-link-check -qq {} \; > bad-links.txt

Note that I used npx to avoid explicit global installation of the markdown-link-check package.

<!-- gh-comment-id:621148384 --> @koppor commented on GitHub (Apr 29, 2020): Dead external links can be discovered with [markdown-link-check](https://github.com/tcort/markdown-link-check): One file: `npx markdown-link-check README.md`. Multiple files outputting the result into a text file: ```bash find . -name \*.md -exec npx markdown-link-check -qq {} \; > bad-links.txt ``` Note that I used [npx](https://www.npmjs.com/package/npx) to avoid explicit global installation of the markdown-link-check package.
Author
Owner

@nichtich commented on GitHub (Jun 29, 2020):

Thanks for the pointer to markdown-link-check, but I'd prefer to have basic support as part of markdownlint instead of having to use yet another tool with different usage and a lot of functionality I don't need anyway. The dead-links use case in this issue is anchor-style links, so implementation would be:

  1. extract all heading titles and create anchor ids
  2. extract all link targets starting with # and check whether a matching id exists.

Note this would not catch link targets within inline HTML but this is discouraged by MD033 anyway.

<!-- gh-comment-id:651139298 --> @nichtich commented on GitHub (Jun 29, 2020): Thanks for the pointer to markdown-link-check, but I'd prefer to have basic support as part of markdownlint instead of having to use yet another tool with different usage and a lot of functionality I don't need anyway. The dead-links use case in this issue is anchor-style links, so implementation would be: 1. extract all heading titles and [create anchor ids](https://github.com/markedjs/marked/blob/743ec55fe16e0b829f68db0b6b3cde9c0b23e78a/src/Slugger.js#L12) 2. extract all link targets starting with `#` and check whether a matching id exists. Note this would not catch link targets within inline HTML but this is [discouraged by MD033](https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md033) anyway.
Author
Owner

@DavidAnson commented on GitHub (Jun 29, 2020):

@nichtich Is the anchor ID algorithm part of a specification anywhere, or are you suggesting making this specific to GitHub?

<!-- gh-comment-id:651195997 --> @DavidAnson commented on GitHub (Jun 29, 2020): @nichtich Is the anchor ID algorithm part of a specification anywhere, or are you suggesting making this specific to GitHub?
Author
Owner

@nichtich commented on GitHub (Jun 29, 2020):

Anchor links from header titles have been discussed in CommonMark but it's not part of CommonMark. Popular implementations that create anchor ids from header titles include Pandoc and GitHub. Pandoc also supports GitHub algorithm (see feature gfm_auto_identifiers), so GitHub algorithm seems the best choice:

  • spaces are converted to dashes (-)
  • uppercase characters are converted to lowercase characters
  • punctuation characters other than - and _ are removed
  • if nothing is left, use the identifier section

For a more extensive comparison of algorithms see https://babelmark.github.io/

<!-- gh-comment-id:651369616 --> @nichtich commented on GitHub (Jun 29, 2020): Anchor links from header titles [have been discussed in CommonMark](https://talk.commonmark.org/t/feature-request-automatically-generated-ids-for-headers/) but it's not part of CommonMark. Popular implementations that create anchor ids from header titles [include Pandoc](https://pandoc.org/MANUAL.html#headings-and-sections) and GitHub. Pandoc also supports GitHub algorithm (see feature `gfm_auto_identifiers`), so GitHub algorithm seems the best choice: * spaces are converted to dashes (`-`) * uppercase characters are converted to lowercase characters * punctuation characters other than `-` and `_` are removed * if nothing is left, use the identifier `section` For a more extensive comparison of algorithms see <https://babelmark.github.io/>
Author
Owner

@DavidAnson commented on GitHub (Jun 29, 2020):

Great info, thank you!

<!-- gh-comment-id:651382383 --> @DavidAnson commented on GitHub (Jun 29, 2020): Great info, thank you!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/markdownlint#217
No description provided.