mirror of
https://github.com/DavidAnson/markdownlint.git
synced 2026-04-26 09:46:01 +03:00
[GH-ISSUE #513] Bulk processing of markdown files for validating local links #418
Labels
No labels
bug
enhancement
enhancement
enhancement
fixed in next
fixed in next
fixed in next
new rule
new rule
new rule
pull-request
question
refactoring
refactoring
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/markdownlint#418
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @holamgadol on GitHub (Mar 31, 2022).
Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/513
I want to make a markdownlint extension for validating local links in Foliant projects, especially for those using MkDocs and CustomIDs.
remark-validate-links has similar possibilities, but it is tailored to the Github/Gitlab hosted projects. Otherwise MkDocs (SSG backend for Foliant) has some differences, which make it unreasonable to add MkDocs support to remark-validate-links.
I've decided to make an effort with markdownlint. It has a clear structure and great VSCode compatibility. Unfortunately, markdownlint has one restriction. It parses only on MD-file and I'm looking for a possibility to bulk MD-file processing.
I want to imply an algorithm from remark-validate-links, where headings and anchors are parsed to a list of refs and links are parsed to a list of links. After that two lists are compared and broken local links could be found.
So it possible to bulk process MD-files? I think it could be done with custom running script, not only by rules.
I got acquainted with the discussion about feat(rules): add valid-link-fragments. That PR tried to solve a similar problem, but in another way.
@DavidAnson commented on GitHub (Mar 31, 2022):
That remark project looks very similar to the proposed MD051 you link to and which I hope to finish off soon. As far as custom rules are concerned, they analyze one file at a time and don't really know about how many other files are going to be looked at. That's a pretty fundamental part of the system and I don't think you'll be able to work around that very cleanly. However, I assume the remark tool works similarly and so I'd want to understand how the behavior you want has been implemented there. Maybe that can provide some guidance? (FYI, I don't look at code for similar projects to avoid the chance of duplicating their code.)
@holamgadol commented on GitHub (Mar 31, 2022):
Well, I'll try to describe the algorithm from remark-validate-links in simple words
Links and refs are collected only from files in input. So we can't check an anchor link to an unknown file, only the existence of the document.
@DavidAnson commented on GitHub (Apr 1, 2022):
Thanks! What part of this do you think would be hard to translate to markdownlint?
@holamgadol commented on GitHub (Apr 1, 2022):
Collect all refs before checking links.
For example, there are two adjacent files (there are both in input):
If we try to check the link in
article.mdbefore collecting refs fromreadme.md, we won't have any information about anchors inreadme.md. So we'll miss the mistake in the link and just check the existence ofreadme.md@DavidAnson commented on GitHub (Apr 1, 2022):
Agreed. But you could check README.md on demand right then, I think? (At the cost of reading/parsing it.)
Does remark make the content of all files available to rules at once? Or does it let them go back and report issues for files that have already been scanned?
@holamgadol commented on GitHub (Apr 1, 2022):
Does checking on demand mean we can read a file that haven't been in input?
In case of remark-validate-links , all files are available to rules at once, I suppose. I should check more properly.
@DavidAnson commented on GitHub (Apr 1, 2022):
Under Node.js, the fs APIs are available and can be used. Under VS Code, the situation is more awkward because the file system MAY be virtualized.