[GH-ISSUE #435] Add option to ignore HTML elements and content to MD044/proper-names #2210

Closed
opened 2026-03-07 20:05:34 +03:00 by kerem · 15 comments
Owner

Originally created by @okalachev on GitHub (Oct 5, 2021).
Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/435

Capital letters are rarely used in file names due to multiple reasons. But:

"MD044": {"names": ["JavaScript"]}
<img src="javascript.png">

Gives:

MD044/proper-names Proper names should have the correct capitalization [Expected: JavaScript; Actual: javascript]

This problem has appeared in v0.24.0.

Originally created by @okalachev on GitHub (Oct 5, 2021). Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/435 Capital letters are rarely used in file names due to multiple reasons. But: ```json "MD044": {"names": ["JavaScript"]} ``` ```markdown <img src="javascript.png"> ``` Gives: ``` MD044/proper-names Proper names should have the correct capitalization [Expected: JavaScript; Actual: javascript] ``` This problem has appeared in `v0.24.0`.
kerem 2026-03-07 20:05:34 +03:00
Author
Owner

@DavidAnson commented on GitHub (Oct 5, 2021):

There are a lot of places a file name might appear. The new implementation of this rule probably finds more instances which is why you've filed this. But I'm not sure I want to add a bunch of exceptions.

What about adding "javascript.png" to your allowed word list kind of how "GitHub"/"github.com" is handled here: https://github.com/DavidAnson/markdownlint/blob/main/test/proper-names-projects.json

<!-- gh-comment-id:934497115 --> @DavidAnson commented on GitHub (Oct 5, 2021): There are a lot of places a file name might appear. The new implementation of this rule probably finds more instances which is why you've filed this. But I'm not sure I want to add a bunch of exceptions. What about adding "javascript.png" to your allowed word list kind of how "GitHub"/"github.com" is handled here: https://github.com/DavidAnson/markdownlint/blob/main/test/proper-names-projects.json
Author
Owner

@okalachev commented on GitHub (Oct 5, 2021):

But it used to work in previous version the better way. It didn't find violations in file names, but it found everywhere else.

We don't use capitalization in file names, and I believe it's a general rule. And with the latest version I now have 159 errors in my documentation, and it doesn't make sense to add all these exceptions.

Isn't there any simple algorithm that can detect file name, or at least an HTML attribute src or href values + markdown images addresses or something like that?

<!-- gh-comment-id:934502633 --> @okalachev commented on GitHub (Oct 5, 2021): But it used to work in previous version the better way. It didn't find violations in file names, but it found everywhere else. We don't use capitalization in file names, and I believe it's a general rule. And with the latest version I now have 159 errors in my documentation, and it doesn't make sense to add all these exceptions. Isn't there any simple algorithm that can detect file name, or at least an HTML attribute `src` or `href` values + markdown images addresses or something like that?
Author
Owner

@DavidAnson commented on GitHub (Oct 5, 2021):

I think the previous implementation of this rule missed things like text in HTML alt and title attributes, so this is an improvement.

Are all 159 new issues in your case unique? If there are duplicates like I'm expecting, a few new entries in the word list will solve the problem. Can you give some other examples, please?

<!-- gh-comment-id:934514637 --> @DavidAnson commented on GitHub (Oct 5, 2021): I think the previous implementation of this rule missed things like text in HTML alt and title attributes, so this is an improvement. Are all 159 new issues in your case unique? If there are duplicates like I'm expecting, a few new entries in the word list will solve the problem. Can you give some other examples, please?
Author
Owner

@okalachev commented on GitHub (Oct 6, 2021):

@DavidAnson, I examined the errors, and I may say, the most of them are pretty unique. Because they are (suppose JavaScript is set as the proper spelling):

  • Unique image file names (in HTML). It doesn’t make any sense to check in HTML attributes like src and href (although I don't know if HTML gets parsed here).

  • Directories names in paths. There is no nice way to fix detections in:

    <img src="img/javascript/1.png">
    

    Except adding something weird like"javascript/" to the list.

  • HTML identifiers. Like:

    <script type="text/javascript">
    
    <a id="javascript-1">
    

    And even:

    <javascript/>
    
  • Identifiers in <code> tags:

    This is not an error:

    `javascript`
    

    But this is:

    <code>javascript</code>
    

    Although it's obvious that code blocks should not be checked.

<!-- gh-comment-id:937304086 --> @okalachev commented on GitHub (Oct 6, 2021): @DavidAnson, I examined the errors, and I may say, the most of them are pretty unique. Because they are (suppose `JavaScript` is set as the proper spelling): * Unique image file names (in HTML). It doesn’t make any sense to check in HTML attributes like `src` and `href` (although I don't know if HTML gets parsed here). * Directories names in paths. There is no nice way to fix detections in: ```html <img src="img/javascript/1.png"> ``` Except adding something weird like`"javascript/"` to the list. * HTML identifiers. Like: ```html <script type="text/javascript"> ``` ```html <a id="javascript-1"> ``` And even: ```html <javascript/> ``` * Identifiers in `<code>` tags: This is not an error: ```markdown `javascript` ``` But this **is**: ```markdown <code>javascript</code> ``` Although it's obvious that code blocks should not be checked.
Author
Owner

@DavidAnson commented on GitHub (Oct 7, 2021):

It looks like the request to ignore file names would only avoid some of these scenarios, so would be a partial solution for you at best.

This project is a linting tool for Markdown and it seems all of your examples use HTML. It's true that HTML can be used in Markdown, but many consider that to be an anti-pattern. Rule MD033/no-inline-html warns against doing so.

I'm not expecting to add a bunch of HTML handling to support the variety of examples you show, but I will leave the issue open for comment.

<!-- gh-comment-id:937360612 --> @DavidAnson commented on GitHub (Oct 7, 2021): It looks like the request to ignore file names would only avoid some of these scenarios, so would be a partial solution for you at best. This project is a linting tool for Markdown and it seems all of your examples use HTML. It's true that HTML can be used in Markdown, but many consider that to be an anti-pattern. Rule MD033/no-inline-html warns against doing so. I'm not expecting to add a bunch of HTML handling to support the variety of examples you show, but I will leave the issue open for comment.
Author
Owner

@okalachev commented on GitHub (Oct 7, 2021):

@DavidAnson, what if to add a parameter that would disable this rule for embedded HTML at all? So this rule would work as it had been working before. Anyway HTML is not get checked correctly. And anyway you say HTML is not recommended in Markdown.

That would work for me, I believe.

<!-- gh-comment-id:937723554 --> @okalachev commented on GitHub (Oct 7, 2021): @DavidAnson, what if to add a parameter that would disable this rule for embedded HTML at all? So this rule would work as it had been working before. Anyway HTML is not get checked correctly. And anyway you say HTML is not recommended in Markdown. That would work for me, I believe.
Author
Owner

@groenroos commented on GitHub (Feb 19, 2022):

We've just run into this issue as well with mixed Markdown and HTML, with instances in img tag srcs being flagged under this rule.

If there was an option to ignore HTML with this rule, I'd turn it on as preferable to the current situation; but as it's been pointed out, it would miss out on alt and title attributes that could contain user-facing text content, which wouldn't be ideal.

I think being able to ignore situations where the triggering word is "filename-y" (kebab-case, not surrounded by whitespace, etc.) would be very useful, probably realistic to implement, and getting closer (even if not perfect) to eliminating false positives.

As it stands now, we unfortunately have to turn off this rule completely.

<!-- gh-comment-id:1045582137 --> @groenroos commented on GitHub (Feb 19, 2022): We've just run into this issue as well with mixed Markdown and HTML, with instances in `img` tag `src`s being flagged under this rule. If there was an option to ignore HTML with this rule, I'd turn it on as preferable to the current situation; but as it's been pointed out, it would miss out on `alt` and `title` attributes that could contain user-facing text content, which wouldn't be ideal. I think being able to ignore situations where the triggering word is "filename-y" (kebab-case, not surrounded by whitespace, etc.) would be very useful, probably realistic to implement, and getting closer (even if not perfect) to eliminating false positives. As it stands now, we unfortunately have to turn off this rule completely.
Author
Owner

@chriswong commented on GitHub (Mar 30, 2022):

"MD044": {"names": ["JavaScript", "javascript"]}

I think both of JavaScript and javascript should be ok under this configuration, but JavaScript not.

<!-- gh-comment-id:1082867396 --> @chriswong commented on GitHub (Mar 30, 2022): ```yaml "MD044": {"names": ["JavaScript", "javascript"]} ``` I think both of `JavaScript` and `javascript` should be ok under this configuration, but `JavaScript` not.
Author
Owner

@DavidAnson commented on GitHub (Mar 30, 2022):

@chriswong I would have expected that as well. The code handles matching substrings by sorting in order of length, but that does not help here where two names differ only in case. I included an example of the problem below and have added a note to myself to look into this. Thank you!

https://dlaa.me/markdownlint/#%25m%23%20Issue%20%3F%3F%3F%0A%0AOkay%3A%20javascript.png%0A%0AOkay%3A%20JavaScript.png%0A%0ABad%3A%20JAVASCRIPT%0A%0A%3C!--%20markdownlint-configure-file%20%7B%0A%20%20%22MD044%22%3A%20%7B%22names%22%3A%20%5B%22JavaScript%22%2C%20%22javascript%22%5D%7D%0A%7D%20--%3E%0A

<!-- gh-comment-id:1083331578 --> @DavidAnson commented on GitHub (Mar 30, 2022): @chriswong I would have expected that as well. The code handles matching substrings by sorting in order of length, but that does not help here where two names differ only in case. I included an example of the problem below and have added a note to myself to look into this. Thank you! https://dlaa.me/markdownlint/#%25m%23%20Issue%20%3F%3F%3F%0A%0AOkay%3A%20javascript.png%0A%0AOkay%3A%20JavaScript.png%0A%0ABad%3A%20JAVASCRIPT%0A%0A%3C!--%20markdownlint-configure-file%20%7B%0A%20%20%22MD044%22%3A%20%7B%22names%22%3A%20%5B%22JavaScript%22%2C%20%22javascript%22%5D%7D%0A%7D%20--%3E%0A
Author
Owner

@DavidAnson commented on GitHub (Apr 26, 2022):

The first commit addresses the javascript/JavaScript issue. The second commit addresses the first 3/4 examples in the detailed comment. It does detect <code>...</code> and ignore the inner content because a) that starts to require parsing complex HTML and b) that is better written with Markdown backticks (unlike the other examples which do not have exactly equivalent Markdown representations).

<!-- gh-comment-id:1109343122 --> @DavidAnson commented on GitHub (Apr 26, 2022): The first commit addresses the `javascript`/`JavaScript` issue. The second commit addresses the first 3/4 examples in the detailed comment. It does detect `<code>...</code>` and ignore the inner content because a) that starts to require parsing complex HTML and b) that is better written with Markdown backticks (unlike the other examples which do not have exactly equivalent Markdown representations).
Author
Owner

@okalachev commented on GitHub (May 5, 2022):

@DavidAnson, thanks, I will try your fixes!

<!-- gh-comment-id:1118379190 --> @okalachev commented on GitHub (May 5, 2022): @DavidAnson, thanks, I will try your fixes!
Author
Owner

@okalachev commented on GitHub (Jun 16, 2022):

@DavidAnson, it’s actually not so easy to try this out, as markdownlint-cli and markdownlint are different things, so I can’t easily make markdownlint-cli using the cloned version of markdownlint, not the downloaded from npm (which lacks this feature).

Is there an easy way to check?

<!-- gh-comment-id:1157558992 --> @okalachev commented on GitHub (Jun 16, 2022): @DavidAnson, it’s actually not so easy to try this out, as `markdownlint-cli` and `markdownlint` are different things, so I can’t easily make `markdownlint-cli` using the cloned version of `markdownlint`, not the downloaded from npm (which lacks this feature). Is there an easy way to check?
Author
Owner

@DavidAnson commented on GitHub (Jun 16, 2022):

The easiest thing is probably to clone this repository into the markdownlint folder under node_modules of the CLI. Or wait a little longer because I am getting close to doing another round of releases.

<!-- gh-comment-id:1157862920 --> @DavidAnson commented on GitHub (Jun 16, 2022): The easiest thing is probably to clone this repository into the markdownlint folder under node_modules of the CLI. Or wait a little longer because I am getting close to doing another round of releases.
Author
Owner

@okalachev commented on GitHub (Jun 16, 2022):

@DavidAnson, I was able to test it with a symlink, and html_elements: false fixed most of the false positives, that's great!

However, things like <code>javascript</code> are still incorrectly detected. Luckily, there are very few of them in my case.

<!-- gh-comment-id:1157910432 --> @okalachev commented on GitHub (Jun 16, 2022): @DavidAnson, I was able to test it with a symlink, and `html_elements: false` fixed most of the false positives, that's great! However, things like `<code>javascript</code>` are still incorrectly detected. Luckily, there are very few of them in my case.
Author
Owner

@okalachev commented on GitHub (Jun 16, 2022):

I read, that it's a known issue. In my case <code> is used because this text is inside an HTML table. And it's impossible to use markdown inside HTML table. And sometimes using HTML table is necessary.

<!-- gh-comment-id:1157914025 --> @okalachev commented on GitHub (Jun 16, 2022): I read, that it's a known issue. In my case `<code>` is used because this text is inside an HTML table. And it's impossible to use markdown inside HTML table. And sometimes using HTML table is necessary.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/markdownlint#2210
No description provided.