[GH-ISSUE #579] MD033 flagging HTML tags in image alt text strings #2310

Closed
opened 2026-03-07 20:06:33 +03:00 by kerem · 9 comments
Owner

Originally created by @nschonni on GitHub (Sep 11, 2022).
Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/579

https://dlaa.me/markdownlint/#%25m!%5BThe%20default%2C%20focused%2C%20and%20disabled%20%3Ctextarea%3E%20element%20in%20Firefox%2071%20and%20Safari%2013%20on%20Mac%20OSX%20and%20Edge%2018%2C%20Yandex%2014%2C%20Firefox%20and%20Chrome%20on%20Windows%2010.%5D(textarea_basic.png)
Since the alt tag doesn't get parsed as HTML, there shouldn't be a need to escape these.
Ran into this because i had been escaping the tags, but then running prettier would clean off the escaping because it was not needed.

Similar thing happens with link title strings, but that's probably a separate bug

Originally created by @nschonni on GitHub (Sep 11, 2022). Original GitHub issue: https://github.com/DavidAnson/markdownlint/issues/579 https://dlaa.me/markdownlint/#%25m!%5BThe%20default%2C%20focused%2C%20and%20disabled%20%3Ctextarea%3E%20element%20in%20Firefox%2071%20and%20Safari%2013%20on%20Mac%20OSX%20and%20Edge%2018%2C%20Yandex%2014%2C%20Firefox%20and%20Chrome%20on%20Windows%2010.%5D(textarea_basic.png) Since the alt tag doesn't get parsed as HTML, there shouldn't be a need to escape these. Ran into this because i had been escaping the tags, but then running prettier would clean off the escaping because it was not needed. Similar thing happens with link title strings, but that's probably a separate bug
kerem 2026-03-07 20:06:33 +03:00
  • closed this issue
  • added the
    question
    label
Author
Owner

@DavidAnson commented on GitHub (Sep 11, 2022):

As your example shows, HTML content in the image alternate text region can be removed by the parser and so I think it is reasonable for markdownlint to warn about it.

Here is an example using markdown-it directly: http://markdown-it.github.io/#md3=%7B%22source%22%3A%22%23%20Issue%20579%5Cn%5Cn!%5Btext%20%3Ctextarea%3E%20text%5D%28image.png%29%5Cn%22%2C%22defaults%22%3A%7B%22html%22%3Atrue%2C%22xhtmlOut%22%3Afalse%2C%22breaks%22%3Afalse%2C%22langPrefix%22%3A%22language-%22%2C%22linkify%22%3Atrue%2C%22typographer%22%3Atrue%2C%22_highlight%22%3Atrue%2C%22_strict%22%3Afalse%2C%22_view%22%3A%22src%22%7D%7D

<!-- gh-comment-id:1242886198 --> @DavidAnson commented on GitHub (Sep 11, 2022): As your example shows, HTML content in the image alternate text region can be removed by the parser and so I think it is reasonable for markdownlint to warn about it. Here is an example using markdown-it directly: http://markdown-it.github.io/#md3=%7B%22source%22%3A%22%23%20Issue%20579%5Cn%5Cn!%5Btext%20%3Ctextarea%3E%20text%5D%28image.png%29%5Cn%22%2C%22defaults%22%3A%7B%22html%22%3Atrue%2C%22xhtmlOut%22%3Afalse%2C%22breaks%22%3Afalse%2C%22langPrefix%22%3A%22language-%22%2C%22linkify%22%3Atrue%2C%22typographer%22%3Atrue%2C%22_highlight%22%3Atrue%2C%22_strict%22%3Afalse%2C%22_view%22%3A%22src%22%7D%7D
Author
Owner

@nschonni commented on GitHub (Sep 11, 2022):

Hmm, I'm thinking it might be a Markdown-it bug then. If you run it through GitHub's parser or the remark parser like Prettier uses, its not treated as a literal

The default, focused, and disabled  element in Firefox 71 and Safari 13 on Mac OSX and Edge 18, Yandex 14, Firefox and Chrome on Windows 10.

<!-- gh-comment-id:1242889906 --> @nschonni commented on GitHub (Sep 11, 2022): Hmm, I'm thinking it might be a Markdown-it bug then. If you run it through GitHub's parser or the remark parser like Prettier uses, its not treated as a literal ![The default, focused, and disabled <textarea> element in Firefox 71 and Safari 13 on Mac OSX and Edge 18, Yandex 14, Firefox and Chrome on Windows 10.](textarea_basic.png)
Author
Owner

@DavidAnson commented on GitHub (Sep 11, 2022):

Toggling the "HTML" checkbox on that demo page opt into and out of this removal behavior.

Skimming the CommonMark specification, it's not clear to me that this scenario is directly addressed, so I think the parser is behaving consistently.

<!-- gh-comment-id:1242891397 --> @DavidAnson commented on GitHub (Sep 11, 2022): Toggling the "HTML" checkbox on that demo page opt into and out of this removal behavior. Skimming the CommonMark specification, it's not clear to me that this scenario is directly addressed, so I think the parser is behaving consistently.
Author
Owner

@nschonni commented on GitHub (Sep 11, 2022):

I filed something on Markdown-it, but looking at the spec https://spec.commonmark.org/0.30/#images it is light, but

Though this spec is concerned with parsing, not rendering, it is recommended that in rendering to HTML, only the plain string content of the image description be used. Note that in the above example, the alt attribute’s value is foo bar, not foo bar or foo bar. Only the plain string content is rendered, without formatting.

<!-- gh-comment-id:1242891799 --> @nschonni commented on GitHub (Sep 11, 2022): I filed something on Markdown-it, but looking at the spec https://spec.commonmark.org/0.30/#images it is light, but > Though this spec is concerned with parsing, not rendering, it is recommended that in rendering to HTML, only the plain string content of the [image description](https://spec.commonmark.org/0.30/#image-description) be used. Note that in the above example, the alt attribute’s value is foo bar, not foo [bar](/url) or foo <a href="/url">bar</a>. Only the plain string content is rendered, without formatting.
Author
Owner

@rlidwka commented on GitHub (Sep 12, 2022):

Everything is parsed in alt, but only plain text is rendered. Consider ![foo *bar* baz]() - it's gonna lose asterisks (in cmark and in github version too).

I believe linter for commonmark syntax should flag any non-text, non-escape inside img tag, because it'll just get ignored by parsers. HTML is no exception there.

<!-- gh-comment-id:1244003106 --> @rlidwka commented on GitHub (Sep 12, 2022): Everything is parsed in alt, but only plain text is rendered. Consider `![foo *bar* baz]()` - it's gonna lose asterisks (in cmark and in github version too). I believe linter for commonmark syntax should flag any non-text, non-escape inside img tag, because it'll just get ignored by parsers. HTML is no exception there.
Author
Owner

@nschonni commented on GitHub (Sep 12, 2022):

The CommonMark sample does remove asterisks, but doesn't remove tags https://spec.commonmark.org/dingus/?text=%23%20Issue%20579%0A%0A!%5Btext%20asterisks%20text%5D(image.png)%0A%0A!%5Btext%20%3Ctextarea%3E%20text%5D(image.png)%0A%0A

<!-- gh-comment-id:1244040109 --> @nschonni commented on GitHub (Sep 12, 2022): The CommonMark sample does remove asterisks, but doesn't remove tags https://spec.commonmark.org/dingus/?text=%23%20Issue%20579%0A%0A!%5Btext%20*asterisks*%20text%5D(image.png)%0A%0A!%5Btext%20%3Ctextarea%3E%20text%5D(image.png)%0A%0A
Author
Owner

@nschonni commented on GitHub (Sep 12, 2022):

Found a relevant discussion https://github.com/commonmark/commonmark-spec/issues/716 but there is no resolution right now

<!-- gh-comment-id:1244052010 --> @nschonni commented on GitHub (Sep 12, 2022): Found a relevant discussion https://github.com/commonmark/commonmark-spec/issues/716 but there is no resolution right now
Author
Owner

@DavidAnson commented on GitHub (Oct 18, 2022):

Closing this based on my Sept 10 example and lack of agreement in the comments about whether this is reasonable.

<!-- gh-comment-id:1281812657 --> @DavidAnson commented on GitHub (Oct 18, 2022): Closing this based on my Sept 10 example and lack of agreement in the comments about whether this is reasonable.
Author
Owner

@nschonni commented on GitHub (Oct 18, 2022):

OK, I'll ping this issue if there is a resolution on the CommonMark or Markdown-it issues

<!-- gh-comment-id:1281815180 --> @nschonni commented on GitHub (Oct 18, 2022): OK, I'll ping this issue if there is a resolution on the CommonMark or Markdown-it issues
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/markdownlint#2310
No description provided.