[GH-ISSUE #83] Remove style tags when extracting page title from HTML #57

Closed
opened 2026-02-25 23:33:21 +03:00 by kerem · 1 comment
Owner

Originally created by @sorbits on GitHub (Apr 9, 2018).
Original GitHub issue: https://github.com/go-shiori/shiori/issues/83

Some sites use styling tags in their <title> element.

An example is https://developers.google.com/apps-script/guides/clasp which currently wrap clasp in <code> tags.

I suggest that all HTML tags be removed from the page’s title.

Originally created by @sorbits on GitHub (Apr 9, 2018). Original GitHub issue: https://github.com/go-shiori/shiori/issues/83 Some sites use styling tags in their `<title>` element. An example is <https://developers.google.com/apps-script/guides/clasp> which currently wrap `clasp` in `<code>` tags. I suggest that all HTML tags be removed from the page’s title.
kerem closed this issue 2026-02-25 23:33:21 +03:00
Author
Owner

@sorbits commented on GitHub (Apr 9, 2018):

I’m closing this, as I checked with the HTML 4.01 specification and it clearly says:

Titles may contain character entities (for accented characters, special characters, etc.), but may not contain other markup (including comments).

I mistakenly assumed the page mentioned above was well-formed (before checking the specification) because it was from Google :)

<!-- gh-comment-id:379659987 --> @sorbits commented on GitHub (Apr 9, 2018): I’m closing this, as I checked with the HTML 4.01 specification and it clearly says: > Titles may contain character entities (for accented characters, special characters, etc.), but may not contain other markup (including comments). I mistakenly assumed the page mentioned above was well-formed (before checking the specification) because it was from Google :)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/shiori#57
No description provided.