[GH-ISSUE #862] Parsing error and missing content on theregister.com #400

Open
opened 2026-02-25 23:34:08 +03:00 by kerem · 0 comments
Owner

Originally created by @lgrn on GitHub (Mar 19, 2024).
Original GitHub issue: https://github.com/go-shiori/shiori/issues/862

Data

  • Shiori version: 1.6.0 (build 595cb45a2c)
  • Database Engine: sqlite
  • Operating system: Debian 12
  • CLI/Web interface/Web Extension: None

Describe the bug / actual behavior

Shiori fails to parse quotes, they are not included in the saved content.

Expected behavior

The quotes are a part of the article, and should be included, preferably with some kind of UI indication that they are quotes, but at the very least included at all.

To Reproduce

Steps to reproduce the behavior:

  1. Save the article https://www.theregister.com/2024/03/18/truenas_abandons_freebsd/
  2. Inspect the saved content
  3. Note that the paragraph beginning with "The creator of PC-BSD(...)" has been saved
  4. Note that the following quote beginning with "Right now the plan(...)" is missing

Notes

This is an HTML excerpt of the problematic section -- the <p> within the <div> is not included:

<p>The creator of PC-BSD(...)</p>
<div class="blockextract">
<p>Right now the plan(...)</p>
</div>
Originally created by @lgrn on GitHub (Mar 19, 2024). Original GitHub issue: https://github.com/go-shiori/shiori/issues/862 ## Data - **Shiori version**: 1.6.0 (build 595cb45a2ce7031926ead1f1e1324e673e36d676) - **Database Engine**: sqlite - **Operating system**: Debian 12 - **CLI/Web interface/Web Extension**: None ## Describe the bug / actual behavior Shiori fails to parse quotes, they are not included in the saved content. ## Expected behavior The quotes are a part of the article, and should be included, preferably with some kind of UI indication that they are quotes, but at the very least included at all. ## To Reproduce Steps to reproduce the behavior: 1. Save the article https://www.theregister.com/2024/03/18/truenas_abandons_freebsd/ 2. Inspect the saved content 3. Note that the paragraph beginning with "The creator of PC-BSD(...)" has been saved 4. Note that the following quote beginning with "Right now the plan(...)" is missing ## Notes This is an HTML excerpt of the problematic section -- the `<p>` within the `<div>` is not included: ```html <p>The creator of PC-BSD(...)</p> <div class="blockextract"> <p>Right now the plan(...)</p> </div> ```
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/shiori#400
No description provided.