[GH-ISSUE #189] Long description meta tag causes invalid yaml to be produced #144

Closed
opened 2026-03-02 11:47:04 +03:00 by kerem · 1 comment
Owner

Originally created by @skirmess on GitHub (Jun 1, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/189

... or, at least I think so.

I am using the hoarder server version 0.14.0.

$ hoarder bookmarks list --list-id njk89ftpg0643vy5626wimxt
[
  {
    id: 'ghn72hfezhm544wxhszobwji',
    createdAt: 2024-06-01T13:01:55.000Z,
    title: null,
    archived: false,
    favourited: false,
    taggingStatus: 'success',
    note: null,
    tags: [],
    content: {
      type: 'link',
      url: 'https://skirmess.github.io/hoarder-invalid-yaml-test/test1.html',
      title: null,
      description: 'This is a multiline description in the header: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Additional info:\n' +
        'Another line: yes\n' +
        'and another : 17\n' +
        'last line : end',
      imageUrl: null,
      imageAssetId: null,
      screenshotAssetId: '2dd01621-b798-4bf0-8967-37b1d8adb30f',
      favicon: 'https://t3.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://github.io/hoarder-invalid-yaml-test/test1.html&size=128',
      htmlContent: '<div id="r... <CROPPED>',
      crawledAt: 2024-06-01T13:01:57.000Z
    }
  }
]

I cannot parse this YAML file with Perl:

$ perl -MYAML::PP -e 'YAML::PP->new->load_file(q{test1.yaml})'
Duplicate key '' at ~/perl5/lib/perl5/YAML/PP/Parser.pm line 61.
 at ~/perl5/lib/perl5/YAML/PP/Loader.pm line 94.

And www.yamllint.com also complains:

Unexpected scalar token at line 15, column 680
Nested mappings are not allowed in compact mappings at line 15, column 705
Implicit keys need to be on a single line at line 15, column 705
Nested mappings are not allowed in compact mappings at line 16, column 24
Implicit keys need to be on a single line at line 16, column 24
Block collections are not allowed within flow collections at line 15, column 20

The html file looks like this:

<html>
        <head>
                <meta property="og:description" content="This is a multiline description in the header: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.



Additional info:
Another line: yes
and another : 17
last line : end" />
        </head>
        <body>
                Test 1
        </body>
</html>
Originally created by @skirmess on GitHub (Jun 1, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/189 ... or, at least I think so. I am using the hoarder server version 0.14.0. - Create a new bookmark of the following link: [https://raw.githubusercontent.com/skirmess/hoarder-invalid-yaml-test/master/test1.html](https://raw.githubusercontent.com/skirmess/hoarder-invalid-yaml-test/master/test1.html) (I've added the link in the web app) - Dump the created link with the hoarder CLI ``` $ hoarder bookmarks list --list-id njk89ftpg0643vy5626wimxt [ { id: 'ghn72hfezhm544wxhszobwji', createdAt: 2024-06-01T13:01:55.000Z, title: null, archived: false, favourited: false, taggingStatus: 'success', note: null, tags: [], content: { type: 'link', url: 'https://skirmess.github.io/hoarder-invalid-yaml-test/test1.html', title: null, description: 'This is a multiline description in the header: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Additional info:\n' + 'Another line: yes\n' + 'and another : 17\n' + 'last line : end', imageUrl: null, imageAssetId: null, screenshotAssetId: '2dd01621-b798-4bf0-8967-37b1d8adb30f', favicon: 'https://t3.gstatic.com/faviconV2?client=SOCIAL&type=FAVICON&fallback_opts=TYPE,SIZE,URL&url=https://github.io/hoarder-invalid-yaml-test/test1.html&size=128', htmlContent: '<div id="r... <CROPPED>', crawledAt: 2024-06-01T13:01:57.000Z } } ] ``` I cannot parse this YAML file with Perl: ``` $ perl -MYAML::PP -e 'YAML::PP->new->load_file(q{test1.yaml})' Duplicate key '' at ~/perl5/lib/perl5/YAML/PP/Parser.pm line 61. at ~/perl5/lib/perl5/YAML/PP/Loader.pm line 94. ``` And [www.yamllint.com](https://www.yamllint.com/) also complains: ``` Unexpected scalar token at line 15, column 680 Nested mappings are not allowed in compact mappings at line 15, column 705 Implicit keys need to be on a single line at line 15, column 705 Nested mappings are not allowed in compact mappings at line 16, column 24 Implicit keys need to be on a single line at line 16, column 24 Block collections are not allowed within flow collections at line 15, column 20 ``` The html file looks like this: ``` <html> <head> <meta property="og:description" content="This is a multiline description in the header: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Additional info: Another line: yes and another : 17 last line : end" /> </head> <body> Test 1 </body> </html> ```
kerem closed this issue 2026-03-02 11:47:04 +03:00
Author
Owner

@MohamedBassem commented on GitHub (Jun 9, 2024):

As mentioned in another issue, the output of the CLI is not meant to be a YAML. @kamtschatka added JSON support in github.com/hoarder-app/hoarder@cde97267a9. So if you want to get a parsable output, use --json.

<!-- gh-comment-id:2156799724 --> @MohamedBassem commented on GitHub (Jun 9, 2024): As mentioned in another issue, the output of the CLI is not meant to be a YAML. @kamtschatka added JSON support in https://github.com/hoarder-app/hoarder/commit/cde97267a90802c6a367aa61ff157983506deead. So if you want to get a parsable output, use --json.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#144
No description provided.