[GH-ISSUE #668] Subscription-based content not appearing #432

Closed
opened 2026-03-02 11:49:47 +03:00 by kerem · 3 comments
Owner

Originally created by @drycounty on GitHub (Nov 17, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/668

Describe the Bug

Beginning of articles are the only thing pulled from subscription-based websites, regardless of browser.
This is noted when logged into an Atlantic.com account on Firefox. I can read full article, and have enabled

  CRAWLER_FULL_PAGE_ARCHIVE: true

yet what is pulled is only the beginning and ending of the article. Link works fine, but even archived pull doesn't grab everything. Would love to be able to see content saved after my subscription ends.

Steps to Reproduce

  1. Log in to Atlantic.com or NYTimes.com via subscription, click on any article.
  2. Click Hoarder plugin (Firefox & Chrome)
  3. Check pulled content, only gets first two paragraphs appear, even in archive.

Expected Behaviour

Full-length article should save, or be archived if specified.

Screenshots or Additional Context

No response

Device Details

Firefox 132.0.2 plugin, Chrome 131.0.6778.70 plugin via MacOS Sonoma

Exact Hoarder Version

0.19.0

Originally created by @drycounty on GitHub (Nov 17, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/668 ### Describe the Bug Beginning of articles are the only thing pulled from subscription-based websites, regardless of browser. This is noted when logged into an Atlantic.com account on Firefox. I can read full article, and have enabled CRAWLER_FULL_PAGE_ARCHIVE: true yet what is pulled is only the beginning and ending of the article. Link works fine, but even archived pull doesn't grab everything. Would love to be able to see content saved after my subscription ends. ### Steps to Reproduce 1. Log in to Atlantic.com or NYTimes.com via subscription, click on any article. 2. Click Hoarder plugin (Firefox & Chrome) 3. Check pulled content, only gets first two paragraphs appear, even in archive. ### Expected Behaviour Full-length article should save, or be archived if specified. ### Screenshots or Additional Context _No response_ ### Device Details Firefox 132.0.2 plugin, Chrome 131.0.6778.70 plugin via MacOS Sonoma ### Exact Hoarder Version 0.19.0
kerem closed this issue 2026-03-02 11:49:48 +03:00
Author
Owner

@MohamedBassem commented on GitHub (Nov 17, 2024):

Hoarder currently crawls the website without any cookies. So it'll appear as if it's a signed out. Capturing the website from the client side is tracked in #50. Let's track that there.

<!-- gh-comment-id:2481610232 --> @MohamedBassem commented on GitHub (Nov 17, 2024): Hoarder currently crawls the website without any cookies. So it'll appear as if it's a signed out. Capturing the website from the client side is tracked in #50. Let's track that there.
Author
Owner

@MohamedBassem commented on GitHub (Nov 17, 2024):

Sorry I meant #172

<!-- gh-comment-id:2481610656 --> @MohamedBassem commented on GitHub (Nov 17, 2024): Sorry I meant #172
Author
Owner

@drycounty commented on GitHub (Nov 17, 2024):

All good! Thank you for the rapid reply!

<!-- gh-comment-id:2481683892 --> @drycounty commented on GitHub (Nov 17, 2024): All good! Thank you for the rapid reply!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#432
No description provided.