[GH-ISSUE #215] Allow downloading more content from a webpage and index it #155

Closed
opened 2026-03-02 11:47:10 +03:00 by kerem · 5 comments
Owner

Originally created by @kamtschatka on GitHub (Jun 9, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/215

I regularly bookmark youtube videos, Instagram videos, other videos.
It is not assured, that those videos stay online forever, so I prefer to download important videos (yes I am a real hoarder).

Would be great if you can enable downloading videos and serving them from hoarder for later viewing (Filesize does not matter to me, but I guess for some it matters).

Would be also great if the subtitles would be downloaded and indexed, so searching is possible also in the video content.
In the long run it would also be cool if it is possible to transcribe the video contents and make it searchable that way.

Originally created by @kamtschatka on GitHub (Jun 9, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/215 I regularly bookmark youtube videos, Instagram videos, other videos. It is not assured, that those videos stay online forever, so I prefer to download important videos (yes I am a real hoarder). Would be great if you can enable downloading videos and serving them from hoarder for later viewing (Filesize does not matter to me, but I guess for some it matters). Would be also great if the subtitles would be downloaded and indexed, so searching is possible also in the video content. In the long run it would also be cool if it is possible to transcribe the video contents and make it searchable that way.
kerem 2026-03-02 11:47:10 +03:00
Author
Owner

@OliverLippertVw commented on GitHub (Aug 1, 2024):

It is not assured, that those videos stay online forever

True. I was planning a long (1 month) trip for over half a year and even in this "short" period of time we had the one or other reel that doesn't exist any longer.

Not to start telling about my various collection of receipts (10+ years) -.-

So linking but also downloading + indexing is a huge benefit, I love to see as well.

<!-- gh-comment-id:2262695439 --> @OliverLippertVw commented on GitHub (Aug 1, 2024): >It is not assured, that those videos stay online forever True. I was planning a long (1 month) trip for over half a year and even in this "short" period of time we had the one or other reel that doesn't exist any longer. Not to start telling about my various collection of receipts (10+ years) -.- So linking but also downloading + indexing is a huge benefit, I love to see as well.
Author
Owner

@huyz commented on GitHub (Aug 1, 2024):

For media downloading, I currently use ArchiveBox.

<!-- gh-comment-id:2263726606 --> @huyz commented on GitHub (Aug 1, 2024): For media downloading, I currently use ArchiveBox.
Author
Owner

@khronimo commented on GitHub (Oct 17, 2024):

I would love to see it being able to index the transcript in Youtube links. I often bookmark interesting videos, interviews and am always frustrated when i try to find that particular resource again later.

<!-- gh-comment-id:2418972227 --> @khronimo commented on GitHub (Oct 17, 2024): I would love to see it being able to index the transcript in Youtube links. I often bookmark interesting videos, interviews and am always frustrated when i try to find that particular resource again later.
Author
Owner

@ItsNoted commented on GitHub (Mar 7, 2025):

Are we able to use Hoarder to archive Youtube transcripts? I wasn't sure if this was added or not.

<!-- gh-comment-id:2706536086 --> @ItsNoted commented on GitHub (Mar 7, 2025): Are we able to use Hoarder to archive Youtube transcripts? I wasn't sure if this was added or not.
Author
Owner

@dimitrieh commented on GitHub (Sep 24, 2025):

Would be also great if the subtitles would be downloaded and indexed, so searching is possible also in the video content.
In the long run it would also be cool if it is possible to transcribe the video contents and make it searchable that way.

Are we able to use Hoarder to archive Youtube transcripts? I wasn't sure if this was added or not.

When using yt-dlp args: Files are downloaded but not indexed - Karakeep downloads .srt and/or metadata .json files but doesn't parse them into the bookmark's searchable text

Follow up issues:

<!-- gh-comment-id:3328065929 --> @dimitrieh commented on GitHub (Sep 24, 2025): > Would be also great if the subtitles would be downloaded and indexed, so searching is possible also in the video content. In the long run it would also be cool if it is possible to transcribe the video contents and make it searchable that way. > Are we able to use Hoarder to archive Youtube transcripts? I wasn't sure if this was added or not. When using yt-dlp args: Files are downloaded but not indexed - Karakeep downloads .srt and/or metadata .json files but doesn't parse them into the bookmark's searchable text Follow up issues: - https://github.com/karakeep-app/karakeep/issues/1629 - https://github.com/karakeep-app/karakeep/issues/1442
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#155
No description provided.