[GH-ISSUE #1066] Unable to Download mp4, gifs from x.com while using single file extension #699

Open
opened 2026-03-02 11:52:00 +03:00 by kerem · 0 comments
Owner

Originally created by @madratzz on GitHub (Feb 24, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1066

Describe the Bug

I think this has something to do with Yt-dl not working properly or something.

Here are the logs from docker container.

2025-02-24T04:15:42.975Z info: [webhook][900] Starting a webhook job for bookmark with id "btoc0qviu01a4qe3bb732jzc"
2025-02-24T04:15:42.975Z info: [webhook][900] Completed successfully
2025-02-24T04:15:43.557Z info: [Crawler][898] Will crawl "https://x.com/IvanBoyko438891/status/1893370203825705111" for link with id "btoc0qviu01a4qe3bb732jzc"
2025-02-24T04:15:43.557Z info: [Crawler][898] Attempting to determine the content-type for the url https://x.com/IvanBoyko438891/status/1893370203825705111
2025-02-24T04:15:43.631Z info: [search][899] Attempting to index bookmark with id btoc0qviu01a4qe3bb732jzc ...
2025-02-24T04:15:43.770Z info: [search][899] Completed successfully
2025-02-24T04:15:44.830Z info: [Crawler][898] Content-type for the url https://x.com/IvanBoyko438891/status/1893370203825705111 is "null"
2025-02-24T04:15:44.830Z info: [Crawler][898] The page has been precrawled. Will use the precrawled archive instead.
2025-02-24T04:15:44.833Z info: [Crawler][898] Will attempt to extract metadata from page ...
2025-02-24T04:15:47.219Z info: [Crawler][898] Will attempt to extract readable content ...
2025-02-24T04:15:49.222Z info: [Crawler][898] Done extracting readable content.
2025-02-24T04:15:49.223Z info: [Crawler][898] Skipping storing the screenshot as it's empty.
2025-02-24T04:15:49.228Z info: [Crawler][898] Done extracting metadata from the page.
2025-02-24T04:15:49.228Z info: [Crawler][898] Downloading image from "https://abs.twimg.com/responsive-web/client-web/icon-ios.77d25eba.png"
2025-02-24T04:15:50.711Z info: [Crawler][898] Downloaded image as assetId: 8d9c010f-d2b9-41f4-92e0-3ba0fa53a162
2025-02-24T04:15:50.895Z info: [Crawler][898] Completed successfully
2025-02-24T04:15:51.259Z info: [webhook][904] Starting a webhook job for bookmark with id "btoc0qviu01a4qe3bb732jzc"
2025-02-24T04:15:51.259Z info: [webhook][904] Completed successfully
2025-02-24T04:15:51.313Z info: [inference][901] Starting an inference job for bookmark with id "btoc0qviu01a4qe3bb732jzc"
2025-02-24T04:15:51.366Z info: [search][902] Attempting to index bookmark with id btoc0qviu01a4qe3bb732jzc ...
2025-02-24T04:15:51.369Z info: [VideoCrawler][903] Attempting to download a file from "https://x.com/IvanBoyko438891/status/1893370203825705111" to "/tmp/video_downloads/318514d5-d64c-4141-a557-a832cff71b2f" using the following arguments: "https://x.com/IvanBoyko438891/status/1893370203825705111,-f,best[filesize<50M],-o,/tmp/video_downloads/318514d5-d64c-4141-a557-a832cff71b2f,--no-playlist"
2025-02-24T04:15:51.719Z info: [search][902] Completed successfully
2025-02-24T04:15:53.444Z info: [inference][901] Inferring tag for bookmark "btoc0qviu01a4qe3bb732jzc" used 488 tokens and inferred: animation,game development,indie games,visual effects
2025-02-24T04:15:53.507Z info: [inference][901] Completed successfully
2025-02-24T04:15:53.787Z info: [search][905] Attempting to index bookmark with id btoc0qviu01a4qe3bb732jzc ...
2025-02-24T04:15:54.140Z info: [search][905] Completed successfully
2025-02-24T04:15:55.033Z error: [VideoCrawler][903] Failed to download a file from "https://x.com/IvanBoyko438891/status/1893370203825705111" to "/tmp/video_downloads/318514d5-d64c-4141-a557-a832cff71b2f"
2025-02-24T04:15:55.033Z info: [VideoCrawler][903] Video Download Completed successfully

Steps to Reproduce

  1. Setup SingleFile as per Docs.
  2. Try to Hoard a Tweet with mp4 content
  3. Hoard Successful but the video is not downloaded

Expected Behaviour

mp4 content should be downloaded correctly

Screenshots or Additional Context

No response

Device Details

Firefox 135 on Windows 10

Exact Hoarder Version

v0.22.0

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem
Originally created by @madratzz on GitHub (Feb 24, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1066 ### Describe the Bug I think this has something to do with Yt-dl not working properly or something. Here are the logs from docker container. ``` 2025-02-24T04:15:42.975Z info: [webhook][900] Starting a webhook job for bookmark with id "btoc0qviu01a4qe3bb732jzc" 2025-02-24T04:15:42.975Z info: [webhook][900] Completed successfully 2025-02-24T04:15:43.557Z info: [Crawler][898] Will crawl "https://x.com/IvanBoyko438891/status/1893370203825705111" for link with id "btoc0qviu01a4qe3bb732jzc" 2025-02-24T04:15:43.557Z info: [Crawler][898] Attempting to determine the content-type for the url https://x.com/IvanBoyko438891/status/1893370203825705111 2025-02-24T04:15:43.631Z info: [search][899] Attempting to index bookmark with id btoc0qviu01a4qe3bb732jzc ... 2025-02-24T04:15:43.770Z info: [search][899] Completed successfully 2025-02-24T04:15:44.830Z info: [Crawler][898] Content-type for the url https://x.com/IvanBoyko438891/status/1893370203825705111 is "null" 2025-02-24T04:15:44.830Z info: [Crawler][898] The page has been precrawled. Will use the precrawled archive instead. 2025-02-24T04:15:44.833Z info: [Crawler][898] Will attempt to extract metadata from page ... 2025-02-24T04:15:47.219Z info: [Crawler][898] Will attempt to extract readable content ... 2025-02-24T04:15:49.222Z info: [Crawler][898] Done extracting readable content. 2025-02-24T04:15:49.223Z info: [Crawler][898] Skipping storing the screenshot as it's empty. 2025-02-24T04:15:49.228Z info: [Crawler][898] Done extracting metadata from the page. 2025-02-24T04:15:49.228Z info: [Crawler][898] Downloading image from "https://abs.twimg.com/responsive-web/client-web/icon-ios.77d25eba.png" 2025-02-24T04:15:50.711Z info: [Crawler][898] Downloaded image as assetId: 8d9c010f-d2b9-41f4-92e0-3ba0fa53a162 2025-02-24T04:15:50.895Z info: [Crawler][898] Completed successfully 2025-02-24T04:15:51.259Z info: [webhook][904] Starting a webhook job for bookmark with id "btoc0qviu01a4qe3bb732jzc" 2025-02-24T04:15:51.259Z info: [webhook][904] Completed successfully 2025-02-24T04:15:51.313Z info: [inference][901] Starting an inference job for bookmark with id "btoc0qviu01a4qe3bb732jzc" 2025-02-24T04:15:51.366Z info: [search][902] Attempting to index bookmark with id btoc0qviu01a4qe3bb732jzc ... 2025-02-24T04:15:51.369Z info: [VideoCrawler][903] Attempting to download a file from "https://x.com/IvanBoyko438891/status/1893370203825705111" to "/tmp/video_downloads/318514d5-d64c-4141-a557-a832cff71b2f" using the following arguments: "https://x.com/IvanBoyko438891/status/1893370203825705111,-f,best[filesize<50M],-o,/tmp/video_downloads/318514d5-d64c-4141-a557-a832cff71b2f,--no-playlist" 2025-02-24T04:15:51.719Z info: [search][902] Completed successfully 2025-02-24T04:15:53.444Z info: [inference][901] Inferring tag for bookmark "btoc0qviu01a4qe3bb732jzc" used 488 tokens and inferred: animation,game development,indie games,visual effects 2025-02-24T04:15:53.507Z info: [inference][901] Completed successfully 2025-02-24T04:15:53.787Z info: [search][905] Attempting to index bookmark with id btoc0qviu01a4qe3bb732jzc ... 2025-02-24T04:15:54.140Z info: [search][905] Completed successfully 2025-02-24T04:15:55.033Z error: [VideoCrawler][903] Failed to download a file from "https://x.com/IvanBoyko438891/status/1893370203825705111" to "/tmp/video_downloads/318514d5-d64c-4141-a557-a832cff71b2f" 2025-02-24T04:15:55.033Z info: [VideoCrawler][903] Video Download Completed successfully ``` ### Steps to Reproduce 1. Setup SingleFile as per Docs. 2. Try to Hoard a Tweet with mp4 content 3. Hoard Successful but the video is not downloaded ### Expected Behaviour mp4 content should be downloaded correctly ### Screenshots or Additional Context _No response_ ### Device Details Firefox 135 on Windows 10 ### Exact Hoarder Version v0.22.0 ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#699
No description provided.