[GH-ISSUE #1732] 下载完整页面归档出错 #1080

Closed
opened 2026-03-02 11:54:50 +03:00 by kerem · 2 comments
Owner

Originally created by @ruzyo on GitHub (Jul 13, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1732

Describe the Bug

2025-07-13T06:52:30.624Z info: Workers version: 0.25.0
2025-07-13T06:52:30.696Z info: [crawler] Loading adblocker ...
2025-07-13T06:52:48.026Z info: [Crawler] Browser connect on demand is enabled, won't proactively start the browser instance
2025-07-13T06:52:48.027Z info: Starting crawler worker ...
2025-07-13T06:52:48.027Z info: Starting inference worker ...
2025-07-13T06:52:48.028Z info: Starting search indexing worker ...
2025-07-13T06:52:48.028Z info: Starting tidy assets worker ...
2025-07-13T06:52:48.029Z info: Starting video worker ...
2025-07-13T06:52:48.029Z info: Starting feed worker ...
2025-07-13T06:52:48.030Z info: Starting asset preprocessing worker ...
2025-07-13T06:52:48.030Z info: Starting webhook worker ...
2025-07-13T06:52:48.031Z info: Starting rule engine worker ...
2025-07-13T06:55:45.527Z info: [Crawler][561] Will crawl "https://blog.5678989.xyz/learning/rclone-webdav" for link with id "uaw8rzn2mogmtz5t9w7a8181"
2025-07-13T06:55:45.528Z info: [Crawler][561] Attempting to determine the content-type for the url https://blog.5678989.xyz/learning/rclone-webdav
2025-07-13T06:55:47.769Z info: [Crawler][561] Content-type for the url https://blog.5678989.xyz/learning/rclone-webdav is "text/html; charset=utf-8"
2025-07-13T06:55:47.770Z info: [Crawler] Connecting to existing browser websocket address: ws://localhost:3600?token=
2025-07-13T06:55:48.002Z error: [Crawler][561] Crawling job failed: [object Object]
undefined
2025-07-13T06:55:48.290Z info: [Crawler][561] Will crawl "https://blog.5678989.xyz/learning/rclone-webdav" for link with id "uaw8rzn2mogmtz5t9w7a8181"
2025-07-13T06:55:48.291Z info: [Crawler][561] Attempting to determine the content-type for the url https://blog.5678989.xyz/learning/rclone-webdav
2025-07-13T06:55:48.731Z info: [Crawler][561] Content-type for the url https://blog.5678989.xyz/learning/rclone-webdav is "text/html; charset=utf-8"
2025-07-13T06:55:48.731Z info: [Crawler] Connecting to existing browser websocket address: ws://localhost:3600?token=
2025-07-13T06:55:48.735Z error: [Crawler][561] Crawling job failed: [object Object]
undefined

Steps to Reproduce

我在主页面中选择一个链接,点选“下载完整页面归档”后出现上述错误,怎么办?请大佬教教

Expected Behaviour

请问大佬该如何解决?

Screenshots or Additional Context

No response

Device Details

No response

Exact Karakeep Version

V0.25.0

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem
Originally created by @ruzyo on GitHub (Jul 13, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1732 ### Describe the Bug 2025-07-13T06:52:30.624Z info: Workers version: 0.25.0 2025-07-13T06:52:30.696Z info: [crawler] Loading adblocker ... 2025-07-13T06:52:48.026Z info: [Crawler] Browser connect on demand is enabled, won't proactively start the browser instance 2025-07-13T06:52:48.027Z info: Starting crawler worker ... 2025-07-13T06:52:48.027Z info: Starting inference worker ... 2025-07-13T06:52:48.028Z info: Starting search indexing worker ... 2025-07-13T06:52:48.028Z info: Starting tidy assets worker ... 2025-07-13T06:52:48.029Z info: Starting video worker ... 2025-07-13T06:52:48.029Z info: Starting feed worker ... 2025-07-13T06:52:48.030Z info: Starting asset preprocessing worker ... 2025-07-13T06:52:48.030Z info: Starting webhook worker ... 2025-07-13T06:52:48.031Z info: Starting rule engine worker ... 2025-07-13T06:55:45.527Z info: [Crawler][561] Will crawl "https://blog.5678989.xyz/learning/rclone-webdav" for link with id "uaw8rzn2mogmtz5t9w7a8181" 2025-07-13T06:55:45.528Z info: [Crawler][561] Attempting to determine the content-type for the url https://blog.5678989.xyz/learning/rclone-webdav 2025-07-13T06:55:47.769Z info: [Crawler][561] Content-type for the url https://blog.5678989.xyz/learning/rclone-webdav is "text/html; charset=utf-8" 2025-07-13T06:55:47.770Z info: [Crawler] Connecting to existing browser websocket address: ws://localhost:3600?token= 2025-07-13T06:55:48.002Z error: [Crawler][561] Crawling job failed: [object Object] undefined 2025-07-13T06:55:48.290Z info: [Crawler][561] Will crawl "https://blog.5678989.xyz/learning/rclone-webdav" for link with id "uaw8rzn2mogmtz5t9w7a8181" 2025-07-13T06:55:48.291Z info: [Crawler][561] Attempting to determine the content-type for the url https://blog.5678989.xyz/learning/rclone-webdav 2025-07-13T06:55:48.731Z info: [Crawler][561] Content-type for the url https://blog.5678989.xyz/learning/rclone-webdav is "text/html; charset=utf-8" 2025-07-13T06:55:48.731Z info: [Crawler] Connecting to existing browser websocket address: ws://localhost:3600?token= 2025-07-13T06:55:48.735Z error: [Crawler][561] Crawling job failed: [object Object] undefined ### Steps to Reproduce 我在主页面中选择一个链接,点选“下载完整页面归档”后出现上述错误,怎么办?请大佬教教 ### Expected Behaviour 请问大佬该如何解决? ### Screenshots or Additional Context _No response_ ### Device Details _No response_ ### Exact Karakeep Version V0.25.0 ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
kerem 2026-03-02 11:54:50 +03:00
Author
Owner

@Eragos commented on GitHub (Jul 13, 2025):

Title

下载完整页面归档出错
Download the full page archive error

Steps to Reproduce

我在主页面中选择一个链接,点选“下载完整页面归档”后出现上述错误,怎么办?请大佬教教
I select a link on the main page and click "Download Full Page Archive" and the above error appears. What should I do? Ask the boss for advice.

Expected Behaviour

请问大佬该如何解决?
Excuse me, how should the boss solve it?

@ruzyo
may be it's more easier for all, if we use English as the common language.
如果我们使用英语作为共同语言,大家可能会更容易接受。

<!-- gh-comment-id:3066951513 --> @Eragos commented on GitHub (Jul 13, 2025): ### Title 下载完整页面归档出错 Download the full page archive error ### Steps to Reproduce 我在主页面中选择一个链接,点选“下载完整页面归档”后出现上述错误,怎么办?请大佬教教 I select a link on the main page and click "Download Full Page Archive" and the above error appears. What should I do? Ask the boss for advice. ### Expected Behaviour 请问大佬该如何解决? Excuse me, how should the boss solve it? @ruzyo may be it's more easier for all, if we use English as the common language. 如果我们使用英语作为共同语言,大家可能会更容易接受。
Author
Owner

@MohamedBassem commented on GitHub (Sep 7, 2025):

This seems related to https://github.com/karakeep-app/karakeep/issues/1758

<!-- gh-comment-id:3263718289 --> @MohamedBassem commented on GitHub (Sep 7, 2025): This seems related to https://github.com/karakeep-app/karakeep/issues/1758
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1080
No description provided.