[GH-ISSUE #186] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot #140

Closed
opened 2026-03-02 11:47:02 +03:00 by kerem · 3 comments
Owner

Originally created by @scubanarc on GitHub (May 28, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/186

I followed the upgrade instructions to get version 0.14 (release) and edited my .env file to include:

CRAWLER_FULL_PAGE_SCREENSHOT=true

Crawls now fail. Here's the docker logs from hoarder-workers-1 during a single crawl event:

2024-05-28T17:22:27.107Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx"
2024-05-28T17:22:27.643Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ...
2024-05-28T17:22:29.000Z info: [Crawler][47] Finished waiting for the page to load.
2024-05-28T17:22:29.604Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot
2024-05-28T17:22:30.656Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx"
2024-05-28T17:22:31.171Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ...
2024-05-28T17:22:32.519Z info: [Crawler][47] Finished waiting for the page to load.
2024-05-28T17:22:33.116Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot
2024-05-28T17:22:35.166Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx"
2024-05-28T17:22:35.673Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ...
2024-05-28T17:22:37.067Z info: [Crawler][47] Finished waiting for the page to load.
2024-05-28T17:22:37.615Z info: [Crawler] The puppeteer browser got disconnected. Will attempt to launch it again.
2024-05-28T17:22:37.615Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222
2024-05-28T17:22:37.616Z info: [Crawler] Successfully resolved IP address, new address: http://172.26.0.5:9222/
2024-05-28T17:22:37.618Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs
2024-05-28T17:22:42.617Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222
2024-05-28T17:22:42.618Z info: [Crawler] Successfully resolved IP address, new address: http://172.26.0.5:9222/
2024-05-28T17:23:35.170Z error: [Crawler][47] Crawling job failed: Error: Timed-out after 60 secs
2024-05-28T17:23:39.199Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx"
2024-05-28T17:23:39.737Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ...
2024-05-28T17:23:41.030Z info: [Crawler][47] Finished waiting for the page to load.
2024-05-28T17:23:41.600Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot
2024-05-28T17:23:49.621Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx"
2024-05-28T17:23:50.122Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ...
2024-05-28T17:23:51.426Z info: [Crawler][47] Finished waiting for the page to load.
2024-05-28T17:23:51.959Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot

Originally created by @scubanarc on GitHub (May 28, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/186 I followed the upgrade instructions to get version 0.14 (release) and edited my .env file to include: CRAWLER_FULL_PAGE_SCREENSHOT=true Crawls now fail. Here's the docker logs from hoarder-workers-1 during a single crawl event: 2024-05-28T17:22:27.107Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:22:27.643Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:22:29.000Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:22:29.604Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot 2024-05-28T17:22:30.656Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:22:31.171Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:22:32.519Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:22:33.116Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot 2024-05-28T17:22:35.166Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:22:35.673Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:22:37.067Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:22:37.615Z info: [Crawler] The puppeteer browser got disconnected. Will attempt to launch it again. 2024-05-28T17:22:37.615Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222 2024-05-28T17:22:37.616Z info: [Crawler] Successfully resolved IP address, new address: http://172.26.0.5:9222/ 2024-05-28T17:22:37.618Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs 2024-05-28T17:22:42.617Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222 2024-05-28T17:22:42.618Z info: [Crawler] Successfully resolved IP address, new address: http://172.26.0.5:9222/ 2024-05-28T17:23:35.170Z error: [Crawler][47] Crawling job failed: Error: Timed-out after 60 secs 2024-05-28T17:23:39.199Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:23:39.737Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:23:41.030Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:23:41.600Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot 2024-05-28T17:23:49.621Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:23:50.122Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:23:51.426Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:23:51.959Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot
kerem closed this issue 2026-03-02 11:47:02 +03:00
Author
Owner

@MohamedBassem commented on GitHub (May 28, 2024):

@scubanarc did you also add the chrome flag that's mentioned in the release note? I found that in my experiments, it was the solution to this problem

<!-- gh-comment-id:2136092465 --> @MohamedBassem commented on GitHub (May 28, 2024): @scubanarc did you also add the chrome flag that's mentioned in the release note? I found that in my experiments, it was the solution to this problem
Author
Owner

@scubanarc commented on GitHub (May 28, 2024):

Ah, I missed that one. I added it and my problems went away.

Thanks for the fast reply!

<!-- gh-comment-id:2136103097 --> @scubanarc commented on GitHub (May 28, 2024): Ah, I missed that one. I added it and my problems went away. Thanks for the fast reply!
Author
Owner

@MohamedBassem commented on GitHub (May 28, 2024):

perfect!

<!-- gh-comment-id:2136104958 --> @MohamedBassem commented on GitHub (May 28, 2024): perfect!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#140
No description provided.