[GH-ISSUE #2251] Unable to add itsfoss.com pages to self-hosted Karakeep #1373

Closed
opened 2026-03-02 11:56:51 +03:00 by kerem · 3 comments
Owner

Originally created by @cb3inco on GitHub (Dec 13, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2251

Describe the Bug

Unable to crawl https://itsfoss.com to add pages to Karakeep. My server is running on a Hetzner VM.

I did adjust my crawl timeouts per this issue - https://github.com/karakeep-app/karakeep/issues/1904 but that didn't help.

Sorry if this should have been a discussion.

Steps to Reproduce

  1. go to https://itsfoss.com/open-source-nas-os/
  2. use browser extension to add to Karakeep
  3. Link is added, but page never renders.

Same thing happens when adding it directly in the karakeep app.

Expected Behaviour

The site to be added and render properly.

Screenshots or Additional Context

Log of the failure:

2025-12-13T10:29:57.643Z error: [Crawler][3955] Crawling job failed: TimeoutError: page.goto: Timeout 100000ms exceeded.
Call log:

page.goto: Timeout 100000ms exceeded.
Call log:

  • navigating to "https://itsfoss.com/open-source-nas-os/", waiting until "domcontentloaded"

    at crawlPage (/app/apps/workers/dist/index.js:115689:45)
    at async crawlAndParseUrl (/app/apps/workers/dist/index.js:116019:18)
    at async Object.runCrawler (/app/apps/workers/dist/index.js:116171:25)
    at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.7.0_@opentelemetry+api@1.9.0_@types+better-sqlite3@7.6.13_@types+react@19.2.5_bette_j25tbpstiiqwo32nscmvntyxcu/node_modules/liteque/dist/index.js:261:19)
    2025-12-13T10:29:57.662Z info: [Crawler][3955:1] Will crawl "https://itsfoss.com/open-source-nas-os/" for link with id "d0259nm2lxefu82nsglc06ss"
    2025-12-13T10:29:57.662Z info: [Crawler][3955:1] Attempting to determine the content-type for the url https://itsfoss.com/open-source-nas-os/
    2025-12-13T10:29:57.741Z info: [Crawler][3955:1] Content-type for the url https://itsfoss.com/open-source-nas-os/ is "text/html"
    2025-12-13T10:29:57.988Z info: [Crawler][3955:1] Navigating to "https://itsfoss.com/open-source-nas-os/"

Image

Device Details

Ubuntu Server 24.04, Docker & Compose, Using zenika-hub/alpine-chrome:123 and getmeili/meilisearch:v1.13.3

Exact Karakeep Version

0.29.3

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem
Originally created by @cb3inco on GitHub (Dec 13, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2251 ### Describe the Bug Unable to crawl https://itsfoss.com to add pages to Karakeep. My server is running on a Hetzner VM. I did adjust my crawl timeouts per this issue - https://github.com/karakeep-app/karakeep/issues/1904 but that didn't help. Sorry if this should have been a discussion. ### Steps to Reproduce 1. go to https://itsfoss.com/open-source-nas-os/ 2. use browser extension to add to Karakeep 3. Link is added, but page never renders. Same thing happens when adding it directly in the karakeep app. ### Expected Behaviour The site to be added and render properly. ### Screenshots or Additional Context Log of the failure: 2025-12-13T10:29:57.643Z error: [Crawler][3955] Crawling job failed: TimeoutError: page.goto: Timeout 100000ms exceeded. Call log: - navigating to "https://itsfoss.com/open-source-nas-os/", waiting until "domcontentloaded" page.goto: Timeout 100000ms exceeded. Call log: - navigating to "https://itsfoss.com/open-source-nas-os/", waiting until "domcontentloaded" at crawlPage (/app/apps/workers/dist/index.js:115689:45) at async crawlAndParseUrl (/app/apps/workers/dist/index.js:116019:18) at async Object.runCrawler (/app/apps/workers/dist/index.js:116171:25) at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@0.7.0_@opentelemetry+api@1.9.0_@types+better-sqlite3@7.6.13_@types+react@19.2.5_bette_j25tbpstiiqwo32nscmvntyxcu/node_modules/liteque/dist/index.js:261:19) 2025-12-13T10:29:57.662Z info: [Crawler][3955:1] Will crawl "https://itsfoss.com/open-source-nas-os/" for link with id "d0259nm2lxefu82nsglc06ss" 2025-12-13T10:29:57.662Z info: [Crawler][3955:1] Attempting to determine the content-type for the url https://itsfoss.com/open-source-nas-os/ 2025-12-13T10:29:57.741Z info: [Crawler][3955:1] Content-type for the url https://itsfoss.com/open-source-nas-os/ is "text/html" 2025-12-13T10:29:57.988Z info: [Crawler][3955:1] Navigating to "https://itsfoss.com/open-source-nas-os/" <img width="1787" height="826" alt="Image" src="https://github.com/user-attachments/assets/b739f939-87b8-4c10-b9be-eeb142e31ad7" /> ### Device Details Ubuntu Server 24.04, Docker & Compose, Using zenika-hub/alpine-chrome:123 and getmeili/meilisearch:v1.13.3 ### Exact Karakeep Version 0.29.3 ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
kerem 2026-03-02 11:56:51 +03:00
Author
Owner

@cb3inco commented on GitHub (Dec 13, 2025):

Update: I did see the compose file on this repo has been updated to use the chrome image 124. I did upgrade that now, but no change.

<!-- gh-comment-id:3649212901 --> @cb3inco commented on GitHub (Dec 13, 2025): Update: I did see the compose file on this repo has been updated to use the chrome image 124. I did upgrade that now, but no change.
Author
Owner

@myselfprincee commented on GitHub (Dec 22, 2025):

Update: I did see the compose file on this repo has been updated to use the chrome image 124. I did upgrade that now, but no change.

Describe the Bug

Unable to crawl https://itsfoss.com to add pages to Karakeep. My server is running on a Hetzner VM.

I did adjust my crawl timeouts per this issue - #1904 but that didn't help.

Sorry if this should have been a discussion.

Steps to Reproduce

  1. go to https://itsfoss.com/open-source-nas-os/
  2. use browser extension to add to Karakeep
  3. Link is added, but page never renders.

Same thing happens when adding it directly in the karakeep app.

Expected Behaviour

The site to be added and render properly.

Screenshots or Additional Context

Log of the failure:

2025-12-13T10:29:57.643Z error: [Crawler][3955] Crawling job failed: TimeoutError: page.goto: Timeout 100000ms exceeded. Call log:

page.goto: Timeout 100000ms exceeded. Call log:

Image ### Device Details Ubuntu Server 24.04, Docker & Compose, Using zenika-hub/alpine-chrome:123 and getmeili/meilisearch:v1.13.3

Exact Karakeep Version

0.29.3

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem

Working totally fine for me.

Image
Image
<!-- gh-comment-id:3683921872 --> @myselfprincee commented on GitHub (Dec 22, 2025): > Update: I did see the compose file on this repo has been updated to use the chrome image 124. I did upgrade that now, but no change. > ### Describe the Bug > Unable to crawl https://itsfoss.com to add pages to Karakeep. My server is running on a Hetzner VM. > > I did adjust my crawl timeouts per this issue - [#1904](https://github.com/karakeep-app/karakeep/issues/1904) but that didn't help. > > Sorry if this should have been a discussion. > > ### Steps to Reproduce > 1. go to https://itsfoss.com/open-source-nas-os/ > 2. use browser extension to add to Karakeep > 3. Link is added, but page never renders. > > Same thing happens when adding it directly in the karakeep app. > > ### Expected Behaviour > The site to be added and render properly. > > ### Screenshots or Additional Context > Log of the failure: > > 2025-12-13T10:29:57.643Z error: [Crawler][3955] Crawling job failed: TimeoutError: page.goto: Timeout 100000ms exceeded. Call log: > > * navigating to "https://itsfoss.com/open-source-nas-os/", waiting until "domcontentloaded" > > page.goto: Timeout 100000ms exceeded. Call log: > > * navigating to "https://itsfoss.com/open-source-nas-os/", waiting until "domcontentloaded" > at crawlPage (/app/apps/workers/dist/index.js:115689:45) > at async crawlAndParseUrl (/app/apps/workers/dist/index.js:116019:18) > at async Object.runCrawler (/app/apps/workers/dist/index.js:116171:25) > at async Runner.runOnce (/app/apps/workers/node_modules/.pnpm/liteque@[0.7.0_@opentelemetry](mailto:0.7.0_@opentelemetry)+api@[1.9.0_@types](mailto:1.9.0_@types)+better-sqlite3@[7.6.13_@types](mailto:7.6.13_@types)[+react@19.2.5_bette_j25tbpstiiqwo32nscmvntyxcu](mailto:+react@19.2.5_bette_j25tbpstiiqwo32nscmvntyxcu)/node_modules/liteque/dist/index.js:261:19) > 2025-12-13T10:29:57.662Z info: [Crawler][3955:1] Will crawl "https://itsfoss.com/open-source-nas-os/" for link with id "d0259nm2lxefu82nsglc06ss" > 2025-12-13T10:29:57.662Z info: [Crawler][3955:1] Attempting to determine the content-type for the url https://itsfoss.com/open-source-nas-os/ > 2025-12-13T10:29:57.741Z info: [Crawler][3955:1] Content-type for the url https://itsfoss.com/open-source-nas-os/ is "text/html" > 2025-12-13T10:29:57.988Z info: [Crawler][3955:1] Navigating to "https://itsfoss.com/open-source-nas-os/" > > <img alt="Image" width="1787" height="826" src="https://private-user-images.githubusercontent.com/14948644/526180546-b739f939-87b8-4c10-b9be-eeb142e31ad7.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NjY0MzI5MDksIm5iZiI6MTc2NjQzMjYwOSwicGF0aCI6Ii8xNDk0ODY0NC81MjYxODA1NDYtYjczOWY5MzktODdiOC00YzEwLWI5YmUtZWViMTQyZTMxYWQ3LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTEyMjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUxMjIyVDE5NDMyOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTUzZjY4ZTdkMDZhN2M3MzgyYjAzMDljZDBjMzk0MzU4OGJiZmUzODIyNTdkZTgxZWUzNGIxNTFiY2NhNGI0ZjAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.7TeV9RiUBOrJ1uTpFKFajBL3MJt791ZR-RMS4jdiSOg"> > ### Device Details > Ubuntu Server 24.04, Docker & Compose, Using zenika-hub/alpine-chrome:123 and getmeili/meilisearch:v1.13.3 > > ### Exact Karakeep Version > 0.29.3 > > ### Have you checked the troubleshooting guide? > * [x] I have checked the troubleshooting guide and I haven't found a solution to my problem Working totally fine for me. <img width="520" height="480" alt="Image" src="https://github.com/user-attachments/assets/530a16d8-266b-4c44-a6fa-655c47bc6f97" /> <hr> <img width="3431" height="1637" alt="Image" src="https://github.com/user-attachments/assets/f11e260f-9635-49b4-8acb-f67329521629" />
Author
Owner

@cb3inco commented on GitHub (Dec 23, 2025):

Yea, now it's working for me. Bizarre. I guess this can be closed.

<!-- gh-comment-id:3685831356 --> @cb3inco commented on GitHub (Dec 23, 2025): Yea, now it's working for me. Bizarre. I guess this can be closed.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1373
No description provided.