mirror of
https://github.com/karakeep-app/karakeep.git
synced 2026-04-25 07:56:05 +03:00
Open
opened 2026-03-02 11:50:38 +03:00 by kerem
·
13 comments
No Branch/Tag specified
main
refactor/use-npm-singlefile
onetab
claude/issue-2596-20260321-1401
claude/fix-docs-button-responsive-V3aBQ
claude/review-import-backpressure-D4ArJ
claude/fix-archived-bookmarks-mobile-P9OJW
claude/issue-1189-20260211-1601
claude/fix-nested-smart-lists-3uFkt
claude/issue-2298-20251223-1704
feat/import-v3
claude/add-cli-search-subcommand-6kIe0
claude/add-bookmark-indexing-timestamps-96bPj
claude/auto-disable-failing-feeds-fkDhP
claude/add-tag-search-aliases-HzESD
feat/docker-compose-dev
claude/add-attachedby-tags-endpoint-01WYfemMGHJJjXsPYLvUJAno
claude/fix-crawler-memory-leaks-NE7Ct
bookmark-debugger
claude/issue-2352-20260106-1120
claude/issue-1977-20260102-2348
claude/add-banner-rendering-JeLUk
claude/add-descendant-qualifier-cUm26
claude/skip-metadata-refresh-archives-CAo4Y
claude/fix-archive-pending-banner-pAyGM
claude/add-embeddings-support-h2swV
claude/nested-manage-lists-QVV85
claude/privacy-type-system-MG1bT
claude/add-action-menu-icons-6hNKw
claude/issue-2299-20251223-1711
claude/bookmark-indexing-progress-QwZSI
claude/migrate-bookmark-attachments-3O2te
claude/add-2025-wrapped-feature-tIUIh
claude/improve-ai-settings-design-639tq
claude/add-youtube-metascraper-plugin-0lWC7
claude/add-problem-reporting-gSSEV
claude/add-mobile-list-menus-spcS7
claude/shadcn-bookmark-cards-WWHzP
claude/add-extensions-link-HTeXc
claude/add-onboarding-screens-hsYMO
claude/fix-settings-switch-overflow-nlzM4
claude/clamp-bookmark-titles-diAEz
claude/port-stats-mobile-expo-MuXAn
claude/whats-new-base-version-vrv8C
claude/fix-settings-auth-checks-jgyD8
claude/add-server-version-display-3sGa2
claude/fix-tag-editor-scrolling-rzdbG
claude/add-company-pricing-card-y5mHY
claude/audit-optimize-transactions-xpDVc
codex/ensure-consistent-ui-experience-across-app-pages
claude/plan-opentelemetry-integration-01Jx183mz1Ev8h8JoYj97Auw
libsql
db-indicies
claude/export-import-lists-01UuCWwdaqduAd35NppvjnMD
claude/configurable-worker-timeout-0198GQh6YrrRzqG62xnogyrz
claude/check-import-quota-01CPdxTpHp18Ba62bYcBTVbA
claude/scraper-worker-thread-01FEHen6MGrQHmdBstJSuiyA
claude/customize-dialog-styling-01CVjEv2KgyZJSpCg3mqkvR7
claude/add-asset-cache-headers-0175WhNcqwiwurrmjj52jnLT
claude/add-db-search-plugin-017Xxd4Jq3MfjWT788vgfbaq
benchmarks-2
claude/add-filtered-deletion-01DTxWNcg3hhqdNpeNLa3s6L
claude/actionbutton-loading-spinner-015DY5ZTvgPgFAXTZz3UGaYv
claude/add-broken-links-qualifier-01S31X1LsKiYb9gE1dXTKvi3
claude/docker-release-tag-trigger-01UmzFXEumhK2jdmRGtMcueo
claude/spread-feed-fetch-scheduling-01EihUtmZSyqeE1HfRMessxW
restate-idempotency
claude/align-android-ios-colors-01GJfkhEyZVBReohVioPa8ok
claude/improve-mobile-app-colors-0155LzHfkd5HyJr6YyZMsus5
codex/add-autocomplete-for-search-query-language
claude/add-bookmark-backups-016L2A8Z94n7tDgDdMPdFuAd
claude/restrict-binary-user-permissions-01FSGyy2RXGZvE26YbAejzGi
effect-ts
claude/prepare-trpc-npm-publish-0193EjfwpxSNVNcLXqXjs6Ln
shared-list-sidebar
claude/lazy-load-tiktoken-017UTNpJPTcMMQvNEBa1aFwo
codex/fix-asset-pre-processing-worker-abort-signals
add-groupid
claude/add-bookmark-list-button-01VF7uXYNLsVDzqdozWMXP5M
claude/extract-shared-ui-components-01DSVfaCr6WRqAyx1vJTZk9r
claude/migrate-shadcn-sidebar-01DKjpg9MD5PJ2potemSnbvW
claude/add-collaborators-rate-limits-01VjXyRWWPUkGQKa8d8D8qKj
claude/modernize-dark-mode-01FRfE81PAY5C44pFu1cYocf
claude/add-signed-url-bookmark-01PjYT1ZhvLK2FPJNTAhJsWf
restate-group-id
claude/add-highlights-page-012vhHpn8fVNp3gf7gBeW14s
claude/disable-shared-bookmark-features-01B9fiGUdu6NyWaxSQFsQBxP
claude/mobile-bookmark-grid-layouts-018cGBBMhPJVq6PJVRBpqT2r
claude/add-mobile-bookmark-summary-01494LYoh4sJW5Fj4GPm62Vj
claude/add-mobile-tags-screen-01WRADt4ZzvXVew1Y9vqF8SV
claude/add-highlight-notes-01LpanRLS4a2YMnT1qB5GTqX
claude/add-search-bar-014k2ngaqjwYRVSvqmbuECqr
claude/hide-collaborator-emails-01TQrkkMupC7CR9BTuDkireg
claude/list-invitation-approval-0129V89M1riXW6JqmoF74VfM
claude/add-bookmark-archive-sort-018VbGPGvtmsGgXFEERoAX7B
claude/add-mobile-smart-lists-01251tYo9u1SywE6XFezAv9e
claude/bookmark-drag-drop-01DmWq286ogHpDGHKcXjKr3z
claude/add-rss-import-01DH1Q2axcDeq8nQJR5MWjPJ
claude/mobile-inapp-browser-auth-01KiT6bwyntRPQ1X4oTtAveC
claude/offline-mode-react-query-01D1rE2bdBEPw2teGqunr5Gd
claude/add-singlefile-extension-support-01BEB9QQZABzwfZDvR9Bz5b2
claude/custom-list-slugs-01VxcfkNUXZ97FNpNVURopMq
claude/issue-2148-20251118-1133
claude/add-groupid-queue-fairness-011CV1r8Wb46HuGAg5o95i3m
claude/hide-viewer-shared-lists-01Fst6NBvdxrXXnDhUmjsNDP
claude/collaborative-lists-013AvDvMqkoszDVcSoCYgBcM
claude/implement-feature-01LT5XzGsbEhZkYXNEjEwdui
claude/fix-bookmark-loading-state-01AgF4H2drxwuTCJDB2Xgiu4
claude/admin-user-edit-013tbiRmb1KX2fhSYqmGKCu8
claude/expose-all-api-01YTruEW72WQYMtq4iZoaPkA
claude/add-doc-link-main-016NYLxShpKuH6R8XCBgeZtc
claude/fix-issue-2133-019JLvdSRAUbU4FtjQztcM6S
claude/explore-effect-ts-integration-01F7xb1dWwP1ma4LnLbFGfDD
claude/optimize-dockerfile-build-011CV5gDnPZbdbbVSPDofC4e
claude/add-custom-headers-guide-011CV249t16aWDRb1mCrzQdC
claude/mobile-app-signup-011CUxPtCXgU6U3T8GShTR2Q
claude/crawler-worker-fetch-browser-011CUvcRc24XEr9DTWDW6MX8
claude/fix-issue-784-011CUvubQrcZHG9S3KjpCKbK
codex/add-user-settings-for-inference-language-and-screenshots
claude/fix-mobile-signin-server-address-011CUnaUWwY2Fhq5Xbwhgr8H
better-auth-2
claude/issue-2028-20251012-1429
claude/issue-1010-20251012-1154
codex/update-feed-refresh-job-idempotency-key
restate
import-v2
fix-public-lists
recurse-delete-list
abort-dangling-processing
tag-pagination
ratelimit-plugin
claude/issue-1937-20250914-0912
codex/implement-title-search-query-qualifier
copilot/add-edit-button-for-notes
cookie-path
ai-tag-cleanup
codex/add-allowlist-and-blocklist-env-variables
mobile-retheme
expo-next-upgrade
opencode/issue1788-20250727215611
fix-trailing-slash-deduplication
edit-bookmark-dialog
bookmark-embeddings
rag
nextjs-15
bookmark-hover-bar
sapling-pr-archive-MohamedBassem
track-bookmark-assets
json-cli
admin-settings
mobile-dark-mode
android/v1.9.2-0
ios/v1.9.1-1
android/v1.9.1-0
ios/v1.9.1-0
ios/v1.9.0-2
ios/v1.9.0-1
android/v1.9.0-1
extension/v1.2.9
cli/v0.31.0
sdk/v0.31.0
mcp/v0.31.0
android/v1.9.0-0
ios/v1.9.0-0
v0.31.0
android/v1.8.5-0
cli/v0.30.0
sdk/v0.30.0
ios/v1.8.4-0
android/v1.8.4-0
v0.30.0
cli/v0.29.1
v0.29.3
v0.29.2
v0.29.1
sdk/v0.29.0
cli/v0.29.0
mcp/v0.29.0
ios/v1.8.3-0
android/v1.8.3-0
extension/v1.2.8
v0.29.0
android/v1.8.2-2
android/v1.8.2-1
ios/v1.8.2-0
android/v1.8.2-0
extension/v1.2.7
android/v1.8.1-0
ios/v1.8.1-0
v0.28.0
cli/v0.27.1
cli/v0.27.0
v0.27.1
sdk/v0.27.0
v0.27.0
android/v1.8.0-1
ios/v1.8.0-1
mcp/v0.26.0
sdk/v0.26.0
v0.26.0
cli/v0.25.0
ios/v1.7.0-1
mcp/v0.25.0
v0.25.0
extension/v1.2.6
ios/v1.7.0-0
android/v1.7.0-0
v0.24.1
v0.24.0
mcp/v0.23.10
mcp/v0.23.9
mcp/v0.23.8
extension/v1.2.5
mcp/v0.23.7
mcp/v0.23.6
mcp/v0.23.5
mcp/v0.23.4
sdk/v0.23.2
cli/v0.23.0
extension/v1.2.4
android/v1.6.9-1
ios/v1.6.9-1
v0.23.2
v0.23.1
sdk/v0.23.0
v0.23.0
ios/v1.6.9-0
sdk/v0.22.0
v0.22.0
android/v1.6.8-0
ios/v1.6.8-0
sdk/v0.21.2
sdk/v0.21.1
sdk/v0.21.0
v0.21.0
cli/v0.20.0
v0.20.0
ios/v1.6.7-4
android/v1.6.7-4
ios/v1.6.7-3
android/v1.6.7-3
android/v1.6.7-2
ios/v1.6.7-2
android/v1.6.7-1
ios/v1.6.7-1
ios/v1.6.7-0
android/v1.6.7-0
v0.19.0
android/v1.6.6-0
android/v1.6.5-0
ios/v1.6.5-0
ios/v1.6.4-0
android/v1.6.4-0
v0.18.0
v0.17.1
v0.17.0
ios/v1.6.3-0
android/v1.6.3-0
extension/v1.2.3
ios/v1.6.2-1
android/v1.6.2-1
ios/v1.6.2-0
android/v1.6.2-0
v0.16.0
ios/v1.6.1-3
android/v1.6.1-3
ios/v1.6.1-2
android/v1.6.1-2
android/v1.6.1-1
ios/v1.6.1-1
android/v1.6.1-0
ios/v1.6.1-0
extension/v1.2.2
android/v1.6.0-1
ios/v1.6.0-1
ios/v1.6.0
android/v1.6.0
cli/v0.13.7
cli/v0.13.6
v0.15.0
cli/v0.13.5
extension/v1.2.1
v0.14.0
cli/v0.13.3
cli/v0.13.2
cli/v0.13.1
cli/v0.13.0
v0.13.1
v0.13.0
mobile-v1.5.0
mobile-v1.4.0
v0.12.2
v0.12.1
v0.12.0
v0.11.1
v0.11.0
v0.10.1
v0.10.0
v0.9.0
v0.8.0
v0.7.0
v0.6.0
v0.5.0
v0.4.1
v.0.4.0
v.0.3.1
v0.3.0
v0.2.2
v0.2.1
v0.2.0
v0.1.0
Labels
Clear labels
Mirrored from GitHub Pull Request
UI/UX
android
bug
dependencies
documentation
documentation
extension
feature request
feature request
good first issue
ios
long-term
performance
pri/high
pri/low
pri/medium
pull-request
Mirrored from GitHub Pull Request
question
status/approved
status/icebox
status/pending_clarification
status/untriaged
No labels
UI/UX
android
bug
dependencies
documentation
documentation
extension
feature request
feature request
good first issue
ios
long-term
performance
pri/high
pri/low
pri/medium
pull-request
question
status/approved
status/icebox
status/pending_clarification
status/untriaged
Milestone
Clear milestone
No items
No milestone
Projects
Clear projects
No items
No project
Assignees
Clear assignees
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".
No due date set.
Dependencies
No dependencies set.
Reference
starred/karakeep#533
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @GreenMonito on GitHub (Jan 3, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/815
Describe the Bug
Neither the tags nor the image are generated for any URL of https://www.elcorteingles.es/.
In the logs I see the following error:
2025-01-03T14:17:38.157Z error: [Crawler] [564] Failed to determine the content-type for the URL https://www.elcorteingles.es/: AbortError: The operation was aborted.Steps to Reproduce
Add this URLs to Hoarder
Expected Behaviour
That the tags and the image are generated
Screenshots or Additional Context
Device Details
Docker, Chrome
Exact Hoarder Version
v0.20.0
Have you checked the troubleshooting guide?
@DerekParks commented on GitHub (Jan 10, 2025):
I'm running into a very similar issue. Looking at the logs and matching up with code paths. I think my html crawl is succeeding but my banner image crawl is not. (maybe because of doing 2 fetches too quickly?)
Edit: Setting CRAWLER_DOWNLOAD_BANNER_IMAGE="true" allowed the crawl to succeed in my case.
@MohamedBassem commented on GitHub (Jan 11, 2025):
@DerekParks
CRAWLER_DOWNLOAD_BANNER_IMAGEis the default, did you explicitly disable it before?@MohamedBassem commented on GitHub (Jan 11, 2025):
@GreenMonito Can you share the full log when this link is added?
@reinhardt-bit commented on GitHub (Jan 11, 2025):
I actually look at the docker logs to better try and figure out what the problem could be.The main issue I found to be this:
'''
TypeError: fetch failed
at node:internal/deps/undici/undici:12345:11
'''
This is a problem in the container itself and a fix I found was found here.
It is a old node.js dependency problem. Please can you look into this. Hope this helps.
I am running Hoarder in docker and I have Ollama installed locally on my machine
@GreenMonito commented on GitHub (Jan 12, 2025):
Sorry for the delay, these are the logs when adding the domain
2025-01-12T19:01:17.706Z info: [Crawler][685] Will crawl "https://www.elcorteingles.es/" for link with id "l02d6h4erym2632v78s44f4n"2025-01-12T19:01:17.706Z info: [Crawler][685] Attempting to determine the content-type for the url https://www.elcorteingles.es/2025-01-12T19:01:17.929Z info: [search][686] Attempting to index bookmark with id l02d6h4erym2632v78s44f4n ...2025-01-12T19:01:18.088Z info: [search][686] Completed successfully2025-01-12T19:01:22.708Z error: [Crawler][685] Failed to determine the content-type for the url https://www.elcorteingles.es/: AbortError: The operation was aborted.2025-01-12T19:01:25.844Z info: [Crawler][685] Successfully navigated to "https://www.elcorteingles.es/". Waiting for the page to load ...2025-01-12T19:01:26.844Z info: [Crawler][685] Finished waiting for the page to load.2025-01-12T19:01:26.875Z info: [Crawler][685] Successfully fetched the page content.2025-01-12T19:01:27.021Z info: [Crawler][685] Finished capturing page content and a screenshot. FullPageScreenshot: false2025-01-12T19:01:27.032Z info: [Crawler][685] Will attempt to extract metadata from page ...2025-01-12T19:01:28.481Z info: [Crawler][685] Will attempt to extract readable content ...2025-01-12T19:01:30.049Z info: [Crawler][685] Done extracting readable content.2025-01-12T19:01:30.102Z info: [Crawler][685] Stored the screenshot as assetId: 335395ff-5658-466d-a9e1-bf81ffecf6582025-01-12T19:02:17.702Z error: [Crawler][685] Crawling job failed: Error: Timed-out after 60 secs Error: Timed-out after 60 secs at Timeout._onTimeout (/app/apps/workers/utils.ts:2:1025) at listOnTimeout (node:internal/timers:594:17) at process.processTimers (node:internal/timers:529:7)@hussion commented on GitHub (Jul 16, 2025):
same problem
@klausmcm commented on GitHub (Oct 1, 2025):
I also get the
Failed to determine the content-typeerror when trying to add this link: https://www.cbc.ca/news/canada/british-columbia/strong-early-response-oxford-astrazeneca-rollout-metro-van-55-65s-1.5971514I hope the below logs are useful
@stelle007 commented on GitHub (Oct 17, 2025):
Same issue for me.
@brodieferguson commented on GitHub (Dec 1, 2025):
I also get the Failed to determine the content-type error when trying to add washingtonpost links:
2025-12-01T19:52:47.697Z info: [Crawler][12704:2] The page has been precrawled. Will use the precrawled archive instead. 2025-12-01T19:52:47.717Z info: [Crawler][12704:2] Will attempt to extract metadata from page ... 2025-12-01T19:52:59.751Z info: <-- GET /api/health 2025-12-01T19:52:59.752Z info: --> GET /api/health 200 1ms 2025-12-01T19:53:29.814Z info: <-- GET /api/health 2025-12-01T19:53:29.815Z info: --> GET /api/health 200 0ms 2025-12-01T19:53:42.694Z error: [Crawler][12704] Crawling job failed: Error: Timeout Error: Timeout at Timeout._onTimeout (file:///app/apps/workers/node_modules/.pnpm/liteque@0.7.0_@opentelemetry+api@1.9.0_@types+better-sqlite3@7.6.13_@types+react@19.2.5_bette_j25tbpstiiqwo32nscmvntyxcu/node_modules/liteque/dist/index.js:263:28) at listOnTimeout (node:internal/timers:588:17) at process.processTimers (node:internal/timers:523:7) 2025-12-01T19:53:43.628Z info: [Crawler][12704:3] Will crawl "https://www.washingtonpost.com/national-security/2025/12/01/trump-habba-us-attorney-ruling/" for link with id "l6x93k0nrseoqc246vn75bba" 2025-12-01T19:53:43.628Z info: [Crawler][12704:3] Attempting to determine the content-type for the url https://www.washingtonpost.com/national-security/2025/12/01/trump-habba-us-attorney-ruling/ 2025-12-01T19:53:48.628Z error: [Crawler][12704:3] Failed to determine the content-type for the url https://www.washingtonpost.com/national-security/2025/12/01/trump-habba-us-attorney-ruling/: AbortError: The operation was aborted.@anuram2k commented on GitHub (Dec 18, 2025):
Same issue for me too.
@anuram2k commented on GitHub (Dec 22, 2025):
Here is the log I see:
[Crawler][55:0] Failed to determine the content-type for the url https://developer.mozilla.org/en-US/docs/Web/API/WebOTP_API: AbortError: The operation was aborted.
@anuram2k commented on GitHub (Jan 14, 2026):
Any plan of fixing this issue ?
@anuram2k commented on GitHub (Jan 20, 2026):
Here is the error for some other site:
[Crawler][168:2] Failed to determine the content-type for the url https://ntfy.sh/: AbortError: The operation was aborted.