starred/karakeep

Fork 0

mirror of https://github.com/karakeep-app/karakeep.git synced 2026-04-25 07:56:05 +03:00

[GH-ISSUE #339] [Feature Request] Allow 3rd party crawling #218

New issue

Closed

opened 2026-03-02 11:47:43 +03:00 by kerem · 3 comments

kerem commented

2026-03-02 11:47:43 +03:00

Owner

Originally created by @aaroneden on GitHub (Jul 31, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/339

Allow connections to Zapier or systems like FireCrawl for more robust crawling

Originally created by @aaroneden on GitHub (Jul 31, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/339 Allow connections to Zapier or systems like FireCrawl for more robust crawling

kerem

2026-03-02 11:47:43 +03:00

closed this issue
added the
feature request
label

kerem commented

2026-03-02 11:47:44 +03:00

Author

Owner

@kamtschatka commented on GitHub (Aug 3, 2024):

from what I can see they only provide markdown, whereas the link scraping in hoarder uses html, so it would be more like a text bookmark and not a link bookmark.

From previous responses to issues like this, the intention is rather to keep hoarder clean instead of adding all kinds of integrations for all kinds of (paid) services.
Can't you utilize the CLI and push the markdown you scraped using those services to hoarder?

@kamtschatka commented on GitHub (Aug 3, 2024): from what I can see they only provide markdown, whereas the link scraping in hoarder uses html, so it would be more like a text bookmark and not a link bookmark. From previous responses to issues like this, the intention is rather to keep hoarder clean instead of adding all kinds of integrations for all kinds of (paid) services. Can't you utilize the CLI and push the markdown you scraped using those services to hoarder?

kerem commented

2026-03-02 11:47:44 +03:00

Author

Owner

@MohamedBassem commented on GitHub (Aug 26, 2024):

Hoarder currently supports browserless (via BROWSER_WEBSOCKET_URL), given that it's the container that's used on unraid for chrome, and that it still keeps hoarder 3rd party provider agnostic.

We don't currently plan to support more 3rd party crawling unless there's strong demand from the community.

@MohamedBassem commented on GitHub (Aug 26, 2024): Hoarder currently supports browserless (via `BROWSER_WEBSOCKET_URL`), given that it's the container that's used on unraid for chrome, and that it still keeps hoarder 3rd party provider agnostic. We don't currently plan to support more 3rd party crawling unless there's strong demand from the community.

kerem commented

2026-03-02 11:47:44 +03:00

Author

Owner

@MohamedBassem commented on GitHub (Sep 15, 2024):

Closing this as it's unlikely it'll get implemented.

@MohamedBassem commented on GitHub (Sep 15, 2024): Closing this as it's unlikely it'll get implemented.

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/karakeep#218

No description provided.

Rows
Columns