starred/karakeep

Fork 0

mirror of https://github.com/karakeep-app/karakeep.git synced 2026-04-25 16:06:04 +03:00

[GH-ISSUE #1306] Plugin to use archive.is automatically #834

New issue

Open

opened 2026-03-02 11:53:07 +03:00 by kerem · 5 comments

kerem commented

2026-03-02 11:53:07 +03:00

Owner

Originally created by @maelp on GitHub (Apr 24, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1306

Describe the feature you'd like

Multiple websites have paywalls that can be circumvented by using "https://archive.is/"

this brings you to a page which either shows the last scraping, or provides you an opportunity to trigger a scraping

this could probably be turned into an "add archive.is archive" in Karakeep, which launches a background job when trying to retrieve an URL in order to also ask archive.is to do a copy, then retrieve that copy from there

perhaps for "long background jobs" like that it could be useful to use Inngest / DBOS / Temporal.io

Describe the benefits this would bring to existing Karakeep users

Add a full archive of the page, for websites where archive.is allows circumventing paywalls

Can the goal of this request already be achieved via other means?

Doing it manually

Have you searched for an existing open/closed issue?

I have searched for existing issues and none cover my fundamental request

Additional context

No response

Originally created by @maelp on GitHub (Apr 24, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1306 ### Describe the feature you'd like Multiple websites have paywalls that can be circumvented by using "https://archive.is/<paste original URL here>" this brings you to a page which either shows the last scraping, or provides you an opportunity to trigger a scraping this could probably be turned into an "add archive.is archive" in Karakeep, which launches a background job when trying to retrieve an URL in order to also ask archive.is to do a copy, then retrieve that copy from there perhaps for "long background jobs" like that it could be useful to use Inngest / DBOS / Temporal.io ### Describe the benefits this would bring to existing Karakeep users Add a full archive of the page, for websites where archive.is allows circumventing paywalls ### Can the goal of this request already be achieved via other means? Doing it manually ### Have you searched for an existing open/closed issue? - [x] I have searched for existing issues and none cover my fundamental request ### Additional context _No response_

kerem added the

feature request

status/icebox

labels

2026-03-02 11:53:07 +03:00

kerem commented

2026-03-02 11:53:08 +03:00

Author

Owner

@maelp commented on GitHub (Apr 24, 2025):

Check an outline made by ChatGPT here https://chatgpt.com/share/6809d209-be58-800b-a764-7885ed79ba2d

@maelp commented on GitHub (Apr 24, 2025): Check an outline made by ChatGPT here https://chatgpt.com/share/6809d209-be58-800b-a764-7885ed79ba2d

kerem commented

2026-03-02 11:53:08 +03:00

Author

Owner

@huchene commented on GitHub (Apr 25, 2025):

@huchene commented on GitHub (Apr 25, 2025): +1

kerem commented

2026-03-02 11:53:08 +03:00

Author

Owner

@Byrnesdigital commented on GitHub (May 13, 2025):

This would be probably be best achieved by integrating with something like Ladder or Marreta which are essentially self-hosted open source versions of archive.is. I haven't spent much time with either but I'm wondering if some sort of automation could be set up to pass URLs/data back and forth? Something like url sent to karakeep gets automatically sent to marreta then the rendered page gets sent back to karakeep and the bookmark is updated with the full text version.

@Byrnesdigital commented on GitHub (May 13, 2025): This would be probably be best achieved by integrating with something like [Ladder](https://github.com/everywall/ladder) or [Marreta](https://github.com/manualdousuario/marreta) which are essentially self-hosted open source versions of archive.is. I haven't spent much time with either but I'm wondering if some sort of automation could be set up to pass URLs/data back and forth? Something like url sent to karakeep gets automatically sent to marreta then the rendered page gets sent back to karakeep and the bookmark is updated with the full text version.

kerem commented

2026-03-02 11:53:08 +03:00

Author

Owner

@maelp commented on GitHub (May 14, 2025):

I guess ideally we would have a common API for all of those, and the user
would choose his backend (archive.is or something self-hosted)

Message ID: @.***>

@maelp commented on GitHub (May 14, 2025): I guess ideally we would have a common API for all of those, and the user would choose his backend (archive.is or something self-hosted) Message ID: ***@***.***>

kerem commented

2026-03-02 11:53:08 +03:00

Author

Owner

@Byrnesdigital commented on GitHub (May 14, 2025):

I guess ideally we would have a common API for all of those, and the user
would choose his backend (archive.is or something self-hosted)

Message ID: @.***>

Upon further reading it looks like archive.is supports Memento for an API. The docs look pretty outdated but it may be worth checking out rather than starting from scratch.

@Byrnesdigital commented on GitHub (May 14, 2025): > I guess ideally we would have a common API for all of those, and the user > would choose his backend (archive.is or something self-hosted) > > Message ID: ***@***.***> > Upon further reading it looks like archive.is supports [Memento](http://mementoweb.org/depot/native/archiveis/) for an API. The docs look pretty outdated but it may be worth checking out rather than starting from scratch.

kerem referenced this issue

2026-03-02 11:53:47 +03:00

[GH-ISSUE #1460] FR: "see similar bookmarks" / "reading suggestion" using embeddings #925

kerem referenced this issue

2026-03-02 11:58:48 +03:00

[PR #834] [CLOSED] WIP: Bookmark embeddings #1707

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

starred/karakeep#834

No description provided.

Rows
Columns