[GH-ISSUE #703] Omnivore import improvement - archived #455

Closed
opened 2026-03-02 11:50:01 +03:00 by kerem · 3 comments
Owner

Originally created by @jakubsuchybio on GitHub (Nov 28, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/703

Describe the feature you'd like

Currently you import with this mapping:

  return parsed.data.map((bookmark) => {
    return {
      title: bookmark.title ?? "",
      content: { type: BookmarkTypes.LINK as const, url: bookmark.url },
      tags: bookmark.labels,
      addDate: bookmark.savedAt.getTime() / 1000,
    };

But the omnivore export format also has bookmark.State which can be either Active or Archived
I can see, that your import dto doesn't have archived boolean

export interface ParsedBookmark {
  title: string;
  content?:
    | { type: BookmarkTypes.LINK; url: string }
    | { type: BookmarkTypes.TEXT; text: string };
  tags: string[];
  addDate?: number;
  notes?: string;
}

but your bookmarks do have them.

I'd code it myself, but I can't even read typescript, even though I am C# dev.
This react stuff isn't something that I want to learn.

Describe the benefits this would bring to existing Hoarder users

It would bring better experience when migrating from Omnivore, so that the work that users put into their articles with archiving wouldn't be wasted.

Can the goal of this request already be achieved via other means?

I guess manual bulk updates for bookmarks. But very difficult to tell which should be archived and which not.

Have you searched for an existing open/closed issue?

  • I have searched for existing issues and none cover my fundamental request

Additional context

No response

Originally created by @jakubsuchybio on GitHub (Nov 28, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/703 ### Describe the feature you'd like Currently you import with this mapping: ``` return parsed.data.map((bookmark) => { return { title: bookmark.title ?? "", content: { type: BookmarkTypes.LINK as const, url: bookmark.url }, tags: bookmark.labels, addDate: bookmark.savedAt.getTime() / 1000, }; ``` But the omnivore export format also has `bookmark.State` which can be either `Active` or `Archived` I can see, that your import dto doesn't have `archived` boolean ``` export interface ParsedBookmark { title: string; content?: | { type: BookmarkTypes.LINK; url: string } | { type: BookmarkTypes.TEXT; text: string }; tags: string[]; addDate?: number; notes?: string; } ``` but your bookmarks do have them. I'd code it myself, but I can't even read typescript, even though I am C# dev. This react stuff isn't something that I want to learn. ### Describe the benefits this would bring to existing Hoarder users It would bring better experience when migrating from Omnivore, so that the work that users put into their articles with archiving wouldn't be wasted. ### Can the goal of this request already be achieved via other means? I guess manual bulk updates for bookmarks. But very difficult to tell which should be archived and which not. ### Have you searched for an existing open/closed issue? - [X] I have searched for existing issues and none cover my fundamental request ### Additional context _No response_
Author
Owner

@thiswillbeyourgithub commented on GitHub (May 19, 2025):

I was not aware of this!

I made a quick and very dirty script using my karakeep-python-api client that should handle most cases: https://github.com/thiswillbeyourgithub/karakeep_python_api/tree/main/community_scripts/omnivore2karakeep-archived, if like me you don't want to reimport your data.

Edit: I'm also working on a script to import the highlights from omnivore

<!-- gh-comment-id:2891331999 --> @thiswillbeyourgithub commented on GitHub (May 19, 2025): I was not aware of this! I made a quick and very dirty script using my [karakeep-python-api client](https://github.com/thiswillbeyourgithub/karakeep_python_api/) that should handle most cases: https://github.com/thiswillbeyourgithub/karakeep_python_api/tree/main/community_scripts/omnivore2karakeep-archived, if like me you don't want to reimport your data. Edit: I'm also working on a script to import the highlights from omnivore
Author
Owner

@youenchene commented on GitHub (May 24, 2025):

And got the same issue for pocket on archive status not taken in account while imporing

Here it is the script/PR : https://github.com/thiswillbeyourgithub/karakeep_python_api/pull/2

<!-- gh-comment-id:2906572259 --> @youenchene commented on GitHub (May 24, 2025): And got the same issue for pocket on archive status not taken in account while imporing Here it is the script/PR : https://github.com/thiswillbeyourgithub/karakeep_python_api/pull/2
Author
Owner

@thiswillbeyourgithub commented on GitHub (May 28, 2025):

FYI I also did a script to import the highlights from omnivore directly into karakeep: https://github.com/thiswillbeyourgithub/karakeep_python_api/tree/main/community_scripts/omnivore2karakeep-highlights

It uses a probabilistic matching and ignores pdf but worked well for me.

<!-- gh-comment-id:2916628856 --> @thiswillbeyourgithub commented on GitHub (May 28, 2025): FYI I also did a script to import the highlights from omnivore directly into karakeep: https://github.com/thiswillbeyourgithub/karakeep_python_api/tree/main/community_scripts/omnivore2karakeep-highlights It uses a probabilistic matching and ignores pdf but worked well for me.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#455
No description provided.