[GH-ISSUE #1939] Detect products on all sites #1200

Open
opened 2026-03-02 11:55:44 +03:00 by kerem · 2 comments
Owner

Originally created by @flexagoon on GitHub (Sep 12, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1939

Describe the feature you'd like

github.com/bhardwajRahul/hoarder@c68e509979 introduced special handling for Amazon pages as part of issue #1344. However, as I also said in that issue, MyMind can detect products automatically on any website, while Karakeep currently hardcodes Amazon.

Describe the benefits this would bring to existing Karakeep users

Same as in #1344, this would help for a wishist usecase

Can the goal of this request already be achieved via other means?

No

Have you searched for an existing open/closed issue?

  • I have searched for existing issues and none cover my fundamental request

Additional context

No response

Originally created by @flexagoon on GitHub (Sep 12, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/1939 ### Describe the feature you'd like https://github.com/bhardwajRahul/hoarder/commit/c68e5099797d5b49ed6441ce04d7c77105327f73 introduced special handling for Amazon pages as part of issue #1344. However, as I also said in that issue, MyMind can detect products automatically on any website, while Karakeep currently hardcodes Amazon. ### Describe the benefits this would bring to existing Karakeep users Same as in #1344, this would help for a wishist usecase ### Can the goal of this request already be achieved via other means? No ### Have you searched for an existing open/closed issue? - [x] I have searched for existing issues and none cover my fundamental request ### Additional context _No response_
Author
Owner

@shahimclt commented on GitHub (Feb 20, 2026):

For special rendering for bookmarks with og:type=product, the parser first needs to capture metadata and store it when crawling the URL. Once that is done, It's just a matter of having a special Renderer for products.

Unfortunately, it's a bit more than I can chew at this point. Maybe somebody can take this up.

<!-- gh-comment-id:3934018173 --> @shahimclt commented on GitHub (Feb 20, 2026): For special rendering for bookmarks with `og:type=product`, the parser first needs to capture metadata and store it when crawling the URL. Once that is done, It's just a matter of having a special Renderer for products. Unfortunately, it's a bit more than I can chew at this point. Maybe somebody can take this up.
Author
Owner

@shahimclt commented on GitHub (Feb 20, 2026):

I am not familiar with NextJS or React, so trying to solve this myself is proving to be a huge rabbit hole.

I am documenting what I tried here, so maybe someone can help me out:

1. Parsing shopping info

metascraper has an addon @samirrayani/metascraper-shopping that can extract price info. I managed to add it and log the parsed price details.

apps\workers\scripts\parseHtmlSubprocess.ts

import metascraperShopping from "@samirrayani/metascraper-shopping";
...
metascraperAmazon(),
  metascraperShopping(), // Added this
  metascraperYoutube({
...

apps\workers\workers\crawlerWorker.ts

//Line:1596
logger.info(
    `[Crawler][${jobId}] Metadata extracted from the page: ${JSON.stringify(meta)}`,
);

This does work.

2. Saving price to DB

packages\db\schema.ts

// add field to bookmarkLinks
productPrice: text("productPrice"),

Now I need help with these:

  • How do I properly add the metascraper-shopping. I understand that a simple npm install will not work as this project uses pnpm. I managed to test using a forced install but need the proper way.
  • How Do I add migrations to the DB?
  • My dev setup is running the docker compose command in Windows. I have to stop the container and run the up command again after I make changes, which takes a long time. Is there a faster way?
<!-- gh-comment-id:3935272757 --> @shahimclt commented on GitHub (Feb 20, 2026): I am not familiar with NextJS or React, so trying to solve this myself is proving to be a huge rabbit hole. I am documenting what I tried here, so maybe someone can help me out: ### 1. Parsing shopping info `metascraper` has an addon `@samirrayani/metascraper-shopping` that can extract price info. I managed to add it and log the parsed price details. `apps\workers\scripts\parseHtmlSubprocess.ts` ```js import metascraperShopping from "@samirrayani/metascraper-shopping"; ... metascraperAmazon(), metascraperShopping(), // Added this metascraperYoutube({ ... ``` `apps\workers\workers\crawlerWorker.ts` ````js //Line:1596 logger.info( `[Crawler][${jobId}] Metadata extracted from the page: ${JSON.stringify(meta)}`, ); ```` This does work. ### 2. Saving price to DB `packages\db\schema.ts` ````js // add field to bookmarkLinks productPrice: text("productPrice"), ```` Now I need help with these: - How do I properly add the `metascraper-shopping`. I understand that a simple npm install will not work as this project uses pnpm. I managed to test using a forced install but need the proper way. - How Do I add migrations to the DB? - My dev setup is running the docker compose command in Windows. I have to stop the container and run the `up` command again after I make changes, which takes a long time. Is there a faster way?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1200
No description provided.