[GH-ISSUE #2304] Versions 0.90.2 and 0.90.3 have issues with SingleFile support. #1406

Open
opened 2026-03-02 11:57:05 +03:00 by kerem · 10 comments
Owner

Originally created by @mapleshadow on GitHub (Dec 25, 2025).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2304

Describe the Bug

Web pages saved with SingleFile using these two versions are highly likely to be unparseable and unsavable. However, version 0.90.1 does not have this problem.

Steps to Reproduce

This issue occurs when using SingleFile to capture web page data.

Expected Behaviour

none

Screenshots or Additional Context

No response

Device Details

No response

Exact Karakeep Version

0.90.2 0.90.3

Have you checked the troubleshooting guide?

  • I have checked the troubleshooting guide and I haven't found a solution to my problem
Originally created by @mapleshadow on GitHub (Dec 25, 2025). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/2304 ### Describe the Bug Web pages saved with SingleFile using these two versions are highly likely to be unparseable and unsavable. However, version 0.90.1 does not have this problem. ### Steps to Reproduce This issue occurs when using SingleFile to capture web page data. ### Expected Behaviour none ### Screenshots or Additional Context _No response_ ### Device Details _No response_ ### Exact Karakeep Version 0.90.2 0.90.3 ### Have you checked the troubleshooting guide? - [x] I have checked the troubleshooting guide and I haven't found a solution to my problem
Author
Owner

@MohamedBassem commented on GitHub (Dec 25, 2025):

what kind of issues are you seeing? Do they fail to get uploaded or what so you mean by unpressable? 0.2 and 0.3 contained only nextjs patches, so it's a bit surprising if they caused issues.

<!-- gh-comment-id:3691360710 --> @MohamedBassem commented on GitHub (Dec 25, 2025): what kind of issues are you seeing? Do they fail to get uploaded or what so you mean by unpressable? 0.2 and 0.3 contained only nextjs patches, so it's a bit surprising if they caused issues.
Author
Owner

@mapleshadow commented on GitHub (Dec 26, 2025):

The issue is that when sending the HTML page with SingleFile to karakeep, karakeep receives it, but most pages remain stuck in the crawling stage. This only resolves if I restart the karakeep platform (I’m using a Docker environment).

<!-- gh-comment-id:3692526862 --> @mapleshadow commented on GitHub (Dec 26, 2025): The issue is that when sending the HTML page with SingleFile to karakeep, karakeep receives it, but most pages remain stuck in the crawling stage. This only resolves if I restart the karakeep platform (I’m using a Docker environment).
Author
Owner

@MohamedBassem commented on GitHub (Dec 26, 2025):

@mapleshadow can you share the logs of the container when you save the snapshot?

<!-- gh-comment-id:3692533309 --> @MohamedBassem commented on GitHub (Dec 26, 2025): @mapleshadow can you share the logs of the container when you save the snapshot?
Author
Owner

@mapleshadow commented on GitHub (Dec 27, 2025):

I have reverted to version 0.90.1.> @mapleshadow can you share the logs of the container when you save the snapshot?

<!-- gh-comment-id:3693838569 --> @mapleshadow commented on GitHub (Dec 27, 2025): I have reverted to version 0.90.1.> @mapleshadow can you share the logs of the container when you save the snapshot?
Author
Owner

@mapleshadow commented on GitHub (Dec 27, 2025):

I suggest you test it according to my description.> @mapleshadow can you share the logs of the container when you save the snapshot?

<!-- gh-comment-id:3693838929 --> @mapleshadow commented on GitHub (Dec 27, 2025): I suggest you test it according to my description.> @mapleshadow can you share the logs of the container when you save the snapshot?
Author
Owner

@Chaldron commented on GitHub (Dec 31, 2025):

I'm observing this issue as well - a capture uploaded using SingleFile results in the page appearing in Karakeep but stuck in the crawling stage. Here are some logs:

Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Will crawl "https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/" for link with id "mkhr8ar1nyrd96g4qjaufab5"
Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Attempting to determine the content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/
Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] Content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ is "text/html"
Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] The page has been precrawled. Will use the precrawled archive instead.
Dec 31 12:57:58 links karakeep-web[958]: 2025-12-31T18:57:58.257Z info: [Crawler][9192:0] Will attempt to extract metadata from page ...
Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.515Z info: <-- GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D
Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.528Z info: --> GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D 200 12ms
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: <--- Last few GCs --->
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000]    87753 ms: Mark-Compact 8079.4 (8224.4) -> 8079.4 (8224.4) MB, pooled: 0 MB, 3696.98 / 0.00 ms  (average mu = 0.161, current mu = 0.033) allocation failure; scavenge might not succeed
Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000]    93563 ms: Mark-Compact 8095.1 (8224.4) -> 8095.1 (8256.4) MB, pooled: 0 MB, 5792.46 / 0.00 ms  (average mu = 0.083, current mu = 0.003) allocation failure; scavenge might not succeed
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: <--- JS stacktrace --->
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Dec 31 12:59:14 links karakeep-web[958]: ----- Native stack trace -----
Dec 31 12:59:14 links karakeep-web[958]:

This log is from after I had already added NODE_OPTIONS="--max-old-space-size=8192" to my environment (that limit should be reflected in the log). Once the "crawl" starts the memory usage of my container slowly climbs until the process gets killed by the OOM killer or the above error message is hit. I also tried using --max-old-space-size=16384 which had the same issue, it just took slightly longer to get to 16 GB memory used :)

Update: Downgrading to 0.29.1 did not solve the issue. Also, this does not happen with every SingleFile upload - many of them do work.

Please let me know if there is anything I can do provide more debug information or otherwise help track down this issue - thank you so much for the amazing software!

<!-- gh-comment-id:3702731444 --> @Chaldron commented on GitHub (Dec 31, 2025): I'm observing this issue as well - a capture uploaded using SingleFile results in the page appearing in Karakeep but stuck in the crawling stage. Here are some logs: ``` Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Will crawl "https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/" for link with id "mkhr8ar1nyrd96g4qjaufab5" Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Attempting to determine the content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] Content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ is "text/html" Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] The page has been precrawled. Will use the precrawled archive instead. Dec 31 12:57:58 links karakeep-web[958]: 2025-12-31T18:57:58.257Z info: [Crawler][9192:0] Will attempt to extract metadata from page ... Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.515Z info: <-- GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.528Z info: --> GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D 200 12ms Dec 31 12:59:14 links karakeep-web[958]: Dec 31 12:59:14 links karakeep-web[958]: <--- Last few GCs ---> Dec 31 12:59:14 links karakeep-web[958]: Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000] 87753 ms: Mark-Compact 8079.4 (8224.4) -> 8079.4 (8224.4) MB, pooled: 0 MB, 3696.98 / 0.00 ms (average mu = 0.161, current mu = 0.033) allocation failure; scavenge might not succeed Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000] 93563 ms: Mark-Compact 8095.1 (8224.4) -> 8095.1 (8256.4) MB, pooled: 0 MB, 5792.46 / 0.00 ms (average mu = 0.083, current mu = 0.003) allocation failure; scavenge might not succeed Dec 31 12:59:14 links karakeep-web[958]: Dec 31 12:59:14 links karakeep-web[958]: Dec 31 12:59:14 links karakeep-web[958]: <--- JS stacktrace ---> Dec 31 12:59:14 links karakeep-web[958]: Dec 31 12:59:14 links karakeep-web[958]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory Dec 31 12:59:14 links karakeep-web[958]: ----- Native stack trace ----- Dec 31 12:59:14 links karakeep-web[958]: ``` This log is from after I had already added `NODE_OPTIONS="--max-old-space-size=8192"` to my environment (that limit should be reflected in the log). Once the "crawl" starts the memory usage of my container slowly climbs until the process gets killed by the OOM killer or the above error message is hit. I also tried using `--max-old-space-size=16384 ` which had the same issue, it just took slightly longer to get to 16 GB memory used :) **Update:** Downgrading to 0.29.1 did not solve the issue. Also, this does not happen with every SingleFile upload - many of them do work. Please let me know if there is anything I can do provide more debug information or otherwise help track down this issue - thank you so much for the amazing software!
Author
Owner

@mapleshadow commented on GitHub (Jan 1, 2026):

I'm observing this issue as well - a capture uploaded using SingleFile results in the page appearing in Karakeep but stuck in the crawling stage. Here are some logs:

Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Will crawl "https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/" for link with id "mkhr8ar1nyrd96g4qjaufab5"
Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Attempting to determine the content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/
Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] Content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ is "text/html"
Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] The page has been precrawled. Will use the precrawled archive instead.
Dec 31 12:57:58 links karakeep-web[958]: 2025-12-31T18:57:58.257Z info: [Crawler][9192:0] Will attempt to extract metadata from page ...
Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.515Z info: <-- GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D
Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.528Z info: --> GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D 200 12ms
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: <--- Last few GCs --->
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000]    87753 ms: Mark-Compact 8079.4 (8224.4) -> 8079.4 (8224.4) MB, pooled: 0 MB, 3696.98 / 0.00 ms  (average mu = 0.161, current mu = 0.033) allocation failure; scavenge might not succeed
Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000]    93563 ms: Mark-Compact 8095.1 (8224.4) -> 8095.1 (8256.4) MB, pooled: 0 MB, 5792.46 / 0.00 ms  (average mu = 0.083, current mu = 0.003) allocation failure; scavenge might not succeed
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: <--- JS stacktrace --->
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Dec 31 12:59:14 links karakeep-web[958]: ----- Native stack trace -----
Dec 31 12:59:14 links karakeep-web[958]:

This log is from after I had already added to my environment (that limit should be reflected in the log). Once the "crawl" starts the memory usage of my container slowly climbs until the process gets killed by the OOM killer or the above error message is hit. I also tried using which had the same issue, it just took slightly longer to get to 16 GB memory used :)NODE_OPTIONS="--max-old-space-size=8192"``--max-old-space-size=16384

Update: Downgrading to 0.29.1 did not solve the issue. Also, this does not happen with every SingleFile upload - many of them do work.

Please let me know if there is anything I can do provide more debug information or otherwise help track down this issue - thank you so much for the amazing software!

Yes, that’s right—this is the issue. Downgrading to 0.29.1 would be better.

<!-- gh-comment-id:3703487105 --> @mapleshadow commented on GitHub (Jan 1, 2026): > I'm observing this issue as well - a capture uploaded using SingleFile results in the page appearing in Karakeep but stuck in the crawling stage. Here are some logs: > > ``` > Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Will crawl "https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/" for link with id "mkhr8ar1nyrd96g4qjaufab5" > Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Attempting to determine the content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ > Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] Content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ is "text/html" > Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] The page has been precrawled. Will use the precrawled archive instead. > Dec 31 12:57:58 links karakeep-web[958]: 2025-12-31T18:57:58.257Z info: [Crawler][9192:0] Will attempt to extract metadata from page ... > Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.515Z info: <-- GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D > Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.528Z info: --> GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D 200 12ms > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: <--- Last few GCs ---> > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000] 87753 ms: Mark-Compact 8079.4 (8224.4) -> 8079.4 (8224.4) MB, pooled: 0 MB, 3696.98 / 0.00 ms (average mu = 0.161, current mu = 0.033) allocation failure; scavenge might not succeed > Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000] 93563 ms: Mark-Compact 8095.1 (8224.4) -> 8095.1 (8256.4) MB, pooled: 0 MB, 5792.46 / 0.00 ms (average mu = 0.083, current mu = 0.003) allocation failure; scavenge might not succeed > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: <--- JS stacktrace ---> > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory > Dec 31 12:59:14 links karakeep-web[958]: ----- Native stack trace ----- > Dec 31 12:59:14 links karakeep-web[958]: > ``` > > This log is from after I had already added to my environment (that limit should be reflected in the log). Once the "crawl" starts the memory usage of my container slowly climbs until the process gets killed by the OOM killer or the above error message is hit. I also tried using which had the same issue, it just took slightly longer to get to 16 GB memory used :)`NODE_OPTIONS="--max-old-space-size=8192"``--max-old-space-size=16384 ` > > **Update:** Downgrading to 0.29.1 did not solve the issue. Also, this does not happen with every SingleFile upload - many of them do work. > > Please let me know if there is anything I can do provide more debug information or otherwise help track down this issue - thank you so much for the amazing software! Yes, that’s right—this is the issue. Downgrading to 0.29.1 would be better.
Author
Owner

@foxdodo commented on GitHub (Jan 9, 2026):

I'm observing this issue as well - a capture uploaded using SingleFile results in the page appearing in Karakeep but stuck in the crawling stage. Here are some logs:

Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Will crawl "https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/" for link with id "mkhr8ar1nyrd96g4qjaufab5"
Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Attempting to determine the content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/
Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] Content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ is "text/html"
Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] The page has been precrawled. Will use the precrawled archive instead.
Dec 31 12:57:58 links karakeep-web[958]: 2025-12-31T18:57:58.257Z info: [Crawler][9192:0] Will attempt to extract metadata from page ...
Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.515Z info: <-- GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D
Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.528Z info: --> GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D 200 12ms
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: <--- Last few GCs --->
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000]    87753 ms: Mark-Compact 8079.4 (8224.4) -> 8079.4 (8224.4) MB, pooled: 0 MB, 3696.98 / 0.00 ms  (average mu = 0.161, current mu = 0.033) allocation failure; scavenge might not succeed
Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000]    93563 ms: Mark-Compact 8095.1 (8224.4) -> 8095.1 (8256.4) MB, pooled: 0 MB, 5792.46 / 0.00 ms  (average mu = 0.083, current mu = 0.003) allocation failure; scavenge might not succeed
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: <--- JS stacktrace --->
Dec 31 12:59:14 links karakeep-web[958]:
Dec 31 12:59:14 links karakeep-web[958]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
Dec 31 12:59:14 links karakeep-web[958]: ----- Native stack trace -----
Dec 31 12:59:14 links karakeep-web[958]:

This log is from after I had already added NODE_OPTIONS="--max-old-space-size=8192" to my environment (that limit should be reflected in the log). Once the "crawl" starts the memory usage of my container slowly climbs until the process gets killed by the OOM killer or the above error message is hit. I also tried using --max-old-space-size=16384 which had the same issue, it just took slightly longer to get to 16 GB memory used :)

Update: Downgrading to 0.29.1 did not solve the issue. Also, this does not happen with every SingleFile upload - many of them do work.

Please let me know if there is anything I can do provide more debug information or otherwise help track down this issue - thank you so much for the amazing software!

Same here. I'm seeing this when scraping Discourse forum posts.

Image
<!-- gh-comment-id:3726708402 --> @foxdodo commented on GitHub (Jan 9, 2026): > I'm observing this issue as well - a capture uploaded using SingleFile results in the page appearing in Karakeep but stuck in the crawling stage. Here are some logs: > > ``` > Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Will crawl "https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/" for link with id "mkhr8ar1nyrd96g4qjaufab5" > Dec 31 12:57:53 links karakeep-web[958]: 2025-12-31T18:57:53.925Z info: [Crawler][9192:0] Attempting to determine the content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ > Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] Content-type for the url https://order.ebay.com/ord/show?orderId=07-13095-37788&purchaseOrderId=07-1309-537787#/ is "text/html" > Dec 31 12:57:54 links karakeep-web[958]: 2025-12-31T18:57:54.385Z info: [Crawler][9192:0] The page has been precrawled. Will use the precrawled archive instead. > Dec 31 12:57:58 links karakeep-web[958]: 2025-12-31T18:57:58.257Z info: [Crawler][9192:0] Will attempt to extract metadata from page ... > Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.515Z info: <-- GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D > Dec 31 12:58:03 links karakeep-web[958]: 2025-12-31T18:58:03.528Z info: --> GET /api/trpc/bookmarks.getBookmark?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22bookmarkId%22%3A%22mkhr8ar1nyrd96g4qjaufab5%22%7D%7D%7D 200 12ms > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: <--- Last few GCs ---> > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000] 87753 ms: Mark-Compact 8079.4 (8224.4) -> 8079.4 (8224.4) MB, pooled: 0 MB, 3696.98 / 0.00 ms (average mu = 0.161, current mu = 0.033) allocation failure; scavenge might not succeed > Dec 31 12:59:14 links karakeep-web[958]: [65:0x70d437481000] 93563 ms: Mark-Compact 8095.1 (8224.4) -> 8095.1 (8256.4) MB, pooled: 0 MB, 5792.46 / 0.00 ms (average mu = 0.083, current mu = 0.003) allocation failure; scavenge might not succeed > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: <--- JS stacktrace ---> > Dec 31 12:59:14 links karakeep-web[958]: > Dec 31 12:59:14 links karakeep-web[958]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory > Dec 31 12:59:14 links karakeep-web[958]: ----- Native stack trace ----- > Dec 31 12:59:14 links karakeep-web[958]: > ``` > > This log is from after I had already added `NODE_OPTIONS="--max-old-space-size=8192"` to my environment (that limit should be reflected in the log). Once the "crawl" starts the memory usage of my container slowly climbs until the process gets killed by the OOM killer or the above error message is hit. I also tried using `--max-old-space-size=16384 ` which had the same issue, it just took slightly longer to get to 16 GB memory used :) > > **Update:** Downgrading to 0.29.1 did not solve the issue. Also, this does not happen with every SingleFile upload - many of them do work. > > Please let me know if there is anything I can do provide more debug information or otherwise help track down this issue - thank you so much for the amazing software! Same here. I'm seeing this when scraping Discourse forum posts. <img width="1465" height="914" alt="Image" src="https://github.com/user-attachments/assets/3f96c8ca-9849-4499-9df0-8db050281b03" />
Author
Owner

@bonnebulle commented on GitHub (Jan 13, 2026):

Sorry if my comment is not pertinent
But, here, we talk about Singlepage Firefox/Chrome extension ?
I following this doc :
https://docs.karakeep.app/integrations/singlefile/

My Firefox is up to date
SinglePage v.1.22.96

Am doing ( SinglePage > Preferences > Destination ) :

Image

Multiple sites I have this issue:

Image
Erreur SingleFile : {"success":false,"error":{"issues":[{"code":"invalid_union_discriminator","options":["link","text","asset"],"path":["type"],"message":"Invalid discriminator value. Expected 'link' | 'text' | 'asset'"}],"name":"ZodError"}} (RestFormApi)

Docker :

2026-01-13T12:48:03.195Z info: --> GET /api/health 200 1ms

2026-01-13T12:48:07.475Z info: <-- OPTIONS /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D

2026-01-13T12:48:07.477Z info: --> OPTIONS /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D 204 1ms

2026-01-13T12:48:07.502Z info: <-- GET /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D

2026-01-13T12:48:07.510Z info: --> GET /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D 500 7ms

Thanks

<!-- gh-comment-id:3744236222 --> @bonnebulle commented on GitHub (Jan 13, 2026): Sorry if my comment is not pertinent But, here, we talk about Singlepage Firefox/Chrome extension ? I following this doc : https://docs.karakeep.app/integrations/singlefile/ My Firefox is up to date SinglePage v.1.22.96 Am doing ( SinglePage > Preferences > Destination ) : <img width="628" height="154" alt="Image" src="https://github.com/user-attachments/assets/49e3d85f-6021-4bfd-92f8-a71566f402d4" /> Multiple sites I have this issue: <img width="602" height="264" alt="Image" src="https://github.com/user-attachments/assets/1384351c-c0d7-47a1-8958-1a35095a8a25" /> ``` Erreur SingleFile : {"success":false,"error":{"issues":[{"code":"invalid_union_discriminator","options":["link","text","asset"],"path":["type"],"message":"Invalid discriminator value. Expected 'link' | 'text' | 'asset'"}],"name":"ZodError"}} (RestFormApi) ``` Docker : ``` 2026-01-13T12:48:03.195Z info: --> GET /api/health 200 1ms 2026-01-13T12:48:07.475Z info: <-- OPTIONS /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D 2026-01-13T12:48:07.477Z info: --> OPTIONS /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D 204 1ms 2026-01-13T12:48:07.502Z info: <-- GET /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D 2026-01-13T12:48:07.510Z info: --> GET /api/trpc/bookmarks.searchBookmarks?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22text%22%3A%22url%3Ahttps%3A%2F%2Fvincentbreton.fr%2Fle-fediverse-derange-et-est-sciemment-occulte%2F%22%7D%7D%7D 500 7ms ``` Thanks
Author
Owner

@ahgraber commented on GitHub (Feb 6, 2026):

I'm having the same issue with 0.30.0, trying to load https://huggingface.co/spaces/nanotron/ultrascale-playbook?section=expert_parallelism from singlefile (with JS enabled). I've added my singlefile profiles for replication purposes.

2026-02-06T00:55:23.832Z info: [Crawler][172970:1] Will attempt to extract metadata from page ...

<--- Last few GCs --->

[336:0x7fa19c2ee000]   116844 ms: Scavenge (interleaved) 2028.5 (2034.3) -> 2028.5 (2045.3) MB, pooled: 0 MB, 12.43 / 0.00 ms  (average mu = 0.153, current mu = 0.080) allocation failure; 
[336:0x7fa19c2ee000]   119378 ms: Mark-Compact (reduce) 2028.8 (2045.3) -> 2028.8 (2031.5) MB, pooled: 0 MB, 2369.68 / 0.00 ms  (+ 19.6 ms in 5 steps since start of marking, biggest step 5.0 ms, walltime since start of marking 2393 ms) (average mu = 0.116
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

singlefile-settings-2026-02-06T00_56_53.623Z.json

<!-- gh-comment-id:3857207281 --> @ahgraber commented on GitHub (Feb 6, 2026): I'm having the same issue with 0.30.0, trying to load `https://huggingface.co/spaces/nanotron/ultrascale-playbook?section=expert_parallelism` from singlefile (with JS enabled). I've added my singlefile profiles for replication purposes. ```txt 2026-02-06T00:55:23.832Z info: [Crawler][172970:1] Will attempt to extract metadata from page ... <--- Last few GCs ---> [336:0x7fa19c2ee000] 116844 ms: Scavenge (interleaved) 2028.5 (2034.3) -> 2028.5 (2045.3) MB, pooled: 0 MB, 12.43 / 0.00 ms (average mu = 0.153, current mu = 0.080) allocation failure; [336:0x7fa19c2ee000] 119378 ms: Mark-Compact (reduce) 2028.8 (2045.3) -> 2028.8 (2031.5) MB, pooled: 0 MB, 2369.68 / 0.00 ms (+ 19.6 ms in 5 steps since start of marking, biggest step 5.0 ms, walltime since start of marking 2393 ms) (average mu = 0.116 FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory ----- Native stack trace ----- ``` [singlefile-settings-2026-02-06T00_56_53.623Z.json](https://github.com/user-attachments/files/25115430/singlefile-settings-2026-02-06T00_56_53.623Z.json)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#1406
No description provided.