[GH-ISSUE #528] [Perf] Move away from the APIs that return all tags in one request #342

Closed
opened 2026-03-02 11:49:01 +03:00 by kerem · 12 comments
Owner

Originally created by @MohamedBassem on GitHub (Oct 13, 2024).
Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/528

People with thousands of tags are already feeling slowness in viewing their tags or searching through them. This issue is to:

  1. Change the AllTags page to be paginated (with a search at top).
  2. Change the tag editor to do async search requests to the backend instead of inline all the tags.
Originally created by @MohamedBassem on GitHub (Oct 13, 2024). Original GitHub issue: https://github.com/karakeep-app/karakeep/issues/528 People with thousands of tags are already feeling slowness in viewing their tags or searching through them. This issue is to: 1. Change the AllTags page to be paginated (with a search at top). 2. Change the tag editor to do async search requests to the backend instead of inline all the tags.
kerem 2026-03-02 11:49:01 +03:00
Author
Owner

@Arcturuss commented on GitHub (Oct 18, 2024):

I feel there is something wrong with the page layout, if just a thousands (not even millions) of tags is enough to slow the browser. Same goes for the API request. Maybe it will be better in the long run to fix the underlying cause instead of hiding it with pagination?

<!-- gh-comment-id:2421009980 --> @Arcturuss commented on GitHub (Oct 18, 2024): I feel there is something wrong with the page layout, if just a thousands (not even millions) of tags is enough to slow the browser. Same goes for the API request. Maybe it will be better in the long run to fix the underlying cause instead of hiding it with pagination?
Author
Owner

@petrm commented on GitHub (Nov 5, 2024):

I just started to notice a huge slowdown with 5000 or so tags. Looking at the performance of the page, the browser actually spends most of the time waiting for /dashboard/tags, in my case over 9 seconds.
It doesn't look to me like a problem in the browser, but in the backend. Is there any flag that could be enabled to spit out the SQL query used to generate the tags on the page? 5000 rows is nothing - this has to return results in milliseconds.

<!-- gh-comment-id:2458026402 --> @petrm commented on GitHub (Nov 5, 2024): I just started to notice a huge slowdown with 5000 or so tags. Looking at the performance of the page, the browser actually spends most of the time waiting for /dashboard/tags, in my case over 9 seconds. It doesn't look to me like a problem in the browser, but in the backend. Is there any flag that could be enabled to spit out the SQL query used to generate the tags on the page? 5000 rows is nothing - this has to return results in milliseconds.
Author
Owner

@petrm commented on GitHub (Nov 5, 2024):

Well, I went to the drizzle docs, looks like there would be a way to show the query, but as far as I can see it would need to be explicitely added to the hoarder codebase https://orm.drizzle.team/docs/goodies#printing-sql-query. Also there is https://github.com/drizzle-team/drizzle-orm/issues/2605.

<!-- gh-comment-id:2458064512 --> @petrm commented on GitHub (Nov 5, 2024): Well, I went to the drizzle docs, looks like there would be a way to show the query, but as far as I can see it would need to be explicitely added to the hoarder codebase https://orm.drizzle.team/docs/goodies#printing-sql-query. Also there is https://github.com/drizzle-team/drizzle-orm/issues/2605.
Author
Owner

@MohamedBassem commented on GitHub (Nov 5, 2024):

it's unlikely that the problem is in the query itself. Last time we've seen this problem was because of the rendering (and potentially the server side rendering in this case).

<!-- gh-comment-id:2458068634 --> @MohamedBassem commented on GitHub (Nov 5, 2024): it's unlikely that the problem is in the query itself. Last time we've seen this problem was because of the rendering (and potentially the server side rendering in this case).
Author
Owner

@petrm commented on GitHub (Nov 5, 2024):

It can be that previously it was, but look at this:
image
Would you be able to share the queries?

<!-- gh-comment-id:2458082966 --> @petrm commented on GitHub (Nov 5, 2024): It can be that previously it was, but look at this: ![image](https://github.com/user-attachments/assets/f386fc59-3453-4800-bae9-5dd52c1d38fc) Would you be able to share the queries?
Author
Owner

@MohamedBassem commented on GitHub (Nov 5, 2024):

Most likely that's server side rendering taking the time. I can share the queries when I'm in front of the laptop.

<!-- gh-comment-id:2458085478 --> @MohamedBassem commented on GitHub (Nov 5, 2024): Most likely that's server side rendering taking the time. I can share the queries when I'm in front of the laptop.
Author
Owner

@kamtschatka commented on GitHub (Nov 5, 2024):

{
  sql: 'select "id", "name", "createdAt", "userId", (select coalesce(json_group_array(json_array("attachedBy")), json_array()) as "data" from "tagsOnBookmarks" "bookmarkTags_tagsOnBookmarks" where "bookmarkTags_tagsOnBookmarks"."tagId" = "bookmarkTags"."id") as "tagsOnBookmarks" from "bookmarkTags" where "bookmarkTags"."userId" = ?',
  params: [ 't9uxin6b6q7fz1v72jm31nr6' ],
  typings: [ 'none' ]
}
<!-- gh-comment-id:2458095016 --> @kamtschatka commented on GitHub (Nov 5, 2024): ``` { sql: 'select "id", "name", "createdAt", "userId", (select coalesce(json_group_array(json_array("attachedBy")), json_array()) as "data" from "tagsOnBookmarks" "bookmarkTags_tagsOnBookmarks" where "bookmarkTags_tagsOnBookmarks"."tagId" = "bookmarkTags"."id") as "tagsOnBookmarks" from "bookmarkTags" where "bookmarkTags"."userId" = ?', params: [ 't9uxin6b6q7fz1v72jm31nr6' ], typings: [ 'none' ] } ```
Author
Owner

@petrm commented on GitHub (Nov 5, 2024):

Thanks. I have been working with the assumption that the server just spits out json which is then used by the browser. I guess I will have to set up a dev copy and dig into typescript profiling ;)

<!-- gh-comment-id:2458101214 --> @petrm commented on GitHub (Nov 5, 2024): Thanks. I have been working with the assumption that the server just spits out json which is then used by the browser. I guess I will have to set up a dev copy and dig into typescript profiling ;)
Author
Owner

@kamtschatka commented on GitHub (Nov 5, 2024):

so yes, the query does take surprisingly long (1,5 seconds with 5000 tags). Then we calculate the ai/human part, which takes a few ms.
The whole request takes 3,5 seconds on my machine, so 2 seconds server side rendering.

<!-- gh-comment-id:2458136542 --> @kamtschatka commented on GitHub (Nov 5, 2024): so yes, the query does take surprisingly long (1,5 seconds with 5000 tags). Then we calculate the ai/human part, which takes a few ms. The whole request takes 3,5 seconds on my machine, so 2 seconds server side rendering.
Author
Owner

@kamtschatka commented on GitHub (Nov 5, 2024):

I changed the way the query is done and it reduces the time to below 100ms on my machine.
Rendering and transfering the data from the server will still take 2 seconds or so.
I guess we should start testing with much larger bookmark/tag amounts to find similar bottlenecks and see what we can do.

<!-- gh-comment-id:2458183179 --> @kamtschatka commented on GitHub (Nov 5, 2024): I changed the way the query is done and it reduces the time to below 100ms on my machine. Rendering and transfering the data from the server will still take 2 seconds or so. I guess we should start testing with much larger bookmark/tag amounts to find similar bottlenecks and see what we can do.
Author
Owner

@MohamedBassem commented on GitHub (Nov 8, 2024):

@kamtschatka Can you try now after the index fix?

<!-- gh-comment-id:2465878249 --> @MohamedBassem commented on GitHub (Nov 8, 2024): @kamtschatka Can you try now after the index fix?
Author
Owner

@kamtschatka commented on GitHub (Nov 9, 2024):

aaahh, i have checked the index, but was too blind to see the misconfiguration...
I have tried it and the query is now just as fast as with my solution (always sub 100ms, mostly closer to 50ms)
I am wondering though if client side rendering of the tags page would be still faster, since creating the HTML and transferring it to the client also contributes a lot to the load times.
A NAS is usually less powerful than a PC and the upload also takes some time.
Unfortunately it was not as easy as adding "use client;" to the files, so I was not able to try that....

<!-- gh-comment-id:2466096588 --> @kamtschatka commented on GitHub (Nov 9, 2024): aaahh, i have checked the index, but was too blind to see the misconfiguration... I have tried it and the query is now just as fast as with my solution (always sub 100ms, mostly closer to 50ms) I am wondering though if client side rendering of the tags page would be still faster, since creating the HTML and transferring it to the client also contributes a lot to the load times. A NAS is usually less powerful than a PC and the upload also takes some time. Unfortunately it was not as easy as adding "use client;" to the files, so I was not able to try that....
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/karakeep#342
No description provided.