[GH-ISSUE #28] [BUG] High RAM usage #19

Closed
opened 2026-02-25 20:34:41 +03:00 by kerem · 3 comments
Owner

Originally created by @chpatton013 on GitHub (May 12, 2020).
Original GitHub issue: https://github.com/benbusby/whoogle-search/issues/28

First off, I'm really excited about this project! I've been looking for something more approachable than searx for quite awhile, and was really happy to see you publish this. It seems like a great project, and I'm looking forward to using it. That said...

Describe the bug
Whoogle seems to use a lot of RAM, even before a single request has reached the application.

To Reproduce
Steps to reproduce the behavior:

  1. docker run --name=whoogle-search --init --rm benbusby/whoogle-search
  2. docker stats --no-stream whoogle-search
  3. Observe MEM USAGE / LIMIT column

Expected behavior
Essentially, less RAM usage.

Desktop (please complete the following information):
I'm testing this in a VM on my laptop, using Vagrant with a VirtualBox provider.

  • OS: macOS 10.15.4
  • Vagrant: 2.2.9
  • VirtualBox: 6.0.14r133895

Additional context
I'm seeing 264MB used by whoogle in my local VM cluster. I tried to deploy this to a t3a.nano in EC2 (456MB RAM), but triggered OOM killer (the docker daemon uses a lot of RAM on its own, which leaves that machine with about 250MB of RAM free). Unfortunately, this is the RAM usage at-rest. I haven't even directed a request at this application yet, and it's already allocated over a quarter of a gigabyte.

For context, most PHP applications run with a default memory limit of 64MB, and many will operate just fine with a limit of 32MB (for example, bookstack, which I'm operating right now with 23MB of RAM usage). RAM is sparse in VMs (both locally and in the cloud), so anything you can do to lower the RAM requirements of the application will make this project more viable for a wider variety of users.

Originally created by @chpatton013 on GitHub (May 12, 2020). Original GitHub issue: https://github.com/benbusby/whoogle-search/issues/28 First off, I'm really excited about this project! I've been looking for something more approachable than searx for quite awhile, and was really happy to see you publish this. It seems like a great project, and I'm looking forward to using it. That said... **Describe the bug** Whoogle seems to use a lot of RAM, even before a single request has reached the application. **To Reproduce** Steps to reproduce the behavior: 1. `docker run --name=whoogle-search --init --rm benbusby/whoogle-search` 2. `docker stats --no-stream whoogle-search` 3. Observe `MEM USAGE / LIMIT` column **Expected behavior** Essentially, less RAM usage. **Desktop (please complete the following information):** I'm testing this in a VM on my laptop, using Vagrant with a VirtualBox provider. - OS: macOS 10.15.4 - Vagrant: 2.2.9 - VirtualBox: 6.0.14r133895 **Additional context** I'm seeing 264MB used by whoogle in my local VM cluster. I tried to deploy this to a t3a.nano in EC2 (456MB RAM), but triggered OOM killer (the docker daemon uses a lot of RAM on its own, which leaves that machine with about 250MB of RAM free). Unfortunately, this is the RAM usage at-rest. I haven't even directed a request at this application yet, and it's already allocated over a quarter of a gigabyte. For context, most PHP applications run with a default memory limit of 64MB, and many will operate just fine with a limit of 32MB (for example, bookstack, which I'm operating right now with 23MB of RAM usage). RAM is sparse in VMs (both locally and in the cloud), so anything you can do to lower the RAM requirements of the application will make this project more viable for a wider variety of users.
kerem 2026-02-25 20:34:41 +03:00
  • closed this issue
  • added the
    bug
    label
Author
Owner

@benbusby commented on GitHub (May 12, 2020):

Yikes. I tracked it down, looks like the library I was using to make funny rhymes out of the User Agent was extremely inefficient with memory. Bypassing that knocked it down to about 30MB of RAM. All that memory for a stupid joke...tsk tsk. Thanks for pointing that out, will push a fix soon.

<!-- gh-comment-id:627142953 --> @benbusby commented on GitHub (May 12, 2020): Yikes. I tracked it down, looks like the library I was using to make funny rhymes out of the User Agent was extremely inefficient with memory. Bypassing that knocked it down to about 30MB of RAM. All that memory for a stupid joke...tsk tsk. Thanks for pointing that out, will push a fix soon.
Author
Owner

@benbusby commented on GitHub (May 12, 2020):

$ docker stats --no-stream whooglesearch
CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
35b5b8492a1d        whooglesearch       0.03%               25.42MiB / 1.943GiB   1.28%               2.41kB / 15.9kB     0B / 0B             2

Fixed in 445019d204, will document in upcoming release in the next few days. Thanks again!

<!-- gh-comment-id:627150287 --> @benbusby commented on GitHub (May 12, 2020): ``` $ docker stats --no-stream whooglesearch CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 35b5b8492a1d whooglesearch 0.03% 25.42MiB / 1.943GiB 1.28% 2.41kB / 15.9kB 0B / 0B 2 ``` Fixed in 445019d204e7e931acc1d4c3b1b478df2803c4b4, will document in upcoming release in the next few days. Thanks again!
Author
Owner

@chpatton013 commented on GitHub (May 12, 2020):

Wow, that's pretty incredible. I wasn't expecting it to be too difficult to fix, but I didn't anticipate it all coming from one dependency. Thanks for the fast turn-around time! I'll keep my eye out for that next release.

I looked into Phyme, and it looks like all that memory is used for the dictionary storage (and resulting lookup structures). Another good reminder that although Python is really convenient to work with, it has some serious costs on the hardware.

<!-- gh-comment-id:627408132 --> @chpatton013 commented on GitHub (May 12, 2020): Wow, that's pretty incredible. I wasn't expecting it to be too difficult to fix, but I didn't anticipate it all coming from one dependency. Thanks for the fast turn-around time! I'll keep my eye out for that next release. I looked into `Phyme`, and it looks like all that memory is used for the dictionary storage (and resulting lookup structures). Another good reminder that although Python is really convenient to work with, it has some serious costs on the hardware.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/whoogle-search#19
No description provided.