mirror of
https://github.com/cbeuw/Cloak.git
synced 2026-04-25 20:45:59 +03:00
[GH-ISSUE #137] Too high memory usage #113
Labels
No labels
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/Cloak#113
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @notsure2 on GitHub (Dec 5, 2020).
Original GitHub issue: https://github.com/cbeuw/Cloak/issues/137
Stress testing ck-server with NumConn=0 and making only around 70 Cloak sessions, it's already consuming 180 MB of RAM and it continues to climb. It doesn't seem to be releasing memory properly even after the connections are stopped. There seems to be a memory leak.
@notsure2 commented on GitHub (Dec 5, 2020):
Info from pprof:
@notsure2 commented on GitHub (Dec 5, 2020):
@notsure2 commented on GitHub (Dec 5, 2020):
Example of triggering event: Use NumConn=0, setup as plugin with shadowsocks, setup with qBittorrent, set large number of simultaneous connections in qBittorrent, download a torrent with lot of sources (eg: Linux Distro).
@notsure2 commented on GitHub (Dec 5, 2020):
@notsure2 commented on GitHub (Dec 5, 2020):
I think I have a guess for the issue. I think that in case a connection is closed, Cloak does not clear the callback registered with timer.AfterFunc, so the Timer instances are getting accumulated in memory (memory leak). Eventually they timeout and get cleared but if a program opens and closes many connections in a short time, it will cause infinite memory growth until cloak is killed by the server.
@notsure2 commented on GitHub (Dec 5, 2020):
Possible workaround, reduce StreamTimeout. But the correct solution is track the timers created per connection, don't create more than 1 timeout timer for each connection and when the connection is closed, cancel and destroy the timer.
@notsure2 commented on GitHub (Dec 5, 2020):
Actually, maybe I'm misunderstanding the code, I'm not completely familiar with golang rwCond, the problem maybe simply only that the timeout was too long.
@cbeuw commented on GitHub (Dec 5, 2020):
So apparently timer.AfterFunc() creates a timer that does not get garbage collected until the timer fires: https://medium.com/@oboturov/golang-time-after-is-not-garbage-collected-4cbc94740082
Since it is in a hot loop, a lot of timers were created. They would get garbage collected eventually but this is still a memory leak. The new commit should fix it by only having one timer per connection
@cbeuw commented on GitHub (Dec 5, 2020):
Hmmm this might have broken things. Will fix
@cbeuw commented on GitHub (Dec 5, 2020):
nvm it's my setup. Should actually be fine
@notsure2 commented on GitHub (Dec 5, 2020):
@cbeuw this issue is still not fixed although it is reduced in severity. You still need to clean up the timer when the connection is prematurely closed from the sender side before the timeout elapsed, otherwise the timer will wait until the timeout elapsed in order to get garbage collected, even after the connection is closed by a long time.
@notsure2 commented on GitHub (Dec 5, 2020):
Memory usage is still climbing. Now the pprof is showing different results:
alloc_space mode:
@cbeuw commented on GitHub (Dec 5, 2020):
Thanks for your effort! Could you please check how many goroutines are running and if that figure ever goes down when things become quiet? I tried the two channel approach earlier and the main issue was that there are hundreds of goroutines stuck alive, somehow never selected the closing channel.
Prior to
github.com/cbeuw/Cloak@4baca256f7, there wasn't any goroutine issues raising from the use of AfterFunc().@notsure2 commented on GitHub (Dec 5, 2020):
You're right I'm getting the same issue.
@notsure2 commented on GitHub (Dec 5, 2020):
maybe there's a case where close is not getting called, or broadcastAfter is getting called after close is called....
@cbeuw commented on GitHub (Dec 5, 2020):
Because a memory leak is better than an excessive goroutine leak under normal load, I've reverted
github.com/cbeuw/Cloak@4baca256f7.Additionally I couldn't reproduce the memory leak without a stress test. I'll experiment with a stress test and see how serious it is and try to find a better solution
@notsure2 commented on GitHub (Dec 5, 2020):
The thing is, it shouldn't have leaked unless broadcastAfter is being called after Close (or both are called together). I'll add some logging messages to investigate this. I think it's possible to solve without reverting.
@notsure2 commented on GitHub (Dec 5, 2020):
@cbeuw to reproduce this, just setup shadowsocks + cloak. then setup torrent client to use shadowsocks, then download a torrent with lot of seeds with torrent client set to use ~100 connections per torrent. Set Cloak NumConn=0
@notsure2 commented on GitHub (Dec 5, 2020):
@cbeuw I think there's something weird with this part:
github.com/cbeuw/Cloak@c0040f20c3/internal/multiplex/streamBufferedPipe.go (L78-L93)In case wTimeout is not 0 (ie: StreamTimeout is set in the config), it schedules a next broadcast (here's an allocated timer).
However, in the very next line, if the buffer is NOT empty, it writes its contents and sends a broadcast by itself without waiting, making the previously scheduled timer callback redundant.
Shouldn't this if block be moved inside the next else block ? You only need to schedule a broadcast if you are then going to wait. In fact, this is going to create a new timer every loop whether there is data in the buffer or not.
@cbeuw commented on GitHub (Dec 6, 2020):
I think you are right. We should only schedule a broadcast via AfterFunc only if we know we will wait for an indeterminate amount of time. In fact, this should apply to the bit above checking if
rDeadlineis set as well