[GH-ISSUE #34] How to reduce cpu load. #26

Open
opened 2026-02-26 12:33:45 +03:00 by kerem · 13 comments
Owner

Originally created by @alexkatanda on GitHub (Jun 23, 2019).
Original GitHub issue: https://github.com/cbeuw/Cloak/issues/34

Hello,
I installed ck-server with shadowsocks-libev3.2.5 on ubuntu18.10.
ck-server's cpu load is very high than ss-server.
ck-server's cpu usage is about 58%. ss-server's cpu usage is about 18%.
Why is ck-server's cpu load very high?
How to reduce ck-server's cpu load?

Originally created by @alexkatanda on GitHub (Jun 23, 2019). Original GitHub issue: https://github.com/cbeuw/Cloak/issues/34 Hello, I installed ck-server with shadowsocks-libev3.2.5 on ubuntu18.10. ck-server's cpu load is very high than ss-server. ck-server's cpu usage is about 58%. ss-server's cpu usage is about 18%. Why is ck-server's cpu load very high? How to reduce ck-server's cpu load?
Author
Owner

@malikshi commented on GitHub (Jun 24, 2019):

@alexkatanda
hi @cbeuw this one cloak issue cpu load so vking high sir, i just found out and tested it, if users has high upload traffic it will load more cpu,
Screenshot_74

<!-- gh-comment-id:505151877 --> @malikshi commented on GitHub (Jun 24, 2019): @alexkatanda hi @cbeuw this one cloak issue cpu load so vking high sir, i just found out and tested it, if users has high upload traffic it will load more cpu, ![Screenshot_74](https://user-images.githubusercontent.com/9080737/60047069-62cbec80-96fb-11e9-89c6-bbd40d10c270.jpg)
Author
Owner

@cbeuw commented on GitHub (Jun 24, 2019):

Please provide some environment information, like the CPU architecture and the version of cloak you are using.

How many sessions are established when this happens? Is there any trigger or does it happen randomly or from the very beginning?

<!-- gh-comment-id:505164059 --> @cbeuw commented on GitHub (Jun 24, 2019): Please provide some environment information, like the CPU architecture and the version of cloak you are using. How many sessions are established when this happens? Is there any trigger or does it happen randomly or from the very beginning?
Author
Owner

@malikshi commented on GitHub (Jun 24, 2019):

Please provide some environment information, like the CPU architecture and the version of cloak you are using.

How many sessions are established when this happens? Is there any trigger or does it happen randomly or from the very beginning?

Screenshot_75

even when i am try to speedtest, see cpu load,
here my spec,
CPU model : Virtual CPU 523cbcdd6ca4
Number of cores : 1
CPU frequency : 2399.996 MHz
Total size of Disk : 25.0 GB (5.7 GB Used)
Total amount of Mem : 985 MB (336 MB Used)
Total amount of Swap : 2047 MB (0 MB Used)
System uptime : 0 days, 1 hour 50 min
Load average : 0.07, 0.08, 0.03
OS : Ubuntu 18.04.2 LTS
Arch : x86_64 (64 Bit)
Kernel : 4.18.0-24-generic

i didn't share this server to anyone

<!-- gh-comment-id:505183363 --> @malikshi commented on GitHub (Jun 24, 2019): > Please provide some environment information, like the CPU architecture and the version of cloak you are using. > > How many sessions are established when this happens? Is there any trigger or does it happen randomly or from the very beginning? ![Screenshot_75](https://user-images.githubusercontent.com/9080737/60052282-6b2a2480-9707-11e9-9942-ed1c2f1a5e13.jpg) even when i am try to speedtest, see cpu load, here my spec, CPU model : Virtual CPU 523cbcdd6ca4 Number of cores : 1 CPU frequency : 2399.996 MHz Total size of Disk : 25.0 GB (5.7 GB Used) Total amount of Mem : 985 MB (336 MB Used) Total amount of Swap : 2047 MB (0 MB Used) System uptime : 0 days, 1 hour 50 min Load average : 0.07, 0.08, 0.03 OS : Ubuntu 18.04.2 LTS Arch : x86_64 (64 Bit) Kernel : 4.18.0-24-generic i didn't share this server to anyone
Author
Owner

@HirbodBehnam commented on GitHub (Jun 26, 2019):

Hello
Things are better in my server:
Speed result
As you can see, ck-server and ss-server are using about 33% of CPU with 120Mbit/s. (Those numbers in htop are out of 200%)

Here is my CPU info on my server

processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD Ryzen 7 PRO 1700X Eight-Core Processor
stepping : 1
microcode : 0x8001137
cpu MHz : 3399.999
cache size : 512 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm cons tant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx hypervisor lahf_lm extap ic abm sse4a misalignsse 3dnowprefetch osvw cpb ssbd vmmcall arat
bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spe c_store_bypass
bogomips : 6799.99
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD Ryzen 7 PRO 1700X Eight-Core Processor
stepping : 1
microcode : 0x8001137
cpu MHz : 3399.999
cache size : 512 KB
physical id : 0
siblings : 2
core id : 1
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm cons tant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx hypervisor lahf_lm extap ic abm sse4a misalignsse 3dnowprefetch osvw cpb ssbd vmmcall arat
bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spe c_store_bypass
bogomips : 6799.99
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

One thing that I don't know if it's important or not, is that my server's CPU has aes flag. You can test that with this command:

grep aes /proc/cpuinfo

If it prints anything you server's CPU does support AES-NI.

Also the encryption algorithm for shadowsocks is aes-128-gcm

<!-- gh-comment-id:505956200 --> @HirbodBehnam commented on GitHub (Jun 26, 2019): Hello Things are better in my server: ![Speed result](https://user-images.githubusercontent.com/11520090/60198151-34165900-9856-11e9-9f35-e457ba67d7bb.png) As you can see, ck-server and ss-server are using about 33% of CPU with 120Mbit/s. (Those numbers in htop are out of 200%) <details><summary>Here is my CPU info on my server</summary> processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 7 PRO 1700X Eight-Core Processor stepping : 1 microcode : 0x8001137 cpu MHz : 3399.999 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm cons tant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx hypervisor lahf_lm extap ic abm sse4a misalignsse 3dnowprefetch osvw cpb ssbd vmmcall arat bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spe c_store_bypass bogomips : 6799.99 TLB size : 2560 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: processor : 1 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 7 PRO 1700X Eight-Core Processor stepping : 1 microcode : 0x8001137 cpu MHz : 3399.999 cache size : 512 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm cons tant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx hypervisor lahf_lm extap ic abm sse4a misalignsse 3dnowprefetch osvw cpb ssbd vmmcall arat bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spe c_store_bypass bogomips : 6799.99 TLB size : 2560 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: </details> One thing that I don't know if it's important or not, is that my server's CPU has `aes` flag. You can test that with this command: ``` grep aes /proc/cpuinfo ``` If it prints anything you server's CPU does support AES-NI. Also the encryption algorithm for shadowsocks is `aes-128-gcm`
Author
Owner

@malikshi commented on GitHub (Jun 27, 2019):

Are you running in pc? It's fuckin good server.
Can you try use ss-server -v -u -c path to ss server config.

<!-- gh-comment-id:506099459 --> @malikshi commented on GitHub (Jun 27, 2019): Are you running in pc? It's fuckin good server. Can you try use ss-server -v -u -c path to ss server config.
Author
Owner

@HirbodBehnam commented on GitHub (Jun 27, 2019):

No both are servers. Client is from Eonix, KVM Virtualization, 1 Core @ 2100 MHz, 512 MB Ram and no AES-NI support.
And what's the matter with UDP relay and verbose mode? Speed tests are usually TCP.

BTW this is my own computer. 20% CPU utilization with 42 Mbit/s
Speedtest Result

<!-- gh-comment-id:506144330 --> @HirbodBehnam commented on GitHub (Jun 27, 2019): No both are servers. Client is from Eonix, KVM Virtualization, 1 Core @ 2100 MHz, 512 MB Ram and no AES-NI support. And what's the matter with UDP relay and verbose mode? Speed tests are usually TCP. BTW this is my own computer. 20% CPU utilization with 42 Mbit/s ![Speedtest Result](https://user-images.githubusercontent.com/11520090/60232789-e4b24600-98b2-11e9-84ba-c81998c9ac1f.png)
Author
Owner

@malikshi commented on GitHub (Jun 27, 2019):

And what's the matter with UDP relay and verbose mode? Speed tests are usually TCP.

just try those in server and any change or not for cpu load and ram,
usually ram and cpu will loaded more when most users has good traffic

<!-- gh-comment-id:506230427 --> @malikshi commented on GitHub (Jun 27, 2019): > And what's the matter with UDP relay and verbose mode? Speed tests are usually TCP. just try those in server and any change or not for cpu load and ram, usually ram and cpu will loaded more when most users has good traffic
Author
Owner

@HirbodBehnam commented on GitHub (Jun 27, 2019):

Yes enabling verbose mode will use more CPU because it writes every connection detail to terminal or the service log. You can remove the -v flag.
Also I tried with UDP Relay:
Speed test
About 20 Mbit/s and 10% CPU usage. Same as before.

<!-- gh-comment-id:506241814 --> @HirbodBehnam commented on GitHub (Jun 27, 2019): Yes enabling verbose mode will use more CPU because it writes every connection detail to terminal or the service log. You can remove the `-v` flag. Also I tried with UDP Relay: ![Speed test](https://user-images.githubusercontent.com/11520090/60248848-b9dae880-98d8-11e9-8e6f-f778a2a12eee.png) About 20 Mbit/s and 10% CPU usage. Same as before.
Author
Owner

@HirbodBehnam commented on GitHub (Jun 27, 2019):

But as you can see the CPU usage is for ss-server rather than ck-server. So you can either change the encryption to rc4-md5 or chacha20 or salsa20.
Also I'm not sure if this helps or not but I also have installed haveged package.

<!-- gh-comment-id:506282580 --> @HirbodBehnam commented on GitHub (Jun 27, 2019): But as you can see the CPU usage is for ss-server rather than ck-server. So you can either change the encryption to `rc4-md5` or `chacha20` or `salsa20`. Also I'm not sure if this helps or not but I also have installed `haveged` package.
Author
Owner

@cbeuw commented on GitHub (Jun 27, 2019):

rc4-md5 is not secure so it's not recommend. Chacha20 performs than AES worse on machines with AES-NI support. Regardless I don't think the CPU usage is due to crypto.

Is it possible to run ck-server on standalone mode and see the CPU usage of ss-server and ck-server separately?

<!-- gh-comment-id:506284519 --> @cbeuw commented on GitHub (Jun 27, 2019): rc4-md5 is not secure so it's not recommend. Chacha20 performs than AES worse on machines with AES-NI support. Regardless I don't think the CPU usage is due to crypto. Is it possible to run ck-server on standalone mode and see the CPU usage of ss-server and ck-server separately?
Author
Owner

@HirbodBehnam commented on GitHub (Jun 27, 2019):

Yeah you are right rc4-md5 is a weak algorithm. @malikshi You can see here for some speed comparison. However this is for 2 years ago and I'm not sure if it is still valid or not!

  • I also tested shadowsocks in my server that does not support AES-NI with aes-128-gcm cipher. Here is the result:
    image
    Download speed is about 20Mbit/s.

And a noobish question from @cbeuw; Why running in standalone is required to manage ck-server's CPU utilization? While running with --plugin option, ck-server runes as child of ss-server. And in htop it shoes the utilization like this:
image
Any problem with this? (sorry for noobish english too)

<!-- gh-comment-id:506302618 --> @HirbodBehnam commented on GitHub (Jun 27, 2019): Yeah you are right `rc4-md5` is a weak algorithm. @malikshi You can see [here](https://github.com/shadowsocks/libQtShadowsocks/wiki/Comparison-of-Encryption-Methods'-Speed) for some speed comparison. However this is for 2 years ago and I'm not sure if it is still valid or not! + I also tested shadowsocks in my server that does not support AES-NI with aes-128-gcm cipher. Here is the result: ![image](https://user-images.githubusercontent.com/11520090/60261134-670d2b00-98f0-11e9-916d-25aee2d9df68.png) Download speed is about 20Mbit/s. And a noobish question from @cbeuw; Why running in standalone is required to manage ck-server's CPU utilization? While running with `--plugin` option, `ck-server` runes as child of `ss-server`. And in htop it shoes the utilization like this: ![image](https://user-images.githubusercontent.com/11520090/60261654-a12afc80-98f1-11e9-8a24-6139bc25841b.png) Any problem with this? (sorry for noobish english too)
Author
Owner

@malikshi commented on GitHub (Jun 29, 2019):

Is it possible to run ck-server on standalone mode and see the CPU usage of ss-server and ck-server separately?
Screenshot_90
what i am post picture before doesn set to F5(tree) hope you understand,

rc4-md5 is not secure so it's not recommend. Chacha20 performs than AES worse on machines with AES-NI support. Regardless I don't think the CPU usage is due to crypto.

and my config using cacha by default in my own script(edited from gist and HirbodBehnam repo) method":"chacha20-ietf-poly1305 .

And what's the matter with UDP relay and verbose mode? Speed tests are usually TCP.

yea i just wanna using udp relay if plugin cloak supported dns over udp(relay). voice/video call need udp right,

and i think still tcp fast open was main problems? i know its made it fast but doesnt feel right when implemented in cloak,

<!-- gh-comment-id:506952568 --> @malikshi commented on GitHub (Jun 29, 2019): > Is it possible to run ck-server on standalone mode and see the CPU usage of ss-server and ck-server separately? ![Screenshot_90](https://user-images.githubusercontent.com/9080737/60383844-273d7380-9aa9-11e9-9d05-0a84c615f512.jpg) what i am post picture before doesn set to F5(tree) hope you understand, > rc4-md5 is not secure so it's not recommend. Chacha20 performs than AES worse on machines with AES-NI support. Regardless I don't think the CPU usage is due to crypto. and my config using cacha by default in my own script(edited from gist and HirbodBehnam repo) `method":"chacha20-ietf-poly1305` . > And what's the matter with UDP relay and verbose mode? Speed tests are usually TCP. yea i just wanna using udp relay if plugin cloak supported dns over udp(relay). voice/video call need udp right, and i think still tcp fast open was main problems? i know its made it fast but doesnt feel right when implemented in cloak,
Author
Owner

@alexkatanda commented on GitHub (Jul 28, 2019):

Hi, @cbeuw
image

here my server spec,
CPU model : Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz
Number of cores : 8
Total amount of Mem : 32GB
OS : Ubuntu 18.04.2 LTS
Arch : x86_64 (64 Bit)
Kernel : 4.15.0-55-generic

Currently I am using this server with my colleagues.
Shadowsocks are using chacha20 cipher.
I also think that CPU usage is for ss-server rather than ck-serve. However in many session environment, the cpu usage of ck-server is high than ss-server.
Currently there are about 100~200 simultaneous session.

<!-- gh-comment-id:515791309 --> @alexkatanda commented on GitHub (Jul 28, 2019): Hi, @cbeuw ![image](https://user-images.githubusercontent.com/52138094/62011753-03a44080-b1af-11e9-974d-8a5f0b3b8933.png) here my server spec, CPU model : Intel(R) Xeon(R) CPU E3-1240 v5 @ 3.50GHz Number of cores : 8 Total amount of Mem : 32GB OS : Ubuntu 18.04.2 LTS Arch : x86_64 (64 Bit) Kernel : 4.15.0-55-generic Currently I am using this server with my colleagues. Shadowsocks are using chacha20 cipher. I also think that CPU usage is for ss-server rather than ck-serve. However in many session environment, the cpu usage of ck-server is high than ss-server. Currently there are about 100~200 simultaneous session.
Sign in to join this conversation.
No labels
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/Cloak#26
No description provided.