[GH-ISSUE #1500] Understanding CPU and Network footprint #788

Closed
opened 2026-03-04 01:48:48 +03:00 by kerem · 2 comments
Owner

Originally created by @afritzler on GitHub (Dec 21, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1500

Setup

I am running s3fs fuse as a side car of an Nginx container in a Kubernetes cluster to serve the bucket content (something like that one here [0]). The bucket in my scenario is gigantic, both in terms of number of files/directories and size. From a pure functional perspective everything looks peachy - however the CPU utilisation is hovering around 25-30% (without any clients accessing data). What is even more concerning is the network throughput. It looks like the pod is ramping up ~1GB/h or traffic in idle mode. On the Nginx side I have enabled autoindex on the mounted directory [1]. From looking at the s3fs logs, it looks like there is a huge initial sync going on which cause the commotion - I set the loglevel to info and the stdout is on 🔥 🔥 🔥

So the question now is: is that a normal behaviour? Or can I somehow tweak that? I tried turning off caching without any success. Is there some kind of lazy loading mode of s3fs as performance/latency on my client side is not really a concern.

Version of s3fs being used (s3fs --version)

1.87

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

2.9.9-3

Kernel information (uname -r)

5.4.0-5-cloud-amd64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

20.10 (Groovy Gorilla) (Ubuntu:rolling base image)

s3fs command line used, if applicable

s3fs my-bucket /mnt -f -o url=https://s3.eu-west-1.amazonaws.com,allow_other,max_stat_cache_size=1
000,stat_cache_expire=900,retries=5,connect_timeout=10,uid=101,gid=101,dbglevel=info

[0] https://devops-bordeaux.net/blog/mounting-s3-bucket-in-kubernetes-pod/
[1] https://nginx.org/en/docs/http/ngx_http_autoindex_module.html

Originally created by @afritzler on GitHub (Dec 21, 2020). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1500 ### Setup I am running s3fs fuse as a side car of an Nginx container in a Kubernetes cluster to serve the bucket content (something like that one here [0]). The bucket in my scenario is gigantic, both in terms of number of files/directories and size. From a pure functional perspective everything looks peachy - however the CPU utilisation is hovering around 25-30% (without any clients accessing data). What is even more concerning is the network throughput. It looks like the pod is ramping up ~1GB/h or traffic in idle mode. On the Nginx side I have enabled `autoindex` on the mounted directory [1]. From looking at the s3fs logs, it looks like there is a huge initial sync going on which cause the commotion - I set the loglevel to info and the stdout is on 🔥 🔥 🔥 So the question now is: is that a normal behaviour? Or can I somehow tweak that? I tried turning off caching without any success. Is there some kind of `lazy loading` mode of s3fs as performance/latency on my client side is not really a concern. #### Version of s3fs being used (s3fs --version) 1.87 #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) 2.9.9-3 #### Kernel information (uname -r) 5.4.0-5-cloud-amd64 #### GNU/Linux Distribution, if applicable (cat /etc/os-release) 20.10 (Groovy Gorilla) (Ubuntu:rolling base image) #### s3fs command line used, if applicable ``` s3fs my-bucket /mnt -f -o url=https://s3.eu-west-1.amazonaws.com,allow_other,max_stat_cache_size=1 000,stat_cache_expire=900,retries=5,connect_timeout=10,uid=101,gid=101,dbglevel=info ``` [0] https://devops-bordeaux.net/blog/mounting-s3-bucket-in-kubernetes-pod/ [1] https://nginx.org/en/docs/http/ngx_http_autoindex_module.html
kerem closed this issue 2026-03-04 01:48:49 +03:00
Author
Owner

@gaul commented on GitHub (Dec 22, 2020):

s3fs should not have any CPU or network usage at idle. I don't know about nginx autoindex but if this is indexing all your data then this would explain the behavior. See the related updatedb FAQ: https://github.com/s3fs-fuse/s3fs-fuse/wiki/FAQ#q-why-does-my-cpu-spike-at-certain-times-of-the-day .

<!-- gh-comment-id:749299781 --> @gaul commented on GitHub (Dec 22, 2020): s3fs should not have any CPU or network usage at idle. I don't know about nginx autoindex but if this is indexing all your data then this would explain the behavior. See the related `updatedb` FAQ: https://github.com/s3fs-fuse/s3fs-fuse/wiki/FAQ#q-why-does-my-cpu-spike-at-certain-times-of-the-day .
Author
Owner

@afritzler commented on GitHub (Dec 22, 2020):

Thanks for looking into this @gaul. After digging into the issue my self I think I found the problem:

It looks like using a combination of Bidirectional and HostToContainer mount propagation lead to the increased load

S3FusePod:
          volumeMounts:
            - name: s3-shared
              mountPath: /mnt
              mountPropagation: Bidirectional

and

Nginx:
          volumeMounts:
            - name: s3-shared
              mountPath: /var/www/
              mountPropagation: HostToContainer 

The solution in my case which worked was using a DaemonSet to deploy the fuse driver + doing a hostPath mount and sharing this volume with the nginx pod.

An example of this can be found here:
https://github.com/freegroup/kube-s3/blob/master/yaml/daemonset.yaml
https://github.com/freegroup/kube-s3/blob/master/yaml/example_pod.yaml

Now the s3fs is only using the eth0 if there is actually a request from the Nginx side.

Closing and thanks for your support!

<!-- gh-comment-id:749640792 --> @afritzler commented on GitHub (Dec 22, 2020): Thanks for looking into this @gaul. After digging into the issue my self I think I found the problem: It looks like using a combination of `Bidirectional` and `HostToContainer` mount propagation lead to the increased load ``` S3FusePod: volumeMounts: - name: s3-shared mountPath: /mnt mountPropagation: Bidirectional ``` and ``` Nginx: volumeMounts: - name: s3-shared mountPath: /var/www/ mountPropagation: HostToContainer ``` The solution in my case which worked was using a DaemonSet to deploy the fuse driver + doing a `hostPath` mount and sharing this volume with the nginx pod. An example of this can be found here: https://github.com/freegroup/kube-s3/blob/master/yaml/daemonset.yaml https://github.com/freegroup/kube-s3/blob/master/yaml/example_pod.yaml Now the s3fs is only using the `eth0` if there is actually a request from the Nginx side. Closing and thanks for your support!
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#788
No description provided.