[GH-ISSUE #1483] Memory leak - s3fs using over 1GB memory #780

Open
opened 2026-03-04 01:48:42 +03:00 by kerem · 22 comments
Owner

Originally created by @francoisfreitag on GitHub (Nov 25, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1483

Thank you for the great product. It’s been very reliable and greatly simplifies the setup for our use case.

The issue was noticed because the OOM-killer selects the s3fs process when the machine runs out of memory. The machine only serves as an SFTP server, where uploaded files are saved to an S3FS. It should not run out of memory.

Clients upload several hundred thousand files nightly. Monitoring the machine every day, the s3fs process memory consumes more and more memory until it exhausts the machine available RAM (about 2 GB).

Additional Information

Version of s3fs

Amazon Simple Storage Service File System V1.87 (commit:unknown) with OpenSSL

Version of fuse

2.9.2-11.el7

Kernel information (uname -r)

3.10.0-1160.2.2.el7.x86_64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

CentOS 7

s3fs command line used, if applicable

s3fs <BUCKET_NAME> /mnt/s3fs -o _netdev,allow_other,noexec

Details about issue

Possibly related issues #340, #725.

Steps to reproduce

I created a Vagrantfile to precisely describe the setup and help reproducing the issue. The issue can be reproduced without it by following the steps in the config.vm.provision section.

  1. Generate many small files. Example script below.
  2. Copy the Vagrantfile to an empty directory
  3. Replace parameters between brackets with your info (<AWS_ACCESS_KEY_ID>, <AWS_SECRET_ACCESS_KEY>, <BUCKET_NAME>)
  4. vagrant up
  5. Note the virtual machine IP address at the end of the vagrant output
  6. sftp testsftp@<MACHINE_ADDRESS>, the password is password. Can use a more complete SFTP client for faster transfers.
  7. put -R data
  8. Disconnect from the SFTP (just to exhibit that memory is never reclaimed)

Monitor the s3fs process as files are transferred (e.g. ps -eo pid,comm,rss), notice that it keeps growing without bounds. At the end of the put -R data, the S3 process uses several hundred MB.

Script to generate many files

mkdir data
echo "Test file with content." > data/test.txt
for i in $(seq 100000); cp data/test.txt data/test_$i.txt

Vagrantfile

Vagrant.configure("2") do |config|
  config.vm.box = "centos/7"
  config.vm.network "private_network", type: "dhcp"
  config.vm.synced_folder ".", "/vagrant", disabled: true

  config.vm.provision "shell", inline: <<-SHELL
    yum install -y epel-release
    yum install -y s3fs-fuse
    setsebool use_fusefs_home_dirs on
    echo "<AWS_ACCESS_KEY_ID>:<AWS_SECRET_ACCESS_KEY>" > /etc/passwd-s3fs
    chmod 600 /etc/passwd-s3fs
    mkdir /mnt/s3fs
    useradd --password $(openssl passwd password) --home-dir /mnt/s3fs testsftp --no-create-home
    s3fs <BUCKET_NAME> /mnt/s3fs -o _netdev,allow_other,noexec
    sed -ri 's/^PasswordAuthentication no$/PasswordAuthentication yes/' /etc/ssh/sshd_config
    systemctl reload sshd
    echo "##### Virtual machine IP address #####"
    ip addr show dev eth1
  SHELL
end
Originally created by @francoisfreitag on GitHub (Nov 25, 2020). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1483 Thank you for the great product. It’s been very reliable and greatly simplifies the setup for our use case. The issue was noticed because the OOM-killer selects the s3fs process when the machine runs out of memory. The machine only serves as an SFTP server, where uploaded files are saved to an S3FS. It should not run out of memory. Clients upload several hundred thousand files nightly. Monitoring the machine every day, the s3fs process memory consumes more and more memory until it exhausts the machine available RAM (about 2 GB). ### Additional Information #### Version of s3fs Amazon Simple Storage Service File System V1.87 (commit:unknown) with OpenSSL #### Version of fuse `2.9.2-11.el7` #### Kernel information (uname -r) `3.10.0-1160.2.2.el7.x86_64` #### GNU/Linux Distribution, if applicable (cat /etc/os-release) CentOS 7 #### s3fs command line used, if applicable ``` s3fs <BUCKET_NAME> /mnt/s3fs -o _netdev,allow_other,noexec ``` ### Details about issue Possibly related issues #340, #725. ### Steps to reproduce I created a [Vagrantfile](https://vagrantup.com/) to precisely describe the setup and help reproducing the issue. The issue can be reproduced without it by following the steps in the `config.vm.provision` section. 1. Generate many small files. Example script below. 1. Copy the Vagrantfile to an empty directory 1. Replace parameters between brackets with your info (`<AWS_ACCESS_KEY_ID>`, `<AWS_SECRET_ACCESS_KEY>`, `<BUCKET_NAME>`) 1. `vagrant up` 1. Note the virtual machine IP address at the end of the vagrant output 1. `sftp testsftp@<MACHINE_ADDRESS>`, the password is `password`. Can use a more complete SFTP client for faster transfers. 1. `put -R data` 1. Disconnect from the SFTP (just to exhibit that memory is never reclaimed) Monitor the s3fs process as files are transferred (e.g. `ps -eo pid,comm,rss`), notice that it keeps growing without bounds. At the end of the `put -R data`, the S3 process uses several hundred MB. #### Script to generate many files ```shell mkdir data echo "Test file with content." > data/test.txt for i in $(seq 100000); cp data/test.txt data/test_$i.txt ``` #### Vagrantfile ```ruby Vagrant.configure("2") do |config| config.vm.box = "centos/7" config.vm.network "private_network", type: "dhcp" config.vm.synced_folder ".", "/vagrant", disabled: true config.vm.provision "shell", inline: <<-SHELL yum install -y epel-release yum install -y s3fs-fuse setsebool use_fusefs_home_dirs on echo "<AWS_ACCESS_KEY_ID>:<AWS_SECRET_ACCESS_KEY>" > /etc/passwd-s3fs chmod 600 /etc/passwd-s3fs mkdir /mnt/s3fs useradd --password $(openssl passwd password) --home-dir /mnt/s3fs testsftp --no-create-home s3fs <BUCKET_NAME> /mnt/s3fs -o _netdev,allow_other,noexec sed -ri 's/^PasswordAuthentication no$/PasswordAuthentication yes/' /etc/ssh/sshd_config systemctl reload sshd echo "##### Virtual machine IP address #####" ip addr show dev eth1 SHELL end ```
Author
Owner

@ma331 commented on GitHub (Mar 26, 2021):

I have the exact same problem, but with a twist.
The same bucket is mounted twice (with same binary and exact same options), and only one s3fs process memleaks. the other doesnt. the one that does, moves a lot of files within s3 via sftp.
on the other s3mount without the problem files are only uploaded.

the difference is one process has 67mb rss the other 4gb.

used version is v1.89 build from git on mar. 13

<!-- gh-comment-id:808092777 --> @ma331 commented on GitHub (Mar 26, 2021): I have the exact same problem, but with a twist. The same bucket is mounted twice (with same binary and exact same options), and only one s3fs process memleaks. the other doesnt. the one that does, moves a lot of files within s3 via sftp. on the other s3mount without the problem files are only uploaded. the difference is one process has 67mb rss the other 4gb. used version is v1.89 build from git on mar. 13
Author
Owner

@gaul commented on GitHub (Apr 24, 2021):

Could you share which SSL library you use, e.g., OpenSSL, GnuTLS? Apparently the latter has a memory leak: curl/curl#5102

<!-- gh-comment-id:826033816 --> @gaul commented on GitHub (Apr 24, 2021): Could you share which SSL library you use, e.g., OpenSSL, GnuTLS? Apparently the latter has a memory leak: curl/curl#5102
Author
Owner

@francoisfreitag commented on GitHub (Apr 24, 2021):

I’m using OpenSSL.

Using the Vagrantfile above, the versions are:

openssl.x86_64                                          1:1.0.2k-19.el7                                 @anaconda
openssl-libs.x86_64                                     1:1.0.2k-19.el7                                 @anaconda

After running a system upgrade in the VM, they get upgraded to:

openssl.x86_64                                           1:1.0.2k-21.el7_9                                      @updates
openssl-libs.x86_64                                      1:1.0.2k-21.el7_9                                      @updates

I can double check the steps to reproduce are still valid if that helps?

<!-- gh-comment-id:826058356 --> @francoisfreitag commented on GitHub (Apr 24, 2021): I’m using OpenSSL. Using the Vagrantfile above, the versions are: ``` openssl.x86_64 1:1.0.2k-19.el7 @anaconda openssl-libs.x86_64 1:1.0.2k-19.el7 @anaconda ``` After running a system upgrade in the VM, they get upgraded to: ``` openssl.x86_64 1:1.0.2k-21.el7_9 @updates openssl-libs.x86_64 1:1.0.2k-21.el7_9 @updates ``` I can double check the steps to reproduce are still valid if that helps?
Author
Owner

@gaul commented on GitHub (Apr 25, 2021):

If you can reproduce these symptoms it would be great. Since we cannot reproduce this locally using some leak checking tool like Valgrind's memcheck or massif would help.

<!-- gh-comment-id:826249100 --> @gaul commented on GitHub (Apr 25, 2021): If you can reproduce these symptoms it would be great. Since we cannot reproduce this locally using some leak checking tool like Valgrind's memcheck or massif would help.
Author
Owner

@ma331 commented on GitHub (Apr 26, 2021):

im using gnutls here. ill recompile with openssl and check if the memory usage changes.

libcurl-gnutls.so.4 => /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007f2f657a4000)
libgnutls.so.30 => /usr/lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f2f6497a000)

<!-- gh-comment-id:826729342 --> @ma331 commented on GitHub (Apr 26, 2021): im using gnutls here. ill recompile with openssl and check if the memory usage changes. libcurl-gnutls.so.4 => /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007f2f657a4000) libgnutls.so.30 => /usr/lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f2f6497a000)
Author
Owner

@francoisfreitag commented on GitHub (Apr 28, 2021):

Here are the steps I took:

  1. Create the test data
  2. Create the VM from CentOS 7
  3. Run system upgrades and reboot
  4. setsebool use_fusefs_home_dirs on
  5. echo "<AWS_ACCESS_KEY_ID>:<AWS_SECRET_ACCESS_KEY>" > /etc/passwd-s3fs
  6. chmod 600 /etc/passwd-s3fs
  7. mkdir /mnt/s3fs
  8. useradd --password $(openssl passwd password) --home-dir /mnt/s3fs testsftp --no-create-home
  9. sed -ri 's/^PasswordAuthentication no$/PasswordAuthentication yes/' /etc/ssh/sshd_config
  10. systemctl reload sshd
  11. Clone, compile and install s3fs from GitHub master following the compilation instruction
  12. sudo -i
  13. valgrind --leak-check=full --show-leak-kinds=all /usr/local/bin/s3fs -f -s test-bucket /mnt/s3fs -o _netdev,allow_other,noexec

Measure RSS size with ps -eo pid,comm,rss: 107376.

  1. sftp vagrant@<VM_IP_ADDRESS>
  2. put -R data
  3. ... let it run for about 2 days ...
  4. disconnect from the SFTP server

Measure RSS size with ps -eo pid,comm,rss: 144072.

The memory keeps growing. It will keep growing until it fills the available RAM, but that filling is very slow.

Here’s the output from valgrind.
valgrind.txt.gz

<!-- gh-comment-id:828664545 --> @francoisfreitag commented on GitHub (Apr 28, 2021): Here are the steps I took: 1. Create the test data 2. Create the VM from CentOS 7 3. Run system upgrades and reboot 4. `setsebool use_fusefs_home_dirs on` 5. echo "<AWS_ACCESS_KEY_ID>:<AWS_SECRET_ACCESS_KEY>" > /etc/passwd-s3fs 6. chmod 600 /etc/passwd-s3fs 7. mkdir /mnt/s3fs 8. useradd --password $(openssl passwd password) --home-dir /mnt/s3fs testsftp --no-create-home 9. `sed -ri 's/^PasswordAuthentication no$/PasswordAuthentication yes/' /etc/ssh/sshd_config` 10. `systemctl reload sshd` 11. Clone, compile and install s3fs from GitHub master following the [compilation instruction](https://github.com/s3fs-fuse/s3fs-fuse/blob/e477d7d186c35cb5007c423cb971fe13a3fa7835/COMPILATION.md) 12. `sudo -i` 13. valgrind --leak-check=full --show-leak-kinds=all /usr/local/bin/s3fs -f -s test-bucket /mnt/s3fs -o _netdev,allow_other,noexec Measure RSS size with `ps -eo pid,comm,rss`: `107376`. 13. `sftp vagrant@<VM_IP_ADDRESS>` 14. put -R data 15. ... let it run for about 2 days ... 16. disconnect from the SFTP server Measure RSS size with `ps -eo pid,comm,rss`: `144072`. The memory keeps growing. It will keep growing until it fills the available RAM, but that filling is very slow. Here’s the output from valgrind. [valgrind.txt.gz](https://github.com/s3fs-fuse/s3fs-fuse/files/6393847/valgrind.txt.gz)
Author
Owner

@gaul commented on GitHub (Apr 29, 2021):

Thanks for testing this! Valgrind says that there is no leak from its perspective which helps narrow this down. s3fs must be holding onto some memory unintentionally. Could you run this again with valgrind --tool=massif which should show the major memory consumers at different points in time. Performance should be somewhat better than with the default memcheck. I expect that one of the caches is growing without bound.

<!-- gh-comment-id:828898084 --> @gaul commented on GitHub (Apr 29, 2021): Thanks for testing this! Valgrind says that there is no leak from its perspective which helps narrow this down. s3fs must be holding onto some memory unintentionally. Could you run this again with `valgrind --tool=massif` which should show the major memory consumers at different points in time. Performance should be somewhat better than with the default memcheck. I expect that one of the caches is growing without bound.
Author
Owner

@francoisfreitag commented on GitHub (Apr 29, 2021):

Thanks for getting back to me, good to know this is helpful!
I’ll run the experiment with massif and report after the 100k files are uploaded.

<!-- gh-comment-id:829001761 --> @francoisfreitag commented on GitHub (Apr 29, 2021): Thanks for getting back to me, good to know this is helpful! I’ll run the experiment with massif and report after the 100k files are uploaded.
Author
Owner

@francoisfreitag commented on GitHub (Apr 30, 2021):

Here’s the file generated by massif: massif.out.2053.gz

<!-- gh-comment-id:830320237 --> @francoisfreitag commented on GitHub (Apr 30, 2021): Here’s the file generated by massif: [massif.out.2053.gz](https://github.com/s3fs-fuse/s3fs-fuse/files/6407598/massif.out.2053.gz)
Author
Owner

@gaul commented on GitHub (May 9, 2021):

$ zcat massif.out.2053.gz | grep mem_heap_B | cut -f2 -d= | sort -n | tail -1
5752949

This suggests that s3fs only allocated maximum 5 MB of memory? Can you share the output from pmap $(pidof s3fs)? I wonder if there is some other leak like pthreads or something.

<!-- gh-comment-id:835782299 --> @gaul commented on GitHub (May 9, 2021): ```ShellSession $ zcat massif.out.2053.gz | grep mem_heap_B | cut -f2 -d= | sort -n | tail -1 5752949 ``` This suggests that s3fs only allocated maximum 5 MB of memory? Can you share the output from `pmap $(pidof s3fs)`? I wonder if there is some other leak like pthreads or something.
Author
Owner

@francoisfreitag commented on GitHub (May 14, 2021):

Reran the experiment, here’s the output from pmap $(pidof s3fs): pmap.out.gz

At the measurement time, the RSS size of s3fs is 20,920 (approx 20 MB).

<!-- gh-comment-id:841154295 --> @francoisfreitag commented on GitHub (May 14, 2021): Reran the experiment, here’s the output from `pmap $(pidof s3fs)`: [pmap.out.gz](https://github.com/s3fs-fuse/s3fs-fuse/files/6478151/pmap.out.gz) At the measurement time, the RSS size of s3fs is 20,920 (approx 20 MB).
Author
Owner

@francoisfreitag commented on GitHub (May 18, 2021):

I keep uploading files, the memory keeps growing slowly. The RSS size of s3fs is now 22,260. Here’s the pmap output: pmap.out.2.gz

<!-- gh-comment-id:842976682 --> @francoisfreitag commented on GitHub (May 18, 2021): I keep uploading files, the memory keeps growing slowly. The RSS size of s3fs is now 22,260. Here’s the pmap output: [pmap.out.2.gz](https://github.com/s3fs-fuse/s3fs-fuse/files/6499694/pmap.out.gz)
Author
Owner

@qianyi-sourse commented on GitHub (Sep 6, 2023):

Thank you for the great product. It’s been very reliable and greatly simplifies the setup for our use case.
I have met the similar problem. In the mount directory, I created three directories, and in each directory, I uploaded 100000 objects. When I open these directories, the memory will grow very quickly. Ultimately, it will exceed 1GB.
##s3fs command
s3fs test-list /nfs/test5 -o passwd_file=/root/.passwd-s3fs -o url=http://xxx -o use_path_request_style -o noxmlns -o dbglevel=error -o default_acl=public-read -o logfile=/var/log/s3fs.log -o allow_other -o multireq_max=500 -o nocopyapi -o use_cache=/es-SP0-0/nfs/ -o del_cache -o parallel_count=500 -o multipart_size=52

<!-- gh-comment-id:1707945564 --> @qianyi-sourse commented on GitHub (Sep 6, 2023): Thank you for the great product. It’s been very reliable and greatly simplifies the setup for our use case. I have met the similar problem. In the mount directory, I created three directories, and in each directory, I uploaded 100000 objects. When I open these directories, the memory will grow very quickly. Ultimately, it will exceed 1GB. ##s3fs command ` s3fs test-list /nfs/test5 -o passwd_file=/root/.passwd-s3fs -o url=http://xxx -o use_path_request_style -o noxmlns -o dbglevel=error -o default_acl=public-read -o logfile=/var/log/s3fs.log -o allow_other -o multireq_max=500 -o nocopyapi -o use_cache=/es-SP0-0/nfs/ -o del_cache -o parallel_count=500 -o multipart_size=52 `
Author
Owner

@gaul commented on GitHub (Sep 8, 2023):

-o parallel_count=500 is not a good idea -- this means s3fs will use 500 threads. Can you try a smaller value? Similarly -o multireq_max=500 seems excessively large.

<!-- gh-comment-id:1710940659 --> @gaul commented on GitHub (Sep 8, 2023): `-o parallel_count=500` is not a good idea -- this means s3fs will use 500 threads. Can you try a smaller value? Similarly `-o multireq_max=500` seems excessively large.
Author
Owner

@qianyi-sourse commented on GitHub (Sep 8, 2023):

I have tried -o parallel_count=100,-o multireq_max=100,but it did not work. Memory usage has not decreased.

<!-- gh-comment-id:1711014522 --> @qianyi-sourse commented on GitHub (Sep 8, 2023): I have tried `-o parallel_count=100,-o multireq_max=100`,but it did not work. Memory usage has not decreased.
Author
Owner

@gaul commented on GitHub (Sep 8, 2023):

These are stil unusually big values -- why aren't you using the default? Each thread will use a few MB of memory.

<!-- gh-comment-id:1712004749 --> @gaul commented on GitHub (Sep 8, 2023): These are stil unusually big values -- why aren't you using the default? Each thread will use a few MB of memory.
Author
Owner

@qianyi-sourse commented on GitHub (Sep 11, 2023):

@gaul Thanks for your reply. I also have tried -o parallel_count=5,-o multireq_max=5,but it did not work. Afterwards, I also tried to update to the latest code, but it didn't work.

<!-- gh-comment-id:1713076673 --> @qianyi-sourse commented on GitHub (Sep 11, 2023): @gaul Thanks for your reply. I also have tried `-o parallel_count=5,-o multireq_max=5`,but it did not work. Afterwards, I also tried to update to the latest code, but it didn't work.
Author
Owner

@gaul commented on GitHub (Sep 12, 2023):

What does "didn't work" mean? Did you have the same memory leak?

<!-- gh-comment-id:1716651299 --> @gaul commented on GitHub (Sep 12, 2023): What does "didn't work" mean? Did you have the same memory leak?
Author
Owner

@qianyi-sourse commented on GitHub (Sep 13, 2023):

Yes. The change of parallel_count,-o multireq_max did not reduce the usage of memory. Open a dir that has 100,000 files ,the usage of memory will eventually reach around 400MB. Then open other dir, the usage of memory will continue to increase.

<!-- gh-comment-id:1716794479 --> @qianyi-sourse commented on GitHub (Sep 13, 2023): Yes. The change of `parallel_count,-o multireq_max` did not reduce the usage of memory. Open a dir that has 100,000 files ,the usage of memory will eventually reach around 400MB. Then open other dir, the usage of memory will continue to increase.
Author
Owner

@qianyi-sourse commented on GitHub (Sep 14, 2023):

I checked the content in the memory through GDB and found that most of the data stored in the memory is HTTP response data.

<!-- gh-comment-id:1718976313 --> @qianyi-sourse commented on GitHub (Sep 14, 2023): I checked the content in the memory through GDB and found that most of the data stored in the memory is HTTP response data.
Author
Owner

@morkeleb commented on GitHub (Nov 30, 2023):

were experiencing the same issue. When we have folders with many files the memory runs out.

<!-- gh-comment-id:1833942009 --> @morkeleb commented on GitHub (Nov 30, 2023): were experiencing the same issue. When we have folders with many files the memory runs out.
Author
Owner

@pianetarosso commented on GitHub (Jan 12, 2024):

Same error, we are loading 5 different buckets, and everything works for the "smaller" ones, when I try to list the files in the bigger with 54K files we get this error

Out of memory: Killed process 49060 (s3fs) total-vm:851644kB, anon-rss:187516kB, file-rss:5120kB, shmem-rss:0kB, UID:0 pgtables:572kB oom_score_adj:0

<!-- gh-comment-id:1889612344 --> @pianetarosso commented on GitHub (Jan 12, 2024): Same error, we are loading 5 different buckets, and everything works for the "smaller" ones, when I try to list the files in the bigger with 54K files we get this error `Out of memory: Killed process 49060 (s3fs) total-vm:851644kB, anon-rss:187516kB, file-rss:5120kB, shmem-rss:0kB, UID:0 pgtables:572kB oom_score_adj:0`
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#780
No description provided.