[GH-ISSUE #980] Out of memory s3fs since yesterday on new server using latest version s3fs-fuse #545

Closed
opened 2026-03-04 01:46:35 +03:00 by kerem · 22 comments
Owner

Originally created by @nondualit on GitHub (Mar 12, 2019).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/980

Version of s3fs being used (s3fs --version)

s3fs --version
Amazon Simple Storage Service File System V1.85(commit:99ec09f) with OpenSSL
Copyright (C) 2010 Randy Rizun rrizun@gmail.com
License GPL2: GNU GPL version 2 https://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

example: 2.9.4

Kernel information (uname -r)

4.15.0-1033-aws (kernel)

GNU/Linux Distribution, if applicable (cat /etc/os-release)

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS"

s3fs command line used, if applicable

s3fs nxxxxx-sql-dxxxxx -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 /mnt/sxxxxx/

/etc/fstab entry, if applicable

no fstab

s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)

if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages

Details about issue

I'm having this memory leak. I just downloaded the newest version. But the oldest was from last week. Is there a problem again? or some dependencies with this new UBUNTU version?

Error:
Mar 12 09:51:09 nondualit_aws kernel: [249822.806272] Out of memory: Kill process 26871 (s3fs) score 284 or sacrifice child
Mar 12 09:51:09 nondualit_aws kernel: [249822.822296] Killed process 26871 (s3fs) total-vm:831084kB, anon-rss:294936kB, file-rss:0kB, shmem-rss:0kB
Mar 12 09:52:42 nondualit_aws kernel: [ 0.000000] Linux version 4.15.0-1033-aws (buildd@lcy01-amd64-019) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #35-U
buntu SMP Wed Feb 6 13:29:46 UTC 2019 (Ubuntu 4.15.0-1033.35-aws 4.15.18)
Mar 12 09:52:42 nondualit_aws kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.15.0-1033-aws root=UUID=fc9a41df-6d71-4f4f-a487-e5999bd67182 ro
console=tty1 console=ttyS0 nvme.io_timeout=4294967295

Originally created by @nondualit on GitHub (Mar 12, 2019). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/980 #### Version of s3fs being used (s3fs --version) s3fs --version Amazon Simple Storage Service File System V1.85(commit:99ec09f) with OpenSSL Copyright (C) 2010 Randy Rizun <rrizun@gmail.com> License GPL2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) _example: 2.9.4_ #### Kernel information (uname -r) 4.15.0-1033-aws (kernel) #### GNU/Linux Distribution, if applicable (cat /etc/os-release) DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.2 LTS" #### s3fs command line used, if applicable s3fs nxxxxx-sql-dxxxxx -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 /mnt/sxxxxx/ #### /etc/fstab entry, if applicable no fstab #### s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs) _if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages_ ``` ``` ### Details about issue I'm having this memory leak. I just downloaded the newest version. But the oldest was from last week. Is there a problem again? or some dependencies with this new UBUNTU version? Error: Mar 12 09:51:09 nondualit_aws kernel: [249822.806272] Out of memory: Kill process 26871 (s3fs) score 284 or sacrifice child Mar 12 09:51:09 nondualit_aws kernel: [249822.822296] Killed process 26871 (s3fs) total-vm:831084kB, anon-rss:294936kB, file-rss:0kB, shmem-rss:0kB Mar 12 09:52:42 nondualit_aws kernel: [ 0.000000] Linux version 4.15.0-1033-aws (buildd@lcy01-amd64-019) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #35-U buntu SMP Wed Feb 6 13:29:46 UTC 2019 (Ubuntu 4.15.0-1033.35-aws 4.15.18) Mar 12 09:52:42 nondualit_aws kernel: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.15.0-1033-aws root=UUID=fc9a41df-6d71-4f4f-a487-e5999bd67182 ro console=tty1 console=ttyS0 nvme.io_timeout=4294967295
kerem 2026-03-04 01:46:35 +03:00
Author
Owner

@gaul commented on GitHub (Mar 15, 2019):

Can you share the steps to reproduce these symptoms? We have had many reports of out-of-memory over the years but cannot track down the cause. Using Valgrind massif or similar might help determine the root cause.

<!-- gh-comment-id:473203868 --> @gaul commented on GitHub (Mar 15, 2019): Can you share the steps to reproduce these symptoms? We have had many reports of out-of-memory over the years but cannot track down the cause. Using Valgrind massif or similar might help determine the root cause.
Author
Owner

@nondualit commented on GitHub (Mar 16, 2019):

Actually, I don't have to do anything, so I can't reproduce. If I mount the S3 bucket in around 6 hours my server will go out of memory. I had to disable the mount since I really can have this issue on my server now. I will give Valgrind massif a try when I have the time. For now, it is better off, is too unsafe to use this package.

<!-- gh-comment-id:473511676 --> @nondualit commented on GitHub (Mar 16, 2019): Actually, I don't have to do anything, so I can't reproduce. If I mount the S3 bucket in around 6 hours my server will go out of memory. I had to disable the mount since I really can have this issue on my server now. I will give Valgrind massif a try when I have the time. For now, it is better off, is too unsafe to use this package.
Author
Owner

@junkert commented on GitHub (Mar 19, 2019):

We are seeing the same issues on our very active SFTP system. I'll will look into getting some Massif outputs here once I get it set up. This is happening on a daily basis for us so shouldn't take too long.

@gaul which flags should I add to the collection using Valgrind Massif? I have it running now with no flags, but want to make sure I get you as much information as you need. Here is how I am currently snagging the massif analysis file:

sudo valgrind --tool=massif s3fs ***** /home/shared/s3/***/*** -o rw,allow_other,uid=***,gid=***,iam_role=***,use_sse,url=https://s3-us-west-2.amazonaws.com,dev,suid
<!-- gh-comment-id:474604187 --> @junkert commented on GitHub (Mar 19, 2019): We are seeing the same issues on our very active SFTP system. I'll will look into getting some Massif outputs here once I get it set up. This is happening on a daily basis for us so shouldn't take too long. @gaul which flags should I add to the collection using Valgrind Massif? I have it running now with no flags, but want to make sure I get you as much information as you need. Here is how I am currently snagging the massif analysis file: ``` sudo valgrind --tool=massif s3fs ***** /home/shared/s3/***/*** -o rw,allow_other,uid=***,gid=***,iam_role=***,use_sse,url=https://s3-us-west-2.amazonaws.com,dev,suid ```
Author
Owner

@gaul commented on GitHub (Mar 20, 2019):

@junkert I believe this will suffice. It would also be helpful to know if you use a distro package or if/how you compile it yourself.

<!-- gh-comment-id:474765795 --> @gaul commented on GitHub (Mar 20, 2019): @junkert I believe this will suffice. It would also be helpful to know if you use a distro package or if/how you compile it yourself.
Author
Owner

@junkert commented on GitHub (Mar 20, 2019):

We are currently on master and built from source on github so we are running the latest on the Master branch which currently is the v1.85 release.

:~/s3fs-fuse$ git branch
* master
:~/s3fs-fuse$

We built the project with the openssl flag

./configure --prefix=/usr --with-openssl

I have valgrind --tool=massif running now on one of the s3fs processes and should have more data for you by EOD.

<!-- gh-comment-id:474933195 --> @junkert commented on GitHub (Mar 20, 2019): We are currently on master and built from source on github so we are running the latest on the Master branch which currently is the v1.85 release. ``` :~/s3fs-fuse$ git branch * master :~/s3fs-fuse$ ``` We built the project with the openssl flag ``` ./configure --prefix=/usr --with-openssl ``` I have ```valgrind --tool=massif``` running now on one of the s3fs processes and should have more data for you by EOD.
Author
Owner

@junkert commented on GitHub (Mar 21, 2019):

@gaul I have some good data for you, but I need a coworker to verify that all sensitive data has been obfuscated and everything looks good before posting the outputs here. I'll try to get something to you tomorrow AM PST some time.

From the looks of it libtasn1.so.6.5.1 (we are on ubuntu 16.04) accounts for around 96% of the total memory usage. It seems to plateau however around an hour or so (we tested for 7 hours today). We have over 20 or so s3fs mounts on this host and it is happening on all of mounts, but the most active seem to be the ones effected most. This causes the OOM killer to randomly start killing processes (we allocate about 8GB total RAM right now for our VM with 32GB of swap as well and rebooting daily to reset memory. This however is not sustainable for us since this eventually will become a scaling issue.

Our s3fs cmd will be included in the outputs with sensitive pieces removed.

<!-- gh-comment-id:475106339 --> @junkert commented on GitHub (Mar 21, 2019): @gaul I have some good data for you, but I need a coworker to verify that all sensitive data has been obfuscated and everything looks good before posting the outputs here. I'll try to get something to you tomorrow AM PST some time. From the looks of it libtasn1.so.6.5.1 (we are on ubuntu 16.04) accounts for around 96% of the total memory usage. It seems to plateau however around an hour or so (we tested for 7 hours today). We have over 20 or so s3fs mounts on this host and it is happening on all of mounts, but the most active seem to be the ones effected most. This causes the OOM killer to randomly start killing processes (we allocate about 8GB total RAM right now for our VM with 32GB of swap as well and rebooting daily to reset memory. This however is not sustainable for us since this eventually will become a scaling issue. Our s3fs cmd will be included in the outputs with sensitive pieces removed.
Author
Owner

@junkert commented on GitHub (Mar 21, 2019):

@gaul here is the output from Massif https://gist.github.com/junkert/0fdb401eb3d7d77b5c84d936ec7632fb

<!-- gh-comment-id:475321116 --> @junkert commented on GitHub (Mar 21, 2019): @gaul here is the output from Massif https://gist.github.com/junkert/0fdb401eb3d7d77b5c84d936ec7632fb
Author
Owner

@tisi1988 commented on GitHub (Apr 5, 2019):

Hi!

We are facing the same problem here. Using v1.85 we experience a huge memory usage that makes our machine hang.

With v1.84 we didn't face this issue but the CPU usage was constantly at ~50% of one core.

With v1.85 the CPU usage seems solved but this memory issue makes this version unusable.

<!-- gh-comment-id:480262064 --> @tisi1988 commented on GitHub (Apr 5, 2019): Hi! We are facing the same problem here. Using v1.85 we experience a huge memory usage that makes our machine hang. With v1.84 we didn't face this issue but the CPU usage was constantly at ~50% of one core. With v1.85 the CPU usage seems solved but this memory issue makes this version unusable.
Author
Owner

@ggtakec commented on GitHub (Apr 7, 2019):

@junkert Thank you for the memory leak data.
I noticed that I saw your file.
You select openssl and build s3fs, but it seems that libcurl which s3fs uses is using gnutls version.

curl --version

Please try. (In the case of OpenSSL version, there is an indication of OpenSSL version)

You should try building s3fs with gnutls (--with-gnutls) or using the openssl version of libcurl.
Thanks in advance for your assistance.

<!-- gh-comment-id:480582252 --> @ggtakec commented on GitHub (Apr 7, 2019): @junkert Thank you for the memory leak data. I noticed that I saw your file. You select openssl and build s3fs, but it seems that libcurl which s3fs uses is using gnutls version. ``` curl --version ``` Please try. (In the case of OpenSSL version, there is an indication of OpenSSL version) You should try building s3fs with gnutls (--with-gnutls) or using the openssl version of libcurl. Thanks in advance for your assistance.
Author
Owner

@zhou-hongyu commented on GitHub (May 1, 2019):

We are facing the same problem here, can anyone help?

<!-- gh-comment-id:488295698 --> @zhou-hongyu commented on GitHub (May 1, 2019): We are facing the same problem here, can anyone help?
Author
Owner

@zhou-hongyu commented on GitHub (May 1, 2019):

Or is there any chance that one of the older version doesn't have memory leakage issue? like 1.8.0?

<!-- gh-comment-id:488296383 --> @zhou-hongyu commented on GitHub (May 1, 2019): Or is there any chance that one of the older version doesn't have memory leakage issue? like 1.8.0?
Author
Owner

@nondualit commented on GitHub (May 1, 2019):

I stop using this software, tot buggy.. and started using the AWS cli command line to communicate with de buckets.

<!-- gh-comment-id:488297071 --> @nondualit commented on GitHub (May 1, 2019): I stop using this software, tot buggy.. and started using the AWS cli command line to communicate with de buckets.
Author
Owner

@zhou-hongyu commented on GitHub (May 1, 2019):

lmao @nondualit, hey man, would you mind provide any details on how you use aws cli to replace it?

<!-- gh-comment-id:488297935 --> @zhou-hongyu commented on GitHub (May 1, 2019): lmao @nondualit, hey man, would you mind provide any details on how you use aws cli to replace it?
Author
Owner

@junkert commented on GitHub (May 1, 2019):

@nevermore2014 We are currently moving to AWS Transfer for SFTP for our permanent solution since we can rely on AWS to handle the scaling side. The login solution is quite cumbersome right now since they only support public key based authentication, but have instructions on how to build a identity provider (IP). I am hoping to have a blog article soon on how to build an IP in Go, Lambda, and DynamoDB. Maybe it will include code, but will see.

@nondualit bugs happen, the only way to make open software better is to contribute and help where you can. Start with bug reports, and grabbing data for the developers of the open source projects. Your contributions will help everyone that uses the software as a whole.

@ggtakec I'll give compiling libcurl with openssl support a shot and grab some more Massif data.

<!-- gh-comment-id:488394555 --> @junkert commented on GitHub (May 1, 2019): @nevermore2014 We are currently moving to AWS Transfer for SFTP for our permanent solution since we can rely on AWS to handle the scaling side. The login solution is quite cumbersome right now since they only support public key based authentication, but have instructions on how to build a identity provider (IP). I am hoping to have a blog article soon on how to build an IP in Go, Lambda, and DynamoDB. Maybe it will include code, but will see. @nondualit bugs happen, the only way to make open software better is to contribute and help where you can. Start with bug reports, and grabbing data for the developers of the open source projects. Your contributions will help everyone that uses the software as a whole. @ggtakec I'll give compiling libcurl with openssl support a shot and grab some more Massif data.
Author
Owner

@zhou-hongyu commented on GitHub (May 1, 2019):

@junkert Thanks so much for your reply, when are you expecting this could be done?

<!-- gh-comment-id:488399649 --> @zhou-hongyu commented on GitHub (May 1, 2019): @junkert Thanks so much for your reply, when are you expecting this could be done?
Author
Owner

@junkert commented on GitHub (May 1, 2019):

@nevermore2014 Hopefully end of May. Will post back here when complete.

<!-- gh-comment-id:488477623 --> @junkert commented on GitHub (May 1, 2019): @nevermore2014 Hopefully end of May. Will post back here when complete.
Author
Owner

@zhou-hongyu commented on GitHub (May 2, 2019):

image

For those of you who has incurred the same problem, I suggest you use version 1.84 as bumper solution for now, since v1.8.4 will only consumes your memory up to 6G, so as long as your instance has more than 16 G memory it wouldn't cause you immediately down time. Consider it as a bumper solution.

<!-- gh-comment-id:488670393 --> @zhou-hongyu commented on GitHub (May 2, 2019): ![image](https://user-images.githubusercontent.com/6007569/57078046-79396700-6cbb-11e9-8e30-3d281d7231c9.png) For those of you who has incurred the same problem, I suggest you use version 1.84 as bumper solution for now, since v1.8.4 will only consumes your memory up to 6G, so as long as your instance has more than 16 G memory it wouldn't cause you immediately down time. Consider it as a bumper solution.
Author
Owner

@ggtakec commented on GitHub (May 4, 2019):

@nevermore2014
Would you like to tell us about your environment(os) and s3fs --version andlibcurl (curl version)?

I want to know the results of the @junkert test, but at first we want to know whether your environment is as same as him.
I'm interested in how this problem is related to gnutls and ubuntu.
Since version 1.85 is modified to keep the SSL session, I'm wonder about the possibility of the bad effect.

Thanks in advance for your assistance.

<!-- gh-comment-id:489342195 --> @ggtakec commented on GitHub (May 4, 2019): @nevermore2014 Would you like to tell us about your environment(os) and `s3fs --version` and` libcurl (curl version) `? I want to know the results of the @junkert test, but at first we want to know whether your environment is as same as him. I'm interested in how this problem is related to gnutls and ubuntu. Since version 1.85 is modified to keep the SSL session, I'm wonder about the possibility of the bad effect. Thanks in advance for your assistance.
Author
Owner

@zhou-hongyu commented on GitHub (May 6, 2019):

@ggtakec
sure. It's
Amazon Linux AMI
Amazon Simple Storage Service File System V1.84(commit:unknown) with OpenSSL
curl 7.61.1 (x86_64-redhat-linux-gnu) libcurl/7.61.1 OpenSSL/1.0.2k zlib/1.2.8 libidn2/0.16 libpsl/0.6.2 (+libicu/50.1.2) libssh2/1.4.2 nghttp2/1.21.1

<!-- gh-comment-id:489653389 --> @zhou-hongyu commented on GitHub (May 6, 2019): @ggtakec sure. It's Amazon Linux AMI Amazon Simple Storage Service File System V1.84(commit:unknown) with OpenSSL curl 7.61.1 (x86_64-redhat-linux-gnu) libcurl/7.61.1 OpenSSL/1.0.2k zlib/1.2.8 libidn2/0.16 libpsl/0.6.2 (+libicu/50.1.2) libssh2/1.4.2 nghttp2/1.21.1
Author
Owner

@gaul commented on GitHub (Feb 3, 2020):

@nevermore2014 Could you test with the latest version 1.85?

<!-- gh-comment-id:581290651 --> @gaul commented on GitHub (Feb 3, 2020): @nevermore2014 Could you test with the latest version 1.85?
Author
Owner

@johnboker commented on GitHub (May 14, 2020):

Has there been any progress on this? I'm seeing this issue as well.

<!-- gh-comment-id:628636958 --> @johnboker commented on GitHub (May 14, 2020): Has there been any progress on this? I'm seeing this issue as well.
Author
Owner

@gaul commented on GitHub (Jul 26, 2020):

Closing due to inactivity. Please retest with the latest 1.86 or master and reopen if symptoms persist.

<!-- gh-comment-id:663951461 --> @gaul commented on GitHub (Jul 26, 2020): Closing due to inactivity. Please retest with the latest 1.86 or master and reopen if symptoms persist.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#545
No description provided.