[GH-ISSUE #873] Can the disk cache be disabled? #508

Open
opened 2026-03-04 01:46:13 +03:00 by kerem · 7 comments
Owner

Originally created by @lnicola on GitHub (Dec 13, 2018).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/873

Additional Information

Version of s3fs being used (s3fs --version)

Amazon Simple Storage Service File System V1.84(commit:unknown) with OpenSSL

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

Version : 2.9.2

Kernel information (uname -r)

3.10.0-862.11.6.el7.x86_64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

CentOS 7

/etc/fstab entry, if applicable

s3fs#bucket /C02 fuse passwd_file=/etc/.c02_pass_s3fs,_netdev,allow_other,uid=0,umask=0000,mp_umask=0000,gid=0,url=http://obs.eu-de.otc.t-systems.com 0 0

s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)

if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages

fdcache.cpp:Read(1622): could not reserve disk space for pre-fetch download

Details about issue

We have a workload that consists in reading around 100-300 of large files (100-900 MB each) in parallel. The files themselves are split into smaller chunks, each read sequentially. The issue is that s3fs keeps filling the root partition (/tmp) with its fdcache. Note that the manual mentions:

       -o use_cache (default="" which means disabled)
              local folder to use for local file cache.

This is a bit misleading, since the default behaviour seems to be to save the files in /tmp. I took a quick look over the code and it looks like the cache is always used. I can't say for certain that s3fs caches more data than it should, but it seems like a possibility. I've seen certainly seen it use 2-300 GB, then start returning read errors because it ran out of disk space, so I'm hoping it could be made store less data. With goofys we didn't have this particular issue because it doesn't cache data to disk by default.

Originally created by @lnicola on GitHub (Dec 13, 2018). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/873 ### Additional Information #### Version of s3fs being used (s3fs --version) Amazon Simple Storage Service File System V1.84(commit:unknown) with OpenSSL #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) Version : 2.9.2 #### Kernel information (uname -r) 3.10.0-862.11.6.el7.x86_64 #### GNU/Linux Distribution, if applicable (cat /etc/os-release) CentOS 7 #### /etc/fstab entry, if applicable ``` s3fs#bucket /C02 fuse passwd_file=/etc/.c02_pass_s3fs,_netdev,allow_other,uid=0,umask=0000,mp_umask=0000,gid=0,url=http://obs.eu-de.otc.t-systems.com 0 0 ``` #### s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs) _if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages_ ``` fdcache.cpp:Read(1622): could not reserve disk space for pre-fetch download ``` ### Details about issue We have a workload that consists in reading around 100-300 of large files (100-900 MB each) in parallel. The files themselves are split into smaller chunks, each read sequentially. The issue is that `s3fs` keeps filling the root partition (`/tmp`) with its `fdcache`. Note that the manual mentions: ``` -o use_cache (default="" which means disabled) local folder to use for local file cache. ``` This is a bit misleading, since the default behaviour seems to be to save the files in `/tmp`. I took a quick look over the code and it looks like the cache is always used. I can't say for certain that `s3fs` caches more data than it should, but it seems like a possibility. I've seen certainly seen it use 2-300 GB, then start returning read errors because it ran out of disk space, so I'm hoping it could be made store less data. With `goofys` we didn't have this particular issue because it doesn't cache data to disk by default.
Author
Owner

@ffeldhaus commented on GitHub (Feb 1, 2019):

As far as I understand it, this could finally be implemented using the new copy_file_range function in libfuse 3.4.1. For details see https://github.com/libfuse/libfuse/pull/259

<!-- gh-comment-id:459850700 --> @ffeldhaus commented on GitHub (Feb 1, 2019): As far as I understand it, this could finally be implemented using the new `copy_file_range` function in libfuse 3.4.1. For details see https://github.com/libfuse/libfuse/pull/259
Author
Owner

@yunyuyuan commented on GitHub (Jun 9, 2023):

Same issue,,I have tried many options, remove use_cache, set use_cache="", set max_stat_cache_size=0, None of them can work,the cache still increase until my hard drive is full,and the download program(qbittorrent) be interrupted.

image

<!-- gh-comment-id:1584002365 --> @yunyuyuan commented on GitHub (Jun 9, 2023): Same issue,,I have tried many options, remove `use_cache`, set `use_cache=""`, set `max_stat_cache_size=0`, None of them can work,the cache still increase until my hard drive is full,and the download program(qbittorrent) be interrupted. ![image](https://github.com/s3fs-fuse/s3fs-fuse/assets/45785585/98a39d06-5025-43bb-a18a-061b90d4762f)
Author
Owner

@ggtakec commented on GitHub (Jun 10, 2023):

@yunyuyuan Thanks for your comment.
It seems to be the same as issue #2156.
(Are you using CentOS7 as well?)
If the conditions are the same, I would like to consolidate to #2156.

<!-- gh-comment-id:1585550523 --> @ggtakec commented on GitHub (Jun 10, 2023): @yunyuyuan Thanks for your comment. It seems to be the same as issue #2156. (Are you using CentOS7 as well?) If the conditions are the same, I would like to consolidate to #2156.
Author
Owner

@yunyuyuan commented on GitHub (Jun 10, 2023):

@ggtakec Thanks for reply,I'm using archlinux.
The problem I encountered is likely due to the cache filling up my hard drive. The size of my hard drive is 10GB, and when I try to download a 12GB file, my hard drive reaches 100% capacity during the download process, and I am unable to complete the download.
This doesn't seem to be related to OOM issue.My cache files are not stored in tmpfs, but rather in /var/s3fs_cache.
So, I'm wondering if it's possible to bypass the cache and write files directly to S3. I'm not sure if that can be achieved.

<!-- gh-comment-id:1585666964 --> @yunyuyuan commented on GitHub (Jun 10, 2023): @ggtakec Thanks for reply,I'm using archlinux. The problem I encountered is likely due to the cache filling up my hard drive. The size of my hard drive is 10GB, and when I try to download a 12GB file, my hard drive reaches 100% capacity during the download process, and I am unable to complete the download. This doesn't seem to be related to OOM issue.My cache files are not stored in `tmpfs`, but rather in `/var/s3fs_cache`. So, I'm wondering if it's possible to bypass the cache and write files directly to `S3`. I'm not sure if that can be achieved.
Author
Owner

@tanguofu commented on GitHub (Jun 11, 2023):

could u have a try ensure_disk options?

<!-- gh-comment-id:1586075819 --> @tanguofu commented on GitHub (Jun 11, 2023): could u have a try `ensure_disk` options?
Author
Owner

@lnicola commented on GitHub (Jun 12, 2023):

I didn't try ensure_diskfree yet, but in my case, s3fs was filling up the disk, then crashing because it couldn't write any more. So I suspect that if it respects the limit, it will crash faster now.

And yeah, this has no relation to #2156.

<!-- gh-comment-id:1587296473 --> @lnicola commented on GitHub (Jun 12, 2023): I didn't try `ensure_diskfree` yet, but in my case, s3fs was filling up the disk, then crashing because it couldn't write any more. So I suspect that if it respects the limit, it will crash faster now. And yeah, this has no relation to #2156.
Author
Owner

@alfiedotwtf commented on GitHub (Mar 21, 2025):

I just tried to mount on Mac OS but it failed saying cache is full (saying it needs a whopping 40Gb for cache?!), even though I only wanted to copy over a few textfiles lol, so actually disabling cache would be nice.

<!-- gh-comment-id:2743039367 --> @alfiedotwtf commented on GitHub (Mar 21, 2025): I just tried to mount on Mac OS but it failed saying cache is full (saying it needs a whopping 40Gb for cache?!), even though I only wanted to copy over a few textfiles lol, so actually disabling cache would be nice.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#508
No description provided.