[GH-ISSUE #2144] [Question] How to disable stat cache in a proper way #1093

Open
opened 2026-03-04 01:51:20 +03:00 by kerem · 1 comment
Owner

Originally created by @creeew on GitHub (Apr 1, 2023).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2144

Additional Information

Version of s3fs being used (s3fs --version)

V1.91

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse or dpkg -s fuse)

2.9.2

Kernel information (uname -r)

4.19.113-300.el7.x86_64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

CentOS7

How to run s3fs, if applicable

s3fs test-bucket /mnt/dir -o rw,allow_other,no_check_certificate,use_path_request_style,disable_noobj_cache,url=http://x.x.x.x:9000,passwd_file=/.passwd-s3fs,dev,suid

s3fs test /mnt/test -f -o passwd_file=/etc/passwd-s3fs -o use_path_request_style -o url=http://x.x.x.x:9000 -o endpoint= -o allow_other -o no_check_certificate -o disable_noobj_cache -o curldbg -d

Details about issue

In a distributed scenario, each service uses s3fs to operate files in the same object storage. However, s3fs has a state cache, which can lead to inconsistent situations.

e.g.
ServiceA and ServiceB are on different nodes but using the same object storage
ServiceA:
cat a existed link_file , result value is "a"
and then
ServiceB:
modifies the link_file's content value to "b"
ServiceA:
cat link_file again, it still returns "a" due to the cached state.

S3fs has arguments to control the stat cace which are max_stat_cache_size and stat_cache_expire, we try to set both values to 0 but some operation like "ls" would miss some items.
Consistency is more important than performance for us. Is there any way to ensure that s3fs is not affected by the state cache and guarantee consistency?

Originally created by @creeew on GitHub (Apr 1, 2023). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2144 <!-- -------------------------------------------------------------------------- The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all. Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on macOS or BSD. --------------------------------------------------------------------------- --> ### Additional Information #### Version of s3fs being used (`s3fs --version`) V1.91 #### Version of fuse being used (`pkg-config --modversion fuse`, `rpm -qi fuse` or `dpkg -s fuse`) 2.9.2 #### Kernel information (`uname -r`) 4.19.113-300.el7.x86_64 #### GNU/Linux Distribution, if applicable (`cat /etc/os-release`) CentOS7 #### How to run s3fs, if applicable s3fs test-bucket /mnt/dir -o rw,allow_other,no_check_certificate,use_path_request_style,disable_noobj_cache,url=http://x.x.x.x:9000,passwd_file=/.passwd-s3fs,dev,suid s3fs test /mnt/test -f -o passwd_file=/etc/passwd-s3fs -o use_path_request_style -o url=http://x.x.x.x:9000 -o endpoint= -o allow_other -o no_check_certificate -o disable_noobj_cache -o curldbg -d ### Details about issue In a distributed scenario, each service uses s3fs to operate files in the same object storage. However, s3fs has a state cache, which can lead to inconsistent situations. e.g. ServiceA and ServiceB are on different nodes but using the same object storage ServiceA: cat a existed link_file , result value is "a" and then ServiceB: modifies the link_file's content value to "b" ServiceA: cat link_file again, it still returns "a" due to the cached state. S3fs has arguments to control the stat cace which are `max_stat_cache_size` and `stat_cache_expire`, we try to set both values to 0 but some operation like "ls" would miss some items. Consistency is more important than performance for us. Is there any way to ensure that s3fs is not affected by the state cache and guarantee consistency?
Author
Owner

@ggtakec commented on GitHub (Apr 8, 2023):

@creeew
You can disable the stats cache by setting max_stat_cache_size or stat_cache_expire to 0. (It sets that there is no valid cache.)
If you use it, max_stat_cache_size is better.
As you expected, using these options(=0) will force s3fs to always contact the server when retrieving stats information.

but some operation like "ls" would miss some items.

I still can't understand the meaning of this.
s3fs will get the same value regardless of whether the stats information is cached or not.
It is believed to be independent of the max_stat_cache_size and stat_cache_expire options.

If any stats information is missing, it may be a bug.
Could you report this issue in detail?

Thanks in advance for your kindness.

<!-- gh-comment-id:1500768095 --> @ggtakec commented on GitHub (Apr 8, 2023): @creeew You can disable the stats cache by setting `max_stat_cache_size` or `stat_cache_expire` to `0`. (It sets that there is no valid cache.) If you use it, `max_stat_cache_size` is better. As you expected, using these options(=`0`) will force s3fs to always contact the server when retrieving stats information. > but some operation like "ls" would miss some items. I still can't understand the meaning of this. s3fs will get the same value regardless of whether the stats information is cached or not. It is believed to be independent of the `max_stat_cache_size` and `stat_cache_expire` options. If any stats information is missing, it may be a bug. Could you report this issue in detail? Thanks in advance for your kindness.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#1093
No description provided.