[GH-ISSUE #2507] If there are two S3 clients deployed on different machines, and client 1 writes a file, how can client 2 read the file written by client 1 in a timely manner? #1222

Open
opened 2026-03-04 01:52:22 +03:00 by kerem · 3 comments
Owner

Originally created by @loongyiyao on GitHub (Aug 1, 2024).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2507

Additional Information
Version of s3fs being used (s3fs --version)
Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse or dpkg -s fuse)
Kernel information (uname -r)
GNU/Linux Distribution, if applicable (cat /etc/os-release)
How to run s3fs, if applicable
/usr/bin/s3fs fangzhenyun /dev/mount1 -o url=https://video.ge.com:9000/ -o endpoint=cn-east-1 -o sigv2 -o passwd_file=/home/zxsrtn/.passwd-s3fs -o use_path_request_style -o allow_other -o umask=0 -o use_cache=/dev/cache1 -o del_cache -o ensure_diskfree=12288 -o enable_noobj_cache -o parallel_count=20 -o multipart_size=52 -o dbglevel=warn -o logfile=/home/output.log

Originally created by @loongyiyao on GitHub (Aug 1, 2024). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2507 Additional Information Version of s3fs being used (s3fs --version) Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse or dpkg -s fuse) Kernel information (uname -r) GNU/Linux Distribution, if applicable (cat /etc/os-release) How to run s3fs, if applicable /usr/bin/s3fs fangzhenyun /dev/mount1 -o url=https://video.ge.com:9000/ -o endpoint=cn-east-1 -o sigv2 -o passwd_file=/home/zxsrtn/.passwd-s3fs -o use_path_request_style -o allow_other -o umask=0 -o use_cache=/dev/cache1 -o del_cache -o ensure_diskfree=12288 -o enable_noobj_cache -o parallel_count=20 -o multipart_size=52 -o dbglevel=warn -o logfile=/home/output.log
Author
Owner

@gaul commented on GitHub (Aug 6, 2024):

-o stat_cache_expire (default is 900) controls the metadata cache. @ggtakec does the data cache have any kind of staleness check?

<!-- gh-comment-id:2270470509 --> @gaul commented on GitHub (Aug 6, 2024): -o stat_cache_expire (default is 900) controls the metadata cache. @ggtakec does the data cache have any kind of staleness check?
Author
Owner

@ggtakec commented on GitHub (Aug 11, 2024):

@loongyiyao I'm sorry for my late reply.

s3fs's stat cache caches meta information for files and directories(objects).
In other words, it caches the stat information for the files themselves.
Separately from that, the list of files and subdirectories in the directory is constantly checked by s3fs.

So, if you create a new file, the file list in the directory will be updated, so if you run a command that scans the directory(ex. ls

), the new file should be found by s3fs.
If you change(update) an existing file rather than creating a new file, it will be affected by the stat cache, so s3fs can not normally notice until the cache times out.
(But if you directly operate the updated file, such as opening it, the stat information will be reread, so s3fs will be able to notice that it has been updated.)

Note that this behavior applies to files created with the s3fs, and the behavior is slightly different for files(objects) created with other clients such as the AWS CLI.

<!-- gh-comment-id:2282339440 --> @ggtakec commented on GitHub (Aug 11, 2024): @loongyiyao I'm sorry for my late reply. s3fs's `stat cache` caches meta information for files and directories(objects). In other words, it caches the stat information for the files themselves. Separately from that, the list of files and subdirectories in the directory is constantly checked by s3fs. So, if you create a new file, the file list in the directory will be updated, so if you run a command that scans the directory(ex. ls <dir path>), the new file should be found by s3fs. If you change(update) an existing file rather than creating a new file, it will be affected by the `stat cache`, so s3fs can not normally notice until the cache times out. (But if you directly operate the updated file, such as opening it, the stat information will be reread, so s3fs will be able to notice that it has been updated.) _Note that this behavior applies to files created with the s3fs, and the behavior is slightly different for files(objects) created with other clients such as the AWS CLI._
Author
Owner

@impowerdevs commented on GitHub (Jun 11, 2025):

Hello @ggtakec ,

  1. We have many files grouped in folders stored in directories in S3 - 200k in one directory.
  2. This means that an ls operation could be costly.
  3. we have some clients writing directly to S3

Issue
AFTER successful PUT to aws/S3
We have to wait 20s-120s until the file is available via s3fs using direct access (~fs read (absolute/path))

Question
Is there a way to force s3fs the access to one file ?
OR does the absolute file path access on OS level imply scanning through the inode catalogues of all directories ?

<!-- gh-comment-id:2963975648 --> @impowerdevs commented on GitHub (Jun 11, 2025): Hello @ggtakec , 1. We have many files grouped in folders stored in directories in S3 - 200k in one directory. 1. This means that an `ls` operation could be costly. 1. we have some clients writing directly to S3 Issue AFTER successful PUT to aws/S3 We have to wait 20s-120s until the file is available via s3fs using direct access (~`fs read (absolute/path)`) Question Is there a way to force s3fs the access to one file ? OR does the absolute file path access on OS level imply scanning through the inode catalogues of all directories ?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#1222
No description provided.