[GH-ISSUE #1428] i have 10s of millions of small files and i've tried s3fs a few months ago, highly unstable. #751

Closed
opened 2026-03-04 01:48:28 +03:00 by kerem · 1 comment
Owner

Originally created by @gitmko0 on GitHub (Sep 29, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1428

i have 10s of millions of small files and i've tried s3fs a few months ago, highly unstable, unable to recreate or restart s3fs failed etc.

possible to trust s3fs with a docker installation now? or is riofs better? would really like to see it work 100% as supposed to be. shld i use it now again? meaning it has been "updated" more frequently than riofs (2018) and also... do we have a stable docker image type that we can rely on as mission critical fs?

Originally created by @gitmko0 on GitHub (Sep 29, 2020). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1428 i have 10s of millions of small files and i've tried s3fs a few months ago, highly unstable, unable to recreate or restart s3fs failed etc. possible to trust s3fs with a docker installation now? or is riofs better? would really like to see it work 100% as supposed to be. shld i use it now again? meaning it has been "updated" more frequently than riofs (2018) and also... do we have a stable docker image type that we can rely on as mission critical fs?
kerem closed this issue 2026-03-04 01:48:29 +03:00
Author
Owner

@gaul commented on GitHub (Sep 29, 2020):

If you have 10s of millions of files in a single directory then s3fs (or any filesystem) will not work well. If these are spread across subdirectories, e.g., dirA/object1-10000, dirB/object1-object10000, then s3fs may work for you. This is largely due to s3fs issuing HEAD request for every object although other practical limitations apply due to memory constraints.

I cannot address your general comments about quality other than many people use s3fs and are happy with it. If you have a specific issue you can report it and we may be able to address it. Other S3 filesystems may work better for specific situations but are unlikely to handle 10s of millions of files.

<!-- gh-comment-id:700690957 --> @gaul commented on GitHub (Sep 29, 2020): If you have 10s of millions of files in a single directory then s3fs (or any filesystem) will not work well. If these are spread across subdirectories, e.g., dirA/object1-10000, dirB/object1-object10000, then s3fs may work for you. This is largely due to s3fs issuing HEAD request for every object although other practical limitations apply due to memory constraints. I cannot address your general comments about quality other than many people use s3fs and are happy with it. If you have a specific issue you can report it and we may be able to address it. Other S3 filesystems may work better for specific situations but are unlikely to handle 10s of millions of files.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#751
No description provided.