[GH-ISSUE #1351] S3FS 1.86 Concurrency Limits #723

Closed
opened 2026-03-04 01:48:12 +03:00 by kerem · 4 comments
Owner

Originally created by @Conrad-T-Pino on GitHub (Aug 4, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1351

Assuming the back-end S3 server supports unlimited bandwidth and connections, what are the maximum read concurrency and write concurrency levels S3FS 1.86 can support over Internet transit links between ISP data centers?

Originally created by @Conrad-T-Pino on GitHub (Aug 4, 2020). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1351 Assuming the back-end S3 server supports unlimited bandwidth and connections, what are the maximum read concurrency and write concurrency levels S3FS 1.86 can support over Internet transit links between ISP data centers?
kerem closed this issue 2026-03-04 01:48:13 +03:00
Author
Owner

@gaul commented on GitHub (Aug 5, 2020):

Generally we expect that s3fs should be limited by the S3 server although you may need to tune -o multipart_size and -o parallel_count. goofys has done some benchmarks but we have not investigated the differences in configuration.

<!-- gh-comment-id:668950418 --> @gaul commented on GitHub (Aug 5, 2020): Generally we expect that s3fs should be limited by the S3 server although you may need to tune `-o multipart_size` and `-o parallel_count`. goofys has done some [benchmarks](https://github.com/kahing/goofys#benchmark) but we have not investigated the differences in configuration.
Author
Owner

@Conrad-T-Pino commented on GitHub (Aug 5, 2020):

Thank you for the prompt reply. I followed up on your suggestion by reviewing the manual page carefully.

My problem case is 2.5 million average 40 KiB size files; PUT response latency (storage processing time) dominates each PUT request. The first 40,000 files took ~11 hours for an estimated 28+ days run time. Worse, production data set is 48.8 times larger.

Real S3 services don't support unlimited bandwidth and connections but let's assume mine does as so I can precisely define S3FS constraints boundary. I want to see parallel single part PUT requests pushed beyond the breaking point so that boundary becomes precisely known. Our live implementation can then operate within a well defined safety margin.

<!-- gh-comment-id:669369954 --> @Conrad-T-Pino commented on GitHub (Aug 5, 2020): Thank you for the prompt reply. I followed up on your suggestion by reviewing the manual page carefully. My problem case is 2.5 million average 40 KiB size files; PUT response latency (storage processing time) dominates each PUT request. The first 40,000 files took ~11 hours for an estimated 28+ days run time. Worse, production data set is 48.8 times larger. Real S3 services don't support unlimited bandwidth and connections but let's assume mine does as so I can precisely define S3FS constraints boundary. I want to see parallel single part PUT requests pushed beyond the breaking point so that boundary becomes precisely known. Our live implementation can then operate within a well defined safety margin.
Author
Owner

@gaul commented on GitHub (Oct 10, 2020):

We would appreciate any real-world characterization of s3fs performance. It seems that you want to create many files rapidly -- does you application create them sequentially or in parallel? The latter will be much faster for a network filesystem like s3fs. You should also make sure that you have your application running in the same region as the bucket to minimize round-trip times. It appears that S3fsCurl::PutRequest does not limit concurrent uploads.

<!-- gh-comment-id:706511405 --> @gaul commented on GitHub (Oct 10, 2020): We would appreciate any real-world characterization of s3fs performance. It seems that you want to create many files rapidly -- does you application create them sequentially or in parallel? The latter will be much faster for a network filesystem like s3fs. You should also make sure that you have your application running in the same region as the bucket to minimize round-trip times. It appears that `S3fsCurl::PutRequest` does not limit concurrent uploads.
Author
Owner

@gaul commented on GitHub (Dec 31, 2020):

Closing since there is nothing actionable here.

<!-- gh-comment-id:752946575 --> @gaul commented on GitHub (Dec 31, 2020): Closing since there is nothing actionable here.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#723
No description provided.