mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 13:26:00 +03:00
[GH-ISSUE #1351] S3FS 1.86 Concurrency Limits #723
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#723
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Conrad-T-Pino on GitHub (Aug 4, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1351
Assuming the back-end S3 server supports unlimited bandwidth and connections, what are the maximum read concurrency and write concurrency levels S3FS 1.86 can support over Internet transit links between ISP data centers?
@gaul commented on GitHub (Aug 5, 2020):
Generally we expect that s3fs should be limited by the S3 server although you may need to tune
-o multipart_sizeand-o parallel_count. goofys has done some benchmarks but we have not investigated the differences in configuration.@Conrad-T-Pino commented on GitHub (Aug 5, 2020):
Thank you for the prompt reply. I followed up on your suggestion by reviewing the manual page carefully.
My problem case is 2.5 million average 40 KiB size files; PUT response latency (storage processing time) dominates each PUT request. The first 40,000 files took ~11 hours for an estimated 28+ days run time. Worse, production data set is 48.8 times larger.
Real S3 services don't support unlimited bandwidth and connections but let's assume mine does as so I can precisely define S3FS constraints boundary. I want to see parallel single part PUT requests pushed beyond the breaking point so that boundary becomes precisely known. Our live implementation can then operate within a well defined safety margin.
@gaul commented on GitHub (Oct 10, 2020):
We would appreciate any real-world characterization of s3fs performance. It seems that you want to create many files rapidly -- does you application create them sequentially or in parallel? The latter will be much faster for a network filesystem like s3fs. You should also make sure that you have your application running in the same region as the bucket to minimize round-trip times. It appears that
S3fsCurl::PutRequestdoes not limit concurrent uploads.@gaul commented on GitHub (Dec 31, 2020):
Closing since there is nothing actionable here.