mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #941] automatically tune multipart sizes #532
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#532
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @gaul on GitHub (Jan 30, 2019).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/941
s3fs should automatically use larger multipart sizes when object sizes are large. For example,
multipart_sizedefaults to 10 MB which means that s3fs can only write <= 100 GB objects with the maximum 10,000 part size instead of the 5 TB limit. Similarly,singlepart_copy_limitshould start smaller to improve parallel uploads but increase as object size gets larger. Propose giving these-1defaults to allow users to modify behavior but otherwise letting s3fs choose the sizes. References #940.@ffeldhaus commented on GitHub (Feb 1, 2019):
I would suggest to divide the filesize by the
parallel_countor a multiple ofparallel_countand determine themultipart_sizethat way. It's also helpful if themultipart_sizeis rounded down to the next power of 2 (e.g. 16MB or 1GB) in case someone wants to check the ETag of a downloaded file and needs to guess the part size used for the multipart upload.@gaul commented on GitHub (Feb 2, 2019):
Using more parts than the number of
parallel_counthelps since network errors do not retransmit as much data. The underlying curl reuses connections so TCP window scaling only affects some connections. Note that there are many possible values for part sizes; let's scope this one to simply allow the full range of object sizes and we can follow on with optimizations.