mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #607] Improve random writes and appends #344
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#344
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @CAFxX on GitHub (May 24, 2017).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/607
In the README.md the performance of random writes and appends is listed as a limitation:
If this is still the case, was it ever considered to use multipart upload, and specifically both Upload Part and Upload Part - Copy, to speed it up?
Specifically, an append operation would become (simplifying):
A random write operation (e.g. "helloworld" -> "hello World") would become:
This approach would probably be beneficial when the existing object is big and the amount of data to append or changed is small, so some heuristics may be needed. In addition, these heuristics should also consider that (as noted below by @sqlbot) the upload part API has some restrictions in terms of minimum/maximum allowable part size ("Part size: 5 MB to 5 GB, last part can be < 5 MB")
@sqlbot commented on GitHub (May 25, 2017):
S3 itself would not allow such an operation at small scale. The Multipart Upload API, whether the part data is uploaded or copied from another object, requires that each part (except the last part) have a minimum size of 5 MB.
http://docs.aws.amazon.com/AmazonS3/latest/dev/qfacts.html
@CAFxX commented on GitHub (May 25, 2017):
@sqlbot so the appending scenario would work without any problem (assuming the old object is at least 5mb in size, something that in an append scenario seems fairly common).
For random writes it could be still useful in certain limited scenarios (e.g. VM images) where it would still provide substantial benefits.
This was kinda implied by the last phrase in my first post ("This approach would probably be beneficial when the existing object is big and the amount of data to append or changed is small."), I will now update it to clarify it.
@ggtakec commented on GitHub (May 2, 2019):
@CAFxX @sqlbot I'm sorry for very late reply.
I implemented this feature in #1027 .
If you can, please try to use master branch latest code after merging it.
@ggtakec commented on GitHub (Sep 29, 2019):
This issue is closed because #1098 has been merged.