mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #1875] Do we have to flush on s3fs_truncate? #955
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#955
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @orozery on GitHub (Jan 27, 2022).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1875
We have an application which writes large files (~4GB) in 32MB chunks,
where before each 32MB write, the application is calling
ftruncateto increase the file size by 32MB.This maps to
s3fs_truncate, which flushes the file.The writing to the S3 backend becomes very inefficient, as the entire file is overwritten every 32MBs written.
I'm wondering if the flush on
s3fs_truncatecan be avoided.In the man page of ftruncate, I did not see they mention that a flush is guaranteed.
@gaul will be happy to get your thoughts.
Thanks!
@gaul commented on GitHub (Jan 27, 2022):
I believe that s3fs has always
fsynconftruncate, as far back as4a30df1ff2. I do not believe that POSIX actually requires this and the slower implementation was only a reflection of s3fs' limited dirty data tracking at that time. Sinceftruncateis an uncommon operation we have not optimized it yet. Could you submit a PR for this?@gaul commented on GitHub (Jan 28, 2022):
One thing to investigate is whether s3fs calls
ftruncateon its temporary file so that it can returnENOSPCto the application if local storage is too small.@ggtakec commented on GitHub (Jan 29, 2022):
I think we need to re-check the code that uses the file size after truncate operation.
The file size is used by other operations that occur after truncate. (In various places)
For example, when using a cache file(in the case where there is a directory capacity), it tries to substitute the file size acquisition from the local cache file.
However, if the cache is not used, the current s3fs will have a problem if the size is not reflected in the S3 server side object after calling the truncate operation.
However, we may be able to turn off fsync while the file is open.
(But that doesn't seem to be a simple fix.)
@gaul commented on GitHub (Jan 29, 2022):
If we added performance counters, suggested in #1571, we could modify the integration tests to check the expected number of RPCs. This would ensure that we are not regressing performance on operations like truncate.
@ggtakec commented on GitHub (Jan 29, 2022):
As you say, performance evaluation for cache is necessary.
(Let's define what should be measured in #1571)
In the first comment of @orozery, there is the following sentence.
In other words, the application repeates call truncate->write pair.
s3fs(and FUSE) does not know that write will be called after truncate.
That's why in the current implementation, it's flushing at every trauncate...
If you want to avoid this, we will be ables to change s3fs as:
only at the close(flush/release) is called, s3fs outputs all dirty parts at once.
But I think that this workaround cannot be used when the local cache is not used (cannot be created).
Conversely, in situations where the local cache is available, it may be possible to prevent flushing until close.
@orozery commented on GitHub (Jan 30, 2022):
I'm trying to re-think this whole idea.
If the user is using
ftruncate, it makes sense to not flush, and wait forflushorclose.However, if the user is using
truncate, which does not get a file descriptor, then I think he should expect the truncate to be immediate on the S3 server.With the current
s3fs_truncatewe cannot tell if truncate was called on a file handle (and which).Now I see that in FUSE3 they changed
int (*truncate) (const char *, off_t)toint (*truncate) (const char *, off_t, struct fuse_file_info *fi).Specifically here:
https://github.com/libfuse/libfuse/issues/58
I guess s3fs is configured with FUSE2 which does not have this new API?
@ggtakec commented on GitHub (Jan 31, 2022):
For details, on the premise that you have to check the source code.
s3fs has been modified to use an internally temporary file descriptor(which is a number that is recognized only inside s3fs, not the fd issued by the system) when a file is opened.
So I think I can use this temporary file descriptor to determine if it's an open file.
@gaul commented on GitHub (Feb 6, 2022):
While s3fs should optimize
truncate, should your application callposix_fallocateinstead?@orozery commented on GitHub (Feb 6, 2022):
Well, my application is closed-source (Symantec) Ghost, so I cannot control its posix calls.
But even though, s3fs currently does not implement
fuse_operations.fallocate.@ggtakec commented on GitHub (Feb 6, 2022):
I think it is possible to implement fuse_operations.fallocate in s3fs.
(it is depending on the mode flags, the implementation can be difficault)
Do you intend to solve this problem by setting(keeping) the file size with fallocate and then writing additional data?
@orozery commented on GitHub (Feb 7, 2022):
As I said, I cannot control the application, which uses
ftruncate.The only option I see is to change s3fs from
libfuse-devtolibfuse3-dev. Have we considered this?@ggtakec commented on GitHub (Feb 13, 2022):
For this matter, I created PR #1887.
@orozery Please try to use it if you can.
I thought that it's possible that s3fs don't need to flush when you resize the file. #1887 is realized for it.
The logic of the
s3fs_truncatefunction hasn't been reviewed for a while, so it would do unnecessary downloads and flushes even if the file(is modifying) size changed.I reviewed these processes and made them simpler.
The operation of
fuse2is as follows.(If I'm not mis-understanding)If user calls
truncateon an unopened file,fuseopens the file and then callss3fs_truncate.If the file is already opened,
s3fs_truncatewill be called as is.(In
s3fs_truncate, the difference between the above two cannot be determined.)Returning from
s3fs_truncate, the file will be flushed when it is closed.(If the file is opened before calling truncate, the flush will occur when it is closed.)
So we can remove the direct call to Flush in s3fs_truncate like #1887.
In
fuse3(I'm not familiar completely with it yet),fuse_file_info*is added to thetruncatehook, but I think it's the same asfuse2for s3fs.Internally, s3fs manages the open file descriptor of the target file, it has the same meaning as this structure of
fuse3.Therefore, I think that the implementation of
s3fs_truncatedue to the difference in fuse2/3 is not different. (I think it's a small change, if not)@gaul commented on GitHub (Feb 15, 2022):
@orozery Please test your application with the latest master and report if this addresses your symptoms.