mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 13:26:00 +03:00
[GH-ISSUE #1013] S3FS generating double S3 ObjectCreated event notifications #553
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#553
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @evh69 on GitHub (Apr 15, 2019).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1013
@ggtakec in relation to https://github.com/s3fs-fuse/s3fs-fuse/issues/427#issuecomment-478217372_ ...
we have application architecture that needs to support zero byte files and discovered this issue during testing. I am sure this has been considered but will ask anyway ... Could a configuration switch that one could set that would allow a choice to create the initial file with some sort of filename pattern? we could leverage in the S3 Event Notification that we could use to ignore the initial file creation, similar to an event filter proposed earlier to handle zero byte files?
IE.
Originally posted by @ggtakec in https://github.com/s3fs-fuse/s3fs-fuse/issues/427#issuecomment-478217372
@ggtakec commented on GitHub (Apr 16, 2019):
@evh69 Thanks to post new issue.
Now s3fs has the following behavior when creating a file.
Then follow the instructions to create a 0-byte object in S3.
Re-upload(overwrite) the 0 byte file after all the write is completed.
Because of this behavior, a 0-byte file is created first anyway.
This is true at the command as well as at the system call.
I understand that for non-existent files, there is a working background for getting a file descriptor to new file.
For this reason, the 0-byte file creation is the same whether you create a new file or rename it.
If s3fs would support WORM, it must bypass the first 0-byte file creation.
Since s3fs creates a file descriptor using a cache file on the local file system, we maybe be able to change s3fs not to upload at first time.
However, this will also need to investigate the impact on other operations (ex. checking file permissions).
So that, I think It is not easy way.
@dkolli commented on GitHub (Apr 17, 2019):
Is meta data on s3 any different when file descriptor was created first with 0 byte than when writing is complete with 0 bytes other than the object size of course. that would help us to filter out the first put in 0 byte file scenario and allow the second put to go through
@gaul commented on GitHub (Apr 17, 2019):
I briefly looked into this, modifying
s3fs_createto avoid creating the object and instead populating the stat cache with an empty file. One common error iss3fs_utimenswhich tries to copy the metadata from the non-existent object to the new object. I am certain that this can be fixed but will require more effort...@alperen66 commented on GitHub (Feb 18, 2021):
its been 2 year do we have a solution now ?
@gaul commented on GitHub (Apr 30, 2021):
@evh69 @dbbyleo could you test that master resolves your symptoms?
@gaul commented on GitHub (May 1, 2021):
I benchmarked this change with
time for i in $(seq 100); do touch mnt/$i; donefrom Japan using a bucket in us-east.Before:
After:
@CESteinmetz commented on GitHub (May 26, 2021):
Thanks for addressing this. I updated to the tip of master, but I am still seeing two ObjectCreated:Put events in the S3 Access Logs. First PUT is with a zero byte file, second with the full file. Am I misunderstanding this fix?
@gaul commented on GitHub (May 27, 2021):
You can watch s3fs make HTTP request by setting the
-f -o curldbgflags. It seems like you are not using the s3fs version that you expect.@CESteinmetz commented on GitHub (May 27, 2021):
I appreciate the prompt response. I enabled debug mode and checked my version:
The commit hash seems to match the tip of master
In the debug logs I still see the double call:
@gaul commented on GitHub (May 27, 2021):
Please
straceyour application to see what calls it is making -- if callscloseorfsyncthen s3fs will flush the file. If it is not, try to correlate the strace output with thes3fs -f -o curldbglogs and share them here.@CESteinmetz commented on GitHub (May 27, 2021):
Thanks. I'm using sftp here, and yes I see the close: