mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #535] File in s3 ends up truncated to 0 bytes #306
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#306
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @alexeits on GitHub (Feb 16, 2017).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/535
A 75K file ended up in s3 truncated to 0 bytes. We analyzed the logs and couldn't find anything wrong.
It appears that s3fs first creates a file named
.in.filenameand then renames it tofilename. According to S3 object copy documentation (the emphasis is mine):It doesn't appear that s3fs handles the case of error in a 200 response at least based on this code fragment.
Could you please confirm (or deny)?
Additional Information
The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.
Version of s3fs being used (s3fs --version)
Amazon Simple Storage Service File System V1.80(commit:unknown) with OpenSSL
Version of fuse being used (pkg-config --modversion fuse)
2.9.2-4ubuntu4.15.04.1
System information (uname -a)
4.7.3-coreos-r2 #1 SMP Thu Feb 2 02:26:10 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Distro (cat /etc/issue)
Ubuntu 15.04
s3fs command line used (if applicable)
The original logs no longer exist, but here are some snippets from the log collector
Details about issue
@ggtakec commented on GitHub (Apr 2, 2017):
@alexeits I'm sorry for my late replying.
s3fs does not create an object with the filename ".in.*" as s3fs's internal processing.
If you just only create the file under s3fs mount directory, s3fs will create the object "filename" directly.
Please let us know your s3fs option when this problem occurred.
And is there any application you are using with the s3fs file system?
(It seems that the application is trying to create an ".in.GLADLYDTA.CSV." File and rename it to "GLADLYDTA.CSV")
Last, I think the explanation of PUT API is correct.
The part you pointed out is the response processing part for one request of s3fs, and the 200/300 error is not handled as an error.
Thanks in advance for your assistance.
@alexeits commented on GitHub (Apr 3, 2017):
@ggtakec thanks for replying
Here are answers to your questions.
Yes, it's
proftpdFTP server that comes with ubuntu:15.04. That's what is creating those.in.*files.@ggtakec commented on GitHub (Apr 9, 2017):
@alexeits I understood about you are using proftpd with s3fs.
First of all, no error is found in your log of this Issue.
In the log, s3fs only appears until rename ".in.GLADLYDTA.CSV." to "GLADLYDTA.CSV".
The reason that "GLADLYDTA.CSV" is 0 byte is because uploading of ". In.GLADLYDTA.CSV." file failed and created 0 byte ".in.GLADLYDTA.CSV." or it seems to have failed in rename processing.
Is there a log after your log when an error occurred?
If it is an error with s3fs rename, it may be followed by an error.
Or we need to check the log of "..." part from L788 to L833.
Thanks in advance for your help.
@alexeits commented on GitHub (Apr 13, 2017):
@ggtakec let me try to answer your questions one by one
I don't believe that either of this operations failed in a way that was detected by s3fs. The lines 787-
788 of the log show that 77896 bytes of the ".in.GLADLYDTA.CSV." file were uploaded successfully:
As for the rename I'm certain that NO errors were reported. As I stated in the issue we did not find any errors in the log at all. My hypothesis is that s3 failed after the s3 copy operation had started. In this case s3 returns 200 and, as we discussed earlier, s3fs does NOT handle 200/300 responses from s3 copy operation as an error, which I believe why we ended up with 0 bytes in the destination file "GLADLYDTA.CSV".
It seems that the best course of action for s3fs would be to check the copy operation response for errors even in case of getting HTTP status 200. Please let me know if you disagree.
@ggtakec commented on GitHub (May 5, 2017):
@alexeits
As you say, uploading seems to have succeeded.
However, if the rename failed, I do not think that it will become a 0 byte file.
This is because s3fs uses the copy api(PUT HEADER) for rename processing, and copy of the file object is done in S3.
As far as logs are concerned, s3fs has also succeeded in this PUT processing.
(s3fs handles error of this PUT HEADER, but here seems to be no error.)
If you can reproduce it, could you get the log with the curldbg option?
It will be helpful if you get more detailed logs.
Thanks in advance for your assistance.
@alexeits commented on GitHub (May 6, 2017):
@ggtakec
It won't be possible to collect any more detailed logs at this time. I would make sure to collect them when the issue occurs again. Although it's likely so rare that we might not see it in months or even longer.
I'd like to challenge your assertion that if the rename failed, it will not be possible for the file become a 0 byte file. As we discussed earlier the s3 copy api (PUT HEADER) may fail inside s3 and s3 would return 200 even if the operation failed. Citing the relevant passage from the s3 docs again:
Also as we have established in our discussion s3fs does NOT currently handle errors returned in 200 OK responses. I do believe the lack of handling for such errors in s3fs is the most likely reason for the silent rename failure.
@ggtakec commented on GitHub (May 6, 2017):
@alexeits
I was unaware of the essence of this problem.
I added a simple check with your advice.
In renaming, when an error occurred during S3 internal processing, it was made to be an EIO error.
Please use latest codes in master branch or check it.
And if you have no problem about this fix, please close this issue.0
Thanks in advance for your help.
@sqlbot commented on GitHub (May 6, 2017):
@alexeits it seems unlikely that an S3 internal error would result in a 0 byte object, since writes are atomic thus an internal failure should result in nothing happening to the bucket at all.
It seems plausible that there's a consistency model issue at play. I don't think we have a way (from the outside looking in) to determine conclusively whether PUT/Copy always makes a copy of the most recently written instance of the object.
We can conclude anecdotally that S3 doesn't literally overwrite objects internally, because overwriting an object doesn't interfere with users who are in the process of downloading an object that is overwritten, and because it seems more likely that overwriting an object allocates new storage and pushes an update to the bucket's index, with index replication being the presumed cause of the eventual consistency behavior.
So while I think the change here is a move in the right direction, I don't know that it will solve the issue, which may be more of a race condition where we inadvertently copy the prior instance of the object -- the empty one we initially create when a file is opened for writing. If this is what's happening then it's not an issue in s3fs but perhaps is something we can find another way to work around.
My buckets use versioning, and I don't think I see this issue... which could imply that buckets with versioning enabled use different internal logic inside S3... different logic that coincidentally mediates the behavior.
@ggtakec commented on GitHub (May 6, 2017):
@sqlbot Thanks for your kindness.
Yes, the #582 patch does not solve this problem.
It only avoided the problem that s3fs did not judge the content of the response body.
It is unclear yet why the zero byte file is the essence of this problem.
If @alexeits can reproduce the phenomenon with a new code, he can drop the error body log, so we want that log.
@ggtakec commented on GitHub (May 7, 2017):
@alexeits
Do you use SSE(use_sse option) for this problem?
I fixed codes about rename bug with SSE at #585.
I got error on this bug case, it was 0 byte files are made.
If you can, please use latest codes in master branch.
Regards,
@alexeits commented on GitHub (May 8, 2017):
Thank you @sqlbot and @ggtakec. We do use SSE (use_sse=1). I'll get and deploy the latest fixes.
I'm going to close the issue because it's possible that it was fixed in #585. It's very rare in our environment and getting a definitive confirmation of a fix could take a long time.
@ggtakec commented on GitHub (May 9, 2017):
@alexeits If this problem seems to happen again, please reopen this issue.
Regards,