mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #436] 2nd PUT when Writing Results in Conflict (HTTP 409) #233
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#233
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @dbbyleo on GitHub (Jun 16, 2016).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/436
Hi All,
I'm new to s3fs, but I have successfully set it up and have been able to mount a bucket from a Hitachi Content Platform (HCP) system. I am able to browse the file system (bucket); I am able to read and delete files. But writes results in:
cp: failed to close ‘/mymountpoint/myfilename’: Input/output errorThis occurs when I try to write any non-zero length files, say like a basic text file.
However, I have no problem copying a zero-length file. Like for example, if I touch a file locally and then copy it to the bucket - this works fine. I touch a file directly in the bucket - this also works fine. But if I try to copy a text file (with size anything greater than zero) the copy fails.
After sifting through the debug info, I found that s3fs does "double puts."
When touching a file directly in the bucket (this succeeds):
When copying a non-zero text file to the bucket (this fails):
Touching a File Locally then Copying to the Bucket (this succeeds):
Help appreciated.
We're running this on a Debian 8 (Jessie) server with curl 7.38.0.
The s3fs mount command we use is:
s3fs mybucket /mymountpoint -o nocopyapi -o use_path_request_style -o nomultipart -o no_check_certificate -o sigv2 -d -d -f -o f2 -o curldbg -o url=https://mysite.mydomain.com -o passwd_file=/myS3credentials@sqlbot commented on GitHub (Jun 17, 2016):
This is interesting, but this wouldn't be why the
PUTfails.Using
x-amz-metadata-directive: REPLACEis not valid whenContent-Lengthis non-zero. It is not used when payload is sent in the request -- it is only used withx-amz-copy-source, which is the way the S3 API implements an internal copy request -- copy the payload and replace the existing metadata with metadata supplied in this request.Background: metadata, as well as the payload of an object, are immutable in S3. To "edit" the metadata, a PUT/Copy must be used, and new metadata supplied with the PUT request. This creates a "new" object, with the same key (path), and new metadata. If the source key and target key are the same in the request, it gives the illusion of editing the metadata but it's technically a new object with the same key. The same mechanism is used for renaming objects in S3 -- PUT/Copy to a new key, then delete the old one after the copy succeeds. The metadata can be replaced or preserved.
Now, what's interesting here is two things -- one, of course, is that you're using something other than S3 as the back-end (which I did not notice when I first read this issue), and it looks as if HCP may have an issue with operations that are too closely spaced in time on the same object.
Also, you've specified
-o nocopyapi-- which should mean that you never seex-amz-metadata-directivein a request, because:...which means
x-amz-metadata-directivewould never be used... yet, that's what s3fs seems to be doing anyway. Interesting.Your "working" request seems to point to a bug in s3fs not behaving as documented, while your "broken" request seems to be a case where s3fs is behaving correctly, but exposing a problem with the "Hitachi Content Platform" not being able to accept these requests so shortly spaced in time.
Is this a condition you observe every time, or is it intermittent?
S3 itself has a few documented cases where it can return a 409 error, at least one of them (
OperationAborted) potentially being retryable with a prospect for success... and if s3fs doesn't have a mechanism for retrying these errors before considering the condition to be fatal, it probably should have one.@dbbyleo commented on GitHub (Jun 17, 2016):
After reading your reply, I went back to retest to make sure if I did encounter an anomaly with the use of -o nocopyapi. And I found that there was no issue - it behaved just like you said; my previous post was slightly incorrect. I'm going to look into somethings you said, but I wanted to post some details after fixing my error.
To answer your question: Yes, this is consistent and reproduceable. But I'm curious what you mean by
So the debug info in the tests that fails is the expected behavior? It seems to me that a second PUT request for the same filename would result in a 409 error. Can you explain more why a PUT for same filename shouldn't result in a "conflict" error?
During my retest, I took better documentation and here's the details just for reference. Based on what you've said so far, I think you'll find (then) that the debug info seems to be what you'd expect.
Mount Command:
With -o nocopyapi
s3fs apple /hcp -o nocopyapi -o use_path_request_style -o nomultipart -o no_check_certificate -o sigv2 -d -d -f -o f2 -o curldbg -o url=https://ahcp3.hcp-demo.hcpdemo.com -o passwd_file=/root/hs3.credHere's the results:
Copying a zero file SUCCEEDS.
Copying a non-zero text file FAILS.
Touching/Creating a zero file directly in the bucket FAILS.
Here's the debug info for each of the above:
Copying a zero file SUCCEEDS
... 1st PUT (and only PUT)
Copying a non-zero text file FAILS.
... 1st PUT
... 2nd PUT
Touching/Creating a zero file directly in the bucket FAILS.
# touch /hcp/touchonhcp... 1st PUT
... 2nd PUT
Then I tested without the -o nocopyapi ...
Mount Command:
Without -o nocopyapi
s3fs apple /hcp -o use_path_request_style -o nomultipart -o no_check_certificate -o sigv2 -d -d -f -o f2 -o curldbg -o url=https://ahcp3.hcp-demo.hcpdemo.com -o passwd_file=/root/hs3.credHere's the results:
Copying a zero file SUCCEEDS.
Copying a non-zero text file FAILS.
Touching/Creating a zero file directly in the bucket SUCCEEDS.
Here's the debug info for each of the above:
Copying a zero file SUCCEEDS.
... 1st PUT (and only PUT)
Copying a non-zero text file FAILS.
... 1st PUT
... 2nd PUT
Touching/Creating a zero file directly in the bucket SUCCEEDS.
# touch /hcp/touchonhcp... 1st PUT
... 2nd PUT
@ggtakec commented on GitHub (Jul 18, 2016):
@dbbyleo I'm sorry for my late reply.
s3fs logic is no different in the file size the XXXX and YYYY
s3fs's upload logic does not depend on the file size.
It is performed in the following sequence:
From your results, HCP has returned a 409 error in the second PUT request.
But I do not know this reason.
If you check it, please check whether the second HEAD request which is after first PUT(and before second PUT) is successful.
And we should know about 409 error reason from HCP.
Thanks in advance for your assistance.
@dbbyleo commented on GitHub (Jul 19, 2016):
@ggtakec Thanks for the reply. This is still an ongoing issue and appreciate your help.
I have rebuilt a new server and installed s3fs 1.80 on Debian Jessie. But I am still getting the same results.
To answer your question: The second HEAD succeeds.
HTTP 409 error is "Conflict." I assume this means "the file already exists."
I've been able to successfully use s3curl to test REST protocols to the HCP. I am able to write files to the HCP via s3curl. But what I also found is I can't write a file to HCP if the file already exists. I'm not an expect on HCP, but it seems similar to the s3fs issue... whereas the 2nd PUT fails seemingly because the file already exists.
@dbbyleo commented on GitHub (Jul 20, 2016):
I just realized that HCP (Hitachi Content Platform) is a fixed-content system and therefore is a WORM device. It seems like this is why I'm having issues with s3fs - the double PUTs is simply not allowed in these types of systems, right? Sorry - I'm new to these storage systems and s3fs altogether...
@gaul commented on GitHub (Jul 21, 2016):
@ggtakec Do we need to create the initial zero-size object? Eliding this will give a better experience with regard to eventual consistency.
@greg-at-symcor-dot-com commented on GitHub (Apr 27, 2017):
This still seems to be an issue with using S3FS with the HCP. Are there any plans to update s3fs to support worm compatible writes (as a mode perhaps)?
@ggtakec commented on GitHub (May 5, 2017):
@dbbyleo @greg-at-symcor-dot-com I'm sorry for my late reply.
I read HCP doccument, and found following lins in "Storing an object":
In other words, HCP is a specification that can not overwrite(PUT) existed objects as long as versioning is disabled.
If this is correct, it seems to be an error in s3fs not only for creating a new file object but also for updating it.
Could you enable versioning?
@greg-at-symcor-dot-com commented on GitHub (May 7, 2017):
Enabling versioning was the trick.
Once I enabled it, I was able to copy files into the s3fs mount.
Thanks!
@ggtakec commented on GitHub (May 9, 2017):
@greg-at-symcor-dot-com
I am glad to hear that it worked nicely.
I'm closing this issue.
Regards,