mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 13:26:00 +03:00
[GH-ISSUE #281] Data corruption when copying from s3fs #143
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#143
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @bruceredmon on GitHub (Oct 19, 2015).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/281
Every 3 to 4 days of use, (approximately 5k files/day) we are experiencing a corruption in the download data stream. Our files are gzip'd, so the decompression is failing. It appears that an internal error message from AWS is being embedded in the error stream at a 1MB boundary (true of every corrupt file we've inspected thus far)
@ggtakec commented on GitHub (Oct 19, 2015):
Please run s3fs with "-d" and "-f"(and "curldbg") option or "-o dbglevel=xxx" instead of "-d"(and -o f2) option.(in master branch codes is added "dbglevel" option)
And if you can, we want to know how that can be reproduced for this issue.
I hope it helps us for solving this issue.
Thanks in advance for your help.
@bruceredmon commented on GitHub (Oct 26, 2015):
Mounting s3fs volumes with -o retries=4,noatime,dbglevel=info,curldbg. Please let me know if this works or if a higher dbglevel is needed.
@ggtakec commented on GitHub (Nov 1, 2015):
@bruceredmon
There is a document in aws about "We encountered an internal error. Please try again."
https://forums.aws.amazon.com/message.jspa?messageID=215866
And s3fs retry to send a request when getting 500 and over 500 response code until retry count.
Please try to set the larger value to "retries" option.
I expect that this issue is resolved by it.
Thanks in advance for your help.
@gaul commented on GitHub (Nov 1, 2015):
@ggtakec After s3fs exhausts its retries it should return EIO to the caller instead of bogus data. I cannot reproduce these symptoms but we need a fix similar to
a1ca8b7124.@bruceredmon commented on GitHub (Nov 9, 2015):
Increasing retries from 4 to 12 seems to have restored stability for now, but agree with @andrewgaul that it would be much better to return IO error than corrupt data.
@ghost commented on GitHub (Jun 16, 2016):
I found weird data corruption when copying files down.. turns out it was
use_cache.. the local file cache was corrupting the data, so I disabled it and that fixed it.@arpad9 commented on GitHub (Aug 25, 2017):
I'm having the same problem with the local cache and corrupted data though that seems like a separate issue. I'll put together some debugging on it.
@gaul commented on GitHub (Jan 24, 2019):
@bruceredmon master includes several fixes for data corruption; could you test again and share your results?
@gaul commented on GitHub (Apr 9, 2019):
Closing due to inactivity. Please reopen if symptoms persist.