mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #226] 125MB tar.gz file error #126
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#126
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ogg1e on GitHub (Aug 11, 2015).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/226
If I use smaller files in s3 (text, jar etc) up to 20MB in size, they work fine in s3fs-fuse. But if I upload a tar.gz file that is 125MB in size, I get the error
when I try to extract it (tar -xvf). If I get the file out of s3 using s3cmd, there's no problem with it and it works fine. So I know the file isn't corrupt.
Here's my /etc/fstab entry using a proxy:
@sqlbot commented on GitHub (Aug 11, 2015):
This issue seems to leave some questions unanswered.
Among the small files you report having no trouble with, you don't mention "gz." Is s3fs working correctly for smaller gz files and not large ones, or is it actually all .gz files, with the size not actually relevant? What about large files in other formats?
You shouldn't get that message with
tar -xvf. You would need-zin the options to get that message, wouldn't you?What kind of proxy are you using? If your ".gz" files are stored in S3 with
Content-Encoding: gzip... well, that is incorrect, and it's a confusingly common mistake I see people make.Content-Encodingis for transparent encoding that the user agent is supposed to be remove, and that's of course not the case with a ".gz" file, which you want to remain gzipped end-to-end. In that case, particularly, the proxy could be stripping the gzip wrapper from the file, leading to the corruption. It might be useful to mention which proxy you are using, and theContent-TypeandContent-Encodingof these problematic S3 objects, as seen in the console. I am guessing you did not originally store them with s3fs, but that might be useful information as well, since you seem to be able to download them with s3cmd.@ogg1e commented on GitHub (Aug 12, 2015):
It does work with smaller tar.gz files.
That error is when I do 'tar -xvf'
I uploaded the files with s3cmd.
@ogg1e commented on GitHub (Aug 12, 2015):
Just did some more testing. If I upload the file with s3fs, I can use it from s3fs. But if I upload it with s3cmd, I cannot use it with s3fs. So what's the difference between the two? How do configure both so they work together? The plan was to upload with s3cmd and read with s3fs.
@ggtakec commented on GitHub (Aug 12, 2015):
s3fs puts attributes to a file(object), which are x-amz-* HTTP headers.
These attributes are used as file permissions by s3fs.
s3cmd does not put theses attributes.
Because of this difference, s3fs can not access the file(object) which is uploaded by s3cmd.
This behavior is the same as the file system, it will also be confirmed permissions of a directory which has the file.
You can see it by the ls command.
Regards,
@ogg1e commented on GitHub (Aug 12, 2015):
Can I manually add these headers with s3cmd?
@ggtakec commented on GitHub (Aug 12, 2015):
You can set following http header for the file by s3cmd
x-amz-meta-gid
x-amz-meta-uid
x-amz-meta-mode
x-amz-meta-mtime
Or if your objects do not have any header, at first you can run s3fs with uid/gid option.(as same as your account)
And do “touch” the files or any action(chmod/chown/chgrp/etc) to the files, then s3fs adds those http header.
After that, remount without uid/gid option and see the files, you can see normal permission.
Regards,
@gaul commented on GitHub (Aug 21, 2015):
@ogg1e You could also try passing
-o umask=0022to s3fs.@ogg1e commented on GitHub (Aug 21, 2015):
I just did some more testing, and even when I upload it with s3fs, i can't use it on another server using s3fs mounted to the same bucket.
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
@ggtakec commented on GitHub (Sep 13, 2015):
If you can try, please copy a problem file to another directory from server2.
And compares original and that file as binary.
If the result is not same, there is something failure on sending/receiveing through s3fs.
If both files are same, there is gzip decompress fault through s3fs.
I think that we should identify the source of the problem.
Regards,
@gaul commented on GitHub (Jan 24, 2019):
@ogg1e Could you retest against master? It include a number of fixes on write and error paths which might address this symptom.
@gaul commented on GitHub (Apr 9, 2019):
Closing due to inactivity. Please reopen if symptoms persist.