mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 13:26:00 +03:00
[GH-ISSUE #1547] Input/output bug big files #816
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#816
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @DasMagischeToastbrot on GitHub (Feb 4, 2021).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1547
Additional Information
The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.
Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on macOS or BSD
Version of s3fs being used (s3fs --version)
Version: 1.88
Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)
Version: 2.9.4
Kernel information (uname -r)
Linux
GNU/Linux Distribution, if applicable (cat /etc/os-release)
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
/etc/fstab entry, if applicable
s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)
if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages
Details about issue
The following file throws error for files > 5 GB. Until version 1.87 everything worked fine. Basically it only throws error from local to s3, but it works great for loading data from s3 to a local disk.
@gaul commented on GitHub (Feb 4, 2021):
Could you further characterize the behavior between 1.87, 1.88, and master? Which specific error is raised? It also appears that you are using multi-part objects (the default and preferred) and not single-part objects -- is this correct?
@DasMagischeToastbrot commented on GitHub (Feb 5, 2021):
In version 1.87 it worked, so it didn't throw any error. If I am building from 1.88 or form the latest commit of 1.88 I get the following error code: Yes you are right there is no disabling of multipart objects. S
This run uses filename=1612509376_my_10g_file.txt
dd: error writing ‘/home/itsme/1612509376_my_10g_file.txt’: No space left on device
9419+0 records in
9418+0 records out
9875488768 bytes (9.9 GB) copied, 176.008 s, 56.1 MB/s
checking md5_local hash...
md5_local=dfa84433a44dfb967366c70ba657d5f4
moving file local->s3
mv: failed to close ‘/home/itsme/mypath/1612509376_my_10g_file.txt’: Input/output error
checking md5_s3 hash...
md5sum: /home/itsme/mypath/1612509376_my_10g_file.txt: No space left on device
md5_s3=
Error!
@zpeschke commented on GitHub (Feb 5, 2021):
I can confirm that I run into this same issue on 1.88 on a CentOS 7 VM. Downgrading to 1.87 resolves the issue.
When attempting to upload a 20GB file, 10GBs are consistently transferring before receiving an "Input/output error". I grabbed some debug logs.
Most of the multiparts look to be uploaded fine:
However, at the end of the transfer, s3fs attempts to upload (reupload?) part number 1 which results in an HTTP 400. I get an error message that I'm exceeding the 5GB upload limit.
The specified copy source is larger than the maximum allowable size for a copy source: 5368709120My guess is that the entire file is attempted to be uploaded in a single multipart at the end of the transfer.
@gaul commented on GitHub (Feb 6, 2021):
Could you try setting
-o max_dirty_datato a larger value, e.g., 102400 (100 GB), as a workaround? Perhaps the periodic dirty flushing has a bug in the copy part logic.@gaul commented on GitHub (Feb 7, 2021):
Please test the referenced PR. In the mean time,
-o nomixuploadis a more complete workaround. Thanks reporting this!@DasMagischeToastbrot commented on GitHub (Feb 9, 2021):
This fixed my problem