[GH-ISSUE #1547] Input/output bug big files #816

Closed
opened 2026-03-04 01:49:02 +03:00 by kerem · 6 comments
Owner

Originally created by @DasMagischeToastbrot on GitHub (Feb 4, 2021).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1547

Additional Information

The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.
Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on macOS or BSD

Version of s3fs being used (s3fs --version)

Version: 1.88

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

Version: 2.9.4

Kernel information (uname -r)

Linux

GNU/Linux Distribution, if applicable (cat /etc/os-release)

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"

/etc/fstab entry, if applicable

s3fs#$PATH /mnt/home/ fuse rw,_netdev,allow_other,endpoint=eu-central-1,iam_role=auto,use_cache=/tmp/s3fs-cache/,ensure_diskfree=5000,uid=myuid,gid=sshusers,umask=002 0 0

s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)

if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages

Details about issue

The following file throws error for files > 5 GB. Until version 1.87 everything worked fine. Basically it only throws error from local to s3, but it works great for loading data from s3 to a local disk.

#!/bin/bash
# create a 10gb file
FILENAME="$( date +%s)_my_10g_file.txt"
echo "This run uses filename=$FILENAME"
dd if=/dev/urandom of=$HOME/$FILENAME bs=1M count=10000
DATASETPATH="$HOME/"
echo 'this is my awesome test file' >> $HOME/$FILENAME
# create a loop of 1 hour
START=`date +%s`
while [ $(( $(date +%s) - 14000 )) -lt $START ]; do

   echo "checking md5_local hash..."
   md5_local=$(md5sum "$HOME/$FILENAME" | cut -d ' ' -f 1)
   echo "md5_local=$md5_local"
   
   echo "moving file local->s3"
   mv $HOME/$FILENAME $DATASETPATH/$FILENAME
   
   echo "checking md5_s3 hash..."
   md5_s3=$(md5sum "$DATASETPATH/$FILENAME" | cut -d ' ' -f 1)
   echo "md5_s3=$md5_s3"
   
   if [[ "$md5_local" == "$md5_s3" ]]
   then
      echo "The checksums match after sync to s3"
   else
      echo "Error!" 1>&2
      exit 64
   fi
   
   echo "moving file s3->local"
   mv $DATASETPATH/$FILENAME $HOME/$FILENAME
   #eval md5sum $FILENAME > $HOME/local
   
   echo "checking md5_local hash..."
   md5_local=$(md5sum "$HOME/$FILENAME" | cut -d ' ' -f 1)
   echo "md5_s3=$md5_s3"
   
   if [[ "$md5_local" == "$md5_s3" ]]
   then
      echo "The checksums match after sync to local"
   else
      echo "Error!" 1>&2
      exit 64
   fi

done
echo "Congratulation everything worked"
Originally created by @DasMagischeToastbrot on GitHub (Feb 4, 2021). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1547 ### Additional Information _The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all._ _Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on macOS or BSD_ #### Version of s3fs being used (s3fs --version) Version: 1.88 #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) Version: 2.9.4 #### Kernel information (uname -r) Linux #### GNU/Linux Distribution, if applicable (cat /etc/os-release) NAME="Amazon Linux" VERSION="2" ID="amzn" ID_LIKE="centos rhel fedora" VERSION_ID="2" PRETTY_NAME="Amazon Linux 2" #### /etc/fstab entry, if applicable ``` s3fs#$PATH /mnt/home/ fuse rw,_netdev,allow_other,endpoint=eu-central-1,iam_role=auto,use_cache=/tmp/s3fs-cache/,ensure_diskfree=5000,uid=myuid,gid=sshusers,umask=002 0 0 ``` #### s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs) _if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages_ ``` ``` ### Details about issue The following file throws error for files > 5 GB. Until version 1.87 everything worked fine. Basically it only throws error from local to s3, but it works great for loading data from s3 to a local disk. ``` #!/bin/bash # create a 10gb file FILENAME="$( date +%s)_my_10g_file.txt" echo "This run uses filename=$FILENAME" dd if=/dev/urandom of=$HOME/$FILENAME bs=1M count=10000 DATASETPATH="$HOME/" echo 'this is my awesome test file' >> $HOME/$FILENAME # create a loop of 1 hour START=`date +%s` while [ $(( $(date +%s) - 14000 )) -lt $START ]; do echo "checking md5_local hash..." md5_local=$(md5sum "$HOME/$FILENAME" | cut -d ' ' -f 1) echo "md5_local=$md5_local" echo "moving file local->s3" mv $HOME/$FILENAME $DATASETPATH/$FILENAME echo "checking md5_s3 hash..." md5_s3=$(md5sum "$DATASETPATH/$FILENAME" | cut -d ' ' -f 1) echo "md5_s3=$md5_s3" if [[ "$md5_local" == "$md5_s3" ]] then echo "The checksums match after sync to s3" else echo "Error!" 1>&2 exit 64 fi echo "moving file s3->local" mv $DATASETPATH/$FILENAME $HOME/$FILENAME #eval md5sum $FILENAME > $HOME/local echo "checking md5_local hash..." md5_local=$(md5sum "$HOME/$FILENAME" | cut -d ' ' -f 1) echo "md5_s3=$md5_s3" if [[ "$md5_local" == "$md5_s3" ]] then echo "The checksums match after sync to local" else echo "Error!" 1>&2 exit 64 fi done echo "Congratulation everything worked" ```
kerem closed this issue 2026-03-04 01:49:02 +03:00
Author
Owner

@gaul commented on GitHub (Feb 4, 2021):

Could you further characterize the behavior between 1.87, 1.88, and master? Which specific error is raised? It also appears that you are using multi-part objects (the default and preferred) and not single-part objects -- is this correct?

<!-- gh-comment-id:773377063 --> @gaul commented on GitHub (Feb 4, 2021): Could you further characterize the behavior between 1.87, 1.88, and master? Which specific error is raised? It also appears that you are using multi-part objects (the default and preferred) and not single-part objects -- is this correct?
Author
Owner

@DasMagischeToastbrot commented on GitHub (Feb 5, 2021):

In version 1.87 it worked, so it didn't throw any error. If I am building from 1.88 or form the latest commit of 1.88 I get the following error code: Yes you are right there is no disabling of multipart objects. S

This run uses filename=1612509376_my_10g_file.txt
dd: error writing ‘/home/itsme/1612509376_my_10g_file.txt’: No space left on device
9419+0 records in
9418+0 records out
9875488768 bytes (9.9 GB) copied, 176.008 s, 56.1 MB/s
checking md5_local hash...
md5_local=dfa84433a44dfb967366c70ba657d5f4
moving file local->s3
mv: failed to close ‘/home/itsme/mypath/1612509376_my_10g_file.txt’: Input/output error
checking md5_s3 hash...
md5sum: /home/itsme/mypath/1612509376_my_10g_file.txt: No space left on device
md5_s3=
Error!

<!-- gh-comment-id:773942703 --> @DasMagischeToastbrot commented on GitHub (Feb 5, 2021): In version 1.87 it worked, so it didn't throw any error. If I am building from 1.88 or form the latest commit of 1.88 I get the following error code: Yes you are right there is no disabling of multipart objects. S This run uses filename=1612509376_my_10g_file.txt dd: error writing ‘/home/itsme/1612509376_my_10g_file.txt’: No space left on device 9419+0 records in 9418+0 records out 9875488768 bytes (9.9 GB) copied, 176.008 s, 56.1 MB/s checking md5_local hash... md5_local=dfa84433a44dfb967366c70ba657d5f4 moving file local->s3 mv: failed to close ‘/home/itsme/mypath/1612509376_my_10g_file.txt’: Input/output error checking md5_s3 hash... md5sum: /home/itsme/mypath/1612509376_my_10g_file.txt: No space left on device md5_s3= Error!
Author
Owner

@zpeschke commented on GitHub (Feb 5, 2021):

I can confirm that I run into this same issue on 1.88 on a CentOS 7 VM. Downgrading to 1.87 resolves the issue.

When attempting to upload a 20GB file, 10GBs are consistently transferring before receiving an "Input/output error". I grabbed some debug logs.

Most of the multiparts look to be uploaded fine:

[INF]       curl.cpp:RequestPerform(2267): HTTP response code 200
[INF]       curl.cpp:insertV4Headers(2598): computing signature [PUT] [/path/to/largefile.tar] [partNumber=512&uploadId=qy96g2q4Lu2hHxmY2C5hG3zV5mRbYTPdNiaimPXZYrFk3Frslaa30P89xaYw
L83gV8l664uTeyyTgVDle8ljT2SAY6jYsO4opjRSaJe4gNY-] [90c82d54582f7d0005634f0b21944c22637a86f39572cec4dc06c930538504a0]
[INF]       curl_util.cpp:url_to_host(327): url is https://s3.amazonaws.com

However, at the end of the transfer, s3fs attempts to upload (reupload?) part number 1 which results in an HTTP 400. I get an error message that I'm exceeding the 5GB upload limit.

The specified copy source is larger than the maximum allowable size for a copy source: 5368709120

[ERR] curl.cpp:RequestPerform(2282): HTTP response code 400, returning EIO. Body Text: <Error><Code>InvalidRequest</Code><Message>The specified copy source is larger than the maximum allowable size for a copy source: 5368709120</Message><RequestId>463F797B1DF57583</RequestId><HostId>...redacted...</HostId></Error>
[WAN] curl_multi.cpp:MultiPerform(171): thread failed - rc(-5)
[WAN] curl_multi.cpp:MultiRead(204): failed a request(400: https://bucket.s3.amazonaws.com/path/to/largefile.tar?partNumber=1&uploadId=qy96g2q4Lu2hHxmY2C5hG3zV5mRbYTPdNiaimPXZYrFk3Frslaa30P89xaYwL83gV8l664uTeyyTgVDle8ljT2SAY6jYsO4opjRSaJe4gNY-)

My guess is that the entire file is attempted to be uploaded in a single multipart at the end of the transfer.

<!-- gh-comment-id:774149009 --> @zpeschke commented on GitHub (Feb 5, 2021): I can confirm that I run into this same issue on 1.88 on a CentOS 7 VM. Downgrading to 1.87 resolves the issue. When attempting to upload a 20GB file, 10GBs are consistently transferring before receiving an "Input/output error". I grabbed some debug logs. Most of the multiparts look to be uploaded fine: ``` [INF] curl.cpp:RequestPerform(2267): HTTP response code 200 [INF] curl.cpp:insertV4Headers(2598): computing signature [PUT] [/path/to/largefile.tar] [partNumber=512&uploadId=qy96g2q4Lu2hHxmY2C5hG3zV5mRbYTPdNiaimPXZYrFk3Frslaa30P89xaYw L83gV8l664uTeyyTgVDle8ljT2SAY6jYsO4opjRSaJe4gNY-] [90c82d54582f7d0005634f0b21944c22637a86f39572cec4dc06c930538504a0] [INF] curl_util.cpp:url_to_host(327): url is https://s3.amazonaws.com ``` However, at the end of the transfer, s3fs attempts to upload (reupload?) part number 1 which results in an HTTP 400. I get an error message that I'm exceeding the 5GB upload limit. `The specified copy source is larger than the maximum allowable size for a copy source: 5368709120` ``` [ERR] curl.cpp:RequestPerform(2282): HTTP response code 400, returning EIO. Body Text: <Error><Code>InvalidRequest</Code><Message>The specified copy source is larger than the maximum allowable size for a copy source: 5368709120</Message><RequestId>463F797B1DF57583</RequestId><HostId>...redacted...</HostId></Error> [WAN] curl_multi.cpp:MultiPerform(171): thread failed - rc(-5) [WAN] curl_multi.cpp:MultiRead(204): failed a request(400: https://bucket.s3.amazonaws.com/path/to/largefile.tar?partNumber=1&uploadId=qy96g2q4Lu2hHxmY2C5hG3zV5mRbYTPdNiaimPXZYrFk3Frslaa30P89xaYwL83gV8l664uTeyyTgVDle8ljT2SAY6jYsO4opjRSaJe4gNY-) ``` My guess is that the entire file is attempted to be uploaded in a single multipart at the end of the transfer.
Author
Owner

@gaul commented on GitHub (Feb 6, 2021):

Could you try setting -o max_dirty_data to a larger value, e.g., 102400 (100 GB), as a workaround? Perhaps the periodic dirty flushing has a bug in the copy part logic.

<!-- gh-comment-id:774444787 --> @gaul commented on GitHub (Feb 6, 2021): Could you try setting `-o max_dirty_data` to a larger value, e.g., 102400 (100 GB), as a workaround? Perhaps the periodic dirty flushing has a bug in the copy part logic.
Author
Owner

@gaul commented on GitHub (Feb 7, 2021):

Please test the referenced PR. In the mean time, -o nomixupload is a more complete workaround. Thanks reporting this!

<!-- gh-comment-id:774676287 --> @gaul commented on GitHub (Feb 7, 2021): Please test the referenced PR. In the mean time, `-o nomixupload` is a more complete workaround. Thanks reporting this!
Author
Owner

@DasMagischeToastbrot commented on GitHub (Feb 9, 2021):

This fixed my problem

<!-- gh-comment-id:775812452 --> @DasMagischeToastbrot commented on GitHub (Feb 9, 2021): This fixed my problem
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#816
No description provided.