mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 05:16:00 +03:00
[GH-ISSUE #2644] Large files cannot be copied using rsync or cp command under Debian Testing (trixie) s3fs (1.93-1+b1) to idrive.com object storage #1262
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#1262
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @pallebone on GitHub (Feb 19, 2025).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2644
Version of s3fs being used (
s3fs --version)1.93-1+b1
Version of fuse being used (
pkg-config --modversion fuse,rpm -qi fuseordpkg -s fuse)Version: 3.14.0-10
Kernel information (
uname -r)6.12.12-amd64
GNU/Linux Distribution, if applicable (
cat /etc/os-release)Debian testing (trixie)
How to run s3fs, if applicable
s3fs retention /media/Retention -o passwd_file=/etc/passwd-archive -o nonempty -o url=https://w9v5.va21.idrivee2-1.com
Details about issue
I have found a workaround to the issue as I was able to determine the cause (file too large).
When copying a file using rsync or cp commands, the server will hang and rsync with --progress will stall coping no further bytes (must be sigkilled).
cp command is similar, will simply stall and stop copying after around 2-3GB.
As I have a postgres DB I am backing up with tar.gz resulting in an 18GB archive, I cannot copy this file over to object storage and resulting backup scripts fail for offsite backup.
Current resolution I have found is to use split command eg:
split -b 384M piping stdout from tar to stdin via | to split command, creating many tar files for each backed up file that was originally a single tar file.
Using rsync or cp in scripts to then copy these smaller files gives no error or hang at all and works perfectly consistently, so when files become large there is some strange issue.
How can I troubleshoot this problem and try to find a resolution? To be clear the workaround splitting files into smaller files is perfectly adequate so there is no rush or anything critical causing an issue, but it is strange that large files have a problem.
Kind regards
Pete
@gaul commented on GitHub (Feb 20, 2025):
What do the s3fs logs say? Launch s3fs with
-f -d. Can you quantify how large is large? Do you have the same problem with AWS CLI, e.g., does idrive.com support large uploads?@pallebone commented on GitHub (Feb 21, 2025):
@gaul - Sure I mounted a bucket with this command:
s3fs -f -d publicscreens /media/PubScreens -o passwd_file=/etc/passwd-publicfs -o nonempty -o url=https://w9v5.va21.idrivee2-1.com
Then I touched a file here:
touch test
Then tried rsync a large file here:
rsync -avW --progress --inplace --ignore-existing --size-only --no-compress /TempBackFile/BigFile /media/PubScreens/
At 9% or around 5GB (5,319,131,136) it hung with continal errors until I killed it.
https://pastebin.com/BPehqHzd
As an aside idrive allows free accounts up to 10GB so if anyone needs to test this directly on their own box it would be quite easy to do so with a free account as it breaks with files less than that in size.
P
@pallebone commented on GitHub (Feb 21, 2025):
Also want to reiterate, since limiting the files to be small ( I selected 384MB randomly) I have had no issues, so the problem only happens with files that get large. This change was easily achieved with the split command. However there should be no reason large files dont work.
@gaul commented on GitHub (Feb 25, 2025):
Please provide the logs that https://github.com/s3fs-fuse/s3fs-fuse/issues/2644#issuecomment-2670529006 suggested.
@pallebone commented on GitHub (Feb 25, 2025):
Is the pastebin that I provided not the logs?
@gaul
@gaul commented on GitHub (Feb 25, 2025):
Sorry I missed the pastebin link. But this is truncated before the error occurs:
You can see this since
[/BigFile]should be followed by[partNumber=.@pallebone commented on GitHub (Feb 25, 2025):
Im sorry, I dont understand what you are telling me. Im just a normal user, so its hard for me to follow what you are telling me.
The only process I knew to follow was do the test and when it hung and stopped working copy and paste the entire output from the terminal to pastebin. Was I supposed to do something additional?
@juliogonzalez commented on GitHub (Feb 25, 2025):
@pallebone seems you didn't paste the full output of
s3fs -f -d publicscreens /media/PubScreens -o passwd_file=/etc/passwd-publicfs -o nonempty -o url=https://w9v5.va21.idrivee2-1.com/You may want to, instead, redirect the log (strictly speaking stderr and stdout) to a file:
For that, add
2>&1 | tee log.txtto the command, for example:s3fs -f -d publicscreens /media/PubScreens -o passwd_file=/etc/passwd-publicfs -o nonempty -o url=https://w9v5.va21.idrivee2-1.com/ 2>&1 | tee log.txtThat should create a log since you mount the bucket, and until you run the test with rsync.
Then if the log is not huge, attach it to this issue with a comment.
Or if it huge, get the last lines in the log, which should contain info from when rsync hangs (tip: use
tailfor extracting them)Also try what @gaul told, test sending the same files to idrive using AWS CLI. Seems it's supported.
@juliogonzalez commented on GitHub (Feb 25, 2025):
BTW, one more thing: you are on 1.93, which is an old release.
It's what Debian provides for now, but you can try to compile and install 1.95. It's quite simple and documented.
@pallebone commented on GitHub (Feb 25, 2025):
Ok, I understand, let me pause this for now and I will look into a newer version and then rerun the logs as you suggest. I didnt know about AWS CLI so I will look into this also. Might take me a while to get to this (few weeks or so) as the server is in production and I have to create a window where I can mess around on it.
Thanks for the advice and direction @juliogonzalez and @gaul :)
@juliogonzalez commented on GitHub (Feb 25, 2025):
Other option could be a separate server, with equivalent (or even the same files) files rsynced to another idrive account, or bucket, or folder (I am not sure what the idrive structure is).
Then if you reproduce the issue with 1.93, you can try AWS CLI and building 1.95, getting the logs, etc.
In short: a staging server :-)
And of course
At 9% or around 5GB (5,319,131,136) it hung with continal errors until I killed it.... what's the size of that file being copied when it hangs? Is that size supported by that idrive services?@pallebone commented on GitHub (Feb 25, 2025):
@juliogonzalez Yes size is supported (supposedly, according to their documentation). I created a 55GB or so file as a test. It was to a bucket.
I will of course come back to you when I can.
@pallebone commented on GitHub (Feb 25, 2025):
@juliogonzalez Yes agree in perfect world staging server is good, but I have to pay out of my own pocket. Budget is only $15 a month for the server. Its just a lemmy server (lemmy.myserv.one). Fitting server+object storage+backups+email and so on is difficult in this budget so kind of just has to work and a lot of testing in production happens.
@pallebone commented on GitHub (Feb 26, 2025):
Today I tried using aws cli, which I have never used before, but I cant work out how to do it. It just gives me errors when I try copy a file eg:
aws s3 cp BigFile32GB.big s3://publicscreens/Big-Object --endpoint-url https://w9v5.va21.idrivee2-1.com/upload failed: ./BigFile32GB.big to s3://publicscreens/Big-Object An error occurred (MissingContentLength) when calling the UploadPart operation: You must provide the Content-Length HTTP header.I tried googling this but Im not familiar enough with how its supposed to work (never seen a working example) to know what I am doing wrong.
I will try looking into this further but for now I am stuck.
Kind regards
P
@gaul commented on GitHub (Feb 27, 2025):
The S3 protocol is sometimes incompletely implemented. So idrive.com may have some issues that prevent use with AWS CLI (and maybe s3fs too). You might want to share this issue with their support team.
@gaul commented on GitHub (Feb 27, 2025):
This is probably related to the Amazon client changes discussed here: https://news.ycombinator.com/item?id=43118592
tl;dr: just test with s3fs and share the complete logs as I requested.
@pallebone commented on GitHub (Mar 18, 2025):
Hello I have done what you asked now.
I first mounted the drive using this command:
s3fs -f -d publicscreens /media/PubScreens -o passwd_file=/etc/passwd-publicfs -o nonempty -o url=https://q8p0.c17.e2-3.dev > /media/SomeFile.log 2>&1Then I used rsync:
rsync -avW --progress --inplace --ignore-existing --size-only --no-compress /TempBackFile/BigFile32GB.big /media/PubScreens/TestFol/This hangs at 5,353,078,784 15%
Here is a screenshot:
I then cancel rsync as the server starts to become slower and slower and eventually it crashes completely if I do not cancel it.
Here is a video showing it hanging (I am switching between terminal windows in the video):
https://we.tl/t-Fipgp7FFLX
The log of SomeFile.log is here:
https://paste.ee/p/sLmTWzRL
Hopefully this works for you?
Kind regards
Peter