mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #533] Downloaded file getting stored in tmp folder #301
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#301
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @tgmedia-nz on GitHub (Feb 13, 2017).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/533
Additional Information
Version of s3fs being used (s3fs --version)
1.79+git90-g8f11507-2
Version of fuse being used (pkg-config --modversion fuse)
2.9.4-1ubuntu3.1
System information (uname -a)
Linux ch-test 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Distro (cat /etc/issue)
Ubuntu 16.04.1 LTS
s3fs command line used (if applicable)
if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages
Details about issue
sample disk environment:
/ (root) 8GB (ssd)
/data/ 200GB (ssd)
If you download a big file (e.g. 21GB) like so:
cp /backup/testfile.zip /data/testfile.zip
You would expect the file being stored to /data/testfile.zip, however, looking at the process, it seems the data gets placed into something like /tmp/tmpfqiprfQ:
This obviously starts to fill up the root partition once the downloaded file is bigger than what is free on that partition.
Any ideas?
@ghost commented on GitHub (Feb 13, 2017):
in fdcache.cpp,
// For cache directory top path
//
#if defined(P_tmpdir)
#define TMPFILE_DIR_0PATH P_tmpdir
#else
#define TMPFILE_DIR_0PATH "/tmp"
#endif
maybe it has some relationship with your problem
@tgmedia-nz commented on GitHub (Feb 16, 2017):
Even if we set the TMPFILE_DIR_0PATH, that would mean we'd need double the space in the end. S3cmd doesn't require this for some reason and pipes the data directly into the file or does it in chunks.
Ideas?
@ggtakec commented on GitHub (Mar 26, 2017):
@tgmedia-nz
As default, s3fs creates the file as temporary when downloading the file. After that, s3fs copies the file to the copy destination. It may run out of disk space like you.
There is an option "ensure_diskfree" to avoid using twice as much disk space.
Please try using the latest code of the master branch.
Thanks in advance for your assistance.
@ggtakec commented on GitHub (Mar 30, 2019):
We kept this issue open for a long time.
Is this problem continuing?
We launch new version 1.86, which fixed some problem(bugs).
Please use the latest version.
I will close this, but if the problem persists, please reopen or post a new issue.
@jf commented on GitHub (Apr 9, 2019):
I am getting the same problem: both with 1.79+git90-g8f11507-2 (https://packages.ubuntu.com/xenial/s3fs), as well as with the latest 1.85 release (downloaded from https://github.com/s3fs-fuse/s3fs-fuse/releases).
In both cases, copying a file from a locally mounted disk to s3 (I haven't tried copying in the other direction but I don't care about the other direction) always results in the source file being copied to a temporary file (taking up disk space unnecessarily?) in the form of something like
/tmp/tmpfnS9vbNwhile the file is being "copied" to s3. I've tried using-o use_cache="", as well as omitting theuse_cacheoption, but it does not matter.I'm thinking that this probably has to do with (?) either a FUSE implementation issue... or else s3fs not detecting that the source is a local file and not using the appropriate system calls to open and read the file... I'm not familiar enough with the code, nor with C++ (my C++ is rudimentary) to be able to figure it out. @ggtakec would you be able to shine any light onto the matter?
@ggtakec commented on GitHub (Apr 9, 2019):
@jf
s3fs creates temporary files when uploading files.
If it specifies the use_cache option, it will be copied under that directory.
If it is not specified, it will be created in / tmp etc.
This is an s3fs operation, not a fuse issue.
If disk space is a problem, please try using the ensure_diskfree option.
@jf commented on GitHub (Apr 9, 2019):
Thanks, @ggtakec . Is there any reason why a copy must be made? Is there any way to disable it? In my use case, "cp /local/file /mnt/s3fs", creating another separate copy of the same file in /tmp is redundant, and takes up extra space.
I will look at the
ensure_diskfreeoption in the meantime, but just want to ask about this.For
ensure_diskfree, I guess if I want to disable any copying, what I would do would be to setensure_diskfreeto some super-large value that is larger than my disk space?@ggtakec commented on GitHub (Apr 16, 2019):
@jf
s3fs receives the contents of the file written through FUSE.
(It is written in block size.)
If you do not specify the use_cache directory, s3fs still need disk space to temporarily store data written in block sizes.
The ensure_diskfree option only has an upper limit that prevents s3fs from using local disks more than necessary.
And if the difference between actual disk size and ensure_diskfree size is less than one part size of multipart upload, s3fs will fail to upload.
It is inevitable that s3fs temporarily stores write data. (writing to memory is not supported.)
However, the temporary storage size can be limited to the size of one part of multipart upload (it is better to make it larger) by not using use_cache and adjusting the ensure_diskfree size.
@bstromski commented on GitHub (Jun 27, 2019):
@ggtakec if filesystems vary by size, ensure_diskfree would have to be dynamic to the system running s3fs.
Wouldn't it be easier to have a cache_limit option that prevents the default (tmp) or cachedir from exceeding that value?