mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-26 05:45:57 +03:00
[GH-ISSUE #1485] Avoiding disk usage for serial reading #781
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#781
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @AlexeyDmitriev on GitHub (Nov 27, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1485
Additional Information
Version of s3fs being used (s3fs --version)
Amazon Simple Storage Service File System V1.82(commit:unknown) with GnuTLS(gcrypt)
Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)
2.9.7-1ubuntu1
Kernel information (uname -r)
5.3.0-1023-aws
GNU/Linux Distribution, if applicable (cat /etc/os-release)
Ubuntu 18.04
Details about issue
So, one of my use cases with s3fs contains mostly reading whole files serially at once:
for example, I could ty run
zcat /mnt/s3/path/to/log.gz | grep 'smth' | wcThe problem is that s3fs stores the whole file on the disk eventually, while if it knew that I only going to read it once, it could store (probably in memory) only current chunk that is not fed to the output yet.
Can it be improved somehow? For example in my case I think limit on the cache for each file could help.
Or may be there's an option already that I have missed?