mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #1339] Possible to query the filesystem meta cache #718
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#718
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @oucil on GitHub (Jul 24, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1339
Version of s3fs being used (s3fs --version)
1.85
Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)
2.9.7
Kernel information (uname -r)
4.18.0-147.8.1.el8_1.x86_64
GNU/Linux Distribution, if applicable (cat /etc/os-release)
NAME="CentOS Linux"
VERSION="8 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
Details about issue
Not so much an issue as a question/request... if I understand correctly file meta is cached on the local filesystem to reduce the required requests going to the S3 volume and speed up responses. One of the shortcomings of S3 at the moment is the ease/difficulty of getting the total current storage used. Examples include running
duthrough eithers4cmdor similar, writing recursive scripts, etc.In our use case, we're ok with eventually consistent values, so is it possible to directly query the s3fs/fuse meta cache to get a total storage value of the remote filesystem?
Thanks,
Kevin.
@gaul commented on GitHub (Jul 26, 2020):
s3fs has a few different kinds of caches. The stat cache caches object metadata which corresponds to the file metadata. By default, s3fs caches 100,000 entries without expiry (although this might change in #1341). You can change the size and duration of the cache via various flags. Does this help?
@oucil commented on GitHub (Jul 30, 2020):
Thanks @gaul, not exactly, it was more a question of does the cache in some way maintain state over the total file sizes of all objects on the remote volume, but if it's limited to 100k objects then anything beyond that wouldn't be counted. Funny enough, our provider finally made the ability to see our bucket sizes via API recently, just unpublished so far, so I have the solution I needed. Really appreciate your help though, thank you!
@gaul commented on GitHub (Aug 1, 2020):
The S3 API does not provide any kind of summarization of the total number of objects or their cumulative size. The Swift API does, however. The best s3fs could do is run
duor similar which is expensive due to issuingHeadObjectonce per object.