mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 21:35:58 +03:00
[GH-ISSUE #841] Intelligent Cache Invalidation? #489
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#489
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @jae-63 on GitHub (Oct 15, 2018).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/841
Additional Information
The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.
Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on MacOS or BSD
Version of s3fs being used (s3fs --version)
1.84
Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)
Kernel information (uname -r)
4.4.0-1069-aws
GNU/Linux Distribution, if applicable (cat /etc/os-release)
Ubuntu 16.04.4 LTS
Details about issue
This is a design inquiry.
We use S3 as a filesystem from Windows workstations, which I understand aren't currently supported in s3fs-fuse. But we can use a Samba+s3fs-fuse solution for now.
We're experiencing terrible slow retrieval speeds in GovCloud, e.g. "aws s3 ls" typically takes 16 seconds on a bucket which would take 1 second in non-GovCloud. So we need to come up with a solution which invalidates the cache selectively.
We can write an AWS Lambda function which is triggered on PUTs to our S3 Bucket. We'd like to parse that URL and then instruct s3fs-fuse to invalidate only the desired portions of the cache. Under all circumstances, we would trust the cache completely, i.e. the cache would have a very long life.
Can you direct me to what portions of the s3fs-fuse source code would be appropriate to modify to support this functionality? I'd imagine using a SIGHUP signal along with a predefined path of a file containing the information gleaned from the above-mentioned PUT verb.
I'd think/imagine that others would find this useful as well.
@gaul commented on GitHub (Oct 19, 2018):
You could add a simple HTTP server to s3fs which would allow querying state like number of operations completed via GET and taking actions like invalidating cache via POST. This would generally be useful for debugging and something we could incorporate as long as it defaults to off.
BTW someone was working on porting s3fs to Windows in billziss-gh/winfsp#143 but we have not heard from them in a while.