[GH-ISSUE #841] Intelligent Cache Invalidation? #489

Open
opened 2026-03-04 01:46:03 +03:00 by kerem · 1 comment
Owner

Originally created by @jae-63 on GitHub (Oct 15, 2018).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/841

Additional Information

The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all.
Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on MacOS or BSD

Version of s3fs being used (s3fs --version)

1.84

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse)

Kernel information (uname -r)

4.4.0-1069-aws

GNU/Linux Distribution, if applicable (cat /etc/os-release)

Ubuntu 16.04.4 LTS

Details about issue

This is a design inquiry.

We use S3 as a filesystem from Windows workstations, which I understand aren't currently supported in s3fs-fuse. But we can use a Samba+s3fs-fuse solution for now.

We're experiencing terrible slow retrieval speeds in GovCloud, e.g. "aws s3 ls" typically takes 16 seconds on a bucket which would take 1 second in non-GovCloud. So we need to come up with a solution which invalidates the cache selectively.

We can write an AWS Lambda function which is triggered on PUTs to our S3 Bucket. We'd like to parse that URL and then instruct s3fs-fuse to invalidate only the desired portions of the cache. Under all circumstances, we would trust the cache completely, i.e. the cache would have a very long life.

Can you direct me to what portions of the s3fs-fuse source code would be appropriate to modify to support this functionality? I'd imagine using a SIGHUP signal along with a predefined path of a file containing the information gleaned from the above-mentioned PUT verb.

I'd think/imagine that others would find this useful as well.

Originally created by @jae-63 on GitHub (Oct 15, 2018). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/841 ### Additional Information _The following information is very important in order to help us to help you. Omission of the following details may delay your support request or receive no attention at all._ _Keep in mind that the commands we provide to retrieve information are oriented to GNU/Linux Distributions, so you could need to use others if you use s3fs on MacOS or BSD_ #### Version of s3fs being used (s3fs --version) 1.84 #### Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse, dpkg -s fuse) #### Kernel information (uname -r) 4.4.0-1069-aws #### GNU/Linux Distribution, if applicable (cat /etc/os-release) Ubuntu 16.04.4 LTS ### Details about issue This is a design inquiry. We use S3 as a filesystem from Windows workstations, which I understand aren't currently supported in s3fs-fuse. But we can use a Samba+s3fs-fuse solution for now. We're experiencing terrible slow retrieval speeds in GovCloud, e.g. "aws s3 ls" typically takes 16 seconds on a bucket which would take 1 second in non-GovCloud. So we need to come up with a solution which invalidates the cache selectively. We can write an AWS Lambda function which is triggered on PUTs to our S3 Bucket. We'd like to parse that URL and then instruct s3fs-fuse to invalidate only the desired portions of the cache. Under all circumstances, we would trust the cache completely, i.e. the cache would have a very long life. Can you direct me to what portions of the s3fs-fuse source code would be appropriate to modify to support this functionality? I'd imagine using a SIGHUP signal along with a predefined path of a file containing the information gleaned from the above-mentioned PUT verb. I'd think/imagine that others would find this useful as well.
Author
Owner

@gaul commented on GitHub (Oct 19, 2018):

You could add a simple HTTP server to s3fs which would allow querying state like number of operations completed via GET and taking actions like invalidating cache via POST. This would generally be useful for debugging and something we could incorporate as long as it defaults to off.

BTW someone was working on porting s3fs to Windows in billziss-gh/winfsp#143 but we have not heard from them in a while.

<!-- gh-comment-id:431226424 --> @gaul commented on GitHub (Oct 19, 2018): You could add a simple HTTP server to s3fs which would allow querying state like number of operations completed via GET and taking actions like invalidating cache via POST. This would generally be useful for debugging and something we could incorporate as long as it defaults to off. BTW someone was working on porting s3fs to Windows in billziss-gh/winfsp#143 but we have not heard from them in a while.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#489
No description provided.