mirror of
https://github.com/s3fs-fuse/s3fs-fuse.git
synced 2026-04-25 13:26:00 +03:00
[GH-ISSUE #891] modifications to files stored on S3 are not coming through to linux instance / being served by Apache #521
Labels
No labels
bug
bug
dataloss
duplicate
enhancement
feature request
help wanted
invalid
need info
performance
pull-request
question
question
testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/s3fs-fuse#521
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @must-defend-500 on GitHub (Jan 13, 2019).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/891
Apologies, I don't know my version number or anything, I am not sure that it matters. I have a simple problem. I recently had to modify an html file that is being stored on S3.
However, the change is not reflected by what Apache is serving.
I have my S3 bucket mounted to an EC2 instance and am running Apache to serve the buckets contents.
How can I get my changes to be reflected on the instance / serve files? I think all I have to do is clear a cache or delete the file on my instance and when Apache requests it, it will pull the new one. Or can I just manually do a "pull" if I ever have to modify a file again.
Further, will this change eventually come through if left alone?
Thanks so much!
@coderall commented on GitHub (Jan 16, 2019):
s3fs may be not suited for your operation. you can not get the new file content in real-time when you modify it with other tools or SDK. S3FS even doesn't ensure the data consistency between tow different mount point.
@must-defend-500 commented on GitHub (Jan 17, 2019):
Hi Alan, thanks for the response. I don't care about real time. Would you
kindly tell me how I can get new file content at all and what the turn
around is like? I have been unable to find what I am looking for on the
web.
Thanks again.
On Wed, Jan 16, 2019 at 2:22 AM alan notifications@github.com wrote:
@coderall commented on GitHub (Jan 21, 2019):
Hi @must-defend-500 , the problem is S3FS lacks some notification mechanism to tell you the file changed.you can try to disable all kind of cache of S3FS to get a changed file newer version.
@must-defend-500 commented on GitHub (Jan 21, 2019):
Thanks again, I'll stop bugging you soon. In this case, the files are only
updated if I edit them by hand. They are mostly static. I just wouldn't
my edits to be reflected in what we serve. I am fine with making this
happen manually but am not sure how.
Could you tell me more about the cache? What I am looking for is something
like this: I edit a file, then I run some s3fs command to update what's on
my instance (and therefore what gets served).
On Sun, Jan 20, 2019 at 9:04 PM alan notifications@github.com wrote:
@ggtakec commented on GitHub (Jan 21, 2019):
@must-defend-500
The stat_cache_expire option may help you.
s3fs has stat information of cached S3 object.
The above option is an option that can specify expire of this stat information.
The expire cache will retrieve the stat information for that object, so if the object is updated the cache will also be new.
I think that performance will be worse if setting too small numerical value.
@coderall commented on GitHub (Jan 22, 2019):
remount the s3fs,you can always see the newest version of file
@must-defend-500 commented on GitHub (Jan 22, 2019):
Thanks Takeshi, so I can set the cache to expire and then once it does s3fs
will pull fresh data (automatically?). I don't have to leave it set to a
small value, but perhaps after I edit a file I can set it to a small value
and then change it back after the update.
what would the command to set the stat_cache_expire look like?
On Mon, Jan 21, 2019 at 7:03 AM Takeshi Nakatani notifications@github.com
wrote:
@must-defend-500 commented on GitHub (Jan 22, 2019):
Thanks alan, that seems like the most straightforward solution. I was just
hoping there was something simpler...like a pull or update method.
I am speaking w/ Takeshi as well. It seems like I may be able to set the
cache to expire after I edit a file (just pass a really small value like 1s
or 100s or something and then change it back after the data is updated).
If the cache is expired I assume s3 will automatically update (pull) the
data from the bucket?
Is this a good solution in your opinion or would you just remount the
bucket?
On Mon, Jan 21, 2019 at 8:25 PM alan notifications@github.com wrote:
@gaul commented on GitHub (Jan 22, 2019):
Reducing the stat cache time seems like the best solution. Perhaps we could add a mechanism to expire part of or the entire stat cache in response to something like SIGUSR1? With enough plumbing someone could consume SNS notifications to make this more responsive...
@ggtakec commented on GitHub (Jan 23, 2019):
@must-defend-500
Unfortunately, s3fs does not have a way to dynamically change the value of stat_cache_expire(while s3fs is running).
As @gaul says, it may be possible in the future to reconfirm by giving a trigger from outside the process, such as SIGUSR1.
If you only want to reload a file, it will be reloaded by deleting the stat information file that exists in the cache directory.
For example, if s3fs is started with the "use_cache=/tmp" option, the stat information file is in "/tmp//. .stat" directory.
There is a file with the same path/name as the target file in this directory.
By deleting this stat file, I think reloading of the target file will occur in the next access.
(First, the stat information of the target file will be got from S3, then the change in the stat information will be detected and the file will be downloaded at the end)
Although I can not recommend it much, there is also such a method.
@gaul commented on GitHub (Jun 23, 2020):
s3fs could also run an HTTP server that allows introspecting on state like the cache. Related to #841.
@sorenwacker commented on GitHub (Oct 11, 2021):
I also do not see any files transfered to S3. It seems everything is just written to the local harddrive.