[GH-ISSUE #891] modifications to files stored on S3 are not coming through to linux instance / being served by Apache #521

Open
opened 2026-03-04 01:46:19 +03:00 by kerem · 12 comments
Owner

Originally created by @must-defend-500 on GitHub (Jan 13, 2019).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/891

Apologies, I don't know my version number or anything, I am not sure that it matters. I have a simple problem. I recently had to modify an html file that is being stored on S3.

However, the change is not reflected by what Apache is serving.

I have my S3 bucket mounted to an EC2 instance and am running Apache to serve the buckets contents.

How can I get my changes to be reflected on the instance / serve files? I think all I have to do is clear a cache or delete the file on my instance and when Apache requests it, it will pull the new one. Or can I just manually do a "pull" if I ever have to modify a file again.

Further, will this change eventually come through if left alone?

Thanks so much!

Originally created by @must-defend-500 on GitHub (Jan 13, 2019). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/891 Apologies, I don't know my version number or anything, I am not sure that it matters. I have a simple problem. I recently had to modify an html file that is being stored on S3. However, the change is not reflected by what Apache is serving. I have my S3 bucket mounted to an EC2 instance and am running Apache to serve the buckets contents. How can I get my changes to be reflected on the instance / serve files? I think all I have to do is clear a cache or delete the file on my instance and when Apache requests it, it will pull the new one. Or can I just manually do a "pull" if I ever have to modify a file again. Further, will this change eventually come through if left alone? Thanks so much!
Author
Owner

@coderall commented on GitHub (Jan 16, 2019):

s3fs may be not suited for your operation. you can not get the new file content in real-time when you modify it with other tools or SDK. S3FS even doesn't ensure the data consistency between tow different mount point.

<!-- gh-comment-id:454691601 --> @coderall commented on GitHub (Jan 16, 2019): s3fs may be not suited for your operation. you can not get the new file content in real-time when you modify it with other tools or SDK. S3FS even doesn't ensure the data consistency between tow different mount point.
Author
Owner

@must-defend-500 commented on GitHub (Jan 17, 2019):

Hi Alan, thanks for the response. I don't care about real time. Would you
kindly tell me how I can get new file content at all and what the turn
around is like? I have been unable to find what I am looking for on the
web.

Thanks again.

On Wed, Jan 16, 2019 at 2:22 AM alan notifications@github.com wrote:

s3fs may be not suited for your operation. you can not get the new file
content in real-time when you modify it with other tools or SDK. S3FS even
doesn't ensure the data consistency between tow different mount point.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-454691601,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AO9QiYMC8ltUFb4n6l60zhDUYnKrJ4kYks5vDuE8gaJpZM4Z8-t8
.

<!-- gh-comment-id:455036683 --> @must-defend-500 commented on GitHub (Jan 17, 2019): Hi Alan, thanks for the response. I don't care about real time. Would you kindly tell me how I can get new file content at all and what the turn around is like? I have been unable to find what I am looking for on the web. Thanks again. On Wed, Jan 16, 2019 at 2:22 AM alan <notifications@github.com> wrote: > s3fs may be not suited for your operation. you can not get the new file > content in real-time when you modify it with other tools or SDK. S3FS even > doesn't ensure the data consistency between tow different mount point. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-454691601>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AO9QiYMC8ltUFb4n6l60zhDUYnKrJ4kYks5vDuE8gaJpZM4Z8-t8> > . >
Author
Owner

@coderall commented on GitHub (Jan 21, 2019):

Hi @must-defend-500 , the problem is S3FS lacks some notification mechanism to tell you the file changed.you can try to disable all kind of cache of S3FS to get a changed file newer version.

<!-- gh-comment-id:455934632 --> @coderall commented on GitHub (Jan 21, 2019): Hi @must-defend-500 , the problem is S3FS lacks some notification mechanism to tell you the file changed.you can try to disable all kind of cache of S3FS to get a changed file newer version.
Author
Owner

@must-defend-500 commented on GitHub (Jan 21, 2019):

Thanks again, I'll stop bugging you soon. In this case, the files are only
updated if I edit them by hand. They are mostly static. I just wouldn't
my edits to be reflected in what we serve. I am fine with making this
happen manually but am not sure how.

Could you tell me more about the cache? What I am looking for is something
like this: I edit a file, then I run some s3fs command to update what's on
my instance (and therefore what gets served).

On Sun, Jan 20, 2019 at 9:04 PM alan notifications@github.com wrote:

Hi @must-defend-500 https://github.com/must-defend-500 , the problem is
S3FS lacks some notification mechanism to tell you the file changed.you can
try to disable all kind of cache of S3FS to get a changed file newer
version.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-455934632,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AO9QiZNmZgmAewU393odRmIwhuQbJ9i6ks5vFS4_gaJpZM4Z8-t8
.

<!-- gh-comment-id:455946471 --> @must-defend-500 commented on GitHub (Jan 21, 2019): Thanks again, I'll stop bugging you soon. In this case, the files are only updated if I edit them by hand. They are mostly static. I just wouldn't my edits to be reflected in what we serve. I am fine with making this happen manually but am not sure how. Could you tell me more about the cache? What I am looking for is something like this: I edit a file, then I run some s3fs command to update what's on my instance (and therefore what gets served). On Sun, Jan 20, 2019 at 9:04 PM alan <notifications@github.com> wrote: > Hi @must-defend-500 <https://github.com/must-defend-500> , the problem is > S3FS lacks some notification mechanism to tell you the file changed.you can > try to disable all kind of cache of S3FS to get a changed file newer > version. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-455934632>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AO9QiZNmZgmAewU393odRmIwhuQbJ9i6ks5vFS4_gaJpZM4Z8-t8> > . >
Author
Owner

@ggtakec commented on GitHub (Jan 21, 2019):

@must-defend-500
The stat_cache_expire option may help you.
s3fs has stat information of cached S3 object.
The above option is an option that can specify expire of this stat information.
The expire cache will retrieve the stat information for that object, so if the object is updated the cache will also be new.
I think that performance will be worse if setting too small numerical value.

<!-- gh-comment-id:456067060 --> @ggtakec commented on GitHub (Jan 21, 2019): @must-defend-500 The stat_cache_expire option may help you. s3fs has stat information of cached S3 object. The above option is an option that can specify expire of this stat information. The expire cache will retrieve the stat information for that object, so if the object is updated the cache will also be new. I think that performance will be worse if setting too small numerical value.
Author
Owner

@coderall commented on GitHub (Jan 22, 2019):

remount the s3fs,you can always see the newest version of file

<!-- gh-comment-id:456249221 --> @coderall commented on GitHub (Jan 22, 2019): remount the s3fs,you can always see the newest version of file
Author
Owner

@must-defend-500 commented on GitHub (Jan 22, 2019):

Thanks Takeshi, so I can set the cache to expire and then once it does s3fs
will pull fresh data (automatically?). I don't have to leave it set to a
small value, but perhaps after I edit a file I can set it to a small value
and then change it back after the update.

what would the command to set the stat_cache_expire look like?

On Mon, Jan 21, 2019 at 7:03 AM Takeshi Nakatani notifications@github.com
wrote:

@must-defend-500 https://github.com/must-defend-500
The stat_cache_expire option may help you.
s3fs has stat information of cached S3 object.
The above option is an option that can specify expire of this stat
information.
The expire cache will retrieve the stat information for that object, so if
the object is updated the cache will also be new.
I think that performance will be worse if setting too small numerical
value.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-456067060,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AO9QiRWbt0_oscHCK_LhWHSnqlbxL-1fks5vFbq7gaJpZM4Z8-t8
.

<!-- gh-comment-id:456263279 --> @must-defend-500 commented on GitHub (Jan 22, 2019): Thanks Takeshi, so I can set the cache to expire and then once it does s3fs will pull fresh data (automatically?). I don't have to leave it set to a small value, but perhaps after I edit a file I can set it to a small value and then change it back after the update. what would the command to set the stat_cache_expire look like? On Mon, Jan 21, 2019 at 7:03 AM Takeshi Nakatani <notifications@github.com> wrote: > @must-defend-500 <https://github.com/must-defend-500> > The stat_cache_expire option may help you. > s3fs has stat information of cached S3 object. > The above option is an option that can specify expire of this stat > information. > The expire cache will retrieve the stat information for that object, so if > the object is updated the cache will also be new. > I think that performance will be worse if setting too small numerical > value. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-456067060>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AO9QiRWbt0_oscHCK_LhWHSnqlbxL-1fks5vFbq7gaJpZM4Z8-t8> > . >
Author
Owner

@must-defend-500 commented on GitHub (Jan 22, 2019):

Thanks alan, that seems like the most straightforward solution. I was just
hoping there was something simpler...like a pull or update method.

I am speaking w/ Takeshi as well. It seems like I may be able to set the
cache to expire after I edit a file (just pass a really small value like 1s
or 100s or something and then change it back after the data is updated).
If the cache is expired I assume s3 will automatically update (pull) the
data from the bucket?

Is this a good solution in your opinion or would you just remount the
bucket?

On Mon, Jan 21, 2019 at 8:25 PM alan notifications@github.com wrote:

remount the s3fs,you can always see the newest version of file


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-456249221,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AO9QibVHZkdDBlKU8JplVKz0rcaAYePXks5vFnahgaJpZM4Z8-t8
.

<!-- gh-comment-id:456263510 --> @must-defend-500 commented on GitHub (Jan 22, 2019): Thanks alan, that seems like the most straightforward solution. I was just hoping there was something simpler...like a pull or update method. I am speaking w/ Takeshi as well. It seems like I may be able to set the cache to expire after I edit a file (just pass a really small value like 1s or 100s or something and then change it back after the data is updated). If the cache is expired I assume s3 will automatically update (pull) the data from the bucket? Is this a good solution in your opinion or would you just remount the bucket? On Mon, Jan 21, 2019 at 8:25 PM alan <notifications@github.com> wrote: > remount the s3fs,you can always see the newest version of file > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/s3fs-fuse/s3fs-fuse/issues/891#issuecomment-456249221>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AO9QibVHZkdDBlKU8JplVKz0rcaAYePXks5vFnahgaJpZM4Z8-t8> > . >
Author
Owner

@gaul commented on GitHub (Jan 22, 2019):

Reducing the stat cache time seems like the best solution. Perhaps we could add a mechanism to expire part of or the entire stat cache in response to something like SIGUSR1? With enough plumbing someone could consume SNS notifications to make this more responsive...

<!-- gh-comment-id:456263966 --> @gaul commented on GitHub (Jan 22, 2019): Reducing the stat cache time seems like the best solution. Perhaps we could add a mechanism to expire part of or the entire stat cache in response to something like SIGUSR1? With enough plumbing someone could consume SNS notifications to make this more responsive...
Author
Owner

@ggtakec commented on GitHub (Jan 23, 2019):

@must-defend-500
Unfortunately, s3fs does not have a way to dynamically change the value of stat_cache_expire(while s3fs is running).
As @gaul says, it may be possible in the future to reconfirm by giving a trigger from outside the process, such as SIGUSR1.

If you only want to reload a file, it will be reloaded by deleting the stat information file that exists in the cache directory.

For example, if s3fs is started with the "use_cache=/tmp" option, the stat information file is in "/tmp//. .stat" directory.
There is a file with the same path/name as the target file in this directory.
By deleting this stat file, I think reloading of the target file will occur in the next access.
(First, the stat information of the target file will be got from S3, then the change in the stat information will be detected and the file will be downloaded at the end)

Although I can not recommend it much, there is also such a method.

<!-- gh-comment-id:456813418 --> @ggtakec commented on GitHub (Jan 23, 2019): @must-defend-500 Unfortunately, s3fs does not have a way to dynamically change the value of stat_cache_expire(while s3fs is running). As @gaul says, it may be possible in the future to reconfirm by giving a trigger from outside the process, such as SIGUSR1. If you only want to reload a file, it will be reloaded by deleting the stat information file that exists in the cache directory. For example, if s3fs is started with the "use_cache=/tmp" option, the stat information file is in "/tmp/<mount point base name>/. <bucket name>.stat" directory. There is a file with the same path/name as the target file in this directory. By deleting this stat file, I think reloading of the target file will occur in the next access. (First, the stat information of the target file will be got from S3, then the change in the stat information will be detected and the file will be downloaded at the end) Although I can not recommend it much, there is also such a method.
Author
Owner

@gaul commented on GitHub (Jun 23, 2020):

s3fs could also run an HTTP server that allows introspecting on state like the cache. Related to #841.

<!-- gh-comment-id:647907043 --> @gaul commented on GitHub (Jun 23, 2020): s3fs could also run an HTTP server that allows introspecting on state like the cache. Related to #841.
Author
Owner

@sorenwacker commented on GitHub (Oct 11, 2021):

I also do not see any files transfered to S3. It seems everything is just written to the local harddrive.

<!-- gh-comment-id:940429706 --> @sorenwacker commented on GitHub (Oct 11, 2021): I also do not see any files transfered to S3. It seems everything is just written to the local harddrive.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#521
No description provided.