[GH-ISSUE #1310] fuse cache on write #701

Open
opened 2026-03-04 01:48:01 +03:00 by kerem · 6 comments
Owner

Originally created by @tsmgeek on GitHub (Jun 18, 2020).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1310

Is there any caching at the fuse layer of files being written?
Im trying to track down a problem where uploaded files are zero filesize when I look at them after a mins etc if I check again its actually 0 bytes on storage.
I am wondering if there is a possibility that there was a failure to transfer the binary data to S3 resulting in an empty file but still reports that it was successfully written when immediately doing a stat call.

Originally created by @tsmgeek on GitHub (Jun 18, 2020). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/1310 Is there any caching at the fuse layer of files being written? Im trying to track down a problem where uploaded files are zero filesize when I look at them after a mins etc if I check again its actually 0 bytes on storage. I am wondering if there is a possibility that there was a failure to transfer the binary data to S3 resulting in an empty file but still reports that it was successfully written when immediately doing a stat call.
Author
Owner

@gaul commented on GitHub (Jul 26, 2020):

When creating a new file, s3fs creates a zero-byte stub object with file metadata. During writes, s3fs caches files locally until the application calls fsync or close. This means that accessing the file via s3fs may report one size while an external tool might report zero bytes. Does this explain the behavior you observed?

<!-- gh-comment-id:663949148 --> @gaul commented on GitHub (Jul 26, 2020): When creating a new file, s3fs creates a zero-byte stub object with file metadata. During writes, s3fs caches files locally until the application calls `fsync` or `close`. This means that accessing the file via s3fs may report one size while an external tool might report zero bytes. Does this explain the behavior you observed?
Author
Owner

@tsmgeek commented on GitHub (Jul 26, 2020):

I'm observing that any process on the same node sees the file correctly, on S3 it's zero file size even after fclose. It's like the write failed, left the stub file on S3 but reported back it was a successful write.

I assume you are saying the zero-stub is written to S3 itself as a form of lock, if this is the case it would be nice to be able to toggle this off and only write whole object on fclose.

<!-- gh-comment-id:664006103 --> @tsmgeek commented on GitHub (Jul 26, 2020): I'm observing that any process on the same node sees the file correctly, on S3 it's zero file size even after fclose. It's like the write failed, left the stub file on S3 but reported back it was a successful write. I assume you are saying the zero-stub is written to S3 itself as a form of lock, if this is the case it would be nice to be able to toggle this off and only write whole object on fclose.
Author
Owner

@gaul commented on GitHub (Aug 1, 2020):

Discussed in https://github.com/s3fs-fuse/s3fs-fuse/issues/1013#issuecomment-484076448.

<!-- gh-comment-id:667535470 --> @gaul commented on GitHub (Aug 1, 2020): Discussed in https://github.com/s3fs-fuse/s3fs-fuse/issues/1013#issuecomment-484076448.
Author
Owner

@tsmgeek commented on GitHub (Aug 1, 2020):

@gaul yes that seems to cover part of the issue.
Adding a WORM flag as per the other ticket looks like it would help the issue as any failed write of the data would not leave zero file on storage.

Note I'm using wasabi s3 storage which I think is based on ceph, worm would help as they also have a 90 day retention limit with mimimum billable file size.

<!-- gh-comment-id:667547872 --> @tsmgeek commented on GitHub (Aug 1, 2020): @gaul yes that seems to cover part of the issue. Adding a WORM flag as per the other ticket looks like it would help the issue as any failed write of the data would not leave zero file on storage. Note I'm using wasabi s3 storage which I think is based on ceph, worm would help as they also have a 90 day retention limit with mimimum billable file size.
Author
Owner

@gaul commented on GitHub (Oct 23, 2021):

@tsmgeek Does the fix for #1013 address your symptoms? Could you test with the latest version 1.90?

<!-- gh-comment-id:950096684 --> @gaul commented on GitHub (Oct 23, 2021): @tsmgeek Does the fix for #1013 address your symptoms? Could you test with the latest version 1.90?
Author
Owner

@tsmgeek commented on GitHub (Oct 26, 2021):

@gaul I think it may do, ile have to deploy latest s3fs on our platform to confirm.
We recoded some parts to use S3 api directly due to this issue and also added retry loops in to fix this problem so most of it was hidden away by though two fixes.

<!-- gh-comment-id:951935517 --> @tsmgeek commented on GitHub (Oct 26, 2021): @gaul I think it may do, ile have to deploy latest s3fs on our platform to confirm. We recoded some parts to use S3 api directly due to this issue and also added retry loops in to fix this problem so most of it was hidden away by though two fixes.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#701
No description provided.