[GH-ISSUE #2113] Question: Does s3fs support Write Order, Synchronous Write Persistence, Distributed File Locking, and Unique Write Ownership? #1077

Closed
opened 2026-03-04 01:51:11 +03:00 by kerem · 3 comments
Owner

Originally created by @ruthst00 on GitHub (Feb 18, 2023).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2113

We would like to use s3fs with the OCI Object Storage Service to store message files for Tibco Enterprise Message Service (EMS) which has these file system requirements:

  • Write Order: The storage solution must write data blocks to shared storage in the same order as they occur in the data buffer. (Solutions that write data blocks in any other order (for example, to enhance disk efficiency) do not satisfy this requirement.)
  • Synchronous Write Persistence: Upon return from a synchronous write call, the storage solution guarantees that all the data have been written to durable, persistent storage.
  • Distributed File Locking: The EMS servers must be able to request and obtain an exclusive lock on the shared storage. The storage solution must not assign the locks to two servers simultaneously. EMS servers use this lock to determine the primary server.
  • Unique Write Ownership: The EMS server process that has the file lock must be the only server process that can write to the file. Once the system transfers the lock to another server, pending writes queued by the previous owner must fail.

Does s3fs support these requirements?

I asked the same question on Stack Overflow if you'd prefer to answer the question there.

Originally created by @ruthst00 on GitHub (Feb 18, 2023). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2113 We would like to use s3fs with the OCI Object Storage Service to store message files for Tibco Enterprise Message Service (EMS) which has these file system requirements: - **Write Order**: The storage solution must write data blocks to shared storage in the same order as they occur in the data buffer. (Solutions that write data blocks in any other order (for example, to enhance disk efficiency) do not satisfy this requirement.) - **Synchronous Write Persistence**: Upon return from a synchronous write call, the storage solution guarantees that all the data have been written to durable, persistent storage. - **Distributed File Locking**: The EMS servers must be able to request and obtain an exclusive lock on the shared storage. The storage solution must not assign the locks to two servers simultaneously. EMS servers use this lock to determine the primary server. - **Unique Write Ownership**: The EMS server process that has the file lock must be the only server process that can write to the file. Once the system transfers the lock to another server, pending writes queued by the previous owner must fail. Does s3fs support these requirements? I asked the same question on [Stack Overflow](https://stackoverflow.com/questions/75496616/does-s3fs-support-write-order-synchronous-write-persistence-distributed-file-l) if you'd prefer to answer the question there.
kerem closed this issue 2026-03-04 01:51:11 +03:00
Author
Owner

@michaelsmoody commented on GitHub (Mar 3, 2023):

@ruthst00

I'm sure the developers would chime in more appropriately, but I've never seen anything in s3fs-fuse that would help you accomplish your goals, unfortunately.

I've used many filesystems that would meet those needs, but s3fs-fuse isn't made for such a use case. While it does (there are currently a few bugs that I'm personally waiting on a new release for, even if commits have been made that allow a from-source build to solve them) have POSIX compatibility, solving some of these, locking is probably the biggest challenge.

Truly, a distributed filesystem with locking is a major challenge, especially one with low latency. It's why filesystems such as OCFS2, Gluster, Ceph, etc, exist. But while s3fs-fuse allows multiple systems to access the same files, it won't block do what you're asking, as near as I can tell, in all of my use, and in reviewing the code. Unless there's a mount option I'm unaware of (and again, perhaps there is, Developers?), it doesn't do this. And for sure, all of the mounting servers don't communicate with one another.

Unfortunately, I think your options are the standard clustered file systems.

OCFS2, Gluster, Ceph, etc. These are all great options, and have come a long way. Lots of them don't require shared inifiniband, fiber channel, iSCSI, etc anymore (or public-cloud equivalents).

There are of course other options in clouds like AWS FSx (which is simply DFS), AWS EFS (which is mostly NFS), AWS OpenZFS, and many others.

<!-- gh-comment-id:1453797547 --> @michaelsmoody commented on GitHub (Mar 3, 2023): @ruthst00 I'm sure the developers would chime in more appropriately, but I've never seen anything in s3fs-fuse that would help you accomplish your goals, unfortunately. I've used many filesystems that _would_ meet those needs, but s3fs-fuse isn't made for such a use case. While it does (there are currently a few bugs that I'm personally waiting on a new release for, even if commits have been made that allow a from-source build to solve them) have POSIX compatibility, solving **some** of these, locking is probably the biggest challenge. Truly, a distributed filesystem with locking is a major challenge, especially one with low latency. It's why filesystems such as OCFS2, Gluster, Ceph, etc, exist. But while s3fs-fuse allows multiple systems to access the same files, it won't block do what you're asking, _as near as I can tell_, in all of my use, and in reviewing the code. Unless there's a mount option I'm unaware of (and again, perhaps there is, Developers?), it doesn't do this. And for sure, all of the mounting servers don't communicate with one another. Unfortunately, I think your options are the standard clustered file systems. OCFS2, Gluster, Ceph, etc. These are all great options, and have come a long way. Lots of them don't require shared inifiniband, fiber channel, iSCSI, etc anymore (or public-cloud equivalents). There are of course other options in clouds like AWS FSx (which is simply DFS), AWS EFS (which is _mostly_ NFS), AWS OpenZFS, and many others.
Author
Owner

@ggtakec commented on GitHub (Mar 12, 2023):

@michaelsmoody
Thanks for your kindness. I think your explanation is enough.

@ruthst00
Write Order may be possible at the expense of performance (not parallel work) in s3fs. But I don't remember testing s3fs for that purpose.
Second, it does not support locking mechanism as file system over network.
If you do this, it will probably be in the form of a lock file or similar, and it's the client application that manages that, not s3fs.
And I imagine that it will surely involve deadlocks and malfunctions due to lock file manipulation.
These locking operations are difficult to support in the current state of s3fs.

<!-- gh-comment-id:1465161463 --> @ggtakec commented on GitHub (Mar 12, 2023): @michaelsmoody Thanks for your kindness. I think your explanation is enough. @ruthst00 `Write Order` may be possible at the expense of performance (not parallel work) in s3fs. But I don't remember testing s3fs for that purpose. Second, it does not support locking mechanism as file system over network. If you do this, it will probably be in the form of a lock file or similar, and it's the client application that manages that, not s3fs. And I imagine that it will surely involve deadlocks and malfunctions due to lock file manipulation. These locking operations are difficult to support in the current state of s3fs.
Author
Owner

@gaul commented on GitHub (Sep 8, 2023):

  • Write Order: The storage solution must write data blocks to shared storage in the same order as they occur in the data buffer. (Solutions that write data blocks in any other order (for example, to enhance disk efficiency) do not satisfy this requirement.)

I don't exactly understand your question but s3fs does not reorder the buffers.

  • Synchronous Write Persistence: Upon return from a synchronous write call, the storage solution guarantees that all the data have been written to durable, persistent storage.

s3fs only guarantees sync-on-close and sync-on-fsync semantics. If it synced on every write call then performance would be ruinously slow due to the S3 immutable object model.

  • Distributed File Locking: The EMS servers must be able to request and obtain an exclusive lock on the shared storage. The storage solution must not assign the locks to two servers simultaneously. EMS servers use this lock to determine the primary server.

s3fs does not support this coordination.

  • Unique Write Ownership: The EMS server process that has the file lock must be the only server process that can write to the file. Once the system transfers the lock to another server, pending writes queued by the previous owner must fail.

s3fs does not directly support locking. You can use POSIX advisory locks but the application and kernel handle these.

<!-- gh-comment-id:1710983157 --> @gaul commented on GitHub (Sep 8, 2023): > * **Write Order**: The storage solution must write data blocks to shared storage in the same order as they occur in the data buffer. (Solutions that write data blocks in any other order (for example, to enhance disk efficiency) do not satisfy this requirement.) I don't exactly understand your question but s3fs does not reorder the buffers. > * **Synchronous Write Persistence**: Upon return from a synchronous write call, the storage solution guarantees that all the data have been written to durable, persistent storage. s3fs only guarantees sync-on-close and sync-on-fsync semantics. If it synced on every write call then performance would be ruinously slow due to the S3 immutable object model. > * **Distributed File Locking**: The EMS servers must be able to request and obtain an exclusive lock on the shared storage. The storage solution must not assign the locks to two servers simultaneously. EMS servers use this lock to determine the primary server. s3fs does not support this coordination. > * **Unique Write Ownership**: The EMS server process that has the file lock must be the only server process that can write to the file. Once the system transfers the lock to another server, pending writes queued by the previous owner must fail. s3fs does not directly support locking. You can use POSIX advisory locks but the application and kernel handle these.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#1077
No description provided.