[GH-ISSUE #566] Question about where to add new metadata features #320

Closed
opened 2026-03-04 01:44:22 +03:00 by kerem · 6 comments
Owner

Originally created by @colakong on GitHub (Apr 20, 2017).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/566

Hi everybody,
I'm looking at adding some features to s3fs for our users, to help them validate downloaded objects.

Adding extra checksum metadata

The first feature would add md5 checksum metadata to any object uploads (separate from etag). This would be meant to help users validate object downloads without needing to implement a client-side version of s3 multi-part checksums for some objects.

It looks like that may be possible in create_file_object() in s3fs.cpp. Is that the right place for it?

Adding checksum directory structure

The second feature would add a read-only .s3_obj_chksum directory at the root of the mount-point, which has the same hierarchy as the mount-point except that contents of files return the corresponding object's checksum.

Is this something that seems reasonable to do, given the current structure of the project? Do you know where a good place to add that feature might be?

Together the features would look like this:

For a mounted bucket like:
s3_mount_point/
├── path
│   └── to
│       └── banana
└── .s3_obj_chksum
    └── path
        └── to
            └── banana

user@computer /tmp : cat s3_mount_point/path/to/banana 
banana banana 1 2 3
user@computer /tmp :

user@computer /tmp : cat s3_mount_point/.s3_obj_chksum/path/to/banana 
7e995973116c45ff560c95a3175d3c0d
user@computer /tmp :
Originally created by @colakong on GitHub (Apr 20, 2017). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/566 Hi everybody, I'm looking at adding some features to s3fs for our users, to help them validate downloaded objects. ### Adding extra checksum metadata The first feature would add md5 checksum metadata to any object uploads (separate from etag). This would be meant to help users validate object downloads without needing to implement a client-side version of s3 multi-part checksums for some objects. It looks like that may be possible in `create_file_object()` in `s3fs.cpp`. Is that the right place for it? ### Adding checksum directory structure The second feature would add a read-only `.s3_obj_chksum` directory at the root of the mount-point, which has the same hierarchy as the mount-point except that contents of files return the corresponding object's checksum. Is this something that seems reasonable to do, given the current structure of the project? Do you know where a good place to add that feature might be? Together the features would look like this: ``` For a mounted bucket like: s3_mount_point/ ├── path │ └── to │ └── banana └── .s3_obj_chksum └── path └── to └── banana user@computer /tmp : cat s3_mount_point/path/to/banana banana banana 1 2 3 user@computer /tmp : user@computer /tmp : cat s3_mount_point/.s3_obj_chksum/path/to/banana 7e995973116c45ff560c95a3175d3c0d user@computer /tmp : ```
kerem closed this issue 2026-03-04 01:44:22 +03:00
Author
Owner

@gaul commented on GitHub (Apr 20, 2017):

I strongly prefer to work with and improve the existing ETag-based checksums instead of adding new metadata that serves only a single user. I understand that multipart upload and range requests make this more complicated but it seems like you can solve your problem today by disabling MPU and exposing the ETag via extended attributes. Further HTTPS should ensure data in-flight while the ETag ensures data at-rest. What corruption vector are you worried about?

If you want to do programmatic things with the S3 protocol, perhaps you can write a middleware to S3Proxy?

<!-- gh-comment-id:295961940 --> @gaul commented on GitHub (Apr 20, 2017): I strongly prefer to work with and improve the existing ETag-based checksums instead of adding new metadata that serves only a single user. I understand that multipart upload and range requests make this more complicated but it seems like you can solve your problem today by disabling MPU and exposing the ETag via extended attributes. Further HTTPS should ensure data in-flight while the ETag ensures data at-rest. What corruption vector are you worried about? If you want to do programmatic things with the S3 protocol, perhaps you can write a middleware to [S3Proxy](https://github.com/andrewgaul/s3proxy)?
Author
Owner

@colakong commented on GitHub (Apr 21, 2017):

Thanks for the response Andrew. Good point about the extended attributes 👍

I'm not sure if it'll be practical for us to disable MPU based on the object sizes we work with, but it's something to consider :)

The features were meant for bit-rot detection, where after an object was uploaded some portion of it was changed/corrupted in a way that wasn't detected by the underlying storage system being exposed through s3.

<!-- gh-comment-id:296018479 --> @colakong commented on GitHub (Apr 21, 2017): Thanks for the response Andrew. Good point about the extended attributes :+1: I'm not sure if it'll be practical for us to disable MPU based on the object sizes we work with, but it's something to consider :) The features were meant for bit-rot detection, where after an object was uploaded some portion of it was changed/corrupted in a way that wasn't detected by the underlying storage system being exposed through s3.
Author
Owner

@gaul commented on GitHub (Apr 23, 2017):

How do you propose to calculate a single checksum over the entire object for a multi-part upload where parts may upload simultaneously or out of order?

The multipart ETag is actually a single-level Merkle tree hash of part hashes concatenated with the number of parts, e.g., ffffffffffffffffffffffffffffffff-31. If you used a known part size and if s3fs exposed the ETag, you could actually calculate this in your application.

However this should not be needed; most S3 storage systems scrub data behind the scenes and proactively repair it based on ETag.

<!-- gh-comment-id:296414664 --> @gaul commented on GitHub (Apr 23, 2017): How do you propose to calculate a single checksum over the entire object for a multi-part upload where parts may upload simultaneously or out of order? The multipart ETag is actually a single-level Merkle tree hash of part hashes concatenated with the number of parts, e.g., ffffffffffffffffffffffffffffffff-31. If you used a known part size and if s3fs exposed the ETag, you could actually calculate this in your application. However this should not be needed; most S3 storage systems scrub data behind the scenes and proactively repair it based on ETag.
Author
Owner

@colakong commented on GitHub (Apr 23, 2017):

The checksum could be computed before a multi-part upload is started.

The extra checksum metadata was intended to help users validate object downloads without needing to implement the s3 multi-part checksums on the client side. I've provided an example of the multi-part checksum calculation, but the preference is to use a checksum for the entire object.

I agree in that corruption isn't likely. Our users care very much about the integrity of their data, so the features' primary benefit is to help them feel more comfortable with our object storage system.

<!-- gh-comment-id:296466710 --> @colakong commented on GitHub (Apr 23, 2017): The checksum could be computed before a multi-part upload is started. The extra checksum metadata was intended to help users validate object downloads without needing to implement the s3 multi-part checksums on the client side. I've provided an example of the multi-part checksum calculation, but the preference is to use a checksum for the entire object. I agree in that corruption isn't likely. Our users care very much about the integrity of their data, so the features' primary benefit is to help them feel more comfortable with our object storage system.
Author
Owner

@gaul commented on GitHub (Apr 23, 2017):

s3fs supports extended attributes which turn into S3 object metadata. Thus you can implement these checksums with setfattr and getfattr today.

<!-- gh-comment-id:296486166 --> @gaul commented on GitHub (Apr 23, 2017): s3fs supports extended attributes which turn into S3 object metadata. Thus you can implement these checksums with `setfattr` and `getfattr` today.
Author
Owner

@colakong commented on GitHub (Apr 24, 2017):

Great, thankyou Andrew :)

<!-- gh-comment-id:296747397 --> @colakong commented on GitHub (Apr 24, 2017): Great, thankyou Andrew :)
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#320
No description provided.