[GH-ISSUE #2649] Cannot use bucket folders created outside s3fs #1266

Open
opened 2026-03-04 01:52:40 +03:00 by kerem · 9 comments
Owner

Originally created by @jarau-de on GitHub (Mar 18, 2025).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2649

s3fs shows folders created outside of s3fs as empty files in Linux filesystem. It's not possible to get into the folders. If a folder was created in s3fs mount everything works as expected. It seems x-amz-meta-* information causes s3fs to struggle. If created inside of s3fs these meta information are created. Other tools as s3cmd and cyberduck are working as expected. I tried also complement_stat and compat_dir, but without any success.

Originally created by @jarau-de on GitHub (Mar 18, 2025). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/2649 s3fs shows folders created outside of s3fs as empty files in Linux filesystem. It's not possible to get into the folders. If a folder was created in s3fs mount everything works as expected. It seems x-amz-meta-* information causes s3fs to struggle. If created inside of s3fs these meta information are created. Other tools as s3cmd and cyberduck are working as expected. I tried also complement_stat and compat_dir, but without any success.
Author
Owner

@swesner411 commented on GitHub (Jun 5, 2025):

I'm having the same exact issue as described above and cannot find a fix. Details are the same. Pre-existing 'folders' do not work. Making one through s3fs works as expected. I'm on version 1.93

<!-- gh-comment-id:2945167771 --> @swesner411 commented on GitHub (Jun 5, 2025): I'm having the same exact issue as described above and cannot find a fix. Details are the same. Pre-existing 'folders' do not work. Making one through s3fs works as expected. I'm on version 1.93
Author
Owner

@vicr-np commented on GitHub (Oct 21, 2025):

In s3fs debug, I see this:

2025-10-21T14:23:34.722Z [INF] cache.cpp:AddStat(320): add stat cache entry[path=/Salesforce/]
2025-10-21T14:23:34.722Z [WAN] string_util.cpp:cvt_strtoofft(93): something error is occurred in convert std::string(-:-:-:-:455) to off_t, thus return 0 as default.

The metadata (from aws s3api head-object) shows

{
    "AcceptRanges": "bytes",
    "LastModified": "2025-07-08T15:29:48+00:00",
    "ContentLength": 0,
    "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
    "ContentType": "application/x-www-form-urlencoded; charset=utf-8",
    "ServerSideEncryption": "AES256",
    "Metadata": {
        "dir": "1",
        "fflags": "0",
        "mdate": "2025-07-08T15:29:47.541Z",
        "owner": "-:-:-:-:455",
        "cdate": "2025-07-08T15:29:47.541Z"
    }
}

<!-- gh-comment-id:3427215354 --> @vicr-np commented on GitHub (Oct 21, 2025): In s3fs debug, I see this: 2025-10-21T14:23:34.722Z [INF] cache.cpp:AddStat(320): add stat cache entry[path=/Salesforce/] 2025-10-21T14:23:34.722Z [WAN] string_util.cpp:cvt_strtoofft(93): something error is occurred in convert std::string(-:-:-:-:455) to off_t, thus return 0 as default. The metadata (from aws s3api head-object) shows ``` { "AcceptRanges": "bytes", "LastModified": "2025-07-08T15:29:48+00:00", "ContentLength": 0, "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"", "ContentType": "application/x-www-form-urlencoded; charset=utf-8", "ServerSideEncryption": "AES256", "Metadata": { "dir": "1", "fflags": "0", "mdate": "2025-07-08T15:29:47.541Z", "owner": "-:-:-:-:455", "cdate": "2025-07-08T15:29:47.541Z" } } ```
Author
Owner

@gaul commented on GitHub (Oct 21, 2025):

@vicr-np could you share how this folder was made? The expected format for this value is an integer UID. I looked through GitHub but cannot find a similar colon-separated format. A web search suggests that there are multiple values, e.g., 501:-:20:-:4294934976.

<!-- gh-comment-id:3429458058 --> @gaul commented on GitHub (Oct 21, 2025): @vicr-np could you share how this folder was made? The expected format for this value is an integer UID. I looked through GitHub but cannot find a similar colon-separated format. A [web search](https://webtechsurvey.com/response-header/x-amz-meta-owner) suggests that there are multiple values, e.g., `501:-:20:-:4294934976`.
Author
Owner

@vicr-np commented on GitHub (Oct 23, 2025):

I'm not sure how it was created. I feel like the problem was introduce after creation, but I'm not sure. It's a production bucket, so I'm not comfortable tampering with it.

As far as I know, the only third party application used is S3 Browser. Otherwise, everything is done through the AWS Console or s3fs.

<!-- gh-comment-id:3437365419 --> @vicr-np commented on GitHub (Oct 23, 2025): I'm not sure how it was created. I feel like the problem was introduce after creation, but I'm not sure. It's a production bucket, so I'm not comfortable tampering with it. As far as I know, the only third party application used is S3 Browser. Otherwise, everything is done through the AWS Console or s3fs.
Author
Owner

@ggtakec commented on GitHub (Oct 24, 2025):

(This is as far as I've been able to understand from my research. If I'm wrong, please ignore the comments below.)

Is it possible that this environment is outputting objects to S3 using Salesforce and the Output Connector of Salesforce Data Cloud?
If you use it, depending on the Data Cloud Output Connector you're using, you'll likely need to check the format of this meta data (which I believe means a directory).

If this assumption is correct, the metadata format handled by the Data Cloud Output Connector and s3fs-fuse are different, so if you manipulate objects with s3fs-fuse(assuming you can do so), there's a high chance that you won't be able to read them from Salesforce.

<!-- gh-comment-id:3442369243 --> @ggtakec commented on GitHub (Oct 24, 2025): _(This is as far as I've been able to understand from my research. If I'm wrong, please ignore the comments below.)_ Is it possible that this environment is outputting objects to S3 using Salesforce and the Output Connector of Salesforce Data Cloud? If you use it, depending on the Data Cloud Output Connector you're using, you'll likely need to check the format of this meta data (which I believe means a directory). If this assumption is correct, the metadata format handled by the Data Cloud Output Connector and s3fs-fuse are different, so if you manipulate objects with s3fs-fuse(assuming you can do so), there's a high chance that you won't be able to read them from Salesforce.
Author
Owner

@vicr-np commented on GitHub (Oct 24, 2025):

This folder contains data related to Salesforce. There is a folder inside which is used for integration with Salesforce Marketing Cloud. We generally use the S3 API from Java to access these folders. I use s3fs when I need to investigate issues.

I have other unrelated folders have the same issue, so I don't think it has anything to do with Marketing Cloud.

I will keep investigating and report back if I find anything.

<!-- gh-comment-id:3442990669 --> @vicr-np commented on GitHub (Oct 24, 2025): This folder contains data related to Salesforce. There is a folder inside which is used for integration with Salesforce Marketing Cloud. We generally use the S3 API from Java to access these folders. I use s3fs when I need to investigate issues. I have other unrelated folders have the same issue, so I don't think it has anything to do with Marketing Cloud. I will keep investigating and report back if I find anything.
Author
Owner

@ggtakec commented on GitHub (Oct 25, 2025):

@vicr-np Thank you for confirming. (I understood that this was different from what I imagined.)

First, regarding the headers relevant to this issue, I'll explain how s3fs-fuse interprets S3 object headers.
s3fs-fuse uses the following headers to recognize an object as a file/directory:

  • x-amz-meta-uid
  • x-amz-meta-gid
  • x-amz-meta-mode

If x-amz-meta-uid or x-amz-meta-gid does not exist, s3fs-fuse will attempt to read the following headers instead:

  • x-amz-meta-owner
  • x-amz-meta-group

(The owner/group headers are used by s3sync and are supported by s3fs-fuse for compatibility.)
The value ​​of these headers(uid/gid/mode/owner/group) are assumed to be numeric and are loaded.

In your case, it appears that x-amz-meta-owner was used.
Therefore, please identify the program in your environment that outputs this x-amz-meta-owner header in the -:-:-:-:455 format.

Although I have not tried it myself, the following method may temporarily resolve this issue without correcting the root cause.
That is to set x-amz-meta-uid: 0 (or a value other than 0) for the target object.
This will use x-amz-meta-uid instead of the x-amz-meta-owner value, so read-only access should be possible.
[NOTE] However, caution is required.
If you write data from s3fs-fuse that involves modifying an object, this header and other headers will also be updated, which may cause the object to become unreadable in your client program.

<!-- gh-comment-id:3445979008 --> @ggtakec commented on GitHub (Oct 25, 2025): @vicr-np Thank you for confirming. (I understood that this was different from what I imagined.) First, regarding the headers relevant to this issue, I'll explain how s3fs-fuse interprets S3 object headers. s3fs-fuse uses the following headers to recognize an object as a file/directory: - x-amz-meta-uid - x-amz-meta-gid - x-amz-meta-mode If `x-amz-meta-uid` or `x-amz-meta-gid` does not exist, s3fs-fuse will attempt to read the following headers instead: - x-amz-meta-owner - x-amz-meta-group _(The owner/group headers are used by `s3sync` and are supported by s3fs-fuse for compatibility.)_ The value ​​of these headers(uid/gid/mode/owner/group) are assumed to be numeric and are loaded. In your case, it appears that `x-amz-meta-owner` was used. Therefore, please identify the program in your environment that outputs this `x-amz-meta-owner` header in the `-:-:-:-:455` format. Although I have not tried it myself, the following method may temporarily resolve this issue without correcting the root cause. That is to set `x-amz-meta-uid: 0` (or a value other than 0) for the target object. This will use `x-amz-meta-uid` instead of the `x-amz-meta-owner` value, so read-only access should be possible. **[NOTE]** However, caution is required. If you write data from s3fs-fuse that involves modifying an object, this header and other headers will also be updated, which may cause the object to become unreadable in your client program.
Author
Owner

@vicr-np commented on GitHub (Oct 27, 2025):

I wasn't able to determine the cause of the corrupt metadata, but was able to fix it with something like this:

aws s3 cp s3://some_bucket/some_directory/ s3://some_bucket/some_directory/ --metadata-directive REPLACE --content-type "application/x-directory"
<!-- gh-comment-id:3451935898 --> @vicr-np commented on GitHub (Oct 27, 2025): I wasn't able to determine the cause of the corrupt metadata, but was able to fix it with something like this: ``` aws s3 cp s3://some_bucket/some_directory/ s3://some_bucket/some_directory/ --metadata-directive REPLACE --content-type "application/x-directory" ```
Author
Owner

@ggtakec commented on GitHub (Oct 28, 2025):

@vicr-np Thanks for sharing your fix.
It certainly looks like modifying the Content-Type would have produced the same result.
I'd appreciate it if you could let me know if you have any more information on these headers in the future.

<!-- gh-comment-id:3455566402 --> @ggtakec commented on GitHub (Oct 28, 2025): @vicr-np Thanks for sharing your fix. It certainly looks like modifying the Content-Type would have produced the same result. I'd appreciate it if you could let me know if you have any more information on these headers in the future.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#1266
No description provided.