[GH-ISSUE #109] Google Cloud Storage Browser Interoperability #69

Closed
opened 2026-03-04 01:41:44 +03:00 by kerem · 9 comments
Owner

Originally created by @sheinbergon on GitHub (Jan 21, 2015).
Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/109

Hi

When someon4 creates a "directory" via the Google Cloud Storage Browser ,
A key get's created with a content type different then "application/octet-stream"
(application/x-www-form-urlencoded)

Now I know it's not really a directory and that these are all keys ( no hierarchy ), but
this causes s3fs to misinterpret the directory as a file and deny access to it's underlying content.
The only way to fix this is to change the directory key content type (and underlying files) via

gsutil setmeta -R -h "Content-Type:application/octet-stream" gs://BUCKET/PATH/TO/BROKEN/DIR

and of course , to avoid creating directories via the UI ( only through the S3FUSE mount )

I think It'll be a good idea to improve the directory identification mechanism using the underlying GCS SDK to handle this case

10x

Originally created by @sheinbergon on GitHub (Jan 21, 2015). Original GitHub issue: https://github.com/s3fs-fuse/s3fs-fuse/issues/109 Hi When someon4 creates a "directory" via the Google Cloud Storage Browser , A key get's created with a content type different then "application/octet-stream" (application/x-www-form-urlencoded) Now I know it's not really a directory and that these are all keys ( no hierarchy ), but this causes s3fs to misinterpret the directory as a file and deny access to it's underlying content. The only way to fix this is to change the directory key content type (and underlying files) via gsutil setmeta -R -h "Content-Type:application/octet-stream" gs://BUCKET/PATH/TO/BROKEN/DIR and of course , to avoid creating directories via the UI ( only through the S3FUSE mount ) I think It'll be a good idea to improve the directory identification mechanism using the underlying GCS SDK to handle this case 10x
kerem 2026-03-04 01:41:44 +03:00
  • closed this issue
  • added the
    need info
    label
Author
Owner

@ggtakec commented on GitHub (Mar 8, 2015):

I'm sorry that I dn't know what google cloud strage sets to "Content-Type".
If gcs makes "_$folder$" object and sets "application/octet-stream" as "Content-Type", s3fs can know it as directory object.
If does not, now s3fs could not know it as directory.
(If you can, please let me know about the directory object attributes on GCS.)

Then do you think that s3fs uses GCS SDK for interpreting directory?
I think it is hard to modify s3fs now because it affect the other objects which is made by s3fs and aws console, s3cmd.
Probably I need another idea for solving this problem.

Regards,

<!-- gh-comment-id:77744693 --> @ggtakec commented on GitHub (Mar 8, 2015): I'm sorry that I dn't know what google cloud strage sets to "Content-Type". If gcs makes "<directory name>_$folder$" object and sets "application/octet-stream" as "Content-Type", s3fs can know it as directory object. If does not, now s3fs could not know it as directory. (If you can, please let me know about the directory object attributes on GCS.) Then do you think that s3fs uses GCS SDK for interpreting directory? I think it is hard to modify s3fs now because it affect the other objects which is made by s3fs and aws console, s3cmd. Probably I need another idea for solving this problem. Regards,
Author
Owner

@kdunn-pivotal commented on GitHub (Mar 13, 2017):

I had luck mounting a Google Storage Bucket using the following syntax:

s3fs pde-kdunn.appspot.com /mnt/s3 -o \
passwd_file=/home/gpadmin/gcp.cred,host=https://storage.googleapis.com,sigv2

where gcp.cred contained:

pde-kdunn.appspot.com:<KEY>:<SECRET>

as found from GCP Console -> Storage -> Settings -> Interoperability for the bucket of interest.

<!-- gh-comment-id:286222694 --> @kdunn-pivotal commented on GitHub (Mar 13, 2017): I had luck mounting a Google Storage Bucket using the following syntax: ``` s3fs pde-kdunn.appspot.com /mnt/s3 -o \ passwd_file=/home/gpadmin/gcp.cred,host=https://storage.googleapis.com,sigv2 ``` where `gcp.cred` contained: ``` pde-kdunn.appspot.com:<KEY>:<SECRET> ``` as found from GCP Console -> Storage -> Settings -> Interoperability for the bucket of interest.
Author
Owner

@hamann commented on GitHub (Mar 6, 2019):

I'd like to share our experience with directories created via Google Cloud Storage Browser as of 2019...

With current version of s3fs-fuse they don't appear as directories, just as regular files, but can't be accessed. We then inspected that directories with gsutil

❯ gsutil stat gs://bucket/bar/
gs://bucket/bar/:
    Creation time:          Wed, 06 Mar 2019 09:16:43 GMT
    Update time:            Wed, 06 Mar 2019 09:16:43 GMT
    Storage class:          REGIONAL
    Content-Length:         11
    Content-Type:           text/plain
    Hash (crc32c):          XkI+Dw==
    Hash (md5):             apnFdauH+MfR7R5S5+NJzg==
    ETag:                   CKuC4ZWX7eACEAE=
    Generation:             1551863803035947
    Metageneration:         1

❯ gsutil cat gs://bucket/bar/
placeholder

❯ irb
>> "placeholder".length
=> 11

So such "directories" always have "placeholder" as content, context type "text/plain" and content length 11.

We then modified

github.com/s3fs-fuse/s3fs-fuse@0d43d070cc/src/s3fs_util.cpp (L849-L857)

and removed the size constraint && (0 == size || 1 == size) and started s3fs-fuse with -o complement_stat, after that such directories got treated as directories.

Is there maybe a better way to detect that instead of removing the size constraint or extending it with || 11 == size?

<!-- gh-comment-id:470035711 --> @hamann commented on GitHub (Mar 6, 2019): I'd like to share our experience with directories created via Google Cloud Storage Browser as of 2019... With current version of s3fs-fuse they don't appear as directories, just as regular files, but can't be accessed. We then inspected that directories with gsutil ```sh ❯ gsutil stat gs://bucket/bar/ gs://bucket/bar/: Creation time: Wed, 06 Mar 2019 09:16:43 GMT Update time: Wed, 06 Mar 2019 09:16:43 GMT Storage class: REGIONAL Content-Length: 11 Content-Type: text/plain Hash (crc32c): XkI+Dw== Hash (md5): apnFdauH+MfR7R5S5+NJzg== ETag: CKuC4ZWX7eACEAE= Generation: 1551863803035947 Metageneration: 1 ❯ gsutil cat gs://bucket/bar/ placeholder ❯ irb >> "placeholder".length => 11 ``` So such "directories" always have "placeholder" as content, context type "text/plain" and content length 11. We then modified https://github.com/s3fs-fuse/s3fs-fuse/blob/0d43d070ccf55a8745d8fe95ec9558e29a85cbb2/src/s3fs_util.cpp#L849-L857 and removed the size constraint `&& (0 == size || 1 == size)` and started s3fs-fuse with `-o complement_stat`, after that such directories got treated as directories. Is there maybe a better way to detect that instead of removing the size constraint or extending it with ` || 11 == size`?
Author
Owner

@ggtakec commented on GitHub (Mar 30, 2019):

It seems like it is better to add || 11 == size when the complement_stat option is specified.
It is confirmation, the content of GCS directory always placeholder (11 bytes)?

@gaul What do you think?

<!-- gh-comment-id:478212713 --> @ggtakec commented on GitHub (Mar 30, 2019): It seems like it is better to add `|| 11 == size` when the complement_stat option is specified. It is confirmation, the content of GCS directory always `placeholder` (11 bytes)? @gaul What do you think?
Author
Owner

@gaul commented on GitHub (Apr 15, 2019):

Wouldn't checking 11 == size treat all 11 byte objects as directories? I wonder if the Content-MD5 apnFdauH+MfR7R5S5+NJzg== is always the same and we can use this instead? Note that we should probably have an s3fs --gcs flag which enables this behavior.

<!-- gh-comment-id:483237724 --> @gaul commented on GitHub (Apr 15, 2019): Wouldn't checking 11 == size treat all 11 byte objects as directories? I wonder if the Content-MD5 `apnFdauH+MfR7R5S5+NJzg==` is always the same and we can use this instead? Note that we should probably have an `s3fs --gcs` flag which enables this behavior.
Author
Owner

@ggtakec commented on GitHub (Apr 16, 2019):

Currently s3fs does not handle content-md5 in HEAD response.
When option such as -gcs is specified, it seems that it can be judged by apnFdauH + MfR7R5S5 + NJzg == in get_mode() by handling this.

<!-- gh-comment-id:483704333 --> @ggtakec commented on GitHub (Apr 16, 2019): Currently s3fs does not handle content-md5 in HEAD response. When option such as -gcs is specified, it seems that it can be judged by apnFdauH + MfR7R5S5 + NJzg == in get_mode() by handling this.
Author
Owner

@gaul commented on GitHub (Jun 4, 2020):

Could someone clarify the state of this issue? #1286 added some support for gsutil headers. Does Google Cloud Storage Browser support this?

<!-- gh-comment-id:638858672 --> @gaul commented on GitHub (Jun 4, 2020): Could someone clarify the state of this issue? #1286 added some support for `gsutil` headers. Does Google Cloud Storage Browser support this?
Author
Owner

@gaul commented on GitHub (Aug 1, 2020):

Closing due to inactivity. Please reopen if symptoms persist.

<!-- gh-comment-id:667535249 --> @gaul commented on GitHub (Aug 1, 2020): Closing due to inactivity. Please reopen if symptoms persist.
Author
Owner

@haraldschilly commented on GitHub (Feb 11, 2021):

Hi, I tried to mount a GCS bucket with a directory – files show up and I can read the content, but the directory ends up as an empty file, not a directory.

I tried with the binary ubuntu (20.04) ships, 1.86, and also built from source (1.88), which has #1286 … same behavior in both cases. I followed the instructions from the wiki for setting the options.

<!-- gh-comment-id:777351884 --> @haraldschilly commented on GitHub (Feb 11, 2021): Hi, I tried to mount a GCS bucket with a directory – files show up and I can read the content, but the directory ends up as an empty file, not a directory. I tried with the binary ubuntu (20.04) ships, 1.86, and also built from source (1.88), which has #1286 … same behavior in both cases. I followed the instructions from the wiki for setting the options.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/s3fs-fuse#69
No description provided.