[GH-ISSUE #1270] Is it possible to download using gsutil command? #181

Open
opened 2026-03-03 12:08:57 +03:00 by kerem · 0 comments
Owner

Originally created by @jdominguez408 on GitHub (Jul 31, 2023).
Original GitHub issue: https://github.com/fsouza/fake-gcs-server/issues/1270

In this example (file /examples/gsutil/gsutil-example.sh):

​
# Copyright 2023 Francisco Souza. All rights reserved.
# Use of this source code is governed by a BSD-style
# license that can be found in the LICENSE file.
​
set -euo pipefail
​
bucket_name=some-bucket
project_id=test-project
here=$(cd "$(dirname "${0}")" && pwd -P)
​
# create bucket
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" mb -p "${project_id}" "gs://${bucket_name}"
​
# list objects in the bucket (should be empty)
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" ls -p "${project_id}" "gs://${bucket_name}"
​
# upload a couple of fileds
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp "${here}"/hello.txt "${here}"/image.png "gs://${bucket_name}/"
​
# list objects in the bucket (should include the files that were just uploaded)
gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" ls -p "${project_id}" "gs://${bucket_name}" 
​

We can see all the operations with gsutil but not the download one. I've tried to execute next command like the previous examples:

gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp  "gs://${bucket_name}/${here}"/hello.txt" "${here}"/hello.txt  


But I always get the same error:

  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gsutil.py", line 151, in RunMain
    sys.exit(gslib.__main__.main())
             ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 436, in main
    return _RunNamedCommandAndHandleExceptions(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 785, in _RunNamedCommandAndHandleExceptions
    _HandleUnknownFailure(e)
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 633, in _RunNamedCommandAndHandleExceptions
    return command_runner.RunNamedCommand(command_name,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 421, in RunNamedCommand
    return_code = command_inst.RunCommand()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1131, in RunCommand
    self.Apply(_CopyFuncWrapper,
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1575, in Apply
    self._SequentialApply(func, args_iterator, exception_handler, caller_id,
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1654, in _SequentialApply
    worker_thread.PerformTask(task, self)
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2404, in PerformTask
    results = task.func(cls, task.args, thread_state=self.thread_gsutil_api)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 673, in _CopyFuncWrapper
    cls.CopyFunc(args,
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 913, in CopyFunc
    _, bytes_transferred, result_url, md5 = copy_helper.PerformCopy(
                                            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3949, in PerformCopy
    return _DownloadObjectToFile(src_url,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3141, in _DownloadObjectToFile
    bytes_transferred, server_encoding = _DownloadObjectToFileResumable(
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2948, in _DownloadObjectToFileResumable
    server_encoding = gsutil_api.GetObjectMedia(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/cloud_api_delegator.py", line 352, in GetObjectMedia
    return self._GetApi(provider).GetObjectMedia(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 1202, in GetObjectMedia
    apitools_download = apitools_transfer.Download.FromData(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/transfer.py", line 253, in FromData
    url = client.FinalizeTransferUrl(info['url'])
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 459, in FinalizeTransferUrl
    url_builder = _UrlBuilder.FromUrl(url)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 189, in FromUrl
    return cls(
           ^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 170, in __init__
    components = urllib.parse.urlsplit(_urljoin(
                                       ^^^^^^^^^
  File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 160, in _urljoin
    new_base = base if base.endswith('/') else base + '/'
                       ^^^^^^^^^^^^^^^^^^
TypeError: endswith first arg must be bytes or a tuple of bytes, not str


The request reach fake-gcs-server in this way:

[fake-gcs-server] time="2023-08-02T11:15:55Z" level=info msg="127.0.0.1 - - [02/Aug/2023:11:15:55 +0000] \"GET /storage/v1/b/some-bucket/o/hello.txt?alt=json&fields=contentType%2Cgeneration%2Ccrc32c%2Cmd5Hash%2Cetag%2Cname%2CcustomerEncryption%2Csize%2CcontentEncoding%2CmediaLink&projection=noAcl HTTP/1.1\" 200 472"
[fake-gcs-server] time="2023-08-02T11:16:36Z" level=info msg="127.0.0.1 - - [02/Aug/2023:11:16:36 +0000] \"GET /storage/v1/b/some-bucket/o/hello.txt?alt=json&fields=contentType%2Cetag%2CcontentEncoding%2Cgeneration%2Cname%2Csize%2CcustomerEncryption%2Ccrc32c%2CmediaLink%2Cmd5Hash&projection=noAcl HTTP/1.1\" 200 472"

As I understood, to be able to download, the argument alt must have the value alt=media instead alt=json. It looks like the client, gsutil in this case, sends the request with the argument alt=json. This same command against a real bucket works perfectly. Could it be possible to make fake-gcs-server compatible with gsutil to work and being able to download with it?

Originally created by @jdominguez408 on GitHub (Jul 31, 2023). Original GitHub issue: https://github.com/fsouza/fake-gcs-server/issues/1270 In this example (file /examples/gsutil/gsutil-example.sh): ​ ```#!/usr/bin/env bash ​ # Copyright 2023 Francisco Souza. All rights reserved. # Use of this source code is governed by a BSD-style # license that can be found in the LICENSE file. ​ set -euo pipefail ​ bucket_name=some-bucket project_id=test-project here=$(cd "$(dirname "${0}")" && pwd -P) ​ # create bucket gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" mb -p "${project_id}" "gs://${bucket_name}" ​ # list objects in the bucket (should be empty) gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" ls -p "${project_id}" "gs://${bucket_name}" ​ # upload a couple of fileds gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp "${here}"/hello.txt "${here}"/image.png "gs://${bucket_name}/" ​ # list objects in the bucket (should include the files that were just uploaded) gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" ls -p "${project_id}" "gs://${bucket_name}" ​ ``` We can see all the operations with gsutil but not the download one. I've tried to execute next command like the previous examples: ``` gsutil -o "Credentials:gs_json_host=127.0.0.1" -o "Credentials:gs_json_port=4443" -o "Boto:https_validate_certificates=False" cp "gs://${bucket_name}/${here}"/hello.txt" "${here}"/hello.txt ``` ​ But I always get the same error: ```Traceback (most recent call last): File "/Users/xxx/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module> gsutil.RunMain() File "/Users/xxx/google-cloud-sdk/platform/gsutil/gsutil.py", line 151, in RunMain sys.exit(gslib.__main__.main()) ^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 436, in main return _RunNamedCommandAndHandleExceptions( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 785, in _RunNamedCommandAndHandleExceptions _HandleUnknownFailure(e) File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 633, in _RunNamedCommandAndHandleExceptions return command_runner.RunNamedCommand(command_name, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 421, in RunNamedCommand return_code = command_inst.RunCommand() ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1131, in RunCommand self.Apply(_CopyFuncWrapper, File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1575, in Apply self._SequentialApply(func, args_iterator, exception_handler, caller_id, File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1654, in _SequentialApply worker_thread.PerformTask(task, self) File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2404, in PerformTask results = task.func(cls, task.args, thread_state=self.thread_gsutil_api) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 673, in _CopyFuncWrapper cls.CopyFunc(args, File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 913, in CopyFunc _, bytes_transferred, result_url, md5 = copy_helper.PerformCopy( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3949, in PerformCopy return _DownloadObjectToFile(src_url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3141, in _DownloadObjectToFile bytes_transferred, server_encoding = _DownloadObjectToFileResumable( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2948, in _DownloadObjectToFileResumable server_encoding = gsutil_api.GetObjectMedia( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/cloud_api_delegator.py", line 352, in GetObjectMedia return self._GetApi(provider).GetObjectMedia( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/gslib/gcs_json_api.py", line 1202, in GetObjectMedia apitools_download = apitools_transfer.Download.FromData( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/transfer.py", line 253, in FromData url = client.FinalizeTransferUrl(info['url']) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 459, in FinalizeTransferUrl url_builder = _UrlBuilder.FromUrl(url) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 189, in FromUrl return cls( ^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 170, in __init__ components = urllib.parse.urlsplit(_urljoin( ^^^^^^^^^ File "/Users/xxx/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/base_api.py", line 160, in _urljoin new_base = base if base.endswith('/') else base + '/' ^^^^^^^^^^^^^^^^^^ TypeError: endswith first arg must be bytes or a tuple of bytes, not str ``` ​ The request reach fake-gcs-server in this way: ``` [fake-gcs-server] time="2023-08-02T11:15:55Z" level=info msg="127.0.0.1 - - [02/Aug/2023:11:15:55 +0000] \"GET /storage/v1/b/some-bucket/o/hello.txt?alt=json&fields=contentType%2Cgeneration%2Ccrc32c%2Cmd5Hash%2Cetag%2Cname%2CcustomerEncryption%2Csize%2CcontentEncoding%2CmediaLink&projection=noAcl HTTP/1.1\" 200 472" [fake-gcs-server] time="2023-08-02T11:16:36Z" level=info msg="127.0.0.1 - - [02/Aug/2023:11:16:36 +0000] \"GET /storage/v1/b/some-bucket/o/hello.txt?alt=json&fields=contentType%2Cetag%2CcontentEncoding%2Cgeneration%2Cname%2Csize%2CcustomerEncryption%2Ccrc32c%2CmediaLink%2Cmd5Hash&projection=noAcl HTTP/1.1\" 200 472" ```` As I understood, to be able to download, the argument `alt` must have the value `alt=media` instead `alt=json`. It looks like the client, gsutil in this case, sends the request with the argument `alt=json`. This same command against a real bucket works perfectly. Could it be possible to make fake-gcs-server compatible with gsutil to work and being able to download with it?
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/fake-gcs-server#181
No description provided.