Content-Length: 323798 | pFad | http://www.github.com/googleapis/python-storage/issues/1459

93C `close` is not idempotent in the face of certain permission errors · Issue #1459 · googleapis/python-storage · GitHub
Skip to content

close is not idempotent in the face of certain permission errors #1459

Open
@nelhage

Description

@nelhage

Thanks for stopping by to let us know something could be better!

Originally filed as piskvorky/smart_open#858, but I think they're right on reflection that this should be regarded as a python-storage bug.

Environment details

  • OS type and version:
  • Python version: Python 3.11.11
  • pip version: pip 23.3.1
  • google-cloud-storage version:
Name: google-cloud-storage
Version: 2.18.2
Summary: Google Cloud Storage API client library
Home-page: https://github.com/googleapis/python-storage
Author: Google LLC
Author-email: googleapis-packages@google.com
License: Apache 2.0
Location: /root/.pyenv/versions/3.11.11/lib/python3.11/site-packages
Requires: google-api-core, google-auth, google-cloud-core, google-crc32c, google-resumable-media, requests
Required-by: gcsfs, google-cloud-aiplatform
---
Name: google-resumable-media
Version: 2.7.2
Summary: Utilities for Google Media Downloads and Resumable Uploads
Home-page: https://github.com/googleapis/google-resumable-media-python
Author: Google Cloud Platform
Author-email: googleapis-publisher@google.com
License: Apache 2.0
Location: /root/.pyenv/versions/3.11.11/lib/python3.11/site-packages
Requires: google-crc32c
Required-by: google-cloud-bigquery, google-cloud-storage

Steps to reproduce

  1. Given a (bucket, key) pair, where you have permissions to write into some parts of the bucket, but not the specific key
  2. Attempt to write to the key using python-storage
  3. Attempt to close() the write file handle multiple times
  4. The first attempt with fail with a InvalidResponse and a 403 error, as expected
  5. The second and future attempts will fail with a confusing ValueError arising from an invariant check inside the library.

Note that if you don't have permissions anywhere on the bucket, then the problem doesn't occur, and close consistently raises InvalidResponse.

Code example

Example snippet that demonstrates the problem:

client = storage.Client()
blob = client.get_bucket(bucket).blob(key)

fh = blob.open("wb")
fh.write(b"hello\n")
for i in range(3):
    try:
        print(f"before {i=} {fh.closed=}")
        fh.close()
        print(f"attempt={i} success!")
    except Exception as ex:
        print(f"attempt={i} failed with: {type(ex)}: {ex}")
Sample output
  • From a bucket I control, to a key I lack permission on:
before i=0 fh.closed=False
attempt=0 failed with: <class 'google.resumable_media.common.InvalidResponse'>: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PERMANENT_REDIRECT: 308>)
before i=1 fh.closed=False
attempt=1 failed with: <class 'ValueError'>: Upload is in an invalid state. To recover call `recover()`.
before i=2 fh.closed=False
attempt=2 failed with: <class 'ValueError'>: Upload is in an invalid state. To recover call `recover()`.
  • From an attempt to write to a public bucket (gs://gcp-public-data-arco-era5/co/eperm.dat)
before i=0 fh.closed=False
attempt=0 failed with: <class 'google.resumable_media.common.InvalidResponse'>: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.CREATED: 201>)
before i=1 fh.closed=False
attempt=1 failed with: <class 'google.resumable_media.common.InvalidResponse'>: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.CREATED: 201>)
before i=2 fh.closed=False
attempt=2 failed with: <class 'google.resumable_media.common.InvalidResponse'>: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.CREATED: 201>)

Stack trace

re-pasting the trace from piskvorky/smart_open#858, in which smart_open tries to close the file handle twice (once via a TextIOWrapper, and once directly)

---------------------------------------------------------------------------
InvalidResponse                           Traceback (most recent call last)
File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/smart_open/utils.py:220, in FileLikeProxy.__exit__(self, *args, **kwargs)
    219 try:
--> 220     return super().__exit__(*args, **kwargs)
    221 finally:

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/smart_open/utils.py:207, in TextIOWrapper.__exit__(self, exc_type, exc_val, exc_tb)
    206 if exc_type is None:
--> 207     self.close()

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py:437, in BlobWriter.close(self)
    436 if not self._buffer.closed:
--> 437     self._upload_chunks_from_buffer(1)
    438 self._buffer.close()

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py:417, in BlobWriter._upload_chunks_from_buffer(self, num_chunks)
    416 for _ in range(num_chunks):
--> 417     upload.transmit_next_chunk(transport, **kwargs)
    419 # Wipe the buffer of chunks uploaded, preserving any remaining data.

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/resumable_media/requests/upload.py:515, in ResumableUpload.transmit_next_chunk(self, transport, timeout)
    513     return result
--> 515 return _request_helpers.wait_and_retry(
    516     retriable_request, self._get_status_code, self._retry_strategy
    517 )

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/resumable_media/requests/_request_helpers.py:155, in wait_and_retry(func, get_status_code, retry_strategy)
    154 try:
--> 155     response = func()
    156 except _CONNECTION_ERROR_CLASSES as e:

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/resumable_media/requests/upload.py:511, in ResumableUpload.transmit_next_chunk.<locals>.retriable_request()
    507 result = transport.request(
    508     method, url, data=payload, headers=headers, timeout=timeout
    509 )
--> 511 self._process_resumable_response(result, len(payload))
    513 return result

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/resumable_media/_upload.py:690, in ResumableUpload._process_resumable_response(self, response, bytes_sent)
    670 """Process the response from an HTTP request.
    671
    672 This is everything that must be done after a request that doesn't
   (...)
    688 .. _sans-I/O: https://sans-io.readthedocs.io/
    689 """
--> 690 status_code = _helpers.require_status_code(
    691     response,
    692     (http.client.OK, http.client.PERMANENT_REDIRECT),
    693     self._get_status_code,
    694     callback=self._make_invalid,
    695 )
    696 if status_code == http.client.OK:
    697     # NOTE: We use the "local" information of ``bytes_sent`` to update
    698     #       ``bytes_uploaded``, but do not verify this against other
   (...)
    703     #       * ``stream.tell()`` (relying on fact that ``initiate()``
    704     #         requires stream to be at the beginning)

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/resumable_media/_helpers.py:108, in require_status_code(response, status_codes, get_status_code, callback)
    107         callback()
--> 108     raise common.InvalidResponse(
    109         response,
    110         "Request failed with status code",
    111         status_code,
    112         "Expected one of",
    113         *status_codes
    114     )
    115 return status_code

InvalidResponse: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PERMANENT_REDIRECT: 308>)

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 with smart_open.smart_open(path, 'w') as fh:
      2     fh.write("hello\n")

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/smart_open/utils.py:222, in FileLikeProxy.__exit__(self, *args, **kwargs)
    220     return super().__exit__(*args, **kwargs)
    221 finally:
--> 222     self.__inner.__exit__(*args, **kwargs)

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py:437, in BlobWriter.close(self)
    435 def close(self):
    436     if not self._buffer.closed:
--> 437         self._upload_chunks_from_buffer(1)
    438     self._buffer.close()

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/cloud/storage/fileio.py:417, in BlobWriter._upload_chunks_from_buffer(self, num_chunks)
    415 # Upload chunks. The SlidingBuffer class will manage seek position.
    416 for _ in range(num_chunks):
--> 417     upload.transmit_next_chunk(transport, **kwargs)
    419 # Wipe the buffer of chunks uploaded, preserving any remaining data.
    420 self._buffer.flush()

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/resumable_media/requests/upload.py:503, in ResumableUpload.transmit_next_chunk(self, transport, timeout)
    424 def transmit_next_chunk(
    425     self,
    426     transport,
   (...)
    430     ),
    431 ):
    432     """Transmit the next chunk of the resource to be uploaded.
    433
    434     If the current upload was initiated with ``stream_final=False``,
   (...)
    501             does not match or is not available.
    502     """
--> 503     method, url, payload, headers = self._prepare_request()
    505     # Wrap the request business logic in a function to be retried.
    506     def retriable_request():

File ~/.pyenv/versions/3.11.11/lib/python3.11/site-packages/google/resumable_media/_upload.py:613, in ResumableUpload._prepare_request(self)
    611     raise ValueError("Upload has finished.")
    612 if self.invalid:
--> 613     raise ValueError(
    614         "Upload is in an invalid state. To recover call `recover()`."
    615     )
    616 if self.resumable_url is None:
    617     raise ValueError(
    618         "This upload has not been initiated. Please call "
    619         "initiate() before beginning to transmit chunks."
    620     )

ValueError: Upload is in an invalid state. To recover call `recover()`.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: storageIssues related to the googleapis/python-storage API.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions









      ApplySandwichStrip

      pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


      --- a PPN by Garber Painting Akron. With Image Size Reduction included!

      Fetched URL: http://www.github.com/googleapis/python-storage/issues/1459

      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy