feat(storage): log additional bytes received from GCS in read path by nidhiii-27 · Pull Request #17423 · googleapis/google-cloud-python

nidhiii-27 · 2026-06-11T09:40:50Z

It has been found that GCS can occasionally send additional bytes while reading from stream. This scenario should be logged properly for debugging and tracking purposes.

Fixes: 475824752

gemini-code-assist

Code Review

This pull request introduces warning logs when more bytes are downloaded than requested from GCS, implemented in both _download.py and blob.py, along with a new unit test. The review feedback highlights several critical and medium-severity issues: an undefined headers variable in _download.py that will cause a NameError, a missing function call to _get_headers in blob.py that will fail at runtime, and a flawed unit test that does not actually execute the download loop due to incorrect mocking of download.finished and consume_next_chunk. Additionally, suggestions are provided to ensure fallback values are correctly applied when bucket or object names are None, and to align the range-handling logic in blob.py with _download.py for consistency.

gemini-code-assist · 2026-06-11T09:43:45Z

+                requested_length = self.end + 1
+
+            if requested_length is not None and self._bytes_downloaded > requested_length:
+                if headers.get("x-goog-stored-content-encoding") != "gzip":


The variable headers is not defined in the scope of _process_response. This will raise a NameError at runtime when this code path is executed. It should be changed to response.headers.

Suggested change

if headers.get("x-goog-stored-content-encoding") != "gzip":

if response.headers.get("x-goog-stored-content-encoding") != "gzip":

gemini-code-assist · 2026-06-11T09:43:45Z

+                if response is not None and not media_helpers._is_decompressive_transcoding(
+                    response, download._get_headers
+                ):


The method download._get_headers is passed as a reference instead of being called. Since _is_decompressive_transcoding expects a dictionary of headers, this will raise an error at runtime when response is not None. It should be called as download._get_headers().

The unit test test__do_download_log_extra_bytes did not catch this because download.finished is a Mock object, which makes not download.finished evaluate to False. As a result, the while loop is skipped, response remains None, and the condition response is not None prevents _is_decompressive_transcoding from being called.

Suggested change

if response is not None and not media_helpers._is_decompressive_transcoding(

response, download._get_headers

):

if response is not None and not media_helpers._is_decompressive_transcoding(

response, download._get_headers()

):

gemini-code-assist · 2026-06-11T09:43:45Z

+                    bucket_name = getattr(self, "client_info_bucket_name", "unknown")
+                    object_name = getattr(self, "client_info_object_name", "unknown")


Since self.client_info_bucket_name and self.client_info_object_name are initialized in __init__ (and thus always exist on self), getattr(self, ..., "unknown") will return None if they were passed as None, rather than falling back to "unknown". To ensure they fall back to "unknown" when they are None, you should use or "unknown".

Suggested change

bucket_name = getattr(self, "client_info_bucket_name", "unknown")

object_name = getattr(self, "client_info_object_name", "unknown")

bucket_name = getattr(self, "client_info_bucket_name", None) or "unknown"

object_name = getattr(self, "client_info_object_name", None) or "unknown"

gemini-code-assist · 2026-06-11T09:43:45Z

+        requested_length = None
+        if start is not None and end is not None:
+            requested_length = end - start + 1
+        elif start is None and end is not None:
+            requested_length = end + 1


For consistency with the logic in _media/_download.py, blob.py should also handle the case where start is negative and end is None (which requests the last -start bytes of the blob).

Suggested change

requested_length = None

if start is not None and end is not None:

requested_length = end - start + 1

elif start is None and end is not None:

requested_length = end + 1

requested_length = None

if start is not None and start < 0 and end is None:

requested_length = -start

elif start is not None and end is not None:

requested_length = end - start + 1

elif start is None and end is not None:

requested_length = end + 1

gemini-code-assist · 2026-06-11T09:43:45Z

+            download = patched.return_value
+            download._bytes_downloaded = 10
+
+            mock_response = mock.Mock()
+            mock_response.headers = {}
+            download.consume.return_value = mock_response
+            download._get_headers.return_value = {}


The test does not actually execute the while not download.finished: loop because download.finished is a Mock object (which is truthy, making not download.finished evaluate to False). Additionally, the test mocks download.consume instead of download.consume_next_chunk.

To make the test effective and ensure the loop runs once, you should mock download.finished to return False then True, and mock download.consume_next_chunk instead of download.consume.

Suggested change

download = patched.return_value

download._bytes_downloaded = 10

mock_response = mock.Mock()

mock_response.headers = {}

download.consume.return_value = mock_response

download._get_headers.return_value = {}

download = patched.return_value

download._bytes_downloaded = 10

type(download).finished = mock.PropertyMock(side_effect=[False, True])

mock_response = mock.Mock()

mock_response.headers = {}

download.consume_next_chunk.return_value = mock_response

download._get_headers.return_value = {}

It has been found that GCS can occasionally send additional bytes while reading from stream. This scenario should be logged properly for debugging and tracking purposes. Fixes: 475824752 [Generated-by: AI]

nidhiii-27 · 2026-06-11T10:23:14Z

Addressed all PR review comments.

nidhiii-27 requested a review from a team as a code owner June 11, 2026 09:40

gemini-code-assist Bot reviewed Jun 11, 2026

View reviewed changes

nidhiii-27 marked this pull request as draft June 11, 2026 09:46

nidhiii-27 added ai-generated storage-feature-parity labels Jun 11, 2026

feat(storage): log additional bytes received from GCS in read path

ad8a555

It has been found that GCS can occasionally send additional bytes while reading from stream. This scenario should be logged properly for debugging and tracking purposes. Fixes: 475824752 [Generated-by: AI]

nidhiii-27 force-pushed the feat/log-additional-bytes branch from d7a9e73 to ad8a555 Compare June 11, 2026 10:23

Fix python formatting

bb8cd28

parthea assigned nidhiii-27 Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(storage): log additional bytes received from GCS in read path#17423

feat(storage): log additional bytes received from GCS in read path#17423
nidhiii-27 wants to merge 2 commits into
mainfrom
feat/log-additional-bytes

nidhiii-27 commented Jun 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Uh oh!

nidhiii-27 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if headers.get("x-goog-stored-content-encoding") != "gzip":
	if response.headers.get("x-goog-stored-content-encoding") != "gzip":

		bucket_name = getattr(self, "client_info_bucket_name", "unknown")
		object_name = getattr(self, "client_info_object_name", "unknown")

Conversation

nidhiii-27 commented Jun 11, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

nidhiii-27 commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant