Skip to content

feat(storage): add object contexts in Python GCS SDK#17039

Draft
nidhiii-27 wants to merge 3 commits into
mainfrom
feat-object-contexts-13325527155543531515
Draft

feat(storage): add object contexts in Python GCS SDK#17039
nidhiii-27 wants to merge 3 commits into
mainfrom
feat-object-contexts-13325527155543531515

Conversation

@nidhiii-27
Copy link
Copy Markdown
Contributor

This PR implements the "Object Contexts" feature in the google-cloud-storage Python SDK, ensuring feature parity with the Go and Java SDKs.

Key changes:

  1. Metadata Attachment: Added ObjectCustomContextPayload and ObjectContexts classes to google/cloud/storage/blob.py. These allow users to define custom key-value pairs (contexts) for objects.
  2. Blob Property: Added a contexts property to the Blob class with appropriate getters and setters. Added contexts to _WRITABLE_FIELDS to enable REST API support for patching and updating these contexts.
  3. Advanced Filtering: Updated the list_blobs method in both google/cloud/storage/client.py and google/cloud/storage/bucket.py to include a keyword-only filter_ parameter. This allows server-side filtering of objects based on various attributes, including custom contexts.
  4. gRPC Support: Enhanced google/cloud/storage/_grpc_conversions.py with:
    • blob_to_proto: Support for converting contexts to GCS V2 proto.
    • proto_to_blob: Support for populating Blob.contexts from GCS V2 proto.
    • get_update_mask: Logic for generating gRPC FieldMask for partial context updates (using shallow masks like contexts.custom.<key>).
  5. Testing: Added new unit tests in google/cloud/storage/tests/unit/test_blob.py and google/cloud/storage/tests/unit/test__grpc_conversions.py to verify the new functionality and ensure no regressions (verified with nox -s unit-3.12). Fixed a minor regression in async write tests.

This implementation allows for advanced metadata management and powerful server-side filtering as requested.


PR created automatically by Jules for task 13325527155543531515 started by @nidhiii-27

Implement object contexts feature parity with Go and Java SDKs. Allows users to attach custom key-value metadata payloads to objects and filter by them during list operations.

- Define ObjectCustomContextPayload and ObjectContexts classes in blob.py.
- Add 'contexts' property to Blob class and include it in _WRITABLE_FIELDS.
- Update list_blobs in Client and Bucket to support 'filter_' parameter.
- Implement gRPC conversion logic and update_mask generation in _grpc_conversions.py.
- Add comprehensive unit tests.

Co-authored-by: nidhiii-27 <224584462+nidhiii-27@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for object contexts in the Google Cloud Storage Python SDK, adding the ObjectContexts and ObjectCustomContextPayload classes and a contexts property to the Blob class. It also implements server-side filtering for list_blobs via a new filter_ parameter. Feedback focuses on improving the robustness of protobuf conversions by checking for field presence rather than truthiness, ensuring FieldMask generation for contexts replaces the entire map for consistency with other SDK fields, and hardening the custom context setter to handle None values.

Comment on lines +124 to +126
value = getattr(proto, proto_field, None)
if value:
blob._properties[attr_name] = value
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using if value: to check for field presence in a protobuf message is problematic because it will skip falsy but valid values such as 0, False, or an empty string "". For proto-plus messages, it is more accurate to check for presence using the in operator.

Suggested change
value = getattr(proto, proto_field, None)
if value:
blob._properties[attr_name] = value
if proto_field in proto:
blob._properties[attr_name] = getattr(proto, proto_field)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jules change the if condition as mentioned above

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the condition to if "custom_time" in proto: as suggested.

Comment on lines +161 to +167
if change == "contexts":
contexts = getattr(blob, "contexts", None)
if contexts is None or not contexts.custom:
paths.append("contexts.custom")
else:
for key in contexts.custom:
paths.append(f"contexts.custom.{key}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current logic for generating the FieldMask for contexts uses individual keys for partial updates if the map is non-empty. However, this behavior is inconsistent with how other map fields (like metadata) are handled in the Python SDK, where the entire map is typically replaced to ensure the local state is synchronized with the server. Using individual keys prevents the deletion of keys that are present on the server but missing from the local blob.contexts.custom dictionary during a patch() operation. It is safer and more consistent to use contexts.custom as the path to replace the entire map.

Suggested change
if change == "contexts":
contexts = getattr(blob, "contexts", None)
if contexts is None or not contexts.custom:
paths.append("contexts.custom")
else:
for key in contexts.custom:
paths.append(f"contexts.custom.{key}")
if change == "contexts":
paths.append("contexts.custom")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jules apply the suggestion provided here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the suggestion to improve proto_to_blob and get_update_mask logic, including better property mapping and timestamp conversion.


@custom.setter
def custom(self, value):
self["custom"] = value
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The custom setter should ensure that the internal value is always a dictionary, even if None is passed. This prevents potential AttributeError or TypeError in other parts of the code (like blob_to_proto) that expect contexts.custom to be iterable.

Suggested change
self["custom"] = value
self["custom"] = value if value is not None else {}
References
  1. When processing optional parameters with a helper function, prefer placing the null or None check within the function body rather than at the call site to simplify the calling code and improve encapsulation.

google-labs-jules Bot and others added 2 commits May 12, 2026 05:53
- Improve gRPC conversion logic: use 'in' operator for field presence check and improve property mapping.
- Refine update_mask generation logic.
- Update tests to match new conversion logic.

Co-authored-by: nidhiii-27 <224584462+nidhiii-27@users.noreply.github.com>
- Implement ObjectContexts and ObjectCustomContextPayload.
- Add 'contexts' property to Blob.
- Add 'filter_' parameter to list_blobs.
- Add gRPC conversion for contexts and update_mask generation.
- Fix unit tests to avoid ADC errors.
- Format code with ruff.

Co-authored-by: nidhiii-27 <224584462+nidhiii-27@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant