refactor(bigframes): Extract json conversions to distinct ops by TrevorBergeron · Pull Request #17473 · googleapis/google-cloud-python

TrevorBergeron · 2026-06-16T01:19:08Z

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

gemini-code-assist

Code Review

This pull request refactors JSON casting by replacing the generic AsTypeOp with explicit ToJSON and JSONDecode operations across the compilation backends (Ibis, Polars, and SQLGlot). Feedback suggests optimizing DataFrame.astype to avoid a performance regression caused by column-by-column assignment on non-JSON types, and improving type safety in the Polars compiler by using a more specific type annotation for the ToJSON operation.

gemini-code-assist · 2026-06-16T01:20:49Z

+        result = self.copy()
+        for col in result.columns:
+            result[col] = result[col].astype(dtype, errors=errors)
+        return result


Looping over columns and performing individual series assignments (result[col] = result[col].astype(...)) introduces a significant performance regression for non-JSON data types. In bigframes, column-by-column assignment creates multiple intermediate DataFrame states and nested expression trees, which can lead to slow compilation times and complex SQL queries.

We can avoid this overhead by fast-pathing the common cases where no JSON columns are involved, or where all columns are being cast to/from JSON, applying the operation to all columns at once using _apply_unary_op.

safe = errors == "null" dtype = bigframes.dtypes.bigframes_type(dtype) if dtype == bigframes.dtypes.JSON_DTYPE: return self._apply_unary_op(ops.ToJSON(safe=safe)) if all(t == bigframes.dtypes.JSON_DTYPE for t in self.dtypes): return self._apply_unary_op(ops.JSONDecode(to_type=dtype, safe=safe)) if not any(t == bigframes.dtypes.JSON_DTYPE for t in self.dtypes): return self._apply_unary_op(ops.AsTypeOp(to_type=dtype, safe=safe)) result = self.copy() for col in result.columns: result[col] = result[col].astype(dtype, errors=errors) return result

gemini-code-assist · 2026-06-16T01:20:49Z

            return input.str.json_decode(_DTYPE_MAPPING[op.to_type])

+        @compile_op.register(json_ops.ToJSON)
+        def _(self, op: ops.ScalarOp, input: pl.Expr) -> pl.Expr:


For consistency and better type safety, annotate the op parameter with the specific json_ops.ToJSON type instead of the generic ops.ScalarOp, matching the pattern used in other registered operations (such as ops.ToArrayOp and json_ops.JSONDecode).

Suggested change

def _(self, op: ops.ScalarOp, input: pl.Expr) -> pl.Expr:

def _(self, op: json_ops.ToJSON, input: pl.Expr) -> pl.Expr:

refactor(bigframes): Extract json conversions to distinct ops

a967dd9

gemini-code-assist Bot reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(bigframes): Extract json conversions to distinct ops#17473

refactor(bigframes): Extract json conversions to distinct ops#17473
TrevorBergeron wants to merge 1 commit into
mainfrom
tbergeron_json-cast_extract

TrevorBergeron commented Jun 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	def _(self, op: ops.ScalarOp, input: pl.Expr) -> pl.Expr:
	def _(self, op: json_ops.ToJSON, input: pl.Expr) -> pl.Expr:

Conversation

TrevorBergeron commented Jun 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant