Skip to content

Add parameter to set maxNumRows for dataflow jobs#3632

Merged
VardhanThigle merged 3 commits intoGoogleCloudPlatform:mainfrom
swatilodha-dev:local
Apr 9, 2026
Merged

Add parameter to set maxNumRows for dataflow jobs#3632
VardhanThigle merged 3 commits intoGoogleCloudPlatform:mainfrom
swatilodha-dev:local

Conversation

@swatilodha-dev
Copy link
Copy Markdown
Contributor

@swatilodha-dev swatilodha-dev commented Apr 3, 2026

Added support to add a runtime parameter -maxNumRows when running glcoud dataflow jobs run

Currently the default value for this flag is 500.

@swatilodha-dev swatilodha-dev requested a review from a team as a code owner April 3, 2026 17:13
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a configurable limit for the number of rows in Spanner dataflow jobs. By transitioning the row limit parameter to a ValueProvider, the system now supports dynamic runtime adjustments, providing better control over batching behavior and resource utilization during large-scale data imports.

Highlights

  • New Runtime Parameter: Added a new tunable runtime parameter -maxNumRows to the Dataflow job configuration, allowing users to limit the number of rows processed in Spanner write operations.
  • Type Migration: Updated the maxNumRows configuration from a static long to a ValueProvider<Integer> across the pipeline, enabling dynamic configuration at runtime.
  • Logic Adjustments: Refactored batching logic in SpannerIO to utilize the new ValueProvider and added logging to improve observability of the row limit during execution.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@rohitwali rohitwali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes. Could you also add a section on the testing done with these changes?

Comment thread v1/src/main/java/com/google/cloud/teleport/spanner/spannerio/SpannerIO.java Outdated
Comment thread v1/src/main/java/com/google/cloud/teleport/spanner/spannerio/SpannerIO.java Outdated
Copy link
Copy Markdown

@rohitwali rohitwali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the approval workflows have failed. Can you please take a look?

@swatilodha-dev
Copy link
Copy Markdown
Contributor Author

Some of the approval workflows have failed. Can you please take a look?

updated the unit test. could you please restart the testing again? Or is there a way for me to do it?

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 57.69231% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.30%. Comparing base (574726c) to head (6f676e5).
⚠️ Report is 55 commits behind head on main.

Files with missing lines Patch % Lines
...le/cloud/teleport/spanner/TextImportTransform.java 0.00% 4 Missing ⚠️
...le/cloud/teleport/spanner/spannerio/SpannerIO.java 78.94% 1 Missing and 3 partials ⚠️
...gle/cloud/teleport/spanner/TextImportPipeline.java 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3632      +/-   ##
============================================
- Coverage     52.34%   52.30%   -0.05%     
+ Complexity     6142     5743     -399     
============================================
  Files          1053     1054       +1     
  Lines         63361    63636     +275     
  Branches       6947     6998      +51     
============================================
+ Hits          33169    33282     +113     
- Misses        27941    28092     +151     
- Partials       2251     2262      +11     
Components Coverage Δ
spanner-templates 72.11% <57.69%> (-0.03%) ⬇️
spanner-import-export 68.76% <57.69%> (-0.15%) ⬇️
spanner-live-forward-migration 80.43% <ø> (+0.08%) ⬆️
spanner-live-reverse-replication 77.81% <ø> (+0.04%) ⬆️
spanner-bulk-migration 89.18% <ø> (+0.01%) ⬆️
gcs-spanner-dv 85.36% <ø> (+0.05%) ⬆️
Files with missing lines Coverage Δ
...t/spanner/spannerio/SpannerTransformRegistrar.java 74.21% <ø> (ø)
...gle/cloud/teleport/spanner/TextImportPipeline.java 0.00% <0.00%> (ø)
...le/cloud/teleport/spanner/TextImportTransform.java 40.62% <0.00%> (-0.26%) ⬇️
...le/cloud/teleport/spanner/spannerio/SpannerIO.java 68.68% <78.94%> (+0.30%) ⬆️

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@rohitwali rohitwali added the improvement Making existing code better label Apr 9, 2026
@VardhanThigle VardhanThigle merged commit 5ef5a13 into GoogleCloudPlatform:main Apr 9, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Making existing code better size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants