Skip to content

perf(spanner): optimize query result decoding#8452

Open
surbhigarg92 wants to merge 3 commits into
mainfrom
spanner_optimize_decoder
Open

perf(spanner): optimize query result decoding#8452
surbhigarg92 wants to merge 3 commits into
mainfrom
spanner_optimize_decoder

Conversation

@surbhigarg92
Copy link
Copy Markdown
Contributor

@surbhigarg92 surbhigarg92 commented Jun 8, 2026

  • Optimized Row Decoding: Compiles type-specific decoders once per schema and reuses the generated closure for each row, significantly reducing runtime CPU overhead for large result sets.

  • Optimized STRUCT/JSON Decoding: Refactored the JIT STRUCT compiler to pre-compute field mappings, nameless field filtering, and default option configurations (wrapStructs) at compile-time. Explicitly checks empty-string nameless fields (name !== null && name !== undefined) to ensure correct recursive metadata propagation.

  • Optimized DATE and TIMESTAMP Parsing: Implemented high-performance fast-path parsers for standard Zulu timestamp and date formats, avoiding regex and split allocation overhead. Hardened these paths using type-safe Number conversions and strict NaN short-circuiting to safely fall back to standard constructors and prevent silent date corruption.

@product-auto-label product-auto-label Bot added the api: spanner Issues related to the Spanner API. label Jun 8, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes Spanner decoding performance by extracting regex constants, introducing a pre-compiled column-specific decoder generator (getDecoder), and adding a direct JSON row creation path (_createJsonRow) to bypass wrapper classes. The review feedback highlights critical correctness bugs in codec.ts where logical OR (||) fallbacks incorrectly handle valid falsy values, incomplete bounds checks for PG_OID and INT64 that fail to check against Number.MIN_SAFE_INTEGER, and a potential runtime error when accessing the prototype of columnMetadata if it is a plain object.

Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts
@surbhigarg92 surbhigarg92 force-pushed the spanner_optimize_decoder branch from cf56674 to df8089e Compare June 8, 2026 09:12
@surbhigarg92
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant performance optimizations to the Spanner client's decoding pipeline. Key changes include pre-compiling column-specific decoder functions via 'getDecoder', adding fast-path parsing for standard date and RFC 3339 timestamp strings, optimizing loops, and bypassing class wrappers in JSON mode. Feedback on these changes highlights several correctness issues in 'codec.ts': the date fast-path parser needs to restrict years to '>= 1970' to avoid 2-digit year bugs; nested 'STRUCT' decoders must correctly index nested metadata by field name; empty string field names should be checked using 'name != null' to prevent incorrect fallback; and an unincremented 'index' variable in 'convertFieldsToJson' needs to be replaced with the loop index 'i' to prevent nameless fields from overwriting each other.

Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
@surbhigarg92
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant performance optimizations for decoding Spanner values, including fast-path parsing for standard date and UTC timestamp strings, a pre-compiled column-specific decoder (getDecoder), and optimizations in PartialResultStream to bypass class wrappers when creating JSON rows. The review feedback highlights critical improvement opportunities in the fast-path parsing logic, pointing out that using parseInt on unvalidated substrings could lead to silent data corruption for malformed inputs. Additionally, the reviewer identified a bug where struct fields with empty string names would fail to retrieve their associated metadata due to an implicit falsy check.

Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
@surbhigarg92 surbhigarg92 force-pushed the spanner_optimize_decoder branch from 6d65860 to 34c1f8b Compare June 8, 2026 11:41
@surbhigarg92
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant performance optimizations for Spanner data decoding. It adds fast-path parsing for standard date and timestamp strings, pre-compiles column-specific decoders via getDecoder to avoid repeated type switch-case evaluations, and implements a direct JSON row creation path in PartialResultStream to bypass wrapper classes. The review feedback highlights critical safety improvements regarding dynamic property access on objects like columnMetadata, val, and columnsMetadata. Specifically, accessing properties using user-defined column names or enum values can accidentally resolve to built-in prototype properties (such as toString or valueOf), leading to runtime errors. Using Object.prototype.hasOwnProperty.call is recommended to safely check for own properties in these cases.

Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/codec.ts Outdated
Comment thread handwritten/spanner/src/partial-result-stream.ts Outdated
@surbhigarg92 surbhigarg92 force-pushed the spanner_optimize_decoder branch from 34c1f8b to e9809d7 Compare June 8, 2026 12:36
@surbhigarg92
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces high-performance fast-path parsing for standard Spanner dates and UTC timestamps, pre-compiles column-specific decoders via getDecoder, and optimizes JSON row creation in PartialResultStream to bypass wrapper classes. Feedback on these changes highlights several correctness and robustness issues: the fast-path date and timestamp parsers should validate constructed date components to prevent silent rollover of invalid dates, and the timestamp parser needs stricter validation against malformed strings. Additionally, the 2-digit year mapping check should include year 0 to ensure it is correctly preserved, and a defensive check should be added in decodeValue_ to prevent runtime errors when the value is undefined.

Comment thread handwritten/spanner/src/codec.ts
Comment on lines +1239 to +1272
const dotIndex = isoString.indexOf('.', 19);
if (dotIndex !== -1) {
const subSecondsStr = isoString.substring(
dotIndex + 1,
isoString.length - 1,
);
const padded = subSecondsStr.padEnd(9, '0');
milliseconds = Number(padded.substring(0, 3));
microseconds = Number(padded.substring(3, 6));
nanoseconds = Number(padded.substring(6, 9));

if (
Number.isNaN(milliseconds) ||
Number.isNaN(microseconds) ||
Number.isNaN(nanoseconds)
) {
return new PreciseDate(isoString);
}
}

const utcMillis = Date.UTC(
year,
month,
day,
hours,
minutes,
seconds,
milliseconds,
);

const preciseDate = new PreciseDate(utcMillis);
preciseDate.setMicroseconds(microseconds);
preciseDate.setNanoseconds(nanoseconds);
return preciseDate;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fast-path parser for PreciseDate has a few correctness issues that can lead to silent acceptance of malformed timestamps or silent rollover of invalid dates:

  1. If the sub-seconds part contains non-digits after the 9th decimal place (e.g., "2021-05-11T16:46:04.872345678abcZ"), the parser will truncate it and successfully parse it instead of falling back to the native constructor (which would produce an Invalid Date). We should validate that subSecondsStr contains only digits using DIGITS_REGEX.
  2. If there is no dot but the string has extra characters (e.g., "2021-05-11T16:46:04abcZ"), the parser will ignore them. We should ensure isoString.length === 20 when there is no dot.
  3. Out-of-range components (e.g., month 13 or day 30 in February) will silently roll over via Date.UTC. We should validate the constructed UTC date components against the parsed components to ensure correctness.
    const dotIndex = isoString.indexOf('.', 19);
    if (dotIndex !== -1) {
      if (dotIndex !== 19) {
        return new PreciseDate(isoString);
      }
      const subSecondsStr = isoString.substring(
        dotIndex + 1,
        isoString.length - 1,
      );
      if (!DIGITS_REGEX.test(subSecondsStr)) {
        return new PreciseDate(isoString);
      }
      const padded = subSecondsStr.padEnd(9, '0');
      milliseconds = Number(padded.substring(0, 3));
      microseconds = Number(padded.substring(3, 6));
      nanoseconds = Number(padded.substring(6, 9));
    } else if (isoString.length !== 20) {
      return new PreciseDate(isoString);
    }

    const utcMillis = Date.UTC(
      year,
      month,
      day,
      hours,
      minutes,
      seconds,
      milliseconds,
    );

    const dateCheck = new Date(utcMillis);
    if (
      dateCheck.getUTCFullYear() !== year ||
      dateCheck.getUTCMonth() !== month ||
      dateCheck.getUTCDate() !== day ||
      dateCheck.getUTCHours() !== hours ||
      dateCheck.getUTCMinutes() !== minutes ||
      dateCheck.getUTCSeconds() !== seconds
    ) {
      return new PreciseDate(isoString);
    }

    const preciseDate = new PreciseDate(utcMillis);
    preciseDate.setMicroseconds(microseconds);
    preciseDate.setNanoseconds(nanoseconds);
    return preciseDate;

Comment thread handwritten/spanner/src/codec.ts
Comment thread handwritten/spanner/src/common-grpc/service.ts
@surbhigarg92 surbhigarg92 marked this pull request as ready for review June 8, 2026 13:06
@surbhigarg92 surbhigarg92 requested a review from a team as a code owner June 8, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the Spanner API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant