feat(storage): Adding takeover operation checksum for bidi appendable uploads#13361
feat(storage): Adding takeover operation checksum for bidi appendable uploads#13361Dhriti07 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces CumulativeHasher, a wrapper around Hasher that accumulates checksums and validates them at the end of a full object read. It integrates this cumulative validation into BaseObjectReadSessionStreamRead, GapicUnbufferedReadableByteChannel, and ObjectReadSessionStream, and adds comprehensive unit and integration tests. The reviewer feedback highlights two important issues: first, GapicUnbufferedReadableByteChannel should check !(hasher instanceof Hasher.NoOpHasher) before wrapping to prevent false-positive validation failures when checksumming is disabled; second, the cumulativeHash field in CumulativeHasher should be declared volatile to ensure thread safety and proper memory visibility across threads.
I am having trouble creating individual review comments. Click here to see my feedback.
java-storage/google-cloud-storage/src/main/java/com/google/cloud/storage/GapicUnbufferedReadableByteChannel.java (95-103)
When checksum validation is disabled, hasher is an instance of Hasher.NoOpHasher. In GapicUnbufferedReadableByteChannel, we do not check for Hasher.NoOpHasher before wrapping it in CumulativeHasher. This will cause CumulativeHasher to be used even when checksumming is disabled, leading to a false-positive UncheckedCumulativeChecksumMismatchException at the end of a full-object read (since the cumulative hash will remain 0 while the expected metadata hash is non-zero).
We should add a check !(hasher instanceof Hasher.NoOpHasher) to match the behavior in BaseObjectReadSessionStreamRead.
this.hasher =
(req.getReadOffset() == 0 && !(hasher instanceof Hasher.NoOpHasher))
? new CumulativeHasher(
hasher,
0,
req.getReadLimit() <= 0
? OptionalLong.empty()
: OptionalLong.of(req.getReadLimit()))
: hasher;
java-storage/google-cloud-storage/src/main/java/com/google/cloud/storage/CumulativeHasher.java (37)
The cumulativeHash field is updated by the thread processing the stream responses (e.g., the gRPC callback thread or user thread) and read by another thread (e.g., the executor thread in ObjectReadSessionStream or user thread). To ensure proper memory visibility across different threads and avoid potential concurrency issues, cumulativeHash should be declared as volatile.
private volatile Crc32cLengthKnown cumulativeHash;
Adding takeover operation checksum for bidi appendable uploads