Fix edge cases with UTF-8 strings in ChecksumResultSet#13441
Open
Noremac201 wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates ChecksumResultSet to ensure a minimum buffer size of 4 bytes when allocating a ByteBuffer and switches from buffer.flip() to buffer.clear() to correctly reset the buffer. It also adds new unit tests covering empty, multi-byte, and mixed UTF-8 strings. The review feedback suggests refactoring these new tests to extract the duplicate mock setup and initialization logic into a helper method, which will improve readability and maintainability.
b0017ca to
1a1b88e
Compare
…tSet 1. Ensure a minimum capacity of 4 bytes when allocating the buffer, this is the max size of a UTF-8 character. However, the java length representation is being used in this code for the byte buffer allocation, which may be too small for a single utf-8 character. 2. Use buffer.clear() instead of the second buffer.flip(). This is for mixed multi-byte utf-8 characters and single-byte characters. A test was added to show this passing, and it fails with flip() vs clear(). Fixes googleapis#13440
1a1b88e to
75a42cf
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ensure a minimum capacity of 4 bytes when allocating the buffer, this is the max size of a UTF-8 character. However, the java length representation is being used in this code for the byte buffer allocation, which may be too small for a single utf-8 character.
Use buffer.clear() instead of the second buffer.flip() . This resets the buffer's write limit back to its full capacity for the next iteration, instead of shrinking it to the size of the previous write. This is for mixed multi-byte utf-8 characters and single-byte characters. A test was added to show this passing, and it fails with
flip()vsclear()Fixes #13440