Skip to content

chore(storage): add microbenchmark for MRD reads#17421

Open
chandra-siri wants to merge 1 commit into
mainfrom
perf-benchmark-mrd-reads
Open

chore(storage): add microbenchmark for MRD reads#17421
chandra-siri wants to merge 1 commit into
mainfrom
perf-benchmark-mrd-reads

Conversation

@chandra-siri

Copy link
Copy Markdown
Contributor

Add a microbenchmark script for measuring multi-range downloader (MRD) reads performance.

@chandra-siri chandra-siri requested a review from a team as a code owner June 11, 2026 05:44
@chandra-siri chandra-siri changed the title chore: add microbenchmark for MRD reads chore(storage): add microbenchmark for MRD reads Jun 11, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new microbenchmark script, benchmark_mrd_reads.py, to measure the performance of GCS Object Range Downloads using the AsyncMultiRangeDownloader (MRD). The review feedback highlights a potential ZeroDivisionError or incorrect slicing during random object generation if the upload chunk size exceeds the buffer size. Additionally, the reviewer suggests refactoring the upload and benchmark execution blocks to ensure that test objects are reliably cleaned up from the GCS bucket even if the upload or benchmark run fails.

Comment on lines +124 to +126
# Generate 10MiB of random buffer to slice from to avoid CPU urandom overhead
buffer_size = min(10 * 1024 * 1024, total_size_bytes)
random_buffer = os.urandom(buffer_size)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If chunk_size_bytes is larger than buffer_size (which is capped at 10 MiB), bytes_to_write can exceed buffer_size. This leads to two critical issues:

  1. If bytes_to_write is exactly buffer_size + 1, buffer_size - bytes_to_write + 1 becomes 0, causing a ZeroDivisionError when calculating slice_start via modulo.
  2. If bytes_to_write is even larger, buffer_size - bytes_to_write + 1 becomes negative, resulting in incorrect slicing of random_buffer and silent data truncation/corruption during the upload.

To fix this, ensure buffer_size is at least chunk_size_bytes (up to total_size_bytes).

Suggested change
# Generate 10MiB of random buffer to slice from to avoid CPU urandom overhead
buffer_size = min(10 * 1024 * 1024, total_size_bytes)
random_buffer = os.urandom(buffer_size)
# Generate random buffer to slice from to avoid CPU urandom overhead.
# Ensure buffer_size is at least chunk_size_bytes to prevent ZeroDivisionError or slicing beyond the buffer limit.
buffer_size = min(max(10 * 1024 * 1024, chunk_size_bytes), total_size_bytes)
random_buffer = os.urandom(buffer_size)

Comment on lines +218 to +232
try:
chunk_size = min(upload_chunk_size_bytes, size_bytes)
await upload_random_object(
grpc_client,
args.bucket,
object_name,
size_bytes,
chunk_size,
)
except Exception as e:
logging.error(f"Upload failed for size {size_str}: {e}")
sys.exit(1)

try:
chk_label = "Enabled" if enable_chk else "Disabled"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If upload_random_object fails or is interrupted, the benchmark script exits immediately without running the finally block that deletes the test object. This can leave orphaned objects in the GCS bucket.

By merging the upload phase into the main try block, we ensure that any partially uploaded or created object is always cleaned up by the finally block, even if the upload fails.

Suggested change
try:
chunk_size = min(upload_chunk_size_bytes, size_bytes)
await upload_random_object(
grpc_client,
args.bucket,
object_name,
size_bytes,
chunk_size,
)
except Exception as e:
logging.error(f"Upload failed for size {size_str}: {e}")
sys.exit(1)
try:
chk_label = "Enabled" if enable_chk else "Disabled"
try:
chunk_size = min(upload_chunk_size_bytes, size_bytes)
await upload_random_object(
grpc_client,
args.bucket,
object_name,
size_bytes,
chunk_size,
)
chk_label = "Enabled" if enable_chk else "Disabled"

Comment on lines +413 to +423
finally:
try:
logging.info(
f"Cleaning up object gs://{args.bucket}/{object_name}..."
)
await grpc_client.delete_object(args.bucket, object_name)
logging.info("Cleanup complete.")
except Exception as e:
logging.warning(
f"Warning: failed to delete test object {object_name}: {e}"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This accompanies the change to merge the upload phase into the main try block. Adding an except block to the outer try ensures that any fatal benchmark or upload failures are logged nicely and exit with status code 1, while still executing the finally block to clean up the test object.

Suggested change
finally:
try:
logging.info(
f"Cleaning up object gs://{args.bucket}/{object_name}..."
)
await grpc_client.delete_object(args.bucket, object_name)
logging.info("Cleanup complete.")
except Exception as e:
logging.warning(
f"Warning: failed to delete test object {object_name}: {e}"
)
except Exception as e:
logging.error(f"Benchmark run failed for size {size_str}: {e}")
sys.exit(1)
finally:
try:
logging.info(
f"Cleaning up object gs://{args.bucket}/{object_name}..."
)
await grpc_client.delete_object(args.bucket, object_name)
logging.info("Cleanup complete.")
except Exception as e:
logging.warning(
f"Warning: failed to delete test object {object_name}: {e}"
)
References
  1. Avoid broad except Exception: blocks that silently return None. Instead, log the exception (e.g., using logger.warning) to aid in debugging and prevent masking underlying issues.

@parthea

parthea commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@chandra-siri, Please could you address the feedback from gemini-code-assist?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants