From 7e325dbf698c3a282de3e7810a350aefcfc9360d Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 21:27:07 +0000 Subject: [PATCH 01/15] feat: add import profiler tool with dynamic loaded lines code volume metrics --- scripts/import_profiler/documentation.md | 74 ++++++++ scripts/import_profiler/profiler.py | 220 +++++++++++++++++++++++ 2 files changed, 294 insertions(+) create mode 100644 scripts/import_profiler/documentation.md create mode 100644 scripts/import_profiler/profiler.py diff --git a/scripts/import_profiler/documentation.md b/scripts/import_profiler/documentation.md new file mode 100644 index 000000000000..d88696e27629 --- /dev/null +++ b/scripts/import_profiler/documentation.md @@ -0,0 +1,74 @@ +# Python SDK Import Profiler: Documentation & Breakdown + +This document provides a comprehensive guide to the `import_profiler` scripts, directory files, and how to analyze the generated import trace logs to target optimization areas, along with the details of the implemented lazy-loading refactor. + +--- + +## 1. File Guide & Directory Structure +The profiling tool is located in the [scripts/import_profiler/](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/) directory: + +* **[profiler.py](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/profiler.py)**: The core executable script. It is designed as a single-file, self-spawning harness that performs process-isolated importing benchmarks and generates trace logs. +* **[plan.md](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/plan.md)**: The current project phases and roadmap checklist. +* **[status.md](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/status.md)**: Tracks the active task state and hosts recorded baseline performance metrics. + +--- + +## 2. Profiler Mechanism (`profiler.py`) + +**Objective** +The Profiler functions as a process-isolated verification harness designed to capture before-and-after metrics across three distinct vectors: Initialization Latency (ms), Peak Memory Usage (MB), and **Dynamic Code Volume (Loaded Modules & Lines of Code)**. + +**Proposed Solution** +The selected design for the tracking framework is Hybrid Process Orchestration, implemented via a highly isolated Master/Worker architecture within a single self-spawning script (`profiler.py`). This setup directly resolves the "Cache Problem," where Python caches completed imports inside `sys.modules` and loads them in <0.1ms on subsequent calls, making standard loop benchmarking inaccurate. + +* **The Master Process:** Operates as the central harness controller, looping exactly N times. For each iteration, it calls a clean subprocess using `sys.executable` pointing back to itself with a specialized `--worker` flag. This isolates every single run from Python's runtime memory cache, guaranteeing a true, 100% "cold start" baseline. +* **The Worker Process:** Spawns a fresh interpreter instance, executes the dynamic target module import via `importlib.import_module()`, natively tracks system metrics, pipes the final calculation payload out as a clean JSON string to `stdout`, and exits. + +**Telemetry Vector Coverage:** +* **Load Time Latency:** Captured at high-resolution around the target execution window via `time.perf_counter()`. +* **Peak Memory (RAM):** Monitored safely through standard library `tracemalloc` block tracking to catch true memory consumption shifts while avoiding the "observer effect" (profiler overhead distorting data). +* **Dynamic Code Volume:** Measured deterministically by snapshotting the `sys.modules` cache before and after the target import. By isolating the delta of newly loaded modules and dynamically summing their physical source lines of code (LOC), the profiler granularly detects optimization savings (e.g., lazy-loading a large layer or pruning a single function block) that might be too small to reliably appear in RAM metrics. + +**Alternatives** + +* **Pure Python-Native Profiling** + * **Mechanism:** Running a continuous runtime script tracking live initialization performance sequentially via standard libraries. + * **Trade-off:** Discarded because the profiler framework script remains co-located within the active module process pool. This results in heavy observer interference and completely fails to report critical OS-level interpreter boot latencies. +* **Pure Shell/Bash Scripting** + * **Mechanism:** Utilizing OS-level shell commands to time python execution from the shell layer (e.g., `/usr/bin/time -v python -c "import package"`). + * **Trade-off:** Discarded because it relies heavily on brittle string parsing of raw console streams, is highly unportable across disparate platform engines (GNU vs. BSD time formats), and cannot profile deep sub-module dependency chains. +* **Static Analysis (AST Parsing / Disk Footprint)** + * **Mechanism:** Statically traversing target library code structures via Abstract Syntax Trees or file sizes to estimate footprint weight before package loading. + * **Trade-off:** Eliminated because static checks are blind to runtime behavior like lazy-loading. A file might exist on disk but never be imported. Our chosen dynamic module-diffing approach accurately counts only what is *actually* evaluated by the interpreter during execution. + +--- + +## 3. How to Interpret Python `-X importtime` Trace Logs +When running with the `--trace` flag, the script captures the raw stderr trace produced by Python's `-X importtime` option. The trace looks like this: + +``` +import time: self [us] | cumulative | imported package +import time: 536 | 536 | _io +import time: 1077 | 2385 | _frozen_importlib_external +import time: 773659 | 793010 | google.cloud.compute_v1.types.compute +``` + +### Explaining the Fields: +1. **`self [us]` (Microseconds):** The time spent importing the module itself, *excluding* any time spent importing its child dependencies. +2. **`cumulative` (Microseconds):** The total time spent loading the module *including* all nested imports. This represents the total wait time introduced by this line. +3. **Hierarchy Indentation:** Indented packages are sub-imports triggered by the parent module. A package with higher indentation is loaded deeper in the call stack. + +--- + +## 4. Execution Reference +Ensure you are in the correct pyenv virtual environment where packages are installed in editable mode: + +```bash +# 1. Run the profiler to get baseline/optimized outcomes (e.g. 5 iterations) +PYENV_VERSION=py312 python profiler.py --module=google.cloud.compute --iterations=5 +PYENV_VERSION=py312 python profiler.py --module=google.cloud.aiplatform --iterations=5 + +# 2. Run the profiler to generate trace logs +PYENV_VERSION=py312 python profiler.py --module=google.cloud.compute --trace +PYENV_VERSION=py312 python profiler.py --module=google.cloud.aiplatform --trace +``` diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py new file mode 100644 index 000000000000..bb080e9f6278 --- /dev/null +++ b/scripts/import_profiler/profiler.py @@ -0,0 +1,220 @@ +import sys +import json +import time +import subprocess +import statistics +import tracemalloc +import importlib +import csv +import os + +def run_worker(target_module): + """Performs ONE import and returns metrics.""" + tracemalloc.start() + start_time = time.perf_counter() + + modules_before = set(sys.modules.keys()) + + # --- TARGET IMPORT --- + importlib.import_module(target_module) + # --------------------- + + end_time = time.perf_counter() + _, peak = tracemalloc.get_traced_memory() + tracemalloc.stop() + + modules_after = set(sys.modules.keys()) + new_modules = modules_after - modules_before + + loaded_lines = 0 + for m in new_modules: + mod = sys.modules.get(m) + if mod and getattr(mod, '__file__', None) and mod.__file__.endswith('.py'): + try: + with open(mod.__file__, 'r', encoding='utf-8') as f: + loaded_lines += sum(1 for _ in f) + except Exception: + pass + + # Output to stdout for the Master to capture + print(json.dumps({ + "time_ms": (end_time - start_time) * 1000, + "peak_ram_mb": peak / (1024 * 1024), + "loaded_modules": len(new_modules), + "loaded_lines": loaded_lines + })) + +def run_master(iterations, target_module, cpu="0", csv_path=None): + """Orchestrates the benchmark.""" + times, memories = [], [] + loaded_modules_val, loaded_lines_val = 0, 0 + + print(f"Profiling start... Running {iterations} cold-start iterations for {target_module}.") + if cpu.lower() != "none": + print(f"CPU Pinning enabled: Pinning processes to core {cpu} using taskset.") + else: + print("CPU Pinning disabled.") + + for i in range(iterations): + # Build command line + cmd = [] + if cpu.lower() != "none": + cmd += ["taskset", "-c", cpu] + + cmd += [sys.executable, __file__, "--worker", f"--module={target_module}"] + + try: + result = subprocess.run( + cmd, capture_output=True, text=True, check=True + ) + data = json.loads(result.stdout) + times.append(data["time_ms"]) + memories.append(data["peak_ram_mb"]) + loaded_modules_val = data.get("loaded_modules", 0) + loaded_lines_val = data.get("loaded_lines", 0) + except (subprocess.CalledProcessError, FileNotFoundError) as e: + # Fallback if taskset is not found or fails + if cpu.lower() != "none" and i == 0: + print("WARNING: taskset CPU pinning failed or is not available. Falling back to unpinned execution...") + # Try running without taskset + cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"] + result = subprocess.run(cmd, capture_output=True, text=True, check=True) + data = json.loads(result.stdout) + times.append(data["time_ms"]) + memories.append(data["peak_ram_mb"]) + loaded_modules_val = data.get("loaded_modules", 0) + loaded_lines_val = data.get("loaded_lines", 0) + cpu = "none" # Disable cpu pinning for remaining iterations + else: + raise e + + # Write CSV if requested + if csv_path: + with open(csv_path, "w", newline="") as f: + writer = csv.writer(f) + writer.writerow(["Iteration", "Time (ms)", "Peak RAM (MB)"]) + for idx, (t, m) in enumerate(zip(times, memories)): + writer.writerow([idx + 1, f"{t:.2f}", f"{m:.4f}"]) + print(f"Raw metrics successfully exported to CSV: {csv_path}") + + # Compute percentiles (P50, P90, P99) + # statistics.quantiles returns 99 cut points for n=100 + q_time = statistics.quantiles(times, n=100) + q_mem = statistics.quantiles(memories, n=100) + + p50_time, p90_time, p99_time = q_time[49], q_time[89], q_time[98] + p50_mem, p90_mem, p99_mem = q_mem[49], q_mem[89], q_mem[98] + + print(f"\n--- Results for {target_module} ({iterations} iterations) ---") + print(f"Code Volume (Deterministic):") + print(f" Loaded Modules: {loaded_modules_val}") + print(f" Loaded Lines: {loaded_lines_val}") + print(f"Time (ms):") + print(f" P50 (Median): {p50_time:.2f}") + print(f" P90: {p90_time:.2f}") + print(f" P99: {p99_time:.2f}") + print(f" Mean: {statistics.mean(times):.2f}") + print(f" Min: {min(times):.2f}") + print(f" Max: {max(times):.2f}") + if len(times) > 1: + print(f" StdDev: {statistics.stdev(times):.2f}") + + print(f"RAM (MB):") + print(f" P50 (Median): {p50_mem:.4f}") + print(f" P90: {p90_mem:.4f}") + print(f" P99: {p99_mem:.4f}") + print(f" Mean: {statistics.mean(memories):.4f}") + print(f" Min: {min(memories):.4f}") + print(f" Max: {max(memories):.4f}") + if len(memories) > 1: + print(f" StdDev: {statistics.stdev(memories):.4f}") + +def run_trace(target_module): + """Generates importtime trace log and writes it to a file.""" + trace_file = f"import_trace_{target_module.replace('.', '_')}.log" + print(f"Generating importtime trace log for {target_module} -> {trace_file}...") + + # We run: python -X importtime -c "import " + result = subprocess.run( + [sys.executable, "-X", "importtime", "-c", f"import {target_module}"], + capture_output=True, text=True + ) + + with open(trace_file, "w") as f: + f.write(result.stderr) + + print(f"Trace log successfully written to {trace_file}") + +def run_cprofile(target_module): + """Runs cProfile to capture stack traces for latency.""" + import cProfile + import pstats + + prof_file = f"cprofile_{target_module.replace('.', '_')}.prof" + print(f"Generating cProfile data for {target_module} -> {prof_file}...") + + # Run profiling + pr = cProfile.Profile() + pr.enable() + importlib.import_module(target_module) + pr.disable() + + # Save for flame charts (e.g. via snakeviz) + pr.dump_stats(prof_file) + print(f"cProfile stats successfully written to {prof_file}") + + # Print top bottlenecks + print("\n--- Top 15 functions by cumulative time ---") + ps = pstats.Stats(pr).sort_stats(pstats.SortKey.CUMULATIVE) + ps.print_stats(15) + +def run_mprofile(target_module): + """Runs tracemalloc snapshot to see where memory is allocated.""" + print(f"Generating tracemalloc memory snapshot for {target_module}...") + + tracemalloc.start() + importlib.import_module(target_module) + snapshot = tracemalloc.take_snapshot() + tracemalloc.stop() + + print("\n--- Top 15 memory allocations by line ---") + top_stats = snapshot.statistics('lineno') + for stat in top_stats[:15]: + print(stat) + +if __name__ == "__main__": + # Parse CLI arguments + target_module = "google.cloud.compute" + iterations = 50 + trace = False + cprofile = False + mprofile = False + cpu = "0" + csv_path = None + + for arg in sys.argv[1:]: + if arg.startswith("--module="): + target_module = arg.split("=")[1] + elif arg.startswith("--iterations="): + iterations = int(arg.split("=")[1]) + elif arg.startswith("--cpu="): + cpu = arg.split("=")[1] + elif arg.startswith("--csv="): + csv_path = arg.split("=")[1] + elif arg == "--trace": + trace = True + elif arg == "--cprofile": + cprofile = True + elif arg == "--mprofile": + mprofile = True + + if "--worker" in sys.argv: + run_worker(target_module) + elif trace: + run_trace(target_module) + elif cprofile: + run_cprofile(target_module) + elif mprofile: + run_mprofile(target_module) + else: + run_master(iterations, target_module, cpu, csv_path) \ No newline at end of file From 5c0dfcb4374b758b0ad0283cdb1beef26b47c651 Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 21:35:36 +0000 Subject: [PATCH 02/15] fix: address PR review feedback on profiler tool - Add fallback to handle single-iteration quantile calculation crashes - Normalize .pyc file paths during line counting and log exceptions - Expose worker process stderr to facilitate debugging CalledProcessError - Fix absolute paths in documentation.md to use relative directory paths --- scripts/import_profiler/documentation.md | 8 ++-- scripts/import_profiler/profiler.py | 53 ++++++++++++++++-------- 2 files changed, 39 insertions(+), 22 deletions(-) diff --git a/scripts/import_profiler/documentation.md b/scripts/import_profiler/documentation.md index d88696e27629..d39e055b986d 100644 --- a/scripts/import_profiler/documentation.md +++ b/scripts/import_profiler/documentation.md @@ -5,11 +5,11 @@ This document provides a comprehensive guide to the `import_profiler` scripts, d --- ## 1. File Guide & Directory Structure -The profiling tool is located in the [scripts/import_profiler/](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/) directory: +The profiling tool is located in the [scripts/import_profiler/](./) directory: -* **[profiler.py](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/profiler.py)**: The core executable script. It is designed as a single-file, self-spawning harness that performs process-isolated importing benchmarks and generates trace logs. -* **[plan.md](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/plan.md)**: The current project phases and roadmap checklist. -* **[status.md](file:///usr/local/google/home/hebaalazzeh/git/google-cloud-python/scripts/import_profiler/status.md)**: Tracks the active task state and hosts recorded baseline performance metrics. +* **[profiler.py](./profiler.py)**: The core executable script. It is designed as a single-file, self-spawning harness that performs process-isolated importing benchmarks and generates trace logs. +* **[plan.md](./plan.md)**: The current project phases and roadmap checklist. +* **[status.md](./status.md)**: Tracks the active task state and hosts recorded baseline performance metrics. --- diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index bb080e9f6278..944f4440d430 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -29,12 +29,17 @@ def run_worker(target_module): loaded_lines = 0 for m in new_modules: mod = sys.modules.get(m) - if mod and getattr(mod, '__file__', None) and mod.__file__.endswith('.py'): - try: - with open(mod.__file__, 'r', encoding='utf-8') as f: - loaded_lines += sum(1 for _ in f) - except Exception: - pass + if mod and getattr(mod, '__file__', None): + file_path = mod.__file__ + if file_path.endswith('.pyc'): + file_path = file_path[:-1] + if file_path.endswith('.py'): + try: + with open(file_path, 'r', encoding='utf-8') as f: + loaded_lines += sum(1 for _ in f) + except Exception as e: + import logging + logging.warning(f"Failed to read lines from {file_path}: {e}") # Output to stdout for the Master to capture print(json.dumps({ @@ -78,14 +83,20 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): print("WARNING: taskset CPU pinning failed or is not available. Falling back to unpinned execution...") # Try running without taskset cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"] - result = subprocess.run(cmd, capture_output=True, text=True, check=True) - data = json.loads(result.stdout) - times.append(data["time_ms"]) - memories.append(data["peak_ram_mb"]) - loaded_modules_val = data.get("loaded_modules", 0) - loaded_lines_val = data.get("loaded_lines", 0) - cpu = "none" # Disable cpu pinning for remaining iterations + try: + result = subprocess.run(cmd, capture_output=True, text=True, check=True) + data = json.loads(result.stdout) + times.append(data["time_ms"]) + memories.append(data["peak_ram_mb"]) + loaded_modules_val = data.get("loaded_modules", 0) + loaded_lines_val = data.get("loaded_lines", 0) + cpu = "none" # Disable cpu pinning for remaining iterations + except subprocess.CalledProcessError as err: + print(f"Error in worker process:\n{err.stderr}", file=sys.stderr) + raise err else: + if isinstance(e, subprocess.CalledProcessError): + print(f"Error in worker process:\n{e.stderr}", file=sys.stderr) raise e # Write CSV if requested @@ -99,11 +110,17 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): # Compute percentiles (P50, P90, P99) # statistics.quantiles returns 99 cut points for n=100 - q_time = statistics.quantiles(times, n=100) - q_mem = statistics.quantiles(memories, n=100) - - p50_time, p90_time, p99_time = q_time[49], q_time[89], q_time[98] - p50_mem, p90_mem, p99_mem = q_mem[49], q_mem[89], q_mem[98] + if len(times) > 1: + q_time = statistics.quantiles(times, n=100) + p50_time, p90_time, p99_time = q_time[49], q_time[89], q_time[98] + else: + p50_time = p90_time = p99_time = times[0] if times else 0.0 + + if len(memories) > 1: + q_mem = statistics.quantiles(memories, n=100) + p50_mem, p90_mem, p99_mem = q_mem[49], q_mem[89], q_mem[98] + else: + p50_mem = p90_mem = p99_mem = memories[0] if memories else 0.0 print(f"\n--- Results for {target_module} ({iterations} iterations) ---") print(f"Code Volume (Deterministic):") From 0cb3d1ceb00f91df4f9fbfd0bd34883f67197474 Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 21:47:46 +0000 Subject: [PATCH 03/15] fix: address further PR feedback on imports, pyc caching, json loading, and encodings - Use importlib.util.source_from_cache for robust .pyc resolution - Move importlib.util and logging imports to module level - Refine json.loads to parse only the last line of stdout - Specify UTF-8 encoding when opening files for writing --- scripts/import_profiler/profiler.py | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index 944f4440d430..b7cab700fa66 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -5,8 +5,10 @@ import statistics import tracemalloc import importlib +import importlib.util import csv import os +import logging def run_worker(target_module): """Performs ONE import and returns metrics.""" @@ -32,13 +34,15 @@ def run_worker(target_module): if mod and getattr(mod, '__file__', None): file_path = mod.__file__ if file_path.endswith('.pyc'): - file_path = file_path[:-1] + try: + file_path = importlib.util.source_from_cache(file_path) + except ValueError: + pass if file_path.endswith('.py'): try: with open(file_path, 'r', encoding='utf-8') as f: loaded_lines += sum(1 for _ in f) except Exception as e: - import logging logging.warning(f"Failed to read lines from {file_path}: {e}") # Output to stdout for the Master to capture @@ -72,7 +76,7 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): result = subprocess.run( cmd, capture_output=True, text=True, check=True ) - data = json.loads(result.stdout) + data = json.loads(result.stdout.strip().splitlines()[-1]) times.append(data["time_ms"]) memories.append(data["peak_ram_mb"]) loaded_modules_val = data.get("loaded_modules", 0) @@ -85,7 +89,7 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"] try: result = subprocess.run(cmd, capture_output=True, text=True, check=True) - data = json.loads(result.stdout) + data = json.loads(result.stdout.strip().splitlines()[-1]) times.append(data["time_ms"]) memories.append(data["peak_ram_mb"]) loaded_modules_val = data.get("loaded_modules", 0) @@ -101,7 +105,7 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): # Write CSV if requested if csv_path: - with open(csv_path, "w", newline="") as f: + with open(csv_path, "w", newline="", encoding="utf-8") as f: writer = csv.writer(f) writer.writerow(["Iteration", "Time (ms)", "Peak RAM (MB)"]) for idx, (t, m) in enumerate(zip(times, memories)): @@ -157,7 +161,7 @@ def run_trace(target_module): capture_output=True, text=True ) - with open(trace_file, "w") as f: + with open(trace_file, "w", encoding="utf-8") as f: f.write(result.stderr) print(f"Trace log successfully written to {trace_file}") From 9340dc942e25fbbc6bc6baf513a2b0c1ccbb6b41 Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 22:04:44 +0000 Subject: [PATCH 04/15] fix: refactor profiler worker execution and handling - Extract worker logic to _run_worker_and_parse for robust JSON parsing - Add early validation for iterations parameter in run_master - Add non-zero exit code check and warning for run_trace failures --- scripts/import_profiler/profiler.py | 43 +++++++++++++++++++++-------- 1 file changed, 32 insertions(+), 11 deletions(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index b7cab700fa66..86898810e838 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -53,8 +53,27 @@ def run_worker(target_module): "loaded_lines": loaded_lines })) +def _run_worker_and_parse(cmd): + result = subprocess.run(cmd, capture_output=True, text=True, check=True) + try: + lines = result.stdout.strip().splitlines() + if not lines: + raise ValueError("Worker produced no output on stdout.") + data = json.loads(lines[-1]) + for key in ("time_ms", "peak_ram_mb", "loaded_modules", "loaded_lines"): + if key not in data: + raise KeyError(f"Missing key '{key}' in worker output") + return data + except (json.JSONDecodeError, IndexError, KeyError, ValueError) as parse_err: + print(f"Error parsing worker output: {parse_err}", file=sys.stderr) + print(f"Worker stdout:\n{result.stdout}", file=sys.stderr) + print(f"Worker stderr:\n{result.stderr}", file=sys.stderr) + raise parse_err + def run_master(iterations, target_module, cpu="0", csv_path=None): """Orchestrates the benchmark.""" + if iterations < 1: + raise ValueError("Number of iterations must be at least 1.") times, memories = [], [] loaded_modules_val, loaded_lines_val = 0, 0 @@ -73,14 +92,11 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): cmd += [sys.executable, __file__, "--worker", f"--module={target_module}"] try: - result = subprocess.run( - cmd, capture_output=True, text=True, check=True - ) - data = json.loads(result.stdout.strip().splitlines()[-1]) + data = _run_worker_and_parse(cmd) times.append(data["time_ms"]) memories.append(data["peak_ram_mb"]) - loaded_modules_val = data.get("loaded_modules", 0) - loaded_lines_val = data.get("loaded_lines", 0) + loaded_modules_val = data["loaded_modules"] + loaded_lines_val = data["loaded_lines"] except (subprocess.CalledProcessError, FileNotFoundError) as e: # Fallback if taskset is not found or fails if cpu.lower() != "none" and i == 0: @@ -88,12 +104,11 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): # Try running without taskset cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"] try: - result = subprocess.run(cmd, capture_output=True, text=True, check=True) - data = json.loads(result.stdout.strip().splitlines()[-1]) + data = _run_worker_and_parse(cmd) times.append(data["time_ms"]) memories.append(data["peak_ram_mb"]) - loaded_modules_val = data.get("loaded_modules", 0) - loaded_lines_val = data.get("loaded_lines", 0) + loaded_modules_val = data["loaded_modules"] + loaded_lines_val = data["loaded_lines"] cpu = "none" # Disable cpu pinning for remaining iterations except subprocess.CalledProcessError as err: print(f"Error in worker process:\n{err.stderr}", file=sys.stderr) @@ -160,7 +175,13 @@ def run_trace(target_module): [sys.executable, "-X", "importtime", "-c", f"import {target_module}"], capture_output=True, text=True ) - + if result.returncode != 0: + print(f"WARNING: Import failed with exit code {result.returncode}. The trace log may be incomplete or contain errors.", file=sys.stderr) + if result.stdout: + print(f"Worker stdout:\n{result.stdout}", file=sys.stderr) + if result.stderr: + print(f"Worker stderr:\n{result.stderr}", file=sys.stderr) + with open(trace_file, "w", encoding="utf-8") as f: f.write(result.stderr) From d8c26a20138a3cbd4103d843fc7c4007c6a7212e Mon Sep 17 00:00:00 2001 From: Heba Alazzeh <137334116+hebaalazzeh@users.noreply.github.com> Date: Mon, 15 Jun 2026 15:18:19 -0700 Subject: [PATCH 05/15] Update scripts/import_profiler/profiler.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- scripts/import_profiler/profiler.py | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index 86898810e838..c73fe121a37c 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -46,12 +46,13 @@ def run_worker(target_module): logging.warning(f"Failed to read lines from {file_path}: {e}") # Output to stdout for the Master to capture - print(json.dumps({ + metrics = { "time_ms": (end_time - start_time) * 1000, "peak_ram_mb": peak / (1024 * 1024), "loaded_modules": len(new_modules), "loaded_lines": loaded_lines - })) + } + print(f"__METRICS__:{json.dumps(metrics)}") def _run_worker_and_parse(cmd): result = subprocess.run(cmd, capture_output=True, text=True, check=True) From f2c998ce5123170d149f70aeec58cf89950acb5e Mon Sep 17 00:00:00 2001 From: Heba Alazzeh <137334116+hebaalazzeh@users.noreply.github.com> Date: Mon, 15 Jun 2026 15:20:41 -0700 Subject: [PATCH 06/15] Update scripts/import_profiler/profiler.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- scripts/import_profiler/profiler.py | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index c73fe121a37c..0b9180370210 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -58,9 +58,13 @@ def _run_worker_and_parse(cmd): result = subprocess.run(cmd, capture_output=True, text=True, check=True) try: lines = result.stdout.strip().splitlines() - if not lines: - raise ValueError("Worker produced no output on stdout.") - data = json.loads(lines[-1]) + data = None + for line in reversed(lines): + if line.startswith("__METRICS__:"): + data = json.loads(line[len("__METRICS__:"):]) + break + if data is None: + raise ValueError("Worker did not output metrics JSON.") for key in ("time_ms", "peak_ram_mb", "loaded_modules", "loaded_lines"): if key not in data: raise KeyError(f"Missing key '{key}' in worker output") From d7ae73fc3bd6cd33e038b51968ceff048280a98f Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 22:20:28 +0000 Subject: [PATCH 07/15] refactor: replace manual sys.argv parsing with argparse --- scripts/import_profiler/profiler.py | 56 ++++++++++++----------------- 1 file changed, 22 insertions(+), 34 deletions(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index 0b9180370210..c2fd76729330 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -230,38 +230,26 @@ def run_mprofile(target_module): print(stat) if __name__ == "__main__": - # Parse CLI arguments - target_module = "google.cloud.compute" - iterations = 50 - trace = False - cprofile = False - mprofile = False - cpu = "0" - csv_path = None - - for arg in sys.argv[1:]: - if arg.startswith("--module="): - target_module = arg.split("=")[1] - elif arg.startswith("--iterations="): - iterations = int(arg.split("=")[1]) - elif arg.startswith("--cpu="): - cpu = arg.split("=")[1] - elif arg.startswith("--csv="): - csv_path = arg.split("=")[1] - elif arg == "--trace": - trace = True - elif arg == "--cprofile": - cprofile = True - elif arg == "--mprofile": - mprofile = True - - if "--worker" in sys.argv: - run_worker(target_module) - elif trace: - run_trace(target_module) - elif cprofile: - run_cprofile(target_module) - elif mprofile: - run_mprofile(target_module) + import argparse + parser = argparse.ArgumentParser(description="Python SDK Import Profiler") + parser.add_argument("--module", default="google.cloud.compute", help="Target module to profile") + parser.add_argument("--iterations", type=int, default=50, help="Number of iterations") + parser.add_argument("--cpu", default="0", help="CPU core to pin to (or 'none')") + parser.add_argument("--csv", help="Path to export CSV results") + parser.add_argument("--trace", action="store_true", help="Generate importtime trace log") + parser.add_argument("--cprofile", action="store_true", help="Run cProfile") + parser.add_argument("--mprofile", action="store_true", help="Run tracemalloc memory snapshot") + parser.add_argument("--worker", action="store_true", help=argparse.SUPPRESS) + + args = parser.parse_args() + + if args.worker: + run_worker(args.module) + elif args.trace: + run_trace(args.module) + elif args.cprofile: + run_cprofile(args.module) + elif args.mprofile: + run_mprofile(args.module) else: - run_master(iterations, target_module, cpu, csv_path) \ No newline at end of file + run_master(args.iterations, args.module, args.cpu, args.csv) \ No newline at end of file From 6a94207f06248b0bd52e686cb417c4e4dfc74cfd Mon Sep 17 00:00:00 2001 From: Heba Alazzeh <137334116+hebaalazzeh@users.noreply.github.com> Date: Mon, 15 Jun 2026 15:41:29 -0700 Subject: [PATCH 08/15] Update scripts/import_profiler/profiler.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- scripts/import_profiler/profiler.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index c2fd76729330..5501c1a1cb6c 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -234,7 +234,8 @@ def run_mprofile(target_module): parser = argparse.ArgumentParser(description="Python SDK Import Profiler") parser.add_argument("--module", default="google.cloud.compute", help="Target module to profile") parser.add_argument("--iterations", type=int, default=50, help="Number of iterations") - parser.add_argument("--cpu", default="0", help="CPU core to pin to (or 'none')") + default_cpu = "0" if sys.platform.startswith("linux") else "none" + parser.add_argument("--cpu", default=default_cpu, help="CPU core to pin to (or 'none')") parser.add_argument("--csv", help="Path to export CSV results") parser.add_argument("--trace", action="store_true", help="Generate importtime trace log") parser.add_argument("--cprofile", action="store_true", help="Run cProfile") From 1015b2c7374f9575c392ed5da80d5d995db68fc0 Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 22:53:59 +0000 Subject: [PATCH 09/15] fix: refactor cprofile and mprofile to use isolated subprocesses - Execute cProfile in a fresh subprocess to ensure cold-start accuracy - Execute tracemalloc in a fresh subprocess to capture all memory allocations - Clean up master process exception handling for missing taskset vs crashed workers --- scripts/import_profiler/profiler.py | 63 ++++++++++++++++------------- 1 file changed, 35 insertions(+), 28 deletions(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index 5501c1a1cb6c..968708d54ca4 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -102,11 +102,10 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): memories.append(data["peak_ram_mb"]) loaded_modules_val = data["loaded_modules"] loaded_lines_val = data["loaded_lines"] - except (subprocess.CalledProcessError, FileNotFoundError) as e: - # Fallback if taskset is not found or fails + except FileNotFoundError as e: if cpu.lower() != "none" and i == 0: - print("WARNING: taskset CPU pinning failed or is not available. Falling back to unpinned execution...") - # Try running without taskset + print("WARNING: taskset CPU pinning is not available. Falling back to unpinned execution...") + cpu = "none" cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"] try: data = _run_worker_and_parse(cmd) @@ -114,14 +113,14 @@ def run_master(iterations, target_module, cpu="0", csv_path=None): memories.append(data["peak_ram_mb"]) loaded_modules_val = data["loaded_modules"] loaded_lines_val = data["loaded_lines"] - cpu = "none" # Disable cpu pinning for remaining iterations except subprocess.CalledProcessError as err: print(f"Error in worker process:\n{err.stderr}", file=sys.stderr) raise err else: - if isinstance(e, subprocess.CalledProcessError): - print(f"Error in worker process:\n{e.stderr}", file=sys.stderr) raise e + except subprocess.CalledProcessError as e: + print(f"Error in worker process:\n{e.stderr}", file=sys.stderr) + raise e # Write CSV if requested if csv_path: @@ -193,41 +192,49 @@ def run_trace(target_module): print(f"Trace log successfully written to {trace_file}") def run_cprofile(target_module): - """Runs cProfile to capture stack traces for latency.""" - import cProfile + """Runs cProfile in a clean subprocess to capture stack traces for latency.""" import pstats prof_file = f"cprofile_{target_module.replace('.', '_')}.prof" print(f"Generating cProfile data for {target_module} -> {prof_file}...") - # Run profiling - pr = cProfile.Profile() - pr.enable() - importlib.import_module(target_module) - pr.disable() - - # Save for flame charts (e.g. via snakeviz) - pr.dump_stats(prof_file) + # Run profiling in a clean subprocess to ensure cold-start + result = subprocess.run( + [sys.executable, "-m", "cProfile", "-o", prof_file, "-c", f"import importlib; importlib.import_module('{target_module}')"], + capture_output=True, text=True + ) + if result.returncode != 0: + print(f"Error generating cProfile data:\n{result.stderr}", file=sys.stderr) + return + print(f"cProfile stats successfully written to {prof_file}") # Print top bottlenecks print("\n--- Top 15 functions by cumulative time ---") - ps = pstats.Stats(pr).sort_stats(pstats.SortKey.CUMULATIVE) + ps = pstats.Stats(prof_file).sort_stats(pstats.SortKey.CUMULATIVE) ps.print_stats(15) def run_mprofile(target_module): - """Runs tracemalloc snapshot to see where memory is allocated.""" + """Runs tracemalloc snapshot in a clean subprocess to see where memory is allocated.""" print(f"Generating tracemalloc memory snapshot for {target_module}...") - tracemalloc.start() - importlib.import_module(target_module) - snapshot = tracemalloc.take_snapshot() - tracemalloc.stop() - - print("\n--- Top 15 memory allocations by line ---") - top_stats = snapshot.statistics('lineno') - for stat in top_stats[:15]: - print(stat) + code = ( + "import tracemalloc\n" + "import importlib\n" + "tracemalloc.start()\n" + f"importlib.import_module('{target_module}')\n" + "snapshot = tracemalloc.take_snapshot()\n" + "tracemalloc.stop()\n" + "top_stats = snapshot.statistics('lineno')\n" + "for stat in top_stats[:15]:\n" + " print(stat)\n" + ) + result = subprocess.run([sys.executable, "-c", code], capture_output=True, text=True) + if result.returncode != 0: + print(f"Error generating memory snapshot:\n{result.stderr}", file=sys.stderr) + else: + print("\n--- Top 15 memory allocations by line ---") + print(result.stdout, end="") if __name__ == "__main__": import argparse From e15d9729b47ebf114e1929a02deb5b26cf8f1a53 Mon Sep 17 00:00:00 2001 From: Heba Alazzeh <137334116+hebaalazzeh@users.noreply.github.com> Date: Mon, 15 Jun 2026 16:00:11 -0700 Subject: [PATCH 10/15] Update scripts/import_profiler/profiler.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- scripts/import_profiler/profiler.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index 968708d54ca4..33e7bd75336c 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -176,7 +176,7 @@ def run_trace(target_module): # We run: python -X importtime -c "import " result = subprocess.run( - [sys.executable, "-X", "importtime", "-c", f"import {target_module}"], + [sys.executable, "-X", "importtime", "-c", f"import importlib; importlib.import_module({json.dumps(target_module)})"], capture_output=True, text=True ) if result.returncode != 0: From 051b45594633af93dc22284529337925b8f4155a Mon Sep 17 00:00:00 2001 From: Heba Alazzeh <137334116+hebaalazzeh@users.noreply.github.com> Date: Mon, 15 Jun 2026 16:00:44 -0700 Subject: [PATCH 11/15] Update scripts/import_profiler/profiler.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- scripts/import_profiler/profiler.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index 33e7bd75336c..ca9a4c4f2dde 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -40,7 +40,7 @@ def run_worker(target_module): pass if file_path.endswith('.py'): try: - with open(file_path, 'r', encoding='utf-8') as f: + with open(file_path, 'r', encoding='utf-8', errors='ignore') as f: loaded_lines += sum(1 for _ in f) except Exception as e: logging.warning(f"Failed to read lines from {file_path}: {e}") From 9f55f1c10204830fabdc33571a352f0c676f30d5 Mon Sep 17 00:00:00 2001 From: Heba Alazzeh <137334116+hebaalazzeh@users.noreply.github.com> Date: Mon, 15 Jun 2026 16:01:13 -0700 Subject: [PATCH 12/15] Update scripts/import_profiler/profiler.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- scripts/import_profiler/profiler.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index ca9a4c4f2dde..9bfde3f5cea6 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -200,7 +200,7 @@ def run_cprofile(target_module): # Run profiling in a clean subprocess to ensure cold-start result = subprocess.run( - [sys.executable, "-m", "cProfile", "-o", prof_file, "-c", f"import importlib; importlib.import_module('{target_module}')"], + [sys.executable, "-m", "cProfile", "-o", prof_file, "-c", f"import importlib; importlib.import_module({json.dumps(target_module)})"], capture_output=True, text=True ) if result.returncode != 0: From fb35fd328bf7470465004669aa9d258042c383ed Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 23:03:02 +0000 Subject: [PATCH 13/15] fix: resolve code injection vulnerability when importing target_module - Use json.dumps to safely escape target_module string in all subprocess commands (mprofile, cprofile, trace) --- scripts/import_profiler/profiler.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/import_profiler/profiler.py b/scripts/import_profiler/profiler.py index 9bfde3f5cea6..648e51924bd4 100644 --- a/scripts/import_profiler/profiler.py +++ b/scripts/import_profiler/profiler.py @@ -174,7 +174,7 @@ def run_trace(target_module): trace_file = f"import_trace_{target_module.replace('.', '_')}.log" print(f"Generating importtime trace log for {target_module} -> {trace_file}...") - # We run: python -X importtime -c "import " + # We run: python -X importtime -c "import importlib; importlib.import_module(...)" result = subprocess.run( [sys.executable, "-X", "importtime", "-c", f"import importlib; importlib.import_module({json.dumps(target_module)})"], capture_output=True, text=True @@ -222,7 +222,7 @@ def run_mprofile(target_module): "import tracemalloc\n" "import importlib\n" "tracemalloc.start()\n" - f"importlib.import_module('{target_module}')\n" + f"importlib.import_module({json.dumps(target_module)})\n" "snapshot = tracemalloc.take_snapshot()\n" "tracemalloc.stop()\n" "top_stats = snapshot.statistics('lineno')\n" From 727d44ecfa7271e90613c00d0fde7b74df7e39ec Mon Sep 17 00:00:00 2001 From: Heba Alazzeh Date: Mon, 15 Jun 2026 23:16:01 +0000 Subject: [PATCH 14/15] docs: simplify profiler mechanism section - Replace detailed implementation architecture and alternative solutions with a concise high-level usage and expected output format per PR review feedback --- scripts/import_profiler/documentation.md | 44 ++++++++++++------------ 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/scripts/import_profiler/documentation.md b/scripts/import_profiler/documentation.md index d39e055b986d..a561b9f124ee 100644 --- a/scripts/import_profiler/documentation.md +++ b/scripts/import_profiler/documentation.md @@ -18,28 +18,28 @@ The profiling tool is located in the [scripts/import_profiler/](./) directory: **Objective** The Profiler functions as a process-isolated verification harness designed to capture before-and-after metrics across three distinct vectors: Initialization Latency (ms), Peak Memory Usage (MB), and **Dynamic Code Volume (Loaded Modules & Lines of Code)**. -**Proposed Solution** -The selected design for the tracking framework is Hybrid Process Orchestration, implemented via a highly isolated Master/Worker architecture within a single self-spawning script (`profiler.py`). This setup directly resolves the "Cache Problem," where Python caches completed imports inside `sys.modules` and loads them in <0.1ms on subsequent calls, making standard loop benchmarking inaccurate. - -* **The Master Process:** Operates as the central harness controller, looping exactly N times. For each iteration, it calls a clean subprocess using `sys.executable` pointing back to itself with a specialized `--worker` flag. This isolates every single run from Python's runtime memory cache, guaranteeing a true, 100% "cold start" baseline. -* **The Worker Process:** Spawns a fresh interpreter instance, executes the dynamic target module import via `importlib.import_module()`, natively tracks system metrics, pipes the final calculation payload out as a clean JSON string to `stdout`, and exits. - -**Telemetry Vector Coverage:** -* **Load Time Latency:** Captured at high-resolution around the target execution window via `time.perf_counter()`. -* **Peak Memory (RAM):** Monitored safely through standard library `tracemalloc` block tracking to catch true memory consumption shifts while avoiding the "observer effect" (profiler overhead distorting data). -* **Dynamic Code Volume:** Measured deterministically by snapshotting the `sys.modules` cache before and after the target import. By isolating the delta of newly loaded modules and dynamically summing their physical source lines of code (LOC), the profiler granularly detects optimization savings (e.g., lazy-loading a large layer or pruning a single function block) that might be too small to reliably appear in RAM metrics. - -**Alternatives** - -* **Pure Python-Native Profiling** - * **Mechanism:** Running a continuous runtime script tracking live initialization performance sequentially via standard libraries. - * **Trade-off:** Discarded because the profiler framework script remains co-located within the active module process pool. This results in heavy observer interference and completely fails to report critical OS-level interpreter boot latencies. -* **Pure Shell/Bash Scripting** - * **Mechanism:** Utilizing OS-level shell commands to time python execution from the shell layer (e.g., `/usr/bin/time -v python -c "import package"`). - * **Trade-off:** Discarded because it relies heavily on brittle string parsing of raw console streams, is highly unportable across disparate platform engines (GNU vs. BSD time formats), and cannot profile deep sub-module dependency chains. -* **Static Analysis (AST Parsing / Disk Footprint)** - * **Mechanism:** Statically traversing target library code structures via Abstract Syntax Trees or file sizes to estimate footprint weight before package loading. - * **Trade-off:** Eliminated because static checks are blind to runtime behavior like lazy-loading. A file might exist on disk but never be imported. Our chosen dynamic module-diffing approach accurately counts only what is *actually* evaluated by the interpreter during execution. +**Usage** + +Run this command to collect the metrics: +```bash +python profiler.py --module --iterations +``` + +**Expected Output** +```text +--- Results for ( iterations) --- +Code Volume (Deterministic): + Loaded Modules: + Loaded Lines: +Time (ms): + P50 (Median):