feat: add import profiler tool with dynamic loaded lines code volume …#17467
feat: add import profiler tool with dynamic loaded lines code volume …#17467hebaalazzeh wants to merge 15 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a process-isolated Python SDK Import Profiler tool (profiler.py) along with its comprehensive documentation (documentation.md). The profiler measures initialization latency, peak memory usage, and dynamic code volume across multiple cold-start iterations. Feedback on the implementation highlights several key improvements: adding a guard to prevent crashes during single-iteration runs, replacing absolute local file paths in the documentation with relative links, handling .pyc files and logging exceptions during line counting, and exposing the worker process's stderr to facilitate debugging when subprocesses fail.
- Add fallback to handle single-iteration quantile calculation crashes - Normalize .pyc file paths during line counting and log exceptions - Expose worker process stderr to facilitate debugging CalledProcessError - Fix absolute paths in documentation.md to use relative directory paths
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK import profiler tool along with its documentation. The profiler is designed to run process-isolated benchmarks to measure import latency, peak memory usage, and dynamic code volume. The feedback provided aims to improve the robustness of the tool. Key recommendations include using importlib.util.source_from_cache to correctly resolve source paths from compiled .pyc files, moving imports to the module level, parsing only the last line of subprocess stdout to handle noisy module initializations safely, and explicitly specifying encoding="utf-8" when writing files to ensure platform-independent behavior.
| if file_path.endswith('.pyc'): | ||
| file_path = file_path[:-1] | ||
| if file_path.endswith('.py'): |
There was a problem hiding this comment.
In Python 3, compiled .pyc files are stored in __pycache__ directories with interpreter-specific suffixes (e.g., module.cpython-312.pyc). Simply stripping the last character (file_path[:-1]) results in a non-existent path like __pycache__/module.cpython-312.py, causing the line counter to fail for almost all cached imports.
Use importlib.util.source_from_cache to correctly resolve the source .py file path.
if file_path.endswith('.pyc'):
try:
file_path = importlib.util.source_from_cache(file_path)
except ValueError:
pass
if file_path.endswith('.py'):| import importlib | ||
| import csv | ||
| import os |
…g, and encodings - Use importlib.util.source_from_cache for robust .pyc resolution - Move importlib.util and logging imports to module level - Refine json.loads to parse only the last line of stdout - Specify UTF-8 encoding when opening files for writing
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK import profiling tool (profiler.py) along with its comprehensive documentation (documentation.md). The profiler uses a process-isolated master/worker architecture to accurately measure cold-start import latency, peak memory usage, and dynamic code volume. The review feedback highlights two key areas for improvement: robustly parsing the worker's stdout in run_master to prevent crashes from unexpected output (such as warnings) and validating the iteration count, and checking the subprocess return code in run_trace to warn users if an import fails instead of silently writing the error log.
| def run_master(iterations, target_module, cpu="0", csv_path=None): | ||
| """Orchestrates the benchmark.""" | ||
| times, memories = [], [] | ||
| loaded_modules_val, loaded_lines_val = 0, 0 | ||
|
|
||
| print(f"Profiling start... Running {iterations} cold-start iterations for {target_module}.") | ||
| if cpu.lower() != "none": | ||
| print(f"CPU Pinning enabled: Pinning processes to core {cpu} using taskset.") | ||
| else: | ||
| print("CPU Pinning disabled.") | ||
|
|
||
| for i in range(iterations): | ||
| # Build command line | ||
| cmd = [] | ||
| if cpu.lower() != "none": | ||
| cmd += ["taskset", "-c", cpu] | ||
|
|
||
| cmd += [sys.executable, __file__, "--worker", f"--module={target_module}"] | ||
|
|
||
| try: | ||
| result = subprocess.run( | ||
| cmd, capture_output=True, text=True, check=True | ||
| ) | ||
| data = json.loads(result.stdout.strip().splitlines()[-1]) | ||
| times.append(data["time_ms"]) | ||
| memories.append(data["peak_ram_mb"]) | ||
| loaded_modules_val = data.get("loaded_modules", 0) | ||
| loaded_lines_val = data.get("loaded_lines", 0) | ||
| except (subprocess.CalledProcessError, FileNotFoundError) as e: | ||
| # Fallback if taskset is not found or fails | ||
| if cpu.lower() != "none" and i == 0: | ||
| print("WARNING: taskset CPU pinning failed or is not available. Falling back to unpinned execution...") | ||
| # Try running without taskset | ||
| cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"] | ||
| try: | ||
| result = subprocess.run(cmd, capture_output=True, text=True, check=True) | ||
| data = json.loads(result.stdout.strip().splitlines()[-1]) | ||
| times.append(data["time_ms"]) | ||
| memories.append(data["peak_ram_mb"]) | ||
| loaded_modules_val = data.get("loaded_modules", 0) | ||
| loaded_lines_val = data.get("loaded_lines", 0) | ||
| cpu = "none" # Disable cpu pinning for remaining iterations | ||
| except subprocess.CalledProcessError as err: | ||
| print(f"Error in worker process:\n{err.stderr}", file=sys.stderr) | ||
| raise err | ||
| else: | ||
| if isinstance(e, subprocess.CalledProcessError): | ||
| print(f"Error in worker process:\n{e.stderr}", file=sys.stderr) | ||
| raise e |
There was a problem hiding this comment.
The current implementation of run_master parses the worker's stdout by splitting lines and taking the last line. However, if the worker process prints any warnings, debug messages, or if the imported module itself prints to stdout, the last line might not be valid JSON. This will cause the master process to crash with an unhandled json.JSONDecodeError or IndexError without displaying the worker's actual output or traceback, making debugging extremely difficult.
Additionally, there is no validation that iterations is at least 1, which can lead to division-by-zero or statistics errors later in the function.
To address these issues, we can extract the worker execution and robust parsing logic into a helper function _run_worker_and_parse that catches parsing errors and prints the raw stdout/stderr of the worker before raising, and validate iterations at the start of run_master.
def _run_worker_and_parse(cmd):
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
try:
lines = result.stdout.strip().splitlines()
if not lines:
raise ValueError("Worker produced no output on stdout.")
data = json.loads(lines[-1])
for key in ("time_ms", "peak_ram_mb", "loaded_modules", "loaded_lines"):
if key not in data:
raise KeyError(f"Missing key '{key}' in worker output")
return data
except (json.JSONDecodeError, IndexError, KeyError, ValueError) as parse_err:
print(f"Error parsing worker output: {parse_err}", file=sys.stderr)
print(f"Worker stdout:\\n{result.stdout}", file=sys.stderr)
print(f"Worker stderr:\\n{result.stderr}", file=sys.stderr)
raise parse_err
def run_master(iterations, target_module, cpu="0", csv_path=None):
"""Orchestrates the benchmark."""
if iterations < 1:
raise ValueError("Number of iterations must be at least 1.")
times, memories = [], []
loaded_modules_val, loaded_lines_val = 0, 0
print(f"Profiling start... Running {iterations} cold-start iterations for {target_module}.")
if cpu.lower() != "none":
print(f"CPU Pinning enabled: Pinning processes to core {cpu} using taskset.")
else:
print("CPU Pinning disabled.")
for i in range(iterations):
cmd = []
if cpu.lower() != "none":
cmd += ["taskset", "-c", cpu]
cmd += [sys.executable, __file__, "--worker", f"--module={target_module}"]
try:
data = _run_worker_and_parse(cmd)
times.append(data["time_ms"])
memories.append(data["peak_ram_mb"])
loaded_modules_val = data["loaded_modules"]
loaded_lines_val = data["loaded_lines"]
except (subprocess.CalledProcessError, FileNotFoundError) as e:
if cpu.lower() != "none" and i == 0:
print("WARNING: taskset CPU pinning failed or is not available. Falling back to unpinned execution...")
cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"]
try:
data = _run_worker_and_parse(cmd)
times.append(data["time_ms"])
memories.append(data["peak_ram_mb"])
loaded_modules_val = data["loaded_modules"]
loaded_lines_val = data["loaded_lines"]
cpu = "none"
except subprocess.CalledProcessError as err:
print(f"Error in worker process:\n{err.stderr}", file=sys.stderr)
raise err
else:
if isinstance(e, subprocess.CalledProcessError):
print(f"Error in worker process:\n{e.stderr}", file=sys.stderr)
raise e| def run_trace(target_module): | ||
| """Generates importtime trace log and writes it to a file.""" | ||
| trace_file = f"import_trace_{target_module.replace('.', '_')}.log" | ||
| print(f"Generating importtime trace log for {target_module} -> {trace_file}...") | ||
|
|
||
| # We run: python -X importtime -c "import <module>" | ||
| result = subprocess.run( | ||
| [sys.executable, "-X", "importtime", "-c", f"import {target_module}"], | ||
| capture_output=True, text=True | ||
| ) | ||
|
|
||
| with open(trace_file, "w", encoding="utf-8") as f: | ||
| f.write(result.stderr) | ||
|
|
||
| print(f"Trace log successfully written to {trace_file}") |
There was a problem hiding this comment.
In run_trace, if the import fails, subprocess.run will complete with a non-zero exit code, but the script will silently write the stderr to the trace file and print a success message without warning the user that the import actually failed. Adding a check for result.returncode and printing a warning with the worker's stdout/stderr will significantly improve usability and prevent users from analyzing incomplete or failed traces.
def run_trace(target_module):
"""Generates importtime trace log and writes it to a file."""
trace_file = f"import_trace_{target_module.replace('.', '_')}.log"
print(f"Generating importtime trace log for {target_module} -> {trace_file}...")
# We run: python -X importtime -c "import <module>"
result = subprocess.run(
[sys.executable, "-X", "importtime", "-c", f"import {target_module}"],
capture_output=True, text=True
)
if result.returncode != 0:
print(f"WARNING: Import failed with exit code {result.returncode}. The trace log may be incomplete or contain errors.", file=sys.stderr)
if result.stdout:
print(f"Worker stdout:\\n{result.stdout}", file=sys.stderr)
if result.stderr:
print(f"Worker stderr:\\n{result.stderr}", file=sys.stderr)
with open(trace_file, "w", encoding="utf-8") as f:
f.write(result.stderr)
print(f"Trace log successfully written to {trace_file}")- Extract worker logic to _run_worker_and_parse for robust JSON parsing - Add early validation for iterations parameter in run_master - Add non-zero exit code check and warning for run_trace failures
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK Import Profiler, which includes a comprehensive documentation guide and a self-spawning profiling script (profiler.py) designed to benchmark import latency, peak memory, and dynamic code volume using a process-isolated master/worker architecture. The feedback recommends enhancing the robustness of the worker's stdout parsing by prefixing metrics with a unique identifier to prevent interference from module initialization logs, and replacing the manual CLI argument parsing with the standard argparse library for improved usability and maintainability.
| if __name__ == "__main__": | ||
| # Parse CLI arguments | ||
| target_module = "google.cloud.compute" | ||
| iterations = 50 | ||
| trace = False | ||
| cprofile = False | ||
| mprofile = False | ||
| cpu = "0" | ||
| csv_path = None | ||
|
|
||
| for arg in sys.argv[1:]: | ||
| if arg.startswith("--module="): | ||
| target_module = arg.split("=")[1] | ||
| elif arg.startswith("--iterations="): | ||
| iterations = int(arg.split("=")[1]) | ||
| elif arg.startswith("--cpu="): | ||
| cpu = arg.split("=")[1] | ||
| elif arg.startswith("--csv="): | ||
| csv_path = arg.split("=")[1] | ||
| elif arg == "--trace": | ||
| trace = True | ||
| elif arg == "--cprofile": | ||
| cprofile = True | ||
| elif arg == "--mprofile": | ||
| mprofile = True | ||
|
|
||
| if "--worker" in sys.argv: | ||
| run_worker(target_module) | ||
| elif trace: | ||
| run_trace(target_module) | ||
| elif cprofile: | ||
| run_cprofile(target_module) | ||
| elif mprofile: | ||
| run_mprofile(target_module) | ||
| else: | ||
| run_master(iterations, target_module, cpu, csv_path) No newline at end of file |
There was a problem hiding this comment.
Using manual argument parsing is fragile, does not support standard help flags (-h/--help), and silently ignores invalid arguments. Replacing it with the standard library argparse module improves maintainability, usability, and robustness.
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Python SDK Import Profiler")
parser.add_argument("--module", default="google.cloud.compute", help="Target module to profile")
parser.add_argument("--iterations", type=int, default=50, help="Number of iterations")
parser.add_argument("--cpu", default="0", help="CPU core to pin to (or 'none')")
parser.add_argument("--csv", help="Path to export CSV results")
parser.add_argument("--trace", action="store_true", help="Generate importtime trace log")
parser.add_argument("--cprofile", action="store_true", help="Run cProfile")
parser.add_argument("--mprofile", action="store_true", help="Run tracemalloc memory snapshot")
parser.add_argument("--worker", action="store_true", help=argparse.SUPPRESS)
args = parser.parse_args()
if args.worker:
run_worker(args.module)
elif args.trace:
run_trace(args.module)
elif args.cprofile:
run_cprofile(args.module)
elif args.mprofile:
run_mprofile(args.module)
else:
run_master(args.iterations, args.module, args.cpu, args.csv)Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK Import Profiler tool, consisting of a detailed documentation guide and a self-spawning, process-isolated benchmarking script (profiler.py) designed to measure cold-start import latency, peak memory usage, and dynamic code volume. The reviewer's feedback focuses on improving the robustness and usability of the script. Key recommendations include converting the cpu parameter to a string to prevent potential runtime errors, automatically creating parent directories for the CSV export path to avoid FileNotFoundError, and replacing manual CLI argument parsing with Python's standard argparse module for a more robust and user-friendly interface.
I am having trouble creating individual review comments. Click here to see my feedback.
scripts/import_profiler/profiler.py (78-81)
If cpu is passed as an integer (e.g., cpu=0), calling cpu.lower() will raise an AttributeError. Additionally, passing an integer to the subprocess command list will cause a TypeError during execution. Converting cpu to a string at the beginning of run_master ensures robust handling of both integer and string inputs.
def run_master(iterations, target_module, cpu="0", csv_path=None):
cpu = str(cpu) if cpu is not None else "none"
if iterations < 1:
raise ValueError("Number of iterations must be at least 1.")
scripts/import_profiler/profiler.py (127-128)
If the provided csv_path contains nested directories that do not exist yet, open(csv_path, "w") will raise a FileNotFoundError. Automatically creating the parent directories using os.makedirs makes the tool more robust and utilizes the otherwise unused os import.
if csv_path:
os.makedirs(os.path.dirname(os.path.abspath(csv_path)), exist_ok=True)
with open(csv_path, "w", newline="", encoding="utf-8") as f:
scripts/import_profiler/profiler.py (232-267)
Using manual argument parsing is error-prone, doesn't support standard --help / -h flags (which currently falls back to running the full benchmark), and only supports specific formats like --module=value instead of --module value. Replacing this with Python's standard argparse module provides robust validation, automatic help documentation, and a much better user experience.
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(
description="Python SDK Import Profiler: benchmark and profile module imports."
)
parser.add_argument(
"--module",
default="google.cloud.compute",
help="Target module to profile (e.g., google.cloud.aiplatform)"
)
parser.add_argument(
"--iterations",
type=int,
default=50,
help="Number of cold-start iterations to run"
)
parser.add_argument(
"--cpu",
default="0",
help="CPU core to pin processes to using taskset (set to 'none' to disable)"
)
parser.add_argument(
"--csv",
help="Path to export raw iteration metrics as a CSV file"
)
parser.add_argument(
"--trace",
action="store_true",
help="Generate an importtime trace log file"
)
parser.add_argument(
"--cprofile",
action="store_true",
help="Run cProfile to capture latency stack traces"
)
parser.add_argument(
"--mprofile",
action="store_true",
help="Run tracemalloc memory snapshot"
)
parser.add_argument(
"--worker",
action="store_true",
help="Run in worker mode (internal use)"
)
args = parser.parse_args()
if args.worker:
run_worker(args.module)
elif args.trace:
run_trace(args.module)
elif args.cprofile:
run_cprofile(args.module)
elif args.mprofile:
run_mprofile(args.module)
else:
run_master(args.iterations, args.module, args.cpu, args.csv)|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK Import Profiler tool, consisting of a master/worker profiling script (profiler.py) and its comprehensive documentation (documentation.md). The profiler measures import latency, peak memory, and dynamic code volume in a process-isolated manner to avoid caching issues. The feedback suggests a platform-aware improvement to default the CPU pinning option to 'none' on non-Linux systems, preventing unnecessary warnings and fallback attempts on macOS or Windows.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK Import Profiler tool, consisting of a comprehensive documentation guide and a master/worker execution script (profiler.py) designed to benchmark module import performance under cold-start conditions. Feedback on the implementation highlights opportunities to preserve the cold-start design by running both cProfile and tracemalloc profiling in isolated subprocesses rather than directly within the master process. Additionally, it is recommended to separate FileNotFoundError and subprocess.CalledProcessError handling in the master process to avoid misleading warnings about taskset availability when a worker process crashes.
| def run_cprofile(target_module): | ||
| """Runs cProfile to capture stack traces for latency.""" | ||
| import cProfile | ||
| import pstats | ||
|
|
||
| prof_file = f"cprofile_{target_module.replace('.', '_')}.prof" | ||
| print(f"Generating cProfile data for {target_module} -> {prof_file}...") | ||
|
|
||
| # Run profiling | ||
| pr = cProfile.Profile() | ||
| pr.enable() | ||
| importlib.import_module(target_module) | ||
| pr.disable() | ||
|
|
||
| # Save for flame charts (e.g. via snakeviz) | ||
| pr.dump_stats(prof_file) | ||
| print(f"cProfile stats successfully written to {prof_file}") | ||
|
|
||
| # Print top bottlenecks | ||
| print("\n--- Top 15 functions by cumulative time ---") | ||
| ps = pstats.Stats(pr).sort_stats(pstats.SortKey.CUMULATIVE) | ||
| ps.print_stats(15) |
There was a problem hiding this comment.
Running cProfile directly in the master process defeats the "cold start" design of the profiler. Since profiler.py already imports heavy standard library modules (like subprocess, json, csv, logging, tracemalloc, statistics), any imports of these modules by the target module will be resolved instantly from Python's sys.modules cache and won't be captured in the profile.
To ensure a true cold-start profile, cProfile should be executed in an isolated subprocess.
def run_cprofile(target_module):
"""Runs cProfile in a clean subprocess to capture stack traces for latency."""
import pstats
prof_file = f"cprofile_{target_module.replace('.', '_')}.prof"
print(f"Generating cProfile data for {target_module} -> {prof_file}...")
# Run profiling in a clean subprocess to ensure cold-start
result = subprocess.run(
[sys.executable, "-m", "cProfile", "-o", prof_file, "-c", f"import importlib; importlib.import_module('{target_module}')"],
capture_output=True, text=True
)
if result.returncode != 0:
print(f"Error generating cProfile data:\n{result.stderr}", file=sys.stderr)
return
print(f"cProfile stats successfully written to {prof_file}")
# Print top bottlenecks
print("\n--- Top 15 functions by cumulative time ---")
ps = pstats.Stats(prof_file).sort_stats(pstats.SortKey.CUMULATIVE)
ps.print_stats(15)- Execute cProfile in a fresh subprocess to ensure cold-start accuracy - Execute tracemalloc in a fresh subprocess to capture all memory allocations - Clean up master process exception handling for missing taskset vs crashed workers
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK Import Profiler tool, which includes a comprehensive documentation file (documentation.md) and a core profiling script (profiler.py) designed to measure import latency, peak memory, and dynamic code volume using process-isolated worker runs. The review feedback highlights several critical security and robustness improvements: addressing potential code injection vulnerabilities in run_trace, run_cprofile, and run_mprofile by safely escaping the target_module parameter using json.dumps(), and preventing potential UnicodeDecodeError failures in run_worker when reading source files by adding errors='ignore' to the file-opening logic.
| "import tracemalloc\n" | ||
| "import importlib\n" | ||
| "tracemalloc.start()\n" | ||
| f"importlib.import_module('{target_module}')\n" | ||
| "snapshot = tracemalloc.take_snapshot()\n" | ||
| "tracemalloc.stop()\n" | ||
| "top_stats = snapshot.statistics('lineno')\n" | ||
| "for stat in top_stats[:15]:\n" | ||
| " print(stat)\n" | ||
| ) | ||
| result = subprocess.run([sys.executable, "-c", code], capture_output=True, text=True) |
There was a problem hiding this comment.
Directly interpolating target_module into the code string is vulnerable to code injection. Use json.dumps() to safely escape the module name.
code = (
"import tracemalloc\n"
"import importlib\n"
"tracemalloc.start()\n"
f"importlib.import_module({json.dumps(target_module)})\n"
"snapshot = tracemalloc.take_snapshot()\n"
"tracemalloc.stop()\n"
"top_stats = snapshot.statistics('lineno')\n"
"for stat in top_stats[:15]:\n"
" print(stat)\n"
)Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
- Use json.dumps to safely escape target_module string in all subprocess commands (mprofile, cprofile, trace)
|
/gemini review |
| * **[plan.md](./plan.md)**: The current project phases and roadmap checklist. | ||
| * **[status.md](./status.md)**: Tracks the active task state and hosts recorded baseline performance metrics. |
There was a problem hiding this comment.
Code Review
This pull request introduces a Python SDK Import Profiler tool, including a detailed documentation guide and a self-spawning script (profiler.py) that benchmarks module imports in process-isolated environments. The reviewer feedback highlights two main improvement opportunities: preventing a potential AttributeError by verifying that the module's __file__ attribute is a string before checking its extension, and refactoring the error handling block in run_master to eliminate duplicate metric collection logic.
| if mod and getattr(mod, '__file__', None): | ||
| file_path = mod.__file__ |
There was a problem hiding this comment.
The __file__ attribute of a module can sometimes be None (e.g., for namespace packages or dynamically created modules) or potentially a non-string object in custom environments. Checking if it is a string before performing string operations like .endswith() prevents potential AttributeErrors.
| if mod and getattr(mod, '__file__', None): | |
| file_path = mod.__file__ | |
| if mod and isinstance(getattr(mod, '__file__', None), str): | |
| file_path = mod.__file__ |
| try: | ||
| data = _run_worker_and_parse(cmd) | ||
| times.append(data["time_ms"]) | ||
| memories.append(data["peak_ram_mb"]) | ||
| loaded_modules_val = data["loaded_modules"] | ||
| loaded_lines_val = data["loaded_lines"] | ||
| except FileNotFoundError as e: | ||
| if cpu.lower() != "none" and i == 0: | ||
| print("WARNING: taskset CPU pinning is not available. Falling back to unpinned execution...") | ||
| cpu = "none" | ||
| cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"] | ||
| try: | ||
| data = _run_worker_and_parse(cmd) | ||
| times.append(data["time_ms"]) | ||
| memories.append(data["peak_ram_mb"]) | ||
| loaded_modules_val = data["loaded_modules"] | ||
| loaded_lines_val = data["loaded_lines"] | ||
| except subprocess.CalledProcessError as err: | ||
| print(f"Error in worker process:\n{err.stderr}", file=sys.stderr) | ||
| raise err | ||
| else: | ||
| raise e | ||
| except subprocess.CalledProcessError as e: | ||
| print(f"Error in worker process:\n{e.stderr}", file=sys.stderr) | ||
| raise e |
There was a problem hiding this comment.
There is significant code duplication in the error handling block where metrics are appended to lists and assigned to variables. Refactoring this block to perform the list appending and variable assignment once after the try-except block simplifies the logic and improves maintainability.
try:
data = _run_worker_and_parse(cmd)
except FileNotFoundError as e:
if cpu.lower() != "none" and i == 0:
print("WARNING: taskset CPU pinning is not available. Falling back to unpinned execution...")
cpu = "none"
cmd = [sys.executable, __file__, "--worker", f"--module={target_module}"]
try:
data = _run_worker_and_parse(cmd)
except subprocess.CalledProcessError as err:
print(f"Error in worker process:\\n{err.stderr}", file=sys.stderr)
raise err
else:
raise e
except subprocess.CalledProcessError as e:
print(f"Error in worker process:\\n{e.stderr}", file=sys.stderr)
raise e
times.append(data["time_ms"])
memories.append(data["peak_ram_mb"])
loaded_modules_val = data["loaded_modules"]
loaded_lines_val = data["loaded_lines"]References
- Remove duplicate lines of code, especially duplicate assertions in tests, to keep the codebase clean and avoid redundancy.
| **Proposed Solution** | ||
| The selected design for the tracking framework is Hybrid Process Orchestration, implemented via a highly isolated Master/Worker architecture within a single self-spawning script (`profiler.py`). This setup directly resolves the "Cache Problem," where Python caches completed imports inside `sys.modules` and loads them in <0.1ms on subsequent calls, making standard loop benchmarking inaccurate. | ||
|
|
||
| * **The Master Process:** Operates as the central harness controller, looping exactly N times. For each iteration, it calls a clean subprocess using `sys.executable` pointing back to itself with a specialized `--worker` flag. This isolates every single run from Python's runtime memory cache, guaranteeing a true, 100% "cold start" baseline. | ||
| * **The Worker Process:** Spawns a fresh interpreter instance, executes the dynamic target module import via `importlib.import_module()`, natively tracks system metrics, pipes the final calculation payload out as a clean JSON string to `stdout`, and exits. |
There was a problem hiding this comment.
Maybe a high level description of usage and the expected output is sufficient instead of solution details?
Something like:
Run this command to collect the metrics
./profiler <arg_1> <arg_2>
Expected output:
package name:
load time:
memory:
package size:
- Replace detailed implementation architecture and alternative solutions with a concise high-level usage and expected output format per PR review feedback
This PR introduces profiler.py, a standalone, process-isolated import profiling tool designed to benchmark Python modules without dynamic cache interference.
To evaluate accurate import load times without observer effect, the profiler runs using a Master/Worker architecture to execute each import in a completely isolated sub-process. This ensures a 100% "cold start" evaluation.
Key Features / Telemetry Vectors:
Initialization Latency (ms): Measured at high resolution natively.
Peak Memory Usage (MB): Measured via tracemalloc.
Dynamic Code Volume (LOC): Tracks the deterministic delta in sys.modules to aggregate the exact physical Lines of Code loaded by the interpreter.
Includes detailed breakdown in scripts/import_profiler/documentation.md.
Related Links:
Design Doc: go/sdk-performance-design
For googleapis/python-aiplatform#4749