Skip to content

Impl sys.audithook#7960

Merged
youknowone merged 18 commits into
RustPython:mainfrom
ShaharNaveh:sys-audithook
May 24, 2026
Merged

Impl sys.audithook#7960
youknowone merged 18 commits into
RustPython:mainfrom
ShaharNaveh:sys-audithook

Conversation

@ShaharNaveh
Copy link
Copy Markdown
Contributor

@ShaharNaveh ShaharNaveh commented May 23, 2026

ref: #7956

Summary by CodeRabbit

  • New Features
    • Added comprehensive audit hook support for security monitoring across standard library modules
    • Implemented sys.audit() and sys.addaudithook() functions enabling custom audit event handling and registration
    • Socket operations, syslog operations, marshal dumps, time.sleep, and system monitoring callbacks now emit audit events

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

📝 Walkthrough

Walkthrough

This PR implements Python's sys.audit() hook mechanism in RustPython. It adds audit hook storage to the VM, introduces sys.audit() and sys.addaudithook() as public APIs, and integrates conditional audit event emission across socket, syslog, marshal, time, monitoring, and frame introspection operations.

Changes

Python Audit Hook System

Layer / File(s) Summary
VM audit storage and core sys infrastructure
crates/vm/src/vm/mod.rs, crates/vm/src/vm/thread.rs, crates/vm/src/stdlib/sys.rs
VirtualMachine gains audit_hooks: RefCell<Vec<PyObjectRef>> storage initialized in both regular and threaded contexts. The sys module's placeholder audit is replaced with run_audit_hooks dispatcher, sys.audit(event, *args) that formats args into a tuple and dispatches to all registered hooks (skipping if empty), and sys.addaudithook(hook) that registers new hooks and probes them with the sys.addaudithook event, suppressing registration only on RuntimeError subclass.
Frame introspection audit events
crates/vm/src/stdlib/sys.rs
sys._getframe and sys._getframemodulename now emit audit events after computing frame results. _getframemodulename return type changed to PyResult<PyObjectRef> to propagate audit-raised exceptions.
Socket module audit integration
crates/stdlib/src/socket.rs
socket.socket.__new__, socket.bind, and socket.gethostname conditionally emit audit events with operation-specific arguments (constructor parameters, resolved IP/port, or no args respectively).
Syslog module audit integration
crates/stdlib/src/syslog.rs
openlog and syslog emit audit events with relevant arguments (ident/logoption/facility, priority/message). closelog and setlogmask refactored to accept VirtualMachine and return PyResult to support audit dispatch before host syslog calls.
Other stdlib audit events
crates/vm/src/stdlib/marshal.rs, crates/vm/src/stdlib/time.rs, crates/vm/src/stdlib/sys/monitoring.rs
marshal.dumps emits audit with value and version; time.sleep emits audit with duration; sys.monitoring.register_callback emits audit with the callback function.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested Reviewers

  • youknowone

Poem

🐇 Hark! The audit trails now gleam so bright,
From socket binds to marshalled bytes in flight,
Each hook a guardian, each call now heard,
Through sys.audit's voice, no secret word!
RustPython watches, catches all with care,
Security audits blooming everywhere! 🔍✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Impl sys.audithook' clearly and specifically identifies the main objective of the pull request - implementing the sys.audithook feature, which aligns with the substantial changes across multiple stdlib modules to add audit hook support.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] test: cpython/Lib/test/test_sys.py (TODO: 7)
[ ] test: cpython/Lib/test/test_syslog.py (TODO: 1)
[x] test: cpython/Lib/test/test_sys_setprofile.py (TODO: 3)
[ ] test: cpython/Lib/test/test_sys_settrace.py (TODO: 30)
[x] test: cpython/Lib/test/test_audit.py (TODO: 15)
[x] test: cpython/Lib/test/audit-tests.py

dependencies:

dependent tests: (233 tests)

  • sys: regrtestdata test___all__ test__colorize test__locale test__osx_support test_android test_annotationlib test_argparse test_array test_asdl_parser test_ast test_asyncio test_audit test_bdb test_bigaddrspace test_bigmem test_bisect test_buffer test_builtin test_bytes test_bz2 test_c_locale_coercion test_calendar test_class test_clinic test_cmath test_cmd test_cmd_line test_cmd_line_script test_code test_code_module test_codeccallbacks test_codecs test_collections test_compile test_compileall test_complex test_concurrent_futures test_context test_contextlib test_coroutines test_csv test_ctypes test_datetime test_dbm test_dbm_sqlite3 test_decimal test_descr test_dict test_difflib test_dis test_doctest test_doctest2 test_docxmlrpc test_dtrace test_dynamic test_dynamicclassattribute test_email test_ensurepip test_enum test_enumerate test_eof test_except_star test_exceptions test_faulthandler test_fcntl test_file test_file_eintr test_fileinput test_fileio test_float test_fork1 test_format test_fractions test_frame test_frozen test_functools test_future_stmt test_gc test_generated_cases test_generators test_genericpath test_genexps test_getopt test_glob test_grammar test_gzip test_hash test_hashlib test_http_cookiejar test_httpservers test_importlib test_inspect test_int test_io test_iter test_itertools test_json test_largefile test_launcher test_list test_listcomps test_locale test_logging test_long test_lzma test_mailbox test_marshal test_math test_memoryio test_memoryview test_metaclass test_mimetypes test_mmap test_monitoring test_msvcrt test_multiprocessing_forkserver test_multiprocessing_main_handling test_multiprocessing_spawn test_netrc test_ntpath test_numeric_tower test_operator test_optparse test_ordered_dict test_os test_osx_env test_pathlib test_patma test_peepholer test_perfmaps test_pickle test_pkg test_pkgutil test_platform test_plistlib test_popen test_posix test_posixpath test_print test_property test_pty test_pwd test_py_compile test_pyclbr test_pydoc test_pyexpat test_quopri test_raise test_range test_re test_regrtest test_repl test_reprlib test_resource test_runpy test_sax test_scope test_script_helper test_select test_selectors test_shutil test_signal test_site test_slice test_smtplib test_socket test_sqlite3 test_ssl test_stable_abi_ctypes test_stat test_statistics test_str test_strftime test_string_literals test_strptime test_strtod test_struct test_subprocess test_support test_sys test_sys_setprofile test_sys_settrace test_sysconfig test_syslog test_tarfile test_tempfile test_termios test_threadedtempfile test_threading test_threading_local test_threadsignals test_time test_timeit test_tomllib test_tools test_trace test_traceback test_tuple test_type_cache test_type_comments test_types test_typing test_unicode_file test_unicode_file_functions test_unicodedata test_unittest test_univnewlines test_urllib test_urllib2 test_urllib2net test_urlparse test_utf8_mode test_uuid test_venv test_wait3 test_wait4 test_wave test_weakref test_webbrowser test_winconsoleio test_winreg test_with test_wsgiref test_xml_etree test_xmlrpc test_xpickle test_zipapp test_zipfile test_zipfile64 test_zipimport test_zipimport_support test_zlib

[x] lib: cpython/Lib/bdb.py
[x] test: cpython/Lib/test/test_bdb.py (TODO: 1)

dependencies:

  • bdb

dependent tests: (1 tests)

  • bdb: test_bdb

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@ShaharNaveh ShaharNaveh marked this pull request as ready for review May 24, 2026 10:12
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/stdlib/src/syslog.rs`:
- Around line 98-104: When auto-opening, the code currently emits the
"syslog.syslog" audit before calling openlog(); change the order so that if
host_syslog::is_open() is false you first emit an audit for "syslog.openlog"
(using the same vm.sys_module.get_attr("audit", vm) and audit.call pattern, e.g.
audit.call((vm.ctx.new_str("syslog.openlog"), /* priority or args as appropriate
*/, vm.ctx.none()), vm) or similar to mirror how syslog.syslog is audited), then
call openlog(OpenLogArgs::default(), vm), and only after openlog succeeds emit
the "syslog.syslog" audit and call audit.call for syslog.syslog and then proceed
to call the syslog handler; locate the logic around
vm.sys_module.get_attr("audit", vm), host_syslog::is_open(), openlog(..., vm)
and the audit.call for "syslog.syslog" to implement this reorder.
- Around line 56-66: The audit payload currently calls args.ident.flatten(),
which conflates "argument omitted" and "explicit None" so openlog(None, ...) is
treated like omitted — change the audit_ident construction to use the outer
Option (args.ident.clone()) so omitted vs explicit None are distinct: if
args.ident is None (argument omitted) fall back to get_argv(vm) as before, if
args.ident is Some(None) produce vm.ctx.none() (explicit None), and if
args.ident is Some(Some(s)) convert s into a PyObjectRef; update the code path
that builds audit_ident (the closure using args.ident.flatten(),
vm.sys_module.get_attr("audit", vm), and get_argv(vm)) to implement this
branching.

In `@crates/vm/src/stdlib/sys.rs`:
- Around line 982-984: The current code checks for "audit" via
vm.sys_module.get_attr and calls it (e.g., the block using
vm.sys_module.get_attr("audit", vm) and audit.call(..., vm)), which allows
bypassing registered audit hooks if sys.audit was removed or replaced; replace
these dynamic attribute lookups with a direct call to run_audit_hooks(...)
passing the same arguments (e.g., the "sys._getframe" name and frame_ref) so
registered hooks always run; apply the same change to the analogous block around
the later occurrence (the code at the other audit call site referenced in the
comment).

In `@crates/vm/src/vm/thread.rs`:
- Line 733: new_thread() currently initializes audit_hooks with
RefCell::new(vec![]), dropping any hooks registered on the parent VM; update
new_thread (in vm::thread.rs) to preserve the parent VM's audit_hooks by cloning
or sharing the existing audit_hooks RefCell contents instead of creating an
empty vec (e.g., derive the new thread's audit_hooks from self.audit_hooks or
the parent ThreadVM's audit_hooks via clone() or by using an Rc/Arc to share the
container), so that sys.addaudithook() handlers registered on the parent are
observed in spawned threads.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 229a247f-bc9d-4092-9c7d-0db605941ce1

📥 Commits

Reviewing files that changed from the base of the PR and between e1d9a11 and b6b2f7d.

⛔ Files ignored due to path filters (4)
  • Lib/test/audit-tests.py is excluded by !Lib/**
  • Lib/test/test_audit.py is excluded by !Lib/**
  • Lib/test/test_bdb.py is excluded by !Lib/**
  • Lib/test/test_sys_setprofile.py is excluded by !Lib/**
📒 Files selected for processing (8)
  • crates/stdlib/src/socket.rs
  • crates/stdlib/src/syslog.rs
  • crates/vm/src/stdlib/marshal.rs
  • crates/vm/src/stdlib/sys.rs
  • crates/vm/src/stdlib/sys/monitoring.rs
  • crates/vm/src/stdlib/time.rs
  • crates/vm/src/vm/mod.rs
  • crates/vm/src/vm/thread.rs

Comment on lines +56 to +66
let ident = match args.ident.clone().flatten() {
Some(args) => Some(args.to_cstring(vm)?),
None => get_argv(vm).map(|argv| argv.to_cstring(vm)).transpose()?,
}
.map(|ident| ident.into_boxed_c_str());

if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
let audit_ident: PyObjectRef = args.ident.flatten().map_or_else(
|| get_argv(vm).map_or_else(|| vm.ctx.none(), Into::into),
Into::into,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve ident=None semantics instead of falling back to argv.

args.ident.flatten() merges “argument omitted” and “explicit None”. That makes openlog(None, ...) behave like omitted ident in audit payload (and ident derivation), which diverges from expected behavior.

Suggested fix
-        let ident = match args.ident.clone().flatten() {
-            Some(args) => Some(args.to_cstring(vm)?),
-            None => get_argv(vm).map(|argv| argv.to_cstring(vm)).transpose()?,
-        }
+        let ident = match args.ident.clone() {
+            OptionalArg::Present(Some(s)) => Some(s.to_cstring(vm)?),
+            OptionalArg::Present(None) => None,
+            OptionalArg::Missing => get_argv(vm).map(|argv| argv.to_cstring(vm)).transpose()?,
+        }
         .map(|ident| ident.into_boxed_c_str());

         if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
-            let audit_ident: PyObjectRef = args.ident.flatten().map_or_else(
-                || get_argv(vm).map_or_else(|| vm.ctx.none(), Into::into),
-                Into::into,
-            );
+            let audit_ident: PyObjectRef = match args.ident {
+                OptionalArg::Present(Some(s)) => s.into(),
+                OptionalArg::Present(None) => vm.ctx.none(),
+                OptionalArg::Missing => get_argv(vm).map_or_else(|| vm.ctx.none(), Into::into),
+            };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let ident = match args.ident.clone().flatten() {
Some(args) => Some(args.to_cstring(vm)?),
None => get_argv(vm).map(|argv| argv.to_cstring(vm)).transpose()?,
}
.map(|ident| ident.into_boxed_c_str());
if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
let audit_ident: PyObjectRef = args.ident.flatten().map_or_else(
|| get_argv(vm).map_or_else(|| vm.ctx.none(), Into::into),
Into::into,
);
let ident = match args.ident.clone() {
OptionalArg::Present(Some(s)) => Some(s.to_cstring(vm)?),
OptionalArg::Present(None) => None,
OptionalArg::Missing => get_argv(vm).map(|argv| argv.to_cstring(vm)).transpose()?,
}
.map(|ident| ident.into_boxed_c_str());
if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
let audit_ident: PyObjectRef = match args.ident {
OptionalArg::Present(Some(s)) => s.into(),
OptionalArg::Present(None) => vm.ctx.none(),
OptionalArg::Missing => get_argv(vm).map_or_else(|| vm.ctx.none(), Into::into),
};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/stdlib/src/syslog.rs` around lines 56 - 66, The audit payload
currently calls args.ident.flatten(), which conflates "argument omitted" and
"explicit None" so openlog(None, ...) is treated like omitted — change the
audit_ident construction to use the outer Option (args.ident.clone()) so omitted
vs explicit None are distinct: if args.ident is None (argument omitted) fall
back to get_argv(vm) as before, if args.ident is Some(None) produce
vm.ctx.none() (explicit None), and if args.ident is Some(Some(s)) convert s into
a PyObjectRef; update the code path that builds audit_ident (the closure using
args.ident.flatten(), vm.sys_module.get_attr("audit", vm), and get_argv(vm)) to
implement this branching.

Comment on lines +98 to 104
if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
audit.call((vm.ctx.new_str("syslog.syslog"), priority, msg.clone()), vm)?;
}

if !host_syslog::is_open() {
openlog(OpenLogArgs::default(), vm)?;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Emit syslog.openlog audit before first syslog.syslog when auto-opening.

Right now syslog.syslog is audited before the implicit openlog() call. That inverts expected event ordering on first use.

Suggested fix
-        if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
-            audit.call((vm.ctx.new_str("syslog.syslog"), priority, msg.clone()), vm)?;
-        }
-
         if !host_syslog::is_open() {
             openlog(OpenLogArgs::default(), vm)?;
         }
+
+        if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
+            audit.call((vm.ctx.new_str("syslog.syslog"), priority, msg.clone()), vm)?;
+        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/stdlib/src/syslog.rs` around lines 98 - 104, When auto-opening, the
code currently emits the "syslog.syslog" audit before calling openlog(); change
the order so that if host_syslog::is_open() is false you first emit an audit for
"syslog.openlog" (using the same vm.sys_module.get_attr("audit", vm) and
audit.call pattern, e.g. audit.call((vm.ctx.new_str("syslog.openlog"), /*
priority or args as appropriate */, vm.ctx.none()), vm) or similar to mirror how
syslog.syslog is audited), then call openlog(OpenLogArgs::default(), vm), and
only after openlog succeeds emit the "syslog.syslog" audit and call audit.call
for syslog.syslog and then proceed to call the syslog handler; locate the logic
around vm.sys_module.get_attr("audit", vm), host_syslog::is_open(), openlog(...,
vm) and the audit.call for "syslog.syslog" to implement this reorder.

Comment on lines +982 to 984
if let Ok(audit) = vm.sys_module.get_attr("audit", vm) {
audit.call((vm.ctx.new_str("sys._getframe"), frame_ref.to_owned()), vm)?;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Call audit hooks directly instead of resolving sys.audit dynamically.

These paths currently skip auditing when sys.audit is missing/replaced, even if hooks are registered. Use run_audit_hooks(...) directly to avoid bypass through attribute mutation.

Also applies to: 995-997

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/stdlib/sys.rs` around lines 982 - 984, The current code checks
for "audit" via vm.sys_module.get_attr and calls it (e.g., the block using
vm.sys_module.get_attr("audit", vm) and audit.call(..., vm)), which allows
bypassing registered audit hooks if sys.audit was removed or replaced; replace
these dynamic attribute lookups with a direct call to run_audit_hooks(...)
passing the same arguments (e.g., the "sys._getframe" name and frame_ref) so
registered hooks always run; apply the same change to the analogous block around
the later occurrence (the code at the other audit call site referenced in the
comment).

asyncio_running_loop: RefCell::new(None),
asyncio_running_task: RefCell::new(None),
callable_cache: self.callable_cache.clone(),
audit_hooks: RefCell::new(vec![]),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve audit hooks when creating thread VMs.

new_thread() resets audit_hooks to empty, so hooks added via sys.addaudithook() in the parent VM are not observed in spawned threads. This breaks threaded audit dispatch behavior.

Suggested fix
-            audit_hooks: RefCell::new(vec![]),
+            audit_hooks: RefCell::new(self.audit_hooks.borrow().clone()),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
audit_hooks: RefCell::new(vec![]),
audit_hooks: RefCell::new(self.audit_hooks.borrow().clone()),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/vm/thread.rs` at line 733, new_thread() currently initializes
audit_hooks with RefCell::new(vec![]), dropping any hooks registered on the
parent VM; update new_thread (in vm::thread.rs) to preserve the parent VM's
audit_hooks by cloning or sharing the existing audit_hooks RefCell contents
instead of creating an empty vec (e.g., derive the new thread's audit_hooks from
self.audit_hooks or the parent ThreadVM's audit_hooks via clone() or by using an
Rc/Arc to share the container), so that sys.addaudithook() handlers registered
on the parent are observed in spawned threads.

Copy link
Copy Markdown
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@youknowone youknowone merged commit 2fabf38 into RustPython:main May 24, 2026
50 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants