Conversation
Move os, crt_fd, fileutils, windows, macros modules from
rustpython-common into the new rustpython-host-env crate.
This isolates host OS API wrappers behind a crate boundary
with zero Python runtime dependency.
- Add crates/host_env to workspace
- Drop nix, windows-sys, widestring deps from common
- Wire vm and stdlib to depend on rustpython-host-env
- Migrate all imports from common::{os,crt_fd,fileutils,windows}
to rustpython_host_env::
Agent-Logs-Url: https://github.com/RustPython/RustPython/sessions/48d1e64d-37ce-409f-b511-8e61a349665c Co-authored-by: youknowone <69878+youknowone@users.noreply.github.com>
📝 WalkthroughWalkthroughA new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🧹 Nitpick comments (3)
crates/host_env/src/msvcrt.rs (2)
58-60:chartou16cast silently truncates supplementary plane characters.Characters above U+FFFF (supplementary planes) will be truncated when cast to
u16. While this matches Windows' 16-bitwchar_tlimitation, the silent data loss could be surprising.📝 Consider documenting or validating BMP range
pub fn putwch(c: char) { + // Note: Windows wchar_t is 16-bit; only BMP characters (U+0000-U+FFFF) supported + debug_assert!(c as u32 <= 0xFFFF, "putwch only supports BMP characters"); unsafe { suppress_iph!(_putwch(c as u16)) }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/msvcrt.rs` around lines 58 - 60, The putwch function silently truncates non-BMP characters when casting char to u16; locate putwch (and its call to suppress_iph!(_putwch(...))) and add validation or documentation: check if c as u32 <= 0xFFFF before calling _putwch, and if not either map to a replacement (e.g. '\u{FFFD}') or return/propagate an error/log a warning (choose consistent behavior with your crate's error handling); alternatively add a clear doc comment on putwch stating it accepts only BMP characters and that callers must handle supplementary-plane code points themselves.
36-52: Panic on invalid Unicode could be problematic in edge cases.
char::from_u32will returnNonefor surrogate code units (0xD800-0xDFFF). If_getwch/_getwchereturns a surrogate (which shouldn't happen for console input, but could occur with unusual input methods), this will panic.Consider returning a
Resultor using a replacement character instead of panicking.♻️ Alternative: Return Result or use replacement char
pub fn getwch() -> String { let value = unsafe { _getwch() }; - char::from_u32(value) - .unwrap_or_else(|| panic!("invalid unicode {value:`#x`} from _getwch")) - .to_string() + char::from_u32(value) + .unwrap_or(char::REPLACEMENT_CHARACTER) + .to_string() }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/msvcrt.rs` around lines 36 - 52, The current implementations of getwch and getwche panic if char::from_u32 returns None (surrogate/invalid codepoints); to avoid this, stop panicking and normalize invalid values to the Unicode replacement character instead so the API stays unchanged: in getwch and getwche replace the unwrap_or_else panic path with a fallback to '\u{FFFD}' before .to_string(); keep getche as-is (it returns raw bytes). Ensure you remove the panic messages and any panic-causing closures in functions getwch and getwche so invalid input yields U+FFFD rather than aborting.crates/host_env/src/fcntl.rs (1)
57-75: Function namedlockfusesflock-style flags, which could cause confusion.The function is named
lockfbut accepts acmdparameter that's checked againstlibc::LOCK_NB(aflock-style flag). POSIXlockf()uses different command constants (F_LOCK,F_ULOCK,F_TLOCK,F_TEST), whileLOCK_NBis fromflock().This conflation of APIs could confuse callers. Consider renaming to
fcntl_lockor documenting that this expects flock-style flags.📝 Suggested rename for clarity
+/// Performs file locking via fcntl using flock-style flags. +/// If `cmd` includes `LOCK_NB`, uses non-blocking `F_SETLK`; otherwise uses blocking `F_SETLKW`. #[cfg(not(any(target_os = "wasi", target_os = "redox")))] -pub fn lockf(fd: i32, cmd: i32, lock: &libc::flock) -> io::Result<i32> { +pub fn fcntl_lock(fd: i32, cmd: i32, lock: &libc::flock) -> io::Result<i32> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/host_env/src/fcntl.rs` around lines 57 - 75, The function lockf currently mixes flock-style flags (checks libc::LOCK_NB on the cmd parameter) with fcntl operations (using F_SETLK/F_SETLKW), which is confusing and potentially incorrect; either rename the function to something like fcntl_lock (or fcntl_lockf) and update its docstring to state it expects flock-style nonblocking flag semantics, or change the cmd handling to accept POSIX lockf commands (F_LOCK/F_TLOCK/F_ULOCK/F_TEST) and map them appropriately to F_SETLK/F_SETLKW; locate the function named lockf and its cmd parameter and adjust the name and/or the flag checks and documentation accordingly so callers aren’t misled by the mixed APIs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/host_env/src/select.rs`:
- Around line 171-176: sec_to_timeval can produce negative tv_usec when sec is
negative which yields invalid timeval for select(); update sec_to_timeval so
negative timeouts are handled by clamping to a non-negative timeval (e.g., if
sec <= 0.0 return timeval { tv_sec: 0, tv_usec: 0 }), otherwise compute tv_sec =
sec.trunc() as _ and tv_usec = (sec.fract() * 1e6) as _ (ensuring tv_usec is
non-negative and fits the timeval field); reference sec_to_timeval and timeval
(and select()) when making the change.
In `@crates/host_env/src/shm.rs`:
- Around line 5-6: Replace the infallible conversion using try_into().unwrap()
with a direct cast on FreeBSD: change the conversion of mode in the
#[cfg(target_os = "freebsd")] branch so that the local variable `mode` is cast
to the target type (since `libc::c_uint` and `libc::mode_t` are both u32 on
FreeBSD) rather than calling `try_into().unwrap()`, updating the `mode` binding
used later in functions in this module (shm.rs) accordingly.
In `@crates/host_env/src/syslog.rs`:
- Around line 62-64: The implementation of pub const fn log_mask(pri: i32)
shifts the wrong operand; change the shift to compute 1 << pri (not pri << 1) so
LOG_MASK(0) yields 1 and all masks are correct. Update log_mask to perform the
shift with a literal one of the correct integer type (e.g., 1i32 << pri or (1 as
i32) << pri) and keep the function signature and const semantics intact. Ensure
the return type stays i32 and handle pri as-is (no additional checks needed
here).
- Around line 35-43: syslog() and closelog() have race conditions: ensure
syslog() holds the global_ident read lock while calling libc::syslog() so the
ident Box<CStr> cannot be freed concurrently, and change closelog() to acquire
the global_ident write lock and perform the is_open() check and drop the ident
while holding that write lock (preventing an openlog() from racing in between).
Also correct log_mask() to use the POSIX LOG_MASK formula (return 1 << pri, not
pri << 1). Update references: openlog (stores GlobalIdent), syslog
(acquire/read-lock around libc::syslog call), closelog (do is_open check and
drop under write lock), global_ident (use its locks accordingly), and log_mask
(use 1 << pri).
In `@crates/host_env/src/time.rs`:
- Around line 24-27: get_tz_info currently ignores GetTimeZoneInformation's
return value and returns a zeroed TIME_ZONE_INFORMATION on failure; change
get_tz_info to return a
Result<windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION, std::io::Error>
(or similar error type), call GetTimeZoneInformation and check its return value
against TIME_ZONE_ID_INVALID, and on failure call GetLastError to construct and
return an Err(std::io::Error::from_raw_os_error(...)); update all callers in
crates/vm/src/stdlib/time.rs (the wrapper function and the functions altzone,
timezone, daylight, tzname and the platform module) to handle the Result by
propagating errors or mapping the Ok value to the previous behavior so failures
are not silently converted to a zeroed struct. Ensure unique symbols:
get_tz_info, wrapper function in stdlib/time.rs, altzone, timezone, daylight,
tzname, and the platform module are updated accordingly.
- Around line 31-45: checked_tm_from_struct_time currently only checks
tm.tm_wday < 0 causing a potential out-of-bounds when asctime_from_tm indexes
WDAY_NAME; update checked_tm_from_struct_time to validate the full range for
tm.tm_wday (>=0 && <=6) and ensure tm.tm_mon is validated for 0..=11 as well so
both WDAY_NAME[tm.tm_wday as usize] and MON_NAME[tm.tm_mon as usize] cannot
panic; locate and update the checks around tm.tm_wday and tm.tm_mon in the
checked_tm_from_struct_time function accordingly.
In `@crates/stdlib/src/openssl.rs`:
- Around line 1671-1681: The fallback arm of the map_err on the
rustpython_host_env::fileutils::fopen call currently returns
vm.new_os_error(e.to_string()) which drops the original errno and prevents
mapping to specific OSError subtypes; change that fallback to return the
original IO error converted to a Python exception by calling
e.to_pyexception(vm) so errno is preserved (i.e., replace
vm.new_os_error(e.to_string()) with e.to_pyexception(vm) inside the map_err
closure).
In `@crates/stdlib/src/select.rs`:
- Around line 106-108: The select path is reading the wrong Windows error
namespace; update the error retrieval in the host_select::select() helper
(crates/host_env/src/select.rs) to call WSAGetLastError() on Windows and
construct the io::Error from that code instead of using
GetLastError()/io::Error::last_os_error(); ensure the returned Error codes
propagate into select.rs::select() and socket.rs::sock_select() so the
interrupted-retry logic can detect Winsock-specific errors (e.g.,
WSAEINTR/WSAEINPROGRESS) correctly. Use conditional compilation for Windows
targets to call the WinSock API (WSAGetLastError) and fallback to the existing
path on non-Windows platforms so behavior is unchanged elsewhere.
In `@crates/vm/src/stdlib/msvcrt.rs`:
- Around line 46-54: The wide-char wrappers (getwch, getche, getwche) currently
treat MSVCRT UTF-16 code units as Rust char and can panic on surrogate halves;
update the plumbing to handle UTF-16 code units instead of Unicode scalars:
change the host_msvcrt side bindings for _getwch/_getwche/_putwch/_ungetwch to
use u16/u32 (i.e., raw UTF-16 code units), stop calling char::from_u32 on the
returned values, and in the VM wrappers (getwch, getche, getwche) either return
the raw u16 code units (Vec<u16>) or safely decode sequences using
std::char::decode_utf16 to produce a String without panicking (handling Err by
using replacement U+FFFD); likewise, when sending to putwch/ungetwch, accept
UTF-16 code units (u16/u32) rather than truncating Rust char.
---
Nitpick comments:
In `@crates/host_env/src/fcntl.rs`:
- Around line 57-75: The function lockf currently mixes flock-style flags
(checks libc::LOCK_NB on the cmd parameter) with fcntl operations (using
F_SETLK/F_SETLKW), which is confusing and potentially incorrect; either rename
the function to something like fcntl_lock (or fcntl_lockf) and update its
docstring to state it expects flock-style nonblocking flag semantics, or change
the cmd handling to accept POSIX lockf commands (F_LOCK/F_TLOCK/F_ULOCK/F_TEST)
and map them appropriately to F_SETLK/F_SETLKW; locate the function named lockf
and its cmd parameter and adjust the name and/or the flag checks and
documentation accordingly so callers aren’t misled by the mixed APIs.
In `@crates/host_env/src/msvcrt.rs`:
- Around line 58-60: The putwch function silently truncates non-BMP characters
when casting char to u16; locate putwch (and its call to
suppress_iph!(_putwch(...))) and add validation or documentation: check if c as
u32 <= 0xFFFF before calling _putwch, and if not either map to a replacement
(e.g. '\u{FFFD}') or return/propagate an error/log a warning (choose consistent
behavior with your crate's error handling); alternatively add a clear doc
comment on putwch stating it accepts only BMP characters and that callers must
handle supplementary-plane code points themselves.
- Around line 36-52: The current implementations of getwch and getwche panic if
char::from_u32 returns None (surrogate/invalid codepoints); to avoid this, stop
panicking and normalize invalid values to the Unicode replacement character
instead so the API stays unchanged: in getwch and getwche replace the
unwrap_or_else panic path with a fallback to '\u{FFFD}' before .to_string();
keep getche as-is (it returns raw bytes). Ensure you remove the panic messages
and any panic-causing closures in functions getwch and getwche so invalid input
yields U+FFFD rather than aborting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 32126555-6507-4aa3-9b49-7cd150eda503
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (56)
Cargo.tomlcrates/common/Cargo.tomlcrates/common/src/lib.rscrates/host_env/Cargo.tomlcrates/host_env/src/crt_fd.rscrates/host_env/src/fcntl.rscrates/host_env/src/fileutils.rscrates/host_env/src/lib.rscrates/host_env/src/macros.rscrates/host_env/src/msvcrt.rscrates/host_env/src/nt.rscrates/host_env/src/os.rscrates/host_env/src/posix.rscrates/host_env/src/select.rscrates/host_env/src/shm.rscrates/host_env/src/signal.rscrates/host_env/src/syslog.rscrates/host_env/src/termios.rscrates/host_env/src/time.rscrates/host_env/src/winapi.rscrates/host_env/src/windows.rscrates/stdlib/Cargo.tomlcrates/stdlib/src/faulthandler.rscrates/stdlib/src/fcntl.rscrates/stdlib/src/mmap.rscrates/stdlib/src/openssl.rscrates/stdlib/src/overlapped.rscrates/stdlib/src/posixshmem.rscrates/stdlib/src/select.rscrates/stdlib/src/socket.rscrates/stdlib/src/syslog.rscrates/stdlib/src/termios.rscrates/vm/Cargo.tomlcrates/vm/src/exceptions.rscrates/vm/src/function/fspath.rscrates/vm/src/lib.rscrates/vm/src/ospath.rscrates/vm/src/stdlib/_codecs.rscrates/vm/src/stdlib/_ctypes/function.rscrates/vm/src/stdlib/_io.rscrates/vm/src/stdlib/_signal.rscrates/vm/src/stdlib/_winapi.rscrates/vm/src/stdlib/msvcrt.rscrates/vm/src/stdlib/nt.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/posix.rscrates/vm/src/stdlib/posix_compat.rscrates/vm/src/stdlib/sys.rscrates/vm/src/stdlib/time.rscrates/vm/src/stdlib/winreg.rscrates/vm/src/stdlib/winsound.rscrates/vm/src/vm/vm_new.rscrates/vm/src/windows.rsexamples/generator.rsexamples/package_embed.rssrc/lib.rs
💤 Files with no reviewable changes (2)
- crates/common/Cargo.toml
- crates/common/src/lib.rs
| pub fn sec_to_timeval(sec: f64) -> timeval { | ||
| timeval { | ||
| tv_sec: sec.trunc() as _, | ||
| tv_usec: (sec.fract() * 1e6) as _, | ||
| } | ||
| } |
There was a problem hiding this comment.
Negative timeout values could produce invalid timeval.
If sec is negative, sec.fract() will also be negative, resulting in a negative tv_usec value. This could cause undefined behavior when passed to select().
🛡️ Proposed fix to handle negative values
pub fn sec_to_timeval(sec: f64) -> timeval {
+ // Clamp negative values to zero to avoid invalid timeval
+ let sec = sec.max(0.0);
timeval {
tv_sec: sec.trunc() as _,
tv_usec: (sec.fract() * 1e6) as _,
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn sec_to_timeval(sec: f64) -> timeval { | |
| timeval { | |
| tv_sec: sec.trunc() as _, | |
| tv_usec: (sec.fract() * 1e6) as _, | |
| } | |
| } | |
| pub fn sec_to_timeval(sec: f64) -> timeval { | |
| // Clamp negative values to zero to avoid invalid timeval | |
| let sec = sec.max(0.0); | |
| timeval { | |
| tv_sec: sec.trunc() as _, | |
| tv_usec: (sec.fract() * 1e6) as _, | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/select.rs` around lines 171 - 176, sec_to_timeval can
produce negative tv_usec when sec is negative which yields invalid timeval for
select(); update sec_to_timeval so negative timeouts are handled by clamping to
a non-negative timeval (e.g., if sec <= 0.0 return timeval { tv_sec: 0, tv_usec:
0 }), otherwise compute tv_sec = sec.trunc() as _ and tv_usec = (sec.fract() *
1e6) as _ (ensuring tv_usec is non-negative and fits the timeval field);
reference sec_to_timeval and timeval (and select()) when making the change.
| #[cfg(target_os = "freebsd")] | ||
| let mode = mode.try_into().unwrap(); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Rust libc bindings on FreeBSD, what is the modeparameter type forshm_open, and can converting from libc::c_uintviaTryInto fail?
💡 Result:
In Rust's libc bindings on FreeBSD, the mode parameter for shm_open is of type libc::mode_t, which is u32 (type alias). The libc::shm_open function is declared as pub unsafe extern "C" fn shm_open(name: *const c_char, oflag: c_int, ...) -> c_int, using variadic arguments (...), so the mode value passed after oflag must be compatible with FreeBSD's C mode_t (also u32). Since libc::c_uint is also u32 on FreeBSD, converting from libc::c_uint to libc::mode_t via TryInto cannot fail, as it is a zero-cost type alias conversion between identical underlying types (u32 to u32).
Citations:
- 1: https://docs.rs/libc/0.2.77/x86_64-apple-darwin/libc/fn.shm_open.html
- 2: https://man.bsd.lv/FreeBSD-13.0/shm_open.2
- 3: https://man.freebsd.org/cgi/man.cgi?apropos=0&manpath=FreeBSD+15.0-RELEASE+and+Ports&query=shm_open&sektion=2
- 4: https://github.com/rust-lang/libc/blob/master/src/unix/bsd/freebsdlike/freebsd/mod.rs
- 5: https://docs.rs/libc/latest/libc/type.mode_t.html
- 6: https://github.com/sfackler/rust-posix-ipc/blob/master/src/shm.rs
🏁 Script executed:
# Find and examine the shm.rs file
fd -type f -name "shm.rs" | head -20Repository: RustPython/RustPython
Length of output: 235
🏁 Script executed:
# Find the shm.rs file in the repository
fd shm.rsRepository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Read the shm.rs file to see the function and its context
cat -n crates/host_env/src/shm.rsRepository: RustPython/RustPython
Length of output: 801
🏁 Script executed:
# Check the libc crate definitions for FreeBSD mode_t and c_uint
# First, let's see what's in the Cargo.lock or Cargo.toml to find the libc version
fd -name "Cargo.toml" | xargs grep -l "libc" | head -5Repository: RustPython/RustPython
Length of output: 295
🏁 Script executed:
# Find Cargo.toml files that reference libc
find . -name "Cargo.toml" -type f | head -5Repository: RustPython/RustPython
Length of output: 202
🏁 Script executed:
# Check the main Cargo.toml and the host_env Cargo.toml for libc dependency
grep -r "libc" ./Cargo.toml ./crates/host_env/Cargo.toml 2>/dev/null | head -10Repository: RustPython/RustPython
Length of output: 192
🏁 Script executed:
# Let's verify the actual type definitions for FreeBSD
# Check if there's any documentation or test that shows the types
rg "mode_t|c_uint" crates/host_env/ -A 2 -B 2Repository: RustPython/RustPython
Length of output: 933
Simplify FreeBSD mode conversion using a direct cast.
The try_into().unwrap() is unnecessary since both libc::c_uint and libc::mode_t are u32 on FreeBSD, making the conversion infallible. Use a direct cast instead for clarity:
Proposed fix
pub fn shm_open(name: &CStr, flags: libc::c_int, mode: libc::c_uint) -> io::Result<libc::c_int> {
#[cfg(target_os = "freebsd")]
- let mode = mode.try_into().unwrap();
+ let mode = mode as libc::mode_t;
let fd = unsafe { libc::shm_open(name.as_ptr(), flags, mode) };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #[cfg(target_os = "freebsd")] | |
| let mode = mode.try_into().unwrap(); | |
| pub fn shm_open(name: &CStr, flags: libc::c_int, mode: libc::c_uint) -> io::Result<libc::c_int> { | |
| #[cfg(target_os = "freebsd")] | |
| let mode = mode as libc::mode_t; | |
| let fd = unsafe { libc::shm_open(name.as_ptr(), flags, mode) }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/shm.rs` around lines 5 - 6, Replace the infallible
conversion using try_into().unwrap() with a direct cast on FreeBSD: change the
conversion of mode in the #[cfg(target_os = "freebsd")] branch so that the local
variable `mode` is cast to the target type (since `libc::c_uint` and
`libc::mode_t` are both u32 on FreeBSD) rather than calling
`try_into().unwrap()`, updating the `mode` binding used later in functions in
this module (shm.rs) accordingly.
| pub fn openlog(ident: Option<Box<CStr>>, logoption: i32, facility: i32) { | ||
| let ident = match ident { | ||
| Some(ident) => GlobalIdent::Explicit(ident), | ||
| None => GlobalIdent::Implicit, | ||
| }; | ||
| let mut locked_ident = global_ident().write().expect("syslog lock poisoned"); | ||
| unsafe { libc::openlog(ident.as_ptr(), logoption, facility) }; | ||
| *locked_ident = Some(ident); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat crates/host_env/src/syslog.rsRepository: RustPython/RustPython
Length of output: 1698
🌐 Web query:
POSIX syslog openlog ident pointer lifetime semantics
💡 Result:
According to POSIX.1-2024 (IEEE Std 1003.1-2024), the openlog function takes a const char *ident parameter, which is a pointer to a null-terminated string used as an identifier prepended to syslog messages. The specification states: "The application shall ensure that the string pointed to by ident remains valid during the syslog calls that will prepend this identifier; however, it is unspecified whether changes made to the string will change the identifier prepended by later syslog calls." This means the pointer is stored and used across subsequent syslog calls until closelog or another openlog is called. The application is responsible for ensuring the string's lifetime extends at least until those syslog calls complete (i.e., do not free or invalidate the memory while the syslog connection uses it). Implementations typically store the pointer directly without copying the string, so the memory must persist. GNU C Library documentation confirms: "the string pointer ident will be retained internally by the Syslog routines. You must not free the memory that ident points to." Linux and other man pages note similar behavior: the pointer is stored as-is, and invalidating it leads to undefined results. Best practice: Pass a static string literal or ensure the string outlives all relevant syslog calls (e.g., until closelog).
Citations:
- 1: https://pubs.opengroup.org/onlinepubs/9799919799/functions/openlog.html
- 2: https://pubs.opengroup.org/onlinepubs/9699919799/functions/openlog.html
- 3: https://www.gnu.org/s/libc/manual/html_node/openlog.html
- 4: https://man7.org/linux/man-pages/man3/syslog.3.html
- 5: https://man7.org/linux/man-pages/man3/openlog.3.html
🌐 Web query:
syslog LOG_MASK macro definition 1 << pri
💡 Result:
The LOG_MASK macro in syslog.h is defined as #define LOG_MASK(pri) (1 << (pri)). This creates a bitmask for a specific syslog priority level (0-7), used with setlogmask to control which priority messages are logged. For example, LOG_MASK(LOG_ERR) evaluates to 1 << 3, enabling only error-level messages.
Citations:
- 1: https://github.com/openbsd/src/blob/master/sys/sys/syslog.h
- 2: https://codebrowser.dev/glibc/glibc/misc/sys/syslog.h.html
- 3: https://manpages.org/setlogmask
- 4: http://unix.superglobalmegacorp.com/Net2/newsrc/sys/syslog.h.html
Hold the ident lock during syslog() and move the is_open() check inside closelog()'s write lock.
Per POSIX.1-2024, the ident pointer passed to openlog() is retained by libc and must remain valid across subsequent syslog() calls. The current code has two races:
syslog()callslibc::syslog()without holding the lock onglobal_ident, so a concurrentcloselog()can drop theBox<CStr>while libc still uses the pointer.closelog()checksis_open()(releasing the lock) before acquiring the write lock to drop the ident, allowing a concurrentopenlog()to race in between.
Proposed fixes
pub fn syslog(priority: i32, msg: &CStr) {
+ let _ident_guard = global_ident().read().expect("syslog lock poisoned");
let cformat = c"%s";
unsafe { libc::syslog(priority, cformat.as_ptr(), msg.as_ptr()) };
}
pub fn closelog() {
- if is_open() {
- let mut locked_ident = global_ident().write().expect("syslog lock poisoned");
+ let mut locked_ident = global_ident().write().expect("syslog lock poisoned");
+ if locked_ident.is_some() {
unsafe { libc::closelog() };
*locked_ident = None;
}
}Also, log_mask() implements the wrong formula. The syslog LOG_MASK macro is 1 << pri, not pri << 1. For example, log_mask(5) should evaluate to 32, not 10.
pub const fn log_mask(pri: i32) -> i32 {
- pri << 1
+ 1 << pri
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/syslog.rs` around lines 35 - 43, syslog() and closelog()
have race conditions: ensure syslog() holds the global_ident read lock while
calling libc::syslog() so the ident Box<CStr> cannot be freed concurrently, and
change closelog() to acquire the global_ident write lock and perform the
is_open() check and drop the ident while holding that write lock (preventing an
openlog() from racing in between). Also correct log_mask() to use the POSIX
LOG_MASK formula (return 1 << pri, not pri << 1). Update references: openlog
(stores GlobalIdent), syslog (acquire/read-lock around libc::syslog call),
closelog (do is_open check and drop under write lock), global_ident (use its
locks accordingly), and log_mask (use 1 << pri).
| pub const fn log_mask(pri: i32) -> i32 { | ||
| pri << 1 | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the current state of the syslog.rs file
cat -n crates/host_env/src/syslog.rs | head -80Repository: RustPython/RustPython
Length of output: 2174
🏁 Script executed:
# Search for any usages of log_mask in the codebase
rg "log_mask" --type rust -B 2 -A 2Repository: RustPython/RustPython
Length of output: 578
🌐 Web query:
syslog LOG_MASK macro definition standard
💡 Result:
The LOG_MASK macro is defined in the POSIX standard <syslog.h> header as a macro that expands to an expression of type int when its argument pri (of type int) is provided. It generates a bitmask for a specific priority level pri, used as part of the maskpri argument to setlogmask to control which syslog priority levels are logged. Standard POSIX definition (IEEE Std 1003.1): LOG_MASK(pri) - A mask for priority pri. Common implementation across systems (BSD/glibc/OpenBSD/Haiku): #define LOG_MASK(pri) (1 << (pri)) This sets the bit corresponding to the priority value (0-7 for LOG_EMERG to LOG_DEBUG), enabling logging only for messages at that exact priority when passed to setlogmask. The default mask allows all priorities. It is often used with LOG_UPTO(pri), which masks all priorities from LOG_EMERG (0) through pri.
Citations:
- 1: https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/basedefs/syslog.h.html
- 2: https://pubs.opengroup.org/onlinepubs/9799919799/functions/setlogmask.html
- 3: https://man7.org/linux/man-pages/man0/syslog.h.0p.html
- 4: https://man7.org/linux/man-pages/man3/setlogmask.3.html
- 5: https://github.com/openbsd/src/blob/master/sys/sys/syslog.h
- 6: https://codebrowser.dev/glibc/glibc/misc/sys/syslog.h.html
LOG_MASK() shifts the wrong operand.
The syslog mask macro per POSIX standard is 1 << pri, not pri << 1. The current implementation returns 0 for LOG_MASK(0) instead of 1, and produces incorrect masks for all other priorities.
Proposed fix
pub const fn log_mask(pri: i32) -> i32 {
- pri << 1
+ 1 << pri
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub const fn log_mask(pri: i32) -> i32 { | |
| pri << 1 | |
| } | |
| pub const fn log_mask(pri: i32) -> i32 { | |
| 1 << pri | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/syslog.rs` around lines 62 - 64, The implementation of
pub const fn log_mask(pri: i32) shifts the wrong operand; change the shift to
compute 1 << pri (not pri << 1) so LOG_MASK(0) yields 1 and all masks are
correct. Update log_mask to perform the shift with a literal one of the correct
integer type (e.g., 1i32 << pri or (1 as i32) << pri) and keep the function
signature and const semantics intact. Ensure the return type stays i32 and
handle pri as-is (no additional checks needed here).
| pub fn get_tz_info() -> windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION { | ||
| let mut info = unsafe { core::mem::zeroed() }; | ||
| unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) }; | ||
| info |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For Win32 GetTimeZoneInformation, what return value indicates failure, and should callers check GetLastError in that case?
💡 Result:
The return value indicating failure for Win32 GetTimeZoneInformation is TIME_ZONE_ID_INVALID (0xFFFFFFFF). Callers should check GetLastError in that case to get extended error information, such as for out-of-memory errors.
Citations:
- 1: https://learn.microsoft.com/tr-tr/windows/win32/api/timezoneapi/nf-timezoneapi-gettimezoneinformation
- 2: https://learn.microsoft.com/en-us/windows/win32/api/timezoneapi/nf-timezoneapi-gettimezoneinformation
- 3: https://learn.microsoft.com/en-us/windows/win32/api/timezoneapi/nf-timezoneapi-getdynamictimezoneinformation
- 4: https://learn.microsoft.com/en-us/previous-versions/ms908466(v=msdn.10)
🏁 Script executed:
# Read the time.rs file to verify the current code state
cat -n crates/host_env/src/time.rsRepository: RustPython/RustPython
Length of output: 1776
🏁 Script executed:
# Search for all usages of get_tz_info to understand call sites
rg "get_tz_info" --type rs -B 2 -A 2Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Search for all usages of get_tz_info
rg "get_tz_info" -A 2 -B 2Repository: RustPython/RustPython
Length of output: 3079
Handle GetTimeZoneInformation failure explicitly.
The return value is currently ignored; on failure this returns a zeroed struct, which silently produces incorrect timezone behavior. The Windows API signals failure with TIME_ZONE_ID_INVALID and requires checking GetLastError for extended error information.
All call sites in crates/vm/src/stdlib/time.rs (wrapper function, altzone, timezone, daylight, tzname, and platform module) currently expect a direct struct return and will need to be updated to handle the Result type.
Proposed fix
+use std::io;
+
#[cfg(target_env = "msvc")]
#[cfg(not(target_arch = "wasm32"))]
-pub fn get_tz_info() -> windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION {
+pub fn get_tz_info() -> io::Result<windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION> {
let mut info = unsafe { core::mem::zeroed() };
- unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) };
- info
+ let status = unsafe { windows_sys::Win32::System::Time::GetTimeZoneInformation(&mut info) };
+ if status == windows_sys::Win32::System::Time::TIME_ZONE_ID_INVALID {
+ Err(io::Error::last_os_error())
+ } else {
+ Ok(info)
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/time.rs` around lines 24 - 27, get_tz_info currently
ignores GetTimeZoneInformation's return value and returns a zeroed
TIME_ZONE_INFORMATION on failure; change get_tz_info to return a
Result<windows_sys::Win32::System::Time::TIME_ZONE_INFORMATION, std::io::Error>
(or similar error type), call GetTimeZoneInformation and check its return value
against TIME_ZONE_ID_INVALID, and on failure call GetLastError to construct and
return an Err(std::io::Error::from_raw_os_error(...)); update all callers in
crates/vm/src/stdlib/time.rs (the wrapper function and the functions altzone,
timezone, daylight, tzname and the platform module) to handle the Result by
propagating errors or mapping the Ok value to the previous behavior so failures
are not silently converted to a zeroed struct. Ensure unique symbols:
get_tz_info, wrapper function in stdlib/time.rs, altzone, timezone, daylight,
tzname, and the platform module are updated accordingly.
| pub fn asctime_from_tm(tm: &libc::tm) -> String { | ||
| const WDAY_NAME: [&str; 7] = ["Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"]; | ||
| const MON_NAME: [&str; 12] = [ | ||
| "Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec", | ||
| ]; | ||
| format!( | ||
| "{} {}{:>3} {:02}:{:02}:{:02} {}", | ||
| WDAY_NAME[tm.tm_wday as usize], | ||
| MON_NAME[tm.tm_mon as usize], | ||
| tm.tm_mday, | ||
| tm.tm_hour, | ||
| tm.tm_min, | ||
| tm.tm_sec, | ||
| tm.tm_year + 1900 | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -n --type rust 'asctime_from_tm\('
rg -n --type rust -C 4 'tm_wday|tm_mon'Repository: RustPython/RustPython
Length of output: 1784
🏁 Script executed:
rg -n --type rust 'asctime_from_tm' --max-count 20Repository: RustPython/RustPython
Length of output: 392
🏁 Script executed:
rg -n --type rust 'tm_wday|tm_mon' -C 3 --max-count 30Repository: RustPython/RustPython
Length of output: 13157
🌐 Web query:
libc::tm tm_wday tm_mon range valid values documentation
💡 Result:
In the C standard library (libc) struct tm, the fields tm_wday and tm_mon have the following valid ranges: tm_wday: days since Sunday [0, 6] (0=Sunday, 1=Monday, ..., 6=Saturday) tm_mon: months since January [0, 11] (0=January, 1=February, ..., 11=December) These ranges are standard across C (cppreference.com citing C11/C17/C23), POSIX (Open Group Base Specifications), and libc man pages (Linux/Ubuntu). Functions like mktime may normalize or ignore tm_wday values outside this range, but the documented valid input/output range is 0-6 for tm_wday and 0-11 for tm_mon.
Citations:
- 1: https://w.cppreference.com/c/chrono/tm.html
- 2: https://manpages.ubuntu.com/manpages/noble/man3/tm.3type.html
- 3: https://man7.org/linux/man-pages/man3/tm.3type.html
- 4: https://pubs.opengroup.org/onlinepubs/009695099/basedefs/time.h.html
- 5: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/time.h.html
🏁 Script executed:
sed -n '580,600p' crates/vm/src/stdlib/time.rs
sed -n '600,620p' crates/vm/src/stdlib/time.rs
sed -n '350,430p' crates/vm/src/stdlib/time.rsRepository: RustPython/RustPython
Length of output: 4219
🏁 Script executed:
sed -n '440,450p' crates/vm/src/stdlib/time.rsRepository: RustPython/RustPython
Length of output: 448
Add upper bound validation for tm_wday in checked_tm_from_struct_time.
While asctime_from_tm call sites do pass through checked_tm_from_struct_time validation, the bounds check for tm_wday is incomplete. Line 442 only checks tm.tm_wday < 0 but omits the upper bound—should also validate tm.tm_wday > 6 to prevent panic when indexing WDAY_NAME[7]. Similarly, confirm tm_mon validation correctly enforces [0, 11] range (line 425 looks correct).
Validation gaps in crates/vm/src/stdlib/time.rs
Line 442-444 (incomplete):
if tm.tm_wday < 0 {
return Err(vm.new_value_error("day of week out of range"));
}Should be:
if tm.tm_wday < 0 || tm.tm_wday > 6 {
return Err(vm.new_value_error("day of week out of range"));
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/host_env/src/time.rs` around lines 31 - 45,
checked_tm_from_struct_time currently only checks tm.tm_wday < 0 causing a
potential out-of-bounds when asctime_from_tm indexes WDAY_NAME; update
checked_tm_from_struct_time to validate the full range for tm.tm_wday (>=0 &&
<=6) and ensure tm.tm_mon is validated for 0..=11 as well so both
WDAY_NAME[tm.tm_wday as usize] and MON_NAME[tm.tm_mon as usize] cannot panic;
locate and update the checks around tm.tm_wday and tm.tm_mon in the
checked_tm_from_struct_time function accordingly.
| rustpython_host_env::fileutils::fopen(path.as_path(), "rb").map_err(|e| match e | ||
| .kind() | ||
| { | ||
| std::io::ErrorKind::NotFound => vm | ||
| .new_os_subtype_error( | ||
| vm.ctx.exceptions.file_not_found_error.to_owned(), | ||
| Some(libc::ENOENT), | ||
| e.to_string(), | ||
| ) | ||
| .upcast(), | ||
| _ => vm.new_os_error(e.to_string()), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the current load_dh_params mapping.
fd -a 'openssl\.rs$' . -x sed -n '1666,1682p' {}
# Inspect the generic OSError constructor.
rg -n -C3 --type=rust '\b(pub\s+)?fn new_os_error\b'
# Inspect how std::io::Error is normally translated into Python exceptions.
rg -n -C4 --type=rust 'raw_os_error\(\)|to_pyexception\(.*vm\)|into_pyexception\(.*vm\)|file_not_found_error|permission_error'Repository: RustPython/RustPython
Length of output: 50378
Replace vm.new_os_error() with e.to_pyexception() in the fallback arm to preserve errno.
The _ => vm.new_os_error(e.to_string()) branch discards the errno code from the std::io::Error, mapping failures like EACCES, EISDIR, and ENOTDIR to a plain OSError instead of their proper subtypes (PermissionError, IsADirectoryError, NotADirectoryError). The codebase uses e.to_pyexception(vm) throughout (_io.rs, nt.rs, _winapi.rs) to preserve this information for correct exception type selection.
Suggested fix
- _ => vm.new_os_error(e.to_string()),
+ _ => e.to_pyexception(vm),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rustpython_host_env::fileutils::fopen(path.as_path(), "rb").map_err(|e| match e | |
| .kind() | |
| { | |
| std::io::ErrorKind::NotFound => vm | |
| .new_os_subtype_error( | |
| vm.ctx.exceptions.file_not_found_error.to_owned(), | |
| Some(libc::ENOENT), | |
| e.to_string(), | |
| ) | |
| .upcast(), | |
| _ => vm.new_os_error(e.to_string()), | |
| rustpython_host_env::fileutils::fopen(path.as_path(), "rb").map_err(|e| match e | |
| .kind() | |
| { | |
| std::io::ErrorKind::NotFound => vm | |
| .new_os_subtype_error( | |
| vm.ctx.exceptions.file_not_found_error.to_owned(), | |
| Some(libc::ENOENT), | |
| e.to_string(), | |
| ) | |
| .upcast(), | |
| _ => e.to_pyexception(vm), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/stdlib/src/openssl.rs` around lines 1671 - 1681, The fallback arm of
the map_err on the rustpython_host_env::fileutils::fopen call currently returns
vm.new_os_error(e.to_string()) which drops the original errno and prevents
mapping to specific OSError subtypes; change that fallback to return the
original IO error converted to a Python exception by calling
e.to_pyexception(vm) so errno is preserved (i.e., replace
vm.new_os_error(e.to_string()) with e.to_pyexception(vm) inside the map_err
closure).
| let mut tv = timeout.map(host_select::sec_to_timeval); | ||
| let res = | ||
| vm.allow_threads(|| host_select::select(nfds, &mut r, &mut w, &mut x, tv.as_mut())); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
sed -n '1,220p' crates/host_env/src/select.rs
printf '\n--- host_select call sites ---\n'
rg -n "host_select::select|use rustpython_host_env::select" crates/stdlib/src/select.rs crates/stdlib/src/socket.rs
printf '\n--- error retrieval in host wrapper ---\n'
rg -n "last_os_error|WSAGetLastError" crates/host_env/src/select.rsRepository: RustPython/RustPython
Length of output: 4825
Use WSAGetLastError() for WinSock error retrieval on Windows.
The host_select::select() helper in crates/host_env/src/select.rs calls io::Error::last_os_error() on error, which reads GetLastError() instead of WSAGetLastError(). WinSock errors live in a separate error namespace; using the wrong retrieval function will produce incorrect error codes on Windows. This breaks error handling in both select.rs::select() (line 108) and socket.rs::sock_select() (line 2813), affecting the interrupted-retry logic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/stdlib/src/select.rs` around lines 106 - 108, The select path is
reading the wrong Windows error namespace; update the error retrieval in the
host_select::select() helper (crates/host_env/src/select.rs) to call
WSAGetLastError() on Windows and construct the io::Error from that code instead
of using GetLastError()/io::Error::last_os_error(); ensure the returned Error
codes propagate into select.rs::select() and socket.rs::sock_select() so the
interrupted-retry logic can detect Winsock-specific errors (e.g.,
WSAEINTR/WSAEINPROGRESS) correctly. Use conditional compilation for Windows
targets to call the WinSock API (WSAGetLastError) and fallback to the existing
path on non-Windows platforms so behavior is unchanged elsewhere.
| host_msvcrt::getwch() | ||
| } | ||
| #[pyfunction] | ||
| fn getche() -> Vec<u8> { | ||
| let c = unsafe { _getche() }; | ||
| vec![c as u8] | ||
| host_msvcrt::getche() | ||
| } | ||
| #[pyfunction] | ||
| fn getwche() -> String { | ||
| let c = unsafe { _getwche() }; | ||
| char::from_u32(c).unwrap().to_string() | ||
| host_msvcrt::getwche() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
sed -n '1,180p' crates/host_env/src/msvcrt.rs
printf '\n--- wide-char wrapper sites ---\n'
rg -n "getwch|getwche|putwch|ungetwch|char::from_u32|as u16|_ungetwch" crates/host_env/src/msvcrt.rs crates/vm/src/stdlib/msvcrt.rsRepository: RustPython/RustPython
Length of output: 5075
The wide-char wrappers incorrectly model UTF-16 console I/O as Unicode scalars.
crates/host_env/src/msvcrt.rs treats _getwch, _getwche, _putwch, and _ungetwch as operating on Unicode code points via char, but MSVCRT APIs work with UTF-16 code units. _getwch and _getwche return u32 values that char::from_u32() cannot handle when they are UTF-16 surrogate code units (0xD800–0xDFFF). An emoji like 😀 arrives as a surrogate pair; the first half will panic. Similarly, putwch and ungetwch truncate char to u16/u32, losing supplementary plane input entirely.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/vm/src/stdlib/msvcrt.rs` around lines 46 - 54, The wide-char wrappers
(getwch, getche, getwche) currently treat MSVCRT UTF-16 code units as Rust char
and can panic on surrogate halves; update the plumbing to handle UTF-16 code
units instead of Unicode scalars: change the host_msvcrt side bindings for
_getwch/_getwche/_putwch/_ungetwch to use u16/u32 (i.e., raw UTF-16 code units),
stop calling char::from_u32 on the returned values, and in the VM wrappers
(getwch, getche, getwche) either return the raw u16 code units (Vec<u16>) or
safely decode sequences using std::char::decode_utf16 to produce a String
without panicking (handling Err by using replacement U+FFFD); likewise, when
sending to putwch/ungetwch, accept UTF-16 code units (u16/u32) rather than
truncating Rust char.
Summary by CodeRabbit