Conversation
Merging this PR will not alter performance
Comparing Footnotes
|
|
The second traceback is about the (possibly) same issue in kanal while writing a file. |
044dfff to
ade6f6c
Compare
|
I believe it's a false positive in the end, muting because the CI is flaky |
ade6f6c to
7b6cd59
Compare
|
|
|
I've tried ignoring sanitization directly on the function, but it still triggers TSan, let's keep the suppression list: diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 44b70f68e..6ca9d1c7a 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -404,7 +404,7 @@ jobs:
ASAN_SYMBOLIZER_PATH: "/usr/bin/llvm-symbolizer"
MSAN_OPTIONS: "symbolize=1"
MSAN_SYMBOLIZER_PATH: "/usr/bin/llvm-symbolizer"
- TSAN_OPTIONS: "symbolize=1:suppressions=$GITHUB_WORKSPACE/vortex-ffi/tsan_suppressions.txt"
+ TSAN_OPTIONS: "symbolize=1"
TSAN_SYMBOLIZER_PATH: "/usr/bin/llvm-symbolizer"
VORTEX_SKIP_SLOW_TESTS: "1"
# -Cunsafe-allow-abi-mismatch=sanitizer: libraries like compiler_builtins
diff --git a/vortex-ffi/tsan_suppressions.txt b/vortex-ffi/tsan_suppressions.txt
deleted file mode 100644
index be979b9de..000000000
--- a/vortex-ffi/tsan_suppressions.txt
+++ /dev/null
@@ -1,4 +0,0 @@
-# https://github.com/vortex-data/vortex/pull/7244
-# This is likely a false positive in TSan for oneshot::channel and kanal
-# where ordering is correct but uses an explicit fence and not relaxed load
-race:oneshot-*/src/channel.rs
diff --git a/vortex-io/build.rs b/vortex-io/build.rs
new file mode 100644
index 000000000..59dbbc366
--- /dev/null
+++ b/vortex-io/build.rs
@@ -0,0 +1,13 @@
+use std::process::Command;
+
+fn main() {
+ println!("cargo:rerun-if-changed=build.rs");
+ let is_nightly = Command::new("rustc")
+ .arg("-V")
+ .output()
+ .map(|output| String::from_utf8_lossy(&output.stdout).contains("nightly"))
+ .unwrap_or(false);
+ if is_nightly {
+ println!("cargo::rustc-cfg=vortex_nightly");
+ }
+}
diff --git a/vortex-io/src/lib.rs b/vortex-io/src/lib.rs
index f30f64ff3..8f0a584ff 100644
--- a/vortex-io/src/lib.rs
+++ b/vortex-io/src/lib.rs
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright the Vortex contributors
+#![cfg_attr(vortex_nightly, feature(sanitize))]
//! Core traits and implementations for asynchronous IO.
//!
diff --git a/vortex-io/src/runtime/handle.rs b/vortex-io/src/runtime/handle.rs
index f04b897b2..25707e55c 100644
--- a/vortex-io/src/runtime/handle.rs
+++ b/vortex-io/src/runtime/handle.rs
@@ -1,6 +1,5 @@
// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright the Vortex contributors
-
use std::pin::Pin;
use std::sync::Arc;
use std::sync::Weak;
@@ -165,6 +164,10 @@ impl<T> Task<T> {
impl<T> Future for Task<T> {
type Output = T;
+ // https://github.com/vortex-data/vortex/pull/7244
+ // This is likely a false positive in TSan for oneshot::channel and kanal
+ // where ordering is correct but uses an explicit fence and not relaxed load
+ #[cfg_attr(vortex_nightly, sanitize(thread = "off"))]
#[allow(clippy::panic)]
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
match ready!(self.recv.poll_unpin(cx)) { |
7b6cd59 to
f14f0da
Compare
f14f0da to
13a731b
Compare
There are TSan data race reports like
https://github.com/vortex-data/vortex/actions/runs/23804465237/job/69373850775?pr=7018
Significant parts of stack trace
which happens in sink FFI tests but can be narrowed down to
The underlying issue is that RUNTIME in FFI is a smol::Executor, and as we don't
have a background thread pool, and we use CurrentThreadRuntime, it runs on host
program's threads, in our case,
cargo test's. If t1 and t2 are scheduled torun on different threads, this triggers TSan in vortex-io's spawn() (case 1,
oneshot) and in vortex-file write_internal (case 2, kanal::bounded_async) which
don't protect multiple different thread consumers to read the state.
The solution here is to benchmark replacing oneshot with futures-rs oneshot
channel