substr expression#7898
Conversation
Polar Signals Profiling ResultsLatest Run
Previous Runs (3)
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.009x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.009x ➖, 0↑ 0↓)
No file size changes detected. |
File Sizes: PolarSignals ProfilingNo file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.986x ➖, 1↑ 0↓)
datafusion / vortex-compact (1.002x ➖, 1↑ 1↓)
datafusion / parquet (1.020x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.974x ➖, 1↑ 0↓)
duckdb / vortex-compact (0.998x ➖, 0↑ 0↓)
duckdb / parquet (0.994x ➖, 0↑ 0↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
Full attributed analysis
|
File Sizes: FineWeb NVMeNo file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.040x ➖, 0↑ 1↓)
datafusion / vortex-compact (1.033x ➖, 0↑ 0↓)
datafusion / parquet (1.045x ➖, 0↑ 3↓)
datafusion / arrow (1.054x ➖, 0↑ 3↓)
duckdb / vortex-file-compressed (1.046x ➖, 0↑ 2↓)
duckdb / vortex-compact (1.041x ➖, 0↑ 2↓)
duckdb / parquet (1.020x ➖, 0↑ 2↓)
duckdb / duckdb (1.030x ➖, 0↑ 0↓)
File Size Changes (9 files changed, -0.0% overall, 3↑ 6↓)
Totals:
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMENo file size changes detected. |
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.966x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.976x ➖, 3↑ 1↓)
datafusion / parquet (0.966x ➖, 3↑ 0↓)
duckdb / vortex-file-compressed (0.991x ➖, 3↑ 1↓)
duckdb / vortex-compact (0.991x ➖, 2↑ 2↓)
duckdb / parquet (0.991x ➖, 0↑ 1↓)
duckdb / duckdb (0.981x ➖, 4↑ 0↓)
File Size Changes (6 files changed, +0.0% overall, 3↑ 3↓)
Totals:
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.684x ❌, 0↑ 7↓)
datafusion / vortex-compact (1.009x ➖, 0↑ 0↓)
datafusion / parquet (1.055x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.028x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.036x ➖, 0↑ 1↓)
duckdb / parquet (1.014x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Random AccessVortex (geomean): 0.873x ✅ How to read Verdict and Engines
unknown / unknown (0.917x ➖, 18↑ 1↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.141x ❌, 0↑ 18↓)
datafusion / vortex-compact (1.133x ❌, 0↑ 18↓)
datafusion / parquet (1.121x ❌, 0↑ 17↓)
datafusion / arrow (1.175x ❌, 0↑ 21↓)
duckdb / vortex-file-compressed (1.139x ❌, 0↑ 12↓)
duckdb / vortex-compact (1.144x ❌, 0↑ 10↓)
duckdb / parquet (1.052x ➖, 0↑ 1↓)
duckdb / duckdb (1.069x ➖, 0↑ 3↓)
File Size Changes (27 files changed, +0.0% overall, 14↑ 13↓)
Totals:
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.122x ❌, 0↑ 6↓)
duckdb / vortex-compact (1.116x ❌, 0↑ 8↓)
duckdb / parquet (1.081x ➖, 0↑ 4↓)
File Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.911x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.989x ➖, 0↑ 1↓)
datafusion / parquet (1.008x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.987x ➖, 0↑ 1↓)
duckdb / vortex-compact (0.980x ➖, 0↑ 0↓)
duckdb / parquet (0.924x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.990x ➖, 1↑ 1↓)
datafusion / parquet (0.987x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.985x ➖, 1↑ 1↓)
duckdb / parquet (0.993x ➖, 0↑ 0↓)
duckdb / duckdb (1.000x ➖, 0↑ 0↓)
File Size Changes (105 files changed, +0.0% overall, 59↑ 46↓)
Totals:
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: CompressionVortex (geomean): 1.001x ➖ How to read Verdict and Engines
unknown / unknown (1.006x ➖, 1↑ 3↓)
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.989x ➖, 1↑ 1↓)
datafusion / vortex-compact (0.913x ➖, 1↑ 0↓)
datafusion / parquet (0.901x ➖, 2↑ 0↓)
duckdb / vortex-file-compressed (1.102x ➖, 0↑ 2↓)
duckdb / vortex-compact (1.098x ➖, 0↑ 1↓)
duckdb / parquet (1.084x ➖, 0↑ 0↓)
Full attributed analysis
|
|
No speedup, closing |
| let arrow_array = string_arr.execute_arrow(Some(&DataType::Utf8), ctx)?; | ||
| let result = arrow_string::substring::substring(arrow_array.as_ref(), start - 1, length)?; | ||
| from_arrow_array_with_len(result.as_ref(), len, nullable) |
There was a problem hiding this comment.
this is likely very slow, we can do this via varbinview offset rewrite and we should apply push down to each encoding type
Merging this PR will improve performance by 23.42%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | baseline_eq[4, 65536] |
237.9 µs | 185.1 µs | +28.54% |
| ⚡ | Simulation | baseline_lt[16, 65536] |
275.5 µs | 217.7 µs | +26.57% |
| ⚡ | Simulation | baseline_lt[4, 65536] |
253.2 µs | 201 µs | +25.94% |
| ⚡ | Simulation | baseline_eq[16, 65536] |
260.5 µs | 230 µs | +13.23% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing myrrc/substr (a284482) with develop (84a4a3f)
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.988x ➖, 0↑ 0↓)
datafusion / parquet (0.994x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.996x ➖, 0↑ 0↓)
duckdb / parquet (1.006x ➖, 0↑ 0↓)
duckdb / duckdb (0.998x ➖, 0↑ 0↓)
File Size Changes (4 files changed, -0.1% overall, 1↑ 3↓)
Totals:
Full attributed analysis
|
| BinaryView::new_inlined(&buf[new_offset..new_offset + new_len]) | ||
| } else { | ||
| let buf = array.buffer(r.buffer_index as usize); | ||
| let prefix: [u8; 4] = buf[new_offset..new_offset + 4] |
|
worth looking at arrow-rs here. You can delegate to arrow-re or at least compare as a baseline substring(array, start: u64, length: Option). |
No description provided.