Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[dbsp] Omit unneeded index bounds in storage layer files when not nee…
…ded.

Columns 1 (and greater) in a layer file are essentially B-trees over tuples
(R,K), where R is the row number and K is the key.  Each row in column 0
includes as auxiliary data a range R0..R1 of rows in column 1 (the "row
group"), so R in the B-tree allows that range to be looked up.  Storing
the minimum and maximum keys of child blocks in the index allows efficient
lookup of keys in row groups that span multiple data blocks.

However, when a row group starts at the beginning of a data block, there's
little value in including that key in its index block, because it won't
usually help us to avoid reading the data block.  It only helps if we're
searching for a key less than the minimum, which is rarely the case; more
often, we're iterating through the keys, and usually we're doing nested
iteration through the column-0 keys too, so that we'd typically have to
read the data block for another one of those anyhow.

Similarly, when a row group ends at the end of a data block, there's little
value in including its last key in the index block.

This can be a significant optimization when keys in column 1 are large.

In layer files where each row in column 0 has exactly one row in column 1,
this optimization means that the column-0 index will have no bounds at all.
Other layer files will benefit less.

Signed-off-by: Ben Pfaff <blp@feldera.com>
  • Loading branch information
blp committed Dec 24, 2025
commit 82f2af9fa404ca92a2d213e7660acbc69d4ab78f
23 changes: 21 additions & 2 deletions crates/dbsp/src/storage/file/format.rs
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,18 @@
//! [`IndexBlockHeader::bound_map_offset`] and the format in
//! [`IndexBlockHeader::bound_map_varint`].
//!
//! <a name="omitted-bounds">Omitted bounds</a>: Columns 1 and greater may
//! omit bounds, in which case those bounds are set to 0. A min-bound (the
//! first in each pair) may be omitted for a given child if the first row in
//! that child is the first row in its row group. A max-bound (the second in
//! each pair) may be omitted for a given child if the last row in that child
//! is the last row in its row group. If any bounds are omitted then
//! [FileTrailer::incompatible_features] must include
//! [FileTrailer::OMITTED_BOUNDS].
//!
//! Column 0 may not omit bounds (it consists of a single row group so it
//! could only omit the very first and very last bound in any case).
//!
//! * An array of "row totals", one for each of
//! [`IndexBlockHeader::n_children`]. The first row total is the total number
//! of rows in the first child tree, the second row total is that plus the
Expand Down Expand Up @@ -181,11 +193,18 @@ pub struct FileTrailer {
///
/// If any of these bits are set, the version number must be at least 3.
///
/// No incompatible features are currently defined. This bitmap is for
/// future expansion.
/// [FileTrailer::OMITTED_BOUNDS] is the only current incompatible feature.
pub incompatible_features: u64,
}

impl FileTrailer {
/// Must be set in [FileTrailer::incompatible_features] if any bounds are
/// [omitted].
///
/// [omitted]: crate::storage::file::format#omitted-bounds
pub const OMITTED_BOUNDS: u64 = 1;
}

/// Information about a column.
///
/// Embedded inside the [`FileTrailer`] block.
Expand Down
58 changes: 39 additions & 19 deletions crates/dbsp/src/storage/file/reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,10 @@ pub enum CorruptionError {
/// Invalid filter block location.
#[error("Invalid file block location ({0}).")]
InvalidFilterLocation(InvalidBlockLocation),

/// Missing bounds for row group that spans data block.
#[error("Missing bounds for row group that spans data block.")]
MissingBounds,
}

/// Reader for an array of [Varint]s in a storage file.
Expand Down Expand Up @@ -1069,10 +1073,15 @@ where
Err(CorruptionError::MissingRow(row).into())
}

unsafe fn get_bound(&self, index: usize, bound: &mut K) {
unsafe fn get_bound<'a>(&self, index: usize, bound: &'a mut K) -> Option<&'a mut K> {
unsafe {
let offset = self.bounds.get(&self.raw, index) as usize;
bound.deserialize_from_bytes(&self.raw, offset)
if offset != 0 {
bound.deserialize_from_bytes(&self.raw, offset);
Some(bound)
} else {
None
}
}
}

Expand Down Expand Up @@ -1102,12 +1111,16 @@ where
let row = self.get_row_bound(mid) + self.first_row;
let cmp = match range_compare(target_rows, row) {
Equal => {
self.get_bound(mid, bound);
let cmp = compare(bound);
if cmp == Equal {
return Some(mid / 2);
if let Some(bound) = self.get_bound(mid, bound) {
match compare(bound) {
Equal => return Some(mid / 2),
cmp => cmp,
}
} else if mid % 2 == 1 {
Less
} else {
Greater
}
cmp
}
cmp => cmp,
};
Expand Down Expand Up @@ -1147,8 +1160,10 @@ where
let mut end = self.n_children();
while *start < end {
let mid = start.midpoint(end);
self.get_bound(mid * 2, tmp_key);
if &targets[start_index] < tmp_key {
if self
.get_bound(mid * 2, tmp_key)
.is_some_and(|bound| &targets[start_index] < bound)
{
end = mid;
} else {
*start = mid + 1;
Expand All @@ -1170,15 +1185,16 @@ where
}

/// Returns the comparison of the largest bound key using `compare`.
unsafe fn compare_max<C>(&self, key_factory: &dyn Factory<K>, compare: &C) -> Ordering
unsafe fn compare_max<C>(&self, key_factory: &dyn Factory<K>, compare: &C) -> Option<Ordering>
where
C: Fn(&K) -> Ordering,
{
unsafe {
let mut ordering = Equal;
let mut ordering = None;
key_factory.with(&mut |key| {
self.get_bound(self.n_children() * 2 - 1, key);
ordering = compare(key);
ordering = self
.get_bound(self.n_children() * 2 - 1, key)
.map(|key| compare(key));
});
ordering
}
Expand Down Expand Up @@ -1527,11 +1543,12 @@ where
);
}

if file_trailer.incompatible_features != 0 {
return Err(CorruptionError::UnsupportedIncompatibleFeatures(
file_trailer.incompatible_features,
)
.into());
let unsupported_features =
file_trailer.incompatible_features & !FileTrailer::OMITTED_BOUNDS;
if unsupported_features != 0 {
return Err(
CorruptionError::UnsupportedIncompatibleFeatures(unsupported_features).into(),
);
}

assert_eq!(factories.len(), file_trailer.columns.len());
Expand Down Expand Up @@ -2608,7 +2625,10 @@ where
// `index_block` and the greatest value under `index_block` is less
// than the target.
if rows.end > index_block.rows().end
&& index_block.compare_max(row_group.factories.key_factory, compare) == Greater
&& index_block
.compare_max(row_group.factories.key_factory, compare)
.ok_or(CorruptionError::MissingBounds)?
== Greater
{
rows.start = index_block.rows().end;
continue;
Expand Down
10 changes: 10 additions & 0 deletions crates/dbsp/src/storage/file/test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -692,6 +692,16 @@ fn two_columns_max_branch_2_uncompressed() {
);
}

#[test]
fn two_columns_max_branch_3_uncompressed() {
init_test_logger();
test_2_columns_helper(
Parameters::default()
.with_max_branch(3)
.with_compression(None),
);
}

#[test]
fn two_columns_max_branch_2_snappy() {
init_test_logger();
Expand Down
Loading