Conversation
mythical-fred
left a comment
There was a problem hiding this comment.
Two blockers: (1) two unsafe blocks in reader.rs missing // SAFETY: comments (see inline), and (2) docs.feldera.com/docs/operations/metrics.md should be updated with entries for the four new range filter metrics (range_filter_size_bytes, range_filter_hits_count, range_filter_misses_count, range_filter_hit_rate_percent). Looking at past PRs (e.g., the transaction metrics PR), the pattern is a manual edit to metrics.md alongside the code change.
27cc389 to
e987d2e
Compare
|
some benchmarking revealed that this optimization alone does not help much for e.g., delta lake connectors because keys get ingested mostly at random (maybe some z ordering or liquid clustering stuff would help but who knows) it helps for connectors that ingest in semi linear orders (e.g., datagen -- maybe postgres/kafka as well) |
ryzhyk
left a comment
There was a problem hiding this comment.
Looks sane. Thanks!
Given this, we can easily experiment with range-partitioned merges.
mythical-fred
left a comment
There was a problem hiding this comment.
New commits look good — CachePadded for filter stats is a nice improvement. Two unsafe blocks in key_range() still lack // SAFETY: comments; those remain a blocker (see inline).
This is a very cheap filter that can speed up ingest significantly. The idea is to track the min/max for every batch. For seek_key_exact if the value we're seeking is not in side of the range we skip the batch. Some ingest heavy benchmarks: just bloom filter: ╭────────────────────────┬───────────┬──────────┬───────╮ │ Metric │ Value │ Lower │ Upper │ ├────────────────────────┼───────────┼──────────┼───────┤ │ Throughput (records/s) │ 2695496 │ - │ - │ │ Memory │ 14.28 GiB │ 1.78 GiB │ - │ │ Storage │ 79.45 GiB │ 110 B │ - │ │ Uptime [ms] │ 302742 │ - │ - │ │ State Amplification │ 0.43 │ - │ - │ ╰────────────────────────┴───────────┴──────────┴───────╯ range+bloom filter: ╭────────────────────────┬────────────┬──────────┬───────╮ │ Metric │ Value │ Lower │ Upper │ ├────────────────────────┼────────────┼──────────┼───────┤ │ Throughput (records/s) │ 4035088 │ - │ - │ │ Memory │ 23.4 GiB │ 2.42 GiB │ - │ │ Storage │ 112.51 GiB │ 110 B │ - │ │ Uptime [ms] │ 303292 │ - │ - │ │ State Amplification │ 0.41 │ - │ - │ ╰────────────────────────┴────────────┴──────────┴───────╯ Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
|
@blp I'd appreciate a short re-review of the part that moves the key range computation into the writer and passes it to the reader when finalizing a batch (as opposed to maintaining min/max throughout the batch building phase) e.g, I'm worried about this part and if it's sane: are there potential problems when the index block is multiple blocks (can this even happen) and would this still work (I think so but best if you have a look). |
|
on the plus side, we found an unrelated bug in FileWSet by adding some more tests to this |
We add stats for the range filter. This lead to some refactoring: Since we now have two filters (with another one on the way) we consolidate the stats into a single struct that can be re-used across filters. It also revealed a performance issue with the current filter stats. Because this function is extremly hot adding the hit and miss atomics in CachePadded led to a 25% increase for the ingest benchmark. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
We didn't load the weight into cursor.diff which caused some newly written tests to fail We did an anlaysis and concluded that this method, while wrong, luckily wasn't used anywhere relevant in real circuits. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
mythical-fred
left a comment
There was a problem hiding this comment.
Checked the new commits ("Better stats for filters", "Fix FileWSet negation") — both look good. The // SAFETY: comment blocker remains: the two unsafe fn key_range() definitions added in reader.rs (one on DataBlock, one on IndexBlock) still have no // SAFETY: comments on the function definitions themselves. The call-site comments in TreeNode::key_range are present, but per Feldera policy each unsafe fn definition needs its own // SAFETY: annotation explaining the invariant. Holding REQUEST_CHANGES until those are added.
mythical-fred
left a comment
There was a problem hiding this comment.
Reconsidering my blocker. I was wrong to hold this up — I checked the rest of reader.rs and no existing unsafe fn definition carries a // SAFETY: doc comment. The convention here is to document at call sites, and the two new call sites in TreeNode::key_range already have the comments (// SAFETY: Unsafe because of serialization). My blocker was inconsistent with the file's established pattern. LGTM.
- Remove legacy maybe_contains_key - Reader no longer "owns" bloom filter directly - Filtering is handled by BatchFilters (which in the future can support multiple different filter impls) This is supposed to make it easier to customize filters in a follow up PR. Signed-off-by: Gerd Zellweger <mail@gerdzellweger.com>
Describe Manual Test Plan
Tested manually with a few pipelines.
Checklist
Describe Incompatible Changes
No incompatible changes.