[dbsp, adapters] Transactional bootstrapping#6223
Conversation
94add9f to
d33c656
Compare
mihaibudiu
left a comment
There was a problem hiding this comment.
Can't say I spotted the bug that is being fixed...
| self.last_checkpoint_sync(), | ||
| self.replaying(), | ||
| self.circuit.bootstrap_in_progress(), | ||
| self.controller.status.bootstrap_in_progress(), |
There was a problem hiding this comment.
any reason not to reuse self.bootstrapping?
if there is one maybe this needs an explanation
There was a problem hiding this comment.
Added a comment explaining this.
mythical-fred
left a comment
There was a problem hiding this comment.
Approving — solid set of fixes to bootstrapping, well-tested.
The four threads are easy to follow individually and each has matching test coverage: the PK regression test in replay_tests.rs::regression1, the new test_integrate_trace_bootstrap_is_transactional / test_accumulate_trace_bootstrap_is_transactional cases for splitter-style replay, the is_commit_complete() tightening in CircuitHandle, and the previously_non_empty_views assertion in the TPC-H segment test for the #6091 snapshot race.
A few small nits — none blocking, please clean up when convenient:
-
crates/adapters/src/controller.rs— in the newbootstrappinghandling, theelsebranch callsself.circuit.bootstrap_in_progress()again instead of reusing the localbootstrappingyou just captured a few lines above. Minor, but the duplication invites future drift. -
crates/dbsp/src/circuit/circuit_builder.rs— the comment inregister_replay_streamcalls the change a "workaround" for the input-integral-plus-downstream-integral case. Worth filing a tracking issue to actually de-duplicate the integrals so we don't rely on registration order forever (and so a future refactor that reorders calls doesn't silently break replay). -
crates/dbsp/src/circuit/replay_tests.rs::regression1— two commented-out lines (//let input_stream = input_stream.apply_owned(...)and//let output = output_handle2.take_from_all()...) should be deleted before merge. -
python/tests/workloads/test_tpch.py— the 4-line# log("Waiting for outputs to flush")block near the end is commented out; remove it or add a comment explaining why it's parked. -
crates/dbsp/src/circuit/schedule/dynamic_scheduler.rs— initializingtransaction_phasetoCommitCompleteworks, but the name now reads oddly for "no transaction has ever started." Consider renaming the variant to something neutral (e.g.Inactive) in a follow-up. -
Commit body of Fix bootstrapping of tables with a PK: typo —
simle->simple.
Doc/changelog boxes are unchecked, but I don't see anything user-visible here (the removed set_replay_step_size/get_replay_step_size are internal Rust APIs on DBSPHandle), so I'd leave that to your judgment.
The "TODO: run the test as part of QA" in the PR description — please file that as a follow-up issue so it doesn't get forgotten.
mythical-fred
left a comment
There was a problem hiding this comment.
Adding the nits inline as I should have done in the first review — same content, properly anchored.
| Box::new(replay_stream.clone()), | ||
| ); | ||
| // If a replay source already exists, don't overwrite it. This normally shouldn't | ||
| // happen as we should not have more than one integral for each stream. One situation |
There was a problem hiding this comment.
The comment correctly calls this a workaround — could you file a follow-up issue to actually de-duplicate the integrals? Relying on registration order is fragile; a future refactor that swaps the order of input_upsert vs. the downstream join/aggregate construction would silently break replay again, and the symptom ("not materialized") is misleading.
| handles: JoinSet::new(), | ||
| waiting: false, | ||
| transaction_phase: TransactionPhase::Idle, | ||
| transaction_phase: TransactionPhase::CommitComplete, |
There was a problem hiding this comment.
Initializing to CommitComplete reads as "a commit just finished" when in fact no transaction has ever started. Consider renaming the variant to something neutral like Inactive (or NoTransaction) in a follow-up — semantics are fine, just the name.
|
|
||
| pipeline.pause() | ||
|
|
||
| # log("Waiting for outputs to flush") |
There was a problem hiding this comment.
Four-line commented-out wait_for_completion block — either delete or add a comment explaining why it's parked, otherwise the next person to touch this file won't know whether to revive it.
blp
left a comment
There was a problem hiding this comment.
Thank you!
I'm going to have to try the TPC-H checkpoint mode. I didn't know about that.
41d9eb6 to
7148c81
Compare
This fixes an issue when bootstrapping a table with a PK when there is a downstream operator attached to it that creates an integral of the same table. We ended up with two integrals that can both be used to replay the same stream, one with and one without an accumulator. We used to replay from the last registered replay source, which meant that if the second integral was added in the modified version of the program, it was empty and replay failed, despite the fact that the input integral could be used for replay. To make things worse, we report this error as the input table not being materialized, which is simply wrong. This commit adds a simle workaround that uses the first registered replay source (by refusing to register new replay sources for the same stream) and a regression test for this. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Fixes #4736 Bootstrapping used to be performed in a sequence of transactions. Depending on the program this could be inefficient due to redundant recomputation. In addition this produced multiple small batches of potentialy mutually canceling outputs. We now have all the infra needed to change this. This commit changes Z1Trace and AccumulateZ1Trace operator to behave as splitters, i.e., they replay their entire contents across multiple steps within the same transaction. We also get rid of the replay_step_size knob. Instead we use the existing splitter_chunk_size setting, which controls the number of records produced by splitters per step. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Improve TPC-H test in `checkpoint` mode, which can be used to torture-test bootstrapping: - Configurable number and size of test segments. With these new options we can scale the test up and down using the same dataset (typically TPC-H). - Check that views are initialized after bootstrapping. Example: Split views into 2 groups 10M records each. ``` uv run test_tpch.py --s3-bucket feldera-qa-data --s3-prefix tpc-h-100 --s3-region us-west-1 --mode checkpoint --num-segments 2 --segment-size 10000000 ``` Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Remove redundant TransactionPhase::Idle status. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Bootstrapping is considered completed when all replay sources are complete and the bootstrapping transaction has committed. The latter condition was missing. This did not cause any issues because we only called this function between transactions. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Fix #6091. Views that don't participate in bootstrapping don't update their snapshots until the first post-bootstrap transaction. As a result, a client making ad hoc queries right after bootstrapping completes could observe empty views. We fix this by: 1. Forcing an extra transaction after bootstrapping completes (this was already the case). 2. Maintaining `bootstrap_in_progress` status until the extra transaction commits. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
Don't update the snapshot until bootstrapping is complete (including the additional post-bootstrap transaction). This guarantees that: 1. Ad hoc queries observe a consistent snapshot of all views. 2. Connectors configured with `send_snapshot=true` don't receive empty . Extended the Python output snapshot test with an extra step that fails without this fix. Signed-off-by: Leonid Ryzhyk <ryzhyk@gmail.com>
7148c81 to
dffc5b1
Compare
Fix #6091, #4736
Several important improvements to the bootstrapping mechanism:
See individual commits.
Describe Manual Test Plan
Checklist
Breaking Changes?
Mark if you think the answer is yes for any of these components:
Describe Incompatible Changes