Improve time complexity of streamMoveIdmpKeys() in asmTriggerBackgroundTrim()#15000
Improve time complexity of streamMoveIdmpKeys() in asmTriggerBackgroundTrim()#15000ShubhamTaple wants to merge 4 commits intoredis:unstablefrom
Conversation
|
I don't know about idmp feature but PR makes sense to me. Btw, how many keys do we expect in this dict? any rough guesses? @sggeorgiev Also, I couldn't find a test with idmp keys & asm (maybe I'm mistaken). Perhaps, we may add it as well. |
|
Looks good to me, the optimization makes sense. Regarding the expected number of keys in |
|
@ShubhamTaple could you please add a test for this? Let's be sure we cover those lines. |
|
@tezc sure I will add a test |
…tag from sflush.json and tcl
|
Changes done in latest commit
Note: The above experimental flags are removed to enable testing |
…IMENTAL tag from sflush.json and tcl" This reverts commit f57ef33. Revert test using sflush
|
|
@sggeorgiev @sundb could you review the tests? I'm not qualified to review it. |
|
@sggeorgiev do we plan to change stream_idmp_keys from dict to kvstore to make it distributed by slot? |
|
No. I don't plan to change it. stream_idmp_keys only holds the few streams actively using IDMP, so N is tiny. Kvstore's wins (per-slot rehashing, defrag, slot scans) need large N to matter, and in non-cluster mode it's just a dict with extra overhead. The only upside — using kvstoreMoveDict in ASM trim — is cosmetic at this scale, and the change would touch SWAPDB, tempDb, RDB load, lazyfree, and ASM paths for no measurable gain. |
Fixes #15001
Problem
In PR #14897
streamMoveIdmpKeys()was implemented such that it takes inint slotas input parameter and iterates overserver.db[0].stream_idmp_keysand moves the keys which belong toslotintostream_idmp_keys.Therefore asmTriggerBackgroundTrim() was calling
streamMoveIdmpKeys()for each slot in trim ranges.Time Complexity
Lets say number of slots be
S(it could be 16,384 in worst case)and let
Kbe number of keys in server.db[0].stream_idmp_keysthen the total time complexity would be
O(S x K)Suggested Fix
we have changed the implementation of streamMoveIdmpKeys() such that it now takes in
slotRangeArray *slotsas an input parameter and iterates overserver.db[0].stream_idmp_keysonce, catching all the keys which fall in theslotsrange and moving them tostream_idmp_keysTherefore asmTriggerBackgroundTrim() would now call
streamMoveIdmpKeys()just once.Time Complexity
say
Ris number of ranges inslotsand letKbe number of keys in server.db[0].stream_idmp_keysthen the total time complexity would be
O(R x K)since
O(R x K)is better thanO(S x K)because number of slots would almost always be greater than number of ranges inslotRangeArray *slots(for a single merged range like
0–16383,R = 1butS = 16 384).Note
Medium Risk
Touches atomic slot migration background trimming and stream IDMP tracking, where mistakes could leave stale IDMP entries on the wrong node or delete/miss entries during slot cleanup. Change is localized but affects correctness under multi-range migrations.
Overview
Improves ASM background trimming performance by changing
streamMoveIdmpKeysto accept aslotRangeArrayand move all matching stream IDMP entries in a single scan instead of being called once per slot.Updates
asmTriggerBackgroundTrim()to invoke this new multi-range version once per trim job, and adds unit tests covering multi-range/disjoint-range migrations to ensure IDMP PID tracking and dedup behavior remain correct on both source and destination.Reviewed by Cursor Bugbot for commit 23c5137. Bugbot is set up for automated code reviews on this repo. Configure here.