Skip to content

Significantly speed up BeEquivalentTo for large unordered collections#3188

Merged
dennisdoomen merged 6 commits intofluentassertions:mainfrom
dennisdoomen:perf/speed-up-be-equivalent-to
Apr 12, 2026
Merged

Significantly speed up BeEquivalentTo for large unordered collections#3188
dennisdoomen merged 6 commits intofluentassertions:mainfrom
dennisdoomen:perf/speed-up-be-equivalent-to

Conversation

@dennisdoomen
Copy link
Copy Markdown
Member

@dennisdoomen dennisdoomen commented Apr 6, 2026

Summary

This PR significantly improves the performance of BeEquivalentTo when comparing large unordered collections (using WithoutStrictOrdering()). Focus benchmark: 500 complex objects vs 500 completely different objects.

The work is split across five focused commits.

Changes

1. Replace O(n!) permutation search with O(n^2 log n) greedy assignment (Phase 3)

For n > 8 subjects, the original brute-force permutation search was replaced with a greedy strategy: compute all n^2 failure-count pairs, sort ascending, pick closest unassigned pair. n <= 8 keeps the exact search.

Also extracts ReferentialComparer into its own file and updates the benchmark target.

2. Pre-cache collection index strings 0-1023

AsCollectionItem<T>(int index) called index.ToString() on every invocation. A static string[] cache for indices 0-1023 eliminates millions of small string allocations.

3. Skip failure message formatting in dry-run comparisons

Phases 1 and 2 only need to count failures, not format them. Three-part change:

  • AssertionScope: new UseDryRun flag, RegisterFailure() counter, GetFailureCount() method
  • AssertionChain: when UseDryRun is set, call RegisterFailure() instead of formatting
  • LooselyOrderedEquivalencyStrategy: new TryToMatchCount() for counting; split caches (countCache for ints, fullFailuresCache for strings); FindClosestMismatches takes separate count/full-failures delegates

Eliminates ~3.25M FailureMessageFormatter calls and ~6.5M Regex.Replace calls from Phases 1+2.

4. Clone CyclicReferenceDetector per dry-run comparison

Dry-runs shared the parent context's CyclicReferenceDetector. When the same object appears in multiple positions, the detector incorrectly flagged subsequent dry-runs as cyclic -- causing wrong counts and stack overflows. Fix: clone the detector for each dry-run call. Exposes the setter as internal.

5. Include nested scope failures in dry-run failure count

GetFailureCount() was missing failures from nested assertion scopes (they go into assertionStrategy.FailureMessages not the local counter). Fix: return failureCount + assertionStrategy.FailureMessages.Count(). Also call scope.Discard() after counting to prevent leaking into the outer scope.

Combined effect

Bottleneck Before After
Phase 3 algorithm O(n!) permutations O(n^2 log n) greedy (n > 8)
FailureMessageFormatter (Phases 1+2) ~3.25M calls 0
Regex.Replace (Phases 1+2) ~6.5M calls 0
Index ToString allocations ~3.25M Static cache 0-1023
CyclicReferenceDetector Shared, unbounded Cloned per dry-run

Microbenchmark BeEquivalentToWithDeeplyNestedStructures

Before

Method Job Runtime N Depth Mean Error StdDev Gen0 Gen1 Gen2 Allocated
BeEquivalentTo .NET 8.0 .NET 8.0 1 6 39.20 us 0.775 us 1.634 us 9.2773 0.4883 - 116.18 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 1 6 66.98 us 1.313 us 1.512 us 20.3857 1.0986 - 125.65 KB
BeEquivalentTo .NET 8.0 .NET 8.0 8 6 283.77 us 5.666 us 13.243 us 68.3594 5.8594 - 838.52 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 8 6 482.34 us 6.367 us 5.956 us 146.4844 14.1602 - 901.14 KB
BeEquivalentTo .NET 8.0 .NET 8.0 100 6 3,410.08 us 44.714 us 41.826 us 843.7500 277.3438 - 10337.69 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 100 6 6,046.85 us 95.444 us 89.279 us 1804.6875 398.4375 - 11099.02 KB
BeEquivalentTo .NET 8.0 .NET 8.0 500 6 17,756.40 us 324.924 us 333.673 us 4187.5000 31.2500 - 51622.43 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 500 6 30,578.22 us 560.673 us 524.454 us 9000.0000 312.5000 125.0000 55414.37 KB

After

Method Job Runtime N Depth Mean Error StdDev Gen0 Gen1 Gen2 Allocated
BeEquivalentTo .NET 8.0 .NET 8.0 1 6 35.61 us 0.707 us 1.380 us 9.0332 0.4883 - 111.83 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 1 6 60.76 us 1.203 us 1.181 us 19.5313 0.9766 - 120.7 KB
BeEquivalentTo .NET 8.0 .NET 8.0 8 6 245.86 us 4.754 us 7.943 us 64.4531 5.8594 - 803.77 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 8 6 429.76 us 5.432 us 5.081 us 140.1367 11.7188 - 861.65 KB
BeEquivalentTo .NET 8.0 .NET 8.0 100 6 3,039.03 us 40.167 us 37.572 us 804.6875 113.2813 - 9906.72 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 100 6 5,232.71 us 100.977 us 108.045 us 1726.5625 179.6875 - 10608.83 KB
BeEquivalentTo .NET 8.0 .NET 8.0 500 6 15,162.26 us 248.119 us 232.090 us 4031.2500 546.8750 31.2500 49477.68 KB
BeEquivalentTo .NET Framework 4.7.2 .NET Framework 4.7.2 500 6 26,038.15 us 414.944 us 367.837 us 8593.7500 781.2500 - 52979.31 KB

PerformanceSpecs / Compare_complex_collection

Before
37-39 seconds

After
12-13 seconds

@dennisdoomen dennisdoomen marked this pull request as draft April 6, 2026 07:38
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

Qodana for .NET

It seems all right 👌

No new problems were found according to the checks applied

💡 Qodana analysis was run in the pull request mode: only the changed files were checked
☁️ View the detailed Qodana report

Contact Qodana team

Contact us at qodana-support@jetbrains.com

@dennisdoomen dennisdoomen force-pushed the perf/speed-up-be-equivalent-to branch 4 times, most recently from 9973452 to 75ccee2 Compare April 6, 2026 16:01
@dennisdoomen dennisdoomen changed the title perf: significantly speed up BeEquivalentTo for large unordered collections Significantly speed up BeEquivalentTo for large unordered collections Apr 6, 2026
@dennisdoomen dennisdoomen force-pushed the perf/speed-up-be-equivalent-to branch 3 times, most recently from d141c65 to e0e1ef5 Compare April 8, 2026 19:50
dennisdoomen and others added 5 commits April 8, 2026 22:07
…collections

Replace the O(n! * n) permutation search in FindClosestMismatches with a
two-strategy approach:
- For n <= 8 items remaining, keep the exact permutation search (globally
  optimal; factorial(8) = 40,320, well within cost).
- For n > 8, use a greedy O(n^2 log n) assignment that sorts all n^2
  (expectation, subject) pairs by failure count and picks the closest
  unassigned pair first. Ties are broken by expectation index then subject
  index to preserve deterministic, natural-order error messages.

Also extract ReferentialComparer to its own file and bump Reflectify to 1.9.0.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
AsCollectionItem<T>(int) now returns a pre-computed string for indices 0-1023
instead of calling ToString() on every comparison. For large collections this
avoids repeated short-lived string allocations in hot paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add UseDryRun flag to AssertionScope. When set, AssertionChain.FailWith()
increments a counter instead of formatting and storing the full message.
LooselyOrderedEquivalencyStrategy uses a dry-run scope in TryToMatchCount
to count failures cheaply, split caches (countCache for counts, fullFailuresCache
for strings), and updates FindClosestMismatches to take separate getFailureCount
and getFullFailures delegates — deferring string allocation to the winning
assignment only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…k overflows

Dry-run comparisons in TryToMatchCount and TryToMatch share the parent
context's CyclicReferenceDetector by default. When the same object appears
in multiple positions in an unordered collection, the detector could flag
subsequent dry-runs as cyclic references and skip them — producing wrong
failure counts and potential stack overflows. Fix by cloning the detector
before each dry-run invocation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetFailureCount() was only counting direct RegisterFailure() calls.
Failures from nested assertion scopes (e.g. via AssertionRuleEquivalencyStep)
are stored in assertionStrategy.FailureMessages rather than the local counter,
so they were invisible to TryToMatchCount — causing mismatches to look like
matches (count=0) and corrupting the optimal assignment.

Also call scope.Discard() after GetFailureCount() so that the dry-run scope
does not propagate its failure messages to the parent scope on disposal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dennisdoomen dennisdoomen force-pushed the perf/speed-up-be-equivalent-to branch from e0e1ef5 to 4883856 Compare April 8, 2026 20:08
@dennisdoomen dennisdoomen marked this pull request as ready for review April 8, 2026 20:08
@dennisdoomen dennisdoomen added this to the 8.10 milestone Apr 8, 2026
@dennisdoomen dennisdoomen requested a review from jnyrup April 8, 2026 20:16
Copy link
Copy Markdown
Member

@jnyrup jnyrup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - and what a pleasure to review with the focused commits ♥️

Did you do a benchmark on Compare_complex_collection to get before and after cpu/mem numbers?

For later work, I'm wondering if we might be able to utilize UseDryRun more places?

Comment thread Src/FluentAssertions/Equivalency/Steps/LooselyOrderedEquivalencyStrategy.cs Outdated
@dennisdoomen
Copy link
Copy Markdown
Member Author

Looks good - and what a pleasure to review with the focused commits

I'm using Copilot CLI heavily, so thanks to Copilot ;-)

For later work, I'm wondering if we might be able to utilize UseDryRun more places?

Most likely. But haven't investigated that yet.

@dennisdoomen
Copy link
Copy Markdown
Member Author

Did you do a benchmark on Compare_complex_collection to get before and after cpu/mem numbers?

Added all the benchmarks.

The strategy already owns TryToMatchCount and TryToMatch, so keeping the closest-match helper chain static forced delegate plumbing without adding value. Make those helpers instance methods so they can use the existing instance members directly while preserving the matching logic, scoring, and caches.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dennisdoomen dennisdoomen merged commit f429c2e into fluentassertions:main Apr 12, 2026
8 checks passed
@dennisdoomen dennisdoomen deleted the perf/speed-up-be-equivalent-to branch April 12, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants