Cap metric label cardinality + hoist peer-tag schema; rename to ClientStatsAggregator#11387
Draft
dougqh wants to merge 8 commits into
Draft
Cap metric label cardinality + hoist peer-tag schema; rename to ClientStatsAggregator#11387dougqh wants to merge 8 commits into
dougqh wants to merge 8 commits into
Conversation
Replaces the per-field DDCache layer inside AggregateEntry with the two new
cardinality handlers. Each per-field handler holds a small HashMap working
set; when its budget is exhausted, subsequent values collapse to a stable
"blocked_by_tracer" sentinel UTF8BytesString rather than growing without
bound. The handlers are reset on the aggregator thread at the end of each
report() cycle (10s default), so the cardinality budget refreshes per
reporting interval.
Caches replaced (limits preserved from the prior DDCache sizes):
RESOURCE_HANDLER 32
SERVICE_HANDLER 32
OPERATION_HANDLER 64
SERVICE_SOURCE_HANDLER 16
TYPE_HANDLER 8
SPAN_KIND_HANDLER 16
HTTP_METHOD_HANDLER 8
HTTP_ENDPOINT_HANDLER 32
GRPC_STATUS_CODE_HANDLER 32
PEER_TAG_HANDLERS per-tag-name TagCardinalityHandler, each 512
Two production-only changes to the handlers as the user wrote them:
- Fixed import: datadog.collections.tagmap6lazy.TagMap doesn't exist;
TagCardinalityHandler now imports datadog.trace.api.TagMap which has the
Entry API the handler uses.
- Added TagCardinalityHandler.register(String) overload so AggregateEntry's
peer-tag canonicalization doesn't have to allocate a TagMap.Entry per
call -- the snapshot already carries peer-tag values as a flattened
String[] {name, value, ...}.
AggregateEntry split into two construction paths:
- forSnapshot(snapshot, agg): the hot path; runs each field through the
appropriate handler.
- of(...): test-only factory; bypasses the handlers and creates UTF8
instances directly, so tests don't pollute static handler state. Content-
equality on the resulting entry still matches the production-built one.
Thread-safety: handlers are HashMap-backed and not safe for concurrent
access. Both forSnapshot and resetCardinalityHandlers must be called from
the aggregator thread. After the prior commits that moved MetricKey
construction to the aggregator thread, this is the only thread that
canonicalizes; the test factory path runs on test threads but doesn't
touch the handlers.
Reset semantics: clearing the handler's working set drops the {value ->
UTF8BytesString} mapping but doesn't invalidate existing AggregateEntry
fields -- those keep their UTF8BytesString references alive on their own.
Subsequent snapshots with the same content still resolve to the existing
entries via content-equality matches(). New values after reset get freshly
allocated UTF8BytesStrings via the handler.
Known limitation (not fixed here): hashOf(SpanSnapshot) hashes from the
raw snapshot fields, not from the post-handler canonical form. So when
cardinality is exceeded, multiple distinct raw values that collapse to
the "blocked_by_tracer" sentinel still produce distinct hashes and land
in different AggregateEntry buckets -- the wire payload will carry
multiple rows that all label as blocked. This is the same behavior the
prior DDCache-based design would have had at capacity. Collapsing those
into a single sentinel entry would require canonicalizing before hashing
and is a follow-up.
Tests: new CardinalityHandlerTest covers PropertyCardinalityHandler and
TagCardinalityHandler in isolation (hit/miss, over-limit blocking, reset
behavior, sentinel stability). Existing ConflatingMetricAggregatorTest /
SerializingMetricWriterTest / AggregateTableTest all pass unchanged
because the test factory bypasses handlers.
Benchmarks (2 forks x 5 iter x 15s) -- producer side unchanged because
the handlers live on the consumer thread:
SimpleSpan bench: 3.114 +- 0.045 us/op (prior: 3.123 +- 0.018)
DDSpan bench: 2.364 +- 0.113 us/op (prior: 2.412 +- 0.022)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prior commit ran every snapshot through the cardinality handlers but still
hashed the raw snapshot fields. When a field exceeded its cardinality budget
the handlers collapsed many distinct values to a single "blocked_by_tracer"
sentinel, but the raw hashes were still all different -- so the blocked entries
fragmented across the AggregateTable. This commit makes hash + match work off
the canonical (post-handler) UTF8BytesString fields, so blocked values land in
the same bucket and merge into one entry.
How the lookup path changes
---------------------------
A new package-private AggregateEntry.Canonical scratch buffer:
- holds the 10 canonical UTF8BytesString refs, primitives, peerTags list,
and the precomputed keyHash;
- exposes populate(SpanSnapshot) which runs each field through the
appropriate handler and computes the long hash from the canonical refs;
- exposes matches(AggregateEntry) for content-equality lookup;
- exposes toEntry(AggregateMetric) which copies its refs into a fresh
AggregateEntry on miss.
AggregateTable holds one Canonical instance and reuses it per findOrInsert.
On a hit nothing is allocated -- the buffer's refs feed the bucket walk and
matches() directly. On a miss the refs are copied into the new entry and the
buffer is overwritten on the next call.
Hash function
-------------
hashOf now takes UTF8BytesString fields (plus primitives + peerTags list)
instead of raw CharSequence/String from the snapshot. UTF8BytesString.hashCode
returns the underlying String's hash, so:
- content-equal entries built via AggregateEntry.of(...) (test factory,
bypasses handlers) produce the same hash as entries built via
Canonical.toEntry(...) (production, via handlers);
- all values that collapsed to "blocked_by_tracer" share that sentinel
instance and therefore that hashCode -- they land in the same bucket and
merge into one entry.
Matches
-------
The SpanSnapshot-keyed matches() on AggregateEntry is gone. Lookup goes
through Canonical.matches(entry) which compares the buffer's UTF8 fields
against the entry's UTF8 fields via Objects.equals (content equality on
UTF8BytesString). This is needed because across handler resets the
UTF8BytesString instance referenced by an existing entry differs from the
freshly-issued instance for the same content -- content-equality lets the
existing entry survive resets.
The peerTagPairsRaw field on AggregateEntry was previously kept for matching
against snapshot.peerTagPairs (the flat String[]). Canonical.matches uses
List.equals on the encoded UTF8 peerTags directly, so peerTagPairsRaw is
dropped.
New test in AggregateTableTest -- cardinalityBlockedValuesCollapseIntoOneEntry
inserts 50 distinct services into a table whose SERVICE_HANDLER has a
cardinality limit of 32, and asserts the final size is 33 (the 32 in-budget
services plus a single collapsed "blocked_by_tracer" entry, not 50 separate
entries).
Benchmarks (2 forks x 5 iter x 15s) -- producer side unchanged:
SimpleSpan bench: 3.117 +- 0.026 us/op (prior: 3.114 +- 0.045)
DDSpan bench: 2.344 +- 0.114 us/op (prior: 2.364 +- 0.113)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…chema-indexed handlers
Replaces the producer's early {@code (name, value)}-pair encoding with a
schema-based design: peer-tag values are captured into a parallel String
array, and the consumer applies the matching {@link TagCardinalityHandler}
by index using a {@link PeerTagSchema}'s parallel name/handler arrays.
This removes the {@code Map<String, TagCardinalityHandler>} the prior commit
left in {@code AggregateEntry} -- handler lookup is now a single array
dereference instead of a hashmap probe.
PeerTagSchema
-------------
New package-private class that holds:
- {@code String[] names} -- peer-tag names in stable order
- {@code TagCardinalityHandler[] handlers} -- parallel to names
Two schemas exist: a static singleton {@code INTERNAL} for the internal-kind
{@code base.service} case, and a {@code CURRENT} schema for the peer-
aggregation kinds (client/producer/consumer) that lazily refreshes when
{@code features.peerTags()} returns a different set of names.
Each {@link SpanSnapshot} captures the schema reference it was built against
so producer and consumer agree on the indexing even if {@code CURRENT}
changes between capture and consumption.
A fast-path identity check (cached last input Set instance) keeps the
{@code currentSyncedTo} call cheap: when the producer hands in the same
Set instance as last time -- the steady-state case --
{@code currentSyncedTo} returns immediately without iterating names. The
{@code matches()} loop only runs when the Set instance changes, which in
production is rare (only on remote-config reconfiguration).
Snapshot shape
--------------
{@code SpanSnapshot.peerTagPairs} (a flat {@code [name0, value0, name1,
value1, ...]} array) is replaced by:
- {@code PeerTagSchema peerTagSchema} -- nullable; schema for the values
- {@code String[] peerTagValues} -- parallel to schema.names
The producer captures only values; the consumer constructs the encoded
{@code "name:value"} UTF8 forms via {@code schema.handler(i).register(value)}
on its own thread.
Consumer-side cleanups bundled in
---------------------------------
While here, also addresses the perf review items raised against the prior
commit:
- {@code hashOf}'s peer-tag loop is now indexed iteration; no more
iterator allocation per snapshot.
- {@code Canonical} now owns a reusable {@code peerTagsBuffer} ArrayList
that's cleared+refilled per {@code populate} call -- zero allocation
on the hit path. The buffer is copied into an immutable list only on
miss when the entry needs to own it long-term.
- {@code Canonical.matches} uses indexed list comparison; no iterator
alloc in {@code List.equals}.
- The {@code HashMap<String, TagCardinalityHandler> PEER_TAG_HANDLERS}
on {@code AggregateEntry} is gone, replaced by the {@link
PeerTagSchema}'s parallel array layout.
Benchmark (2 forks x 5 iter x 15s)
----------------------------------
SimpleSpan bench: 3.165 +- 0.032 us/op (prior: 3.117 +- 0.026)
DDSpan bench: 2.727 +- 0.018 us/op (prior: 2.344 +- 0.114)
Some producer-side regression from the per-snapshot schema sync (volatile
read + identity check). The fast-path identity comparison keeps it small;
hoisting the sync out of the per-snapshot loop is possible but would change
behavior in the edge case where {@code features.peerTags()} returns
different Sets within a single trace (covered by an existing test). Choosing
correctness over the marginal speedup.
Tests
-----
AggregateTableTest's snapshot builder is updated to construct a schema +
values via {@code PeerTagSchema.currentSyncedTo}, exercising the same code
path as production. Existing peer-tag test in {@code
ConflatingMetricAggregatorTest} still passes unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "Conflating" in the name dates from the prior design that used a Batch pool + pending map to conflate up to 64 hits per inbox slot. That mechanism is gone -- the producer now publishes one SpanSnapshot per span and the consumer's AggregateTable is the conflation point. The new name matches the existing protocol/metric terminology (HealthMetrics.onClientStat*, stats.flush_payloads, etc.). File renames: ConflatingMetricsAggregator.java -> ClientStatsAggregator.java ConflatingMetricAggregatorTest.groovy -> ClientStatsAggregatorTest.groovy ConflatingMetricsAggregatorBenchmark -> ClientStatsAggregatorBenchmark ConflatingMetricsAggregatorDDSpan* -> ClientStatsAggregatorDDSpan* Plus all symbol references in MetricsAggregatorFactory and the test fixtures that referenced the old class name. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three small follow-ups carried over from a /techdebt pass: - TracerHealthMetrics: previousCounts array was sized 51, but the prior commits added a 52nd reporter (statsInboxFull). Without this fix the new counter's report() call would throw ArrayIndexOutOfBoundsException; the Flush task swallows that exception, so the failure would be silent (statsInboxFull would just never make it to statsd). - Aggregator: removes the now-dead public clearAggregates() method. The ClearSignal route from ClientStatsAggregator.disable() supplanted it several commits ago; the method had no remaining callers. - TagCardinalityHandler: removes the unused register(TagMap.Entry) overload and its isValidType helper. The String-keyed overload covers all current callers (AggregateEntry's peer-tag canonicalization). - PeerTagSchema: spotless-driven javadoc reflow only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ClientStatsAggregator.publish was calling features.peerTags() + PeerTagSchema.currentSyncedTo for every span. Peer-tag configuration is stable for the duration of a single trace publish in production -- DDAgentFeaturesDiscovery returns the same Set instance until remote-config reconfiguration -- so the per-snapshot sync is wasted work. Move the sync to once per publish(trace) and pass the resolved schema to the inner publish(span, isTopLevel, peerAggSchema). INTERNAL-kind spans still use the static PeerTagSchema.INTERNAL regardless. Behavior boundary ----------------- Schema changes from features.peerTags() now take effect at the next publish(trace) call rather than mid-trace. Production-equivalent (a trace takes microseconds to milliseconds; remote-config refreshes are seconds apart), but a Spock test that used `>>> [...]` to mock different peerTags() returns on successive calls within one trace no longer makes sense in the new model. That test is rewritten to assert the production-relevant case: peer-tag NAMES are stable, peer-tag VALUES vary per span, distinct value combinations produce distinct aggregate buckets. Benchmark (2 forks x 5 iter x 15s) ---------------------------------- SimpleSpan bench: 3.133 +- 0.057 us/op (prior: 3.165 +- 0.032) DDSpan bench: 2.454 +- 0.082 us/op (prior: 2.727 +- 0.018) Recovers ~270 ns/op on the DDSpan bench -- most of the regression introduced by the per-snapshot lookup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JFR profiling showed ~21% of producer CPU time spent in tag-map lookups during ClientStatsAggregator.publish. One of those lookups -- span.kind -- is redundant because DDSpanContext already caches the kind as a byte ordinal that resolves to a String via a small array. - Add CoreSpan.getSpanKindString() with a default that falls back to the tag map for non-DDSpan impls; DDSpan overrides to delegate to the context's cached resolution. - Hoist schema.names array out of the capturePeerTagValues loop. - Avoid an unnecessary toString() in isSynthetic by declaring SYNTHETICS_ORIGIN as String and using contentEquals. Benchmark (ClientStatsAggregatorDDSpanBenchmark): before: 2.410 us/op after: 1.995 us/op (~17% improvement) vs. master baseline (6.428 us/op): now ~3.2x faster. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the producer/consumer split, the canonical-key trick that makes cardinality-blocking actually save space, the once-per-trace peer-tag schema sync, the role of each file in datadog.trace.common.metrics, and the rationale behind the redesign from ConflatingMetricsAggregator. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dougqh
commented
May 15, 2026
| * peerTagNames} differ from the cached set. Designed to be called from the producer hot path: the | ||
| * common case is a single volatile read and an array-length / set-contains comparison. | ||
| */ | ||
| static PeerTagSchema currentSyncedTo(Set<String> peerTagNames) { |
Contributor
Author
There was a problem hiding this comment.
I'm not crazy about this part. I'd like to figure out how to make schema regeneration clear at the feature discovery level.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sits on top of #11382. Bounds client-stats label cardinality, reworks peer-tag handling, renames the aggregator, and adds a design doc.
DDCaches withPropertyCardinalityHandler/TagCardinalityHandler. Each has a per-field budget; once exhausted, overflow values canonicalize to ablocked_by_tracersentinelUTF8BytesString. Handlers reset every reporting cycle.AggregateTable.findOrInsertruns every label through its handler before computing the lookup hash, so cardinality-blocked values collapse into one bucket instead of fragmenting into N entries.String[]to aPeerTagSchema.namesarray).tag:valueinterning happens on the aggregator thread viaTagCardinalityHandler. The schema is synced once per trace viaPeerTagSchema.currentSyncedTo(Set)with an identity-check fast path.ConflatingMetricsAggregator→ClientStatsAggregator.span.kindbyte ordinal through a newCoreSpan.getSpanKindString()(skips a tag-map lookup per metrics-eligible span); hoistschema.namesout ofcapturePeerTagValues; avoidtoString()allocation inisSynthetic.TracerHealthMetrics.previousCountssize bug that would have silently dropped the newstatsInboxFullcounter; drop deadclearAggregates().docs/client_metrics_design.mdcovering the pipeline shape, the canonical-key trick, thread-safety contract, reporting cadence, failure modes, and benchmark numbers.Benchmark
ClientStatsAggregatorDDSpanBenchmark(64 client-kind DDSpans per op, realCoreTracer):~3.2× over master end to end.
Test plan
:dd-trace-core:test— metrics tests pass (existing + newAggregateTableTestcases for cardinality collapse)🤖 Generated with Claude Code