Fix waf.updates metric success:N/A tag from sequential counter metric values#11604
Fix waf.updates metric success:N/A tag from sequential counter metric values#11604jandro996 wants to merge 2 commits into
Conversation
This comment has been minimized.
This comment has been minimized.
…of sequential counters Removes wafInitCounter and wafUpdatesCounter AtomicInteger fields from WafMetricCollector. Both wafInit() and wafUpdates() now use literal 1L, matching RFC semantics (COUNT-1-per-event). The old sequential counters (1, 2, 3, ...) caused success:N/A when the rawMetricsQueue was full and a metric was dropped — the counter advanced but the offer() silently failed, creating gaps the backend interpreted as success:N/A. Also updates currentRuleVersion from e.wafDiagnostics.rulesetVersion in the InvalidRuleSetException catch block of handleWafUpdateResultReport(), so wafUpdates() emits with the correct event_rules_version even on partial WAF config errors. Raises log level from debug to warn.
a1992d9 to
a063275
Compare
🟢 Java Benchmark SLOs — All performance SLOs passed
PR vs. master results
Commit: Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion. |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a06327584d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| && !e.wafDiagnostics.rulesetVersion.isEmpty() | ||
| && (!defaultConfigActivated || currentRuleVersion == null)) { | ||
| currentRuleVersion = e.wafDiagnostics.rulesetVersion; | ||
| statsReporter.setRulesVersion(currentRuleVersion); |
There was a problem hiding this comment.
Avoid publishing failed rules as the active version
When an RC update throws InvalidRuleSetException before loading any valid rules, this path still updates statsReporter to the new diagnostics version. WAFStatsReporter.processTraceSegment then tags every subsequent trace with _dd.appsec.event_rules.version whenever rulesVersion is set, even though the WAF continues running the previous/default rules; the success path avoids this by requiring !wafDiagnostics.rules.getLoaded().isEmpty(), so the error path should keep the same guard for trace tagging or otherwise separate the failed-update metric version from the active rules version.
Useful? React with 👍 / 👎.
| noExceptionThrown() | ||
| } | ||
|
|
||
| void 'currentRuleVersion is updated from diagnostics when InvalidRuleSetException is thrown on RC update'() { |
There was a problem hiding this comment.
Move the new regression test to JUnit 5
The root AGENTS.md for this repo says, “Test frameworks: Always use JUnit 5. Do not write new Groovy / Spock tests,” but this change adds a new Spock test method to the Groovy spec. Even though the surrounding file is existing Groovy, new coverage should be added as a JUnit 5 test (or the relevant spec migrated) to avoid expanding the deprecated test style.
Useful? React with 👍 / 👎.
…loaded on error On InvalidRuleSetException, always update currentRuleVersion and setRuleVersion() on modules so wafUpdates() emits with the correct event_rules_version. But only update statsReporter (trace tagging) if rules.getLoaded() is non-empty, mirroring the success path guard. This avoids tagging traces with a version whose rules failed to load.
What Does This Do
AtomicIntegercounters (wafInitCounter,wafUpdatesCounter) fromWafMetricCollectorand uses1Las the metric value for bothwafInit()andwafUpdates(), matching RFC semantics (COUNT-1-per-event)AppSecConfigServiceImpl.handleWafUpdateResultReport(), updatescurrentRuleVersionfrome.wafDiagnostics.rulesetVersionin theInvalidRuleSetExceptioncatch block sowafUpdates()emits with the correctevent_rules_versioneven when a partial WAF config error occursInvalidRuleSetExceptionduring WAF config updates fromdebugtowarncurrentRuleVersionis populated from diagnostics whenInvalidRuleSetExceptionis thrown during a remote config updateMotivation
appsec.waf.updateswas reportingsuccess:N/Afor approximately 10% of events.The root cause was the use of sequential
AtomicIntegercounters (1, 2, 3, ...) as the metric value. The telemetry backend aggregates metrics with the same tags across polling windows. When therawMetricsQueue(capacity 1024) is full and a metric offer is dropped, the counter has already been incremented — the next successful emission sends valueN+1with no precedingN, creating a gap the backend interprets assuccess:N/A.The RFC specifies
waf.initandwaf.updatesas COUNT-1-per-event metrics: each emission should carry value1, not a cumulative count.The secondary fix ensures
currentRuleVersionis set from WAF diagnostics even whenInvalidRuleSetExceptionis thrown, sowafUpdates()carries the correctevent_rules_versiontag in error scenarios.Additional Notes
The condition
(!defaultConfigActivated || currentRuleVersion == null)in the error path mirrors the guard used in the success path: it prevents overwriting a user-customized rule version with a default config version, while still allowing the version to be set on the first update or when the default config is active.Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: APPSEC-62740
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.