Revert "Ignore capturing connection continuation for armeria (#11657)"#11685
Revert "Ignore capturing connection continuation for armeria (#11657)"#11685r1viollet wants to merge 1 commit into
Conversation
This reverts commit 0e13e90.
|
🟢 Java Benchmark SLOs — All performance SLOs passed
PR vs. master results
Commit: Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion. |
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 23d65d61b9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "io.grpc.netty.shaded.io.netty.util.concurrent.GlobalEventExecutor", | ||
| "com.linecorp.armeria.client.HttpClientFactory", | ||
| "com.linecorp.armeria.client.HttpChannelPool" | ||
| "io.grpc.netty.shaded.io.netty.util.concurrent.GlobalEventExecutor" |
There was a problem hiding this comment.
Preserve the Armeria async-propagation guard
By ending knownMatchingTypes() before Armeria's HttpClientFactory/HttpChannelPool, this instrumentation no longer disables async propagation while Armeria creates its pool/connect tasks. When the first Armeria gRPC request is made under an active parent span, executor instrumentation can capture that parent into connection setup with no corresponding consumer, keeping the trace pending until the monitor flushes and allowing the strict gRPC tests to fail; the new test overrides only hide that regression. Please keep an Armeria-specific guard that avoids the shutdown crash without re-enabling these leaked continuations.
Useful? React with 👍 / 👎.
What Does This Do
Reverts #11657 (
0e13e90dac).Motivation
0e13e90dacwas bisected as the first commit that introduces a SIGSEGV / JVM-shutdown hang in test suites running with the profiler enabled (DataDog/profiling-backend CI). The failure is reproduced at ~17% per run.Reproducer (linux/glibc, JDK 26-zulu,
CI_JOB_NAMEset):Bisect trail (each commit built from source, tested 8 iterations except where noted):
v1.63.0ee9a8df466dd22dbf41c82ced127ea553bc3edc2477a0ce8393ac2bc4a970e13e90dac139a1ba3c7(HEAD)Note on mechanism: the profiling-backend test suite does not use Armeria, so the new class matchers (
HttpClientFactory,HttpChannelPool) and method-level advice will never match at runtime. The crash is therefore not caused by the Armeria instrumentation logic itself. The likely explanation is that registering two additionaltransformer.applyAdvice()calls changes agent initialisation timing or thread startup order enough to trigger an existing shutdown race inlibjavaProfiler.so's thread-exit cleanup path (the native profiler has had a series of shutdown-time race fixes in recent weeks). The revert removes the trigger; a proper fix for the underlying native race is being tracked separately.Additional Notes
DataDog/profiling-backend CI is temporarily protected by pinning
dd-java-agentto Maven release1.61.1(DataDog/profiling-backend#8634) while this is resolved.Jira ticket: [PROF-XXXX]