feat: native socket I/O tracking via PLT hooks (PROF-10637)#488
feat: native socket I/O tracking via PLT hooks (PROF-10637)#488
Conversation
Track libc send/recv calls with inverse-transform sampling (PID rate control, ~5000 events/min) and emit NativeSocketEvent JFR events. - PLT-hook send/recv in all loaded native libraries via LibraryPatcher - NativeSocketSampler: byte-weighted sampling, fd-to-addr cache, PID controller - New JFR type: datadog.NativeSocketEvent (8 fields) - Activated by 'nativesocket' profiler argument; Linux only, macOS no-op - 12 integration tests (Netty NIO) + GTest unit tests for hook invocation Resolves: PROF-10637 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI Test ResultsRun: #24522147526 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Failed Testsmusl-amd64/debug / 25-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 21-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 11-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 17-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 11-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 21-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 17-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 25-librcaJob: View logs No detailed failure information available. Check the job logs. glibc-aarch64/debug / 8-j9Job: View logs No detailed failure information available. Check the job logs. Summary: Total: 32 | Passed: 23 | Failed: 9 Updated: 2026-04-16 17:11:59 UTC |
Remove !Platform.isAarch64() guard: JDK17/21/25 on aarch64 already skip via Platform.isJavaVersion(8). For 8-j9 on aarch64, J9's libnet.so calls send/recv via PLT just as on amd64 — PLT hooking works the same way on both architectures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
What does this PR do?:
Intercepts libc
send/recvcalls via PLT hooking and recordsdatadog.NativeSocketEventJFR events with byte-weighted inverse-transformsampling (PID rate control, target ~5000 events/min).
Motivation:
Track blocking TCP socket I/O at the libc function level to surface socket
latency and throughput in Datadog profiler. Netty with Java NIO transport
(the primary use case) goes through libc
send/recv, making PLT patchingthe right interception point. Feature is explicitly opt-in via the
nativesocketprofiler argument. Implements PROF-10637.Additional Notes:
Key design decisions:
send/recvonly (TCP blocking I/O); UDPsendto/recvfromand Netty native epoll/io_uring are explicitly out of scope
malloc and locking are safe inside hooks
the same fd may remain valid across chunks and clearing would race with
in-flight recordings
_orig_send/_orig_recvare intentionally not nulled inunpatch_socket_functionsto avoid a memory-ordering race with in-flighthook invocations on aarch64
NativeSocketEventstruct lives innativeSocketSampler.h(notevent.h) because it is only used byNativeSocketSampler#ifdef __linux__)start()are notintercepted (documented in
libraryPatcher.h)How to test the change?:
NativeSocketSamplerHookTestinddprof-lib/src/test/cpp/nativeSocketSampler_ut.cpp— verify thatsend_hook/recv_hookdelegate to the installed_orig_send/_orig_recvfunction pointers
ddprof-test/src/test/java/com/datadoghq/profiler/nativesocket/:NativeSocketEnabledTest— events produced when feature is enabledNativeSocketEventFieldsTest— all 8 required JFR fields present and validNativeSocketSendRecvSeparateTest— SEND and RECV events tracked independentlyNativeSocketDisabledTest— no events when feature is not enabledNativeSocketRateLimitTest— event count is substantially less than operation count (subsampling active), weight > 1 on sampled eventsNativeSocketRemoteAddressTest— remoteAddress field is in ip:port formatNativeSocketMacOsNoOpTest— no events on macOS (no-op stub)NativeSocketStackTraceTest— stack trace captured on eventsNativeSocketBytesAccuracyTest— bytesTransferred field matches actual bytesNativeSocketUdpExcludedTest— UDP sends do not produce eventsNativeSocketEventThreadTest— eventThread field populated with calling threadNativeSocketNettyNioTest— Netty 4.x with NioEventLoopGroup produces eventsSpec: #486
For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance.