Enable x86 TSC monotonic clock by default with runtime calibration#15018
Enable x86 TSC monotonic clock by default with runtime calibration#15018fcostaoliveira wants to merge 5 commits intoredis:unstablefrom
Conversation
Remove the USE_PROCESSOR_CLOCK compile-time gate for x86_64 Linux so the hardware TSC is enabled automatically when the CPU advertises constant_tsc. Replace the fragile "model name" GHz regex parsing with runtime calibration: measure RDTSC ticks over a 10 ms wall-clock interval to determine the TSC frequency. This works on CPUs whose /proc/cpuinfo model-name line does not include a "@ X.XGHz" suffix. With the HW clock active, the call() hot path can use getMonotonicUs() instead of gettimeofday(), eliminating 2-3 system calls per command on x86.
🤖 Augment PR SummarySummary: Enables the x86_64 Linux TSC-based monotonic clock by default, using runtime calibration.
🤖 Was this summary useful? React with 👍 or 👎 |
|
|
||
| /* Sleep ~10 ms to accumulate enough ticks for an accurate measurement. */ | ||
| struct timespec req = {0, 10000000}; | ||
| nanosleep(&req, NULL); |
There was a problem hiding this comment.
src/monotonic.c:90: The nanosleep() and surrounding clock_gettime() calls ignore return values; if nanosleep is interrupted (EINTR) or clock_gettime fails, the calibration can compute an incorrect mono_ticksPerMicrosecond and skew all monotonic timing. Consider checking/handling these return codes so you reliably fall back to the POSIX path on failure.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Already addressed in 2488aba4b (on this branch — Handle calibration syscall failures in monotonicInit_x86linux):
clock_gettime()return values checked at both sample points; on failure we log and return, leavinggetMonotonicUsNULL so the POSIX clock path is used.nanosleep()return value checked;EINTRretries with the remaining time, any other errno logs and returns to POSIX fallback.errno.hincluded for theEINTRcheck.
Thread can be resolved.
Check return values of clock_gettime() and nanosleep() during TSC calibration. On failure (or EINTR for nanosleep), fall back to the POSIX clock path instead of computing an incorrect tick rate.
Addresses ShooterIT's review comment on src/monotonic.c. Invariant TSC on modern x86 is guaranteed to be monotonic across a single core's context, but TSC migration across sockets/cores with misaligned TSC, virtualisation, or firmware quirks can still produce a non-monotonic sample pair. Subtracting uint64_t in that case wraps to a huge value and computes a nonsense tick rate. Guard against tsc_end <= tsc_start in monotonicInit_x86linux and bail out to the POSIX clock path when detected, matching the behaviour of the other calibration-failure branches in the same function.
CE Performance Automation : step 1 of 2 (build) DONE.This comment was automatically generated given a benchmark was triggered.
You can check a comparison in detail via the grafana link |
CE Performance Automation : step 2 of 2 (benchmark) RUNNING...This comment was automatically generated given a benchmark was triggered. Started benchmark suite at 2026-04-22 13:13:42.993307 and took 130.941803 seconds up until now. In total will run 7 benchmarks. |
| struct timespec ts_start, ts_end; | ||
| uint64_t tsc_start, tsc_end; | ||
|
|
||
| if (clock_gettime(CLOCK_MONOTONIC, &ts_start) != 0) { |
There was a problem hiding this comment.
This seems very fragile and inaccurate to me.
A context switch between the system call and the he clock reading, would cause inaccuracies that in rare cases could be huge, and besides a small inaccuracy can cause a huge drift after a long period.
Luckily we don't use the monotonic clock for anything important, but still, I think this manual calibration is wrong.
How about using it only when the parsing of the official one fails?
|
|
||
|
|
||
| const char * monotonicInit(void) { | ||
| #if defined(USE_PROCESSOR_CLOCK) && defined(__x86_64__) && defined(__linux__) |
There was a problem hiding this comment.
I remember this being discussed back when this code was originally added, I remember two claims that bother me:
- There are hardware that have unreliable clocks. I remember seeing a comment listing them in some Linux source file.
- I remember observations that indicated that with the exception of some bad hypervisor platform, the system call is actually a very fast VDSO, and that Linux already knows, when it's safe to use the HW clock without a real syscall, and when it does, its nearly as fast as using the HW clock directly.
Do you have any new realization around these, or evidence to contradict that research.
@yoav-steinberg feel free to comment from memory.
Summary
On x86_64 Linux, Redis's hardware TSC clock path was gated behind a
compile-time
USE_PROCESSOR_CLOCKflag. Without it, Redis falls back toclock_gettime(CLOCK_MONOTONIC)— a syscall that costs ~50-100ns perinvocation vs ~5-10ns for RDTSC.
This is already the default on ARM (Generic Timer), but x86 users had to
opt in manually. Additionally, the existing x86 calibration parsed the
CPU "model name" field for a GHz string, which fails on CPUs that don't
include a frequency in that field.
This change:
USE_PROCESSOR_CLOCKcompile-time gate on x86_64 Linuxconstant_tscis present in/proc/cpuinfoflagsRDTSC ticks over a 10ms
clock_gettimeinterval at startupconstant_tscis absentBenchmark Results — io-threads validation (3 datapoints / cell)
Test:
memtier_benchmark-1Mkeys-string-setget2000c-1KiB-pipeline-10(2000 client connections, 1M keys, 10% SET / 90% GET, 1KiB values, pipeline 10).Platform:
x86-aws-m7i.metal-24xl— Intel Xeon Platinum 8488C (Sapphire Rapids), 96 cores bare metal.Comparison:
hw-clock-x86-default(3ce8b055a) vsunstable(0fa78fd8f), 3 independent runs each.oss-standaloneoss-standalone-02-io-threadsoss-standalone-04-io-threadsoss-standalone-08-io-threadsoss-standalone-12-io-threadsoss-standalone-16-io-threadsInterpretation. The improvement concentrates exactly where the theory predicts: topologies where multiple io-threads contend on
clock_gettime(8/12/16 io-threads) show consistent +6.4% to +8.9% gains with non-overlapping confidence intervals across 3 runs. Low-thread-count topologies (standalone, 2/4 io-threads) are flat — no regressions, but no measurable win either, since the syscall-per-command cost isn't the bottleneck at low concurrency.ARM: No change expected and none observed — ARM already uses the HW clock path (Generic Timer /
CNTVCT_EL0) by default. This PR only affects the x86_64 Linux path.The underlying mechanism is the existing
call()optimization atserver.c:3910-3935, which skipsustime()/gettimeofday()when a HW monotonic clock is available. By flipping x86 TSC on by default (whenconstant_tscis reported), that fast path is taken on Intel Sapphire Rapids and equivalents without the user having to rebuild.Note
Medium Risk
Changes the default monotonic clock source on x86_64 Linux and adds a startup calibration path; incorrect calibration or platform quirks could impact timing behavior, though it falls back to POSIX on detected issues.
Overview
On x86_64 Linux, the hardware TSC monotonic clock path is now attempted by default (no longer gated by
USE_PROCESSOR_CLOCK) when/proc/cpuinforeportsconstant_tsc.The x86 init logic drops CPU model-name GHz parsing and instead calibrates
ticks/usat startup by measuringRDTSCover a ~10msCLOCK_MONOTONICinterval, with additional error handling (EINTR-safenanosleep, non-monotonic TSC detection) that triggers fallback to the POSIX clock.Reviewed by Cursor Bugbot for commit ddcc038. Bugbot is set up for automated code reviews on this repo. Configure here.