Skip to content

sysstat: avoid all-zero CPU rows on counter regression#433

Open
orasagar wants to merge 1 commit into
sysstat:masterfrom
orasagar:sar-cpu-counter-regression
Open

sysstat: avoid all-zero CPU rows on counter regression#433
orasagar wants to merge 1 commit into
sysstat:masterfrom
orasagar:sar-cpu-counter-regression

Conversation

@orasagar
Copy link
Copy Markdown

@orasagar orasagar commented Jun 3, 2026

sar can report an all-zero CPU utilization row when per-CPU
/proc/stat counters move backward between two samples. In that case
get_per_cpu_interval() subtracts the previous summed counters from the
current summed counters using unsigned arithmetic. A single regressing
field can underflow the interval to a huge value, causing all displayed
percentages for that CPU to round down to 0.00.

Compute the per-CPU interval as the sum of non-negative deltas for each
CPU counter instead of subtracting aggregate sums. Also apply the guest
counter interval correction only when the parent user/nice counter has
not regressed.

This is reproducible with sar -P ALL using two /proc/stat samples where
a per-CPU user/system counter decreases between reads.

  sar can report an all-zero CPU utilization row when per-CPU
  /proc/stat counters move backward between two samples. In that case
  get_per_cpu_interval() subtracts the previous summed counters from the
  current summed counters using unsigned arithmetic. A single regressing
  field can underflow the interval to a huge value, causing all displayed
  percentages for that CPU to round down to 0.00.

  Compute the per-CPU interval as the sum of non-negative deltas for each
  CPU counter instead of subtracting aggregate sums. Also apply the guest
  counter interval correction only when the parent user/nice counter has
  not regressed.

  This is reproducible with sar -P ALL using two /proc/stat samples where
  a per-CPU user/system counter decreases between reads.

Signed-off-by: Sagar Sagar <sagar.sagar@oracle.com>
@orasagar
Copy link
Copy Markdown
Author

orasagar commented Jun 3, 2026

This can be seen with CPU hotplug when a CPU goes offline and later comes back online with one or more per-CPU /proc/stat counters lower than the previous sample.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant