sysstat: avoid all-zero CPU rows on counter regression#433
Open
orasagar wants to merge 1 commit into
Open
Conversation
sar can report an all-zero CPU utilization row when per-CPU /proc/stat counters move backward between two samples. In that case get_per_cpu_interval() subtracts the previous summed counters from the current summed counters using unsigned arithmetic. A single regressing field can underflow the interval to a huge value, causing all displayed percentages for that CPU to round down to 0.00. Compute the per-CPU interval as the sum of non-negative deltas for each CPU counter instead of subtracting aggregate sums. Also apply the guest counter interval correction only when the parent user/nice counter has not regressed. This is reproducible with sar -P ALL using two /proc/stat samples where a per-CPU user/system counter decreases between reads. Signed-off-by: Sagar Sagar <sagar.sagar@oracle.com>
Author
|
This can be seen with CPU hotplug when a CPU goes offline and later comes back online with one or more per-CPU /proc/stat counters lower than the previous sample. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
sar can report an all-zero CPU utilization row when per-CPU
/proc/stat counters move backward between two samples. In that case
get_per_cpu_interval() subtracts the previous summed counters from the
current summed counters using unsigned arithmetic. A single regressing
field can underflow the interval to a huge value, causing all displayed
percentages for that CPU to round down to 0.00.
Compute the per-CPU interval as the sum of non-negative deltas for each
CPU counter instead of subtracting aggregate sums. Also apply the guest
counter interval correction only when the parent user/nice counter has
not regressed.
This is reproducible with sar -P ALL using two /proc/stat samples where
a per-CPU user/system counter decreases between reads.