8269 dtrace stddev aggregation is normalized incorrectly

illumos/illumos-gate@79809f9cf4
79809f9cf4

https://www.illumos.org/issues/8269
  It seems that currently normalization of stddev aggregation is done
  incorrectly.
  We divide both the sum of values and the sum of their squares by the
  normalization factor. But we should divide the sum of squares by the
  normalization factor squared to scale the original values properly.

Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>
This commit is contained in:
avg 2017-06-09 15:04:10 +00:00
parent 96f21d6535
commit 9afe119166

View File

@ -381,8 +381,10 @@ dt_stddev(uint64_t *data, uint64_t normal)
* The standard approximation for standard deviation is
* sqrt(average(x**2) - average(x)**2), i.e. the square root
* of the average of the squares minus the square of the average.
* When normalizing, we should divide the sum of x**2 by normal**2.
*/
dt_divide_128(data + 2, normal, avg_of_squares);
dt_divide_128(avg_of_squares, normal, avg_of_squares);
dt_divide_128(avg_of_squares, data[0], avg_of_squares);
norm_avg = (int64_t)data[1] / (int64_t)normal / (int64_t)data[0];