8269 dtrace stddev aggregation is normalized incorrectly

illumos/illumos-gate@79809f9cf4
79809f9cf4

https://www.illumos.org/issues/8269
  It seems that currently normalization of stddev aggregation is done
  incorrectly.
  We divide both the sum of values and the sum of their squares by the
  normalization factor. But we should divide the sum of squares by the
  normalization factor squared to scale the original values properly.

Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>
This commit is contained in:
Andriy Gapon 2017-06-09 15:04:10 +00:00
parent 27ba1b79ca
commit 79edb7989a

View File

@ -381,8 +381,10 @@ dt_stddev(uint64_t *data, uint64_t normal)
* The standard approximation for standard deviation is
* sqrt(average(x**2) - average(x)**2), i.e. the square root
* of the average of the squares minus the square of the average.
* When normalizing, we should divide the sum of x**2 by normal**2.
*/
dt_divide_128(data + 2, normal, avg_of_squares);
dt_divide_128(avg_of_squares, normal, avg_of_squares);
dt_divide_128(avg_of_squares, data[0], avg_of_squares);
norm_avg = (int64_t)data[1] / (int64_t)normal / (int64_t)data[0];