ministat: disallow negative variance / nan Stddev

With all values identical it was possible for Var() to return a negative
value due to limited floating point precision, resulting in "nan"
reported as Stddev.

Variance cannot actually be negative, so just return 0.  We can later
investigate alternate algorithms for calculating variance to reduce the
effect of catastrophic cancellation here.

Reported by:	Arshan Khanifar <arshankhanifar_gmail.com>
Approved by:	phk
Sponsored by:	The FreeBSD Foundation
This commit is contained in:
Ed Maste 2018-02-21 15:54:23 +00:00
parent cb4985fbf6
commit a9cf54b0c9

View File

@ -208,6 +208,12 @@ static double
Var(struct dataset *ds)
{
/*
* Due to limited precision it is possible that sy^2/n > syy,
* but variance cannot actually be negative.
*/
if (ds->syy <= ds->sy * ds->sy / ds->n)
return (0);
return (ds->syy - ds->sy * ds->sy / ds->n) / (ds->n - 1.0);
}