ministat: disallow negative variance / nan Stddev

With all values identical it was possible for Var() to return a negative
value due to limited floating point precision, resulting in "nan"
reported as Stddev.

Variance cannot actually be negative, so just return 0.  We can later
investigate alternate algorithms for calculating variance to reduce the
effect of catastrophic cancellation here.

Reported by:	Arshan Khanifar <arshankhanifar_gmail.com>
Approved by:	phk
Sponsored by:	The FreeBSD Foundation
This commit is contained in:
emaste 2018-02-21 15:54:23 +00:00
parent b53f4cb1fe
commit b3fd6ddf73

View File

@ -208,6 +208,12 @@ static double
Var(struct dataset *ds)
{
/*
* Due to limited precision it is possible that sy^2/n > syy,
* but variance cannot actually be negative.
*/
if (ds->syy <= ds->sy * ds->sy / ds->n)
return (0);
return (ds->syy - ds->sy * ds->sy / ds->n) / (ds->n - 1.0);
}