9d7dc13615
and csqrtl(). When one component is huge and the other is tiny, scaling down the tiny component gave spurious underflow. When both components are denormal, not scaling them up gave inaccuracies of 34+ ulps on not very carefully selected args. Fixing this reduces the maximum error to 1.6 ulps on the same set of args (mosly not denormal ones). The scaling used multiplication of a complex variable by 2, but clang messes this on amd64 up by losing the sign of -0.0. Calculate the components separately, as is well known to be needed for operations on more exceptional values.