Don't micro-optimize the uniprocessor case; use the same loop there. Submitted by: Bhanu Prakash Reviewed by: kib, jhb