_get_curthread(). This is similar to the kernel's curthread. Doing
this saves stack overhead and is more convenient to the programmer.
- Pass the pointer to the newly created thread to _thread_init().
- Remove _get_curthread_slow().
This was changed because originally we were blocking on the umtx and
allowing the kernel to do the queueing. It was decided that the
lib should queue and start the threads in the order it decides and the
umtx code would just be used like spinlocks.