atomically:
1) Search _thread_list for the thread to join.
2) Search _dead_list for the thread to join.
3) Set the running thread as the joiner.
While we're at it, fix a race in the case where multiple threads try to
join on the same thread. POSIX says that the behavior of multiple joiners
is undefined, but the fix is cheap as a result of the other fix.