a proper solution.
- Add a dummy entry point which just calls the C entry points, and try to make
sure it's the first code in the binary.
- Copy a bit more than func_end to try to copy the whole load_kernel()
function. gcc4 puts code behind the func_end symbol.
Approved by: re (blanket)