amd64: get rid of the pessimized bcopy in syscall arg copy

The code was unnecessarily conditionally copying either 5 or 6 args.
It can blindly copy 6, which also means the size is known at compilation
time and the operation can be depessimized.

Note the entire syscall handling code is rather slow.

Tested on Skylake, sample result for getppid (calls/s):
without pti: 7310106 -> 10653569
with pti: 3304843 -> 4148306

Some syscalls (like read) did not note any difference, other have typically
very modest wins.
This commit is contained in:
Mateusz Guzik 2018-05-04 04:05:07 +00:00
parent a571c38536
commit f0648bcc04

View File

@ -908,7 +908,7 @@ cpu_fetch_syscall_args(struct thread *td)
error = 0;
argp = &frame->tf_rdi;
argp += reg;
bcopy(argp, sa->args, sizeof(sa->args[0]) * regcnt);
bcopy(argp, sa->args, sizeof(sa->args[0]) * 6);
if (sa->narg > regcnt) {
KASSERT(params != NULL, ("copyin args with no params!"));
error = copyin(params, &sa->args[regcnt],