Commit Graph

26 Commits

Author SHA1 Message Date
Konstantin Belousov
95fd15898b Real hardware, as opposed to QEMU, does not allow to have a call gate
in long mode which transfers control to 32bit code segment. Unbreak
the lcall $7,$0 implementation on amd64 by putting the 64bit user code
segment' selector into call gate, and execute the 64bit trampoline
which converts the return frame into 32bit format and switches back to
32bit mode for executing int $0x80 trampoline.

Note that all jumps over the hoops are performed in the user mode.

MFC after:	1 week
2012-08-14 12:13:27 +00:00
Konstantin Belousov
26ccf4f10f Inline the syscallenter() and syscallret(). This reduces the time measured
by the syscall entry speed microbenchmarks by ~10% on amd64.

Submitted by:	jhb
Approved by:	re (bz)
MFC after:	2 weeks
2011-09-11 16:05:09 +00:00
Konstantin Belousov
7332c129e0 Add support for executing the FreeBSD 1/i386 a.out binaries on amd64.
In particular:
- implement compat shims for old stat(2) variants and ogetdirentries(2);
- implement delivery of signals with ancient stack frame layout and
  corresponding sigreturn(2);
- implement old getpagesize(2);
- provide a user-mode trampoline and LDT call gate for lcall $7,$0;
- port a.out image activator and connect it to the build as a module
  on amd64.

The changes are hidden under COMPAT_43.

MFC after:   1 month
2011-04-01 11:16:29 +00:00
John Baldwin
8c7a92bd4a Remove unused KTRACE includes. 2010-08-19 16:41:27 +00:00
Konstantin Belousov
afe1a68827 Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
  usermode into struct syscall_args. The structure is machine-depended
  (this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
  from the syscall. It is a generalization of
  cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
  return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively.  The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by:	jhb, marcel, marius, nwhitehorn, stas
Tested by:	marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
	stas (mips)
MFC after:	1 month
2010-05-23 18:32:02 +00:00
Konstantin Belousov
5b1162b964 For PT_TO_SCE stop that stops the ptraced process upon syscall entry,
syscall arguments are collected before ptracestop() is called. As a
consequence, debugger cannot modify syscall or its arguments.

For i386, amd64 and ia32 on amd64 MD syscall(), reread syscall number
and arguments after ptracestop(), if debugger modified anything in the
process environment. Since procfs stopeven requires number of syscall
arguments in p_xstat, this cannot be solved by moving stop/trace point
before argument fetching.

Move the code to read arguments into separate function
fetch_syscall_args() to avoid code duplication. Note that ktrace point
for modified syscall is intentionally recorded twice, once with original
arguments, and second time with the arguments set by debugger.

PT_TO_SCX stop is executed after cpu_syscall_set_retval() already.

Reported by:	Ali Polatel <alip exherbo org>
Briefly discussed with:	jhb
MFC after:	3 weeks
2010-01-23 11:45:35 +00:00
Konstantin Belousov
1173b9a2d0 For ia32 syscall(), call cpu_set_syscall_retval(). Update comment inside
cpu_set_syscall_retval() accordingly.

MFC after:	1 week
2009-12-12 20:11:31 +00:00
Attilio Rao
393a081d42 Optimize vmmeter locking.
In particular:
- Add an explicative table for locking of struct vmmeter members
- Apply new rules for some of those members
- Remove some unuseful comments

Heavily reviewed by: alc, bde, jeff
Approved by: jeff (mentor)
2007-06-10 21:59:14 +00:00
Attilio Rao
6759608248 Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
Kip Macy
e5f8d4099d Newer versions of gcc don't support treating structures passed by value
as if they were really passed by reference. Specifically, the dead stores
elimination pass in the GCC 4.1 optimiser breaks the non-compliant behavior
on which FreeBSD relied. This change brings FreeBSD up to date by switching
trap frames to being explicitly passed by reference.

Reviewed by: kan
Tested by: kan
2006-12-17 06:48:40 +00:00
John Baldwin
cb76d9b05c Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.
2006-07-28 20:22:58 +00:00
John Baldwin
af5bf12239 Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
John Baldwin
22ea1bc57a Unify the checking for lock misbehavior in the various syscall()
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
  section.
- Add a new explicit check before userret() to make sure we don't return
  with any locks held.  The advantage here is that we can include the
  syscall number and name in syscall() whereas that info is not available
  in userret().
- Drop the mtx_assert()'s of sched_lock and Giant.  They are replaced by
  the more general checks just added.

MFC after:	2 weeks
2006-07-27 22:32:30 +00:00
John Baldwin
0c5d1dbd43 Add KTR_SYSC tracing to the syscall() implementations that didn't have it
yet.

MFC after:	1 week
2006-07-27 21:25:50 +00:00
John Baldwin
00f1856905 Add missing ptrace(2) system-call stops to various syscall()
implementations.

MFC after:	1 week
2006-07-27 19:50:16 +00:00
Poul-Henning Kamp
eb2da9a51f Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly.  Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret().  Use the absolute value of
td->pticks in userret() and eliminate third argument.
2006-02-08 08:09:17 +00:00
Wayne Salamon
4f9ac41fba Call the audit syscall enter/exit functions for the amd64 architecture,
both 32-bit and 64-bit paths. System calls will now be audited.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-04 20:37:20 +00:00
David Xu
9104847f21 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
John Baldwin
aa9aa68d2f Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
Julian Elischer
99e9dcb817 Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after:	2 days
2004-08-31 06:12:13 +00:00
Peter Wemm
0d2a298904 Initial landing of SMP support for FreeBSD/amd64.
- This is heavily derived from John Baldwin's apic/pci cleanup on i386.
- I have completely rewritten or drastically cleaned up some other parts.
  (in particular, bootstrap)
- This is still a WIP.  It seems that there are some highly bogus bioses
  on nVidia nForce3-150 boards.  I can't stress how broken these boards
  are.  I have a workaround in mind, but right now the Asus SK8N is broken.
  The Gigabyte K8NPro (nVidia based) is also mind-numbingly hosed.
- Most of my testing has been with SCHED_ULE.  SCHED_4BSD works.
- the apic and acpi components are 'standard'.
- If you have an nVidia nForce3-150 board, you are stuck with 'device
  atpic' in addition, because they somehow managed to forget to connect the
  8254 timer to the apic, even though its in the same silicon!  ARGH!
  This directly violates the ACPI spec.
2003-11-17 08:58:16 +00:00
Peter Wemm
76caec589f MFi386 by jhb: use symbolic constants for the IDT entries. 2003-09-22 22:09:02 +00:00
Peter Wemm
46159d1fd6 Switch to using the emulator in the common compat area.
Still work-in-progress.
2003-08-23 00:04:53 +00:00
David E. O'Brien
56ae44c5df Use __FBSDID().
Brought to you by:	a boring talk at Ottawa Linux Symposium
2003-07-25 21:19:19 +00:00
Peter Wemm
1f5b79bc16 Make this compile with WITNESS enabled. It wants the syscall names. 2003-05-31 06:49:53 +00:00
Peter Wemm
d85631c4ac Add BASIC i386 binary support for the amd64 kernel. This is largely
stolen from the ia64/ia32 code (indeed there was a repocopy), but I've
redone the MD parts and added and fixed a few essential syscalls.  It
is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic)
and p4.  The ia64 code has not implemented signal delivery, so I had
to do that.

Before you say it, yes, this does need to go in a common place.  But
we're in a freeze at the moment and I didn't want to risk breaking ia64.
I will sort this out after the freeze so that the common code is in a
common place.

On the AMD64 side, this required adding segment selector context switch
support and some other support infrastructure.  The %fs/%gs etc code
is hairy because loading %gs will clobber the kernel's current MSR_GSBASE
setting.  The segment selectors are not used by the kernel, so they're only
changed at context switch time or when changing modes.  This still needs
to be optimized.

Approved by:	re (amd64/* blanket)
2003-05-14 04:10:49 +00:00