Commit Graph

205 Commits

Author SHA1 Message Date
jhb
38bfdc988f - Use a custom version of copyinuio() to implement readv/writev using
kern_readv/writev.
- Use kern_settimeofday() and kern_adjtime() rather than stackgapping it.
2005-03-31 22:58:13 +00:00
ps
90b32391ca Use kern_kevent instead of the stackgap for 32bit syscall wrapping.
Submitted by:	jhb
Tested on:	amd64
2005-03-01 17:45:55 +00:00
ps
3477112899 Ooops. I will compile test before committing. The stackgap version
of kevent32 will be going away shortly, so this is temporary until
I commit the non-stackgap version.
2005-03-01 13:50:57 +00:00
ps
7f5f318c48 Correct the freebsd32_kevent prototype. 2005-03-01 06:32:53 +00:00
jhb
54ea9f1912 Regen. 2005-02-24 18:24:29 +00:00
jhb
aa85f8d747 Use msync() to implement msync() for freebsd32 emulation. This isn't quite
right for certain MAP_FIXED mappings on ia64 but it will work fine for all
other mappings and works fine on amd64.

Requested by:	ps, Christian Zander
MFC after:	1 week
2005-02-24 18:24:16 +00:00
jhb
847763ff6e - Add a custom version of exec_copyin_args() to deal with the 32-bit
pointers in argv and envv in userland and use that together with
  kern_execve() and exec_free_args() to implement freebsd32_execve()
  without using the stackgap.
- Fix freebsd32_adjtime() to call adjtime() rather than utimes().  Still
  uses stackgap for now.
- Use kern_setitimer(), kern_getitimer(), kern_select(), kern_utimes(),
  kern_statfs(), kern_fstatfs(), kern_fhstatfs(), kern_stat(),
  kern_fstat(), and kern_lstat().

Tested by:	cokane (amd64)
Silence on:	amd64, ia64
2005-02-18 18:56:04 +00:00
ps
87f1a6a1a6 Add a 32bit syscall wrapper for modstat
Obtained from:	Yahoo!
2005-01-19 17:53:06 +00:00
ps
db53196a48 - rename nanosleep1 to kern_nanosleep
- Add a 32bit syscall entry for nanosleep

Reviewed by:	peter
Obtained from:	Yahoo!
2005-01-19 17:44:59 +00:00
jhb
830736d271 Regenerate. 2005-01-04 18:54:40 +00:00
jhb
adb721c906 Partial sync up to the master syscalls.master file:
- Mark mount, unmount and nmount MPSAFE.
- Add a stub for _umtx_op().
- Mark open(), link(), unlink(), and freebsd32_sigaction() MPSAFE.

Pointy hats to:	several
2005-01-04 18:53:32 +00:00
das
130bed6547 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
marks
dca89dc1d6 Rebuild from compat/freebsd32/syscalls.master:1.43
Reviewed by:	imp, phk, njl, peter
Approved by:	njl
2004-11-18 23:56:09 +00:00
marks
60b0534c96 32-bit FreeBSD ABI compatibility stubs from syscalls.master:1.179
Reviewed by:	imp, phk, njl, peter
Approved by:	njl
2004-11-18 23:54:26 +00:00
rwatson
e095dbaea3 Rebuild from FreeBSD32 syscalls.master:1.42. 2004-10-23 20:05:42 +00:00
rwatson
8bec59acae 32-bit FreeBSD ABI compatibility stubs from syscalls.master:1.178. 2004-10-23 20:04:56 +00:00
peter
09964b7499 Put on my peril sensitive sunglasses and add a flags field to the internal
sysctl routines and state.  Add some code to use it for signalling the need
to downconvert a data structure to 32 bits on a 64 bit OS when requested by
a 32 bit app.

I tried to do this in a generic abi wrapper that intercepted the sysctl
oid's, or looked up the format string etc, but it was a real can of worms
that turned into a fragile mess before I even got it partially working.

With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have
it not abort.  Things like netstat, ps, etc have a long way to go.

This also fixes a bug in the kern.ps_strings and kern.usrstack hacks.
These do matter very much because they are used by libc_r and other things.
2004-10-11 22:04:16 +00:00
mtm
0a21f474dc Close a race between a thread exiting and the freeing of it's stack.
After some discussion the best option seems to be to signal the thread's
death from within the kernel. This requires that thr_exit() take an
argument.

Discussed with: davidxu, deischen, marcel
MFC after: 3 days
2004-10-06 14:23:00 +00:00
jhb
ce2d3f89af Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
  pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
  don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
  times it needs rather than calling getrusage() twice with associated
  stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
  for user, system, and interrupt time as well as a bintime of the total
  runtime.  A new p_rux field in struct proc replaces the same inline fields
  from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime).  A new p_crux
  field in struct proc contains the "raw" child time usage statistics.
  ruadd() has been changed to handle adding the associated rusage_ext
  structures as well as the values in rusage.  Effectively, the values in
  rusage_ext replace the ru_utime and ru_stime values in struct rusage.  These
  two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
  calculates appropriate timevals for user and system time as well as updating
  the rux_[isu]u fields of a passed in rusage_ext structure.  calcru() uses a
  copy of the process' p_rux structure to compute the timevals after updating
  the runtime appropriately if any of the threads in that process are
  currently executing.  It also now only locks sched_lock internally while
  doing the rux_runtime fixup.  calcru() now only requires the caller to
  hold the proc lock and calcru1() only requires the proc lock internally.
  calcru() also no longer allows callers to ask for an interrupt timeval
  since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
  calling calcru1() on p_crux.  Note that this means that any code that wants
  child times must now call this function rather than reading from p_cru
  directly.  This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
  in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
  proctree lock is used to protect the process group rather than the process
  group lock.  By holding this lock until the end of the function we now
  ensure that the process/thread that we pick to dump info about will no
  longer vanish while we are trying to output its info to the console.

Submitted by:	bde (mostly)
MFC after:	1 month
2004-10-05 18:51:11 +00:00
peter
79e1f83f8d Regen 2004-07-14 00:03:51 +00:00
peter
89261c4c9f Unmapped syscalls should be NOPROTO so that we don't get a duplicate
prototype.  (kldunloadf in this case)
2004-07-14 00:03:30 +00:00
phk
b0e6874188 Give kldunload a -f(orce) argument.
Add a MOD_QUIESCE event for modules.  This should return error (EBUSY)
of the module is in use.

MOD_UNLOAD should now only fail if it is impossible (as opposed to
inconvenient) to unload the module.  Valid reasons are memory references
into the module which cannot be tracked down and eliminated.

When kldunloading, we abandon if MOD_UNLOAD fails, and if -force is
not given, MOD_QUIESCE failing will also prevent the unload.

For backwards compatibility, we treat EOPNOTSUPP from MOD_QUIESCE as
success.

Document that modules should return EOPNOTSUPP for unknown events.
2004-07-13 19:36:59 +00:00
phk
7b891087ed Add kldunloadf() system call. Stay tuned for follwing commit messages. 2004-07-13 19:35:11 +00:00
marcel
622fe058c9 Change the thread ID (thr_id_t) used for 1:1 threading from being a
pointer to the corresponding struct thread to the thread ID (lwpid_t)
assigned to that thread. The primary reason for this change is that
libthr now internally uses the same ID as the debugger and the kernel
when referencing to a kernel thread. This allows us to implement the
support for debugging without additional translations and/or mappings.

To preserve the ABI, the 1:1 threading syscalls, including the umtx
locking API have not been changed to work on a lwpid_t. Instead the
1:1 threading syscalls operate on long and the umtx locking API has
not been changed except for the contested bit. Previously this was
the least significant bit. Now it's the most significant bit. Since
the contested bit should not be tested by userland, this change is
not expected to be visible. Just to be sure, UMTX_CONTESTED has been
removed from <sys/umtx.h>.

Reviewed by: mtm@
ABI preservation tested on: i386, ia64
2004-07-02 00:40:07 +00:00
marcel
e84fdd61ba Regen. 2004-07-02 00:38:56 +00:00
phk
40dd98a3bd Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
marcel
8c5804d307 Fix build for non-COMPAT_FREEBSD4 configurations. Make the FreeBSD 4
statfs functions conditional upon the option.
2004-04-24 04:31:59 +00:00
peter
462ac75706 Regen 2004-04-14 23:17:57 +00:00
peter
ba2b6ac30f Catch up to the not-so-recent statfs(2) changes. 2004-04-14 23:17:37 +00:00
mtm
02e9e2319a Regen for libthr thread synchronization syscalls. 2004-03-27 14:34:17 +00:00
mtm
adb111ed69 Separate thread synchronization from signals in libthr. Instead
use msleep() and wakeup_one().

Discussed with: jhb, peter, tjr
2004-03-27 14:30:43 +00:00
jhb
275240297d - Replace wait1() with a kern_wait() function that accepts the pid,
options, status pointer and rusage pointer as arguments.  It is up to
  the caller to copyout the status and rusage to userland if needed.  This
  lets us axe the 'compat' argument and hide all that functionality in
  owait(), by the way.  This also cleans up some locking in kern_wait()
  since it no longer has to drop locks around copyout() since all the
  copyout()'s are deferred.
- Convert owait(), wait4(), and the various ABI compat wait() syscalls to
  use kern_wait() rather than wait1() or wait4().  This removes a bit
  more stackgap usage.

Tested on:	i386
Compiled on:	i386, alpha, amd64
2004-03-17 20:00:00 +00:00
peter
4f161dd3db Regen (FWIW) 2004-02-21 23:38:58 +00:00
peter
e2003f8831 Try and make the compat sigreturn prototypes closer to reality. 2004-02-21 23:37:33 +00:00
deischen
1425928c32 Regen. 2004-02-03 05:20:28 +00:00
deischen
4ddd55694e Sync with kern/syscalls.master. 2004-02-03 05:18:48 +00:00
peter
d11dc3ad9b Regen 2004-01-28 23:45:48 +00:00
peter
b26e4fdf5e Add getitimer swab stub 2004-01-28 23:45:37 +00:00
peter
72906fa267 GC unused 'syshide' override to /dev/null. This was here to disable
the output of the namespc column.  Its functionality was removed some time
ago, but the overrides and the namespc column remained.
2003-12-24 00:32:07 +00:00
peter
38f4513fb8 Regen (should be a NOP except for rcsid) 2003-12-23 04:07:47 +00:00
peter
7afe90e680 GC unused namespc column. 2003-12-23 04:07:22 +00:00
peter
7cd274c812 Regen 2003-12-23 03:21:49 +00:00
peter
6538c4df01 freebsd32_fstat(2) is now MPSAFE 2003-12-23 03:21:06 +00:00
peter
03780390b0 Rather than screw around with the (unsafe) stackgap, call vn_stat/fo_stat
directly for stat/fstat/lstat syscall emulation.  It turns out not only
safer, but the code is smaller this way too.
2003-12-23 03:20:49 +00:00
peter
e646517cd8 Regen 2003-12-23 02:48:58 +00:00
peter
04289884a4 Eliminate stackgap usage for the (woefully incomplete) path translations
since it isn't needed here anymore.
Use standard open(2)/access(2) and chflags(2) syscalls now.
2003-12-23 02:48:11 +00:00
peter
7c39c30764 regen 2003-12-11 02:36:37 +00:00
peter
d4883a4357 Mark freebsd32_gettimeofday() as mpsafe 2003-12-11 02:36:07 +00:00
peter
0cde709881 Just implementing a 32 bit version of gettimeofday() was smaller than
the wrapper code.  And it doesn't use the stackgap as a bonus.
2003-12-11 02:34:49 +00:00
peter
a767a6c392 Regen 2003-12-10 22:33:45 +00:00
peter
25170ce26a Add missing extattr_list_fd(), extattr_list_file(), extattr_list_link()
and kse_switchin() syscall slots.
2003-12-10 22:33:27 +00:00
peter
3db2893823 The osigpending, oaccept, orecvfrom and ogetdirentries entries were
accidently being compiled in as standard.  These are part of the
set of unimplemented COMPAT_43 syscall set.
2003-12-10 22:31:46 +00:00
peter
4de1c73f10 Regen 2003-11-08 07:31:49 +00:00
peter
742af7ab5a "implement" vfork(). Add comments next to the other syscalls that need
to be implemented.  This is enough to run i386 /bin/tcsh.  /bin/sh is still
not happy because of some strange job control problem.
2003-11-08 07:31:30 +00:00
peter
39c76d997c Dont write to the stackgap directly in execve(). 2003-11-07 21:27:13 +00:00
jhb
b6b663a274 Regen. 2003-11-07 20:30:30 +00:00
jhb
08067cd608 Sync with global syscalls.master by marking ptrace(), dup(), pipe(),
ktrace(), freebsd32_sigaltstack(), sysarch(), issetugid(), utrace(), and
freebsd32_sigaction() as MP safe.
2003-11-07 20:29:53 +00:00
peter
11bf308b70 Add CTASSERT()'s to check that the sizes of our replicas of the 32 bit
structures come out the right size.

Fix the ones that broke.  stat32 had some missing fields from the end
and statfs32 was broken due to the strange definition of MNAMELEN
(which is dependent on sizeof(long))

I'm not sure if this fixes any actual problems or not.
2003-10-30 02:40:30 +00:00
peter
e95056563d Switch to using the emulator in the common compat area.
Still work-in-progress.
2003-08-23 00:04:53 +00:00
peter
e2c08ea16b Initial sweep to de-i386-ify this 2003-08-22 23:07:28 +00:00
peter
2c79c9e29b Regen 2003-08-22 22:52:04 +00:00
peter
763024f66d Begin attempting to consolidate the two different i386 emulations
on ia64 and amd64.  I'm attempting to keep the generic 32bit-on-64bit
binary support seperate from the i386 support and the MD backend support.
2003-08-22 22:51:48 +00:00
peter
ba0d622c9f Regen 2003-08-21 03:48:50 +00:00
peter
96b31600a1 This is too funny for words. Swap syscalls 416 and 417 around. It works
better that way when sigaction() and sigreturn() do the right thing.
2003-08-21 03:48:05 +00:00
obrien
02a4f42b9a Use __FBSDID().
Brought to you by:	a boring talk at Ottawa Linux Symposium
2003-07-25 21:19:19 +00:00
peter
a32db9797c Regenerate. 2003-05-31 06:51:04 +00:00
peter
3afe1a4377 Make this compile with WITNESS enabled. It wants the syscall names. 2003-05-31 06:49:53 +00:00
peter
99d1672b3d Deal with the user VM space expanding. 32 bit applications do not like
having their stack at the 512GB mark.  Give 4GB of user VM space for 32
bit apps.  Note that this is significantly more than on i386 which gives
only about 2.9GB of user VM to a process (1GB for kernel, plus page
table pages which eat user VM space).

Approved by: re (blanket)
2003-05-23 05:07:33 +00:00
peter
c177f59bbf Regen
Approved by: re (amd64 blanket)
2003-05-14 04:11:25 +00:00
peter
770abdbb9c Add BASIC i386 binary support for the amd64 kernel. This is largely
stolen from the ia64/ia32 code (indeed there was a repocopy), but I've
redone the MD parts and added and fixed a few essential syscalls.  It
is sufficient to run i386 binaries like /bin/ls, /usr/bin/id (dynamic)
and p4.  The ia64 code has not implemented signal delivery, so I had
to do that.

Before you say it, yes, this does need to go in a common place.  But
we're in a freeze at the moment and I didn't want to risk breaking ia64.
I will sort this out after the freeze so that the common code is in a
common place.

On the AMD64 side, this required adding segment selector context switch
support and some other support infrastructure.  The %fs/%gs etc code
is hairy because loading %gs will clobber the kernel's current MSR_GSBASE
setting.  The segment selectors are not used by the kernel, so they're only
changed at context switch time or when changing modes.  This still needs
to be optimized.

Approved by:	re (amd64/* blanket)
2003-05-14 04:10:49 +00:00
jhb
dcf45ed625 Regen. 2003-04-25 15:59:44 +00:00
jhb
011ef0f3d8 Oops, the thr_* and jail_attach() syscall entries should be NOPROTO rather
than STD.
2003-04-25 15:59:18 +00:00
jhb
968ad4dbc6 Regen. 2003-04-24 20:50:57 +00:00
jhb
4d35246c8d Fix the thr_create() entry by adding a trailing \. Also, sync up the
MP safe flag for thr_* with the main table.
2003-04-24 20:49:46 +00:00
jhb
146e8aecec - Replace inline implementations of sigprocmask() with calls to
kern_sigprocmask() in the various binary compatibility emulators.
- Replace calls to sigsuspend(), sigaltstack(), sigaction(), and
  sigprocmask() that used the stackgap with calls to the corresponding
  kern_sig*() functions instead without using the stackgap.
2003-04-22 18:23:49 +00:00
mike
75859ca578 o In struct prison, add an allprison linked list of prisons (protected
by allprison_mtx), a unique prison/jail identifier field, two path
  fields (pr_path for reporting and pr_root vnode instance) to store
  the chroot() point of each jail.
o Add jail_attach(2) to allow a process to bind to an existing jail.
o Add change_root() to perform the chroot operation on a specified
  vnode.
o Generalize change_dir() to accept a vnode, and move namei() calls
  to callers of change_dir().
o Add a new sysctl (security.jail.list) which is a group of
  struct xprison instances that represent a snapshot of active jails.

Reviewed by:	rwatson, tjr
2003-04-09 02:55:18 +00:00
jeff
5f8f1497c8 - Add thr and umtx system calls. 2003-04-01 01:15:56 +00:00
jeff
23844ff023 - Add a placeholder for sigwait 2003-03-31 23:36:40 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
phk
4bfb37f22e Remove #include <sys/dkstat.h> 2003-02-16 14:13:23 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
rwatson
22c41db3e5 Synchronize to kern/syscalls.master:1.139.
Obtained from:	TrustedBSD Project
2002-12-29 20:33:26 +00:00
marcel
3c86a795f0 Regen: swapoff 2002-12-16 00:49:36 +00:00
marcel
4451a382e7 Change swapoff from MNOPROTO to UNIMPL. The former doesn't work. 2002-12-16 00:48:52 +00:00
dillon
b43fb3e920 This is David Schultz's swapoff code which I am finally able to commit.
This should be considered highly experimental for the moment.

Submitted by:	David Schultz <dschultz@uclink.Berkeley.EDU>
MFC after:	3 weeks
2002-12-15 19:17:57 +00:00
alfred
d070c0a52d SCARGS removal take II. 2002-12-14 01:56:26 +00:00
alfred
4f48184fb2 Backout removal SCARGS, the code freeze is only "selectively" over. 2002-12-13 22:41:47 +00:00
alfred
d19b4e039d Remove SCARGS.
Reviewed by: md5
2002-12-13 22:27:25 +00:00
deischen
54d9a4c0f7 Regenerate after adding syscalls. 2002-11-16 23:48:14 +00:00
deischen
280e9bbfe8 Add *context() syscalls to ia64 32-bit compatability table as requested
in kern/syscalls.master.
2002-11-16 15:15:17 +00:00
rwatson
3f3d082989 Sync to src/sys/kern/syscalls.master 2002-11-02 23:55:30 +00:00
peter
a75c662939 Stake a claim on 418 (__xstat), 419 (__xfstat), 420 (__xlstat) 2002-10-19 22:25:31 +00:00
peter
6f9d4eb337 Grab 416/417 real estate before I get burned while testing again.
This is for the not-quite-ready signal/fpu abi stuff.  It may not see
the light of day, but I'm certainly not going to be able to validate it
when getting shot in the foot due to syscall number conflicts.
2002-10-19 22:09:23 +00:00
rwatson
f3cd77cf07 Add a placeholder for the execve_mac() system call, similar to SELinux's
execve_secure() system call, which permits a process to pass in a label
for a label change during exec.  This permits SELinux to change the
label for the resulting exec without a race following a manual label
change on the process.  Because this interface uses our general purpose
MAC label abstraction, we call it execve_mac(), and wrap our port of
SELinux's execve_secure() around it with appropriate sid mappings.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 21:06:57 +00:00
peter
b3c87d7052 re-regen. Sigh. 2002-10-09 22:40:41 +00:00
peter
f7a3aba232 Sigh. Fix fat-fingering of diff. I knew this was going to happen. 2002-10-09 22:40:02 +00:00
peter
8a5224b1c2 regenerate. sendfile stuff and other recently picked up stubs. 2002-10-09 22:28:48 +00:00
peter
54a5ebbeb2 Try and deal with the #ifdef COMPAT_FREEBSD4 sendfile stuff. This would
have been a lot easier if do_sendfile() was usable externally.
2002-10-09 22:27:24 +00:00
peter
641a3d5cb3 Try and patch up some tab-to-space spammage. 2002-10-09 22:14:35 +00:00
peter
dbe70a6b44 Add placeholder stubs for nsendfile, mac_syscall, ksem_close, ksem_post,
ksem_wait, ksem_trywait, ksem_init, ksem_open, ksem_unlink, ksem_getvalue,
ksem_destroy, __mac_get_pid, __mac_get_link, __mac_set_link,
extattr_set_link, extattr_get_link, extattr_delete_link.
2002-10-09 22:10:23 +00:00
archie
9301eb9484 Let kse_wakeup() take a KSE mailbox pointer argument.
Reviewed by:	julian
2002-10-02 16:48:16 +00:00
archie
904b65e85d Make the following name changes to KSE related functions, etc., to better
represent their purpose and minimize namespace conflicts:

	kse_fn_t		-> kse_func_t
	struct thread_mailbox	-> struct kse_thr_mailbox
	thread_interrupt()	-> kse_thr_interrupt()
	kse_yield()		-> kse_release()
	kse_new()		-> kse_create()

Add missing declaration of kse_thr_interrupt() to <sys/kse.h>.
Regenerate the various generated syscall files. Minor style fixes.

Reviewed by:	julian
2002-09-25 18:10:42 +00:00
peter
c93c6004b2 Regenerate 2002-07-20 02:56:34 +00:00
peter
cc7b2e4248 Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable
handler in the kernel at the same time.  Also, allow for the
exec_new_vmspace() code to build a different sized vmspace depending on
the executable environment.  This is a big help for execing i386 binaries
on ia64.   The ELF exec code grows the ability to map partial pages when
there is a page size difference, eg: emulating 4K pages on 8K or 16K
hardware pages.

Flesh out the i386 emulation support for ia64.  At this point, the only
binary that I know of that fails is cvsup, because the cvsup runtime
tries to execute code in pages not marked executable.

Obtained from:  dfr (mostly, many tweaks from me).
2002-07-20 02:56:12 +00:00
dfr
cfb2ec9f72 Initial support for executing IA-32 binaries. This will not compile
without a few patches for the rest of the kernel to allow the image
activator to override exec_copyout_strings and setregs.

None of the syscall argument translation has been done. Possibly, this
translation layer can be shared with any platform that wants to support
running ILP32 binaries on an LP64 host (e.g. sparc32 binaries?)
2002-04-10 19:34:51 +00:00