80 Commits

Author SHA1 Message Date
John Baldwin
03badf38ab Remove an unused syscallarg() macro leftover from this code's origins in
NetBSD.

Approved by:	re (scottl)
2005-07-07 19:26:43 +00:00
John Baldwin
c87b5f76aa - Implement svr4_emul_find() using kern_alternate_path(). This changes
the semantics in that the returned filename to use is now a kernel
  pointer rather than a user space pointer.  This required changing the
  arguments to the CHECKALT*() macros some and changing the various system
  calls that used pathnames to use the kern_foo() functions that can accept
  kernel space filename pointers instead of calling the system call
  directly.
- Use kern_open(), kern_access(), kern_msgctl(), kern_execve(),
  kern_mkfifo(), kern_mknod(), kern_statfs(), kern_fstatfs(),
  kern_setitimer(), kern_stat(), kern_lstat(), kern_fstat(), kern_utimes(),
  kern_pathconf(), and kern_unlink().
2005-02-07 21:53:42 +00:00
Warner Losh
898b0535b7 Start each of the license/copyright comments with /*- 2005-01-05 22:34:37 +00:00
John Baldwin
c88379381b - Move the function prototypes for kern_setrlimit() and kern_wait() to
sys/syscallsubr.h where all the other kern_foo() prototypes live.
- Resort kern_execve() while I'm there.
2005-01-05 22:19:44 +00:00
David Schultz
6004362e66 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
Poul-Henning Kamp
124e4c3be8 Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST.
Use this in all the places where sleeping with the lock held is not
an issue.

The distinction will become significant once we finalize the exact
lock-type to use for this kind of case.
2004-11-13 11:53:02 +00:00
John Baldwin
78c85e8dfc Rework how we store process times in the kernel such that we always store
the raw values including for child process statistics and only compute the
system and user timevals on demand.

- Fix the various kern_wait() syscall wrappers to only pass in a rusage
  pointer if they are going to use the result.
- Add a kern_getrusage() function for the ABI syscalls to use so that they
  don't have to play stackgap games to call getrusage().
- Fix the svr4_sys_times() syscall to just call calcru() to calculate the
  times it needs rather than calling getrusage() twice with associated
  stackgap, etc.
- Add a new rusage_ext structure to store raw time stats such as tick counts
  for user, system, and interrupt time as well as a bintime of the total
  runtime.  A new p_rux field in struct proc replaces the same inline fields
  from struct proc (i.e. p_[isu]ticks, p_[isu]u, and p_runtime).  A new p_crux
  field in struct proc contains the "raw" child time usage statistics.
  ruadd() has been changed to handle adding the associated rusage_ext
  structures as well as the values in rusage.  Effectively, the values in
  rusage_ext replace the ru_utime and ru_stime values in struct rusage.  These
  two fields in struct rusage are no longer used in the kernel.
- calcru() has been split into a static worker function calcru1() that
  calculates appropriate timevals for user and system time as well as updating
  the rux_[isu]u fields of a passed in rusage_ext structure.  calcru() uses a
  copy of the process' p_rux structure to compute the timevals after updating
  the runtime appropriately if any of the threads in that process are
  currently executing.  It also now only locks sched_lock internally while
  doing the rux_runtime fixup.  calcru() now only requires the caller to
  hold the proc lock and calcru1() only requires the proc lock internally.
  calcru() also no longer allows callers to ask for an interrupt timeval
  since none of them actually did.
- calcru() now correctly handles threads executing on other CPUs.
- A new calccru() function computes the child system and user timevals by
  calling calcru1() on p_crux.  Note that this means that any code that wants
  child times must now call this function rather than reading from p_cru
  directly.  This function also requires the proc lock.
- This finishes the locking for rusage and friends so some of the Giant locks
  in exit1() and kern_wait() are now gone.
- The locking in ttyinfo() has been tweaked so that a shared lock of the
  proctree lock is used to protect the process group rather than the process
  group lock.  By holding this lock until the end of the function we now
  ensure that the process/thread that we pick to dump info about will no
  longer vanish while we are trying to output its info to the console.

Submitted by:	bde (mostly)
MFC after:	1 month
2004-10-05 18:51:11 +00:00
John Baldwin
4afec35169 Add a proc *p pointer for td->td_proc to make this code easier to read. 2004-09-24 20:26:15 +00:00
John Baldwin
7eaec467d8 Various small style fixes. 2004-09-22 15:24:33 +00:00
John Baldwin
b7e23e826c - Replace wait1() with a kern_wait() function that accepts the pid,
options, status pointer and rusage pointer as arguments.  It is up to
  the caller to copyout the status and rusage to userland if needed.  This
  lets us axe the 'compat' argument and hide all that functionality in
  owait(), by the way.  This also cleans up some locking in kern_wait()
  since it no longer has to drop locks around copyout() since all the
  copyout()'s are deferred.
- Convert owait(), wait4(), and the various ABI compat wait() syscalls to
  use kern_wait() rather than wait1() or wait4().  This removes a bit
  more stackgap usage.

Tested on:	i386
Compiled on:	i386, alpha, amd64
2004-03-17 20:00:00 +00:00
John Baldwin
91d5354a2c Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count.  The plimit
  structure is treated similarly to struct ucred in that is is always copy
  on write, so having a reference to a structure is sufficient to read from
  it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
  limits from a process to keep the limit structure from changing out from
  under you while reading from it.
- Various global limits that are ints are not protected by a lock since
  int writes are atomic on all the archs we support and thus a lock
  wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
  behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
  either an rlimit, or the current or max individual limit of the specified
  resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
  other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
  (it didn't used the stackgap when it should have) but uses lim_rlimit()
  and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
  but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits.  It
  also no longer uses the stackgap for accessing sysctl's for the
  ibcs2_sysconf() syscall but uses kernel_sysctl() instead.  As a result,
  ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by:	mtm (mostly, I only did a few cleanups and catchups)
Tested on:	i386
Compiled on:	alpha, amd64
2004-02-04 21:52:57 +00:00
Robert Watson
679365e7b9 Reduce gratuitous includes: don't include jail.h if it's not needed.
Presumably, at some point, you had to include jail.h if you included
proc.h, but that is no longer required.

Result of:	self injury involving adding something to struct prison
2004-01-21 17:10:47 +00:00
Alexander Kabaev
501f5ff123 Do not call VOP_GETATTR in getdents function. It does not serve any
purpose and the resulting vattr structure was ignored. In addition,
the VOP_GETATTR call was made with no vnode lock held, resulting in
vnode locking violation panic with debug kernels.

Reported by:	truckman

Approved by:	re@ (rwatson)
2003-11-19 04:12:32 +00:00
Tim J. Robbins
78b73f3e05 Fix some security bugs in the SVR4 emulator:
- Return NULL instead of returning memory outside of the stackgap
  in stackgap_alloc() (FreeBSD-SA-00:42.linux)
- Check for stackgap_alloc() returning NULL in svr4_emul_find(),
  and clean_pipe().
- Avoid integer overflow on large nfds argument in svr4_sys_poll()
- Reject negative nbytes argument in svr4_sys_getdents()
- Don't copy out past the end of the struct componentname
  pathname buffer in svr4_sys_resolvepath()
- Reject out-of-range signal numbers in svr4_sys_sigaction(),
  svr4_sys_signal(), and svr4_sys_kill().
- Don't malloc() user-specified lengths in show_ioc() and
  show_strbuf(), place arbitrary limits instead.
- Range-check lengths in si_listen(), ti_getinfo(), ti_bind(),
  svr4_do_putmsg(), svr4_do_getmsg(), svr4_stream_ti_ioctl().

Some fixes obtain from OpenBSD.
2003-10-20 10:38:48 +00:00
Poul-Henning Kamp
887dd1d910 Remove dangling extern reference to swap_pager_full 2003-08-06 18:40:49 +00:00
Poul-Henning Kamp
3b6d965263 Add a f_vnode field to struct file.
Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.

By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.

At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.
2003-06-22 08:41:43 +00:00
David E. O'Brien
ad39da7821 Use __FBSDID(). 2003-06-10 21:44:29 +00:00
John Baldwin
90af4afacb - Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
  M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
  sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
  that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
  and thread_stopped() are now MP safe.

Reviewed by:	arch@
Approved by:	re (rwatson)
2003-05-13 20:36:02 +00:00
John Baldwin
4cac73c759 The proc lock is sufficient to test p_state against PRS_ZOMBIE, so don't
needlessly lock sched_lock.
2003-04-17 22:09:08 +00:00
Dag-Erling Smørgrav
521f364b80 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Matthew Dillon
de6edd1bd9 Add missing #include
Submitted by:	"Sam Leffler" <sam@errno.com>
2003-01-14 21:31:31 +00:00
Matthew Dillon
1c33791c80 Apply bandaid to bring svr4_sys_waitsys() in line with exit1(). This
routine really need to be gutted and merged with exit1().

Reviewed by:	jhb
2003-01-13 22:44:23 +00:00
Matthew Dillon
48e3128b34 Bow to the whining masses and change a union back into void *. Retain
removal of unnecessary casts and throw in some minor cleanups to see if
anyone complains, just for the hell of it.
2003-01-13 00:33:17 +00:00
Matthew Dillon
cd72f2180b Change struct file f_data to un_data, a union of the correct struct
pointer types, and remove a huge number of casts from code using it.

Change struct xfile xf_data to xun_data (ABI is still compatible).

If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary.  There are no operational changes in this
commit.
2003-01-12 01:37:13 +00:00
Alfred Perlstein
d1e405c5ce SCARGS removal take II. 2002-12-14 01:56:26 +00:00
Alfred Perlstein
bc9e75d7ca Backout removal SCARGS, the code freeze is only "selectively" over. 2002-12-13 22:41:47 +00:00
Alfred Perlstein
0bbe7292e1 Remove SCARGS.
Reviewed by: md5
2002-12-13 22:27:25 +00:00
Jake Burkholder
d920042a00 Make this compile. 2002-08-31 19:33:59 +00:00
Matthew Dillon
cac4515267 Implement data, text, and vmem limit checking in the elf loader and svr4
compat code.  Clean up accounting for multiple segments.  Part 1/2.

Submitted by:	Andrey Alekseyev <uitm@zenon.net> (with some modifications)
MFC after:	3 days
2002-08-30 18:09:46 +00:00
Robert Watson
6835e7427c On MAC check failure for readdir, use 'goto out' to use the common exit
handling, rather than returning directly to prevent leaking of vnode
reference/lock.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-15 15:46:10 +00:00
Robert Watson
7d834ce78d Enforce MAC policies for the locally implemented vnode services in
SVR4 emulation relating to readdir() and fd_revoke().  All other
services appear to be implemented by simply wrapping existing
FreeBSD native system call implementations, so don't require local
instrumentation in the emulator module.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-12 01:42:21 +00:00
Julian Elischer
e602ba25fd Part 1 of KSE-III
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by:	Almost everyone who counts
	(at various times, peter, jhb, matt, alfred, mini, bernd,
	and a cast of thousands)

	NOTE: this is still Beta code, and contains lots of debugging stuff.
	expect slight instability in signals..
2002-06-29 17:26:22 +00:00
Bruce Evans
5cc5b1d344 Fixed a printf format error. It was old and should have been detected by
gcc-2.9x, but somehow wasn't fixed already.
2002-05-25 10:48:03 +00:00
John Baldwin
ea97757a54 - Lock proctree_lock instead of pgrpsess_lock.
- Exclusively lock proctree_lock while calling leavepgrp().
2002-04-16 17:04:21 +00:00
John Baldwin
44731cab3b Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
Alfred Perlstein
8899023f66 Make the reference counting of 'struct pargs' SMP safe.
There is still some locations where the PROC lock should be held
in order to prevent inconsistent views from outside (like the
proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be
fixed later.

Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-03-27 21:36:18 +00:00
Jeff Roberson
851031501a Remove references to vm_zone.h and switch over to the new uma API. 2002-03-20 10:35:22 +00:00
Alfred Perlstein
b63dc6ad47 Remove __P. 2002-03-20 05:48:58 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
Seigo Tanimura
f591779bb5 Lock struct pgrp, session and sigio.
New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
  session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by:	jhb, alfred
Tested on:	cvsup.jp.FreeBSD.org
		(which is a quad-CPU machine running -current)
2002-02-23 11:12:57 +00:00
Seigo Tanimura
af46379bcb Lock the caller process if the pid passed to getsid() or getpgid()
equals to zero.
2002-01-19 06:34:58 +00:00
Seigo Tanimura
a6fccfb5f2 For getsid(), return the sid stored in struct session. This prevents
panic in case where a session has no session leader.

Inspired by:	Solaris 8
2002-01-19 05:31:51 +00:00
Alfred Perlstein
7e9aab77c5 make compile, add missing { and variable declaration. 2002-01-19 03:48:38 +00:00
Alfred Perlstein
426da3bcfb SMP Lock struct file, filedesc and the global file list.
Seigo Tanimura (tanimura) posted the initial delta.

I've polished it quite a bit reducing the need for locking and
adapting it for KSE.

Locks:

1 mutex in each filedesc
   protects all the fields.
   protects "struct file" initialization, while a struct file
     is being changed from &badfileops -> &pipeops or something
     the filedesc should be locked.

1 mutex in each struct file
   protects the refcount fields.
   doesn't protect anything else.
   the flags used for garbage collection have been moved to
     f_gcflag which was the FILLER short, this doesn't need
     locking because the garbage collection is a single threaded
     container.
  could likely be made to use a pool mutex.

1 sx lock for the global filelist.

struct file *	fhold(struct file *fp);
        /* increments reference count on a file */

struct file *	fhold_locked(struct file *fp);
        /* like fhold but expects file to locked */

struct file *	ffind_hold(struct thread *, int fd);
        /* finds the struct file in thread, adds one reference and
                returns it unlocked */

struct file *	ffind_lock(struct thread *, int fd);
        /* ffind_hold, but returns file locked */

I still have to smp-safe the fget cruft, I'll get to that asap.
2002-01-13 11:58:06 +00:00
Paul Saab
cbc89bfbfe Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader
tunable.

Reviewed by:	peter
MFC after:	2 weeks
2001-10-10 23:06:54 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Assar Westerlund
57762323e5 get rid of some printf and pointer type warnings 2001-07-22 00:12:22 +00:00
Robert Watson
b1fc0ec1a7 o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
  pcred->pc_uidinfo, which was associated with the real uid, only rename
  it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
  corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
  original macro that pointed.
  p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
  p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
  cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
  cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
  we figure out locking and optimizations; generally speaking, this
  means moving to a structure like this:
        newcred = crdup(oldcred);
        ...
        p->p_ucred = newcred;
        crfree(oldcred);
  It's not race-free, but better than nothing.  There are also races
  in sys_process.c, all inter-process authorization, fork, exec, and
  exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
  remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
  use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
  pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
  allocation.
o Clean up ktrcanset() to take into account changes, and move to using
  suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
  calls to better document current behavior.  In a couple of places,
  current behavior is a little questionable and we need to check
  POSIX.1 to make sure it's "right".  More commenting work still
  remains to be done.
o Update credential management calls, such as crfree(), to take into
  account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
      change_euid()
      change_egid()
      change_ruid()
      change_rgid()
      change_svuid()
      change_svgid()
  In each case, the call now acts on a credential not a process, and as
  such no longer requires more complicated process locking/etc.  They
  now assume the caller will do any necessary allocation of an
  exclusive credential reference.  Each is commented to document its
  reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
  and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
  questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
  processes and pcreds.  Note that this authorization, as well as
  CANSIGIO(), needs to be updated to use the p_cansignal() and
  p_cansched() centralized authorization routines, as they currently
  do not take into account some desirable restrictions that are handled
  by the centralized routines, as well as being inconsistent with other
  similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from:	TrustedBSD Project
Reviewed by:	green, bde, jhb, freebsd-arch, freebsd-audit
2001-05-25 16:59:11 +00:00