Commit Graph

227 Commits

Author SHA1 Message Date
Don Lewis
47934cef8f Split the mlock() kernel code into two parts, mlock(), which unpacks
the syscall arguments and does the suser() permission check, and
kern_mlock(), which does the resource limit checking and calls
vm_map_wire().  Split munlock() in a similar way.

Enable the RLIMIT_MEMLOCK checking code in kern_mlock().

Replace calls to vslock() and vsunlock() in the sysctl code with
calls to kern_mlock() and kern_munlock() so that the sysctl code
will obey the wired memory limits.

Nuke the vslock() and vsunlock() implementations, which are no
longer used.

Add a member to struct sysctl_req to track the amount of memory
that is wired to handle the request.

Modify sysctl_wire_old_buffer() to return an error if its call to
kern_mlock() fails.  Only wire the minimum of the length specified
in the sysctl request and the length specified in its argument list.
It is recommended that sysctl handlers that use sysctl_wire_old_buffer()
should specify reasonable estimates for the amount of data they
want to return so that only the minimum amount of memory is wired
no matter what length has been specified by the request.

Modify the callers of sysctl_wire_old_buffer() to look for the
error return.

Modify sysctl_old_user to obey the wired buffer length and clean up
its implementation.

Reviewed by:	bms
2004-02-26 00:27:04 +00:00
Poul-Henning Kamp
dc08ffec87 Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
2004-02-21 21:10:55 +00:00
Dag-Erling Smørgrav
44f4b94b38 Don't bother storing a result when all you need are the side effects. 2004-02-16 18:38:46 +00:00
David Malone
a82294d01c In fdcheckstd the descriptor table should never be shared, so just
KASSERT this rather than trying to deal with what happens when file
descriptors change out from under us.
2004-02-15 21:14:48 +00:00
John Baldwin
91d5354a2c Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count.  The plimit
  structure is treated similarly to struct ucred in that is is always copy
  on write, so having a reference to a structure is sufficient to read from
  it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
  limits from a process to keep the limit structure from changing out from
  under you while reading from it.
- Various global limits that are ints are not protected by a lock since
  int writes are atomic on all the archs we support and thus a lock
  wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
  behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
  either an rlimit, or the current or max individual limit of the specified
  resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
  other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
  (it didn't used the stackgap when it should have) but uses lim_rlimit()
  and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
  but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits.  It
  also no longer uses the stackgap for accessing sysctl's for the
  ibcs2_sysconf() syscall but uses kernel_sysctl() instead.  As a result,
  ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by:	mtm (mostly, I only did a few cleanups and catchups)
Tested on:	i386
Compiled on:	alpha, amd64
2004-02-04 21:52:57 +00:00
Dag-Erling Smørgrav
a6d4491c71 Restore correct semantics for F_DUPFD fcntl. This should fix the errors
people have been getting with configure scripts.
2004-01-17 00:59:04 +00:00
Dag-Erling Smørgrav
56a9fc0e93 WITNESS won't let us hold two filedesc locks at the same time, so juggle
fdp and newfdp around a bit.
2004-01-16 21:54:56 +00:00
Dag-Erling Smørgrav
ddce426f69 Remove two KASSERTs which were overly paranoid. 2004-01-16 08:45:56 +00:00
Dag-Erling Smørgrav
12d568c2b1 Take care to drop locks when calling malloc() 2004-01-15 18:50:11 +00:00
Dag-Erling Smørgrav
a2fe44e8cf New file descriptor allocation code, derived from similar code introduced
in OpenBSD by Niels Provos.  The patch introduces a bitmap of allocated
file descriptors which is used to locate available descriptors when a new
one is needed.  It also moves the task of growing the file descriptor table
out of fdalloc(), reducing complexity in both fdalloc() and do_dup().

Debts of gratitude are owed to tjr@ (who provided the original patch on
which this work is based), grog@ (for the gdb(4) man page) and rwatson@
(for assistance with pxeboot(8)).
2004-01-15 10:15:04 +00:00
Dag-Erling Smørgrav
c9de31f55f Mechanical whitespace cleanup. 2004-01-11 19:39:14 +00:00
Alan Cox
0e88a71798 Remove long dead code, specifically, code related to munmapfd().
(See also vm/vm_mmap.c revision 1.173.)
2004-01-11 06:59:21 +00:00
David Malone
70ad6c2190 Plug a leak of open files that happens when you exec a suid program
with one of std{in,out,err} open. This helps with the file descriptor
leaks reported on -current. This should probably be merged into 5.2.

Reviewed by:	ru
Tested by:	Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>
2003-12-28 19:27:14 +00:00
David Malone
e1419c08e2 falloc allocates a file structure and adds it to the file descriptor
table, acquiring the necessary locks as it works. It usually returns
two references to the new descriptor: one in the descriptor table
and one via a pointer argument.

As falloc releases the FILEDESC lock before returning, there is a
potential for a process to close the reference in the file descriptor
table before falloc's caller gets to use the file. I don't think this
can happen in practice at the moment, because Giant indirectly protects
closes.

To stop the file being completly closed in this situation, this change
makes falloc set the refcount to two when both references are returned.
This makes life easier for several of falloc's callers, because the
first thing they previously did was grab an extra reference on the
file.

Reviewed by:	iedowse
Idea run past:	jhb
2003-10-19 20:41:07 +00:00
Robert Watson
c142b0fcfe Remove the global variable 'cmask', which was used to initialize the
fd_cmask field in the file descriptor structure for the first process
indirectly from CMASK, and when an fd structure is initialized before
being filled in, and instead just use CMASK.  This appears to be an
artifact left over from the initial integration of quotas into BSD.

Suggested by:	peter
2003-10-02 03:57:59 +00:00
David Malone
d2cce3d6e8 Do some minor Giant pushdown made possible by copyin, fget, fdrop,
malloc and mbuf allocation all not requiring Giant.

1) ostat, fstat and nfstat don't need Giant until they call fo_stat.
2) accept can copyin the address length without grabbing Giant.
3) sendit doesn't need Giant, so don't bother grabbing it until kern_sendit.
4) move Giant grabbing from each indivitual recv* syscall to recvit.
2003-08-04 21:28:57 +00:00
Alan Cox
fbe1bdddcc Revision 1.51 of vm/uma_core.c modified uma_large_free() to acquire Giant
when needed.  So, don't do it here.
2003-07-29 05:23:19 +00:00
Robert Watson
2e4a71cdb1 When exporting file descriptor data for threads invoking the
kern.file sysctl, don't return information about processes that
fail p_cansee(td, p).  This prevents sockstat and related
programs from seeing file descriptors owned by processes not
in the same jail as the thread, as well as having implications
for MAC, etc.

This is a partial solution: it permits an information leak about
the number of descriptors in the sizing calculation (but this is
not new information, you can also get it from kern.openfiles),
and doesn't attempt to mask file descriptors based on the
properties of the descriptor, only the process referencing it.
However, it provides most of what you want under most
circumstances, without complicating the locking.

PR:	54211
Based on a patch submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2003-07-28 16:03:53 +00:00
Poul-Henning Kamp
7c89f162bc Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00
Alan Cox
18e8d4e79c revision 1.51 of vm/uma_core.c modified uma_large_malloc() to acquire
Giant when needed.
2003-07-25 22:26:43 +00:00
Don Lewis
857d9c60d0 Extend the mutex pool implementation to permit the creation and use of
multiple mutex pools with different options and sizes.  Mutex pools can
be created with either the default sleep mutexes or with spin mutexes.
A dynamically created mutex pool can now be destroyed if it is no longer
needed.

Create two pools by default, one that matches the existing pool that
uses the MTX_NOWITNESS option that should be used for building higher
level locks, and a new pool with witness checking enabled.

Modify the users of the existing mutex pool to use the appropriate pool
in the new implementation.

Reviewed by:	jhb
2003-07-13 01:22:21 +00:00
Poul-Henning Kamp
1226914c17 Use the f_vnode field to tell which file descriptors have a vnode. 2003-07-04 12:20:27 +00:00
Poul-Henning Kamp
3b6d965263 Add a f_vnode field to struct file.
Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.

By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.

At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.
2003-06-22 08:41:43 +00:00
Poul-Henning Kamp
eaaca5deee Don't (re)initialize f_gcflag to zero.
Move initialization of DTYPE_VNODE specific field f_seqcount into
the DTYPE_VNODE specific code.
2003-06-20 08:02:30 +00:00
Alfred Perlstein
bab88630ba Unlock the struct file lock before aquiring Giant, otherwise
we can deadlock because of lock order reversals.  This was not
caught because Witness ignores pool mutexes right now.

Diagnosis and help: truckman
Noticed by: pho
2003-06-19 18:13:07 +00:00
Mike Silbersack
4d7dfc31b8 Add a rate limited message reporting when kern.maxfiles is exceeded,
reporting who did it.

Also, fix a style bug introduced in the previous change.

MFC after:	1 week
2003-06-19 04:07:12 +00:00
Mike Silbersack
438f085b2f Reserve the last 5% of file descriptors for root use. This should allow
systems to fail more gracefully when a file descriptor exhaustion situation
occurs.

Original patch by:	David G. Andersen <dga@lcs.mit.edu>
PR:			45353
MFC after:		1 week
2003-06-18 18:57:58 +00:00
Poul-Henning Kamp
7c2d2efd58 Initialize struct fileops with C99 sparse initialization. 2003-06-18 18:16:40 +00:00
David E. O'Brien
677b542ea2 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
Tor Egge
ad05d58087 Add tracking of process leaders sharing a file descriptor table and
allow a file descriptor table to be shared between multiple process
leaders.

PR:		50923
2003-06-02 16:05:32 +00:00
Poul-Henning Kamp
90471005e1 Remove needless return
Found by:       FlexeLint
2003-05-31 20:16:44 +00:00
Robert Watson
c1dca9ab07 VOP_PATHCONF() requires a vnode lock; this patch adds locking to
fpathconf(). The lock is held for direct calls to VOP_PATHCONF() in
pathconf() already.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:13:08 +00:00
Mark Murray
51da11a27a Fix some easy, global, lint warnings. In most cases, this means
making some local variables static. In a couple of cases, this means
removing an unused variable.
2003-04-30 12:57:40 +00:00
Alexander Kabaev
104a9b7e3e Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
Poul-Henning Kamp
7ac40f5f59 Gigacommit to improve device-driver source compatibility between
branches:

Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.

This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.

Approved by:    re(scottl)
2003-03-03 12:15:54 +00:00
Tor Egge
c6faf3bf1d Remove unneeded code added in revision 1.188. 2003-03-01 17:18:28 +00:00
Scott Long
3303c14b57 Don't NULL out p_fd until after closefd() has been called. This isn't
totally correct, but it has caused breakage for too long.  I welcome
someone with more fd fu to fix it correctly.
2003-02-24 05:46:55 +00:00
Mike Makonnen
750a91d8b1 Remove a comment which hasn't been true since rev. 1.158
Approved by:	jhb, markm (mentor)(implicit)
2003-02-22 05:59:48 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Tor Egge
218a01e062 Avoid file lock leakage when linuxthreads port or rfork is used:
- Mark the process leader as having an advisory lock
  - Check if process leader is marked as having advisory lock when
    closing file
  - Check that file is still open after lock has been obtained
  - Don't allow file descriptor table sharing between processes
    with different leaders

PR:		10265
Reviewed by:	alfred
2003-02-15 22:43:05 +00:00
Alfred Perlstein
e7d6662f1b Do not allow kqueues to be passed via unix domain sockets. 2003-02-15 06:04:55 +00:00
Alfred Perlstein
edf6699ae6 Fix LOR with PROC/filedesc. Introduce fdesc_mtx that will be used as a
barrier between free'ing filedesc structures.  Basically if you want to
access another process's filedesc, you want to hold this mutex over the
entire operation.
2003-02-15 05:52:56 +00:00
Alfred Perlstein
42e1b74af2 Don't lock FILEDESC under PROC.
The locking here needs to be revisited, but this ought to get rid of the
LOR messages that people are complaining about for now.  I imagine either
I or someone else interested with smp will eventually clear this up.
2003-02-11 07:20:52 +00:00
Poul-Henning Kamp
4af0d0c21f NODEVFS cleanup: remove #ifdefs 2003-01-30 12:35:40 +00:00
Jeffrey Hsu
a448a15bc1 Add missing SMP file locks around read-modify-write operations on
the flag field.

Reviewed by:	rwatson
2003-01-21 20:20:48 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Poul-Henning Kamp
7e760e148a Originally when DEVFS was added, a global variable "devfs_present"
was used to control code which were conditional on DEVFS' precense
since this avoided the need for large-scale source pollution with
#include "opt_geom.h"

Now that we approach making DEVFS standard, replace these tests
with an #ifdef to facilitate mechanical removal once DEVFS becomes
non-optional.

No functional change by this commit.
2003-01-19 11:03:07 +00:00
Matthew Dillon
48e3128b34 Bow to the whining masses and change a union back into void *. Retain
removal of unnecessary casts and throw in some minor cleanups to see if
anyone complains, just for the hell of it.
2003-01-13 00:33:17 +00:00
Matthew Dillon
cd72f2180b Change struct file f_data to un_data, a union of the correct struct
pointer types, and remove a huge number of casts from code using it.

Change struct xfile xf_data to xun_data (ABI is still compatible).

If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary.  There are no operational changes in this
commit.
2003-01-12 01:37:13 +00:00
Jacques Vidrine
f0c093284d Correct file descriptor leaks in lseek and do_dup.
The leak in lseek was introduced in vfs_syscalls.c revision 1.218.
The leak in do_dup was introduced in kern_descrip.c revision 1.158.

Submitted by:	iedowse
2003-01-06 13:19:05 +00:00