Commit Graph

176 Commits

Author SHA1 Message Date
Konstantin Belousov
75d633cbf6 Move SysV IPC freebsd32 compat shims from freebsd32_misc.c to corresponding
sysv_{msg,sem,shm}.c files.

Mark SysV IPC freebsd32 syscalls as NOSTD and add required
SYSCALL_INIT_HELPER/SYSCALL32_INIT_HELPERs to provide auto
register/unregister on module load.

This makes COMPAT_FREEBSD32 functional with SysV IPC compiled and loaded
as modules.

Reviewed by:	jhb
MFC after:	2 weeks
2010-03-19 11:04:42 +00:00
Ruslan Ermilov
4d9d1e823c - Rename tunable kern.ipc.shmmaxpgs to kern.ipc.shmall.
- Explain the fuss when initializing shmmax.

PR:	75542 (mistakenly closed instead of PR 75541)
2009-10-24 19:00:58 +00:00
John Baldwin
af88b2c276 Use the correct cast for the arguments passed to freebsd_shmctl() in
oshmctl().

Submitted by:	kib
2009-06-25 17:11:27 +00:00
John Baldwin
ca99828420 Tweak the oshmctl() compile fix: convert the K&R definition to ANSI. 2009-06-25 13:36:57 +00:00
Robert Watson
893ef4d21c oshmctl() now requires a sysv_shm.c-local function prototype. 2009-06-25 07:16:10 +00:00
John Baldwin
b648d4806b Change the ABI of some of the structures used by the SYSV IPC API:
- The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned
  short.
- The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned
  short.
- The mode member of struct ipc_perm is now mode_t instead of unsigned short
  (this is merely a style bug).
- The rather dubious padding fields for ABI compat with SV/I386 have been
  removed from struct msqid_ds and struct semid_ds.
- The shm_segsz member of struct shmid_ds is now a size_t instead of an
  int.  This removes the need for the shm_bsegsz member in struct
  shmid_kernel and should allow for complete support of SYSV SHM regions
  >= 2GB.
- The shm_nattch member of struct shmid_ds is now an int instead of a
  short.
- The shm_internal member of struct shmid_ds is now gone.  The internal
  VM object pointer for SHM regions has been moved into struct
  shmid_kernel.
- The existing __semctl(), msgctl(), and shmctl() system call entries are
  now marked COMPAT7 and new versions of those system calls which support
  the new ABI are now present.
- The new system calls are assigned to the FBSD-1.1 version in libc.  The
  FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls.
- A simplistic framework for tagging system calls with compatibility
  symbol versions has been added to libc.  Version tags are added to
  system calls by adding an appropriate __sym_compat() entry to
  src/lib/libc/incldue/compat.h. [1]

PR:		kern/16195 kern/113218 bin/129855
Reviewed by:	arch@, rwatson
Discussed with:	kan, kib [1]
2009-06-24 21:10:52 +00:00
John Baldwin
45f48220bb Deprecate the msgsys(), semsys(), and shmsys() system calls by moving
them under COMPAT_FREEBSD[4567].  Starting with FreeBSD 5.0 the SYSV IPC
API was implemented via direct system calls (e.g. msgctl(), msgget(), etc.)
rather than indirecting through the var-args *sys() system calls.  The
shmsys() system call was already effectively deprecated for all but
COMPAT_FREEBSD4 already as its implementation for the !COMPAT_FREEBSD4 case
was to simply invoke nosys().
2009-06-24 20:01:13 +00:00
John Baldwin
71361470b1 - Move syscall function argument structure types to be just above the
relevenat system call function.
- Whitespace fixes.
2009-06-24 13:35:38 +00:00
Konstantin Belousov
3364c323e6 Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:45:22 +00:00
Alan Cox
cbe9534615 Eliminate an instance of VM_PROT_READ_IS_EXEC that I overlooked in r190705. 2009-06-09 17:18:41 +00:00
Robert Watson
bcf11e8d00 Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with:	pjd
2009-06-05 14:55:22 +00:00
Jamie Gritton
0304c73163 Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails.  Child jails may be restricted more than their parents,
but never less.  Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system.  Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings.  The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by:	bz (mentor)
2009-05-27 14:11:23 +00:00
Konstantin Belousov
45329b60da Systematically use vm_size_t to specify the size of the segment for VM KPI.
Do not overload the local variable size in kern_shmat() due to vm_size_t
change.
Fix style bug by adding explicit comparision with 0.

Discussed with:	bde
MFC after:	1 week
2009-03-05 11:45:42 +00:00
Konstantin Belousov
65067cc8b0 Correct types of variables used to track amount of allocated SysV shared
memory from int to size_t. Implement a workaround for current ABI not
allowing to properly save size for and report more then 2Gb sized segment
of shared memory.

This makes it possible to use > 2 Gb shared memory segments on 64bit
architectures. Please note the new BUGS section in shmctl(2) and
UPDATING note for limitations of this temporal solution.

Reviewed by:	csjp
Tested by:	Nikolay Dzham <i levsha org ua>
MFC after:	2 weeks
2009-03-02 18:53:30 +00:00
Christian S.J. Peron
4f18813f1f Make sure we restrict Linux only IPC calls from being executed
through the FreeBSD ABI.  IPC_INFO, SHM_INFO, SHM_STAT were added
specifically for Linux binary support.  They are not documented
as being a part of the FreeBSD ABI, also, the structures necessary
for them have been hidden away from the users for a long time.

Also, the Linux ABI layer uses it's own structures to populate the
responses back to the user to ensure that the ABI is consistent.

I think there is a bit more separation work that needs to happen.

Reviewed by:	jhb
Discussed with:	jhb
Discussed on:	freebsd-arch@ (very briefly)
MFC after:	1 month
2008-02-12 20:55:03 +00:00
Robert Watson
30d239bc4c Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

  mac_<object>_<method/action>
  mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme.  Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier.  Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods.  Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by:	SPARTA (original patches against Mac OS X)
Obtained from:	TrustedBSD Project, Apple Computer
2007-10-24 19:04:04 +00:00
Robert Watson
873fbcd776 Further system call comment cleanup:
- Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde)
- Remove extra blank lines in some cases.
- Add extra blank lines in some cases.
- Remove no-op comments consisting solely of the function name, the word
  "syscall", or the system call name.
- Add punctuation.
- Re-wrap some comments.
2007-03-05 13:10:58 +00:00
Robert Watson
0c14ff0eb5 Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.
2007-03-04 22:36:48 +00:00
Robert Watson
3d50b06b8e Remove call to ipcperm() in shmget_existing(). The flags argument is
ignored on other systems I investigated when accessing an existing
memory segment rather than creating a new one.  This call to ipcperm()
is the only one to pass in a complete mode flag to the permission
checks rather than a simple access request mask, and caused problems
for the revised ipcperm() based on the priv(9) interface, which can
now be restored.

PR:	106078
2007-02-19 22:56:10 +00:00
Robert Watson
aed5570872 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
Robert Watson
f50c4fd817 Remove MAC_DEBUG + MPRINTF debugging from System V IPC. This no longer
appears to be serving a useful purpose, as it was used during initial
development of MAC support for System V IPC.

MFC after:	1 month
Obtained from:	TrustedBSD Project
Suggested by:	Christopher dot Vance at SPARTA dot com
2006-09-20 13:40:00 +00:00
Robert Watson
b37ffd3189 Move some functions and definitions from uipc_socket2.c to uipc_socket.c:
- Move sonewconn(), which creates new sockets for incoming connections on
  listen sockets, so that all socket allocate code is together in
  uipc_socket.c.

- Move 'maxsockets' and associated sysctls to uipc_socket.c with the
  socket allocation code.

- Move kern.ipc sysctl node to uipc_socket.c, add a SYSCTL_DECL() for it
  to sysctl.h and remove lots of scattered implementations in various
  IPC modules.

- Sort sodealloc() after soalloc() in uipc_socket.c for dependency order
  reasons.  Statisticize soalloc() and sodealloc() as they are now
  required only in uipc_socket.c, and are internal to the socket
  implementation.

After this change, socket allocation and deallocation is entirely
centralized in one file, and uipc_socket2.c consists entirely of socket
buffer manipulation and default protocol switch functions.

MFC after:	1 month
2006-06-10 14:34:07 +00:00
Paul Saab
fbb273bc05 Properly support for FreeBSD 4 32bit System V shared memory.
Submitted by:	peter
Obtained from:	Yahoo!
MFC after:	3 weeks
2006-03-30 07:42:32 +00:00
Robert Watson
7723d5ed12 Re-order MAC and DAC checks in shmget() in order to give precedence to
the MAC result, as well as avoid losing the DAC check result when MAC
is enabled.

MFC after:	3 days
Reported by:	Patrick LeBlanc <Patrick dot LeBlanc at sparta dot com>
2005-10-04 16:40:20 +00:00
Christian S.J. Peron
9baea4b4b4 Change the data type of the upper shared memory limits from a signed
integer to an unsigned long. This lifts variables like the maximum
number of pages available for shared memory from 2^31 to 2^32 on 32
bit architectures, and from 2^31 to 2^64 on 64 bit architectures.

It should be noted that this changes breaks ABI on 64 bit architectures
because the size of the shmmax, shmmin, shmmni, shmseg and shmall members
of the shminfo structure has changed.

Silence on:	current@
2005-08-06 07:20:18 +00:00
John Baldwin
a4c24c66bc Actually use the iterating variable in the for loop when trying to avoid
overflow.

Reported by:	Vladislav Shabanov vs at rambler-co dot ru
MFC after:	1 week
Glanced at:	alfred
2005-05-12 20:04:48 +00:00
Christian S.J. Peron
84f85aedef Add much needed descriptions for a number of the IPC related sysctl OIDs.
This information will be very useful for people who are tuning applications
which have a dependence on IPC mechanisms.

The following OIDs were documented:

Message queues:
 kern.ipc.msgmax
 kern.ipc.msgmni
 kern.ipc.msgmnb
 kern.ipc.msgtlq
 kern.ipc.msgssz
 kern.ipc.msgseg

Semaphores:
 kern.ipc.semmap
 kern.ipc.semmni
 kern.ipc.semmns
 kern.ipc.semmnu
 kern.ipc.semmsl
 kern.ipc.semopm
 kern.ipc.semume
 kern.ipc.semusz
 kern.ipc.semvmx
 kern.ipc.semaem

Shared memory:
 kern.ipc.shmmax
 kern.ipc.shmmin
 kern.ipc.shmmni
 kern.ipc.shmseg
 kern.ipc.shmall
 kern.ipc.shm_use_phys
 kern.ipc.shm_allow_removed
 kern.ipc.shmsegs

These new descriptions can be viewed using sysctl -d

PR:		kern/65219
Submitted by:	Dan Nelson <dnelson at allantgroup dot com> (modified)
No objections:	developers@
Descriptions reviewed by: gnn
MFC after:	1 week
2005-02-12 01:22:39 +00:00
Robert Watson
14cedfc842 Invoke label initialization, creation, cleanup, and tear-down MAC
Framework entry points for System V IPC shared memory.

Submitted by:	Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, SPAWAR, McAfee Research
2005-01-22 19:10:25 +00:00
Warner Losh
9454b2d864 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
Robert Watson
921d05b90d Second of several commits to allow kernel System V IPC data structures
to be modified and extended without breaking the user space ABI:

Use _kernel variants on _ds structures for System V sempahores, message
queues, and shared memory.  When interfacing with userspace, export
only the _ds subsets of the _kernel data structures.  A lot of search
and replace.

Define the message structure in the _KERNEL portion of msg.h so that it
can be used by other kernel consumers, but not exposed to user space.

Submitted by:	Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, SPAWAR, McAfee Research
2004-11-12 13:23:47 +00:00
Alan Cox
94ddc7076d Push Giant deep into vm_forkproc(), acquiring it only if the process has
mapped System V shared memory segments (see shmfork_myhook()) or requires
the allocation of an ldt (see vm_fault_wire()).
2004-09-03 05:11:32 +00:00
Alexander Kabaev
00fbcda80d Avoid casts as lvalues. 2004-07-28 06:42:41 +00:00
Alan Cox
1a276a3f91 - Use atomic ops for updating the vmspace's refcnt and exitingcnt.
- Push down Giant into shmexit().  (Giant is acquired only if the vmspace
   contains shm segments.)
 - Eliminate the acquisition of Giant from proc_rwmem().
 - Reduce the scope of Giant in exit1(), uncovering the destruction of the
   address space.
2004-07-27 03:53:41 +00:00
Alan Cox
0049f8b27b Eliminate struct shm_handle. It is an unnecessary level of indirection to
a vm_object.
2004-07-09 05:28:38 +00:00
Tim J. Robbins
68ba7a1d57 When no fixed address is given in a shmat() request, pass a hint address
to vm_map_find() that is less likely to be outside of addressable memory
for 32-bit processes: just past the end of the largest possible heap.
This is the same hint that mmap() uses.
2004-06-19 14:46:13 +00:00
Poul-Henning Kamp
77409fe148 Add missing #include <sys/module.h> 2004-05-30 20:34:58 +00:00
Jacques Vidrine
b00a3c85da Correct a reference counting bug in shmat(2). If vm_map_find(9)
failed, the reference count for the virtual memory object referenced
by the specified shared memory segment would have been erroneously
incremented.

Reported by:	Joost Pol <joost@pine.nl>
2004-02-05 18:00:35 +00:00
Robert Watson
a2f88a8b7c Slight whitespace consistency improvement:
Trim trailing whitespace.
  Remove unmatched " " before ")".
2003-11-07 04:47:14 +00:00
Max Khon
2332251c6a Back out the following revisions:
1.36      +73 -60    src/sys/compat/linux/linux_ipc.c
1.83      +102 -48   src/sys/kern/sysv_shm.c
1.8       +4 -0      src/sys/sys/syscallsubr.h

That change was intended to support vmware3, but
wantrem parameter is useless because vmware3 uses SYSV shared memory
to talk with X server and X server is native application.
The patch worked because check for wantrem was not valid
(wantrem and SHMSEG_REMOVED was never checked for SHMSEG_ALLOCATED segments).

Add kern.ipc.shm_allow_removed (integer, rw) sysctl (default 0) which when set
to 1 allows to return removed segments in
shm_find_segment_by_shmid() and shm_find_segment_by_shmidx().

MFC after:	1 week
2003-11-05 01:53:10 +00:00
Mike Silbersack
184dcdc7c8 Change all SYSCTLS which are readonly and have a related TUNABLE
from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide
more useful error messages.
2003-10-21 18:28:36 +00:00
Jacques Vidrine
01b9dc96e3 Update some argument-documenting comments to match reality.
Add an explicit range check to those same arguments to reduce risk of
cardiac arrest in future code readers.
2003-08-07 16:42:27 +00:00
John Baldwin
8b149b5131 Consistently use the BSD u_int and u_short instead of the SYSV uint and
ushort.  In most of these files, there was a mixture of both styles and
this change just makes them self-consistent.

Requested by:	bde (kern_ktrace.c)
2003-08-07 15:04:27 +00:00
David E. O'Brien
677b542ea2 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
Martin Blapp
f130dcf22a Change the semantics of sysv shm emulation to take a additional
argument to the functions shm{at,ctl}1 and shm_find_segment_by_shmid{x}.
The BSD semantics didn't allow the usage of shared segment after
being marked for removal through IPC_RMID.

The patch involves the following functions:
  - shmat
  - shmctl
  - shm_find_segment_by_shmid
  - shm_find_segment_by_shmidx
  - linux_shmat
  - linux_shmctl

Submitted by:	Orlando Bassotto <orlando.bassotto@ieo-research.it>
Reviewed by:	marcel
2003-05-05 09:22:58 +00:00
Alan Cox
b077a36297 Lock some manipulations of the vm object's flags. 2003-04-13 19:36:18 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
9d4156aed3 Fix logic in loop so it actually executes.
Pointed out by: fjoe
2003-02-16 16:12:10 +00:00
Alfred Perlstein
5015c68a3c prevent overflow in shminfo.shmmax 2003-02-16 06:08:55 +00:00
Alfred Perlstein
e1d7d0bb60 Bring shm functions closer the the opengroup standards.
PR: 47469
Submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-01-25 21:33:05 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Matthew Dillon
3db161e079 It is possible for an active aio to prevent shared memory from being
dereferenced when a process exits due to the vmspace ref-count being
bumped.  Change shmexit() and shmexit_myhook() to take a vmspace instead
of a process and call it in vmspace_dofree().  This way if it is missed
in exit1()'s early-resource-free it will still be caught when the zombie is
reaped.

Also fix a potential race in shmexit_myhook() by NULLing out
vmspace->vm_shm prior to calling shm_delete_mapping() and free().

MFC after:	7 days
2003-01-13 23:04:32 +00:00
Alan Cox
49bf855d20 Lock the vm object when performing back-to-back vm_object_clear_flag() and
vm_object_set_flag().
2003-01-02 18:32:13 +00:00
Alfred Perlstein
b618bb96f0 return foo -> return (foo) 2002-08-15 02:10:12 +00:00
Alfred Perlstein
8209f090f1 Change struct vmspace->vm_shm from void * to struct shmmap_state *, this
removes the need for casts in several cases.
2002-07-22 16:22:27 +00:00
Alfred Perlstein
2cc593fd8e Remove caddr_t. 2002-07-22 16:12:55 +00:00
Alfred Perlstein
4d77a549fe Remove __P. 2002-03-19 21:25:46 +00:00
John Baldwin
c6f55f33ea - Use td_ucred for jail checks.
- Move jail checks and some other checks involving constants and stack
  variables out from under Giant.  This isn't perfectly safe atm because
  jail_sysvipc_allowed is read w/o a lock meaning that its value could be
  stale.  This global variable will soon become a per-jail flag, however,
  at which time it will either not need a lock or will use the prison lock.
2002-03-05 18:57:36 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
Alfred Perlstein
21d56e9c33 Make AIO a loadable module.
Remove the explicit call to aio_proc_rundown() from exit1(), instead AIO
will use at_exit(9).

Add functions at_exec(9), rm_at_exec(9) which function nearly the
same as at_exec(9) and rm_at_exec(9), these functions are called
on behalf of modules at the time of execve(2) after the image
activator has run.

Use a modified version of tegge's suggestion via at_exec(9) to close
an exploitable race in AIO.

Fix SYSCALL_MODULE_HELPER such that it's archetecuterally neutral,
the problem was that one had to pass it a paramater indicating the
number of arguments which were actually the number of "int".  Fix
it by using an inline version of the AS macro against the syscall
arguments.  (AS should be available globally but we'll get to that
later.)

Add a primative system for dynamically adding kqueue ops, it's really
not as sophisticated as it should be, but I'll discuss with jlemon when
he's around.
2001-12-29 07:13:47 +00:00
Michael Reifenberger
491dec936c Introduce [IPC|SHM]_[INFO|STAT] to shmctl to make
`/compat/linux/usr/bin/ipcs -m` happy.
2001-10-28 09:29:10 +00:00
Paul Saab
cbc89bfbfe Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader
tunable.

Reviewed by:	peter
MFC after:	2 weeks
2001-10-10 23:06:54 +00:00
Michael Reifenberger
b3a4bc4247 PR: kern/29698 (part)
Reviewed by:	audit
Add tunables for the sem* and shm* syscontrols for tuning on boottime
until they become dynamic.
SAP R/3 doesn't like the compiled in defaults.
2001-09-13 20:20:09 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Matthew Dillon
b6a4b4f9ae Giant Pushdown: sysv shm, sem, and msg calls. 2001-08-31 00:02:18 +00:00
Matthew Dillon
0cddd8f023 With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
Dima Dorfman
a723c4e173 Export via sysctl:
* all members of msginfo from sysv_msg.c;
  * msqids from sysv_msg.c;
  * sema from sysv_sem.c; and
  * shmsegs from sysv_shm.c;

These will be used by ipcs(1) in non-kvm mode.

Reviewed by:	tmm
2001-05-30 03:28:59 +00:00
Dima Dorfman
028f979d1d Correct style bugs with regards to long lines and comments.
Reviewed by:	bde
2001-05-23 23:38:05 +00:00
Dima Dorfman
a8dbafbe87 Correct the vm_mtx handling; specifically, don't acquire it in
shm_deallocate_segment because shmexit_myhook calls it, and the latter
should always be called with it already held.

Submitted by:	dwmalone, dd
Approved by:	alfred
2001-05-22 03:56:26 +00:00
John Baldwin
9dceb26b23 Sort includes. 2001-05-21 18:52:02 +00:00
Alfred Perlstein
67d1f21cbe Aquire vm mutex when releasing sysv shm segments.
Obtained from: Dima Dorfman <dima@unixfreak.org>
2001-05-20 20:37:47 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
Matthew Dillon
1766b2e5fa Raise the SysV shared memory defaults to more reasonable values.
Mainly increases the shared memory limit from 4M to 32M (approx).
Many more programs these days use SysV shared memory, especially X-related
programs.
2001-05-04 18:43:19 +00:00
Mark Murray
fb919e4d5a Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
Robert Watson
91421ba234 o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
  pr_free(), invoked by the similarly named credential reference
  management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
  of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
  rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
  flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
  mutex use.

Notes:

o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
  credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
  required to protect the reference count plus some fields in the
  structure.

Reviewed by:	freebsd-arch
Obtained from:	TrustedBSD Project
2001-02-21 06:39:57 +00:00
Brian Feldman
a02f31364e It is _DEFINITELY_ not okay to change shmseg on a running system. 2001-02-04 20:10:32 +00:00
Dag-Erling Smørgrav
faa784b70c Use predictable internal names for the sysvipc modules, so we have a
chance of getting dependencies working.
2001-01-14 18:04:30 +00:00
Alfred Perlstein
78525ce318 sysvipc loadable.
new syscall entry lkmressys - "reserved loadable syscall"

Make syscall_register allow overwriting of such entries (lkmressys).
2000-12-01 08:57:47 +00:00
Robert Watson
cb1f0db9db o Deny access to System V IPC from within jail by default, as in the
current implementation, jail neither virtualizes the Sys V IPC namespace,
  nor provides inter-jail protections on IPC objects.
o Support for System V IPC can be enabled by setting jail.sysvipc_allowed=1
  using sysctl.
o This is not the "real fix" which involves virtualizing the System V
  IPC namespace, but prevents processes within jail from influencing those
  outside of jail when not approved by the administrator.

Reported by:	Paulo Fragoso <paulo@nlink.com.br>
2000-10-31 01:34:00 +00:00
Matthew Dillon
8b03c8ed5e This is a cleanup patch to Peter's new OBJT_PHYS VM object type
and sysv shared memory support for it.  It implements a new
    PG_UNMANAGED flag that has slightly different characteristics
    from PG_FICTICIOUS.

    A new sysctl, kern.ipc.shm_use_phys has been added to enable the
    use of physically-backed sysv shared memory rather then swap-backed.
    Physically backed shm segments are not tracked with PV entries,
    allowing programs which use a large shm segment as a rendezvous
    point to operate without eating an insane amount of KVM in the
    PV entry management.  Read: Oracle.

    Peter's OBJT_PHYS object will also allow us to eventually implement
    page-table sharing and/or 4MB physical page support for such segments.
    We're half way there.
2000-05-29 22:40:54 +00:00
Peter Wemm
24488c7498 Provide a temporary undocumented option: SHM_PHYS_BACKED. This will
become sysctl and/or flags controlled later.  It's mainly here for an
easy place to test the physical memory backed objects.
2000-05-21 13:52:13 +00:00
Peter Wemm
255108f385 Make sysv-style shared memory tuneable params fully runtime adjustable
via sysctl.  It's done pretty simply but it should be quite adequate.
Also move SHMMAXPGS from $machine/include/vmparam.h as the comments that
went with it were wrong... we don't allocate KVM space for the pages so
that comment is bogus..  The only practical limit is how much physical
ram you want to lock up as this stuff isn't paged out or swap backed.
2000-03-30 07:17:05 +00:00
Alan Cox
af25d10c91 shmat: If VM_PROT_READ_IS_EXEC is defined and prot includes VM_PROT_READ,
VM_PROT_EXECUTE must be added to prot before calling vm_map_find.

Without this change, an mprotect on a shmat'ed region fails (when
it shouldn't).  This bug was reported Feb 28 by Brooks Davis
<brooks@one-eyed-alien.net> on -hackers.

Reviewed by:	bde
Approved by:	jkh
2000-03-10 09:11:24 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Alan Cox
dc92aa57fd For consistency with other implementations, check for the existence
of the segment before checking its permissions.

PR:		kern/11999
Submitted by:	Brooks Davis <brooks@one-eyed-alien.net>
1999-06-19 23:53:13 +00:00
Poul-Henning Kamp
1c308b817a Change suser_xxx() to suser() where it applies. 1999-04-27 12:21:16 +00:00
Matthew Dillon
1c7c3c6a86 This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
David Greenman
6cde7a165f Fixed two potentially serious classes of bugs:
1) The vnode pager wasn't properly tracking the file size due to
   "size" being page rounded in some cases and not in others.
   This sometimes resulted in corrupted files. First noticed by
   Terry Lambert.
   Fixed by changing the "size" pager_alloc parameter to be a 64bit
   byte value (as opposed to a 32bit page index) and changing the
   pagers and their callers to deal with this properly.
2) Fixed a bogus type cast in round_page() and trunc_page() that
   caused some 64bit offsets and sizes to be scrambled. Removing
   the cast required adding casts at a few dozen callers.
   There may be problems with other bogus casts in close-by
   macros. A quick check seemed to indicate that those were okay,
   however.
1998-10-13 08:24:45 +00:00
Doug Rabson
069e9bc1b4 Change various syscalls to use size_t arguments instead of u_int.
Add some overflow checks to read/write (from bde).

Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags
and vm_object::paging_in_progress to use operations which are not
interruptable.

Reviewed by: Bruce Evans <bde@zeta.org.au>
1998-08-24 08:39:39 +00:00
John Dyson
96fb8cf258 Fix the shm panic. I mistakenly used the shadow_count to keep the object
from being split, and instead added an OBJ_NOSPLIT.
1998-05-04 17:12:53 +00:00
John Dyson
cbd8ec0902 Work around some VM bugs, the worst being an overly aggressive
swap space free calculation.  More complete fixes will be forthcoming,
in a week.
1998-05-04 03:01:44 +00:00
Poul-Henning Kamp
227ee8a188 Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time.  Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by:	bde
1998-03-30 09:56:58 +00:00
Eivind Eklund
303b270b0a Staticize. 1998-02-09 06:11:36 +00:00
Eivind Eklund
5591b823d1 Make COMPAT_43 and COMPAT_SUNOS new-style options. 1997-12-16 17:40:42 +00:00
Poul-Henning Kamp
cb226aaa62 Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
1997-11-06 19:29:57 +00:00
Poul-Henning Kamp
a1c995b626 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
Poul-Henning Kamp
55166637cd Distribute and statizice a lot of the malloc M_* types.
Substantial input from:	bde
1997-10-11 18:31:40 +00:00
Bruce Evans
1fd0b0588f Removed unused #includes. 1997-08-02 14:33:27 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
John Dyson
996c772f58 This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
		Mount_std mounts will not work until the getfsent
		library routine is changed.

Reviewed by:	various people
Submitted by:	Jeffery Hsu <hsu@freebsd.org>
1997-02-10 02:22:35 +00:00