Commit Graph

91 Commits

Author SHA1 Message Date
trasz
802017a04b Add kern.racct.enable tunable and RACCT_DISABLED config option.
The point of this is to be able to add RACCT (with RACCT_DISABLED)
to GENERIC, to avoid having to rebuild the kernel to use rctl(8).

Differential Revision:	https://reviews.freebsd.org/D2369
Reviewed by:	kib@
MFC after:	1 month
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
2015-04-29 10:23:02 +00:00
mjg
a9faac8f4b Avoid dynamic syscall overhead for statically compiled modules.
The kernel tracks syscall users so that modules can safely unregister them.

But if the module is not unloadable or was compiled into the kernel, there is
no need to do this.

Achieve this by adding SY_THR_STATIC_KLD macro which expands to SY_THR_STATIC
during kernel build and 0 otherwise.

Reviewed by:	kib (previous version)
MFC after:	2 weeks
2014-10-26 19:42:44 +00:00
hselasky
35b126e324 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
gjb
fc21f40567 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
hselasky
bd1ed65f0f Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
kmacy
99851f359e In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by:	rwatson
Approved by:	re (bz)
2011-09-16 13:58:51 +00:00
trasz
4a17b24427 All the racct_*() calls need to happen with the proc locked. Fixing this
won't happen before 9.0.  This commit adds "#ifdef RACCT" around all the
"PROC_LOCK(p); racct_whatever(p, ...); PROC_UNLOCK(p)" instances, in order
to avoid useless locking/unlocking in kernels built without "options RACCT".
2011-07-06 20:06:44 +00:00
trasz
97c31dedde Style fix.
Submitted by:	jhb@
2011-04-06 19:08:50 +00:00
trasz
d3c78eed8e Add accounting for SysV-related resources.
Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-06 18:11:24 +00:00
trasz
d6f4192036 Add ucred pointer to the SysV-related memory structures. This is required
for racct.

Note that after this commit, ipcs(1) needs to be rebuilt.  Otherwise, it will
fail with "ipcs: sysctlbyname: kern.ipc.msqids: Cannot allocate memory".

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib (earlier version)
2011-04-06 16:59:54 +00:00
netchild
cc4128c6b1 Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:   Google Summer of Code 2010
Submitted by:   kibab
Reviewed by:    arch@ (parts by rwatson, trasz, jhb)
X-MFC after:    to be determined in last commit with code from this project
2011-02-25 10:11:01 +00:00
mdf
ca68749f2a Specify a CTLTYPE_FOO so that a future sysctl(8) change does not need
to rely on the format string.
2011-01-18 21:14:18 +00:00
trasz
234ab3f035 Remove useless NULL checks for M_WAITOK mallocs. 2010-12-02 01:14:45 +00:00
kib
b27fa06f97 Move SysV IPC freebsd32 compat shims from freebsd32_misc.c to corresponding
sysv_{msg,sem,shm}.c files.

Mark SysV IPC freebsd32 syscalls as NOSTD and add required
SYSCALL_INIT_HELPER/SYSCALL32_INIT_HELPERs to provide auto
register/unregister on module load.

This makes COMPAT_FREEBSD32 functional with SysV IPC compiled and loaded
as modules.

Reviewed by:	jhb
MFC after:	2 weeks
2010-03-19 11:04:42 +00:00
jhb
6f52fe78fb Change the ABI of some of the structures used by the SYSV IPC API:
- The uid/cuid members of struct ipc_perm are now uid_t instead of unsigned
  short.
- The gid/cgid members of struct ipc_perm are now gid_t instead of unsigned
  short.
- The mode member of struct ipc_perm is now mode_t instead of unsigned short
  (this is merely a style bug).
- The rather dubious padding fields for ABI compat with SV/I386 have been
  removed from struct msqid_ds and struct semid_ds.
- The shm_segsz member of struct shmid_ds is now a size_t instead of an
  int.  This removes the need for the shm_bsegsz member in struct
  shmid_kernel and should allow for complete support of SYSV SHM regions
  >= 2GB.
- The shm_nattch member of struct shmid_ds is now an int instead of a
  short.
- The shm_internal member of struct shmid_ds is now gone.  The internal
  VM object pointer for SHM regions has been moved into struct
  shmid_kernel.
- The existing __semctl(), msgctl(), and shmctl() system call entries are
  now marked COMPAT7 and new versions of those system calls which support
  the new ABI are now present.
- The new system calls are assigned to the FBSD-1.1 version in libc.  The
  FBSD-1.0 symbols in libc now refer to the old COMPAT7 system calls.
- A simplistic framework for tagging system calls with compatibility
  symbol versions has been added to libc.  Version tags are added to
  system calls by adding an appropriate __sym_compat() entry to
  src/lib/libc/incldue/compat.h. [1]

PR:		kern/16195 kern/113218 bin/129855
Reviewed by:	arch@, rwatson
Discussed with:	kan, kib [1]
2009-06-24 21:10:52 +00:00
jhb
0894d349bd Deprecate the msgsys(), semsys(), and shmsys() system calls by moving
them under COMPAT_FREEBSD[4567].  Starting with FreeBSD 5.0 the SYSV IPC
API was implemented via direct system calls (e.g. msgctl(), msgget(), etc.)
rather than indirecting through the var-args *sys() system calls.  The
shmsys() system call was already effectively deprecated for all but
COMPAT_FREEBSD4 already as its implementation for the !COMPAT_FREEBSD4 case
was to simply invoke nosys().
2009-06-24 20:01:13 +00:00
jhb
d8d39adf3c - Move syscall function argument structure types to be just above the
relevenat system call function.
- Whitespace fixes.
2009-06-24 13:35:38 +00:00
rdivacky
f56dfc12fb In non-debugging mode make this define (void)0 instead of nothing. This
helps to catch bugs like the below with clang.

	if (cond);		<--- note the trailing ;
	   something();

Approved by:	ed (mentor)
Discussed on:	current@
2009-06-21 07:54:47 +00:00
rwatson
f4934662e5 Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with:	pjd
2009-06-05 14:55:22 +00:00
jamie
a013e0afcb Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails.  Child jails may be restricted more than their parents,
but never less.  Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system.  Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings.  The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by:	bz (mentor)
2009-05-27 14:11:23 +00:00
rwatson
60570a92bf Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

  mac_<object>_<method/action>
  mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme.  Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier.  Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods.  Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by:	SPARTA (original patches against Mac OS X)
Obtained from:	TrustedBSD Project, Apple Computer
2007-10-24 19:04:04 +00:00
rwatson
00b02345d4 Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
rwatson
69938bd196 Further system call comment cleanup:
- Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde)
- Remove extra blank lines in some cases.
- Add extra blank lines in some cases.
- Remove no-op comments consisting solely of the function name, the word
  "syscall", or the system call name.
- Add punctuation.
- Re-wrap some comments.
2007-03-05 13:10:58 +00:00
rwatson
300d4098cf Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.
2007-03-04 22:36:48 +00:00
rwatson
41001412d8 Do allow privilege to create over-sized messages on System V IPC
message queues in jail.
2007-02-19 13:23:45 +00:00
jkim
0099defbac MFP4: (part of) 110058
copyin()/copyout() for message type is separated from msgsnd()/msgrcv() and
it is done from its wrapper functions to support 32-bit emulations.  After I
implemented this, I have briefly referenced NetBSD and Darwin.  NetBSD passes
copyin()/copyout() function pointers from wrappers.  Darwin passes size of
message type as an argument, which is actually similar to my first
implementation (P4 109706).  We may revisit these implementations later.
2006-12-20 19:26:30 +00:00
jkim
edc10a6695 Fix msgsnd(3)/msgrcv(3) deadlock under heavy resource pressure by timing out
msgsnd and rechecking resources.  This problem was found while I was running
Linux Test Project test suite (test cases: msgctl08, msgctl09).
Change `msgwait' to `msgsnd' and `msgrcv' to distinguish its sleeping
conditions.  Few cosmetic changes to debugging messages.
2006-11-17 20:43:01 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
rwatson
7beaaf5cd2 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
rwatson
41fdf2d481 Remove MAC_DEBUG + MPRINTF debugging from System V IPC. This no longer
appears to be serving a useful purpose, as it was used during initial
development of MAC support for System V IPC.

MFC after:	1 month
Obtained from:	TrustedBSD Project
Suggested by:	Christopher dot Vance at SPARTA dot com
2006-09-20 13:40:00 +00:00
rwatson
120490c1a5 Move some functions and definitions from uipc_socket2.c to uipc_socket.c:
- Move sonewconn(), which creates new sockets for incoming connections on
  listen sockets, so that all socket allocate code is together in
  uipc_socket.c.

- Move 'maxsockets' and associated sysctls to uipc_socket.c with the
  socket allocation code.

- Move kern.ipc sysctl node to uipc_socket.c, add a SYSCTL_DECL() for it
  to sysctl.h and remove lots of scattered implementations in various
  IPC modules.

- Sort sodealloc() after soalloc() in uipc_socket.c for dependency order
  reasons.  Statisticize soalloc() and sodealloc() as they are now
  required only in uipc_socket.c, and are internal to the socket
  implementation.

After this change, socket allocation and deallocation is entirely
centralized in one file, and uipc_socket2.c consists entirely of socket
buffer manipulation and default protocol switch functions.

MFC after:	1 month
2006-06-10 14:34:07 +00:00
csjp
17aca298fa Add much needed descriptions for a number of the IPC related sysctl OIDs.
This information will be very useful for people who are tuning applications
which have a dependence on IPC mechanisms.

The following OIDs were documented:

Message queues:
 kern.ipc.msgmax
 kern.ipc.msgmni
 kern.ipc.msgmnb
 kern.ipc.msgtlq
 kern.ipc.msgssz
 kern.ipc.msgseg

Semaphores:
 kern.ipc.semmap
 kern.ipc.semmni
 kern.ipc.semmns
 kern.ipc.semmnu
 kern.ipc.semmsl
 kern.ipc.semopm
 kern.ipc.semume
 kern.ipc.semusz
 kern.ipc.semvmx
 kern.ipc.semaem

Shared memory:
 kern.ipc.shmmax
 kern.ipc.shmmin
 kern.ipc.shmmni
 kern.ipc.shmseg
 kern.ipc.shmall
 kern.ipc.shm_use_phys
 kern.ipc.shm_allow_removed
 kern.ipc.shmsegs

These new descriptions can be viewed using sysctl -d

PR:		kern/65219
Submitted by:	Dan Nelson <dnelson at allantgroup dot com> (modified)
No objections:	developers@
Descriptions reviewed by: gnn
MFC after:	1 week
2005-02-12 01:22:39 +00:00
jhb
71c05d27c0 - Tweak kern_msgctl() to return a copy of the requested message queue id
structure in the struct pointed to by the 3rd argument for IPC_STAT and
  get rid of the 4th argument.  The old way returned a pointer into the
  kernel array that the calling function would then access afterwards
  without holding the appropriate locks and doing non-lock-safe things like
  copyout() with the data anyways.  This change removes that unsafeness and
  resulting race conditions as well as simplifying the interface.
- Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(),
  fstatfs(), and fhstatfs().  Use these wrappers to cut out a lot of
  code duplication for freebsd4 and netbsd compatability system calls.
- Add a new lookup function kern_alternate_path() that looks up a filename
  under an alternate prefix and determines which filename should be used.
  This is basically a more general version of linux_emul_convpath() that
  can be shared by all the ABIs thus allowing for further reduction of
  code duplication.
2005-02-07 18:44:55 +00:00
sobomax
896df27c1a Split out kernel side of msgctl(2) into two parts: the first that pops data
from the userland and pushes results back and the second which does
actual processing. Use the latter to eliminate stackgap in the linux wrapper
of that syscall.

MFC after:      2 weeks
2005-01-26 00:46:36 +00:00
rwatson
a9307575e8 Invoke label initialization, creation, cleanup, and tear-down MAC
Framework entry points for System V IPC message queues.

Submitted by:	Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, SPAWAR, McAfee Research
2005-01-22 18:51:43 +00:00
imp
20280f1431 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
rwatson
44386c7719 Make the sysctls kern.ipc.msgmnb and kern.ipc.msgtql into tunables as
is the case for most other sysctls in the System V IPC message queue
implementation.

PR:		75541
Submitted by:	Sergiy Vyshnevetskiy <serg at vostok dot net>
MFC after:	2 weeks
2004-12-30 13:56:34 +00:00
rwatson
891fcc9766 Second of several commits to allow kernel System V IPC data structures
to be modified and extended without breaking the user space ABI:

Use _kernel variants on _ds structures for System V sempahores, message
queues, and shared memory.  When interfacing with userspace, export
only the _ds subsets of the _kernel data structures.  A lot of search
and replace.

Define the message structure in the _KERNEL portion of msg.h so that it
can be used by other kernel consumers, but not exposed to user space.

Submitted by:	Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, SPAWAR, McAfee Research
2004-11-12 13:23:47 +00:00
phk
30a7ac8468 Add missing #include <sys/module.h> 2004-05-30 20:34:58 +00:00
rwatson
f74e2ef8ef Slight whitespace consistency improvement:
Trim trailing whitespace.
  Remove unmatched " " before ")".
2003-11-07 04:47:14 +00:00
silby
f0e686a675 Change all SYSCTLS which are readonly and have a related TUNABLE
from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide
more useful error messages.
2003-10-21 18:28:36 +00:00
nectar
df9de6c5cd Update some argument-documenting comments to match reality.
Add an explicit range check to those same arguments to reduce risk of
cardiac arrest in future code readers.
2003-08-07 16:42:27 +00:00
obrien
3b8fff9e4c Use __FBSDID(). 2003-06-11 00:56:59 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
alfred
3d1537da2b fix warnings 2003-01-26 23:25:00 +00:00
alfred
1cd571649a Add const qualifier to data argument for msgsnd.
PR: standards/45274
Submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-01-26 20:09:34 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
maxim
523771c06a o Clear a high bit of ipc_perm.seq so msgget(3) never returns a
negative message queue id.

PR:		kern/46122
Submitted by:	Vladimir B.Grebenschikov <vova@sw.ru>
MFC after:	2 weeks
2002-12-15 09:41:46 +00:00
alfred
24b9035a3a Make SYSVMSG mpsafe. Right now there is a global lock over the
entire subsystem, we could move to per-message queue locks, however
the messages themselves seem to come from a global pool and to avoid
over-locking this code (locking individual queues, then the global
pool) I've opted to just do it this way.

Requested by: rwatson
Tested by: NetBSD's regression suite.
2002-08-13 08:00:36 +00:00
alfred
d1de8b21e3 Cleanup:
Define a debug printf macro rather than wrapping all calls to printf
with #ifdefs.
2002-07-22 18:27:54 +00:00