Commit Graph

283 Commits

Author SHA1 Message Date
jhb
3706cd3509 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
dd
c8a6bd9922 Introduce a version field to `struct xucred' in place of one of the
spares (the size of the field was changed from u_short to u_int to
reflect what it really ends up being).  Accordingly, change users of
xucred to set and check this field as appropriate.  In the kernel,
this is being done inside the new cru2x() routine which takes a
`struct ucred' and fills out a `struct xucred' according to the
former.  This also has the pleasant sideaffect of removing some
duplicate code.

Reviewed by:	rwatson
2002-02-27 04:45:37 +00:00
iedowse
293f2d1cfe Sockets passed into uipc_abort() have been allocated by sonewconn()
but never accept'ed, so they must be destroyed. Originally, unp_drop()
detected this situation by checking if so->so_head is non-NULL.
However, since revision 1.54 of uipc_socket.c (Feb 1999), so->so_head
is set to NULL before calling soabort(), so any unix-domain sockets
waiting to be accept'ed are leaked if the server socket is closed.

Resolve this by moving the socket destruction code into uipc_abort()
itself, and making it unconditional (the other caller of unp_drop()
never needs the socket to be destroyed). Use unp_detach() to avoid
the original code duplication when destroying the socket.

PR:		kern/17895
Reviewed by:	dwmalone (an earlier version of the patch)
MFC after:	1 week
2002-02-25 00:03:34 +00:00
alfred
13c64df775 Remove a bogus FILEDESC_UNLOCK.
Submitted by: tanimura
2002-01-14 19:45:03 +00:00
alfred
844237b396 SMP Lock struct file, filedesc and the global file list.
Seigo Tanimura (tanimura) posted the initial delta.

I've polished it quite a bit reducing the need for locking and
adapting it for KSE.

Locks:

1 mutex in each filedesc
   protects all the fields.
   protects "struct file" initialization, while a struct file
     is being changed from &badfileops -> &pipeops or something
     the filedesc should be locked.

1 mutex in each struct file
   protects the refcount fields.
   doesn't protect anything else.
   the flags used for garbage collection have been moved to
     f_gcflag which was the FILLER short, this doesn't need
     locking because the garbage collection is a single threaded
     container.
  could likely be made to use a pool mutex.

1 sx lock for the global filelist.

struct file *	fhold(struct file *fp);
        /* increments reference count on a file */

struct file *	fhold_locked(struct file *fp);
        /* like fhold but expects file to locked */

struct file *	ffind_hold(struct thread *, int fd);
        /* finds the struct file in thread, adds one reference and
                returns it unlocked */

struct file *	ffind_lock(struct thread *, int fd);
        /* ffind_hold, but returns file locked */

I still have to smp-safe the fget cruft, I'll get to that asap.
2002-01-13 11:58:06 +00:00
rwatson
36784fd2c4 o Back out portions of 1.50 and 1.47, eliminating sonewconn3() and
always deriving the credential for a newly accepted connection from
  the listen socket.  Previously, the selection of the credential
  depended on the protocol: UNIX domain sockets would use the
  connecting process's credential, and protocols supporting a creation
  of the socket before the receiving end called accept() would use
  the listening socket.  After this change, it is always the listening
  credential.

Reviewed by:	green
2001-12-13 22:09:37 +00:00
dillon
86ed17d675 Give struct socket structures a ref counting interface similar to
vnodes.  This will hopefully serve as a base from which we can
expand the MP code.  We currently do not attempt to obtain any
mutex or SX locks, but the door is open to add them when we nail
down exactly how that part of it is going to work.
2001-11-17 03:07:11 +00:00
rwatson
8cf42b482a o Replace reference to 'struct proc' with 'struct thread' in 'struct
sysctl_req', which describes in-progress sysctl requests.  This permits
  sysctl handlers to have access to the current thread, permitting work
  on implementing td->td_ucred, migration of suser() to using struct
  thread to derive the appropriate ucred, and allowing struct thread to be
  passed down to other code, such as network code where td is not currently
  available (and curproc is used).

o Note: netncp and netsmb are not updated to reflect this change, as they
  are not currently KSE-adapted.

Reviewed by:		julian
Obtained from:	TrustedBSD Project
2001-11-08 02:13:18 +00:00
dwmalone
162339bdef When scanning for control messages, don't process the data mbufs.
This could cause hangs if a unix domain socket was closed with data
still to be read from it.

Tested by:	Andrea Campi <andrea@webcom.it>
2001-10-29 20:04:03 +00:00
rwatson
f51eaee62f - Combine kern.ps_showallprocs and kern.ipc.showallsockets into
a single kern.security.seeotheruids_permitted, describes as:
  "Unprivileged processes may see subjects/objects with different real uid"
  NOTE: kern.ps_showallprocs exists in -STABLE, and therefore there is
  an API change.  kern.ipc.showallsockets does not.
- Check kern.security.seeotheruids_permitted in cr_cansee().
- Replace visibility calls to socheckuid() with cr_cansee() (retain
  the change to socheckuid() in ipfw, where it is used for rule-matching).
- Remove prison_unpcb() and make use of cr_cansee() against the UNIX
  domain socket credential instead of comparing root vnodes for the
  UDS and the process.  This allows multiple jails to share the same
  chroot() and not see each others UNIX domain sockets.
- Remove unused socheckproc().

Now that cr_cansee() is used universally for socket visibility, a variety
of policies are more consistently enforced, including uid-based
restrictions and jail-based restrictions.  This also better-supports
the introduction of additional MAC models.

Reviewed by:	ps, billf
Obtained from:	TrustedBSD Project
2001-10-09 21:40:30 +00:00
ps
38383190d5 Only allow users to see their own socket connections if
kern.ipc.showallsockets is set to 0.

Submitted by:	billf (with modifications by me)
Inspired by:	Dave McKay (aka pm aka Packet Magnet)
Reviewed by:	peter
MFC after:	2 weeks
2001-10-05 07:06:32 +00:00
dwmalone
86cf053ae0 Hopefully improve control message passing over Unix domain sockets.
1) Allow the sending of more than one control message at a time
over a unix domain socket. This should cover the PR 29499.

2) This requires that unp_{ex,in}ternalize and unp_scan understand
mbufs with more than one control message at a time.

3) Internalize and externalize used to work on the mbuf in-place.
This made life quite complicated and the code for sizeof(int) <
sizeof(file *) could end up doing the wrong thing. The patch always
create a new mbuf/cluster now. This resulted in the change of the
prototype for the domain externalise function.

4) You can now send SCM_TIMESTAMP messages.

5) Always use CMSG_DATA(cm) to determine the start where the data
in unp_{ex,in}ternalize. It was using ((struct cmsghdr *)cm + 1)
in some places, which gives the wrong alignment on the alpha.
(NetBSD made this fix some time ago).

This results in an ABI change for discriptor passing and creds
passing on the alpha. (Probably on the IA64 and Spare ports too).

6) Fix userland programs to use CMSG_* macros too.

7) Be more careful about freeing mbufs containing (file *)s.
This is made possible by the prototype change of externalise.

PR:		29499
MFC after:	6 weeks
2001-10-04 13:11:48 +00:00
julian
5596676e6c KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
julian
b753f3c491 Forgot to remove this un-needed test. (M_WAITOK won't fail)
I vaguely remember someone once proving it COULD return NULL..
was that changed?

Reminded by: BDE

MFC after:	2 weeks
2001-08-19 04:30:13 +00:00
julian
70b8f5f2d3 fix typo
Submitted by:	Ian Dowse <iedowse@maths.tcd.ie>
2001-08-18 17:43:29 +00:00
julian
e4062474eb Don't alocate a 400 byte buffer on the stack,
Nor 800 bytes of structures..

MFC after:	2 weeks
2001-08-18 02:53:50 +00:00
dd
5e416567e1 Implement a LOCAL_PEERCRED socket option which returns a
`struct xucred` with the credentials of the connected peer.
Obviously this only works (and makes sense) on SOCK_STREAM
sockets.  This works for both the connect(2) and listen(2)
callers.

There is precise documentation of the semantics in unix(4).

Reviewed by:	dwmalone (eyeballed)
2001-08-17 22:01:18 +00:00
rwatson
f504530d9f o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
  pcred->pc_uidinfo, which was associated with the real uid, only rename
  it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
  corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
  original macro that pointed.
  p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
  p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
  cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
  cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
  we figure out locking and optimizations; generally speaking, this
  means moving to a structure like this:
        newcred = crdup(oldcred);
        ...
        p->p_ucred = newcred;
        crfree(oldcred);
  It's not race-free, but better than nothing.  There are also races
  in sys_process.c, all inter-process authorization, fork, exec, and
  exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
  remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
  use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
  pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
  allocation.
o Clean up ktrcanset() to take into account changes, and move to using
  suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
  calls to better document current behavior.  In a couple of places,
  current behavior is a little questionable and we need to check
  POSIX.1 to make sure it's "right".  More commenting work still
  remains to be done.
o Update credential management calls, such as crfree(), to take into
  account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
      change_euid()
      change_egid()
      change_ruid()
      change_rgid()
      change_svuid()
      change_svgid()
  In each case, the call now acts on a credential not a process, and as
  such no longer requires more complicated process locking/etc.  They
  now assume the caller will do any necessary allocation of an
  exclusive credential reference.  Each is commented to document its
  reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
  and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
  questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
  processes and pcreds.  Note that this authorization, as well as
  CANSIGIO(), needs to be updated to use the p_cansignal() and
  p_cansched() centralized authorization routines, as they currently
  do not take into account some desirable restrictions that are handled
  by the centralized routines, as well as being inconsistent with other
  similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from:	TrustedBSD Project
Reviewed by:	green, bde, jhb, freebsd-arch, freebsd-audit
2001-05-25 16:59:11 +00:00
markm
bcca5847d5 Undo part of the tangle of having sys/lock.h and sys/mutex.h included in
other "system" header files.

Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.

Sort sys/*.h includes where possible in affected files.

OK'ed by:	bde (with reservations)
2001-05-01 08:13:21 +00:00
tmm
731887b731 Change uipc_sockaddr so that a sockaddr_un without a path is returned
nam for an unbound socket instead of leaving nam untouched in that case.
This way, the getsockname() output can be used to determine the address
family of such sockets (AF_LOCAL).

Reviewed by:	iedowse
Approved by:	rwatson
2001-04-24 19:09:23 +00:00
rwatson
ab5676fc87 o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
  pr_free(), invoked by the similarly named credential reference
  management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
  of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
  rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
  flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
  mutex use.

Notes:

o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
  credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
  required to protect the reference count plus some fields in the
  structure.

Reviewed by:	freebsd-arch
Obtained from:	TrustedBSD Project
2001-02-21 06:39:57 +00:00
bmilekic
4b6a7bddad * Rename M_WAIT mbuf subsystem flag to M_TRYWAIT.
This is because calls with M_WAIT (now M_TRYWAIT) may not wait
  forever when nothing is available for allocation, and may end up
  returning NULL. Hopefully we now communicate more of the right thing
  to developers and make it very clear that it's necessary to check whether
  calls with M_(TRY)WAIT also resulted in a failed allocation.
  M_TRYWAIT basically means "try harder, block if necessary, but don't
  necessarily wait forever." The time spent blocking is tunable with
  the kern.ipc.mbuf_wait sysctl.
  M_WAIT is now deprecated but still defined for the next little while.

* Fix a typo in a comment in mbuf.h

* Fix some code that was actually passing the mbuf subsystem's M_WAIT to
  malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the
  value of the M_WAIT flag, this could have became a big problem.
2000-12-21 21:44:31 +00:00
phk
54ca48450c Convert all users of fldoff() to offsetof(). fldoff() is bad
because it only takes a struct tag which makes it impossible to
use unions, typedefs etc.

Define __offsetof() in <machine/ansi.h>

Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>

Remove myriad of local offsetof() definitions.

Remove includes of <stddef.h> in kernel code.

NB: Kernelcode should *never* include from /usr/include !

Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.

Deprecate <struct.h> with a warning.  The warning turns into an error on
01-12-2000 and the file gets removed entirely on 01-01-2001.

Paritials reviews by:   various.
Significant brucifications by:  bde
2000-10-27 11:45:49 +00:00
truckman
0575d8a3f9 Remove uidinfo hash table lookup and maintenance out of chgproccnt() and
chgsbsize(), which are called rather frequently and may be called from an
interrupt context in the case of chgsbsize().  Instead, do the hash table
lookup and maintenance when credentials are changed, which is a lot less
frequent.  Add pointers to the uidinfo structures to the ucred and pcred
structures for fast access.  Pass a pointer to the credential to chgproccnt()
and chgsbsize() instead of passing the uid.  Add a reference count to the
uidinfo structure and use it to decide when to free the structure rather
than freeing the structure when the resource consumption drops to zero.
Move the resource tracking code from kern_proc.c to kern_resource.c.  Move
some duplicate code sequences in kern_prot.c to separate helper functions.
Change KASSERTs in this code to unconditional tests and calls to panic().
2000-09-05 22:11:13 +00:00
green
0fa5ae2859 Remove any possibility of hiwat-related race conditions by changing
the chgsbsize() call to use a "subject" pointer (&sb.sb_hiwat) and
a u_long target to set it to.  The whole thing is splnet().

This fixes a problem that jdp has been able to provoke.
2000-08-29 11:28:06 +00:00
mckusick
a3d0c189ea Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).
2000-07-11 22:07:57 +00:00
phk
e5de271d47 Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by:	bde
2000-07-04 11:25:35 +00:00
phk
61ff05be25 Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

        -sysctl_vm_zone SYSCTL_HANDLER_ARGS
        +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
2000-07-03 09:35:31 +00:00
alfred
7f71a1a091 fix races in the uidinfo subsystem, several problems existed:
1) while allocating a uidinfo struct malloc is called with M_WAITOK,
   it's possible that while asleep another process by the same user
   could have woken up earlier and inserted an entry into the uid
   hash table.  Having redundant entries causes inconsistancies that
   we can't handle.

   fix: do a non-waiting malloc, and if that fails then do a blocking
   malloc, after waking up check that no one else has inserted an entry
   for us already.

2) Because many checks for sbsize were done as "test then set" in a non
   atomic manner it was possible to exceed the limits put up via races.

   fix: instead of querying the count then setting, we just attempt to
   set the count and leave it up to the function to return success or
   failure.

3) The uidinfo code was inlining and repeating, lookups and insertions
   and deletions needed to be in their own functions for clarity.

Reviewed by: green
2000-06-22 22:27:16 +00:00
shin
567d8629de Enable SCM_RIGHTS on alpha. Allocate necessary buffer as conversion between
int and struct file *.

Approved by: jkh

Submitted by: brian
Reviewed by: bde, brian, peter
2000-03-09 15:15:27 +00:00
eivind
87724eb673 Introduce NDFREE (and remove VOP_ABORTOP) 1999-12-15 23:02:35 +00:00
phk
8fca18de89 This is a partial commit of the patch from PR 14914:
Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY
   structures for list operations.  This patch makes all list operations
   in sys/kern use the queue(3) macros, rather than directly accessing the
   *Q_{HEAD,ENTRY} structures.

This batch of changes compile to the same object files.

Reviewed by:    phk
Submitted by:   Jake Burkholder <jake@checker.org>
PR:     14914
1999-11-16 10:56:05 +00:00
peter
787140aa42 Trim unused options (or #ifdef for undoc options).
Submitted by:	phk
1999-10-11 15:19:12 +00:00
green
f980526bf6 Implement RLIMIT_SBSIZE in the kernel. This is a per-uid sockbuf total
usage limit.
1999-10-09 20:42:17 +00:00
guido
12294bc989 Do not follow symlinks when binding a unix domain socket.
This fixes the ssh 1.2.27 vulnerability as reported in bugtraq.
1999-09-29 21:09:41 +00:00
green
4395e552e2 Change so_cred's type to a ucred, not a pcred. THis makes more sense, actually.
Make a sonewconn3() which takes an extra argument (proc) so new sockets created
with sonewconn() from a user's system call get the correct credentials, not
just the parent's credentials.
1999-09-19 02:17:02 +00:00
green
8703e5f3a4 Get rid of some evil defines (a pair of snd and rcv.) 1999-09-17 21:38:24 +00:00
peter
3b842d34e8 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
phk
7e26ca1d1a Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
        major()         umajor()
        minor()         uminor()
        makedev()       umakedev()
        dev2udev()      udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits.  This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference.  If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).
1999-05-11 19:55:07 +00:00
truckman
df85d5a50f Fix descriptor leak provoked by KKIS.05051999.003b exploit code.
unp_internalize() takes a reference to the descriptor.  If the send
fails after unp_internalize(), the control mbuf would be freed ophaning
the reference.

Tested in -CURRENT by: Pierre Beyssac <beyssac@enst.fr>
1999-05-10 18:09:39 +00:00
phk
ca21a25f17 This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing.  The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact:  "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

   I have no scripts for setting up a jail, don't ask me for them.

   The IP number should be an alias on one of the interfaces.

   mount a /proc in each jail, it will make ps more useable.

   /proc/<pid>/status tells the hostname of the prison for
   jailed processes.

   Quotas are only sensible if you have a mountpoint per prison.

   There are no privisions for stopping resource-hogging.

   Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by:   http://www.rndassociates.com/
Run for almost a year by:       http://www.servetheweb.com/
1999-04-28 11:38:52 +00:00
eivind
c8003f9434 More consistent with surrounding style. (Hey - it looked great in the
diff...)

Prodded by:	bde
1999-04-12 14:34:52 +00:00
eivind
081517e8b2 Staticize. 1999-04-11 02:17:47 +00:00
dfr
22ceb237f0 * Change sysctl from using linker_set to construct its tree using SLISTs.
This makes it possible to change the sysctl tree at runtime.

* Change KLD to find and register any sysctl nodes contained in the loaded
  file and to unregister them when the file is unloaded.

Reviewed by: Archie Cobbs <archie@whistle.com>,
	Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)
1999-02-16 10:49:55 +00:00
dillon
b90a2bf706 The code that reclaims descriptors from in-transit unix domain
descriptor-passing messages was calling sorflush() without checking
    to see if the descriptor was actually a socket.  This can cause a
    crash by exiting programs that use the mechanism under certain
    circumstances.
1999-01-21 09:02:18 +00:00
dillon
df24433bbe This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
phk
13c66194f4 Nitpicking and dusting performed on a train. Removes trivial warnings
about unused variables, labels and other lint.
1998-10-25 17:44:59 +00:00
bde
863d5c8b68 Cast pointers to uintptr_t/intptr_t instead of to u_long/long,
respectively.  Most of the longs should probably have been
u_longs, but this changes is just to prevent warnings about
casts between pointers and integers of different sizes, not
to fix poorly chosen types.
1998-07-15 02:32:35 +00:00
wollman
bbc4497ada Convert socket structures to be type-stable and add a version number.
Define a parameter which indicates the maximum number of sockets in a
system, and use this to size the zone allocators used for sockets and
for certain PCBs.

Convert PF_LOCAL PCB structures to be type-stable and add a version number.

Define an external format for infomation about socket structures and use
it in several places.

Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through
sysctl(3) without blocking network interrupts for an unreasonable
length of time.  This probably still has some bugs and/or race
conditions, but it seems to work well enough on my machines.

It is now possible for `netstat' to get almost all of its information
via the sysctl(3) interface rather than reading kmem (changes to follow).
1998-05-15 20:11:40 +00:00
msmith
964ce778b1 In the words of the submitter:
---------
Make callers of namei() responsible for releasing references or locks
instead of having the underlying filesystems do it.  This eliminates
redundancy in all terminal filesystems and makes it possible for stacked
transport layers such as umapfs or nullfs to operate correctly.

Quality testing was done with testvn, and lat_fs from the lmbench suite.

Some NFS client testing courtesy of Patrik Kudo.

vop_mknod and vop_symlink still release the returned vpp.  vop_rename
still releases 4 vnode arguments before it returns.  These remaining cases
will be corrected in the next set of patches.
---------

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-07 04:58:58 +00:00
des
396b114475 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
eivind
4547a09753 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
eivind
c552a9a1c3 Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
bde
6c224bdd6e Fixed duplicate definitions of M_FILE (one static). 1997-11-23 10:43:49 +00:00
phk
4d26888936 Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by:	-Wunused
1997-11-07 08:53:44 +00:00
phk
36e7a51ea1 Last major round (Unless Bruce thinks of somthing :-) of malloc changes.
Distribute all but the most fundamental malloc types.  This time I also
remembered the trick to making things static:  Put "static" in front of
them.

A couple of finer points by:	bde
1997-10-12 20:26:33 +00:00
peter
13141f4b23 Various select -> poll changes 1997-09-14 02:52:18 +00:00
bde
6ffb8bf9af Removed unused #includes. 1997-09-02 20:06:59 +00:00
bde
a6e315b69d Added used #include - don't depend on <sys/mbuf.h> including
<sys/malloc.h> (unless we only use the bogusly shared M*WAIT flags).
1997-09-02 01:19:47 +00:00
wollman
4542c1cf5d Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs.  (Socket buffers are the one exception.)  A number
of kernel APIs needed to get fixed in order to make this happen.  Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead.  Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.
1997-08-16 19:16:27 +00:00
wollman
6afbf203bd The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.

2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.

3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call.  Protocols should use the process pointer they are now passed.

4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.

5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.

As a result, LINT is now broken.  I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.
1997-04-27 20:01:29 +00:00
bde
0d3591bdbd Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined.
Fixed everything that depended on getting fcntl.h stuff from the wrong
place.  Most things don't depend on file.h stuff at all.
1997-03-23 03:37:54 +00:00
wpaul
cdd7ea4262 Add support to sendmsg()/recvmsg() for passing credentials between
processes using AF_LOCAL sockets. This hack is going to be used with
Secure RPC to duplicate a feature of STREAMS which has no real counterpart
in sockets (with STREAMS/TLI, you can apparently use t_getinfo() to learn
UID of a local process on the other side of a transport endpoint).

What happens is this: the client sets up a sendmsg() call with ancillary
data using the SCM_CREDS socket-level control message type. It does not
need to fill in the structure. When the kernel notices the data,
unp_internalize() fills in the cmesgcred structure with the sending
process' credentials (UID, EUID, GID, and ancillary groups). This data
is later delivered to the receiving process. The receiver can then
perform the follwing tests:

- Did the client send ancillary data?
	o Yes, proceed.
	o No, refuse to authenticate the client.

- The the client send data of type SCM_CREDS?
	o Yes, proceed.
	o No, refuse to authenticate the client.

- Is the cmsgcred structure the right size?
	o Yes, proceed.
	o No, signal a possible error.

The receiver can now inspect the credential information and use it to
authenticate the client.
1997-03-21 16:12:32 +00:00
wollman
a6e3c2a3ce Create a new branch of the kernel MIB, kern.ipc, to store
all of the configurables and instrumentation related to
inter-process communication mechanisms.  Some variables,
like mbuf statistics, are instrumented here for the first
time.

For mbuf statistics: also keep track of m_copym() and
m_pullup() failures, and provide for the user's inspection
the compiled-in values of MSIZE, MHLEN, MCLBYTES, and MINCLSIZE.
1997-02-24 20:30:58 +00:00
peter
94b6d72794 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
dyson
10f666af84 This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
		Mount_std mounts will not work until the getfsent
		library routine is changed.

Reviewed by:	various people
Submitted by:	Jeffery Hsu <hsu@freebsd.org>
1997-02-10 02:22:35 +00:00
jkh
808a36ef65 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
julian
421ae6a998 Add comments to hard-to-follow File descriptor handling code 1996-12-05 22:41:13 +00:00
dg
3f0638f73b Move or add #include <queue.h> in preparation for upcoming struct socket
changes.
1996-03-11 15:13:58 +00:00
hsu
986af46af9 Merge in Lite2: LIST replacement for f_filef, f_fileb, and filehead.
Reviewed by:	davidg & bde
1996-03-11 02:17:11 +00:00
phk
9cb413a93c Another mega commit to staticize things. 1995-12-14 09:55:16 +00:00
dyson
a752352fe7 Increase the size of the pipe buffer as denoted by PIPSIZ from
4k to 8k.  This has a significant effect on the pipe performance.  In
the future it might be good to increase this to 16k.  PIPSIZ is now
tunable for experimentation.
1995-08-31 01:39:31 +00:00
bde
b31df09238 Make everything except the unsupported network sources compile cleanly
with -Wnested-externs.
1995-08-16 16:14:28 +00:00
dg
ec426e40ee Move mbuf frees to after call to sorflush().
Submitted by:	Matt Dillon
1995-08-08 02:22:16 +00:00
rgrimes
c86f0c7a71 Remove trailing whitespace. 1995-05-30 08:16:23 +00:00
wollman
37660997fe Make networking domains drop-ins, through the magic of GNU ld. (Some day,
there may even be LKMs.)  Also, change the internal name of `unixdomain'
to `localdomain' since AF_LOCAL is now the preferred name of this family.
Declare netisr correctly and in the right place.
1995-05-11 00:13:26 +00:00
dg
f4e59eab80 Fixed bug caused by attempting a connect with a null 'nam'. 1995-02-15 11:30:35 +00:00
wollman
04d65ef905 Merge in the socket-level support for Transaction TCP. 1995-02-07 02:01:16 +00:00
phk
c3e4945541 All of this is cosmetic. prototypes, #includes, printfs and so on. Makes
GCC a lot more silent.
1994-10-02 17:35:40 +00:00
phk
59934affab A potential panic, found by adding declarations. 1994-09-28 19:55:10 +00:00
dg
8d205697aa Added $Id$ 1994-08-02 07:55:43 +00:00
rgrimes
2469c867a1 The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by:	Rodney W. Grimes
Submitted by:	John Dyson and David Greenman
1994-05-25 09:21:21 +00:00
rgrimes
8fb65ce818 BSD 4.4 Lite Kernel Sources 1994-05-24 10:09:53 +00:00