Commit Graph

244 Commits

Author SHA1 Message Date
John Baldwin
f34fa851e0 Catch up to header include changes:
- <sys/mutex.h> now requires <sys/systm.h>
- <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>
2001-03-28 09:17:56 +00:00
John Baldwin
931cccf603 Proc locking identical to that of linprocfs' vnops except that we hold the
proc lock while calling psignal.
2001-03-07 03:15:05 +00:00
John Baldwin
30ac5d0f9e Protect read to p_pptr with proc lock rather than proctree lock. 2001-03-07 03:10:20 +00:00
John Baldwin
c65c565b44 Proc locking. Lock around psignal() and also ensure both an exclusive
proctree lock and the process lock are held when updating p_pptr and
p_oppid.  When we are just reaading p_pptr we only need the proc lock and
not a proctree lock as well.
2001-03-07 03:09:40 +00:00
John Baldwin
0087374731 Protect p_flag with the proc lock. 2001-03-07 02:07:56 +00:00
Doug Rabson
a76decc6f7 Remove the copyinstr call which was trying to copy the pathname in from
user space. It has already been copied in and mp->mnt_stat.f_mntonname has
already been initialised by the caller.

This fixes a panic on the alpha caused by the fact that the variable
'size' wasn't initialised because the call to copyinstr() bailed out with
an EFAULT error.
2001-03-03 15:15:33 +00:00
Robert Watson
91421ba234 o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
  pr_free(), invoked by the similarly named credential reference
  management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
  of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
  rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
  flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
  mutex use.

Notes:

o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
  credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
  required to protect the reference count plus some fields in the
  structure.

Reviewed by:	freebsd-arch
Obtained from:	TrustedBSD Project
2001-02-21 06:39:57 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Poul-Henning Kamp
fc2ffbe604 Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)
2001-02-04 13:13:25 +00:00
John Baldwin
b939335607 - Catch up to proc flag changes. 2001-01-24 11:20:05 +00:00
Poul-Henning Kamp
49851cc706 Use macro API to <sys/queue.h> 2000-12-31 10:24:19 +00:00
Jake Burkholder
98f03f9030 Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by:	jhb, -smp@
2000-12-23 19:43:10 +00:00
Robert Watson
f6a99e61c5 o Tighten restrictions on use of /proc/pid/ctl and move access checks
in ctl to using centralized p_can() inter-process access control
  interface.

Reviewed by:	sef
2000-12-13 04:28:24 +00:00
Jake Burkholder
c0c2557090 - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead
of explicit calls to lockmgr.  Also provides macros for the flags
  pased to specify shared, exclusive or release which map to the
  lockmgr flags.  This is so that the use of lockmgr can be easily
  replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
2000-12-13 00:17:05 +00:00
Dag-Erling Smørgrav
668891c57b Add a module version (so that linprocfs can properly depend on procfs) 2000-12-09 13:17:51 +00:00
John Baldwin
c3f52eedeb Protect p_stat with the sched_lock.
Reviewed by:	jake
2000-12-02 01:58:15 +00:00
Eivind Eklund
b8c8516a7f More paranoia against overflows 2000-11-08 21:53:05 +00:00
Eivind Eklund
ab3240e198 Fix overflow from jail hostname.
Bug found by:	Esa Etelavuori <eetelavu@cc.hut.fi>
2000-11-01 19:38:08 +00:00
Alfred Perlstein
e5b7b6b78e return correct type for process directory entries, DT_DIR not DT_REG 2000-10-05 23:19:51 +00:00
Dag-Erling Smørgrav
18203af447 Remove a comment that has been not only obsolete but patently wrong for the
last 31 revisions (almost three years).
2000-09-04 18:18:17 +00:00
Robert Watson
84a5637620 o Simplify if/then clause equating ESRCH with ENOENT when hiding a process
Submitted by:	des
2000-09-01 18:41:32 +00:00
Robert Watson
ca94dd37a3 o Make procfs use vaccess() for procfs_access() DAC and super-user checks,
rather than implementing its own {uid,gid,other} checks against vnode
  mode.  Similar change to linprocfs currently under review.

Obtained from:	TrustedBSD Project
2000-09-01 13:41:41 +00:00
Robert Watson
387d2c036b o Centralize inter-process access control, introducing:
int p_can(p1, p2, operation, privused)

  which allows specification of subject process, object process,
  inter-process operation, and an optional call-by-reference privused
  flag, allowing the caller to determine if privilege was required
  for the call to succeed.  This allows jail, kern.ps_showallprocs and
  regular credential-based interaction checks to occur in one block of
  code.  Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL,
  and P_CAN_DEBUG.  p_can currently breaks out as a wrapper to a
  series of static function checks in kern_prot, which should not
  be invoked directly.

o Commented out capabilities entries are included for some checks.

o Update most inter-process authorization to make use of p_can() instead
  of manual checks, PRISON_CHECK(), P_TRESPASS(), and
  kern.ps_showallprocs.

o Modify suser{,_xxx} to use const arguments, as it no longer modifies
  process flags due to the disabling of ASU.

o Modify some checks/errors in procfs so that ENOENT is returned instead
  of ESRCH, further improving concealment of processes that should not
  be visible to other processes.  Also introduce new access checks to
  improve hiding of processes for procfs_lookup(), procfs_getattr(),
  procfs_readdir().  Correct a bug reported by bp concerning not
  handling the CREATE case in procfs_lookup().  Remove volatile flag in
  procfs that caused apparently spurious qualifier warnigns (approved by
  bde).

o Add comment noting that ktrace() has not been updated, as its access
  control checks are different from ptrace(), whereas they should
  probably be the same.  Further discussion should happen on this topic.

Reviewed by:	bde, green, phk, freebsd-security, others
Approved by:	bde
Obtained from:	TrustedBSD Project
2000-08-30 04:49:09 +00:00
Poul-Henning Kamp
39f70682ae Introduce vop_stdinactive() and make it the default if no vop_inactive
is declared.

Sort and prune a few vop_op[].
2000-08-18 10:01:02 +00:00
Poul-Henning Kamp
eb95c536ad Remove unneeded #include <sys/kernel.h> 2000-04-29 15:36:14 +00:00
Brian Feldman
b7db19017b Move procfs_fullpath() to vfs_cache.c, with a rename to textvp_fullpath().
There's no excuse to have code in synthetic filestores that allows direct
references to the textvp anymore.

Feature requested by:	msmith
Feature agreed to by:	warner
Move requested by:	phk
Move agreed to by:	bde
2000-04-26 11:57:45 +00:00
Brian Feldman
28749c79c8 Quiet an unused variable warning by commenting out a variable declaration
that goes with a commented out statement.
2000-04-22 17:58:40 +00:00
Brian Feldman
bb529be712 There's no reason to make "file" 0500 rather than 0555. 2000-04-22 04:01:54 +00:00
Brian Feldman
081d7b00c7 Welcome back our old friend from procfs, "file"! 2000-04-22 03:44:41 +00:00
Peter Wemm
c447342094 Change #ifdef KERNEL to #ifdef _KERNEL in the public headers. "KERNEL"
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot).  This is consistant with the other
BSD's who made this change quite some time ago.  More commits to come.
1999-12-29 05:07:58 +00:00
Peter Wemm
e0f9d2869a Fix typo "," vs ";"
PR:		15696
Submitted by:	Takashi Okumura <taka@cs.pitt.edu>
1999-12-27 16:03:38 +00:00
Eivind Eklund
a6b174a83b Include vm/vm_extern.h to get at prototypes 1999-12-20 18:26:58 +00:00
Robert Watson
91f37dcba1 Second pass commit to introduce new ACL and Extended Attribute system
calls, vnops, vfsops, both in /kern, and to individual file systems that
require a vfsop_ array entry.

Reviewed by:	eivind
1999-12-19 06:08:07 +00:00
Eivind Eklund
762e6b856c Introduce NDFREE (and remove VOP_ABORTOP) 1999-12-15 23:02:35 +00:00
Peter Wemm
99b30c79b0 Don't simulate a pseudo address-space beyond VM_MAXUSER_ADDRESS that
maps onto the upages.  We used to use this extensively, particularly
for ps and gdb.  Both of these have been "fixed".  ps gets the p_stats
via eproc along with all the other stats, and gdb uses the regs, fpregs
etc files.

Once apon a time the UPAGES were mapped here, but that changed back
in January '96.  This essentially kills my revisions 1.16 and 1.17.
The 2-page "hole" above the stack can be reclaimed now.
1999-12-11 10:21:34 +00:00
Poul-Henning Kamp
5ecdb702b0 Remove unused #includes.
Obtained from:	http://bogon.freebsd.dk/include
1999-12-08 08:59:40 +00:00
Poul-Henning Kamp
a8704f8999 Add a sysctl to control if argv is disclosed to the world:
kern.ps_argsopen
It defaults to 1 which means that all users can see all argvs in ps(1).

Reviewed by:	Warner
1999-11-26 08:27:16 +00:00
Poul-Henning Kamp
a9e0361b4a Introduce the new function
p_trespass(struct proc *p1, struct proc *p2)
which returns zero or an errno depending on the legality of p1 trespassing
on p2.

Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one
extra signal related check.

Replace procfs.h:CHECKIO() macros with calls to p_trespass().

Only show command lines to process which can trespass on the target
process.
1999-11-21 19:03:20 +00:00
Poul-Henning Kamp
da654d9070 s/p_cred->pc_ucred/p_ucred/g 1999-11-21 12:38:21 +00:00
Sean Eric Fagan
13baacebcb A process should be able to examine itself. 1999-11-20 18:22:14 +00:00
Poul-Henning Kamp
6153cb2048 Make proc/*/cmdline use the cached argv if available.
Submitted by:   Paul Saab <paul@mu.org>
Reviewed by:    phk
1999-11-17 21:35:07 +00:00
Poul-Henning Kamp
3cf5d0fd07 The function `procfs_getattr()' in procfs doesn't set the value of
vap->va_fsid, so we cannot get valid information about procfs.

Submitted by:   SAWADA Mizuki miz@pa.aix.or.jp
Reviewed by:    phk
PR:     1654
1999-11-17 21:33:25 +00:00
Alan Cox
b561683329 Passing "0" or "FALSE" as the fourth argument to vm_fault is wrong. It
should be "VM_FAULT_NORMAL".
1999-11-09 01:44:28 +00:00
Sean Eric Fagan
75bd443641 Explain why Warner is right, and I am wrong, in the removing of the
file object.  Also explain some possible directions to re-implement it --
I'm not sure it should be, given the minimal application use.  (Other
than having the debugger automatically access the symbols for a process,
the main use I'd found was with some minor accounting ability, but _that_
depends on it being in the filesystem space; an ioctl access method would
be useless in that case.)

This is a code-less change; only a comment has been added.
1999-11-08 05:13:54 +00:00
Sean Eric Fagan
900e2da760 Make an incredibly stupid change because Warner threatened to do it and
continue doing it despite objections by me (the principal author).

Note that this doesn't fix the real problem -- the real problem is generally
bad setup by ignorant users, and education is the right way to fix it.

So while this doesn't actually solve the prolem mentioned in the complaint
(since it's still possible to do it via other methods, although they mostly
involve a bit more complicity), and there are better methods to do this,
nobody was willing or able to provide me with a real world example that
couldn't be worked around using the existing permissions and group
mechanism.  And therefore, security by removing features is the method of
the day.

I only had three applications that used it, in any event.  One of them would
have made debugging easier, but I still haven't finished it, and won't
now, so it doesn't really matter.
1999-11-07 07:52:02 +00:00
Poul-Henning Kamp
923502ff91 useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments)
of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>.  This puts
the #defines for the vm_inherit_t and vm_prot_t types next to their
typedefs.

This paves the road for the commit to follow shortly: change
useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE}
as argument.
1999-10-29 18:09:36 +00:00
Marcel Moolenaar
2c42a14602 sigset_t change (part 2 of 5)
-----------------------------

The core of the signalling code has been rewritten to operate
on the new sigset_t. No methodological changes have been made.
Most references to a sigset_t object are through macros (see
signalvar.h) to create a level of abstraction and to provide
a basis for further improvements.

The NSIG constant has not been changed to reflect the maximum
number of signals possible. The reason is that it breaks
programs (especially shells) which assume that all signals
have a non-null name in sys_signame. See src/bin/sh/trap.c
for an example. Instead _SIG_MAXSIG has been introduced to
hold the maximum signal possible with the new sigset_t.

struct sigprop has been moved from signalvar.h to kern_sig.c
because a) it is only used there, and b) access must be done
though function sigprop(). The latter because the table doesn't
holds properties for all signals, but only for the first NSIG
signals.

signal.h has been reorganized to make reading easier and to
add the new and/or modified structures. The "old" structures
are moved to signalvar.h to prevent namespace polution.

Especially the coda filesystem suffers from the change, because
it contained lines like (p->p_sigmask == SIGIO), which is easy
to do for integral types, but not for compound types.

NOTE: kdump (and port linux_kdump) must be recompiled.

Thanks to Garrett Wollman and Daniel Eischen for pressing the
importance of changing sigreturn as well.
1999-09-29 15:03:48 +00:00
Alfred Perlstein
c24fda81c9 Seperate the export check in VFS_FHTOVP, exports are now checked via
VFS_CHECKEXP.

Add fh(open|stat|stafs) syscalls to allow userland to query filesystems
based on (network) filehandle.

Obtained from:	NetBSD
1999-09-11 00:46:08 +00:00
Alfred Perlstein
5a5fccc8e7 All unimplemented VFS ops now have entries in kern/vfs_default.c that return
reasonable defaults.

This avoids confusing and ugly casting to eopnotsupp or making dummy functions.
Bogus casting of filesystem sysctls to eopnotsupp() have been removed.

This should make *_vfsops.c more readable and reduce bloat.

Reviewed by:	msmith, eivind
Approved by:	phk
Tested by:	Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>
1999-09-07 22:42:38 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Marcel Moolenaar
63a9927353 Let processes retrieve their argv through procfs. Revert to the original
behaviour in all other cases.

Submitted by: Andrew Gordon <arg@arg1.demon.co.uk>
1999-08-19 19:41:08 +00:00
Bruce Evans
dd26feb90b Fixed printf format errors (%qu -> %llu; the arg was already unsigned long
long to hide problems on alphas).
1999-08-08 13:43:51 +00:00
Poul-Henning Kamp
46dcdb370e Allow jailed proccesses to open non-process vnodes like the root of the fs. 1999-07-09 21:31:44 +00:00
Peter Wemm
ebce412ca2 Use %q rather than rolling a custom routine. 1999-07-09 17:56:59 +00:00
Jonathan Lemon
d6c4f01106 Support for i386 hardware breakpoints.
Submitted by:	Brian Dean <brdean@unx.sas.com>
1999-07-09 04:18:32 +00:00
Jonathan Lemon
ab001a72be Implement support for hardware debug registers on the i386.
Submitted by:	Brian Dean <brdean@unx.sas.com>
1999-07-09 04:16:00 +00:00
Poul-Henning Kamp
7a7404d275 Eliminate the bogus procfs private almost struct dirent structure.
Spotted by: Lars Hamren
Reviewed by: bde
1999-06-13 20:53:16 +00:00
Dmitrij Tejblum
9d3a442583 Don't call calcru() on a swapped-out process. calcru() access p_stats, which
is in U-area.
1999-05-22 20:10:31 +00:00
Poul-Henning Kamp
a6d3121589 Make the type and map files claim 0 bytes size. Tar doesn't get confused
now, but doesn't store any data eiter.

I wonder if we shouldn't claim to be fifos instead...
1999-05-04 08:01:55 +00:00
Poul-Henning Kamp
8902608d57 Add even more () to CHECKIO which by now feels positively LISPish.
Submitted by:	bde
Reviewed by:	phk
1999-05-04 08:00:10 +00:00
Poul-Henning Kamp
d37ed5a03a Add a new "file" to procfs: "rlimit" which shows the resource limits for
the process.

PR:		11342
Submitted by:	Adrian Chadd adrian@freebsd.org
Reviewed by:	phk
1999-04-30 13:04:21 +00:00
Poul-Henning Kamp
75c1354190 This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing.  The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact:  "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

   I have no scripts for setting up a jail, don't ask me for them.

   The IP number should be an alias on one of the interfaces.

   mount a /proc in each jail, it will make ps more useable.

   /proc/<pid>/status tells the hostname of the prison for
   jailed processes.

   Quotas are only sensible if you have a mountpoint per prison.

   There are no privisions for stopping resource-hogging.

   Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by:   http://www.rndassociates.com/
Run for almost a year by:       http://www.servetheweb.com/
1999-04-28 11:38:52 +00:00
Poul-Henning Kamp
1c308b817a Change suser_xxx() to suser() where it applies. 1999-04-27 12:21:16 +00:00
Poul-Henning Kamp
f711d546d2 Suser() simplification:
1:
  s/suser/suser_xxx/

2:
  Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
  s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.
1999-04-27 11:18:52 +00:00
Luoqi Chen
b1028ad122 Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This
is the preparation step for moving pmap storage out of vmspace proper.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		Matthew Dillion	<dillon@apollo.backplane.com>
1999-02-19 14:25:37 +00:00
Matthew Dillon
9fdfe602fc Remove MAP_ENTRY_IS_A_MAP 'share' maps. These maps were once used to
attempt to optimize forks but were essentially given-up on due to
    problems and replaced with an explicit dup of the vm_map_entry structure.
    Prior to the removal, they were entirely unused.
1999-02-07 21:48:23 +00:00
John Polstra
b7429e253a Correct a format mismatch on 64-bit architectures. This should
fix the erroneous values in the procfs "map" file on the Alpha.
1999-02-05 06:18:54 +00:00
Matthew Dillon
831a80b0d5 Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile
1999-01-27 22:42:27 +00:00
Matthew Dillon
1c7c3c6a86 This is a rather large commit that encompasses the new swapper,
changes to the VM system to support the new swapper, VM bug
    fixes, several VM optimizations, and some additional revamping of the
    VM code.  The specific bug fixes will be documented with additional
    forced commits.  This commit is somewhat rough in regards to code
    cleanup issues.

Reviewed by:	"John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
1999-01-21 08:29:12 +00:00
Peter Wemm
75ba77578f A partial implementation of the procfs cmdline pseudo-file. This
is enough to satisfy things like StarOffice.  This is a hack, but doing
it properly would be a LOT of work, and would require extensive grovelling
around in the user address space to find the argv[].

Obtained from: Mostly from Andrzej Bialecki <abial@nask.pl>.
1999-01-05 03:53:06 +00:00
Archie Cobbs
2127f26023 Examine all occurrences of sprintf(), strcat(), and str[n]cpy()
for possible buffer overflow problems. Replaced most sprintf()'s
with snprintf(); for others cases, added terminating NUL bytes where
appropriate, replaced constants like "16" with sizeof(), etc.

These changes include several bug fixes, but most changes are for
maintainability's sake. Any instance where it wasn't "immediately
obvious" that a buffer overflow could not occur was made safer.

Reviewed by:	Bruce Evans <bde@zeta.org.au>
Reviewed by:	Matthew Dillon <dillon@apollo.backplane.com>
Reviewed by:	Mike Spengler <mks@networkcs.com>
1998-12-04 22:54:57 +00:00
David Greenman
730075613a Added a second argument, "activate" to the vm_page_unwire() call so that
the caller can select either inactive or active queue to put the page on.
1998-10-28 13:37:02 +00:00
Bruce Evans
8994ca3ce9 Removed statically configured mount type numbers (MOUNT_*) and all
references to them.

The change a couple of days ago to ignore these numbers in statically
configured vfsconf structs was slightly premature because the cd9660,
cfs, devfs, ext2fs, nfs vfs's still used MOUNT_* instead of the number
in their vfsconf struct.
1998-09-07 13:17:06 +00:00
Alexander Langer
f35f7d0dfd Style fixes and a bug fix: don't remove the exit handler if unmount
fails.

Submitted by:	bde
1998-07-27 22:47:17 +00:00
Alexander Langer
3f47ee5c4d A better solution to the rm_at_exit problem: Register the exit function
during first mount.  Unregister the exit function at last unmount.

Concept by:	sef
Reviewed by:	sef
Implemented by:	alex
1998-07-27 01:07:01 +00:00
Alexander Langer
ca2be56ff9 Override the default VFS LKM dispatch functions so that a module
unload function can be provided (this is necessary to unregister
the at_exit handler).
1998-07-25 15:52:44 +00:00
Bruce Evans
a23d65bfc8 Cast pointers to uintptr_t/intptr_t instead of to u_long/long,
respectively.  Most of the longs should probably have been
u_longs, but this changes is just to prevent warnings about
casts between pointers and integers of different sizes, not
to fix poorly chosen types.
1998-07-15 02:32:35 +00:00
Bruce Evans
ac1e407b32 Fixed printf format errors. 1998-07-11 07:46:16 +00:00
Bruce Evans
96eb19e1a3 Quick fix for type mismatches which were fatal if longs aren't 32
bits.  We used a private, wrong, version of `struct dirent' to help
break getdirentries(), and we use a silly check that the size of this
struct is a power of 2 to help break mount() if getdirentries() would
not work.  This fix just changes the struct to match `struct dirent'
(except for the name length).
1998-07-07 04:08:44 +00:00
Dmitrij Tejblum
6bfc1a02b1 Remove "not hungly" panics. Cookies now used by the linux and ibcs2
emulators. The emulators assume that filesystem may just ignore cookies, and
handle this case correctly. So we just ignore cookies.

Also sync *_readdir "prototypes" with reality.
1998-06-25 16:54:41 +00:00
Bruce Evans
a395dbb153 Avoid a 64-bit division in procfs_readdir(). Fixed related overflows.
Check args using the same expression as in fdesc and kernfs.  The check
was actually already correct, modulo overflow.  It could be tightened
up to either allow huge (aligned) offsets, treating them as EOF, or
disallow all offsets beyond EOF.

Didn't fix invalid address calculation &foo[i] where i may be out of
bounds.

Didn't fix shooting of foot using a private unportable dirent struct.
1998-06-14 12:53:39 +00:00
Peter Wemm
7a204420d3 Don't silently accept attempts to change flags where they are not
supported.
1998-06-10 06:34:57 +00:00
Doug Rabson
ecbb00a262 This commit fixes various 64bit portability problems required for
FreeBSD/alpha.  The most significant item is to change the command
argument to ioctl functions from int to u_long.  This change brings us
inline with various other BSD versions.  Driver writers may like to
use (__FreeBSD_version == 300003) to detect this change.

The prototype FreeBSD/alpha machdep will follow in a couple of days
time.
1998-06-07 17:13:14 +00:00
Tor Egge
afc6ea238f Disallow reading the current kernel stack. Only the user structure and
the current registers should be accessible.
Reviewed by:	David Greenman <dg@root.com>
1998-05-19 00:00:14 +00:00
Mike Smith
79cc756d8b As described by the submitter:
Reverse the VFS_VRELE patch.  Reference counting of vnodes does not need
to be done per-fs.  I noticed this while fixing vfs layering violations.
Doing reference counting in generic code is also the preference cited by
John Heidemann in recent discussions with him.

The implementation of alternative vnode management per-fs is still a valid
requirement for some filesystems but will be revisited sometime later,
most likely using a different framework.

Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-05-06 05:29:41 +00:00
John Dyson
c0877f103f Tighten up management of memory and swap space during map allocation,
deallocation cycles.  This should provide a measurable improvement
on swap and memory allocation on loaded systems.  It is unlikely a
complete solution.  Also, provide more map info with procfs.
Chuck Cranor spurred on this improvement.
1998-04-29 04:28:22 +00:00
Dag-Erling Smørgrav
dc73342347 Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
Poul-Henning Kamp
a0502b19d4 Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.
1998-03-26 20:54:05 +00:00
Mike Smith
34bdbbd0de The intent is to get rid of WILLRELE in vnode_if.src by making
a complement to all ops that return a vpp, VFS_VRELE.  This is
initially only for file systems that implement the following ops
that do a WILLRELE:

	vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link,
	vop_rename, vop_mkdir, vop_rmdir, vop_symlink

This is initial DNA that doesn't do anything yet.  VFS_VRELE is
implemented but not called.

A default vfs_vrele was created for fs implementations that use the
standard vnode management routines.

VFS_VRELE implementations were made for the following file systems:

Standard (vfs_vrele)
	ffs mfs nfs msdosfs devfs ext2fs

Custom
	union umapfs

Just EOPNOTSUPP
	fdesc procfs kernfs portal cd9660

These implementations may change as VOP changes are implemented.

In the next phase, in the vop implementations calls to vrele and the vrele
part of vput will be moved to the top layer vfs_vnops and made visible
to all layers.  vput will be replaced by unlock in these cases.  Unlocking
will still be done in the per fs layer but the refcount decrement will be
triggered at the top because it doesn't hurt to hold a vnode reference a
little longer.  This will have minimal impact on the structure of the
existing code.

This will only be done for vnode arguments that are released by the various
fs vop implementations.

Wider use of VFS_VRELE will likely require restructuring of the code.

Reviewed by:	phk, dyson, terry et. al.
Submitted by:	Michael Hancock <michaelh@cet.co.jp>
1998-03-01 22:46:53 +00:00
Eivind Eklund
303b270b0a Staticize. 1998-02-09 06:11:36 +00:00
Eivind Eklund
0b08f5f737 Back out DIAGNOSTIC changes. 1998-02-06 12:14:30 +00:00
Eivind Eklund
47cfdb166d Turn DIAGNOSTIC into a new-style option. 1998-02-04 22:34:03 +00:00
John Dyson
2d8acc0f4a VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
John Dyson
95e5e988e0 Make our v_usecount vnode reference count work identically to the
original BSD code.  The association between the vnode and the vm_object
no longer includes reference counts.  The major difference is that
vm_object's are no longer freed gratuitiously from the vnode, and so
once an object is created for the vnode, it will last as long as the
vnode does.

When a vnode object reference count is incremented, then the underlying
vnode reference count is incremented also.  The two "objects" are now
more intimately related, and so the interactions are now much less
complex.

When vnodes are now normally placed onto the free queue with an object still
attached.  The rundown of the object happens at vnode rundown time, and
happens with exactly the same filesystem semantics of the original VFS
code.  There is absolutely no need for vnode_pager_uncache and other
travesties like that anymore.

A side-effect of these changes is that SMP locking should be much simpler,
the I/O copyin/copyout optimizations work, NFS should be more ponderable,
and further work on layered filesystems should be less frustrating, because
of the totally coherent management of the vnode objects and vnodes.

Please be careful with your system while running this code, but I would
greatly appreciate feedback as soon a reasonably possible.
1998-01-06 05:26:17 +00:00
Sean Eric Fagan
dd30ff81d9 Use CHECKIO in procfs_ioctl() to ensure that any changes in UID/GID result
in the expected failure.
1998-01-06 01:37:12 +00:00
Bruce Evans
a954e88d0b Fixed a missing/misplaced/misstyled prototype. 1997-12-30 08:46:44 +00:00
Bruce Evans
675ea6f083 Unspammed nested include of <vm/vm_zone.h>. 1997-12-27 02:56:39 +00:00
Sean Eric Fagan
d5f81602a7 Clear the p_stops field on change of user/group id, unless the correct
flag is set in the p_pfsflags field.  This, essentially, prevents an SUID
proram from hanging after being traced.  (E.g., "truss /usr/bin/rlogin" would
fail, but leave rlogin in a stopevent state.)  Yet another case where procctl
is (hopefully ;)) no longer needed in the general case.

Reviewed by:	bde (thanks bruce :))
1997-12-20 03:05:47 +00:00
Sean Eric Fagan
d7b7dcba41 Change the ioctls for procfs around a bit; in particular, whever possible,
change from

	ioctl(fd, PIOC<foo>, &i);

to

	ioctl(fd, PIOC<foo>, i);

This is going from the _IOW to _IO ioctl macro.  The kernel, procctl, and
truss must be in synch for it all to work (not doing so will get errors about
inappropriate ioctl's, fortunately).  Hopefully I didn't forget anything :).
1997-12-13 03:13:49 +00:00
Sean Eric Fagan
ba9d19e99b Fix a problem with procfs_exit() that resulted in missing some procfs
nodes; this also apparantly caused a panic in some circumstances.
Also, since procfs_exit() is getting rid of the nodes when a process
exits, don't bother checking for the process' existance in procfs_inactive().
1997-12-12 03:33:43 +00:00