Commit Graph

4908 Commits

Author SHA1 Message Date
Poul-Henning Kamp
2dd527b3ac Move generic disk ioctls from <sys/disklabel.h> to <sys/disk.h>.
Sponsored by:	DARPA & NAI Labs
2002-04-08 09:20:07 +00:00
Poul-Henning Kamp
d39e457bba Put back dumppcb, but this time we put a comment to tell what it is for.
Brucifixion by:	bde
2002-04-08 06:59:13 +00:00
Alan Cox
c0bf5caa74 Restructure aio_return() to eliminate duplicated code and facilitate Giant
push down.
2002-04-08 04:57:56 +00:00
Jeffrey Hsu
20504246d8 There's only one socket zone so we don't need to remember it
in every socket structure.
2002-04-08 03:04:22 +00:00
Maxime Henrion
9d8353732e o Change kernel_vmount() interface to be more convenient : pass two
separate strings instead of passing "foo=bar".
o Don't forget to clear the VMOUNT flag on the vnode when vfs_nmount()
  fails because the fs doesn't implement VFS_NMOUNT (and in vfs_mount()
  when the fs doesn't implement VFS_MOUNT) ; also decrement the vfs
  refcount in the !MNT_UPDATE case.
2002-04-07 13:22:47 +00:00
David Malone
cf4ce70bb3 Remove a comment which relates to the old name cache code, which
was replaced in 1997.

Approved by:	phk
2002-04-07 08:58:31 +00:00
Alan Cox
ae124fc4bd Reduce the duplication of code for error handling in _aio_aqueue(). 2002-04-07 07:17:59 +00:00
Alan Cox
63a4964eec Change jobref and *ijoblist from int to long in order to avoid
a catastrophe after the 2^32nd AIO operation on 64-bit architectures.
2002-04-07 01:28:34 +00:00
Jake Burkholder
98281c99fc Remove a stale comment. 2002-04-06 08:44:04 +00:00
Jake Burkholder
a9f5d33875 Include machine/ktr.h for sparc64 so we pick up KTR_CPU. 2002-04-06 08:43:17 +00:00
Jake Burkholder
a30d7c60f6 Use CTASSERT rather than a runtime check to detect kinfo_proc size changes.
Remove the ugly yuck code to busy wait for 20 seconds.
2002-04-06 08:13:52 +00:00
Yoshihiro Takahashi
d7ef6277af Added the new kernel dumping support for pc98. 2002-04-06 06:41:54 +00:00
Bruce Evans
c78f394575 Updated a doubly stale comment about signotify(). Fixed a nearby long line. 2002-04-05 10:00:37 +00:00
Peter Wemm
911fc92344 Increase the size of the register stack storage on ia64 from 32K to 2MB so
that we can compile gcc.  This is a hack because it adds a fixed 2MB to
each process's VSIZE regardless of how much is really being used since
there is no grow-up stack support.  At least it isn't physical memory.
Sigh.

Add a sysctl to enable tweaking it for new processes.
2002-04-05 01:57:45 +00:00
Thomas Moestl
d7f7792edf Add a generic implementation of inittodr() and resettodr(), as well as
a set of helper routines to deal with real-time clocks. The generic
functions access the clock diver using a kobj interface. This is intended
to reduce code reduplication and make it easy to support more than one
clock model on a single architecture.

This code is currently only used on sparc64, but it is planned to convert
the code of the other architectures to it later.
2002-04-04 23:39:10 +00:00
John Baldwin
6008862bc2 Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on:	i386, alpha, sparc64
2002-04-04 21:03:38 +00:00
John Baldwin
0c88508a78 Change mtx_init() to now take an extra argument. The third argument is
the generic lock type for use with witness.  If this argument is NULL then
the lock name is used as the lock type.  Add a macro for a lock type name
for network driver locks.
2002-04-04 20:52:27 +00:00
John Baldwin
9939f0f11c Set the lock type equal to the lock name for now as all of the current
sx locks don't use very specific lock names.
2002-04-04 20:49:35 +00:00
John Baldwin
b6396e1656 Add a new char * pointer lo_type to struct lock_object that is used to
point to a more generic name for a lock that is more suitable for use by
witness when grouping locks.  For example, although network driver locks
use the interface name for the name of each lock, they should all use the
same witness and be treated the same as witness.  Another example is that
all UMA zone locks should be treated the same.  The witness code has also
been updated to print out the lock type in addition to the lock name in a
few places where it is relevant.
2002-04-04 20:45:21 +00:00
Poul-Henning Kamp
f67ad03a25 Delete the bogus d_boot[01] fields from struct disklabel.
This shrinks the size 4 bytes on alpha, down to the same 276 bytes
as all other platforms.

Construct a hack to make old ioctls work on new kernels.

Once world is recompiled only the new and correct sysctls will be
used.

This hack will become annoying around 1st of may to make people
rebuild their worlds and it will be gone before 5.0.
2002-04-04 20:34:48 +00:00
Bruce Evans
79065dba2a Moved signal handling and rescheduling from userret() to ast() so that
they aren't in the usual path of execution for syscalls and traps.
The main complication for this is that we have to set flags to control
ast() everywhere that changes the signal mask.

Avoid locking in userret() in most of the remaining cases.

Submitted by:	luoqi (first part only, long ago, reorganized by me)
Reminded by:	dillon
2002-04-04 17:49:48 +00:00
Bruce Evans
179235b38b Optimized the check for unmasked pending signals in CURSIG() using a new
inline function sigsetmasked() and a new macro SIGPENDING().  CURSIG()
will soon be moved out of the normal path of execution for syscalls and
traps.  Then its efficiency will be less important but the new interfaces
will be useful for checking for unmasked pending signals in more places.

Submitted by:		luoqi (long ago, in a slightly different form)

Assert that sched_lock is not held in CURSIG().
2002-04-04 15:19:41 +00:00
Alan Cox
9b16adc1e7 o aio_process needn't fhold()/fdrop() the fp now that _aio_aqueue() and
aio_free_entry() do this.
 o Remove two unnecessary/unused variables from aio_process() and one field
   from aiocblist.
2002-04-04 02:13:20 +00:00
Alfred Perlstein
19a0f7e1be Avoid a lock order reversal by dropping the eventhandler_mutex earlier.
We get enough protection from the lock on the individual lists that we
aquire later.

Noticed/Tested by: Steven G. Kargl <kargl@troutmask.apl.washington.edu>
Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-04-04 00:52:03 +00:00
John Baldwin
7049932843 - Axe a stale comment. We haven't allowed the ucred pointer passed to
securelevel_*() to be NULL for a while now.
- Use KASSERT() instead of if (foo) panic(); to optimize the
  !INVARIANTS case.

Submitted by:	Martin Faxer <gmh003532@brfmasthugget.se>
2002-04-03 18:35:25 +00:00
Maxime Henrion
bcc931752f Add two forgotten vfs_unbusy() calls, in vfs_mount() and vfs_nmount().
Reviewed by:	phk
2002-04-03 12:19:03 +00:00
Ruslan Ermilov
12c79eb288 Dike out a highly insecure UCONSOLE option.
TIOCCONS must be able to VOP_ACCESS() /dev/console to succeed.

Obtained from:	OpenBSD
2002-04-03 10:56:59 +00:00
Matthew Dillon
d1b534dfc6 brelse() was improperly clearing B_DELWRI in the B_DELWRI|B_INVAL case
without removing the buffer from the vnode's dirty buffer list, which
can result in a panic in NFS.  Replaced the code with a call to bundirty()
which deals with it properly.

PR:		kern/36108, kern/36174
Submitted by:	various people
Special mention: to Danny Schales <dan@coes.LaTech.edu> for providing a core dump that helped me track this down.
MFC after:	1 day
2002-04-03 00:17:36 +00:00
Dag-Erling Smørgrav
e633070431 Revert to open hashing. It makes the code simpler, and works farily well
even when the number of records approaches the size of the hash table.
Besides, the previous implementation (using linear probing) was broken :)

Also, use the newly introduced MTX_SYSINIT.
2002-04-02 23:26:32 +00:00
John Baldwin
c53c013bae - Move the MI mutexes sched_lock and Giant from being declared in the
various machdep.c's to being declared in kern_mutex.c.
- Add a new function mutex_init() used to perform early initialization
  needed for mutexes such as setting up thread0's contested lock list
  and initializing MI mutexes.  Change the various MD startup routines
  to call this function instead of duplicating all the code themselves.

Tested on:	alpha, i386
2002-04-02 22:19:16 +00:00
John Baldwin
7feefcd6ce Spelling police. 2002-04-02 20:44:30 +00:00
John Baldwin
c08cf3c3e8 Enforce an implicit lock order of sleepable locks before non-sleepable
locks.
2002-04-02 19:27:21 +00:00
Andrew R. Reiter
72a492cacf - Add a mutex to lock the global securelevel value.
- Make use of MTX_SYSINIT() as the means to initialize our mutex lock.
2002-04-02 17:43:17 +00:00
Seigo Tanimura
2a60b9b951 Fix leakage of p_pgrp lock. 2002-04-02 17:12:06 +00:00
John Baldwin
48c343df5f Explicitly document how we implicitly enforce the lock order of sleep
locks before spin locks.
2002-04-02 16:51:20 +00:00
Andrew R. Reiter
c27b56999e - Add MTX_SYSINIT and SX_SYSINIT as macro glue for allowing sx and mtx
locks to be able to setup a SYSINIT call.  This helps in places where
  a lock is needed to protect some data, but the data is not truly
  associated with a subsystem that can properly initialize it's lock.
  The macros use the mtx_sysinit() and sx_sysinit() functions,
  respectively, as the handler argument to SYSINIT().

Reviewed by: alfred, jhb, smp@
2002-04-02 16:05:43 +00:00
Dag-Erling Smørgrav
b784ffe91a Instead of get_cyclecount(9), use nanotime(9) to record acquisition and
release times.  Measurements are made and stored in nanoseconds but
presented in microseconds, which should be sufficient for the locks for
which we actually want this (those that are held long and / or often).
Also, rename some variables and structure members to unit-agnostic names.
2002-04-02 14:42:01 +00:00
Poul-Henning Kamp
408ab1b875 Retire the bogus ioctl DIOCGPART in toto.
Once again we can notice that badly thought out hacks ferment and infect
far more code than initially expected.

Sponsored by:	DARPA and NAI Labs.
2002-04-02 11:52:13 +00:00
Marcel Moolenaar
7902451821 Don't compile the dummy dumpsys for ia64. 2002-04-02 10:55:40 +00:00
Robert Watson
3bd1da2958 Update comment regarding the locking of the sysctl tree.
Rename memlock to sysctllock, and MEMLOCK()/MEMUNLOCK() to SYSCTL_LOCK()/
SYSCTL_UNLOCK() and related changes to make the lock names make more
sense.

Submitted by:	Jonathan Mini <mini@haikugeek.com>
2002-04-02 05:50:07 +00:00
Alfred Perlstein
29a2c0cd09 Use sx locks instead of flags+tsleep locks.
Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-04-02 04:20:38 +00:00
Alfred Perlstein
28fe1a715e Use sx locks rather than lockmgr locks for eventhandlers.
Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-04-02 04:18:54 +00:00
Dag-Erling Smørgrav
6c35e80948 Mutex profiling code, conditional on the MUTEX_PROFILING option. Adds the
following sysctl variables:

  debug.mutex.prof.enable	    enable / disable profiling
  debug.mutex.prof.acquisitions	    number of mutex acquisitions recorded
  debug.mutex.prof.records	    number of acquisition points recorded
  debug.mutex.prof.maxrecords	    max number of acquisition points
  debug.mutex.prof.rejected	    number of rejections (due to full table)
  debug.mutex.prof.hashsize	    hash size
  debug.mutex.prof.collisions	    number of hash collisions
  debug.mutex.prof.stats	    profiling statistics

The code records four numbers for each acquisition point (identified by
source file name and line number): longest time held, total time held,
number of non-recursive acquisitions, average time held.  The measurements
are in clock cycles (as returned by get_cyclecount(9)); this may cause
measurements on some SMP systems to be unreliable.  This can probably be
worked around by replacing get_cyclecount(9) by some incarnation of
nanotime(9).

This work was derived from initial patches by eivind.
2002-04-02 00:01:49 +00:00
Matthew Dillon
182da8209d Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter()
and cpu_critical_exit() and moves associated critical prototypes into their
own header file, <arch>/<arch>/critical.h, which is only included by the
three MI source files that need it.

Backout and re-apply improperly comitted syntactical cleanups made to files
that were still under active development.  Backout improperly comitted program
structure changes that moved localized declarations to the top of two
procedures.  Partially re-apply one of the program structure changes to
move 'mask' into an intermediate block rather then in three separate
sub-blocks to make the code more readable.  Re-integrate bug fixes that Jake
made to the sparc64 code.

Note: In general, developers should not gratuitously move declarations out
of sub-blocks.  They are where they are for reasons of structure, grouping,
readability, compiler-localizability, and to avoid developer-introduced bugs
similar to several found in recent years in the VFS and VM code.

Reviewed by:	jake
2002-04-01 23:51:23 +00:00
John Baldwin
44731cab3b Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
John Baldwin
4c44ad8ee5 Whitespace only change: use ANSI function declarations instead of K&R. 2002-04-01 20:13:31 +00:00
Poul-Henning Kamp
c23cda8580 Extend a hack to also hack around PC98's definition of __i386__ 2002-04-01 20:13:03 +00:00
John Baldwin
4269e184e8 Fix style bug in previous commit. 2002-04-01 17:53:42 +00:00
Jake Burkholder
60a57b73ef ktr changes to improve performance and make writing a userland utility to
dump the trace buffer feasible.
- Remove KTR_EXTEND.  This changes the format of the trace entries when
  activated, making writing a userland tool which is not tied to a specific
  kernel configuration difficult.
- Use get_cyclecount() for timestamps.  nanotime() is much too heavy weight
  and requires recursion protection due to ktr traces occuring as a result
  of ktr traces.  KTR_VERBOSE may still require recursion protection, which
  is now conditional on it.
- Allow KTR_CPU to be overridden by MD code.  This is so that it is possible
  to trace early in startup before pcpu and/or curthread are setup.
- Add a version number for the ktr interface.  A userland tool can check this
  to detect mismatches.
- Use an array for the parameters to make decoding in userland easier.
- Add file and line recording to the non-extended traces now that the extended
  version is no more.

These changes will break gdb macros to decode the extended version of the
trace buffer which are floating around.  Users of these macros should either
use the show ktr command in ddb, or use the userland utility which can be run
on a core dump.

Approved by:	jhb
Tested on:	i386, sparc64
2002-04-01 05:35:26 +00:00
Poul-Henning Kamp
81661c94b6 Here follows the new kernel dumping infrastructure.
Caveats:

The new savecore program is not complete in the sense that it emulates
enough of the old savecores features to do the job, but implements none
of the options yet.

I would appreciate if a userland hacker could help me out getting savecore
to do what we want it to do from a users point of view, compression,
email-notification, space reservation etc etc.  (send me email if
you are interested).

Currently, savecore will scan all devices marked as "swap" or "dump" in
/etc/fstab _or_ any devices specified on the command-line.

All architectures but i386 lack an implementation of dumpsys(), but
looking at the i386 version it should be trivial for anybody familiar
with the platform(s) to provide this function.

Documentation is quite sparse at this time, more to come.

Details:

ATA and SCSI drivers should work as the dump formatting code has been
removed.  The IDA, TWE and AAC have not yet been converted.

Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set
the device as dumpdev.  To implement the "off" argument, /dev/null
is used as the device.

Savecore will fail if handed any options since they are not (yet)
implemented.  All devices marked "dump" or "swap" in /etc/fstab
will be scanned and dumps found will be saved to diskfiles
named from the MD5 hash of the header record.  The header record
is dumped in readable format in the .info file.  The kernel
is not saved.  Only complete dumps will be saved.

All maintainer rights for this code are disclaimed: feel free to
improve and extend.

Sponsored by:   DARPA, NAI Labs
2002-03-31 22:37:00 +00:00
Poul-Henning Kamp
1f3a74b1b1 Implement the two "GEOM" ioctls DIOCGSECTORSIZE and DIOCGMEDIASIZE for
the non-GEOM code as well.  This simplifies the the kernel-dumping
and disk-management tools as less compatibility cruft will be needed.

Sponsored by:	DARPA and NAI Labs.
2002-03-31 21:17:12 +00:00
Alan Cox
a5c0b1c020 Keep the reference to the file acquired in _aio_aqueue() until the operation
completes.  The reference is released in aio_free_entry().

Submitted by:	tegge
2002-03-31 20:17:56 +00:00
Alfred Perlstein
7b11fea64f Close some holes with p->p_args by NULL'ing out the p->p_args pointer
while holding the proc lock, and by holding the pargs structure when
accessing it from outside of the owner.

Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-03-31 10:33:12 +00:00
Poul-Henning Kamp
8d19a26558 Centralize the "bootdev" and "dumpdev" variables. They are still pretty
bogus all things considered, but at least now they don't camouflage as
being MD variables.
2002-03-31 07:15:28 +00:00
Alan Cox
5e20c11f19 Add a local proc *p in exec_new_vmspace() to avoid repeated dereferencing
to obtain it.
2002-03-31 00:05:30 +00:00
Bruce Evans
4f1f485f34 Fixed handling of short reads in readdisklabel() and writedisklabel().
These functions use DEV_STRATEGY() which can easily return a short
count (with no error) for reads near EOF.  EOF happens for "disks" too
small to contain a label sector (mainly for empty slices).  The functions
didn't understand this at all, and looked for labels in the garbage
in the buffer beyond what DEV_STRATEGY() returned.  The recent UMA
changes combined with my local changes and configuration resulted in
the garbage often containing a valid but garbage label left over from
a previous call.

Bugs in EOF handling in -current limited the problem to "disks" with
size precisely LABELSECTOR sectors.  LABELSECTOR happens to be a very
unusual "disk" size since it is only 0 for non-i386 arches that don't
usually have disks with DOS MBRs.
2002-03-30 16:02:43 +00:00
Dan Moschuk
e7876c0943 Nuke CV_DEBUG in favour of INVARIANTS.
Approved by: jhb
2002-03-30 03:52:52 +00:00
Jake Burkholder
b454c6dd29 Style fixes purposefully left out of last commit. I checked the kse tree
and didn't see any changes that this conflicts with.
2002-03-29 16:45:03 +00:00
Jake Burkholder
d0ce9a7e07 Remove abuse of intr_disable/restore in MI code by moving the loop in ast()
back into the calling MD code.  The MD code must ensure no races between
checking the astpening flag and returning to usermode.

Submitted by:	peter (ia64 bits)
Tested on:	alpha (peter, jeff), i386, ia64 (peter), sparc64
2002-03-29 16:35:26 +00:00
Seigo Tanimura
5cf4bcebbf The description of fd_mtx is "filedesc structure." 2002-03-29 11:26:05 +00:00
Matthew N. Dodd
32bc1098b2 Add resource_list_add_next() which returns the RID for the resource added. 2002-03-29 06:42:54 +00:00
Alfred Perlstein
c1508b28c6 To remove nested include of sys/lock.h and sys/mutex.h from sys/proc.h
make the pargs_* functions into non-inlines in kern/kern_proc.c.

Requested by: bde
2002-03-28 18:12:27 +00:00
Poul-Henning Kamp
45609bea17 Get the magnitude of the NTP adjustment right. 2002-03-28 16:02:44 +00:00
Maxime Henrion
daab5e2472 - Properly sync vfs_nmount() with changes that have be already done
in vfs_mount(), in particular revisions 1.215, 1.227 and 1.240.
- flag2 is a low quality variable name, change it to kern_flag.
- strncpy NUL-terminates f_fstypename and f_mntonname since the strings
  have length <= <buffer length> - 1, so the explicit NUL-termination is
  bogus.
- M_ZERO'ing space for fstype and fspath is stupid since we never use the
  space beyond the end of the string.
- Do various style(9) cleanups in both functions.

Submitted by:	bde
Reviewed by:	phk
2002-03-28 13:47:32 +00:00
Alan Cox
cd430164f1 Allow resursion on the pipe mutex because filt_piperead() and filt_pipewrite()
can be called both with and without the pipe mutex held.  (For example,
if called by pipeselwakeup(), it is held.  Whereas, if called by kqueue_scan(),
it is not.)

Reviewed by:	alfred
2002-03-27 21:47:50 +00:00
Alfred Perlstein
8899023f66 Make the reference counting of 'struct pargs' SMP safe.
There is still some locations where the PROC lock should be held
in order to prevent inconsistent views from outside (like the
proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be
fixed later.

Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-03-27 21:36:18 +00:00
Jeff Roberson
f22a4b62f5 Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks
with this flag.  Remove the dup_list and dup_ok code from subr_witness.  Now
we just check for the flag instead of doing string compares.

Also, switch the process lock, process group lock, and uma per cpu locks over
to this interface.  The original mechanism did not work well for uma because
per cpu lock names are unique to each zone.

Approved by:	jhb
2002-03-27 09:23:41 +00:00
Matthew Dillon
e6bbfd402d oops, forgot to commit this. td->td_savecrit = 0 replaced by API
call cpu_thread_link().
2002-03-27 08:26:37 +00:00
Jake Burkholder
f2a79bb9b4 Make this compile.
Pointy hat to:	dillon
2002-03-27 06:44:32 +00:00
Matthew Dillon
d74ac6819b Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit().  Cleanup the td_savecrit field by moving it
from MI to MD.  Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections.  This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit.  Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways.  This should
be temporary.

Reviewed by:	core
Approved by:	core
2002-03-27 05:39:23 +00:00
Bruce Evans
c0f7f75fd7 "Fixed" -Wshadow warnings by changing the name of some function parameters
from `index' to `indx'.  The correct fix would be to not support or use
index().
2002-03-27 04:04:17 +00:00
Alan Cox
cb100b25ce Remove an unnecessary and inconsistently used variable from exec_new_vmspace(). 2002-03-26 19:20:04 +00:00
Andrew R. Reiter
dcce8874eb - Fixup a few style nits:
- return error -> return (error);
  - move a declaration to the top of the function.
  - become bug for bug compatible with if (error) lines.

Submitted by: bde
2002-03-26 18:07:10 +00:00
Maxime Henrion
17594b936b As discussed in -arch, add the new nmount(2) system call and the
new vfs_getopt()/vfs_copyopt() API.  This is intended to be used
later, when there will be filesystems implementing the VFS_NMOUNT
operation.  The mount(2) system call will disappear when all
filesystems will be converted to the new API.  Documentation will
be committed in a while.

Reviewed by:	phk
2002-03-26 15:33:44 +00:00
Bruce Evans
237e41fc58 Added used include of <sys/sx.h>. Don't depend on namespace pollution in
<sys/file.h>.
2002-03-26 01:09:51 +00:00
Bruce Evans
ee99e978a3 Added used include of <sys/sx.h>. Don't depend on namespace pollution in
<sys/file.h> or <sys/socketvar.h>.
2002-03-25 21:52:04 +00:00
David E. O'Brien
0beb3ecc6c Commit work-around for panics when mounting FS's that are auto-loaded as
modules (ie. procfs.ko).

When the kernel loads dynamic filesystem module, it looks for any of the
VOP operations specified by the new filesystem that have not been registered
already by the currently known filesystems.  If any of such operations exist,
vfs_add_vnops function calls vfs_opv_recalc function, which rebuilds vop_t
vectors for each filesystem and sets all global pointers like ufs_vnops_p,
devfs_specop_p, etc to the new values and then frees the old pointers.  This
behavior is bad because there might be already active vnodes whose v_op fields
will be left pointing to the random garbage, leading to inevitable crash soon.

Submitted by:	Alexander Kabaev <ak03@gte.com>
2002-03-25 21:30:50 +00:00
Andrew R. Reiter
517f30c2c1 - Recommit the securelevel_gt() calls removed by commits rev. 1.84 of
kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the
  source of the recent panics occuring around kldloading file system
  support modules.

Requested by: rwatson
2002-03-25 18:26:34 +00:00
Poul-Henning Kamp
aaead0dfe9 Modernize my email address. 2002-03-25 13:52:45 +00:00
Bruce Evans
70f52b4845 Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses.  Switch to KNF
formatting and/or rewrap the whole prototype in some cases.
2002-03-24 05:09:11 +00:00
John Baldwin
d846883bc4 Use td_ucred in several trivial syscalls and remove Giant locking as
appropriate.
2002-03-22 22:32:04 +00:00
John Baldwin
f2ae7368ea Use explicit Giant locks and unlocks for rather than instrumented ones for
code that is still not safe.  suser() reads p_ucred so it still needs
Giant for the time being.  This should allow kern.giant.proc to be set
to 0 for the time being.
2002-03-22 21:02:02 +00:00
Robert Watson
29dc1288b0 Merge from TrustedBSD MAC branch:
Move the network code from using cr_cansee() to check whether a
    socket is visible to a requesting credential to using a new
    function, cr_canseesocket(), which accepts a subject credential
    and object socket.  Implement cr_canseesocket() so that it does a
    prison check, a uid check, and add a comment where shortly a MAC
    hook will go.  This will allow MAC policies to seperately
    instrument the visibility of sockets from the visibility of
    processes.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-22 19:57:41 +00:00
Alfred Perlstein
db51256707 When "cloning" a pipe's buffer bcopy the data after dropping the pipe's
lock as the data may be paged out and cause a fault.
2002-03-22 16:09:22 +00:00
Robert Watson
7906271f25 In sysctl, req->td is believed always to be non-NULL, so there's no need
to test req->td for NULL values and then do somewhat more bizarre things
relating to securelevel special-casing and suser checks.  Remove the
testing and conditional security checks based on req->td!=NULL, and insert
a KASSERT that td != NULL.  Callers to sysctl must always specify the
thread (be it kernel or otherwise) requesting the operation, or a
number of current sysctls will fail due to assumptions that the thread
exists.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
Discussed with:	bde
2002-03-22 14:58:27 +00:00
Robert Watson
4584bb3945 Since cred never appears to be passed into the securelevel calls as
NULL, turn warning printf's into panic's, since this call has been
restructured such that a NULL cred would result in a page fault anyway.

There appears to be one case where NULL is explicitly passed in in the
sysctl code, and this is believed to be in error, so will be modified.
Securelevels now always require a credential context so that per-jail
securelevels are properly implemented.

Obtained from:	TrustedBSD Project
Sponsored by:	NAI Labs
Discussed with:	bde
2002-03-22 14:49:12 +00:00
Andrew R. Reiter
fe3240e9aa - Back out the commit to make the linker_load_file() securelevel check
made aware in jail environments.  Supposedly something is broken, so
  this should be backed out until further investigation proves otherwise,
  or a proper fix can be provided.
2002-03-22 04:56:09 +00:00
Robert Watson
1b350b4542 Break out the "see_other_uids" policy check from the various
method-based inter-process security checks.  To do this, introduce
a new cr_seeotheruids(u1, u2) function, which encapsulates the
"see_other_uids" logic.  Call out to this policy following the
jail security check for all of {debug,sched,see,signal} inter-process
checks.  This more consistently enforces the check, and makes the
check easy to modify.  Eventually, it may be that this check should
become a MAC policy, loaded via a module.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-22 02:28:26 +00:00
Andrew R. Reiter
e85b9ae9ac - Fix a logic error in checking the securelevel that was introduced in the
previous commit.

Pointy hats to: arr, rwatson
2002-03-21 15:27:39 +00:00
Warner Losh
cb9a238a8a Remove last two abuses of cpu_critical_{enter,exit} in the MI code.
Reviewed by: jake, jhb, rwatson
2002-03-21 06:11:09 +00:00
Benno Rice
565ab9395f Add a change mirroring that made to kern/subr_trap.c and others.
This makes kernel builds with DIAGNOSTIC work again.

Apparently forgotten by:	jhb
Might want to be checked by:	jhb
2002-03-21 02:47:51 +00:00
Jeff Roberson
59295dba57 UMA permited us to utilize the 'waitok' flag to soalloc. 2002-03-20 21:23:26 +00:00
John Baldwin
01c04d2de9 Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined.
Instead of caching the ucred reference, just go ahead and eat the
decerement and increment of the refcount.  Now that Giant is pushed down
into crfree(), we no longer have to get Giant in the common case.  In the
case when we are actually free'ing the ucred, we would normally free it on
the next kernel entry, so the cost there is not new, just in a different
place.  This also removse td_cache_ucred from struct thread.  This is
still only done #ifdef DIAGNOSTIC.

[ missed this file in the previous commit ]

Tested on:	i386, alpha
2002-03-20 21:12:04 +00:00
John Baldwin
c1a513c951 - Push down Giant into crfree() in the case that we actually free a ucred.
- Add a cred_free_thread() function (conditional on DIAGNOSTICS) that drops
  a per-thread ucred reference to be used in debugging code when leaving
  the kernel.
2002-03-20 21:00:50 +00:00
Andrew R. Reiter
c457a4403a - Change a check of securelevel to securelevel_gt() call in order to help
against users within a jail attempting to load kernel modules.
- Add a check of securelevel_gt() to vfs_mount() in order to chop some
  low hanging fruit for the repair of securelevel checking of linking and
  unlinking files from within jails.  There is more to be done here.

Reviewed by: rwatson
2002-03-20 16:03:42 +00:00
Andrew R. Reiter
dca9d05526 - Remove a semi-colon from after SYSINIT that was introduced in rev. 1.163. 2002-03-20 14:46:38 +00:00
Jeff Roberson
586c8b6b29 Add calls to uma_zone_set_max() to restore previously enforced limits. 2002-03-20 05:30:58 +00:00
Jeff Roberson
54d77689ed Backout part of my previous commit; I was wrong about vm_zone's handling of
limits on zones w/o objects.
2002-03-20 04:39:32 +00:00
Jeff Roberson
9e9d298a9b Remove references to vm_zone.h and switch over to the new uma API. 2002-03-20 04:11:52 +00:00
Jeff Roberson
c897b81311 Remove references to vm_zone.h and switch over to the new uma API.
Also, remove maxsockets.  If you look carefully you'll notice that the old
zone allocator never honored this anyway.
2002-03-20 04:09:59 +00:00
Alfred Perlstein
4d77a549fe Remove __P. 2002-03-19 21:25:46 +00:00
Alfred Perlstein
1f31a77ce8 don't generate files with __P. 2002-03-19 20:48:32 +00:00
Andrew R. Reiter
08a54da785 - Change a malloc / bzero pair to make use of the M_ZERO malloc(9) flag. 2002-03-19 15:41:21 +00:00
Peter Wemm
30171114b3 Fix a gcc-3.1+ warning.
warning: deprecated use of label at end of compound statement

ie: you cannot do this anymore:
switch(foo) {
....

default:
}
2002-03-19 11:02:06 +00:00
Peter Wemm
3ba30c18a2 Pacify gcc-3.1+, initialize two variables to avoid -Wuninitialized
warnings.
2002-03-19 10:57:40 +00:00
Peter Wemm
a5e7c7da5e Fix warnings on gcc-3.1+ where __func__ is a const char * instead of a
string.
2002-03-19 10:56:46 +00:00
Jeff Roberson
8355f576a9 This is the first part of the new kernel memory allocator. This replaces
malloc(9) and vm_zone with a slab like allocator.

Reviewed by:	arch@
2002-03-19 09:11:49 +00:00
Alfred Perlstein
4a950215ef Close a race when vfs_syscalls.c:checkdirs() runs.
To do this protect the filedesc pointer in the proc with PROC_LOCK
in both checkdirs() and kern_descrip.c:fdfree().
2002-03-19 04:30:04 +00:00
Bruce Evans
367b50a28f Fixed some printf format errors (hopefully all of the remaining daddr64_t
ones for GENERIC, and all others on the same line as those).  Reformat
the printfs if necessary to avoid new long lones or old format printf
errors.
2002-03-19 04:09:21 +00:00
Andrew R. Reiter
9b3851e9e3 - Lock down the ``module'' structure by adding an SX lock that is used by
all the global bits of ``module'' data.  This commit adds a few generic
  macros, MOD_SLOCK, MOD_XLOCK, etc., that are meant to be used as ways
  of accessing the SX lock.  It is also the first step in helping to lock
  down the kernel linker and module systems.

Reviewed by: jhb, jake, smp@
2002-03-18 07:45:30 +00:00
Kirk McKusick
a0595d0249 Add a flags parameter to VFS_VGET to pass through the desired
locking flags when acquiring a vnode. The immediate purpose is
to allow polling lock requests (LK_NOWAIT) needed by soft updates
to avoid deadlock when enlisting other processes to help with
the background cleanup. For the future it will allow the use of
shared locks for read access to vnodes. This change touches a
lot of files as it affects most filesystems within the system.
It has been well tested on FFS, loopback, and CD-ROM filesystems.
only lightly on the others, so if you find a problem there, please
let me (mckusick@mckusick.com) know.
2002-03-17 01:25:47 +00:00
Jake Burkholder
ac59490b5e Convert all pmap_kenter/pmap_kremove pairs in MI code to use pmap_qenter/
pmap_qremove.  pmap_kenter is not safe to use in MI code because it is not
guaranteed to flush the mapping from the tlb on all cpus.  If the process
in question is preempted and migrates cpus between the call to pmap_kenter
and pmap_kremove, the original cpu will be left with stale mappings in its
tlb.  This is currently not a problem for i386 because we do not use PG_G on
SMP, and thus all mappings are flushed from the tlb on context switches, not
just user mappings.  This is not the case on all architectures, and if PG_G
is to be used with SMP on i386 it will be a problem.  This was committed by
peter earlier as part of his fine grained tlb shootdown work for i386, which
was backed out for other reasons.

Reviewed by:	peter
2002-03-17 00:56:41 +00:00
Dag-Erling Smørgrav
8bc814e603 Implement PT_IO (read / write arbitrary amounts of data or text).
Submitted by:	Artur Grabowski <art@{blahonga,openbsd}.org>
Obtained from:	OpenBSD
2002-03-16 02:40:02 +00:00
Dag-Erling Smørgrav
a888d317bb PT_[GS]ET{,DB,FP}REGS isn't really optional any more, since we have dummy
backend functions for those archs that don't support them.  I meant to do
this ages ago, but never got around to it.

Inspired by:	OpenBSD
2002-03-15 20:17:12 +00:00
Kirk McKusick
0d2af52141 Introduce the new 64-bit size disk block, daddr64_t. Change
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.
2002-03-15 18:49:47 +00:00
Alfred Perlstein
628abf6c69 Giant pushdown for read/write/pread/pwrite syscalls.
kern/kern_descrip.c:
Aquire Giant in fdrop_locked when file refcount hits zero, this removes
the requirement for the caller to own Giant for the most part.

kern/kern_ktrace.c:
Aquire Giant in ktrgenio, simplifies locking in upper read/write syscalls.

kern/vfs_bio.c:
Aquire Giant in bwillwrite if needed.

kern/sys_generic.c
Giant pushdown, remove Giant for:
   read, pread, write and pwrite.
readv and writev aren't done yet because of the possible malloc calls
for iov to uio processing.

kern/sys_socket.c
Grab giant in the socket fo_read/write functions.

kern/vfs_vnops.c
Grab giant in the vnode fo_read/write functions.
2002-03-15 08:03:46 +00:00
Alfred Perlstein
3b018f572d Bug fixes:
Missed a place where the pipe sleep lock was needed in order to safely grab
Giant, fix it and add an assertion to make sure this doesn't happen again.

Fix typos in the PIPE_GET_GIANT/PIPE_DROP_GIANT that could cause the
wrong mutex to get passed to PIPE_LOCK/PIPE_UNLOCK.

Fix a location where the wrong pipe was being passed to
PIPE_GET_GIANT/PIPE_DROP_GIANT.
2002-03-15 07:18:09 +00:00
Alfred Perlstein
85f190e4d1 Fixes to make select/poll mpsafe.
Problem:
  selwakeup required calling pfind which would cause lock order
  reversals with the allproc_lock and the per-process filedesc lock.
Solution:
  Instead of recording the pid of the select()'ing process into the
  selinfo structure, actually record a pointer to the thread.  To
  avoid dereferencing a bad address all the selinfo structures that
  are in use by a thread are kept in a list hung off the thread
  (protected by sellock).  When a selwakeup occurs the selinfo is
  removed from that threads list, it is also removed on the way out
  of select or poll where the thread will traverse its list removing
  all the selinfos from its own list.

Problem:
  Previously the PROC_LOCK was used to provide the mutual exclusion
  needed to ensure proper locking, this couldn't work because there
  was a single condvar used for select and poll and condvars can
  only be used with a single mutex.
Solution:
  Introduce a global mutex 'sellock' which is used to provide mutual
  exclusion when recording events to wait on as well as performing
  notification when an event occurs.

Interesting note:
  schedlock is required to manipulate the per-thread TDF_SELECT
  flag, however if given its own field it would not need schedlock,
  also because TDF_SELECT is only manipulated under sellock one
  doesn't actually use schedlock for syncronization, only to protect
  against corruption.

Proc locks are no longer used in select/poll.

Portions contributed by: davidc
2002-03-14 01:32:30 +00:00
Brian Feldman
0e0af8ecda Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate.
While doing this, move it earlier in the sysinit boot process so that the
VM system can use it.

After that, the system is now able to use sx locks instead of lockmgr
locks in the VM system.  To accomplish this, some of the more
questionable uses of the locks (such as testing whether they are
owned or not, as well as allowing shared+exclusive recursion) are
removed, and simpler logic throughout is used so locks should also be
easier to understand.

This has been tested on my laptop for months, and has not shown any
problems on SMP systems, either, so appears quite safe.  One more
user of lockmgr down, many more to go :)
2002-03-13 23:48:08 +00:00
Archie Cobbs
44a8ff315e Add realloc() and reallocf(), and make free(NULL, ...) acceptable.
Reviewed by:	alfred
2002-03-13 01:42:33 +00:00
Jeff Roberson
8de00f4a87 This patch adds the "LOCKSHARED" option to namei which causes it to only acquire shared locks on leafs.
The stat() and open() calls have been changed to make use of this new functionality.  Using shared locks in
these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes.  Also,
this reduces the number of exclusive locks that are floating around in the system, which helps reduce the
number of deadlocks that occur.

A new kernel option "LOOKUP_SHARED" has been added.  It defaults to off so this patch can be turned on for
testing, and should eventually go away once it is proven to be stable.  I have personally been running this
patch for over a year now, so it is believed to be fully stable.

Reviewed by:	jake, obrien
Approved by:	jake
2002-03-12 04:00:11 +00:00
Poul-Henning Kamp
417fb7f6fa Make the disk_clone() routine more robust for abuse.
Sneak in a trivial bit of the GEOM stuff while we're here anyway.
2002-03-11 08:08:02 +00:00
Seigo Tanimura
183ccde6c6 Stop abusing the pgrpsess_lock. 2002-03-11 07:53:13 +00:00
Seigo Tanimura
aa3bf85c54 Do not lock the pgrpsess_lock exclusively across ttywait().
Spotted by:		David Wolfskill <david@catwhisker.org>
Investigated by:	rwatson
2002-03-11 07:51:08 +00:00
David Malone
6c75a65a00 Don't assign strcmp to a variable called err and then compare it
with zero, just compare strcmp with zero. This fixes the same bug
which Maxim just fixed and fixes some odd style too.

PR:		35712
Reviewed by:	arr
2002-03-10 23:12:43 +00:00
Maxim Sobolev
832af2d5ed Fix a breakage introduced in rev.1.75 (supposedly style cleanup), which results
in "missing dependencies" error when loading some kld modules. It is sad to
see how often these days style cleanus break doesn't broken things. Perhaps
people should recall good old principle: "don't fix it if it isn't broken".
2002-03-10 19:20:01 +00:00
Poul-Henning Kamp
01de1b13b8 Make the proposed name arg to dev_stdclone() const. 2002-03-10 10:50:05 +00:00
Alfred Perlstein
bbbb04ce62 Remove __P 2002-03-09 22:44:37 +00:00
Alfred Perlstein
be4af4b723 Don't deref NULL mutex pointer when pipeclose()'ing a pipe that is not
fully instaniated.

Revert the logic in pipeclose so that we don't have the entire function
pretty much under a single if() statement, instead invert the test and
just return if it fails.

Submitted (in different form) by: bde

Don't use pool mutexes for pipes.  We can not use pool mutexes
because we will need to grab the select lock while holding a pipe
lock which is not allowed because you may not aquire additional
mutexes when holding a pool mutex.

Instead malloc(9) space for the mutex that is shared between the
pipes.
2002-03-09 22:06:31 +00:00
Poul-Henning Kamp
1c1676edca Delete "notyet" code before it becomes "ohh no" code. 2002-03-09 20:11:25 +00:00
Luigi Rizzo
2dbd9d5bc3 Make the DEVICE_POLLING code compile with -Werror and in LINT 2002-03-09 08:02:52 +00:00
John Baldwin
60e269643d - Use a MI critical section in witness_sleep() and witness_list() as they
simply need to prevent switching from another CPU and do not need
  interrupts disabled.
- Add a comment to witness_list() about why displaying spin locks for
  threads on other CPU's really is just a bad idea and probably shouldn't
  be done.
2002-03-08 18:57:57 +00:00
John Baldwin
c29824db05 Read KTR_CPU into a temporary variable so that we use a consistent value
for both the cpumask check and the cpu entry field w/o needing to use
a critical section.
2002-03-08 18:55:59 +00:00
Poul-Henning Kamp
fb92273bdc Move the mount of the root filesystem to happen in the init process before
the exec if /sbin/init.

This allows the scheduler to get started and kthreads a chance to run
before we start filesystem operations.
2002-03-08 10:33:11 +00:00
Mike Silbersack
77a7d074e4 Unconditionally limit maxproc so that it is not possible
to exhaust all kmaps.  The only reward for setting maxproc
to a value which will cause kmap exhaustion is a panic
during a forkbomb attack.

MFC after:	3 days
2002-03-07 04:50:36 +00:00
Jake Burkholder
752dff3d9c Add needed includes of machine/smp.h, remove nested include in sys/smp.h
so that inlines in machine/smp.h can use variables declared in sys/smp.h.
2002-03-07 04:43:51 +00:00
Dag-Erling Smørgrav
e97c3e3d5c Rename runq_find() to runq_findproc(), and hide it behind #ifdef DIAGNOSTIC,
as it can have a severe impact on performance under high load, and the bug
it was meant to catch was fixed ages ago.
2002-03-06 15:34:07 +00:00
Maxim Konovalov
cf11f48256 Fix a typo, unbreak the world.
Thanks to:	mux
Approved by:	ru
2002-03-06 12:28:51 +00:00
Bruce Evans
3006e31679 Don't (blindly) truncate the unit number to 4 digits when formatting the
string returned by device_get_nameunit().
2002-03-06 11:34:02 +00:00
Maxim Konovalov
9dfd307b10 Maximum semid is seminfo.semmni not seminfo.semmsl.
PR:		kern/34979
Submitted by:	James Gritton <jamie@gritton.org>
Reviewed by:	alfred, ru
Approved by:	ru
MFC after:	1 week
2002-03-06 10:52:49 +00:00
Robert Watson
89e1164ee2 Three p_ucred -> td_ucred's missed in jhb's earlier pass; all appear to
be safe.
2002-03-05 19:45:45 +00:00
Robert Watson
b0ad6e203a The change from td->td_proc->p_ucred to td->td_ucred has shortened some
lines: more agressively line wrap under those circumstances.
2002-03-05 19:31:25 +00:00
John Baldwin
c6f55f33ea - Use td_ucred for jail checks.
- Move jail checks and some other checks involving constants and stack
  variables out from under Giant.  This isn't perfectly safe atm because
  jail_sysvipc_allowed is read w/o a lock meaning that its value could be
  stale.  This global variable will soon become a per-jail flag, however,
  at which time it will either not need a lock or will use the prison lock.
2002-03-05 18:57:36 +00:00
Eivind Eklund
f52bd684f3 * Move bswlist declaration and initialization from kern/vfs_bio.c to
vm/vm_pager.c, which is the only place it is used.
* Make the QUEUE_* definitions and bufqueues local to vfs_bio.c.
* constify buf_wmesg.
2002-03-05 18:20:58 +00:00
Eivind Eklund
04858e7ee4 Change wmesg to const char * instead of char * 2002-03-05 17:45:12 +00:00
Robert Watson
ba51c2659d Part II: update various mechanically generated files to allow for new
system call number allocations.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-05 16:13:01 +00:00
Robert Watson
11ffd032ff Reserve system call numbers for the MAC framework. This will prevent
people working on the MAC tree from getting toasted whenever system call
numbers are allocated in the main tree (for example, for KSE :-).
Calls allocated: __mac_{get,set}_proc, __mac_{get,set}_{fd,file}().

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-05 16:11:11 +00:00
Eivind Eklund
eb8e6d5276 Document all functions, global and static variables, and sysctls.
Includes some minor whitespace changes, and re-ordering to be able to document
properly (e.g, grouping of variables and the SYSCTL macro calls for them, where
the documentation has been added.)

Reviewed by:	phk (but all errors are mine)
2002-03-05 15:38:49 +00:00
Robert Drehmel
6f60771b6d Fix a warning. 2002-03-05 15:19:33 +00:00
Jeff Roberson
88c99cfbc8 Add a new variable mp_maxid. This is used so that per cpu datastructures may
be allocated as arrays indexed by the cpu id.  Previously the only reliable
way to know the max cpu id was through MAXCPU. mp_ncpus isn't useful here
because cpu ids may be sparsely mapped, although x86 and alpha do not do this.

Also, call cpu_mp_probe much earlier so the max cpu id is known before the VM
starts up.  This is intended to help support per cpu queues for the new
allocator, but may be useful elsewhere.

Reviewed by:	jake
Approved by:	jake
2002-03-05 10:01:46 +00:00
Seigo Tanimura
996abba928 Track the number of wired pages to avoid unwiring unwired pages.
Reviewed by:	alfred
2002-03-05 00:51:03 +00:00
Mitsuru IWASAKI
899ccf541a Add generalized power profile code.
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.

 - move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
 - call power_profile_set_state() also from APM driver when AC-line
   status changes
 - add call-back function for Crusoe LongRun controlling on power
   profile changes for a example
2002-03-04 18:46:13 +00:00
Bosko Milekic
5a4f147089 Fix bug in mb_alloc that made systems configured with
PAGE_SIZE / MCLBYTES == 1 crash. Fix them by changing the
appropriate "allocate new page and bucket" code in mb_alloc to use
the macro for properly grabbing an allocated object from a bucket,
the one that checks whether the bucket is empty.
This should allow ken to continue testing zero-copy stuff on -CURRENT.

Noticed and provided debug info: ken
2002-03-03 22:10:04 +00:00
Dima Dorfman
e74d483140 Check the version of ex_anon (a `struct xucred') before using it to
fill out netc_anon (a `struct ucred'), and add an XXX around the
entire operation since it isn't clear whether it's doing the right
thing with things like cr_uidinfo and cr_prison.
2002-03-03 06:07:57 +00:00
Seigo Tanimura
92c914f936 Fix lock leakage and late unlock.
Submitted by:	bde
2002-03-02 12:42:24 +00:00
Ian Dowse
167b8d0334 In sosend(), enforce the socket buffer limits regardless of whether
the data was supplied as a uio or an mbuf. Previously the limit was
ignored for mbuf data, and NFS could run the kernel out of mbufs
when an ipfw rule blocked retransmissions.
2002-02-28 11:22:40 +00:00
Warner Losh
0cf3c909d8 Remove now unused struct proc *p.
Approved by: jhb
2002-02-27 20:57:57 +00:00
John Baldwin
bdd67d483c - Change namei() to use td_ucred instead of p_ucred.
- Change the hack in access() that uses a temporary credential to set
  td_ucred to the temp cred instead of p_ucred.
2002-02-27 19:15:29 +00:00
John Baldwin
6f105b3444 - Change unp_listen() to accept a thread rather than a proc as its second
argument.
- Use td_ucred in unp_listen() instead of p_ucred.
2002-02-27 19:14:01 +00:00
John Baldwin
4a7d6cd251 Fix Giant leakage in several error cases in __semctl(). 2002-02-27 19:12:14 +00:00
John Baldwin
6bd7ad69a1 Add a comment about an unlocked access to p_ucred that will go away in
the near future.
2002-02-27 19:10:50 +00:00
Alfred Perlstein
9f01374de5 kill __P. 2002-02-27 18:51:53 +00:00
Alfred Perlstein
566c1313a3 add assertions in the places where giant is required to catch when
the pipe is locked and shouldn't be.

initialize pipe->pipe_mtxp to NULL when creating pipes in order not
to trip the above assertions.

swap pipe lock with giant around calls to pipe_destroy_write_buffer()

pipe_destroy_write_buffer issue noticed by: jhb
2002-02-27 18:49:58 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
John Baldwin
65e3406d28 Temporarily lock Giant while we update td_ucred. The proc lock doesn't
fully protect p_ucred yet so Giant is needed until all the p_ucred
locking is done.  This is the original reason td_ucred was not used
immediately after its addition.  Unfortunately, not using td_ucred is
not enough to avoid problems.  Since p_ucred could be stale, we could
actually be dereferencing a stale pointer to dink with the refcount, so
we really need Giant to avoid foot-shooting.  This allows td_ucred to
be safely used as well.
2002-02-27 18:30:01 +00:00
Alfred Perlstein
21dbcfd500 Fix a NULL deref panic in pipe_write, we can't blindly lock
pipe->pipe_peer->pipe_mtxp because it may be NULL, so lock the
passed in pipe's mutex instead.
2002-02-27 17:23:16 +00:00
Robert Drehmel
ad1ff0997e Make getcredhostname() take a buffer and the buffer's size
as arguments.  The correct hostname is copied into the buffer
while having the prison's lock acquired in a jailed process'
case.

Reviewed by:	jhb, rwatson
2002-02-27 16:43:20 +00:00
Robert Drehmel
9484d0c0e8 Add a function which returns the correct hostname for a given
credential.

Reviewed by:	phk
2002-02-27 14:58:32 +00:00
Alfred Perlstein
ffddaaeeeb MPsafe fixes:
use SYSINIT to initialize pipe_zone.
use PIPE_LOCK to protect kevent ops.
2002-02-27 11:27:48 +00:00
Seigo Tanimura
2f9325870d Return ESRCH if the target process is not inferior to the curproc.
Spotted by:	HIROSHI OOTA <oota@LSi.nec.co.jp>
2002-02-27 10:38:14 +00:00
Alfred Perlstein
e6be967434 Don't hardcode /sys when making tags, instead use ${.CURDIR}/.. this
fixes a problem where one tries to make tags when the source isn't in
/sys.

Submitted by: Jihui Yang <yangjihui@yahoo.com>
2002-02-27 10:07:15 +00:00
Peter Wemm
d1693e1701 Back out all the pmap related stuff I've touched over the last few days.
There is some unresolved badness that has been eluding me, particularly
affecting uniprocessor kernels.  Turning off PG_G helped (which is a bad
sign) but didn't solve it entirely.  Userland programs still crashed.
2002-02-27 09:51:33 +00:00
Alfred Perlstein
f81b04d96c First rev at making pipe(2) pipe's MPsafe.
Both ends of the pipe share a pool_mutex, this makes allocation
and deadlock avoidance easy.

Remove some un-needed FILE_LOCK ops while I'm here.

There are some issues wrt to select and the f{s,g}etown code that
we'll have to deal with, I think we may also need to move the calls
to vfs_timestamp outside of the sections covered by PIPE_LOCK.
2002-02-27 07:35:59 +00:00
Dima Dorfman
76183f3453 Introduce a version field to `struct xucred' in place of one of the
spares (the size of the field was changed from u_short to u_int to
reflect what it really ends up being).  Accordingly, change users of
xucred to set and check this field as appropriate.  In the kernel,
this is being done inside the new cru2x() routine which takes a
`struct ucred' and fills out a `struct xucred' according to the
former.  This also has the pleasant sideaffect of removing some
duplicate code.

Reviewed by:	rwatson
2002-02-27 04:45:37 +00:00
Peter Wemm
bd1e3a0f89 Jake further reduced IPI shootdowns on sparc64 in loops by using ranged
shootdowns in a couple of key places.  Do the same for i386.  This also
hides some physical addresses from higher levels and has it use the
generic vm_page_t's instead.  This will help for PAE down the road.

Obtained from:	jake (MI code, suggestions for MD part)
2002-02-27 02:14:58 +00:00
Matthew Dillon
181df8c9d4 revert last commit temporarily due to whining on the lists. 2002-02-26 20:33:41 +00:00
Matthew Dillon
f96ad4c223 STAGE-1 of 3 commit - allow (but do not require) interrupts to remain
enabled in critical sections and streamline critical_enter() and
critical_exit().

This commit allows an architecture to leave interrupts enabled inside
critical sections if it so wishes.  Architectures that do not wish to do
this are not effected by this change.

This commit implements the feature for the I386 architecture and provides
a sysctl, debug.critical_mode, which defaults to 1 (use the feature).  For
now you can turn the sysctl on and off at any time in order to test the
architectural changes or track down bugs.

This commit is just the first stage.  Some areas of the code, specifically
the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will
be cleaned up in the STAGE-2 commit when the critical_*() functions are
moved entirely into MD files.

The following changes have been made:

	* critical_enter() and critical_exit() for I386 now simply increment
	  and decrement curthread->td_critnest.  They no longer disable
	  hard interrupts.  When critical_exit() decrements the counter to
	  0 it effectively calls a routine to deal with whatever interrupts
	  were deferred during the time the code was operating in a critical
	  section.

	  Other architectures are unaffected.

	* fork_exit() has been conditionalized to remove MD assumptions for
	  the new code.  Old code will still use the old MD assumptions
	  in regards to hard interrupt disablement.  In STAGE-2 this will
	  be turned into a subroutine call into MD code rather then hardcoded
	  in MI code.

	  The new code places the burden of entering the critical section
	  in the trampoline code where it belongs.

	* I386: interrupts are now enabled while we are in a critical section.
	  The interrupt vector code has been adjusted to deal with the fact.
	  If it detects that we are in a critical section it currently defers
	  the interrupt by adding the appropriate bit to an interrupt mask.

	* In order to accomplish the deferral, icu_lock is required.  This
	  is i386-specific.  Thus icu_lock can only be obtained by mainline
	  i386 code while interrupts are hard disabled.  This change has been
	  made.

	* Because interrupts may or may not be hard disabled during a
	  context switch, cpu_switch() can no longer simply assume that
	  PSL_I will be in a consistent state.  Therefore, it now saves and
	  restores eflags.

	* FAST INTERRUPT PROVISION.  Fast interrupts are currently deferred.
	  The intention is to eventually allow them to operate either while
	  we are in a critical section or, if we are able to restrict the
	  use of sched_lock, while we are not holding the sched_lock.

	* ICU and APIC vector assembly for I386 cleaned up.  The ICU code
	  has been cleaned up to match the APIC code in regards to format
	  and macro availability.  Additionally, the code has been adjusted
	  to deal with deferred interrupts.

	* Deferred interrupts use a per-cpu boolean int_pending, and
	  masks ipending, spending, and fpending.  Being per-cpu variables
	  it is not currently necessary to lock; bus cycles modifying them.

	  Note that the same mechanism will enable preemption to be
	  incorporated as a true software interrupt without having to
	  further hack up the critical nesting code.

	* Note: the old critical_enter() code in kern/kern_switch.c is
	  currently #ifdef to be compatible with both the old and new
	  methodology.  In STAGE-2 it will be moved entirely to MD code.

Performance issues:

	One of the purposes of this commit is to enhance critical section
	performance, specifically to greatly reduce bus overhead to allow
	the critical section code to be used to protect per-cpu caches.
	These caches, such as Jeff's slab allocator work, can potentially
	operate very quickly making the effective savings of the new
	critical section code's performance very significant.

	The second purpose of this commit is to allow architectures to
	enable certain interrupts while in a critical section.  Specifically,
	the intention is to eventually allow certain FAST interrupts to
	operate rather then defer.

	The third purpose of this commit is to begin to clean up the
	critical_enter()/critical_exit()/cpu_critical_enter()/
	cpu_critical_exit() API which currently has serious cross pollution
	in MI code (in fork_exit() and ast() for example).

	The fourth purpose of this commit is to provide a framework that
	allows kernel-preempting software interrupts to be implemented
	cleanly.  This is currently used for two forward interrupts in I386.
	Other architectures will have the choice of using this infrastructure
	or building the functionality directly into critical_enter()/
	critical_exit().

	Finally, this commit is designed to greatly improve the flexibility
	of various architectures to manage critical section handling,
	software interrupts, preemption, and other highly integrated
	architecture-specific details.
2002-02-26 17:06:21 +00:00
Bruce Evans
ffe4d2f7c7 Fixed 3 regressions in rev.1.99 (clobbering of the English fix in rev.1.98,
and 2 unformattings).
2002-02-26 16:17:45 +00:00
Søren Schmidt
ed57cfc480 Hide "bla bla exists, skipping it" behind bootverbose. 2002-02-26 10:38:33 +00:00
Poul-Henning Kamp
c91f7a7332 Cast the variable, not the constant to 64 bits. 2002-02-26 09:27:39 +00:00
Poul-Henning Kamp
0f5c7c4b1c Fix warning in !SMP case.
Submitted by:	 Maxime Henrion <mux@mu.org>
2002-02-26 09:21:52 +00:00
Poul-Henning Kamp
1634e90817 Remove unused variable. 2002-02-26 09:16:27 +00:00
Peter Wemm
e2256f43ed Fix warning. s/microuptime()/binuptime()/ for switchtime initial value. 2002-02-26 01:03:39 +00:00
Peter Wemm
bd47bef5aa Fix a warning. Do not assume pointer == long. 2002-02-26 00:55:27 +00:00
Peter Wemm
6bd95d70db Work-in-progress commit syncing up pmap cleanups that I have been working
on for a while:
- fine grained TLB shootdown for SMP on i386
- ranged TLB shootdowns.. eg: specify a range of pages to shoot down with
  a single IPI, since the IPI is very expensive.  Adjust some callers
  that used to trigger this inside tight loops to do a ranged shootdown
  at the end instead.
- PG_G support for SMP on i386 (options ENABLE_PG_G)
- defer PG_G activation till after we decide what we are going to do with
  PSE and the 4MB pages at the start of the kernel.  This should solve
  some rumored strangeness about stale PG_G entries getting stuck
  underneath the 4MB pages.
- add some instrumentation for the fine TLB shootdown
- convert some asm instruction wrappers from functions to inlines.  gcc
  seems to do a fair bit better with this.
- [temporarily!] pessimize the tlb shootdown IPI handlers.  I will fix
  this again shortly.

This has been working fairly well for me for a while, but I have tweaked
it again prior to commit since my last major testing round.  The only
outstanding problem that I know of is PG_G related, which is why there
is an option for it (not on by default for SMP).  I have seen a world
speedups by a few percent (as much as 4 or 5% in one case) but I have
*not* accurately measured this - I am a bit sceptical of these numbers.
2002-02-25 23:49:51 +00:00
Ian Dowse
ddb7d629f1 Sockets passed into uipc_abort() have been allocated by sonewconn()
but never accept'ed, so they must be destroyed. Originally, unp_drop()
detected this situation by checking if so->so_head is non-NULL.
However, since revision 1.54 of uipc_socket.c (Feb 1999), so->so_head
is set to NULL before calling soabort(), so any unix-domain sockets
waiting to be accept'ed are leaked if the server socket is closed.

Resolve this by moving the socket destruction code into uipc_abort()
itself, and making it unconditional (the other caller of unp_drop()
never needs the socket to be destroyed). Use unp_detach() to avoid
the original code duplication when destroying the socket.

PR:		kern/17895
Reviewed by:	dwmalone (an earlier version of the patch)
MFC after:	1 week
2002-02-25 00:03:34 +00:00
Poul-Henning Kamp
5b7d8efa8d Add a generation number to timecounters and spin if it changes under
our feet when we look inside timecounter structures.

Make the "sync_other" code more robust by never overwriting the
tc_next field.

Add counters for the bin[up]time functions.

Call tc_windup() in tc_init() and switch_timecounter() to make sure
we all the fields set right.
2002-02-24 20:04:07 +00:00
Poul-Henning Kamp
e9be968e95 Fix a typo (?) in previous commit told ttyprintf() to print the integer
part of the user-time as a 64bit quantity.  This resulted in weird
output from SIGINFO.
2002-02-24 19:56:41 +00:00
Seigo Tanimura
f591779bb5 Lock struct pgrp, session and sigio.
New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
  session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by:	jhb, alfred
Tested on:	cvsup.jp.FreeBSD.org
		(which is a quad-CPU machine running -current)
2002-02-23 11:12:57 +00:00
Jake Burkholder
39dda4e363 Make this compile.
Pointy hat to:	julian
2002-02-23 01:42:13 +00:00
Julian Elischer
77c4066424 Add some DIAGNOSTIC code.
While in userland, keep the thread's ucred reference in a shadow
field so that the usual place to store it is NULL.
If DIAGNOSTIC is not set, the thread ucred is kept valid until the next
kernel entry, at which time it is checked against the process cred
and possibly corrected. Produces a BIG speedup in
kernels with INVARIANTS set. (A previous commit corrected it
for the non INVARIANTS case already)

Reviewed by:	dillon@freebsd.org
2002-02-22 23:58:22 +00:00
Andrew R. Reiter
e68baa7073 - Whitespace fixes leftover from previous commit.
Submitted by:	bde
2002-02-22 13:43:56 +00:00
Andrew R. Reiter
54c94c8a35 - Whitespace fixup left over from previous commit.
- Remove bogus cast.

Submitted by:	bde
2002-02-22 13:33:10 +00:00
Poul-Henning Kamp
1cbb9c3b03 Convert p->p_runtime and PCPU(switchtime) to bintime format. 2002-02-22 13:32:01 +00:00
Poul-Henning Kamp
4e2befc031 Use better scaling factor for NTPs correction.
Explain the magic.
2002-02-22 12:59:20 +00:00
Poul-Henning Kamp
57c10583aa GC: BIO_ORDERED, various infrastructure dealing with BIO_ORDERED. 2002-02-22 09:26:35 +00:00
Poul-Henning Kamp
986066d065 Replace bowrite() with BUF_WRITE in ufs.
Remove bowrite(), it is now unused.

This is the first step in getting entirely rid of BIO_ORDERED which is
a generally accepted evil thing.

Approved by:	mckusick
2002-02-22 09:03:00 +00:00
Andrew R. Reiter
8e92b63c6f - Massive style fixup.
Reviewed by: mike
Approved by: dfr
2002-02-22 04:14:49 +00:00
Boris Popov
cebcee2e9e Add support for iovcnt greater than 1. This should resolve problems
with some applications.

Obtained from:	Darwin project
MFC after:	2 weeks
2002-02-21 16:23:38 +00:00
Bruce Evans
19610b66d8 Fixed some style bugs. Added a comment about a bug in PT_SSTEP.
Approved by:	des
2002-02-21 04:47:38 +00:00
Bruce Evans
4b1aa58b5f Recover bits that were lost in transition in rev.1.76:
- P_INMEM checks in all the functions.  P_INMEM must be checked because
  PHOLD() is broken.  The old bits had bogus locking (using sched_lock)
  to lock P_INMEM.  After removing the P_INMEM checks, we were left with
  just the bogus locking.
- large comments.  They were too large, but better than nothing.

Remove obfuscations that were gained in transition in rev.1.76:
- PROC_REG_ACTION() is even more of an obfuscation than PROC_ACTION().

The change copies procfs_machdep.c rev.1.22 of i386/procfs_machdep.c
verbatim except for "fixing" the old-style function headers and adjusting
function names and comments.  It doesn't remove the bogus locking.

Approved by:	des
2002-02-21 04:37:55 +00:00
Julian Elischer
fd21c2b51c Oops, used wrong error value for unimplemented syscalls. 2002-02-20 22:27:09 +00:00
Peter Wemm
114730b0a8 Tidy up some unused variables 2002-02-20 21:25:44 +00:00
Andrew R. Reiter
b65420f968 - Fix style further by adding parentheses around return values so that
they look like:
	return (val);  instead of:  return val;
2002-02-20 16:05:30 +00:00
Andrew R. Reiter
287698b4f1 - Style.9 formatting fix; this commit is mostly white space related with
the next commit actually doing the:
	return val; -> return (val);
  changes.  This commit was done in preparation for getting ``struct
  modules'' locked down.

Reviewed by: bde
Approved by: dfr
2002-02-20 14:30:02 +00:00
Robert Watson
ec20f901a2 More cleanups relating to vm object allocation failure: make sure we
call VOP_CLOSE() with vp unlocked; clean up the return path a little,
in as much as our namei/vnode operation return paths can be cleared
up.  For a return case that was apparently never taken, this sure
is ugly.

Reviewed by:	jeffr
2002-02-20 00:11:57 +00:00
Mike Silbersack
cc6712ea04 A few misc forkbomb defenses:
- Leave 10 processes for root-only use, the previous
  value of 1 was insufficient to run ps ax | more.
- Remove the printing of "proc: table full".  When the table
  really is full, this would flood the screen/logs, making
  the problem tougher to deal with.
- Force any process trying to fork beyond its user's maximum
  number of processes to sleep for .5 seconds before returning
  failure.  This turns 2000 rampaging fork monsters into 2000
  harmlessly snoozing fork monsters.

Reviewed by:	dillon, peter
MFC after:	1 week
2002-02-19 03:15:28 +00:00
Julian Elischer
c28841c1da Add stub syscalls and definitions for KSE calls.
"Book'em Danno"
2002-02-19 02:40:31 +00:00
Julian Elischer
8a2c87e7c7 Add 5 KSE syscalls. Two will be implemented with the next KSE
step and the others are reservations for coming code.
All will be stubbed in this kernel in the next commit.
This will allow people to easily make KSE binaries for userland testing
(the syscalls will be in libc) but they will still need a real KSE kernel
to test it. (libc looks in /sys to decide what it should add stubs for).
2002-02-19 02:19:36 +00:00
Matthew Dillon
3e1ce344ba Load the current timecounter into tc. The timecounter global can change
at any time and we do not want to call one timercounter's function with
another timecounter's structural pointer.

MFC after:	3 days
2002-02-18 19:49:30 +00:00
Matthew Dillon
735da6de88 Add kern_giant_ucred to instrument Giant around ucred related operations
such a getgid(), setgid(), etc...
2002-02-18 17:51:47 +00:00
Poul-Henning Kamp
68edc1b939 Make v_addpollinfo() visible and non-inline.
Have callers only call it as needed.
Add necessary call in ufs_kqfilter().

Test-case found by:	Andrew Gallatin <gallatin@cs.duke.edu>
2002-02-18 16:18:02 +00:00
Robert Watson
b541b65d91 Rehash of 1.43: simply remove the comment, since it's highly redundant
and only partially correct.
2002-02-18 16:02:24 +00:00
Ian Dowse
b01bcf4c74 Add the braces missed by revision 1.131.
Pointy hat to:	rwatson
2002-02-18 12:46:18 +00:00
Poul-Henning Kamp
21dcdb38e1 Take the common case of gettimeofday(&tv, NULL) out from under Giant. 2002-02-18 08:40:28 +00:00
Poul-Henning Kamp
90737495aa Remove yet a redundant VN_KNOTE() macro. 2002-02-18 08:24:48 +00:00
Matthew Dillon
5638baf0c6 The ICANON flag is an lflag, not an iflag.
Submitted by:	Neelkanth Natu <neelnatu@yahoo.com>
MFC after:	3 days
2002-02-18 06:07:11 +00:00
Robert Watson
4729fbd85f When vn_open() is failing because it cannot allocate a vm object, call
VOP_CLOSE() on the vnode, so that VOP_OPEN() and VOP_CLOSE() calls
are symmetric in all failure cases.  This prevents an 'open' reference
from being leaked in that unlikely failure scenario.
2002-02-18 00:26:10 +00:00
Robert Watson
3056874a81 style(9) prefers formatted comments in '/*' ... '*/' as opposed to
#if 0'd.
2002-02-18 00:23:44 +00:00
Robert Watson
eae1306746 Per discussion at BSDCon, note that the vop_getattr locking protocol
should require a shared lock, rather than an exclusive lock, which can
improve performance.  No actual code change here, since a number of
VFS locking fixes are in the works.
2002-02-18 00:22:57 +00:00
Poul-Henning Kamp
4b55dbe36b Move the stuff related to select and poll out of struct vnode.
The use of the zone allocator may or may not be overkill.
There is an XXX: over in ufs/ufs/ufs_vnops.c that jlemon may need
to revisit.

This shaves about 60 bytes of struct vnode which on my laptop means
600k less RAM used for vnodes.
2002-02-17 21:15:36 +00:00
Poul-Henning Kamp
362912ebcc Remove cache_purgeleafdirs(), it has been #if 0 for quite some time. 2002-02-17 20:40:29 +00:00
Daniel Eischen
1e599eee20 Regenerate these files after change to syscalls.master. 2002-02-17 17:42:47 +00:00
Daniel Eischen
bc874287e9 Fix prototype to sigreturn to use struct __ucontext instead of ucontext_t. 2002-02-17 17:41:28 +00:00
Matthew Dillon
e1bca29fae replace the embedded cr_mtx in the ucred structure with cr_mtxp (a mutex
pointer), and use the mutex pool routines.  This greatly reduces the size
of the ucred structure.
2002-02-17 07:30:34 +00:00
Julian Elischer
2eb927e2bb If the credential on an incoming thread is correct, don't bother
reaquiring it. In the same vein, don't bother dropping the thread cred
when goinf ot userland. We are guaranteed to nned it when we come back,
(which we are guaranteed to do).

Reviewed by:	jhb@freebsd.org, bde@freebsd.org (slightly different version)
2002-02-17 01:09:56 +00:00
Brian Feldman
1b56782026 (Doing that whole test-immediately-after-commit-thing like obrien sez:)
Forgot to include lock.h and mutex.h for GIANT_REQUIRED.
2002-02-16 17:44:43 +00:00
Brian Feldman
1fd9f8f438 Add revoke_and_destroy_dev(), to be used by devices which decide when
they choose to destroy themselves without regard to whether or not
they are open.
2002-02-16 17:35:05 +00:00
Bruce Evans
8c3d74f4bf Fixed a typo in rev.1.65 that gave a reference to a nonexistent variable.
This was not detected by LINT because LINT is missing COMPAT_SUNOS.
2002-02-15 03:54:01 +00:00
Luigi Rizzo
e522304423 Make this compile after changes to kse structures.
This escaped because DEVICE_POLLING is disabled in LINT being
not compatible with SMP. In fact, it is only a runtime problem,
so if we could recognize that we are building a LINT kernel
we could as well disable the check for SMP being defined.

Reported-by: Joe Clarke
2002-02-15 02:50:07 +00:00
Alan Cox
9fbd7ccf00 o Clearing p/td_retval[0] after aio_newproc() is unnecessary. (We stopped
calling rfork() to create aio threads in revision 1.46.)
 o Don't recompute the FILE * when it's already stored in the kernel's AIOCB.
2002-02-12 17:40:41 +00:00
Alan Cox
96347d1e6d The previous commit included a change to fill_kinfo_proc() that results
in a NULL pointer dereference.  Repair this mistake.
2002-02-12 04:21:28 +00:00
Luigi Rizzo
daccb6386b MFS: synchronize the code with the version in -stable, specifically:
+ SYSCTL_ULONG -> SYSCTL_UINT
 + some procedure renaming and variable rearrangement
 + fix the 'interface going deaf' problem same as in -stable.
2002-02-11 23:56:18 +00:00
Julian Elischer
2c1007663f In a threaded world, differnt priorirites become properties of
different entities.  Make it so.

Reviewed by:	jhb@freebsd.org (john baldwin)
2002-02-11 20:37:54 +00:00
David E. O'Brien
952539e39a Allow one to specify the AWK used in the environment(commandline).
Gawk is blowing up when run natively on the sparc64 -- leading to totally
bogus kernel values (all "0x0").  Good ole BWK awk works fine however.
2002-02-11 03:54:30 +00:00
Poul-Henning Kamp
d9888e41d5 GC the unused einval()
Obtained from:	~bde/sys.dif.gz
2002-02-10 22:07:41 +00:00
Poul-Henning Kamp
58a24f7938 Style(9) nits.
Obtained from:	~bde/sys.dif.gz
2002-02-10 22:04:44 +00:00
Robert Watson
1745909176 Add a comment indicating that the locking protocol should be updated
to be 'L L L' for vop_getattr().  Don't update it yet, because there
are still many offenders.
2002-02-10 21:46:16 +00:00
Robert Watson
5da271f5a6 Add a comment indicating that VOP_GETATTR() is called without appropriate
locking in the core dump code.  This should be fixed.
2002-02-10 21:45:16 +00:00
Robert Watson
1ea030d8fe Make sure to hold vnode lock when calling into VOP_GETATTR().
Discussed with:	mckusick, phk
2002-02-10 21:44:30 +00:00
Robert Watson
894c9fe04e Add a comment indicating that the vnode locking in this section of the
kernel linker code may be wrong: it fails to hold a lock across the
call to VOP_GETATTR(), and vn_rdwr() with IO_NODELOCKED.
2002-02-10 21:29:02 +00:00
Robert Watson
c0a9dc83c8 Make sure to grab vnode lock on a vnode before calling VOP_GETATTR()
to perform an ownership test in revoke().  This is also required for
MAC hooks so that the vnode lock is held during a call to the MAC
framework.  Release the lock before calling VOP_REVOKE().

Discussed with:	phk, mckusick
2002-02-10 20:45:43 +00:00
Robert Watson
56e04d01c0 Remove a stray 'const' that slept into extattr_set_vp(), and could
result in compiler warnings.
2002-02-10 05:31:55 +00:00
Robert Watson
1aa1d02a98 Part II: Update system calls for extended attributes. Rebuild of
generated files.
2002-02-10 04:44:37 +00:00
Robert Watson
74237f55b0 Part I: Update extended attribute API and ABI:
o Modify the system call syntax for extattr_{get,set}_{fd,file}() so
  as not to use the scatter gather API (which appeared not to be used
  by any consumers, and be less portable), rather, accepts 'data'
  and 'nbytes' in the style of other simple read/write interfaces.
  This changes the API and ABI.

o Modify system call semantics so that extattr_get_{fd,file}() return
  a size_t.  When performing a read, the number of bytes read will
  be returned, unless the data pointer is NULL, in which case the
  number of bytes of data are returned.  This changes the API only.

o Modify the VOP_GETEXTATTR() vnode operation to accept a *size_t
  argument so as to return the size, if desirable.  If set to NULL,
  the size will not be returned.

o Update various filesystems (pseodofs, ufs) to DTRT.

These changes should make extended attributes more useful and more
portable.  More commits to rebuild the system call files, as well
as update userland utilities to follow.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-02-10 04:43:22 +00:00
Julian Elischer
237a8a02da Replace accidentally removed setrunqueue()
solves problem with machines failing to sync in booting.
Submitted by: Tor.Egge@cvsup.no.freebsd.org
2002-02-09 01:38:16 +00:00
John Baldwin
18fc2ba9ff Use the mtx_owner() macro in one spot in _mtx_lock_sleep() to make the
code easier to read.
2002-02-09 00:12:53 +00:00
Thomas Moestl
2333d112fb Fix a bug introduced in r. 1.28: when copy{in,out} would fail for an
iovec that was not the last one in the uio, the error would be ignored
silently.

Bug found and fix proposed by:	jhb
2002-02-08 20:19:44 +00:00
Peter Wemm
1037bbb195 Fix broken Giant locking protocol introduced in rev 1.114. You cannot
unlock Giant if it is not locked in the first place.  This make the
nfstat(2) syscall (#278) a nice panic(2) implementation.
2002-02-08 09:16:57 +00:00
Peter Wemm
fe0d0493ac Bah, I managed to turn cosmetic things into real bugs. Fix shadowed
variable declarations. :-(  Definately not my day today.
2002-02-08 08:56:01 +00:00