Commit Graph

5024 Commits

Author SHA1 Message Date
Jeff Roberson
90769c9ed0 Improve the VOP locking asserts
- Add vfs_badlock_print to control whether or not we print lock violations
 - Add vfs_badlock_panic to control whether we panic on lock violations

Both default to on to mimic the original behavior if DEBUG_VFS_LOCKS is on.
2002-06-28 20:58:14 +00:00
Ian Dowse
84b2995b2f In vn_mkdir(), use vrele() instead of vput() on the parent directory
vnode in the case that the target exists and is the same vnode as
the parent (i.e. "mkdir ."). The namei() call does not leave the
vnode locked in this case even though you might expect it to.

This bug was mostly harmless in practice because unlocking an already
unlocked vnode currently does not trigger any panics or warnings.

Reviewed by:	jeff
2002-06-28 20:06:47 +00:00
Jeff Roberson
5c71bc6cf2 Clean up vn_rdwr locking.
- Do shared locks on read.
 - Only do vn_{start,finished}_write when writing.
2002-06-28 17:51:11 +00:00
Brian Feldman
aac12bcfbc Fix a case where a vnode got explicitly unlocked after the pointer to it
got set to NULL.

Revision 1.355: in the box
2002-06-28 16:17:47 +00:00
Luigi Rizzo
d26d355f0e Remove a printf and add a comment on an assumption that could be
occasionally violated by device drivers.
2002-06-27 23:23:04 +00:00
Robert Watson
600c1a5a8e Fix a bug that prevented the deletion of non-default ACLs from being
passed down the VFS stack.  While I'm here, replace a '0' with a 'NULL'
to make the code more readable.

Sponsored by:	DARPA, NAI Labs
Obtained from:	TrustedBSD Project
2002-06-27 19:31:15 +00:00
Robert Watson
cbeb840245 A bit of whitespace magic. 2002-06-27 19:30:11 +00:00
Kenneth D. Merry
98cb733c67 At long last, commit the zero copy sockets code.
MAKEDEV:	Add MAKEDEV glue for the ti(4) device nodes.

ti.4:		Update the ti(4) man page to include information on the
		TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options,
		and also include information about the new character
		device interface and the associated ioctls.

man9/Makefile:	Add jumbo.9 and zero_copy.9 man pages and associated
		links.

jumbo.9:	New man page describing the jumbo buffer allocator
		interface and operation.

zero_copy.9:	New man page describing the general characteristics of
		the zero copy send and receive code, and what an
		application author should do to take advantage of the
		zero copy functionality.

NOTES:		Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS,
		TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.

conf/files:	Add uipc_jumbo.c and uipc_cow.c.

conf/options:	Add the 5 options mentioned above.

kern_subr.c:	Receive side zero copy implementation.  This takes
		"disposable" pages attached to an mbuf, gives them to
		a user process, and then recycles the user's page.
		This is only active when ZERO_COPY_SOCKETS is turned on
		and the kern.ipc.zero_copy.receive sysctl variable is
		set to 1.

uipc_cow.c:	Send side zero copy functions.  Takes a page written
		by the user and maps it copy on write and assigns it
		kernel virtual address space.  Removes copy on write
		mapping once the buffer has been freed by the network
		stack.

uipc_jumbo.c:	Jumbo disposable page allocator code.  This allocates
		(optionally) disposable pages for network drivers that
		want to give the user the option of doing zero copy
		receive.

uipc_socket.c:	Add kern.ipc.zero_copy.{send,receive} sysctls that are
		enabled if ZERO_COPY_SOCKETS is turned on.

		Add zero copy send support to sosend() -- pages get
		mapped into the kernel instead of getting copied if
		they meet size and alignment restrictions.

uipc_syscalls.c:Un-staticize some of the sf* functions so that they
		can be used elsewhere.  (uipc_cow.c)

if_media.c:	In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid
		calling malloc() with M_WAITOK.  Return an error if
		the M_NOWAIT malloc fails.

		The ti(4) driver and the wi(4) driver, at least, call
		this with a mutex held.  This causes witness warnings
		for 'ifconfig -a' with a wi(4) or ti(4) board in the
		system.  (I've only verified for ti(4)).

ip_output.c:	Fragment large datagrams so that each segment contains
		a multiple of PAGE_SIZE amount of data plus headers.
		This allows the receiver to potentially do page
		flipping on receives.

if_ti.c:	Add zero copy receive support to the ti(4) driver.  If
		TI_PRIVATE_JUMBOS is not defined, it now uses the
		jumbo(9) buffer allocator for jumbo receive buffers.

		Add a new character device interface for the ti(4)
		driver for the new debugging interface.  This allows
		(a patched version of) gdb to talk to the Tigon board
		and debug the firmware.  There are also a few additional
		debugging ioctls available through this interface.

		Add header splitting support to the ti(4) driver.

		Tweak some of the default interrupt coalescing
		parameters to more useful defaults.

		Add hooks for supporting transmit flow control, but
		leave it turned off with a comment describing why it
		is turned off.

if_tireg.h:	Change the firmware rev to 12.4.11, since we're really
		at 12.4.11 plus fixes from 12.4.13.

		Add defines needed for debugging.

		Remove the ti_stats structure, it is now defined in
		sys/tiio.h.

ti_fw.h:	12.4.11 firmware.

ti_fw2.h:	12.4.11 firmware, plus selected fixes from 12.4.13,
		and my header splitting patches.  Revision 12.4.13
		doesn't handle 10/100 negotiation properly.  (This
		firmware is the same as what was in the tree previously,
		with the addition of header splitting support.)

sys/jumbo.h:	Jumbo buffer allocator interface.

sys/mbuf.h:	Add a new external mbuf type, EXT_DISPOSABLE, to
		indicate that the payload buffer can be thrown away /
		flipped to a userland process.

socketvar.h:	Add prototype for socow_setup.

tiio.h:		ioctl interface to the character portion of the ti(4)
		driver, plus associated structure/type definitions.

uio.h:		Change prototype for uiomoveco() so that we'll know
		whether the source page is disposable.

ufs_readwrite.c:Update for new prototype of uiomoveco().

vm_fault.c:	In vm_fault(), check to see whether we need to do a page
		based copy on write fault.

vm_object.c:	Add a new function, vm_object_allocate_wait().  This
		does the same thing that vm_object allocate does, except
		that it gives the caller the opportunity to specify whether
		it should wait on the uma_zalloc() of the object structre.

		This allows vm objects to be allocated while holding a
		mutex.  (Without generating WITNESS warnings.)

		vm_object_allocate() is implemented as a call to
		vm_object_allocate_wait() with the malloc flag set to
		M_WAITOK.

vm_object.h:	Add prototype for vm_object_allocate_wait().

vm_page.c:	Add page-based copy on write setup, clear and fault
		routines.

vm_page.h:	Add page based COW function prototypes and variable in
		the vm_page structure.

Many thanks to Drew Gallatin, who wrote the zero copy send and receive
code, and to all the other folks who have tested and reviewed this code
over the years.
2002-06-26 03:37:47 +00:00
Andrew R. Reiter
e024f583de - Remove Giant acquisition from modevent(), modfnext(), modstat() and
modfind().  Giant is no longer needed by these functions for safe
  execution.

Reviewed by:	jhb
2002-06-26 00:31:44 +00:00
Andrew R. Reiter
4e77f68011 - Alleviate jail() from having the burden of acquiring Giant by simply
removing.  We can do this since we no longer need Giant to safely
  execute jail().

Reviewed by:	rwatson, jhb
2002-06-26 00:29:01 +00:00
Alan Cox
366838ddfe o Eliminate vmspace::vm_minsaddr. It's initialized but never used.
o Replace stale comments in vmspace by "const until freed" annotations
   on some fields.
2002-06-25 18:14:38 +00:00
Jake Burkholder
8ba3d077ff Add an MD callout like cpu_exit, but which is called after sched_lock is
obtained, when all other scheduling activity is suspended.  This is needed
on sparc64 to deactivate the vmspace of the exiting process on all cpus.
Otherwise if another unrelated process gets the exact same vmspace structure
allocated to it (same address), its address space will not be activated
properly.  This seems to fix some spontaneous signal 11 problems with smp
on sparc64.
2002-06-24 15:48:02 +00:00
Maxime Henrion
0ae037eb45 Bring sys/kern/md5c.c in sync with the userland version.
Add a comment so that people don't forget to keep the
version in src/lib/libmd/md5c.c in sync with this one.
This fixes a warning on sparc64.

Reviewed by:	phk
2002-06-24 14:15:25 +00:00
Kirk McKusick
d374764fd3 Use proper size in bzero of stat structure.
Submitted by:	Jake Burkholder <jake@locore.ca>
Sponsored by:	DARPA & NAI Labs.
2002-06-24 07:14:44 +00:00
Jonathan Mini
01ad8a53db Remove unused diagnostic function cread_free_thread().
Approved by:	alfred
2002-06-24 06:22:00 +00:00
Matthew Dillon
727300861d I Noticed a defect in the way wakeup() scans the tailq. Tor noticed an
even worse defect in wakeup_one().  This patch cleans up both.

Submitted by:	tegge
MFC after:	3 days
2002-06-24 00:14:36 +00:00
Maxime Henrion
0a84e7c8f7 More 64 bits platforms warning fixes.
Reviewed by:	rwatson
2002-06-23 18:32:39 +00:00
Kirk McKusick
6524dddcd5 This patch fixes a size problem with the stat structure for
64-bit architectures that was introduced in the UFS2 code
merge two days ago. The stat structure change that caused
the problem was the addition of the file create time.

Submitted by:	Bruce Evans <bde@zeta.org.au>
Sponsored by:	DARPA & NAI Labs.
2002-06-22 22:01:13 +00:00
Maxime Henrion
2853bfa0df We don't need to check the return value of malloc() against
NULL when the M_WAITOK flag is specified.
2002-06-22 21:44:11 +00:00
Matthew Dillon
ed22d6e948 Fix a bug in vfs_bio_clrbuf(). The single-page-clrbuf optimization was
improperly clearing more then just the invalid portions of the page.  (This
bug is not known to have been triggered by anything).

Submitted by:	tegge
MFC after:	7 days
2002-06-22 19:09:35 +00:00
Maxime Henrion
cacd1c9b49 o Remove the initialization of unused fields in the struct
uio now that we don't use uiomove() anymore.
o Enforce stricter checks on the length of the iov's in
  nmount(2) since we now malloc() them individually and
  corrupted iov's could make the kernel crash in malloc()
  with "kmem_map too small".

Reviewed by:	phk
2002-06-22 18:07:05 +00:00
Jonathan Mini
9718382d85 Always drop the p_args reference we held for copyout, even if we're about
to change it. This fixes a leak triggered by setproctitle(3).

Approved by:	alfred
Noticed by:	Peter Jeremy <peter.jeremy@alcatel.com.au>
2002-06-22 10:05:50 +00:00
Kirk McKusick
1c85e6a35d This commit adds basic support for the UFS2 filesystem. The UFS2
filesystem expands the inode to 256 bytes to make space for 64-bit
block pointers. It also adds a file-creation time field, an ability
to use jumbo blocks per inode to allow extent like pointer density,
and space for extended attributes (up to twice the filesystem block
size worth of attributes, e.g., on a 16K filesystem, there is space
for 32K of attributes). UFS2 fully supports and runs existing UFS1
filesystems. New filesystems built using newfs can be built in either
UFS1 or UFS2 format using the -O option. In this commit UFS1 is
the default format, so if you want to build UFS2 format filesystems,
you must specify -O 2. This default will be changed to UFS2 when
UFS2 proves itself to be stable. In this commit the boot code for
reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c)
as there is insufficient space in the boot block. Once the size of the
boot block is increased, this code can be defined.

Things to note: the definition of SBSIZE has changed to SBLOCKSIZE.
The header file <ufs/ufs/dinode.h> must be included before
<ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and
ufs_lbn_t.

Still TODO:
Verify that the first level bootstraps work for all the architectures.
Convert the utility ffsinfo to understand UFS2 and test growfs.
Add support for the extended attribute storage. Update soft updates
to ensure integrity of extended attribute storage. Switch the
current extended attribute interfaces to use the extended attribute
storage. Add the extent like functionality (framework is there,
but is currently never used).

Sponsored by: DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@freebsd.org>
2002-06-21 06:18:05 +00:00
Maxime Henrion
7d2d440991 Change the way we internally store the mount options to
a linked list.  This is to allow the merging of the mount
options in the MNT_UPDATE case, as the current data structure
is unsuitable for this.

There are no functional differences in this commit.

Reviewed by:	phk
2002-06-20 20:03:42 +00:00
Alfred Perlstein
c33c825169 Implement SO_NOSIGPIPE option for sockets. This allows one to request that
an EPIPE error return not generate SIGPIPE on sockets.

Submitted by: lioux
Inspired by: Darwin
2002-06-20 18:52:54 +00:00
Alfred Perlstein
69be5db96f Don't leak resources if fdcheckstd() fails during exec.
Submitted by: Mike Makonnen <makonnen@pacbell.net>
2002-06-20 17:27:28 +00:00
Ian Dowse
99568bcaf7 Display the mutex name in the ^T status line if the selected thread
is blocked on a mutex. Prepend a '*' to distinguish this case as
is done in top(1).
2002-06-20 14:03:36 +00:00
Peter Wemm
9d04103b7c Remove UIO_USERISPACE - we do not support any split instruction/data
address space machines (eg: pdp-11) and are not likely to ever do so.
Nothing in our kernel sets this.
2002-06-20 07:08:43 +00:00
Peter Wemm
2f9267ec23 Move the "- 1" into the RQB_FFS(mask) macro itself so that
implementations can provide a base zero ffs function if they wish.
This changes
  #define RQB_FFS(mask) (ffs64(mask))
  foo = RQB_FFS(mask) - 1;
to
  #define RQB_FFS(mask) (ffs64(mask) - 1)
  foo = RQB_FFS(mask);
On some platforms we can get the "- 1" for free, eg: those that use the
C code for ffs64().

Reviewed by:	jake (in principle)
2002-06-20 06:21:20 +00:00
Andrew R. Reiter
2eb7b21b00 - Remove the lock(9) protecting the kernel linker system.
- Added a mutex, kld_mtx, to protect the kernel_linker system.  Note that
  while ``classes'' is global (to that file), it is only read only after
  SI_SUB_KLD, SI_ORDER_ANY.
- Add a SYSINIT to flip a flag that disallows class registration after
  SI_SUB_KLD, SI_ORDER_ANY.

Idea for ``classes'' read only by:	jake
Reviewed by:	jake
2002-06-19 21:25:59 +00:00
Poul-Henning Kamp
c4bacc1871 Remove the compat bits for the mis-aligned struct disklabel on alpha,
people got three times longer than I promised.

Sponsored by: DARPA & NAI Labs.
2002-06-19 08:37:02 +00:00
Alfred Perlstein
1419eacb86 Squish the "could sleep with process lock" messages caused by calling
uifind() with a proc lock held.

change_ruid() and change_euid() have been modified to take a uidinfo
structure which will be pre-allocated by callers, they will then
call uihold() on the uidinfo structure so that the caller's logic
is simplified.

This allows one to call uifind() before locking the proc struct and
thereby avoid a potential blocking allocation with the proc lock
held.

This may need revisiting, perhaps keeping a spare uidinfo allocated
per process to handle this situation or re-examining if the proc
lock needs to be held over the entire operation of changing real
or effective user id.

Submitted by: Don Lewis <dl-freebsd@catspoiler.org>
2002-06-19 06:39:25 +00:00
Alfred Perlstein
f2102dadf9 setsugid() touches p->p_flag so assert that the proc is locked. 2002-06-18 22:41:35 +00:00
Seigo Tanimura
03e4918190 Remove so*_locked(), which were backed out by mistake. 2002-06-18 07:42:02 +00:00
Maxime Henrion
fe93750656 Change vfs_copyopt() so that the length argument passed to it
must be the exact same size as the mount option.  This makes
vfs_copyopt() much more useful.
2002-06-14 20:04:21 +00:00
Bosko Milekic
ad1026f912 Set system_map for both mbuf_map and clust_map to 1, in mbuf_init().
Submitted by: Tor Egge (tegge)
Pointed out to me by: hsu
2002-06-13 23:53:42 +00:00
Robert Watson
6480dc743e Regen. 2002-06-13 23:44:50 +00:00
Robert Watson
65772a1a0a Keep POSIX.1e capabilities system call placeholders, but remove definitions. 2002-06-13 23:43:53 +00:00
Robert Watson
a3cce19f7d kern_cap.c no longer needed. 2002-06-13 23:19:34 +00:00
Robert Watson
4aaae52d99 opt_cap.c no longer needed 2002-06-13 23:17:39 +00:00
Kelly Yancey
9ae6d334da Make nselcol, the number of select collisions since boot, unsigned as
negative collisions simply doesn't make sense.

PR:		(one small part of) 19720
Approved by:	alfred
2002-06-12 02:08:18 +00:00
Kelly Yancey
e3f0c5755c Time counter stats are unsigned, advertise them to sysctl(8) that way.
PR:		(one small part of) 19720
Approved by:	phk
2002-06-11 19:47:44 +00:00
Kelly Yancey
3316a80bd1 Convert hit and miss counters to unsigned values. Surely negative values
for either does not make sense.

PR:		(one small part of) 19720
2002-06-10 22:40:26 +00:00
John Baldwin
d0c149fce8 We no longer need to acqure Giant in ast() for ktrpsig() in postsig() now
that ktrace no longer needs Giant.
2002-06-07 05:43:40 +00:00
John Baldwin
374a15aa55 - trapsignal() no longer needs to acquire Giant for ktrpsig().
- Catch up to new ktrace API.
2002-06-07 05:43:02 +00:00
John Baldwin
af300f2367 - Proper locking for p_tracep and p_traceflag.
- Catch up to new ktrace API.
2002-06-07 05:42:25 +00:00
John Baldwin
6c84de02e0 Properly lock accesses to p_tracep and p_traceflag. Also make a few
ktrace-only things #ifdef KTRACE that were not before.
2002-06-07 05:41:27 +00:00
John Baldwin
9ba7fe1b76 - Catch up to new ktrace API.
- ktrace trace points in msleep() and cv_wait() no longer need Giant.
2002-06-07 05:39:16 +00:00
John Baldwin
60a9bb197d Catch up to changes in ktrace API. 2002-06-07 05:37:18 +00:00
John Baldwin
ea3fc8e4cd Overhaul the ktrace subsystem a bit. For the most part, the actual vnode
operations to dump a ktrace event out to an output file are now handled
asychronously by a ktrace worker thread.  This enables most ktrace events
to not need Giant once p_tracep and p_traceflag are suitably protected by
the new ktrace_lock.

There is a single todo list of pending ktrace requests.  The various
ktrace tracepoints allocate a ktrace request object and tack it onto the
end of the queue.  The ktrace kernel thread grabs requests off the head of
the queue and processes them using the trace vnode and credentials of the
thread triggering the event.

Since we cannot assume that the user memory referenced when doing a
ktrgenio() will be valid and since we can't access it from the ktrace
worker thread without a bit of hassle anyways, ktrgenio() requests are
still handled synchronously.  However, in order to ensure that the requests
from a given thread still maintain relative order to one another, when a
synchronous ktrace event (such as a genio event) is triggered, we still put
the request object on the todo list to synchronize with the worker thread.
The original thread blocks atomically with putting the item on the queue.
When the worker thread comes across an asynchronous request, it wakes up
the original thread and then blocks to ensure it doesn't manage to write a
later event before the original thread has a chance to write out the
synchronous event.  When the original thread wakes up, it writes out the
synchronous using its own context and then finally wakes the worker thread
back up.  Yuck.  The sychronous events aren't pretty but they do work.

Since ktrace events can be triggered in fairly low-level areas (msleep()
and cv_wait() for example) the ktrace code is designed to use very few
locks when posting an event (currently just the ktrace_mtx lock and the
vnode interlock to bump the refcoun on the trace vnode).  This also means
that we can't allocate a ktrace request object when an event is triggered.
Instead, ktrace request objects are allocated from a pre-allocated pool
and returned to the pool after a request is serviced.

The size of this pool defaults to 100 objects, which is about 13k on an
i386 kernel.  The size of the pool can be adjusted at compile time via the
KTRACE_REQUEST_POOL kernel option, at boot time via the
kern.ktrace_request_pool loader tunable, or at runtime via the
kern.ktrace_request_pool sysctl.

If the pool of request objects is exhausted, then a warning message is
printed to the console.  The message is rate-limited in that it is only
printed once until the size of the pool is adjusted via the sysctl.

I have tested all kernel traces but have not tested user traces submitted
by utrace(2), though they should work fine in theory.

Since a ktrace request has several properties (content of event, trace
vnode, details of originating process, credentials for I/O, etc.), I chose
to drop the first argument to the various ktrfoo() functions.  Currently
the functions just assume the event is posted from curthread.  If there is
a great desire to do so, I suppose I could instead put back the first
argument but this time make it a thread pointer instead of a vnode pointer.

Also, KTRPOINT() now takes a thread as its first argument instead of a
process.  This is because the check for a recursive ktrace event is now
per-thread instead of process-wide.

Tested on:	i386
Compiles on:	sparc64, alpha
2002-06-07 05:32:59 +00:00
John Baldwin
48849938e8 Change the all locks list from a STAILQ to a TAILQ. This bloats struct
lock_object by another pointer (though all of lock_object should be
conditional on LOCK_DEBUG anyways) in exchange for an O(1) TAILQ_REMOVE()
in witness_destroy() (called for every mtx_destroy() and sx_destroy())
instead of an O(n) STAILQ_REMOVE.  Since WITNESS is so dog slow as it is,
the speed-up is worth the space cost.

Suggested by:	iedowse
2002-06-06 20:51:04 +00:00
Chad David
ca18d53eae s/!SIGNOTEMPY/SIGISEMPTY/
Reviewed by: marcel, jhb, alfred
2002-06-06 19:12:41 +00:00
John Baldwin
8dcb900b62 Handle "dead" witnesses better in the situation of several short term locks
being created and destroyed without a single long-term one around to ensure
the witness associated with that group of locks stays alive.  The pipe
mutexes are an example of this group.  For a dead witness we no longer
clear the witness name.  Instead, when looking up the witness for a lock,
if a dead witness' (a witness with a refcount of 0) w_name pointer is
identical to the witness name of the lock then we revive that witness
instead of using a new witness for the lock.  This results in far fewer
dead witness objects and also better preserves locking orders over the long
term resulting in more correct lock order checking.  Note that we can't
ever derefence w_name of a dead witness since we don't know if the string
it is pointing to has been free()'d or kldunload()'d out from under us.
2002-06-06 19:04:38 +00:00
Dag-Erling Smørgrav
edad3af28d Move some sysctls from the debug tree to the vfs tree. 2002-06-06 15:50:22 +00:00
Dag-Erling Smørgrav
4a357a32e0 Gratuitous whitespace cleanup. 2002-06-06 15:46:38 +00:00
Poul-Henning Kamp
53e645d90e Use "bwrbg" as description when we sleep for background writing,
"biord" was misleading in every possible way.
2002-06-06 08:56:10 +00:00
Bruce Evans
6438c894da Fixed overflow in the bounds checking in dscheck(). It assumed that
daadr_t is no larger than a long, and some other relatively harmless
things (*blush*).  Overflow for subtracting a daddr_t from a u_long
caused "truncation" of the i/o for attempts to access blocks beyond
the end of the actually cause expansion of the i/o to a preposterous
size.
2002-06-06 00:35:07 +00:00
John Baldwin
6a95e08f2f Replace thread_runnable() with thread_running() as the latter is more
accurate.

Suggested by:	julian
2002-06-04 22:36:24 +00:00
John Baldwin
7fcca6096f Optimize the adaptive mutex spin a bit. Use a simple while loop with
simple reads (and on IA32, a "pause" instruction for each interation of the
loop) to spin until either the mutex owner field changes, or the lock owner
stops executing.

Suggested by:	tanimura
Tested on:	i386
2002-06-04 21:53:48 +00:00
John Baldwin
5853d37d3b Add a private thread_runnable() macro to make the code more readable and
make the KSE diff easier to maintain.
2002-06-04 21:50:02 +00:00
Dag-Erling Smørgrav
f16a5176ad ANSIfy the one remaining K&R function. 2002-06-02 21:57:28 +00:00
Dag-Erling Smørgrav
e89efc02e0 Whitespace nits. 2002-06-02 21:55:58 +00:00
Dag-Erling Smørgrav
3b1f7e7de0 Add support for 'j' flag. Simplify the size modifier code and reduce code
duplication.  Also add support for 'n' specifier.

Reviewed by:	bde
2002-06-02 21:54:55 +00:00
Jens Schweikhardt
21dc7d4f57 Fix typo in the BSD copyright: s/withough/without/
Spotted and suggested by:	des
MFC after:	3 weeks
2002-06-02 20:05:59 +00:00
Mike Barcroft
6ee093fb8f Add POSIX.1-2001 WCONTINUED option for waitpid(2). A proc flag
(P_CONTINUED) is set when a stopped process receives a SIGCONT and
cleared after it has notified a parent process that has requested
notification via waitpid(2) with WCONTINUED specified in its options
operand.  The status value can be checked with the new WIFCONTINUED()
macro.

Reviewed by:	jake
2002-06-01 18:37:46 +00:00
Archie Cobbs
48d183faca Fix a bug in m_split(): the "m->m_ext.ext_size" field of an mbuf was being
set to zero. This field indicates the total space in the external buffer
and therefore should not be modified after the external buffer is added.

Add a comment warning that the mbufs returned by m_split() might be read-only.

Fix M_TRAILINGSPACE() to return zero if !M_WRITABLE(m).

Reviewed by:	freebsd-net
Obtained from:	Vernier Networks, Inc.
MFC after:	1 week
2002-05-31 22:09:57 +00:00
Dag-Erling Smørgrav
7aa57dca57 Nit: kern.ttys is of type S,xtty, not S,tty. 2002-05-31 16:11:49 +00:00
Seigo Tanimura
4cc20ab1f0 Back out my lats commit of locking down a socket, it conflicts with hsu's work.
Requested by:	hsu
2002-05-31 11:52:35 +00:00
Robert Drehmel
280759e75e - Replace the bandaid introduced in revision 1.110 with
a better solution.
 - Add braces for a ``for'' statement containing a single
   multi-line statement.
2002-05-31 09:41:09 +00:00
Poul-Henning Kamp
eef633a71f Mistyped and lost a '&' in previous commit. 2002-05-30 16:26:39 +00:00
Poul-Henning Kamp
fe71224650 Don't forget to factor in the boottime when we calculate PPS timestamps.
Submitted by:	Akira Watanabe <akira@myaw.ei.meisei-u.ac.jp>
2002-05-30 10:34:01 +00:00
Jeff Roberson
7181624aaa Record the file, line, and pid of the last successful shared lock holder. This
is useful as a last effort in debugging file system deadlocks.  This is enabled
via 'options DEBUG_LOCKS'
2002-05-30 05:55:22 +00:00
Julian Elischer
628855e758 CURSIG() is not a macro so rename it cursig().
Obtained from:	KSE tree
2002-05-29 23:44:32 +00:00
Julian Elischer
2d0231f5da diff reduction from KSE to keep WW-III from happenning on -current 2002-05-29 20:40:50 +00:00
Dag-Erling Smørgrav
6b658142fd Add some checks to prevent NULL dereferences.
Submitted by:	jhay
2002-05-28 14:29:56 +00:00
Maxime Henrion
8eb0098f4c Remove a duplicated vfs_freeopts() that I introduced in last
revision.
2002-05-28 13:27:55 +00:00
Dag-Erling Smørgrav
6c533ac713 Add NAI copyright. 2002-05-28 06:53:41 +00:00
Marcel Moolenaar
52183d0145 Add uuidgen(2) and uuidgen(1).
The uuidgen command, by means of the uuidgen syscall, generates one
or more Universally Unique Identifiers compatible with OSF/DCE 1.1
version 1 UUIDs.

From the Perforce logs (change 11995):

Round of cleanups:
o  Give uuidgen() the correct prototype in syscalls.master
o  Define struct uuid according to DCE 1.1 in sys/uuid.h
o  Use struct uuid instead of uuid_t. The latter is defined
   in sys/uuid.h but should not be used in kernel land.
o  Add snprintf_uuid(), printf_uuid() and sbuf_printf_uuid()
   to kern_uuid.c for use in the kernel (currently geom_gpt.c).
o  Rename the non-standard struct uuid in kern/kern_uuid.c
   to struct uuid_private and give it a slightly better definition
   for better byte-order handling. See below.
o  In sys/gpt.h, fix the broken uuid definitions to match the now
   compliant struct uuid definition. See below.
o  In usr.bin/uuidgen/uuidgen.c catch up with struct uuid change.

A note about byte-order:
        The standard failed to provide a non-conflicting and
unambiguous definition for the binary representation. My initial
implementation always wrote the timestamp as a 64-bit little-endian
(2s-complement) integral. The clock sequence was always written
as a 16-bit big-endian (2s-complement) integral. After a good
nights sleep and couple of Pan Galactic Gargle Blasters (not
necessarily in that order :-) I reread the spec and came to the
conclusion that the time fields are always written in the native
by order, provided the the low, mid and hi chopping still occurs.
The spec mentions that you "might need to swap bytes if you talk
to a machine that has a different byte-order". The clock sequence
is always written in big-endian order (as is the IEEE 802 address)
because its division is resulting in bytes, making the ordering
unambiguous.
2002-05-28 06:16:08 +00:00
Marcel Moolenaar
494eefd86b Add syscall uuidgen() for generating Univerally Unique Identifiers
(UUIDs). On ia64 UUIDs, aka GUIDs, are used by EFI and the firmware
among others. To create GUID Partition Tables (GPTs), we need to
be able to generate UUIDs.
2002-05-28 05:58:06 +00:00
Dag-Erling Smørgrav
1a149fcd67 Introduce struct xtty, used when exporting tty information to userland.
Make kern.ttys export a struct xtty rather than struct tty.  Since struct
tty is no longer exposed to userland, remove the dev_t / udev_t hack.

Sponsored by:	DARPA, NAI Labs
2002-05-28 05:40:53 +00:00
Alan Cox
a739e09c6e o Remove some unnecessary casting from and add some necessary casting to
aio_suspend() and lio_listio().

Submitted by:	bde
2002-05-25 18:39:42 +00:00
Dag-Erling Smørgrav
4b4c18f861 ANSIfy (significant portions were already partly ANSIfied) 2002-05-25 15:52:53 +00:00
Dag-Erling Smørgrav
b7457aabf6 Remove register. 2002-05-25 15:44:38 +00:00
Dag-Erling Smørgrav
dedf14f521 Automated whitespace cleanup. 2002-05-25 15:43:06 +00:00
Jake Burkholder
d2ac231616 Make the run queue parameters machine dependent. Optimize 64 bit
architectures by using a 64 bit word for the bit array which keeps
track of non-empty queues.

Reviewed by:	peter
2002-05-25 01:12:23 +00:00
Peter Wemm
34e3110c70 Fix warnings. Also, removed an unused variable that I found that was just
initialized and never used afterwards.
2002-05-24 06:06:18 +00:00
Maxime Henrion
2274ec995c Style nit, no functional changes. 2002-05-23 23:22:22 +00:00
Maxime Henrion
9ee6bf717f Slightly change the way we pass mount options to the filesystem
VFS_NMOUNT operations.

Reviewed by:	phk
2002-05-23 23:02:19 +00:00
Hajimu UMEMOTO
4b562eede1 In m_aux_delete, no need to chase beyond victim.
Submitted by:	archie
Obtained from:	KAME
2002-05-23 15:59:48 +00:00
John Baldwin
cc5d39f81e Minor nit: get p pointer in msleep() from td->td_proc (where
td == curthread) rather than from curproc.
2002-05-23 04:14:18 +00:00
John Baldwin
a79c98fa98 Whitespace: trim a trailing tab. 2002-05-23 04:12:28 +00:00
Dag-Erling Smørgrav
db586c8b7c Make the counters uintmax_ts, and use %ju rather than %llu. 2002-05-23 03:08:42 +00:00
John Baldwin
6b8c698908 Rename pause() to ia32_pause() so it doesn't conflict with the pause()
function defined in <unistd.h>.  I didn't #ifdef _KERNEL it because the
mutex implementation in libpthread will probably need this.
2002-05-22 20:32:39 +00:00
John Baldwin
0228ea4e0b Rename cpu_pause() to pause(). Originally I was going to make this an
MI API with empty cpu_pause() functions on other arch's, but this
functionality is definitely unique to IA-32, so I decided to leave it
as i386-only and wrap it in #ifdef's.  I should have dropped the cpu_
prefix when I made that decision.

Requested by:	bde
2002-05-22 13:19:22 +00:00
John Baldwin
703fc290fb Add appropriate IA32 "pause" instructions to improve performanec on
Pentium 4's and newer IA32 processors.  The "pause" instruction has been
verified by Intel to be a NOP on all currently existing IA32 processors
prior to the Pentium 4.
2002-05-21 22:26:35 +00:00
Andrew R. Reiter
ec41816009 - td will never be NULL, so the call to soalloc() in socreate() will always
be passed a 1; we can, however, use M_NOWAIT to indicate this.
- Check so against NULL since it's a pointer to a structure.
2002-05-21 21:30:44 +00:00
John Baldwin
0e54ddadd9 Fix an old cut 'n' paste bug inherited from BSD/OS: don't increment 'i'
twice once we are in the long wait stage of spinning on a spin mutex.
2002-05-21 21:27:05 +00:00
Andrew R. Reiter
1515cd22e1 - OR the flag variable with M_ZERO so that the uma_zalloc() handles the
zero'ing out of the allocated memory.  Also removed the logical bzero
  that followed.
2002-05-21 21:18:41 +00:00
John Baldwin
e6302957fe Whitespace fixup, properly indent the body of an else clause. 2002-05-21 21:13:27 +00:00
John Baldwin
2498cf8c42 Add code to make default mutexes adaptive if the ADAPTIVE_MUTEXES kernel
option is used (not on by default).

- In the case of trying to lock a mutex, if the MTX_CONTESTED flag is set,
  then we can safely read the thread pointer from the mtx_lock member while
  holding sched_lock.  We then examine the thread to see if it is currently
  executing on another CPU.  If it is, then we keep looping instead of
  blocking.
- In the case of trying to unlock a mutex, it is now possible for a mutex
  to have MTX_CONTESTED set in mtx_lock but to not have any threads
  actually blocked on it, so we need to handle that case.  In that case,
  we just release the lock as if MTX_CONTESTED was not set and return.
- We do not adaptively spin on Giant as Giant is held for long times and
  it slows SMP systems down to a crawl (it was taking several minutes,
  like 5-10 or so for my test alpha and sparc64 SMP boxes to boot up when
  they adaptively spinned on Giant).
- We only compile in the code to do this for SMP kernels, it doesn't make
  sense for UP kernels.

Tested on:	i386, alpha, sparc64
2002-05-21 20:47:11 +00:00