freebsd-skq

Author	SHA1	Message	Date
jhb	11b212e025	- trapsignal() no longer needs to acquire Giant for ktrpsig(). - Catch up to new ktrace API.	2002-06-07 05:43:02 +00:00
jhb	b83763b249	- Proper locking for p_tracep and p_traceflag. - Catch up to new ktrace API.	2002-06-07 05:42:25 +00:00
jhb	eb29fde68b	Properly lock accesses to p_tracep and p_traceflag. Also make a few ktrace-only things #ifdef KTRACE that were not before.	2002-06-07 05:41:27 +00:00
jhb	fd3d90c2c8	- Catch up to new ktrace API. - ktrace trace points in msleep() and cv_wait() no longer need Giant.	2002-06-07 05:39:16 +00:00
jhb	fbebc83b5b	Catch up to changes in ktrace API.	2002-06-07 05:37:18 +00:00
jhb	ab80d12ef1	Overhaul the ktrace subsystem a bit. For the most part, the actual vnode operations to dump a ktrace event out to an output file are now handled asychronously by a ktrace worker thread. This enables most ktrace events to not need Giant once p_tracep and p_traceflag are suitably protected by the new ktrace_lock. There is a single todo list of pending ktrace requests. The various ktrace tracepoints allocate a ktrace request object and tack it onto the end of the queue. The ktrace kernel thread grabs requests off the head of the queue and processes them using the trace vnode and credentials of the thread triggering the event. Since we cannot assume that the user memory referenced when doing a ktrgenio() will be valid and since we can't access it from the ktrace worker thread without a bit of hassle anyways, ktrgenio() requests are still handled synchronously. However, in order to ensure that the requests from a given thread still maintain relative order to one another, when a synchronous ktrace event (such as a genio event) is triggered, we still put the request object on the todo list to synchronize with the worker thread. The original thread blocks atomically with putting the item on the queue. When the worker thread comes across an asynchronous request, it wakes up the original thread and then blocks to ensure it doesn't manage to write a later event before the original thread has a chance to write out the synchronous event. When the original thread wakes up, it writes out the synchronous using its own context and then finally wakes the worker thread back up. Yuck. The sychronous events aren't pretty but they do work. Since ktrace events can be triggered in fairly low-level areas (msleep() and cv_wait() for example) the ktrace code is designed to use very few locks when posting an event (currently just the ktrace_mtx lock and the vnode interlock to bump the refcoun on the trace vnode). This also means that we can't allocate a ktrace request object when an event is triggered. Instead, ktrace request objects are allocated from a pre-allocated pool and returned to the pool after a request is serviced. The size of this pool defaults to 100 objects, which is about 13k on an i386 kernel. The size of the pool can be adjusted at compile time via the KTRACE_REQUEST_POOL kernel option, at boot time via the kern.ktrace_request_pool loader tunable, or at runtime via the kern.ktrace_request_pool sysctl. If the pool of request objects is exhausted, then a warning message is printed to the console. The message is rate-limited in that it is only printed once until the size of the pool is adjusted via the sysctl. I have tested all kernel traces but have not tested user traces submitted by utrace(2), though they should work fine in theory. Since a ktrace request has several properties (content of event, trace vnode, details of originating process, credentials for I/O, etc.), I chose to drop the first argument to the various ktrfoo() functions. Currently the functions just assume the event is posted from curthread. If there is a great desire to do so, I suppose I could instead put back the first argument but this time make it a thread pointer instead of a vnode pointer. Also, KTRPOINT() now takes a thread as its first argument instead of a process. This is because the check for a recursive ktrace event is now per-thread instead of process-wide. Tested on: i386 Compiles on: sparc64, alpha	2002-06-07 05:32:59 +00:00
jhb	165d918ce2	Change the all locks list from a STAILQ to a TAILQ. This bloats struct lock_object by another pointer (though all of lock_object should be conditional on LOCK_DEBUG anyways) in exchange for an O(1) TAILQ_REMOVE() in witness_destroy() (called for every mtx_destroy() and sx_destroy()) instead of an O(n) STAILQ_REMOVE. Since WITNESS is so dog slow as it is, the speed-up is worth the space cost. Suggested by: iedowse	2002-06-06 20:51:04 +00:00
davidc	b44a13481e	s/!SIGNOTEMPY/SIGISEMPTY/ Reviewed by: marcel, jhb, alfred	2002-06-06 19:12:41 +00:00
jhb	de3e290d8f	Handle "dead" witnesses better in the situation of several short term locks being created and destroyed without a single long-term one around to ensure the witness associated with that group of locks stays alive. The pipe mutexes are an example of this group. For a dead witness we no longer clear the witness name. Instead, when looking up the witness for a lock, if a dead witness' (a witness with a refcount of 0) w_name pointer is identical to the witness name of the lock then we revive that witness instead of using a new witness for the lock. This results in far fewer dead witness objects and also better preserves locking orders over the long term resulting in more correct lock order checking. Note that we can't ever derefence w_name of a dead witness since we don't know if the string it is pointing to has been free()'d or kldunload()'d out from under us.	2002-06-06 19:04:38 +00:00
des	936333132d	Move some sysctls from the debug tree to the vfs tree.	2002-06-06 15:50:22 +00:00
des	8aef2ace20	Gratuitous whitespace cleanup.	2002-06-06 15:46:38 +00:00
phk	b98dc1ffce	Use "bwrbg" as description when we sleep for background writing, "biord" was misleading in every possible way.	2002-06-06 08:56:10 +00:00
bde	f55264a991	Fixed overflow in the bounds checking in dscheck(). It assumed that daadr_t is no larger than a long, and some other relatively harmless things (blush). Overflow for subtracting a daddr_t from a u_long caused "truncation" of the i/o for attempts to access blocks beyond the end of the actually cause expansion of the i/o to a preposterous size.	2002-06-06 00:35:07 +00:00
jhb	4a77bedabf	Replace thread_runnable() with thread_running() as the latter is more accurate. Suggested by: julian	2002-06-04 22:36:24 +00:00
jhb	408adb7287	Optimize the adaptive mutex spin a bit. Use a simple while loop with simple reads (and on IA32, a "pause" instruction for each interation of the loop) to spin until either the mutex owner field changes, or the lock owner stops executing. Suggested by: tanimura Tested on: i386	2002-06-04 21:53:48 +00:00
jhb	1ba6786436	Add a private thread_runnable() macro to make the code more readable and make the KSE diff easier to maintain.	2002-06-04 21:50:02 +00:00
des	7464466a40	ANSIfy the one remaining K&R function.	2002-06-02 21:57:28 +00:00
des	f58932ded5	Whitespace nits.	2002-06-02 21:55:58 +00:00
des	a79d7499e2	Add support for 'j' flag. Simplify the size modifier code and reduce code duplication. Also add support for 'n' specifier. Reviewed by: bde	2002-06-02 21:54:55 +00:00
schweikh	28bcbfe85d	Fix typo in the BSD copyright: s/withough/without/ Spotted and suggested by: des MFC after: 3 weeks	2002-06-02 20:05:59 +00:00
mike	1b681bdeaa	Add POSIX.1-2001 WCONTINUED option for waitpid(2). A proc flag (P_CONTINUED) is set when a stopped process receives a SIGCONT and cleared after it has notified a parent process that has requested notification via waitpid(2) with WCONTINUED specified in its options operand. The status value can be checked with the new WIFCONTINUED() macro. Reviewed by: jake	2002-06-01 18:37:46 +00:00
archie	f138a74bc8	Fix a bug in m_split(): the "m->m_ext.ext_size" field of an mbuf was being set to zero. This field indicates the total space in the external buffer and therefore should not be modified after the external buffer is added. Add a comment warning that the mbufs returned by m_split() might be read-only. Fix M_TRAILINGSPACE() to return zero if !M_WRITABLE(m). Reviewed by: freebsd-net Obtained from: Vernier Networks, Inc. MFC after: 1 week	2002-05-31 22:09:57 +00:00
des	ad1dca67b1	Nit: kern.ttys is of type S,xtty, not S,tty.	2002-05-31 16:11:49 +00:00
tanimura	e6fa9b9e92	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
robert	c2b2e9b3bc	- Replace the bandaid introduced in revision 1.110 with a better solution. - Add braces for a ``for'' statement containing a single multi-line statement.	2002-05-31 09:41:09 +00:00
phk	53143bb2c1	Mistyped and lost a '&' in previous commit.	2002-05-30 16:26:39 +00:00
phk	559ad51949	Don't forget to factor in the boottime when we calculate PPS timestamps. Submitted by: Akira Watanabe <akira@myaw.ei.meisei-u.ac.jp>	2002-05-30 10:34:01 +00:00
jeff	feec324370	Record the file, line, and pid of the last successful shared lock holder. This is useful as a last effort in debugging file system deadlocks. This is enabled via 'options DEBUG_LOCKS'	2002-05-30 05:55:22 +00:00
julian	304195369e	CURSIG() is not a macro so rename it cursig(). Obtained from: KSE tree	2002-05-29 23:44:32 +00:00
julian	200eddc848	diff reduction from KSE to keep WW-III from happenning on -current	2002-05-29 20:40:50 +00:00
des	577e468b90	Add some checks to prevent NULL dereferences. Submitted by: jhay	2002-05-28 14:29:56 +00:00
mux	8ba975c439	Remove a duplicated vfs_freeopts() that I introduced in last revision.	2002-05-28 13:27:55 +00:00
des	35f5a040c8	Add NAI copyright.	2002-05-28 06:53:41 +00:00
marcel	58435e6cb7	Add uuidgen(2) and uuidgen(1). The uuidgen command, by means of the uuidgen syscall, generates one or more Universally Unique Identifiers compatible with OSF/DCE 1.1 version 1 UUIDs. From the Perforce logs (change 11995): Round of cleanups: o Give uuidgen() the correct prototype in syscalls.master o Define struct uuid according to DCE 1.1 in sys/uuid.h o Use struct uuid instead of uuid_t. The latter is defined in sys/uuid.h but should not be used in kernel land. o Add snprintf_uuid(), printf_uuid() and sbuf_printf_uuid() to kern_uuid.c for use in the kernel (currently geom_gpt.c). o Rename the non-standard struct uuid in kern/kern_uuid.c to struct uuid_private and give it a slightly better definition for better byte-order handling. See below. o In sys/gpt.h, fix the broken uuid definitions to match the now compliant struct uuid definition. See below. o In usr.bin/uuidgen/uuidgen.c catch up with struct uuid change. A note about byte-order: The standard failed to provide a non-conflicting and unambiguous definition for the binary representation. My initial implementation always wrote the timestamp as a 64-bit little-endian (2s-complement) integral. The clock sequence was always written as a 16-bit big-endian (2s-complement) integral. After a good nights sleep and couple of Pan Galactic Gargle Blasters (not necessarily in that order :-) I reread the spec and came to the conclusion that the time fields are always written in the native by order, provided the the low, mid and hi chopping still occurs. The spec mentions that you "might need to swap bytes if you talk to a machine that has a different byte-order". The clock sequence is always written in big-endian order (as is the IEEE 802 address) because its division is resulting in bytes, making the ordering unambiguous.	2002-05-28 06:16:08 +00:00
marcel	e2eeb62542	Add syscall uuidgen() for generating Univerally Unique Identifiers (UUIDs). On ia64 UUIDs, aka GUIDs, are used by EFI and the firmware among others. To create GUID Partition Tables (GPTs), we need to be able to generate UUIDs.	2002-05-28 05:58:06 +00:00
des	e332aae785	Introduce struct xtty, used when exporting tty information to userland. Make kern.ttys export a struct xtty rather than struct tty. Since struct tty is no longer exposed to userland, remove the dev_t / udev_t hack. Sponsored by: DARPA, NAI Labs	2002-05-28 05:40:53 +00:00
alc	afb615dae0	o Remove some unnecessary casting from and add some necessary casting to aio_suspend() and lio_listio(). Submitted by: bde	2002-05-25 18:39:42 +00:00
des	324a67fe9d	ANSIfy (significant portions were already partly ANSIfied)	2002-05-25 15:52:53 +00:00
des	94fe5108ff	Remove register.	2002-05-25 15:44:38 +00:00
des	f1297851a7	Automated whitespace cleanup.	2002-05-25 15:43:06 +00:00
jake	88bdee3b2f	Make the run queue parameters machine dependent. Optimize 64 bit architectures by using a 64 bit word for the bit array which keeps track of non-empty queues. Reviewed by: peter	2002-05-25 01:12:23 +00:00
peter	c952c3ce19	Fix warnings. Also, removed an unused variable that I found that was just initialized and never used afterwards.	2002-05-24 06:06:18 +00:00
mux	334d1908ec	Style nit, no functional changes.	2002-05-23 23:22:22 +00:00
mux	67080508a8	Slightly change the way we pass mount options to the filesystem VFS_NMOUNT operations. Reviewed by: phk	2002-05-23 23:02:19 +00:00
ume	a58ed55860	In m_aux_delete, no need to chase beyond victim. Submitted by: archie Obtained from: KAME	2002-05-23 15:59:48 +00:00
jhb	bd383063f6	Minor nit: get p pointer in msleep() from td->td_proc (where td == curthread) rather than from curproc.	2002-05-23 04:14:18 +00:00
jhb	2d4c041eb3	Whitespace: trim a trailing tab.	2002-05-23 04:12:28 +00:00
des	2fda28e6ab	Make the counters uintmax_ts, and use %ju rather than %llu.	2002-05-23 03:08:42 +00:00
jhb	096c0249dc	Rename pause() to ia32_pause() so it doesn't conflict with the pause() function defined in <unistd.h>. I didn't #ifdef _KERNEL it because the mutex implementation in libpthread will probably need this.	2002-05-22 20:32:39 +00:00
jhb	2f66cc911b	Rename cpu_pause() to pause(). Originally I was going to make this an MI API with empty cpu_pause() functions on other arch's, but this functionality is definitely unique to IA-32, so I decided to leave it as i386-only and wrap it in #ifdef's. I should have dropped the cpu_ prefix when I made that decision. Requested by: bde	2002-05-22 13:19:22 +00:00
jhb	3b7890a56f	Add appropriate IA32 "pause" instructions to improve performanec on Pentium 4's and newer IA32 processors. The "pause" instruction has been verified by Intel to be a NOP on all currently existing IA32 processors prior to the Pentium 4.	2002-05-21 22:26:35 +00:00
arr	8f86bf993e	- td will never be NULL, so the call to soalloc() in socreate() will always be passed a 1; we can, however, use M_NOWAIT to indicate this. - Check so against NULL since it's a pointer to a structure.	2002-05-21 21:30:44 +00:00
jhb	0ceb358d5c	Fix an old cut 'n' paste bug inherited from BSD/OS: don't increment 'i' twice once we are in the long wait stage of spinning on a spin mutex.	2002-05-21 21:27:05 +00:00
arr	8bb819d225	- OR the flag variable with M_ZERO so that the uma_zalloc() handles the zero'ing out of the allocated memory. Also removed the logical bzero that followed.	2002-05-21 21:18:41 +00:00
jhb	6190f4162b	Whitespace fixup, properly indent the body of an else clause.	2002-05-21 21:13:27 +00:00
jhb	d3398f2f58	Add code to make default mutexes adaptive if the ADAPTIVE_MUTEXES kernel option is used (not on by default). - In the case of trying to lock a mutex, if the MTX_CONTESTED flag is set, then we can safely read the thread pointer from the mtx_lock member while holding sched_lock. We then examine the thread to see if it is currently executing on another CPU. If it is, then we keep looping instead of blocking. - In the case of trying to unlock a mutex, it is now possible for a mutex to have MTX_CONTESTED set in mtx_lock but to not have any threads actually blocked on it, so we need to handle that case. In that case, we just release the lock as if MTX_CONTESTED was not set and return. - We do not adaptively spin on Giant as Giant is held for long times and it slows SMP systems down to a crawl (it was taking several minutes, like 5-10 or so for my test alpha and sparc64 SMP boxes to boot up when they adaptively spinned on Giant). - We only compile in the code to do this for SMP kernels, it doesn't make sense for UP kernels. Tested on: i386, alpha, sparc64	2002-05-21 20:47:11 +00:00
jhb	fd74bc1d8e	Optimize spin mutexes for UP kernels without debugging to just enter and exit critical sections. We only contest on a spin mutex on an SMP kernel running on an SMP machine.	2002-05-21 20:34:28 +00:00
jhb	a4a680304c	In witness_unlock(), when updating a lock list entry bucket, decrement the count of lock list entries after we fixup the bucket of lock list entries. In theory we can remove the intr_disable/intr_restore() calls now.	2002-05-20 19:16:22 +00:00
jake	dca97f2341	Add a bandaid so that sysctl kern.malloc works on sparc64.	2002-05-20 18:29:37 +00:00
jhb	4423d1f90a	- Allow witness_sleep() to be called when witness hasn't been initialized yet. We just return without performing any checks. - Don't explicitly enter and exit critical sections when walking lock lists. We don't need a critical section to walk the list of sleep locks for a thread. We check to see if a spin lock list is empty before we walk it. If the list is empty we don't need to walk it. If it isn't then we already hold at least one spin lock and are already in a critical section and thus don't need our own explicit critical section.	2002-05-20 17:49:46 +00:00
jhb	bb678d578d	Fix the td_intr_nesting_level check to work ok if a flag like M_ZERO is passed in with M_WAITOK to malloc().	2002-05-20 17:46:57 +00:00
silby	85e17a3398	Subtle fix to the accept filter LRU code. In some cases, a newly initialized socket with no qlimit was being passed in. In order to handle this case properly, we must not use >= when comparing queue sizes to qlimit. As a result of this improper handling, a panic could result in certain cases. PR: 38325 MFC after: 3 days	2002-05-20 17:34:31 +00:00
mux	85aa3f836d	Change two vput() that should have been vrele(). Submitted by: iedowse	2002-05-20 14:59:43 +00:00
tanimura	92d8381dd5	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
marcel	1f1f792674	All signals can be sent to the inferior process when it's restarted, not just the legacy ones. PR: 33299 Submitted by: Alexander N. Kabaev <ak03@gte.com>	2002-05-19 01:37:43 +00:00
jhb	b6d6774e76	Change p_can{debug,see,sched,signal}()'s first argument to be a thread pointer instead of a proc pointer and require the process pointed to by the second argument to be locked. We now use the thread ucred reference for the credential checks in p_can*() as a result. p_canfoo() should now no longer need Giant.	2002-05-19 00:14:50 +00:00
jhb	7df8f89185	Now that daddr_t has grown up, use %lld to printf it and cast it to long long.	2002-05-18 23:46:04 +00:00
phk	c506e4337e	Use btodb() macro. Sponsored by: DARPA & NAI Labs.	2002-05-18 09:34:09 +00:00
eric	4579e1dcd0	Separate "seperate" from kernel source.	2002-05-16 22:43:20 +00:00
trhodes	28d42899b7	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
mux	84d9baf797	o Fix vfs_copyopt(), the first argument to bcopy() is the source, not the destination. o Remove some code from vfs_getopt() which was making the interface more complicated to use for a very slight gain.	2002-05-16 17:09:41 +00:00
rwatson	61d5a9043f	p_cansignal() returns an errno value; at some point, the check for inter-process signalling ceased to preserve and return that value, instead always returning EPERM. This meant that it was possible to "probe" the pid space for processes that were not otherwise visible. This change reverts that reversion. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-05-14 23:07:15 +00:00
jeff	ba85b0e087	Disable the shared locking namei() code for now. It breaks several stacking filesystems. This is on hold until the rest of VFS Locking is reviewed and deemed safe. It can be enabled with 'options LOOKUP_SHARED'.	2002-05-14 21:59:49 +00:00
des	f2d1d92921	Remove a printf(3) argument with no corresponding format specifier.	2002-05-14 18:28:06 +00:00
phk	8536ea3cdb	Make daddr_t and u_daddr_t 64bits wide. Retire daddr64_t and use daddr_t instead. Sponsored by: DARPA & NAI Labs.	2002-05-14 11:09:43 +00:00
phk	02fe70f68e	Retire the bogus uses of the disklabel field d_sbsize and begin to initialize it to zero so we don't have to have everbody and their aunt including FFS specific header files. Sponsored by: DARPA & NAI Labs.	2002-05-12 20:49:41 +00:00
marcel	6683d5d11c	Fix alpha build. The alpha has dumpsys implemented. While here, revert the condition to list the machines for which dumpsys has not been implemented. Reported by: wilko	2002-05-12 18:27:28 +00:00
silby	f3419e2e8f	Change the mbuf exhaustion warning message to match the message in -stable.	2002-05-09 20:21:07 +00:00
mini	b6d1cd6b33	Remove trace_req(). Reviewed by: alfred, jhb, peter	2002-05-09 04:13:41 +00:00
alc	eff7d93533	o Correct an error made in revision 1.65: In readv(), if uap->iovcnt is out-of-range, drop the file reference before returning. (This error also exists in the RELENG_4 branch.) o Eliminate the acquisition and release of Giant in readv() now that malloc() and free() are callable without Giant.	2002-05-09 02:30:41 +00:00
alfred	8de609e473	expand_name fixes: .) don't use MAXPATHLEN + 1, fix logic to compensate. .) style(9) function parameters. .) fix line wrapping. .) remove duplicated error and string handling code. .) don't NUL terminate already NUL terminated string. .) all string length variables changed from int to size_t. .) constify variables. .) catch when corename would be truncated. .) cast pid_t and uid_t args for format string. .) add parens around return arguments. Help and suggestions from: bde	2002-05-08 09:06:47 +00:00
jake	4b2b9b41e7	Remove runq_findproc. This never worked right in the first place and can be prohibitively expensive.	2002-05-08 04:39:49 +00:00
alfred	c4da65d875	M_ZERO the temp buffer in expand_name() otherwise if an error occurs while logging we may pass a non NUL terminated string to log(9) for a %s format arg.	2002-05-07 23:37:07 +00:00
peter	890d39a38c	Re-remove kern_random.c and svr4_signal.c. Somehow dillon managed to keep on committing to these while they were in the Attic after they had been removed. I think this was because he had the file checked out and already 'modified' while markm cvs rm'ed them, and cvs screws up when trying to "merge" the modifications with the "rm". And after that the client state was sufficiently hosed to keep it messed up. Yay CVS! (CVS is very fragile for adding and removing files remotely) The existence of these files was pointed out by: ru	2002-05-07 21:54:47 +00:00
tanimura	9070f27e7d	Do not forget to increase the number of completely connected sockets in soisconnected_locked(). Forgotten by: tanimura	2002-05-07 16:17:44 +00:00
jeff	74069a30ee	Switch from just holding the interlock to holding the standard lock throughout getnewvnode(). This is safer. In the future, we should investigate requiring only the interlock to get the vnode object.	2002-05-07 02:44:06 +00:00
alfred	d1e340364b	Make funsetown() take a 'struct sigio **' so that the locking can be done internally. Ensure that no one can fsetown() to a dying process/pgrp. We need to check the process for P_WEXIT to see if it's exiting. Process groups are already safe because there is no such thing as a pgrp zombie, therefore the proctree lock completely protects the pgrp from having sigio structures associated with it after it runs funsetownlst. Add sigio lock to witness list under proctree and allproc, but over proc and pgrp. Seigo Tanimura helped with this.	2002-05-06 19:31:28 +00:00
jhb	1641885111	When checking to see if the init process calls exit1(), compare p to the initproc proc pointer instead of checking to see if the pid is 1. Submitted by: bde	2002-05-06 17:07:10 +00:00
jhb	c08f0c732a	Style fixes in local variable declarations. Submitted by: bde	2002-05-06 17:04:29 +00:00
jhb	a65193d5b1	- Style fixes in some comments. - Whitespace nit. - Sort some includes. Submitted by: bde (mostly)	2002-05-06 15:46:29 +00:00
jeff	bfe0870a56	Hold the currently selected vnode's lock across the call to VOP_GETVOBJECT. Don't try to create a vm object before the file system has a chance to finish initializing it. This is incorrect for a number of reasons. Firstly, that VOP requires a lock which the file system may not have initialized yet. Also, open and others will create a vm object if it is necessary later.	2002-05-06 04:47:43 +00:00
mux	b2f5ccfa53	Add the lchflags(2) syscall. Reviewed by: rwatson	2002-05-05 23:47:41 +00:00
mux	07314cd73a	Add an entry for the lchflags(2) syscall. It's useful to prevent a symlink deletion. Reviewed by: rwatson	2002-05-05 23:37:44 +00:00
jeff	4323e678da	Move a KASSERT() in open() prior to unlocking the vnode. It's not safe to call VOP_GETVOBJECT without a lock.	2002-05-05 23:17:13 +00:00
alc	c5483b3129	o Condition the compilation of uiomoveco() and vm_uiomove() on ENABLE_VFS_IOOPT. o Add a comment to the effect that this code is experimental support for zero-copy I/O.	2002-05-05 22:42:40 +00:00
phk	5020d62430	Expand the one-line function pbreassignbuf() the only place it is or could be used.	2002-05-05 20:37:08 +00:00
bde	31ade1b13e	Return the correct error code (ENOSYS, not EINVAL) from nosys(). Getting killed by SIGSYS for unimlemented syscalls is bad enough. Obtained from: Lite2 branch The Lite2 branch has some other interesting unmerged (?) bits in this file. They are well hidden among cosmetic regressions.	2002-05-05 04:50:47 +00:00
bde	e0f62a1bbb	Fixed breakage of binary compatibility of the kern.clockrate sysctl in sys/time.h rev.1.53, etc. Zero out the entire struct clkinfo and not just the new spare part of it so that there is no possibility of leaking kernel stack context to userland.	2002-05-05 04:33:09 +00:00
mux	7856f6d21c	Fix a typo. Submitted by: dwmalone	2002-05-04 19:50:09 +00:00
phk	536c2f0f78	Remove a six year old undocumented #ifdef : NO_B_MALLOC.	2002-05-04 19:24:55 +00:00
dillon	226cd40e3d	Remove obsolete code (that was already #if 0'd out). Requested by: Hiten Pandya <hitmaster2k@yahoo.com>	2002-05-04 17:10:15 +00:00
alfred	1d5057f893	style(9): 'if' and 'while' need a space after them.	2002-05-04 07:40:49 +00:00
phk	d26e256ae9	Initialize time_second to 1 instead of zero to pacify slightly bogus arp code. Various minor style fixes from BDE.	2002-05-03 08:46:03 +00:00
tanimura	101b936bbc	As malloc(9) and free(9) are now Giant-free, remove the Giant lock across malloc(9) and free(9) of a pgrp or a session.	2002-05-03 07:46:59 +00:00
tanimura	58f1f5c532	Fix the lock order reversal between the sigio lock and a process/pgrp lock in funsetownlst() by locking the sigio lock across funsetownlst().	2002-05-03 05:32:25 +00:00
peter	ab041d4f7c	Retire makeobjops.pl - replaced by ../tools/makeobjops.awk.	2002-05-02 22:21:59 +00:00
phk	8cabbc69f8	As promised make the hack for sizeof(struct disklabel) on alpha annoying. Run make world (or recompile whatever program whines) to get rid of warning. Compat bits will be removed entirely in about two weeks.	2002-05-02 21:53:39 +00:00
mux	85b0c22bf2	Convert devfs to nmount. Reviewed by: phk	2002-05-02 20:27:42 +00:00
jhb	80604a408d	- Protect randompid and nprocs with the allproc_lock. - Reorder fork1() to do malloc() and other blocking operations prior to acquiring the needed process locks. - The new process inherit's the credentials of curthread, not the credentials of the old process. - Document a really weird race that will come up with KSE allows multiple kernel threads per process.	2002-05-02 15:13:45 +00:00
jhb	32bb958227	- Reorder a few things so that when we lock the process at the end of exit1() we don't have to release it until we acquire schd_lock to call cpu_throw(). - Since we can switch at any time due to preemption or a lock release prior to acquiring sched_lock, don't update switchtime and switchticks until the very end of exit1() after we have acquired sched_lock. - Interlock the proctree_lock and proc lock in wait1() and exit1() to avoid lost wakeups when a parent blocks waiting for a child to exit at the bottom of wait1(). In exit1() the proc lock interlocked with proctree_lock (and released after acquiring sched_lock) is that of the parent process. - In wait1() use an exclusive lock of proctree lock while we are looking for a process to harvest. This allows us to completely remove all references to the process once we've found one (i.e., disconnect it from pgrp's, session's, zombproc list, and it's parent's children list) "atomically" without needing to worry about a lock upgrade. - We don't need sched_lock to test if p_stat is SZOMB or SSTOP when holding the proc lock since the proc lock is always held with p_stat is set to SZOMB or SSTOP. - Protect nprocs with an xlock of the allproc_lock.	2002-05-02 15:09:58 +00:00
jhb	ce5fb0dc3a	- Reorder execve() so that it performs blocking operations before it locks the process. - Defer other blocking operations such as vrele()'s until after we release locks. - execsigs() now requires the proc lock to be held when it is called rather than locking the process internally.	2002-05-02 15:00:14 +00:00
jeff	6bfc4bdd96	Hide a pointer to the malloc_type bucket at the end of the freed memory. If this memory is modified after it has been freed we can now report it's previous owner.	2002-05-02 09:07:04 +00:00
jeff	f7f01600de	malloc/free(9) no longer require Giant. Use the malloc_mtx to protect the mallochash. Mallochash is going to go away as soon as I introduce the kfree/kmalloc api and partially overhaul the malloc wrapper. This can't happen until all users of the malloc api that expect memory to be aligned on the size of the allocation are fixed.	2002-05-02 07:22:19 +00:00
jeff	b152d5fbb5	Remove the temporary alignment check in free(). Implement the following checks on freed memory in the bucket path: - Slab membership - Alignment - Duplicate free This previously was only done if we skipped the buckets. This code will slow down INVARIANTS a bit, but it is smp safe. The checks were moved out of the normal path and into hooks supplied in uma_dbg.	2002-05-02 02:08:48 +00:00
alfred	798c53d495	Redo the sigio locking. Turn the sigio sx into a mutex. Sigio lock is really only needed to protect interrupts from dereferencing the sigio pointer in an object when the sigio itself is being destroyed. In order to do this in the most unintrusive manner change pgsigio's sigio * argument into a **, that way we can lock internally to the function.	2002-05-01 20:44:46 +00:00
peter	84ae7c9225	Cosmetic tweaks. Try and keep the style more consistent, catch some stray whitespace and update a comment.	2002-05-01 02:51:50 +00:00
peter	55a74432bb	kern_tc.c doesn't use <machine/psl.h>, and having this #include breaks other platforms.	2002-05-01 01:31:26 +00:00
obrien	0c1a773004	Remove this Perl script. There have been zero bug reports against vnode_if.awk.	2002-05-01 00:40:44 +00:00
jeff	968fe15c4d	Convert longs to u_longs in stats. This will hold off wrap arounds for a while longer.	2002-04-30 22:39:32 +00:00
alc	0e84366ae7	o Convert the vm_page buckets mutex to a spin lock. (This resolves an issue on the Alpha platform found by jeff@.) o Simplify vm_page_lookup(). Reviewed by: jhb	2002-04-30 21:24:47 +00:00
phk	5ae616a516	Brucifixion ? Yes, out that door, row on the left, one patch each. Many thanks to: bde	2002-04-30 20:42:06 +00:00
dillon	8468513da0	These are Alexander Kabaev's VFSops fixes (see the thread 'Found: module loading breakage'). The patch fixes serious issues with the VFS operations vector array which results in a crash when a filesystem module adding a new VOP is loaded into the kernel. Basically what was happening before was that the old operations vector was being freed and a new one allocated. The original MALLOC code tended to reuse the same address for the case and so the bug did not rear its ugly head until the new memory subsystem was emplaced. This patch replaces the temporary workaround Dave O'Brien comitted in 1.58. The patch is clean enough that I intend to MFC it to stable at some point. Submitted by: Alexander Kabaev <ak03@gte.com> MFC after: 1 week	2002-04-30 18:44:32 +00:00
jeff	21868731b0	Add a new UMA debugging facility. This will overwrite freed memory with 0xdeadc0de and then check for it just before memory is handed off as part of a new request. This will catch any post free/pre alloc modification of memory, as well as introduce errors for anything that tries to dereference it as a pointer. This code takes the form of special init, fini, ctor and dtor routines that are specificly used by malloc. It is in a seperate file because additional debugging aids will want to live here as well.	2002-04-30 07:54:25 +00:00
jeff	06a56984b5	Move the implementation of M_ZERO into UMA so that it can be passed to uma_zalloc and friends. Remove this functionality from the malloc wrapper. Document this change in uma.h and adjust variable names in uma_core.	2002-04-30 04:26:34 +00:00
tanimura	89ec521d91	Revert the change of #includes in sys/filedesc.h and sys/socketvar.h. Requested by: bde Since locking sigio_lock is usually followed by calling pgsigio(), move the declaration of sigio_lock and the definitions of SIGIO_*() to sys/signalvar.h. While I am here, sort include files alphabetically, where possible.	2002-04-30 01:54:54 +00:00
rwatson	d139b64371	Re-add the 16384 bucket also. Submitted by: green	2002-04-29 17:53:23 +00:00
rwatson	c27fece07b	Revert a portion of kern_malloc.c:1.99, which (in addition to adding malloc profiling) also modified the set of pre-defined buckets for the memory allocator. For reasons unknown to me, this resulted in extensive memory corruption in the kernel, in particular on SMP boxes, so I'm committing this work-around until Jeff gets a chance to debug it properly. David Wolfskill pointed me at this commit as the one that might be a problem; I've been running this code on two dual-processor burn-in boxes for about 12 hours now, and the rate of panics due to memory corruption has dropped to zero (from one every five minutes). Hopefully not treading on the toes of: jeff	2002-04-29 17:12:02 +00:00
dwmalone	2eb82b93ad	Add a sysctl which disables the logging of console output. Approved by: phk MFC after: 2 weeks	2002-04-29 09:15:38 +00:00
asmodai	dafd57693b	Fix indention which I did wrong in a previous commit. Submitted by: bde	2002-04-29 08:18:06 +00:00
phk	307f787e5a	Stylistic sweep through the timecounter code. Renovate comments.	2002-04-28 18:24:21 +00:00
phk	e866359c06	Don't screw up our uptime with historical dates.	2002-04-28 16:51:36 +00:00
iedowse	08fc3f3e82	Avoid the user-visible effect of setting SA_NOCLDWAIT when the SIGCHLD handler is SIG_IGN. This is a reimplementation of the problematic revision 1.131 of kern_exit.c. To avoid accessing process UPAGES, we set a new procsig flag when the SIGCHLD handler is SIG_IGN and use that instead.	2002-04-27 22:41:41 +00:00
peter	cc7a68c868	Finish fixing hints. Remember the use_kenv state for the next run. Otherwise we fall back to using the static hints the next time around. We still have the leftover fallback code there which meant that we skipped the use_hints checking on the second and subsequent calls. Also, be a bit more careful about walking off the end of the envp array. I've extracted this from a larger diff. I hope I didn't miss anything...	2002-04-27 22:32:57 +00:00
peter	c204fdd4f3	Partial fix for hints Obtained from: mux	2002-04-27 22:25:13 +00:00
iedowse	9f30a58b28	Remove a stale comment saying that the vnode lock must be the first element in the structure pointed to by vp->v_data; the vnode lock is now within the vnode structure itself.	2002-04-27 22:20:33 +00:00
tanimura	6d8e4294e0	Fix the code fragment clobbered in my last commit.	2002-04-27 09:33:49 +00:00
tanimura	dbb4756491	Add a global sx sigio_lock to protect the pointer to the sigio object of a socket. This avoids lock order reversal caused by locking a process in pgsigio(). sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now require sigio_lock to be locked. Provide sowwakeup_locked(), soisconnected_locked(), and so on in case where we have to modify a socket and wake up a process atomically.	2002-04-27 08:24:29 +00:00
phk	bcaaa89ad0	Explain magic number. Add magic date no explanation. Add a delta which was lost in transit yesterday which prevented other timecounters from actually being used.	2002-04-27 07:28:54 +00:00
phk	521d4c87b6	Make the dummy timecounter actually tick or we will never get anyhere.	2002-04-27 07:06:52 +00:00
jhb	366bb5db9c	Whitespace bogon.	2002-04-27 04:48:36 +00:00
marcel	37e2e2ecca	Insert a semi-colon between label 'skip:' and the closing brace of the FOREACH loop to silence GCC 3.	2002-04-27 02:58:18 +00:00
mike	99e543a853	Move the new byte order function prototypes from <sys/param.h> to <sys/endian.h>. This puts us in line with NetBSD and OpenBSD.	2002-04-26 22:48:23 +00:00
phk	4c421c0b9a	Now that the private parts of timecounters are no longer being fingered by other bits of code, split struct timecounter into two. struct timecounter contains just the bits which pertains to the hardware counter and the reading of it. struct timehands (as in "the hands on a clock") contains all the ugly bit fidling stuff. Statically compile ten timehands. This commit is the functional part. A later cosmetic patch will rename various variables and fieldnames.	2002-04-26 21:51:08 +00:00
phk	d1d55e6cb9	Hide the private parts of timecounter from a couple of places that don't really need to know the gory details.	2002-04-26 21:31:44 +00:00
phk	0054f0f74b	Simplify the RFC2783 and PPS_SYNC timestamp collection API.	2002-04-26 20:24:28 +00:00
phk	04257819a4	Move the winding of timecounters out of hardclock and into a normal timeout loop. Limit the rate at which we wind the timecounters to approx 1000 Hz. This limits the precision of the get{bin,nano,micro}[up]time(9) functions to roughly a millisecond.	2002-04-26 12:37:36 +00:00
phk	91f1d49b73	Various cleanup and sorting of clock reading functions. Add the two functions missing in the complete 12 function complement.	2002-04-26 10:19:29 +00:00
phk	76a2a4c2cf	Rename tco_setscales() and tco_delta() to use the same tc_ prefix as the rest of this file.	2002-04-26 10:11:02 +00:00
phk	f227fb83e6	Remove the tc_update() function. Any frequency change to the timecounter will be used starting at the next second, which is good enough for sysctl purposes. If better adjustment is needed the NTP PLL should be used.	2002-04-26 10:06:26 +00:00
brian	895107253f	Test if rootvnode is NULL rather than if rootdev is NODEV when determining if there's a filesystem present. rootdev can be NODEV in the NFS-mounted root scenario. Discussed with: Harti Brandt <brandt@fokus.gmd.de>, iedowse	2002-04-26 09:52:54 +00:00
silby	dd3cd5fed6	Make sure that sockets undergoing accept filtering are aborted in a LRU fashion when the listen queue fills up. Previously, there was no mechanism to kick out old sockets, leading to an easy DoS of daemons using accept filtering. Reviewed by: alfred MFC after: 3 days	2002-04-26 02:07:46 +00:00
des	b3648bf706	Add the mutex profiling lock to the witness list. This hopefully unbreaks the MUTEX_PROFILING + WITNESS + !WITNESS_SKIPSPIN case. Submitted by: Hiten Pandya <hiten@uk.FreeBSD.org>	2002-04-25 22:48:40 +00:00
bde	e1e6cfc088	Fixed some longstanding bugs in _getenv_static(): - malformed environment strings (ones without an '=') were not rejected. There shouldn't be any of these, but when the static environment is empty it always begins with one of these; this one should be considered as the terminator after the end of the environment, but it isn't. - the comparison of the name being looked up with the name in the environment was fuzzy -- only the characters up to the length of the latter were compared, so _getenv_static("foobar") matched "foo=..." in the environment and everything matched "" in the empty environment. MFC after: 3 days	2002-04-25 20:25:15 +00:00
bde	c7cc23aacf	Break the following implementation of panic(3): #!bin/sh # Original version of this by Michael Reifenberger # <root@nihil.plaut.de>. mdconfig -d -u 11 >/dev/null 2>&1 dd if=/dev/zero of=zz bs=1m count=1 while : do mdconfig -a -t vnode -f zz -u 11 fdisk -f - -iv /dev/md11 <<EOF1 g c1 h64 s32 p 1 165 0 2048 a 1 EOF1 mdconfig -d -u 11 done Garbage pointers in __si_u were not cleared by destroy_dev(). Not clearing si_disk made the above fatal because the disk layer uses si_disk as a flag to indicate that the dev_t has been completely initialized. disk_destroy() clears si_disk for the parent dev_t but doesn't get called for children. Not fixed: - setting the undocumented sysctl debug.free_devt should cause more complete destruction of the dev_t including clearing of __si_u, but actually causes the above to panic a little earlier. - the loop leaks 10 memory allocations per iteration (4 DEVFS, 2 devbuf and 4 dev_t). Reviewed by: timeout by MAINTAINER after 3 months	2002-04-25 13:17:33 +00:00
marcel	56d625090e	Don't use the symbol name to lookup the symbol value when we can use the symbol index defined by the relocation. The elf_lookup() support function is to be used by elf_reloc() when symbol lookups need to be done. The elf_lookup() function operates on the symbol index and will do a symbol name based lookup when such is required, otherwise it uses the symbol index directly. This solves the problem seen on ia64 where the symbol hash table does not contain local symbols and a symbol name based lookup would fail for those symbols. Don't pass the symbol name to elf_reloc(), as it isn't used any more.	2002-04-25 01:22:16 +00:00
tanimura	1616fbed42	Free(9) should be Giant-free. Suggested by: jhb	2002-04-24 09:59:18 +00:00
silby	b4055530fc	Remove sodropablereq - this function hasn't been used since the syncache went in. MFC after: 3 days	2002-04-24 04:11:08 +00:00
hsu	7bef5a6e99	The cold and panicstr variables do not need to be protected by sched_lock. Submitted by: Jennifer Yang (yangjihui@yahoo.com) Reviewed by: jake & jhb in principle	2002-04-23 19:50:22 +00:00
phk	834fdde07a	Add a basic sanity check on pointers passed to free(9). Should be improved by: jeff	2002-04-23 18:50:25 +00:00
phk	bf5ba9f42b	Don't call malloc(9) to allocate zero bytes softc data for devices.	2002-04-23 15:48:23 +00:00
rwatson	780f32f693	Slightly restructure extattr_get_vp() so that there's only one entry point to VOP_GETEXTATTR(). This simplifies code flow when inserting MAC hooks. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-23 01:27:38 +00:00
alfred	d4c507ea29	Don't FILEDESC_LOCK around calls to falloc().	2002-04-22 20:09:11 +00:00
des	4d6b787d2d	Usage style sweep: spell "usage" with a small 'u'. Also change one case of blatant __progname abuse (several more remain) This commit does not touch anything in src/{contrib,crypto,gnu}/.	2002-04-22 13:44:47 +00:00
phk	68aee74f02	Comment out Kirks io-request priority hack until we can do this in a civilized way which doesn't cause grief. The problem is that it is not generally safe to cast a "struct bio " to a "struct buf ". Things like ccd, vinum, ata-raid and GEOM constructs bio's which are not entrails of a struct buf. Also, curthread may or may not have anything to do with the I/O request at hand. The correct solution can either be to tag struct bio's with a priority derived from the requesting threads nice and have disksort act on this field, this wouldn't address the "silly-seek syndrome" where two equal processes bang the diskheads from one edge to the other of the disk repeatedly. Alternatively, and probably better: a sleep should be introduced either at the time the I/O is requested or at the time it is completed where we can be sure to sleep in the right thread. The sleep also needs to be in constant timeunits, 1/hz can be practicaly any sub-second size, at high HZ the current code practically doesn't do anything.	2002-04-22 06:53:20 +00:00
marcel	84ecc1bfc1	Add function link_elf_get_gp(), specific to ia64 for now, to get the DT_PLTGOT value. On ia64 this is the value of GP. We need this to construct function descriptors, but the elf file structure is not exported to MD code. Note that the name of the function is based on the meaning that DT_PLTGOT has on ia64. This may differ on other architectures. As such, link_elf_get_gp() has a high level of MD to it. Renaming the function to describe what DT_* value is returned makes it generic, but also makes the MD code less clear and if we only need this on ia64, then a general name for a specific function doesn't help. In short: I don't know what is "right" at this time, so I'll go with what I have.	2002-04-21 21:08:30 +00:00
markm	b0c0526342	Use protected names (_foo) to cutdown on boatloads of lint warnings.	2002-04-21 11:16:10 +00:00
marcel	5de2c9fb38	GCC 3.x WARNS: Add a break to the default case.	2002-04-20 21:56:42 +00:00
tanimura	e2acd5cecf	Push down Giant for setpgid(), setsid() and aio_daemon(). Giant protects only malloc(9) and free(9).	2002-04-20 12:02:52 +00:00
rwatson	30744d9c56	Improve style consistency of vfs_syscalls.c by converting the style used in various extattr_*() calls to match the rest of the file. Originally, these bits at the end looked more like style(9). This patch was submitted by green by way of the TrustedBSD MAC tree, and I fixed a few problems with it on the way through. Someone with more time on their hands should convert the entire file to style(9); this commit is for diff reduction purposes. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-20 01:37:08 +00:00
rwatson	4d39491e7e	In sendfile(), use the vn_rdwr() helper function, rather than manually constructing a struct aio and invoking VOP_READ() directly. This cleans up the code a little, but also has the advantage of making sure almost all vnode read/write access in the kernel goes through the helper function, meaning that instrumentation of that helper function can impact almost all relevant read/write operations. In this case, it permits us to put MAC hooks into vn_rdwr() and not modify uipc_syscalls.c (yet). In general, if helper vn_*() functions exist, they should be used in preference to direct VOP's in system call service code. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-19 13:46:24 +00:00
rwatson	63ab78794e	Divorce proc0 and proc1 credentials earlier; while this isn't technically needed in the current code, in the MAC tree, create_init() relies on the ability to modify the credentials present for initproc, and should not perform that modification on a shared credential. Pro-active diff reduction against MAC changes that are in the queue; also facilitates other work, including the capabilities implementation. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-19 13:35:53 +00:00
phk	f4a2041f29	suser is Giant safe, so optimize a pointless case.	2002-04-19 09:20:13 +00:00
suz	553226e8e1	just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD. (based on freebsd4-snap-20020128) Reviewed by: ume MFC after: 1 week	2002-04-19 04:46:24 +00:00
nectar	fcc5ad0935	When exec'ing a set[ug]id program, make sure that the stdio file descriptors (0, 1, 2) are allocated by opening /dev/null for any which are not already open. Reviewed by: alfred, phk MFC after: 2 days	2002-04-19 00:45:29 +00:00
mux	6961e47900	Avoid calling malloc() or free() while holding the kenv lock. Reviewed by: jake	2002-04-17 17:51:10 +00:00
mux	a207e41bef	Rework the kernel environment subsystem. We now convert the static environment needed at boot time to a dynamic subsystem when VM is up. The dynamic kernel environment is protected by an sx lock. This adds some new functions to manipulate the kernel environment : freeenv(), setenv(), unsetenv() and testenv(). freeenv() has to be called after every getenv() when you have finished using the string. testenv() only tests if an environment variable is present, and doesn't require a freeenv() call. setenv() and unsetenv() are self explanatory. The kenv(2) syscall exports these new functionalities to userland, mainly for kenv(1). Reviewed by: peter	2002-04-17 13:06:36 +00:00
mux	c79270302c	Add an entry for the kenv(2) syscall (code to follow). Reviewed by: peter	2002-04-17 13:05:13 +00:00
iedowse	64322dabea	The recent NFS forced unmount improvements introduced a side-effect where some client operations might be unexpectedly cancelled during an unsuccessful non-forced unmount attempt. This causes problems for amd(8), because it periodically attempts a non-forced unmount to check if the filesystem is still in use. Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set only during the operation of a forced unmount. Use this instead of MNTK_UNMOUNT to trigger the cancellation of hung NFS operations. Also correct a problem where dounmount() might inadvertently clear the MNTK_UNMOUNT flag. Reported by: simokawa MFC after: 1 week	2002-04-17 01:07:29 +00:00
jhb	dba04cd736	Lock proctree_lock instead of pgrpsess_lock.	2002-04-16 17:11:34 +00:00
jhb	6cbba0bb03	- Lock proctree_lock instead of pgrpsess_lock. - Use temporary variables to hold a pointer to a pgrp while we dink with it while not holding either the associated proc lock or proctree_lock. It is in theory possible that p->p_pgrp could change out from under us.	2002-04-16 17:09:22 +00:00
jhb	d9a4c30c37	- Lock proctree_lock instead of pgrpsess_lock. - Simplify return logic of setsid() and setpgid().	2002-04-16 17:06:11 +00:00
jhb	2ebbf84d61	- Lock proctree_lock instead of pgrpsess_lock. - Exclusively lock proctree_lock while calling leavepgrp().	2002-04-16 17:04:21 +00:00
jhb	7202da4491	- Merge the pgrpsess_lock and proctree_lock sx locks into one proctree_lock sx lock. Trying to get the lock order between these locks was getting too complicated as the locking in wait1() was being fixed. - leavepgrp() now requires an exclusive lock of proctree_lock to be held when it is called. - fixjobc() no longer gets a shared lock of proctree_lock now that it requires an xlock be held by the caller. - Locking notes in sys/proc.h are adjusted to note that everything that used to be protected by the pgrpsess_lock is now protected by the proctree_lock.	2002-04-16 17:03:05 +00:00
phk	2edc95ffee	Remove two debug printfs which should never have been committed.	2002-04-15 21:08:51 +00:00
jhb	f656d44b0b	You have to cast int64_t's to long long if you printf them with %lld. This now compiles on alpha without a warning. Pointy-hat to: phk	2002-04-15 21:04:32 +00:00
phk	b6bf4c07cf	Improve the implementation of adjtime(2). Apply the change as a continuous slew rather than as a series of discrete steps and make it possible to adjust arbitraryly huge amounts of time in either direction. In practice this is done by hooking into the same once-per-second loop as the NTP PLL and setting a suitable frequency offset deducting the amount slewed from the remainder. If the remaining delta is larger than 1 second we slew at 5000PPM (5msec/sec), for a delta less than a second we slew at 500PPM (500usec/sec) and for the last one second period we will slew at whatever rate (less than 500PPM) it takes to eliminate the delta entirely. The old implementation stepped the clock a number of microseconds every HZ to acheive the same effect, using the same rates of change. Eliminate the global variables tickadj, tickdelta and timedelta and their various use and initializations. This removes the most significant obstacle to running timecounter and NTP housekeeping from a timeout rather than hardclock.	2002-04-15 12:23:11 +00:00
phk	ed0cd9a251	Take the "tickadj" element out of struct clockinfo. Our adjtime(2) implementation is being changed and the very concept of tickadj will no longer be meaningful.	2002-04-15 12:11:06 +00:00
phk	af54e26ee0	In the ntp_adjtime(2) syscall, return our actual estimate of unapplied offset correction instead of the most recent offset applied.	2002-04-15 08:58:24 +00:00
jeff	6cb876e7dd	Finish adding support code for sysctl kern.mprof. This dumps some malloc information related to bucket size effeciency. Three things are printed on each row: Size is the size the user actually asked for rounded to 16 bytes. Requests is the number of times this size was asked for. Real Size is the size we actually handed out. At the end the total memory used and total waste is displayed. Currently my system displays about 33% wasted memory. The intent of this code is to gather statistics for tuning the malloc bucket sizes. It is not intended to be run with INVARIANTS and it is not entirely mp safe. It can be enabled via 'options MALLOC_PROFILE' which was commited earlier.	2002-04-15 05:24:01 +00:00
jeff	da6660250e	Remove malloc_type's ks_limit. Updated the kmemzones logic such that the ks_size bitmap can be used as an index into it to report the size of the zone used. Create the kern.malloc sysctl which replaces the kvm mechanism to report similar data. This will provide an easy place for statistics aggregation if malloc_type statistics become per cpu data. Add some code ifdef'd under MALLOC_PROFILING to facilitate a tool for sizing the malloc buckets.	2002-04-15 04:05:53 +00:00
alfred	0925885691	Don't allow one to trace an ancestor when already traced. PR: kern/29741 Submitted by: Dave Zarzycki <zarzycki@FreeBSD.org> Fix from: Tim J. Robbins <tim@robbins.dropbear.id.au> MFC After: 2 weeks	2002-04-14 17:12:55 +00:00
jeff	9089f1baf8	Use VOP_GETVOBJECT instead of accessing the member directly. This fixed an issue with nullfs and NAMEI shared. Submitted by: Alexander Kabaev	2002-04-14 10:18:48 +00:00
alc	3ad9fd7f0b	Regen	2002-04-14 05:33:58 +00:00
alc	a34b48c478	Remove the requirement that Giant be held around sigreturn().	2002-04-14 05:31:47 +00:00
alc	7e3107d0af	o Use aiocblist::fd_file in the AIO threads rather than recomputing the file * from the calling process's descriptor table. o Eliminate sharing of the calling process's descriptor table with the AIO threads.	2002-04-14 03:04:19 +00:00
jhb	e93a8a367d	- Change killpg1()'s first argument to be a thread instead of a process so we can use td_ucred. - In killpg1(), the proc lock is sufficient to check if p_stat is SZOMB or not. We don't need sched_lock. - Close some races in psignal(). In psignal() there is a big switch statement based on p_stat. All the different cases are assuming that the process (or thread) isn't going to change state out from under it. To ensure this is true, just lock sched_lock for the entire switch. We practically held it the entire time already anyways. This also simplifies the locking somewhat and actually results in fewer lock operations. - Allow signotify() to be called with the sched_lock held since psignal() now does that. - Use td_ucred in a couple of places.	2002-04-13 23:33:36 +00:00
jhb	300593a2cc	- Change donice() to take a thread as the first argument instead of a process so it can use td_ucred. - Require the target process of donice() to be locked when donice() is called. - Use td_ucred. - Lock the target process of p_cansee() and while reading the credentials of a process. - Change the logic of rtprio() slightly so it does it's copyin() if needed prior to locking the target process. - rtprio() no longer needs Giant. In theory with full KSE it would still need Giant to protect p_ucred of curproc for the p_canfoo() functions but p_canfoo() will be changing to using td_ucred of curthread before full KSE hits the tree.	2002-04-13 23:28:23 +00:00
jhb	95ee443e6c	- Change the algorithms of the syscalls to modify process credentials to allocate a blank cred first, lock the process, perform checks on the old process credential, copy the old process credential into the new blank credential, modify the new credential, update the process credential pointer, unlock the process, and cleanup rather than trying to allocate a new credential after performing the checks on the old credential. - Cleanup _setugid() a little bit. - setlogin() doesn't need Giant thanks to pgrp/session locking and td_ucred.	2002-04-13 23:07:05 +00:00
jhb	418e247b74	- Change the first argument of ktrcanset(), ktrsetchildren(), and ktrops() to a thread pointer so that ktrcanset() can use td_ucred. - Add some proc locking to partially protect p_tracep and p_traceflag.	2002-04-13 22:54:18 +00:00
tmm	a0622efd75	Use pmap_extract() instead of pmap_kextract() to retrieve the physical address associated with a user virtual address in pipe_build_write_buffer(). Reviewed by: alc	2002-04-13 20:09:06 +00:00
asmodai	4d94ee39e6	Use the correct macros for F_SETFD/F_GETFD instead of magic numbers. Reflect that fact in the manual page. PR: 12723 Submitted by: Peter Jeremy <peter.jeremy@alcatel.com.au> Approved by: bde MFC after: 2 weeks	2002-04-13 10:16:53 +00:00
tmm	86be827a6a	Back out the last revision - it does not work correctly when one of the pages in question is not in the top-level vm object, but in one of the shadow ones. Pointed out by: alc Pointy hat to: tmm	2002-04-13 00:03:07 +00:00
jhb	6629f872ca	Rework ptrace(2) to be more locking friendly. We do any needed copyin()'s and acquire the proctree_lock if needed first. Then we lock the process if necessary and fiddle with it as appropriate. Finally we drop locks and do any needed copyout's. This greatly simplifies the locking.	2002-04-12 21:17:37 +00:00
tmm	1720bac84c	Do not use pmap_kextract() to find out the physical address of a user belong to a user virtual address; while this happens to work on some architectures, it can't on sparc64, since user and kernel virtual address spaces overlap there (the distinction between them is done via separate address space identifiers). Instead, look up the page in the vm_map of the process in question. Reviewed by: jake	2002-04-12 19:38:41 +00:00
hsu	74de2695a0	Fix corner case where m_len was not being initialized. Submitted by: Maksim Yevmenkin <myevmenk@digisle.net> MFC after: 1 week	2002-04-12 00:01:50 +00:00
jhb	9522d33ac9	- Set the base priority of an ithread that has no handlers when we set its normal priority. - Lock sched_lock while we dink with the priorities. - Remove a few extra blank lines.	2002-04-11 21:03:35 +00:00
alc	200626256b	Regen	2002-04-11 17:35:53 +00:00
alc	8a20a702cf	Remove the requirement that Giant be held around osigreturn(). All platform- specific implementations are MPSAFE.	2002-04-11 17:34:38 +00:00
jhb	ce939dfab8	- Change settime() to take a thread as its first argument instead of a proc so it can use td_ucred. - Push Giant down into the end of settime() where we actually set the time on the timecounter and time of day clock. - Remove Giant from clock_settime(). - Push Giant down in settimeofday() to just protect the 'tz' global variable.	2002-04-10 04:09:07 +00:00
jhb	78e19df6f6	Display the recursion count in the lock_instance in the show locks output. Indirectly requested by: peter	2002-04-10 01:25:11 +00:00
jhb	ad726578d6	Cosmetic fixup in output of lock types in show locks output.	2002-04-10 01:19:53 +00:00
brian	cd534ec28e	In linker_load_module(), check that rootdev != NODEV before calling linker_search_module(). Without this, modules loaded from loader.conf that then try to load in additional modules (such as digi.ko loading a card's BIOS) die badly in the vn_open() called from linker_search_module(). It may be worth checking (KASSERTing?) that rootdev != NODEV in vn_open() too.	2002-04-10 01:14:45 +00:00
brian	8ad55476a0	Change linker_reference_module() so that it's passed a struct mod_depend * (which may be NULL). The only consumer of this function at the moment is digi_loadmoduledata(), and that passes a NULL mod_depend *. In linker_reference_module(), check to see if we've already got the required module loaded. If we have, bump the reference count and return that, otherwise continue the module search as normal.	2002-04-10 01:13:57 +00:00
jhb	97bce5a40f	- Change fill_kinfo_proc() to require that the process is locked when it is called. - Change sysctl_out_proc() to require that the process is locked when it is called and to drop the lock before it returns. If this proves too complex we can change sysctl_out_proc() to simply acquire the lock at the very end and have the calling code drop the lock right after it returns. - Lock the process we are going to export before the p_cansee() in the loop in sysctl_kern_proc() and hold the lock until we call sysctl_out_proc(). - Don't call p_cansee() on the process about to be exported twice in the aforementioned loop.	2002-04-09 20:10:46 +00:00
jhb	026e9455de	Whitespace changes to wrap long lines.	2002-04-09 20:01:16 +00:00
jhb	1fbf4e9848	We don't need Giant to read the pgrp ID since the proc lock has protected p_pgrp since the pgrp locking went in. We also don't need it to check for invalid values in the options argument to wait1(), so push Giant down slightly.	2002-04-09 20:00:40 +00:00
jhb	fc492a338c	- Remove an early KSE diagnostic panic. The thread pointer here is always curthread. - We don't need Giant to do suser() checks now, so don't lock Giant until after the check.	2002-04-09 19:58:38 +00:00
jhb	f7c4d57b64	Don't lock the ithread lock in ithread_create(). The ithread isn't on any lists or in any tables yet so there are no other references to it, thus we don't need to lock it.	2002-04-09 16:26:37 +00:00
phk	a90e28ebbb	Implement DIOCGFRONTSTUFF ioctl which reports how many bytes from the start of the device magic stuff might occupy. Sponsored by: DARPA & NAI Labs.	2002-04-09 15:43:32 +00:00
phk	5b960672bf	Rename DIOCGKERNELDUMP to DIOCSKERNELDUMP as it strictly speaking is a "set" not a "get" operation. Sponsored by: DARPA & NAI Labs.	2002-04-09 10:04:09 +00:00
jeff	0b5e15cef7	Turn #ifdef LOOKUP_SHARED into #ifndef LOOKUP_EXCLUSIVE to enable this behavior by default. Also, change the options line to reflect this. If there are no problems reported this will become the only behavior and the knob will be removed in a month or so. Demanded by: obrien	2002-04-09 05:14:17 +00:00
mux	bf7d877bcd	The fourth parameter to copystr() is a size_t, not an int. Approved by: peter	2002-04-08 21:14:19 +00:00
phk	33405073ec	Move generic disk ioctls from <sys/disklabel.h> to <sys/disk.h>. Sponsored by: DARPA & NAI Labs	2002-04-08 09:20:07 +00:00
phk	e1803d493e	Put back dumppcb, but this time we put a comment to tell what it is for. Brucifixion by: bde	2002-04-08 06:59:13 +00:00
alc	548fecffbc	Restructure aio_return() to eliminate duplicated code and facilitate Giant push down.	2002-04-08 04:57:56 +00:00
hsu	d9992faaf2	There's only one socket zone so we don't need to remember it in every socket structure.	2002-04-08 03:04:22 +00:00
mux	00b6b34450	o Change kernel_vmount() interface to be more convenient : pass two separate strings instead of passing "foo=bar". o Don't forget to clear the VMOUNT flag on the vnode when vfs_nmount() fails because the fs doesn't implement VFS_NMOUNT (and in vfs_mount() when the fs doesn't implement VFS_MOUNT) ; also decrement the vfs refcount in the !MNT_UPDATE case.	2002-04-07 13:22:47 +00:00
dwmalone	d7ca365130	Remove a comment which relates to the old name cache code, which was replaced in 1997. Approved by: phk	2002-04-07 08:58:31 +00:00
alc	bfb320784e	Reduce the duplication of code for error handling in _aio_aqueue().	2002-04-07 07:17:59 +00:00
alc	2f8880db13	Change jobref and *ijoblist from int to long in order to avoid a catastrophe after the 2^32nd AIO operation on 64-bit architectures.	2002-04-07 01:28:34 +00:00
jake	7f897ef089	Remove a stale comment.	2002-04-06 08:44:04 +00:00
jake	553fb6e233	Include machine/ktr.h for sparc64 so we pick up KTR_CPU.	2002-04-06 08:43:17 +00:00
jake	3fadd16f0d	Use CTASSERT rather than a runtime check to detect kinfo_proc size changes. Remove the ugly yuck code to busy wait for 20 seconds.	2002-04-06 08:13:52 +00:00
nyan	e4475cba04	Added the new kernel dumping support for pc98.	2002-04-06 06:41:54 +00:00
bde	572176e33d	Updated a doubly stale comment about signotify(). Fixed a nearby long line.	2002-04-05 10:00:37 +00:00
peter	293815b90a	Increase the size of the register stack storage on ia64 from 32K to 2MB so that we can compile gcc. This is a hack because it adds a fixed 2MB to each process's VSIZE regardless of how much is really being used since there is no grow-up stack support. At least it isn't physical memory. Sigh. Add a sysctl to enable tweaking it for new processes.	2002-04-05 01:57:45 +00:00
tmm	91f571835a	Add a generic implementation of inittodr() and resettodr(), as well as a set of helper routines to deal with real-time clocks. The generic functions access the clock diver using a kobj interface. This is intended to reduce code reduplication and make it easy to support more than one clock model on a single architecture. This code is currently only used on sparc64, but it is planned to convert the code of the other architectures to it later.	2002-04-04 23:39:10 +00:00
jhb	db9aa81e23	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
jhb	ec0e08c944	Change mtx_init() to now take an extra argument. The third argument is the generic lock type for use with witness. If this argument is NULL then the lock name is used as the lock type. Add a macro for a lock type name for network driver locks.	2002-04-04 20:52:27 +00:00
jhb	883d8a5526	Set the lock type equal to the lock name for now as all of the current sx locks don't use very specific lock names.	2002-04-04 20:49:35 +00:00
jhb	8143d2b80e	Add a new char * pointer lo_type to struct lock_object that is used to point to a more generic name for a lock that is more suitable for use by witness when grouping locks. For example, although network driver locks use the interface name for the name of each lock, they should all use the same witness and be treated the same as witness. Another example is that all UMA zone locks should be treated the same. The witness code has also been updated to print out the lock type in addition to the lock name in a few places where it is relevant.	2002-04-04 20:45:21 +00:00
phk	38f498fe43	Delete the bogus d_boot[01] fields from struct disklabel. This shrinks the size 4 bytes on alpha, down to the same 276 bytes as all other platforms. Construct a hack to make old ioctls work on new kernels. Once world is recompiled only the new and correct sysctls will be used. This hack will become annoying around 1st of may to make people rebuild their worlds and it will be gone before 5.0.	2002-04-04 20:34:48 +00:00
bde	14ae95f735	Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask. Avoid locking in userret() in most of the remaining cases. Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon	2002-04-04 17:49:48 +00:00
bde	3b8182ff40	Optimized the check for unmasked pending signals in CURSIG() using a new inline function sigsetmasked() and a new macro SIGPENDING(). CURSIG() will soon be moved out of the normal path of execution for syscalls and traps. Then its efficiency will be less important but the new interfaces will be useful for checking for unmasked pending signals in more places. Submitted by: luoqi (long ago, in a slightly different form) Assert that sched_lock is not held in CURSIG().	2002-04-04 15:19:41 +00:00
alc	db11618136	o aio_process needn't fhold()/fdrop() the fp now that _aio_aqueue() and aio_free_entry() do this. o Remove two unnecessary/unused variables from aio_process() and one field from aiocblist.	2002-04-04 02:13:20 +00:00
alfred	6dc270c501	Avoid a lock order reversal by dropping the eventhandler_mutex earlier. We get enough protection from the lock on the individual lists that we aquire later. Noticed/Tested by: Steven G. Kargl <kargl@troutmask.apl.washington.edu> Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-04 00:52:03 +00:00
jhb	9fa365d7d6	- Axe a stale comment. We haven't allowed the ucred pointer passed to securelevel_*() to be NULL for a while now. - Use KASSERT() instead of if (foo) panic(); to optimize the !INVARIANTS case. Submitted by: Martin Faxer <gmh003532@brfmasthugget.se>	2002-04-03 18:35:25 +00:00
mux	9effffd331	Add two forgotten vfs_unbusy() calls, in vfs_mount() and vfs_nmount(). Reviewed by: phk	2002-04-03 12:19:03 +00:00
ru	d8ffece3c4	Dike out a highly insecure UCONSOLE option. TIOCCONS must be able to VOP_ACCESS() /dev/console to succeed. Obtained from: OpenBSD	2002-04-03 10:56:59 +00:00
dillon	9a85737b15	brelse() was improperly clearing B_DELWRI in the B_DELWRI\|B_INVAL case without removing the buffer from the vnode's dirty buffer list, which can result in a panic in NFS. Replaced the code with a call to bundirty() which deals with it properly. PR: kern/36108, kern/36174 Submitted by: various people Special mention: to Danny Schales <dan@coes.LaTech.edu> for providing a core dump that helped me track this down. MFC after: 1 day	2002-04-03 00:17:36 +00:00
des	2a92b78602	Revert to open hashing. It makes the code simpler, and works farily well even when the number of records approaches the size of the hash table. Besides, the previous implementation (using linear probing) was broken :) Also, use the newly introduced MTX_SYSINIT.	2002-04-02 23:26:32 +00:00
jhb	9d3d63fcbc	- Move the MI mutexes sched_lock and Giant from being declared in the various machdep.c's to being declared in kern_mutex.c. - Add a new function mutex_init() used to perform early initialization needed for mutexes such as setting up thread0's contested lock list and initializing MI mutexes. Change the various MD startup routines to call this function instead of duplicating all the code themselves. Tested on: alpha, i386	2002-04-02 22:19:16 +00:00
jhb	9153749ef0	Spelling police.	2002-04-02 20:44:30 +00:00
jhb	2c4739409a	Enforce an implicit lock order of sleepable locks before non-sleepable locks.	2002-04-02 19:27:21 +00:00
arr	e3fb0536de	- Add a mutex to lock the global securelevel value. - Make use of MTX_SYSINIT() as the means to initialize our mutex lock.	2002-04-02 17:43:17 +00:00
tanimura	448edc64b4	Fix leakage of p_pgrp lock.	2002-04-02 17:12:06 +00:00
jhb	77dc513737	Explicitly document how we implicitly enforce the lock order of sleep locks before spin locks.	2002-04-02 16:51:20 +00:00
arr	6ae00dcc9f	- Add MTX_SYSINIT and SX_SYSINIT as macro glue for allowing sx and mtx locks to be able to setup a SYSINIT call. This helps in places where a lock is needed to protect some data, but the data is not truly associated with a subsystem that can properly initialize it's lock. The macros use the mtx_sysinit() and sx_sysinit() functions, respectively, as the handler argument to SYSINIT(). Reviewed by: alfred, jhb, smp@	2002-04-02 16:05:43 +00:00
des	cbcf839df4	Instead of get_cyclecount(9), use nanotime(9) to record acquisition and release times. Measurements are made and stored in nanoseconds but presented in microseconds, which should be sufficient for the locks for which we actually want this (those that are held long and / or often). Also, rename some variables and structure members to unit-agnostic names.	2002-04-02 14:42:01 +00:00
phk	4d586060a3	Retire the bogus ioctl DIOCGPART in toto. Once again we can notice that badly thought out hacks ferment and infect far more code than initially expected. Sponsored by: DARPA and NAI Labs.	2002-04-02 11:52:13 +00:00
marcel	5dc73db814	Don't compile the dummy dumpsys for ia64.	2002-04-02 10:55:40 +00:00
rwatson	219837a0a6	Update comment regarding the locking of the sysctl tree. Rename memlock to sysctllock, and MEMLOCK()/MEMUNLOCK() to SYSCTL_LOCK()/ SYSCTL_UNLOCK() and related changes to make the lock names make more sense. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-02 05:50:07 +00:00
alfred	4a63b9d69c	Use sx locks instead of flags+tsleep locks. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-02 04:20:38 +00:00
alfred	cb408d85e7	Use sx locks rather than lockmgr locks for eventhandlers. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-02 04:18:54 +00:00
des	f6a3790f10	Mutex profiling code, conditional on the MUTEX_PROFILING option. Adds the following sysctl variables: debug.mutex.prof.enable enable / disable profiling debug.mutex.prof.acquisitions number of mutex acquisitions recorded debug.mutex.prof.records number of acquisition points recorded debug.mutex.prof.maxrecords max number of acquisition points debug.mutex.prof.rejected number of rejections (due to full table) debug.mutex.prof.hashsize hash size debug.mutex.prof.collisions number of hash collisions debug.mutex.prof.stats profiling statistics The code records four numbers for each acquisition point (identified by source file name and line number): longest time held, total time held, number of non-recursive acquisitions, average time held. The measurements are in clock cycles (as returned by get_cyclecount(9)); this may cause measurements on some SMP systems to be unreliable. This can probably be worked around by replacing get_cyclecount(9) by some incarnation of nanotime(9). This work was derived from initial patches by eivind.	2002-04-02 00:01:49 +00:00
dillon	3ad295d416	Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter() and cpu_critical_exit() and moves associated critical prototypes into their own header file, <arch>/<arch>/critical.h, which is only included by the three MI source files that need it. Backout and re-apply improperly comitted syntactical cleanups made to files that were still under active development. Backout improperly comitted program structure changes that moved localized declarations to the top of two procedures. Partially re-apply one of the program structure changes to move 'mask' into an intermediate block rather then in three separate sub-blocks to make the code more readable. Re-integrate bug fixes that Jake made to the sparc64 code. Note: In general, developers should not gratuitously move declarations out of sub-blocks. They are where they are for reasons of structure, grouping, readability, compiler-localizability, and to avoid developer-introduced bugs similar to several found in recent years in the VFS and VM code. Reviewed by: jake	2002-04-01 23:51:23 +00:00
jhb	dc2e474f79	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
jhb	81ff87afc2	Whitespace only change: use ANSI function declarations instead of K&R.	2002-04-01 20:13:31 +00:00
phk	07b4c10b28	Extend a hack to also hack around PC98's definition of __i386__	2002-04-01 20:13:03 +00:00
jhb	7205e92665	Fix style bug in previous commit.	2002-04-01 17:53:42 +00:00
jake	f9f52274db	ktr changes to improve performance and make writing a userland utility to dump the trace buffer feasible. - Remove KTR_EXTEND. This changes the format of the trace entries when activated, making writing a userland tool which is not tied to a specific kernel configuration difficult. - Use get_cyclecount() for timestamps. nanotime() is much too heavy weight and requires recursion protection due to ktr traces occuring as a result of ktr traces. KTR_VERBOSE may still require recursion protection, which is now conditional on it. - Allow KTR_CPU to be overridden by MD code. This is so that it is possible to trace early in startup before pcpu and/or curthread are setup. - Add a version number for the ktr interface. A userland tool can check this to detect mismatches. - Use an array for the parameters to make decoding in userland easier. - Add file and line recording to the non-extended traces now that the extended version is no more. These changes will break gdb macros to decode the extended version of the trace buffer which are floating around. Users of these macros should either use the show ktr command in ddb, or use the userland utility which can be run on a core dump. Approved by: jhb Tested on: i386, sparc64	2002-04-01 05:35:26 +00:00
phk	ef82a51634	Here follows the new kernel dumping infrastructure. Caveats: The new savecore program is not complete in the sense that it emulates enough of the old savecores features to do the job, but implements none of the options yet. I would appreciate if a userland hacker could help me out getting savecore to do what we want it to do from a users point of view, compression, email-notification, space reservation etc etc. (send me email if you are interested). Currently, savecore will scan all devices marked as "swap" or "dump" in /etc/fstab _or_ any devices specified on the command-line. All architectures but i386 lack an implementation of dumpsys(), but looking at the i386 version it should be trivial for anybody familiar with the platform(s) to provide this function. Documentation is quite sparse at this time, more to come. Details: ATA and SCSI drivers should work as the dump formatting code has been removed. The IDA, TWE and AAC have not yet been converted. Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set the device as dumpdev. To implement the "off" argument, /dev/null is used as the device. Savecore will fail if handed any options since they are not (yet) implemented. All devices marked "dump" or "swap" in /etc/fstab will be scanned and dumps found will be saved to diskfiles named from the MD5 hash of the header record. The header record is dumped in readable format in the .info file. The kernel is not saved. Only complete dumps will be saved. All maintainer rights for this code are disclaimed: feel free to improve and extend. Sponsored by: DARPA, NAI Labs	2002-03-31 22:37:00 +00:00
phk	ac4142e564	Implement the two "GEOM" ioctls DIOCGSECTORSIZE and DIOCGMEDIASIZE for the non-GEOM code as well. This simplifies the the kernel-dumping and disk-management tools as less compatibility cruft will be needed. Sponsored by: DARPA and NAI Labs.	2002-03-31 21:17:12 +00:00
alc	6985f3b63c	Keep the reference to the file acquired in _aio_aqueue() until the operation completes. The reference is released in aio_free_entry(). Submitted by: tegge	2002-03-31 20:17:56 +00:00
alfred	abeff55bde	Close some holes with p->p_args by NULL'ing out the p->p_args pointer while holding the proc lock, and by holding the pargs structure when accessing it from outside of the owner. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-03-31 10:33:12 +00:00
phk	87273d930a	Centralize the "bootdev" and "dumpdev" variables. They are still pretty bogus all things considered, but at least now they don't camouflage as being MD variables.	2002-03-31 07:15:28 +00:00
alc	bd314bbdd7	Add a local proc *p in exec_new_vmspace() to avoid repeated dereferencing to obtain it.	2002-03-31 00:05:30 +00:00
bde	2f7ae9b739	Fixed handling of short reads in readdisklabel() and writedisklabel(). These functions use DEV_STRATEGY() which can easily return a short count (with no error) for reads near EOF. EOF happens for "disks" too small to contain a label sector (mainly for empty slices). The functions didn't understand this at all, and looked for labels in the garbage in the buffer beyond what DEV_STRATEGY() returned. The recent UMA changes combined with my local changes and configuration resulted in the garbage often containing a valid but garbage label left over from a previous call. Bugs in EOF handling in -current limited the problem to "disks" with size precisely LABELSECTOR sectors. LABELSECTOR happens to be a very unusual "disk" size since it is only 0 for non-i386 arches that don't usually have disks with DOS MBRs.	2002-03-30 16:02:43 +00:00
dan	ade94cf622	Nuke CV_DEBUG in favour of INVARIANTS. Approved by: jhb	2002-03-30 03:52:52 +00:00
jake	855079d5b7	Style fixes purposefully left out of last commit. I checked the kse tree and didn't see any changes that this conflicts with.	2002-03-29 16:45:03 +00:00
jake	8f9ce8398d	Remove abuse of intr_disable/restore in MI code by moving the loop in ast() back into the calling MD code. The MD code must ensure no races between checking the astpening flag and returning to usermode. Submitted by: peter (ia64 bits) Tested on: alpha (peter, jeff), i386, ia64 (peter), sparc64	2002-03-29 16:35:26 +00:00
tanimura	9ae6d1242c	The description of fd_mtx is "filedesc structure."	2002-03-29 11:26:05 +00:00
mdodd	9e60cd20eb	Add resource_list_add_next() which returns the RID for the resource added.	2002-03-29 06:42:54 +00:00
alfred	1118775014	To remove nested include of sys/lock.h and sys/mutex.h from sys/proc.h make the pargs_* functions into non-inlines in kern/kern_proc.c. Requested by: bde	2002-03-28 18:12:27 +00:00
phk	8834f75902	Get the magnitude of the NTP adjustment right.	2002-03-28 16:02:44 +00:00
mux	ac4c018837	- Properly sync vfs_nmount() with changes that have be already done in vfs_mount(), in particular revisions 1.215, 1.227 and 1.240. - flag2 is a low quality variable name, change it to kern_flag. - strncpy NUL-terminates f_fstypename and f_mntonname since the strings have length <= <buffer length> - 1, so the explicit NUL-termination is bogus. - M_ZERO'ing space for fstype and fspath is stupid since we never use the space beyond the end of the string. - Do various style(9) cleanups in both functions. Submitted by: bde Reviewed by: phk	2002-03-28 13:47:32 +00:00
alc	b35f60b7fd	Allow resursion on the pipe mutex because filt_piperead() and filt_pipewrite() can be called both with and without the pipe mutex held. (For example, if called by pipeselwakeup(), it is held. Whereas, if called by kqueue_scan(), it is not.) Reviewed by: alfred	2002-03-27 21:47:50 +00:00
alfred	c513408927	Make the reference counting of 'struct pargs' SMP safe. There is still some locations where the PROC lock should be held in order to prevent inconsistent views from outside (like the proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be fixed later. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-03-27 21:36:18 +00:00
jeff	dff418f166	Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares. Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone. Approved by: jhb	2002-03-27 09:23:41 +00:00
dillon	7fa55182e2	oops, forgot to commit this. td->td_savecrit = 0 replaced by API call cpu_thread_link().	2002-03-27 08:26:37 +00:00
jake	c214910390	Make this compile. Pointy hat to: dillon	2002-03-27 06:44:32 +00:00
dillon	dc5aafeb94	Compromise for critical()/cpu_critical() recommit. Cleanup the interrupt disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up). Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement. This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary. Reviewed by: core Approved by: core	2002-03-27 05:39:23 +00:00
bde	cd522c5374	"Fixed" -Wshadow warnings by changing the name of some function parameters from `index' to `indx'. The correct fix would be to not support or use index().	2002-03-27 04:04:17 +00:00
alc	0afabfc8b7	Remove an unnecessary and inconsistently used variable from exec_new_vmspace().	2002-03-26 19:20:04 +00:00
arr	da9c75ac68	- Fixup a few style nits: - return error -> return (error); - move a declaration to the top of the function. - become bug for bug compatible with if (error) lines. Submitted by: bde	2002-03-26 18:07:10 +00:00
mux	124c6d3a26	As discussed in -arch, add the new nmount(2) system call and the new vfs_getopt()/vfs_copyopt() API. This is intended to be used later, when there will be filesystems implementing the VFS_NMOUNT operation. The mount(2) system call will disappear when all filesystems will be converted to the new API. Documentation will be committed in a while. Reviewed by: phk	2002-03-26 15:33:44 +00:00
bde	4941686e50	Added used include of <sys/sx.h>. Don't depend on namespace pollution in <sys/file.h>.	2002-03-26 01:09:51 +00:00
bde	05400f476f	Added used include of <sys/sx.h>. Don't depend on namespace pollution in <sys/file.h> or <sys/socketvar.h>.	2002-03-25 21:52:04 +00:00
obrien	1e153b6d04	Commit work-around for panics when mounting FS's that are auto-loaded as modules (ie. procfs.ko). When the kernel loads dynamic filesystem module, it looks for any of the VOP operations specified by the new filesystem that have not been registered already by the currently known filesystems. If any of such operations exist, vfs_add_vnops function calls vfs_opv_recalc function, which rebuilds vop_t vectors for each filesystem and sets all global pointers like ufs_vnops_p, devfs_specop_p, etc to the new values and then frees the old pointers. This behavior is bad because there might be already active vnodes whose v_op fields will be left pointing to the random garbage, leading to inevitable crash soon. Submitted by: Alexander Kabaev <ak03@gte.com>	2002-03-25 21:30:50 +00:00
arr	db4f882c76	- Recommit the securelevel_gt() calls removed by commits rev. 1.84 of kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the source of the recent panics occuring around kldloading file system support modules. Requested by: rwatson	2002-03-25 18:26:34 +00:00
phk	811d04c86c	Modernize my email address.	2002-03-25 13:52:45 +00:00
bde	90f30ee936	Fixed some style bugs in the removal of __P(()). The main ones were not removing tabs before "__P((", and not outdenting continuation lines to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting and/or rewrap the whole prototype in some cases.	2002-03-24 05:09:11 +00:00
jhb	f89014c6f6	Use td_ucred in several trivial syscalls and remove Giant locking as appropriate.	2002-03-22 22:32:04 +00:00
jhb	59d20d5aab	Use explicit Giant locks and unlocks for rather than instrumented ones for code that is still not safe. suser() reads p_ucred so it still needs Giant for the time being. This should allow kern.giant.proc to be set to 0 for the time being.	2002-03-22 21:02:02 +00:00
rwatson	afe2b1f929	Merge from TrustedBSD MAC branch: Move the network code from using cr_cansee() to check whether a socket is visible to a requesting credential to using a new function, cr_canseesocket(), which accepts a subject credential and object socket. Implement cr_canseesocket() so that it does a prison check, a uid check, and add a comment where shortly a MAC hook will go. This will allow MAC policies to seperately instrument the visibility of sockets from the visibility of processes. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-03-22 19:57:41 +00:00
alfred	054cce2c17	When "cloning" a pipe's buffer bcopy the data after dropping the pipe's lock as the data may be paged out and cause a fault.	2002-03-22 16:09:22 +00:00
rwatson	a58b691f90	In sysctl, req->td is believed always to be non-NULL, so there's no need to test req->td for NULL values and then do somewhat more bizarre things relating to securelevel special-casing and suser checks. Remove the testing and conditional security checks based on req->td!=NULL, and insert a KASSERT that td != NULL. Callers to sysctl must always specify the thread (be it kernel or otherwise) requesting the operation, or a number of current sysctls will fail due to assumptions that the thread exists. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs Discussed with: bde	2002-03-22 14:58:27 +00:00
rwatson	ef91e0f942	Since cred never appears to be passed into the securelevel calls as NULL, turn warning printf's into panic's, since this call has been restructured such that a NULL cred would result in a page fault anyway. There appears to be one case where NULL is explicitly passed in in the sysctl code, and this is believed to be in error, so will be modified. Securelevels now always require a credential context so that per-jail securelevels are properly implemented. Obtained from: TrustedBSD Project Sponsored by: NAI Labs Discussed with: bde	2002-03-22 14:49:12 +00:00
arr	fc49faf982	- Back out the commit to make the linker_load_file() securelevel check made aware in jail environments. Supposedly something is broken, so this should be backed out until further investigation proves otherwise, or a proper fix can be provided.	2002-03-22 04:56:09 +00:00
rwatson	d8370f667d	Break out the "see_other_uids" policy check from the various method-based inter-process security checks. To do this, introduce a new cr_seeotheruids(u1, u2) function, which encapsulates the "see_other_uids" logic. Call out to this policy following the jail security check for all of {debug,sched,see,signal} inter-process checks. This more consistently enforces the check, and makes the check easy to modify. Eventually, it may be that this check should become a MAC policy, loaded via a module. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-03-22 02:28:26 +00:00
arr	68e226a99e	- Fix a logic error in checking the securelevel that was introduced in the previous commit. Pointy hats to: arr, rwatson	2002-03-21 15:27:39 +00:00
imp	969e82886e	Remove last two abuses of cpu_critical_{enter,exit} in the MI code. Reviewed by: jake, jhb, rwatson	2002-03-21 06:11:09 +00:00
benno	d30ab95478	Add a change mirroring that made to kern/subr_trap.c and others. This makes kernel builds with DIAGNOSTIC work again. Apparently forgotten by: jhb Might want to be checked by: jhb	2002-03-21 02:47:51 +00:00
jeff	f350069589	UMA permited us to utilize the 'waitok' flag to soalloc.	2002-03-20 21:23:26 +00:00
jhb	2e425ee2fc	Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined. Instead of caching the ucred reference, just go ahead and eat the decerement and increment of the refcount. Now that Giant is pushed down into crfree(), we no longer have to get Giant in the common case. In the case when we are actually free'ing the ucred, we would normally free it on the next kernel entry, so the cost there is not new, just in a different place. This also removse td_cache_ucred from struct thread. This is still only done #ifdef DIAGNOSTIC. [ missed this file in the previous commit ] Tested on: i386, alpha	2002-03-20 21:12:04 +00:00
jhb	64bf9fe9fa	- Push down Giant into crfree() in the case that we actually free a ucred. - Add a cred_free_thread() function (conditional on DIAGNOSTICS) that drops a per-thread ucred reference to be used in debugging code when leaving the kernel.	2002-03-20 21:00:50 +00:00
arr	fc9167c193	- Change a check of securelevel to securelevel_gt() call in order to help against users within a jail attempting to load kernel modules. - Add a check of securelevel_gt() to vfs_mount() in order to chop some low hanging fruit for the repair of securelevel checking of linking and unlinking files from within jails. There is more to be done here. Reviewed by: rwatson	2002-03-20 16:03:42 +00:00
arr	3780b11057	- Remove a semi-colon from after SYSINIT that was introduced in rev. 1.163.	2002-03-20 14:46:38 +00:00
jeff	dcd2af7655	Add calls to uma_zone_set_max() to restore previously enforced limits.	2002-03-20 05:30:58 +00:00
jeff	803cb2a2ba	Backout part of my previous commit; I was wrong about vm_zone's handling of limits on zones w/o objects.	2002-03-20 04:39:32 +00:00
jeff	35c1a72689	Remove references to vm_zone.h and switch over to the new uma API.	2002-03-20 04:11:52 +00:00
jeff	318cbeeecf	Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.	2002-03-20 04:09:59 +00:00
alfred	357e37e023	Remove __P.	2002-03-19 21:25:46 +00:00
alfred	5a84f98839	don't generate files with __P.	2002-03-19 20:48:32 +00:00
arr	ae315cb919	- Change a malloc / bzero pair to make use of the M_ZERO malloc(9) flag.	2002-03-19 15:41:21 +00:00
peter	83444279ce	Fix a gcc-3.1+ warning. warning: deprecated use of label at end of compound statement ie: you cannot do this anymore: switch(foo) { .... default: }	2002-03-19 11:02:06 +00:00
peter	a0b32e92d6	Pacify gcc-3.1+, initialize two variables to avoid -Wuninitialized warnings.	2002-03-19 10:57:40 +00:00
peter	4319b6e738	Fix warnings on gcc-3.1+ where __func__ is a const char * instead of a string.	2002-03-19 10:56:46 +00:00
jeff	2923687da3	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00
alfred	21fc25cfdf	Close a race when vfs_syscalls.c:checkdirs() runs. To do this protect the filedesc pointer in the proc with PROC_LOCK in both checkdirs() and kern_descrip.c:fdfree().	2002-03-19 04:30:04 +00:00
bde	df8144b98e	Fixed some printf format errors (hopefully all of the remaining daddr64_t ones for GENERIC, and all others on the same line as those). Reformat the printfs if necessary to avoid new long lones or old format printf errors.	2002-03-19 04:09:21 +00:00
arr	25a6daa828	- Lock down the ``module'' structure by adding an SX lock that is used by all the global bits of ``module'' data. This commit adds a few generic macros, MOD_SLOCK, MOD_XLOCK, etc., that are meant to be used as ways of accessing the SX lock. It is also the first step in helping to lock down the kernel linker and module systems. Reviewed by: jhb, jake, smp@	2002-03-18 07:45:30 +00:00
mckusick	14dd08fd15	Add a flags parameter to VFS_VGET to pass through the desired locking flags when acquiring a vnode. The immediate purpose is to allow polling lock requests (LK_NOWAIT) needed by soft updates to avoid deadlock when enlisting other processes to help with the background cleanup. For the future it will allow the use of shared locks for read access to vnodes. This change touches a lot of files as it affects most filesystems within the system. It has been well tested on FFS, loopback, and CD-ROM filesystems. only lightly on the others, so if you find a problem there, please let me (mckusick@mckusick.com) know.	2002-03-17 01:25:47 +00:00
jake	34dcf8975d	Convert all pmap_kenter/pmap_kremove pairs in MI code to use pmap_qenter/ pmap_qremove. pmap_kenter is not safe to use in MI code because it is not guaranteed to flush the mapping from the tlb on all cpus. If the process in question is preempted and migrates cpus between the call to pmap_kenter and pmap_kremove, the original cpu will be left with stale mappings in its tlb. This is currently not a problem for i386 because we do not use PG_G on SMP, and thus all mappings are flushed from the tlb on context switches, not just user mappings. This is not the case on all architectures, and if PG_G is to be used with SMP on i386 it will be a problem. This was committed by peter earlier as part of his fine grained tlb shootdown work for i386, which was backed out for other reasons. Reviewed by: peter	2002-03-17 00:56:41 +00:00
des	cba4e41433	Implement PT_IO (read / write arbitrary amounts of data or text). Submitted by: Artur Grabowski <art@{blahonga,openbsd}.org> Obtained from: OpenBSD	2002-03-16 02:40:02 +00:00
des	85d610d6a1	PT_[GS]ET{,DB,FP}REGS isn't really optional any more, since we have dummy backend functions for those archs that don't support them. I meant to do this ages ago, but never got around to it. Inspired by: OpenBSD	2002-03-15 20:17:12 +00:00
mckusick	e929f2e4f0	Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.	2002-03-15 18:49:47 +00:00
alfred	b0fd50345a	Giant pushdown for read/write/pread/pwrite syscalls. kern/kern_descrip.c: Aquire Giant in fdrop_locked when file refcount hits zero, this removes the requirement for the caller to own Giant for the most part. kern/kern_ktrace.c: Aquire Giant in ktrgenio, simplifies locking in upper read/write syscalls. kern/vfs_bio.c: Aquire Giant in bwillwrite if needed. kern/sys_generic.c Giant pushdown, remove Giant for: read, pread, write and pwrite. readv and writev aren't done yet because of the possible malloc calls for iov to uio processing. kern/sys_socket.c Grab giant in the socket fo_read/write functions. kern/vfs_vnops.c Grab giant in the vnode fo_read/write functions.	2002-03-15 08:03:46 +00:00
alfred	2261bd0e24	Bug fixes: Missed a place where the pipe sleep lock was needed in order to safely grab Giant, fix it and add an assertion to make sure this doesn't happen again. Fix typos in the PIPE_GET_GIANT/PIPE_DROP_GIANT that could cause the wrong mutex to get passed to PIPE_LOCK/PIPE_UNLOCK. Fix a location where the wrong pipe was being passed to PIPE_GET_GIANT/PIPE_DROP_GIANT.	2002-03-15 07:18:09 +00:00
alfred	2c16fbdd2a	Fixes to make select/poll mpsafe. Problem: selwakeup required calling pfind which would cause lock order reversals with the allproc_lock and the per-process filedesc lock. Solution: Instead of recording the pid of the select()'ing process into the selinfo structure, actually record a pointer to the thread. To avoid dereferencing a bad address all the selinfo structures that are in use by a thread are kept in a list hung off the thread (protected by sellock). When a selwakeup occurs the selinfo is removed from that threads list, it is also removed on the way out of select or poll where the thread will traverse its list removing all the selinfos from its own list. Problem: Previously the PROC_LOCK was used to provide the mutual exclusion needed to ensure proper locking, this couldn't work because there was a single condvar used for select and poll and condvars can only be used with a single mutex. Solution: Introduce a global mutex 'sellock' which is used to provide mutual exclusion when recording events to wait on as well as performing notification when an event occurs. Interesting note: schedlock is required to manipulate the per-thread TDF_SELECT flag, however if given its own field it would not need schedlock, also because TDF_SELECT is only manipulated under sellock one doesn't actually use schedlock for syncronization, only to protect against corruption. Proc locks are no longer used in select/poll. Portions contributed by: davidc	2002-03-14 01:32:30 +00:00
green	9a5e1dcf21	Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate. While doing this, move it earlier in the sysinit boot process so that the VM system can use it. After that, the system is now able to use sx locks instead of lockmgr locks in the VM system. To accomplish this, some of the more questionable uses of the locks (such as testing whether they are owned or not, as well as allowing shared+exclusive recursion) are removed, and simpler logic throughout is used so locks should also be easier to understand. This has been tested on my laptop for months, and has not shown any problems on SMP systems, either, so appears quite safe. One more user of lockmgr down, many more to go :)	2002-03-13 23:48:08 +00:00
archie	4ff8306186	Add realloc() and reallocf(), and make free(NULL, ...) acceptable. Reviewed by: alfred	2002-03-13 01:42:33 +00:00
jeff	e6d26e8880	This patch adds the "LOCKSHARED" option to namei which causes it to only acquire shared locks on leafs. The stat() and open() calls have been changed to make use of this new functionality. Using shared locks in these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes. Also, this reduces the number of exclusive locks that are floating around in the system, which helps reduce the number of deadlocks that occur. A new kernel option "LOOKUP_SHARED" has been added. It defaults to off so this patch can be turned on for testing, and should eventually go away once it is proven to be stable. I have personally been running this patch for over a year now, so it is believed to be fully stable. Reviewed by: jake, obrien Approved by: jake	2002-03-12 04:00:11 +00:00
phk	ad998ff108	Make the disk_clone() routine more robust for abuse. Sneak in a trivial bit of the GEOM stuff while we're here anyway.	2002-03-11 08:08:02 +00:00
tanimura	22c75bf1c9	Stop abusing the pgrpsess_lock.	2002-03-11 07:53:13 +00:00
tanimura	b9e49bfcc9	Do not lock the pgrpsess_lock exclusively across ttywait(). Spotted by: David Wolfskill <david@catwhisker.org> Investigated by: rwatson	2002-03-11 07:51:08 +00:00
dwmalone	532dc5e009	Don't assign strcmp to a variable called err and then compare it with zero, just compare strcmp with zero. This fixes the same bug which Maxim just fixed and fixes some odd style too. PR: 35712 Reviewed by: arr	2002-03-10 23:12:43 +00:00
sobomax	c497ba78d2	Fix a breakage introduced in rev.1.75 (supposedly style cleanup), which results in "missing dependencies" error when loading some kld modules. It is sad to see how often these days style cleanus break doesn't broken things. Perhaps people should recall good old principle: "don't fix it if it isn't broken".	2002-03-10 19:20:01 +00:00
phk	e28f5bfbe1	Make the proposed name arg to dev_stdclone() const.	2002-03-10 10:50:05 +00:00
alfred	00d9ca2b85	Remove __P	2002-03-09 22:44:37 +00:00

... 5 6 7 8 9 ...

5233 Commits