freebsd-skq

Author	SHA1	Message	Date
iedowse	b399d5ecbd	Add a checksum to the kernel message buffer, and update it every time a character is written. Use this at boot time to reject the existing buffer contents if they are corrupt. This fixes a problem seen on some hardware (especially laptops) where the message buffer gets partially corrupted during a short power cycle or reset, but the msgbuf structure is left intact so it gets reused, resulting in random junk and control characters appearing in dmesg and /var/log/messages. PR: kern/28497	2003-03-28 02:50:10 +00:00
tegge	ede5ebede7	Add support for reading directly from file to userland buffer when the O_DIRECT descriptor status flag is set and both offset and length is a multiple of the physical media sector size.	2003-03-26 23:40:42 +00:00
tegge	5e14826743	Adjust the number of vnodes scanned by vlrureclaim() according to the size of the vnode list.	2003-03-26 22:15:58 +00:00
rwatson	84af8bf695	Permit debug.malloc.failure_rate to be specified using a tunable so that the feature can be enabled during the boot process. Note the continued limitation that FreeBSD fails so rapidly with this setting enabled that it's hard to narrow down particular failures for correction; we really need per-malloc type failure rates.	2003-03-26 20:44:29 +00:00
rwatson	68d9c43724	Add a new kernel option, MALLOC_MAKE_FAILURES, which compiles in a debugging feature causing M_NOWAIT allocations to fail at a specified rate. This can be useful for detecting poor handling of M_NOWAIT: the most frequent problems I've bumped into are unconditional deference of the pointer even though it's NULL, and hangs as a result of a lost event where memory for the event couldn't be allocated. Two sysctls are added: debug.malloc.failure_rate How often to generate a failure: if set to 0 (default), this feature is disabled. Otherwise, the frequency of failures -- I've been using 10 (one in ten mallocs fails), but other popular settings might be much lower or much higher. debug.malloc.failure_count Number of times a coerced malloc failure has occurred as a result of this feature. Useful for tracking what might have happened and whether failures are being generated. Useful possible additions: tying failure rate to malloc type, printfs indicating the thread that experienced the coerced failure. Reviewed by: jeffr, jhb	2003-03-26 20:18:40 +00:00
tegge	d9da9de257	fp->f_offset doesn't need any protection when it isn't accessed.	2003-03-26 19:21:12 +00:00
rwatson	e5680de54a	Modify the mac_init_ipq() MAC Framework entry point to accept an additional flags argument to indicate blocking disposition, and pass in M_NOWAIT from the IP reassembly code to indicate that blocking is not OK when labeling a new IP fragment reassembly queue. This should eliminate some of the WITNESS warnings that have started popping up since fine-grained IP stack locking started going in; if memory allocation fails, the creation of the fragment queue will be aborted. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-26 15:12:03 +00:00
jhb	671aa92ea0	Remove extraneous check. We are not going to return from copyin/out on the stack of a thread A but actually be thread B instead of thread A.	2003-03-25 20:13:24 +00:00
mdodd	fe36e1c847	Give print_child a default method.	2003-03-25 04:32:52 +00:00
jake	783ae539c3	- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)	2003-03-25 00:07:06 +00:00
jhb	98a481610a	Replace the at_fork, at_exec, and at_exit functions with the slightly more flexible process_fork, process_exec, and process_exit eventhandlers. This reduces code duplication and also means that I don't have to go duplicate the eventhandler locking three more times for each of at_fork, at_exec, and at_exit. Reviewed by: phk, jake, almost complete silence on arch@	2003-03-24 21:15:35 +00:00
jhb	966c72c345	- Remove witness_dead and just use witness_watch instead. If witness_watch is set to 0, it now has the same affect as setting witness_dead used to have. - Added a sysctl handler that allows root to change witness_watch from a non-zero value to zero to disable witness at runtime. Note that you can't turn witness back on once it is off. You can only turn it off as a one-way switch. - Added a comment describing the possible values of witness_watch.	2003-03-24 21:03:53 +00:00
mux	cfd612e2d7	Remove a trailing semicolon in SCHED_QUANTUM definition. Luckily this didn't cause any bugs. Spotted by: Samy Al Bahra <samy@kerneled.com>	2003-03-24 15:16:21 +00:00
cognet	501da8bd5f	s/discriptors/descriptors/	2003-03-23 19:41:34 +00:00
tjr	9785758af0	Remove unused mtx_lock_giant(), mtx_unlock_giant(), related globals and sysctls.	2003-03-23 11:26:11 +00:00
yar	f9968b4d9f	We shouldn't assert that a vode is locked in vop_lock_post() if VOP_LOCK() has failed. Reviewed by: jeff	2003-03-22 13:21:54 +00:00
jhb	300b98684d	Use td_ucred of curthread instead of p_ucred of curproc. This required changing sem_perm() and sem_hasopen() to take a thread instead of a proc for the first argument.	2003-03-20 21:12:31 +00:00
phk	afab808bc4	Backout the getcwd changes, a more comprehensive effort will be needed.	2003-03-20 10:40:45 +00:00
davidxu	8e88e8da05	Adjust code for userland preemptive. Userland can set a quantum in kse_mailbox to schedule an upcall, this is useful for userland timeout routine, for example pthread_cond_timedwait(). Also extract upcall scheduling code from kse_reassign and create a new function called thread_switchout to include these code. Reviewed by: julain	2003-03-19 05:49:38 +00:00
des	d0bee7242c	Unregisterize, ansify.	2003-03-19 00:49:40 +00:00
des	e97206db4c	Whitespace cleanup.	2003-03-19 00:33:38 +00:00
jake	ce68d8fa22	long != int. Use SYSCTL_UINT for kern.devstat.generation. Fixes booting on sparc64.	2003-03-18 23:32:27 +00:00
gallatin	d18eb82da5	Fix a race condition in socow_setup(): The page must be wired before sf_buf_alloc() is called, as sf_buf_alloc() may sleep. If it does sleep, the page might be reclaimed before wiring occurs. Reported by: alc	2003-03-18 18:27:33 +00:00
phk	eef257a93e	If devstat_new_entry() is passed a unit number of -1 assume that the devstat is for an "interior" GEOM node and register using the name argument as a geom identity pointer. Do not put these devstat structures on the list returned by the sysctl. This gives us the ability to tell the two kinds of nodes apart and leave the current "strictly physical" view of devstat intact without modifications, yet be able to use devstat for both kinds of devices. It also saves us bloating struct devstat with another 48 bytes of space for the name. At least for now. Reviewed by: ken	2003-03-18 09:30:31 +00:00
phk	45ffc6d110	Make devstat fully Giant agnostic: Add a mutex and protect the allocation and traversal of the list with it. When we allocate a page for devstat use we drop the mutex and use M_WAITOK this is not nice, but under the given circumstances the best we can do. In the sysctl handler for returning the devstat entries we do not want to hold the mutex across copyout(9) calls, so we keep a very careful eye on the devstat_generation count, and abandon with EBUSY if it changes under our feet. Specifically test for BIO_WRITE, rather than default non-read,non-deletes as write. Make the default be DEVSTAT_NO_DATA. Add atomic increments of the sequence[01] fields so applications using the mmap'ed view stand a chance of detecting updates in progress. Reviewed by: ken	2003-03-18 09:20:20 +00:00
phk	e059b79437	Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.	2003-03-18 08:45:25 +00:00
phk	6c4e57c44b	Make devstat_new_entry() take a const void * rather than const char * argument, GEOM nodes are not identified by ascii string.	2003-03-18 07:52:59 +00:00
jeff	c0f79e52d6	- Unlock the target bp and not the pager buf bp in a failure case in cluster_wbuild(). This was causing strange panics that were widely reported on current@. Big Pointy Hat to: jeff	2003-03-17 18:38:49 +00:00
phk	90f136c592	(This commit certainly increases the need for a wash&clean of vfs_cache.c, but I decided that it was important for this patch to not bit-rot, and since it is mainly moving code around, the total amount of entropy is epsilon /phk) This is a patch to move the common parts of linux_getcwd() back into kern/vfs_cache.c so that the standard FreeBSD libc getcwd() can use it's extended functionality. The linux syscall linux_getcwd() in compat/linux/linux_getcwd.c has been rewritten to use it too. It should be possible to simplify libc's getcwd() after this. No doubt this code needs some cleaning up, since I've left in the sysctl variables I used for debugging. PR: 48169 Submitted by: James Whitwell <abacau@yahoo.com.au>	2003-03-17 12:21:08 +00:00
phk	1ff2d4dcb1	Add a #define for the device name of the mmap device for devstat. Constify the geom identification pointer.	2003-03-16 23:20:05 +00:00
alc	4641d3d127	Pass the sf buf to MEXTADD() as the optional argument. This permits the simplification of socow_iodone() and sf_buf_free(); they don't have to reverse engineer the sf buf from the data's address.	2003-03-16 07:19:12 +00:00
phk	1b534a022f	One devstat_start_transaction_bio() is enough.	2003-03-15 22:20:38 +00:00
phk	f432014308	Run a revision of the devstat interface: Kernel: Change statistics to use the uptime() timescale (ie: relative to boottime) rather than the UTC aligned timescale. This makes the device statistics code oblivious to clock steps. Change timestamps to bintime format, they are cheaper. Remove the "busy_count", and replace it with two counter fields: "start_count" and "end_count", which are updated in the down and up paths respectively. This removes the locking constraint on devstat. Add a timestamp argument to devstat_start_transaction(), this will normally be a timestamp set by the _bio() function in bp->bio_t0. Use this field to calculate duration of I/O operations. Add two timestamp arguments to devstat_end_transaction(), one is the current time, a NULL pointer means "take timestamp yourself", the other is the timestamp of when this transaction started (see above). Change calculation of busy_time to operate on "the salami principle": Only when we are idle, which we can determine by the start+end counts being identical, do we update the "busy_from" field in the down path. In the up path we accumulate the timeslice in busy_time and update busy_from. Change the byte_* and num_* fields into two arrays: bytes[] and operations[]. Userland: Change the misleading "busy_time" name to be called "snap_time" and make the time long double since that is what most users need anyway, fill it using clock_gettime(CLOCK_MONOTONIC) to put it on the same timescale as the kernel fields. Change devstat_compute_etime() to operate on struct bintime. Remove the version 2 legacy interface: the change to bintime makes compatibility far too expensive. Fix a bug in systat's "vm" page where boot relative busy times would be bogus. Bump __FreeBSD_version to 500107 Review & Collaboration by: ken	2003-03-15 21:59:06 +00:00
phk	4a623e073c	Add a devstat_start_transaction_bio() to match the devstat_end_transaction_bio() we already have. For now it just calls devstat_start_transaction(), but that will change shortly.	2003-03-15 10:33:32 +00:00
davidxu	c2573f692d	Export current time when returning from never blocked syscall.	2003-03-14 03:52:16 +00:00
jhb	e90ccce535	Trim some trailing whitespace.	2003-03-13 23:07:09 +00:00
jhb	a21ccbbf8c	Add a new userland-visible ktrace flag KTR_DROP and an internal ktrace flag KTRFAC_DROP to track instances when ktrace events are dropped due to the request pool being exhausted. When a thread tries to post a ktrace event and is unable to due to no available ktrace request objects, it sets KTRFAC_DROP in its process' p_traceflag field. The next trace event to successfully post from that process will set the KTR_DROP flag in the header of the request going out and clear KTRFAC_DROP. The KTR_DROP flag is the high bit in the type field of the ktr_header structure. Older kdump binaries will simply complain about an unknown type when seeing an entry with KTR_DROP set. Note that KTR_DROP being set on a record in a ktrace file does not tell you anything except that at least one event from this process was dropped prior to this event. The user has no way of knowing what types of events were dropped nor how many were dropped. Requested by: phk	2003-03-13 18:31:15 +00:00
jhb	f02ef38080	- Cache a reference to the credential of the thread that starts a ktrace in struct proc as p_tracecred alongside the current cache of the vnode in p_tracep. This credential is then used for all later ktrace operations on this file rather than using the credential of the current thread at the time of each ktrace event. - Now that we have multiple ktrace-related items in struct proc that are pointers, rename p_tracep to p_tracevp to make it less ambiguous. Requested by: rwatson (1)	2003-03-13 18:24:22 +00:00
iedowse	db8dd4828e	In m_dup_pkthdr(), convert the supplied `how' argument into malloc flags when passing it into m_tag_copy_chain(), as m_tag* functions use malloc, not mbuf flags.	2003-03-13 09:02:19 +00:00
jeff	459181e3ed	- Add a lock for protecting against msleep(bp, ...) wakeup(bp) races. - Create a new function bdone() which sets B_DONE and calls wakup(bp). This is suitable for use as b_iodone for buf consumers who are not going through the buf cache. - Create a new function bwait() which waits for the buf to be done at a set priority and with a specific wmesg. - Replace several cases where the above functionality was implemented without locking with the new functions.	2003-03-13 07:31:45 +00:00
jeff	ec5374265b	- Remove a dead check for bp->b_vp == vp in vtruncbuf(). This has not been possible for some time. - Lock the buf before accessing fields. This should very rarely be locked. - Assert that B_DELWRI is set after we acquire the buf. This should always be the case now.	2003-03-13 07:22:53 +00:00
jeff	ae3c8799da	- Remove a race between fsync like functions and flushbufqueues() by requiring locked bufs in vfs_bio_awrite(). Previously the buf could have been written out by fsync before we acquired the buf lock if it weren't for giant. The cluster_wbuild() handles this race properly but the single write at the end of vfs_bio_awrite() would not. - Modify flushbufqueues() so there is only one copy of the loop. Pass a parameter in that says whether or not we should sync bufs with deps. - Call flushbufqueues() a second time and then break if we couldn't find any bufs without deps.	2003-03-13 07:19:23 +00:00
alfred	91e561ec03	Make sure we actually have a dev before dereferencing in case someone botches and sends us a NULL pointer. The other code in this file seems to expect it to be able to handle it behaving this way.	2003-03-13 06:29:44 +00:00
jeff	814703b2a4	- Tune down read_max. For single disks we get no gain out of reading more than a MAXPHYS size block ahead. Having this set too high just leaves other processes starved for IO and screws up interactive response. Let the users with RAID set it higher when they need it.	2003-03-13 06:17:59 +00:00
tjr	2230824221	Tidy up previous change: move comment about obtaining an exclusive reference where it belongs, and remove a blank line to make it more obvious what the comment applies to.	2003-03-13 00:57:47 +00:00
tjr	a9d877b4c2	Back out previous. The locking here needs a rethink.	2003-03-13 00:54:53 +00:00
jhb	954b82f293	- Various little style fixes. - If SYSCTL_OUT() fails in sysctl_kern_proc_args(), return the error instead of ignoring it if we have new arguments for the process. - If the new arguments for a process are too long, return ENOMEM instead of returning success but not doing the actual copy. Submitted by: bde	2003-03-12 20:17:40 +00:00
jhb	7510e91aa2	- Avoid dropping the proc lock around a simple permissions check and just hold hold it across the check to avoid extra lock operations in the common case. - Copy in the new args to a temporary pargs structure before we drop the reference to the old one. Thus, if the copyin() fails, the process arguments are unchanged rather than being deleted. Also, p_args is no longer NULL during the sysctl operation.	2003-03-12 16:14:55 +00:00
tjr	679efe569a	Acquire sched_lock around use of FOREACH_KSEGRP_IN_PROC, accesses to kg_nice and calls to sched_nice() in getpriority() and setpriority() (really donice()).	2003-03-12 11:24:41 +00:00
tjr	59d5730195	In wait1(), remove the zombie process from zombproc before removing it from its pgrp to avoid leaving zombies around with p_pgrp == NULL. This bug was apparent as a NULL-dereference in the pid selection code in fork1().	2003-03-12 11:10:04 +00:00
jhb	0c3ac305c8	Trim an extra blank line that snuck into the last commit.	2003-03-11 22:33:42 +00:00
kan	378cd3b05d	Rename vfs_stdsync function to vfs_stdnosync which matches more closely what function is really doing. Update all existing consumers to use the new name. Introduce a new vfs_stdsync function, which iterates over mount point's vnodes and call FSYNC on each one of them in turn. Make nwfs and smbfs use this new function instead of rolling their own identical sync implementations. Reviewed by: jeff	2003-03-11 22:15:10 +00:00
jhb	b2bb08b487	- Change witness_displaydescendants() to accept the indentation level as a parameter instead of using the level of a given witness. When recursing, pass an indent level of indent + 1. - Make use of the information witness_levelall() provides in witness_display_list() to use an O(n) algorithm instead of an O(n^2) algo to decide which witnesses to display hierarchies from. Basically, we only display a hierarchy for witnesses with a level of 0. - Add a new per-witness flag that is reset at the start of witness_display() for all witness's and is set the first time a witness is displayed in witness_displaydescendants(). If a witness is encountered more than once in the lock order tree (which happens often), witness_displaydescendants() marks the later occurrences with the string "(already displayed)" and doesn't display the subtree under that witness. This avoids duplicating large amounts of the lock order tree in the 'show witness' output in DDB. All these changes serve to make 'show witness' a lot more readable and useful than it was previously.	2003-03-11 22:14:21 +00:00
jhb	736396fc05	- Split the itismychild() function into two functions: insertchild() adds a witness to the child list of a parent witness. rebalancetree() runs through the entire tree removing direct descendants of witnesses who already have said child witness as an indirect descendant through another direct descendant. itismychild() now calls insertchild() followed by rebalancetree() and no longer needs the evil hack of having static recursed variable. - Add a function reparentchildren() that adds all the direct descendants of one witness as direct descendants of another witness. - Change the return value of itismychild() and similar functions so that they return 0 in the case of failure due to lack of resources instead of 1. This makes the return value more intuitive. - Check the return value of itismychild() when defining the static lock order in witness_initialize(). - Don't try to setup a lock instance in witness_lock() if itismychild() fails. Witness is hosed anyways so no need to do any more witness related activity at that point. It also makes the code flow easier to understand. - Add a new depart() function as the opposite of enroll(). When the reference count of a witness drops to 0 in witness_destroy(), this function is called on that witness. First, it runs through the lock order tree using reparentchildren() to reparent direct descendants of the departing witness to each of the witness' parents in the tree. Next, it releases it's own child list and other associated resources. Finally it calls rebalanacetree() to rebalance the lock order tree. - Sort function prototypes into something closer to alphabetical order. As a result of these changes, there should no longer be 'dead' witnesses in the order tree, and repeatedly loading and unloading a module should no longer exhaust witness of its internal resources. Inspired by: gallatin	2003-03-11 22:07:35 +00:00
jhb	5aeddebb51	Trim useless "../" leading strings from filenames passed into witness.	2003-03-11 21:53:12 +00:00
jhb	42e39caaab	Adjust style of #ifdef's and #endif's to be more consistent and in line with recent additions to style(9).	2003-03-11 21:38:49 +00:00
jhb	f7f4727b44	Do the lock order check skip for the LOP_TRYLOCK case after the check for recursing on a lock instead of before. This fixes a bug where WITNESS could get a little confused if you did an sx_tryslock() on a sx lock that you already had an slock on. WITNESS would still function correctly but it could result in weirdness in the output of 'show locks'. This also makes it possible for mtx_trylock() to recurse on a lock.	2003-03-11 20:54:37 +00:00
jhb	ed498389de	Rework the eventhandler locking for hopefully the last time. The scheme used popped into my head during my morning commute a few weeks ago, but it is also very similar (though a bit simpler) to a patch that mini@ developed a while ago. Basically, each eventhandler list has a mutex and a run count. During an eventhandler invocation, the mutex is held while we traverse the list but is dropped while we execute actual handlers. Also, a runcount counter is incremented at the start of an invocation and decremented at the end of an invocation. Adding to the list is not a big deal since the reference of a thread currently executing the handlers remains valid across an add operation. Whether or not new handlers are executed by threads currently executing the handlers for a given list is indeterminate however. The harder case is when a handler is removed from the list. If the runcount is zero, the handler is simply removed from the list directly. If the runcount is not zero, then another thread is currently executing the handlers of this list, so the priority of this handler is set to a magic value (currently -1) to mark it as dead. Dead handlers are not executed during an invocation. If the runcount is zero after it is decremented at the end of an invocation, then a new eventhandler_prune_list() function is called to remove dead handlers from the list. Additional minor notes: - All the common parts of EVENTHANDLER_INVOKE() and EVENTHANDLER_FAST_INVOKE() have been merged into a common _EVENTHANDLER_INVOKE() macro to reduce duplication and ease maintenance. - KTR logging for eventhandlers is now available via the KTR_EVH mask. - The global eventhander_mutex is no longer recursive. Tested by: scottl (SMP i386)	2003-03-11 20:17:00 +00:00
jhb	97c1e71ca2	Axe the useless MTX_SLEEPABLE flag. mutexes are not sleepable locks. Nothing used this flag and WITNESS would have panic'd during mtx_init() if anything had.	2003-03-11 20:02:57 +00:00
jhb	8c7bd00836	Use a shorter and less redundant name for the sysctl tree lock.	2003-03-11 20:01:51 +00:00
jhb	a8c7b4178e	Use the KTR_LOCK mask for logging events via KTR in lockmgr() rather than KTR_LOCKMGR. lockmgr locks are locks just like other locks.	2003-03-11 20:00:37 +00:00
jhb	559c644740	Trim leading "../" sequences from filenames.	2003-03-11 19:56:16 +00:00
jeff	1e928d5480	- Regularize variable usage in cluster_read(). - Issue the io that we will later block on prior to doing cluster read ahead so that it is more likely to be ready when we block. - Loop issuing clustered reads until we've exhausted the seq count supplied by the file system. - Use a sysctl tunable "vfs.read_max" to determine the maximum number of blocks that we'll read ahead.	2003-03-11 06:14:03 +00:00
davidxu	f453b04b6d	Lock proc lock before changing p_flag.	2003-03-11 03:16:02 +00:00
davidxu	b47a4be33e	Fix signal delivering bug for threaded process.	2003-03-11 02:59:50 +00:00
davidxu	bb4f70ad77	Fix threaded process job control bug. SMP tested. Reviewed by: julian	2003-03-11 00:07:53 +00:00
kan	5fc13b9f67	Remove trainling whitespace.	2003-03-10 21:55:00 +00:00
phk	780911f210	PHCC[1]: I had commented the #ifdef INVARIANTS checks out to make sure I ran this code in all kernels and forgot to comment the #ifdefs back in before I committed. Spotted by: bmilekic [1] PHCC = Pointy Hat Correction Commit	2003-03-10 20:24:54 +00:00
phk	b3deffb7e1	Make malloc and mbuf allocation mode flags nonoverlapping. Under INVARIANTS whine if we get incompatible flags. Submitted by: imp	2003-03-10 19:39:53 +00:00
jhb	cb710fa095	Now that we have WITNESS_WARN(), we only call witness_list() from the ddb 'show locks' command. Thus, move witness_list() to the #ifdef DDB section and remove extra checks for calling this function outside of DDB. Also, witness_list() now returns void instead of returning an int. Reported by: Steve Ames <steve@energistic.com> Prodded by: davidxu	2003-03-10 17:03:57 +00:00
phk	ef078ef6a1	Don't call make_dev() before we are ready for it.	2003-03-09 20:42:49 +00:00
alc	3a50d97c5c	Remove some unnecessary actions by the zero-copy setup and teardown code. Remove an incorrect comment. (Incrementing an object's reference count does not prevent a process from exiting. The real concern here is that the physical page must not be deleted until transmission is complete. That is already handled by the VM system and sf_buf_free().) Tested by: ken	2003-03-09 20:38:56 +00:00
phk	2fe1f6204d	Note that MAJOR_AUTO is now the default if d_maj is not initialized. This is more robust and prevents the hijacking of /dev/console for the typical mistake. Remove unneeded MAJOR_AUTO uses, it is only needed explicitly now if the driver source has cross-branch compatibility to old releases.	2003-03-09 11:03:45 +00:00
phk	0d99ca7a9d	Add one little hack to allow us to make MAJOR_AUTO be zero: Let the console driver ask for major 256 and magically change this to mean zero.	2003-03-09 10:28:05 +00:00
davidxu	26081cfe06	Cosmetic change, make it QUEUE_MACRO_DEBUG friendly	2003-03-09 04:27:46 +00:00
tjr	0b60094f80	Hold the proc lock while accessing p_procsig in trapsignal().	2003-03-09 01:40:55 +00:00
phk	cb28d5d218	Retire devstat_add_entry() as a public function and bump __FreeBSD_version to mark this act.	2003-03-08 21:46:43 +00:00
phk	e39377fdd0	Introduce a device driver for /dev/devstat, this will allow us to mmap the device statistics structures into userland instead of using sysctl. Introduce new devstat_new_entry() function which allocates the devstat structure an calls devstat_add_entry() on it.	2003-03-08 19:58:57 +00:00
ken	471eab1868	Zero copy send and receive fixes: - On receive, vm_map_lookup() needs to trigger the creation of a shadow object. To make that happen, call vm_map_lookup() with PROT_WRITE instead of PROT_READ in vm_pgmoveco(). - On send, a shadow object will be created by the vm_map_lookup() in vm_fault(), but vm_page_cowfault() will delete the original page from the backing object rather than simply letting the legacy COW mechanism take over. In other words, the new page should be added to the shadow object rather than replacing the old page in the backing object. (i.e. vm_page_cowfault() should not be called in this case.) We accomplish this by making sure fs.object == fs.first_object before calling vm_page_cowfault() in vm_fault(). Submitted by: gallatin, alc Tested by: ken	2003-03-08 06:58:18 +00:00
davidxu	dd4ead08fe	Lock sched_lock before modifying td_flags.	2003-03-08 04:09:04 +00:00
bbraun	801f005cb2	Fix a spelling error. Submitted by: jkh Reviewed by: zarzycki	2003-03-07 22:47:32 +00:00
jhb	a189180b83	Respect any passed in external lockmgr flags such as LK_NOWAIT in the default implementations of VOP_LOCK() and VOP_UNLOCK(). Tested by: jlemon, phk Glanced at by: jeffr	2003-03-07 20:45:07 +00:00
jhb	7e6cfbee4e	Oops, fix the double faults people were seeing with the recent changes to witness. Sleepable locks such as sx locks always come before all mutexes including Giant. However, the static lock order list placed Giant before the proctree and allproc sx locks. This resulted in witness creating a cycle in its lock order "tree" (real trees don't have cycles) leading to infinite recursion and eventually a double fault. To fix, put Giant after sx locks in the lock order list.	2003-03-06 17:25:06 +00:00
alc	e5b46f193e	Remove GIANT_REQUIRED from sf_buf_free().	2003-03-06 04:48:19 +00:00
rwatson	7974609efe	Instrument sysarch() MD privileged I/O access interfaces with a MAC check, mac_check_sysarch_ioperm(), permitting MAC security policy modules to control access to these interfaces. Currently, they protect access to IOPL on i386, and setting HAE on Alpha. Additional checks might be required on other platforms to prevent bypass of kernel security protections by unauthorized processes. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-06 04:47:47 +00:00
alc	c50367da67	Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress. Discussed on: arch@	2003-03-06 03:41:02 +00:00
rwatson	9ecf925a7d	Provide a mac_check_system_swapoff() entry point, which permits MAC modules to authorize disabling of swap against a particular vnode. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-05 23:50:15 +00:00
rwatson	3158a8710a	Move the initialization of the vattr flags field in setfflags() to before the MAC check so that we pass the flags field into the MAC check properly initialized. This didn't affect any current MAC modules since they didn't care what the flags argument was (as they were primarily interested in the fact that it was a meta-data write, not the contents of the write), but would be relevant to future modules relying on that field. Submitted by: Mike Halderman <mrh@spawar.navy.mil> Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-03-05 23:15:23 +00:00
peter	fbc7526e8f	Finish driving a stake through the heart of netns and the associated ifdefs scattered around the place - its dead Jim! The SMB stuff had stolen AF_NS, make it official.	2003-03-05 19:24:24 +00:00
das	5ba556c626	Make TTYHOG tunable. Reviewed by: mike (mentor)	2003-03-05 08:16:29 +00:00
jlemon	04e28d5a81	Update netisr handling; Each SWI now registers its queue, and all queue drain routines are done by swi_net, which allows for better queue control at some future point. Packets may also be directly dispatched to a netisr instead of queued, this may be of interest at some installations, but currently defaults to off. Reviewed by: hsu, silby, jayanth, sam Sponsored by: DARPA, NAI Labs	2003-03-04 23:19:55 +00:00
jhb	45fcac94f4	Bah, fix a bogon in the last commit: get the sense of a compare test right so that we allow a sleepable lock to be acquired with Giant held rather than allowing a sleepable lock to be acquired with anything but Giant held.	2003-03-04 22:34:07 +00:00
jeff	bfd7640850	- Hold the buf lock while manipulating and inspecting its fields. - Use gbincore() and not incore() so that we can drop the vnode interlock as we acquire the buflock. - Use GB_LOCK_NOWAIT when getting bufs for read ahead clusters so that we don't block on locked bufs. - Convert a while loop to a howmany() that will most likely be faster on modern processors. There is another while loop divide that was left near by because it is operating on a 64bit int and is most likely faster. - Cleanup the cluster_read() code a little to get rid of a goto and make the logic clearer. Tested on: x86, alpha Tested by: Steve Kargl <sgk@troutmask.apl.washington.edu> Reviewd by: arch	2003-03-04 21:35:28 +00:00
jhb	1ec0222389	Remove safety belt: it is now ok to do a mtx_trylock() on a mutex you already own. The mtx_trylock() will fail however. Enhance the comment at the top of the try lock function to explain this. Requested by: jlemon and his evil netisr locking	2003-03-04 21:32:25 +00:00
jhb	e4bcd25517	Replace calls to WITNESS_SLEEP() and witness_list() with equivalent calls to WITNESS_WARN().	2003-03-04 21:03:05 +00:00
jhb	e87dfc0cde	Add a WITNESS_WARN() call to verify that we hold no locks after running a handler from an interrupt thread.	2003-03-04 21:01:42 +00:00
jhb	c5a53306ce	A small overhaul of witness: - Add a comment about special lock order rules and Giant near the top of subr_witness.c. Specifically, this documents and explains the real lock order relationship between Giant and sleepable locks (i.e. lockmgr locks and sx locks). Basically, Giant can be safely acquired either before or after sleepable locks and the case of Giant before a sleepable lock is exempted as a special case. - Add a new static function 'witness_list_lock()' that displays a single line of information about a struct lock_instance. This is used to make the output of witness messages more consistent and reduce some code duplication. - Fixup a few comments in witness_lock(). - Properly handle the Giant-before-sleepable-lock lock order exception in a more general fashion and remove the no longer needed LI_SLEPT flag. - Break up the last condition before assuming a reversal a bit to try and make the logic less confusing in witness_lock(). - Axe WITNESS_SLEEP() now that LI_SLEPT is no longer needed and replace it with a more general WITNESS_WARN() macro/function combination. WITNESS_WARN() allows you to output a customized message out to the console along with a list of held locks. It will optionally drop into the debugger as well. You can exempt a single lock from the check by passing it in as the second argument. You can also use flags to specify if Giant should be exempt from the check, if all sleepable locks should be exempt from the check, and if witness should panic if any non-exempt locks are found. - Make the witness_list() function static. Other areas of the kernel should use the new WITNESS_WARN() instead.	2003-03-04 20:56:39 +00:00
jhb	f78f351da3	Miscellaneous cleanups to _mtx_lock_sleep(): - Declare some local variables at the top of the function instead of in a nested block. - Use mtx_owned() instead of masking off bits from mtx_lock manually. - Read the value of mtx_lock into 'v' as a separate line rather than inside an if statement for clarity. This code is hairy enough as it is.	2003-03-04 20:32:41 +00:00
jhb	3b2bb7e47b	Properly assert that mtx_trylock() is not called on a mutex we already owned. Previously the KASSERT would only trigger if we successfully acquired a lock that we already held. However, _obtain_lock() fails to acquire locks that we already hold, so the KASSERT was never checked in the case it was supposed to fail.	2003-03-04 20:30:30 +00:00
jeff	8f4d53e3aa	- Create a function sched_interact_score() which decides on the interactivity of a kseg and assigns it a value of 0 through 100. - Use sched_interact_score() to determine the dynamic priority. - Define SCHED_CURR() in terms of sched_interact_score(). - Adjust the maximum slice back down to 100ms. - Remove redundant clearing of ke_runq in sched_wakeup() - Clean up #defines and comment them.	2003-03-04 02:45:59 +00:00

1 2 3 4 5 ...

6187 Commits