freebsd-dev

Author	SHA1	Message	Date
Maxim Konovalov	75d960eb2e	o Remove rev. 1.57 leftover, not reached code.	2006-07-02 20:49:46 +00:00
Maxim Konovalov	e2668f5563	o Fix typo in the comment. PR: kern/99632 Submitted by: clsung	2006-06-30 08:10:55 +00:00
David E. O'Brien	2e4db89cfc	Fix building with GCC 4.2: define data types before referring to them.	2006-06-29 19:37:31 +00:00
John Baldwin	fe95c76276	Fix semctl(2) breakage from the previous commit. Previously __semctl() had a local 'semid' variable which was the array index and used uap->semid as the original IPC id. During the kern_semctl() conversion those two variables were collapsed into a single 'semid' variable breaking the places that needed the original IPC ID. To fix, add a new 'semidx' variable to hold the array index and leave 'semid' unmolested as the IPC id. While I'm here, explicitly document that the (undocumented, at least in semctl(2)) SEM_STAT command curiously expects an array index in the 'semid' parameter rather than an IPC id. Submitted by: maxim	2006-06-29 13:58:36 +00:00
David Xu	5151eeb194	Fix a bug when accumulating run time, if a thread calls yield() syscall, its run time may be lost.	2006-06-29 12:29:20 +00:00
David Xu	d29a8ce69b	Fix system load count (noticed by dephij). Remove incorrect comment.	2006-06-29 09:49:00 +00:00
David Xu	0922ef0c42	Remove unused function declaration. Add else statement in sched_calc_pri. Fix a bug when checking interrupt thread in sched_add.	2006-06-29 05:59:36 +00:00
David Xu	d60003a2e4	Remove load balancer code, since it has serious priority inversion problem which really hurts performance on FreeBSD.	2006-06-29 05:36:34 +00:00
John Baldwin	49d409a108	- Add a kern_semctl() helper function for __semctl(). It accepts a pointer to a copied-in copy of the 'union semun' and a uioseg to indicate which memory space the 'buf' pointer of the union points to. This is then used in linux_semctl() and svr4_sys_semctl() to eliminate use of the stackgap. - Mark linux_ipc() and svr4_sys_semsys() MPSAFE.	2006-06-27 18:28:50 +00:00
John Baldwin	597d608f86	- Expand the scope of Giant some in mount(2) to protect the vfsp structure from going away. mount(2) is now MPSAFE. - Expand the scope of Giant some in unmount(2) to protect the mp structure (or rather, to handle concurrent unmount races) from going away. umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount(). - nmount(2) and linux_mount() were already MPSAFE.	2006-06-27 14:46:31 +00:00
Pawel Jakub Dawidek	0bd645ae0c	Compress direct cr_ruid comparsion and jailed() call to suser_cred(9). Reviewed by: rwatson	2006-06-27 11:32:08 +00:00
Pawel Jakub Dawidek	8838c27693	Use suser_cred(9) instead of checking cr_uid directly. Reviewed by: rwatson	2006-06-27 11:29:38 +00:00
Pawel Jakub Dawidek	2905ade228	- Use suser_cred(9) instead of checking cr_ruid directly. - For privileged processes safe two mutex operations. We may want to consider if this is good idea to use SUSER_ALLOWJAIL here, but for now I didn't wanted to change the original behaviour. Reviewed by: rwatson	2006-06-27 11:28:50 +00:00
Sergey Babkin	d81175c738	Backed out the change by request from rwatson. PR: kern/14584	2006-06-26 22:03:22 +00:00
John Baldwin	c94ce032df	Address a problem I missed in removing Giant from the kernel linker. Not all of the module event handlers are MP safe yet, so always acquire Giant for now when invoking module event handlers. Eventually we can add an MPSAFE flag or some such and add appropriate locking to all module event handlers.	2006-06-26 18:34:45 +00:00
John Baldwin	322fb40cbf	Remove duplicate security checks already performed in kern_kldload().	2006-06-26 18:33:32 +00:00
Robert Watson	e83b30bdcb	Trim basically unused 'unp' in uipc_connect().	2006-06-26 16:18:22 +00:00
Sergey Babkin	7a799f1ef0	The common UID/GID space implementation. It has been discussed on -arch in 1999, and there are changes to the sysctl names compared to PR, according to that discussion. The description is in sys/conf/NOTES. Lines in the GENERIC files are added in commented-out form. I'll attach the test script I've used to PR. PR: kern/14584 Submitted by: babkin	2006-06-25 18:37:44 +00:00
Ian Dowse	450ec4ed45	If linker_release_module() fails then we still hold a reference on the linker_file, so record this by restoring the linker_file pointer in fp->file.	2006-06-25 12:36:21 +00:00
Pawel Jakub Dawidek	92c0849935	Simplify the code and remove two mutex operations. MFC after: 2 weeks	2006-06-24 22:55:43 +00:00
John Baldwin	70f3778827	Replace the kld_mtx mutex with a kld_sx sx lock and expand it's scope to protect all linker-related data structures including the contents of linker file objects and the any linker class data as well. Considering how rarely the linker is used I just went with the simple solution of single-threading the whole thing rather than expending a lot of effor on something more fine-grained and complex. Giant is still explicitly acquired while registering and deregistering sysctl's as well as in the elf linker class while calling kmupetext(). The rest of the linker runs without Giant unless it has to acquire Giant while loading files from a non-MPSAFE filesystem.	2006-06-21 20:42:08 +00:00
John Baldwin	cbda6f950b	- Push down Giant in kldfind() and kldsym(). - Remove several goto's by either using direct return's or else clauses.	2006-06-21 20:15:36 +00:00
John Baldwin	d36e739a0c	Whoops, revert accidental commit.	2006-06-21 17:48:59 +00:00
John Baldwin	9dd44bd79e	Fix two comments and a style fix.	2006-06-21 17:48:03 +00:00
John Baldwin	0df2972736	Various whitespace fixes.	2006-06-21 17:47:45 +00:00
John Baldwin	62d615d508	Conditionally acquire Giant around VFS operations.	2006-06-20 21:31:38 +00:00
John Baldwin	aeeb017bd6	- Push Giant down into linker_reference_module(). - Add a new function linker_release_module() as a more intuitive complement to linker_reference_module() that wraps linker_file_unload(). linker_release_module() can either take the module name and version info passed to linker_reference_module() or it can accept the linker file object returned by linker_reference_module().	2006-06-20 20:54:13 +00:00
John Baldwin	f462ce3edd	Make linker_find_file_by_name() and linker_find_file_by_id() static to simplify linker locking. The only external consumers now use linker_file_foreach().	2006-06-20 20:41:15 +00:00
John Baldwin	932151064a	- Add a new linker_file_foreach() function that walks the list of linker file objects calling a user-specified predicate function on each object. The iteration terminates either when the entire list has been iterated over or the predicate function returns a non-zero value. linker_file_foreach() returns the value returned by the last invocation of the predicate function. It also accepts a void * context pointer that is passed to the predicate function as well. Using an iterator function avoids exposing linker internals to the rest of the kernel making locking simpler. - Use linker_file_foreach() instead of walking the list of linker files manually to lookup ndis files in ndis(4). - Use linker_file_foreach() to implement linker_hwpmc_list_objects().	2006-06-20 20:37:17 +00:00
John Baldwin	aaf3170501	Make linker_file_add_dependency() and linker_load_module() static since only the linker uses them.	2006-06-20 20:18:42 +00:00
John Baldwin	e767366f99	Don't check if malloc(M_WAITOK) returns NULL.	2006-06-20 20:11:00 +00:00
John Baldwin	e5bb3a01d7	Use 'else' to remove another goto.	2006-06-20 19:49:28 +00:00
John Baldwin	73a2437a83	- Remove some useless variable initializations. - Make some conditional free()'s where the condition was always true unconditional.	2006-06-20 19:32:10 +00:00
George V. Neville-Neil	fb11be62a2	Properly cast the values of valsize (the size of the value passed in) in setsockopt so that they can be compared correctly against negative values. Passing in a negative value had a rather negative effect on our socket code, making it impossible to open new sockets. PR: 98858 Submitted by: James.Juran@baesystems.com MFC after: 1 week	2006-06-20 12:36:40 +00:00
Robert Watson	721150ad8f	When retrieving SO_ERROR via getsockopt(), hold the socket lock around the retrieval and replacement with 0. MFC after: 1 week	2006-06-18 19:02:49 +00:00
Yaroslav Tykhiy	42ccd54fec	Add a funny sysctl: debug.kdb.trap_code . It is similar to debug.kdb.trap, except for it tries to cause a page fault via a call to an invalid pointer. This can highlight differences between a fault on data access vs. a fault on code call some CPUs might have. This appeared as a test for a work \ Sponsored by: RiNet (Cronyx Plus LLC)	2006-06-18 12:27:59 +00:00
Robert Watson	cd3a3a269f	Remove sbinsertoob(), sbinsertoob_locked(). They violate (and have basically always violated) invariannts of soreceive(), which assume that the first mbuf pointer in a receive socket buffer can't change while the SB_LOCK sleepable lock is held on the socket buffer, which is precisely what these functions do. No current protocols invoke these functions, and removing them will help discourage them from ever being used. I should have removed them years ago, but lost track of it. MFC after: 1 week Prodded almost by accident by: peter	2006-06-17 22:48:34 +00:00
Ed Maste	374875fa56	Add a description for sysctl -d.	2006-06-17 02:58:18 +00:00
Robert Watson	9a44cbf19c	Remove unused (and ifdef'd) unp_abort() and unp_drain(). MFC after: 1 month	2006-06-16 22:11:49 +00:00
David Malone	93ef14a74b	Add a kern.timecounter.tc sysctl tree that contains the mask, frequency, quality and current value of each available time counter. At the moment all of these are read-only, but it might make sense to make some of these read-write in the future. MFC after: 3 months	2006-06-16 20:29:05 +00:00
Yaroslav Tykhiy	be70abccba	Kill an XXX remark that has been untrue since rev. 1.150 of this file.	2006-06-16 07:36:18 +00:00
Christian S.J. Peron	4f0840f348	Axe Giant from vn_fullpath(9). The vnode -> pathname lookup should be filesystem agnostic. We are not touching any file system specific functions in this code path. Since we have a cache lock, there is really no need to keep Giant around here. This eliminates Giant acquisitions for any syscall which is auditing pathnames. Discussed with: jeff	2006-06-16 05:09:28 +00:00
Maxim Konovalov	059d68dea6	o Expand an exclusive lock scope to prevent a race between two simultaneous module_register(). Original work done by: Alex Lyashkov Reviewed by: jhb MFC after: 2 weeks	2006-06-15 08:53:09 +00:00
David Xu	7bb561fbb9	Use scheduler API sched_relinquish() to implement yield() syscall.	2006-06-15 06:41:57 +00:00
David Xu	36ec198bd5	Add scheduler API sched_relinquish(), the API is used to implement yield() and sched_yield() syscalls. Every scheduler has its own way to relinquish cpu, the ULE and CORE schedulers have two internal run- queues, a timesharing thread which calls yield() syscall should be moved to inactive queue.	2006-06-15 06:37:39 +00:00
David Xu	c2c1ab1858	Clear ke_runq before calling maybe_preempt, this avoids a KASSERT(ke->ke_runq == NULL) panic when the sched_add is recursively called by maybe_preempt. Reported by: Wojciech A. Koszek < dunstan at freebsd dot czest dot pl >	2006-06-14 03:46:03 +00:00
Xin LI	6ad26d8376	Unexpand an instance of TAILQ_EMPTY()	2006-06-14 03:14:26 +00:00
Marcel Moolenaar	e1684acf38	Unbreak 64-bit architectures. The 3rd argument to kern_kldload() is a pointer to an integer and td->td_retval[0] is of type register_t. On 64-bit architectures register_t is wider than an integer.	2006-06-14 03:01:06 +00:00
David Xu	2c7cae8042	Fox a typo in sched_is_timeshare.	2006-06-13 23:45:59 +00:00
David Xu	e15abbf251	Pass boolean value to __predict_false. Try to keep KSE slot count correct for migrating thread, the count is a bit mess.	2006-06-13 23:01:50 +00:00
John Baldwin	edd32c2da2	Use kern_kldload() and kern_kldunload() to load and unload modules when we intend for the user to be able to unload them later via kldunload(2) instead of calling linker_load_module() and then directly adjusting the ref count on the linker file structure. This makes the resulting consumer code simpler and cleaner and better hides the linker internals making it possible to sanely lock the linker.	2006-06-13 21:36:23 +00:00
John Baldwin	b21c9288ce	A couple of minor style tweaks.	2006-06-13 21:34:12 +00:00
John Baldwin	d53885879d	- Add a kern_kldload() that is most of the previous kldload() and push Giant down in it. - Push Giant down in kern_kldunload() and reorganize it slightly to avoid using gotos. Also, expose this function to the rest of the kernel.	2006-06-13 21:28:18 +00:00
John Baldwin	6b3d277ad4	- Push down Giant some in kldstat(). - Use a 'struct kld_file_stat' on the stack to read data under the lock and then do one copyout() w/o holding the lock at the end to push the data out to userland.	2006-06-13 21:11:12 +00:00
John Baldwin	b904477c68	Unexpand TAILQ_FOREACH() and TAILQ_FOREACH_SAFE().	2006-06-13 20:49:07 +00:00
John Baldwin	3a600aeabc	Remove some more pointless goto's and don't check to see if malloc(M_WAITOK) returns NULL.	2006-06-13 20:27:23 +00:00
John Baldwin	2fa6cc80d7	Handle the simple case of just dropping a reference near the start of linker_file_unload() instead of in the middle of a bunch of code for the case of dropping the last reference to improve readability and sanity. While I'm here, remove pointless goto's that were just jumping to a return statement.	2006-06-13 19:45:08 +00:00
Maxim Konovalov	70df31f4de	o There are two methods to get a process credentials over the unix sockets: 1) A sender sends SCM_CREDS message to a reciever, struct cmsgcred; 2) A reciever sets LOCAL_CREDS socket option and gets sender credentials in control message, struct sockcred. Both methods use the same control message type SCM_CREDS with the same control message level SOL_SOCKET, so they are indistinguishable for the receiver. A difference in struct cmsgcred and struct sockcred layouts may lead to unwanted effects. Now for sockets with LOCAL_CREDS option remove all previous linked SCM_CREDS control messages and then add a control message with struct sockcred so the process specifically asked for the peer credentials by LOCAL_CREDS option always gets struct sockcred. PR: kern/90800 Submitted by: Andrey Simonenko Regres. tests: tools/regression/sockets/unix_cmsg/ MFC after: 1 month	2006-06-13 14:33:35 +00:00
David Xu	b41f1452d9	Add scheduler CORE, the work I have done half a year ago, recent, I picked it up again. The scheduler is forked from ULE, but the algorithm to detect an interactive process is almost completely different with ULE, it comes from Linux paper "Understanding the Linux 2.6.8.1 CPU Scheduler", although I still use same word "score" as a priority boost in ULE scheduler. Briefly, the scheduler has following characteristic: 1. Timesharing process's nice value is seriously respected, timeslice and interaction detecting algorithm are based on nice value. 2. per-cpu scheduling queue and load balancing. 3. O(1) scheduling. 4. Some cpu affinity code in wakeup path. 5. Support POSIX SCHED_FIFO and SCHED_RR. Unlike scheduler 4BSD and ULE which using fuzzy RQ_PPQ, the scheduler uses 256 priority queues. Unlike ULE which using pull and push, the scheduelr uses pull method, the main reason is to let relative idle cpu do the work, but current the whole scheduler is protected by the big sched_lock, so the benefit is not visible, it really can be worse than nothing because all other cpu are locked out when we are doing balancing work, which the 4BSD scheduelr does not have this problem. The scheduler does not support hyperthreading very well, in fact, the scheduler does not make the difference between physical CPU and logical CPU, this should be improved in feature. The scheduler has priority inversion problem on MP machine, it is not good for realtime scheduling, it can cause realtime process starving. As a result, it seems the MySQL super-smack runs better on my Pentium-D machine when using libthr, despite on UP or SMP kernel.	2006-06-13 13:12:56 +00:00
John Baldwin	5c69ad8374	Use fget() in kqueue_register() instead of doing all the work by hand.	2006-06-12 21:46:23 +00:00
Warner Losh	ccdc8d9bff	Add a convenience function rman_init_from_resource for initializing a rman from a resource. Also, include _bus.h since the implementation of bus_space isn't needed here, just the definitions of the types.	2006-06-12 04:06:21 +00:00
Ian Dowse	eb1030c4fd	Keep firmware images on the list until they have been unregistered with firmware_unregister(). Previously when the last driver reference had been dropped we would clear the list entry under the assumption that the firmware module was about to be unloaded, but this was not true if the firmware image had been loaded manually with kldload. This makes it possible to manually kldload firmware images as a workaround for drivers such as ipw that attempt to load firmware while resuming after a suspend. Reviewed by: mlaier (an earlier version of the patch)	2006-06-10 17:04:07 +00:00
Robert Watson	b37ffd3189	Move some functions and definitions from uipc_socket2.c to uipc_socket.c: - Move sonewconn(), which creates new sockets for incoming connections on listen sockets, so that all socket allocate code is together in uipc_socket.c. - Move 'maxsockets' and associated sysctls to uipc_socket.c with the socket allocation code. - Move kern.ipc sysctl node to uipc_socket.c, add a SYSCTL_DECL() for it to sysctl.h and remove lots of scattered implementations in various IPC modules. - Sort sodealloc() after soalloc() in uipc_socket.c for dependency order reasons. Statisticize soalloc() and sodealloc() as they are now required only in uipc_socket.c, and are internal to the socket implementation. After this change, socket allocation and deallocation is entirely centralized in one file, and uipc_socket2.c consists entirely of socket buffer manipulation and default protocol switch functions. MFC after: 1 month	2006-06-10 14:34:07 +00:00
Robert Watson	e02421f3fb	Rearrange code in soalloc() so that it's less indented by returning early if uma_zalloc() from the socket zone fails. No functional change. MFC after: 1 week	2006-06-08 22:33:18 +00:00
Konstantin Belousov	55aef2632f	Fix the LOR that occurs when the MAC compiled into the kernel and vnode is destroyed. Reviewed by: rwatson LOR: 189 MFC after: 2 weeks Approved by: kan (mentor)	2006-06-08 07:55:10 +00:00
David Xu	0ae716e5ee	Make ke_rqindex unsigned.	2006-06-06 12:26:17 +00:00
Robert Watson	7ebfc8df78	Audit some arguments to nmount(), mount(), umount(). Submitted by: wsalamon Obtained from: TrustedBSD Project	2006-06-05 15:32:07 +00:00
Robert Watson	6e79e6f805	Audit command, uid arguments for quotactl(). Audit the mode argument to mkfifo(). Audit the target path passed to symlink(). Submitted by: wsalamon Obtained from: TrustedBSD Project	2006-06-05 13:34:23 +00:00
Robert Watson	d3778141bf	Audit path passed to the acct() system call. Obtained from: TrustedBSD Project	2006-06-05 13:02:34 +00:00
John Baldwin	49b94bfc54	Bah, fix fat finger in last. Invert the ~ on MTX_FLAGMASK as it's non-intuitive for the ~ to be built into the mask. All the users now explicitly ~ the mask. In addition, add MTX_UNOWNED to the mask even though it technically isn't a flag. This should unbreak mtx_owner(). Quickly spotted by: kris	2006-06-03 21:11:33 +00:00
John Baldwin	3ce3f44293	In the case of reentering the debugger due to an attempt to perform a context switch while in the debugger, reenter the debugger sooner before performing any statistics updates.	2006-06-03 20:49:44 +00:00
John Baldwin	315ce35f7b	Simplify mtx_owner() so it only reads m->mtx_lock once.	2006-06-03 20:45:00 +00:00
John Baldwin	f781b5a4bb	Style fix to be more like _mtx_lock_sleep(): use 'while (!foo) { ... }' instead of 'for (;;) { if (foo) break; ... }'.	2006-06-03 20:44:01 +00:00
Pawel Jakub Dawidek	1f58dd4956	Fix a problem introduced in revision 1.220. On mount(2) failure, don't forget to unbusy file system before its destruction. This fixes the following warning on mount failure: Mount point <X> had 1 dangling refs Tested by: wkoszek	2006-06-02 20:29:02 +00:00
Doug Ambrisko	51e37c7f37	Make lio ident more consistant with aio ident.	2006-06-02 17:45:48 +00:00
Pawel Jakub Dawidek	f420242b2b	Don't forget to unlock kq lock in low memory situations. OK'ed by: jmg	2006-06-02 13:23:39 +00:00
Pawel Jakub Dawidek	8ebab14c70	Remove confusing done_noglobal label. The KQ_GLOBAL_UNLOCK() macro know how to handle both situations - when kq_global lock is and is not held. OK'ed by: jmg	2006-06-02 13:21:21 +00:00
Pawel Jakub Dawidek	241321abc0	Use SLIST_FOREACH_SAFE() macro, because knote_drop() can free an element which can be then used to find next element in the list. OK'ed by: jmg	2006-06-02 13:18:59 +00:00
Olivier Houchard	4bb0f51d1d	sched_rem() already sets ke->ke_state to KES_THREAD, so there's no need to redo it.	2006-06-01 22:45:56 +00:00
Diomidis Spinellis	23efd78d03	Remove two locking assertion entries that: a) were incorrectly written and therefore never compiled into assertions, and b) were incorrectly specified and when compiled resulted in a failed assertion.	2006-05-31 14:06:06 +00:00
Diomidis Spinellis	f69ec7af12	Assertion code specifications are introduced using special character sequences that are distinct from comments. %% is used for argument locks; %! for pre- and post-conditions.	2006-05-30 20:49:54 +00:00
Diomidis Spinellis	b1b4282160	Remove incorrect lock validation specifications that caused failed assertions with DEBUG_VFS_LOCKS. We should reinstate them with correct specifications, possibly after extendng vnode_if.awk Noted by: truckman@	2006-05-30 20:21:51 +00:00
Tor Egge	57051fdc4b	Close race between vmspace_exitfree() and exit1() and races between vmspace_exitfree() and vmspace_free() which could result in the same vmspace being freed twice. Factor out part of exit1() into new function vmspace_exit(). Attach to vmspace0 to allow old vmspace to be freed earlier. Add new function, vmspace_acquire_ref(), for obtaining a vmspace reference for a vmspace belonging to another process. Avoid changing vmspace refcount from 0 to 1 since that could also lead to the same vmspace being freed twice. Change vmtotal() and swapout_procs() to use vmspace_acquire_ref(). Reviewed by: alc	2006-05-29 21:28:56 +00:00
Xin LI	56e26c3e7e	Unexpand TAILQ_FIRST(foo) == NULL to TAILQ_EMPTY(foo).	2006-05-29 05:43:26 +00:00
Kris Kennaway	80a8e5da94	Correct typos MFC after: 2 weeks	2006-05-28 22:15:28 +00:00
Robert Watson	4bb260ad78	In execve(), audit the path name being executed. In the future, it would also be good to audit the interpreter pathname, if any. Obtained from: TrustedBSD Project	2006-05-28 08:28:47 +00:00
Diomidis Spinellis	0e1c7fb8ea	Add missing % signs in the lock annotations of the functions: lookup, rename, strategy, islocked The missing % sign meant that the lines were processed as plain comments and the corresponding assertions were never generated.	2006-05-28 07:24:12 +00:00
Xin LI	e38c7f3ef3	extlen and cpp is not used here in linker_search_kld(), so nuke them. Reported by: Mingyan Guo <guomingyan at gmail dot com> MFC After: 2 weeks	2006-05-27 09:21:41 +00:00
Poul-Henning Kamp	9dd2370db6	If the console has no cncheckc method, use cngetc instead.	2006-05-26 11:00:20 +00:00
Poul-Henning Kamp	8aed7613bd	Don't use CONS_DRIVER() macro to insert dummy element in cons_set	2006-05-26 10:46:38 +00:00
Poul-Henning Kamp	16b1613a31	GC the cn_dbctl_t hook for consoles, it is unused. This used to make syscons switch to vty0 when we entered DDB but this was lost in the KDB shuffle. We may want to bring it back down the road but it should be done by calling cn_init_t/cn_term_t instead, possibly with a flag argument saying "Debugger!"	2006-05-26 10:24:00 +00:00
Craig Rodrigues	0c89bb0a02	Add "update" mount option to global_opts array, for use with vfs_filteropt().	2006-05-26 02:38:48 +00:00
Craig Rodrigues	5eb304a91a	Remove calls to vfs_export() for exporting a filesystem for NFS mounting from individual filesystems. Call it instead in vfs_mount.c, after we call VFS_MOUNT() for a specific filesystem.	2006-05-26 00:32:21 +00:00
Robert Watson	20bdac8a4f	Use getsock() and fput() instead of fgetsock() and fputsock() in sendfile(). This causes sendfile() to use the file descriptor reference to the socket instead of bumping the socket reference count, which avoids an additional refcount operation, as well as a potential expensive socket refcount drop, which can lead to contention on the accept mutex. This change also has the side effect of further reducing the number of cases where an in-progress I/O operation can occur on a socket after close, as using the file descriptor refcount prevents the socket from closing while in use. MFC after: 3 months	2006-05-25 15:10:13 +00:00
Stephan Uphoff	dcf67e65d2	Do not set B_NOCACHE on buffers when releasing them in flushbuflist(). If B_NOCACHE is set the pages of vm backed buffers will be invalidated. However clean buffers can be backed by dirty VM pages so invalidating them can lead to data loss. Add support for flush dirty page in the data invalidation function of some network file systems. This fixes data losses during vnode recycling (and other code paths using invalbuf(,V_SAVE,,*)) for data written using an mmaped file. Collaborative effort by: jhb@,mohans@,peter@,ps@,ups@ Reviewed by: tegge@ MFC after: 7 days	2006-05-25 01:00:35 +00:00
Sam Leffler	75b773ae3d	When starting up threads in taskqueue_start_threads create them stopped before adjusting their priority and setting them on the run q so they cannot race for resources (pointed out by njl). While here add a console printf on thread create fails; otherwise noone may notice (e.g. return value is always 0 and caller has no way to verify). Reviewed by: jhb, scottl MFC after: 2 weeks	2006-05-24 22:11:07 +00:00
David Xu	f705bbe8b1	Don't allow non-root user to set a scheduler policy, otherwise this could be a local DOS. Submitted by: Diane Bruce at db at db.net	2006-05-21 00:40:38 +00:00
David Xu	f6c040a2c5	Style fixes. Submitted by: Diane Bruce < db at db dot net >	2006-05-19 06:37:24 +00:00
David Xu	7b8d821268	Move flag TDF_UMTXQ into structure umtxq, this eliminates the requirement of scheduler lock in some umtx code.	2006-05-18 08:43:46 +00:00
Poul-Henning Kamp	d595182f0b	Make the printfs relating to purging threads from a device less intrusive.	2006-05-17 06:37:14 +00:00
Poul-Henning Kamp	c40da00ca3	Since DELAY() was moved, most <machine/clock.h> #includes have been unnecessary.	2006-05-16 14:37:58 +00:00
Paul Saab	6befa6ae1b	Allow concurrent read(2)/readv(2) access to a file. Lock file offset against multiple read calls. Submitted by: ups Obtained from: Yahoo! MFC after: 2 weeks	2006-05-16 07:50:54 +00:00
Kelly Yancey	c9ad8a67af	Restore the ability to mount procfs and fdescfs filesystems via the mount(2) system call: * Add cmount hook to fdescfs and pseudofs (and, by extension, procfs and linprocfs). This (mostly) restores the ability to mount these filesystems using the old mount(2) system call (see below for the rest of the fix). * Remove not-NULL check for the data argument from the mount(2) entry point. Per the mount(2) man page, it is up to the individual filesystem being mounted to verify data. Or, in the case of procfs, etc. the filesystem is free to ignore the data parameter if it does not use it. Enforcing data to be not-NULL in the mount(2) system call entry point prevented passing NULL to filesystems which ignored the data pointer value. Apparently, passing NULL was common practice in such cases, as even our own mount_std(8) used to do it in the pre-nmount(2) world. All userland programs in the tree were converted to nmount(2) long ago, but I've found at least one external program which broke due to this (presumably unintentional) mount(2) API change. One could argue that external programs should also be converted to nmount(2), but then there isn't much point in keeping the mount(2) interface for backward compatibility if it isn't backward compatible.	2006-05-15 19:42:10 +00:00
Benno Rice	77fe443878	The VERBOSE_SYSINIT stuff sees the DDB define a lot better if we include opt_ddb.h. Spotted by: benno Pointy hat to: benno	2006-05-14 07:11:28 +00:00
Craig Rodrigues	5250012a1d	For nmount(), if "rw" is specified as a mount option, add "noro" to the list of mount options. This allows a read-only mount to be converted to read-write via: mount -u -o rw Requested by: kris	2006-05-14 01:51:38 +00:00
John Baldwin	73dbd3da73	Remove various bits of conditional Alpha code and fixup a few comments.	2006-05-12 05:04:46 +00:00
Benno Rice	26ab616fdc	Add a new kernel config option, VERBOSE_SYSINIT. When porting FreeBSD to a new platform, one of the more useful things to do is get mi_startup() to let you know which SYSINIT it's up to. Most people tend to whack a printf in the SYSINIT loop to print the address of the function it's about to call. Going one better, jhb made a version that uses DDB to look up the name of the function and print that instead. This version is essentially his with the addition of some ifdeffery to make it optional and to allow it to work (although using only the function address, not the symbol) if you forgot to enable DDB. All the cool bits by: jhb Approved by: scottl, rink, cognet, imp	2006-05-12 02:01:38 +00:00
Poul-Henning Kamp	99ab8292c7	Remove more straggling CPU_ macro references	2006-05-11 17:53:26 +00:00
David Xu	005efcdb0e	Use wakeup_one to avoid thundering herd. Tested by: kris	2006-05-09 13:00:46 +00:00
David Xu	759ccccadb	Use a dedicated mutex to protect aio queues, the movation is to reduce lock contention with other parts.	2006-05-09 00:10:11 +00:00
Tor Egge	11991ab418	Call vn_finished_write() before calling the coredump handler which will indirectly call vn_start_write() as necessary for each write.	2006-05-07 22:50:22 +00:00
Tor Egge	d302786c87	Temporarily unlock vnode for new image being executed to avoid lock order reversals that can lead to deadlocks. Normally vn_close(), namei() or vrele() should not be called while holding vnode locks.	2006-05-05 20:25:05 +00:00
Pawel Jakub Dawidek	643df192de	vn_start_write()/vn_finished_write() is not needed here, because vn_start_write() is always called earlier in the code path and calling the function recursively may lead to a deadlock. Confirmed by: tegge MFC after: 2 weeks	2006-04-29 21:57:38 +00:00
Kris Kennaway	cef31ff7d9	Lock giant when assigning ni_vp and keep vfslocked state valid. Committed for: jeff	2006-04-29 07:13:49 +00:00
Pawel Jakub Dawidek	122410eea2	vn_start_write() is called only when v_type != VCHR, so corresponding vn_finished_write() should also be called only then. BTW. I fixed two functions here: vn_rdwr() and vn_write(). The latter seems to be unused. MFC after: 3 weeks	2006-04-28 21:54:05 +00:00
Robert Watson	3bf14fd5e9	Also check use_pty in the ptmx clone lookup; this means that when ptmx support is turned off using the sysctl, we no longer even allow the ptmx device to be looked up. Foot provided by: peter	2006-04-28 21:39:57 +00:00
Marcel Moolenaar	8f405ed335	Remove the puc-specific hacks. The puc(4) driver now properly uses the rman(9) interface.	2006-04-28 21:23:09 +00:00
Jeff Roberson	6ca9fcc586	- Add a BO_NEEDSGIANT flag to the bufobj. This flag forces all child buffers to go on the buf daemon's DIRTYGIANT queue. - Set BO_NEEDSGIANT on ffs's devvp since the ffs_copyonwrite handler runs in the context of the buf daemon and may require Giant.	2006-04-28 01:05:31 +00:00
Jeff Roberson	4b5b86816c	- Consistently track ni_dvp and ni_vp with dvfslocked and vfslocked rather than trying to optimize it into a single lock. This adds more calls to lock giant with non smpsafe filesystems but is the only way to reliably hold the correct lock. - Remove an invalid assert in the mountedhere case in lookup and fix the code to properly deal with the scenario. We can actually have a lookup that returns dp == dvp with mountedhere set with certain unmount races. Tested by: kris Reported by: kris/mohans	2006-04-28 00:59:48 +00:00
John-Mark Gurney	5c06d111b8	back out for now... revert ccpu to being kern.ccpu...	2006-04-27 17:57:59 +00:00
John-Mark Gurney	c71ce6a445	move remaining sysctl into the kern.sched tree...	2006-04-26 19:42:38 +00:00
John Baldwin	ae110b53d1	Add some new commands to hopefully make it easier to diagnose lock-related problems in ddb: - "show threadchain [thread]" will start with the specified thread (or the current kdb thread by default) and show it's state. If it is blocked on a lock, it will find the owner of the lock and show its state, etc. - "show allchains" will find all of the threads that are blocked on a lock (but do not have any threads blocked on a lock they hold) and show the resulting thread chain. - "show lockchain <lock>" takes a pointer to a lock_object (such as a mutex or rwlock). If there is a turnstile for that lock, then it will display all the threads blocked on the lock. In addition, for each thread blocked on the lock, it will display any contested locks they hold, and recurse on those locks to show any threads blocked on those locks, etc.	2006-04-25 20:28:17 +00:00
John Baldwin	de833b7c0c	Use db_lookup_thread() to lookup the thread for the passed in address and change 'show locks' to only list the locks for a given thread rather than for all the threads in the process containing a specified thread.	2006-04-25 20:24:23 +00:00
Marius Strobl	fa63296aba	Remove last vestiges of sab(4).	2006-04-25 19:43:53 +00:00
Robert Watson	102ea03373	Extend getsock() to return the struct file flags read while holding the file lock, in the style of fgetsock(). Modify accept1() to use getsock() instead of fgetsock(), relying on the file descriptor reference rather than an acquired socket reference to prevent the listen socket from being destroyed during accept(). This avoids additional reference count operations, which should improve performance, and also avoids accept1() operating on a socket whose file descriptor has been torn down, which may have resulted in protocol shutdown starting. MFC after: 3 months	2006-04-25 11:48:16 +00:00
Maxim Konovalov	481f8fe85f	Inherit LOCAL_CREDS option from listen socket for sockets returned by accept(2). PR: kern/90644 Submitted by: Andrey Simonenko OK'ed by: mdodd Tested by: NetBSD regress/sys/kern/unfdpass/unfdpass.c MFC after: 1 month	2006-04-24 19:09:33 +00:00
Marcel Moolenaar	845652dd28	MFp4: Add the ipend() method to the serdev I/F to allow umbrella drivers to obtain pending interrupt status from subordinate drivers.	2006-04-23 22:12:39 +00:00
Robert Watson	0cec9959e8	Assert that sockets passed into soabort() not be SQ_COMP or SQ_INCOMP, since that removal should have been done a layer up. MFC after: 3 months	2006-04-23 18:15:54 +00:00
Robert Watson	28ea180136	Add missing 'not' to SQ_COMP comment. MFC after: 3 months	2006-04-23 15:37:23 +00:00
Robert Watson	6ca35d4b81	Move handling of SQ_COMP exception case in sofree() to the top of the function along with the remainder of the reference checking code. Move comment from body to header with remainder of comments. Inclusion of a socket in a completed connection queue counts as a true reference, and should not be handled as an under-documented edge case. MFC after: 3 months	2006-04-23 15:33:38 +00:00
John Baldwin	f9ab2f134f	Print td_name instead of p_comm if td_name is non-empty for 'show turnstile' and 'show sleepq'.	2006-04-21 20:40:43 +00:00
Paul Saab	95f16c1e2c	Don't try to kill embryonic processes in killpg1(). This prevents a race condition between fork() and kill(pid,sig) with pid < 0 that can cause a kernel panic. Submitted by: up MFC after: 3 weeks	2006-04-21 19:26:21 +00:00
Paul Saab	4f590175b7	Allow for nmbclusters and maxsockets to be increased via sysctl. An eventhandler is used to update all the various zones that depend on these values.	2006-04-21 09:25:40 +00:00
John-Mark Gurney	be4db476a6	const'ify resource_spec to note that we won't be changing anything while releasing resources... also, NULL out the resources as we free them...	2006-04-20 01:44:16 +00:00
Warner Losh	0385d64761	r_spare1 and r_spare2 aren't needed. They aren't used. They can't be accessed from outside of subr_rman.c. Remove them. Reviewed by: jmg (in theory)	2006-04-19 21:25:55 +00:00
John Baldwin	fea3efe5bf	Implement rw_try_upgrade() and rw_downgrade(). rw_try_upgrade() makes a single attempt at upgrading a read lock to a write lock, and rw_downgrade() converts curthread's write lock into a read lock.	2006-04-19 21:06:52 +00:00
Wojciech A. Koszek	5884c1a098	'owner' is not used without SMP. Fix kernel build for such kernel configurations. Approved by: jhb	2006-04-18 20:32:42 +00:00
John Baldwin	efa86db61d	Adaptively spin before blocking on the turnstile if an rwlock is write locked. In general the adaptive spinning is similar to the same code for mutexes with some extra trickiness in rw_wunlock_hard(). Specifically, even though both wait bits might be set and we might have a turnstile with at least one waiting thread, there might not be any threads blocked on the queue we are not waking up (they might all be spinning), and we should only preserve the waiting flag for the queue we aren't waking up if there are in fact threads blocked on that queue. Secondly, there might not be any threads blocked on the queue we have chosen to waken threads from (there might only be threads blocked on the other queue and the threads for this queue are all spinning) in which case we disown the turnstile instead of doing a braodcast and unpend.	2006-04-18 18:27:54 +00:00
John Baldwin	f1a4b852dc	- Bring back turnstile_empty() which can check to see if an individual queue on a turnstile is empty. - Add a turnstile_disown() function that allows a thread to give up ownership of a turnstile w/o waking up any waiters.	2006-04-18 18:16:54 +00:00
Xin LI	4207c279d4	In vfs_hash_get(): mount point should never be changed so explicitly constify the mp parameter. Reviewed by: phk	2006-04-18 08:05:08 +00:00
John Baldwin	38bf165fa1	- Add a rw_wowner() macro that just returns the owner of a write lock and use it in places that only care about the write owner instead of rw_owner() as a baby step towards limited read-lock owner. - Tidy the code that sets the WAITER flag bits to not duplicate a test around the atomic operation and the KTR trace in both of the lock functions.	2006-04-17 21:11:01 +00:00
John Baldwin	32553b153e	Add a 'show sleepqueue' alias for 'show sleepq' in DDB.	2006-04-17 20:16:32 +00:00
John Baldwin	964b557211	Trim trailing whitespace.	2006-04-17 20:14:51 +00:00
John Baldwin	2971c36136	Add a new module_file() function that returns the linker_file_t associated with a given module_t. I use this in some the MOD_LOAD event handler for some test kernel modules to ask the kernel linker to look up the linker sets in my test modules. (I use linker sets to generate the list of possible events that I then signal to execute via a sysctl. On non-amd64, ld(8) would resolve the entire linker set, but on amd64 I have to ask the kernel linker to do it for me, and having the kernel linker do it works on all archs.)	2006-04-17 19:44:44 +00:00
John Baldwin	0f180a7cce	Change msleep() and tsleep() to not alter the calling thread's priority if the specified priority is zero. This avoids a race where the calling thread could read a snapshot of it's current priority, then a different thread could change the first thread's priority, then the original thread would call sched_prio() inside msleep() undoing the change made by the second thread. I used a priority of zero as no thread that calls msleep() or tsleep() should be specifying a priority of zero anyway. The various places that passed 'curthread->td_priority' or some variant as the priority now pass 0.	2006-04-17 18:20:38 +00:00
John-Mark Gurney	e98b5a89de	remove duplicate sizeof vnode entry (debug.sizeof.vnode already existed)... move ncsize into debug.sizeof and rename to namecache...	2006-04-16 18:38:30 +00:00
Scott Long	bb141be10a	Take a better stab at making this compile.	2006-04-15 18:54:56 +00:00
Scott Long	83bc5d54c8	Take a stab at making this compile.	2006-04-15 18:04:04 +00:00
John Baldwin	76447e5618	Mark the thread pointer used during an adaptive spin volatile so that the compiler doesn't decide to cache td_state. Cachine the state would cause the spinning thread to not notice when the owning thread stopped executing (if it was preempted for example) which could result in livelock.	2006-04-14 19:51:50 +00:00
John Baldwin	a29b4f6eec	Drop the kqueue global mutex as soon as we are finished with it rather than keeping it locked until we exit the function to optimize the case where the lock would be dropped and later reacquired. The optimization was broken when kevent's were moved from UFS to VFS and the knote list lock for a vnode kevent became the lockmgr vnode lock. If one tried to use a kqueue that contained events for a kqueue fd followed by a vnode, then the kq global lock would end up being held when the vnode lock was acquired which could result in sleeping with a mutex held (and subsequent panics) if the vnode lock was contested. Reviewed by: jmg Tested by: ps (on 6.x) MFC after: 3 days	2006-04-14 14:27:28 +00:00
David Xu	cfd6f8cd6c	Clear TDF_SINTR in sleepq_resume_thread, also sleepq_catch_signal does not need to clear it now, this should fix panic when msleep is recursivly called. Patch is slightly adjusted after review. Reviewed by: jhb Tested by: Csaba Henk, csaba-ml at creo.hu MFC after: 3 days	2006-04-13 23:29:25 +00:00
John Baldwin	9477358d00	Turn on ithread_destroy() and call it from intr_event_destroy() to tear down an interrupt event's associated thread (if it has one).	2006-04-13 17:29:04 +00:00
Christian S.J. Peron	d5e5634075	Kill the last Giant acquisition in the exit(2) code. This Giant acquisition doesn't appear to be protecting anything. Most of consumers funsetownlst(9) do not appear to be picking up Giant anywhere. This was originally a part of my Giant exit(2) clean up revision 1.272 but I thought it was a good idea to leave it out until we were able to analyze it better. Tested by: kris MFC after: 3 weeks	2006-04-10 14:07:28 +00:00
Pawel Jakub Dawidek	0909f38a3c	On shutdown try to turn off all swap devices. This way GEOM providers are properly closed on shutdown. Requested by: ru Reviewed by: alc MFC after: 2 weeks	2006-04-10 10:03:41 +00:00
David Xu	e631cff309	Use proc lock to prevent a thread from exiting, Giant was no longer used to protect thread list.	2006-04-10 04:55:59 +00:00
Robert Watson	d37b79a00f	Remove UNIX domain socket raw socket support. This feature is documented as being undocumented in Stevens, and was broken in 1997 during network stack infrastructure work. It is the one remaining (and incorrect) direct protocol reference to raw_usrreq.pru_attach; this is incorrect because the raw socket code assumes that raw_uattach is called only after the protocol has allocated a PCB. MFC after: 3 months	2006-04-09 16:29:47 +00:00
Marcel Moolenaar	07c8931358	Add the scc_hwmtx spin mutex, defined by scc(4).	2006-04-07 22:15:54 +00:00
John-Mark Gurney	1c4ca5e5fe	spell unlock correctly, this is relatively minor as it's rare someone would provide a lock method, and want the default unlock, but it is a bug... PR: 95356 Submitted by: Stephen Corteselli MFC after: 3 days	2006-04-07 17:21:27 +00:00
Jeff Roberson	b53bf1269c	- VFS_LOCK_GIANT when recycling a vnode via getnewvnode. We may be recycling for an unrelated filesystem. I really don't like potentially acquiring giant in the context of a giantless filesystem but there are reasonable objections to removing the recycling from this path. Sponsored by: Isilon Systems, Inc.	2006-04-04 06:46:10 +00:00
Jeff Roberson	4b24e4210e	- Properly check against B_DELWRI and B_NEEDSGIANT. This check was incorrectly written and caused some !NEEDSGIANT buffers to be put in the NEEDSGIANT queue. Sponsored by: Isilon Systems, Inc.	2006-04-04 06:44:21 +00:00
Marcel Moolenaar	39eb1d1263	Increment kdb_active after we stopped the other CPUs and decrement kdb_active before we restart them. This avoids false positives on restarted CPUs when they test for kdb_active while kdb_trap() is still finishing up.	2006-04-04 00:40:20 +00:00
Marcel Moolenaar	bfcdefd8aa	Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the PCB in which the context of stopped CPUs is stored. To access this PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The definition, when present, lives in <machine/kdb.h> and abstracts where MD code saves the context. Define KDB_STOPPEDPCB on i386, amd64, alpha and sparc64 in accordance to previous code.	2006-04-03 22:51:47 +00:00
Peter Wemm	b9eee07e36	Remove the unused sva and eva arguments from pmap_remove_pages().	2006-04-03 21:16:10 +00:00
Marcel Moolenaar	5991a4f811	In kdb_trap(), change the type of the local variable 'intr' from int to register_t, as intr_disable() returns the latter and register_t may be wider than int. Pointed out by: marius@	2006-04-03 20:55:52 +00:00
Marcel Moolenaar	2fae8f5aed	Replace critical_enter() and critical_exit() in kdb_trap() with intr_disable() and intr_restore() resp. Previously, critical regions would have interrupts disabled, but that was changed. Consequently, the debugger could run with interrupts enabled. This could cause problems for the low-level console code where received characters would trigger an interrupt that causes the interrupt handler to read the character instead of the cngetc() function.	2006-04-03 17:48:09 +00:00
John-Mark Gurney	5e6125891f	mask out any action when copying the flags from the event to the knote.. Pointed out by: Václav Haisman Submitted by: Dan Nelson (slightly modifed patch) MFC after: 3 days	2006-04-01 20:15:39 +00:00
Robert Watson	bc725eafc7	Chance protocol switch method pru_detach() so that it returns void rather than an error. Detaches do not "fail", they other occur or the protocol flags SS_PROTOREF to take ownership of the socket. soclose() no longer looks at so_pcb to see if it's NULL, relying entirely on the protocol to decide whether it's time to free the socket or not using SS_PROTOREF. so_pcb is now entirely owned and managed by the protocol code. Likewise, no longer test so_pcb in other socket functions, such as soreceive(), which have no business digging into protocol internals. Protocol detach routines no longer try to free the socket on detach, this is performed in the socket code if the protocol permits it. In rts_detach(), no longer test for rp != NULL in detach, and likewise in other protocols that don't permit a NULL so_pcb, reduce the incidence of testing for it during detach. netinet and netinet6 are not fully updated to this change, which will be in an upcoming commit. In their current state they may leak memory or panic. MFC after: 3 months	2006-04-01 15:42:02 +00:00
Robert Watson	ac45e92ff2	Change protocol switch pru_abort() API so that it returns void rather than an int, as an error here is not meaningful. Modify soabort() to unconditionally free the socket on the return of pru_abort(), and modify most protocols to no longer conditionally free the socket, since the caller will do this. This commit likely leaves parts of netinet and netinet6 in a situation where they may panic or leak memory, as they have not are not fully updated by this commit. This will be corrected shortly in followup commits to these components. MFC after: 3 months	2006-04-01 15:15:05 +00:00
Robert Watson	fa4c5373ce	Add comment to accept1() that it should use getsock() instead of fgetsock() to avoid additional mutex operations, and also to avoid use of soref/sorele which are now not preferred. MFC after: 3 months	2006-04-01 11:14:56 +00:00
Robert Watson	197b35d717	Mark fgetsock() and fputsock() as depcrecated: callers should rely on the file descriptor reference, rather than paying additional lock operations to acquire a socket reference from the file descriptor. This will also help to ensure that file descriptor based socket requests are not delivered to a socket after close. Most consumers have already been converted to this model. MFC after: 3 months	2006-04-01 11:09:54 +00:00
Robert Watson	7f689de232	Assert so->so_pcb is NULL in sodealloc() -- the protocol state should not be present at this point. We will eventually remove this assert because the socket layer should never look at so_pcb, but for now it's a useful debugging tool. MFC after: 3 months	2006-04-01 10:45:52 +00:00
Robert Watson	220c1357ed	Add a somewhat sizable comment documenting the semantics of various kernel socket calls relating to the creation and destruction of sockets. This will eventually form the foundation of socket(9), but is currently in too much flux to do so. MFC after: 3 months	2006-04-01 10:43:02 +00:00
Jeff Roberson	0af2472199	- Add an assert to vgone. It is illegal to call vgone without a reference to the vnode. Without a reference the vnode will never be vdestroy'd and the memory will never be reclaimed. Sponsored by: Isilon Systems, Inc.	2006-03-31 23:39:26 +00:00
Jeff Roberson	ba5eb429e3	- When there are dangling vnodes at unmount print them before we panic. Sponsored by: Isilon Systems, Inc.	2006-03-31 23:38:15 +00:00
Jeff Roberson	3bbd6d8ae6	- Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs(). Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.	2006-03-31 03:54:20 +00:00
Jeff Roberson	94bc95db3c	- Hold a reference from the time vfs_busy starts until vfs_unbusy is called. - vfs_getvfs has to return a reference to prevent the returned mountpoint from changing identities. - Release references acquired via vfs_getvfs. Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.	2006-03-31 03:53:25 +00:00
Jeff Roberson	c5fcce21c5	- GETWRITEMOUNT now returns a referenced mountpoint to prevent its identity from changing. This is possible now that mounts are not freed. Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.	2006-03-31 03:52:24 +00:00
Jeff Roberson	a218edceb2	- Allocate mounts from a uma zone that uses UMA_ZONE_NOFREE to prevent mount memory from being reclaimed. This resolves a number of race conditions described in vfs_default.c and introduced with the VFS_LOCK_GIANT macros. - Let the mtx and lock remain valid after the mount structure has been freed by using init and fini calls. Technically fini will never be called but is included for completeness. - Consistently use lockmgr directly rather than lockmgr to lock and vfs_unbusy to unlock. Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.	2006-03-31 03:49:51 +00:00
Jeff Roberson	fdf86b2dcd	- LK_RETRY means nothing when passed to VOP_LOCK. Call vn_lock instead. - Move the vn_lock of the dvp until after we've unbusied the filesystem to avoid a LOR with the mount point lock. - In the v_mountedhere while loop we acquire a new instance of giant each time through without releasing the first. This would cause us to leak Giant. Sponsored by: Isilon Systems, Inc.	2006-03-31 02:59:23 +00:00
Jeff Roberson	084d64ac21	- Add the B_NEEDSGIANT flag which is only set if the vnode that owns a buf requires Giant. It is set in bgetvp and cleared in brelvp. - Create QUEUE_DIRTY_GIANT for dirty buffers that require giant. - In the buf daemon, only grab giant when processing QUEUE_DIRTY_GIANT and only if we think there are buffers in that queue. Sponsored by: Isilon Systems, Inc.	2006-03-31 02:56:30 +00:00
Sam Leffler	00537061dd	fixup error handling in taskqueue_start_threads: check for kthread_create failing, print a message when we fail for some reason as most callers do not check the return value (e.g. 'cuz they're called from SYSINIT) Reviewed by: scottl MFC after: 1 week	2006-03-30 23:06:59 +00:00
Pawel Jakub Dawidek	177a987379	Fix a panic on sparc64 related to inproper aligment - we cannot assume, that 'unsigned char *' argument is 4 byte aligned. MFC after: 3 days	2006-03-30 18:45:50 +00:00
Marcel Moolenaar	6174e6ed12	Add scc(4), a driver for serial communications controllers. These controllers typically have multiple channels and support a number of serial communications protocols. The scc(4) driver is itself an umbrella driver that delegates the control over each channel and mode to a subordinate driver (like uart(4)). The scc(4) driver supports the Siemens SAB 82532 and the Zilog Z8530 and replaces puc(4) for these devices.	2006-03-30 18:33:22 +00:00
Paul Saab	fbb273bc05	Properly support for FreeBSD 4 32bit System V shared memory. Submitted by: peter Obtained from: Yahoo! MFC after: 3 weeks	2006-03-30 07:42:32 +00:00
John Baldwin	4b3b0413d2	Always explicitly panic in propogate_priority() if we try to propogate a lock's priority to a sleeping thread. When we panic, dump a stack trace of the thread that is asleep if DDB is compiled into the kernel just before calling panic(). This is much more informative and useful for debugging than the current behavior of getting a page fault and not having an easy way of determining which thread caused the original problem. MFC after: 1 week	2006-03-29 23:24:55 +00:00
John-Mark Gurney	4e095bc045	hold the list lock over the f_event and KNOTE_ACTIVATE calls... This closes a race where data could come in before we clear the INFLUX flag, and get skipped over by knote (and hence never be activated, though it should of been)... Found by: glebius & co. Reviewed by: glebius MFC after: 3 days	2006-03-29 18:15:30 +00:00
John Baldwin	33f19bee6f	- Conditionalize Giant around VFS operations for ALQ, ktrace, and generating a coredump as the result of a signal. - Fix a bug where we could leak a Giant lock if vn_start_write() failed in coredump(). Reported by: jmg (2)	2006-03-28 21:30:22 +00:00
John Baldwin	11178ee4c1	Conditionalize locking of Giant for VFS in acct(2). We already conditionally acquired Giant in the other parts of the accounting code.	2006-03-28 21:26:59 +00:00
John Baldwin	861dab08e7	Change vn_open() to honor the MPSAFE flag in the passed in nameidata object and use that instead of testing fdidx against -1 to determine if it should release Giant if Giant was locked due to the requested file residing on a non-MPSAFE VFS. Discussed with: jeff	2006-03-28 21:22:08 +00:00
Dag-Erling Smørgrav	867c089bc7	Revert previous commit at davidxu's insistance. Instead, use __DECONST (argh!) and rearrange the prototypes to make it clear that _umtx_op() is not deprecated.	2006-03-28 14:32:38 +00:00
Dag-Erling Smørgrav	b3efbabe87	The undocumented and deprecated system call _umtx_op() takes two pointer arguments. The first one is never used (all callers pass in 0); the second is sometimes used to pass in a struct timespec * which is used as a timeout and never modified. Constify that argument so callers can pass a const struct timespec * without jumping through hoops.	2006-03-28 09:18:34 +00:00
Alan Cox	7c8dcf2def	Use NET_LOCK_GIANT() and VFS_LOCK_GIANT() instead of unconditionally acquiring Giant in kern_sendfile(). Guard against the forced reclamation of a vnode in kern_sendfile(). Discussed with: jeff Reviewed by: tegge MFC after: 3 weeks	2006-03-27 04:23:16 +00:00
Robert Watson	63b01ffd34	Add a sysctl, regression.sonewconn_earlytest, which when options REGRESSION is enabled, allows user space to dictate that sonewconn() should skip it's "skip the hard work" check to see if the listen queue is full, and instead proceed with allocation of a socket and trimming of the overflowed queue. This makes it easier to test the queue overflow logic. MFC after: 1 month	2006-03-26 22:44:37 +00:00
Joseph Koshy	49874f6ea3	MFP4: Support for profiling dynamically loaded objects. Kernel changes: Inform hwpmc of executable objects brought into the system by kldload() and mmap(), and of their removal by kldunload() and munmap(). A helper function linker_hwpmc_list_objects() has been added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve the list of currently loaded kernel modules. The unused `MAPPINGCHANGE' event has been deprecated in favour of separate `MAP_IN' and `MAP_OUT' events; this change reduces space wastage in the log. Bump the hwpmc's ABI version to "2.0.00". Teach hwpmc(4) to handle the map change callbacks. Change the default per-cpu sample buffer size to hold 32 samples (up from 16). Increment __FreeBSD_version. libpmc(3) changes: Update libpmc(3) to deal with the new events in the log file; bring the pmclog(3) manual page in sync with the code. pmcstat(8) changes: Introduce new options to pmcstat(8): "-r" (root fs path), "-M" (mapfile name), "-q"/"-v" (verbosity control). Option "-k" now takes a kernel directory as its argument but will also work with the older invocation syntax. Rework string handling in pmcstat(8) to use an opaque type for interned strings. Clean up ELF parsing code and add support for tracking dynamic object mappings reported by a v2.0.00 hwpmc(4). Report statistics at the end of a log conversion run depending on the requested verbosity level. Reviewed by: jhb, dds (kernel parts of an earlier patch) Tested by: gallatin (earlier patch)	2006-03-26 12:20:54 +00:00
David Xu	dbbccfe923	1. Move code for scanning pending I/O from aio_fsync to aio_aqueue, it has less overhead. 2. Avoid scheduling task if maximum number of I/O threads is reached.	2006-03-24 00:50:06 +00:00
David Xu	177e987e63	Regenerate.	2006-03-23 08:48:37 +00:00
David Xu	99eee864ad	Implement aio_fsync() syscall.	2006-03-23 08:46:42 +00:00
Pawel Jakub Dawidek	96c0381f5c	Destroy "bip" bio in error case. Found by: Coverity Prevent analysis tool Coverity ID: 795 MFC after: 3 days	2006-03-22 00:42:41 +00:00
Jeff Roberson	bacb51fb67	- Remove explicit giant acquires and replace it with VFS_LOCK_GIANT. Sponsored by: Isilon Systems, Inc.	2006-03-22 00:00:05 +00:00
Jeff Roberson	77c79550af	- Remove explicit calls to lock and unlock Giant and replace them with VFS_LOCK_GIANT/VFS_UNLOCK_GIANT calls. This completely removes Giant acquisition in the syscall path for ffs. Bug fix to kern_fhstatfs from: Todd Miller <Todd.Miller@sparta.com> Sponsored by: Isilon Systems, Inc.	2006-03-21 23:58:37 +00:00

... 2 3 4 5 6 ...

9509 Commits