freebsd-nq

Author	SHA1	Message	Date
Ed Schouten	18cf135421	Update system call tables. The previous commit also included changes to all the system call lists, but it is a tradition to update these lists in a second commit, so rerun make sysent to update the $FreeBSD$ tags inside these files to refer to the latest version of syscalls.master. Requested by: rwatson	2008-08-20 08:39:10 +00:00
Ed Schouten	bc093719ca	Integrate the new MPSAFE TTY layer to the FreeBSD operating system. The last half year I've been working on a replacement TTY layer for the FreeBSD kernel. The new TTY layer was designed to improve the following: - Improved driver model: The old TTY layer has a driver model that is not abstract enough to make it friendly to use. A good example is the output path, where the device drivers directly access the output buffers. This means that an in-kernel PPP implementation must always convert network buffers into TTY buffers. If a PPP implementation would be built on top of the new TTY layer (still needs a hooks layer, though), it would allow the PPP implementation to directly hand the data to the TTY driver. - Improved hotplugging: With the old TTY layer, it isn't entirely safe to destroy TTY's from the system. This implementation has a two-step destructing design, where the driver first abandons the TTY. After all threads have left the TTY, the TTY layer calls a routine in the driver, which can be used to free resources (unit numbers, etc). The pts(4) driver also implements this feature, which means posix_openpt() will now return PTY's that are created on the fly. - Improved performance: One of the major improvements is the per-TTY mutex, which is expected to improve scalability when compared to the old Giant locking. Another change is the unbuffered copying to userspace, which is both used on TTY device nodes and PTY masters. Upgrading should be quite straightforward. Unlike previous versions, existing kernel configuration files do not need to be changed, except when they reference device drivers that are listed in UPDATING. Obtained from: //depot/projects/mpsafetty/... Approved by: philip (ex-mentor) Discussed: on the lists, at BSDCan, at the DevSummit Sponsored by: Snow B.V., the Netherlands dcons(4) fixed by: kan	2008-08-20 08:31:58 +00:00
Konstantin Belousov	2bb4c6f922	In brelse, put the B_NEEDSGIANT buffer on the QUEUE_DIRTY_GIANT queue, instead of QUEUE_DIRTY. Tested by: pho Reviewed by: attilio MFC after: 3 days	2008-08-19 11:31:49 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Alfred Perlstein	cbd3ba3edf	Prevent crashes due to unlocked access to hash buckets in two sysctls. Use CACHE_LOCK to prevent crashes. Sysctls fixed: debug.hashstat.nchash and debug.hashstat.rawnchash. Obtained from: Juniper Networks MFC After: 1 week	2008-08-16 21:48:10 +00:00
Kip Macy	e77d7f7143	Add flag to indicate to xen support code that threads are running (and thus we can block). MFC after: 1 month	2008-08-15 21:03:13 +00:00
Attilio Rao	3d06b4b330	Introduce some WITNESS improvements: - Speedup the lock orderings lookup modifying the witness graph from a linked tree to a matrix. A table lookup caches the lock orderings in order to make a O(1) access for them. Any witness object has an unique index withing this lookup cache table. - Reduce the lock contention on w_mtx acquiring it only when the LOR actually happens and not in a sane case. In order to do this don't totally flush lock lists (per-CPU spinlocks list and per-thread sleeplocks list) but check for ll_count anytime we need to have to verify allocations sanity. - Introduce the function witness_thread_exit() in the witness namespace which should verify a thread doesn't hold any witness occurrence why exiting. - Rename the sysctl debug.witness.graphs into debug.witness.fullgraph and add debug.witness.badstacks which prints out stacks for LOR revealed. This is implemented using the stack(9) support, which makes WITNESS to be dependent by the STACK option or by the DDB (including STACK) option. - Fix style(9) for src/sys/kern/subr_witness.c The hash table approach has been developed by Ilya Maykov on the behalf of Isilon Systems which kindly released the patch. Jeff Roberson, ported the patch to -CURRENT and fixed w_mtx contention, on the behalf of Nokia. Submitted by: Ilya Maykov <ivmaykov at gmail dot com> (Isilon Systems), jeff Sponsored by: Nokia	2008-08-13 18:24:22 +00:00
Christian S.J. Peron	ded7d39cb9	Reduce the scope of the vnode lock such that it does not cover the various copyouts associated with initializing the process's argv/env data in userspace. It is possible that these copyout operations can fault under memory pressure, possibly resulting in dead locks. This is believed to be safe since none of the copyout_strings() operations need to interact with the vnode here. Submitted by: Zhouyi Zhou PR: kern/111260 Discussed with: kib MFC after: 3 weeks	2008-08-12 21:27:48 +00:00
Konstantin Belousov	e792b09be2	Revert r181345. Move the NULL pointer check to the vfs_deleteopt() function. Discussed with: rodrigc MFC after: 3 days	2008-08-10 12:15:36 +00:00
Ed Schouten	79da190c16	Remove unneeded D_NEEDGIANT from /dev/fd/{0,1,2}. There is no reason the fdopen() routine needs Giant. It only sets curthread->td_dupfd, based on the device unit number of the cdev. I guess we won't get massive performance improvements here, but still, I assume we eventually want to get rid of Giant.	2008-08-09 12:42:12 +00:00
Dag-Erling Smørgrav	2616144e43	Add sbuf_new_auto as a shortcut for the very common case of creating a completely dynamic sbuf. Obtained from: Varnish MFC after: 2 weeks	2008-08-09 11:14:05 +00:00
Dag-Erling Smørgrav	546d78908b	Switch to simplified BSD license (with phk's approval), plus whitespace and style(9) cleanup.	2008-08-09 10:26:21 +00:00
John Baldwin	414e7679cb	Permit Giant to be passed as the explicit interlock either to msleep/mtx_sleep or the various cv_wait() routines. Currently, the "unlock" behavior of PDROP and cv_wait_unlock() with Giant is not permitted as it is will be confusing since Giant is fully unrecursed and unlocked during a thread sleep. This is handy for subsystems which wish to allow unlocked drivers to continue to use Giant such as CAM, the new TTY layer, and the new USB stack. CAM currently uses a hack that I told Scott to use because I really didn't want to permit this behavior, and the TTY and USB patches both have various patches to permit this. MFC after: 2 weeks	2008-08-07 21:00:13 +00:00
John Baldwin	da7bbd2c08	If a thread that is swapped out is made runnable, then the setrunnable() routine wakes up proc0 so that proc0 can swap the thread back in. Historically, this has been done by waking up proc0 directly from setrunnable() itself via a wakeup(). When waking up a sleeping thread that was swapped out (the usual case when waking proc0 since only sleeping threads are eligible to be swapped out), this resulted in a bit of recursion (e.g. wakeup() -> setrunnable() -> wakeup()). With sleep queues having separate locks in 6.x and later, this caused a spin lock LOR (sleepq lock -> sched_lock/thread lock -> sleepq lock). An attempt was made to fix this in 7.0 by making the proc0 wakeup use the ithread mechanism for doing the wakeup. However, this required grabbing proc0's thread lock to perform the wakeup. If proc0 was asleep elsewhere in the kernel (e.g. waiting for disk I/O), then this degenerated into the same LOR since the thread lock would be some other sleepq lock. Fix this by deferring the wakeup of the swapper until after the sleepq lock held by the upper layer has been locked. The setrunnable() routine now returns a boolean value to indicate whether or not proc0 needs to be woken up. The end result is that consumers of the sleepq API such as *sleep/wakeup, condition variables, sx locks, and lockmgr, have to wakeup proc0 if they get a non-zero return value from sleepq_abort(), sleepq_broadcast(), or sleepq_signal(). Discussed with: jeff Glanced at by: sam Tested by: Jurgen Weber jurgen - ish com au MFC after: 2 weeks	2008-08-05 20:02:31 +00:00
John Baldwin	0f3dd6ff0d	Close two different races with concurrent opens of pty master devices that could result in leaked ttys or a leaked pty + tty pair. MFC after: 1 week	2008-08-04 19:51:23 +00:00
John Baldwin	0bc7bc0ec8	- Close a race with concurrent open's of a pts master device which could result in leaked tty structures. - When constructing a new pty, allocate it's tty structure before adding it to the list. MFC after: 1 week	2008-08-04 19:49:05 +00:00
Antoine Brodin	f8062a0b0f	Kill a dead variable PR: 126223 Submitted by: Mateusz Guzik	2008-08-03 21:07:19 +00:00
Robert Watson	1d986c5ff1	Remove broken code to replace st_mode value with ACCESSPERMS when lstat(2) is called on symlinks -- this code appears never to have worked. The PR this addresses suggests that the intended original behavior is the right one, but as bde points out in the PR comments, we do actually support storing a mode on symlinks, so returning it seems reasonable. This is consistent with Mac OS X, which despite documentation to the contrary does return the mode set on a symlink, but not some other platforms. The Single Unix Spec requires only that the returned bits be "meaningful", which seems at best unhelpful as advice goes. PR: 25018 MFC after: 3 days	2008-08-03 15:44:56 +00:00
Konstantin Belousov	4f7afc20e0	Calling linker_load_dependencies() while holding the module' vnode lock may cause a LOR between kld_sx lock and vnode lock. linker_load_dependencies() drops kld_sx, and another thread may attempt to load the same kld. Reported and tested by: pjd MFC after: 1 week	2008-08-03 13:33:45 +00:00
Sam Leffler	6e0186d5ee	add callout_schedule; besides being useful it also improves compatibility with other systems Reviewed by: ed, battlez	2008-08-02 17:42:38 +00:00
Christian S.J. Peron	dfc714fba1	Currently, BSM audit pathname token generation for chrooted or jailed processes are not producing absolute pathname tokens. It is required that audited pathnames are generated relative to the global root mount point. This modification changes our implementation of audit_canon_path(9) and introduces a new function: vn_fullpath_global(9) which performs a vnode -> pathname translation relative to the global mount point based on the contents of the name cache. Much like vn_fullpath, vn_fullpath_global is a wrapper function which called vn_fullpath1. Further, the string parsing routines have been converted to use the sbuf(9) framework. This change also removes the conditional acquisition of Giant, since the vn_fullpath1 method will not dip into file system dependent code. The vnode locking was modified to use vhold()/vdrop() instead the vref() and vrele(). This will modify the hold count instead of modifying the user count. This makes more sense since it's the kernel that requires the reference to the vnode. This also makes sure that the vnode does not get recycled we hold the reference to it. [1] Discussed with: rwatson Reviewed by: kib [1] MFC after: 2 weeks	2008-07-31 16:57:41 +00:00
Ed Schouten	e7ea30e404	Remove the use of lbolt from the VFS syncer. It seems we only use `lbolt' inside the VFS syncer and the TTY layer now. Because I'm planning to replace the TTY layer next month, there's no reason to keep `lbolt' if it's only used in a single thread inside the kernel. Because the syncer code wanted to wake up the syncer thread before the timeout, it called sleepq_remove(). Because we now just use a condvar(9) with a timeout value of `hz', we can wake it up using cv_broadcast() without waking up any unrelated threads. Reviewed by: phk	2008-07-30 12:39:18 +00:00
Ed Schouten	911d490140	Don't make subr_clist.c depend on the TTY layer. After the import of the new TTY layer, the TTY_QUOTE definition will not be present anymore. To make sure clists will still work as expected, introduce an internal definition called QUOTEMASK. Maybe we can decide to remove the quote bits entirely, but we still have to look into this. There may be drivers that still use the quote bits. Obtained from: //depot/projects/mpsafetty	2008-07-30 12:32:42 +00:00
John Baldwin	c3ea337801	When choosing a CPU for a thread in a cpuset, prefer the last CPU that the thread ran on if there are no other CPUs in the set with a shorter per-CPU runqueue.	2008-07-28 20:39:21 +00:00
John Baldwin	f7f1cc1518	Really fix this.	2008-07-28 18:33:43 +00:00
Pawel Jakub Dawidek	7224dd4dad	Properly check if td_name is empty and if it is, print process name, instead of empty thread name. Reviewed by: jhb	2008-07-28 18:10:26 +00:00
John Baldwin	f200843b72	Implement support for cpusets in the 4BSD scheduler. - When a cpuset is applied to a thread, walk the cpuset to see if it is a "full" cpuset (includes all available CPUs). If not, set a new TDS_AFFINITY flag to indicate that this thread can't run on all CPUs. When inheriting a cpuset from another thread during thread creation, the new thread also inherits this flag. It is in a new ts_flags field in td_sched rather than using one of the TDF_SCHEDx flags because fork() clears td_flags after invoking sched_fork(). - When placing a thread on a runqueue via sched_add(), if the thread is not pinned or bound but has the TDS_AFFINITY flag set, then invoke a new routine (sched_pickcpu()) to pick a CPU for the thread to run on next. sched_pickcpu() walks the cpuset and picks the CPU with the shortest per-CPU runqueue length. Note that the reason for the TDS_AFFINITY flag is to avoid having to walk the cpuset and examine runq lengths in the common case. - To avoid walking the per-CPU runqueues in sched_pickcpu(), add an array of counters to hold the length of the per-CPU runqueues and update them when adding and removing threads to per-CPU runqueues. MFC after: 2 weeks	2008-07-28 17:25:24 +00:00
John Baldwin	8aa3d7ffc0	Various and sundry style and whitespace fixes.	2008-07-28 15:52:02 +00:00
Kip Macy	947265b6bd	- track maximum wait time - resize columns based on actual observed numerical values MFC after: 3 days	2008-07-27 21:45:20 +00:00
Pawel Jakub Dawidek	5573021d78	Assert for exclusive vnode lock in vinactive(), vrecycle() and vgonel() functions. Reviewed by: kib	2008-07-27 11:48:15 +00:00
Pawel Jakub Dawidek	610507ae00	- Move vp test for beeing NULL under IGNORE_LOCK(). - Check if panicstr isn't set, if it is ignore the lock. This helps to avoid confusion, because lockmgr is a no-op when panicstr isn't NULL, so asserting anything at this point doesn't make sense and can just race with other panic. Discussed with: kib	2008-07-27 11:46:42 +00:00
Tom Rhodes	be6b130476	Fill in a few sysctl descriptions. Approved by: rwatson	2008-07-26 00:55:35 +00:00
Ed Schouten	bea45cdda3	Move ttyinfo() into its own C file. The ttyinfo() routine generates the fancy output when pressing ^T. Right now it is stored in tty.c. In the MPSAFE TTY code it is already stored in tty_info.c. To make integration of the MPSAFE TTY code a little easier, take the same approach. This makes the TTY code a little bit more readable, because having the proc_/thread_ routines in tty.c is very distractful. Approved by: philip (mentor)	2008-07-25 14:31:00 +00:00
Konstantin Belousov	58e8af1bf5	Call pargs_drop() unconditionally in do_execve(), the function correctly handles the NULL argument. Make pargs_free() static. MFC after: 1 week	2008-07-25 11:55:32 +00:00
Konstantin Belousov	96f1567fa7	s/alredy/already/ in the comments and the log message.	2008-07-25 11:22:25 +00:00
Konstantin Belousov	8b4a2800de	Do the pargs_hold() on the copy of the pointer to the p_args of the child process immediately after bulk bcopy() without dropping the process lock. Since process is not single-threaded when forking, dropping and reacquiring the lock allows an other thread to change the process title of the parent in between, and results in hold being done on the invalid pointer. The problem manifested itself as the double free of the old p_args. Reported by: kris Reviewed by: jhb MFC after: 1 week	2008-07-23 08:45:25 +00:00
Attilio Rao	09400d5abe	- Disallow XFS mounting in write mode. The write support never worked really and there is no need to maintain it. - Fix vn_get() in order to let it call vget(9) with a valid locking request. vget(9) returns the vnode locked in order to prevent recycling, but in this case internal XFS locks alredy prevent it from happening, so it is safe to drop the vnode lock before to return by vn_get(). - Add a VNASSERT() in vget(9) in order to catch malformed locking requests. Discussed with: kan, kib Tested by: Lothar Braun <lothar at lobraun dot de>	2008-07-21 23:01:09 +00:00
Robert Watson	828e07694c	If run_interrupt_driven_config_hooks() waits 360 seconds and INVARIANTS is compiled into the kernel, then panic. MFC after: 3 days Discussed with: scottl	2008-07-21 20:50:49 +00:00
Pawel Jakub Dawidek	7f41115ef6	Implement the following macros for completeness: SYSCTL_QUAD() SYSCTL_ADD_QUAD() TUNABLE_QUAD() TUNABLE_QUAD_FETCH() Now we can use 64bit tunables on 32bit systems.	2008-07-21 15:05:25 +00:00
Kip Macy	dd0e6c383a	Add accessor functions for socket fields. MFC after: 1 week	2008-07-21 00:49:34 +00:00
Alan Cox	14e69e48b8	Eliminate dead code. (The commit message for revision 1.287 explains why this code is dead.)	2008-07-20 04:13:51 +00:00
Robert Watson	1cc2bd820b	Rather than simply waiting silently and indefinitely for all interrupt-driven configuration handlers to complete, print out a diagnostic message every 60 second indicating which handlers are still running. Do this at most 5 times per run so as to avoid scrolling out any useful information from the kernel message buffer. The interval of 60 seconds was selected based on a best guess as to the nature of "long enough" and may want to be tuned higher or lower depending on real-world tolerances. MFC after: 3 days Discussed with: scottl	2008-07-19 19:08:35 +00:00
Robert Watson	1a4b919f8e	witness_addgraph() is required even if DDB isn't compiled into the kernel, so exclude it from #ifdef DDB. Submitted by: attilio	2008-07-19 17:47:23 +00:00
Robert Watson	51c0f94ed7	Add DDB "show conifhk" command, which lists hooks currently waiting for completion in run_interrupt_driven_config_hooks(). This is helpful when trying to figure out which device drivers have gone into la-la land during boot-time autoconfiguration. MFC after: 3 days	2008-07-19 12:12:54 +00:00
Jeff Roberson	9fc51b0bf4	Fix a race which could result in some timeout buckets being skipped. - When a tick occurs on a cpu, iterate from cs_softticks until ticks. The per-cpu tick processing happens asynchronously with the actual adjustment of the 'ticks' variable. Sometimes the results may be visible before the local call and sometimes after. Previously this could cause a one tick window where we didn't evaluate the bucket. - In softclock fetch curticks before incrementing cc_softticks so we don't skip insertions which were made for the current time. Sponsored by: Nokia	2008-07-19 05:18:29 +00:00
Jeff Roberson	e980fff622	- Check whether we've recorded this tick in ts_ticks on another cpu in sched_tick() to prevent multiple increments for one tick. This pushes the value out of range and breaks priority calculation. Reviewed by: kib Found by: pho/nokia Sponsored by: Nokia MFC after: 3 days	2008-07-19 05:13:47 +00:00
Kip Macy	694382c8eb	revert local change	2008-07-18 07:10:33 +00:00
Kip Macy	2976b312a1	revert change from local tree	2008-07-18 07:07:57 +00:00
Kip Macy	4af83c8cff	import vendor fixes to cxgb	2008-07-18 06:12:31 +00:00
Konstantin Belousov	9a75ea2333	Pair the VOP_OPEN call from do_execve() with the reciprocal VOP_CLOSE. This was unnoticed because local filesystems usually do nothing non-trivial in the close vop. Reported and tested by: Rick Macklem MFC after: 2 weeks	2008-07-17 16:44:07 +00:00

... 2 3 4 5 6 ...

10746 Commits