freebsd-nq

Author	SHA1	Message	Date
John Baldwin	2c17901060	Add 'compat_freebsd[4567]' features corresponding to the kernel options COMPAT_FREEBSD[4567]. MFC after: 1 week Requested by: kris	2008-01-17 22:46:32 +00:00
Sam Leffler	eeb76a1889	promote ath_defrag to m_collapse (and retire private+unused m_collapse from cxgb) Reviewed by: pyun, jhb, kmacy MFC after: 2 weeks	2008-01-17 21:25:09 +00:00
John Baldwin	cff3c4fdc5	Remove a conditional that is always true. MFC after: 2 weeks	2008-01-17 20:15:15 +00:00
John Baldwin	8ffbe1559e	Add a set of regression tests for the POSIX shm API (shm_open(2) and shm_unlink(2)).	2008-01-16 15:51:24 +00:00
Nate Lawson	e1f13773ec	Remove duplicate cpufreq levels, i.e. ones that are within 25 Mhz of each other. The first one survives, the rest are removed. So far, it appears only some acpi_perf(4) BIOS tables have these invalid states, but address this in the core to be sure to handle other potential driver data. PR: kern/114722 Tested by: stefan.lambrev / moneybookers.com MFC after: 3 days	2008-01-16 01:05:21 +00:00
Jeff Roberson	a755f21484	- When executing the 'tryself' branch in sched_pickcpu() look at the lowest priority on the queue for the current cpu vs curthread's priority. In the case that curthread is waking up many threads of a lower priority as would happen with a turnstile_broadcast() or wakeup() of many threads this prevents them from all ending up on the current cpu. - In sched_add() make the relationship between a scheduled ithread and the current cpu advisory rather than strict. Only give the ithread affinity for the current cpu if it's actually being scheduled from a hardware interrupt. This prevents it from migrating when it simply blocks on a lock. Sponsored by: Nokia	2008-01-15 09:03:09 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	d1127e669c	lockmgr() function will return successfully when trying to work under panic but it won't actually lock anything. This can lead some paths to reach lockmgr_disown() with inconsistent lock which will let trigger the relative assertions. Fix those in order to recognize panic situation and to not trigger. Reported by: pho Submitted by: kib	2008-01-11 16:38:12 +00:00
Robert Watson	d92909c1d4	Don't zero td_runtime when billing thread CPU usage to the process; maintain a separate td_incruntime to hold unbilled CPU usage for the thread that has the previous properties of td_runtime. When thread information is requested using the thread monitoring sysctls, export thread td_runtime instead of process rusage runtime in kinfo_proc. This restores the display of individual ithread and other kernel thread CPU usage since inception in ps -H and top -SH, as well for libthr user threads, valuable debugging information lost with the move to try kthreads since they are no longer independent processes. There is universal agreement that we should rewrite the process and thread export sysctls, but this commit gets things going a bit better in the mean time. Likewise, there are resevations about the continued validity of statclock given the speed of modern processors. Reviewed by: attilio, emaste, jhb, julian	2008-01-10 22:11:20 +00:00
Robert Watson	8a69e5fa71	Remove "lock pushdown" todo item in comment -- I did that for 7.0. MFC after: 3 weeks	2008-01-10 12:38:17 +00:00
Robert Watson	a635784569	Correct typos in comments. MFC after: 3 weeks	2008-01-10 12:29:12 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Attilio Rao	6edbb3ee9e	Fix a last second typo about recent lockmgr_disown() introduction.	2008-01-09 00:02:43 +00:00
Attilio Rao	d7a7e17968	Remove explicit calling of lockmgr() with the NULL argument. Now, lockmgr() function can only be called passing curthread and the KASSERT() is upgraded according with this. In order to support on-the-fly owner switching, the new function lockmgr_disown() has been introduced and gets used in BUF_KERNPROC(). KPI, so, results changed and FreeBSD version will be bumped soon. Differently from previous code, we assume idle thread cannot try to acquire the lockmgr as it cannot sleep, so loose the relative check[1] in BUF_KERNPROC(). Tested by: kris [1] kib asked for a KASSERT in the lockmgr_disown() about this condition, but after thinking at it, as this is a well known general rule, I found it not really necessary.	2008-01-08 23:48:31 +00:00
John Baldwin	4ad6d200d6	Regen for shm_open(2) and shm_unlink(2).	2008-01-08 22:01:26 +00:00
John Baldwin	8e38aeff17	Add a new file descriptor type for IPC shared memory objects and use it to implement shm_open(2) and shm_unlink(2) in the kernel: - Each shared memory file descriptor is associated with a swap-backed vm object which provides the backing store. Each descriptor starts off with a size of zero, but the size can be altered via ftruncate(2). The shared memory file descriptors also support fstat(2). read(2), write(2), ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared memory file descriptors. - shm_open(2) and shm_unlink(2) are now implemented as system calls that manage shared memory file descriptors. The virtual namespace that maps pathnames to shared memory file descriptors is implemented as a hash table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash of the pathname. - As an extension, the constant 'SHM_ANON' may be specified in place of the path argument to shm_open(2). In this case, an unnamed shared memory file descriptor will be created similar to the IPC_PRIVATE key for shmget(2). Note that the shared memory object can still be shared among processes by sharing the file descriptor via fork(2) or sendmsg(2), but it is unnamed. This effectively serves to implement the getmemfd() idea bandied about the lists several times over the years. - The backing store for shared memory file descriptors are garbage collected when they are not referenced by any open file descriptors or the shm_open(2) virtual namespace. Submitted by: dillon, peter (previous versions) Submitted by: rwatson (I based this on his version) Reviewed by: alc (suggested converting getmemfd() to shm_open())	2008-01-08 21:58:16 +00:00
John Baldwin	39033470fe	Close a race in the kern.ttys sysctl handler that resulted in panics in dev2udev() when a tty was being detached concurrently with the sysctl handler: - Hold the 'tty_list_mutex' lock while we read all the fields out of the struct tty for copying out later. Previously the pty(4) and pts(4) destroy routines could set t_dev to NULL, drop their reference on the tty and destroy the cdev while the sysctl handler was attempting to invoke dev2udev() on the cdev being destroyed. This happened when the sysctl handler read the value of t_dev prior to it being set to NULL either due to it being stale or due to timing races. By holding the list lock we guarantee that the destroy routines will block in ttyrel() in that case and not destroy the cdev until after we've copied all of our data. We may see a NULL cdev pointer or we may see the previous value, but the previous value will no longer point to a destroyed cdev if we see it. - Fix the ttyfree() routine used by tty device drivers in their detach methods to use ttyrel() on the tty so we don't leak them. Also, fix it to use the same order of operations as pty/pts destruction (set t_dev NULL, ttyrel(), destroy_dev()) so it cooperates with the sysctl handler. MFC after: 3 days Tested by: avatar	2008-01-08 04:53:28 +00:00
Kris Kennaway	357911ce77	Fix logic in skipcount handling (used to sample every 1/N lock operations to reduce profiling overhead)	2008-01-08 01:11:40 +00:00
Robert Watson	57d7e86b65	Free MAC label on a POSIX semaphore when the semaphore is freed. MFC after: 3 days Submitted by: jhb	2008-01-07 22:03:19 +00:00
John Baldwin	e46502943a	Make ftruncate a 'struct file' operation rather than a vnode operation. This makes it possible to support ftruncate() on non-vnode file types in the future. - 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on a given file descriptor. - ftruncate() moves to kern/sys_generic.c and now just fetches a file object and invokes fo_truncate(). - The vnode-specific portions of ftruncate() move to vn_truncate() in vfs_vnops.c which implements fo_truncate() for vnode file types. - Non-vnode file types return EINVAL in their fo_truncate() method. Submitted by: rwatson	2008-01-07 20:05:19 +00:00
Bruce Evans	9283848511	In sequential_heuristic(): - spell 16384 as 16384 and not as BKVASIZE. 16384 is (not quite) just a magic size that works well in practice. BKVASIZE should be MAXBSIZE (65536), but is 16384 because i386's don't have enough kva for it to be MAXBSIZE; 16384 works (not so well) for it for much the same reasons that it works well in the heuristic. - expand and/or add comments about this and other details. - don't explicitly inline this function. - fix some other style bugs.	2008-01-05 08:54:51 +00:00
Peter Wemm	4113f8d741	Fall back to the binary-specified interpreter (ld-elf.so.1) if the ABI override binary isn't found. This could probably be smoother, but it is what I did in p4 change #126891 on 2007/09/27. It should solve the "ld-elf32.so.1"-in-chroot problem.	2008-01-05 08:35:56 +00:00
Jeff Roberson	fd0b8c783d	- Restore timeslicing code for all bit SCHED_FIFO priority classes. Reported by: Peter Jeremy <peterjeremy@optushome.com.au>	2008-01-05 04:47:31 +00:00
Bjoern A. Zeeb	a82be55d42	Add missing sb_sndptr* fields to db_print_sockbuf(). While here change %d to %u for u_ints. Discussed with: rwatson, kmacy	2008-01-03 15:19:31 +00:00
Jeff Roberson	a57decdf32	- In sysctl_kern_file skip fdps with negative lastfiles. This can happen if there are no files open. Accounting for these can eventually return a negative value for olenp causing sysctl to crash with a bad malloc. Reported by: Pawel Worach <pawel.worach@gmail.com>	2008-01-03 01:26:59 +00:00
David E. O'Brien	bedff79a00	Note what is too {short,long}.	2008-01-02 18:48:27 +00:00
John Baldwin	c0cfd9d113	A few whitespace fixes.	2008-01-02 17:09:15 +00:00
Jeff Roberson	41e0f66d41	- Place the fhold() in unp_internalize_fp to be more consistent with refs. - Clear all of the gc flags before doing a run. Stale flags were causing us to skip some descriptors. - If a unp socket has been marked REF in a gc pass it can't be dead. Found by: rwatson's test tool.	2008-01-01 01:46:42 +00:00
Craig Rodrigues	450ea867c5	In vfs_scanopt(), make sure that the mount option value is not NULL before calling vsscanf(). PR: 118531 Submitted by: Jaakko Heinonen <jh saunalahti fi> MFC after: 3 days	2007-12-31 23:44:53 +00:00
John Baldwin	0deabe7e53	Actually declare the kern.features sysctl node. Pointy hat to: jhb	2007-12-31 22:03:57 +00:00
Jeff Roberson	0c66dc6758	- Pause a while after disabling lock profiling and before resetting it to be sure that all participating CPUs have stopped updating it. - Restore the behavior of printing the name of the lock type in the output.	2007-12-31 03:45:51 +00:00
Jeff Roberson	6f552cb098	- Check the correct variable against NULL in two places. - If the unp_file is NULL that means it has never been internalized and it must be reachable.	2007-12-31 03:44:54 +00:00
Warner Losh	c94a7cac1f	Rather than not redirting the bp when we get ENXIO, only redirty it when the error is EIO. This catches a much larger class of errors that are unlikely to succeed if retried. Submitted by: bde	2007-12-30 05:53:45 +00:00
Jeff Roberson	397c19d175	Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho	2007-12-30 01:42:15 +00:00
Alan Cox	f8a47341fe	Add the superpage reservation system. This is "part 2 of 2" of the machine-independent support for superpages. (The earlier part was the rewrite of the physical memory allocator.) The remainder of the code required for superpages support is machine-dependent and will be added to the various pmap implementations at a later date. Initially, I am only supporting one large page size per architecture. Moreover, I am only enabling the reservation system on amd64. (In an emergency, it can be disabled by setting VM_NRESERVLEVELS to 0 in amd64/include/vmparam.h or your kernel configuration file.)	2007-12-29 19:53:04 +00:00
Robert Watson	c5f1beb02a	In "show lockedvnods" DDB command, use db_printf() rather than printf() so that the results end up in the DDB output stream rather than the console output stream. This should likely also be done for the vprint() function it calls. MFC after: 3 months	2007-12-28 00:47:31 +00:00
Attilio Rao	100f241571	Trimm out now unused option LK_EXCLUPGRADE from the lockmgr namespace. This option just adds complexity and the new implementation no longer will support it, so axing it now that it is unused is probabilly the better idea. FreeBSD version is bumped in order to reflect the KPI breakage introduced by this patch. In the ports tree, kris found that only old OSKit code uses it, but as it is thought to work only on 2.x kernels serie, version bumping will solve any problem.	2007-12-28 00:38:13 +00:00
Attilio Rao	7a1d78fa3f	In order to avoid a huge class of deadlocks (in particular in interactions with the interlock), owner of the lock should be only curthread or at least, for its limited usage, NULL which identifies LK_KERNPROC. The thread "extra argument" for the lockmgr interface is going to be removed in the near future, but for the moment, just let kernel run for some days with this check on in order to find potential deadlocking places around the kernel and fix them.	2007-12-27 22:56:57 +00:00
Robert Watson	0417fe5421	Return ESRCH when a kernel stack is queried on a process in execve() -- p_candebug() will return EAGAIN which, if the other process never leaves execve(), will result in the sysctl spinning and never returning to userspace. Processes should always eventually leave execve(), but spinning in kernel while we wait is bad for countless reasons, and particularly harmful if execve() itself is deadlocked. Possibly we should return another error, or return a marker indicating the thread is in execve() so it can be reported that way in userspace. Reported by: kris	2007-12-27 22:44:01 +00:00
Attilio Rao	98e4f2e2bf	As LK_EXCLUPGRADE is used in conjuction with LK_NOWAIT, LK_UPGRADE becames equivalent with this and so operate the switch. That call is the only one remaining LK_EXCLUPGRADE consumer and removing it will prepare the ground for LK_EXCLUPGRADE axing and further lockmgr improvements. Discussed with: jeff, ups	2007-12-27 20:52:05 +00:00
Warner Losh	b27aa20e8d	A partial solution to some of the 'pull the umass device with a mounted FS' problems. These are more along the lines of 'avoiding an avoidable panic' than a complete solution to removable devices. We now close the barn door after the horse has gotten lose and has been hit by a truck, as it were. The barn no longer catches fire in this case, but the horse is still dead :-). The vfs_bio.c fix causes us not to put a failed write back into the dirty pool if the error returned was ENXIO. In that case, the buffer is treated like any other clean buffer that's being retured. ENXIO means the device isn't there anymore and will never be there again in the future, so retrying is futile. The vfs_mount.c fix treats 'ENXIO' as success for unmounting a file system. If the device is gone, retrying later won't help and we'll never be able to unmount the device. These two are part of a larger patch set submitted by the author. The other patches will be forth coming. I added comments to these two patches. Submitted by: Henrik Gulbrandsen Reviewed by: phk@ PR: usb/46176 (partial)	2007-12-27 16:38:28 +00:00
Robert Watson	618c7db30a	Add textdump(4) facility, which provides an alternative form of kernel dump using mechanically generated/extracted debugging output rather than a simple memory dump. Current sources of debugging output are: - DDB output capture buffer, if there is captured output to save - Kernel message buffer - Kernel configuration, if included in kernel - Kernel version string - Panic message Textdumps are stored in swap/dump partitions as with regular dumps, but are laid out as ustar files in order to allow multiple parts to be stored as a stream of sequentially written blocks. Blocks are written out in reverse order, as the size of a textdump isn't known a priori. As with regular dumps, they will be extracted using savecore(8). One new DDB(4) command is added, "textdump", which accepts "set", "unset", and "status" arguments. By default, normal kernel dumps are generated unless "textdump set" is run in order to schedule a textdump. It can be canceled using "textdump unset" to restore generation of a normal kernel dump. Several sysctls exist to configure aspects of textdumps; debug.ddb.textdump.pending can be set to check whether a textdump is pending, or set/unset in order to control whether the next kernel dump will be a textdump from userspace. While textdumps don't have to be generated as a result of a DDB script run automatically as part of a kernel panic, this is a particular useful way to use them, as instead of generating a complete memory dump, a simple transcript of an automated DDB session can be captured using the DDB output capture and textdump facilities. This can be used to generate quite brief kernel bug reports rich in debugging information but not dependent on kernel symbol tables or precisely synchronized source code. Most textdumps I generate are less than 100k including the full message buffer. Using textdumps with an interactive debugging session is also useful, with capture being enabled/disabled in order to record some but not all of the DDB session. MFC after: 3 months	2007-12-26 11:32:33 +00:00
Wojciech A. Koszek	4ffcc89aa6	Rewrite kern.console handling in sbuf(9). My intention is to leave kern.console format as is. Thus, no difference in output format should appear after this commit. Reviewed by: cognet@ (mentor) Approved by: cognet@ (mentor)	2007-12-25 21:17:34 +00:00
Robert Watson	3de213cc00	Add a new 'why' argument to kdb_enter(), and a set of constants to use for that argument. This will allow DDB to detect the broad category of reason why the debugger has been entered, which it can use for the purposes of deciding which DDB script to run. Assign approximate why values to all current consumers of the kdb_enter() interface.	2007-12-25 17:52:02 +00:00
Julian Elischer	6829a5c59e	give thread0 the tid 100000 and bumpt the others to start at 100001 MFC after: 1 week	2007-12-22 04:56:48 +00:00
Wojciech A. Koszek	731016fe36	Make SCHED_ULE buildable with gcc3. Reviewed by: cognet (mentor), jeffr Approved by: cognet (mentor), jeffr	2007-12-21 23:30:18 +00:00
Warner Losh	d4277fef7b	When devclass_get_maxunit is passed a NULL, return -1 to indicate that there's nothing allocated at all yet.	2007-12-19 22:05:07 +00:00
David E. O'Brien	10c2b8e128	Be more exact with sigaction SA_SIGINFO handling. Reviewed by: marcel	2007-12-18 20:39:13 +00:00
Kip Macy	5e0f5cfaed	Add SB_NOCOALESCE flag to disable socket buffer update in place	2007-12-17 10:02:01 +00:00
David Xu	7fab871d8c	Check NULL pointer.	2007-12-17 08:09:37 +00:00

1 2 3 4 5 ...

10247 Commits