freebsd-skq

Author	SHA1	Message	Date
Pawel Jakub Dawidek	67e83b07c6	Fix information leak. We can find PIDs of running processes from within a jail, etc. by simply calling setpriority(PRIO_PROCESS, <PID>, 0) and checking the return value: 0 means that the process exists and -1 that it doesn't exist. Reviewed by: rwatson MFC after: 1 week	2008-03-16 17:55:06 +00:00
Robert Watson	237fdd787b	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
Maxim Sobolev	c9370ff4d0	Properly set size of the file_zone to match kern.maxfiles parameter. Otherwise the parameter is no-op, since zone by default limits number of descriptors to some 12K entries. Attempt to allocate more ends up sleeping on zonelimit. MFC after: 2 weeks	2008-03-16 06:21:30 +00:00
Ruslan Ermilov	1f49b573e1	Fix panic on e.g. "kldload /dev/null". PR: kern/121427 Reviewed by: sem MFC after: 3 days	2008-03-15 17:40:18 +00:00
John Baldwin	eaf86d1678	Add preliminary support for binding interrupts to CPUs: - Add a new intr_event method ie_assign_cpu() that is invoked when the MI code wishes to bind an interrupt source to an individual CPU. The MD code may reject the binding with an error. If an assign_cpu function is not provided, then the kernel assumes the platform does not support binding interrupts to CPUs and fails all requests to do so. - Bind ithreads to CPUs on their next execution loop once an interrupt event is bound to a CPU. Only shared ithreads are bound. We currently leave private ithreads for drivers using filters + ithreads in the INTR_FILTER case unbound. - A new intr_event_bind() routine is used to bind an interrupt event to a CPU. - Implement binding on amd64 and i386 by way of the existing pic_assign_cpu PIC method. - For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up an interrupt source and binds its interrupt event to the specified CPU. MI code can currently (ab)use this by doing: intr_bind(rman_get_start(irq_res), cpu); however, I plan to add a truly MI interface (probably a bus_bind_intr(9)) where the implementation in the x86 nexus(4) driver would end up calling intr_bind() internally. Requested by: kmacy, gallatin, jeff Tested on: {amd64, i386} x {regular, INTR_FILTER}	2008-03-14 19:41:48 +00:00
John Baldwin	d628fbfa98	Make the function prototype for cpu_search() match the declaration so that this still compiles with gcc3.	2008-03-14 15:22:38 +00:00
Jeff Roberson	f4d77e9e54	PR 117603 - Close a sleepqueue signal race by interlocking with the per-process spinlock. This was mistakenly omitted from the thread_lock patch and has been a race since. MFC After: 1 week PR: bin/117603 Reported by: Danny Braniss <danny@cs.huji.ac.il>	2008-03-13 00:46:12 +00:00
Jeff Roberson	6617724c5f	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
Jeff Roberson	c5aa6b581d	- Pass the priority argument from sleep() into sleepq and down into sched_sleep(). This removes extra thread_lock() acquisition and allows the scheduler to decide what to do with the static boost. - Change the priority arguments to cv_ to match sleepq/msleep/etc. where 0 means no priority change. Catch -1 in cv_broadcastpri() and convert it to 0 for now. - Set a flag when sleeping in a way that is compatible with swapping since direct priority comparisons are meaningless now. - Add a sysctl to ule, kern.sched.static_boost, that defaults to on which controls the boost behavior. Turning it off gives better performance in some workloads but needs more investigation. - While we're modifying sleepq, change signal and broadcast to both return with the lock held as the lock was held on enter. Reviewed by: jhb, peter	2008-03-12 06:31:06 +00:00
Jeff Roberson	bdb5bdf0b7	- KSE may free a thread that was never actually forked. This will leave td_cpuset NULL. Check for this condition before dereferencing the cpuset. Reported by: david@catwhisker.org, miwi@freebsd.org Sponsored by: Nokia	2008-03-12 05:01:14 +00:00
Jeff Roberson	c143ac21af	- Fix the invalid priority panics people are seeing by forcing tdq_runq_add to select the runq rather than hoping we set it properly when we adjusted the priority. This involves the same number of branches as before so should perform identically without the extra fragility. Tested by: bz Reviewed by: bz	2008-03-10 22:48:27 +00:00
Jeff Roberson	7217d8d1ee	- Don't rely on a side effect of sched_prio() to set the initial ts_runq for thread0. Set it directly in sched_setup(). This fixes traps on boot seen on some machines. Reported by: phk	2008-03-10 09:50:29 +00:00
Jeff Roberson	8f93d79d05	- Handle kdb switch panics outside of mi_switch() to remove some instructions from the common path and make the code more clear. Whether this has any impact on performance may depend on optimization levels. Sponsored by: Nokia	2008-03-10 03:16:51 +00:00
Jeff Roberson	73daf66f41	Reduce ULE context switch time by over 25%. - Only calculate timeshare priorities once per tick or when a thread is woken from sleeping. - Keep the ts_runq pointer valid after all priority changes. - Call tdq_runq_add() directly from sched_switch() without passing in via tdq_add(). We don't need to adjust loads or runqs anymore. - Sort tdq and ts_sched according to utilization to improve cache behavior. Sponsored by: Nokia	2008-03-10 03:15:19 +00:00
Warner Losh	9ab8f3544a	Tiny bit of KNF to make bus_setup_intr() look like the rest of this function.	2008-03-10 01:48:25 +00:00
Jeff Roberson	1bf6461e98	- Add the missing '2' case to the switch table for kern.smp.topology and assign it to create the flat 'none' topology where all cpus are scheduled as if they are equal and unrelated.	2008-03-10 01:38:53 +00:00
Jeff Roberson	ff256d9c47	- Add an implementation of sched_preempt() that avoids excessive IPIs. - Normalize the preemption/ipi setting code by introducing sched_shouldpreempt() so the logical is identical and not repeated between tdq_notify() and sched_setpreempt(). - In tdq_notify() don't set NEEDRESCHED as we may not actually own the thread lock this could have caused us to lose td_flags settings. - Garbage collect some tunables that are no longer relevant.	2008-03-10 01:32:01 +00:00
Jeff Roberson	1e24c28f46	- Add a sched_preempt() routine to be called by md code after IPI_PREEMPT is delivered. - Add a simple implementation to 4bsd.	2008-03-10 01:30:35 +00:00
Warner Losh	908e1e5df5	Any driver that relies on its parent to set the devclass has no way to know if has siblings that need an actual probe. Introduce a specail return value called BUS_PROBE_NOOWILDCARD. If the driver returns this, the probe is only successful for devices that have had a specific devclass set for them. Reviewed by: current@, jhb@, grehan@	2008-03-09 05:10:22 +00:00
Antoine Brodin	e3ad7f6626	Introduce a new F_DUP2FD command to fcntl(2), for compatibility with Solaris and AIX. fcntl(fd, F_DUP2FD, arg) and dup2(fd, arg) are functionnaly equivalent. Document it. Add some regression tests (identical to the dup2(2) regression tests). PR: 120233 Submitted by: Jukka Ukkonen Approved by: rwaston (mentor) MFC after: 1 month	2008-03-08 22:02:21 +00:00
Robert Watson	36b208e008	Use sbuf routines to construct core dump filenames rather than custom string buffer handling, making the code both easier to read and more robust against string-handling bugs. MFC after: 1 week	2008-03-08 16:31:29 +00:00
Robert Watson	eeccc36738	Unlock the process lock when expand_name() fails, or we may leak the process lock leading to a hang. This bug was introduced in kern_sig.c:1.351, when the call to expand_name() was moved earlier bit this particular error case was not updated.	2008-03-08 15:48:06 +00:00
Robert Watson	b916b56b5a	Add __FBSDID() tag. MFC after: 3 days Pointed out by: antoine	2008-03-07 15:27:08 +00:00
Jeff Roberson	c6440f72b6	- Add a missing unlock to cpuset_setaffinity(CPU_LEVEL_CPUSET, CPU_WHICH_PID) Found by: gallatin	2008-03-06 20:11:24 +00:00
Jeff Roberson	8bd75bdde4	- Don't overwrite the recently allocated 'nset' in cpuset_setthread() by passing it to cpuset_which(). Pass in 'set' instead. This argument is not used but for convenience cpuset_which() nulls all incoming parameters. Submitted by: davidxu	2008-03-05 08:08:32 +00:00
Jeff Roberson	73c40187fd	- Verify that when a user supplies a mask that is bigger than the kernel mask none of the upper bits are set. - Be more careful about enforcing the boundaries of masks and child sets. - Introduce a few more CPU_* macros for implementing these tests. - Change the cpusetsize argument to be bytes rather than bits to match other apis. Sponsored by: Nokia	2008-03-05 01:49:20 +00:00
Ruslan Ermilov	9e47336389	Make it possible to continue working after calling doadump() manually from debugger. (This got broken in rev. 1.122.)	2008-03-04 07:39:31 +00:00
Rafal Jaworowski	6b7ba54456	Initial support for Freescale PowerQUICC III MPC85xx system-on-chip family. The PQ3 is a high performance integrated communications processing system based on the e500 core, which is an embedded RISC processor that implements the 32-bit Book E definition of the PowerPC architecture. For details refer to: http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MPC8555E This port was tested and successfully run on the following members of the PQ3 family: MPC8533, MPC8541, MPC8548, MPC8555. The following major integrated peripherals are supported: * On-chip peripherals bus * OpenPIC interrupt controller * UART * Ethernet (TSEC) * Host/PCI bridge * QUICC engine (SCC functionality) This commit brings the main functionality and will be followed by individual drivers that are logically separate from this base. Approved by: cognet (mentor) Obtained from: Juniper, Semihalf MFp4: e500	2008-03-03 17:17:00 +00:00
Marcel Moolenaar	f5a3ef99c2	Unbreak after cpuset: initialize td_cpuset in sched_fork_thread().	2008-03-02 21:34:57 +00:00
Jeff Roberson	62fa74d95a	Add support for the new cpu topology api: - When searching for affinity search backwards in the tree from the last cpu we ran on while the thread still has affinity for the group. This can take advantage of knowledge of shared L2 or L3 caches among a group of cores. - When searching for the least loaded cpu find the least loaded cpu via the least loaded path through the tree. This load balances system bus links, individual cache levels, and hyper-threaded/SMT cores. - Make the periodic balancer recursively balance the highest and lowest loaded cpu across each link. Add support for cpusets: - Convert the cpuset to a simple native cpumask_t while the kernel still only supports cpumask. - Pass the derived cpumask down through the cpu_search functions to restrict the result cpus. - Make the various steal functions resilient to failure since all threads can not run on all cpus any longer. General improvements: - Precisely track the lowest priority thread on every runq with tdq_setlowpri(). Before it was more advisory but this ended up having pathological behaviors. - Remove many #ifdef SMP conditions to simplify the code. - Get rid of the old cumbersome tdq_group. This is more naturally expressed via the cpu_group tree. Sponsored by: Nokia Testing by: kris	2008-03-02 08:20:59 +00:00
Jeff Roberson	81aa71755b	- Remove the old smp cpu topology specification with a new, more flexible tree structure that encodes the level of cache sharing and other properties. - Provide several convenience functions for creating one and two level cpu trees as well as a default flat topology. The system now always has some topology. - On i386 and amd64 create a seperate level in the hierarchy for HTT and multi-core cpus. This will allow the scheduler to intelligently load balance non-uniform cores. Presently we don't detect what level of the cache hierarchy is shared at each level in the topology. - Add a mechanism for testing common topologies that have more information than the MD code is able to provide via the kern.smp.topology tunable. This should be considered a debugging tool only and not a stable api. Sponsored by: Nokia	2008-03-02 07:58:42 +00:00
Jeff Roberson	4da2b9d42f	- Regen for cpuset Sponsored by: Nokia	2008-03-02 07:41:10 +00:00
Jeff Roberson	d7f687fc9b	Add cpuset, an api for thread to cpu binding and cpu resource grouping and assignment. - Add a reference to a struct cpuset in each thread that is inherited from the thread that created it. - Release the reference when the thread is destroyed. - Add prototypes for syscalls and macros for manipulating cpusets in sys/cpuset.h - Add syscalls to create, get, and set new numbered cpusets: cpuset(), cpuset_{get,set}id() - Add syscalls for getting and setting affinity masks for cpusets or individual threads: cpuid_{get,set}affinity() - Add types for the 'level' and 'which' parameters for the cpuset. This will permit expansion of the api to cover cpu masks for other objects identifiable with an id_t integer. For example, IRQs and Jails may be coming soon. - The root set 0 contains all valid cpus. All thread initially belong to cpuset 1. This permits migrating all threads off of certain cpus to reserve them for special applications. Sponsored by: Nokia Discussed with: arch, rwatson, brooks, davidxu, deischen Reviewed by: antoine	2008-03-02 07:39:22 +00:00
Jeff Roberson	885d51a38a	- Add a new sched_affinity() api to be used in the upcoming cpuset implementation. - Add empty implementations of sched_affinity() to 4BSD and ULE. Sponsored by: Nokia	2008-03-02 07:19:35 +00:00
Attilio Rao	7fbfba7bf8	- Handle buffer lock waiters count directly in the buffer cache instead than rely on the lockmgr support [1]: * bump the waiters only if the interlock is held * let brelvp() return the waiters count * rely on brelvp() instead than BUF_LOCKWAITERS() in order to check for the waiters number - Remove a namespace pollution introduced recently with lockmgr.h including lock.h by including lock.h directly in the consumers and making it mandatory for using lockmgr. - Modify flags accepted by lockinit(): * introduce LK_NOPROFILE which disables lock profiling for the specified lockmgr * introduce LK_QUIET which disables ktr tracing for the specified lockmgr [2] * disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it can only be used on a per-instance basis - Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer used This patch breaks KPI so __FreBSD_version will be bumped and manpages updated by further commits. Additively, 'struct buf' changes results in a disturbed ABI also. [2] Really, currently there is no ktr tracing in the lockmgr, but it will be added soon. [1] Submitted by: kib Tested by: pho, Andrea Barberio <insomniac at slackware dot it>	2008-03-01 19:47:50 +00:00
Konstantin Belousov	e30cf87ba1	Do not assert any locks for VOP_PRINT. In particular, do not assert that the vnode interlock is not held. vn_printf() already correctly handles locked and unlocked vnode interlocks, and all the in-tree vop_print methods are interlock-agnostic. Some code calls vprintf() with the vnode interlock held, that causes unjustified panics with INVARIANTS (ffs_syncvnode() as example). Reported by: Peter Holm	2008-02-26 12:16:35 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	628f51d275	Introduce some functions in the vnode locks namespace and in the ffs namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.	2008-02-24 16:38:58 +00:00
Colin Percival	491869163b	After finishing sending file data in sendfile(2), don't forget to send the provided trailers. This has been broken since revision 1.240. Submitted by: Dan Nelson PR: kern/120948 "sounds ok to me" from: phk MFC after: 3 days	2008-02-24 00:07:00 +00:00
Dag-Erling Smørgrav	60e15db992	This patch adds a new ktrace(2) record type, KTR_STRUCT, whose payload consists of the null-terminated name and the contents of any structure you wish to record. A new ktrstruct() function constructs and emits a KTR_STRUCT record. It is accompanied by convenience macros for struct stat and struct sockaddr. In kdump(1), KTR_STRUCT records are handled by a dispatcher function that runs stringent sanity checks on its contents before handing it over to individual decoding funtions for each type of structure. Currently supported structures are struct stat and struct sockaddr for the AF_INET, AF_INET6 and AF_UNIX families; support for AF_APPLETALK and AF_IPX is present but disabled, as I am unable to test it properly. Since 's' was already taken, the letter 't' is used by ktrace(1) to enable KTR_STRUCT trace points, and in kdump(1) to enable their decoding. Derived from patches by Andrew Li <andrew2.li@citi.com>. PR: kern/117836 MFC after: 3 weeks	2008-02-23 01:01:49 +00:00
Yaroslav Tykhiy	c6446de05d	Undo the damage I did in sys/kern/vfs_mount.c #1.274 and sbin/mount_nfs/mount_nfs.c #1.76. Let the dragons sleep. Requested by: rodrigc, des PR: kern/120319 (welcome the bug back)	2008-02-18 20:58:57 +00:00
Yaroslav Tykhiy	37ed722f78	Add a remark on a questionable property of vfs_mergeopts().	2008-02-18 10:10:42 +00:00
Antoine Brodin	370f990d30	Make sysctl_kern_arnd return a random buffer instead of a random long, as it is expected by userland (stack protector guard setup for example). PR: 119129 Approved by: rwatson (mentor) MFC after: 1 month	2008-02-17 16:44:48 +00:00
Kris Kennaway	e17660e79c	Switch from conditionally dropping Giant in exit1() to asserting it is not held, which appears to be always true.	2008-02-17 15:28:28 +00:00
Warner Losh	6b4d690c62	Fix typo in comment.	2008-02-17 02:46:54 +00:00
Antoine Brodin	74727f1209	Remove a superfluous line in run_interrupt_driven_config_hooks(), next_entry is already initialized during TAILQ_FOREACH_SAFE(). PR: kern/119604 Approved by: rwatson (mentor) MFC after: 1 month	2008-02-15 21:54:21 +00:00
Attilio Rao	24463dbbee	- Introduce lockmgr_args() in the lockmgr space. This function performs the same operation of lockmgr() but accepting a custom wmesg, prio and timo for the particular lock instance, overriding default values lkp->lk_wmesg, lkp->lk_prio and lkp->lk_timo. - Use lockmgr_args() in order to implement BUF_TIMELOCK() - Cleanup BUF_LOCK() - Remove LK_INTERNAL as it is nomore used in the lockmgr namespace Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-15 21:04:36 +00:00
Yaroslav Tykhiy	38a7fd05f7	In the new order of things dictated by nmount(2), a read-only mount is to be requested via a "ro" option. At the same time, MNT_RDONLY is gradually becoming an indicator of the current state of the FS instead of a command flag. Today passing MNT_RDONLY alone to the kernel's mount machinery will lead to various glitches. (See the PRs for examples.) Therefore mount the root FS with a "ro" option instead of the MNT_RDONLY flag. (Note that MNT_RDONLY still is added to the mount flags internally, by vfs_donmount(), if "ro" was specified.) To be able to pass "ro" cleanly to kernel_vmount(), teach the latter function to accept options with NULL values. Also correct the comment explaining how mount_arg() handles length of -1. PR: bin/106636 kern/120319 Submitted by: Jaakko Heinonen <see PR kern/120319 for email> (originally)	2008-02-14 17:04:31 +00:00
Simon L. B. Nielsen	1b7089994c	Fix sendfile(2) write-only file permission bypass. Security: FreeBSD-SA-08:03.sendfile Submitted by: kib	2008-02-14 11:44:31 +00:00
John Baldwin	ad69e26b69	Add KASSERT()'s to catch attempts to recurse on spin mutexes that aren't marked recursable either via mtx_lock_spin() or thread_lock(). MFC after: 1 week	2008-02-13 23:39:05 +00:00

1 2 3 4 5 ...

10336 Commits