freebsd-dev

Author	SHA1	Message	Date
David Xu	3bba58f287	Fix compiling problem.	2008-04-29 05:48:05 +00:00
David Xu	727158f6f6	Introduce command UMTX_OP_WAIT_UINT_PRIVATE and UMTX_OP_WAKE_PRIVATE to allow userland to specify that an address is not shared by multiple processes.	2008-04-29 03:48:48 +00:00
Robert Watson	ae11a989e6	When writing trailers in sendfile(2), don't call kern_writev() while holding the socket buffer lock. These leads to an immediate panic due to recursing the socket buffer lock. This bug was introduced in uipc_syscalls.c:1.240, but masked by another bug until that was fixed in uipc_syscalls.c:1.269. Note that the current fix isn't perfect, but better than panicking: normally we guarantee that simultaneous invocations of a system call to write on a stream socket won't be interlaced, which is ensured by use of the socket buffer sleep lock. This is guaranteed for the sendfile headers, but not trailers. In practice, this is likely not a problem, but should be fixed. MFC after: 3 days Pointy hat to: andre (1.240), cperciva (1.269)	2008-04-27 15:50:00 +00:00
Kris Kennaway	5894445dad	* Correct a mis-merge that leaked the PROC_LOCK [1] * Return ENOENT on error instead of 0 [2] Submitted by: rdivacky [1], kib [2]	2008-04-26 13:16:55 +00:00
Pawel Jakub Dawidek	3800322fe2	Implement 'show mount' command in DDB. Without argument, it prints short info about all currently mounted file systems. When an address is given as an argument, prints detailed info about the given mount point. MFC after: 2 weeks	2008-04-26 13:04:48 +00:00
Jeff Roberson	6c47aaae12	- Add an integer argument to idle to indicate how likely we are to wake from idle over the next tick. - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are suspended in cpu specific states. This function can fail and cause the scheduler to fall back to another mechanism (ipi). - Implement support for mwait in cpu_idle() on i386/amd64 machines that support it. mwait is a higher performance way to synchronize cpus as compared to hlt & ipis. - Allow selecting the idle routine by name via sysctl machdep.idle. This replaces machdep.cpu_idle_hlt. Only idle routines supported by the current machine are permitted. Sponsored by: Nokia	2008-04-25 05:18:50 +00:00
Kris Kennaway	b1ba81d948	fdhold can return NULL, so add the one remaining missing check for this condition. Reviewed by: attilio MFC after: 1 week	2008-04-24 22:08:36 +00:00
Konstantin Belousov	12e79a9bbc	Allow the vnode zone to return the unused memory. The vnode reference count is/shall be properly maintained for the long time, and VFS shall be safe against the vnode memory reclamation. Proposed by: jeff Tested by: pho	2008-04-24 09:58:33 +00:00
Poul-Henning Kamp	9b4a8ab7ba	Now that all platforms use genclock, shuffle things around slightly for better structure. Much of this is related to <sys/clock.h>, which should really have been called <sys/calendar.h>, but unless and until we need the name, the repocopy can wait. In general the kernel does not know about minutes, hours, days, timezones, daylight savings time, leap-years and such. All that is theoretically a matter for userland only. Parts of kernel code does however care: badly designed filesystems store timestamps in local time and RTC chips almost universally track time in a YY-MM-DD HH:MM:SS format, and sometimes in local timezone instead of UTC. For this we have <sys/clock.h> <sys/time.h> on the other hand, deals with time_t, timeval, timespec and so on. These know only seconds and fractions thereof. Move inittodr() and resettodr() prototypes to <sys/time.h>. Retain the names as it is one of the few surviving PDP/VAX references. Move startrtclock() to <machine/clock.h> on relevant platforms, it is a MD call between machdep.c/clock.c. Remove references to it elsewhere. Remove a lot of unnecessary <sys/clock.h> includes. Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs. XXX: should be kern.disable_rtc_set really, it's not MD.	2008-04-22 19:38:30 +00:00
Pawel Jakub Dawidek	d90d4eb28c	Back-out previous revision. For now I can use _ddb() variants of stack(9) KPI, as I use it for debugging only. Once someone will need it for more production features, the change should be reconsider. Requested by: rwatson	2008-04-21 17:22:35 +00:00
Robert Watson	8501a69cc9	Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to explicitly select write locking for all use of the inpcb mutex. Update some pcbinfo lock assertions to assert locked rather than write-locked, although in practice almost all uses of the pcbinfo rwlock main exclusive, and all instances of inpcb lock acquisition are exclusive. This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code. MFC after: 3 months Tested by: kris (superset of committered patch)	2008-04-17 21:38:18 +00:00
Pawel Jakub Dawidek	f55f27f862	Allow linker_search_symbol_name() to be called with KLD lock held. The linker_search_symbol_name() function is used by stack_print() and stack_print() can be called from kernel module unload method. MFC after: 1 week	2008-04-17 19:19:40 +00:00
Jeff Roberson	1690c6c1be	- Add a metric to describe how busy a processor has been over the last two ticks by counting the number of switches and the load when sched_clock() is called. - If the busy metric exceeds a threshold allow the idle thread to spin waiting for new work for a brief period to avoid using IPIs. This reduces the cost on the sender and receiver as well as reducing wakeup latency considerably when it works. Sponsored by: Nokia	2008-04-17 09:56:01 +00:00
Jeff Roberson	8df78c41d6	- Make SCHED_STATS more generic by adding a wrapper to create the variables and sysctl nodes. - In reset walk the children of kern_sched_stats and reset the counters via the oid_arg1 pointer. This allows us to add arbitrary counters to the tree and still reset them properly. - Define a set of switch types to be passed with flags to mi_switch(). These types are named SWT_*. These types correspond to SCHED_STATS counters and are automatically handled in this way. - Make the new SWT_ types more specific than the older switch stats. There are now stats for idle switches, remote idle wakeups, remote preemption ithreads idling, etc. - Add switch statistics for ULE's pickcpu algorithm. These stats include how much migration there is, how often affinity was successful, how often threads were migrated to the local cpu on wakeup, etc. Sponsored by: Nokia	2008-04-17 04:20:10 +00:00
Doug Rabson	a365ea5fba	Fix compilation with LOCKF_DEBUG.	2008-04-16 14:08:12 +00:00
Konstantin Belousov	eab626f110	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
David Xu	d61f3de656	Implement POSIX function tcgetsid() which returns session id. PR: stand/107561	2008-04-15 08:33:32 +00:00
Marcel Moolenaar	495168ba8d	Support and switch to the ULE scheduler: o Implement IPI_PREEMPT, o Set td_lock for the thread being switched out, o For ULE & SMP, loop while td_lock points to blocked_lock for the thread being switched in, o Enable ULE by default in GENERIC and SKI,	2008-04-15 05:02:42 +00:00
Randall Stewart	cf71e4381a	Add pru_flush routine so a transport can flush itself during Shutdown MFC after: 1 week	2008-04-14 18:06:04 +00:00
Alan Cox	e384d8a89b	Initialize the vm object's flags to include OBJ_NOSPLIT, just like the vm objects that are used by System V shared memory segments.	2008-04-13 21:08:34 +00:00
Attilio Rao	22dd228d5d	Use a "rel" memory barrier for disowning the lock as it cames from an exclusive locking operation.	2008-04-13 01:21:56 +00:00
Attilio Rao	0b0100db88	struct lock_instance and struct lock_list_entry don't need to be in the public namespace for WITNESS as they are only used internally so just move them in the private namespace for the subsystem (with all related supporting definitions).	2008-04-13 01:20:47 +00:00
Poul-Henning Kamp	8d24f82310	fix printf type confusion on amd64	2008-04-12 21:51:54 +00:00
Poul-Henning Kamp	c9ad6040dd	Emit summaries of struct c(alender)t(ime) <-> struct timespec conversions under bootverbose. Struct ct is used for setting/reading real time clocks and I'm about to Do Things to some of those, so a bit of preemptive debugging is in order. Remove a pointless __inline.	2008-04-12 20:35:56 +00:00
Attilio Rao	e5f94314ad	- Re-introduce WITNESS support for lockmgr. About the old implementation the only one difference is that lockmgr*() functions now accept LK_NOWITNESS flag which skips ordering for the instanced calling. - Remove an unuseful stub in witness_checkorder() (because the above check doesn't allow ever happening) and allow witness_upgrade() to accept non-try operation too.	2008-04-12 19:57:30 +00:00
Attilio Rao	872b7289fd	- Remove a stale comment. - Add an extra assertion in order to catch malformed requested operations.	2008-04-12 13:56:17 +00:00
Attilio Rao	1859cffaef	Add missing stubs for spinlocks cpuset and intrcnt. Submitted by: kris	2008-04-12 13:51:18 +00:00
Xin LI	31c50f53da	Instead of rolling our own jail number allocation procedure, use alloc_unr() to do it. Submitted by: Ed Schouten <ed 80386 nl> PR: kern/122270 MFC after: 1 month	2008-04-11 21:31:15 +00:00
John Baldwin	03c7442d75	Use kthread_exit() to terminate a taskqueue thread rather than kproc_exit() now that the taskqueue threads are kthreads rather than kprocs. Reported by: kris	2008-04-11 17:35:54 +00:00
Jeff Roberson	9b33b154b5	- Add the interrupt vector number to intr_event_create so MI code can lookup hard interrupt events by number. Ignore the irq# for soft intrs. - Add support to cpuset for binding hardware interrupts. This has the side effect of binding any ithread associated with the hard interrupt. As per restrictions imposed by MD code we can only bind interrupts to a single cpu presently. Interrupts can be 'unbound' by binding them to all cpus. Reviewed by: jhb Sponsored by: Nokia	2008-04-11 03:26:41 +00:00
Pawel Jakub Dawidek	b03d720760	- Use LK_TYPE_MASK where needed. Actually after sys/sys/lockmgr.h:1.69 it is no longer needed, but for now we still want to be consistent with other similar checks in the tree. - Call ASSERT_VOP_ELOCKED() only when vget() returns 0. Reviewed by: jeff	2008-04-09 20:19:55 +00:00
Sam Leffler	6c6eaea6dd	Do image loading in a context known to have a root directory: o create a private task queue thread that sets up root and current directories (hooking mountroot event as needed); this is necessary because task queue threads are parented from proc0 and it does not have a reference to rootvnode (lost when / mounting moved to init) o bounce image load + unload requests through the private task q so we can load images even when the request is made from a thread that does not have sufficient context (e.g. task q thread) o add a check in the task q thread to fail requests before root is mounted (just in case) Reviewed by: jhb, mlaier, luigi (glance) MFC after: 1 month	2008-04-09 19:07:48 +00:00
Sam Leffler	00c71fb7c3	o add a mountroot event handler that fires when / is mounted; this information was lost when root started being mounted by init o remove SI_SUB_MOUNT_ROOT since it's no longer meaningful MFC after: 2 weeks	2008-04-08 17:53:33 +00:00
Sam Leffler	175611b668	change taskqueue_start_threads to create threads instead of proc's Reviewed by: jhb	2008-04-08 17:48:02 +00:00
Konstantin Belousov	48b05c3f82	Implement the linux syscalls openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat, renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat. Submitted by: rdivacky Sponsored by: Google Summer of Code 2007 Tested by: pho	2008-04-08 09:45:49 +00:00
Attilio Rao	e0f62984c1	- Use a different encoding for lockmgr options: make them encoded by bit in order to allow per-bit checks on the options flag, in particular in the consumers code [1] - Re-enable the check against TDP_DEADLKTREAT as the anti-waiters starvation patch allows exclusive waiters to override new shared requests. [1] Requested by: pjd, jeff	2008-04-07 14:46:38 +00:00
Don Lewis	8a3724388b	vfs_syscalls.c 1.452 mistakenly swapped the behavior of chown() and lchown().	2008-04-07 00:29:32 +00:00
Attilio Rao	047dd67e96	Optimize lockmgr in order to get rid of the pool mutex interlock, of the state transitioning flags and of msleep(9) callings. Use, instead, an algorithm very similar to what sx(9) and rwlock(9) alredy do and direct accesses to the sleepqueue(9) primitive. In order to avoid writer starvation a mechanism very similar to what rwlock(9) uses now is implemented, with the correspective per-thread shared lockmgrs counter. This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and lockmgr_args_rw(). These two are like the 2 "normal" versions, but they both accept a rwlock as interlock. In order to realize this, the general lockmgr manager function "__lockmgr_args()" has been implemented through the generic lock layer. It supports all the blocking primitives, but currently only these 2 mappers live. The patch drops the support for WITNESS atm, but it will be probabilly added soon. Also, there is a little race in the draining code which is also present in the current CVS stock implementation: if some sharers, once they wakeup, are in the runqueue they can contend the lock with the exclusive drainer. This is hard to be fixed but the now committed code mitigate this issue a lot better than the (past) CVS version. In addition assertive KA_HELD and KA_UNHELD have been made mute assertions because they are dangerous and they will be nomore supported soon. In order to avoid namespace pollution, stack.h is splitted into two parts: one which includes only the "struct stack" definition (_stack.h) and one defining the KPI. In this way, newly added _lockmgr.h can just include _stack.h. Kernel ABI results heavilly changed by this commit (the now committed version of "struct lock" is a lot smaller than the previous one) and KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction, so manpages and __FreeBSD_version will be updated accordingly. Tested by: kris, pho, jeff, danger Reviewed by: jeff Sponsored by: Google, Summer of Code program 2007	2008-04-06 20:08:51 +00:00
Jeff Roberson	ce62b59c88	- Correct a major error introduced in the per-cpu timeout commit. Sleep and wakeup require the same wait channel to function properly. Found by: kris Pointy hat: me	2008-04-06 11:08:49 +00:00
John Baldwin	8aa9e82e67	Move INTR_FILTER from opt_global.h to its own header.	2008-04-05 20:13:15 +00:00
John Baldwin	1ee1b68792	Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt code. - Rename the intr_event 'eoi', 'disable', and 'enable' hooks to 'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric. Also, add a comment describe what the MI code expects them to do. - On amd64, i386, and powerpc this is effectively a NOP. - On arm, don't bother masking the interrupt unless the ithread is scheduled in the non-INTR_FILTER case to match what INTR_FILTER did. Also, don't bother unmasking the interrupt in the post_filter case if we never masked it. The INTR_FILTER case had been doing this by having arm_unmask_irq for the post_filter (formerly 'eoi') hook. - On ia64, stray interrupts are now masked for the non-INTR_FILTER case. They were already masked in the INTR_FILTER case. - On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for both the 'post_filter' and 'post_ithread' hooks to match what the non-INTR_FILTER code did. - On sun4v, retire the ithread wrapper hack by using an appropriate 'post_ithread' hook instead (it's what 'post_ithread'/'enable' was designed to do even in 5.x). Glanced at by: piso Reviewed by: marius Requested by: marius [1], [5] Tested on: amd64, i386, arm, sparc64	2008-04-05 19:58:30 +00:00
Alan Cox	7630c26507	Reintroduce UMA_SLAB_KMAP; however, change its spelling to UMA_SLAB_KERNEL for consistency with its sibling UMA_SLAB_KMEM. (UMA_SLAB_KMAP met its original demise in revision 1.30 of vm/uma_core.c.) UMA_SLAB_KERNEL is now required by the jumbo frame allocators. Without it, UMA cannot correctly return pages from the jumbo frame zones to the VM system because it resets the pages' object field to NULL instead of the kernel object. In more detail, the jumbo frame zones are created with the option UMA_ZONE_REFCNT. This causes UMA to overwrite the pages' object field with the address of the slab. However, when UMA wants to release these pages, it doesn't know how to restore the object field, so it sets it to NULL. This change teaches UMA how to reset the object field to the kernel object. Crashes reported by: kris Fix tested by: kris Fix discussed with: jeff MFC after: 6 weeks	2008-04-04 18:41:12 +00:00
Jeff Roberson	00ca09449d	- Add sysctls at debug.rwlock to control the behavior of the speculative spinning when readers hold a lock. This spinning is speculative because, unlike the write case, we can not test whether the owners are running. - Add speculative read spinning for readers who are blocked by pending writers while a read lock is still held. This allows the thread to spin until the write lock succeeds after which it may spin until the writer has released the lock. This prevents excessive context switches when readers and writers both hold the lock for brief periods. Sponsored by: Nokia	2008-04-04 10:00:46 +00:00
Jeff Roberson	3bc8c68d9f	- Add a Nokia copyright to cpuset to reflect their generous contribution to this work.	2008-04-04 01:22:04 +00:00
Jeff Roberson	0502fe2e43	- Allow static_boost to specify no boost with '0', traditional kernel fixed pri boost with '1' or any priority less than the current thread's priority with a value greater than two. Default the boost to PRI_MIN_TIMESHARE to prevent regular user-space threads from starving threads in the kernel. This prevents these user-threads from also being scheduled as if they are high fixed-priority kernel threads. - Restore the setting of lowpri in tdq_choose(). It has to be either here or in sched_switch(). I accidentally removed it from both places. Tested by: kris	2008-04-04 01:16:18 +00:00
Jeff Roberson	03d17db7d5	- Don't check for the ITHD pri class in tdq_load_add and rem. 4BSD doesn't do this either. Simply check P_NOLOAD. It'd be nice if this was in a thread flag so we didn't have an extra cache miss every time we add and remove a thread from the run-queue.	2008-04-04 01:04:43 +00:00
Jeff Roberson	e4b1aa6210	- Fix a mis-merge that crept in during the softclock changes. Spotted by: jhb	2008-04-04 01:03:23 +00:00
David Xu	44253336b6	let umtxq_busy() only spin on mp machine. make function name do_rwlock_unlock to be consistent with others.	2008-04-03 11:49:20 +00:00
Jeff Roberson	e8245292a7	- Convert two timeout users to the new callout_reset_curcpu() api. Sponsored by: Nokia	2008-04-02 11:21:42 +00:00
Jeff Roberson	8d809d5061	Implement per-cpu callout threads, wheels, and locks. - Move callout thread creation from kern_intr.c to kern_timeout.c - Call callout_tick() on every processor via hardclock_cpu() rather than inspecting callout internal details in kern_clock.c. - Remove callout implementation details from callout.h - Package up all of the global variables into a per-cpu callout structure. - Start one thread per-cpu. Threads are not strictly bound. They prefer to execute on the native cpu but may migrate temporarily if interrupts are starving callout processing. - Run all callouts by default in the thread for cpu0 to maintain current ordering and concurrency guarantees. Many consumers may not properly handle concurrent execution. - The new callout_reset_on() api allows specifying a particular cpu to execute the callout on. This may migrate a callout to a new cpu. callout_reset() schedules on the last assigned cpu while callout_reset_curcpu() schedules on the current cpu. Reviewed by: phk Sponsored by: Nokia	2008-04-02 11:20:30 +00:00

1 2 3 4 5 ...

10450 Commits