freebsd-dev/sys/kern
Alexander Motin 40ea77a036 Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

The defined now safety requirements are:
 - caller should not hold any locks and should be reenterable;
 - callee should not depend on GEOM dual-threaded concurency semantics;
 - on the way down, if request is unmapped while callee doesn't support it,
   the context should be sleepable;
 - kernel thread stack usage should be below 50%.

To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
 - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
 - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
 - G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
 - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe.  If any of requirements are not met, request is queued to
g_up or g_down thread same as before.

Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).

To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION.  da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.

This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).

Sponsored by:	iXsystems, Inc.
MFC after:	2 months
2013-10-22 08:22:19 +00:00
..
bus_if.m Add a BUS_CHILD_DELETED() method that a bus can hook to allow it to cleanup 2012-08-21 18:13:09 +00:00
capabilities.conf Sort properly. 2013-09-07 19:16:02 +00:00
clock_if.m
cpufreq_if.m
device_if.m Revert r239178 and implement two new functions, namely 2012-08-15 15:42:57 +00:00
dtio_kdtrace.c Change the module name for the I/O provider to "kernel" from 2012-09-25 19:16:28 +00:00
genassym.sh
imgact_aout.c Cosmetics: define FREEBSD32_MINUSER and AOUT32_MINUSER for struct 2012-07-22 13:41:45 +00:00
imgact_elf32.c
imgact_elf64.c
imgact_elf.c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use 2013-09-09 18:11:59 +00:00
imgact_gzip.c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use 2013-09-09 18:11:59 +00:00
imgact_shell.c
inflate.c
init_main.c Debugging. My attempt at EVENTHANDLER(multiuser) was a failure; use EVENTHANDLER(mountroot) instead. 2013-10-08 06:54:52 +00:00
init_sysent.c Regen. 2013-09-19 18:56:00 +00:00
kern_acct.c acct: create a special plimit object and set it for exiting processes 2013-06-30 19:08:06 +00:00
kern_alq.c The fix committed in r250951 replaced the reported panic with a deadlock... gold 2013-06-17 09:49:07 +00:00
kern_clock.c Correct a bug that prevented deadlkres from (almost) ever firing. 2013-06-28 15:55:30 +00:00
kern_clocksource.c - Make callout(9) tickless, relying on eventtimers(4) as backend for 2013-03-04 11:09:56 +00:00
kern_condvar.c Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With 2013-09-20 23:06:21 +00:00
kern_conf.c Reject spaces and double quotation marks in device names. devctl(4) 2012-12-22 13:33:28 +00:00
kern_cons.c cngetc: use cpu_spinwait to ease the cncheckc loop a tiny bit 2012-10-06 19:50:23 +00:00
kern_context.c
kern_cpu.c Revert r175376 and tune cpufreq(4) frequency comparison logic instead. 2012-03-10 18:56:16 +00:00
kern_cpuset.c Several improvements to rmlock(9). Many of these are based on patches 2013-06-25 18:44:15 +00:00
kern_ctf.c Remove the support for using non-mpsafe filesystem modules. 2012-10-22 17:50:54 +00:00
kern_descrip.c When growing the file descriptor table, new larger memory chunk is 2013-10-09 18:41:35 +00:00
kern_dtrace.c Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. 2011-11-07 06:44:47 +00:00
kern_environment.c r249408 and r249436 cause a NULL pointer dereference on the CUBIEBOARD 2013-04-16 22:09:08 +00:00
kern_et.c Fix incorrect assertion that caused panic when periodic-only timers used. 2013-03-13 06:42:01 +00:00
kern_event.c Add a resource limit for the total number of kqueues available to the 2013-10-21 16:44:53 +00:00
kern_exec.c Add a sysctl kern.disallow_high_osrel which disables executing the 2013-10-15 06:38:40 +00:00
kern_exit.c Specify SDT probe argument types in the probe definition itself rather than 2013-08-15 04:08:55 +00:00
kern_fail.c Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. 2011-11-07 06:44:47 +00:00
kern_ffclock.c Revise the sysctl handling code and restructure the hierarchy of sysctls 2011-12-01 07:19:13 +00:00
kern_fork.c Extend the support for exempting processes from being killed when swap is 2013-09-19 18:53:42 +00:00
kern_gzio.c Remove the support for using non-mpsafe filesystem modules. 2012-10-22 17:50:54 +00:00
kern_hhook.c Move hhook's per-vnet initialisation to an earlier SYSINIT SI_SUB stage to 2013-06-15 10:08:34 +00:00
kern_idle.c
kern_intr.c Snapshot. This passes the build test, but has not yet been finished or debugged. 2013-10-04 06:55:06 +00:00
kern_jail.c Keep PRIV_KMEM_READ permitted inside jails as it is on the outside. 2013-09-06 17:32:29 +00:00
kern_khelp.c Cleanup and simplification in khelp_{register|deregister}_helper(). No 2013-06-15 06:45:17 +00:00
kern_kthread.c Do not use potentially stale thread in kthread_add() 2013-08-17 17:02:43 +00:00
kern_ktr.c ktr: correctly handle possible wrap-around in the boot buffer 2013-02-08 07:29:07 +00:00
kern_ktrace.c Fix panic in ktrcapfail() when no capability rights are passed. 2013-09-18 19:26:08 +00:00
kern_linker.c Rename the kld_unload event handler to kld_unload_try, and add a new 2013-08-24 21:13:38 +00:00
kern_lock.c Add LK_TRYUPGRADE operation for lockmgr(9), which attempts to 2013-09-29 18:02:23 +00:00
kern_lockf.c Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. 2011-11-07 06:44:47 +00:00
kern_lockstat.c
kern_loginclass.c
kern_malloc.c Tidy up kmeminit(): Since r245575, 'nmbclusters' is calculated after 2013-10-05 18:53:03 +00:00
kern_mbuf.c Ignore attempts to set the nmbcluster sysctls to their current value 2013-10-10 16:11:34 +00:00
kern_mib.c fix some fat-fingering in r246246 2013-02-02 14:19:50 +00:00
kern_module.c Fix a typo. 2012-08-22 20:01:57 +00:00
kern_mtxpool.c
kern_mutex.c Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With 2013-09-20 23:06:21 +00:00
kern_ntptime.c rename scheduler->swapper and SI_SUB_RUN_SCHEDULER->SI_SUB_LAST 2013-07-24 09:45:31 +00:00
kern_osd.c
kern_physio.c Fix some issues in change 254760 pointed out by Bruce Evans: 2013-08-29 16:41:40 +00:00
kern_pmc.c Add software PMC support. 2012-03-28 20:58:30 +00:00
kern_poll.c Remove unsigned comparison < 0 2013-08-07 07:22:56 +00:00
kern_priv.c Make the comments a little more clear about PRIV_KMEM_*, explicitly 2013-07-06 00:10:52 +00:00
kern_proc.c Extend the support for exempting processes from being killed when swap is 2013-09-19 18:53:42 +00:00
kern_prot.c Style fix 2012-11-14 10:33:12 +00:00
kern_racct.c Accessing td_state requires thread lock to be held. 2013-03-14 23:20:18 +00:00
kern_rangelock.c Change the queue of locks in kern_rangelock.c from holding lock requests in 2013-08-15 20:19:17 +00:00
kern_rctl.c Add CPU percentage limit enforcement to RCTL. The resouce name is "pcpu". 2012-10-26 16:01:08 +00:00
kern_resource.c Add a resource limit for the total number of kqueues available to the 2013-10-21 16:44:53 +00:00
kern_rmlock.c Fix lc_lock/lc_unlock() support for rmlocks held in shared mode. With 2013-09-20 23:06:21 +00:00
kern_rwlock.c Consistently use the same value to indicate exclusively-held and 2013-09-22 14:09:07 +00:00
kern_sdt.c FreeBSD's DTrace implementation has a few problems with respect to handling 2013-08-13 03:10:39 +00:00
kern_sema.c
kern_sharedpage.c Remove the deprecated VM_ALLOC_RETRY flag for the vm_page_grab(9). 2013-08-22 07:39:53 +00:00
kern_shutdown.c Switch the vm_object mutex to be a rwlock. This will enable in the 2013-03-09 02:32:23 +00:00
kern_sig.c Change the cap_rights_t type from uint64_t to a structure that we can extend 2013-09-05 00:09:56 +00:00
kern_switch.c Add a comment on why inlining critical_enter() may not be a good idea 2012-12-09 04:54:22 +00:00
kern_sx.c Consistently use the same value to indicate exclusively-held and 2013-09-22 14:09:07 +00:00
kern_synch.c Make load average sampling asynchronous to hardclock ticks. This improves 2013-09-24 07:03:16 +00:00
kern_syscalls.c
kern_sysctl.c Add a helpful message that can help point to why a sysctl tree removal failed 2013-08-09 01:04:44 +00:00
kern_tc.c - Make callout(9) tickless, relying on eventtimers(4) as backend for 2013-03-04 11:09:56 +00:00
kern_thr.c Stop treating td_sigmask specially for the purposes of new thread 2012-05-26 20:03:47 +00:00
kern_thread.c Another NFS SIGSTOP related fix: Ignore thread suspend requests due to 2013-03-21 14:06:27 +00:00
kern_time.c Implement compat32 wrappers for the ktimer_* syscalls. 2013-07-21 19:43:52 +00:00
kern_timeout.c Make the callout arithmetic more robust adding checks for overflow. 2013-09-26 10:06:50 +00:00
kern_umtx.c Fix two issues with the spin loops in the umtx(2) implementation. 2013-06-13 09:33:22 +00:00
kern_uuid.c Further restrict the MAC addresses that we use for UUID generation 2013-07-24 18:13:43 +00:00
kern_xxx.c
ksched.c sched_rr_interval() seems always returned period in hz ticks, but same 2012-08-10 18:19:57 +00:00
link_elf_obj.c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use 2013-09-09 18:11:59 +00:00
link_elf.c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use 2013-09-09 18:11:59 +00:00
linker_if.m
Make.tags.inc - Trim an unused and bogus Makefile for mount_smbfs. 2013-06-28 21:00:08 +00:00
Makefile
makesyscalls.sh Error out on failure to open specified config file 2013-10-16 17:03:46 +00:00
md4c.c
md5c.c
p1003_1b.c
posix4_mib.c
sched_4bsd.c rename scheduler->swapper and SI_SUB_RUN_SCHEDULER->SI_SUB_LAST 2013-07-24 09:45:31 +00:00
sched_ule.c Micro-optimize cpu_search(), allowing compiler to use more efficient inline 2013-09-07 15:16:30 +00:00
serdev_if.m
stack_protector.c
subr_acl_nfs4.c Fix bug where NFSv4 ACL enforcement code wouldn't unconditionally 2012-04-17 14:54:00 +00:00
subr_acl_posix1e.c Add module load/unload stubs. 2012-03-13 20:27:48 +00:00
subr_autoconf.c
subr_blist.c Remove reference to the rlist code from comments, and fix a typo visible 2013-02-05 20:08:33 +00:00
subr_bufring.c
subr_bus_dma.c Move an assertion to the right spot; only bus_dmamap_load_mbuf(9) 2013-06-01 11:42:47 +00:00
subr_bus.c Add YARROW_RNG and FORTUNA_RNG to sys/conf/options. 2013-10-08 11:05:26 +00:00
subr_busdma_bufalloc.c Replace kernel virtual address space allocation with vmem. This provides 2013-08-07 06:21:20 +00:00
subr_capability.c Fix panic in cap_rights_is_valid() when invalid rights are provided - 2013-09-07 19:03:16 +00:00
subr_clock.c
subr_counter.c Revert r249590 and in case if mp_ncpus isn't initialized use MAXCPU. This 2013-07-23 11:16:40 +00:00
subr_devstat.c Merge GEOM direct dispatch changes from the projects/camlock branch. 2013-10-22 08:22:19 +00:00
subr_disk.c
subr_dummy_vdso_tc.c Implement mechanism to export some kernel timekeeping data to 2012-06-22 07:06:40 +00:00
subr_eventhandler.c
subr_fattime.c
subr_firmware.c Correct sizeof usage 2012-06-25 05:41:16 +00:00
subr_hash.c Convert panic()s to KASSERT()s. This is an optimisation for 2012-01-23 16:31:46 +00:00
subr_hints.c Style fixes. 2012-09-04 23:16:55 +00:00
subr_kdb.c - Extend the KDB interface to add a per-debugger callback to print a 2012-04-12 17:43:59 +00:00
subr_kobj.c As it turns out, r186347 actually is insufficient to avoid the use of the 2011-11-15 20:11:03 +00:00
subr_lock.c Several improvements to rmlock(9). Many of these are based on patches 2013-06-25 18:44:15 +00:00
subr_log.c MFcalloutng (r244255 by mav, with minor changes): 2013-03-04 16:07:55 +00:00
subr_mbpool.c Give (*ext_free) an int return value allowing for very sophisticated 2013-08-25 10:57:09 +00:00
subr_mchain.c Mechanically substitute flags from historic mbuf allocator with 2012-12-05 08:04:20 +00:00
subr_module.c
subr_msgbuf.c - Clean up timestamps in msgbuf code. The timestamps should now be 2012-03-19 00:36:32 +00:00
subr_param.c Implement the concept of the unmapped VMIO buffers, i.e. buffers which 2013-03-19 14:13:12 +00:00
subr_pcpu.c Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. 2011-11-07 06:44:47 +00:00
subr_pctrie.c - Add a new general purpose path-compressed radix trie which can be used 2013-05-12 04:05:01 +00:00
subr_power.c
subr_prf.c Reduce the scope of the proctree_lock. If several processes cause 2013-09-13 06:39:10 +00:00
subr_prof.c Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. 2011-11-07 15:43:11 +00:00
subr_rman.c Unlock in the error path to prevent a lock leak. 2012-05-31 17:27:05 +00:00
subr_rtc.c Core structure and functions to support a feed-forward clock within the kernel. 2011-11-19 14:10:16 +00:00
subr_sbuf.c Always request zeroed memory, in case we're dumb enough to leak it later. 2013-09-22 23:47:56 +00:00
subr_scanf.c Xen netback driver rewrite. 2012-01-26 16:35:09 +00:00
subr_sglist.c
subr_sleepqueue.c Partially revert r195702. Deferring stops is now implemented via a set of 2013-03-18 17:23:58 +00:00
subr_smp.c Fix ia64 and mips kernel builds due to XENHVM=>GENERIC integration in 2013-09-22 02:46:13 +00:00
subr_stack.c Constify stack argument for functions that don't modify it. 2011-11-16 19:06:55 +00:00
subr_syscall.c Fix build on ARM (and probably other platforms) 2012-12-28 06:52:53 +00:00
subr_taskqueue.c Add comments that taskqueue_enqueue_locked() returns without the lock. 2013-10-21 21:16:50 +00:00
subr_trap.c Partially revert r195702. Deferring stops is now implemented via a set of 2013-03-18 17:23:58 +00:00
subr_turnstile.c Update the comment: we do show the backtrace of misbehaving thread. 2013-02-17 21:37:32 +00:00
subr_uio.c Remove zero-copy sockets code. It only worked for anonymous memory, 2013-09-16 06:25:54 +00:00
subr_unit.c Move the definition of the struct unrhdr into a separate header file, 2013-08-30 07:37:45 +00:00
subr_vmem.c Added sysctl to turn off calls to vmem_check(). 2013-08-20 11:06:56 +00:00
subr_witness.c Trim a couple of panic messages. 2013-09-04 11:52:28 +00:00
sys_capability.c This looks like a typo that breaks the build. Yell at me if this isn't the 2013-09-05 03:36:57 +00:00
sys_generic.c By default, allow up to SSIZE_MAX i/o for non-devfs files. 2013-10-15 06:35:22 +00:00
sys_pipe.c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use 2013-09-09 18:11:59 +00:00
sys_procdesc.c Change the cap_rights_t type from uint64_t to a structure that we can extend 2013-09-05 00:09:56 +00:00
sys_process.c Extend the support for exempting processes from being killed when swap is 2013-09-19 18:53:42 +00:00
sys_socket.c Make sendfile() a method in the struct fileops. Currently only 2013-08-15 07:54:31 +00:00
syscalls.c Regen. 2013-09-19 18:56:00 +00:00
syscalls.master Extend the support for exempting processes from being killed when swap is 2013-09-19 18:53:42 +00:00
systrace_args.c Regenerate syscall argument strings after r255777. 2013-09-21 23:06:36 +00:00
sysv_ipc.c
sysv_msg.c
sysv_sem.c
sysv_shm.c Add a mmap flag (MAP_32BIT) on 64-bit platforms to request that a mapping use 2013-09-09 18:11:59 +00:00
tty_compat.c
tty_info.c Fix whitespace inconsistencies in TTY code. 2012-02-06 18:15:46 +00:00
tty_inq.c Use strchr() and strrchr(). 2012-01-02 12:12:10 +00:00
tty_outq.c
tty_pts.c Make sendfile() a method in the struct fileops. Currently only 2013-08-15 07:54:31 +00:00
tty_tty.c
tty_ttydisc.c Correct SIGTTIN handling. 2012-10-25 09:05:21 +00:00
tty.c Change the cap_rights_t type from uint64_t to a structure that we can extend 2013-09-05 00:09:56 +00:00
uipc_accf.c
uipc_debug.c Fix socket buffer timeouts precision using the new sbintime_t KPI instead 2013-09-01 23:34:53 +00:00
uipc_domain.c - Implement two new system calls: 2013-03-02 21:11:30 +00:00
uipc_mbuf2.c Mechanically substitute flags from historic mbuf allocator with 2012-12-05 08:04:20 +00:00
uipc_mbuf.c Pad m_hdr on 32bit architectures to to prevent alignment and padding 2013-08-27 20:52:02 +00:00
uipc_mqueue.c Fix !CAPABILITIES build. 2013-09-05 10:24:09 +00:00
uipc_sem.c Change the cap_rights_t type from uint64_t to a structure that we can extend 2013-09-05 00:09:56 +00:00
uipc_shm.c Implement sendfile(2) for the posix shared memory segment file descriptor, 2013-09-11 06:41:15 +00:00
uipc_sockbuf.c - Substitute sbdrop_internal() with sbcut_internal(). The latter doesn't free 2013-10-09 11:57:53 +00:00
uipc_socket.c Remove zero-copy sockets code. It only worked for anonymous memory, 2013-09-16 06:25:54 +00:00
uipc_syscalls.c Print more useful information about the transfer that trigger the assertion. 2013-10-21 16:17:46 +00:00
uipc_usrreq.c Provide pr_ctloutput method for AF_LOCAL/SOCK_SEQPACKET sockets. 2013-09-11 18:22:30 +00:00
vfs_acl.c Change the cap_rights_t type from uint64_t to a structure that we can extend 2013-09-05 00:09:56 +00:00
vfs_aio.c The fget() function now takes pointer to cap_rights_t, so change 0 to NULL. 2013-09-05 11:59:23 +00:00
vfs_bio.c MFprojects/camlock r256619: 2013-10-21 06:44:55 +00:00
vfs_cache.c namecache sdt: freebsd doesn't support structured characters yet 2013-07-09 08:58:34 +00:00
vfs_cluster.c When allocating a pbuf for the cluster write, do not sleep waiting 2013-08-27 01:31:12 +00:00
vfs_default.c - Convert the bufobj lock to rwlock. 2013-05-31 00:43:41 +00:00
vfs_export.c Further refine the handling of stop signals in the NFS client. The 2013-02-21 19:02:50 +00:00
vfs_extattr.c Change the cap_rights_t type from uint64_t to a structure that we can extend 2013-09-05 00:09:56 +00:00
vfs_hash.c Add exported vfs_hash_index() function, which calculates the canonical 2013-01-14 05:41:40 +00:00
vfs_init.c Revert accidental commit. 2013-06-29 05:05:57 +00:00
vfs_lookup.c Fix panic in ktrcapfail() when no capability rights are passed. 2013-09-18 19:26:08 +00:00
vfs_mount.c Change len checks for fstypelen and fspathlen to be against absolute len 2013-10-03 22:52:03 +00:00
vfs_mountroot.c In r243868, the error message buffer errmsg have been changed from 2013-09-09 05:01:18 +00:00
vfs_subr.c Do not flush buffers when the v_object of the passed vnode does not 2013-10-09 18:43:29 +00:00
vfs_syscalls.c Correct the logic broken in my last commit. 2013-09-05 09:36:19 +00:00
vfs_vnops.c When opening or closing fifo, ensure that the vnode is locked 2013-09-13 06:52:23 +00:00
vnode_if.src remove vop_lookup_pre and vop_lookup_post 2012-11-22 10:36:10 +00:00