Commit Graph

9864 Commits

Author SHA1 Message Date
Mark Johnston
c50c331896 Remove an unneeded typedef of ip6_t from the DTrace ip provider library.
It causes an error when ipfilter is enabled, since ipl.ko contains an
identical typedef.

PR:		203092
MFC after:	1 week
2015-09-15 05:16:26 +00:00
Hans Petter Selasky
9acc0eafd7 Implement callout_drain_async(), inspired by the projects/hps_head
branch.

This function is used to drain a callout via a callback instead of
blocking the caller until the drain is complete. Refer to the
callout_drain_async() manual page for a detailed description.

Limitation: If a lock is used with the callout, the callout can only
be drained asynchronously one time unless the callout_init_mtx()
function is called again. This limitation is not present in
projects/hps_head and will require more invasive changes to the
timeout code, which was not in the scope of this patch.

Differential Revision:	https://reviews.freebsd.org/D3521
Reviewed by:		wblock
MFC after:		1 month
2015-09-14 10:52:26 +00:00
Alexander Motin
d36c617616 CTL documentation update, mostly for HA. 2015-09-12 10:23:23 +00:00
Edward Tomasz Napierala
31066e58e5 Point potential geom_fox(4) users to gmultipath(8).
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-09-12 08:54:24 +00:00
Mark Johnston
99fdade2c6 Document stack_save_td(9) and stack_save_td_running(9).
Reviewed by:	wblock
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3243
2015-09-11 03:56:04 +00:00
Hiroki Sato
b1c250ff3f - Remove GIF_{SEND,ACCEPT}_REVETHIP.
- Simplify EADDRNOTAVAIL and EAFNOSUPPORT conditions.

MFC after:	3 days
2015-09-10 05:59:39 +00:00
Allan Jude
7245b843bb Document the sctp blackhole sysctl MIB
PR:		184110
Submitted by:	Marie Helene Kvello-Aune <marieheleneka@gmail.com>
Reviewed by:	wblock
Approved by:	wblock (mentor)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D3528
2015-09-07 01:21:56 +00:00
Baptiste Daroussin
6fa997e2c1 Cross reference sesutil(8) and ses(4)
Submitted by:	trasz
MFC after:	1 month (with r287473)
2015-09-05 10:29:47 +00:00
Xin LI
28ffe927c2 Expose an interface to determine if an ACE is inherited.
Submitted by:	sef
Reviewed by:	trasz
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D3540
2015-09-04 00:14:20 +00:00
Conrad Meyer
14bdbaf2e4 Detect badly behaved coredump note helpers
Coredump notes depend on being able to invoke dump routines twice; once
in a dry-run mode to get the size of the note, and another to actually
emit the note to the corefile.

When a note helper emits a different length section the second time
around than the length it requested the first time, the kernel produces
a corrupt coredump.

NT_PROCSTAT_FILES output length, when packing kinfo structs, is tied to
the length of filenames corresponding to vnodes in the process' fd table
via vn_fullpath.  As vnodes may move around during dump, this is racy.

So:

 - Detect badly behaved notes in putnote() and pad underfilled notes.

 - Add a fail point, debug.fail_point.fill_kinfo_vnode__random_path to
   exercise the NT_PROCSTAT_FILES corruption.  It simply picks random
   lengths to expand or truncate paths to in fo_fill_kinfo_vnode().

 - Add a sysctl, kern.coredump_pack_fileinfo, to allow users to
   disable kinfo packing for PROCSTAT_FILES notes.  This should avoid
   both FILES note corruption and truncation, even if filenames change,
   at the cost of about 1 kiB in padding bloat per open fd.  Document
   the new sysctl in core.5.

 - Fix note_procstat_files to self-limit in the 2nd pass.  Since
   sometimes this will result in a short write, pad up to our advertised
   size.  This addresses note corruption, at the risk of sometimes
   truncating the last several fd info entries.

 - Fix NT_PROCSTAT_FILES consumers libutil and libprocstat to grok the
   zero padding.

With suggestions from:	bjk, jhb, kib, wblock
Approved by:	markj (mentor)
Relnotes:	yes
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3548
2015-09-03 20:32:10 +00:00
Edward Tomasz Napierala
b8c19fd719 It's 2015, and some people are still trying to use fdisk and then
go asking what debug flags to set for GEOM to make it work.  Advice
them to use gpart(8) instead.

Something similar should probably done with disklabel,
but I need to rewrite the disklabel examples first.

Reviewed by:	wblock@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3315
2015-09-02 14:08:43 +00:00
Mark Johnston
e98a67279a nv.h lives in sys/ as of r279439. 2015-08-28 00:12:59 +00:00
Warner Losh
ae1f3df434 New 1-Wire bus implementation. 1-Wire controller is abstracted, though
only gpiobus configured via FDT is supported. Bus enumeration is
supported. Devices are created for each device found. 1-Wire
temperature controllers are supported, but other drivers could be
written. Temperatures are polled and reported via a sysctl.  Errors
are reported via sysctl counters. Mis-wired bus detection is included
for more trouble shooting. See ow(4), owc(4) and ow_temp(4) for
details of what's supported and known issues.

This has been tested on Raspberry Pi-B, Pi2 and Beagle Bone Black
with up to 7 devices.

Differential Revision: https://reviews.freebsd.org/D2956
Relnotes: yes
MFC after: 2 weeks
Reviewed by: loos@ (with many insightful comments)
2015-08-27 23:33:38 +00:00
Kristof Provost
64b3b4d611 pf: Remove support for 'scrub fragment crop|drop-ovl'
The crop/drop-ovl fragment scrub modes are not very useful and likely to confuse
users into making poor choices.
It's also a fairly large amount of complex code, so just remove the support
altogether.

Users who have 'scrub fragment crop|drop-ovl' in their pf configuration will be
implicitly converted to 'scrub fragment reassemble'.

Reviewed by:	gnn, eri
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D3466
2015-08-27 21:27:47 +00:00
Ed Schouten
bc1ace0b96 Decompose linkat()/renameat() rights to source and target.
To make it easier to understand how Capsicum interacts with linkat() and
renameat(), rename the rights to CAP_{LINK,RENAME}AT_{SOURCE,TARGET}.

This also addresses a shortcoming in Capsicum, where it isn't possible
to disable linking to files stored in a directory. Creating hardlinks
essentially makes it possible to access files with additional rights.

Reviewed by:	rwatson, wblock
Differential Revision:	https://reviews.freebsd.org/D3411
2015-08-27 15:16:41 +00:00
Conrad Meyer
e974f91c38 Import ioat(4) driver
I/OAT is also referred to as Crystal Beach DMA and is a Platform Storage
Extension (PSE) on some Intel server platforms.

This driver currently supports DMA descriptors only and is part of a
larger effort to upstream an interconnect between multiple systems using
the Non-Transparent Bridge (NTB) PSE.

For now, this driver is only built on AMD64 platforms.  It may be ported
to work on i386 later, if that is desired.  The hardware is exclusive to
x86.

Further documentation on ioat(4), including API documentation and usage,
can be found in the new manual page.

Bring in a test tool, ioatcontrol(8), in tools/tools/ioat.  The test
tool is not hooked up to the build and is not intended for end users.

Submitted by:	jimharris, Carl Delsey <carl.r.delsey@intel.com>
Reviewed by:	jimharris (reviewed my changes)
Approved by:	markj (mentor)
Relnotes:	yes
Sponsored by:	Intel
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3456
2015-08-24 19:32:03 +00:00
Edward Tomasz Napierala
fbefacfc26 Tweak the "rctl_enable" description to not give the impression
of being disabled by default.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-08-23 13:51:06 +00:00
Mark Murray
e866d8f05b Make the UMA harvesting go away completely if not wanted. Default to "not wanted".
Provide and document the RANDOM_ENABLE_UMA option.

Change RANDOM_FAST to RANDOM_UMA to clarify the harvesting.

Remove RANDOM_DEBUG option, replace with SDT probes. These will be of
use to folks measuring the harvesting effect when deciding whether to
use RANDOM_ENABLE_UMA.

Requested by:	scottl and others.
Approved by:	so (/dev/random blanket)
Differential Revision:    https://reviews.freebsd.org/D3197
2015-08-22 12:59:05 +00:00
Luiz Otavio O Souza
0a70aaf8f5 Add ALTQ(9) support for the CoDel algorithm.
CoDel is a parameterless queue discipline that handles variable bandwidth
and RTT.

It can be used as the single queue discipline on an interface or as a sub
discipline of existing queue disciplines such as PRIQ, CBQ, HFSC, FAIRQ.

Differential Revision:	https://reviews.freebsd.org/D3272
Reviewd by:	rpaulo, gnn (previous version)
Obtained from:	pfSense
Sponsored by:	Rubicon Communications (Netgate)
2015-08-21 22:02:22 +00:00
Bryan Drewery
75824a7b3e Remove reference to non-existent kern_openat(9).
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2015-08-20 22:14:43 +00:00
Bryan Drewery
7ec1b6b672 Add link for rw_unlock(9) to rwlock(9).
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2015-08-20 18:22:06 +00:00
Luiz Otavio O Souza
3df058ffaf Add the GPIO driver for the ADI Engineering RCC-VE and RCC-DFF/DFFv2.
This driver allows read the software reset switch state and control the
status LEDs.

The GPIO pins have their direction (input/output) locked down to prevent
possible short circuits.

Note that most people get a reset button that is a hardware reset.  The
software reset button is available on boards from Netgate.

Sponsored by:	Rubicon Communications (Netgate)
2015-08-18 21:05:56 +00:00
Mark Murray
646041a89a Add DEV_RANDOM pseudo-option and use it to "include out" random(4)
if desired.

Retire randomdev_none.c and introduce random_infra.c for resident
infrastructure. Completely stub out random(4) calls in the "without
DEV_RANDOM" case.

Add RANDOM_LOADABLE option to allow loadable Yarrow/Fortuna/LocallyWritten
algorithm.  Add a skeleton "other" algorithm framework for folks
to add their own processing code. NIST, anyone?

Retire the RANDOM_DUMMY option.

Build modules for Yarrow, Fortuna and "other".

Use atomics for the live entropy rate-tracking.

Convert ints to bools for the 'seeded' logic.

Move _write() function from the algorithm-specific areas to randomdev.c

Get rid of reseed() function - it is unused.

Tidy up the opt_*.h includes.

Update documentation for random(4) modules.

Fix test program (reviewers, please leave this).

Differential Revision:    https://reviews.freebsd.org/D3354
Reviewed by:              wblock,delphij,jmg,bjk
Approved by:              so (/dev/random blanket)
2015-08-17 07:36:12 +00:00
Sean Bruno
38be29d321 Add capability to disable CRC stripping. This breaks IPMI/BMC capabilities on certain adatpers.
Linux has been doing the exact same thing since 2008

eb7c3adb1c

PR:	161277
Differential Revision:	https://reviews.freebsd.org/D3282
Submitted by:	Fravadona@gmail.com
Reviewed by:	erj wblock
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Limelight Networks
2015-08-16 19:06:23 +00:00
Enji Cooper
b3667a140d Regen src.conf.5 per r286822 2015-08-16 10:10:58 +00:00
Mariusz Zaborski
347a39b4a6 Add support for the arrays in nvlist library.
- Add
  nvlist_{add,get,take,move,exists,free}_{number,bool,string,nvlist,
  descriptor} functions.
- Add support for (un)packing arrays.
- Add the nvl_array_next field to the nvlist structure.
  If an array is added by the nvlist_{move,add}_nvlist_array function
  this field will contains next element in the array.
- Add the nitems field to the nvpair and nvpair_header structure.
  This field contains number of elements in the array.
- Add special flag (NV_FLAG_IN_ARRAY) which is set if nvlist is a part of
  an array.
- Add special type (NV_TYPE_NVLIST_ARRAY_NEXT).This type is used only
  on packing/unpacking.
- Add new API for traversing arrays (nvlist_get_array_next).
- Add the nvlist_get_pararr function which combines the
  nvlist_get_array_next and nvlist_get_parent functions. If nvlist is in
  the array it will return next element from array. If nvlist is last
  element in array or it isn't in array it will return his
  container (parent). This function should simplify traveling over nvlist.
- Add tests for new features.
- Add documentation for new functions.
- Add my copyright.
- Regenerate the sys/cddl/compat/opensolaris/sys/nvpair.h file.

PR:		191083
Reviewed by:	allanjude (doc)
Approved by:	pjd (mentor)
2015-08-15 06:34:49 +00:00
Alan Cox
cff0a327b8 Stop describing an acquire operation as a read barrier and a release
operation as a write barrier.  That description has never been correct,
and it has caused confusion.  An acquire operation orders writes as well
as reads, and a release operation orders reads as well as writes.

Also, explicitly say that a thread doesn't see its own accesses being
reordered.  The reordering of a thread's accesses is only (potentially)
visible to another thread.  Thus, memory barriers need only be used to
control the ordering of accesses between threads, not within a thread.

Reviewed by:	bde, kib
Discussed with:	jhb
MFC after:	1 week
2015-08-14 17:49:03 +00:00
Ed Maste
84465e31bd Update src.conf(5) after r286730 2015-08-13 17:54:28 +00:00
Christian Brueffer
d3c2497cab Small cleanup.
- fix mandoc -Tlint warnings
- use appropriate macros
- canonize FreeBSD spelling
2015-08-13 16:11:04 +00:00
Ian Lepore
4159fbab87 Add a new PPS driver for AM335x (beaglebone) timer hardware. This can be
used as a module or compiled-in.
2015-08-13 15:19:30 +00:00
Ian Lepore
e8bac3f240 If a specific timecounter has been chosen via sysctl, and a new timecounter
with higher quality registers (presumably in a module that has just been
loaded), do not undo the user's choice by switching to the new timecounter.

Document that behavior, and also the fact that there is no way to unregister
a timecounter (and thus no way to unload a module containing one).
2015-08-12 20:50:20 +00:00
Christian Brueffer
548afe2bec Fix mandoc warnings/errors.
MFC after:	1 week
2015-08-12 11:56:19 +00:00
Mariusz Zaborski
89ca10c6e2 Make the nvlist_next(9) function handle NULL pointer variable.
This simplifies removing the first element from nvlist.

Reviewed by:	AllanJude
Approved by:	pjd (mentor)
2015-08-11 17:41:32 +00:00
Ian Lepore
196d3019a8 Allow the choice of PPS signal captured by uart(4) to be runtime-configured,
eliminating the need to build a custom kernel to use the CTS signal.

The historical UART_PPS_ON_CTS kernel option is still honored, but now it
can be overridden at runtime using a tunable to configure all uart devices
(hw.uart.pps_mode) or specific devices (dev.uart.#.pps_mode).  The per-
device config is both a tunable and a writable sysctl.

This syncs the PPS capabilities of uart(4) with the enhancements recently
recently added to ucom(4) for capturing from USB serial devices.

Relnotes:	yes
2015-08-10 20:08:09 +00:00
Christian Brueffer
db51871b42 Xref iwm(4). 2015-08-10 10:54:35 +00:00
Christian Brueffer
1db4188894 Hook up iwm.4 and iwmfw.4 to the build. 2015-08-10 10:36:08 +00:00
Alexander Motin
ff7b06db23 Document kern.cam.ctl.debug sysctl.
MFC after:	1 week
2015-08-09 10:11:04 +00:00
Alan Cox
a61bd9573f Revise the text about the atomicity of the defined operations across
multiple processors.  In particular, clearly state that the operations
are always atomic when they are applied to the default memory type
that is used by the kernel (and applications).

Reviewed by:	kib, jhb (an earlier version)
MFC after:	1 week
2015-08-09 07:45:15 +00:00
Pawel Jakub Dawidek
445bda3f4f Allow to disable BIO_DELETE passthru in fstab for swap-on-geli devices by
passing 'notrim' option.

PR:		198863
Submitted by:	Matthew D. Fuller fullermd at over-yonder dot net
2015-08-08 09:57:38 +00:00
Rui Paulo
d4886179cb Import OpenBSD's iwm WiFi driver for Intel 3160/7260/7265.
There are still several bugs, but I've been using it for a while now.
Thanks to all the testers and to Adrian for his help with this
driver.

This driver isn't connected to the build yet, but it will be soon.

There's no MFC planned because the driver isn't very stable yet.

Reviewed by:	adrian
Obtained from:	https://github.com/rpaulo/iwm
Tested by:	adrian, gjb, dumbbell (others that I forgot).
Relnotes:	yes
2015-08-08 06:06:48 +00:00
Marcel Moolenaar
aaa8b90caf Document the application interface. 2015-08-08 04:59:27 +00:00
Jason A. Harmening
cbaa6a0e0c Create man page for pmap_quick_enter_page(9) and pmap_quick_remove_page(9)
Reviewed by:	kib, brueffer, wblock
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D3312
2015-08-07 12:13:15 +00:00
Kevin Lo
0fa4d4b570 Add support for ASUS WL-100g. 2015-08-07 02:05:16 +00:00
Ian Lepore
374b1ec1ea Document the recently added get-bitmode and eeprom read/write functionality. 2015-08-06 20:59:03 +00:00
Kevin Lo
3072411ee8 Add support for Planex GW-NS300N. 2015-08-04 15:04:28 +00:00
Edward Tomasz Napierala
cdc3449233 Revert r286236; vgonel() is a static function.
Sponsored by:	The FreeBSD Foundation
2015-08-04 08:16:18 +00:00
Edward Tomasz Napierala
6a968be547 Document vgonel(9).
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-08-03 16:30:47 +00:00
Ed Schouten
43a81e6322 Add a manual page for the cloudabi and cloudabi64 kernel modules.
CloudABI has two separate kernel modules: cloudabi and cloudabi64. The
first module contains all the pointer size independent code, whereas
cloudabi64 contains the actual 64-bits specific system calls and the ELF
loader.

Reviewed by:	wblock
Obtained from:	https://github.com/NuxiNL/freebsd
Differential Revision:	https://reviews.freebsd.org/D3258
2015-08-02 14:56:30 +00:00
Mark Johnston
16f3fdf55f Regenerate after r286174. 2015-08-02 00:56:16 +00:00
John-Mark Gurney
94d919b999 mark this function as deprecated, and put the warning first, since I
doubt most people will read to the end...  Note the use of sys/cdefs.h
for pre-C11 compilers...

I didn't included a note about being compatibile w/ userland since a
C11 feature should be obviously usable in userland...

Suggested by:	imp
2015-08-02 00:22:14 +00:00
John-Mark Gurney
215397449a The implementation note isn't true anymore..
Not that anyone reads it, but those that do, remind them that this
isn't usable in userland...  I can't wait till this doc is wrong..
2015-07-31 03:28:02 +00:00
Christian Brueffer
703a08b23c The kernel option and module are actually called pmspcv.
MFC after:	3 days
2015-07-30 19:08:23 +00:00
Ed Maste
5be09b1082 Regenerate src.conf(5) after r286016 and r286030 2015-07-29 18:55:51 +00:00
Christian Brueffer
8843746ab4 Remove the AUTHORS section until it's clear who exactly wrote the driver. 2015-07-29 16:37:36 +00:00
Warner Losh
aa255ef6dd Teach sysctl about the new optional suffix after IK to specify
precision. Update input as well. Add IK to the manual (it was missing
completely).

Differential Revision: https://reviews.freebsd.org/D3181
2015-07-29 02:34:25 +00:00
Michael Gmelin
ca2e4ecd73 isl(4), driver for Intersil I2C ISL29018 Digital Ambient Light Sensor
Differential Revision:	https://reviews.freebsd.org/D2811
Reviewed by:	adrian, wblock
Approved by:	adrian, wblock
Relnotes:	yes
2015-07-25 20:17:19 +00:00
Michael Gmelin
46f07718f7 cyapa(4), driver for the Cypress APA I2C trackpad
Differential Revision:	https://reviews.freebsd.org/D3068
Reviewed by:	kib, wblock
Approved by:	kib
Relnotes:	yes
2015-07-25 18:14:35 +00:00
Edward Tomasz Napierala
208a8b9532 Update Capsicum and Mandatory Access Control manual pages
to no longer claim they are experimental.

Reviewed by:	rwatson@, wblock@
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D2985
2015-07-25 15:56:49 +00:00
Kristof Provost
e600320b2a Pf can reassemble IPv6 fragments now.
Obtained from: bluhm (OpenBSD)
Sponsored by: Essen FreeBSD Hackathon
2015-07-25 14:06:32 +00:00
Christian Brueffer
79cdd7a420 Add a basic manpage for the pms driver.
MFC after:	1 week
Committed from:	Essen FreeBSD Hackathon
2015-07-24 21:48:53 +00:00
Brooks Davis
bdf80fecf0 Document the fact that tunables can be set in device.hints.
Reviewed by:	wblock
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D3153
2015-07-23 17:27:10 +00:00
Marcel Moolenaar
be00e09818 Check the hw.proto.attach environment variable for devices that
proto(4) should attach to instead of the normal driver.

Document the variable.
2015-07-19 23:37:45 +00:00
Edward Tomasz Napierala
bd425507b7 Expand sysctl descriptions in iscsi(4) and ctl(4).
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2015-07-18 15:27:12 +00:00
Ed Schouten
50f960e60e Fix a small typo: "the the".
Spotted by:	wblock
2015-07-16 15:43:55 +00:00
Ed Schouten
707d98fe2f Implement the CloudABI random_get() system call.
The random_get() system call works similar to getentropy()/getrandom()
on OpenBSD/Linux. It fills a buffer with random data.

This change introduces a new function, read_random_uio(), that is used
to implement read() on the random devices. We can call into this
function from within the CloudABI compatibility layer.

Approved by:	secteam
Reviewed by:	jmg, markm, wblock
Obtained from:	https://github.com/NuxiNL/freebsd
Differential Revision:	https://reviews.freebsd.org/D3053
2015-07-14 18:45:15 +00:00
Christian Brueffer
9870187435 Markup fixes. 2015-07-13 15:26:03 +00:00
Christian Brueffer
647cdd025a Fix a typo and duplicate word. 2015-07-13 14:25:15 +00:00
Mark Murray
3aa77530ca * Address review (and add a bit myself).
- Tweek man page.
 - Remove all mention of RANDOM_FORTUNA. If the system owner wants YARROW or DUMMY, they ask for it, otherwise they get FORTUNA.
 - Tidy up headers a bit.
 - Tidy up declarations a bit.
 - Make static in a couple of places where needed.
 - Move Yarrow/Fortuna SYSINIT/SYSUNINIT to randomdev.c, moving us towards a single file where the algorithm context is used.
 - Get rid of random_*_process_buffer() functions. They were only used in one place each, and are better subsumed into those places.
 - Remove *_post_read() functions as they are stubs everywhere.
 - Assert against buffer size illegalities.
 - Clean up some silly code in the randomdev_read() routine.
 - Make the harvesting more consistent.
 - Make some requested argument name changes.
 - Tidy up and clarify a few comments.
 - Make some requested comment changes.
 - Make some requested macro changes.

* NOTE: the thing calling itself a 'unit test' is not yet a proper
  unit test, but it helps me ensure things work. It may be a proper
  unit test at some time in the future, but for now please don't make
  any assumptions or hold any expectations.

Differential Revision:	https://reviews.freebsd.org/D2025
Approved by:	so (/dev/random blanket)
2015-07-12 18:14:38 +00:00
Adrian Chadd
6520495abc Add an initial NUMA affinity/policy configuration for threads and processes.
This is based on work done by jeff@ and jhb@, as well as the numa.diff
patch that has been circulating when someone asks for first-touch NUMA
on -10 or -11.

* Introduce a simple set of VM policy and iterator types.
* tie the policy types into the vm_phys path for now, mirroring how
  the initial first-touch allocation work was enabled.
* add syscalls to control changing thread and process defaults.
* add a global NUMA VM domain policy.
* implement a simple cascade policy order - if a thread policy exists, use it;
  if a process policy exists, use it; use the default policy.
* processes inherit policies from their parent processes, threads inherit
  policies from their parent threads.
* add a simple tool (numactl) to query and modify default thread/process
  policities.
* add documentation for the new syscalls, for numa and for numactl.
* re-enable first touch NUMA again by default, as now policies can be
  set in a variety of methods.

This is only relevant for very specific workloads.

This doesn't pretend to be a final NUMA solution.

The previous defaults in -HEAD (with MAXMEMDOM set) can be achieved by
'sysctl vm.default_policy=rr'.

This is only relevant if MAXMEMDOM is set to something other than 1.
Ie, if you're using GENERIC or a modified kernel with non-NUMA, then
this is a glorified no-op for you.

Thank you to Norse Corp for giving me access to rather large
(for FreeBSD!) NUMA machines in order to develop and verify this.

Thank you to Dell for providing me with dual socket sandybridge
and westmere v3 hardware to do NUMA development with.

Thank you to Scott Long at Netflix for providing me with access
to the two-socket, four-domain haswell v3 hardware.

Thank you to Peter Holm for running the stress testing suite
against the NUMA branch during various stages of development!

Tested:

* MIPS (regression testing; non-NUMA)
* i386 (regression testing; non-NUMA GENERIC)
* amd64 (regression testing; non-NUMA GENERIC)
* westmere, 2 socket (thankyou norse!)
* sandy bridge, 2 socket (thankyou dell!)
* ivy bridge, 2 socket (thankyou norse!)
* westmere-EX, 4 socket / 1TB RAM (thankyou norse!)
* haswell, 2 socket (thankyou norse!)
* haswell v3, 2 socket (thankyou dell)
* haswell v3, 2x18 core (thankyou scott long / netflix!)

* Peter Holm ran a stress test suite on this work and found one
  issue, but has not been able to verify it (it doesn't look NUMA
  related, and he only saw it once over many testing runs.)

* I've tested bhyve instances running in fixed NUMA domains and cpusets;
  all seems to work correctly.

Verified:

* intel-pcm - pcm-numa.x and pcm-memory.x, whilst selecting different
  NUMA policies for processes under test.

Review:

This was reviewed through phabricator (https://reviews.freebsd.org/D2559)
as well as privately and via emails to freebsd-arch@.  The git history
with specific attributes is available at https://github.com/erikarn/freebsd/
in the NUMA branch (https://github.com/erikarn/freebsd/compare/local/adrian_numa_policy).

This has been reviewed by a number of people (stas, rpaulo, kib, ngie,
wblock) but not achieved a clear consensus.  My hope is that with further
exposure and testing more functionality can be implemented and evaluated.

Notes:

* The VM doesn't handle unbalanced domains very well, and if you have an overly
  unbalanced memory setup whilst under high memory pressure, VM page allocation
  may fail leading to a kernel panic.  This was a problem in the past, but it's
  much more easily triggered now with these tools.

* This work only controls the path through vm_phys; it doesn't yet strongly/predictably
  affect contigmalloc, KVA placement, UMA, etc.  So, driver placement of memory
  isn't really guaranteed in any way.  That's next on my plate.

Sponsored by:	Norse Corp, Inc.; Dell
2015-07-11 15:21:37 +00:00
John-Mark Gurney
f405d8eb61 some additional improvements to the documentation...
Sponsored by:	Netflix, Inc.
2015-07-11 04:20:56 +00:00
John-Mark Gurney
94b591186d yet more documentation improvements... Many changes were made to the
OCF w/o documentation...

Document the new (8+ year old) device_t way of handling things, that
_unregister_all will leave no threads in newsession, the _SYNC flag,
the requirement that a flag be specified...

Other minor changes like breaking up a wall of text into paragraphs...
2015-07-08 22:46:45 +00:00
Patrick Kelsey
fe3ff217dd Replace use of .Po Pc with the preferred .Pq for single line
enclosures in iovctl.conf(5), iovctl(8), pci(9), and
pci_iov_schema(9).

Differential Revision: https://reviews.freebsd.org/D3000
Reviewed by: wblock
Approved by: jmallett (mentor)
2015-07-08 16:16:44 +00:00
Warner Losh
abb14e5ded The results of the vote are in. This reflects that vote. Single
line statements inside of braces is recognized as an acceptable
style.
	http://reviews.freebsd.org/V3
As always, this isn't license for wholesale change, etc.
2015-07-06 20:10:47 +00:00
Mark Johnston
363f89481b Remove a BUGS entry that was addressed by r282300. 2015-07-05 23:24:52 +00:00
Mark Johnston
4b4ad3a219 Rename the dtrace-* man pages to dtrace_* for consistency with other
subsection man pages (e.g. geom_*, mac_*, snd_*).
2015-07-05 23:23:12 +00:00
Konstantin Belousov
e6e979f5d0 Document the locking context for the directly dispatched callouts.
Cross-reference timeout(9).

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-07-05 19:29:24 +00:00
Mariusz Zaborski
58c86148dd Move nvlist documentation to the FreeBSD Kernel Developer's sections.
Approved by:	pjd (mentor)
2015-07-04 10:27:30 +00:00
Edward Tomasz Napierala
a2c782d103 Make ctl(4) appear in "man -k iscsi" results.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2015-07-03 16:55:08 +00:00
Marcel Moolenaar
15abbe5137 Minor update to the proto(4) man page:
1.  We now support ISA devices
2.  DMA support has been added
2015-07-03 16:20:14 +00:00
John-Mark Gurney
e08d13cf83 more word smithing wrt the crd_inject field...
We've already defined IV earlier, so no need to expand it a second
time here...
2015-07-03 01:55:06 +00:00
John-Mark Gurney
2ca5eb5d2d update the documentation of the _IV_ flags... _IV_PRESENT doesn't
mean what you think it should...  This will be fixed in the future
with a flag rename, but document what the flag really does and make
the _IV_ flags clear what their presents (or lack there of) means...

Reviewed by:	gnn, eri (both earlier version)
2015-07-03 00:37:16 +00:00
John-Mark Gurney
0812640475 add units to the length and count so that it's clear what they
are measured in...  Gems like: "len is the length of the buffer."
aren't useful in man pages.
2015-06-30 19:06:14 +00:00
Mark Murray
c4f9c760c9 Updated random(4) boot/shutdown scripting.
Fix the man pages as well.

Differential Revision: https://reviews.freebsd.org/D2924
Approved by: so (delphij)
2015-06-30 17:09:41 +00:00
Mark Murray
d1b06863fb Huge cleanup of random(4) code.
* GENERAL
- Update copyright.
- Make kernel options for RANDOM_YARROW and RANDOM_DUMMY. Set
  neither to ON, which means we want Fortuna
- If there is no 'device random' in the kernel, there will be NO
  random(4) device in the kernel, and the KERN_ARND sysctl will
  return nothing. With RANDOM_DUMMY there will be a random(4) that
  always blocks.
- Repair kern.arandom (KERN_ARND sysctl). The old version went
  through arc4random(9) and was a bit weird.
- Adjust arc4random stirring a bit - the existing code looks a little
  suspect.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Redo read_random(9) so as to duplicate random(4)'s read internals.
  This makes it a first-class citizen rather than a hack.
- Move stuff out of locked regions when it does not need to be
  there.
- Trim RANDOM_DEBUG printfs. Some are excess to requirement, some
  behind boot verbose.
- Use SYSINIT to sequence the startup.
- Fix init/deinit sysctl stuff.
- Make relevant sysctls also tunables.
- Add different harvesting "styles" to allow for different requirements
  (direct, queue, fast).
- Add harvesting of FFS atime events. This needs to be checked for
  weighing down the FS code.
- Add harvesting of slab allocator events. This needs to be checked for
  weighing down the allocator code.
- Fix the random(9) manpage.
- Loadable modules are not present for now. These will be re-engineered
  when the dust settles.
- Use macros for locks.
- Fix comments.

* src/share/man/...
- Update the man pages.

* src/etc/...
- The startup/shutdown work is done in D2924.

* src/UPDATING
- Add UPDATING announcement.

* src/sys/dev/random/build.sh
- Add copyright.
- Add libz for unit tests.

* src/sys/dev/random/dummy.c
- Remove; no longer needed. Functionality incorporated into randomdev.*.

* live_entropy_sources.c live_entropy_sources.h
- Remove; content moved.
- move content to randomdev.[ch] and optimise.

* src/sys/dev/random/random_adaptors.c src/sys/dev/random/random_adaptors.h
- Remove; plugability is no longer used. Compile-time algorithm
  selection is the way to go.

* src/sys/dev/random/random_harvestq.c src/sys/dev/random/random_harvestq.h
- Add early (re)boot-time randomness caching.

* src/sys/dev/random/randomdev_soft.c src/sys/dev/random/randomdev_soft.h
- Remove; no longer needed.

* src/sys/dev/random/uint128.h
- Provide a fake uint128_t; if a real one ever arrived, we can use
  that instead. All that is needed here is N=0, N++, N==0, and some
  localised trickery is used to manufacture a 128-bit 0ULLL.

* src/sys/dev/random/unit_test.c src/sys/dev/random/unit_test.h
- Improve unit tests; previously the testing human needed clairvoyance;
  now the test will do a basic check of compressibility. Clairvoyant
  talent is still a good idea.
- This is still a long way off a proper unit test.

* src/sys/dev/random/fortuna.c src/sys/dev/random/fortuna.h
- Improve messy union to just uint128_t.
- Remove unneeded 'static struct fortuna_start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
  it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])

* src/sys/dev/random/yarrow.c src/sys/dev/random/yarrow.h
- Improve messy union to just uint128_t.
- Remove unneeded 'staic struct start_cache'.
- Tighten up up arithmetic.
- Provide a method to allow eternal junk to be introduced; harden
  it against blatant by compress/hashing.
- Assert that locks are held correctly.
- Fix the nasty pre- and post-read overloading by providing explictit
  functions to do these tasks.
- Turn into self-sufficient module (no longer requires randomdev_soft.[ch])
- Fix some magic numbers elsewhere used as FAST and SLOW.

Differential Revision: https://reviews.freebsd.org/D2025
Reviewed by: vsevolod,delphij,rwatson,trasz,jmg
Approved by: so (delphij)
2015-06-30 17:00:45 +00:00
Sean Bruno
8971c13ffa Delete the refernce to VLAN handling being disabled by default. This is
no longer the case.

PR:		118693
MFC after:	3 days
2015-06-29 17:59:00 +00:00
Hans Petter Selasky
38b622e199 Make the system queue header file fully usable within C++ programs by
adding macros to define class lists.

This change is backwards compatible for all use within C and C++
programs. Only C++ programs will have added support to use the queue
macros within classes. Previously the queue macros could only be used
within structures.

The queue.3 manual page has been updated to describe the new
functionality and some alphabetic sorting has been done while
at it.

Differential Revision:	https://reviews.freebsd.org/D2745
PR:			200827 (exp-run)
MFC after:		2 weeks
2015-06-28 21:06:45 +00:00
Ermal Luçi
a5b789f65a ALTQ FAIRQ discipline import from DragonFLY
Differential Revision:  https://reviews.freebsd.org/D2847
Reviewed by:    glebius, wblock(manpage)
Approved by:    gnn(mentor)
Obtained from:  pfSense
Sponsored by:   Netgate
2015-06-24 19:16:41 +00:00
Kevin Lo
2907b9f461 Mention that using ports/net/malo-firmware-kmod to install the firmware. 2015-06-24 09:28:43 +00:00
Simon J. Gerraty
cc2520d2f3 Fix generation of src.conf.5
Since makeman turns all options on, we need to guard somethings from
make(showconfig)
2015-06-22 20:21:57 +00:00
Edward Tomasz Napierala
c6ba08803a Expand sysctls descriptions for iscsi(4) and ctl(4).
Differential Revision:	https://reviews.freebsd.org/D2876
Reviewed by:	wblock@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2015-06-21 14:21:38 +00:00
Konstantin Belousov
7705435606 The barriers, provided by _acq and _rel atomics, are acquire and
release barriers, not read and write barriers.  They fence all memory
accesses from the respective side, not limited by the kind of
operation.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-06-20 17:18:46 +00:00
Xin LI
10b942e5d3 Fix markups and change e.g./eg. to e.g.,.
MFC after:	2 weeks
2015-06-19 21:35:56 +00:00
Xin LI
908e4e9754 Fix markups.
MFC after:	2 weeks
2015-06-19 21:35:24 +00:00
Warner Losh
196f0f2b53 Back out contested change until dispute is resolved. This proved to be
more contentious than I expected.
2015-06-19 21:30:45 +00:00
Xin LI
fdc84efdd1 Document kern.cam.ada.legacy_aliases, while I'm there also fix some typos.
MFC after:	2 weeks
2015-06-19 21:26:06 +00:00
Christian Brueffer
61d010fae9 Document title should be in CAPS. 2015-06-18 16:31:32 +00:00
Christian Brueffer
cf28e39c96 Remove EOL whitespace. 2015-06-18 16:29:11 +00:00
Warner Losh
c97426f4d7 Bump date.
Submitted by: Xin Li
2015-06-17 22:06:27 +00:00
Warner Losh
b6c9950099 Update style.9 to reflect consensus on developer's mailing list
allowing redundant braces.

Differential Revision: https://reviews.freebsd.org/D2842
2015-06-17 21:25:36 +00:00
Sergey Kandaurov
ccc785556c Deshallify. 2015-06-15 23:30:54 +00:00