Commit Graph

534 Commits

Author SHA1 Message Date
Sean Bruno
34bac11eba Add iflib man pages for developers.
Doc review is probably waranted here for editing.

Submitted by:	Nicole Graziano
2017-01-28 00:40:36 +00:00
Ed Schouten
1e1f3941e4 Add support for attaching aggregation labels to sysctl objects.
I'm currently working on writing a metrics exporter for the Prometheus
monitoring system to provide access to sysctl metrics. Prometheus and
sysctl have some structural differences:

- sysctl is a tree of string component names.
- Prometheus uses a flat namespace for its metrics, but allows you to
  attach labels with values to them, so that you can do aggregation.

An initial version of my exporter simply translated

    hw.acpi.thermal.tz1.temperature

to

    sysctl_hw_acpi_thermal_tz1_temperature_celcius

while we should ideally have

    sysctl_hw_acpi_thermal_temperature_celcius{thermal_zone="tz1"}

allowing you to graph all thermal zones on a system in one go.

The change presented in this commit adds support for accomplishing this,
by providing the ability to attach labels to nodes. In the example I
gave above, the label "thermal_zone" would be attached to "tz1". As this
is a feature that will only be used very rarely, I decided to not change
the KPI too aggressively.

Discussed on:	hackers@
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D8775
2016-12-14 12:47:34 +00:00
Alan Cox
590cb3c761 The function vm_page_cache() no longer exists. Remove its man page. 2016-11-27 01:44:31 +00:00
Andriy Voskoboinyk
1af4a58503 mbuf(9), mbuf_tags(9): fix function prototypes.
- Add m_getclr(9) symlink to ObsoleteFiles.inc (removed in r295481).
- Add const qualifiers in m_dup(), m_dup_pkthdr() and m_tag_copy_chain()
(r286450).
- Fix m_dup_pkthdr() definition (it's not the same as m_move_pkthdr()).

MFC after:	5 days
2016-10-10 17:16:02 +00:00
Bryan Drewery
995192edad Add link for vrefl(9).
Sponsored by:	Dell EMC Isilon
MFC after:	1 week
2016-10-06 18:05:25 +00:00
Mariusz Zaborski
5ec81a23a8 Add man page for dnvlist.
Submitted by:	Adam Starak <starak.adam@gmail.com>
Reviewed by:	cem, wblock
2016-10-05 19:01:00 +00:00
John Baldwin
da0fc9250c Reset PCI pass through devices via PCI-e FLR during VM start and end.
Add routines to trigger a function level reset (FLR) of a PCI-express
device via the PCI-express device control register.  This also includes
support routines to wait for pending transactions to complete as well
as calculating the maximum completion timeout permitted by a device.

Change the ppt(4) driver to reset pass through devices before attaching
to a VM during startup and before detaching from a VM during shutdown.

Reviewed by:	imp, wblock (earlier version)
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7751
2016-09-06 21:15:35 +00:00
Mark Johnston
dbbaf04f1e Remove support for idle page zeroing.
Idle page zeroing has been disabled by default on all architectures since
r170816 and has some bugs that make it seemingly unusable. Specifically,
the idle-priority pagezero thread exacerbates contention for the free page
lock, and yields the CPU without releasing it in non-preemptive kernels. The
pagezero thread also does not behave correctly when superpage reservations
are enabled: its target is a function of v_free_count, which includes
reserved-but-free pages, but it is only able to zero pages belonging to the
physical memory allocator.

Reviewed by:	alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D7714
2016-09-03 20:38:13 +00:00
Mariusz Zaborski
7dcd0f0e7b Introduce cnv man page.
Submitted by:		Adam Starak <starak.adam@gmail.com>
Reviewed by:		cem@, wblock@
Differential Revision:	https://reviews.freebsd.org/D7249
2016-08-27 13:47:52 +00:00
Edward Tomasz Napierala
37e56c6efe Remove lockmgr_waiters(9) and BUF_LOCKWAITERS(9); they were not used
for anything.

Reviewed by:	kib@
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D7420
2016-08-05 13:53:28 +00:00
John Baldwin
0aee83cc1d Permit the name of the /dev/iov entry to be set by the driver.
The PCI_IOV option creates character devices in /dev/iov for each PF
device driver that registers support for creating VFs.  By default the
character device is named after the PF device (e.g. /dev/iov/foo0).
This change adds a variant of pci_iov_attach() called pci_iov_attach_name()
that allows the name of the /dev/iov entry to be specified by the
driver.

Reviewed by:	rstone
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D7400
2016-08-03 17:09:12 +00:00
Konstantin Belousov
a9e182e895 Extract the calculation of the callout fire time into the new function
callout_when(9).  See the man page update for the description of the
intended use.

Tested by:	pho
Reviewed by:	jhb, bjk (man page updates)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7137
2016-07-28 08:57:01 +00:00
Konstantin Belousov
90b581f2cc Implement mtx_trylock_spin(9).
Discussed with:	bde
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D7192
2016-07-23 05:30:55 +00:00
Jonathan T. Looney
cf3c688cc9 Document support for alternate TCP stacks.
Differential Revision:	https://reviews.freebsd.org/D6940
Reviewed by:	hiren
Approved by:	re (gjb)
Sponsored by:	Juniper Networks
2016-06-28 13:37:01 +00:00
John Baldwin
20fee1093e Add sglist functions for working with arrays of VM pages.
sglist_count_vmpages() determines the number of segments required for
a buffer described by an array of VM pages. sglist_append_vmpages()
adds the segments described by such a buffer to an sglist.  The latter
function is largely pulled from sglist_append_bio(), and
sglist_append_bio() now uses sglist_append_vmpages().

Reviewed by:	kib
Sponsored by:	Chelsio Communications
2016-05-20 23:28:43 +00:00
John Baldwin
cc981af204 Add new bus methods for mapping resources.
Add a pair of bus methods that can be used to "map" resources for direct
CPU access using bus_space(9).  bus_map_resource() creates a mapping and
bus_unmap_resource() releases a previously created mapping.  Mappings are
described by 'struct resource_map' object.  Pointers to these objects can
be passed as the first argument to the bus_space wrapper API used for bus
resources.

Drivers that wish to map all of a resource using default settings
(for example, using uncacheable memory attributes) do not need to change.
However, drivers that wish to use non-default settings can now do so
without jumping through hoops.

First, an RF_UNMAPPED flag is added to request that a resource is not
implicitly mapped with the default settings when it is activated.  This
permits other activation steps (such as enabling I/O or memory decoding
in a device's PCI command register) to be taken without creating a
mapping.  Right now the AGP drivers don't set RF_ACTIVE to avoid using
up a large amount of KVA to map the AGP aperture on 32-bit platforms.
Once RF_UNMAPPED is supported on all platforms that support AGP this
can be changed to using RF_UNMAPPED with RF_ACTIVE instead.

Second, bus_map_resource accepts an optional structure that defines
additional settings for a given mapping.

For example, a driver can now request to map only a subset of a resource
instead of the entire range.  The AGP driver could also use this to only
map the first page of the aperture (IIRC, it calls pmap_mapdev() directly
to map the first page currently).  I will also eventually change the
PCI-PCI bridge driver to request mappings of the subset of the I/O window
resource on its parent side to create mappings for child devices rather
than passing child resources directly up to nexus to be mapped.  This
also permits bridges that do address translation to request suitable
mappings from a resource on the "upper" side of the bus when mapping
resources on the "lower" side of the bus.

Another attribute that can be specified is an alternate memory attribute
for memory-mapped resources.  This can be used to request a
Write-Combining mapping of a PCI BAR in an MI fashion.  (Currently the
drivers that do this call pmap_change_attr() directly for x86 only.)

Note that this commit only adds the MI framework.  Each platform needs
to add support for handling RF_UNMAPPED and thew new
bus_map/unmap_resource methods.  Generally speaking, any drivers that
are calling rman_set_bustag() and rman_set_bushandle() need to be
updated.

Discussed on:	arch
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D5237
2016-05-20 17:57:47 +00:00
John Baldwin
886b793d84 Remove dangling references to rman_await_resource().
This function was removed when RF_TIMESHARE was removed a couple of years
ago.

MFC after:	3 days
2016-05-20 01:17:38 +00:00
Warner Losh
c6755605f6 Per Ravi Pokala's suggestion, rewrite the g_reset_bio description to
be clearer. It also describes it with more nuance. Add missing MLINKS
noticed by trasz@. Bump the date.
2016-05-17 17:08:13 +00:00
Andrew Turner
d7be980dbe Re-commit r299467 having fixed the build:
Add a new get_id interface to pci and pcib. This will allow us to both
detect failures, and get different PCI IDs.

For the former the interface returns an int to signal an error. The ID is
returned at a uintptr_t * argument.

For the latter there is a type argument that allows selecting the ID type.
This only specifies a single type, however a MSI type will be added
to handle the need to find the ID the hardware passes to the ARM GICv3
interrupt controller.

A follow up commit will be made to remove pci_get_rid.

Reviewed by:    jhb, rstone (previous version)
Obtained from:  ABT Systems Ltd
Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D6239
2016-05-16 09:15:50 +00:00
Conrad Meyer
f41be0f076 Revert r299467 to fix the kernel build.
$ svn merge -c -299467 .

Approved by:	build being broken for six hours
2016-05-11 23:00:12 +00:00
Andrew Turner
9a36a337ff Add a new get_id interface to pci and pcib. This will allow us to both
detect failures, and get different PCI IDs.

For the former the interface returns an int to signal an error. The ID is
returned at a uintptr_t * argument.

For the latter there is a type argument that allows selecting the ID type.
This only specifies a single type, however a MSI type will be added
to handle the need to find the ID the hardware passes to the ARM GICv3
interrupt controller.

A follow up commit will be made to remove pci_get_rid.

Reviewed by:	jhb, rstone
Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D6239
2016-05-11 17:07:29 +00:00
John Baldwin
8d791e5af1 Add a new bus method to fetch device-specific CPU sets.
bus_get_cpus() returns a specified set of CPUs for a device.  It accepts
an enum for the second parameter that indicates the type of cpuset to
request.  Currently two valus are supported:

 - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
   the device when DEVICE_NUMA is enabled)
 - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL.  INTR_CPUS is mapped to 'all_cpus'
by default.  The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled.  They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Compared to the r298933, this version uses 'struct _cpuset' in
<sys/bus.h> instead of 'cpuset_t' to avoid requiring <sys/param.h>
(<sys/_cpuset.h> still requires <sys/param.h> for MAXCPU even though
<sys/_bitset.h> does not after recent changes).
2016-05-09 20:50:21 +00:00
John Baldwin
8a08b7d36b Revert bus_get_cpus() for now.
I really thought I had run this through the tinderbox before committing,
but many places need <sys/types.h> -> <sys/param.h> for <sys/bus.h> now.
2016-05-03 01:17:40 +00:00
John Baldwin
bc153c692f Add a new bus method to fetch device-specific CPU sets.
bus_get_cpus() returns a specified set of CPUs for a device.  It accepts
an enum for the second parameter that indicates the type of cpuset to
request.  Currently two valus are supported:

 - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
   the device when DEVICE_NUMA is enabled)
 - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)

For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL.  INTR_CPUS is mapped to 'all_cpus'
by default.  The idea is that INTR_CPUS should always return a valid set.

Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).

The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.

The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled.  They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.

Reviewed by:	wblock (manpage)
Differential Revision:	https://reviews.freebsd.org/D5519
2016-05-02 18:00:38 +00:00
John Baldwin
a907c6914c Add a new rescan method to the bus interface.
The BUS_RESCAN() method rescans a single bus device checking for devices
that have been added or removed from the bus.  A new 'rescan' command is
added to devctl(8) to trigger a rescan.

Differential Revision:	https://reviews.freebsd.org/D6016
2016-04-27 16:29:03 +00:00
Glen Barber
d60840138f MFH
Sponsored by:	The FreeBSD Foundation
2016-04-04 23:55:32 +00:00
Hans Petter Selasky
a435d46fdf Improve the implementation and documentation of the
SYSCTL_COUNTER_U64_ARRAY() macro.

- Add proper asserts to the SYSCTL_COUNTER_U64_ARRAY() macro that checks
  the size of the first element of the array.
- Add an example to the counter(9) manual page how to use the
  SYSCTL_COUNTER_U64_ARRAY() macro.
- Add some missing symbolic links for counter(9) while at it.
2016-03-16 08:37:52 +00:00
Glen Barber
52259a98ad MFH
Sponsored by:	The FreeBSD Foundation
2016-03-02 16:14:46 +00:00
Mark Johnston
a3f6b02969 Document m_catpkt(), and remove misinformation about m_cat(9).
Since m_cat() may copy data from the second mbuf chain into the last mbuf
of the first chain, it may free the first mbuf of the second chain. Thus,
the second chain is not guaranteed to be valid after m_cat() returns.

Reviewed by:	glebius
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D5497
2016-03-02 04:58:51 +00:00
John Baldwin
cbc4d2db75 Remove taskqueue_enqueue_fast().
taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167.  It has been a compat shim ever
since.  It's time for the compat shim to go.

Submitted by:	Howard Su <howard0su@gmail.com>
Reviewed by:	sephe
Differential Revision:	https://reviews.freebsd.org/D5131
2016-03-01 17:47:32 +00:00
Glen Barber
bf2df150f1 Separate manual pages into their own package.
Sponsored by:	The FreeBSD Foundation
2016-01-21 16:36:33 +00:00
Konstantin Belousov
48ce5d4cac Provide yet another KPI for cdev creation, make_dev_s(9).
Immediate problem fixed by the new KPI is the long-standing race
between device creation and assignments to cdev->si_drv1 and
cdev->si_drv2, which allows the window where cdevsw methods might be
called with si_drv1,2 fields not yet set.  Devices typically checked
for NULL and returned spurious errors to usermode, and often left some
methods unchecked.

The new function interface is designed to be extensible, which should
allow to add more features to make_dev_s(9) without inventing yet
another name for function to create devices, while maintaining KPI and
even KBI backward-compatibility.

Reviewed by:	hps, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D4746
2016-01-07 20:08:02 +00:00
John Baldwin
ce204e1bd8 Add accessor methods to fetch the BAR holding the MSI-X table and PBA.
While here, explicitly note the requirement that the BAR(s) must be
allocated prior to calling pci_alloc_msix().

Reviewed by:	andrew, emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D4688
2015-12-23 21:51:10 +00:00
Jonathan T. Looney
54503a13d8 Add a safety net to reclaim mbufs when one of the mbuf zones become
exhausted.

It is possible for a bug in the code (or, theoretically, even unusual
network conditions) to exhaust all possible mbufs or mbuf clusters.
When this occurs, things can grind to a halt fairly quickly. However,
we currently do not call mb_reclaim() unless the entire system is
experiencing a low-memory condition.

While it is best to try to prevent exhaustion of one of the mbuf zones,
it would also be useful to have a mechanism to attempt to recover from
these situations by freeing "expendable" mbufs.

This patch makes two changes:

a) The patch adds a generic API to the UMA zone allocator to set a
function that should be called when an allocation fails because the
zone limit has been reached. Because of the way this function can be
called, it really should do minimal work.

b) The patch uses this API to try to free mbufs when an allocation
fails from one of the mbuf zones because the zone limit has been
reached. The function schedules a callout to run mb_reclaim().

Differential Revision:	https://reviews.freebsd.org/D3864
Reviewed by:	gnn
Comments by:	rrs, glebius
MFC after:	2 weeks
Sponsored by:	Juniper Networks
2015-12-20 02:05:33 +00:00
Mark Johnston
711fbd17ec Add helper functions proc_readmem() and proc_writemem().
These helper functions can be used to read in or write a buffer from or to
an arbitrary process' address space. Without them, this can only be done
using proc_rwmem(), which requires the caller to fill out a uio. This is
onerous and results in code duplication; the new functions provide a simpler
interface which is sufficient for most existing callers of proc_rwmem().

This change also adds a manual page for proc_rwmem() and the new functions.

Reviewed by:	jhb, kib
Differential Revision:	https://reviews.freebsd.org/D4245
2015-12-07 21:33:15 +00:00
Christian Brueffer
a8b75f756d Add an MLINK for m_collapse.
PR:		204205
Submitted by:	avos
MFC after:	1 week
2015-12-07 19:08:33 +00:00
Randall Stewart
f5206d3f71 Some basic documentation (a man page) on kern_testfrwk 2015-11-12 11:42:01 +00:00
Randall Stewart
96eacdfdc2 Add the MLINK for async_drain Thanks Edward for the pointer.
MFC after:	1 month
2015-11-11 23:10:09 +00:00
Mark Johnston
635458bc06 Add a manual page for PHOLD() and friends.
MFC after:	1 week
2015-11-08 01:41:44 +00:00
Conrad Meyer
be87839e56 Round out SYSCTL macros to the full set of fixed-width types
Add S8, S16, S32, and U32 types;  add SYSCTL*() macros for them, as well
as for the existing 64-bit types.  (While SYSCTL*QUAD and UQUAD macros
already exist, they do not take the same sort of 'val' parameter that
the other macros do.)

Clean up the documented "types" in the sysctl.9 document.  (These are
macros and thus not real types, but the manual page documents intent.)

The sysctl_add_oid(9) arg2 has been bumped from intptr_t to intmax_t to
accommodate 64-bit types on 32-bit pointer architectures.

This is just the kernel support piece; the userspace sysctl(1) support
will follow in a later patch.

Submitted by:	Ravi Pokala <rpokala@panasas.com>
Reviewed by:	cem
Relnotes:	no
Sponsored by:	Panasas
Differential Revision:	https://reviews.freebsd.org/D4091
2015-11-07 01:43:01 +00:00
Adrian Chadd
791b3571a2 remove \, it confuses things. 2015-11-05 22:50:21 +00:00
John Baldwin
87dd2f95d2 Add a new helper function for PCI devices to locate the upstream
PCI-express root port of a given PCI device.

Reviewed by:	kib, imp
MFC after:	1 week
Sponsored by:	Chelsio
Differential Revision:	https://reviews.freebsd.org/D4089
2015-11-05 21:27:25 +00:00
John Baldwin
ec603c7297 Add helper routines for PCI device drivers to read, write, and modify
PCI-Express capability registers (that is, PCI config registers in the
standard PCI config space belonging to the PCI-Express capability
register set).

Note that all of the current PCI-e registers are either 16 or 32-bits,
so only widths of 2 or 4 bytes are supported.

Reviewed by:	imp
MFC after:	1 week
Sponsored by:	Chelsio
Differential Revision:	https://reviews.freebsd.org/D4088
2015-11-05 21:26:06 +00:00
Conrad Meyer
fb61390c99 sysctl(9): Document U8/U16 types from r289773
Suggested by:	ngie
Sponsored by:	EMC / Isilon Storage Division
2015-10-23 15:08:16 +00:00
Conrad Meyer
5546be25d6 Document cpuset(9)
A follow-up to r289467.

Coerced by:	jhb
Sponsored by:	EMC / Isilon Storage Division
2015-10-20 23:48:14 +00:00
Conrad Meyer
7ebf41220c Document bitset(9) 2015-10-17 19:55:58 +00:00
Enji Cooper
f2e3428854 Remove MLINKS to more non-existent mbuf(9) macros
X-MFC with: r288295
MFC after: 3 days
Sponsored by: EMC / Isilon Storage Division
2015-09-27 04:55:43 +00:00
Enji Cooper
c91afdd4d0 Posthumously remove all references to MFREE(9)
The macro was removed in r90227

MFC after: 3 days
Sponsored by: EMC / Isilon Storage Division
2015-09-27 04:40:54 +00:00
Conrad Meyer
0d27967e31 Document bus_get_resource(9).
Suggested by:	Francois Tigeot
Obtained from:	DragonFlyBSD 09301a2b29f3ae5edd39a858f909f8770372f71e
Sponsored by:	EMC / Isilon Storage Division
2015-09-26 14:52:47 +00:00
Hans Petter Selasky
c55f4c9445 Revert r287780 until more developers have their say.
Differential Revision:	https://reviews.freebsd.org/D3521
Requested by:		gnn
2015-09-22 06:51:55 +00:00