Add some missing errno values to thr_new(2) and pthread_create(3).
In particular, EDEADLK was not documented in the latter.
While I'm here, improve some English and cross-references.
Reviewed by: kib
Sponsored by: Dell Inc.
Differential Revision: https://reviews.freebsd.org/D6663
Add accessor functions to toggle the state per VNET.
The base system (vnet0) will always enable itself with the normal
registration. We will share the registered protocol handlers in all
VNETs minimising duplication and management.
Upon disabling netisr processing for a VNET drain the netisr queue from
packets for that VNET.
Update netisr consumers to (de)register on a per-VNET start/teardown using
VNET_SYS(UN)INIT functionality.
The change should be transparent for non-VIMAGE kernels.
Reviewed by: gnn (, hiren)
Obtained from: projects/vnet
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6691
specific order. VNET_SYSUNINITs however are doing exactly that.
Thus remove the VIMAGE conditional field from the domain(9) protosw
structure and replace it with VNET_SYSUNINITs.
This also allows us to change some order and to make the teardown functions
file local static.
Also convert divert(4) as it uses the same mechanism ip(4) and ip6(4) use
internally.
Slightly reshuffle the SI_SUB_* fields in kernel.h and add a new ones, e.g.,
for pfil consumers (firewalls), partially for this commit and for others
to come.
Reviewed by: gnn, tuexen (sctp), jhb (kernel.h)
Obtained from: projects/vnet
MFC after: 2 weeks
X-MFC: do not remove pr_destroy
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6652
Reflect all recent changes in the manpage:
- add adhoc-demo and hostap into list of supported modes;
add few examples for them;
- mention encryption/decryption offload for CCMP cipher;
- extend list of driver messages in the DIAGNOSTICS;
- document hostap mode limitations / powersave instability
in the CAVEATS section.
Differential Revision: https://reviews.freebsd.org/D5149
Add a bit_count function, which efficiently counts the number of bits set in
a bitstring.
sys/sys/bitstring.h
tests/sys/sys/bitstring_test.c
share/man/man3/bitstring.3
Add bit_alloc
sys/kern/subr_unit.c
Use bit_count instead of a naive counting loop in check_unrhdr, used
when INVARIANTS are enabled. The userland test runs about 6x faster
in a generic build, or 8.5x faster when built for Nehalem, which has
the POPCNT instruction.
sys/sys/param.h
Bump __FreeBSD_version due to the addition of bit_alloc
UPDATING
Add a note about the ABI incompatibility of the bitstring(3)
changes, as suggested by lidl.
Suggested by: gibbs
Reviewed by: gibbs, ngie
MFC after: 9 days
X-MFC-With: 299090, 300538
Relnotes: yes
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D6255
sglist_count_vmpages() determines the number of segments required for
a buffer described by an array of VM pages. sglist_append_vmpages()
adds the segments described by such a buffer to an sglist. The latter
function is largely pulled from sglist_append_bio(), and
sglist_append_bio() now uses sglist_append_vmpages().
Reviewed by: kib
Sponsored by: Chelsio Communications
Add a pair of bus methods that can be used to "map" resources for direct
CPU access using bus_space(9). bus_map_resource() creates a mapping and
bus_unmap_resource() releases a previously created mapping. Mappings are
described by 'struct resource_map' object. Pointers to these objects can
be passed as the first argument to the bus_space wrapper API used for bus
resources.
Drivers that wish to map all of a resource using default settings
(for example, using uncacheable memory attributes) do not need to change.
However, drivers that wish to use non-default settings can now do so
without jumping through hoops.
First, an RF_UNMAPPED flag is added to request that a resource is not
implicitly mapped with the default settings when it is activated. This
permits other activation steps (such as enabling I/O or memory decoding
in a device's PCI command register) to be taken without creating a
mapping. Right now the AGP drivers don't set RF_ACTIVE to avoid using
up a large amount of KVA to map the AGP aperture on 32-bit platforms.
Once RF_UNMAPPED is supported on all platforms that support AGP this
can be changed to using RF_UNMAPPED with RF_ACTIVE instead.
Second, bus_map_resource accepts an optional structure that defines
additional settings for a given mapping.
For example, a driver can now request to map only a subset of a resource
instead of the entire range. The AGP driver could also use this to only
map the first page of the aperture (IIRC, it calls pmap_mapdev() directly
to map the first page currently). I will also eventually change the
PCI-PCI bridge driver to request mappings of the subset of the I/O window
resource on its parent side to create mappings for child devices rather
than passing child resources directly up to nexus to be mapped. This
also permits bridges that do address translation to request suitable
mappings from a resource on the "upper" side of the bus when mapping
resources on the "lower" side of the bus.
Another attribute that can be specified is an alternate memory attribute
for memory-mapped resources. This can be used to request a
Write-Combining mapping of a PCI BAR in an MI fashion. (Currently the
drivers that do this call pmap_change_attr() directly for x86 only.)
Note that this commit only adds the MI framework. Each platform needs
to add support for handling RF_UNMAPPED and thew new
bus_map/unmap_resource methods. Generally speaking, any drivers that
are calling rman_set_bustag() and rman_set_bushandle() need to be
updated.
Discussed on: arch
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D5237
intention of the POSIX IEEE Std 1003.1TM-2008/Cor 1-2013.
A robust mutex is guaranteed to be cleared by the system upon either
thread or process owner termination while the mutex is held. The next
mutex locker is then notified about inconsistent mutex state and can
execute (or abandon) corrective actions.
The patch mostly consists of small changes here and there, adding
neccessary checks for the inconsistent and abandoned conditions into
existing paths. Additionally, the thread exit handler was extended to
iterate over the userspace-maintained list of owned robust mutexes,
unlocking and marking as terminated each of them.
The list of owned robust mutexes cannot be maintained atomically
synchronous with the mutex lock state (it is possible in kernel, but
is too expensive). Instead, for the duration of lock or unlock
operation, the current mutex is remembered in a special slot that is
also checked by the kernel at thread termination.
Kernel must be aware about the per-thread location of the heads of
robust mutex lists and the current active mutex slot. When a thread
touches a robust mutex for the first time, a new umtx op syscall is
issued which informs about location of lists heads.
The umtx sleep queues for PP and PI mutexes are split between
non-robust and robust.
Somewhat unrelated changes in the patch:
1. Style.
2. The fix for proper tdfind() call use in umtxq_sleep_pi() for shared
pi mutexes.
3. Removal of the userspace struct pthread_mutex m_owner field.
4. The sysctl kern.ipc.umtx_vnode_persistent is added, which controls
the lifetime of the shared mutex associated with a vnode' page.
Reviewed by: jilles (previous version, supposedly the objection was fixed)
Discussed with: brooks, Martin Simmons <martin@lispworks.com> (some aspects)
Tested by: pho
Sponsored by: The FreeBSD Foundation
translate the pci rid to a controller ID. The translation could be based
on the 'msi-map' OFW property, a similar ACPI option, or hard-coded for
hardware lacking the above options.
Reviewed by: wma
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Add a new get_id interface to pci and pcib. This will allow us to both
detect failures, and get different PCI IDs.
For the former the interface returns an int to signal an error. The ID is
returned at a uintptr_t * argument.
For the latter there is a type argument that allows selecting the ID type.
This only specifies a single type, however a MSI type will be added
to handle the need to find the ID the hardware passes to the ARM GICv3
interrupt controller.
A follow up commit will be made to remove pci_get_rid.
Reviewed by: jhb, rstone (previous version)
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6239
detect failures, and get different PCI IDs.
For the former the interface returns an int to signal an error. The ID is
returned at a uintptr_t * argument.
For the latter there is a type argument that allows selecting the ID type.
This only specifies a single type, however a MSI type will be added
to handle the need to find the ID the hardware passes to the ARM GICv3
interrupt controller.
A follow up commit will be made to remove pci_get_rid.
Reviewed by: jhb, rstone
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6239
the bio to a pristine state should you wish to re-use it for another
I/O without freeing it. In the bast, a simple bzero was done to do
this, but that may not be sufficient in the future when the bio may
contain state that's not part of the documented API. Besides, it makes
the code clearer as to the intent...
Noticed by: smh@
bus_get_cpus() returns a specified set of CPUs for a device. It accepts
an enum for the second parameter that indicates the type of cpuset to
request. Currently two valus are supported:
- LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
the device when DEVICE_NUMA is enabled)
- INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)
For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL. INTR_CPUS is mapped to 'all_cpus'
by default. The idea is that INTR_CPUS should always return a valid set.
Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).
The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.
The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled. They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.
Compared to the r298933, this version uses 'struct _cpuset' in
<sys/bus.h> instead of 'cpuset_t' to avoid requiring <sys/param.h>
(<sys/_cpuset.h> still requires <sys/param.h> for MAXCPU even though
<sys/_bitset.h> does not after recent changes).
The slowstart_flightsize and local_slowstart_flightsize sysctl's
were removed from the TCP code in 226447 several years ago.
PR: 209376
MFC after: 1 week
This is an initial work in progress to use the replacement bhnd
bus code for devices which support it.
* Add manpage updates for bhnd, bhndb, siba
* Add kernel options for bhnd, bhndbus, etc
* Add initial support in if_bwn_pci / if_bwn_mac for using bhnd
as the bus transport for suppoted NICs
* if_bwn_pci will eventually be the PCI bus glue to interface to bwn,
which will use the right backend bus to attach to, versus direct
nexus/bhnd attachments (as found in embedded broadcom devices.)
The PCI glue defaults to probing at a lower level than the bwn glue,
so bwn should still attach as per normal without a boot time tunable set.
It's also not fully fleshed out - the bwn probe/attach code needs to be
broken out into platform and bus specific things (just like ath, ath_pci,
ath_ahb) before we can shift the driver over to using this.
Tested:
* BCM4311, STA mode
* BCM4312, STA mode
Submitted by: Landon Fuller <landonf@landonf.org>
Differential Revision: https://reviews.freebsd.org/D6191
Two new functions are provided, bit_ffs_at() and bit_ffc_at(), which allow
for efficient searching of set or cleared bits starting from any bit offset
within the bit string.
Performance is improved by operating on longs instead of bytes and using
ffsl() for searches within a long. ffsl() is a compiler builtin in both
clang and gcc for most architectures, converting what was a brute force
while loop search into a couple of instructions.
All of the bitstring(3) API continues to be contained in the header file.
Some of the functions are large enough that perhaps they should be uninlined
and moved to a library, but that is beyond the scope of this commit.
sys/sys/bitstring.h:
Convert the majority of the existing bit string implementation from
macros to inline functions.
Properly protect the implementation from inadvertant macro expansion
when included in a user's program by prefixing all private
macros/functions and local variables with '_'.
Add bit_ffs_at() and bit_ffc_at(). Implement bit_ffs() and
bit_ffc() in terms of their "at" counterparts.
Provide a kernel implementation of bit_alloc(), making the full API
usable in the kernel.
Improve code documenation.
share/man/man3/bitstring.3:
Add pre-exisiting API bit_ffc() to the synopsis.
Document new APIs.
Document the initialization state of the bit strings
allocated/declared by bit_alloc() and bit_decl().
Correct documentation for bitstr_size(). The original code comments
indicate the size is in bytes, not "elements of bitstr_t". The new
implementation follows this lead. Only hastd assumed "elements"
rather than bytes and it has been corrected.
etc/mtree/BSD.tests.dist:
tests/sys/Makefile:
tests/sys/sys/Makefile:
tests/sys/sys/bitstring.c:
Add tests for all existing and new functionality.
include/bitstring.h
Include all headers needed by sys/bitstring.h
lib/libbluetooth/bluetooth.h:
usr.sbin/bluetooth/hccontrol/le.c:
Include bitstring.h instead of sys/bitstring.h.
sbin/hastd/activemap.c:
Correct usage of bitstr_size().
sys/dev/xen/blkback/blkback.c
Use new bit_alloc.
sys/kern/subr_unit.c:
Remove hard-coded assumption that sizeof(bitstr_t) is 1. Get rid of
unrb.busy, which caches the number of bits set in unrb.map. When
INVARIANTS are disabled, nothing needs to know that information.
callapse_unr can be adapted to use bit_ffs and bit_ffc instead.
Eliminating unrb.busy saves memory, simplifies the code, and
provides a slight speedup when INVARIANTS are disabled.
sys/net/flowtable.c:
Use the new kernel implementation of bit-alloc, instead of hacking
the old libc-dependent macro.
sys/sys/param.h
Update __FreeBSD_version to indicate availability of new API
Submitted by: gibbs, asomers
Reviewed by: gibbs, ngie
MFC after: 4 weeks
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D6004
It will be used for the upcoming LRO hash table initialization.
And probably will be useful in other cases, when M_WAITOK can't
be used.
Reviewed by: jhb, kib
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D6138
bus_get_cpus() returns a specified set of CPUs for a device. It accepts
an enum for the second parameter that indicates the type of cpuset to
request. Currently two valus are supported:
- LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to
the device when DEVICE_NUMA is enabled)
- INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core)
For systems that do not support NUMA (or if it is not enabled in the kernel
config), LOCAL_CPUS fails with EINVAL. INTR_CPUS is mapped to 'all_cpus'
by default. The idea is that INTR_CPUS should always return a valid set.
Device drivers which want to use per-CPU interrupts should start using
INTR_CPUS instead of simply assigning interrupts to all available CPUs.
In the future we may wish to add tunables to control the policy of
INTR_CPUS (e.g. should it be local-only or global, should it ignore
SMT threads or not).
The x86 nexus driver exposes the internal set of interrupt CPUs from the
the x86 interrupt code via INTR_CPUS.
The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable
LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled. They also and
the global INTR_CPUS set from the nexus driver with the per-domain set from
_PXM to generate a local INTR_CPUS set for child devices.
Reviewed by: wblock (manpage)
Differential Revision: https://reviews.freebsd.org/D5519