improve cancellation robustness.
Introduce a new file operation, fo_aio_queue, which is responsible for
queueing and completing an asynchronous I/O request for a given file.
The AIO subystem now exports library of routines to manipulate AIO
requests as well as the ability to run a handler function in the
"default" pool of AIO daemons to service a request.
A default implementation for file types which do not include an
fo_aio_queue method queues requests to the "default" pool invoking the
fo_read or fo_write methods as before.
The AIO subsystem permits file types to install a private "cancel"
routine when a request is queued to permit safe dequeueing and cleanup
of cancelled requests.
Sockets now use their own pool of AIO daemons and service per-socket
requests in FIFO order. Socket requests will not block indefinitely
permitting timely cancellation of all requests.
Due to the now-tight coupling of the AIO subsystem with file types,
the AIO subsystem is now a standard part of all kernels. The VFS_AIO
kernel option and aio.ko module are gone.
Many file types may block indefinitely in their fo_read or fo_write
callbacks resulting in a hung AIO daemon. This can result in hung
user processes (when processes attempt to cancel all outstanding
requests during exit) or a hung system. To protect against this, AIO
requests are only permitted for known "safe" files by default. AIO
requests for all file types can be enabled by setting the new
vfs.aio.enable_usafe sysctl to a non-zero value. The AIO tests have
been updated to skip operations on unsafe file types if the sysctl is
zero.
Currently, AIO requests on sockets and raw disks are considered safe
and are enabled by default. aio_mlock() is also enabled by default.
Reviewed by: cem, jilles
Discussed with: kib (earlier version)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D5289
taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167. It has been a compat shim ever
since. It's time for the compat shim to go.
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: sephe
Differential Revision: https://reviews.freebsd.org/D5131
is the physical memory size so may be larger than a u_long can hold, e.g.
on ARM with LPAE we could see an address space of up to 40 bits. On ARM
u_long is only 32 bits so the memory size will be truncated, possibly to
zero.
Reported by: bz
Sponsored by: ABT Systems Ltd
Note that isrc_arg member of struct intr_irqsrc is used only for
INTR_SOLO and IPI filter. This should be remembered if IPI filters
and their arguments will be stored on another place.
This option could be unusable very soon, if interrupt controllers
implementations will not be implemented considering it.
Enable system register access for EL2. Alpine-V2 is
the first device requiring this to be enabled.
It is also in-sync with Linux initialization code,
and compatible with Alpine-V2 uboot requirements.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Approved by: cognet (mentor)
Reviewed by: wma
Differential revision: https://reviews.freebsd.org/D5394
So that the host could dispatch the TX done back to this TX ring's
owner channel
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5498
Some MPC85xx GPIO controllers are compatible with QorIQ.
It may make more sense in the future to rename this and mpc85xx_gpio.c, as
mpc85xx_gpio.c appears to only be compatible with a few mpc85xx SoCs. All other
MPC85xx SoCs use the same controller as QorIQ.
Summary:
As part of the migration of rman_res_t to be typed to uintmax_t, memory ranges
must be clamped appropriately for the bus, to prevent completely bogus addresses
from being used.
This is extracted from D4544.
Reviewed By: cem
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5134
These firmwares were obtained from the beta "Chelsio T5/T4 Unified Wire
v2.12.0.2 for Linux" release. Changes since last release are listed in the
"Release Notes" accompanying the beta release and are copy-pasted here as well.
The plan is to have only GA'd firmwares in any -STABLE FreeBSD branch so I'll
MFC this (after 2 months) only if it ends up in a GA release.
================================================================================
================================================================================
22.1. T5 Firmware
+++++++++++++++++++++++++++++++++
Version : 1.15.28.0
Date : 02/29/2016
================================================================================
FIXES
-----
BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where the default ingress
queue was ignored.
- Fixed an issue where adapter failed to load fw by adjusting DRAM frequency.
- Fixed an issue in watchdog which was causing VM bring-up failure after
reboot.
- Fixed 40G link failures with some switches when auto-negotiation enabled.
- Fixed to improve on link bring-up time.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where bogus d3hot bits were set causing traffic stall.
- Fixed an issue where sometimes adapter was not seen after reboot.
- Fixed an issue where iWARP was crashing in conjunction with traffic
management.
- Fixed an issue where link failed to come up after removing twinax cable and
inserting optical module.
OFLD
- Fixed a potential iSCSI data corruption issue by disabling RxFragEn flag.
FOiSCSI
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP was not been provisioned yet and driver tried to
use given inerface.
ENHANCEMENTS
------------
BASE:
- Added new interface to program DCA settings in SGE contexts; allow 32-byte
IQE size
- Added PTP interface fw_ptp_ts to support PTP Frequeny and Offset adjustment.
- Added MPS raw interface.
ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
OFLD:
- WR opcode is returned to host in cqe error response.
================================================================================
================================================================================
22.2. T4 Firmware
+++++++++++++++++
Version : 1.15.28.0
Date : 02/29/2016
================================================================================
FIXES
-----
BASE:
- Fixed an issue in FW_RSS_VI_CONFIG_CMD handling where default ingress queue
was ignored.
- Fixed an issue in watchdog which was causing VM bring-up failure after
reboot.
- Per port buffer groups size doubled to improve performance.
- Fixed an issue where iWARP was crashing in conjunction with traffic
management.
FOiSCSI:
- Fixed an issue in recovery path where connection was getting closed before
recovery processing was done.
- Fixed an issue in TCP port reuse.
- Fixed an issue in recovery path when large number (>64) of iSCSI connections
were in use.
- Returned ENETUNREACH if IP had not been provisioned yet and driver tried to
use given inerface.
ENHANCEMENTS
------------
BASE:
- Added MPS raw interface.
ETH:
- New mailbox command FW_DCB_IEEE_CMD api added for IEEE dcbx.
================================================================================
Obtained from: Chelsio Communications
MFC after: 2 months
Sponsored by: Chelsio Communications
The m_ext.ext_cnt pointer becomes a union. It can now hold the refcount
value itself. To tell that m_ext.ext_flags flag EXT_FLAG_EMBREF is used.
The first mbuf to attach a cluster stores the refcount. The further mbufs
to reference the cluster point at refcount in the first mbuf. The first
mbuf is freed only when the last reference is freed.
The benefit over refcounts stored in separate slabs is that now refcounts
of different, unrelated mbufs do not share a cache line.
For EXT_EXTREF mbufs the zone_ext_refcnt is no longer needed, and m_extadd()
becomes void, making widely used M_EXTADD macro safe.
For EXT_SFBUF mbufs the sf_ext_ref() is removed, which was an optimization
exactly against the cache aliasing problem with regular refcounting.
Discussed with: rrs, rwatson, gnn, hiren, sbruno, np
Reviewed by: rrs
Differential Revision: https://reviews.freebsd.org/D5396
Sponsored by: Netflix
Drivers should set their own filters via ic_scan_start()/ic_scan_end()
callbacks; and we don't need frames other than beacons or probe responses.
(Note: this was a noop since r287197 due to promiscuous mode with bridge
workaround)
Tested with Intel 3945BG, RTL8188EU and WUSB54GC in HOSTAP mode.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5474
- In case, when we are doing <smth> -> INIT (FEXT_REINIT) -> <smth2>
state transition, cancel_scan() may be called in the first transition.
Reenqueue second state transition, so things will be executed in order.
- Discard any AUTH+ state transition request when INIT -> SCAN
transition is not done.
- Allow to track discarded state transitions via 'state' debugging
category.
Tested with:
* RTL8188EU, HOSTAP mode.
* RTL8188CUS, STA mode.
* Intel 3945BG, IBSS and STA modes.
PR: 197498
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5482
- Pass scan state and additional internal flags as a parameters.
- Add locked version.
Tested with:
* Intel 3945BG, STA mode.
* RTL8188EU, STA mode.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5148
transmitted
- Use M_TXCB mechanism to report about null data frame transmission.
- Increase timeout from 1 to 10 ms (the previous one may be not enough
for non-empty queue).
Tested with:
* Intel 3945BG, STA mode.
* RTL8188CUS, STA mode.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5147
scan_curchan_task() functions)
(This part should fix the problem, described in
https://lists.freebsd.org/pipermail/freebsd-wireless/2016-January/006420.html)
- Rename ss_scan_task into ss_scan_start (better describes it's
current purpose)
- Utilize taskqueue_*_timeout() functions instead of current mechanism:
* reschedule scan_curchan_task() via taskqueue_enqueue_timeout()
for every 'maxdwell' msecs (will replace infinite loop + sleeping
for 'maxdwell' period via cv_wait());
* rerun the task immediately when an external event occurs
(instead of waking it up via cv_signal())
Also, use mtx_sleep() to wait for null frame transmission
(allows to drop conditional variable).
Tested with:
* Intel 3945BG, STA mode;
* RTL8188EU, STA mode.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5145
Use u_long instead of uint32_t variables to avoid overflow
in case of PA space bigger than 32-bit.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Approved by: cognet (mentor)
Reviewed by: andrew, br, wma
Differential revision: https://reviews.freebsd.org/D5393
(32 and 64-bit, LE and BE).
The changes were tested with QEMU's 'mips' target.
Most of the implementation was lifted from the ARM version, the appropriate
MIPS-specific things were implemented.
With these changes I am able to go all the way through the u-boot->ubldr->kernel
boot chain in QEMU on all combinations of bit-ness and endian-ness.
For the tests I've used FAT32 disk images (as FAT32 is supported by U-boot),
which include /boot/kernel/kernel and /boot/kernel/ubldr.bin
In U-boot I do:
fatload ide 0 <LOAD_ADDR> /boot/kernel/ubldr.bin; go <LOAD_ADDR>
where LOAD_ADDR is 80800000 for 32-bit and ffffffff80800000 for 64-bit
Then it's the usual ubldr that takes over and loads and starts a kernel.
Approved by: adrian (mentor)
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D5313
ubldr.
The changes are mostly dealing with removing unnecessary casts from the U-Boot
API (we're passing only pointers, no obvious reason to cast them to uint32_t),
cleaning up some compiler warnings and using the proper printf format
specifiers in order to be able to compile cleanly for both 32-bit and 64-bit
MIPS targets.
Reviewed by: imp
Approved by: adrian (mentor)
Sponsored by: Smartcom - Bulgaria AD
Differential Revision: https://reviews.freebsd.org/D5312
It would serve as a debug tool, if the shared buffer ring's indices
stopped updating.
Submitted by: HongJiang Zhang <honzhan microsoft com>
Reviewed by: sephe, Jun Su <junsu microsoft com>
Modified by: sephe
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5402
sfence only makes sure about the store-store order, which is not
sufficient here. Use atomic_thread_fence_seq_cst() as suggested
jhb and kib (a locked op in the nutshell, which should have the
Reviewed by: jhb, kib, Jun Su <junsu microsoft com>
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5436
Unlike buf_ring_peek, it only supports single consumer mode, and it
clears the cons_head if DEBUG_BUFRING/INVARIANTS is defined.
The normal use case of drbr_peek for network drivers is:
m = drbr_peek(br);
err = hw_spec_encap(&m); /* could m_defrag/m_collapse */
(*)
if (err) {
if (m == NULL)
drbr_advance(br);
else
drbr_putback(br, m);
/* break the loop */
}
drbr_advance(br);
The race is:
If hw_spec_encap() m_defrag or m_collapse the mbuf, i.e. the old mbuf
was freed, or like the Hyper-V's network driver, that transmission-
done does not even require the TX lock; then on the other CPU at the
(*) time, the freed mbuf could be recycled and being drbr_enqueue even
before the current CPU had the chance to call drbr_{advance,putback}.
This triggers a panic in drbr_enqueue duplicated element check, if
DEBUG_BUFRING/INVARIANTS is defined.
Use buf_ring_peek_clear_sc() in drbr_peek() to fix the above race.
This change is a NO-OP, if neither DEBUG_BUFRING nor INVARIANTS are
defined.
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5416
Freescale's QorIQ line includes a new ethernet controller, based on their
Datapath Acceleration Architecture (DPAA). This uses a combination of a Frame
manager, Buffer manager, and Queue manager to improve performance across all
interfaces by being able to pass data directly between hardware acceleration
interfaces.
As part of this import, Freescale's Netcomm Software (ncsw) driver is imported.
This was an attempt by Freescale to create an OS-agnostic sub-driver for
managing the hardware, using shims to interface to the OS-specific APIs. This
work was abandoned, and Freescale's primary work is in the Linux driver (dual
BSD/GPL license). Hence, this was imported directly to sys/contrib, rather than
going through the vendor area. Going forward, FreeBSD-specific changes may be
made to the ncsw code, diverging from the upstream in potentially incompatible
ways. An alternative could be to import the Linux driver itself, using the
linuxKPI layer, as that would maintain parity with the vendor-maintained driver.
However, the Linux driver has not been evaluated for reliability yet, and may
have issues with the import, whereas the ncsw-based driver in this commit was
completed by Semihalf 4 years ago, and is very stable.
Other SoC modules based on DPAA, which could be added in the future:
* Security and Encryption engine (SEC4.x, SEC5.x)
* RAID engine
Additional work to be done:
* Implement polling mode
* Test vlan support
* Add support for the Pattern Matching Engine, which can do regular expression
matching on packets.
This driver has been tested on the P5020 QorIQ SoC. Others listed in the
dtsec(4) manual page are expected to work as the same DPAA engine is included in
all.
Obtained from: Semihalf
Relnotes: Yes
Sponsored by: Alex Perez/Inertial Computing
urtwn_set_rx_bssid_all() will allow to receive beacons only
when they are not denied by filter.
Revealed by D5474.
Tested with RTL8188CUS, HOSTAP mode.
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5477