This patchset implements a new TX lock, covering both the per-VAP (and
thus per-node) TX locking and the serialisation through to the underlying
physical device.
This implements the hard requirement that frames to the underlying physical
device are scheduled to the underlying device in the same order that they
are processed at the VAP layer. This includes adding extra encapsulation
state (such as sequence numbers and CCMP IV numbers.) Any order mismatch
here will result in dropped packets at the receiver.
There are multiple transmit contexts from the upper protocol layers as well
as the "raw" interface via the management and BPF transmit paths.
All of these need to be correctly serialised or bad behaviour will result
under load.
The specifics:
* add a new TX IC lock - it will eventually just be used for serialisation
to the underlying physical device but for now it's used for both the
VAP encapsulation/serialisation and the physical device dispatch.
This lock is specifically non-recursive.
* Methodize the parent transmit, vap transmit and ic_raw_xmit function
pointers; use lock assertions in the parent/vap transmit routines.
* Add a lock assertion in ieee80211_encap() - the TX lock must be held
here to guarantee sensible behaviour.
* Refactor out the packet sending code from ieee80211_start() - now
ieee80211_start() is just a loop over the ifnet queue and it dispatches
each VAP packet send through ieee80211_start_pkt().
Yes, I will likely rename ieee80211_start_pkt() to something that
better reflects its status as a VAP packet transmit path. More on
that later.
* Add locking around the management and BAR TX sending - to ensure that
encapsulation and TX are done hand-in-hand.
* Add locking in the mesh code - again, to ensure that encapsulation
and mesh transmit are done hand-in-hand.
* Add locking around the power save queue and ageq handling, when
dispatching to the parent interface.
* Add locking around the WDS handoff.
* Add a note in the mesh dispatch code that the TX path needs to be
re-thought-out - right now it's doing a direct parent device transmit
rather than going via the vap layer. It may "work", but it's likely
incorrect (as it bypasses any possible per-node power save and
aggregation handling.)
Why not a per-VAP or per-node lock?
Because in order to ensure per-VAP ordering, we'd have to hold the
VAP lock across parent->if_transmit(). There are a few problems
with this:
* There's some state being setup during each driver transmit - specifically,
the encryption encap / CCMP IV setup. That should eventually be dragged
back into the encapsulation phase but for now it lives in the driver TX path.
This should be locked.
* Two drivers (ath, iwn) re-use the node->ni_txseqs array in order to
allocate sequence numbers when doing transmit aggregation. This should
also be locked.
* Drivers may have multiple frames queued already - so when one calls
if_transmit(), it may end up dispatching multiple frames for different
VAPs/nodes, each needing a different lock when handling that particular
end destination.
So to be "correct" locking-wise, we'd end up needing to grab a VAP or
node lock inside the driver TX path when setting up crypto / AMPDU sequence
numbers, and we may already _have_ a TX lock held - mostly for the same
destination vap/node, but sometimes it'll be for others. That could lead
to LORs and thus deadlocks.
So for now, I'm sticking with an IC TX lock. It has the advantage of
papering over the above and it also has the added advantage that I can
assert that it's being held when doing a parent device transmit.
I'll look at splitting the locks out a bit more later on.
General outstanding net80211 TX path issues / TODO:
* Look into separating out the VAP serialisation and the IC handoff.
It's going to be tricky as parent->if_transmit() doesn't give me the
opportunity to split queuing from driver dispatch. See above.
* Work with monthadar to fix up the mesh transmit path so it doesn't go via
the parent interface when retransmitting frames.
* Push the encryption handling back into the driver, if it's at all
architectually sane to do so. I know it's possible - it's what mac80211
in Linux does.
* Make ieee80211_raw_xmit() queue a frame into VAP or parent queue rather
than doing a short-cut direct into the driver. There are QoS issues
here - you do want your management frames to be encapsulated and pushed
onto the stack sooner than the (large, bursty) amount of data frames
that are queued. But there has to be a saner way to do this.
* Fragments are still broken - drivers need to be upgraded to an if_transmit()
implementation and then fragmentation handling needs to be properly fixed.
Tested:
* STA - AR5416, AR9280, Intel 5300 abgn wifi
* Hostap - AR5416, AR9160, AR9280
* Mesh - some testing by monthadar@, more to come.
* If the first call succeeded but failed to transmit, a timer would
reschedule it via bar_timeout(). Unfortunately bar_timeout() didn't
check the return value from the ieee80211_send_bar() reattempt and
if that failed (eg the driver ic_raw_xmit() failed), it would never
re-arm the timer.
* If BARPEND is cleared (which ieee80211_send_bar() will do if it can't
TX), then re-arming the timer isn't enough - once bar_timeout() occurs,
it'll see BARPEND is 0 and not run through the rest of the routine.
So when rearming the timer, also set that flag.
* If the TX wasn't occuring, bar_tx_complete() wouldn't be called and the
driver callback wouldn't be called either. So the driver had no idea
that the BAR TX attempt had failed. In the ath(4) case, TX would stay
paused.
(There's no callback to indicate that BAR TX had failed or not;
only a "BAR TX was attempted". That's a separate, later problem.)
So call the driver callback (ic_bar_response()) before the ADDBA session
is torn down, so it has a chance of being notified that things didn't
quite go to plan.
I've verified that yes, this does suspend traffic for ath(4), retry BAR
TX even if the driver is failing ic_raw_xmit(), and then eventually giving
up and sending a DELBA. I'll address the "out of ath_buf" issue in ath(4)
in a subsequent commit - this commit just fixes the edge case where any
driver is (way) out of internal buffers/descriptors and fails frame TX.
PR: kern/168170
Reviewed by: bschmidt
MFC after: 1 month
Currently, a channel width change updates the 802.11n HT info data in
net80211 but it doesn't trigger any device changes. So the device
driver may decide that HT40 frames can be transmitted but the last
device channel set only had HT20 set.
Now, a task is scheduled so a hardware reset or change isn't done
during any active ongoing RX. It also means that it's serialised
with the other task operations (eg channel change.)
This isn't the final incantation of this work, see below.
For now, any unmodified drivers will simply receive a channel
change log entry. A subsequent patch to ath(4) will introduce
some basic channel change handling (by resetting the NIC.)
Other NICs may need to update their rate control information.
TODO:
* There's still a small window at the present moment where the
channel width has been updated but the task hasn't been fired.
The final version of this should likely pass in a channel width
field to the driver and let the driver atomically do whatever
it needs to before changing the channel.
PR: kern/166286
On a TX failure, ic_raw_xmit will still call ieee80211_node_free().
There's no need to call it here.
Submitted by: moonlightakkiy@yahoo.ca
Approved by: re (kib)
TX for the given TID needs to be paused during ADDBA requests (and unpaused
once the session is established.) Since net80211 currently doesn't implement
software aggregation, if this pause/unpause is done in the driver (as it
is in my development branch) then it will need to be unpaused both on
ADDBA response and on ADDBA timeout.
This callback allows the driver to unpause TX for the relevant TID.
Reviewed by: bschmidt
Intel 4965 devices for example have HT40 on 2GHz completely disabled
but it is still supported for 5GHz. To handle that in sta mode we
need to check if we can "upgrade" to a HT40 channel after the
association, if that is not possible but we are still announcing
support to the remote side we are left with a very flabby connection.
Reviewed by: adrian
Contrary to the rateset information in legacy frames the MCS Set
field also contains TX capability information in cases where the
number of available TX and RX spartial streams differ. Because a
rateset doesn't contain that information we have to pull the
those directly from the hardware capabilities.
Get rid of the assumption that every device is capable of 40MHz,
SGI and 2 spartial streams. Instead of printing, in the worst case,
8 times 76 MCS rates, print logically connect ranges and the
support RX/TX streams.
A device without 40MHz and SGI support looks like:
ath0: 2T2R
ath0: 11na MCS 20Mhz
ath0: MCS 0-7: 6.5Mbps - 65Mbps
ath0: MCS 8-15: 13Mbps - 130Mbps
ath0: 11ng MCS 20Mhz
ath0: MCS 0-7: 6.5Mbps - 65Mbps
ath0: MCS 8-15: 13Mbps - 130Mbps
adjust the IEEE80211_HTRATE_MAXSIZE constant, only MCS0 - 76 are valid
the other bits in the mcsset IE (77 - 127) are either reserved or used
for TX parameters.
o Process the BAR frame on the adhoc, mesh and sta modes
o Fix the format of the ADDBA reply frame
o Fix references to the spec section numbers
Also, print the all the MCS rates in bootverbose.
Sponsored by: iXsystems, Inc.
Obtained from: //depot/user/rpaulo/80211n/...
o add a new facility for components to register send+recv handlers
o ieee80211_send_action and ieee80211_recv_action now use the registered
handlers to dispatch operations
o rev ieee80211_send_action api to enable passing arbitrary data
o rev ieee80211_recv_action api to pass the 802.11 frame header as it may
be difficult to locate
o update existing IEEE80211_ACTION_CAT_BA and IEEE80211_ACTION_CAT_HT handling
o update mwl for api rev
Reviewed by: rpaulo
Approved by: re (kensmith)
o don't increment extracted tid, this was a vestige of IEEE80211_NONQOS_TID
being defined as 0 (w/ real tid's +1)
o handle 4-address frames (add IEEE80211_IS_DSTODS to check if an 802.11
header is DSTODS)
Submitted by: cbzimmer
Reviewed by: avatar
o replace DLT_IEEE802_11 support in net80211 with DLT_IEEE802_11_RADIO
and remove explicit bpf support from wireless drivers; drivers now
use ieee80211_radiotap_attach to setup shared data structures that
hold the radiotap header for each packet tx/rx
o remove rx timestamp from the rx path; it was used only by the tdma support
for debugging and was mostly useless due to it being 32-bits and mostly
unavailable
o track DLT_IEEE80211_RADIO bpf attachments and maintain per-vap and
per-com state when there are active taps
o track the number of monitor mode vaps
o use bpf tap and monitor mode vap state to decide when to collect radiotap
state and dispatch frames; drivers no longer explicitly directly check
bpf state or use bpf calls to tap frames
o handle radiotap state updates on channel change in net80211; drivers
should not do this (unless they bypass net80211 which is almost always
a mistake)
o update various drivers to be more consistent/correct in handling radiotap
o update ral to include TSF in radiotap'd frames
o add promisc mode callback to wi
Reviewed by: cbzimmer, rpaulo, thompsa
the net80211 layer has complete control over the handling of mgt frames
(in particular, the ac, tx rate, and retry count); this also allows us
to purge the M_LINK0 flag that was attached to mbufs to mark them as
needing encryption for shared key auth
o change ieee80211_send_setup to take a tid parameter so it can be used
to setup QoS frames
o correct BAR frame construction for AMPDU
o retransmit BAR frames until ACK'd or timeout (use tunables to
control behaviour, default is very aggressive)
o defer seq# update until BAR frame is ACK'd
o add BAR response handling callback for driver to interpose and
push new state to device or push pending aggregates
While here also:
o add backpointer to node in the per-tid tx aggregation data structure
o move ampdu tx state setup/teardown work to separate functions
o change ieee80211_parse_htcap and ieee80211_parse_htinfo to save only
internal state obtained from the ie's; no dynamic state such as
ni_chw is altered
o add ieee80211_ht_updateparams to parse ht cap+info ie's and update
dynamic node state
o change ieee80211_ht_node_init to not take an htcap ie that is parsed;
instead have the caller make a separate call as one caller wants to
parse the ie while another wants to parse both cap+info ie's and
update state so can better do this with ieee80211_ht_updateparams
These changes fix sta mode state handling where the node's channel
width was shifted to ht20/ht40 prematurely.
For receive:
o explicitly tag rx frames w/ M_AMPDU instead of passing frames through
the reorder processing according to the node having HT and the frame
being QoS data
o relax ieee80211_ampdu_reorder asserts to allow any frame to be passed
in, unsuitable frames are returned to the caller for normal processing;
this permits drivers that cannot inspect the PLCP to mark all data
frames as potential ampdu candidates with only a small penalty
o add M_AMPDU_MPDU to identify frames resubmitted from the reorder q
For transmit:
o tag aggregation candidates with M_AMPDU_MPDU
o fix the QoS ack policy set in ampdu subframes; we only support immediate
BA streams which should be marked for "normal ack" to get implicit block
ack behaviour; interestingly certain vendor parts BA'd frames with the
11e BA ack policy set
o do not assign a sequence # to aggregation candidates; this must be done
when frames are submitted for transmit (NB: this can/will be handled
better when aggregation is pulled up to net80211)
properly; was doing this only on a change in the protection mode
so the advertised protection mode in the beacon would oscillate.
Submitted by: Chris Zimmermann
o use txa_start to form the addba request and purge txa_seqstart
o fill txa_start before calling ic_addba_request to permit drivers
to override when they handle seq# generation (e.g. mwl)
Note this includes changes to all drivers and moves some device firmware
loading to use firmware(9) and a separate module (e.g. ral). Also there
no longer are separate wlan_scan* modules; this functionality is now
bundled into the wlan module.
Supported by: Hobnob and Marvell
Reviewed by: many
Obtained from: Atheros (some bits)