from a management frame transmission.
This bug is a bit loopy, so here goes.
The underlying cause is pretty easy to understand - the node isn't
referenced before passing into the callout, so if the node is deleted
before the callout fires, it'll dereference free'd memory.
The code path however is slightly more convoluted.
The functions _say_ mgt_tx - ie management transmit - which is partially
true. Yes, that callback is attached to the mbuf for some management
frames. However, it's only for frames relating to scanning and
authentication attempts. It helpfully drives the VAP state back to
"SCAN" if the transmission fails _OR_ (as I subsequently found out!)
if the transmission succeeds but the state machine doesn't make progress
towards being authenticated and active.
Now, the code itself isn't terribly clear about this.
It _looks_ like it's just handling the transmit failure case.
However, when you look at what goes on in the transmit success case, it's
moving the VAP state back to SCAN if it hasn't changed state since
the time the callback was scheduled. Ie, if it's in ASSOC or AUTH still,
it'll go back to SCAN. But if it has transitioned to the RUN state,
the comparison will fail and it'll not transition things back to the
SCAN state.
So, to fix this, I decided to leave everything the way it is and merely
fix the locking and remove the node reference.
The _better_ fix would be to turn this callout into a "assoc/auth request"
timeout callback and make the callout locked, thus eliminating all races.
However, until all the drivers have been fixed so that transmit completions
occur outside of any locking that's going on, it's going to be impossible
to do this without introducing LORs. So, I leave some of the evilness
in there.
Tested:
* AR5212, ath(4), STA mode
* 5100 and 4965 wifi, iwn(4), STA mode
I changed it to use if_transmit a while ago but apparently with monitor
mode the if_transmit method is overridden.
This is (mostly) a workaround until a more permanent solution can be
found.
Submitted by: Patrick Kelsey <kelsey@ieee.org>
Approved by: re@ (gjb)
The aim of this function is to eventually be the completion entry point
for all 802.11 encapsulated mbufs. All the wifi drivers end up doing
what is in this function so it's an easy win to turn it into a net80211
method and abstract out this code.
Ideally the drivers will all eventually be modified to queue up completed
mbufs and call this function with all the driver locks not held.
This will allow for some much more interesting software queue handling
in the future (like net80211 based A-MSDU, fast-frames, A-MPDU aggregation
and retransmission.)
Tested:
* ath(4), iwn(4)
together.
Add M_FLAG_PRINTF for use with printf(9) %b indentifier.
Use the generic mbuf flags print names in the net80211 code and adjust
the protocol specific bits for their new positions.
Change SCTP M_PROTO mapping from 5 to 1 to fit within the 16bit field
they use internally to store some additional information.
Discussed with: trociny, glebius
M_LASTFRAG flags to protocol specific flags.
Remove the now unused M_FRAG, M_FIRSTFRAG and M_LASTFRAG mbuf flags.
Discussed with: trociny, glebius, adrian
upper layer(s).
This eliminates the if_snd queue from net80211. Yay!
This unfortunately has a few side effects:
* It breaks ALTQ to net80211 for now - sorry everyone, but fixing
parallelism and eliminating the if_snd queue is more important
than supporting this broken traffic scheduling model. :-)
* There's no VAP and IC flush methods just yet - I think I'll add
some NULL methods for now just as placeholders.
* It reduces throughput a little because now net80211 will drop packets
rather than buffer them if the driver doesn't do its own buffering.
This will be addressed in the future as I implement per-node software
queues.
Tested:
* ath(4) and iwn(4) in STA operation
the normal and the mesh transmit paths can use.
The API is a bit horrible because it both consumes the mbuf and frees
the node reference regardless of whether it succeeds or not.
It's a hold-over from how the code behaves; it'd be nice to have it
not free the node reference / mbuf if TX fails and let the caller
decide what to do.
* Add 802.11n 2ghz and 5ghz tables, including legacy rates and up to
MCS23 rates (3x3.)
* Populate the rate code -> rate index lookup table with MCS _and_
normal rates, but _not_ the basic rate flag. Since the basic rate flag
is the same as the MCS flag, we can only use one.
* Introduce some accessor inlines that do PLCP and rate table lookup/access
and enforce that it doesn't set the basic rate bit. They're not
designed for MCS rates, so it will panic.
* Start converting drivers that use the rate table stuff to use the
accessor inlines and strip the basic flag.
* Teach AMRR about basic 11n - it's still as crap for MCS as it is
being used by iwn, so it's not a step _backwardS_.
* Convert iwn over to accept 11n MCS rates rather than 'translate' legacy
to MCS rates. It doesn't use a lookup table any longer; instead it's a
function which takes the current node (for HT parameters) and the
rate code, and returns the hardware PLCP code to use.
Tested:
* ath - it's a no-op, and it works that way
* iwn - both 11n and non-11n
The "find node" function call will increase the node reference anyway;
so there's no reason to hold the node table lock during the MLME change.
The only reason I could think of is to stop overlapping mlme ioctls
from causing issues, but this should be fixed a different way.
This fixes a whole class of LORs that creep up when nodes are being
timed out or removed by hostapd.
Tested:
* AR5416, hostap, with nodes coming and going. No LORs or stability
issues were observed.
When creating fragment frames, the header length should honour the
DATAPAD flag.
This fixes the fragments that are queued to the ath(4) driver but it
doesn't yet fix fragment transmission. That requires further changes
to the ath(4) transmit path. Well, strictly speaking, it requires
further changes to _all_ wifi driver transmit paths, but this is at least
a start.
Tested:
* AR5416, STA mode, w/ fragthreshold set to 256.
before using said node.
The "blessed" way here is to take a node reference before referencing
anything inside the node, otherwise the node can be freed between
the time the pointer is copied/dereferenced and the time the node contents
are used.
This mirrors fixes that I've done elsewhere in the net80211/driver
stack.
PR: kern/178470
the given node.
This takes into account the per-node cap, the ic cap and the
per-channel regulatory caps.
This is designed to replace references to ni_txpower in various net80211
drivers - ni_txpower doesn't necessarily reflect the actual cap for
the given node (eg if the node has the default value of 50dBm (100) and
the administrator has manually configured a lower TX power.)
This patchset implements a new TX lock, covering both the per-VAP (and
thus per-node) TX locking and the serialisation through to the underlying
physical device.
This implements the hard requirement that frames to the underlying physical
device are scheduled to the underlying device in the same order that they
are processed at the VAP layer. This includes adding extra encapsulation
state (such as sequence numbers and CCMP IV numbers.) Any order mismatch
here will result in dropped packets at the receiver.
There are multiple transmit contexts from the upper protocol layers as well
as the "raw" interface via the management and BPF transmit paths.
All of these need to be correctly serialised or bad behaviour will result
under load.
The specifics:
* add a new TX IC lock - it will eventually just be used for serialisation
to the underlying physical device but for now it's used for both the
VAP encapsulation/serialisation and the physical device dispatch.
This lock is specifically non-recursive.
* Methodize the parent transmit, vap transmit and ic_raw_xmit function
pointers; use lock assertions in the parent/vap transmit routines.
* Add a lock assertion in ieee80211_encap() - the TX lock must be held
here to guarantee sensible behaviour.
* Refactor out the packet sending code from ieee80211_start() - now
ieee80211_start() is just a loop over the ifnet queue and it dispatches
each VAP packet send through ieee80211_start_pkt().
Yes, I will likely rename ieee80211_start_pkt() to something that
better reflects its status as a VAP packet transmit path. More on
that later.
* Add locking around the management and BAR TX sending - to ensure that
encapsulation and TX are done hand-in-hand.
* Add locking in the mesh code - again, to ensure that encapsulation
and mesh transmit are done hand-in-hand.
* Add locking around the power save queue and ageq handling, when
dispatching to the parent interface.
* Add locking around the WDS handoff.
* Add a note in the mesh dispatch code that the TX path needs to be
re-thought-out - right now it's doing a direct parent device transmit
rather than going via the vap layer. It may "work", but it's likely
incorrect (as it bypasses any possible per-node power save and
aggregation handling.)
Why not a per-VAP or per-node lock?
Because in order to ensure per-VAP ordering, we'd have to hold the
VAP lock across parent->if_transmit(). There are a few problems
with this:
* There's some state being setup during each driver transmit - specifically,
the encryption encap / CCMP IV setup. That should eventually be dragged
back into the encapsulation phase but for now it lives in the driver TX path.
This should be locked.
* Two drivers (ath, iwn) re-use the node->ni_txseqs array in order to
allocate sequence numbers when doing transmit aggregation. This should
also be locked.
* Drivers may have multiple frames queued already - so when one calls
if_transmit(), it may end up dispatching multiple frames for different
VAPs/nodes, each needing a different lock when handling that particular
end destination.
So to be "correct" locking-wise, we'd end up needing to grab a VAP or
node lock inside the driver TX path when setting up crypto / AMPDU sequence
numbers, and we may already _have_ a TX lock held - mostly for the same
destination vap/node, but sometimes it'll be for others. That could lead
to LORs and thus deadlocks.
So for now, I'm sticking with an IC TX lock. It has the advantage of
papering over the above and it also has the added advantage that I can
assert that it's being held when doing a parent device transmit.
I'll look at splitting the locks out a bit more later on.
General outstanding net80211 TX path issues / TODO:
* Look into separating out the VAP serialisation and the IC handoff.
It's going to be tricky as parent->if_transmit() doesn't give me the
opportunity to split queuing from driver dispatch. See above.
* Work with monthadar to fix up the mesh transmit path so it doesn't go via
the parent interface when retransmitting frames.
* Push the encryption handling back into the driver, if it's at all
architectually sane to do so. I know it's possible - it's what mac80211
in Linux does.
* Make ieee80211_raw_xmit() queue a frame into VAP or parent queue rather
than doing a short-cut direct into the driver. There are QoS issues
here - you do want your management frames to be encapsulated and pushed
onto the stack sooner than the (large, bursty) amount of data frames
that are queued. But there has to be a saner way to do this.
* Fragments are still broken - drivers need to be upgraded to an if_transmit()
implementation and then fragmentation handling needs to be properly fixed.
Tested:
* STA - AR5416, AR9280, Intel 5300 abgn wifi
* Hostap - AR5416, AR9160, AR9280
* Mesh - some testing by monthadar@, more to come.
* The following bit flags where incroccetly defined:
o Mesh Control Present
o Mesh Power Save Level
o RSPI
This is now corrected according to Table 8.4 as per IEEE 802.11 2012;
Approved by: adrian (mentor)
since the former is defined everywhere. This cuts off some code not
necessary on non strict aligment arches.
Reviewed by: adrian
Sponsored by: Nginx, Inc.
* Add the superg.h header to allow ieee80211_check_ff() to work
* Since the assert stuff creates assertions based on line numbers and there
was a conflict, just nudge things down a bit.
* Added hwmp_update_transmitter function that checks if the metric
to the transmitter have improved. If old FI is invalid or metric
is larger the FI to the transmitter is updated occurdingly.
This is a recommendation from the 802.11 2012 standard, table 13-9;
Approved by: adrian (mentor)
* When calling ieee80211_mesh_rt_flush_peer, the rt->rt_dest argument
should not be passed because it can get freed before invalidating
the other routes that depends on it to compare with next_hop.
Use PERR_DADDR(i) instead;
Approved by: adrian (mentor)
* The standard is unclear about what should happen in case a mesh STA (not
marked as a mesh gate) recevies a PREQ for a destination that is marked
as proxy. Solution for now is not to do intermediate reply at all, and
let the PREQ reach the mesh gate;
Approved by: adrian (mentor)
* Original PREP frame is transmitted only by the target mesh STA or the
mesh STA that is the proxy target;
* Fixed so that metric value is not over written incorrectly in
hwmp_recv_preq for when replying back with a PREP;
Approved by: adrian (mentor)