freebsd-skq/sys/dev/ath
adrian 7945cc929c Implement my first cut at filtered frames in aggregation sessions.
The hardware can optionally "filter" frames if successive transmissions
to a given node (ie, "entry in the keycache") fail.  That way the hardware
can implement a kind of early abort of all the other frames queued to
that destination, rather than simply trying to TX each frame to that
destination (and failing.)

The background:

* If a frame comes back as being filtered, the hardware didn't try to
  TX it (or it was outside the TX burst opportunity.) So, take it as a hint
  that some (but not all, see below) frames to the destination may be
  filtered.

* If the CLRDMASK bit is set in a TX descriptor, the "filter to this
  destination" bit in the keycache entry is cleared and TX to that host
  will be unconditionally retried.

* Right now everything has the CLRDMASK bit set, so filtered frames
  tend to be aggregates and frames that fall outside of the WME burst
  window. It was a bit worse in the past as I had messed up the TX
  flags and CLRDMASK wasn't being set on aggregate frames.

The annoying bits:

* It's easy (ish) to do for aggregate session frames - firstly, they
  can be retried in any order as long as they're within the BAW, and
  there's already a bunch of infrastructure tracking how many frames
  the TID has queued to the hardware (tid->hwq_depth.) However, for
  frames that bypassed the software queue, hwq_depth doesn't get
  incremented. I'll fix that in a subsequent commit.

* For non-aggregate session frames, the only retries that can occur
  are ones for sequence numbers that hvaen't successfully been TXed yet.
  Since there's no re-ordering going on in non-aggregate sessions, if any
  subsequent seqno frames make it out, any filtered frames before that
  seqno need to be dropped.

  Hence why this initially is just for aggregate session frames.

* Since there may be intermediary frames to the destination that
  have CLRDMASK set - for example, any directly dispatched management
  frames to that destination - it's possible that there will be some
  filtered frames followed up by some non filtered frames.  Thus,
  it can't be assumed that once you see a filtered frame for the given
  destination node, all subsequent frames for all TIDs will be filtered.

Ok, with that in mind:

* Create a per-TID filtered frame queue for frames that the hardware
  returns as filtered.

* Track filtered frames per-tid, rather than per-node.  It just makes
  the locking much easier.

* When a filtered frame appears in the completion function, the node
  transitions to "filtered", and all subsequent completed error frames
  (filtered or otherwise) are put on the filtered frame queue.  The TID
  is paused once (during the transition from non-filtered to filtered).

* If a filtered frame retry count exceeds SWMAX_RETRIES, a BAR should be
  sent.

* Once all the frames queued to the hardware for the given filtered frame
  TID, transition back from filtered frame to non-filtered frame, which
  means pre-pending all the filtered frames onto the head of the software
  queue, clearing the filtered frame state and unpausing the TID.

Things get quite hairy around handling completion (aggr, non-aggr, norm,
direct-dispatched frames to a hardware queue); whether it's an "error",
"cleanup" or "BAR" state as well as filtered, which order to do things
in (eg do filtered BEFORE checking for BAR, as the filter completion
may be needed to actually transmit a BAR frame.)

This work has definitely reminded me that I have to tidy up all the locking
and remove some of the ridiculous lock/unlock/lock/unlock going on in the
completion functions.

It's also reminded me that I should really split out TID versus hardware TXQ
locking, even if the underlying locking is still the destination hardware TXQ.

Finally, this is all pre-requisite for working on AP mode power save support
(PS-POLL, uAPSD) as well as improving performance to misbehaving nodes (as
they can transition into filter mode, stopping any TX until everything has
caught up.)

Finally (ish) - this should also be done for non-aggregate sessions as
there are still plenty of laptops and mobile devices that don't speak
802.11n but do wish for stable, useful power save AP support where packets
aren't simply dropped.  This requires software retransmission for
non-aggregate sessions to be implemented, which includes the caveats I've
mentioned above.

Finally finally - this doesn't yet do anything about the CLRDMASK bit in the
TX descriptor.  That's still unconditionally set to 1.  I'll debug the
current work (mostly ensuring I haven't busted up the hairy transitions
between BAR, filtered, error (all frames in an aggregate failing) and
cleanup (when transitioning from aggregation -> non-aggregation.))

Finally finally finally - this is all original work by yours truely, rather
than ported from the Atheros internal driver codebase or Linux ath9k.

Tested:
 * AR9280, AR5416 in STA mode
 * AR9280, AR9130 in hostap mode
 * Lots and lots of iperf testing in very marginal and non-marginal conditions,
   complete with inducing filtered frames + BAR TX conditions.
2012-09-18 10:14:17 +00:00
..
ath_dfs/null Disable strong signal diversity when enabling radar pulse detection 2012-09-02 05:01:10 +00:00
ath_hal Add a couple of accessor inline functions for state that exists in net80211. 2012-09-18 01:27:24 +00:00
ath_rate Improve performance of the Sample rate algorithm on 802.11n networks. 2012-09-17 01:09:17 +00:00
ah_osdep.c Fix the HAL debugging to only use one bit to mark a message as unmaskable. 2012-07-01 02:34:32 +00:00
ah_osdep.h Migrate the ath_hal_filltxdesc() API to take a list of buffer/seglen values. 2012-08-05 10:12:27 +00:00
if_ath_ahb.c Flesh out a new DMA map for the EDMA TX completion status, as well 2012-07-23 02:49:25 +00:00
if_ath_beacon.c Extend the beacon code slightly to support AP mode beaconing for the 2012-08-11 23:26:19 +00:00
if_ath_beacon.h Extend the beacon code slightly to support AP mode beaconing for the 2012-08-11 23:26:19 +00:00
if_ath_debug.c Only print the descriptor contents! 2012-08-27 23:27:41 +00:00
if_ath_debug.h Add a debug bit for TX destination filtering. 2012-09-17 01:18:47 +00:00
if_ath_keycache.c Add a 'vap' to ath_keyset(). 2011-11-08 19:25:52 +00:00
if_ath_keycache.h Add a 'vap' to ath_keyset(). 2011-11-08 19:25:52 +00:00
if_ath_led.c Bring over the initial 802.11n bluetooth coexistence support code. 2012-06-26 22:16:53 +00:00
if_ath_led.h Refactor out the software LED config code into a common function, called 2011-12-26 05:46:22 +00:00
if_ath_misc.h Break out the TX completion code into a separate function, so it can be 2012-08-14 22:32:20 +00:00
if_ath_pci.c Flesh out a new DMA map for the EDMA TX completion status, as well 2012-07-23 02:49:25 +00:00
if_ath_rx_edma.c Remove - not needed. 2012-08-29 00:53:58 +00:00
if_ath_rx_edma.h Begin abstracting out the RX path in preparation for RX EDMA support. 2012-07-03 06:59:12 +00:00
if_ath_rx.c Remove extra debugging - there's no longer any need. 2012-08-29 00:53:41 +00:00
if_ath_rx.h Convert sc_rxpending to a per-EDMA queue, and use that for the legacy code. 2012-07-10 00:02:19 +00:00
if_ath_sysctl.c Implement my first cut at filtered frames in aggregation sessions. 2012-09-18 10:14:17 +00:00
if_ath_sysctl.h Break out most of the HAL related tweaks into a per-HAL instance, 2011-06-23 02:38:36 +00:00
if_ath_tdma.c Migrate the TDMA management functions out of if_ath.c into if_ath_tdma.c. 2012-05-20 02:49:42 +00:00
if_ath_tdma.h Migrate the TDMA management functions out of if_ath.c into if_ath_tdma.c. 2012-05-20 02:49:42 +00:00
if_ath_tsf.h Migrate the bulk of the RX routines out from if_ath.c to if_ath_rx.[ch]. 2012-05-20 02:05:10 +00:00
if_ath_tx_edma.c Initialise an uninitialised variable. 2012-08-21 16:44:25 +00:00
if_ath_tx_edma.h Begin separating out the TX DMA setup in preparation for TX EDMA support. 2012-07-23 03:52:18 +00:00
if_ath_tx_ht.c Correctly mask out the RTS/CTS flags when forming aggregates. 2012-09-08 02:56:09 +00:00
if_ath_tx_ht.h Introduce TX aggregation and software TX queue management 2011-11-08 22:43:13 +00:00
if_ath_tx.c Implement my first cut at filtered frames in aggregation sessions. 2012-09-18 10:14:17 +00:00
if_ath_tx.h Revert the ath_tx_draintxq() method, and instead teach it the minimum 2012-08-12 00:46:15 +00:00
if_ath.c Remove TDMA #define entries from if_ath.c; they now exist in if_ath_tdma.h. 2012-09-09 04:53:10 +00:00
if_athdfs.h Change the ath_dfs_process_phy_err() method to take an mbuf rather than 2012-06-24 08:09:06 +00:00
if_athioctl.h Implement my first cut at filtered frames in aggregation sessions. 2012-09-18 10:14:17 +00:00
if_athrate.h Create an ioctl API for fetching the current rate control information. 2012-07-20 01:27:20 +00:00
if_athvar.h Add a per-TID filter queue and filter state bits. 2012-09-17 01:21:55 +00:00