2011-01-29 11:35:23 +00:00
|
|
|
/*-
|
|
|
|
* Copyright (c) 2002-2009 Sam Leffler, Errno Consulting
|
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer,
|
|
|
|
* without modification.
|
|
|
|
* 2. Redistributions in binary form must reproduce at minimum a disclaimer
|
|
|
|
* similar to the "NO WARRANTY" disclaimer below ("Disclaimer") and any
|
|
|
|
* redistribution must be conditioned upon including a substantially
|
|
|
|
* similar Disclaimer requirement for further binary redistribution.
|
|
|
|
*
|
|
|
|
* NO WARRANTY
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
|
|
|
* ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
|
|
|
* LIMITED TO, THE IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTIBILITY
|
|
|
|
* AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
|
|
|
|
* THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY,
|
|
|
|
* OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
|
|
|
|
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
|
|
|
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
|
|
|
|
* IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
|
|
|
|
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
|
|
|
|
* THE POSSIBILITY OF SUCH DAMAGES.
|
|
|
|
*
|
|
|
|
* $FreeBSD$
|
|
|
|
*/
|
|
|
|
#ifndef __IF_ATH_MISC_H__
|
|
|
|
#define __IF_ATH_MISC_H__
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This is where definitions for "public things" in if_ath.c
|
|
|
|
* will go for the time being.
|
|
|
|
*
|
|
|
|
* Anything in here should eventually be moved out of if_ath.c
|
|
|
|
* and into something else.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* unaligned little endian access */
|
|
|
|
#define LE_READ_2(p) \
|
|
|
|
((u_int16_t) \
|
|
|
|
((((u_int8_t *)(p))[0] ) | (((u_int8_t *)(p))[1] << 8)))
|
|
|
|
#define LE_READ_4(p) \
|
|
|
|
((u_int32_t) \
|
|
|
|
((((u_int8_t *)(p))[0] ) | (((u_int8_t *)(p))[1] << 8) | \
|
|
|
|
(((u_int8_t *)(p))[2] << 16) | (((u_int8_t *)(p))[3] << 24)))
|
|
|
|
|
2012-07-09 08:37:59 +00:00
|
|
|
extern int ath_rxbuf;
|
|
|
|
extern int ath_txbuf;
|
|
|
|
extern int ath_txbuf_mgmt;
|
|
|
|
|
2011-01-29 11:35:23 +00:00
|
|
|
extern int ath_tx_findrix(const struct ath_softc *sc, uint8_t rate);
|
|
|
|
|
2012-06-13 06:57:55 +00:00
|
|
|
extern struct ath_buf * ath_getbuf(struct ath_softc *sc,
|
|
|
|
ath_buf_type_t btype);
|
|
|
|
extern struct ath_buf * _ath_getbuf_locked(struct ath_softc *sc,
|
|
|
|
ath_buf_type_t btype);
|
2011-11-08 21:13:05 +00:00
|
|
|
extern struct ath_buf * ath_buf_clone(struct ath_softc *sc,
|
2013-04-01 20:57:13 +00:00
|
|
|
struct ath_buf *bf);
|
2012-06-13 06:57:55 +00:00
|
|
|
/* XXX change this to NULL the buffer pointer? */
|
2011-11-08 21:49:33 +00:00
|
|
|
extern void ath_freebuf(struct ath_softc *sc, struct ath_buf *bf);
|
2012-06-13 05:39:16 +00:00
|
|
|
extern void ath_returnbuf_head(struct ath_softc *sc, struct ath_buf *bf);
|
|
|
|
extern void ath_returnbuf_tail(struct ath_softc *sc, struct ath_buf *bf);
|
2011-01-29 11:35:23 +00:00
|
|
|
|
2011-11-08 18:56:52 +00:00
|
|
|
extern int ath_reset(struct ifnet *, ATH_RESET_TYPE);
|
2011-11-08 21:49:33 +00:00
|
|
|
extern void ath_tx_default_comp(struct ath_softc *sc, struct ath_buf *bf,
|
|
|
|
int fail);
|
Introduce TX aggregation and software TX queue management
for Atheros AR5416 and later wireless devices.
This is a very large commit - the complete history can be
found in the user/adrian/if_ath_tx branch.
Legacy (ie, pre-AR5416) devices also use the per-software
TXQ support and (in theory) can support non-aggregation
ADDBA sessions. However, the net80211 stack doesn't currently
support this.
In summary:
TX path:
* queued frames normally go onto a per-TID, per-node queue
* some special frames (eg ADDBA control frames) are thrown
directly onto the relevant hardware queue so they can
go out before any software queued frames are queued.
* Add methods to create, suspend, resume and tear down an
aggregation session.
* Add in software retransmission of both normal and aggregate
frames.
* Add in completion handling of aggregate frames, including
parsing the block ack bitmap provided by the hardware.
* Write an aggregation function which can assemble frames into
an aggregate based on the selected rate control and channel
configuration.
* The per-TID queues are locked based on their target hardware
TX queue. This matches what ath9k/atheros does, and thus
simplified porting over some of the aggregation logic.
* When doing TX aggregation, stick the sequence number allocation
in the TX path rather than net80211 TX path, and protect it
by the TXQ lock.
Rate control:
* Delay rate control selection until the frame is about to
be queued to the hardware, so retried frames can have their
rate control choices changed. Frames with a static rate
control selection have that applied before each TX, just
to simplify the TX path (ie, not have "static" and "dynamic"
rate control special cased.)
* Teach ath_rate_sample about aggregates - both completion and
errors.
* Add an EWMA for tracking what the current "good" MCS rate is
based on failure rates.
Misc:
* Introduce a bunch of dirty hacks and workarounds so TID mapping
and net80211 frame inspection can be kept out of the net80211
layer. Because of the way this code works (and it's from Atheros
and Linux ath9k), there is a consistent, 1:1 mapping between
TID and AC. So we need to ensure that frames going to a specific
TID will _always_ end up on the right AC, and vice versa, or the
completion/locking will simply get very confused. I plan on
addressing this mess in the future.
Known issues:
* There is no BAR frame transmission just yet. A whole lot of
tidying up needs to occur before BAR frame TX can occur in the
"correct" place - ie, once the TID TX queue has been drained.
* Interface reset/purge/etc results in frames in the TX and RX
queues being removed. This creates holes in the sequence numbers
being assigned and the TX/RX AMPDU code (on either side) just
hangs.
* There's no filtered frame support at the present moment, so
stations going into power saving mode will simply have a number
of frames dropped - likely resulting in a traffic "hang".
* Raw frame TX is going to just not function with 11n aggregation.
Likely this needs to be modified to always override the sequence
number if the frame is going into an aggregation session.
However, general raw frame injection currently doesn't work in
general in net80211, so let's just ignore this for now until
this is sorted out.
* HT protection is just not implemented and won't be until the above
is sorted out. In addition, the AR5416 has issues RTS protecting
large aggregates (anything >8k), so the work around needs to be
ported and tested. Thus, this will be put on hold until the above
work is complete.
* The rate control module 'sample' is the only currently supported
module; onoe/amrr haven't been tested and have likely bit rotted
a little. I'll follow up with some commits to make them work again
for non-11n rates, but they won't be updated to handle 11n and
aggregation. If someone wishes to do so then they're welcome to
send along patches.
* .. and "sample" doesn't really do a good job of 11n TX. Specifically,
the metrics used (packet TX time and failure/success rates) isn't as
useful for 11n. It's likely that it should be extended to take into
account the aggregate throughput possible and then choose a rate
which maximises that. Ie, it may be acceptable for a higher MCS rate
with a higher failure to be used if it gives a more acceptable
throughput/latency then a lower MCS rate @ a lower error rate.
Again, patches will be gratefully accepted.
Because of this, ATH_ENABLE_11N is still not enabled by default.
Sponsored by: Hobnob, Inc.
Obtained from: Linux, Atheros
2011-11-08 22:43:13 +00:00
|
|
|
extern void ath_tx_update_ratectrl(struct ath_softc *sc,
|
|
|
|
struct ieee80211_node *ni, struct ath_rc_series *rc,
|
|
|
|
struct ath_tx_status *ts, int frmlen, int nframes, int nbad);
|
2011-11-08 21:49:33 +00:00
|
|
|
|
Overhaul the TXQ locking (again!) as part of some beacon/cabq timing
related issues.
Moving the TX locking under one lock made things easier to progress on
but it had one important side-effect - it increased the latency when
handling CABQ setup when sending beacons.
This commit introduces a bunch of new changes and a few unrelated changs
that are just easier to lump in here.
The aim is to have the CABQ locking separate from other locking.
The CABQ transmit path in the beacon process thus doesn't have to grab
the general TX lock, reducing lock contention/latency and making it
more likely that we'll make the beacon TX timing.
The second half of this commit is the CABQ related setup changes needed
for sane looking EDMA CABQ support. Right now the EDMA TX code naively
assumes that only one frame (MPDU or A-MPDU) is being pushed into each
FIFO slot. For the CABQ this isn't true - a whole list of frames is
being pushed in - and thus CABQ handling breaks very quickly.
The aim here is to setup the CABQ list and then push _that list_ to
the hardware for transmission. I can then extend the EDMA TX code
to stamp that list as being "one" FIFO entry (likely by tagging the
last buffer in that list as "FIFO END") so the EDMA TX completion code
correctly tracks things.
Major:
* Migrate the per-TXQ add/removal locking back to per-TXQ, rather than
a single lock.
* Leave the software queue side of things under the ATH_TX_LOCK lock,
(continuing) to serialise things as they are.
* Add a new function which is called whenever there's a beacon miss,
to print out some debugging. This is primarily designed to help
me figure out if the beacon miss events are due to a noisy environment,
issues with the PHY/MAC, or other.
* Move the CABQ setup/enable to occur _after_ all the VAPs have been
looked at. This means that for multiple VAPS in bursted mode, the
CABQ gets primed once all VAPs are checked, rather than being primed
on the first VAP and then having frames appended after this.
Minor:
* Add a (disabled) twiddle to let me enable/disable cabq traffic.
It's primarily there to let me easily debug what's going on with beacon
and CABQ setup/traffic; there's some DMA engine hangs which I'm finally
trying to trace down.
* Clear bf_next when flushing frames; it should quieten some warnings
that show up when a node goes away.
Tested:
* AR9280, STA/hostap, up to 4 vaps (staggered)
* AR5416, STA/hostap, up to 4 vaps (staggered)
TODO:
* (Lots) more AR9380 and later testing, as I may have missed something here.
* Leverage this to fix CABQ hanling for AR9380 and later chips.
* Force bursted beaconing on the chips that default to staggered beacons and
ensure the CABQ stuff is all sane (eg, the MORE bits that aren't being
correctly set when chaining descriptors.)
2013-03-24 00:03:12 +00:00
|
|
|
extern int ath_hal_gethangstate(struct ath_hal *ah, uint32_t mask,
|
|
|
|
uint32_t *hangs);
|
|
|
|
|
2011-11-08 21:49:33 +00:00
|
|
|
extern void ath_tx_freebuf(struct ath_softc *sc, struct ath_buf *bf,
|
|
|
|
int status);
|
2013-03-26 19:46:51 +00:00
|
|
|
extern void ath_txq_freeholdingbuf(struct ath_softc *sc,
|
|
|
|
struct ath_txq *txq);
|
2011-03-02 16:03:19 +00:00
|
|
|
|
2012-05-20 04:14:29 +00:00
|
|
|
extern void ath_txqmove(struct ath_txq *dst, struct ath_txq *src);
|
|
|
|
|
2012-05-20 02:05:10 +00:00
|
|
|
extern void ath_mode_init(struct ath_softc *sc);
|
|
|
|
|
|
|
|
extern void ath_setdefantenna(struct ath_softc *sc, u_int antenna);
|
|
|
|
|
2012-05-20 04:14:29 +00:00
|
|
|
extern void ath_setslottime(struct ath_softc *sc);
|
|
|
|
|
2012-07-27 05:34:45 +00:00
|
|
|
extern int ath_descdma_alloc_desc(struct ath_softc *sc,
|
|
|
|
struct ath_descdma *dd, ath_bufhead *head, const char *name,
|
2012-07-27 05:48:42 +00:00
|
|
|
int ds_size, int ndesc);
|
2012-07-09 08:37:59 +00:00
|
|
|
extern int ath_descdma_setup(struct ath_softc *sc, struct ath_descdma *dd,
|
2012-07-23 23:40:13 +00:00
|
|
|
ath_bufhead *head, const char *name, int ds_size, int nbuf,
|
|
|
|
int ndesc);
|
2012-07-14 02:07:51 +00:00
|
|
|
extern int ath_descdma_setup_rx_edma(struct ath_softc *sc,
|
|
|
|
struct ath_descdma *dd, ath_bufhead *head, const char *name,
|
|
|
|
int nbuf, int desclen);
|
2012-07-09 08:37:59 +00:00
|
|
|
extern void ath_descdma_cleanup(struct ath_softc *sc,
|
|
|
|
struct ath_descdma *dd, ath_bufhead *head);
|
|
|
|
|
2012-07-31 03:09:48 +00:00
|
|
|
extern void ath_legacy_attach_comp_func(struct ath_softc *sc);
|
2012-08-12 00:37:29 +00:00
|
|
|
|
2012-08-12 00:46:15 +00:00
|
|
|
extern void ath_tx_draintxq(struct ath_softc *sc, struct ath_txq *txq);
|
|
|
|
|
2012-08-12 00:37:29 +00:00
|
|
|
extern void ath_legacy_tx_drain(struct ath_softc *sc,
|
|
|
|
ATH_RESET_TYPE reset_type);
|
2012-07-31 03:09:48 +00:00
|
|
|
|
2012-08-14 22:32:20 +00:00
|
|
|
extern void ath_tx_process_buf_completion(struct ath_softc *sc,
|
|
|
|
struct ath_txq *txq, struct ath_tx_status *ts, struct ath_buf *bf);
|
|
|
|
|
|
|
|
extern int ath_stoptxdma(struct ath_softc *sc);
|
|
|
|
|
2012-10-28 21:13:12 +00:00
|
|
|
extern void ath_tx_update_tim(struct ath_softc *sc,
|
|
|
|
struct ieee80211_node *ni, int enable);
|
|
|
|
|
2012-05-20 02:05:10 +00:00
|
|
|
/*
|
|
|
|
* This is only here so that the RX proc function can call it.
|
|
|
|
* It's very likely that the "start TX after RX" call should be
|
|
|
|
* done via something in if_ath.c, moving "rx tasklet" into
|
|
|
|
* if_ath.c and do the ath_start() call there. Once that's done,
|
|
|
|
* we can kill this.
|
|
|
|
*/
|
|
|
|
extern void ath_start(struct ifnet *ifp);
|
Push the actual TX processing into the ath taskqueue, rather than having
it run out of multiple concurrent contexts.
Right now the ath(4) TX processing is a bit hairy. Specifically:
* It was running out of ath_start(), which could occur from multiple
concurrent sending processes (as if_start() can be started from multiple
sending threads nowdays.. sigh)
* during RX if fast frames are enabled (so not really at the moment, not
until I fix this particular feature again..)
* during ath_reset() - so anything which calls that
* during ath_tx_proc*() in the ath taskqueue - ie, TX is attempted again
after TX completion, as there's now hopefully some ath_bufs available.
* Then, the ic_raw_xmit() method can queue raw frames for transmission
at any time, from any net80211 TX context. Ew.
This has caused packet ordering issues in the past - specifically,
there's absolutely no guarantee that preemption won't occuring _during_
ath_start() by the TX completion processing, which will call ath_start()
again. It's a mess - 802.11 really, really wants things to be in
sequence or things go all kinds of loopy.
So:
* create a new task struct for TX'ing;
* make the if_start method simply queue the task on the ath taskqueue;
* make ath_start() just be called by the new TX task;
* make ath_tx_kick() just schedule the ath TX task, rather than directly
calling ath_start().
Now yes, this means that I've taken a step backwards in terms of
concurrency - TX -and- RX now occur in the same single-task taskqueue.
But there's nothing stopping me from separating out the TX / TX completion
code into a separate taskqueue which runs in parallel with the RX path,
if that ends up being appropriate for some platforms.
This fixes the CCMP/seqno concurrency issues that creep up when you
transmit large amounts of uni-directional UDP traffic (>200MBit) on a
FreeBSD STA -> AP, as now there's only one TX context no matter what's
going on (TX completion->retry/software queue,
userland->net80211->ath_start(), TX completion -> ath_start());
but it won't fix any concurrency issues between raw transmitted frames
and non-raw transmitted frames (eg EAPOL frames on TID 16 and any other
TID 16 multicast traffic that gets put on the CABQ.) That is going to
require a bunch more re-architecture before it's feasible to fix.
In any case, this is a big step towards making the majority of the TX
path locking irrelevant, as now almost all TX activity occurs in the
taskqueue.
Phew.
2012-10-14 20:44:08 +00:00
|
|
|
extern void ath_start_task(void *arg, int npending);
|
2012-05-20 02:05:10 +00:00
|
|
|
|
2013-02-07 02:15:25 +00:00
|
|
|
/*
|
|
|
|
* Kick the frame TX task.
|
|
|
|
*/
|
2012-06-05 03:14:49 +00:00
|
|
|
static inline void
|
|
|
|
ath_tx_kick(struct ath_softc *sc)
|
|
|
|
{
|
|
|
|
|
Pull out the if_transmit() work and revert back to ath_start().
My changed had some rather significant behavioural changes to throughput.
The two issues I noticed:
* With if_start and the ifnet mbuf queue, any temporary latency
would get eaten up by some mbufs being queued. With ath_transmit()
queuing things to ath_buf's, I'd only get 512 TX buffers before I
couldn't queue any further frames.
* There's also some non-zero latency involved with TX being pushed
into a taskqueue via direct dispatch. Any time the scheduler didn't
immediately schedule the ath TX task would cause extra latency.
Various 1ge/10ge drivers implement both direct dispatch (if the TX
lock can be acquired) and deferred task transmission (if the TX lock
can't be acquired), with frames being pushed into a drbd queue.
I'll have to do this at some point, but until I figure out how to
deal with 802.11 fragments, I'll have to wait a while longer.
So what I saw:
* lots of extra latency, specially under load - if the taskqueue
wasn't immediately scheduled, things went pear shaped;
* any extra latency would result in TX ath_buf's taking their sweet time
being replenished, so any further calls to ath_transmit() would drop
mbufs.
* .. yes, there's no explicit backpressure here - things are just dropped.
Eek.
With this, the general performance has gone up, but those subtle if_start()
related race conditions are back. For some reason, this is doubly-obvious
with the AR5416 NIC and I don't quite understand why yet.
There's an unrelated issue with AR5416 performance in STA mode (it's
fine in AP mode when bridging frames, weirdly..) that requires a little
further investigation. Specifically - it works fine on a Lenovo T40
(single core CPU) running a March 2012 9-STABLE kernel, but a Lenovo T60
(dual core) running an early November 2012 kernel behaves very poorly.
The same hardware with an AR9160 or AR9280 behaves perfectly.
2013-02-13 05:32:19 +00:00
|
|
|
ATH_TX_LOCK(sc);
|
|
|
|
ath_start(sc->sc_ifp);
|
|
|
|
ATH_TX_UNLOCK(sc);
|
2012-06-05 03:14:49 +00:00
|
|
|
}
|
2012-05-20 04:14:29 +00:00
|
|
|
|
2013-02-07 02:15:25 +00:00
|
|
|
/*
|
|
|
|
* Kick the software TX queue task.
|
|
|
|
*/
|
|
|
|
static inline void
|
|
|
|
ath_tx_swq_kick(struct ath_softc *sc)
|
|
|
|
{
|
|
|
|
|
2013-02-11 07:49:40 +00:00
|
|
|
taskqueue_enqueue(sc->sc_tq, &sc->sc_txqtask);
|
2013-02-07 02:15:25 +00:00
|
|
|
}
|
|
|
|
|
2011-01-29 11:35:23 +00:00
|
|
|
#endif
|