freebsd-nq/sys/dev/ath/if_ath_misc.h

151 lines
5.3 KiB
C
Raw Normal View History

/*-
* Copyright (c) 2002-2009 Sam Leffler, Errno Consulting
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer,
* without modification.
* 2. Redistributions in binary form must reproduce at minimum a disclaimer
* similar to the "NO WARRANTY" disclaimer below ("Disclaimer") and any
* redistribution must be conditioned upon including a substantially
* similar Disclaimer requirement for further binary redistribution.
*
* NO WARRANTY
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTIBILITY
* AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
* THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY,
* OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
* IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
* THE POSSIBILITY OF SUCH DAMAGES.
*
* $FreeBSD$
*/
#ifndef __IF_ATH_MISC_H__
#define __IF_ATH_MISC_H__
/*
* This is where definitions for "public things" in if_ath.c
* will go for the time being.
*
* Anything in here should eventually be moved out of if_ath.c
* and into something else.
*/
/* unaligned little endian access */
#define LE_READ_2(p) \
((u_int16_t) \
((((u_int8_t *)(p))[0] ) | (((u_int8_t *)(p))[1] << 8)))
#define LE_READ_4(p) \
((u_int32_t) \
((((u_int8_t *)(p))[0] ) | (((u_int8_t *)(p))[1] << 8) | \
(((u_int8_t *)(p))[2] << 16) | (((u_int8_t *)(p))[3] << 24)))
extern int ath_rxbuf;
extern int ath_txbuf;
extern int ath_txbuf_mgmt;
extern int ath_tx_findrix(const struct ath_softc *sc, uint8_t rate);
extern struct ath_buf * ath_getbuf(struct ath_softc *sc,
ath_buf_type_t btype);
extern struct ath_buf * _ath_getbuf_locked(struct ath_softc *sc,
ath_buf_type_t btype);
extern struct ath_buf * ath_buf_clone(struct ath_softc *sc,
struct ath_buf *bf);
/* XXX change this to NULL the buffer pointer? */
extern void ath_freebuf(struct ath_softc *sc, struct ath_buf *bf);
extern void ath_returnbuf_head(struct ath_softc *sc, struct ath_buf *bf);
extern void ath_returnbuf_tail(struct ath_softc *sc, struct ath_buf *bf);
extern int ath_reset(struct ifnet *, ATH_RESET_TYPE);
extern void ath_tx_default_comp(struct ath_softc *sc, struct ath_buf *bf,
int fail);
Introduce TX aggregation and software TX queue management for Atheros AR5416 and later wireless devices. This is a very large commit - the complete history can be found in the user/adrian/if_ath_tx branch. Legacy (ie, pre-AR5416) devices also use the per-software TXQ support and (in theory) can support non-aggregation ADDBA sessions. However, the net80211 stack doesn't currently support this. In summary: TX path: * queued frames normally go onto a per-TID, per-node queue * some special frames (eg ADDBA control frames) are thrown directly onto the relevant hardware queue so they can go out before any software queued frames are queued. * Add methods to create, suspend, resume and tear down an aggregation session. * Add in software retransmission of both normal and aggregate frames. * Add in completion handling of aggregate frames, including parsing the block ack bitmap provided by the hardware. * Write an aggregation function which can assemble frames into an aggregate based on the selected rate control and channel configuration. * The per-TID queues are locked based on their target hardware TX queue. This matches what ath9k/atheros does, and thus simplified porting over some of the aggregation logic. * When doing TX aggregation, stick the sequence number allocation in the TX path rather than net80211 TX path, and protect it by the TXQ lock. Rate control: * Delay rate control selection until the frame is about to be queued to the hardware, so retried frames can have their rate control choices changed. Frames with a static rate control selection have that applied before each TX, just to simplify the TX path (ie, not have "static" and "dynamic" rate control special cased.) * Teach ath_rate_sample about aggregates - both completion and errors. * Add an EWMA for tracking what the current "good" MCS rate is based on failure rates. Misc: * Introduce a bunch of dirty hacks and workarounds so TID mapping and net80211 frame inspection can be kept out of the net80211 layer. Because of the way this code works (and it's from Atheros and Linux ath9k), there is a consistent, 1:1 mapping between TID and AC. So we need to ensure that frames going to a specific TID will _always_ end up on the right AC, and vice versa, or the completion/locking will simply get very confused. I plan on addressing this mess in the future. Known issues: * There is no BAR frame transmission just yet. A whole lot of tidying up needs to occur before BAR frame TX can occur in the "correct" place - ie, once the TID TX queue has been drained. * Interface reset/purge/etc results in frames in the TX and RX queues being removed. This creates holes in the sequence numbers being assigned and the TX/RX AMPDU code (on either side) just hangs. * There's no filtered frame support at the present moment, so stations going into power saving mode will simply have a number of frames dropped - likely resulting in a traffic "hang". * Raw frame TX is going to just not function with 11n aggregation. Likely this needs to be modified to always override the sequence number if the frame is going into an aggregation session. However, general raw frame injection currently doesn't work in general in net80211, so let's just ignore this for now until this is sorted out. * HT protection is just not implemented and won't be until the above is sorted out. In addition, the AR5416 has issues RTS protecting large aggregates (anything >8k), so the work around needs to be ported and tested. Thus, this will be put on hold until the above work is complete. * The rate control module 'sample' is the only currently supported module; onoe/amrr haven't been tested and have likely bit rotted a little. I'll follow up with some commits to make them work again for non-11n rates, but they won't be updated to handle 11n and aggregation. If someone wishes to do so then they're welcome to send along patches. * .. and "sample" doesn't really do a good job of 11n TX. Specifically, the metrics used (packet TX time and failure/success rates) isn't as useful for 11n. It's likely that it should be extended to take into account the aggregate throughput possible and then choose a rate which maximises that. Ie, it may be acceptable for a higher MCS rate with a higher failure to be used if it gives a more acceptable throughput/latency then a lower MCS rate @ a lower error rate. Again, patches will be gratefully accepted. Because of this, ATH_ENABLE_11N is still not enabled by default. Sponsored by: Hobnob, Inc. Obtained from: Linux, Atheros
2011-11-08 22:43:13 +00:00
extern void ath_tx_update_ratectrl(struct ath_softc *sc,
struct ieee80211_node *ni, struct ath_rc_series *rc,
struct ath_tx_status *ts, int frmlen, int nframes, int nbad);
Overhaul the TXQ locking (again!) as part of some beacon/cabq timing related issues. Moving the TX locking under one lock made things easier to progress on but it had one important side-effect - it increased the latency when handling CABQ setup when sending beacons. This commit introduces a bunch of new changes and a few unrelated changs that are just easier to lump in here. The aim is to have the CABQ locking separate from other locking. The CABQ transmit path in the beacon process thus doesn't have to grab the general TX lock, reducing lock contention/latency and making it more likely that we'll make the beacon TX timing. The second half of this commit is the CABQ related setup changes needed for sane looking EDMA CABQ support. Right now the EDMA TX code naively assumes that only one frame (MPDU or A-MPDU) is being pushed into each FIFO slot. For the CABQ this isn't true - a whole list of frames is being pushed in - and thus CABQ handling breaks very quickly. The aim here is to setup the CABQ list and then push _that list_ to the hardware for transmission. I can then extend the EDMA TX code to stamp that list as being "one" FIFO entry (likely by tagging the last buffer in that list as "FIFO END") so the EDMA TX completion code correctly tracks things. Major: * Migrate the per-TXQ add/removal locking back to per-TXQ, rather than a single lock. * Leave the software queue side of things under the ATH_TX_LOCK lock, (continuing) to serialise things as they are. * Add a new function which is called whenever there's a beacon miss, to print out some debugging. This is primarily designed to help me figure out if the beacon miss events are due to a noisy environment, issues with the PHY/MAC, or other. * Move the CABQ setup/enable to occur _after_ all the VAPs have been looked at. This means that for multiple VAPS in bursted mode, the CABQ gets primed once all VAPs are checked, rather than being primed on the first VAP and then having frames appended after this. Minor: * Add a (disabled) twiddle to let me enable/disable cabq traffic. It's primarily there to let me easily debug what's going on with beacon and CABQ setup/traffic; there's some DMA engine hangs which I'm finally trying to trace down. * Clear bf_next when flushing frames; it should quieten some warnings that show up when a node goes away. Tested: * AR9280, STA/hostap, up to 4 vaps (staggered) * AR5416, STA/hostap, up to 4 vaps (staggered) TODO: * (Lots) more AR9380 and later testing, as I may have missed something here. * Leverage this to fix CABQ hanling for AR9380 and later chips. * Force bursted beaconing on the chips that default to staggered beacons and ensure the CABQ stuff is all sane (eg, the MORE bits that aren't being correctly set when chaining descriptors.)
2013-03-24 00:03:12 +00:00
extern int ath_hal_gethangstate(struct ath_hal *ah, uint32_t mask,
uint32_t *hangs);
extern void ath_tx_freebuf(struct ath_softc *sc, struct ath_buf *bf,
int status);
extern void ath_txq_freeholdingbuf(struct ath_softc *sc,
struct ath_txq *txq);
extern void ath_txqmove(struct ath_txq *dst, struct ath_txq *src);
extern void ath_mode_init(struct ath_softc *sc);
extern void ath_setdefantenna(struct ath_softc *sc, u_int antenna);
extern void ath_setslottime(struct ath_softc *sc);
extern int ath_descdma_alloc_desc(struct ath_softc *sc,
struct ath_descdma *dd, ath_bufhead *head, const char *name,
int ds_size, int ndesc);
extern int ath_descdma_setup(struct ath_softc *sc, struct ath_descdma *dd,
ath_bufhead *head, const char *name, int ds_size, int nbuf,
int ndesc);
extern int ath_descdma_setup_rx_edma(struct ath_softc *sc,
struct ath_descdma *dd, ath_bufhead *head, const char *name,
int nbuf, int desclen);
extern void ath_descdma_cleanup(struct ath_softc *sc,
struct ath_descdma *dd, ath_bufhead *head);
extern void ath_legacy_attach_comp_func(struct ath_softc *sc);
extern void ath_tx_draintxq(struct ath_softc *sc, struct ath_txq *txq);
extern void ath_legacy_tx_drain(struct ath_softc *sc,
ATH_RESET_TYPE reset_type);
extern void ath_tx_process_buf_completion(struct ath_softc *sc,
struct ath_txq *txq, struct ath_tx_status *ts, struct ath_buf *bf);
extern int ath_stoptxdma(struct ath_softc *sc);
extern void ath_tx_update_tim(struct ath_softc *sc,
struct ieee80211_node *ni, int enable);
/*
* This is only here so that the RX proc function can call it.
* It's very likely that the "start TX after RX" call should be
* done via something in if_ath.c, moving "rx tasklet" into
* if_ath.c and do the ath_start() call there. Once that's done,
* we can kill this.
*/
extern void ath_start(struct ifnet *ifp);
Push the actual TX processing into the ath taskqueue, rather than having it run out of multiple concurrent contexts. Right now the ath(4) TX processing is a bit hairy. Specifically: * It was running out of ath_start(), which could occur from multiple concurrent sending processes (as if_start() can be started from multiple sending threads nowdays.. sigh) * during RX if fast frames are enabled (so not really at the moment, not until I fix this particular feature again..) * during ath_reset() - so anything which calls that * during ath_tx_proc*() in the ath taskqueue - ie, TX is attempted again after TX completion, as there's now hopefully some ath_bufs available. * Then, the ic_raw_xmit() method can queue raw frames for transmission at any time, from any net80211 TX context. Ew. This has caused packet ordering issues in the past - specifically, there's absolutely no guarantee that preemption won't occuring _during_ ath_start() by the TX completion processing, which will call ath_start() again. It's a mess - 802.11 really, really wants things to be in sequence or things go all kinds of loopy. So: * create a new task struct for TX'ing; * make the if_start method simply queue the task on the ath taskqueue; * make ath_start() just be called by the new TX task; * make ath_tx_kick() just schedule the ath TX task, rather than directly calling ath_start(). Now yes, this means that I've taken a step backwards in terms of concurrency - TX -and- RX now occur in the same single-task taskqueue. But there's nothing stopping me from separating out the TX / TX completion code into a separate taskqueue which runs in parallel with the RX path, if that ends up being appropriate for some platforms. This fixes the CCMP/seqno concurrency issues that creep up when you transmit large amounts of uni-directional UDP traffic (>200MBit) on a FreeBSD STA -> AP, as now there's only one TX context no matter what's going on (TX completion->retry/software queue, userland->net80211->ath_start(), TX completion -> ath_start()); but it won't fix any concurrency issues between raw transmitted frames and non-raw transmitted frames (eg EAPOL frames on TID 16 and any other TID 16 multicast traffic that gets put on the CABQ.) That is going to require a bunch more re-architecture before it's feasible to fix. In any case, this is a big step towards making the majority of the TX path locking irrelevant, as now almost all TX activity occurs in the taskqueue. Phew.
2012-10-14 20:44:08 +00:00
extern void ath_start_task(void *arg, int npending);
Be (very) careful about how to add more TX DMA work. The list-based DMA engine has the following behaviour: * When the DMA engine is in the init state, you can write the first descriptor address to the QCU TxDP register and it will work. * Then when it hits the end of the list (ie, it either hits a NULL link pointer, OR it hits a descriptor with VEOL set) the QCU stops, and the TxDP points to the last descriptor that was transmitted. * Then when you want to transmit a new frame, you can then either: + write the head of the new list into TxDP, or + you write the head of the new list into the link pointer of the last completed descriptor (ie, where TxDP points), then kick TxE to restart transmission on that QCU> * The hardware then will re-read the descriptor to pick up the link pointer and then jump to that. Now, the quirks: * If you write a TxDP when there's been no previous TxDP (ie, it's 0), it works. * If you write a TxDP in any other instance, the TxDP write may actually fail. Thus, when you start transmission, it will re-read the last transmitted descriptor to get the link pointer, NOT just start a new transmission. So the correct thing to do here is: * ALWAYS use the holding descriptor (ie, the last transmitted descriptor that we've kept safe) and use the link pointer in _THAT_ to transmit the next frame. * NEVER write to the TxDP after you've done the initial write. * .. also, don't do this whilst you're also resetting the NIC. With this in mind, the following patch does basically the above. * Since this encapsulates Sam's issues with the QCU behaviour w/ TDMA, kill the TDMA special case and replace it with the above. * Add a new TXQ flag - PUTRUNNING - which indicates that we've started DMA. * Clear that flag when DMA has been shutdown. * Ensure that we're not restarting DMA with PUTRUNNING enabled. * Fix the link pointer logic during TXQ drain - we should always ensure the link pointer does point to something if there's a list of frames. Having it be NULL as an indication that DMA has finished or during a reset causes trouble. Now, given all of this, i want to nuke axq_link from orbit. There's now HAL methods to get and set the link pointer of a descriptor, so what we should do instead is to update the right link pointer. * If there's a holding descriptor and an empty TXQ list, set the link pointer of said holding descriptor to the new frame. * If there's a non-empty TXQ list, set the link pointer of the last descriptor in the list to the new frame. * Nuke axq_link from orbit. Note: * The AR9380 doesn't need this. FIFO TX writes are atomic. As long as we don't append to a list of frames that we've already passed to the hardware, all of the above doesn't apply. The holding descriptor stuff is still needed to ensure the hardware can re-read a completed descriptor to move onto the next one, but we restart DMA by pushing in a new FIFO entry into the TX QCU. That doesn't require any real gymnastics. Tested: * AR5210, AR5211, AR5212, AR5416, AR9380 - STA mode.
2013-05-18 18:27:53 +00:00
extern void ath_tx_dump(struct ath_softc *sc, struct ath_txq *txq);
/*
* Kick the frame TX task.
*/
static inline void
ath_tx_kick(struct ath_softc *sc)
{
Migrate ath(4) to now use if_transmit instead of the legacy if_start and if queue mechanism; also fix up (non-11n) TX fragment handling. This may result in a bit of a performance drop for now but I plan on debugging and resolving this at a later stage. Whilst here, fix the transmit path so fragment transmission works. The TX fragmentation handling is a bit more special. In order to correctly transmit TX fragments, there's a bunch of corner cases that need to be handled: * They must be transmitted back to back, in the same order.. * .. ie, you need to hold the TX lock whilst transmitting this set of fragments rather than interleaving it with other MSDUs destined to other nodes; * The length of the next fragment is required when transmitting, in order to correctly set the NAV field in the current frame to the length of the next frame; which requires .. * .. that we know the transmit duration of the next frame, which .. * .. requires us to set the rate of all fragments to the same length, or make the decision up-front, etc. To facilitate this, I've added a new ath_buf field to describe the length of the next fragment. This avoids having to keep the mbuf chain together. This used to work before my 11n TX path work because the ath_tx_start() routine would be handed a single mbuf with m_nextpkt pointing to the next frame, and that would be maintained all the way up to when the duration calculation was done. This doesn't hold true any longer - the actual queuing may occur at any point in the future (think ath_node TID software queuing) so this information needs to be maintained. Right now this does work for non-11n frames but it doesn't at all enforce the same rate control decision for all frames in the fragment. I plan on fixing this in a followup commit. RTS/CTS has the same issue, I'll look at fixing this in a subsequent commit. Finaly, 11n fragment support requires the driver to have fully decided what the rate scenario setup is - including 20/40MHz, short/long GI, STBC, LDPC, number of streams, etc. Right now that decision is (currently) made _after_ the NAV field value is updated. I'll fix all of this in subsequent commits. Tested: * AR5416, STA, transmitting 11abg fragments * AR5416, STA, 11n fragments work but the NAV field is incorrect for the reasons above. TODO: * It would be nice to be able to queue mbufs per-node and per-TID so we can only queue ath_buf entries when it's time to assemble frames to send to the hardware. But honestly, we should just do that level of software queue management in net80211 rather than ath(4), so I'm going to leave this alone for now. * More thorough AP, mesh and adhoc testing. * Ensure that net80211 doesn't hand us fragmented frames when A-MPDU has been negotiated, as we can't do software retransmission of fragments. * .. set CLRDMASK when transmitting fragments, just to ensure.
2013-05-26 22:23:39 +00:00
/* XXX NULL for now */
}
/*
* Kick the software TX queue task.
*/
static inline void
ath_tx_swq_kick(struct ath_softc *sc)
{
taskqueue_enqueue(sc->sc_tq, &sc->sc_txqtask);
}
#endif