freebsd-dev

Author	SHA1	Message	Date
Adrian Chadd	9a2de0c3d6	[ath] reset hardware if this particular mac bug is seen. I have to dig into why I'm seeing it on chips as late as the AR9380 era stuff (as it's marked as an AR5416 bug, but who knows!) but i'm seeing aggregate TX frames complete with no blockack bit set. So, everything should be treated as a failure and do a hardware reset for good measure. Tested: * AR9380, STA mode * AR9580 (5GHz), AP mode	2020-05-21 04:26:20 +00:00
Adrian Chadd	051ea90c43	[ath_rate_sample] Limit the tx schedules for A-MPDU ; don't take short retries into account and remove the requirement that the MCS rate is "higher" if we're considering a new rate. Ok, another fun one. * In order for reliable non-software retried higher MCS rates, the TX schedules (inconsistently!) use hard-coded lower rates at the end of the schedule. Now, hard-coded is a problem because (a) it means that aggregate formation is limited by the SLOWEST rate, so I never formed large AMDU frames for 3 stream rates, and (b) if the AP disables lower rates as base rates, it complains about "unknown rix" every frame you transmit at that rate. So, for now just disable the third and fourth schedule entry for AMPDUs. Now I'm forming 32k and 64k aggregates for the higher density MCS rates much more reliably. It would be much nicer if the rate schedule stuff wasn't fixed but instead I'd just populate ath_rc_series[] when I fetch the rates. This is all a holdover of ye olde pre-11n stuff and I really just need to nuke it. But for now, ye hack. * The check for "is this MCS rate better" based on MCS itself is just garbage. It meant things like going MCS0->7 would be fine, and say 0->8->16 is fine, (as they're equivalent encoding but 1,2,3 spatial streams), BUT it meant going something like MCS7->11 would fail even though it's likely that MCS11 would just be better, both for EWMA/BER and throughput. So for now just use the average tx time. The "right" way for this comparison would be to compare PHY bitrates rather than MCS / rate indexes, but I'm not yet there. The bit rates ARE available in the PHY index, but honestly I have a lot of other cleaning up to here before I think about that. * Don't include the RTS/CTS retry count (and thus time) into the average tx time caluation. It just makes temporarily failures make the rate look bad by QUITE A LOT, as RTS/CTS exchanges are (a) long, and (b) mostly irrelevant to the actual rate being tried. If we keep hitting RTS/CTS failures then there's something ELSE wrong on the channel, not our selected rate.	2020-05-16 05:07:45 +00:00
Adrian Chadd	cce6344402	[ath] [ath_rate] Extend ath_rate_sample to better handle 11n rates and aggregates. My initial rate control code was .. suboptimal. I wanted to at least get MCS rates sent, but it didn't do anywhere near enough to handle low signal level links or remotely keep accurate statistics. So, 8 years later, here's what I should've done back then. * Firstly, I wasn't at all tracking packet sizes other than the two buckets (250 and 1600 bytes.) So, extend it to include 4096, 8192, 16384, 32768 and 65536. I may go add 2048 at some point if I find it's useful. This is important for a few reasons. First, when forming A-MPDU or AMSDU aggregates the frame sizes are larger, and thus the TX time calculation is woefully, increasingly wrong. Secondly, the behaviour of 802.11 channels isn't some fixed thing, both due to channel conditions and radios themselves. Notably, there was some observations done a few years ago on 11n chipsets which noticed longer aggregates showed an increase in failed A-MPDU sub-frame reception as you got further along in the transmit time. It could be due to a variety of things - transmitter linearity, channel conditions changing, frequency/phase drift, etc - but the observation was to potentially form shorter aggregates to improve BER. * .. and then modify the ath TX path to report the length of the aggregate sent, so as the statistics kept would line up with the correct bucket. * Then on the rate control look-up side - i was also only using the first frame length for an A-MPDU rate control lookup which isn't good enough here. So, add a new method that walks the TID software queue for that node to find out what the likely length of data available is. It isn't ALL of the data in the queue because we'll only ever send enough data to fit inside the block-ack window, so limit how many bytes we return to roughly what ath_tx_form_aggr() would do. * .. and cache that in the first ath_buf in the aggregate so it and the eventual AMPDU length can be returned to the rate control code. * THEN, modify the rate control code to look at them both when deciding which bucket to attribute the sent frame on. I'm erring on the side of caution and using the size bucket that the lookup is based on. Ok, so now the rate lookups and statistics are "more correct". However, MCS rates are not the same as 11abg rates in that they're not a monotonically incrementing set of faster rates and you can't assume that just because a given MCS rate fails, the next higher one wouldn't work better or be a lower average tx time. So, I had to do a bunch of surgery to the best rate and sample rate math. This is the bit that's a WIP. * First, simplify the statistics updates (update_stats()) to do a single pass on all rates. * Next, make sure that each rate average tx time is updated based on /its/ failure/success. Eg if you sent a frame with { MCS15, MCS12, MCS8 } and MCS8 succeeded, MCS15 and MCS 12 would have their average tx time updated for /their/ part of the transmission, not the whole transmission. * Next, EWMA wasn't being fully calculated based on the /failures/ in each of the rate attempts. So, if MCS15, MCS12 failed above but MCS8 didn't, then ensure that the statistics noted that /all/ subframes failed at those rates, rather than the eventual set of transmitted/sent frames. This ensures the EWMA /and/ average TX time are updated correctly. * When picking a sample rate and initial rate, probe rates aroud the current MCS but limit it to MCS0..7 /for all spatial streams/, rather than doing crazy things like hitting MCS7 and then probing MCS8 - MCS8 is basically MCS0 but two spatial streams. It's a /lot/ slower than MCS7. Also, the reverse is true - if we're at MCS8 then don't probe MCS7 as part of it, it's not likely to succeed. * Fix bugs in pick_best_rate() where I was /immediately/ choosing the highest MCS rate if there weren't any frames yet transmitted. I was defaulting to 25% EWMA and .. then each comparison would accept the higher rate. Just skip those; sampling will fill in the details. So, this seems to work a lot better. It's not perfect; I'm still seeing a lot of instability around higher MCS rates because there are bursts of loss/retransmissions that aren't /too/ bad. But i'll keep iterating over this and tidying up my hacks. Ok, so why this still something I'm poking at? rather than porting minstrel_ht? ath_rate_sample tries to minimise airtime, not maximise throughput. I have extended it with an EWMA based on sub-frame success/failures - high MCS rates that have partially successful receptions still show super short average frame times, but a /lot/ of retransmits have to happen for that to work. So for MCS rates I also track this EWMA and ensure that the rates I'm choosing don't have super crappy packet failures. I don't mind not getting lower peak throughput versus minstrel_ht; instead I want to see if I can make "minimise airtime" work well. Tested: * AR9380, STA mode * AR9344, STA mode * AR9580, STA/AP mode	2020-05-15 18:51:20 +00:00
Adrian Chadd	84f950a54d	[ath] [ath_rate] Add some extra data into the rate control lookup. Right now (well, since I did this in 2011/2012) the rate control code makes some super bad choices for 11n aggregates/rates, and it tracks statistics even more questionably. It's been long enough and I'm now trying to use it again daily, so let's start by: * telling the rate control code if it's an aggregate or not; * being clearer about the TID - yes it can be extracted from the ath_buf but this way it can be overridden by the caller without changing the TID itself. (This is for doing experiments with voice/video QoS at some point..) * Return an optional field to limit how long the aggregate is in microseconds. Right now the rate control code supplies a rate table and the ath aggr form code will look at the rate table and limit the aggregate size to 4ms at the slowest rate. Yeah, this is pretty terrible. * Add some more TODO comments around handling txpower, rate and handling filtered frames status so if I continue to have spoons for this I can go poke at it.	2020-05-13 00:05:11 +00:00
Adrian Chadd	9fbe631a1a	[net80211] convert all of the WME use over to a temporary copy of WME info. This removes the direct WME info access in the ieee80211com struct and instead provides a method of fetching the data. Right now it's a no-op but eventually it'll turn into a per-VAP method for drivers that support it (eg iwn, iwm, upcoming ath10k work) as things like p2p support require this kind of behaviour. Tested: * ath(4), STA and AP mode TODO: * yes, this is slightly stack size-y, but it is an important first step to get drivers migrated over to a sensible WME API. A lot of per-phy things need to be converted to per-VAP before P2P, 11ac firmware, etc stuff shows up.	2018-01-02 00:07:28 +00:00
Pedro F. Giffuni	718cf2ccb9	sys/dev: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 14:52:40 +00:00
Adrian Chadd	15e58d4d26	[ath] prepare for "correct" group (bcast/mcast) address frame handling and software/hardware queue TID mapping. When I initially did this 11n TX work in days of yonder, my 802.11 standards clue was ... not as finely tuned. One of the things in 802.11-2012 (which I guess technically was after I did this work, but I'm sure it was like this in the previous rev?) is that among other traffic classes, three things are important: * group addressed frames should be default non-QoS, even if they're QoS frames, and * group addressed frames should have a seqno out of a different space than the per-TID QoS one; and because of this * group addressed frames, being non-QoS, should never be in the Block-ACK window for TX. Now, net80211 and now this code cheats by using the non-QOS TID, but ideally we'd introduce a separate seqno space just for multicast/group traffic for TX and RX comparison. Later extensions (eg reliable multicast / multimedia) express what one should do when doing multicast traffic in a TID. Now, technically we /could/ do group traffic as QoS traffic and throw it into a per-TID seqno space, but this definitely introduces ordering issues when you take into account things like CABQ behaviour. (Ie, if some traffic in the TID goes into the CABQ and some doesn't, because it's doing a split of multicast and non-multicast traffic, then you have seqno ordering issues.) So, until someone implements 802.11vv reliable multicast / multimedia extensions, group traffic is non-QoS. Next, software/hardware queue TID mapping. In the past I believed the WME tagging of frames because well, net80211 had a habit of tagging things like management traffic with it. But, then we also map QoS traffic categories to TIDs as well. So, we should obey the TID! But! then it put some management traffic into higher WME categories too, as those frames don't have QoS TIDs. But! It'd do things like put things like QoS action frames into higher WME categories, when they should be kept in-order with the rest of the traffic for that TID. So! Given all of this, the ath(4) driver does overrides to not trust the WME category. I .. am undoing some of this. Now, the TID has a 1:1 mapping to the hardware queue. The TID is the primary source of truth now for all QoS traffic. The WME is only used for non-QoS traffic. This now means that any TID traffic queued should be consistently queued regardless of WME, so things like the "TX finished, do more TX" that is occuring right now for transmit handling should be "better". The consistent {TID, WME} -> hardware queue mapping is important for transmit completion. It's used to schedule more traffic for that particular TID, because that {many TID}:{1 TXQ} mapping in ath_tx_tid_sched() is used for driving completion. Ie, when the hardware queue completes, it'll walk that list of scheduled TIDs attached to that TXQ. The eventual aim is to get ready for some other features around putting some data into other hardware queues (eg for better PS-POLL support, uAPSD, support, correct-er TDMA support, etc) which requires that I tidy all of this up in preparation for then introducing further TID scheduling that isn't linked to a hardware TXQ (likely a per-WME, per-TID driver queue, and a per-node driver queue) to enable that. Tested: * AR9380, STA mode * AR9380, AR9580, AP mode	2017-03-19 05:00:14 +00:00
Adrian Chadd	39d5467677	[ath] log seqno, type and subtype when assigning sequence numbers for A-MPDU. This is just to improve adrian-debugging.	2017-01-31 20:57:40 +00:00
Adrian Chadd	57af292d36	[ath] fix thresholds for deciding to queue to the software queue and populate hardware frames This is two fixes, which establishes what I /think/ is pretty close to the theoretical PHY maximum speed on the AR9380 devices. * When doing A-MPDU on a TID, don't queue to the hardware directly if the hardware queue is busy. This gives us time to get more packets queued up (and the hardware is busy, so there's no point in queuing more to the hardware right now) to potentially form an A-MPDU. This fixes up the throughput issue I was seeing where a couple hundred single frames were being sent a second interspersed between A-MPDU frames. It just happened that the software queue had exactly one frame in it at that point. Queuing it until the hardware finishes transmitting isn't exactly costly. * When determining whether to dequeue from a software node/TID queue into the hardware queue, fix up the checks to work right for EDMA chips (ar9380 and later.) Before it was not dispatching anything until the FIFO was empty. Now we allow it to dispatch another aggregate up to the hardware aggregate limit, like I intended with the earlier work. This allows a 5GHz HT40, short-GI, "htprotmode off" test at MCS23 to achieve 357 Mbit/sec in a one-way UDP test. The stars have to be aligned /just right/ so there are no retries but it can happen. Just don't expect it to work in an OTA test if your 2yo is running around the room - MCS23 is very very sensitive to channel conditions. Tested: * AR9380 STA (test) -> AR9580 hostap TODO: * More thorough testing on pre-AR9380 chips (AR5416, AR9160, AR9280) * (Finally) teach ath_rate_sample about throughput/latency rather than air time, so I can get good transmit rates with a 2yo running around.	2017-01-23 04:30:08 +00:00
Andriy Voskoboinyk	887a63246c	net80211: remove IEEE80211_RADIOTAP_TSFT field from transmit definitions. This field may be used for received frames only. Differential Revision: https://reviews.freebsd.org/D3826 Differential Revision: https://reviews.freebsd.org/D3827	2016-09-20 18:53:42 +00:00
Adrian Chadd	5abc0b2590	[ath] set the relevant TOA/TOD locationing bits when trying to do locationing. * Don't do RTS/CTS - experiments show that we get ACK frames for each of them and this ends up causing the timestamps to look all funny. * Set the HAL_TXDESC_POS bit, so the AR9300 HAL sets up the hardware to return location and CSI information.	2016-09-12 04:55:13 +00:00
Adrian Chadd	7ff1939db0	[ath] [ath_hal] break out the duration calculation to optionally include SIFS. The pre-11n calculations include SIFS, but the 11n ones don't. The reason is that (mostly) the 11n hardware is doing the SIFS calculation for us but the pre-11n hardware isn't. This means that we're over-shooting the times in the duration field for non-11n frames on 11n hardware, which is OK, if not a little inefficient. Now, this is all fine for what the hardware needs for doing duration math for ACK, RTS/CTS, frame length, etc, but it isn't useful for doing PHY duration calculations. Ie, given a frame to TX and its timestamp, what would the end of the actual transmission time be; and similar for an RX timestamp and figuring out its original length. So, this adds a new field to the duration routines which requests SIFS or no SIFS to be included. All the callers currently will call it requesting SIFS, so this /should/ be a glorious no-op. I'm however planning some future work around airtime fairness and positioning which requires these routines to have SIFS be optional. Notably though, the 11n version doesn't do any SIFS addition at the moment. I'll go and tweak and verify all of the packet durations before I go and flip that part on. Tested: * AR9330, STA mode * AR9330, AP mode * AR9380, STA mode	2016-07-15 06:39:35 +00:00
Adrian Chadd	bcf5fc498a	[ath] commit initial bluetooth coexistence support for the MCI NICs. This is the initial framework to call into the MCI HAL routines and drive the basic state engine. The MCI bluetooth coex model uses a command channel between wlan and bluetooth, rather than a 2-wire or 3-wire signaling protocol to control things. This means the wlan and bluetooth chip exchange a lot more information and signaling, even at the per-packet level. The NICs in question can share the input LNA and output PA on the die, so they absolutely can't stomp on each other in a silly fashion. It also allows for the bluetooth side to signal when profiles come and go, so the driver can take appropriate control. There's also the possibility of dynamic bluetooth/wlan duty cycle control which I haven't yet really played with. It configures things up with a static "wlan wins everything" coexistence, configures up the available 2GHz channel map for bluetooth, sets a static duty cycle for bluetooth/wifi traffic priority and drives the basics needed to keep the MCI HAL code happy. It doesn't do any actual coexistence except to default to "wlan wins everything", which at least demonstrates that things do indeed work. Bluetooth inquiry frames still trump wifi (including beacons), so that demonstrates things really do indeed seem to work. Tested: * AR9462 (WB222), STA mode + bt * QCA9565 (WB335), STA mode + bt TODO: * .. the rest of coexistence. yes, bluetooth, not people. That stuff's hard. * It doesn't do the initial BT side calibration, which requires a WLAN chip reset. I'll fix up the reset path a bit more first before I enable that. * The 1-ant and 2-ant configuration bits aren't being set correctly in if_ath_btcoex.c - I'll dig into that and fix it in a subsequent commit. * It's not enabled by default for WB222/WB225 even though I believe it now can be - I'll chase that up in a subsequent commit. Obtained from: Qualcomm Atheros, Linux ath9k	2016-06-02 00:51:36 +00:00
Pedro F. Giffuni	f6b6084b8e	dev/ath: minor spelling fixes in comments. No functional change. Reviewed by: adrian	2016-05-02 19:56:48 +00:00
Adrian Chadd	82525db1eb	[ath] turn the BA hardware bug back into a printf(). I saw this happen a couple of times and all I saw was a dump of the transmit descriptors. Log the message for now so I can see whta happened.	2016-04-29 01:52:06 +00:00
Adrian Chadd	d957a93abe	net80211: move ieee80211_free_node() call on error from ic_raw_xmit() to ieee80211_raw_output(). This doesn't free the mbuf upon error; the driver ic_raw_xmit method is still doing that. Submitted by: <s3erios@gmail.com> Differential Revision: https://reviews.freebsd.org/D3774	2015-10-12 04:55:20 +00:00
Adrian Chadd	d07be335a0	net80211: separate mbuf cleanup from ieee80211_fragment() * Create ieee80211_free_mbuf() which frees a list of mbufs. * Use it in the fragment transmit path and ath / uath transmit paths. * Call it in xmit_pkt() if the transmission fails; otherwise fragments may be leaked. This should be a big no-op. Submitted by: <s3erios@gmail.com> Differential Revision: https://reviews.freebsd.org/D3769	2015-10-12 03:27:08 +00:00
Gleb Smirnoff	7a79cebfba	Replay r286410. Change KPI of how device drivers that provide wireless connectivity interact with the net80211 stack. Historical background: originally wireless devices created an interface, just like Ethernet devices do. Name of an interface matched the name of the driver that created. Later, wlan(4) layer was introduced, and the wlanX interfaces become the actual interface, leaving original ones as "a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer and a driver became a mix of methods that pass a pointer to struct ifnet as identifier and methods that pass pointer to struct ieee80211com. From user point of view, the parent interface just hangs on in the ifconfig list, and user can't do anything useful with it. Now, the struct ifnet goes away. The struct ieee80211com is the only KPI between a device driver and net80211. Details: - The struct ieee80211com is embedded into drivers softc. - Packets are sent via new ic_transmit method, which is very much like the previous if_transmit. - Bringing parent up/down is done via new ic_parent method, which notifies driver about any changes: number of wlan(4) interfaces, number of them in promisc or allmulti state. - Device specific ioctls (if any) are received on new ic_ioctl method. - Packets/errors accounting are done by the stack. In certain cases, when driver experiences errors and can not attribute them to any specific interface, driver updates ic_oerrors or ic_ierrors counters. Details on interface configuration with new world order: - A sequence of commands needed to bring up wireless DOESN"T change. - /etc/rc.conf parameters DON'T change. - List of devices that can be used to create wlan(4) interfaces is now provided by net.wlan.devices sysctl. Most drivers in this change were converted by me, except of wpi(4), that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing changes to at least 8 drivers. Thanks to pluknet@, Oliver Hartmann, Olivier Cochard, gjb@, mmoll@, op@ and lev@, who also participated in testing. Reviewed by: adrian Sponsored by: Netflix Sponsored by: Nginx, Inc.	2015-08-27 08:56:39 +00:00
Adrian Chadd	3797bf0896	Remove most of the references of ifp->if_softc and replace with references to ic->ic_softc. This is in preparation for gleb's ifnet work. Tested: * ath(4), STA mode * ath(4), hostap mode * make universe	2015-08-17 02:04:11 +00:00
Adrian Chadd	ba2c1fbc03	Revert the wifi ifnet changes until things are more baked and tested. * 286410 * 286413 * 286416 The initial commit broke a variety of debug and features that aren't in the GENERIC kernels but are enabled in other platforms.	2015-08-08 01:10:17 +00:00
Gleb Smirnoff	79d2c5e857	Change KPI of how device drivers that provide wireless connectivity interact with the net80211 stack. Historical background: originally wireless devices created an interface, just like Ethernet devices do. Name of an interface matched the name of the driver that created. Later, wlan(4) layer was introduced, and the wlanX interfaces become the actual interface, leaving original ones as "a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer and a driver became a mix of methods that pass a pointer to struct ifnet as identifier and methods that pass pointer to struct ieee80211com. From user point of view, the parent interface just hangs on in the ifconfig list, and user can't do anything useful with it. Now, the struct ifnet goes away. The struct ieee80211com is the only KPI between a device driver and net80211. Details: - The struct ieee80211com is embedded into drivers softc. - Packets are sent via new ic_transmit method, which is very much like the previous if_transmit. - Bringing parent up/down is done via new ic_parent method, which notifies driver about any changes: number of wlan(4) interfaces, number of them in promisc or allmulti state. - Device specific ioctls (if any) are received on new ic_ioctl method. - Packets/errors accounting are done by the stack. In certain cases, when driver experiences errors and can not attribute them to any specific interface, driver updates ic_oerrors or ic_ierrors counters. Details on interface configuration with new world order: - A sequence of commands needed to bring up wireless DOESN"T change. - /etc/rc.conf parameters DON'T change. - List of devices that can be used to create wlan(4) interfaces is now provided by net.wlan.devices sysctl. Most drivers in this change were converted by me, except of wpi(4), that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing changes to at least 8 drivers. Thanks to Olivier Cochard, gjb@, mmoll@, op@ and lev@, who also participated in testing. Details here: https://wiki.freebsd.org/projects/ifnet/net80211 Still, drivers: ndis, wtap, mwl, ipw, bwn, wi, upgt, uath were not tested. Changes to mwl, ipw, bwn, wi, upgt are trivial and chances of problems are low. The wtap wasn't compilable even before this change. But the ndis driver is complex, and it is likely to be broken with this commit. Help with testing and debugging it is appreciated. Differential Revision: D2655, D2740 Sponsored by: Nginx, Inc. Sponsored by: Netflix	2015-08-07 11:43:14 +00:00
Gleb Smirnoff	76e6fd5d6c	Use device_printf() instead of if_printf(). No functional changes.	2015-05-29 14:35:16 +00:00
Gleb Smirnoff	2127b2e232	Mechanically convert to if_inc_counter().	2014-09-18 20:47:39 +00:00
Adrian Chadd	f5c30c4e8d	Bring over some initial power save management support, reset path fixes and beacon programming / debugging into the ath(4) driver. The basic power save tracking: * Add some new code to track the current desired powersave state; and * Add some reference count tracking so we know when the NIC is awake; then * Add code in all the points where we're about to touch the hardware and push it to force-wake. Then, how things are moved into power save: * Only move into network-sleep during a RUN->SLEEP transition; * Force wake the hardware up everywhere that we're about to touch the hardware. The net80211 stack takes care of doing RUN<->SLEEP<->(other) state transitions so we don't have to do it in the driver. Next, when to wake things up: * In short - everywhere we touch the hardware. * The hardware will take care of staying awake if things are queued in the transmit queue(s); it'll then transit down to sleep if there's nothing left. This way we don't have to track the software / hardware transmit queue(s) and keep the hardware awake for those. Then, some transmit path fixes that aren't related but useful: * Force EAPOL frames to go out at the lowest rate. This improves reliability during the encryption handshake after 802.11 negotiation. Next, some reset path fixes! * Fix the overlap between reset and transmit pause so we don't transmit frames during a reset. * Some noisy environments will end up taking a lot longer to reset than normal, so extend the reset period and drop the raise the reset interval to be more realistic and give the hardware some time to finish calibration. * Skip calibration during the reset path. Tsk! Then, beacon fixes in station mode! * Add a _lot_ more debugging in the station beacon reset path. This is all quite fluid right now. * Modify the STA beacon programming code to try and take the TU gap between desired TSF and the target TU into account. (Lifted from QCA.) Tested: * AR5210 * AR5211 * AR5212 * AR5413 * AR5416 * AR9280 * AR9285 TODO: * More AP, IBSS, mesh, TDMA testing * Thorough AR9380 and later testing! * AR9160 and AR9287 testing Obtained from: QCA	2014-04-30 02:19:41 +00:00
Adrian Chadd	f172ef758e	Rewrite the cleanup code to, well, actually work right. The existing cleanup code was based on the Atheros reference driver from way back and stuff that was in Linux ath9k. It turned out to be .. rather silly. Specifically: * The whole method of determining whether there's hardware-queued frames was fragile and the BAW would never quite work right afterwards. * The cleanup path wouldn't correctly pull apart aggregate frames in the queue, so frames would not be freed and the BAW wouldn't be correctly updated. So to implement this: * Pull the aggregate frames apart correctly and handle each separately; * Make the atid->incomp counter just track the number of hardware queued frames rather than try to figure it out from the BAW; * Modify the aggregate completion path to handle it as a single frame (atid->incomp tracks the one frame now, not the subframes) and remove the frames from the BAW before completing them as normal frames; * Make sure bf->bf_next is NULled out correctly; * Make both aggregate session and non-aggregate path frames now be handled via the incompletion path. TODO: * kill atid->incomp; the driver tracks the hardware queued frames for each TID and so we can just use that. This is a stability fix that should be merged back to stable/10. Tested: * AR5416, STA MFC after: 7 days	2014-04-21 06:07:08 +00:00
Adrian Chadd	1771c64935	* Modify the debugging output from pause/resume to note the TID and STA MAC * Now that the paused < 0 bugs have been identified, make the DPRINTF() a device_printf() again. Anything else that shows up here needs to be fixed immediately. Tested: * AR5416, STA mode MFC after: 7 days	2014-04-21 02:09:14 +00:00
Adrian Chadd	706bb44485	Make sure bf_next is NULL'ed out when we're completing up an aggregate frame through the cleanup path. Whilst here, fix the indenting for something I messed up. Tested: * AR5416, STA mode	2014-04-21 02:05:51 +00:00
Adrian Chadd	59fbb5304d	Fix a cleanup hang if cleanup gets called _during_ an active cleanup. During power save testing I noticed that the cleanup code is being called during a RUN->RUN state transition. It's because the net80211 stack is treating that (for reasons I don't quitey know yet) as a reassociation and this calls the node cleanup code. The reason it's seeing a RUN->RUN transition is because during active power save stuff it's possible that the RUN->SLEEP and SLEEP->RUN transitions happen so quickly that the deferred net80211 vap state code "loses" a transition, namely the intermediary SLEEP transition. So, this was causing the node reassociation code to sometimes be called twice in quick succession and this would result in ath_tx_tid_cleanup() to be called again. The code calling it would always call pause, and then only call resume if the TID didn't have "cleanup_inprogress" set. Unfortunately it didn't check if it was already set on entry, so it would pause but not call resume. Thus, paused would be called more than once (once before each entry into ath-tx_tid_cleanup()) but resume would only be called once when the cleanup state was finished. This doesn't entirely fix all of the issues seen in the cleanup path but it's a necessary first step. Since this is a stability fix, it should be merged to stable/10 at some point. Tested: * AR5416, STA mode MFC after: 7 days	2014-04-21 01:02:49 +00:00
Adrian Chadd	42fdd8e726	Add some debugging and forcing of the BAW to match what the current tracked BAW actually is. The net80211 code that completes a BAR will set tid->txa_start (the BAW start) to whatever value was called when sending the BAR. Now, in case there's bugs in my driver code that cause the BAW to slip along, we should make sure that the new BAW we start at is actually what we currently have it at, not what we've sent. This totally breaks the specification and so this stays a printf(). If it happens then I need to know and fix it. Whilst here, add some debugging updates: * add TID logging to places where it's useful; * use SEQNO().	2014-04-08 07:14:14 +00:00
Adrian Chadd	8ec9220e81	Don't do continue inside the scheduler loop; we really need to check if we've hit the end of the list and cycled around to the first node again. Obtained from: DragonflyBSD	2014-04-08 07:10:52 +00:00
Adrian Chadd	1f7373066f	Correct the actual definition of ath_tx_tid_filt_comp_single() to match how it's used. This is another bug that led to aggregate traffic hanging because the BAW tracking stopped being accurate. In this instance, a filtered frame that exceeded retries would return a non-error, which would mean the caller would never remove it from the BAW. But it wouldn't be added to the filtered list, so it would be lost forever. There'd thus be a hole in the BAW that would never get transmitted and this leads to a traffic hang. Tested: * Routerstation Pro, AR9220 AP	2014-04-08 07:08:59 +00:00
Adrian Chadd	c5d230ab42	Add a comment explaining the obvious.	2014-04-08 07:01:27 +00:00
Adrian Chadd	a3fd3b1429	Don't resume a TID on each filtered frame completion - only do it if we did suspend it. The whole suspend/resume TID queue thing is supposed to be a matched reference count - a subsystem (eg addba negotiation, BAR transmission, filtered frames, etc) is supposed to call pause() once and then resume() once. ath_tx_tid_filt_comp_complete() is called upon the completion of any filtered frame, regardless of whether the driver had aleady seen a filtered frame and called pause(). So only call resume() if tid->isfiltered = 1, which indicates that we had called pause() once. This fixes a seemingly whacked and different problem - traffic hangs. What was actually going on: * There'd be some marginal link with crappy behaviour, causing filtered frames and BAR TXing to occur; * A BAR TX would occur, setting the new BAW (block-ack window) to seqno n; * .. and pause() would be called, blocking further transmission; * A filtered frame completion would occur from the hardware, but with tid->isfiltered = 0 which indiciates we haven't actually marked the queue yet as filtered; * ath_tx_tid_filt_comp_complete() would call resume(), continuing transmission; * Some frames would be queued to the hardware, since the TID is now no longer paused; * .. and if some make it out and ACked successfully, the new BAW may be seqno n+1 or more; * .. then the BAR TX completes and sets the new seqno back to n. At this point the BAW tracking would be loopy because the BAW start was modified but the BAW ring buffer wasn't updated in lock step. Tested: * Routerstation Pro + AR9220 AP	2014-04-08 07:00:43 +00:00
Adrian Chadd	6fc621c22c	Throw the flush messages behind ATH_DEBUG_RESET as well. These are needed to diagnose TX hangs that I and hiren are seeing. Without it, the only way we'll see debugging is by having ATH_DEBUG_SW_TX enabled and that is going to be very, very spammy. ATH_DEBUG_RESET is fine; it's only going to be done during stuck beacon situations in AP mode. Whilst I'm here, and now that it's behind debugging, let's just disable the "print only one" conditional. I'll eventually make it more tunable. Tested: * AR9220, hostap mode.	2014-03-20 23:16:58 +00:00
Rui Paulo	a2be2710b4	Call ieee80211_dump_pkt() based on IFF_DUMPPKTS(). MFC after: 3 days	2014-03-08 19:35:31 +00:00
Kevin Lo	5945b5f5ab	Rename definition of IEEE80211_FC1_WEP to IEEE80211_FC1_PROTECTED. The origin of WEP comes from IEEE Std 802.11-1997 where it defines whether the frame body of MAC frame has been encrypted using WEP algorithm or not. IEEE Std. 802.11-2007 changes WEP to Protected Frame, indicates whether the frame is protected by a cryptographic encapsulation algorithm. Reviewed by: adrian, rpaulo	2014-01-08 08:06:56 +00:00
Olivier Houchard	f431664c05	Include <sys/ktr.h>, since we need it if ATH_DEBUG is defined.	2013-10-28 20:26:34 +00:00
Gleb Smirnoff	76039bc84f	The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare to this event, adding if_var.h to files that do need it. Also, include all includes that now are included due to implicit pollution via if_var.h Sponsored by: Netflix Sponsored by: Nginx, Inc.	2013-10-26 17:58:36 +00:00
Rui Paulo	b372f122ab	Add a missing comma.	2013-10-17 05:51:54 +00:00
Rui Paulo	83bbd5ebf9	Move a lot of debugging printf's to DPRINTF. Approved by: adrian MFC after: 2 weeks	2013-10-17 01:53:07 +00:00
Adrian Chadd	272a8ab68a	Log the MAC address of the node in question rather than the pointer.	2013-08-17 01:14:28 +00:00
Adrian Chadd	5da3fc1048	Shuffle around the cleanup unpause calls a bit.	2013-05-29 01:40:13 +00:00
Adrian Chadd	cd7dffd058	Migrate ath(4) to now use if_transmit instead of the legacy if_start and if queue mechanism; also fix up (non-11n) TX fragment handling. This may result in a bit of a performance drop for now but I plan on debugging and resolving this at a later stage. Whilst here, fix the transmit path so fragment transmission works. The TX fragmentation handling is a bit more special. In order to correctly transmit TX fragments, there's a bunch of corner cases that need to be handled: * They must be transmitted back to back, in the same order.. * .. ie, you need to hold the TX lock whilst transmitting this set of fragments rather than interleaving it with other MSDUs destined to other nodes; * The length of the next fragment is required when transmitting, in order to correctly set the NAV field in the current frame to the length of the next frame; which requires .. * .. that we know the transmit duration of the next frame, which .. * .. requires us to set the rate of all fragments to the same length, or make the decision up-front, etc. To facilitate this, I've added a new ath_buf field to describe the length of the next fragment. This avoids having to keep the mbuf chain together. This used to work before my 11n TX path work because the ath_tx_start() routine would be handed a single mbuf with m_nextpkt pointing to the next frame, and that would be maintained all the way up to when the duration calculation was done. This doesn't hold true any longer - the actual queuing may occur at any point in the future (think ath_node TID software queuing) so this information needs to be maintained. Right now this does work for non-11n frames but it doesn't at all enforce the same rate control decision for all frames in the fragment. I plan on fixing this in a followup commit. RTS/CTS has the same issue, I'll look at fixing this in a subsequent commit. Finaly, 11n fragment support requires the driver to have fully decided what the rate scenario setup is - including 20/40MHz, short/long GI, STBC, LDPC, number of streams, etc. Right now that decision is (currently) made _after_ the NAV field value is updated. I'll fix all of this in subsequent commits. Tested: * AR5416, STA, transmitting 11abg fragments * AR5416, STA, 11n fragments work but the NAV field is incorrect for the reasons above. TODO: * It would be nice to be able to queue mbufs per-node and per-TID so we can only queue ath_buf entries when it's time to assemble frames to send to the hardware. But honestly, we should just do that level of software queue management in net80211 rather than ath(4), so I'm going to leave this alone for now. * More thorough AP, mesh and adhoc testing. * Ensure that net80211 doesn't hand us fragmented frames when A-MPDU has been negotiated, as we can't do software retransmission of fragments. * .. set CLRDMASK when transmitting fragments, just to ensure.	2013-05-26 22:23:39 +00:00
Adrian Chadd	72910f03e5	Implement a separate hardware queue threshold for aggregate and non-aggr traffic. When transmitting non-aggregate traffic, we need to keep the hardware busy whilst transmitting or small bursts in txdone/tx latency will kill us. This restores non-aggregate iperf performance, especially when doing TDMA. Tested: * AR5416<->AR5416, TDMA * AR5416 STA <-> AR9280 AP	2013-05-21 18:13:57 +00:00
Adrian Chadd	6112d22c3f	More non-ATH_DEBUG build fixes.	2013-05-19 01:33:17 +00:00
Adrian Chadd	9be82a4209	Be (very) careful about how to add more TX DMA work. The list-based DMA engine has the following behaviour: * When the DMA engine is in the init state, you can write the first descriptor address to the QCU TxDP register and it will work. * Then when it hits the end of the list (ie, it either hits a NULL link pointer, OR it hits a descriptor with VEOL set) the QCU stops, and the TxDP points to the last descriptor that was transmitted. * Then when you want to transmit a new frame, you can then either: + write the head of the new list into TxDP, or + you write the head of the new list into the link pointer of the last completed descriptor (ie, where TxDP points), then kick TxE to restart transmission on that QCU> * The hardware then will re-read the descriptor to pick up the link pointer and then jump to that. Now, the quirks: * If you write a TxDP when there's been no previous TxDP (ie, it's 0), it works. * If you write a TxDP in any other instance, the TxDP write may actually fail. Thus, when you start transmission, it will re-read the last transmitted descriptor to get the link pointer, NOT just start a new transmission. So the correct thing to do here is: * ALWAYS use the holding descriptor (ie, the last transmitted descriptor that we've kept safe) and use the link pointer in _THAT_ to transmit the next frame. * NEVER write to the TxDP after you've done the initial write. * .. also, don't do this whilst you're also resetting the NIC. With this in mind, the following patch does basically the above. * Since this encapsulates Sam's issues with the QCU behaviour w/ TDMA, kill the TDMA special case and replace it with the above. * Add a new TXQ flag - PUTRUNNING - which indicates that we've started DMA. * Clear that flag when DMA has been shutdown. * Ensure that we're not restarting DMA with PUTRUNNING enabled. * Fix the link pointer logic during TXQ drain - we should always ensure the link pointer does point to something if there's a list of frames. Having it be NULL as an indication that DMA has finished or during a reset causes trouble. Now, given all of this, i want to nuke axq_link from orbit. There's now HAL methods to get and set the link pointer of a descriptor, so what we should do instead is to update the right link pointer. * If there's a holding descriptor and an empty TXQ list, set the link pointer of said holding descriptor to the new frame. * If there's a non-empty TXQ list, set the link pointer of the last descriptor in the list to the new frame. * Nuke axq_link from orbit. Note: * The AR9380 doesn't need this. FIFO TX writes are atomic. As long as we don't append to a list of frames that we've already passed to the hardware, all of the above doesn't apply. The holding descriptor stuff is still needed to ensure the hardware can re-read a completed descriptor to move onto the next one, but we restart DMA by pushing in a new FIFO entry into the TX QCU. That doesn't require any real gymnastics. Tested: * AR5210, AR5211, AR5212, AR5416, AR9380 - STA mode.	2013-05-18 18:27:53 +00:00
Adrian Chadd	97c9a8e806	Add some more debugging printf()s to complain if the ath_buf tx queue doesn't match the actual hardware queue this frame is queued to. I'm trying to ensure that the holding buffers are actually being queued to the same TX queue as the holding buffer that they end up on. I'm pretty sure this is all correct so if this complains, it'll be due to some kind of subtle broken-ness that needs fixing. This is only done for legacy hardware, not EDMA hardware. Tested: * AR5416 STA mode, very lightly	2013-05-17 05:16:30 +00:00
Adrian Chadd	6d07d3e014	Tidy up the debugging - don't bother printing out TID pointers; now that we are printing out the MAC address in these fields, just printing out the TID is enough.	2013-05-16 17:53:12 +00:00
Adrian Chadd	b45a991e92	Limit the number of software queued frames when doing non-aggregation. This should prevent the TX queue being filled with non-aggregate frames, causing starvation and non-fair queue behaviour.	2013-05-16 17:46:32 +00:00
Adrian Chadd	22a3aee637	Implement my first cut at "correct" node power-save and PS-POLL support. This implements PS-POLL awareness i nthe * Implement frame "leaking", which allows for a software queue to be scheduled even though it's asleep * Track whether a frame has been leaked or not * Leak out a single non-AMPDU frame when transmitting aggregates * Queue BAR frames if the node is asleep * Direct-dispatch the rest of control and management frames. This allows for things like re-association to occur (which involves sending probe req/resp as well as assoc request/response) when the node is asleep and then tries reassociating. * Limit how many frames can set in the software node queue whilst the node is asleep. net80211 is already buffering frames for us so this is mostly just paranoia. * Add a PS-POLL method which leaks out a frame if there's something in the software queue, else it calls net80211's ps-poll routine. Since the ath PS-POLL routine marks the node as having a single frame to leak, either a software queued frame would leak, OR the next queued frame would leak. The next queued frame could be something from the net80211 power save queue, OR it could be a NULL frame from net80211. TODO: * Don't transmit further BAR frames (eg via a timeout) if the node is currently asleep. Otherwise we may end up exhausting management frames due to the lots of queued BAR frames. I may just undo this bit later on and direct-dispatch BAR frames even if the node is asleep. * It would be nice to burst out a single A-MPDU frame if both ends support this. I may end adding a FreeBSD IE soon to negotiate this power save behaviour. * I should make STAs timeout of power save mode if they've been in power save for more than a handful of seconds. This way cards that get "stuck" in power save mode don't stay there for the "inactivity" timeout in net80211. * Move the queue depth check into the driver layer (ath_start / ath_transmit) rather than doing it in the TX path. * There could be some naughty corner cases with ps-poll leaking. Specifically, if net80211 generates a NULL data frame whilst another transmitter sends a normal data frame out net80211 output / transmit, we need to ensure that the NULL data frame goes out first. This is one of those things that should occur inside the VAP/ic TX lock. Grr, more investigations to do.. Tested: * STA: AR5416, AR9280 * AP: AR5416, AR9280, AR9160	2013-05-15 18:33:05 +00:00

1 2 3 4

190 Commits