Commit Graph

1815 Commits

Author SHA1 Message Date
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Marius Strobl
077ef5b445 ath(4): Remove MIPS SoC build glue and AR9130 from FreeBSD HAL
All of these are obsoleted by the general removal of MIPS support.

Actually, corresponding to the removed AH_SUPPORT_x, there is more
superfluous support sprinkled across the HAL source. However, that
code is left in place for now in order to ease a sync to NetBSD.

Reviewed by:	emaste (w/ man page fix)
Approved by:	adrian
Differential Revision:	https://reviews.freebsd.org/D41355
2023-08-09 17:44:46 +02:00
Marius Strobl
37c8ee8847 ath(4): Remove MIPS AHB frontend and join PCI one w/ main support again
Following the removal of general MIPS support, there's no longer a need
to have the AHB bus-frontend in place, which according to Linux sources
also isn't used with any non-MIPS SoCs. For simplicity, PCI bus support
is only made conditional on the main one again, i. e. device ath_pci is
removed, and built into the main module, i. e. if_ath_pci.ko obsoleted,
respectively.
Effectively, this reverts the following commits and associated changes:
dba9c85977
e849bb3ecb

Approved by:	adrian
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D41354
2023-08-08 22:30:13 +02:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Justin Hibbits
a6c0f09aa3 ath(4) IfAPI fixups
Use the if_getcounter() IfAPI instead of accessing the ifnet directly.

Sponsored by:	Juniper Networks, Inc.
2023-02-06 12:32:05 -05:00
Gordon Bergling
e4b7980c5c ath(4): Fix a sysctl description and a typo in a comment
- s/delimeter/delimiter/

MFC after:	7 days
2023-01-19 14:16:05 +01:00
Justin Hibbits
ec22a3a259 DrvAPI: Trivial mechanical conversions for various drivers
Mechanically convert the following drivers, with trivial changes:
* ipw(4)
* igc(4)
* enetc(4)
* malo(4)
* nfe(4)
* bxe(4)
* awg(4)
* otus(4)
* rtwn(4)
* bnxt(4)
* ath(4)

Sponsored by:	Juniper Networks, Inc.
2022-12-21 15:36:59 -05:00
John Baldwin
c1ebd4c98f ath: Fix mismatches in array bounds.
Reported by:	GCC -Warray-parameter
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D37542
2022-12-07 12:30:42 -08:00
Gordon Bergling
f9a9fe46dc ath(4): Fix two typos in source code comments
- s/overriden/overridden/

MFC after:	3 days
2022-09-03 15:24:48 +02:00
John Baldwin
f859852f25 ath: Remove unused devclass arguments to DRIVER_MODULE. 2022-05-06 15:39:29 -07:00
Gordon Bergling
6cff32b3c4 ath(4): Remove a double word in a few source code comments
- s/for for/for/

MFC after:	3 days
2022-04-09 10:55:00 +02:00
Gordon Bergling
8eb774fe54 ath: Fix a typo in a source code comment
- s/ony/only/

MFC after:	3 days
2022-03-27 19:28:49 +02:00
Adrian Chadd
c50346bcf5 ath: bump the default node queue size to 128 frames, not 64
It turns out that, silly adrian, setting it to 64 means only two
AMPDU frames of 32 subframes each.  Thus, whilst those are in-flight,
any subsequent queues frames to that node get dropped.

This ends up being pretty no bueno for performance if any receive
is also going on at that point.

Instead, set it to 128 for the time being to ensure that SOME
frames get queued in the meantime.  This results in some frames
being immediately available in the software queue for transmit
when the two existing A-MPDU frames have been completely sent,
rather than the queue remaining empty until at least one is sent.

It's not the best solution - I still think I'm scheduling receive
far more often than giving time to schedule transmit work -
but at least now I'm not starving the transmit side.

Before this, a bidirectional iperf would show receive at ~ 150mbit/sec.
but the transmit side at like 10kbit/sec.  With it set to 128 it's
now 150mbit/sec receive, and ~ 10mbit receive.  It's better than 10kbit/sec,
but still not as far as I'd like it to be.

Tested:

* AR9380/QCA934x (TL-WDR4300 AP), Macbook pro test STA + AR9380 test STA
2021-05-22 21:23:00 -07:00
Adrian Chadd
f858e9281c [ath] Handle STA + AP beacon programming without stomping over HW AP beacon programming
I've been using STA+AP modes at home for a couple years now
and I've been finding and fixing a lot of weird corner cases.
This is the eventual patchset I've landed on.

* Don't force beacon resync in STA mode if we're using sw beacon tracking.
  This stops a variety of stomping issues when the STA VAP is reconfigured;
  the AP hardware beacons were being stomped on!

* Use the first AP VAP to configure beacons on, rather than the first VAP.
  This prevents weird behaviour in ath_beacon_config() when the hardware
  is being reconfigured and the STA VAP was the first one created.
* Ensure the beacon interval / timing programming is within the AR9300
  HAL bounds by masking off any flags that may have been there before
  shifting the value up to 1/8 TUs rather than the 1 TU resolution the
  previous chips used.

Now I don't get weird beacon reprogramming during startup, STA state
changes and hardware recovery which showed up as HI-LARIOUS beacon
configurations and STAs that would just disconnect from the AP very
frequently.

Tested:

* AR9344/AR9380, STA and AP and STA+AP modes
2021-05-22 16:39:16 -07:00
Adrian Chadd
1ca3996828 [ath] Add ast_tsfoor to the sysctl statistics array. 2021-05-22 15:54:16 -07:00
Adrian Chadd
61c83c4e8b [ath] Add ath_hal_getnav and ath_hal_setnav so the driver layer
can check the NAV as appropriate.
2021-04-18 22:59:28 -07:00
Adrian Chadd
bed90bf8ed [ath_hal] Add get/set NAV functions
The NAV (network allocation vector) register reflects the current MAC
tracking of NAV - when it will stay quiet before transmitting.

Other devices transmit their frame durations in their 802.11 PHY headers
and all devices that hear a frame - even if it's one in an encoding
they don't understand - will understand the low bitrate PHY header that
includes the frame duration.  So, they'll set NAV to this value so
they'll stay quiet until the transmit completes.

Anyway, sometimes the PHY NAV header is garbled and sometimes, notably
older broadcom devices, will fake a long NAV so they can get "cleaner" air
for local calibration.  When this happens, the hardware will stay quiet
for quite some time and this can lead to missed/stuck beacons, or
(for Very Large Values) a MAC hang.

This code just adds the ability to get/set the NAV; the driver will
need to take care of using it during transmit hangs and beacon misses
to see if it's due to a trash looking NAV.
2021-04-18 22:52:31 -07:00
Adrian Chadd
fb3edd4f3f [ath] do a cold reset if TSFOOR triggers
TSFOOR happens if a beacon with a given TSF isn't received within the
programmed/expected TSF value, plus/minus a fudge range. (OOR == out of range.)

If this happens then it could be because the baseband/mac is stuck, or
the baseband is deaf.  So, do a cold reset and resync the beacon to
try and unstick the hardware.

It also happens when a bad AP decides to err, slew its TSF because they
themselves are resetting and they don't preserve the TSF "well."

This has fixed a bunch of weird corner cases on my 2GHz AP radio upstairs
here where it occasionally goes deaf due to how much 2GHz noise is up
here (and ANI gets a little sideways) and this unsticks the station
VAP.

For AP modes a hung baseband/mac usually ends up as a stuck beacon
and those have been addressed for a long time by just resetting the
hardware.  But similar hangs in station mode didn't have a similar
recovery mechanism.

Tested:

* AR9380, STA mode, 2GHz/5GHz
* AR9580, STA mode, 5GHz
* QCA9344 SoC w/ on-board wifi (TL-WDR4300/3600 devices); 2GHz
  STA mode
2021-03-12 23:30:25 -08:00
Adrian Chadd
51dfae383b [ath] validate ts_antenna before updating tx statistics
Right now ts_antenna is either 0 or 1 in each supported HAL so
this is purely a sanity check.

Later on if I ever get magical free time I may add some extensions
for the NICs that can have slightly more complicated antenna switches
for transmit and I'd like this to not bust memory.
2021-03-12 17:33:07 -08:00
Mark Johnston
0e72eb4602 ath_hal: Stop printing messages during boot
ath_hal is compiled into the kernel by default and so always prints a
message to dmesg even when the system has no ath hardware.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-03-08 12:39:06 -05:00
Adrian Chadd
c1e16b6534 [ath] also remove the magic size value here for the transmit antenna statistics. 2020-12-08 17:28:42 +00:00
Adrian Chadd
68545bc433 [ath] Don't use hard-coded values in the sanity check.
Don't use hard-coded values in the phy error and receive antenna
checks.
2020-12-08 17:27:24 +00:00
Adrian Chadd
2511069a3f [ath] replace the hard-coded magic values in if_athioctl.h with constant defines
Replace some hard-coded magic values in the ioctl stats struct with
#defines.  I'm going to follow up with some more sanity checking in
the receive path that also use these values so we don't do bad
things if the hardware is (more) confused.
2020-12-08 17:25:59 +00:00
Bjoern A. Zeeb
fe5ebb23cc Provide MS() and SM() macros for 80211 and wireless drivers.
We have (two versions) of MS() and SM() macros which we use throughout
the wireless code.  Change all but three places (ath_hal, rtwn, and rsu)
to the newly provided _IEEE80211_MASKSHIFT() and _IEEE80211_SHIFTMASK()
macros.  Also change one internal case using both _S and _M instead of
just _S away from _M (one of the reasons rtwn and rsu were not changed).

This was done semi-mechanically.  No functional changes intended.

Requested by:	gnn (D26091)
Reviewed by:	adrian (pre line wrap)
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision:	https://reviews.freebsd.org/D26539
2020-09-24 10:57:39 +00:00
Bjoern A. Zeeb
c6167b4bf5 WiFi: fix ieee80211_media_change() callers
In r178354 with the introduction of multi-bss ("vap") support factoring
out started and with r193340 ieee80211_media_change() no longer returned
ENETRESET but only 0 or error.
As ieee80211(9) tells the ieee80211_media_change() function should not
be called directly but is registered with ieee80211_vap_attach() instead.

Some drivers have not been fully converted.  After fixing the return
checking some of these functions were simply wrappers between
ieee80211_vap_attach() and ieee80211_media_change(), so remove the extra
function, where possible as well.

PR:		248955
Submitted by:	Tong Zhang (ztong0001 gmail.com) (original)
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2020-09-07 15:35:40 +00:00
Mateusz Guzik
9966c0f962 ath: clean up empty lines in .c and .h files 2020-09-01 21:41:07 +00:00
Adrian Chadd
b5e7ee4718 [ath_hal] Add KeyMiss for AR5212/AR5416 series chips.
This is a flag from the MAC that says the received packet didn't match
a keycache slot.  This isn't technically a problem as WEP keys don't
match keycache slots (they're "global" keys), but it could be useful
for tracking down CCMP decryption failures.

Right now it's a no-op - it mirrors what the AR9300 HAL does and it
just increments a counter.  But, hey, maybe one day I'll use it for
diagnosing keycache/CCMP decrypt issues.
2020-06-27 02:59:51 +00:00
Adrian Chadd
857e0646ca [if_ath] Don't update the beacon bits from beacon frames in hostapd mode.
This logic is running the beacon receive bits in STA+AP mode on both the
STA and AP side.  The STA side sees its beacons from the BSS fine; the
AP side is seeing other beacons on the same channel but with the BSS
node for some odd reason.  (I think it's a valid reason, but I currently
forget what that valid reason is.)

So, just to be cleaner about things, don't run the nexttbtt/etc bits
at all if we're in hostap mode.  If I ever get mesh working then maybe
I'll make sure it works right on mesh+ap and mesh+sta modes.

Whilst here, log the VAP i'm being called on to make it clearer what
is going on.  I may end up adding a VAP dprintf version of this at
some point.

Tested:

* AR9380, STA (DWDS client) + hostap on the same NIC
2020-06-07 05:08:44 +00:00
Adrian Chadd
f6287cc63c [ath] Don't re-program the beacon timers if we miss a beacon in software-beacon STA mode.
This is something I added a few years ago to handle resyncing the beacon if
we miss a beacon or need to sync after association/reassociation/powersave.

However, if we're doing STA+AP mode (eg DWDS) then we don't want
to reprogram the beacons here; this may upset normal AP operation.
I missed checking for the sc->sc_swbmiss flag so I was reinitialising
the beacon timers after every beacon miss / TSFOOR option, and
that isn't likely good.

This plus ensuring that STA's are created with "-beacon" to disable
BMISS/TSFOOR processing will hopefully quieten some of the issues
I've seen with missed beacons / TSFOOR (out of range) interrupts
coming in when operating in STA mode.

Tested:

* AR9380/AR9580, STA+AP modes
2020-06-01 06:10:25 +00:00
Adrian Chadd
9f716a645d [ath] Update ath_rate_sample to use the same base type as ticks.
Until net80211 grows a specific ticks type that matches the system,
manually use the same type as the kernel/net80211 'ticks' type
(signed int.)

Tested:

* AR9380, STA mode
2020-05-27 22:48:34 +00:00
Adrian Chadd
8c01c3dc46 [ath] [ath_hal] Propagate the HAL_RESET_TYPE through to the chip reset; set it during ath_reset()
Although I added the reset type field to ath_hal_reset() years ago,
I never finished adding it both throughout the HALs and in if_ath.c.

This will eventually deprecate the ath_hal force_full_reset option
because it can be requested at the driver layer.

So:

* Teach ar5416ChipReset() and ar9300_chip_reset() about the HAL type
* Use it in ar5416Reset() and ar9300_reset() when doing a full chip reset
* Extend ath_reset() to include the HAL_RESET_TYPE parameter added in the above functions
* Use HAL_RESET_NORMAL in most calls to ath_reset()
* .. but use HAL_RESET_BBPANIC for the BB panics, and HAL_RESET_FORCE_COLD during fatal, beacon miss and other hardware related hangs.

This should be a glorified no-op outside of actual hardware issues.
I've tested things with ath_hal force_full_reset set to 1 for years now,
so I know that feature and a full reset works (albeit much slower than
a warm reset!) and it does unwedge hardware.

The eventual aim is to use this for all the places where the driver
detects a potential hang as well as if long calibration - ie, noise floor
calibration - fails to complete. That's one of the big hardware related
things that causes station mode operation to hang without easy recovery.

Differential Revision:	https://reviews.freebsd.org/D24981
2020-05-25 22:31:45 +00:00
Adrian Chadd
a100c050eb [ath] Hopefully recover better-er upon RX restart on AR9380.
This is all very long-standing bug stuff that is touchy and still poorly
documented. Ok, here goes.

The basic bug:

* deleting a VAP causes the RX path (and TX path too) to be restarted
  without a full chip reset, which causes RX hangs on the AR9380 and later.
  (ie, the ones with the newer DMA engine.)

The basic fix:

* do an RX flush when stopping RX in ath_vap_delete() to match what happens
  when RX is stopped elsewhere.  This ensures any pending frames are completed
  and we restart at the right spot; it also ensures we don't push new RX buffers
  into the hardware if we're stopping receive.

The other issues I found:

* Don't bother checking the RX packet ring in the deferred read taskqueue;
  that's specifically supposed to be for completing frames rather than
  just yanking them off the receive ring.

* Cancel/drain any pending deferred read taskqueue.  This isn't done inside
  any locks so we should be super careful here.  This stops the hardware
  being reprogrammed at the same time in another thread/CPU whilst we're
  stopping RX.

* .. (yes, this should be better serialised, but that's for another day. maybe.)

* Add more debugging to trace what's going on here.

And the fun bit:

* Reinitialise the RX FIFO ONLY if we've been reset or stopped, rather than just
  reset.  I noticed that after all the above was done I was STILL seeing RXEOL.
  RXEOL isn't enabled on the AR9380 so I'd only see it if I was sending TX frames
  (ie a ping where it'd be transmitted but never received) so I was not being
  spammed by RXEOL.  So, as long as stuff is stopped, restart it.

This seems to be doing the right thing in both AP and STA modes.

What I should do next, if I ever get time:

* as I said above, serialise the receive stop/start to include taskqueues
* monitor RXEOL on the AR9380 and I keep seeing it spammed / lockups, just
  go do a full chip reset to get things back on track. It sucks, but it
  is better than nothing.

Tested:

* AR9380 AP/STA mode, adding/deleting a hostap VAP to trigger the TX/RX
  queue stop/start; whilst also running an iperf through it.  Lots of times.
  Lots.  Of.. Times.
2020-05-21 04:35:12 +00:00
Adrian Chadd
9a2de0c3d6 [ath] reset hardware if this particular mac bug is seen.
I have to dig into why I'm seeing it on chips as late as the AR9380 era
stuff (as it's marked as an AR5416 bug, but who knows!) but i'm seeing
aggregate TX frames complete with no blockack bit set.  So, everything
should be treated as a failure and do a hardware reset for good measure.

Tested:

* AR9380, STA mode
* AR9580 (5GHz), AP mode
2020-05-21 04:26:20 +00:00
Adrian Chadd
8af1445957 [ath_rate_sample] Obey the maximum frame length even when using static rates.
I wasn't enforcing the maximum packet length when using static rates
so although the driver was enforcing it itself OK, the statistics were
sometimes going into the wrong bin.

Tested:

* AR9380, STA mode
2020-05-21 03:53:45 +00:00
Adrian Chadd
cf43155590 [ath_rate_sample] Fix correct status when completing frames with short failures.
My preivous logic was a bit wrong.  This caused transmissions that failed due
to a mix of short and long retries to count intermediate rates as OK if the
LONG retry count indicated some retries had made it to this intermediate rate,
but the SHORT retry count was the one that caused the whole transmit to fail.

Now status is passed in again - and this is the status for the whole transmission -
and then update_stats() does some quick math to see if the current transmission
series hit its long retry count or not before updating things as a success
or failure.
2020-05-16 21:59:41 +00:00
Adrian Chadd
051ea90c43 [ath_rate_sample] Limit the tx schedules for A-MPDU ; don't take short retries
into account and remove the requirement that the MCS rate is "higher" if we're
 considering a new rate.

Ok, another fun one.

* In order for reliable non-software retried higher MCS rates, the TX schedules
  (inconsistently!) use hard-coded lower rates at the end of the schedule.
  Now, hard-coded is a problem because (a) it means that aggregate formation
  is limited by the SLOWEST rate, so I never formed large AMDU frames for
  3 stream rates, and (b) if the AP disables lower rates as base rates, it
  complains about "unknown rix" every frame you transmit at that rate.

  So, for now just disable the third and fourth schedule entry for AMPDUs.
  Now I'm forming 32k and 64k aggregates for the higher density MCS rates
  much more reliably.

  It would be much nicer if the rate schedule stuff wasn't fixed but instead
  I'd just populate ath_rc_series[] when I fetch the rates.  This is all a
  holdover of ye olde pre-11n stuff and I really just need to nuke it.

  But for now, ye hack.

* The check for "is this MCS rate better" based on MCS itself is just garbage.
  It meant things like going MCS0->7 would be fine, and say 0->8->16 is fine,
  (as they're equivalent encoding but 1,2,3 spatial streams), BUT it meant
  going something like MCS7->11 would fail even though it's likely that
  MCS11 would just be better, both for EWMA/BER and throughput.

  So for now just use the average tx time.  The "right" way for this comparison
  would be to compare PHY bitrates rather than MCS / rate indexes, but I'm not
  yet there.  The bit rates ARE available in the PHY index, but honestly
  I have a lot of other cleaning up to here before I think about that.

* Don't include the RTS/CTS retry count (and thus time) into the average tx time
  caluation.  It just makes temporarily failures make the rate look bad by
  QUITE A LOT, as RTS/CTS exchanges are (a) long, and (b) mostly irrelevant
  to the actual rate being tried.  If we keep hitting RTS/CTS failures then
  there's something ELSE wrong on the channel, not our selected rate.
2020-05-16 05:07:45 +00:00
Adrian Chadd
5add701776 [ath_rate_sample] Fix logic for determining whether to bump up an MCS rate.
* Fix formatting, cause reasons;
* Put back the "and the chosen rate is within 90% of the current rate" logic;
* Ensure the best rate and the current rate aren't the same; this ...
* ... fixes the packets_since_switch[] tracking to actually conut how many
  frames since the rate switched, so now I know how stable stuff is; and
* Ensure that MCS can go up to a higher MCS at this or any other spatial stream.
  My previous quick hack attempt was doing > rather than >= so you had to go
  to both a higher root MCS rate (0..7) and spatial stream. Eg, you couldn't
  go from MCS0 (1ss) to MCS8 (2ss) this way.

The best rate and switching rate logic still have a bunch more work to do
because they're still quite touchy when it comes to average tx time but at least
now it's choosing higher rates correctly when it wants to try a higher rate.

Tested:

* AR9380, STA mode
2020-05-16 01:56:06 +00:00
Adrian Chadd
af72b23b65 [ath] [ath_rate_sample] le oops, trim out an #if 1 that I didn't fully delete.
Cool, so now I know it's about 3 weeks between starting on freebsd coding
and breaking the build again. Queue dunce cap.
2020-05-15 20:03:53 +00:00
Adrian Chadd
cce6344402 [ath] [ath_rate] Extend ath_rate_sample to better handle 11n rates and aggregates.
My initial rate control code was .. suboptimal.  I wanted to at least get MCS
rates sent, but it didn't do anywhere near enough to handle low signal level links
or remotely keep accurate statistics.

So, 8 years later, here's what I should've done back then.

* Firstly, I wasn't at all tracking packet sizes other than the two buckets
  (250 and 1600 bytes.)  So, extend it to include 4096, 8192, 16384, 32768 and
  65536.  I may go add 2048 at some point if I find it's useful.

  This is important for a few reasons.  First, when forming A-MPDU or AMSDU
  aggregates the frame sizes are larger, and thus the TX time calculation
  is woefully, increasingly wrong.  Secondly, the behaviour of 802.11 channels
  isn't some fixed thing, both due to channel conditions and radios themselves.
  Notably, there was some observations done a few years ago on 11n chipsets
  which noticed longer aggregates showed an increase in failed A-MPDU sub-frame
  reception as you got further along in the transmit time.  It could be due to
  a variety of things - transmitter linearity, channel conditions changing,
  frequency/phase drift, etc - but the observation was to potentially form
  shorter aggregates to improve BER.

* .. and then modify the ath TX path to report the length of the aggregate sent,
  so as the statistics kept would line up with the correct bucket.

* Then on the rate control look-up side - i was also only using the first frame
  length for an A-MPDU rate control lookup which isn't good enough here.
  So, add a new method that walks the TID software queue for that node to
  find out what the likely length of data available is.  It isn't ALL of the
  data in the queue because we'll only ever send enough data to fit inside the
  block-ack window, so limit how many bytes we return to roughly what ath_tx_form_aggr()
  would do.

* .. and cache that in the first ath_buf in the aggregate so it and the eventual
  AMPDU length can be returned to the rate control code.

* THEN, modify the rate control code to look at them both when deciding which bucket
  to attribute the sent frame on.  I'm erring on the side of caution and using the
  size bucket that the lookup is based on.

Ok, so now the rate lookups and statistics are "more correct".  However, MCS rates
are not the same as 11abg rates in that they're not a monotonically incrementing
set of faster rates and you can't assume that just because a given MCS rate fails,
the next higher one wouldn't work better or be a lower average tx time.

So, I had to do a bunch of surgery to the best rate and sample rate math.
This is the bit that's a WIP.

* First, simplify the statistics updates (update_stats()) to do a single pass on
  all rates.
* Next, make sure that each rate average tx time is updated based on /its/ failure/success.
  Eg if you sent a frame with { MCS15, MCS12, MCS8 } and MCS8 succeeded, MCS15 and MCS
  12 would have their average tx time updated for /their/ part of the transmission,
  not the whole transmission.
* Next, EWMA wasn't being fully calculated based on the /failures/ in each of the
  rate attempts.  So, if MCS15, MCS12 failed above but MCS8 didn't, then ensure
  that the statistics noted that /all/ subframes failed at those rates, rather than
  the eventual set of transmitted/sent frames.   This ensures the EWMA /and/ average
  TX time are updated correctly.
* When picking a sample rate and initial rate, probe rates aroud the current MCS
  but limit it to MCS0..7 /for all spatial streams/, rather than doing crazy things
  like hitting MCS7 and then probing MCS8 - MCS8 is basically MCS0 but two spatial
  streams.  It's a /lot/ slower than MCS7.  Also, the reverse is true - if we're at
  MCS8 then don't probe MCS7 as part of it, it's not likely to succeed.
* Fix bugs in pick_best_rate() where I was /immediately/ choosing the highest MCS
  rate if there weren't any frames yet transmitted.  I was defaulting to 25% EWMA and
  .. then each comparison would accept the higher rate.  Just skip those; sampling
  will fill in the details.

So, this seems to work a lot better.  It's not perfect; I'm still seeing a lot of
instability around higher MCS rates because there are bursts of loss/retransmissions
that aren't /too/ bad.  But i'll keep iterating over this and tidying up my hacks.

Ok, so why this still something I'm poking at? rather than porting minstrel_ht?

ath_rate_sample tries to minimise airtime, not maximise throughput.  I have
extended it with an EWMA based on sub-frame success/failures - high MCS rates
that have partially successful receptions still show super short average frame
times, but a /lot/ of retransmits have to happen for that to work.
So for MCS rates I also track this EWMA and ensure that the rates I'm choosing
don't have super crappy packet failures.  I don't mind not getting lower
peak throughput versus minstrel_ht; instead I want to see if I can make "minimise
airtime" work well.

Tested:

* AR9380, STA mode
* AR9344, STA mode
* AR9580, STA/AP mode
2020-05-15 18:51:20 +00:00
Adrian Chadd
84f950a54d [ath] [ath_rate] Add some extra data into the rate control lookup.
Right now (well, since I did this in 2011/2012) the rate control code
makes some super bad choices for 11n aggregates/rates, and it tracks
statistics even more questionably.

It's been long enough and I'm now trying to use it again daily, so let's
start by:

* telling the rate control code if it's an aggregate or not;
* being clearer about the TID - yes it can be extracted from the
  ath_buf but this way it can be overridden by the caller without
  changing the TID itself.

  (This is for doing experiments with voice/video QoS at some point..)

* Return an optional field to limit how long the aggregate is in
  microseconds.  Right now the rate control code supplies a rate table
  and the ath aggr form code will look at the rate table and limit
  the aggregate size to 4ms at the slowest rate.  Yeah, this is pretty
  terrible.

* Add some more TODO comments around handling txpower, rate and
  handling filtered frames status so if I continue to have spoons for
  this I can go poke at it.
2020-05-13 00:05:11 +00:00
Adrian Chadd
41137b0604 [ath_hal] [ath_hal_ar9300] Fix endian macros to work in and out of kernel tree.
Yes, people shouldn't use bitfields in C for structure parsing.
If someone ever wants a cleanup task then it'd be great to remove them
from this vendor code and other places in the ar9285/ar9287 HALs.

Alas, here we are.

AH_BYTE_ORDER wasn't defined and neither were the two values it could be.
So when compiling ath_ee_print_9300 it'd default to the big endian struct
layout and get a WHOLE lot of stuff wrong.

So:

* move AH_BYTE_ORDER into ath_hal/ah.h where it can be used by everyone.
* ensure that AH_BYTE_ORDER is actually defined before using it!

This should work on both big and little endian platforms.
2020-05-12 02:20:27 +00:00
Rebecca Cran
7e03a82b63 ath_hal: fix typo in ath_hal_printf 2020-03-18 03:14:17 +00:00
Pawel Biernacki
08f5e6bb81 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (7 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all low hanging fruits as MPSAFE.

Reviewed by:	markj
Approved by:	kib (mentor, blanket)
Differential Revision:	https://reviews.freebsd.org/D23626
2020-02-21 16:32:17 +00:00
Adrian Chadd
af2441fbc7 [ath] Attempt to fix epoch handling.
The epoch stuff with taskqueues works fine if the driver never calls
the receive path in other contexts, but this driver does.  If there was
a chip reset during active receive then part of the reset will call
the receive path to flush out any active packets before reinitialising
the receive queue and that needs to be done with the epoch held.

So:

* make the receive task a normal task again
* explicitly call epoch enter/exit around the legacy and newer DMA
  receive paths
* add a couple of epoch asserts to ensure that the receive packet
  path itself is called with epoch held.

This fixes it on my Atom eeepc laptop (circa 2010!) that I did
all of my initial 802.11n work in this driver and net80211.

Tested:

* AR9285, STA mode

TODO:

* Test on EDMA chipset (AR9380)
* Test in AP/adhoc modes, just to be sure (eg for beacon
  receive processing in particular.)
2020-02-20 07:12:43 +00:00
Gleb Smirnoff
6c3e93cb5a Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process
incoming packets in taskqueue context.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D23518
2020-02-11 18:57:07 +00:00
Gleb Smirnoff
5ab0c8434a ath(4) processing input packets in taskqueue. Enter network epoch
before calling ieee80211_input_mimo().
2020-01-24 17:11:54 +00:00
Gleb Smirnoff
6545173041 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:06:15 +00:00
Adrian Chadd
1bae1560ee [ath_hal] Fix queue bits a bit
Found by PVS Studio: duplicate assignment; add assignment of tqi_compBuf.

Submitted by:	<mizhka@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D20431
2019-05-28 18:05:10 +00:00
Adrian Chadd
b4967c9b6f [ath_rate_sample] Have the final attempted rate in 11n modes to be the lowest one.
Right now ath_rate_sample has a fixed rate schedule, rather than the minstrel_ht
style "best, good, most reliable" triplet.  So, if higher rates are tried then
it'll not fail back to a lower MCS rate in that transmission schedule.

This means that in low SNR situations it'll not easily drop to MCS0 unless enough
transmissions occur to allow rate control to eventually decide to drop; and if
it's TCP traffic it'll get slowed down because of packet loss.

It's worse for 2-stream and 3-stream rates; it doesn't ever fall back to lower
stream rates, and these higher stream rates required higher SNR to work.

So instead let's (for now?) have each of the 11n transmit rates use MCS0 as
the last attempt. ath_rate_sample will quickly see that rate succeeds more
and will move to it much quicker.

Testing:

* AR9344 (Wasp) - 2G STA mode
2019-05-05 06:32:40 +00:00