Commit Graph

1752 Commits

Author SHA1 Message Date
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Brooks Davis
541d96aaaf Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size).  This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14900
2018-03-30 18:50:13 +00:00
Adrian Chadd
f6ede6300f [ath] Use the BSSID address logic for STA VAPs too.
For DWDS VAPs on ath(4) we need to ensure that the STA vap and hostap VAP
have different MAC addresses.  If the STA code path doesn't utilise the
address assign / reclaim path then it doesn't update the bitmap with which
address was allocated.

This should fix a bunch of corner issues I've been seeing with DWDS STA + AP
VAPs that I was working around with manual MAC address assignment.
2018-02-07 09:37:22 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
26c1d774b5 dev: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
2018-01-13 22:30:30 +00:00
Ed Maste
b081d0e406 ath: fix possible memory disclosure in ioctl handler
Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
MFC after:	1 week
2018-01-03 19:28:13 +00:00
Ed Maste
22760c6d0a ath: revert accidental change committed with r327526
It will be recommitted with the correct commit message.
2018-01-03 19:24:21 +00:00
Ed Maste
47b5c71929 embed_mfs: correctly test grep return value
Reported by:	br
MFC with:	r326992
Sponsored by:	The FreeBSD Foundation
2018-01-03 19:22:10 +00:00
Ed Maste
c24972459f ath: fix possible memory disclosures in ioctl handlers
Apply the fix from r327499 to additional ioctl handlers.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com>
MFC after:	1 week
MFC with:	r327499
Sponsored by:	The FreeBSD Foundation
2018-01-02 19:34:23 +00:00
Ed Maste
85f385b9aa ath: fix memory disclosure from ath_btcoex_ioctl
The ath_btcoex_ioctl handler allocated a buffer without M_ZERO and
returned it to userland without writing to it.

The device has permissions only for root so this is not urgent, and the
fix can be MFCd and considered for a future EN.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com>
Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
Reviewed by:	adrian
MFC after:	1 week
2018-01-02 19:29:30 +00:00
Adrian Chadd
9fbe631a1a [net80211] convert all of the WME use over to a temporary copy of WME info.
This removes the direct WME info access in the ieee80211com struct and instead
provides a method of fetching the data.  Right now it's a no-op but eventually
it'll turn into a per-VAP method for drivers that support it (eg iwn, iwm,
upcoming ath10k work) as things like p2p support require this kind of behaviour.

Tested:

* ath(4), STA and AP mode

TODO:

* yes, this is slightly stack size-y, but it is an important first step
  to get drivers migrated over to a sensible WME API.  A lot of per-phy things
  need to be converted to per-VAP before P2P, 11ac firmware, etc stuff shows up.
2018-01-02 00:07:28 +00:00
Eitan Adler
caa7e52f3f kernel: Fix several typos and minor errors
- duplicate words
- typos
- references to old versions of FreeBSD

Reviewed by:	imp, benno
2017-12-27 03:23:21 +00:00
Pedro F. Giffuni
6e778a7efd SPDX: license IDs for some ISC-related files. 2017-12-08 15:57:29 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Adrian Chadd
565312d184 [ath] Begin using the replacement EDCA functions.
As part of ath10k and other chipset support, the EDCA stuff has to be moved
to potentially be per-VAP.  For hardware that doesn't support it (ie,
everything that we currently support) it can just fetch the "current"
global EDCA parameters for the NIC.

This is one of those parameters that is linked to the currently active
channel context / VAP in Linux mac80211 parlance.

Tested:

* ath(4), STA and AP modes
2017-10-12 21:58:51 +00:00
Adrian Chadd
86a656deb0 [ath_hal] if building with ALQ, ensure we actually depend upon ALQ. 2017-06-29 03:59:02 +00:00
Wojciech Macek
7108339449 Increase timeout in Atheros HAL
It turned out, that some models of the Atheros PCIe
adapters (e.g. AR983x family) may fail to attach
due to insufficient timeout value.

Submitted by:   Bartosz Szczepanek <bsz@semihalf.com>
Obtained from:  Semihalf
Sponsored by:   Stormshield
Reviewed by:    adrian
Differential revision: https://reviews.freebsd.org/D10903
2017-05-29 09:21:38 +00:00
Adrian Chadd
7b6899bf2a [ath] fix short-GI wireshark flag.
Yes, HAL_RX_GI means "short guard interval."
2017-05-26 00:48:21 +00:00
Adrian Chadd
f46839b9e3 [ath] [ath_hal] retire AH_SUPPORT_AR5416 changing anything.
Yes, the memory bloat is large, but it's 2017 and I'll fix it later
by making it runtime configurable / per-chip configurable if I ever need to.
2017-05-25 04:26:26 +00:00
Adrian Chadd
41059135ce [ath] [ath_hal] (etc, etc) - begin the task of re-modularising the HAL.
In the deep past, when this code compiled as a binary module, ath_hal
built as a module.  This allowed custom, smaller HAL modules to be built.
This was especially beneficial for small embedded platforms where you
didn't require /everything/ just to run.

However, sometime around the HAL opening fanfare, the HAL landed here
as one big driver+HAL thing, and a lot of the (dirty) infrastructure
(ie, #ifdef AH_SUPPORT_XXX) to build specific subsets of the HAL went away.
This was retained in sys/conf/files as "ath_hal_XXX" but it wasn't
really floated up to the modules themselves.

I'm now in a position where for the reaaaaaly embedded boards (both the
really old and the last couple generation of QCA MIPS boards) having a
cut down HAL module and driver loaded at runtime is /actually/ beneficial.

This reduces the kernel size down by quite a bit.  The MIPS modules look
like this:

adrian@gertrude:~/work/freebsd/head-embedded/src % ls -l ../root/mips_ap/boot/kernel.CARAMBOLA2/ath*ko
-r-xr-xr-x  1 adrian  adrian    5076 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_dfs.ko
-r-xr-xr-x  1 adrian  adrian  100588 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_hal.ko
-r-xr-xr-x  1 adrian  adrian  627324 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_hal_ar9300.ko
-r-xr-xr-x  1 adrian  adrian  314588 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_main.ko
-r-xr-xr-x  1 adrian  adrian   23472 May 23 23:45 ../root/mips_ap/boot/kernel.CARAMBOLA2/ath_rate.ko

And the x86 versions, like this:

root@gertrude:/home/adrian # ls -l /boot/kernel/ath*ko
-r-xr-xr-x  1 root  wheel   36632 May 24 18:32 /boot/kernel/ath_dfs.ko
-r-xr-xr-x  1 root  wheel  134440 May 24 18:32 /boot/kernel/ath_hal.ko
-r-xr-xr-x  1 root  wheel   82320 May 24 18:32 /boot/kernel/ath_hal_ar5210.ko
-r-xr-xr-x  1 root  wheel  104976 May 24 18:32 /boot/kernel/ath_hal_ar5211.ko
-r-xr-xr-x  1 root  wheel  236144 May 24 18:32 /boot/kernel/ath_hal_ar5212.ko
-r-xr-xr-x  1 root  wheel  336104 May 24 18:32 /boot/kernel/ath_hal_ar5416.ko
-r-xr-xr-x  1 root  wheel  598336 May 24 18:32 /boot/kernel/ath_hal_ar9300.ko
-r-xr-xr-x  1 root  wheel  406144 May 24 18:32 /boot/kernel/ath_main.ko
-r-xr-xr-x  1 root  wheel   55352 May 24 18:32 /boot/kernel/ath_rate.ko

.. so you can see, not building the whole HAL can save quite a bit.
For example, if you don't need AR9300 support, you can actually avoid
wasting half a megabyte of RAM.  On embedded routers this is quite a
big deal.

The AR9300 HAL can be later further shrunk because, hilariously,
it indeed supports AH_SUPPORT_<xxx> for optionally adding chipset support.
(I'll chase that down later as it's quite a big savings if you're only
building for a single embedded target.)

So:

* Create a very hackish way to load/unload HAL modules
* Create module metadata for each HAL subtype - ah_osdep_arXXXX.c
* Create module metadata for ath_rate and ath_dfs (bluetooth is
  currently just built as part of it)
* .. yes, this means we could actually build multiple rate control
  modules and pick one at load time, but I'd rather just glue this
  into net80211's rate control code.  Oh well, baby steps.
* Main driver is now "ath_main"
* Create an "if_ath" module that does what the ye olde one did -
  load PCI glue, main driver, HAL and all child modules.
  In this way, if you have "if_ath_load=YES" in /boot/modules.conf
  it will load everything the old way and stuff should still work.
* For module autoloading purposes, I actually /did/ fix up
  the name of the modules in if_ath_pci and if_ath_ahb.

If you want to selectively load things (eg on ye cheape ARM/MIPS platforms
where RAM is at a premium) you should:

* load ath_hal
* load the chip modules in question
* load ath_rate, ath_dfs
* load ath_main
* load if_ath_pci and/or if_ath_ahb depending upon your particular
  bus bind type - this is where probe/attach is done.

TODO:

* AR5312 module and associated pieces - yes, we have the SoC side support
  now so the wifi support would be good to "round things out";
* Just nuke AH_SUPPORT_AR5416 for now and always bloat the packet
  structures; this'll simplify other things.
* Should add a simple refcnt thing to the HAL RF/chip modules so you
  can't unload them whilst you're using them.
* Manpage updates, UPDATING if appropriate, etc.
2017-05-25 04:18:46 +00:00
Adrian Chadd
31ce00f662 [ath] begin migration of AHB support to use the PCI style board data API for calibration data.
This brings the AHB support in line with the PCI support - now other "things"
can wrap up the calibration / board data into a firmware blob and have them
probe/attach after the system has finished booting.

Note that this change requires /all/ of the AHB using kernel configurations
to change - so until I drop those changes in, this breaks AHB.

Fear not, I'll do that soon.

TODO:

* the above stuff.

Tested:

* AR9331, carambola 2, loading if_ath / wlan as modules at run time
2017-05-24 01:02:35 +00:00
Adrian Chadd
15e58d4d26 [ath] prepare for "correct" group (bcast/mcast) address frame handling and software/hardware queue TID mapping.
When I initially did this 11n TX work in days of yonder, my 802.11 standards
clue was ... not as finely tuned.  One of the things in 802.11-2012 (which
I guess technically was after I did this work, but I'm sure it was like this
in the previous rev?) is that among other traffic classes, three things are
important:

* group addressed frames should be default non-QoS, even if they're QoS frames, and
* group addressed frames should have a seqno out of a different space than the
  per-TID QoS one; and because of this
* group addressed frames, being non-QoS, should never be in the Block-ACK window
  for TX.

Now, net80211 and now this code cheats by using the non-QOS TID, but ideally
we'd introduce a separate seqno space just for multicast/group traffic for
TX and RX comparison.

Later extensions (eg reliable multicast / multimedia) express what one should do
when doing multicast traffic in a TID.  Now, technically we /could/ do group traffic
as QoS traffic and throw it into a per-TID seqno space, but this definitely
introduces ordering issues when you take into account things like CABQ behaviour.
(Ie, if some traffic in the TID goes into the CABQ and some doesn't, because
it's doing a split of multicast and non-multicast traffic, then you have
seqno ordering issues.)

So, until someone implements 802.11vv reliable multicast / multimedia extensions,
group traffic is non-QoS.

Next, software/hardware queue TID mapping.  In the past I believed the WME tagging
of frames because well, net80211 had a habit of tagging things like management
traffic with it.  But, then we also map QoS traffic categories to TIDs as well.
So, we should obey the TID!  But! then it put some management traffic into higher
WME categories too, as those frames don't have QoS TIDs.  But! It'd do things like
put things like QoS action frames into higher WME categories, when they should
be kept in-order with the rest of the traffic for that TID.  So! Given all of this,
the ath(4) driver does overrides to not trust the WME category.

I .. am undoing some of this.  Now, the TID has a 1:1 mapping to the hardware
queue.  The TID is the primary source of truth now for all QoS traffic.
The WME is only used for non-QoS traffic.  This now means that any TID traffic
queued should be consistently queued regardless of WME, so things like the
"TX finished, do more TX" that is occuring right now for transmit handling
should be "better".

The consistent {TID, WME} -> hardware queue mapping is important for
transmit completion.  It's used to schedule more traffic for that
particular TID, because that {many TID}:{1 TXQ} mapping in ath_tx_tid_sched()
is used for driving completion.  Ie, when the hardware queue completes,
it'll walk that list of scheduled TIDs attached to that TXQ.

The eventual aim is to get ready for some other features around putting
some data into other hardware queues (eg for better PS-POLL support,
uAPSD, support, correct-er TDMA support, etc) which requires that
I tidy all of this up in preparation for then introducing further
TID scheduling that isn't linked to a hardware TXQ (likely a per-WME, per-TID
driver queue, and a per-node driver queue) to enable that.

Tested:

* AR9380, STA mode
* AR9380, AR9580, AP mode
2017-03-19 05:00:14 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Adrian Chadd
e2cf2aa54d [ath] sigh, how'd I miss this. 2017-02-10 07:16:56 +00:00
Adrian Chadd
aa36f34d52 [ath_hal] implement NULL methods for ah_setQuiet for AR5210, AR5211.
Tested:

* "crap, I didn't bring my cardbus collection and T400 with me" compile
  tested.
2017-02-09 23:20:55 +00:00
Adrian Chadd
1410ca560d [ath] initial station side quiet IE support.
This implements hardware assisted quiet IE support.  Quiet time is
an optional interval on DFS channels (but doesn't have to be DFS
only channels! sigh) where the station and AP can be quiet in order
to allow for channel utilisation measurements.  Typically that's
stuff like radar detection, spectral scan, other-BSS frame sniffing,
checking how busy the air is, etc.

The hardware implements it as one of the generic timers, which is
supplied a period, offset from the trigger period and duration
to stay quiet.  The AP can announce quiet time configurations which
change, and so this code also tracks that.

Implementation details:

* track the current quiet time IE
* compare the new one against the previous one - if only the TBTT
  counter changes, don't update things
* If tbttcount=1 then program it into the hardware - that is when
  it is easiest to program the correct starting offset (one TBTT +
  configured offset).
* .. later on check to see if it can be done on any tbttcount
* If the IE goes away then remove the quiet timer and clear the
  config
* Upon reset, state change, new beacon - clear quiet time IE
  and just let it resync from the next beacon.

History:

This was work done initially by sibridgetech.com in 2011/2012/2013
as part of some FreeBSD wifi DFS contracting work they had for a
third party.  They implemented the net80211 quiet time IE pieces
and had some test code for the station side which didn't entirely
use the timers correctly.

I figured out how to use the timers correctly without stopping/starting
the transmit DMA engine each time. When done correctly, the timer
just needs to be programmed once and left alone until the next
configuration change.

So, thanks to Himali Patel and Parthiv Shah for their work way
back then.  I finally figured it out and finished it!

TODO:

* Now, I'd rather net80211 did the quiet time IE tracking and parsing,
  pushing configurations into the driver is needed.  I'll look at
  doing that in a subsequent update.

* This doesn't handle multiple quiet time IEs, which will currently
  just mess things up.  I'll look into supporting that in the future
  (at least by only obeying "one" of them, and then ignoring
  subsequent IEs in a beacon/probe frame.)

* This also implements the STA side and not the AP side - the AP
  side will come later, and involves taking various other intervals
  into account (eg the beacon offset for multi-VAP modes, the
  SWBA time, etc, etc) as well as obtaining the configuration when
  a beacon is configured/generated rather than "hearing" an IE.

* .. investigate supporting quiet IE in mesh, tdma, ibss modes

* .. investigate supporting quiet IE for non-DFS channels
  (so this can be done for say, 2GHz channels.)

* Chances are i should commit NULL methods for the ar5210, ar5211 HALs..

Tested:

* AR9380, STA mode - announcing quiet, removing quiet, changing quite
  time config, whilst doing iperf testing;
* AR9380, AP mode.
2017-02-09 23:15:11 +00:00
Adrian Chadd
f45d974ee9 [ath] prepare for station side quiet time support.
* Track the current quiet time configuration in the ath_vap struct.
* Add an accessor method for calling the quiet time HAL method.
2017-02-07 02:21:34 +00:00
Adrian Chadd
2bc158cf24 [ath_hal] prepare for CAC quiet time.
To support DFS, the NIC needs to be very quiet during this time.
No transmissions including ACKs are allowed.

This is just the initial HAL glue.
2017-02-03 05:15:08 +00:00
Adrian Chadd
f59ff2a32d [ath] ensure the NIC is woken up before changing the spectral scan parameters. 2017-02-02 05:27:48 +00:00
Adrian Chadd
4a3fdd0e13 [ath_hal] add debugging level for spectral scan. 2017-02-02 05:27:27 +00:00
Adrian Chadd
38fbe19b81 [ath] add spectral scan priority. 2017-02-02 05:27:06 +00:00
Adrian Chadd
39d5467677 [ath] log seqno, type and subtype when assigning sequence numbers for A-MPDU.
This is just to improve adrian-debugging.
2017-01-31 20:57:40 +00:00
Adrian Chadd
46be12ae13 [ath] fix "doing stuff before wakeup" warning; add comments for ACK/CTS handling during DFS/PASSIVE channels
* Although the hardware is awake, the power state handling doesn't think so.
  So just explicitly wake it up early in setup so ath_hal calls don't complain.

* We shouldn't be transmitting or ACKing frames during DFS CAC or on passive
  channels before we hear a beacon.  So, start laying down comments in the
  places where this work has to be done.

Note:

* The main bit missing from finishing this particular bit of work is a state
  call to transition a VAP from passive to non-passive when a beacon is heard.
  CAC is easy, it's an interface state.  So, I'll go and add a method to control
  that soon.
2017-01-27 01:17:00 +00:00
Adrian Chadd
3557b26a4b [ath_hal] note that the CCA configuration setting may be chip-dependent.
I bet it isn't, but who knows - this is making assumptions about the
layout of AR_DIAG.
2017-01-24 22:52:09 +00:00
Adrian Chadd
5e2c0d2d47 [ath] modify cabq and per-node packet usage limits.
* limit cabq to 64 - in practice if this stays at ath_txbuf then
  all buffers can be tied up by a very busy broadcast domain (eg ARP
  storm, way too much MDNS/NETBIOS).  It's been like this in the
  freebsd-wifi-build AP project for the longest time.

* Now that I figured out the hilarity inherent in aggregate forming
  and AR9380 EDMA work, change the per-node to 64 frames by default.
  I'll do some more work to shorten the queue latency introduced when
  doing data so TCP isn't so terrible, but it's now no longer /always/
  tens of milliseconds of extra latency  when doing active iperf tests.

Notes:

The reason for the extra latency is partly tx/rx taskqueue handling and
scheduling, and partly due to a lack of airtime/QoS awareness of per-node
traffic.  Ideally we'd have different limits/priorities on the QoS/TID
levels per node so say, voice/video data got a better share of buffer
allocations over best effort/bulk data, but we currently don't implement
that.  It's not /hard/ to do, I just need to do it.

Tested:

* AR9380 (STA), AR9580 (hostap) - both with the relevant changes.
  TCP is now at around 180mbit with rate control and RTS protection
  enabled.  UDP stays at 355mbit at MCS23, no HT protection.
2017-01-23 04:47:38 +00:00
Adrian Chadd
57af292d36 [ath] fix thresholds for deciding to queue to the software queue and populate hardware frames
This is two fixes, which establishes what I /think/ is pretty close to the
theoretical PHY maximum speed on the AR9380 devices.

* When doing A-MPDU on a TID, don't queue to the hardware directly if
  the hardware queue is busy.  This gives us time to get more packets
  queued up (and the hardware is busy, so there's no point in queuing
  more to the hardware right now) to potentially form an A-MPDU.

  This fixes up the throughput issue I was seeing where a couple hundred
  single frames were being sent a second interspersed between A-MPDU
  frames.  It just happened that the software queue had exactly one
  frame in it at that point.  Queuing it until the hardware finishes
  transmitting isn't exactly costly.

* When determining whether to dequeue from a software node/TID queue into
  the hardware queue, fix up the checks to work right for EDMA chips
  (ar9380 and later.)   Before it was not dispatching anything until
  the FIFO was empty.  Now we allow it to dispatch another aggregate
  up to the hardware aggregate limit, like I intended with the earlier
  work.

This allows a 5GHz HT40, short-GI, "htprotmode off" test at MCS23
to achieve 357 Mbit/sec in a one-way UDP test.  The stars have to be
aligned /just right/ so there are no retries but it can happen.
Just don't expect it to work in an OTA test if your 2yo is running
around the room - MCS23 is very very sensitive to channel conditions.

Tested:

* AR9380 STA (test) -> AR9580 hostap

TODO:

* More thorough testing on pre-AR9380 chips (AR5416, AR9160, AR9280)
* (Finally) teach ath_rate_sample about throughput/latency rather than
  air time, so I can get good transmit rates with a 2yo running around.
2017-01-23 04:30:08 +00:00
Adrian Chadd
6eb9f206a6 [ath] [ar9300] ensure the software scheduler is called to form more aggregates for EDMA chips
When investigating performance on UDP TX on the AR9380 I found that the
following sequence was occuring:

* INTR
* EINPROGRESS - nothing yet
* INTR
* TXSTATUS - process a TX completion for an aggregate
* INTR, INTR
* TXSTATUS - process a TX completion for an aggregate
* TXD, TXD ... populate frames from the hardware queue and submit

What should be happening is a completed TXSTATUS fires off more packets
that are queued on active TIDs.

What /was/ happening was after that first TXSTATUS the TX queue hardware queue
was still empty, so it didn't push anything into the FIFO.  Only after the
second TXSTATUS did any progress get made.

This is one of two commits - it ensures that the software TX queue scheduler
is called /after/ TX completion, otherwise no frames from the software staging
queues will be processed into the hardware queues.

The second commit will fix it so it populates aggregate frames correctly
when the above occurs - right now ath_txq_sched() is called, but it doesn't
populate anything because its pre-check conditions are wrong.

Whilst here, add/tweak debugging.

Tested:

* AR9380 STA (testing device) -> AR9580 hostap
2017-01-23 04:20:37 +00:00
Adrian Chadd
a2d74cc30b [ath] only apply the AR9300 delimiter workaround for the first sub-frame.
This is supposed to only be applied to the first subframe and only if
RTS/CTS is being done.  I'm still not yet checking RTS/CTS exchange status
so it's just happening for all subframes on AR9380 and later.

This gets MCS23 throughput up from around 250mbit to 303mbit with RTS/CTS
protection enabled, and around 330mbit with no HT protection enabled.

Now, MCS23 has a PHY rate of 450mbit and we should be seeing closer to
400mbit for a straight one-way UDP test, but this beats the previous
maximum throughput.

Tested:

* AR9380 (STA) -> AR9580 (AP) - STA with the modifications, doing UDP TX
  test using iperf.
2017-01-22 05:45:42 +00:00
Adrian Chadd
28dc144e73 [ath] improve the debugging when looking into the maximum A-MPDU size being chosen.
This is how I caught the "why are we only sending 8K aggregates?" problem.
2017-01-21 23:35:54 +00:00
Andriy Voskoboinyk
8f1e113906 ath: adapt LDPC support checks
Set both IEEE80211_HTCAP_LDPC and IEEE80211_HTC_TXLDPC capability flags
if LDPC is supported + set 'do_ldpc = 1' only when it is not disabled,
not just supported.

Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D9277
2017-01-21 21:03:26 +00:00
Adrian Chadd
0149d3d476 [ath] ensure both iv_ampdu_limit and iv_ampdu_rxmax is set.
A recent change enforced the VAP limit as well as the peer limit.
I now need to actually set iv_ampdu_limit or we don't transmit more
than 8K sized aggregates.

This restores the expected (suboptimal, but still much faster) behaviour.

Tested:

* AR9380, STA mode
2017-01-21 06:53:30 +00:00
Adrian Chadd
eee8e3627b [ath_hal] mad, mad hacks to get some semblence of correct HT/40 channels populated.
The HT40 channel population logic was "just" doing pairs of channels starting with
the band entry frequency.  Trouble is, a lot of the rules start way off at 5120MHz,
which isn't a valid 5GHz channel.  Then, eg for HT40U, it would populate:

* (5120,5140)
* (5160,5180)
* (5200,5220)
* (5240,5260)

.. as the HT40U pairs, with the first being the primary channel.  Channel 36
is 5180MHz, and since it's not a primary channel here, it wouldn't populate it.
Then, the next HT40U would be 5200/5220, which is highly wrong.

HT40D had the same problem.

So, this just forces that 5GHz HT40 channels start at channel 36 (5180),
no matter what the band edge says.  This includes eg doing 4.9GHz channels.

This erm, meant that the HT40 channels for the low band was always wrong.

Oops!

Tested:

* AR9380, STA mode
* AR9344 SoC, AP mode

MFC after:	1 week
2017-01-05 04:56:04 +00:00
Andriy Voskoboinyk
4a19d71238 net80211 + drivers: convert to ieee80211_crypto_get_key_wepidx().
Proposed by:	adrian
2016-12-07 22:16:07 +00:00
Adrian Chadd
707210ff0e [ath] use the correct AMPDU frame limit for the given node, rather than the global config.
This is important in hostap, ibss, (11s at some magical future date, etc)
where different nodes may have smaller limits.

Oops!

MFC after:	1 week
Relnotes:	Yes
2016-12-03 02:47:41 +00:00
Adrian Chadd
8c03e55dd2 [ath] force wake the hardware if we see a missed beacon.
This adds a workaround to incorrectly behaving APs (ie, FreeBSD APs) which
don't beacon out exactly when they should (at TBTT multiples of beacon
intervals.)

It forces the hardware awake (but leaves it in network-sleep so self
generated frames still state that the hardware is asleep!) and will
remain awake until the next sleep transition driven by net80211.

That way if the beacons are just at the wrong interval, we get a much
better chance of hearing more consecutive beacons before we go to sleep,
thus not constantly disconnecting.

Tested:

* AR9485, STA mode, against a misbehaving FreeBSD AP.
2016-11-28 17:54:29 +00:00
Adrian Chadd
872f3a66dd [ath] revert the previous commit, after reading the 802.11-2012 spec a bit more.
The 802.11-2012 spec talks about this - section 10.1.3.2 - Beacon Generation
in Infrastructure Networks.  So yes, we should be expecting beacons to be
going out in multiples of intval.

Silly adrian.

So:

* fix the FreeBSD APs that are sending beacons at incorrect TBTTs (target
  beacon transmit time); and
* yes indeed we will have to wake up out of network sleep until we sync
  a beacon.
2016-11-28 17:06:35 +00:00
Adrian Chadd
908341abeb [ath] wake up the hardware from power-save before doing transmit completion checking.
This was being done in the pre-AR9380 case, but not for AR9380 and later.
When powersave in STA mode is enabled, this may have lead to the transmit
completion code doing this:

* call the task, which doesn't wake up the hardware
* complete the frames, which doesn't touch the hardware
* schedule pending frames on the hardware queue, which DOES touch the
  hardware, and this will be ignored

This would show up in the logs like this:

(with debugging enabled):
Nov 27 23:03:56 lovelace kernel: Q1[  0] (nseg=1) (DS.V:0xfffffe011bd57300 DS.P:0x49b57300) I: 168cc117 L:00000000 F:0005
...
(in general, doesn't require debugging enabled):
Nov 27 23:03:56 lovelace kernel: ath_hal_reg_write: reg=0x00000804, val=0x49b57300, pm=2

That register is a EDMA TX FIFO register (queue 1), and the val is the descriptor
being written.

Whilst here, make sure the software queue gets kicked here.

Tested;

* AR9485, STA mode + powersave
2016-11-28 08:13:20 +00:00
Adrian Chadd
0ffc652e9e [ath] fix target beacon interval programming for STA mode when in powersave.
This bug has been bugging me for quite some time.  I finally sat down
with enough coffee to figure it out.

The short of it - rounding up to the next intval multiple of the TSF value
only works if the AP is transmitting all its beacons on an interval of
the TSF.  If it isn't - for example, doing staggered beacons on a multi-VAP
setup with a single hardware TSF - then weird things occur.

The long of it -

When powersave is enabled, the MAC and PHY are partially powered off.
They can't receive any packets (or transmit, for that matter.)
The target beacon timer programming will wake up the MAC/PHY just before
the beacon is supposed to be received (well, strictly speaking, at DTIM
so it can see the TIM - traffic information map - telling the STA whether
any traffic is there for it) and it happens automatically.

However, this relies on the target beacon time being programmed correctly.
If it isn't then the hardware will wake up and not hear any beacons -
and then it'll be asleep for said beacons.  After enough of this, net80211
will give up and assume the AP went away.

This should fix both TSFOOR interrupts and disconnects from APs with powersave
enabled.

The annoying bit is that it only happens if APs stagger things or start
on a non-zero TSF.  So, this would sometimes be fine and sometimes not be
fine.

What:

* I don't know (yet) why the code rounds up to the next intval.
  For now, just disable rounding it and trust the value we get.

TODO:

* If we do see a beacon miss in STA mode then we should transition
  out of sleep for a while so we can hear beacons to resync against.
  I'd love a patch from someone to enable that particular behaviour.
  Note - that doesn't require that net80211 brings the chip out of
  sleep state - only that we wake the chip up through to full-on and
  then let it go to sleep again when we've seen a beacon.  The wifi
  stack and AP can still completely just stay believing we're in sleep
  mode.

Tested:

* AR9485, STA mode, powersave enabled

MFC after:	1 week
Relnotes:	Yes
2016-11-28 02:59:33 +00:00
Adrian Chadd
dc87d07103 [ath] include logging of TU versions of the TSF values.
The beacon programming side of things deals in TUs and 1/8th TUs, so
it's good to se the TU value here when debugging beaconing issues.
2016-11-28 02:51:55 +00:00
Adrian Chadd
5566fb10ee [ath] Begin to lay the groundwork for btcoex related ioctl controls. 2016-11-27 18:34:29 +00:00