Commit Graph

1050 Commits

Author SHA1 Message Date
Adrian Chadd
a108d2d6c6 Revert r233227 and followup commits as it breaks CCMP PN replay detection.
This showed up when doing heavy UDP throughput on SMP machines.

The problem with this is because the 802.11 sequence number is being
allocated separately to the CCMP PN replay number (which is assigned
during ieee80211_crypto_encap()).

Under significant throughput (200+ MBps) the TX path would be stressed
enough that frame TX/retry would force sequence number and PN allocation
to be out of order.  So once the frames were reordered via 802.11 seqnos,
the CCMP PN would be far out of order, causing most frames to be discarded
by the receiver.

I've fixed this in some local work by being forced to:

  (a) deal with the issues that lead to the parallel TX causing out of
      order sequence numbers in the first place;
  (b) fix all the packet queuing issues which lead to strange (but mostly
      valid) TX.

I'll begin fixing these in a subsequent commit or five.

PR:		kern/166190
2012-06-11 06:59:28 +00:00
Adrian Chadd
94fe37d25c Add a new ioctl for ath(4) which returns the aggregate statistics. 2012-06-10 06:42:18 +00:00
Adrian Chadd
9f95609828 Mostly revert previous commit(s). After doing a bunch of local testing,
it turns out that it negatively affects performance.  I'm stil investigating
exactly why deferring the IO causes such negative TCP performance but
doesn't affect UDP preformance.

Leave the ath_tx_kick() change in there however; it's going to be useful
to have that there for if_transmit() work.

PR:		kern/168649
2012-06-05 06:03:55 +00:00
Adrian Chadd
14d33c7e35 Create a function - ath_tx_kick() - which is called where ath_start() is
called to "kick" along TX.

For now, schedule a taskqueue call.

Later on I may go back to the direct call of ath_rx_tasklet() - but for
now, this will do.

I've tested UDP and TCP TX. UDP TX still achieves 240MBit, but TCP
TX gets stuck at around 100MBit or so, instead of the 150MBit it should
be at.  I'll re-test with no ACPI/power/sleep states enabled at startup
and see what effect it has.

This is in preparation for supporting an if_transmit() path, which will
turn ath_tx_kick() into a NUL operation (as there won't be an ifnet
queue to service.)

Tested:
	* AR9280 STA

TODO:
	* test on AR5416, AR9160, AR928x STA/AP modes

PR:		kern/168649
2012-06-05 03:14:49 +00:00
Adrian Chadd
470a7f4191 Migrate the TX path to a taskqueue for now, until a better way of
implementing parallel TX and TX/RX completion can be done without
simply abusing long-held locks.

Right now, multiple concurrent ath_start() entries can result in
frames being dequeued out of order.  Well, they're dequeued in order
fine, but if there's any preemption or race between CPUs between:

* removing the frame from the ifnet, and
* calling and runningath_tx_start(), until the frame is placed on a
  software or hardware TXQ

Then although dequeueing the frame is in-order, queueing it to the hardware
may be out of order.

This is solved in a lot of other drivers by just holding a TX lock over
a rather long period of time.  This lets them continue to direct dispatch
without races between dequeue and hardware queue.

Note to observers: if_transmit() doesn't necessarily solve this.
It removes the ifnet from the main path, but the same issue exists if
there's some intermediary queue (eg a bufring, which as an aside also
may pull in ifnet when you're using ALTQ.)

So, until I can sit down and code up a much better way of doing parallel
TX, I'm going to leave the TX path using a deferred taskqueue task.
What I will likely head towards is doing a direct dispatch to hardware
or software via if_transmit(), but it'll require some driver changes to
allow queues to be made without using the really large ath_buf / ath_desc
entries.

TODO:

* Look at how feasible it'll be to just do direct dispatch to
  ath_tx_start() from if_transmit(), avoiding doing _any_ intermediary
  serialisation into a global queue.  This may break ALTQ for example,
  so I have to be delicate.

* It's quite likely that I should break up ath_tx_start() so it
  deposits frames onto the software queues first, and then only fill
  in the 802.11 fields when it's being queued to the hardware.
  That will make the if_transmit() -> software queue path very
  quick and lightweight.

* This has some very bad behaviour when using ACPI and Cx states.
  I'll do some subsequent analysis using KTR and schedgraph and file
  a follow-up PR or two.

PR:		kern/168649
2012-06-04 22:01:12 +00:00
Adrian Chadd
1a67d0266c Add the AR9280 workarounds for PCIe suspend/resume.
These aren't strictly needed at the moment as we're not doing APSM
and forcing the NIC in and out of network sleep.  But, they don't hurt.

Tested:

* AR9280 (mini-PCIe)

Obtained from:	Qualcomm Atheros, Linux ath9k
2012-05-26 01:36:25 +00:00
Adrian Chadd
b815538dec Avoid using hard-coded numbers here. 2012-05-26 01:35:11 +00:00
Adrian Chadd
a35baf81c9 Remove an unneeded field from ath_buf. 2012-05-26 01:34:36 +00:00
Adrian Chadd
cf2a77f961 Add some AR5416/AR5418 WAR's for power-on and suspend/resume:
* Now that ah_configPCIE is called for both power on and suspend/resume,
  make sure the right bit(s) are cleared and set when suspending and
  resuming.  Specifically:

  + force disable/enable the PCIe PHY upon suspend/resume;
  + reprogram the PCIe WAR register when resuming and upon power-on.

* Add a recipe which powers down any PCIe PHY hardware inside the AR5416
  (which is the PCI variant) to save on power.  I have (currently) no way
  to test exactly how much power is saved, if any.

Tested on:

* AR5416 cardbus - although unfortunately pccard/cbb/cardbus currently
  detaches the NIC upon suspend, I don't think it's a proper test case.

* AR5418 PCIe attached to expresscard - since we're not doing PCIe APSM,
  it's also not likely a full/good test case.

In both instances I went through a handful of suspend/resume cycles and
ensured that the STA vap reassociated correctly.

TODO:

* Setup a laptop to simply sit in a suspend/resume loop, making sure that
  the NIC always correctly comes back;

* Start doing suspend/resume tests with actual traffic going on in the
  background, as I bet this process is all quite racy at the present;

* Test adhoc/hostap mode, just to be completely sure it's working correctly;

* See if I can jury rig an external power source to an AR5416 to test out
  whether ah_disablePCIE() works.

Obtained from:	Qualcomm Atheros
2012-05-25 17:53:57 +00:00
Adrian Chadd
5a8ffc7d5c * According to the reference code, AR_WA_D3_L1_DISBABLE is bit 14.
* Add some other WAR bits (very usefully described too) in preparation for
  porting over some suspend/resume fixes from ath9k/Atheros.

Obtained from:	Qualcomm Atheros
2012-05-25 16:45:56 +00:00
Adrian Chadd
ae2a0aa428 oops - ath_hal_disablepcie is actually destined for another purpose,
not to disable the PCIe PHY in prepration for reset.

Extend the enablepci method to have a "poweroff" flag, which if equal
to true means the hardware is about to go to sleep.
2012-05-25 05:01:27 +00:00
Adrian Chadd
d73df6d52c Prepare for improved (read: pcie) suspend/resume support.
* Flesh out the pcie disable method for 11n chips, as they were defaulting
  to the AR5212 (empty) PCIe disable method.

* Add accessor macros for the HAL PCIe enable/disable calls.

* Call disable on ath_suspend()

* Call enable on ath_resume()

NOTE:

* This has nothing to do with the NIC sleep/run state - the NIC still
  will stay in network-run state rather than supporting network-sleep
  state.  This is preparation work for supporting correct suspend/resume
  WARs for the 11n PCIe NICs.

TODO:

* It may be feasible at this point to keep the chip powered down during
  initial probe/attach and only power it up upon the first configure/reset
  pass.  This however would require correct (for values of "correct")
  tracking of the NIC power configuration state from the driver and that
  just isn't attempted at the moment.

Tested:

* AR9280 on my Lenovo T60, but with no suspend/resume pass (yet).
2012-05-25 02:07:59 +00:00
Adrian Chadd
e4f6061912 Re-up the TX ath_buf limit from 128 to 512.
I'll have to leave this high for now, until I've done some significant
surgery with how ath_bufs (and descriptors) are handled.

This should significantly cut down on the opportunities for a full TX
queue hanging traffic.  I'll continue making things work though; I'm
mostly doing this for users. :)
2012-05-22 19:50:21 +00:00
Adrian Chadd
d3a6425b7c Fix up some corner cases with aggregation handling.
I've come across a weird scenario in net80211 where two TX streams will
happily attempt to setup an aggregation session together.
If we're very lucky, it happens concurrently on separate CPUs and the
total lack of locking in the net80211 aggregation code causes this stuff
to race. Badly.

So >1 call would occur to the ath(4) addba start, but only one call would
complete to addba complete or timeout.  The TID would thus stay paused.

The real fix is to implement some proper per-node (or maybe per-TID)
locking in net80211, which then could be leveraged by the ath(4) TX
aggregation code.

Whilst I'm at it, shuffle around the debugging messages a bit.
I like to keep people on their toes.
2012-05-22 06:31:03 +00:00
Adrian Chadd
6deb7f3271 For now, add a quick debugging patch to log when the hw TXQ != the TID/AC. 2012-05-21 22:43:38 +00:00
Adrian Chadd
4dfd450768 Rename ath_tx_cleanup() -> ath_tx_tid_cleanup() in order to not clash
with a symbol in if_ath.c
2012-05-21 22:39:13 +00:00
Adrian Chadd
803f0c592d Re-add 'ic' and properly wrap it in the SUPERG macro. 2012-05-21 17:51:26 +00:00
Bernhard Schmidt
62763b6197 Remove unused variable. 2012-05-20 09:46:48 +00:00
Adrian Chadd
d542f7f686 Migrate the per-frame code out from ath_rx_proc() to ath_rx_pkt().
This will (eventually) be used by the EDMA RX path used by the
AR93xx and later NICs.
2012-05-20 06:35:22 +00:00
Adrian Chadd
ba5c15d9ba Migrate most of the beacon handling functions out to if_ath_beacon.c.
This is also in preparation for supporting AR9300 and later NICs.
2012-05-20 04:14:29 +00:00
Adrian Chadd
a35dae8d87 Migrate the TDMA management functions out of if_ath.c into if_ath_tdma.c.
There's some TX path TDMA code in if_ath_tx.c which should be migrated
out, but first I should likely try and verify/fix/repair the TDMA support
in 9.x and -HEAD.
2012-05-20 02:49:42 +00:00
Adrian Chadd
e60c4fc2c9 Migrate the bulk of the RX routines out from if_ath.c to if_ath_rx.[ch].
* migrate the rx processing out into if_ath_rx.c
* migrate the TSF functions into if_ath_tsf.h, as inlines

This is in prepration for supporting the EDMA RX routines, required to
support the AR93xx series NICs.

TODO:

* ath_start() shouldn't be private, but it's called as part of
  the RX path. I should likely migrate ath_rx_tasklet() back into
  if_ath.c and then return this to be 'static'.  The RX code really
  shouldn't need to see TX routines (and vice versa.)

* ath_beacon_* should be in if_ath_beacon.[ch].

* ath_tdma_* should be in if_ath_tdma.[ch] ...
2012-05-20 02:05:10 +00:00
Adrian Chadd
0e22ed0eb2 Migrate ath_debug and sc_debug from an int to a uint64_t / QUAD;
add some more BAR debugging logic.

* Change the definition of ath_debug and ath_softc.sc_debug  from
  int to uint64_t;
* Change the relevant sysctls;
* Add a new BAR TX debugging field;
* Use this in if_ath_tx.

This has been tested by using the sysctl program, which happily allows
for fields > 32 bits to be configured.
2012-05-15 23:39:37 +00:00
Adrian Chadd
e9a6408e07 Handle non-xretry errors the same as xretry errors for now.
Although I _should_ handle the other errors in various ways (specifically
errors like FILT), treating them as having transmitted successfully
is completely wrong.  Here, they'd be counted as successful and the BAW
would be advanced.. but the RX side wouldn't have received them.

The specific errors I've been seeing here are HAL_TXERR_FILT.

This patch does fix the issue - I've tested it using -i 0.001 pings
(enough to start aggregation) and now the behaviour is correct:

* The RX side never sees a "moved window" error, and
* The TX side sends BARs as needed, with the RX side correctly handling
  them.

PR:		kern/167902
2012-05-15 04:55:15 +00:00
Adrian Chadd
a41607fc2d Add some empty DFS methods for AR5210/AR5211 for now, if DFS is enabled
but these don't exist, the code panics.

I should really just add or use a DFS HAL capability before doing this,
so the methods wouldn't be needed..
2012-05-09 18:17:01 +00:00
Adrian Chadd
78de21c070 Re-enable this particular DELAY() for now, at least until the
TX and RX PCU stop/drain routines have been thoroughly debugged.

It's also very likely that I should add hooks back up to the
interface glue (if_ath_pci / if_ath_ahb) to do any relevant
bus flushes that are required.  A WMAC DDR flush may be required
for the AR9130 SoC.
2012-05-07 18:30:22 +00:00
Adrian Chadd
96ff26ff5d Fix a couple of sc_ac2q[] mappings that were using the TID, not the AC.
PR:		kern/167588
2012-05-04 20:31:27 +00:00
Adrian Chadd
352f07f66d Change the MIB cycle count API to return HAL_BOOL, rather than uint32_t,
to return whether it was successful.

Add placeholder (blank) methods for previous chips, for both it and
the 11n extension channel busy call.
2012-05-01 14:48:51 +00:00
Adrian Chadd
6af850068c After thinking about this a bit more, let's not keep statistics per-channel
in the HAL.  That's very memory hungry (32k just for channel statistics)
which would be better served by keeping a summary in the ANI state.

Or, later, keep a survey history in net80211.

So:

* Migrate the ah_chansurvey array to be a single entry, for the current
  channel.
* Change the ioctl interface and ANI code to just reference that.
* Clear the ah_chansurvey array during channel reset, both in the AR5212
  and AR5416 reset path.
2012-04-28 22:03:19 +00:00
Adrian Chadd
641d61c78e Although not strictly needed, quieten a compiler warning by a user. 2012-04-28 18:56:17 +00:00
Adrian Chadd
af5336e30b Extend the ANI code to implement basic channel survey support.
* Always call ar5416GetListenTime()
* Modify ar5416GetListenTime() to:
  + don't update the ANI state if there isn't any ANI state;
  + don't update the channel survey state if there's no active
    channel - just to be paranoid
  + copy the channel survey results into the current sample slot
    based on the current channel; then increment the sample counter
    and sample history counter.
* Modify ar5416GetMIBCyclesPct() to simply return a HAL_SURVEY_SAMPLE,
  rather than a set of percentages.  The ANI code wasn't using the
  percentages anyway.

TODO:

* Create a new function which fetches the survey results periodically
* .. then modify the ANI code to use the pre-fetched values rather than
  fetching them again
* Roll the 11n ext busy function from ar5416_misc.c to update all the
  counters, then do the result calculation
* .. then, modify the MIB counter routine to correctly fetch a snapshot -
  freeze the counters, fetch the values, then reset the counters.
2012-04-28 08:29:46 +00:00
Adrian Chadd
9dc50580ff Fetch the channel survey code from the HAL.
This information is currently not being populated by any of the HAL
modules.
2012-04-28 08:15:40 +00:00
Adrian Chadd
f452e9d26e Extend the HAL channel survey statistics:
* include ext_chan_busy;
* include ofdm/cck phy error counts, which aren't yet implemented.
2012-04-28 08:12:51 +00:00
Adrian Chadd
9406c902cd Add a comment about this DELAY(), I'm not sure whether it's supposed
to be for a DDR/FIFO flush or something else.
2012-04-28 05:00:47 +00:00
Adrian Chadd
aaaca7e743 Add an AR5416 PCU DMA stop method, as a check for the AR9130 is needed.
The reference driver has a 3ms delay for the AR9130 but I'm not as yet
sure why.  From what I can gather, it's likely waiting for some FIFO
flush to occur.

At some point in the future it may be worthwhile adding a WMAC
FIFO flush here, but that'd require some side-call through to the SoC
DDR flush routines.

Obtained from:	Atheros
2012-04-28 03:07:36 +00:00
Adrian Chadd
39da9d42bd Remove some of the redundant locking done in the TX completion path,
when checking whether BAR frames need to be checked.
2012-04-26 23:57:24 +00:00
Adrian Chadd
8df7248cf3 Add the BT register definitions for AR9285/AR9287 BT coexistence.
Obtained from:	Linux ath9k
2012-04-26 02:03:16 +00:00
Adrian Chadd
f13112ff78 Add placeholder methods for WMI command access (USB, perhaps SDIO later)
which will be needed for AR7010 and AR9287 USB access.

The names differ slightly from Linux and Atheros, for the sake of
consistency.

A lot more work is required in order to convert the 11n HAL support to
fully support USB.
2012-04-25 01:42:22 +00:00
Adrian Chadd
36e9589ef3 Add a note that explains what the current state of the register byte order
macros are.
2012-04-25 01:24:39 +00:00
Adrian Chadd
866e643549 .. oops. 2012-04-20 22:07:21 +00:00
Adrian Chadd
c928fccc09 "Upgrade" the AR9285 code to support PCI/ART EEPROM on flash.
I've just verified that this boots on an Atheros AP91. I haven't verified
it with traffic though, so YMMV.
2012-04-20 21:56:13 +00:00
Adrian Chadd
a47f39da1f Stop using the hardware register value byte order swapping for now,
at least until I can root cause what's going on.

The only platform I've seen this on is the AR9220 when attached to
the AR71xx CPUs.  I get immediate PCIe bus errors and all subsequent
accesses cause further MIPS bus exceptions.  I don't have any other
big-endian platforms to test this on.

If I get a chance (or two), I'll try to whack this on a bus analyser
and see exactly what happens.

I'd rather leave this on, especially for slower, embedded platforms.
But the #ifdef hell is something I'm trying to avoid.
2012-04-19 03:26:21 +00:00
Adrian Chadd
f846cf42ab Run the fatal proc as a proc, rather than where it currently is.
Otherwise the reset path will sleep, which it can't do in this context.
2012-04-17 06:02:41 +00:00
Adrian Chadd
2aa563dfeb Migrate the net80211 TX aggregation state to be from per-AC to per-TID.
TODO:

* Test mwl(4) more thoroughly!

Reviewed by:	bschmidt (for iwn)
2012-04-15 20:29:39 +00:00
Adrian Chadd
82d05362e6 Drop this down from 512 to 128 for now.
This may result in a bit of a throughput drop.  However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.

This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.
2012-04-15 19:54:22 +00:00
Adrian Chadd
b890549d41 Override some default values to work around various issues in the deep,
dirty and murky past.

* Override the default cache line size to be something reasonable if
  it's set to 0.  Some NICs initialise with '0' (eg embedded ones)
  and there are comments in the driver stating that various OSes (eg
  older Linux ones) would incorrectly program things and 0 out this
  register.

* Just default to overriding the latency timer.  Every other driver
  does this.

* Use a default cache line size of 32 bytes.  It should be "reasonable
  enough".

Obtained from:	Linux ath9k, Atheros
2012-04-15 00:04:23 +00:00
Adrian Chadd
27ce86b8b6 Both linux ath9k and the reference driver initialises the PLL here
during chip wakeup.

Obtained from:	Linux ath9k, Atheros
2012-04-14 04:40:11 +00:00
Adrian Chadd
79f57b35ce Upgrade ATH_EEPROM_FIRMWARE to a configuration option. 2012-04-13 18:00:48 +00:00
Adrian Chadd
0f60da6fb4 Introduce the ability to grab local EEPROM data from the firmware(9)
interface.

* Introduce a device hint, 'eeprom_firmware', which is the name of firmware
  to lookup.
* If the lookup succeeds, take a copy of it and use it as the eeprom data.

This isn't enabled by default - you have to define ATH_EEPROM_FIRMWARE.
I'll add it to the configuration variables in a later commit.

TODO:

* just keep a firmware reference in ath_softc, and remove the need to
  waste the extra memory in having sc_eepromdata be a malloc()ed block.
2012-04-13 08:48:38 +00:00
Adrian Chadd
53e98d5a48 Fix the default, non-superg compile.
Pointy-hat-to:	adrian
2012-04-11 02:34:32 +00:00