161 Commits

Author SHA1 Message Date
Adrian Chadd
661deb68d5 Use HAL_NUM_RX_QUEUES rather than a magic constant. 2012-07-19 03:18:15 +00:00
Adrian Chadd
ad3e6dcd37 Break out the TX descriptor link field into HAL methods.
The DMA FIFO chips (AR93xx and later) differ slightly to th elegacy
chips:

* The RX DMA descriptors don't have a ds_link field;
* The TX DMA descriptors have a ds_link field however at a different
  offset.

This is a reimplementation based on what the reference driver and ath9k
does.

A subsequent commit will enable it in the TX and beacon paths.

Obtained from:	Linux ath9k, Qualcomm Atheros
2012-07-19 02:25:14 +00:00
Adrian Chadd
0b59717b4b Change the RX EDMA path to first complete the FIFO, then re-populate it
with fresh descriptors, before handling the frames.

Wrap it all in the RX locks.

Since the FIFO is very shallow (16 for HP, 128 for LP) it needs to be
drained and replenished very quickly.  Ideally, I'll eventually move this
RX FIFO drain/fill into the interrupt handler, only deferring the actual
frame completion.
2012-07-14 02:52:48 +00:00
Adrian Chadd
2fe91baa92 Create an RX queue lock.
Ideally these locks would go away and there'd be a single driver lock,
like what iwn(4) does.  I'll worry about that later.
2012-07-14 02:22:17 +00:00
Adrian Chadd
d434a377d9 Convert sc_rxpending to a per-EDMA queue, and use that for the legacy code.
Prepare ath_rx_pkt() to handle multiple RX queues, and default the legacy
RX queue to use the HP queue.
2012-07-10 00:02:19 +00:00
Adrian Chadd
3d184db2f8 Further preparations for the RX EDMA support.
Break out the DMA descriptor setup/teardown code into a method.
The EDMA RX code doesn't allocate descriptors, just ath_buf entries.
2012-07-09 08:37:59 +00:00
Adrian Chadd
0a6b6951b2 Introduce the EDMA related HAL capabilities.
Whilst here, fix a typo in a previous commit.

Obtained from:	Qualcomm Atheros
2012-07-09 07:31:26 +00:00
Adrian Chadd
d60a0680ba Extend the RX HAL API to include the RX queue identifier.
The AR93xx and later chips support two RX FIFO queues - a high and low
priority queue.

For legacy chips, just assume the queues are high priority.

This is inspired by the reference driver but is a reimplementation of
the API and code.
2012-07-09 07:19:11 +00:00
Adrian Chadd
f8cc9b09b0 Begin abstracting out the RX path in preparation for RX EDMA support.
The RX EDMA support requires a modified approach to the RX descriptor
handling.

Specifically:

* There's now two RX queues - high and low priority;
* The RX queues are implemented as FIFOs; they're now an array of pointers
  to buffers;
* .. and the RX buffer and descriptor are in the same "buffer", rather than
  being separate.

So to that end, this commit abstracts out most of the RX related functions
from the bulk of the driver.  Notably, the RX DMA/buffer allocation isn't
updated, primarily because I haven't yet fleshed out what it should look
like.

Whilst I'm here, create a set of matching but mostly unimplemented EDMA
stubs.

Tested:

  * AR9280, station mode

TODO:

  * Thorough AP and other mode testing for non-EDMA chips;
  * Figure out how to allocate RX buffers suitable for RX EDMA, including
    correctly setting the mbuf length to compensate for the RX descriptor
    and completion status area.
2012-07-03 06:59:12 +00:00
Adrian Chadd
577cd9a9b2 Bring over some further HAL capabilities from the Atheros HAL, as well
as an EDMA check function.

For the AR9003 and later NICs, different TX/RX DMA and descriptor handling
code will be conditional on the EDMA check.

Obtained from:	Qualcomm Atheros
2012-07-02 06:02:12 +00:00
Adrian Chadd
375d4f068a Shuffle some more fields in ath_buf so it's not too big.
This shaves off 20 bytes - from 288 bytes to 268 bytes.

However, it's still too big.
2012-06-16 04:41:35 +00:00
Adrian Chadd
3dd2db6646 Shave four (or eight) bytes off of ath_buf - this field isn't used. 2012-06-16 04:36:08 +00:00
Adrian Chadd
956ac958bf Shrink ath_buf a little more:
* Resize some types.  In particular, bfs_seqno can be uint16_t for now.
  Previous work would assign the unassigned seqno a value of -1, which
  I obviously can't do here.

* Remove bfs_pktdur.  It was in the original code but nothing so far uses
  it.

This gets ath_buf down (on my i386 system) to 292 bytes from 300 bytes.
I'd rather it be much, much smaller.
2012-06-14 04:24:13 +00:00
Adrian Chadd
23ced6c117 Implement a global (all non-mgmt traffic) TX ath_buf limitation when
ath_start() is called.

This (defaults to 10 frames) gives for a little headway in the TX ath_buf
allocation, so buffer cloning is still possible.

This requires a lot omre experimenting and tuning.

It also doesn't stop a node/TID from consuming all of the available
ath_buf's, especially when the node is going through high packet loss
or only talking at a low TX rate.  It also doesn't stop a paused TID
from taking all of the ath_bufs.  I'll look at fixing that up in subsequent
commits.

PR:	kern/168170
2012-06-14 00:51:53 +00:00
Adrian Chadd
af33d486ab Implement a separate, smaller pool of ath_buf entries for use by management
traffic.

* Create sc_mgmt_txbuf and sc_mgmt_txdesc, initialise/free them appropriately.
* Create an enum to represent buffer types in the API.
* Extend ath_getbuf() and _ath_getbuf_locked() to take the above enum.
* Right now anything sent via ic_raw_xmit() allocates via ATH_BUFTYPE_MGMT.
  This may not be very useful.
* Add ATH_BUF_MGMT flag (ath_buf.bf_flags) which indicates the current buffer
  is a mgmt buffer and should go back onto the mgmt free list.
* Extend 'txagg' to include debugging output for both normal and mgmt txbufs.
* When checking/clearing ATH_BUF_BUSY, do it on both TX pools.

Tested:

* STA mode, with heavy UDP injection via iperf.  This filled the TX queue
  however BARs were still going out successfully.

TODO:

* Initialise the mgmt buffers with ATH_BUF_MGMT and then ensure the right
  type is being allocated and freed on the appropriate list.  That'd save
  a write operation (to bf->bf_flags) on each buffer alloc/free.

* Test on AP mode, ensure that BAR TX and probe responses go out nicely
  when the main TX queue is filled (eg with paused traffic to a TID,
  awaiting a BAR to complete.)

PR:		kern/168170
2012-06-13 06:57:55 +00:00
Adrian Chadd
c2ac9655c3 Introduce a new lock debug which is specifically for making sure the
_TID_ lock is held.

For now the TID lock is also the TXQ lock. This is just to make sure
that the right TXQ lock is held for the given TID.
2012-06-11 07:06:49 +00:00
Adrian Chadd
a108d2d6c6 Revert r233227 and followup commits as it breaks CCMP PN replay detection.
This showed up when doing heavy UDP throughput on SMP machines.

The problem with this is because the 802.11 sequence number is being
allocated separately to the CCMP PN replay number (which is assigned
during ieee80211_crypto_encap()).

Under significant throughput (200+ MBps) the TX path would be stressed
enough that frame TX/retry would force sequence number and PN allocation
to be out of order.  So once the frames were reordered via 802.11 seqnos,
the CCMP PN would be far out of order, causing most frames to be discarded
by the receiver.

I've fixed this in some local work by being forced to:

  (a) deal with the issues that lead to the parallel TX causing out of
      order sequence numbers in the first place;
  (b) fix all the packet queuing issues which lead to strange (but mostly
      valid) TX.

I'll begin fixing these in a subsequent commit or five.

PR:		kern/166190
2012-06-11 06:59:28 +00:00
Adrian Chadd
9f95609828 Mostly revert previous commit(s). After doing a bunch of local testing,
it turns out that it negatively affects performance.  I'm stil investigating
exactly why deferring the IO causes such negative TCP performance but
doesn't affect UDP preformance.

Leave the ath_tx_kick() change in there however; it's going to be useful
to have that there for if_transmit() work.

PR:		kern/168649
2012-06-05 06:03:55 +00:00
Adrian Chadd
470a7f4191 Migrate the TX path to a taskqueue for now, until a better way of
implementing parallel TX and TX/RX completion can be done without
simply abusing long-held locks.

Right now, multiple concurrent ath_start() entries can result in
frames being dequeued out of order.  Well, they're dequeued in order
fine, but if there's any preemption or race between CPUs between:

* removing the frame from the ifnet, and
* calling and runningath_tx_start(), until the frame is placed on a
  software or hardware TXQ

Then although dequeueing the frame is in-order, queueing it to the hardware
may be out of order.

This is solved in a lot of other drivers by just holding a TX lock over
a rather long period of time.  This lets them continue to direct dispatch
without races between dequeue and hardware queue.

Note to observers: if_transmit() doesn't necessarily solve this.
It removes the ifnet from the main path, but the same issue exists if
there's some intermediary queue (eg a bufring, which as an aside also
may pull in ifnet when you're using ALTQ.)

So, until I can sit down and code up a much better way of doing parallel
TX, I'm going to leave the TX path using a deferred taskqueue task.
What I will likely head towards is doing a direct dispatch to hardware
or software via if_transmit(), but it'll require some driver changes to
allow queues to be made without using the really large ath_buf / ath_desc
entries.

TODO:

* Look at how feasible it'll be to just do direct dispatch to
  ath_tx_start() from if_transmit(), avoiding doing _any_ intermediary
  serialisation into a global queue.  This may break ALTQ for example,
  so I have to be delicate.

* It's quite likely that I should break up ath_tx_start() so it
  deposits frames onto the software queues first, and then only fill
  in the 802.11 fields when it's being queued to the hardware.
  That will make the if_transmit() -> software queue path very
  quick and lightweight.

* This has some very bad behaviour when using ACPI and Cx states.
  I'll do some subsequent analysis using KTR and schedgraph and file
  a follow-up PR or two.

PR:		kern/168649
2012-06-04 22:01:12 +00:00
Adrian Chadd
a35baf81c9 Remove an unneeded field from ath_buf. 2012-05-26 01:34:36 +00:00
Adrian Chadd
ae2a0aa428 oops - ath_hal_disablepcie is actually destined for another purpose,
not to disable the PCIe PHY in prepration for reset.

Extend the enablepci method to have a "poweroff" flag, which if equal
to true means the hardware is about to go to sleep.
2012-05-25 05:01:27 +00:00
Adrian Chadd
d73df6d52c Prepare for improved (read: pcie) suspend/resume support.
* Flesh out the pcie disable method for 11n chips, as they were defaulting
  to the AR5212 (empty) PCIe disable method.

* Add accessor macros for the HAL PCIe enable/disable calls.

* Call disable on ath_suspend()

* Call enable on ath_resume()

NOTE:

* This has nothing to do with the NIC sleep/run state - the NIC still
  will stay in network-run state rather than supporting network-sleep
  state.  This is preparation work for supporting correct suspend/resume
  WARs for the 11n PCIe NICs.

TODO:

* It may be feasible at this point to keep the chip powered down during
  initial probe/attach and only power it up upon the first configure/reset
  pass.  This however would require correct (for values of "correct")
  tracking of the NIC power configuration state from the driver and that
  just isn't attempted at the moment.

Tested:

* AR9280 on my Lenovo T60, but with no suspend/resume pass (yet).
2012-05-25 02:07:59 +00:00
Adrian Chadd
e4f6061912 Re-up the TX ath_buf limit from 128 to 512.
I'll have to leave this high for now, until I've done some significant
surgery with how ath_bufs (and descriptors) are handled.

This should significantly cut down on the opportunities for a full TX
queue hanging traffic.  I'll continue making things work though; I'm
mostly doing this for users. :)
2012-05-22 19:50:21 +00:00
Adrian Chadd
d3a6425b7c Fix up some corner cases with aggregation handling.
I've come across a weird scenario in net80211 where two TX streams will
happily attempt to setup an aggregation session together.
If we're very lucky, it happens concurrently on separate CPUs and the
total lack of locking in the net80211 aggregation code causes this stuff
to race. Badly.

So >1 call would occur to the ath(4) addba start, but only one call would
complete to addba complete or timeout.  The TID would thus stay paused.

The real fix is to implement some proper per-node (or maybe per-TID)
locking in net80211, which then could be leveraged by the ath(4) TX
aggregation code.

Whilst I'm at it, shuffle around the debugging messages a bit.
I like to keep people on their toes.
2012-05-22 06:31:03 +00:00
Adrian Chadd
0e22ed0eb2 Migrate ath_debug and sc_debug from an int to a uint64_t / QUAD;
add some more BAR debugging logic.

* Change the definition of ath_debug and ath_softc.sc_debug  from
  int to uint64_t;
* Change the relevant sysctls;
* Add a new BAR TX debugging field;
* Use this in if_ath_tx.

This has been tested by using the sysctl program, which happily allows
for fields > 32 bits to be configured.
2012-05-15 23:39:37 +00:00
Adrian Chadd
352f07f66d Change the MIB cycle count API to return HAL_BOOL, rather than uint32_t,
to return whether it was successful.

Add placeholder (blank) methods for previous chips, for both it and
the 11n extension channel busy call.
2012-05-01 14:48:51 +00:00
Adrian Chadd
f846cf42ab Run the fatal proc as a proc, rather than where it currently is.
Otherwise the reset path will sleep, which it can't do in this context.
2012-04-17 06:02:41 +00:00
Adrian Chadd
82d05362e6 Drop this down from 512 to 128 for now.
This may result in a bit of a throughput drop.  However, any throughput
drop at this point should be investigated and root caused, as it's likely
because TX scheduling (all the way down to how preemption, scheduler work,
etc) is happening in a sub-optimal fashion.

This also makes it much more likely to be reloadable on a live machine.
Allocating 5120 TX ath_buf entries via contigmalloc is very unlikely
after a few hours of using X/Chromium.
2012-04-15 19:54:22 +00:00
Adrian Chadd
f8ab7a9fc9 Convert the flags over to a set of bit flags. 2012-04-10 19:25:43 +00:00
Adrian Chadd
9467e3f3fc Squirrel away SYNC interrupt debugging if it's enabled in the HAL.
Bus errors will show up as various SYNC interrupts which will be passed
back up to ath_intr().
2012-04-10 07:23:37 +00:00
Adrian Chadd
ce656facf3 Store away the RTS aggregate limit from the HAL.
This will be used by some upcoming code to ensure that aggregates
are enforced to be a certain size.  The AR5416 has a limitation on
RTS protected aggregates (8KiB).
2012-04-07 02:51:53 +00:00
Adrian Chadd
875a9451d9 Remove duplicate txflags field from ath_buf.
rename bf_state.bfs_flags to bf_state.bfs_txflags, as that is what
it effectively is.
2012-04-07 02:01:26 +00:00
Adrian Chadd
88b3d48316 Implement BAR TX.
A BAR frame must be transmitted when an frame in an A-MPDU session fails
to transmit - it's retried too often, or it can't be cloned for
re-transmission.  The BAR frame tells the remote side to advance the
left edge of the block-ack window (BAW) to a new value.

In order to do this:

* TX for that particular node/TID must be paused;
* The existing frames in the hardware queue needs to be completed, whether
  they're TXed successfully or otherwise;
* The new left edge of the BAW is then communicated to the remote side
  via a BAR frame;
* Once the BAR frame has been sucessfully TXed, aggregation can resume;
* If the BAR frame can't be successfully TXed, the aggregation session
  is torn down.

This is a first pass that implements the above.  What needs to be done/
tested:

* What happens during say, a channel reset / stuck beacon _and_ BAR
  TX.  It _should_ be correctly buffered and retried once the
  reset has completed.  But if a bgscan occurs (and they shouldn't,
  grr) the BAR frame will be forcibly failed and the aggregation session
  will be torn down.

  Yes, another reason to disable bgscan until I've figured this out.

* There's way too much locking going on here.  I'm going to do a couple
  of further passes of sanitising and refactoring so the (re) locking
  isn't so heavy.  Right now I'm going for correctness, not speed.

* The BAR TX can fail if the hardware TX queue is full.  Since there's
  no "free" space kept for management frames, a full TX queue (from eg
  an iperf test) can race with your ability to allocate ath_buf/mbufs
  and cause issues.  I'll knock this on the head with a subsequent
  commit.

* I need to do some _much_ more thorough testing in hostap mode to ensure
  that many concurrent traffic streams to different end nodes are correctly
  handled.  I'll find and squish whichever bugs show up here.

But, this is an important step to being able to flip on 802.11n by default.
The last issue (besides bug fixes, of course) is HT frame protection and
I'll address that in a subsequent commit.
2012-04-04 23:45:15 +00:00
Adrian Chadd
33d340324a Correctly handle AR_MoreAggr when assembling multi-descriptor final frames.
Linux ath9k doesn't have this issue as it doesn't try queuing multi-
descriptor frames to the hardware.

Before, I was only setting the first and last descriptor in the final
frame correctly - and that was done by accident. The first descriptor in
the last sub-frame was being correctly updated by ath_tx_setds_11n();
the last descriptor in the last sub-frame was being correctly updated
by ath_buf_set_rate(). But both of those are "incorrect".

The correct behaviour is:

* AR_IsAggr is set for all descriptors for all subframes in an aggregate.
* AR_MoreAggr is set for all descriptors for all non-final sub-frames
  in an aggregate.

Ie, all descriptors in the last sub-frame of an aggregate must have this
field set to 0.

I still need to do a couple of extra passes to ensure the pad delimiter
field is being correctly handled in all descriptors in the last sub-frame.
2012-04-04 21:49:49 +00:00
Adrian Chadd
03e9308f0a Defer the rescheduling of TID -> TXQ frames in some instances.
Right now ath_txq_sched() is mainly called from the TX ath_tx_processq()
routine, which is (mostly) done as part of the taskqueue.  It shouldn't
be called outside the taskqueue.

But now that I'm about to flip back on BAR TX, I'm going to start
stressing the ath_tx_tid_pause() and ath_tx_tid_resume() paths.
What I don't want to have happen is a reschedule of the TID traffic
_during_ the completion of TX frames.

Ideally I'd like to have a way to flag back up to the processing code
that the current hardware queue should be rechecked for software TID
queue frames.  But for now, this should suffice for the BAR TX case.

I may eventually delete this code once I've brought some further
sanity to the general TX queue/completion path.
2012-03-29 17:39:18 +00:00
Adrian Chadd
0b96ef630b Delay sequence number allocation for A-MPDU until just before the frame
is queued to the hardware.

Because multiple concurrent paths can execute ath_start(), multiple
concurrent paths can push frames into the software/hardware TX queue
and since preemption/interrupting can occur, there's the possibility
that a gap in time will occur between allocating the sequence number
and queuing it to the hardware.

Because of this, it's possible that a thread will have allocated a
sequence number and then be preempted by another thread doing the same.
If the second thread sneaks the frame into the BAW, the (earlier) sequence
number of the first frame will be now outside the BAW and will result
in the frame being constantly re-added to the tail of the queue.
There it will live until the sequence numbers cycle around again.

This also creates a hole in the RX BAW tracking which can also cause
issues.

This patch delays the sequence number allocation to occur only just before
the frame is going to be added to the BAW.  I've been wanting to do this
anyway as part of a general code tidyup but I've not gotten around to it.
This fixes the PR.

However, it still makes it quite difficult to try and ensure in-order
queuing and dequeuing of frames. Since multiple copies of ath_start()
can be run at the same time (eg one TXing process thread, one TX completion
task/one RX task) the driver may end up having frames dequeued and pushed
into the hardware slightly/occasionally out of order.

And, to make matters more annoying, net80211 may have the same behaviour -
in the non-aggregation case, the TX code allocates sequence numbers
before it's thrown to the driver.  I'll open another PR to investigate
this and potentially introduce some kind of final-pass TX serialisation
before frames are thrown to the hardware.  It's also very likely worthwhile
adding some debugging code into ath(4) and net80211 to catch when/if this
does occur.

PR:		kern/166190
2012-03-20 04:50:25 +00:00
Adrian Chadd
b09e37a185 Fix a panic introduced in a previous commit - non-beaconing modes (eg STA)
don't setup the avp mcast queue.

This is a bit annoying though - it turns out the mcast queue isn't
initialised for STA mode but it's then touched to see whether anything
is in it.  That should be fixed in a subsequent commit.

Noticed by:	gperez@entel.upc.edu
PR:		kern/165895
2012-03-10 19:58:23 +00:00
Adrian Chadd
9c85ff9164 Don't flood the cabq/mcastq with frames.
In a very noisy 2.4GHz environment (with HT/40 enabled, making it worse)
I saw the following occur:

* the air was considered "busy" a lot of the time;
* the cabq time is quite short due to staggered beacons being enabled;
* it just wasn't able to keep up TX'ing CABQ frames;
* .. and the cabq would swallow up all the TX ath_buf's.

This patch introduces a twiddle which allows the maximum cabq depth to be
set, forcing further frames to be dropped.

It defaults to the TX buffer count at the moment, so the default behaviour
isn't changed.

I've also started fleshing out a similar setup for the data path, so
it doesn't swallow up all the available TX buffers and preventing management
frames (such as ADDBA) out.

PR:		kern/165895
2012-03-10 04:14:04 +00:00
Adrian Chadd
d52f713265 Attempt to further fix some of the concurrency/reset issues that occur.
* ath_reset() is being called in softclock context, which may have the
  thing sleep on a lock.  To avoid this, since we really _shouldn't_
  be sleeping on any locks, break out the no-loss reset path into a tasklet
  and call that from:

  + ath_calibrate()
  + ath_watchdog()

  This has the added advantage that it'll end up also doing the frame
  RX cleanup from within the taskqueue context, rather than the softclock
  context.

* Shuffle around the taskqueue_block() call to be before we grab the lock
  and disable interrupts.

  The trouble here is that taskqueue_block() doesn't block currently
  queued (but not yet running) tasks so calling it doesn't guarantee
  no further tasks (that weren't running on _A_ CPU at the time of this
  call) will complete.  Calling taskqueue_drain() on these tasks won't
  work because if any _other_ thread calls taskqueue_enqueue() for whatever
  reason, everything gets very angry and stops working.

  This slightly changes the race condition enough to let ath_rx_tasklet()
  run before we try disabling it, and thus quietens the warnings a bit.

  The (more) true solution will be doing something like the following:

  * having a taskqueue_blocked mask in ath_softc;
  * having an interrupt_blocked mask in ath_softc;
  * only calling taskqueue_drain() on each individual task _after_ the
    lock has been acquired - that way no further tasklet scheduling
    is going to occur.
  * Then once the tasks have been blocked _and_ the interrupt has been
    disabled, call taskqueue_drain() on each, ensuring that anything
    that _was_ scheduled or running is removed.

  The trouble is if something calls taskqueue_enqueue() on a task
  after taskqueue_blocked() has been called but BEFORE taskqueue_drain()
  has been called, ta_pending will be set to 1 and taskqueue_drain()
  will sit there stuck in msleep() until you hard-kill the machine.

PR: kern/165382
PR: kern/165220
2012-02-25 19:12:54 +00:00
Adrian Chadd
a865860d09 Add in a new driver feature to allow the TX and RX chainmask to be
overridden at attach time.

Some 802.11n NICs may only have one physical antenna connected.
The radios will be very upset if you try enabling radios which aren't
connected to antennas.

This allows hints to override the TX and RX chainmask.

These hints are:

hint.ath.X.rx_chainmask
hint.ath.X.tx_chainmask

They can be set at either boot time or in kenv before the module is loaded.

This and the previous HAL commit were sponsored in late 2011 by Hobnob, Inc.

Sponsored by:	Hobnob, Inc.
2012-02-10 10:01:09 +00:00
Adrian Chadd
ee2e64dd6b Fix up some style(9) indenting and reorganise some of the hal methods.
There should be no functional change due to this commit.
2012-01-24 06:12:48 +00:00
Adrian Chadd
eb1d1f1de3 Add a missing HAL method macro. I'm using this as part of some personal
DFS radar stuff.
2012-01-24 06:07:05 +00:00
Adrian Chadd
3440495a52 Flesh out configurable hardware based LED blinking.
The hardware (MAC) LED blinking involves a few things:

* Selecting which GPIO pins map to the MAC "power" and "network" lines;
* Configuring the MAC LED state (associated, scanning, idle);
* Configuring the MAC LED blinking type and speed.

The AR5416 HAL configures the normal blinking setup - ie, blink rate based
on TX/RX throughput.  The default AR5212 HAL doesn't program in any
specific blinking type, but the default of 0 is the same.

This code introduces a few things:

* The hardware led override is configured via sysctl 'hardled';
* The MAC network and power LED GPIO lines can be set, or left at -1
  if needed.  This is intended to allow only one of the hardware MUX
  entries to be configured (eg for PCIe cards which only have one LED
  exposed.)

TODO:

* For AR2417, the software LED blinking involves software blinking the
  Network LED.  For the AR5416 and later, this can just be configured
  as a GPIO output line.  I'll chase that up with a subsequent commit.

* Add another software LED blink for "Link", separate from "activity",
  which blinks based on the association state.  This would make my
  D-Link DWA-552 have consistent and useful LED behaviour (as they're
  marked "Link" and "Activity."

* Don't expose the hardware LED override unless it's an AR5416 or later,
  as the previous generation hardware doesn't have this multiplexing
  setup.
2011-12-26 07:47:05 +00:00
Adrian Chadd
ef27340c5b Flesh out some slightly dirty reset/channel change serialisation code
for the ath(4) driver.

Currently, there's nothing stopping reset, channel change and general
TX/RX from overlapping with each other. This wasn't a big deal with
pre-11n traffic as it just results in some dropped frames.
It's possible this may have also caused some inconsistencies and
badly-setup hardware.

Since locks can't be held across all of this (the Linux solution)
due to LORs with the network stack locks, some state counter
variables are used to track what parts of the code the driver is
currently in.

When the hardware is being reset, it disables the taskqueue and
waits for pending interrupts, tx, rx and tx completion before
it begins the reset or channel change.

TX and RX both abort if called during an active reset or channel
change.

Finally, the reset path now doesn't flush frames if ATH_RESET_NOLOSS
is set. Instead, completed TX and RX frames are passed back up to
net80211 before the reset occurs.

This is not without problems:

* Raw frame xmit are just dropped, rather than placed on a queue.
  The net80211 stack should be the one which queues these frames
  rather than the driver.

* It's all very messy. It'd be better if these hardware operations
  were serialised on some kind of work queue, rather than hoping
  they can be run in parallel.

* The taskqueue block/unblock may occur in parallel with the
  newstate() function - which shuts down the taskqueue and restarts
  it once the new state is known. It's likely these operations should
  be refcounted so the taskqueue is restored once no other areas
  in the code wish to suspend operations.

* .. interrupt disable/enable should likely be refcounted as well.

With this work, the driver does not drop frames during stuck beacon
or fatal errors and thus 11n traffic continues to run correctly.
Default and full resets however do still drop frames and it's possible
this may occur, causing traffic loss and session stalls.

Sponsored by:	Hobnob, Inc.
2011-11-18 05:06:30 +00:00
Adrian Chadd
8f939e7967 Merge in some fixes from the if_ath_tx branch.
* Close down some of the kickpcu races, where the interrupt handler
  can and will run concurrently with the taskqueue.
* Close down the TXQ active/completed race between the interrupt
  handler and the concurrently running tx completion taskqueue
  function.
* Add some tx and rx interrupt count tracking, for debugging.
* Fix the kickpcu logic in ath_rx_proc() to not simply drain and
  restart the TX queue - instead, assume the hardware isn't
  (too) confused and just restart RX DMA. This may break on
  previous chipsets, so if it does I'll add a HAL flag and
  conditionally handle this (ie, for broken chipsets, I'll
  just restore the "stop PCU / flush things / restart PCU"
  logic.)
* Misc stuff

Sponsored by:	Hobnob, Inc.
2011-11-08 18:10:04 +00:00
Adrian Chadd
6b349e5a86 Migrate the STAILQ lists to TAILQs.
A bunch of the 11n TX aggregation logic wants to traverse lists of buffers
in various ways. In order to provide O(1) behaviour in this instance,
use TAILQs.

This does blow out the memory footprint and CPU cycles slightly for some
of these operations. I may convert some of these back to STAILQs once
the rest of the software transmit queue handling has been stabilised.

Sponsored by:	Hobnob, Inc.
2011-11-08 17:08:12 +00:00
Adrian Chadd
3dd85b265f Begin merging in some of my 802.11n TX aggregation driver changes.
* Add a PCU lock, which isn't currently used but will eventually be
  used to serialise some of the driver access.

* Add in all the software TX aggregation state, that's kept per-node
  and per-TID.

* Add in the software and aggregation state to ath_buf.

* Add in hooks to ath_softc for aggregation state and the (upcoming)
  aggregation TX state calls.

* Add / fix the HAL access macros.

Obtained from:	Linux, ath9k
Sponsored by:	Hobnob, Inc.
2011-11-08 02:12:11 +00:00
Adrian Chadd
45a94a8341 Update the default AIFS value for hostap mode.
Obtained from:	Linux ath9k, Atheros reference
2011-09-28 02:54:42 +00:00
Adrian Chadd
fc4de9b7fc Update the TSF and next-TBTT methods to work for the AR5416 and later NICs.
This is another commit in a series of TDMA support fixes for the 11n NICs.

* Move ath_hal_getnexttbtt() into the HAL; write methods for it.
  This returns a timer value in TSF, rather than TU.

* Move ath_hal_getcca() and ath_hal_setcca() into the HAL too, where they
  likely now belong.

* Create a new HAL capability: HAL_CAP_LONG_RXDESC_TSF.
  The pre-11n NICs write 15 bit TSF snapshots into the RX descriptor;
  the AR5416 and later write 32 bit TSF snapshots into the RX descriptor.
* Use the new capability to choose between 15 and 31 bit TSF adjustment
  functions in ath_extend_tsf().

* Write ar5416GetTsf64() and ar5416SetTsf64() methods.
  ar5416GetTsf64() tries to compensate for TSF changes at the 32 bit boundary.

According to yin, this fixes the TDMA beaconing on 11n chipsets and TDMA
stations can now associate/talk, but there are still issues with traffic
stability which need to be investigated.

The ath_hal_extendtsf() function is also used in RX packet timestamping;
this may improve adhoc mode on the 11n chipsets. It also will affect the
timestamps seen in radiotap frames.

Submitted by:	Kang Yin Su <cantona@cantona.net>
Approved by:	re (kib)
2011-09-08 01:23:05 +00:00
Adrian Chadd
ef068c6d9f And add another missing brace. Another pointy hat moment.
This one however isn't used by any public code yet, so it
didn't break the build.

Approved by:	re (kib, blanket)
2011-08-08 19:03:26 +00:00