Commit Graph

366 Commits

Author SHA1 Message Date
Adrian Chadd
0fbe75a1c9 Re-lock the ath lock after ath_reset() has been called.
The calibrate callout is done with the sc lock held.

This only showed up when using an older NIC (AR5212) whose
radio/phy requires the rfgain adjustment.

Pointy-hat-to:	adrian
Sponsored by:	Hobnob, Inc.
2011-11-23 07:12:26 +00:00
Adrian Chadd
a2d8240de5 Use the correct lock when calling msleep().
This fixes panics that users have been seeing when operating in station mode,
where the interface undergoes a lot more resets then in hostap mode (ie whilst
doing channel scanning.)

Reported by:	arundel, wblock@wonkity.com
Sponsored by:	Hobnob, Inc.
2011-11-21 22:57:28 +00:00
Adrian Chadd
5856d663ae Fix some whitespace pollution. 2011-11-21 21:59:01 +00:00
Adrian Chadd
9a842e8b59 Begin breaking apart the receive setup/stop path in preparation for more
"correct" handling of frames in the RX pending queue during interface
transitions.

* ath_stoprecv() doesn't blank out the descriptor list - that's what
  ath_startrecv() does. So, change a comment to reflect that.

* ath_stoprecv() does include a large (3ms) delay to let pending DMA
  complete. However, I'm under the impression that the stopdma hal
  method does check for a bit in the PCU to indicate DMA has stopped.
  So, to help with fast abort and restart, modify ath_stoprecv() to take
  a flag which indicates whether this is needed.

* Modify the uses of ath_stoprecv() to pass in a flag to support the
  existing behaviour (ie, do the delay.)

* Remove some duplicate PCU teardown code (which wasn't shutting down DMA,
  so it wasn't entirely correct..) and replace it with a call to
  ath_stoprecv(sc, 0) - which disables the DELAY call.

The upshoot of this is now channel change doesn't simply drop completed
frames on the floor, but instead it cleanly handles those frames.
It still discards pending TX frames in the software and hardware queues
as there's no (current) logic which forcibly recalculates the rate control
information (or whether they're appropriate to be on the TX queue after
a channel change), that'll come later.

This still doesn't stop all the sources of queue stalls but it does
tidy up some of the code duplication.

To be complete, queue stalls now occur during normal behaviour -
they only occur after some kind of broken behaviour causes an interface
or node flush, upsetting the TX/RX BAW. Subsequent commits will
incrementally fix these and other related issues.

Sponsored by:	Hobnob, Inc.
2011-11-19 21:05:31 +00:00
Adrian Chadd
ef27340c5b Flesh out some slightly dirty reset/channel change serialisation code
for the ath(4) driver.

Currently, there's nothing stopping reset, channel change and general
TX/RX from overlapping with each other. This wasn't a big deal with
pre-11n traffic as it just results in some dropped frames.
It's possible this may have also caused some inconsistencies and
badly-setup hardware.

Since locks can't be held across all of this (the Linux solution)
due to LORs with the network stack locks, some state counter
variables are used to track what parts of the code the driver is
currently in.

When the hardware is being reset, it disables the taskqueue and
waits for pending interrupts, tx, rx and tx completion before
it begins the reset or channel change.

TX and RX both abort if called during an active reset or channel
change.

Finally, the reset path now doesn't flush frames if ATH_RESET_NOLOSS
is set. Instead, completed TX and RX frames are passed back up to
net80211 before the reset occurs.

This is not without problems:

* Raw frame xmit are just dropped, rather than placed on a queue.
  The net80211 stack should be the one which queues these frames
  rather than the driver.

* It's all very messy. It'd be better if these hardware operations
  were serialised on some kind of work queue, rather than hoping
  they can be run in parallel.

* The taskqueue block/unblock may occur in parallel with the
  newstate() function - which shuts down the taskqueue and restarts
  it once the new state is known. It's likely these operations should
  be refcounted so the taskqueue is restored once no other areas
  in the code wish to suspend operations.

* .. interrupt disable/enable should likely be refcounted as well.

With this work, the driver does not drop frames during stuck beacon
or fatal errors and thus 11n traffic continues to run correctly.
Default and full resets however do still drop frames and it's possible
this may occur, causing traffic loss and session stalls.

Sponsored by:	Hobnob, Inc.
2011-11-18 05:06:30 +00:00
Adrian Chadd
ddbe3036e5 Introduce a work-around for issues with the AR5416 based MAC on SMP devices.
The AR5416 MAC (which shows up in the AR5008, AR9001, AR9002 devices) has
issues with PCI transactions on SMP machines. This work-around enforces
that register access is serialised through a (global for now) spinlock.

This should stop the hangs people have seen with the AR5416 PCI devices
on SMP hosts.

Obtained by:	Linux, Atheros
2011-11-09 22:39:44 +00:00
Adrian Chadd
31fdf3d6d2 Fix the KTR option to compile by default - it was referencing
some unmerged interrupt status debugging code from my branch.

* Add ah_intrstate[8] which will have the record of the last
  call to ath_hal_getintr().
* Wrap the KTR code behind ATH_KTR_INTR_DEBUG.
* Add the HAL interrupt debugging behind AH_INTERRUPT_DEBUGGING.

This is only done for the AR5416 and later NICs but it will be
trivial to add to the earlier NICs if required.

Neither are enabled by default, although to minimise HAL binary
API differences, the ah_intrstate[] array is always compiled into
the ath_hal struct.
2011-11-08 22:50:28 +00:00
Adrian Chadd
eb6f0de09d Introduce TX aggregation and software TX queue management
for Atheros AR5416 and later wireless devices.

This is a very large commit - the complete history can be
found in the user/adrian/if_ath_tx branch.

Legacy (ie, pre-AR5416) devices also use the per-software
TXQ support and (in theory) can support non-aggregation
ADDBA sessions. However, the net80211 stack doesn't currently
support this.

In summary:

TX path:

* queued frames normally go onto a per-TID, per-node queue
* some special frames (eg ADDBA control frames) are thrown
  directly onto the relevant hardware queue so they can
  go out before any software queued frames are queued.
* Add methods to create, suspend, resume and tear down an
  aggregation session.
* Add in software retransmission of both normal and aggregate
  frames.
* Add in completion handling of aggregate frames, including
  parsing the block ack bitmap provided by the hardware.
* Write an aggregation function which can assemble frames into
  an aggregate based on the selected rate control and channel
  configuration.
* The per-TID queues are locked based on their target hardware
  TX queue. This matches what ath9k/atheros does, and thus
  simplified porting over some of the aggregation logic.
* When doing TX aggregation, stick the sequence number allocation
  in the TX path rather than net80211 TX path, and protect it
  by the TXQ lock.

Rate control:

* Delay rate control selection until the frame is about to
  be queued to the hardware, so retried frames can have their
  rate control choices changed. Frames with a static rate
  control selection have that applied before each TX, just
  to simplify the TX path (ie, not have "static" and "dynamic"
  rate control special cased.)
* Teach ath_rate_sample about aggregates - both completion and
  errors.
* Add an EWMA for tracking what the current "good" MCS rate is
  based on failure rates.

Misc:

* Introduce a bunch of dirty hacks and workarounds so TID mapping
  and net80211 frame inspection can be kept out of the net80211
  layer. Because of the way this code works (and it's from Atheros
  and Linux ath9k), there is a consistent, 1:1 mapping between
  TID and AC. So we need to ensure that frames going to a specific
  TID will _always_ end up on the right AC, and vice versa, or the
  completion/locking will simply get very confused. I plan on
  addressing this mess in the future.

Known issues:

* There is no BAR frame transmission just yet. A whole lot of
  tidying up needs to occur before BAR frame TX can occur in the
  "correct" place - ie, once the TID TX queue has been drained.

* Interface reset/purge/etc results in frames in the TX and RX
  queues being removed. This creates holes in the sequence numbers
  being assigned and the TX/RX AMPDU code (on either side) just
  hangs.

* There's no filtered frame support at the present moment, so
  stations going into power saving mode will simply have a number
  of frames dropped - likely resulting in a traffic "hang".

* Raw frame TX is going to just not function with 11n aggregation.
  Likely this needs to be modified to always override the sequence
  number if the frame is going into an aggregation session.
  However, general raw frame injection currently doesn't work in
  general in net80211, so let's just ignore this for now until
  this is sorted out.

* HT protection is just not implemented and won't be until the above
  is sorted out. In addition, the AR5416 has issues RTS protecting
  large aggregates (anything >8k), so the work around needs to be
  ported and tested. Thus, this will be put on hold until the above
  work is complete.

* The rate control module 'sample' is the only currently supported
  module; onoe/amrr haven't been tested and have likely bit rotted
  a little. I'll follow up with some commits to make them work again
  for non-11n rates, but they won't be updated to handle 11n and
  aggregation. If someone wishes to do so then they're welcome to
  send along patches.

* .. and "sample" doesn't really do a good job of 11n TX. Specifically,
  the metrics used (packet TX time and failure/success rates) isn't as
  useful for 11n. It's likely that it should be extended to take into
  account the aggregate throughput possible and then choose a rate
  which maximises that. Ie, it may be acceptable for a higher MCS rate
  with a higher failure to be used if it gives a more acceptable
  throughput/latency then a lower MCS rate @ a lower error rate.
  Again, patches will be gratefully accepted.

Because of this, ATH_ENABLE_11N is still not enabled by default.

Sponsored by:	Hobnob, Inc.
Obtained from:	Linux, Atheros
2011-11-08 22:43:13 +00:00
Adrian Chadd
1f25c0f7a7 Make sure TXEOL is set on default queues. Otherwise we don't get an
interrupt on the completion of a TX queue and this can cause TX
hangs / timeout.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:55:40 +00:00
Adrian Chadd
9352fb7ab0 Refactor out the TX buffer management and completion code in
preparation for TX aggregation.

* Add in logic which calls ath_buf bf->bf_comp if it's set.
  This allows for AMPDU (and RIFS, and FF, if someone desires) code
  to handle completion - which includes freeing subframes, retransmitting
  subframes, etc.

* Break out the buffer free, buffer busy/unbusy default completion handler
  code into separate functions. This allows bf_comp methods to free and
  unbusy each subframe ath_buf as required.

* Break out the statistics update code into a separate function, just
  to clean up the TX completion path a little.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:49:33 +00:00
Adrian Chadd
6edf1dc729 Change the descriptor logic to use bf_lastds to point to the last
descriptor, rather than using the maths involving bf_desc[bf_nseg - 1].

When doing TX aggregation, the status will be updated in the -final-
descriptor of the -final- subframe in an aggregate. Thus bf_lastds
may point to the last descriptor in a completely different ath_buf.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:25:36 +00:00
Adrian Chadd
e346b0732c Change ath_buf allocation to:
* Immediately return NULL if a buffer isn't available;
* Track the "buffers not available" count;
* Clear some fields used for tx aggregation;
* Add ath_buf_clone() which clones the majority of buffer state.
  This is needed when retransmission of a "busy" buffer is required.

Sponsored by:	Hobnob, Inc.
2011-11-08 21:13:05 +00:00
Adrian Chadd
2d43342484 Break out the TX DMA stop code into a separate function.
Sponsored by:	Hobnob, Inc.
2011-11-08 21:06:36 +00:00
Adrian Chadd
55c7b87745 Add a 'vap' to ath_keyset().
Add some code (which is currently disabled) which modifies the group
multicast key cache behaviour. I haven't yet figured out what the
exact/correct behaviour is so I'm leaving it disabled. It's worth
investigating and "correcting", especially for future work with
mesh/ibss and encryption.

Sponsored by:	Hobnob, Inc.
2011-11-08 19:25:52 +00:00
Adrian Chadd
16d4de92f9 Some more various fixes, etc from my 11n branch.
* When doing software TX queue handling and flush, it's possible
  that the deletion of a VAP (eg a STA shutdown) will queue a
  "STA Disassociate" frame whilst the interface is being deleted.
  The VAP is then deleted, and the frame ends up being queued
  to a node that is freed before it can be TX'ed. Things go awry
  at this point.

  There's no way at the present to avoid freeing the underlying node
  when the vap is being deleted. It's too late in the game.

  I suspect the real fix is to make sure the frame is software
  queued with no completion information somehow, so it doesn't
  link back to a node whose underlying VAP has been freed.
  For now, we'll just have to do this.

* Add some comments showing what's going on.

* Move an instance of the ATH_LOCK() around to protect the interrupt
  set. I'll worry about changing that to a PCU lock later on once
  the 11n code is in the tree.

Sponsored by:	Hobnob, Inc.
2011-11-08 19:18:34 +00:00
Adrian Chadd
f52d345218 Add KTR tracepoints to the ath driver, in order to debug TX, RX
and interrupt handling.

Sponsored by:	Hobnob, Inc.
2011-11-08 19:02:59 +00:00
Adrian Chadd
517526efe8 In preparation for supporting 11n TX/RX properly, allow for TX queue draining
and interface resets to be marked as ATH_RESET_DEFAULT, ATH_RESET_FULL,
ATH_RESET_NOLOSS.

Currently a reset is still a reset - ie, all tx/rx frames in the hardware
queues are purged. This means that those frames will be lost to the 11n TX
and RX aggregation state tracking, breaking AMPDU sessions.

The (eventual) new semantics:

* ATH_RESET_DEFAULT:
      full reset, this is the default for reset situations
      which I haven't yet figured out what they should be.
* ATH_RESET_FULL:
      A full reset - for things such as channel changes.
* ATH_RESET_NOLOSS:
      Don't flush TX/RX queues - handle pending RX frames and leave TX
      frames where they are; restart TX DMA from where it was.
2011-11-08 18:56:52 +00:00
Adrian Chadd
4afa805ec7 Break out the node cleanup and node free path, in preparation for
doing software TX queue management.

The software queued TX frames will be freed by the new cleanup
function.

Sponsored by:	Hobnob, Inc.
2011-11-08 18:48:26 +00:00
Adrian Chadd
96ff485dc3 Preparation for correct 802.11n tx/rx handling.
* Change ath_rx_proc() to ath_rx_tasklet(); make that the taskqueue function.
  This way (eventually) ath_rx_proc() can be called from elsewhere in the
  packet reset/processing queue so frames aren't just "flushed" during
  interface resets/reconfigure. This breaks 802.11n RX aggregation tracking.
* Extend ath_tx_proc() to take a 'resched' flag, which marks whether to
  reschedule further RX PCU reads or not.
* Change ath_tx_processq() to take a "dosched" flag, which will eventually
  be used to indicate whether to reschedule the software TX scheduler.

Sponsored by:	Hobnob, Inc.
2011-11-08 18:45:15 +00:00
Adrian Chadd
8f939e7967 Merge in some fixes from the if_ath_tx branch.
* Close down some of the kickpcu races, where the interrupt handler
  can and will run concurrently with the taskqueue.
* Close down the TXQ active/completed race between the interrupt
  handler and the concurrently running tx completion taskqueue
  function.
* Add some tx and rx interrupt count tracking, for debugging.
* Fix the kickpcu logic in ath_rx_proc() to not simply drain and
  restart the TX queue - instead, assume the hardware isn't
  (too) confused and just restart RX DMA. This may break on
  previous chipsets, so if it does I'll add a HAL flag and
  conditionally handle this (ie, for broken chipsets, I'll
  just restore the "stop PCU / flush things / restart PCU"
  logic.)
* Misc stuff

Sponsored by:	Hobnob, Inc.
2011-11-08 18:10:04 +00:00
Adrian Chadd
6b349e5a86 Migrate the STAILQ lists to TAILQs.
A bunch of the 11n TX aggregation logic wants to traverse lists of buffers
in various ways. In order to provide O(1) behaviour in this instance,
use TAILQs.

This does blow out the memory footprint and CPU cycles slightly for some
of these operations. I may convert some of these back to STAILQs once
the rest of the software transmit queue handling has been stabilised.

Sponsored by:	Hobnob, Inc.
2011-11-08 17:08:12 +00:00
Adrian Chadd
3dd85b265f Begin merging in some of my 802.11n TX aggregation driver changes.
* Add a PCU lock, which isn't currently used but will eventually be
  used to serialise some of the driver access.

* Add in all the software TX aggregation state, that's kept per-node
  and per-TID.

* Add in the software and aggregation state to ath_buf.

* Add in hooks to ath_softc for aggregation state and the (upcoming)
  aggregation TX state calls.

* Add / fix the HAL access macros.

Obtained from:	Linux, ath9k
Sponsored by:	Hobnob, Inc.
2011-11-08 02:12:11 +00:00
Ed Schouten
d745c852be Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
2011-11-07 06:44:47 +00:00
Adrian Chadd
dcfd99a788 When punting frames to the RX tap, free the mbufs since we've tampered with
their length.

Without this, an error frame mbuf would:

* have its size adjusted;
* thrown at the radiotap code;
* then since it's never consumed, the rxbuf/mbuf is then re-added to the
  RX descriptor list with the small size;
* .. and the hardware ends up (sometimes) only DMA'ing part of a frame into
  the small buffer, chaining RX frames together (setting the more flag).

I discovered this particular issue when doing some promiscuous radiotap
testing; I found that I'd occasionally get rs_more set in RX descriptors
w/ the first frame length being very small (sub-100 bytes.) The driver
handles 2-descriptor RX frames (but not more), so this still worked; it
was just odd.

This is suboptimal and may benefit from being replaced with caching
the m_pkthdr_len and m_len fields, then restoring them after completion.
2011-10-28 15:44:09 +00:00
Adrian Chadd
3f3087fd02 Include opt_ah.h when compiling the driver.
There are HAL methods which are actually direct register
access, rather than simply HAL calls. Because of this, these
register accesses would use the non-debug path in ah_osdep.h
as opt_ah.h isn't included.

With this, the correct register access methods are used,
so debugging traces show things such as TXDP checking and
TSF32 access.
2011-10-18 02:43:59 +00:00
Adrian Chadd
a05e382c09 Don't enable the PHY radar errors in calcrxfilter.
That way the radar errors aren't enabled prematurely.

A DFS tester has reported that radar events are reported
during channel scanning, before DFS is actually enabled.
2011-10-17 14:17:37 +00:00
Adrian Chadd
0e0290482b Don't bother triggering the cabq queue if it's empty.
Obtained from:	Atheros
2011-09-28 03:11:51 +00:00
Adrian Chadd
7b15790a4b Fix lock order to be correcter.
Nothing else locks these two queues (cabq, avp mcastq), but it should
be consistent and correct.
2011-09-28 03:07:51 +00:00
Adrian Chadd
fc4de9b7fc Update the TSF and next-TBTT methods to work for the AR5416 and later NICs.
This is another commit in a series of TDMA support fixes for the 11n NICs.

* Move ath_hal_getnexttbtt() into the HAL; write methods for it.
  This returns a timer value in TSF, rather than TU.

* Move ath_hal_getcca() and ath_hal_setcca() into the HAL too, where they
  likely now belong.

* Create a new HAL capability: HAL_CAP_LONG_RXDESC_TSF.
  The pre-11n NICs write 15 bit TSF snapshots into the RX descriptor;
  the AR5416 and later write 32 bit TSF snapshots into the RX descriptor.
* Use the new capability to choose between 15 and 31 bit TSF adjustment
  functions in ath_extend_tsf().

* Write ar5416GetTsf64() and ar5416SetTsf64() methods.
  ar5416GetTsf64() tries to compensate for TSF changes at the 32 bit boundary.

According to yin, this fixes the TDMA beaconing on 11n chipsets and TDMA
stations can now associate/talk, but there are still issues with traffic
stability which need to be investigated.

The ath_hal_extendtsf() function is also used in RX packet timestamping;
this may improve adhoc mode on the 11n chipsets. It also will affect the
timestamps seen in radiotap frames.

Submitted by:	Kang Yin Su <cantona@cantona.net>
Approved by:	re (kib)
2011-09-08 01:23:05 +00:00
Adrian Chadd
f3fb16875c Fix a missing initialisation of bt_flags when setting up the TDMA beacon.
The AR5212 HAL didn't check this field; timers are enabled a different
way.

The AR5416 HAL however did, and since this field was uninitialised, it had
whatever was on the stack at the time. This lead to "unpredictable"
behaviour.

This allows TDMA to work on the AR5416 and later chipsets.

Thanks to:	paradyse@gmail.com
Approved by:	re (kib, blanket)
2011-08-24 14:11:00 +00:00
Adrian Chadd
f35ec93778 Remove this call, now that I've solved the radar module problem without
needing this particular modification.

It can be called during ath_dfs_radar_enable() and still achieve the
same functionality, so I am.

Approved by:	re (kib, blanket)
2011-08-09 04:24:56 +00:00
Adrian Chadd
8db87e4079 Introduce some more DFS related hooks, inspired both by local work
and the Atheros reference code.

The radar detection code needs to know what the current DFS domain is.
Since net80211 doesn't currently know this information, it's extracted
from the HAL regulatory domain information.

The specifics:

* add a new ath_dfs API hook, ath_dfs_init_radar_filters(), which
  updates the radar filters whenever the regulatory domain changes.
* add HAL_DFS_DOMAIN which describes the currently configured DFS domain .
* add a new HAL internal variable which tracks the currently configured
  HAL DFS domain.
* add a new HAL capability, HAL_CAP_DFS_DMN, which returns the currently
  configured HAL DFS domain setting.
* update the HAL DFS domain setting whenever the channel setting is
  updated.

Since this isn't currently used by any radar code, these should all
be no-ops for existing users.

Obtained from:	Atheros
Submitted by:	KBC Networks, sibridge
Approved by:	re (kib, blanket)
2011-08-08 16:22:42 +00:00
Adrian Chadd
1fdadc0fb6 Fix a corner case in RXEOL handling which was likely introduced by yours
truly.

Before 802.11n, the RX descriptor list would employ the "self-linked tail
descriptor" trick which linked the last descriptor back to itself.
This way, the RX engine would never hit the "end" of the list and stop
processing RX (and assert RXEOL) as it never hit a descriptor whose next
pointer was 0. It would just keep overwriting the last descriptor until
the software freed up some more RX descriptors and chained them onto the
end.

For 802.11n, this needs to stop as a self-linked RX descriptor tickles the
block-ack logic into ACK'ing whatever frames are received into that
self-linked descriptor - so in very busy periods, you could end up with
A-MPDU traffic that is ACKed but never received by the 802.11 stack.
This would cause some confusion as the ADDBA windows would suddenly
be out of sync.

So when that occured here, the last descriptor would be hit and the PCU
logic would stop. It would only start again when the RX descriptor list
was updated and the PCU RX engine was re-tickled. That wasn't being done,
so RXEOL would be continuously asserted and no RX would continue.

This patch introduces a new flag - sc->sc_kickpcu - which when set,
signals the RX task to kick the PCU after its processed whatever packets
it can. This way completed packets aren't discarded.

In case some other task gets called which resets the hardware, don't
update sc->sc_imask - instead, just update the hardware interrupt mask
directly and let either ath_rx_proc() or ath_reset() restore the imask
to its former setting.

Note: this bug was only triggered when doing a whole lot of frame snooping
with serial console IO in the RX task. This would defer interrupt processing
enough to cause an RX descriptor overflow. It doesn't happen in normal
conditions.

Approved by: re (kib, blanket)
2011-08-02 02:46:03 +00:00
Adrian Chadd
73f895fcf6 Disable the RXORN/RXEOL interrupts if RXEOL occurs, preventing an
interrupt storm.

This is easily triggered by flipping on and off tcpdump -y IEEE802_11_RADIO
w/ witness enabled. This causes a whole lot of console IO and when you're
attached to a serial console (eg on my AR7161 embedded board), the RX
interrupt doesn't get called quickly enough and the RX queue fills up.

This wasn't a problem in the past because of the self-linked RX descriptor
trick - the RX would never hit the "end" of the RX descriptor list.
However this isn't possible for 802.11n (see previous commit history for
why.)

Both Linux ath9k and the Atheros reference driver code do this; I'm just
looking now for where they then restart the PCU receive. Right now the RX
will just stop until the interface is reset.

Obtained from:	Linux, Atheros
Approved by:	re (kib)
2011-07-31 16:16:25 +00:00
Adrian Chadd
4b326fda72 Remove two debugging printf()s which snuck in during the testing of the
last commit.

Approved by:	re (kib)
Pointy-hat-to:	adrian@
2011-07-31 08:13:25 +00:00
Adrian Chadd
45abcd6c9c Implement the 4KB split transaction workaround for Merlin (AR9280).
The AR9280 apparently has an issue with descriptors which straddle a page
boundary (4k). I'm not yet sure whether I should use PAGE_SIZE in the
calculations or whether I should use 4096; the reference code uses 4096.

This patch fiddles with descriptor allocation so a descriptor entry
doesn't straddle a 4kb address boundary. The descriptor memory allocation
is made larger to contain extra descriptors and then the descriptor
address is advanced to the next 4kb boundary where needed.

I've tested this both on Merlin (AR9280) and non-Merlin (in this case,
AR9160.)

Obtained from:	Linux, Atheros
Approved by:	re (kib)
2011-07-31 08:01:41 +00:00
Adrian Chadd
f51c84eaf6 This links in the ath dfs ioctl into the driver and defines the
ioctl interface for DFS modules to use.

Since there's no open source dfs code yet, this doesn't introduce any
operational changes.

Approved by:	re (kib)
2011-07-21 14:25:12 +00:00
Adrian Chadd
f52efb6d38 Fix a corner case in STA beacon processing when a CSA is received but
the AP doesn't transmit beacons.

If the AP requests a CSA (ie, a channel switch) and then enters CAC
(channel availability check) for 60 seconds, it doesn't send beacons
and it just listens for radar events (and other things which we don't
do yet.)

Now, ath_newstate() was not resetting the beacon timer config on
a transition to the RUN state when in STA mode - it was setting
sc_syncbeacon, which simply updates the beacon config from the
contents of the next received beacon.

This means the STA never generates beacon miss events.

If the AP goes into CAC for 60 seconds and recovers, the STA will
happily receive the first beacon and reconfigure timers.
But if it gets a radar event after that, it'll change channel
again, not notify the station that it's changed channel..
and since the station is happily waiting for the first beacon
to configure the beacon timer details from, it won't ever
generate a beacon miss interrupt and it'll sit there forever
(or until the AP appears on that channel once again.)

This change forces the last known beacon timer config to be
written to hardware on a transition from CSA->RUN in STA mode.
This forces bmiss events to occur and the STA will eventually
(after a handful of beacon miss events) begin scanning for
another access point.
2011-06-29 13:21:52 +00:00
Adrian Chadd
2fd9aabb1b Fix beacon transmission after a channel set.
The DFS code was tickling the channel set directly whilst going
through the state RUN -> CSA -> RUN. This only changed the channel;
it didn't go via ath_reset(). However in this driver, a channel
change always causes a chip reset, which resets the beacon timer
configuration and interrupt setup. This meant that data would go
out but as the beacon timers never fired, beacons would never
be queued.

The confusing part is that sometimes the state transition was
RUN -> SCAN -> CAC -> RUN (with CSA being in there sometimes);
going via SCAN would clear sc_beacons and thus the transition
to RUN would reprogram beacon transmission.

In case someone tries debugging why suspending a device currently
beaconing (versus just RX'ing beacons which is what occurs in STA
mode), add a silly comment which should hopefully land them at
this commit message. The call to ath_hal_reset() will be clearing
the beacon config and it may not be always reset.
2011-06-26 13:53:24 +00:00
Adrian Chadd
10dc8de41b Add ATH_ENABLE_DFS which enables the DFS flag so the DFS code
can be tested.

This doesn't at all actually do radar detection! It's just
so developers who wish to test the net80211 DFS code can easily
do so. Without this flag, the DFS channels are never marked
DFS and thus the DFS stuff doesn't run.
2011-06-26 13:43:15 +00:00
Adrian Chadd
37931a3544 Break out most of the HAL related tweaks into a per-HAL instance,
rather than global variables.

This specifically allows for debugging to be enabled per-NIC, rather
than globally.

Since the ath driver doesn't know about AH_DEBUG, and to keep the ABI
consistent regardless of whether AH_DEBUG is enabled or not, enable the
debug parameter always but only conditionally compile in the debug
methods if needed.

The ALQ support is currently still global pending some brainstorming.

Submitted by:	ssgriffonuser@gmail.com
Reviewed by:	adrian, bschmidt
2011-06-23 02:38:36 +00:00
Adrian Chadd
373815ef7b Add a missing call to sync the DMAed buffer before the radar event data is extracted. 2011-06-05 03:33:46 +00:00
Adrian Chadd
7e5eb44d14 A few changes to make radar detection implementable in a hal_dfs/
module.

* If sc->sc_dodfs is set to 1 by the ath_dfs_radar_enable(),
  set the relevant rx filter bit to begin receiving radar PHY
  errors. The HAL code already knows how to set the relevant
  error mask register to enable radar events.

* Add a missing call to ath_dfs_radar_enable() after ath_hal_reset()

* change ath_dfs_process_phyerr() to take a const char *buf for now,
  rather than a descriptor. This way it can get access to the packet
  buffer contents.
2011-06-04 04:14:59 +00:00
Adrian Chadd
48237774e4 Flesh out the radar detection related operations for the ath driver.
This is in no way a complete DFS/radar detection implementation!
It merely creates an abstracted interface which allows for future
development of the DFS radar detection code.

Note: Net80211 already handles the bulk of the DFS machinery,
all we need to do here is figure out that a radar event has occured
and inform it as such. It then drives the DFS state engine for us.

The "null" DFS radar detection module is included by default;
it doesn't require a device line.

This commit:

* Adds a simple abstracted layer for radar detection state -
  sys/dev/ath/ath_dfs/;
* Implements a null DFS module which doesn't do anything;
  (ie, implements the exact behaviour at the moment);
* Adds hooks to the ath driver to process received radar events
  and gives the DFS module a chance to determine whether
  a radar has been detected.

Obtained from:	Atheros
2011-06-01 20:09:49 +00:00
Adrian Chadd
9be25f4a3a Set default A-MPDU density/size. 2011-05-30 14:57:00 +00:00
Adrian Chadd
76355edba5 Teach if_ath about devices which have short-GI in 20MHz channel modes.
This has been disabled until now because there hasn't been any supported
device which has this feature. Since the AR9287 is the first device to
support it, and since now the HAL has functional AR9287+11n support,
flip this on.
2011-05-29 00:17:13 +00:00
Adrian Chadd
e9d1191f6a * Add some more TX descriptor error counters; this'll be helpful when
implementing TX aggregation
* Whilst I'm there, comment some RX error counters
2011-05-15 15:54:34 +00:00
Adrian Chadd
dce0bcca8a Now that the devices with functioning ps-poll hardware support have
been enumerated (merlin and later), flick this on.
2011-05-12 14:03:29 +00:00
Adrian Chadd
e57539af23 Cosmetic changes to fit 80 character screen width. 2011-04-29 16:43:30 +00:00
Adrian Chadd
6f5fe81e02 Fix a corner-case of interrupt handling which resulted in potentially
spurious (and fatal) interrupt errors.

One user reported seeing this:

Apr 22 18:04:24 ceres kernel: ar5416GetPendingInterrupts: fatal error,
  ISR_RAC 0x0 SYNC_CAUSE 0x2000

SYNC_CAUSE of 0x2000 is AR_INTR_SYNC_LOCAL_TIMEOUT which is a bus timeout;
this shouldn't cause HAL_INT_FATAL to be set.

After checking out ath9k, ath9k_ar9002_hw_get_isr() clears (*masked)
before continuing, regardless of whether any bits in the ISR registers
are set. So if AR_INTR_SYNC_CAUSE is set to something that isn't
treated as fatal, and AR_ISR isn't read or is read and is 0, then
(*masked) wouldn't be cleared. Thus any of the existing bits set
that were passed in would be preserved in the output.

The caller in if_ath - ath_intr() - wasn't setting the masked value
to 0 before calling ath_hal_getisr(), so anything that was present
in that uninitialised variable would be preserved in the case above
of AR_ISR=0, AR_INTR_SYNC_CAUSE != 0; and if the HAL_INT_FATAL bit
was set, a fatal condition would be interpreted and the chip was
reset.

This patch does the following:

* ath_intr() - set masked to 0 before calling ath_hal_getisr();
* ar5416GetPendingInterrupts() - clear (*masked) before processing
  continues; so if the interrupt source is AR_INTR_SYNC_CAUSE
  and it isn't fatal, the hardware isn't reset via returning
  HAL_INT_FATAL.

This doesn't fix any underlying errors which trigger
AR_INTR_SYNC_LOCAL_TIMEOUT - which is a bus timeout of some
sort - so that likely should be further investigated.
2011-04-23 06:37:09 +00:00