* 286410
* 286413
* 286416
The initial commit broke a variety of debug and features that aren't
in the GENERIC kernels but are enabled in other platforms.
with the net80211 stack.
Historical background: originally wireless devices created an interface,
just like Ethernet devices do. Name of an interface matched the name of
the driver that created. Later, wlan(4) layer was introduced, and the
wlanX interfaces become the actual interface, leaving original ones as
"a parent interface" of wlanX. Kernelwise, the KPI between net80211 layer
and a driver became a mix of methods that pass a pointer to struct ifnet
as identifier and methods that pass pointer to struct ieee80211com. From
user point of view, the parent interface just hangs on in the ifconfig
list, and user can't do anything useful with it.
Now, the struct ifnet goes away. The struct ieee80211com is the only
KPI between a device driver and net80211. Details:
- The struct ieee80211com is embedded into drivers softc.
- Packets are sent via new ic_transmit method, which is very much like
the previous if_transmit.
- Bringing parent up/down is done via new ic_parent method, which notifies
driver about any changes: number of wlan(4) interfaces, number of them
in promisc or allmulti state.
- Device specific ioctls (if any) are received on new ic_ioctl method.
- Packets/errors accounting are done by the stack. In certain cases, when
driver experiences errors and can not attribute them to any specific
interface, driver updates ic_oerrors or ic_ierrors counters.
Details on interface configuration with new world order:
- A sequence of commands needed to bring up wireless DOESN"T change.
- /etc/rc.conf parameters DON'T change.
- List of devices that can be used to create wlan(4) interfaces is
now provided by net.wlan.devices sysctl.
Most drivers in this change were converted by me, except of wpi(4),
that was done by Andriy Voskoboinyk. Big thanks to Kevin Lo for testing
changes to at least 8 drivers. Thanks to Olivier Cochard, gjb@, mmoll@,
op@ and lev@, who also participated in testing. Details here:
https://wiki.freebsd.org/projects/ifnet/net80211
Still, drivers: ndis, wtap, mwl, ipw, bwn, wi, upgt, uath were not
tested. Changes to mwl, ipw, bwn, wi, upgt are trivial and chances
of problems are low. The wtap wasn't compilable even before this change.
But the ndis driver is complex, and it is likely to be broken with this
commit. Help with testing and debugging it is appreciated.
Differential Revision: D2655, D2740
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
The 6205 (Taylor Peak) in the Lenovo X230 works fine in 5GHz 11a and 11n HT20,
but not 11n HT40. The NIC goes RX deaf the moment HT40 is configured.
It's so RX deaf that it doesn't even hear beacons and the firmware sends
"BEACON MISS" events. That's pretty deaf.
I tried configuring up the HT40 flags in monitor mode and it worked - so
I assumed that doing the transition from 20 -> 40MHz channel configuration
when going auth->assoc (ie, after the NIC has been partially configured)
is a problem.
So for now, let's just always set them if they're available.
Tested:
* Intel 5300, STA mode, 5GHz HT/40 AP; 2GHz HT/20 AP
* Intel 6205, STA mode, 5GHz HT/40, HT20, 11a AP; 2GHz HT/20 AP
This was pointed out to me by coworkers trying to use FreeBSD-HEAD
in the office on their Thinkpad T420p laptops.
TODO:
* I don't like how the HT40 flags are configured - the whole interop/
protection config should be re-checked. Notably, I think curhtprotmode
is 0 in a lot of cases, which means "no interoperability" and i think
that's busted.
Sponsored by: Norse Corp, Inc.
The iwn(4) firmware forgets most of its channel state after an RXON
command. This means that any beacons its seen on passive 5GHz channels
are forgotten upon an association/authorisation request.
This unfortuantely means that 5GHz association almost always fails -
the assoc and/or auth frames are dropped with a status of "passive
channel, haven't seen a beacon yet." (0x90.)
So:
* add an xmit queue, global, to buffer frames
* modify the xmit path to use the mbuf tag from net80211
to specify raw frame details
* buffer xmit frames from both raw and non-raw paths
* if a beacon is seen in the RX path, schedule a taskqueue to
send said frames and un-buffer things.
* flush frames during state change back to INIT, or NIC
down/up/detach.
This isn't the final shape I'd like this to be in but it certainly
is better than 5GHz "not working at all".
Tested:
* Intel 5100, STA mode (before spilling coffee)
* Intel 5300, STA mode (after spilling coffee)
Story:
* This has been bugging me at work for months, which I just
worked around by throwing an ath(4) into my Lenovo T400 cardbus
slot.
* Our ops director discovered indeed FreeBSD runs well on the
Lenovo T420p, except for that pesky 5GHz thing. So now developers
also can have a T420p running FreeBSD to do work with.
Their #1 feedback to me - "boy it'd be nice if 5GHz wifi worked."
* .. then, I was at NANOG but stuck with 5GHz only wifi and no ath(4)
NIC to put in a laptop - and I snapped.
Thus, the reason this is actually work related.
MFC after: 2 weeks
Sponsored by: Norse Corp, Inc.
stability.
I'll re-enable it once the scan overhaul is done - the NIC itself
can do bgscan, but not how we're doing it.
MFC after: 2 weeks
Sponsored by: Norse Corp, Inc.
* simplify channel logic for determining RF gain setting in scan setup
* don't set TX timer on error
* free node references for unsent frames on device stop
* set maxfrags to IWN_MAX_SCATTER-1 (first segment is used by TX command)
* add missing IWN_UNLOCK() from interrupt path when the hardware
disappears.
* pass control frames to host
* nitems() instead of local macro
Tested:
* Intel 5100, STA mode
PR: kern/196264
Submitted by: Andriy Voskoboinyk <s3erios@gmail.com>
The rules turn out to be:
* for non-aggregation session TX queues - it's either sent or not sent.
* for aggregation session TX queues - if nframes=1, then the status reflects
the completed transmission.
* however, for nframes > 1, then this is just a status reflecting what
the initial transmission did. The compressed BA (immediate or delayed)
may not have yet been received, so the actual frame status is in the
compressed BA updates.
Whilst here, I fiddled with debugging and formatting a bit.
There's also RTS attempts (what the atheros chips call "short retries")
which weren't being logged and they aren't yet being used in the rate
control statistics updates. For now, at least log them.
TODO:
* This still isn't 100% correct! So I have to tinker with this some more.
(The failures aren't always failures..)
* Extend the rate control API in net80211 so it can take both short and
long retry counts.
Tested:
* Intel 5100, STA mode
The (eventual) intention is to create MIB counters for transmitted
frame completion to count how many packets with each status are
transmitted.
Note the difference between A-MPDU and non A-MPDU status.
Obtained from: Linux iwlwifi/dvm driver
but has some retries.
Without this, single frame transmission in AMPDU will always look like
it succeeded fine, and thus AMRR will think it's totally fine to just
keep upping the rate upwards.
Now, this is still not quite right! For multi-frame aggregates the
completion happens in two parts - the TX done and the BA received.
The driver is currently double accounting those a little - there's no
way to say to the rate control code "I completed X frames, Y worked fine,
there were Z retries." And it's a bit odd with iwn, as the firmware
retransmits frames for us so we don't get to see how many retransmits
happened; only that it took longer than normal. I may have to extend
the rate control API to properly track that.
So this may keep the rate lower than it should be, but that's better
than keeping it higher than it should be.
Tested:
* 5100, STA mode
in preparation for the 5300 3x3 NIC.
During this particular adventure, I did indeed discover that a whole
swath of things made little to no sense.
Those included, and are fixed here:
* A lot of the antenna configuration bits assume the NIC has two receive
chains. That's blatantly untrue for NICs that don't.
* There was some disconnect between the antenna configuration when
forming a PLCP rate DWORD (which includes the transmit antenna
configuration), separate to the link quality antenna configuration.
So now there's helper functions to return which antenna configurations
to use and those are used wherever an antenna config is required.
* The 5300 does up to three stream TX/RX (so MCS0->23), however
the link quality table has only 16 slots. This means all of the
rate entries are .. well, dual-stream rates. If this is the case,
the "last MIMO" parameter can't be 16 or it panics the firmware.
Set it to 15.
* .. and since yes it has 16 slots, it only would try retransmitting
from MCS8->MCS23, which can be quite .. terrible. Hard-code the last
two retry slots to be the lowest configured rate.
* I noticed some transmit configuration command stuff is different
based on firmware API version, so I lifted that code from Linux.
* Add / augment some more logging to make it easier to capture this
stuff.
Now, 3x3 is still terrible because the link quality configuration is
plainly not good enough. I'll have to think about that.
However, the original goal of this - 3x3 operation on the Intel
5300 NIC - actually worked.
There are also rate control bugs in the way this driver handles
notifying the net80211 rate control code when AMPDU is enabled.
It always steps the rate up to the maximum rate possible - and
this eventually ends in much sadness. I'll fix that later.
As a side note - 2GHz HT40 now works on all the NICs I have tested.
As a second side note - this exposed some bad 3x3 behaviour in
the ath(4) rate control code where it starts off at a 3-stream rate
and doesn't downgrade quickly enough. This makes the initial
dhcp exchange take a long time. I'll fix the ath(4) rate code
to start at a low fixed 1x1 MCS rate and step up if everything
works out.
Tested:
* Intel 2200
* Intel 2230
* Intel 5300
* Intel 5100
* Intel 6205
* Intel 100
TODO:
* Test the other NICs more thoroughly!
Thank you to Michael Kosarev <russiane39@gmail.com> for donating the
Intel 5300 NIC and pestering me about it since last year to try and
make it all work.
"fatal firmware error" happens. Previously it was neccessary to reset
it manually, using "/etc/rc.d/netif restart".
Approved by: adrian@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
"fatal firmware error" happens. Previously it was neccessary to reset
it manually, using "/etc/rc.d/netif restart".
Approved by: adrian@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Centrino 2230 firmware.
This fixes the general statistics block to be actually valid.
I've verified this by contrasting the output of iwnstats before and
after the change. The general block is now correct.
Tested:
* Intel 5100 (old format stats message)
* Intel 2230 (new format stats message)
It turns out that there's a variant format of the RX statisitcs notification
from the intel firmware. It's even more whacked - the non-BT variant has
bluetooth fields; apparently some later NICs return even _more_ bluetooth
related fields.
I'll commit the statistics structure changes here - it's a no-op for the
driver. I'll later teach the driver code to populate a statistics structure
from the received message after reformatting things correctly.
I don't _think_ it's going to fix anything related to sensitivity programming
as the CCK/OFDM (non-11n) fields are in the same place for both formats.
But the HT structure and the general statistics aren't in the same place.
I'll go find some NIC(s) that spit out the other format and when I find one,
I'll go and update the driver to handle things correctly.
Tested:
* Intel 5100 (which returns the legacy, non-BT format)
Obtained from: Linux iwlwifi
The origin of WEP comes from IEEE Std 802.11-1997 where it defines
whether the frame body of MAC frame has been encrypted using WEP
algorithm or not.
IEEE Std. 802.11-2007 changes WEP to Protected Frame, indicates
whether the frame is protected by a cryptographic encapsulation
algorithm.
Reviewed by: adrian, rpaulo
- Simplify MSI allocation and release. For a single one, we don't need to
fiddle with the MSI count and pci_release_msi(9) is smart enough to just
do nothing in case of INTx.
- Don't allocate MSI as RF_SHAREABLE.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
MFC after: 1 week
This chip doesn't require the temperature sensor offset, either v1 or
v2. Doing so causes the initial calibration test to fail.
Tested:
* Intel Centrino 6150
Make the scan state optional - we'll obviously need a vap, but we now
won't require the scan state. the only thing the scan state is needed
for is to check for the list of SSIDs to scan - which we can now
just plain ignore by passing in NULL as the scan state pointer.
Tested:
* Intel 5100 (STA)
This is in preparation for being able to use iwn_scan() to do an off
channel scan to reset the RF tuning.
It should be a no-op.
Tested:
* Intel 5100 (STA)
in preparation for the scan based retune logic.
The linux iwlwifi driver does a rescan (onto a non-active channel)
to force an RF retune when the PLCP error rates exceed a certain threshold.
* Add code to track HT PLCP rate errors;
* Separate out the PLCP error count fetch and update so the delta
can be used when checking for PLCP error rates;
* Implement the PLCP error logic from iwlwifi;
* For now, just print out whenever the error rate exceeds the
threshold.
The actual scan based retune will take a bit more effort; the scan
command code right now assumes that a scan state is passed in.
This does need to change to be more flexible (both for this and
in preparation for scanning multiple channels at once.)
Tested:
* 5100 (STA mode)
* 2200 (STA mode)
* 2230 (STA mode)
working on some RF tuning issues.
The linux iwlwifi driver has these thresholds which they use to see
if there are PLCP errors over a certain interval. If they hit this,
they trigger a single-channel (different from active channels!)
scan to retune the RF front-end.
for these chipsets.
* Correctly set the active/passive flag in the scan request - this is
NOT a "is the channel active|passive"; it's to do with whether we
have an SSID to actively scan for or not. The firmware takes care
of the active/passive setup of the channel.
* Calculate the active/passive dwell time based on the beacon interval
and the channel mode, rather than using a hard coded value.
* For now, hardcode the scan service_time. It's defined as:
31:22 - number of beacon intervals to come back onto the home channel
for;
0:21 - time (microseconds) to come back onto the home channel for.
When doing an active scan when the NIC is active (whether we're associated
or not - it only matters if we've setup the NIC to a destination or not)
this determines how much time to stay on the home channel for when
scanning. We can tune this based on the amount of active traffic.
For now it's 4 beacon intervals and 100 microseconds.
* Fix the "good crc threshold" setting. It differs based on the NIC
firmware. Some older firmware required a workaround; the later
firmware instead treats the field as a flag.
* Enforce that we are not sending a scan command if one is already
pending. Any time this is done is a bug and it absolutely needs
to be fixed - so be very loud.
* Add the SCAN flag to a few debug messages that are scan related but
only occuring under STATE.
Now, this does get noisy when you're scanning in an actively busy 2GHz
network as the firmware (for reason I don't quite yet understand) seems
hell bent on staying on some passive channels longer than it should.
However, it should eventually recover and complete the scan.
This is a work in progress; please let me know if things get stuck or
if things improve!
Tested:
* intel centrino 2200
* intel centrino 2230
* intel 6200
* intel 5100
* intel 4965 (gets upset, but that's a known issue)
Obtained from: linux iwlwifi
TX ring according to what the firmware requires.
The firmware requires A-MPDU sub-frames to be at a very specific ring
offset - that is, the ring slot offset should be (seqno % 256.)
This holds for every NIC I've tested thus far except the 4965,
which starts erroring out here shortly before the firmware panics.
Which is good, it's doing what it's supposed to (read: capture that
we've screwed up somewhere.)
The specifics about getting this stuff right:
* the initial seqno allocation should match up with the ringid.
* .. yes, this means we can start at a ring offset that isn't zero.
* .. because we program the start seqno in the firmware message
to setup the AC.
* The initial seqno allocation may be non-zero _and_ frames may be
being transmitted during a-mpdu negotiation. I faced similar
issues on ath(4) and had to software queue frames to that node+TID
during A-MPDU negotiation.
* seqno allocation should be in lockstep with ring increments.
* If we fail to transmit some segment, no, we shouldn't reuse that
ring slot. We should just transmit a BAR (which we aren't yet
doing, sigh) and move onto the next seqno.
* In theory there shouldn't be any holes in the seqno space when
we are transmitting frames.
Tested:
* 4965 (throws problems, so yes we have to fix this);
* 5100 (seems ok);
* 6200 (seems ok);
* 2200 (seems ok);
* 2230 (seems ok).
shifts into the sign bit. Instead use (1U << 31) which gets the
expected result.
This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.
A similar change was made in OpenBSD.
Discussed with: -arch, rdivacky
Reviewed by: cperciva
This field needs to be (a) set, and (b) greater than the other timeouts
(passive, active, maxquiet, etc.) It also is in microseconds, not
milliseconds.
I hope this will fix the scan hangs that people are seeing.
Obtained from: Linux iwlwifi
The previous code simply hard-coded IWN_ANT_AB which is only correct for
some of the NICs.
Now, if the NIC is a 1-stream TX, you need to set IWN_ANT_AB and _not_
just a single antenna. The Intel 5100 firmware panics the moment the
link quality table is updated.
So!
* no secondary antenna? Set it to IWN_ANT_AB;
* two-stream device? Transmit on the full transmit antenna configuration.
Tested:
* Intel 5100, STA
* Intel 2200 (eadler)
Obtained from: Linux iwlwifi
support.
* Extend the hardware base_params structure to include a bunch of hardware
flags indicating what is and isn't supported.
* Convert a bunch of the initial hardware configuration conditionals to
consult the base_params structure.
* Add new calibration code for temperature calibration for the Centrino 2xxx
series NICs.
* Add new bluetooth coexistence code for Centrino 2xxx series NICs.
* For NICs that support PAN (personal area networking), use a different
transmit queue and command queue setup, in preparation for said
PAN support.
* Extend the calibration array in iwn_softc to include enough space for
the new calibration types.
Tested (by myself, if not mentioned):
* Intel 4965
* Intel 5100
* Intel 6150
* Intel 2230
* Intel 2200 (eadler)
* Intel 1030
* Intel 6200
* Intel 6230
* Intel 6250
* Intel 6150
* Intel 100
What doesn't work:
* Intel 6235 - fails in calibration at startup
TODO:
* Testing on Intel 53xx series hardware
Submitted by: Cedric Gross <cg@cgross.info>
This is a terrible solution that at least behaves mostly correctly.
It walks the currently active rate table looking for rates to match.
It assumes that the code matches the setup path in the link quality
setup code (much like the previous, much simpler but even more hackish
math did.)
It's O(n), but n<15, so we're okay for the time being.
Tested:
* Intel 5100, STA - 11a, 11n, 11bg modes.
(which is a 1x2 device) panics the firmware.
But, for some 6xxx devices that require IWN_ANT_BC for the TX chainmask,
the link quality entries need to represent _that_.
So, revert this for now until I can figure out what is supposed to be
going on.
NIC and pushed up to the driver. Unfortunately this means there's
no rate control notification done. Thus, if the rate control code
makes a decision that hits a crappy rate that can't succeed, the
rate code would never lower the rate and packet loss would continue.
So, fake some rate control notification in this case.
Without this, a far away station with low signal strength would
associate using the management rate (by default the lowest rate)
and then the EAPOL frames would go out at the current AMRR best
guess. This would result in association failing authentication.
Tested:
* Intel 5100, STA
* Intel 2230, STA
This is a no-op for now!
* Add a new flag value for "there are no extra bits" for some random
field;
* Add a definition for the maximum number of calibration entries in
the calibration data cache in iwn_softc. It's not yet used.
* Add regulatory bands for the 2030 NIC.
Submitted by: Cedric Gross <cg@cgross.info>
the rate is 11n, rather than whether the channel is 11n.
This correctly allows the PLCP lookup code to return the legacy rates
even on an 11n channel.
PR: kern/183430
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
This is "STA invalid". I saw it during some 4965 testing (kern/183260)
and I still have no idea what is causing it.
Obtained from: Linux drivers/net/wireless/iwlegacy
Some firmware versions seem to get very unhappy if they're sent btcoex
commands when they don't actually have bluetooth hardware in them.
So, disable sending them those commands.
Tested:
* 5100 (which has bluetooth, no problems)
* 4965 (which doesn't have bluetooth, but didn't seem to crash)
* 6200 (no bluetooth, seems to get unhappy being sent bluetooth commands.)
index lookups.
* My recent(ish) change to iwn(4) and the net80211 rate control API to
support 11n rates broke the link quality table use. So, until I or
someone else decides to fix it, let's just disable it for now.
* Teach iwn_tx_data_raw() to use the iwn_rate_to_plcp() function.
* Eliminate two uses of the net80211 rate index lookup functions - they
are only for legacy rates and they're not needed here.
This fixes some invalid looking rate control TX issues that showed up
on my 4965 but it doesn't fix the two TX hangs I've noticed. Those look
like DMA related issues.
Tested:
* 4965, STA mode
* 5100, STA mode
command register. The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR. Thus, the bit is no longer
a reliable indication of capability, and should not be checked. This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.
This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.
Submitted by: jhb
Reviewed by: jfv, marius, achadd, achim
MFC after: 1 day
* Break out the single, static RX context into a pointer, and ..
* .. extend it to two RX contexts - a default and a PAN context.
Whilst here, add a few extra fields in preparation for further iwn(4)
work.
Tested:
* Intel 4965, STA mode - same level of stability
* Intel 5100, STA mode - no change
Submitted by: Cedric Gross <cg@gross.info>
I haven't yet reviewed the Intel driver(s) in more depth to see if
there are 1x1 NICs that report they support 2 transmit/receive chains..
if so then we'll have to update this.
Tested:
* Intel 4965, which is a 2x2 device with 3 RX and 2 TX chains.
PR: kern/181132
IDs for new devices.
* Add new device IDs
* Extend the ID probe code to include the newer range of bits used
by later model devices
Tested:
* Intel 5100, STA mode
TODO:
* Test on Intel 4965, just to be sure
Submitted by: Cedric GROSS <cg@gross.info>
* Add in some new register debugging under IWN_DEBUG_REGISTER
* Make IWN_DEBUG an option now for building. I'll chase this up
with a commit to 'options' soon.
Submitted by: Cedric GROSS <cg@cgross.info>
* Add 802.11n 2ghz and 5ghz tables, including legacy rates and up to
MCS23 rates (3x3.)
* Populate the rate code -> rate index lookup table with MCS _and_
normal rates, but _not_ the basic rate flag. Since the basic rate flag
is the same as the MCS flag, we can only use one.
* Introduce some accessor inlines that do PLCP and rate table lookup/access
and enforce that it doesn't set the basic rate bit. They're not
designed for MCS rates, so it will panic.
* Start converting drivers that use the rate table stuff to use the
accessor inlines and strip the basic flag.
* Teach AMRR about basic 11n - it's still as crap for MCS as it is
being used by iwn, so it's not a step _backwardS_.
* Convert iwn over to accept 11n MCS rates rather than 'translate' legacy
to MCS rates. It doesn't use a lookup table any longer; instead it's a
function which takes the current node (for HT parameters) and the
rate code, and returns the hardware PLCP code to use.
Tested:
* ath - it's a no-op, and it works that way
* iwn - both 11n and non-11n