For ieee80211_add_channel+*() we are passing in an int flag for
ht40 and in some cases another int flag for vht80 where we'd only
need two bits really.
Convert these variables to a bitflag and fold them together into one.
This also allows for VHT160 and VHT80P80 and whatever may come to
be considered. Define the various options currently needed.
Change the drivers (rtwn and rsu) which actually set this bit to non-0.
For convenience the "1" currently used for HT40 is preserved.
Enahnce getflags_5ghz() to handle the full set of VHT flags based
on the input flags from the the driver.
Update the regdomain implementation as well to make use of the new
flags and deal with higher [V]HT bandwidths.
ieee80211_add_channel() specifically did not take flags so it will
not support naything beyond 20Mhz channels.
Note: I am not entirely happy with the "cbw_flag[s]" name, but we
do use chan_flags elsewhere already.
MFC after: 2 weeks
Reviewed by: adrian, gnn
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Differential revision: https://reviews.freebsd.org/D26091
In set_vht_extchan() the checks are performed in the order of VHT20/40/80.
That means if a channel has a lower and higheer VHT flag set we would
return the lower first.
We normally do not set more than one VHT flag so this change is supposed
to be a NOP but follows the logical thinking order of returning the best
first. Also we nowhere assert a single VHT flag so make sure we'll not
be stuck with VHT20 when we could do more.
While here add the debugging printfs for VHT160 and VHT80P80 which still
need doing once we deal with a driver at that level.
Reviewed by: adrian, gnn
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision: https://reviews.freebsd.org/D26088
Rather than coding an array size of [4] replace the number with
WME_NUM_AC.
MFC after: 2 weeks
Reviewed by: adrian
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision: https://reviews.freebsd.org/D26090
For the <VHT-MCS, NSS> tuple, NSS is 1..8 (or in our loop case 0..7
but not 0..6). Correct the boundry to check for < 8 and not < 7.
MFC after: 2 weeks
Reviewed by: adrian
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision: https://reviews.freebsd.org/D26087
In ieee80211_vht_get_chwidth_ie() we need to return 80P80 (3) before
VHT160 (2) as otherwise we'll never use 80P80. Fix the order.
MFC after: 2 weeks
X-MFC with: r364303 (which missed this)
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
For flags and checks the order goes VHT160 and then VHT80P80 unless
checks are in reverse order ("more comes first") in which case we
deal with VHT80P80 first.
The one reverse order to pick out is where we check channel
prefernences. While it may seem that VHT160 is better, finding
two "free" channels (VHT 80+80) is more likely so we do prefer that.
While dealing with VHT160 and VHT80P80 add extra clauses previously
missing or marked TODO in a few places.
Reviewed by: adrian, gnn
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision: https://reviews.freebsd.org/D26002
The standard uses 80+80 and 80p80 but nowhere 80_80.
Switch the latter to 80P80 for all the macros and comments refering
to #defined flags which I could find.
The only place we leave as 80p80 is the ifconfig command line arguments
as we spell them all in lower case.
Ideally we would use 80+80 for any interactions with the user and
80P80 for anything internal but let us not confuse parsers and
hence avoid the '+' in either case.
Reviewed by: adrian, gnn
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision: https://reviews.freebsd.org/D26001
Rather then using magic numbers duplicate IEEE80211_FVHT_VHT* in
ifconfig (cleanup of these and other flags used and not exposed by
net80211 should happen later) and use those.
In the kernel this simplifies one ioctl path (the other one currently
relies on individual bit flags being passed in).
We also re-order the 80P80 and 160 flag for 160 to come before 80+80
and more clearly leave the flags as TODO in one of the 160/80+80 cases.
Reviewed by: adrian
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Differential Revision: https://reviews.freebsd.org/D26000
Add IEEE80211_IOC_IC_NAME to query the ic_name field and in ifconfig
to print the parent interface again. This functionality was lost
around r287197. It helps in case of multiple wlan interfaces and
multiple underlying hardware devices to keep track which wlan
interface belongs to which physical device.
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Reviewed by: adrian, Idwer Vollering
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25832
Split the MANAGE privilege into MANAGE, SETMAC and CREATE_VAP.
+ VAP_MANAGE is everything but setting the MAC and creating a VAP.
+ VAP_SETMAC is setting the MAC address of the VAP.
Typically you wouldn't want the jail to be able to modify this.
+ CREATE_VAP is to create a new VAP. Again, you don't want to be doing
this in a jail, but this DOES stop being able to run some corner
cases like Dynamic WDS (DWDS) AP in a jail/vnet. We can figure this
bit out later.
This allows me to run wpa_supplicant in a jail after transferring
a STA VAP into it. I unfortunately can't currently set the wlan
debugging inside the jail; that would be super useful!
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D25630
The later firmware devices (including iwn!) support multiple configuration
contexts for a lot of things, leaving it up to the firmware to decide
which channel and vap is active. This allows for things like off-channel
p2p sta/ap operation and other weird things.
However, net80211 is still focused on a "net80211 drives all" when it comes to driving
the NIC, and as part of this history a lot of these options are global and not per-VAP.
This is fine when net80211 drives things and all VAPs share a single channel - these
parameters importantly really reflect the state of the channel! - but it will increasingly
be not fine when we start supporting more weird configurations and more recent NICs.
Yeah, recent like iwn/iwm.
Anyway - so, migrate all of the HT protection, legacy protection and preamble
stuff to be per-VAP. The global flags are still there; they're now calculated
in a deferred taskqueue that mirrors the old behaviour. Firmware based drivers
which have per-VAP configuration of these parameters can now just listen to the
per-VAP options.
What do I mean by per-channel? Well, the above configuration parameters really
are about interoperation with other devices on the same channel. Eg, HT protection
mode will flip to legacy/mixed if it hears ANY BSS that supports non-HT stations or
indicates it has non-HT stations associated. So, these flags really should be
per-channel rather than per-VAP, and then for things like "do i need short preamble
or long preamble?" turn into a "do I need it for this current operating channel".
Then any VAP using it can query the channel that it's on, reflecting the real
required state.
This patch does none of the above paragraph just yet.
I'm also cheating a bit - I'm currently not using separate taskqueues for
the beacon updates and the per-VAP configuration updates. I can always further
split it later if I need to but I didn't think it was SUPER important here.
So:
* Create vap taskqueue entries for ERP/protection, HT protection and short/long
preamble;
* Migrate the HT station count, short/long slot station count, etc - into per-VAP
variables rather than global;
* Fix a bug with my WME work from a while ago which made it per-VAP - do the WME
beacon update /after/ the WME update taskqueue runs, not before;
* Any time the HT protmode configuration changes or the ERP protection mode
config changes - schedule the task, which will call the driver without the
net80211 lock held and all correctly serialised;
* Use the global flags for beacon IEs and VAP flags for probe responses and
other IE situations.
The primary consumer of this is ath10k. iwn could use it when sending RXON,
but we don't support IBSS or AP modes on it yet, and I'm not yet sure whether
it's required in STA mode (ie whether the firmware parses beacons to change
protection mode or whether we need to.)
Tested:
* AR9280, STA/AP
* AR9380, DWDS STA+STA/AP
* ath10k work, STA/AP
* Intel 6235, STA
* Various rtwn / run NICs, DWDS STA and STA configurations
U-APSD (unscheduled automatic power save delivery) is a power save method
that's a bit better than legacy PS-POLL - stations can mark frames with
an extra flag that tells the AP to leak out more frames after it sends
its own frames rather than needing to send a PS-POLL to get another frame
from the AP.
Now, this code just handles the negotiation bits; it doesn't actually
implement U-APSD. That's up to drivers, and nothing in the tree yet
implements this. I /may/ implement this for ath(4) if I eventually care
enough but right now I plan on just implementing it for firmware offload
based NICs that handle this in the NIC.
I'll commit the ifconfig bit after this and I may have some follow-up
commits as this gets used more by me in local testing.
This should be a glorious no-op for everyone else. If things change
for anyone that isn't fixed by a complete recompile then please reach out
to me.
In the 11n world, most NICs did A-MPDU receive/transmit offloading but
not A-MSDU offloading. So, the net80211 A-MPDU receive path would just
receive MPDUs, do the reordering bit, pass it up to the rest of
net80211 for crypto decap and then do A-MSDU decap before throwing ethernet
frames up to the rest of the system.
However 11ac and 11ax NICs are increasingly doing A-MSDU offload (and
newer 11ax stuff does socket offload, but hey I don't want to scare people
JUST yet) - so although A-MPDU reordering may be done in the OS, A-MSDUs
look like a normal MPDU. This means that all the MSDUs are actually
faked into a set of MPDUs with matching 802.11 header - the sequence number,
QoS header and any encryption verification bits (like IV) are just copied.
This shows up as MASSIVE packet loss in net80211, cause after the first MPDU
we just toss the rest.
(And don't get me started about ethernet decap with A-MPDU host reordering;
we'll have to cross that bridge for later 11ac and 11ax bits too.)
Anyway, this work changes each A-MPDU reorder slot into an mbufq.
The mbufq is treated as a whole set of frames to pass up to the stack
and reordered/de-duped as a group. The last frame in the reorder list
is checked to see if it's an A-MSDU final frame so any duplicates are
correctly tossed rather than double-received. Other than that, the
rest of the logic is unchanged.
The previous commit did a small subset of this - if there wasn't any reordering
going on then it'd accept the A-MSDUs. This is the rest of the needed work.
This is a no-op for 11n NICs doing A-MPDU reordering but needing software
A-MSDU decap - they aren't tagged as A-MSDU and so any subsequent
frames added to the reorder slot are tossed.
Tested:
* QCA9880 (ath10k/athp) - STA/AP mode;
* RT3593 (if_rsu) - 11n STA+DWDS mode (I'm committing through it rn);
* QCA9380 (if_ath) - STA/AP mode.
This is a new, optional (for now!) method that drivers can use to separate
node allocation and node initialisation. Right now they're the same, and
drivers that need to do node allocation via firmware commands need to sleep
and thus they need to defer node allocation into an internal taskqueue.
Right now they're just separate but not deferred. Later on if I get the time
we'll start deferring the node and key related operations but that requires
making a bunch of other stuff (notably things that generate frames!) also
async/deferred.
Tested:
* RT3593, STA/DWDS mode
* AR9380, STA/AP modes
* QCA9880 (athp) - STA/AP modes
Allow net80211 drivers to register a small vtable of debugnet-related
methods.
This is not a functional change. Driver support is needed, similar to
debugnet(4) for wired NICs.
Reviewed by: adrian, markj (earlier version both)
Differential Revision: https://reviews.freebsd.org/D17308
When doing A-MSDU offload handling the driver is required to mark
A-MSDUs from the same MPDU with the same sequence number.
It then tags them as AMSDU (if it's a decap'ed A-MSDU) and AMSDU_MORE
(saying there's more AMSDUs decapped in the same MSDU.)
This allows encryption and sequence number offload to work right.
In the A-MSDU path the sequence number check looks at the A-MSDU flags
in the frame to see whether it's part of the same seqno and will pass them
(ie, not increment rx_seq until the last A-MSDU is seen from the driver,
or a new seqno shows up.0
However, I did this work in the A-MSDU path but not the A-MSDU in A-MPDU path.
For the non A-MDSU offload case the A-MPDU receive reordering will do its
thing and then pass up the MPDU up for decap - which then will see it's
an A-MSDU and decap each sub-frame. But this isn't done for offloaded
A-MSDU frames.
This requires two parts:
* Don't bump the RX sequence number, same as above; and
* If frames go into the reordering buffer, they need to be added into the slot
as a set of frames rather than a single frame, so once a new seqno shows up
this slot can be marked as "full" and we can move on.
This patch does the first. The latter requires that I find and commit
work to change rxa_m from an mbuf to an mbufq and the nhandle A-MSDU
there. But, the first is enough to allow the normal case (ie, no or not
a lot of A-MPDU RX reordering) to work.
This allows the athp driver (QCA9880) throughput to go from VERY low
(like 5mbit TCP, 1/3-1/4 expected UDP throughput) to ~ 250mbit TCP
and > 300mbit UDP on a VHT/40 channel. TCP sucks because, well, it
shows up as MASSIVE packet loss when all but one frame in a decap'ed
A-MSDU stream is dropped. Le whoops.
Now, where'd I put that laptop with the patch for rxa_m mbufq that
I wrote like in 2017...
Tested:
* AR9380, STA/AP mode (a big no-op, no A-MSDU hardware decap);
* if_run (RT3593), STA DWDS mode (A-MPDU / A-MSDU receive, but again
no A-MSDU hardware decap);
* QCA9880, STA/AP mode (which is doing hardware A-MPDU/A-MSDU decap,
but no A-MPDU reordering in the firmware.)
Somewhere in net80211 if_xname is checked against NULL but it doesn't trigger
a compiler warning, but this does. So DTRT for FreeBSD and the other if_xname
derefences can be converted to this function at a later time.
This removes the requirement to know what's in the ifp.
(If someone wants a quick clean-up task, it'd be nice to convert instances
of ifp dereferencing for if_xname over to this method.)
This updates the logic to allow:
* A-MPDU if available;
* A-MSDU if available and A-MPDU is off/NACKed;
* A-MPDU+A-MSDU if it's available and negotiated;
* Fast frames if the node is 11abg (and not HT/VHT.)
This allows for things to fail back to A-MSDU or fast frames
if A-MPDU isn't available rather than needing to be non-HT/non-VHT.
It also allows A-MPDU+A-MSDU to work if it's negotiated.
Tested:
* AR9380, STA + AP mode (A-MPDU, A-MSDU, FF, A-MPDU+A-MSDU)
* RT5350, STA mode (A-MSDU, FF)
* AR9170, STA mode (A-MSDU, FF)
Turns out this isn't a required call. I didn't pick it up because my
uncommitted changes involve new updateslot methods for cards I'm working
on.
Dunce hat to: adrian
This is hopefully a big no-op unless you're running some extra
patches to flip on A-MSDU options in a driver.
802.11n supports sending A-MSDU in A-MPDU. That lets you do things
like pack small frames into an A-MSDU and stuff /those/ into an A-MPDU.
It allows for much more efficient airtime because you're not
wasting time sending small frames - which is still a problem when
doing A-MPDU as there's still per-frame overhead and minimum A-MPDU
density requirements.
It, however, is optional for 802.11n. A lot of stuff doesn't advertise
it (but does it, just wait!); and I know that ath10k does it and my
ath(4) driver work supports it.
Now, 802.11ac makes A-MSDU in A-MPDU something that can happen more
frequently, because even though you can send very large A-MPDUs
(like 1 megabyte and larger) you still have the small frame problem.
So, 802.11ac NICs like ath10k and iwm will support A-MSDU in A-MPDU
out of the box if it's enabled - and you can negotiate it.
So, let's lay down the ground work to enable A-MSDU in A-MPDU.
This will allow hardware like iwn(4) and ath(4) which supports
software A-MSDU but hardware A-MPDU to be more efficient.
Drivers that support A-MSDU in A-MPDU will set TX/RX htcap flags.
Note this is separate from the software A-MSDU encap path; /that/
dictates whether net80211 is doing A-MSDU encapsulation or not.
These HTC flags control negotiation, NOT encapsulation.
Once this negotiation and driver bits are done, hardware like
rtwn(4), run(4), and others will be able to use A-MSDU even without
A-MPDU working; right now FF and A-MSDU aren't even attempted
if you're an 11n node. It's a small hold-over from the initial
A-MPDU work and I know how to fix it, but to flip it on properly
I need to be able to negotiate or ignore A-MSDU in A-MPDU.
Oh and the fun part - some 11ac APs I've tested will quite happily
decap A-MSDU in A-MPDU even though they don't negotiate it when
doing 802.11n. So hey, I know it works - I just want to properly
handle things. :-)
Tested:
* AR9380, STA/AP mode
The 11b/11g ERP and slot time update handling are two things which weren't
migrated into the per-VAP state when Sam did the initial VAP work.
That makes sense for a lot of setups where net80211 is driving radio state
and the radio only cares about the shared state.
However, as noted by a now deleted comment, the ERP and slot time updates
aren't EXACTLY correct/accurate - they only take into account the most
RECENTLY created VAP, and the state updates when one creates/destroys
VAPs isn't exactly great.
So:
* track the short slot logic per VAP;
* whenever the slot time configuration changes, just push it into a deferred
task queue update so drivers don't have to serialise it themselves;
* if a driver registers a per-VAP slot time handler then it'll just get the
per VAP one;
* .. if a driver registers a global one then the legacy behaviour is maintained -
a single slot time is calculated and pushed out.
Note that the calculated slot time is better than the existing logic - if ANY
of the VAPs require long slot then it's disabled for all VAPs rather than
whatever the last configured VAP did.
Now, this isn't entirely complete - the rest of ERP tracking around short/long
slot capable station tracking needs to be converted into per-VAP, as well
as the preamble/barker flags. Luckily those also can be done in a similar
fashion - keep per-VAP counters/flags and unify them before doing the driver
update. I'll defer that work until later.
All the existing drivers can keep doing what they're doing with the global
slot time flags as that is maintained. One driver (iwi) used the per-VAP
flags instead of the ic flags, so now that driver will work properly.
This unblocks some ath10k porting work as the firmware takes the slot time
configuration per-VAP rather than globally, and some firmware handles
STA+AP and STA+STA (on same/different channels) configurations where
the firmware will switch slot time as appropriate.
Tested:
* AR9380, STA/AP mode
* AR9880 (ath10k), STA mode
net80211 currently doesn't negotiate A-MSDU in A-MPDU during ADDBA.
I've added the field in net80211 and this commit:
* Prints out the ADDBA field value during ADDBA;
* Adds some comments around where I need to follow up with some
negotiation logic.
Right now we don't have a driver flag anywhere which controls
whether A-MSDU in A-MPDU is allowed. I know it works (I have it
manually turned on at home on a couple test APs, heh!) but
I can't flip it on until we can negotiate it.
Tested:
* AR9380, STA/AP mode, printing out ADDBA requests
Now that I have A-MSDU and A-MPDU coexisting together, we need to actually
announce if (a) it's permitted and (b) figure out if we should use it
when transmitting.
This just adds the field; it doesn't yet include it in ADDBA exchanges.
I'm trying to chase down more weird "I am not doing an incremental scan
when being asked" issues so these debugging statements help.
Notably, I've added more debugging around reasons why the scan is skipped -
eg because the cache is considered hot.
This should be a no-op unless you care about the debugging output!
This sends a probe request after IBSS node discovery through
beacon frames. This allows things like HT and VHT capabilities
to be "negotiated" in adhoc mode.
It is .. kinda fire and pray - this isn't retried after discovery
so it's quite possible that nodes occasionally don't come up with
HT/VHT rate upgrades. At some point it may be a fun side project
to add support for retrying these probe requests/negotiations
after IBSS node discovery.
Tested:
* tested with multiple ath(4) NICs in 11n mode.
Differential Revision: https://reviews.freebsd.org/D24979
The scan task refactoring stuff circa 2014-2016 broke the blocking task
into a taskqueue with some async bits, but it apparently broke scans
being interrupted by traffic.
Notably - the new "field" SCAN_PAUSE sets both SCAN_INTERRUPT and SCAN_CANCEL,
and a bunch of existing code was checking for SCAN_CANCEL only and breaking
the scan. Unfortunately it was then (a) cancelling the scan entirely and
(b) not notifying userland that scan was done.
So:
* Update the calls to scan_end() to only pass in 1 (saying the scan is complete)
if SCAN_CANCEL is set WITHOUT SCAN_INTERRUPT. If both are set then yes,
the scan is interrupted, but it isn't canceled - it's just paused.
* Update the "did the scan flags change whilst the driver was called" logic
to check for canceled scans, not interrupted scans.
* The "scan done" logic now explicitly checks for either interrupted or
completed scans. This accounts for the situation where a scan is being
aborted via traffic but it ALSO happens to have finished (ie the last
channel was checked.)
This doesn't ENTIRELY fix scanning as the resume function is broken
due to incorrect ticks math. Thus, the second half of this patch
changes the ieee80211_ticks_*() macros to use int instead of long,
matching the logic that the TCP code does with ticks and handles
wrapping / negative ticks values. If cast to long then the wrapping
math wouldn't work right (ie, if ticks was actually negative,
ie, after the system has been up for a while.)
This allows contbgscan() to correctly calculate if a scan should
continue based on ticks and ic->ic_lastdata .
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D25031
This is useful when a wireless driver is stopped or started in response
to events like an RF Kill button press. Applications like
wpa_supplicant depend on such events to have a correct view of interface
state.
Reviewed by: adrian, cy, melifaro
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D24925
I'm still not sure whether this is the full solution, but here goes.
I have a two node DWDS setup - a main AP with the ethernet bridge uplink
and a satellite AP in the back of the house. They're both AR9344+AR9580
dual band 11n APs.
The problem was that multicast frames was not going from the DWDS AP to
the DWDS STA. Unicast frames are fine, and multicast frames from the
DWDS STA to AP are fine.
Now, multicast and unicast frames from the STA -> AP are just transmitted
using the unicast key. That's fine. However, the AP -> STA multicast
frames by default are transmitted using the current default / multicast
key, the shared one between all STAs in a BSS. Now, the DWDS implementation
ignores non WDS frames - it only allows about 4 address frames outside
of management / EAPOL frames! - so the STA side ignores the normal multicast
frames.
Instead, the AP side uses ieee80211_dwds_mcast() to send multicast frames
to each WDS VAP that was created as part of the "dynamic" part of DWDS.
This should be queuing them individually to each node instead of using
the normal multicast send path; and this is how they should get turned into
4-addr WDS frames.
HOWEVER, ieee80211_encap() was trying to use the default TX key to queue
them rather than the unicast key that's already setup. Since this synthetic
node doesn't have the default TX key setup, transmission fails. Things
would be fine in WEP and in open mode because in both cases you would
have static keys (or no keys) setup. It just fails in WPA mode.
This resolves the issue. AP DWDS multicast is now sent using the unicast
key just like in STA mode and I'm pretty sure the STA mode side will stil
work fine (as it's a STA VAP with a DWDS flag..)
Tested:
* TL-WDR3600/4300 APs
IEEE80211_MESH_RTCMD_ADD was invoking memcmp() to validate the
supplied address directly on the user pointer rather than first doing
a copyin() and validating the copied value.
IEEE80211_MESH_RTCMD_DELETE was passing the user pointer directly to
ieee80211_mesh_rt_del() rather than copying the user buffer into a
temporary kernel buffer.
Reviewed by: brooks, kib
Obtained from: CheriBSD
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24562
This pattern is used in callbacks with void * data arguments and seems
both relatively uncommon and relatively harmless. Silence the warning
by casting through uintptr_t.
This warning is on by default in Clang 11.
Reviewed by: arichardson
Obtained from: CheriBSD (partial)
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24425
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all low hanging fruits as MPSAFE.
Reviewed by: markj
Approved by: kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23626
This makes amrr_node_stats() cleaner and allows the rate printing to be
reusable.
Submitted by: Neel Chauhan <neel at neelc.org>
Reviewed by: adrian
Differential Revision: https://reviews.freebsd.org/D22318
misunderstanding that the function does not work additive
when repeatedly called for diffferent bands.
Reviewed by: avos (a few months ago)
MFC after: 2 weeks
- Perform ifp mismatch checks (to determine if a send tag is allocated
for a different ifp than the one the packet is being output on), in
ip_output() and ip6_output(). This avoids sending packets with send
tags to ifnet drivers that don't support send tags.
Since we are now checking for ifp mismatches before invoking
if_output, we can now try to allocate a new tag before invoking
if_output sending the original packet on the new tag if allocation
succeeds.
To avoid code duplication for the fragment and unfragmented cases,
add ip_output_send() and ip6_output_send() as wrappers around
if_output and nd6_output_ifp, respectively. All of the logic for
setting send tags and dealing with send tag-related errors is done
in these wrapper functions.
For pseudo interfaces that wrap other network interfaces (vlan and
lagg), wrapper send tags are now allocated so that ip*_output see
the wrapper ifp as the ifp in the send tag. The if_transmit
routines rewrite the send tags after performing an ifp mismatch
check. If an ifp mismatch is detected, the transmit routines fail
with EAGAIN.
- To provide clearer life cycle management of send tags, especially
in the presence of vlan and lagg wrapper tags, add a reference count
to send tags managed via m_snd_tag_ref() and m_snd_tag_rele().
Provide a helper function (m_snd_tag_init()) for use by drivers
supporting send tags. m_snd_tag_init() takes care of the if_ref
on the ifp meaning that code alloating send tags via if_snd_tag_alloc
no longer has to manage that manually. Similarly, m_snd_tag_rele
drops the refcount on the ifp after invoking if_snd_tag_free when
the last reference to a send tag is dropped.
This also closes use after free races if there are pending packets in
driver tx rings after the socket is closed (e.g. from tcpdrop).
In order for m_free to work reliably, add a new CSUM_SND_TAG flag in
csum_flags to indicate 'snd_tag' is set (rather than 'rcvif').
Drivers now also check this flag instead of checking snd_tag against
NULL. This avoids false positive matches when a forwarded packet
has a non-NULL rcvif that was treated as a send tag.
- cxgbe was relying on snd_tag_free being called when the inp was
detached so that it could kick the firmware to flush any pending
work on the flow. This is because the driver doesn't require ACK
messages from the firmware for every request, but instead does a
kind of manual interrupt coalescing by only setting a flag to
request a completion on a subset of requests. If all of the
in-flight requests don't have the flag when the tag is detached from
the inp, the flow might never return the credits. The current
snd_tag_free command issues a flush command to force the credits to
return. However, the credit return is what also frees the mbufs,
and since those mbufs now hold references on the tag, this meant
that snd_tag_free would never be called.
To fix, explicitly drop the mbuf's reference on the snd tag when the
mbuf is queued in the firmware work queue. This means that once the
inp's reference on the tag goes away and all in-flight mbufs have
been queued to the firmware, tag's refcount will drop to zero and
snd_tag_free will kick in and send the flush request. Note that we
need to avoid doing this in the middle of ethofld_tx(), so the
driver grabs a temporary reference on the tag around that loop to
defer the free to the end of the function in case it sends the last
mbuf to the queue after the inp has dropped its reference on the
tag.
- mlx5 preallocates send tags and was using the ifp pointer even when
the send tag wasn't in use. Explicitly use the ifp from other data
structures instead.
- Sprinkle some assertions in various places to assert that received
packets don't have a send tag, and that other places that overwrite
rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.
Reviewed by: gallatin, hselasky, rgrimes, ae
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20117
Update node flags when driver supports SMPS, not when it is disabled or
in dynamic mode ((iv_htcaps & HTCAP_SMPS) != 0).
Checked with RTL8188EE (1T1R), STA mode - 'smps' word should disappear
from 'ifconfig wlan0' output.
MFC after: 2 weeks