It seems the EEE made RX MAC enter LPI(Low Power Idle) mode such
that dwc(4) was not able to receive packets. Ideally dwc(4) should
be able to use EEE to save power during periods of low link
utilization(i.e. gating off clock). Due to lack of dwc(4)
datasheet it's not easy to take required steps for EEE on LPI
enter/exit events. Disabling EEE in PHY seems to be easy
workaround until dwc(4) supports EEE.
Updating EEE advertisement register on RTL8211F seems to have no
effect until reprogramming MII_ANAR, MII_100T2CR and MII_BMCR
with auto-negotiation. It's not clear whether it's related with
mii_phy_reset()'s BMCR_ISO handling for RTL8211F though.
It seems rgephy_reset() needs careful investigation for newer
RealTek PHYs.
Ganbold submitted working version based on NetBSD change and
tested lots of changes I made. Thanks a lot!
Submitted by: ganbold (initial version)
In collaboration with: ganbold
Each virtual interface has its own MAC address, queues, and statistics.
The dedicated netmap interfaces (ncxgbeX / ncxlX) were already implemented
as additional VIs on each port. This change allows additional non-netmap
interfaces to be configured on each port. Additional virtual interfaces
use the naming scheme vcxgbeX or vcxlX.
Additional VIs are enabled by setting the hw.cxgbe.num_vis tunable to a
value greater than 1 before loading the cxgbe(4) or cxl(4) driver.
NB: The first VI on each port is the "main" interface (cxgbeX or cxlX).
T4/T5 NICs provide a limited number of MAC addresses for each physical port.
As a result, a maximum of six VIs can be configured on each port (including
the "main" interface and the netmap interface when netmap is enabled).
One user-visible result is that when netmap is enabled, packets received
or transmitted via the netmap interface are no longer counted in the stats
for the "main" interface, but are not accounted to the netmap interface.
The netmap interfaces now also have a new-bus device and export various
information sysctl nodes via dev.n(cxgbe|cxl).X.
The cxgbetool 'clearstats' command clears the stats for all VIs on the
specified port along with the port's stats. There is currently no way to
clear the stats of an individual VI.
Reviewed by: np
MFC after: 1 month
Sponsored by: Chelsio
the firmware because the original flowc sent by t4_tom changed the
firmware-internal state of the tid. For now I'm setting MAX_DSL to 8K
and PDU accordingly. This will be tidied up with the next firmware
update that will handle multiple flowc's correctly.
After an MC reboot, a VF driver may reset before the PF driver has
finished bringing everything back up. This includes the VFs EVB port.
MC_CMD_VADAPTOR_ALLOC is the first MCDI call after an MC reboot to
require the EVB port, so if it fails with MC_CMD_ERR_NO_EVB_PORT,
retry the command a few times after waiting a while.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4333
Make link control privilege visible to OS driver to guard updates to
flow control and PHY advertised capabilities.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4330
It allows manftest to program them.
Submitted by: Paul Fox <pfox at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4329
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4328
Common code should infer other privileges from Admin privilege to
support firmware that pre-dates introduction of specific privilege
flags.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4327
Submitted by: Artem V. Andreev <Artem.Andreev at oktetlabs.ru>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4308
It is really observed in the case of VLAN over sfxge interface.
Also this change makes total value equal to 35 which is default assumed
by the kernel for if_hw_tsomaxsegcount.
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4319
IPv4/IPv6 checksum offloading and VLAN tag insertion/stripping.
Since uether doesn't provide a way to announce driver specific offload
capabilities to upper stack, checksum offloading support needs more work
and will be done in the future.
Special thanks to Hayes Wang from RealTek who gave input.
tested on the Broadwell-Xeon with a hacked up version of pmcstudy -T. I still need
to circle back and add in to pmcstudy all the new tests from the Broadwell Vtune
guide (for the hacked up version I just made it so I could run the -T option). The
Skylake CPU is not yet available (even though Intel is advertising it .. imagine that).
The Skylake PMC's will need to be tested once we can get a sample skylake CPU :-)
Sponsored by: Netflix Inc.
This change will fix kernel panic with uninitialized (zeroed)
RXON structure.
Tested with Intel 3945BG, IBSS mode.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4304
The synth programming here requires the real centre frequency,
which for HT20 channels is the normal channel, but HT40 is
/not/ the primary channel. Everything else was using 'freq',
which is the correct centre frequency, but the hornet config
was using 'ichan' to do the lookup which was also the primary
channel.
So, modify the HAL call that does the mapping to take a frequency
in MHz and return the channel number.
Tested:
* Carambola 2, AR9331, tested both HT/20 and HT/40 operation.
On ARM, we must ensure proper interdevice write ordering.
The AHCI interrupt status register must be updated in HW before
registers in interrupt controller.
Unfortunately, only way how we can do it is readback.
Discussed with: mav
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D4240
Using a typedef for common code return types (rather than "int")
allows the Prefast static analyser to understand when a function
has been successful (and thus when its postconditions must hold).
This greatly reduces then number of false positives reported by
prefast for error paths in common code functions.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Dynamic config partitions on boards that support RFID are divided into
a number of segments, each formatted like a partition, with header,
trailer and end tags. The first segment is the current active
configuration.
The segments are initialised by manftest and each contain a different
configuration e.g. firmware variant. The firmware can be instructed
via RFID to copy a segment over the first segment, hence changing the
active configuration. This allows ops to change the configuration of
a board prior to shipment using RFID.
Changes to the dynamic config may need to be written to all segments (in
particular firmware versions written by manftest) or just the first
segment (changes to the active configuration). See SF-111324-SW.
If only the first segment is written the code still needs to be aware of
the possible presence of subsequent segments as writing to a segment may
cause its size to increase, which would overwrite the subsequent
segments and invalidate them.
Boards that do not support RFID will only have one segment in their
dynamic config partition.
Submitted by: Paul Fox <pfox at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4302
This probe/attaches correctly in my local branch and now displays
a useful message:
ath0: <Qualcomm Atheros QCA953x> at mem 0x18100000-0x1811ffff irq 0 on nexus0
...
ath0: AR9530 mac 1280.0 RF5110 phy 0.0
If the VPD is corrupt and contains an 'RV' keyword before the
END tag, then this function could return without setting the
return code to report the error.
Found by prefast.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Keep prefast happy by returning the initial queue index
from falconsiena_tx_qcreate(). No change in behaviour, as
etxo_qcreate already zeros *addedp before the call.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
PIO is not yet supported in the FreeBSD driver.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Adjust external port mapping table to distinguish Pavia from Monza.
Now the presence of any 40G mode implies at least 2 outputs per
external port. So Pavia 4x10G ports are now mapped to 1,2,3,4;
Monza 4x10G ports map to 1,1,2,2 as before.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
GCC 4.2.1 used on FreeBSD 8 and 9 branches does not like unnamed
union member in the structure. It is not strictly required in head,
but nice to have to minimize difference with out-of-tree driver.
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
- Make scan aborted by event restart immediately and infinitely.
- Improve handling of some loop events from firmware.
- Remove loop down timer, adding its functionality to scanner thread.
- Some more unification and simplification.
I added MYBEACON support a while ago to listen to beacons that are only
for your configured BSSID. For AR9380 and later NICs this results in
a lot less chip wakeups in station mode as it then only shows you beacons
that are destined to you.
However in IBSS mode you really do want to hear all beacons so you can do
IBSS merges. Oops.
So only use MYBEACON for STA + not-scanning, and just use BEACON for
the other modes it used to use BEACON for.
This doesn't completely fix IBSS merges though - there are still some
conditions to chase down and fix.
And expose vm_memattr_t of current mapping to consumers (as well as the
ability to change it to one of UC, WB, WC).
After short discussion with: jhb (but no review)
Sponsored by: EMC / Isilon Storage Division
- Add error handling for urtwn_(r88e_)read_rom() and
urtwn_efuse_*() functions.
- Remove code duplication between urtwn_efuse_read() and
urtwn_r88e_read_rom().
- Merge r88e_rom and (r92c_)rom structures
(only one of them can be used at the same time).
- Other minor fixes / improvements.
Tested with RTL8188EU, STA mode
(URTWN_DEBUG + USB_DEBUG, hw.usb.urtwn.debug=3, no visual differences).
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4253
Adds a new tunable, ntb.hw.b2b_mw_idx, which specifies the offset (from the
total number of memory windows) to use for register access on hardware with
the SDOORBELL_LOCKUP errata. The default is -1, i.e., the last memory
window.
We map BARs before the b2b_mw_idx is selected, so map them all as memory
windows initially. The register memory window should not be write-combined,
so we explicitly disable WC on the selected MW later.
This introduces a layer of abstraction between consumer memory window
indices, which exclude any exclusive errata-workaround BARs, and internal
memory window indices, which include such BARs. An internal routine,
ntb_user_mw_to_idx(), converts the former to the latter. Public APIs have
been updated to use this instead of assuming the exclusive workaround BAR is
the last available MW.
Sponsored by: EMC / Isilon Storage Division
This should be a big no-op pass; and reduces the size of if_ath.c.
I'm hopefully soon going to take a whack at the USB support for ath(4)
and this'll require some reuse of the busdma memory code.
- In the iSCSI CPL handlers, do not rely on the ulpcb/icl_conn when in
the middle of assembling a PDU. This is so we don't have to grab
various locks and evaluate the kernel state of the connection multiple
times. Instead, the state is evaluated once after the entire PDU is
received. This requires another ULP specific item in toepcb (ulpcb2).
- If there is data in the so_rcv sockbuf of a connection in iSCSI ULP
mode it must be from before the connection got promoted to ULP mode.
Convert the contents of the sockbuf to PDUs and deliver them to ICL.
Do this before delivering the PDUs received on the "normal" ULP path.
- The receive path within ICL is allowed to sleep so it's not
appropriate to deliver PDUs to ICL from the driver's ithread, or from
any other thread with any mutex held. Use worker threads (created
back in r285650 but unused till now) to dispatch received PDUs to ICL.
Assign a worker thread to each connection. For now everything goes to
the first thread.
- Prevent various bad races that are possible when more than one of
a) rx ithread, b) worker thread, and c) icl_conn_close are active at
the same time.
It is perhaps preferable to have a separate limit for send instead of
reusing the receive limit. I'll discuss with trasz@ and mav@ before
pulling this into head.
bridges. Currently this includes information about what resources a
bridge decodes on the upstream side for use by downstream devices including
bus numbers, I/O port resources, and memory resources. Windows and bus
ranges are enumerated for both PCI-PCI bridges and PCI-CardBus bridges.
To simplify the implementation, all enumeration is done by reading the
appropriate config space registers directly rather than querying the
bridge driver in the kernel via new ioctls. This does result in a few
limitations.
First, an unimplemented window in a PCI-PCI bridge cannot be accurately
detected as accurate detection requires writing to the window base
register. That is not safe for pciconf(8). Instead, this assumes that
any window where both the base and limit read as all zeroes is
unimplemented.
Second, the PCI-PCI bridge driver in a tree has a few quirks for
PCI-PCI bridges that use subtractive decoding but do not indicate that
via the progif config register. The list of quirks is duplicated in
pciconf's source.
Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D4171
Hacks to enable target mode there complicated code, while didn't really
work. And for outdated hardware fixing it is not really interesting.
Initiator mode tested with Qlogic 1080 adapter is still working fine.
While later firmware always registers for RSCN requests, older one does
it only in initiator mode. But in target mode there RSCN can be the only
way to detect gone intiator.
For those chips we are not receiving login events, adding initiators
based on ATIO requests. But there is no port ID in that structure, so
in fabric mode we have to explicitly fetch it from firmware to be able
to do normal scan after that.
Lower the payload data (IP) portion of the MTU from 0x10000 to
IP_MAXPACKET (0xFFFF) to avoid panicing the IP stack.
Sponsored by: EMC / Isilon Storage Division
This feature is disabled by default. To enable it, tune
hw.if_ntb.enable_xeon_watchdog to non-zero.
If enabled, writes an unused NTB register every second to demonstrate to
a hardware watchdog that the NTB device is still alive. Most machines
with NTB will not need this -- you know who you are.
Sponsored by: EMC / Isilon Storage Division
This change simplifies and unifies port adding/updating for loop and
fabric scanners. It also fixes problems with scanning restarts due to
concurrent port databases changes. It also fixes many cosmetic issues.
RX buffers from number of received packets.
Differential Revision: https://reviews.freebsd.org/D4178
Submitted by: Drew Gallatin <gallatin@freebsd.org>
Sponsored by: Mellanox Technologies
MFC after: 3 days
Setting sysctl dev....conf.hw_lro may fail if the net device lro is
turned off. Due to the nature of our sysctl handler we need to set the
values back to 0 and issue an error.
Differential Revision: https://reviews.freebsd.org/D4177
Submitted by: Shahar Klein <shahark@mellanox.com>
Sponsored by: Mellanox Technologies
MFC after: 3 days
Discard the unused rx_free_q. Instead, reuse inputed packets by putting
them back on the *pend* queue after reinitialization.
If tx or rx handlers are unavailable, free mbufs rather than leaking
them.
With this change, if_ntb can receive more than 100
(NTB_QP_DEF_NUM_ENTRIES) packets.
Sponsored by: EMC / Isilon Storage Division
32-bit BARs can only address memory mapped in the low 32 bits of
physical RAM. Expose this as a 'plimit' out parameter from
ntb_mw_get_range().
Fix if_ntb to allocate memory within this limit.
Sponsored by: EMC / Isilon Storage Division
Sometimes they'll read spurious values (observed: 0xc on Broadwell-DE),
failing link negotiation.
Discussed with: Dave Jiang, Allen Hubbe
Sponsored by: EMC / Isilon Storage Division
The tunable 'hw.ntb.enable_writecombine' may be set to zero to
administratively disable write combining the mapped NTB region.
Sponsored by: EMC / Isilon Storage Division
Use bus_space_write instead of (non-volatile) C pointer writes via an
iowrite32() shim in the same places as the Dual BSD/GPL Linux driver.
Update some types to fixed 32-bit sizes.
Sponsored by: EMC / Isilon Storage Division
Current Xen resume code clears all pending bitmap IPIs on resume, which is
not correct. Instead re-inject bitmap IPI vectors on resume to all CPUs in
order to acknowledge any pending bitmap IPIs.
Sponsored by: Citrix Systems R&D
MFC after: 2 weeks
Modern cards in most cases operate abstract port handles, that have no
any relation to real loop IDs. Leave loopid used only where it really
goes about local loop IDs.
While there, fix few more cases where LUNs were still printed in decimal.
It turns out on a 16550 w/ a 25MHz SoC reference clock you get a little
over 3% error at 115200 baud, which causes this to fail.
Just .. cope. Things cope these days.
Default to 30 (3.0%) as before, but allow UART_DEV_TOLERANCE_PCT to be
set at build time to change that.
- icl_cxgbei_conn is _the_ per-connection softc for iSCSI. Retire
iscsi_socket by removing all unneeded fields and moving the rest to
icl_cxgbei_conn.
- Update pdu_queue to use the new t4_push_pdus and associated mbufq in
the TOE driver. Throw away all the callbacks registered during MOD_LOAD
as t4_push_pdus doesn't need them.
- Use the mbuf allocated for the BHS header to store icl_cxgbei_pdu as
well. This eliminates the custom zone for the PDUs and reduces the
number of allocations on the fast path. For each PDU, the old code
used to allocate an icl_cxgbei_pdu, an mbuf for the BHS, and tags for
the BHS and data mbufs. The new code allocates just one mbuf per PDU.
This is convenient for another reason -- it allows t4_tom to deal with
mbufs (which it understands) instead of having to call into the iSCSI
driver.
- Remove the socket upcalls, calls to ICL_DEBUG and ICL_WARN, and all
code within ICL_KERNEL_PROXY. None of this stuff is actually used by
cxgbei, it's probably leftover copy/paste from icl_soft.
- Fold various icl_foo into icl_cxgbei_foo if the cxgbei implementation
was simply a call to the other function.
- Remove set_tcb_field and use t4_set_tcb_field that's already available
in the base driver.
- Fix connection handoff to not assume that there is only one T4/T5
adapter in the system and that's the one handling all offloaded
connections. Walk the list of adapters and match tp->t_tod with the
adapter's toedev instead. This allows multiple TOE devices of multiple
types to coexist.
- Fix connection teardown by not reaching for the inp via the toepcb but
via the socket instead. If the tid is dead in the hardware then the
inp has already been unhooked from the toepcb by t4_tom.
- Add more CTRs. The ones on the normal fast path are disabled by
default to avoid flooding the log.
- Refine pdu_append_data.
- Other miscellaneous changes.
Switch PCI register reads from using magic numbers to using the names
defined in pcireg.h
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4185
Using other values causes VMXNET3_CMD_ENABLE to fail. The Linux
driver also enforces this restriction.
Reviewed by: bryanv
MFC after: 1 week
Sponsored by: Norse
Differential Revision: https://reviews.freebsd.org/D4139
accident with RTL8168G and later chips when the interface actually was
brought up. This is due to the fact that with these MAC variants, RXDV
gate needs be disabled for WOL to work. So do just that in re_setwol()
when IFCAP_WOL is requested.
Reported and tested by: dhw
MFC after: 3 days
The code tracks a counter which is the number of events until the next
sample. On context switch in, it loads the saved counter. On context
switch out, it tries to calculate a new saved counter.
Problems:
1. The saved counter was shared by all threads in a process. However, this
means that all threads would be initially loaded with the same saved
counter. However, that could result in sampling more often than once every
X number of events.
2. The calculation to determine a new saved counter was backwards. It
added when it should have subtracted, and subtracted when it should have
added. Assume a single-threaded process with a reload count of 1000 events.
Assuming the counter on context switch in was 100 and the counter on context
switch out was 50 (meaning the thread has "consumed" 50 more events), the
code would calculate a new saved counter of 150 (instead of the proper 50).
Fix:
1. As soon as the saved counter is used to initialize a monitor for a
thread on context switch in, set the saved counter to the reload count.
That way, subsequent threads to use the saved counter will get the full
reload count, assuring we sample at least once every X number of events
(across all threads).
2. Change the calculation of the saved counter. Due to the change to the
saved counter in #1, we simply need to add (modulo the reload count) the
remaining counter time we retrieve from the CPU when a thread is context
switched out.
Differential Revision: https://reviews.freebsd.org/D4122
Approved by: gnn (mentor)
MFC after: 1 month
Sponsored by: Juniper Networks
On my own tests I see no effect from this change, but I also can't
reproduce the reported problem in general.
PR: 127391
PR: 204554
Submitted by: satz@iranger.com
MFC after: 2 weeks
Changes to the code to gather user stacks:
* Delay setting pmc_cpumask until we actually have the stack.
* When recording user stack traces, only walk the portion of the ring
that should have samples for us.
Sponsored by: Juniper Networks
Approved by: gnn (mentor)
MFC after: 1 month
Currently, there is a single pm_stalled flag that tracks whether a
performance monitor was "stalled" due to insufficent ring buffer
space for samples. However, because the same performance monitor
can run on multiple processes or threads at the same time, a single
pm_stalled flag that impacts them all seems insufficient.
In particular, you can hit corner cases where the code fails to stop
performance monitors during a context switch out, because it thinks
the performance monitor is already stopped. However, in reality,
it may be that only the monitor running on a different CPU was stalled.
This patch attempts to fix that behavior by tracking on a per-CPU basis
whether a PM desires to run and whether it is "stalled". This lets the
code make better decisions about when to stop PMs and when to try to
restart them. Ideally, we should avoid the case where the code fails
to stop a PM during a context switch out.
Sponsored by: Juniper Networks
Reviewed by: jhb
Approved by: gnn (mentor)
Differential Revision: https://reviews.freebsd.org/D4124
There is no need for the upstream and downstream addresses to be
different for the NTB configs. Go to using a single set of address. It
is still possible to configure them differently using module parameter
override however (CEM: tunable).
Authored by: Dave Jiang <dave.jiang@intel.com>
Reviewed by: Allen Hubbe <Allen.Hubbe@emc.com>
Reviewed by: Jon Mason <jdmason@kudzu.us>
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Remove the use of sbuf_data on drained sbufs from the debug sysctls:
* ixl_sysctl_hw_res_alloc
* ixl_sysctl_switch_config
This prevents a kernel panic when accessing these values under a kernel
compiled with INVARIANTS.
Sponsored by: Multiplay
Order of operations issue with the QP Num and MW count, which would
result in the receive buffer pointer being invalid if there are more
than 1 MW. Corrected with parenthesis to enforce the proper order of
operations.
Reported by: John I. Kading <John.Kading@gd-ms.com>
Reported by: Conrad Meyer <cem@FreeBSD.org>
Authored by: Jon Mason <jdmason@kudzu.us>
Obtained from: Linux (Dual BSD/GPL driver)
Sponsored by: EMC / Isilon Storage Division
Because it can sleep drainking link work callout(s). Linux (dual
BSD/GPL driver) does something very similar.
At the same time, switch the NTB CTX lock to a non-spin mutex, because
the taskqueue_swi lock can't be taken after a spin mutex.
Suggested by: Witness
Sponsored by: EMC / Isilon Storage Division
In ntb_poll_link, we are intentionally writing the link bit, which is
absent from db_valid_mask. Don't panic on a kassert when we do so.
The Linux version of this (dual BSD/GPL) driver has the db_valid_mask
assertions in callers of db_iowrite() rather than db_iowrite() itself;
it skips the assertions in the equivalent of ntb_poll_link(). Rather
than duplicating the assertions in every caller, add a db_iowrite_raw()
that doesn't check and use it from ntb_poll_link().
Suggested by: kassert_panic
Sponsored by: EMC / Isilon Storage Division
from Mellanox Technologies. The current driver supports ethernet
speeds up to and including 100 GBit/s. Infiniband support will be
done later.
The code added is not compiled by default, which will be done by a
separate commit.
Sponsored by: Mellanox Technologies
MFC after: 2 weeks
the TOE for LAN operation. It is possible to set this to other values
(cluster for networks with little loss and really tight RTTs, and wan
for relatively large RTTs and/or lossy networks) depending on the
environment in which the TOE is being used.
None of this affects plain NIC operation in any way.
MFC after: 1 week
- Split urtwn_tx_start() into urtwn_tx_data() and urtwn_tx_start()
(the last will be used for beacon updates / raw xmit path).
- Remove unneeded code from _urtwn_getbuf().
- Use CCK11 for data frames in 11b mode.
- Send EAPOL frames at 1 Mbps.
- Reduce code duplication in urtwn_tx_data().
- Fix sequence numbering.
- Add IEEE80211_RADIOTAP_F_WEP flag for encrypted frames.
- Check URTWN_RUNNING flag under lock.
Tested with RTL8188EU, STA mode.
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4017
Right now the only way to force a cold reset is:
* The HAL itself detects it's needed, or
* The sysctl, setting all resets to be cold.
Trouble is, cold resets take quite a bit longer than warm resets.
However, there are situations where a cold reset would be nice.
Specifically, after a stuck beacon, BB/MAC hang, stuck calibration results,
etc.
The vendor HAL has a separate method to set the reset reason (which is
how HAL_RESET_BBPANIC gets set) which informs the HAL during the reset path
why it occured. This is almost but not quite the same; I may eventually
unify both approaches in the future.
This commit just extends HAL_RESET_TYPE to include both status (eg BBPANIC)
and type (eg do COLD.) None of the HAL code uses it yet though; that'll
come later.
It also is a big no-op in each HAL - I need to go teach each of the HALs
about cold/warm reset through this path.
Using unmapped IO is really beneficial when running inside of a VM,
since it avoids IPIs to other vCPUs in order to invalidate the
mappings.
This patch adds unmapped IO support to blkfront. The following tests
results have been obtained when running on a Xen host without HAP:
PVHVM
3165.84 real 6354.17 user 4483.32 sys
PVHVM with unmapped IO
2099.46 real 4624.52 user 2967.38 sys
This is because when running using shadow page tables TLB flushes and
range invalidations are much more expensive, so using unmapped IO
provides a very important performance boost.
Sponsored by: Citrix Systems R&D
MFC after: 2 weeks
X-MFC-with: r290610
dev/xen/blkfront/blkfront.c:
- Add and announce support for unmapped IO.
before their initial configuration is done, it turns out that r281337
has the inverse effect on some older chips. Moreover, as with newer
chips before, two chips seemingly identical according to their MAC
revisions may behave differently in this regard, with most working
but a few not, making changes extremely hard to test.
Closer inspection of the corresponding Linux code suggests that RX
and TX should only be enabled after their initial configuration with
RTL8168G and later chips, i. e. RTL8106E{,US}, RTL8107E, as well as
RTL8168{EP,G,GU,H}, so limit the new code path to these. [1]
- Distinguish between RTL8168H and RTL8107E, with the latter being the
10/100-Mbit/s-only variant of the former.
- For MAC variants that can only do Fast Ethernet at a maximum, ensure
that we don't advertise Gigabit Ethernet speed.
- In re_stop(), do the inverse of re_init_locked() and enable RXDV
gate on RTL8168G and later chips again, matching what Linux does.
PR: 203422 [1]
MFC after: 1 week
- Filter out unneeded frames in STA mode.
- Implement ic_promisc() call.
Tested with RTL8188EU, STA and MONITOR modes.
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D3999
learned that the Power8 and the PS3 have a mix of OFW and FDT. Both have AIM
defined. But currently they are not affected. They have no I2C devices under
OFW.
This version was tested on a Quad G5 and build tested for armv6*.
Discussed with nwhitehorn@
Reviewed by: ian@
32-bits aligned. Merge the two bounce buffers into a single one. Some
rough tests showed that the DWC OTG throughput on RPI2 increased by
10% after this patch.
MFC after: 1 week
The mbuf length fields must be set before m_adj() is called else
m_adj() will not always adjust the mbuf and an unaligned read
exception can trigger inside the network stack. This can happen on
platforms where unaligned reads are not supported. Adjust a length
check to include the 2-byte ethernet alignment while at it.
MFC after: 3 days
- Drop TSF initialization; device can discover it without our help.
- Do not touch R92C_BCN_CTRL_EN_BCN bit in STA mode.
- Add 'static' keyword for function definition.
Tested with RTL8188EU, STA mode.
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D3801
- Fix mbuf leaks in iwn_raw_xmit() and iwn_xmit_task()
(regression since r288178).
- Check IWN_FLAG_RUNNING flag under lock.
- Remove m->m_pkthdr.rcvif initialization (fixed in r283994).
- Enclose some values in return statements into parentheses.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4069
attributes when replying to a TLP from a Root Port. As a workaround,
disable No Snoop and Relaxed Ordering in the Root Port of each T5 adapter
during attach so that CPU-initiated requests do not contain these flags.
Note that this affects CPU-initiated requests to all devices under this
root port.
Reviewed by: np
MFC after: 1 week
Sponsored by: Chelsio
PCI-Express capability registers (that is, PCI config registers in the
standard PCI config space belonging to the PCI-Express capability
register set).
Note that all of the current PCI-e registers are either 16 or 32-bits,
so only widths of 2 or 4 bytes are supported.
Reviewed by: imp
MFC after: 1 week
Sponsored by: Chelsio
Differential Revision: https://reviews.freebsd.org/D4088
Since r288350, ic_wme_task() is called via ieee80211_runtask(),
so, any additional deferring from the driver side is not needed.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4072
- There is no reason to have a special case for iSCSI in t4_rcvd. Either there
is data in the socket buffer (from when the connection was plain TOE, before
being promoted to ulp_mode iSCSI) and sbused _should_ be taken into account,
or sbused is 0 and doesn't affect the calculation of rx_credits.
- write_tx_wr doesn't need special handling for iSCSI either. Its caller should
specify the ulp_submode.
- Replace t4_ulp_push_frames with t4_push_pdus that can deal with PDUs in an
mbufq hanging off the toepcb. This eliminates the "backwards" calls from
t4_tom's tx into the iSCSI driver.
- The iSCSI driver installs a handler for RX_ISCSI_DDP already and the iSCSI
handler for RX_DATA_DDP is identical to the one for RX_ISCSI_DDP. Take
advantage of this to eliminate the last remaining "backwards" call from
do_rx_data_ddp into the iSCSI driver.
- Eliminate the CXGBE_ISCSI_MBUF_TAG abomination.
- For tx, it makes no sense to allocate an mbuf tag just to stash 2 bits worth
of information. Use a spare byte from the mbuf header instead.
- For rx, the per-connection ulpcb is a more natural place to keep information
about the PDU currently being assembled.
to transmit the buffer.
ath_tx_start() may manipulate/reallocate the mbuf as part of the DMA
code, so we can't expect the mbuf can be returned back to the caller.
Now, the net80211 ifnet work changed the semantics slightly so
if an error is returned here, the mbuf/reference is freed by the
caller (here, it's net80211.)
So, once we reach ath_tx_start(), we never return failure. If we fail
then we still return OK and we free the mbuf/noderef ourselves, and
we increment OERRORS.
Certain invalid operations trigger hardware error conditions. Error
conditions that only halt one channel can be detected and recovered by
resetting the channel. Error conditions that halt the whole device are
generally not recoverable.
Add a sysctl to inject channel-fatal HW errors,
'dev.ioat.<N>.force_hw_error=1'.
When a halt due to a channel error is detected, ioat(4) blocks new
operations from being queued on the channel, completes any outstanding
operations with an error status, and resets the channel before allowing
new operations to be queued again.
Update ioat.4 to document error recovery; document blockfill introduced
in r290021 while we are here; document ioat_put_dmaengine() added in
r289907; document DMA_NO_WAIT added in r289982.
Sponsored by: EMC / Isilon Storage Division
Fixes race condition observed under following circumstances:
1) I/O split on 128KB boundary with Intel NVMe controller.
Current Intel controllers produce better latency when
I/Os do not span a 128KB boundary - even if the I/O size
itself is less than 128KB.
2) Per-CPU I/O queues are enabled.
3) Child I/Os are submitted on different submission queues.
4) Interrupts for child I/O completions occur almost
simultaneously.
5) ithread for child I/O A increments bio_inbed, then
immediately is preempted (rendezvous IPI, higher priority
interrupt).
6) ithread for child I/O B increments bio_inbed, then completes
parent bio since all children are now completed.
7) parent bio is freed, and immediately reallocated for a VFS
or gpart bio (including setting bio_children to 1 and
clearing bio_driver1).
8) ithread for child I/O A resumes processing. bio_children
for what it thinks is the parent bio is set to 1, so it
thinks it needs to complete the parent bio.
Result is either calling a NULL callback function, or double freeing
the bio to its uma zone.
PR: 203746
Reported by: Drew Gallatin <gallatin@netflix.com>,
Marc Goroff <mgoroff@quorum.net>
Tested by: Drew Gallatin <gallatin@netflix.com>
MFC after: 3 days
Sponsored by: Intel
status registers for every interrupt. Check a common host channel
status interrupt register first, then conditionally read the
individual host channel status registers.
Submitted by: Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after: 1 week
- Move all files related to the LinuxKPI into sys/compat/linuxkpi and
its subfolders.
- Update sys/conf/files and some Makefiles to use new file locations.
- Added description of COMPAT_LINUXKPI to sys/conf/NOTES which in turn
adds the LinuxKPI to all LINT builds.
- The LinuxKPI can be added to the kernel by setting the
COMPAT_LINUXKPI option. The OFED kernel option no longer builds the
LinuxKPI into the kernel. This was done to keep the build rules for
the LinuxKPI in sys/conf/files simple.
- Extend the LinuxKPI module to include support for USB by moving the
Linux USB compat from usb.ko to linuxkpi.ko.
- Bump the FreeBSD_version.
- A universe kernel build has been done.
Reviewed by: np @ (cxgb and cxgbe related changes only)
Sponsored by: Mellanox Technologies
AMD64 pmap assumes ranges will be in the DMAP, which isn't necessarily
true for NTB memory windows (especially 64-bit BARs).
Suggested by: pmap_change_attr_locked -> kassert_panic
Sponsored by: EMC / Isilon Storage Division
Allows DMA from/to arbitrary KVA or physical address. /dev/ioat_test
must be enabled by root and is only R/W root, so this is approximately
as dangerous as /dev/mem and /dev/kmem.
Sponsored by: EMC / Isilon Storage Division
For the most of chips (except anscient ones) port handlers have no relation
to port IDs. In such situation old code scanning first 125 handlers was
quite naive. Instead of doing that, send to chip single request to get full
list of port handlers available on specific virtual port and scan only them.
Old code had problems with case of several virtual ports enabled, when port
handlers allocated from global address space could easily go above 125.
This change was successfully tested on 23xx, 24xx and 25xx chips in loop
mode with 4 virtual initiator ports, each seing 50 virtual target ports.
* refactor out the rx filter and operating mode code into a separate
method.
* add some comments about what's left with setting the operating mode
based on what carl9170 does.
* comment out some init from otus_init_mac() - it's no longer needed as
it's always init'ed now.
* add debugging and a missing return around a failure to call m_get2() -
during monitor mode operation I found RXing of frames > 2k, which
fails allocation. I'm sure they're valid (it's configuring 11n RX and
receiving 11n frames even though the driver doesn't "do" 11n)
and may be A-MSDU; but allocations fail and we should handle that
gracefully.
Tested:
* UB82 reference NIC (AR9170 + AR9104 2x2 dual band NIC); STA and
monitor mode operation.
The IOAT hardware supports writing a 64-bit pattern to some destination
buffer. The same limitations on buffer length apply as for copy
operations. Throughput is a bit higher (probably because fill does not
have to spend bandwidth reading from a source in memory).
Support for testing Block Fill has been added to ioatcontrol(8) and the
ioat_test device. ioatcontrol(8) accepts the '-f' flag, which tests
Block Fill. (If the flag is omitted, the tool tests copy by default.)
The '-V' flag, in conjunction with '-f', verifies that buffers are
filled in the expected pattern.
Tested on: Broadwell DE (Xeon D-1500)
Sponsored by: EMC / Isilon Storage Division
Add generic hw descriptor struct and generic control flags struct, in
preparation for other kinds of IOAT operation.
Sponsored by: EMC / Isilon Storage Division
Now on 24xx and above chips it is really possible to simulate several
virtual FC ports with single physical one. For example, it allows to
configure several targets in ctl.conf, assign each of them to separate
virtual port, and let user to control access to them with switch zoning.
I still doubt that all problems are solved there, but at now it passes
at least basic tests.
The new load_ma implementation can cause dereferences when used with
certain drivers, back it out until the reason is found:
Fatal trap 12: page fault while in kernel mode
cpuid = 11; apic id = 03
fault virtual address = 0x30
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff808a2d22
stack pointer = 0x28:0xfffffe07cc737710
frame pointer = 0x28:0xfffffe07cc737790
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 13 (g_down)
trap number = 12
panic: page fault
cpuid = 11
KDB: stack backtrace:
#0 0xffffffff80641647 at kdb_backtrace+0x67
#1 0xffffffff80606762 at vpanic+0x182
#2 0xffffffff806067e3 at panic+0x43
#3 0xffffffff8084eef1 at trap_fatal+0x351
#4 0xffffffff8084f0e4 at trap_pfault+0x1e4
#5 0xffffffff8084e82f at trap+0x4bf
#6 0xffffffff80830d57 at calltrap+0x8
#7 0xffffffff8063beab at _bus_dmamap_load_ccb+0x1fb
#8 0xffffffff8063bc51 at bus_dmamap_load_ccb+0x91
#9 0xffffffff8042dcad at ata_dmaload+0x11d
#10 0xffffffff8042df7e at ata_begin_transaction+0x7e
#11 0xffffffff8042c18e at ataaction+0x9ce
#12 0xffffffff802a220f at xpt_run_devq+0x5bf
#13 0xffffffff802a17ad at xpt_action_default+0x94d
#14 0xffffffff802c0024 at adastart+0x8b4
#15 0xffffffff802a2e93 at xpt_run_allocq+0x193
#16 0xffffffff802c0735 at adastrategy+0xf5
#17 0xffffffff80554206 at g_disk_start+0x426
Uptime: 2m29s
Add a new flag for DMA operations, DMA_NO_WAIT. It behaves much like
other NOWAIT flags -- if queueing an operation would sleep, abort and
return NULL instead.
When growing the internal descriptor ring, the memory allocation is
performed outside of all locks. A lock-protected flag is used to avoid
duplicated work. Threads that cannot sleep and attempt to queue
operations when the descriptor ring is full allocate a larger ring with
M_NOWAIT, or bail if that fails.
ioat_reserve_space() could become an external API if is important to
callers that they have room for a sequence of operations, or that those
operations succeed each other directly in the hardware ring.
This patch splits the internal head index (->head) from the hardware's
head-of-chain (DMACOUNT) register (->hw_head). In the future, for
simplicity's sake, we could drop the 'ring' array entirely and just use
a linked list (with head and tail pointers rather than indices).
Suggested by: Witness
Sponsored by: EMC / Isilon Storage Division
Assertion used here was invalid. If current thread helds any of locks,
we never want to recurse on them.
Obtained from: Semihalf
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D3903
Add e6000sw driver supporting Marvell 88E6352, 88E6172, 88E6176 switches.
It needs to be attached to mdio interface, exporting SMI access
functionality. e6000sw supports port-based VLAN configuration, per-port
media changing, accessing PHY and switch registers.
e6000sw attaches miibuses and PHY drivers as children. Instead of typical
tick as callout, kthread-based tick is used. This combined with SX locks
allows MDIO read/write calls to sleep. It is expected, because this
hardware requires long delays in SMI read/write procedures, which can not
be handled by busy-waiting.
Reviewed by: adrian
Obtained from: Semihalf
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D3902
This commit introduces support for etherswitch devices that utilize SMI as
a way of accessing its registers. SMI register is located in address space
of mge -- access to it was exported through MDIO interface.
Attachment functions were enhanced so as to ensure proper initialisation
in both cases: 1) PHYs attached directly to mge, 2) PHYs attached to
switch device and switch attached to mge. Attachment of etherswitch device
depends on dts entry with compatible="mrvl,sw" property. If none is found,
typical PHY attachment procedure follows.
In case of switch attached, PHYs' status and configuration is accessible
via etherswitchcfg, and ifconfig shows always-up, non-configurable mge
interfaces.
Due to the fact that there may be simultaneous accessess to SMI
registers (e.g. from PHY attached to one of mge instances and switch
to the other), SMI access interlock was added. It is SX lock,
because sleep ability is necessary -- busy-waiting would result
in poor performance due to long delays required by hardware.
Underlying switch driver is obliged to use sleepable locks as well.
Reviewed by: adrian
Obtained from: Semihalf
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D3900
We need to reset the chancmp and chainaddr MMIO registers to bring the
device back to a working state.
Name the chanerr bits while we're here.
Sponsored by: EMC / Isilon Storage Division
We only need to borrow a mutex for the drain sleep and the 0->1
transition, so just reuse an existing one for now.
The wchan is arbitrary. Using refcount itself would have required
__DEVOLATILE(), so use the lock's address instead.
Different uses are tagged by kind, although we only do anything with
that information in INVARIANTS builds.
Sponsored by: EMC / Isilon Storage Division
Callers should have acquired this lock when they invoked ioat_acquire()
before issuing operations. Assert it is held.
Sponsored by: EMC / Isilon Storage Division
This is still the worst possible way to allocate memory if it will ever
be under pressure, but at least it won't deadlock.
Suggested by: WITNESS
Sponsored by: EMC / Isilon Storage Division
Pull out the timer callout delay into IOAT_INTR_TIMO and shorten it
considerably (5s -> 100ms). Single operations do not take 5-10 seconds
and when interrupts aren't working, waiting 100ms sucks a lot less than
5s.
Sponsored by: EMC / Isilon Storage Division
Now 24xx and above chips support full 8-byte LUN address space.
Older FC chips may support up to 16K LUNs when firmware allows.
Tested in both initiator and target modes for 23xx, 24xx and 25xx.
This change allows to decode respective functions in isp(4) in target mode
and pass them through CAM to CTL. Unfortunately neither CAM nor isp(4)
support returning response info for those task management functions now.
On the other side I just have no initiator to test this functionality.
Using unmapped IO is really beneficial when running inside of a VM,
since it avoids IPIs to other vCPUs in order to invalidate the
mappings.
This patch adds unmapped IO support to blkfront. The following tests
results have been obtained when running on a Xen host without HAP:
PVHVM
3165.84 real 6354.17 user 4483.32 sys
PVHVM with unmapped IO
2099.46 real 4624.52 user 2967.38 sys
This is because when running using shadow page tables TLB flushes and
range invalidations are much more expensive, so using unmapped IO
provides a very important performance boost.
Sponsored by: Citrix Systems R&D
MFC after: 2 weeks
X-MFC-with: r289834
FC port database code already notifies CAM about all devices. Additional
full scan is just a waste of time, that by definition won't find anything
that is not present in port database.
registers (they are used for different purposes).
- Wrap R92C_MSR modifications into urtwn_set_mode().
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D3838
* Add a comment about the parameters I should support, stolen shamelessly
from iwn(4);
* Implement the rate bit for the raw transmit path;
* Print out the host-order versions of each of the transmit bits, so
I have a hope in heck of debugging why things are going wrong.
This still doesn't fix 5GHz in the office but that's likely due to a lot
of other configuration parameters being 2GHz-specific. That'll come next.
Tested:
* AR9170 + AR9103 (2/5GHz) 2x2, 5GHz association