So that:
- TCP/IP stack will not do unnecessary IP header checksum for TSO
packets.
- Reduce guest load for non-TSO IP packets.
Reviewed by: adrian
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5099
- For non-TSO offloading, we don't need to access mbuf to know
which csum offloading is requested, we can just use the
CSUM_{IP,TCP,UDP} in the csum_flags.
- For TSO offloading, we still can depend on CSUM_{TSO4,TSO6}
in the csum_flags to tell whether the TSO packet is an IPv4
TSO packet or an IPv6 TSO packet.
This streamlines csum offloading handling (remove the two goto)
and allows us the nuke the unnecessary get_transport_proto_type().
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5098
- Record csum features in softc, so we don't need to duplicate the
logic from attach path to ioctl path.
- Protect if_capenable and if_hwassist changes by main lock.
- Prefer turn on/off bits in if_hwassist explicitly instead of using
XOR.
Reviewed by: adrian, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
MFC after: 1 week
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5085
a mips big-endian board.
This is (hopefully! ish!) a temporary change until a slightly better way
can be found to express this without a config option.
Tested:
* BUFFALO WZR-HP-G300NH 1stGen (by submitter)
Submitted by: Mori Hiroki <yamori813@yahoo.co.jp>
Add #defines for ATA_WRITE_UNCORRECTABLE48 and its features. Update the
decoding in ATACAM to recognize the new values. Also improve command
decoding for a few other commands (SMART, NOP, SET_FEATURES). Bring the
decoding in ata(4) up to parity with ATACAM.
Reviewed by: mav, imp
MFC after: 1 month
Sponsored by: Panasas, Inc.
Differential Revision: https://reviews.freebsd.org/D5181
also for 82598, which doesn't support it.
The legacy code has a check for it, which was missed when the code for dealing with
CSUM_IP6_* was added. Add the same check for FreeBSD 10 and higher.
Differential Revision: https://reviews.freebsd.org/D5192
MD_ROOT_SIZE and embed_mfs.sh were basically retired as part of
https://reviews.freebsd.org/D2903 .
However, when building a kernel with 'options MD_ROOT_SIZE' specified, this
results in a non-working MFS, as within sys/dev/md/md.c we fall within the
wrong # ifdef.
This patch implements the following:
* Allow kernels to be built without the MD_ROOT_SIZE option, which results
in a kernel built as per D2903.
* Allow kernels to be built with the MD_ROOT_SIZE option, which results
in a kernel built similarly to the pre-D2903 way, with the following
differences:
* The MFS is now put in a separate section within the kernel (oldmfs,
so it differs from the mfs section introduced by D2903).
* embed_mfs.sh is changed, so it looks up the oldmfs section within the
kernel, gets its size and offset, sees if the MFS will fit within the
allocated oldmfs section and only if all is well does a dd of the MFS
image into the kernel.
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
Reviewed by: brooks, imp
Differential Revision: https://reviews.freebsd.org/D5093
causes watchdog timeouts when using TSO4 at link speeds below
Gigabit, at least with 82573E. So disable the assist automatically
when at lower speeds.
Submitted by: jfv
Approved by: erj
Obtained from: D3162
MFC after: 3 days
Fix ixgbe reporting of flow control autoneg when running under DBG 1
Reviewed by: erj
MFC after: 2 days
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D5089
Setup phy and gbic power as per Linux 4.3.13 driver.
This fixes link not detected on X540-AT2 after booting to Linux which turns
the phy power off on detach.
Reviewed by: sbruno
MFC after: 2 days
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D5107
Fix the inverted set of interrupts being used as the mask for ixl.
Without this ixl devices fail to detect link state changes.
Reviewed by: erj, sbruno
MFC after: 2 days
Sponsored by: Multiplay
This is the final step required allowing to compile and to run RISC-V
kernel and userland from HEAD.
RISC-V is a completely open ISA that is freely available to academia
and industry.
Thanks to all the people involved! Special thanks to Andrew Turner,
David Chisnall, Ed Maste, Konstantin Belousov, John Baldwin and
Arun Thomas for their help.
Thanks to Robert Watson for organizing this project.
This project sponsored by UK Higher Education Innovation Fund (HEIF5) and
DARPA CTSRD project at the University of Cambridge Computer Laboratory.
FreeBSD/RISC-V project home: https://wiki.freebsd.org/riscv
Reviewed by: andrew, emaste, kib
Relnotes: Yes
Sponsored by: DARPA, AFRL
Sponsored by: HEIF5
Differential Revision: https://reviews.freebsd.org/D4982
The process is not held since the process_exit hook is called after the
exithold. There is no need to hold the process since the hook will
always see it exiting via the process_exit event.
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
The NVMe specification does not define a maximum or optimal delete
size, so technically max delete size is min(full size of namespace,
2^32 - 1 LBAs). A single delete operation for a multi-TB NVMe
namespace though may take much longer to complete than the nvme(4)
I/O timeout period. So choose a sensible default here that is still
suitably large to minimize the number of overall delete operations.
This also fixes possible uint32_t overflow on initial TRIM operation
for zpool create operations for NVMe namespaces with >4G LBAs.
MFC after: 3 days
Sponsored by: Intel
ofw_bus_get_node() must be tested against negative values since
missing parent bus method will result in calling the default method
which simply returns (-1): sys/dev/ofw/ofw_bus_if.m
This was lost in the review process.
Obtained from: Semihalf
Sponsored by: Cavium
Some firmware revisions provide different DTB tree that include
odd MDIO placement in the tree.
This commit adds support for 2 new buses:
- MRML bridge (PCIB subordinate)
- MDIO nexus (MRML subordinate)
This allows for the correct MDIO attachment with both - new and old
firmware.
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5070
Search for BGX node in DTS in two ways:
1. Try to find it uder root node first
2. If not found under root, find the top level PCI bridge node
and search all nodes below it until appropriate BGX node is found.
Move search code to another function to make the code more clear.
Remove unused variable by the way.
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D5066
Use driver settable callbacks for handling of:
- core post reset
- reading actual port speed
Typically, OTG enabled EHCI cores wants setting of USBMODE register,
but this register is not defined in EHCI specification and different
cores can have it on different offset.
Also, for cores with TT extension, actual port speed must be determinable.
But again, EHCI specification not covers this so this patch provides
function for two most common variant of speed bits layout.
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D5088
This fixes some cases where a process could exit without being untracked
by filemon.
Reported by: mjg
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
It only prints the header from filemon_ioctl. Keep the name though to stay
closer to other implementations.
MFC after: 2 weeks
Sponsored by: EMC / Isilon Storage Division
to be 64-bit on 32-bit architectures. It is not uncommon for device trees
to use the upper 32-bits to store what effectively is an index into the
parent ranges property. In this case, when running with a 32-bit bus_addr_t
and bus_size_t, we would previously truncate the address, this may then
incorrectly match the wrong range, and return the wrong address.
Tested by: bz (earlier version)
- Use taskqueue instead of swi for event handling.
- Scan the interrupt flags in filter
- Disable ringbuffer interrupt mask in filter to ensure no unnecessary
interrupts.
Submitted by: Jun Su <junsu microsoft com>
Reviewed by: adrian, sephe, Dexuan <decui microsoft com>
Approved by: adrian (mentor)
MFC after: 2 weeks
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4920
Summary:
Migrate to using the semi-opaque type rman_res_t to specify rman resources. For
now, this is still compatible with u_long.
This is step one in migrating rman to use uintmax_t for resources instead of
u_long.
Going forward, this could feasibly be used to specify architecture-specific
definitions of resource ranges, rather than baking a specific integer type into
the API.
This change has been broken out to facilitate MFC'ing drivers back to 10 without
breaking ABI.
Reviewed By: jhb
Sponsored by: Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5075
- Start vap(s) (via ieee80211_start_all()) only when initialization
succeeds; stop the first vap otherwise (via ieee80211_stop());
- Do not try to stop a device multiple times
(move (sc->sc_flags & RTWN_RUNNING) check to urtwn_stop_locked()).
Tested by: kevlo
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D5058
Added hw.ix.flow_control which enables the default flow_control of all ix
interfaces to be set in loader.conf.
Added hw.ix.advertise_speed which enables the default advertised_speed of
all ix interfaces to be set in loader.conf.
Made enable_aim device independent based on hw.ix.enable_aim default.
Reviewed by: erj
MFC after: 1 week
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D5060
- Avoid main lock contention by trylock for if_start, if that fails,
schedule TX taskqueue for if_start
- Don't do direct sending if the packet to be sent is large, e.g.
TSO packet.
This change gives me stable 9.1Gbps TCP sending performance w/ TSO
over a 10Gbe directly connected network (the performance fluctuated
between 4Gbps and 9Gbps before this commit). It also improves non-
TSO TCP sending performance a lot.
Reviewed by: adrian, royger
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D5074
Currently when the OF_getprop() function returns with error,
the caller (OF_getencprop()) still changes the buffer endiannes.
This may destroy the default value passed in the input buffer if
used on a Little Endian platform.
Reviewed by: mmel
Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Obtained from: Semihalf
Sponsored by: Cavium
The page information array could contain up to 32 elements (i.e. 512B).
And on network side w/ TSO, 11+ (176B+) elements, i.e. ~44K TSO packet,
in the page information array is quite common.
This saves us some cpu cycles.
Reviewed by: adrian, delphij
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4992
According to all available information, VMSWITCH always does the
TCP segment checksum verification before sending the segment to
guest.
Reviewed by: adrian, delphij, Hongjiang Zhang <honzhan microsoft com>
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4991
All used fields are setup one by one, so there is no need to zero
out this large struct.
While I'm here, move the stack variable near its usage.
Reviewed by: adrian, delphij, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4978
While I'm here, move stack variables near their usage.
Reviewed by: adrian, delphij, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4977
- Avoid unnecessary malloc/free on transmission path.
- busdma(9)-fy transmission path.
- Properly handle IFF_DRV_OACTIVE. This should fix the network
stalls reported by many.
- Properly setup TSO parameters.
- Properly handle bpf(4) tapping. This 5 times the performance
during TCP sending test, when there is one bpf(4) attached.
- Allow size of chimney sending be tuned on a running system.
Default value still needs more test to determine.
Reviewed by: adrian, delphij
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4972
support frameworks (i.e. regulators/phy/tsensors/fuses...).
It provides simple unified consumers interface for manipulations with
on-chip resets.
Reviewed by: ian, imp (paritaly)
support frameworks(i.e. reset/regulators/phy/tsensors/fuses...).
The clock framework significantly simplifies handling of complex clock
structures found in modern SoCs. It provides the unified consumers
interface, holds and manages actual clock topology, frequency and gating.
It's tested on three different ARM boards (Nvidia Tegra TK1, Inforce 6410 and
Odroid XU2) and on one MIPS board (Creator Ci20) by kan@.
The framework is still far from perfect and probably doesn't have stable
interface yet, but we want to start testing it on more real boards and
different architectures.
Reviewed by: ian, kan (earlier version)
The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to
represent an iWARP endpoint when the connection is over TCP. For
servers the current approach is to invoke create_listen callback for
each iWARP RNIC registered with the CM. This doesn't work too well for
INADDR_ANY because a listen on any TCP socket already notifies all
hardware TOEs/RNICs of the new listener. This patch fixes the server
side of things for FreeBSD. We've tried to keep all these modifications
in the iWARP/TCP specific parts of the OFED infrastructure as much as
possible.
Submitted by: Krishnamraju Eraparaju @ Chelsio (with design inputs from Steve Wise)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D4801
When ifconfig sets media then the values displayed by the advertise_speed
value are invalidated.
Fix this by setting the bits correctly including setting advertise to 0 for
media = auto.
Reviewed by: sbruno
MFC after: 1 week
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D5034
Windows 10 and Window 2016 will return all zero inquiry data for
non-existing slots. If we dispatched them, then a lot of useless
(0 sized) disks would be created. So we verify the returned inquiry
data (valid type, non-empty vendor/product/revision etc.), before
further dispatching.
Minor white space cleanup and wording fix.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reviewed by: adrian, sephe, Jun Su <junsu microsoft com>
Approved by: adrian (mentor)
Modified by: sephe
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4928
Vmbus event handler will need to find the channel by its relative
id, when software interrupt for event happens. The original lookup
searches the channel list, which is not very efficient. We now
create a table indexed by the channel relative id to speed up
the channel lookup.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reviewed by: delphij, adrain, sephe, Dexuan Cui <decui microsoft com>
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4802
This teaches the mx25l driver (sys/dev/flash/mx25l.c) to interact with
sys/dev/fdt/fdt_slicer.c and sys/geom/geom_flashmap.c.
This allows systems with SPI flash to benefit from the possibility to define
flash 'slices' via FDT, just the same way that it's currently possible for
CFI and NAND flashes.
Tested:
* Carambola 2, AR9331 + SPI NOR flash
PR: kern/206227
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
use of fdt_fixup_table on PowerPC and ARM. As such we can remove it from
other architectures as it's unneeded.
Reviewed by: nwhitehorn
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D5013
a) Look for the CPL in the payload buffer instead of the descriptor.
b) Retrieve the socket associated with the tid with the inpcb lock held.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Redo LC calibration if temperature changed significantly since last
calibration.
Tested with RTL8188EU/RTL8188CUS in STA mode.
Reviewed by: kevlo
Approved by: adrian (mentor)
Obtained from: NetBSD (mostly)
Differential Revision: https://reviews.freebsd.org/D4966
- Use bitmap for debug output selection.
- Add few new messages (one for URTWN_DEBUG_BEACON
and another one for URTWN_DEBUG_INTR).
- Replace an undocumented URTWN_DEBUG definition with USB_DEBUG.
Tested with RTL8188EU / RTL8188CUS in IBSS / HOSTAP modes.
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4959
(by default it was set to 9us).
Tested with RTL8188EU / RTL8188CUS in STA mode.
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4535
Add support for multiple TX and RX queue pairs. The default number of queues
is set to 4, but can be easily changed from the sysctl node hw.xn.num_queues.
Also heavily refactor netfront driver: break out a bunch of helper
functions and different structures. Use threads to handle TX and RX.
Remove some dead code and fix quite a few bugs as I go along.
Submitted by: Wei Liu <wei.liu2@citrix.com>
Reviewed by: royger
Sponsored by: Citrix Systems R&D
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D4193
mv_pci driver omitted slot 0, which can be valid device on Armada38x.
New mechanism detects if device is root link, basing on vendor's
and device's IDs.
It is restricted to Armada38x; on other machines, behaviour remains
the same.
Reviewed by: andrew
Obtained from: Semihalf
Sponsored by: Stormshield
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D4377
Driver was modified to ensure it attaches properly to "marvell,orion-ehci"
node, which doesn't have error interrupt line defined. Neccessary
ofw_compat_data struct was added and probe procedure was altered.
Reviewed by: andrew, ian
Obtained from: Semihalf
Sponsored by: Stormshield
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D4369
uart_dev_ns8250 now relies on compatible property instead of additional
'busy-detect' cell. All drivers with compatible = "snps,dw-apb-uart" have
busy detection turned on. DTS files of devices affected by the change
were modified and 'busy-detect' property was removed.
Reviewed by: andrew, ian, imp
Obtained from: Semihalf
Sponsored by: Stormshield
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D4218
This compatibility string is used in .dts file of Armada38x
and isrequired for driver attachment.
Reviewed by: andrew, ian, imp
Obtained from: Semihalf
Sponsored by: Stormshield
Submitted by: Michal Stanek <mst@semihalf.com>
Differential revision: https://reviews.freebsd.org/D4216
Strict compatibility requirement is a root of problems when simplebus'
node has two compatibility strings (i.e. on Armada38x). Removing this
requirement should not interfere with other platforms.
fdt_is_compatible_strict() and fdt_find_compatible() calls were changed
in fdt_common.c and mv_common.c.
Reviewed by: ian, imp
Obtained from: Semihalf
Sponsored by: Stormshield
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D4602
New annotation coming into use in the common code. Support its use by
adding null macro definitions for non-Windows platforms.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
X-MFC with: r293901
to actually wait until the TX FIFOs of UARTs have be drained before
returning. This is done by bringing the equivalent of the TS_BUSY flag
found in the previous implementation back in an ABI-preserving way.
Reported and tested by: Patrick Powell
Most likely, drivers for USB-serial-adapters likewise incorporating
TX FIFOs as well as other terminal devices that buffer output in some
form should also provide implementations of tsw_busy.
MFC after: 3 days
- Add optimizing LRO wrapper which pre-sorts all incoming packets
according to the hash type and flowid. This prevents exhaustion of
the LRO entries due to too many connections at the same time.
Testing using a larger number of higher bandwidth TCP connections
showed that the incoming ACK packet aggregation rate increased from
~1.3:1 to almost 3:1. Another test showed that for a number of TCP
connections greater than 16 per hardware receive ring, where 8 TCP
connections was the LRO active entry limit, there was a significant
improvement in throughput due to being able to fully aggregate more
than 8 TCP stream. For very few very high bandwidth TCP streams, the
optimizing LRO wrapper will add CPU usage instead of reducing CPU
usage. This is expected. Network drivers which want to use the
optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
"tcp_lro_flush()". Further the LRO control structure must be
initialized using "tcp_lro_init_args()" passing a non-zero number
into the "lro_mbufs" argument.
- Make LRO statistics 64-bit. Previously 32-bit integers were used for
statistics which can be prone to wrap-around. Fix this while at it
and update all SYSCTL's which expose LRO statistics.
- Ensure all data is freed when destroying a LRO control structures,
especially leftover LRO entries.
- Reduce number of memory allocations needed when setting up a LRO
control structure by precomputing the total amount of memory needed.
- Add own memory allocation counter for LRO.
- Bump the FreeBSD version to force recompilation of all KLDs due to
change of the LRO control structure size.
Sponsored by: Mellanox Technologies
Reviewed by: gallatin, sbruno, rrs, gnn, transport
Tested by: Netflix
Differential Revision: https://reviews.freebsd.org/D4914
after changing the HW LRO sysctl when previously in up state.
Reviewed by: gnn
Sponsored by: Mellanox Technologies
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D4941
Ensure that checksum flags and L3/L4 fields are ignored
if lower level errors are reported in the event.
Remove checks for CRC0_ERR (bad iSCSI header CRC) and
CRC1_ERR (bad iSCSI payload or FCoE/FCoIP CRC) as they
are not used by any existing code.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4975
The dynamic config on Medford is stored using two partitions in flash, and at
any time one is the 'current' partition, used to provide the active config,
and the other 'backup' partition is used for writes. This means that there
are two potential partitions that can be used to service reads, and which is
required can depend on, for example, whether the read is to get the current
contents or to verify a write.
When the partition write lock is held, the default behaviour is to read from
the backup partition, which was wrong for most reads in the common code which
require the current partition. This change allows the current partition to be
read whilst the write lock is held.
There is one read in Manftest which needs the backup partition.
ef10_nvram_partn_read_mode() is created to avoid changing
ef10_nvram_partn_read() which shares a prototype with the equivalent Falcon
and Siena methods.
MC_CMD_NVRAM_READ_IN_V2 adds an extra field, but firmware which doesn't support
it just ignores it.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4974
on FDT/OFW platforms.
After the refactoring of the powerpc code so that OF_decode_addr() is usable
on all FDT/OFW platforms, this switches uart(4) to using it.
Differential Revision: https://reviews.freebsd.org/D4675
tlv_partition_header has field *preset* to support RFID-selectable
segments of dynamic configuration
Submitted by: Mateusz Wrzesinski <mwrzesinski at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
This allows an MTU change to be requested on unpriviliged functions
without also setting all the other parameters supported by MC_CMD_SET_MAC.
The enhanced SET_MAC command was introduced in v4_7 firmware.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4958
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4961
we want to use it.
The rate table was being initialised in low->high, but the link quality
table was being initialised high->low. So, when we did a lookup, we
would get the indexes wrong.
This started by a patch from dragonflybsd which reversed how the ni->in_ridx[]
array is being used; I'd rather it all be consistent. So, this is consistent.
Inspired by: what I did to iwn(4) a while ago
Inspired by: DragonflyBSD; <imre@vdsz.com>
Previously the polarity was for TTL levels, which are the reverse of RS-232.
Also add handling of the UART_PPS_INVERT_PULSE option bit in the sysctl
value, the same as was recently added to uart(4), so that people using TTL
level connections can request a logical inverting of the signal.
Use the named constants from the new dev/uart/uart_ppstypes.h for the pps
capture modes and option bits.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4957
- Add the structure with already known fields offsets
(some of them were taken from this driver,
some (channel_plan, rf_* fields) - from TP-LINK official driver)
- Fix a typo / dehardcode a constant in RTL8192C ROM structure.
Tested with RTL8188EU, STA mode
Reviewed by: kevlo
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4274
Leaving BIST methods for now as, though the Medford bootrom now has lots
of BIST support, production firmware doesn't appear to have been updated
yet.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4949
Not per PF copies as on Huntington.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4935
Some classes of IOAT hardware prefetch reads. DMA operations that
depend on the result of prior DMA operations must use the DMA_FENCE flag
to prevent stale reads.
(E.g., I've hit this personally on Broadwell-EP. The Broadwell-DE has a
different IOAT unit that is documented to not pipeline DMA operations.)
Sponsored by: EMC / Isilon Storage Division
The "mcdi_err_arg" probe still reports results of failed MCDI
commands, unless the caller invoked efx_mcdi_execute_quiet().
Submitted by: Andy Moreton <amoreton at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4919
Add support for Huntington MCDI licensing interface to common code.
Ported from Linux net driver IOCTL functions with restructuring for
initial support for V3 licensing API.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4918
Fix an explanatory comment which did not explain very well.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4916
Fix efx_vpd_hunk_next() which has -- since its inception -- failed to
correctly iterate over the tags and keywords contained in the VPD data.
Only the first tag or keyword would be returned and the next call with
*contp == 1 would walk to the end of the data and finish.
This was spotted when fixing up errors spotted by Prefast code analysis
(which neglected to set all of the out parameters in all successful cases)
Also fix efx_vpd_verify() on Siena and EF10 which (as a side effect of
correctly iterating over all the tags and keywords) was failing as it
detected that both the static VPD and dynamic VPD storage contained an
RV keyword in the VPD-R tag. This is intentional as the static VPD and
dynamic VPD are stored separately (firmware merges their contents and
computes a new RV keyword checksum for the data readable from the VPD
capability in PCIe configuration space).
Submitted by: Andrew Lee <alee at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4915
The only value which has changed is the number of rows
(ER_DZ_EVQ_TMR_REG_ROWS is 2048 vs 1024 for FR_BZ_TIMER_COMMAND_REGP0_ROWS)
but that isn't used, so this shouldn't change behaviour.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4913
If the NVSP protocol version is not greater than NVSP_PROTOCOL_VERSION_2,
then the recv buffer size is 15MB, otherwise the buffer size is 16MB.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reviewed by: royger, Dexuan Cui <decui microsoft com>, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4814
Submitted by: Howard Su <howard0su gmail com>
Reviewed by: royger, Dexuan Cui <decui microsoft com>, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4693
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: delphij, royger, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4676
We don't need them at all.
Submitted by: Dexuan Cui <decui microsoft com>
Sponsored by: Microsoft OSTC
Reviewed by: royger, adrian, delphij
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4595
This is first step to move the generic part of HV code into kernel instead
of module, so that it is possible to use hypercall to implement some other
paravirtualization code in the kernel.
Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: royger, delphij, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D3072
alignment guarantees provided by m_defrag(9), use m_collapse(9)
instead for performance reasons.
While at it, sanitize the statistics softc members, i. e. retire
unused ones and add SYSCTL nodes missing for actually used ones.
Differential Revision: https://reviews.freebsd.org/D4717
MAKEDEV_CHECKNAME flag to the call, this is required to not panic on
race between the clone and destructing the closed master.
Reported by and discussed with: bde
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Pulled firmware_ids.h from firmwaresrc and applied genfwdef script.
Submitted by: Richard Houldsworth <rhouldsworth at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4897
The EFSYS_OPT_RX_HDR_SPLIT optional feature in the common code
implemented the Lookahead Split feature of Windows. This split
received packets at a preconfigured byte offset, and delivered
the header and payload portions to separate receive queues.
Now the common code interface has no callers, so remove it.
Note that this should not be confused with the Header Data Split
feature of Windows, which splits packets at a header boundary.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4888
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4880
option to invert the polarity in software. Also add an option to capture
very narrow pulses by using the hardware's MSR delta-bit capability of
latching line state changes.
This effectively reverts the mistake I made in r286595 which was based on
empirical measurements made on hardware using TTL-level signaling, in which
the logic levels are inverted from RS-232. Thus, this re-syncs the polarity
with the requirements of RFC 2783, which is writen in terms of RS-232
signaling.
Narrow-pulse mode uses the ability of most ns8250 and similar chips to
provide a delta indication in the modem status register. The hardware is
able to notice and latch the change when the pulse width is shorter than
interrupt latency, which results in the signal no longer being asserted by
time the interrupt service code runs. When running in this mode we get
notified only that "a pulse happened" so the driver synthesizes both an
ASSERT and a CLEAR event (with the same timestamp for each). When the pulse
width is about equal to the interrupt latency the driver may intermittantly
see both edges of the pulse. To prevent generating spurious events, the
driver implements a half-second lockout period after generating an event
before it will generate another.
Differential Revision: https://reviews.freebsd.org/D4477
Prior to Medford, option ROM config was stored with one partition
per network port. Medford stores option ROM config in a single
partition (as an array of configurations, one per PF).
Update the EFXname /port to MCDI partition mapping for this.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4885
Make error messages consistent, and remove redundant checks.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4884
This API has been replaced by efx_mac_multicast_list_set()
and has no callers.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4883
New filters types may be added, but the same machinery should be able to
handle them.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4881
These bigs are changed on Medford.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4876
The pktfilter module has been obsolete for some time, as
it was replaced by newer features in filter module. With
the removal of the storport driver, this module has no
users and can be removed.
Submitted by: Andy Moreton <amoreton at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4875
Some new partitions have been added, but they shouldn't need to be
handled any differently.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4873
Rename all except hunt_tx_qdesc_tso_create(), which creates a
fw-assisted TSO v1 descriptor which isn't supported on Medford.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4870
All of these apply to both Huntington and Medford.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4868
This should allow these functions to work for Medford as well.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4866
All these fields will be used in shared ef10 code, so put them in an
ef10 member of a per-architecture union, rather that in the per-chip
union.
Submitted by: Mark Spender <mspender at solarflare.com>
Reviewed by: gnn
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4865
Creating some files together to do the build system changes in one go.
Submitted by: Mark Spender <mspender at solarflare.com>
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D4859
This one mainly avoids mbuf cluster allocation for TCP ACKs during
TCP sending tests. And it gives me ~200Mbps improvement (4.7Gbps
-> 4.9Gbps), when running iperf3 TCP sending test w/ 16 connections.
While I'm here, nuke the unnecessary zeroing out pkthdr.csum_flags.
Reviewed by: adrain
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4853
Many applications and kernel modules (e.g. bridge) rely on the ifmedia
status report; give them what they want.
Submitted by: Dexuan Cui <decui microsoft com>
Reviewed by: Jun Su <junsu microsoftc com>, me, adrian
Modified by: me (minor)
Original differential: https://reviews.freebsd.org/D4611
Differential Revision: https://reviews.freebsd.org/D4852
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
- Implement the LRO using tcp_lro APIs, and LRO is enabled by default.
- Add several stats sysctl nodes.
- Check IP/TCP length before sending the packet to tcp_lro_rx(), if host
does not provide RX csum information (*); and add an option through
sysctl to always trust host TCP segment csum checks (default is off).
- Add sysctl to control the LRO entry depth; it is disabled by default.
It is used to avoid holding too much TCP segments in driver. Limiting
the LRO entry depth helps a lot in a one/two streams RX test.
This one 3x the RX performance on my local test (3Gbps -> 10Gbps), and
~2x the RX performance over a directly connected 40Ge network (5Gbps ->
9Gbps).
(*) It seems the host stops supplying csum information, once the network
load is high. This still needs investigation...
Reviewed by: Hongjiang Zhang <honzhan microsoft com>,
Dexuan Cui <decui microsoft com>,
Jun Su <junsu microsoft com>,
delphij
Tested by: me (local),
Hongjiang Zhang <honzhan microsoft com>
(directly connected 40Ge)
Approved by: delphij (mentor), adrian (mentor, no objection)
With feedback from: delphij, Hongjiang Zhang <honzhan microsoft com>
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4824
This will allow to restore channel list after switching interface
to more restrictive regdomain.
Tested with Intel 3945BG (wpi) only.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4863
by busdma to the blkfront driver must be an integer number of sectors,
and must be aligned in memory on a "sector" boundary.
Having these assertions yesterday would have made finding the bug fixed
in r293698 somewhat easier.
Ensure that all iSCSI sessions are correctly terminated during shutdown.
* Enhances the changes done by r286226 (D3052).
* Add shutdown post sync event to run after filesystem shutdown
(SHUTDOWN_PRI_FIRST) but before CAM shutdown (SHUTDOWN_PRI_DEFAULT).
* Changes iscsi_maintenance_thread to processes terminate in preference to
reconnect.
Reviewed by: trasz
MFC after: 2 weeks
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D4429
- Add a description of Advantech PCI-1602 Rev. A boards. [1]
- Properly set up REG_ACR also for PCI-1602 Rev. A based on what the
Advantech-supplied Linux driver does.
- Additionally use the macros of <dev/ic/ns16550.h> to replace existing
magic values and get rid of trivial comments.
- Fix the style of some comments.
PR: 205359 [1]
Submitted by: Jan Mikkelsen (original patch) [1]
up to now.
The new sendfile is the code that Netflix uses to send their multiple tens
of gigabits of data per second. The new implementation features asynchronous
I/O, when I/O operations are launched, but not awaited to be complete. An
explanation of why such behavior is beneficial compared to old one is
going to be too long for a commit message, so we will skip it here.
Additional features of new syscall are extra flags, which provide an
application more control over data sent. The SF_NOCACHE flag tells
kernel that data shouldn't be cached after it was sent. The SF_READAHEAD()
macro allows to specify readahead size in pages.
The new syscalls is a drop in replacement. No modifications are required
to applications. One can take nginx binary for stable/10 and run it
successfully on head. Although SF_NODISKIO lost its original sense, as now
sendfile doesn't block, and now means something completely different (tm),
using the new sendfile the old way is absolutely safe.
Celebrates: Netflix global launch!
Sponsored by: Nginx, Inc.
Sponsored by: Netflix
Relnotes: yes
ioat_acquire_reserve() is an extended version of ioat_acquire(). It
allows users to reserve space in the channel for some number of
descriptors. If this succeeds, it guarantees that at least submission
of N valid descriptors will succeed.
Sponsored by: EMC / Isilon Storage Division
Due to FreeBSD system-wide limits on number of MSI-X vectors
(https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321),
it may be desirable to allocate fewer than the maximum number
of vectors for an NVMe device, in order to save vectors for
other devices (usually Ethernet) that can take better
advantage of them and may be probed after NVMe.
This tunable is expressed in terms of minimum number of CPUs
per I/O queue instead of max number of queues per controller,
to allow for a more even distribution of CPUs per queue. This
avoids cases where some number of CPUs have a dedicated queue,
but other CPUs need to share queues. Ideally the PR referenced
above will eventually be fixed and the mechanism implemented
here becomes obsolete anyways.
While here, fix a bug in the CPUs per I/O queue calculation to
properly account for the admin queue's MSI-X vector.
Reviewed by: gallatin
MFC after: 3 days
Sponsored by: Intel
This helps immensily with our ability to operate in the Amazon Cloud.
Discussed on Intel Networking Community call this morning.
Submitted by: Jarrod Petz(petz@nisshoko.net)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D4788
the system is booted and running.
Add PHY detection logic to ixgbe_handle_mod() and add locking to
ixgbe_handle_msf() as well.
PR: 150251
Submitted by: aboyer@averesystems.com
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D3188
e1000/e1000e split in linux.
Split rxbuffer and txbuffer apart to support the new RX descriptor format
structures. Move rxbuffer manipulation to em_setup_rxdesc() to unify the
new behavior changes.
Add a RSSKEYLEN macro for help in generating the RSSKEY data structures
in the card.
Change em_receive_checksum() to process the new rxdescriptor format
status bit.
MFC after: 2 weeks
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D3447
Previously nvme(4) would revert to a signle I/O queue if it could not
allocate enought interrupt vectors or NVMe submission/completion queues
to have one I/O queue per core. This patch determines how to utilize a
smaller number of available interrupt vectors, and assigns (as closely
as possible) an equal number of cores to each associated I/O queue.
MFC after: 3 days
Sponsored by: Intel
Instead just use num_io_queues to make this determination.
This prepares for some future changes enabling use of multiple
queues when we do not have enough queues or MSI-X vectors
for one queue per CPU.
MFC after: 3 days
Sponsored by: Intel
This significantly improves parallelism in the most common case.
The taskqueue is still used whenever BIO_ORDERED bios are in flight.
This patch is based heavily on a patch from gallatin@.
MFC after: 3 days
Sponsored by: Intel
This ensures the bio flags are not read after biodone().
The ordering will still be enforced, after the bio is
submitted successfully.
MFC after: 3 days
Sponsored by: Intel
Still wait until all in-flight bios (including the ordered bio)
complete before processing more bios from the queue.
MFC after: 3 days
Sponsored by: Intel
and t_maxseg. This dualism emerged with T/TCP, but was not properly cleaned
up after T/TCP removal. After all permutations over the years the result is
that t_maxopd stores a minimum of peer offered MSS and MTU reduced by minimum
protocol header. And t_maxseg stores (t_maxopd - TCPOLEN_TSTAMP_APPA) if
timestamps are in action, or is equal to t_maxopd otherwise. That's a very
rough estimate of MSS reduced by options length. Throughout the code it
was used in places, where preciseness was not important, like cwnd or
ssthresh calculations.
With this change:
- t_maxopd goes away.
- t_maxseg now stores MSS not adjusted by options.
- new function tcp_maxseg() is provided, that calculates MSS reduced by
options length. The functions gives a better estimate, since it takes
into account SACK state as well.
Reviewed by: jtl
Differential Revision: https://reviews.freebsd.org/D3593
This optimization is not proper (and causes kernel panic),
since driver checks fw_status to optimize away parsing stage
if it was already done.
Reported by: dchagin
It was returning a pointer to stack-allocated memory, so make the
allocation at the caller instead.
Found by: clang static analyzer
Coverity: CID 1245774
Reviewed by: ed, rpaulo
Review URL: https://reviews.freebsd.org/D4740
The fd is closed later in this case. This fixes a "SS_NOFDREF on enter"
panic.
Submitted by: Krishnamraju Eraparaju @ Chelsio
Reviewed by: Steve Wise @ Open Grid Computing
- Separate 'firmware_put(sc->fw_fp, FIRMWARE_UNLOAD); sc->fw_fp = NULL;'
into iwn_unload_firmware().
- Move error handling to the end of iwn_read_firmware().
No functional changes.
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4768
iwn(4) / wpi(4) works in the same way
(read_firmware() -> hw_init() -> firmware_put())
Approved by: adrian (mentor)
Differential Revision: https://reviews.freebsd.org/D4766
- Simplify defragmentation code.
- Use proper number of dma segments for data.
Approved by: adrian (mentor)
Obtained from: DragonFlyBSD (mostly)
Differential Revision: https://reviews.freebsd.org/D4754
stable/10 doesn't have the if_getdrvflags(9) KPI. Reference the field in the
structure directly if the __FreeBSD_version is < 1100022, so the driver can
be built with PCI_IOV support on stable/10, without backporting all of
r266974 (which requires additional changes due to projects/ifnet, etc)
Differential Revision: https://reviews.freebsd.org/D4759
Reviewed by: erj, sbruno
Sponsored by: EMC / Isilon Storage Division
These are going to be much more efficient on low end embedded systems
but unfortunately they make it .. less convenient to implement correct
bus barriers and debugging. They also didn't implement the register
serialisation workaround required for Owl (AR5416.)
So, just remove them for now. Later on I'll just inline the routines
from ah_osdep.c.
- Change order of data in if_iwmvar.h
(like it is in other drivers: defines, data structures,
vap/node structures, softc struct and locks); use indentation.
- Fix IWM_LOCK(_sc) / IWM_UNLOCK(_sc) macro.
- Add IWM_LOCK_INIT / DESTROY(sc) + fix mtx_init() usage.
- Wrap iwm_node casts into IWM_NODE() macro.
- Drop some fields:
* wt_hwqueue from Tx radiotap header;
* macaddr[6] from iwm_vap;
Approved by: adrian
Differential Revision: https://reviews.freebsd.org/D4753
tree parsing opt-out rather than opt-in. All FDT-based systems as well as
PowerPC systems with real Open Firmware use the CHRP-derived binding that
includes it, which makes SPARC the odd man out here. Making it opt-out
avoids astonishment on new platform bring up.
The ath hal and driver code all assume the world is an x86 or the
bus layer does an explicit bus flush after each operation (eg netbsd.)
However, we don't do that.
So, to be "correct" on platforms like sparc64, mips and ppc (and maybe
ARM, I am not sure), just do explicit barriers after each operation.
Now, this does slow things down a tad on embedded platforms but I'd
rather things be "correct" versus "fast." At some later point if someone
wishes it to be fast then we should add the barrier calls to the HAL and
driver.
Tested:
* carambola 2 (AR9331.)
rounding) has better spread. Implement fp16_sin() to go along with
fp16_cos(). In the rendering loop, switch from addition to subtraction
so the center of the pattern will be a trough rather than a peak. This
is completely arbitrary, of course, but looks better to me.