r259150:
Print out the full PCIe link negotiation during dmesg.
I found this useful when checking whether a NIC is in a PCIE 3.0 8x slot
or not.
r283864:
cxgbe: no need to display the per-lane GT/s rating of the pcie link.
r283858:
cxgbe: set minimum burst size when fetching freelist buffers to 128B.
r284007:
cxgbe: set the minimum burst size when fetching fl buffers to 128B for
netmap rx queues too. This should have gone in as part of r283858.
cxgbe/tom: return rx credits promptly if the socket buffer's low water
mark cannot be reached because the window advertised to the peer isn't
wide enough. While here, tweak the normal credit return too.
r278239:
cxgbe(4): reserve id for iSCSI upper layer driver.
r278374:
cxgbe(4): tidy up some of the interaction between the Upper Layer
Drivers (ULDs) and the base if_cxgbe driver.
Track the per-adapter activation of ULDs in a new "active_ulds" field.
This was done pretty arbitrarily before this change -- via TOM_INIT_DONE
in adapter->flags for TOM, and the (1 << MAX_NPORTS) bit in
adapter->offload_map for iWARP.
iWARP and hw-accelerated iSCSI rely on the TOE (supported by the TOM
ULD). The rules are:
a) If the iWARP and/or iSCSI ULDs are available when TOE is enabled then
iWARP and/or iSCSI are enabled too.
b) When the iWARP and iSCSI modules are loaded they go looking for
adapters with TOE enabled and enable themselves on that adapter.
c) You cannot deactivate or unload the TOM module from underneath iWARP
or iSCSI. Any such attempt will fail with EBUSY.
r277637, and r283149 (by emaste@).
r276485 is the real change here, the rest deal with the fallout of
mp_ring's reliance on 64b atomics.
Use the incorrectly spelled 'eigth' from struct pkthdr in this branch
instead of MFC'ing r261733, which would have renamed the field of a
public structure in a -STABLE branch.
---
r276480:
Temporarily unplug cxgbe(4) from !amd64 builds.
r276485:
cxgbe(4): major tx rework.
a) Front load as much work as possible in if_transmit, before any driver
lock or software queue has to get involved.
b) Replace buf_ring with a brand new mp_ring (multiproducer ring). This
is specifically for the tx multiqueue model where one of the if_transmit
producer threads becomes the consumer and other producers carry on as
usual. mp_ring is implemented as standalone code and it should be
possible to use it in any driver with tx multiqueue. It also has:
- the ability to enqueue/dequeue multiple items. This might become
significant if packet batching is ever implemented.
- an abdication mechanism to allow a thread to give up writing tx
descriptors and have another if_transmit thread take over. A thread
that's writing tx descriptors can end up doing so for an unbounded
time period if a) there are other if_transmit threads continuously
feeding the sofware queue, and b) the chip keeps up with whatever the
thread is throwing at it.
- accurate statistics about interesting events even when the stats come
at the expense of additional branches/conditional code.
The NIC txq lock is uncontested on the fast path at this point. I've
left it there for synchronization with the control events (interface
up/down, modload/unload).
c) Add support for "type 1" coalescing work request in the normal NIC tx
path. This work request is optimized for frames with a single item in
the DMA gather list. These are very common when forwarding packets.
Note that netmap tx in cxgbe already uses these "type 1" work requests.
d) Do not request automatic cidx updates every 32 descriptors. Instead,
request updates via bits in individual work requests (still every 32
descriptors approximately). Also, request an automatic final update
when the queue idles after activity. This means NIC tx reclaim is still
performed lazily but it will catch up quickly as soon as the queue
idles. This seems to be the best middle ground and I'll probably do
something similar for netmap tx as well.
e) Implement a faster tx path for WRQs (used by TOE tx and control
queues, _not_ by the normal NIC tx). Allow work requests to be written
directly to the hardware descriptor ring if room is available. I will
convert t4_tom and iw_cxgbe modules to this faster style gradually.
r276498:
cxgbe(4): remove buf_ring specific restriction on the txq size.
r277225:
Make cxgbe(4) buildable with the gcc in base.
r277226:
Allow cxgbe(4) to be built on i386. Driver attach will succeed only on
a subset of i386 systems.
r277227:
Plug cxgbe(4) back into !powerpc && !arm builds, instead of building it
on amd64 only.
r277230:
Build cxgbe(4) on powerpc64 too.
r277637:
Make sure the compiler flag to get cxgbe(4) to compile with gcc is used
only when gcc is being used. This is what r277225 should have been.
Various fixes for suspend and resume of PCI to PCI and PCI to Cardbus
bridges.
274633:
Remove stray empty comment. The code is adequately explained in the
block comment above, so there's nothing to add here.
274639:
Modernize comments about BIOSes being lame since in this detail they
aren't lame, the rules changed along the way. Catch up to 1999 or so
with the new rules.
274663:
Fix typo pointed out by avg@ and Joerg Sonnenberger. Add a clarifying
sentence too.
277233:
Suspend and resume were the only two functions not to follow the brdev
convention here, so fix that.
277234:
Move the suspsned and resume functions to the bus attachment. They
were accessing PCI config registers, which won't work for the ISA
version.
277235:
Always enable I/O, memory and dma cycles. Some BIOSes don't enable
them, sometimes they are reset for power state transitions or during
whatever happens while suspended. Also, it is good practice to always
do this.
281870:
Cosmetic change: use PCIR_SECLAT_2 rather than PCIR_SECLAT_1.
281871:
The minimim grant and maximum latency PCI config registers are only valid
for type 0 devices, not type 1 or 2 bridges. Don't read them for bridge
devices during bus scans and return an error when attempting to read them
as ivars for bridge devices.
281873:
Don't explicitly manage power states for PCI-PCI bridge devices in the
driver's suspend and resume routines. These have been redundant no-ops
since r214065 changed the PCI bus driver to manage power states for
all devices (including type 1/2 bridge devices) during suspend and resume.
281874:
Update the pci_cfg_save/restore routines to operate on bridge devices
(type 1 and type 2) as well as leaf devices (type 0). In particular,
this allows the existing PCI bus logic to save and restore capability
registers such as MSI and PCI-express work for bridge devices rather than
requiring that code to be duplicated in bridge drivers. It also means
that bridge drivers no longer need to save and restore basic registers
such as the PCI command register or BARs nor manage powerstates for the
bridge device.
While here, pci_setup_secbus() has been changed to initialize the 'sec'
and 'sub' fields in the 'secbus' structure instead of requiring the pcib
and pccbb drivers to do this in the NEW_PCIB + PCI_RES_BUS case.
Fix for control endpoint handling in the DWC OTG driver. The data
stage processing is only allowed after the setup complete event has
been received. Else a race may occur and the OUT data can be corrupted.
While at it ensure resetting a FIFO has the required wait loop.
- Updated all files with 2015 Avago copyright, and updated LSI's copyright
dates.
- Changed all of the PCI device strings from LSI to Avago Technologies (LSI).
- Added a sysctl variable to control how StartStopUnit behavior works. User can
select to spin down disks based on if disk is SSD or HDD.
- Inquiry data is required to tell if a disk will support SSU at shutdown or
not. Due to the addition of mprssas_async, which gets Advanced Info but not
Inquiry data, the setting of supports_SSU was moved to the
mprsas_scsiio_complete function, which snoops for any Inquiry commands. And,
since disks are shutdown as a target and not a LUN, this process was
simplified by basing it on targets and not LUNs.
- Added a sysctl variable that sets the amount of time to retry after sending a
failed SATA ID command. This helps with some bad disks and large disks that
require a lot of time to spin up. Part of this change was to add a callout to
handle timeouts with the SATA ID command. The callout function is called
mprsas_ata_id_timeout(). (Fixes PR 191348)
- Changed the way resets work by allowing I/O to continue to devices that are
not currently under a reset condition. This uses devq's instead of simq's and
makes use of the MPSSAS_TARGET_INRESET flag. This change also adds a function
called mprsas_prepare_tm().
- Some changes were made to reduce code duplication when getting a SAS address
for a SATA disk.
- Fixed some formatting and whitespace.
- Bump version of mpr driver to 9.255.01.00-fbsd
PR: 191348
Extend the maximum number of allowed PCM channels in a PCM stream to
127 and decrease the maximum number of sub-channels to 1. These
definitions are only used inside the kernel and can be changed later
if more than one sub-channel is desired. This has been done to allow
so-called USB audio rack modules to work with FreeBSD.
Add support for more than 8 audio channels per PCM stream for USB
audio class compliant devices under FreeBSD. Tested using 16 recording
and 16 playback audio channels simultaneously.
Bump the FreeBSD version to force recompiling all external modules.
Fix two bugs that could result in PMC sampling effectively stopping.
In both cases, the the effect of the bug was that a very small positive
number was written to the counter. This means that a large number of
events needed to occur before the next sampling interrupt would trigger.
Even with very frequently occurring events like clock cycles wrapping all
the way around could take a long time. Both bugs occurred when updating
the saved reload count for an outgoing thread on a context switch.
First, the counter-independent code compares the current reload count
against the count set when the thread switched in and generates a delta
to apply to the saved count. If this delta causes the reload counter
to go negative, it would add a full reload interval to wrap it around to
a positive value. The fix is to add the full reload interval if the
resulting counter is zero.
Second, occasionally the raw counter value read during a context switch
has actually wrapped, but an interrupt has not yet triggered. In this
case the existing logic would return a very large reload count (e.g.
2^48 - 2 if the counter had overflowed by a count of 2). This was seen
both for fixed-function and programmable counters on an E5-2643.
Workaround this case by returning a reload count of zero.
PR: 198149
Sponsored by: Norse Corp, Inc.
- Move hwpmc(4) debugging code under a new HWPMC_DEBUG option instead of
the broader DEBUG option.
- Convert hwpmc(4) debug printfs over to KTR.
Sponsored by: Norse Corp, Inc.
r273480:
cxgbe/iw_cxgbe: wake up waiters after flushing the qp.
r273750:
Some cxgbe/iw_cxgbe fixes:
- Free rt in c4iw_connect only if it is allocated.
- Call soclose instead of so_shutdown if there is an abort from the peer.
- Close socket and return failure if TOE is not enabled.
r273753:
iwcm_event status needs to be populated for close_complete_upcall
r273797:
Always request a completion for every work request for iWARP. The
initial MPA exchange must be tracked this way so that t4_tom's state for
the tid is all clean at the time the tid transitions to RDMA mode. Once
it does, t4_tom is out of the way and iw_cxgbe uses the qp endpoints
directly.
r274461:
iw_cxgbe: don't forget to close the socket in c4iw_connect if soconnect
fails.
- Re-introduce "device ixgbe" option for kernel configs. This allows
users to continue using existing kernel configurations, while still
letting if_ix and if_ixv exist to allow ifconfig magic to work.
Though the amd64 GENERIC config has "device ix" and "device ixv" instead of
"device ixgbe".
This is achieved by making the if_ixgbe module a stub that depends
on if_ix and if_ixv.
- Replace if_baudrate assignment with if_initbaudrate() call.
Thanks to jhb@ for the stub module suggestion and pointing out the if_baudrate
error.
Approved by: jfv (mentor)
Don't free mbufs when stopping an interface in netmap mode.
Currently if you ifconfig down a vtnet interface while it is being used
via netmap, the kernel panics due to trying to treat the cookie values
in the virtio rings as mbufs to be freed. When netmap is enabled, these
cookie values are pointers to something else.
Note that other netmap-aware drivers don't seem to need this as they
store the mbuf pointers in the software rings that mirror the hardware
descriptor rings, and since netmap doesn't touch those, the software
state always has NULL mbuf pointers causing the loops to free mbufs to
not do anything. However, vtnet reuses the same state area for both
netmap and non-netmap mode, so it needs to explicitly avoid looking at
the rings and treating the cookie values as mbufs if netmap is
enabled.
Sponsored by: Norse Corp, Inc.
Fix an off-by-one error by adding proper range checks when parsing the
HDA association descriptors. This fixes a crash during device probe
for some HDA PCI devices.
Use OF_getencprop over OF_getprop and fdt32_to_cpu. The latter may give
us the wrong data in the failure case if shift was not zero.
Remove the extra copy of uart_fdt_get_clock and uart_fdt_get_shift. While
here also use OF_getencprop in uart_fdt_get_clock.
Move uart_fdt_get_clock and uart_fdt_get_shift to uart_bus_fdt.c, we may
not build uart_cpu_fdt.c in all configs.
Fix uart_fdt_get_clock. It should have been using the cell variable passed
in, not value on the stack.
sfxge: relax assertion to allow RST flag in TSO packets
Kernel under stress load, mixed MC reboot and sfupdate really
generates TSO packet with RST flag.
It will generate many TCP packets with RST flag set.
May be RST flag should be set in the last segment only, but it could be
dropped. So, it is safer to keep the flag in all packets to be sure that
connection is reset.
Sponsored by: Solarflare Communications, Inc.
When a netmap process terminates without the full set of buffers it
was granted via rings and ni_bufs_list_head represented in those rings
and lists (e.g., via SIGKILL), those buffers are no longer available
for subsequent users for the lifetime of the system. To mitigate this
resource leak, reset the allocator state when the last ref to that
allocator is released.
Note that this only recovers leaked resources for an allocator when
there are no longer any users of that allocator, so there remain
circumstances in which leaked allocator resources may not ever be
recovered - consider a set of multiple netmap processes that are all
using the same allocator (say, the global allocator) where members of
that set may be killed and restarted over time but at any given point
there is one member of that set running.
Add mutex support to the pps_ioctl() API in the kernel.
Add PPS support to USB serial drivers.
Use correct mode variable for PPS support.
Switch polarity of USB serial PPS events.
The ftdi "get latency" and "get bitmode" device commands are read
operations, not writes.
Implement a mechanism for making changes in the kernel<->driver PPS
interface without breaking ABI or API compatibility with existing drivers.
Bump version number to indicate the new PPS ABI version changes in the
pps_state structure.
Define new linker set, UART_FDT_CLASS_AND_DEVICE, for registering full
(class and device) FDT UART. Define second one, UART_FDT_CLASS, for UART
class only.
Move the uart_class definitions and fdt compat data into the individual
uart implementations, and export them using the new linker-set mechanism.
Add uart driver for Qualcomm MSM 7000/8000 series chips.
It is working on IFC6410 board which has Qualcomm Snapdragon SoC.
Use documented compat string for msm uart.
Make PL011 UART to wait on putc only when TX FIFO is full
Make uart_bus_fdt a decendant of ofwbus
Microsoft vmbus, storage and other related driver enhancements for HyperV.
- Vmbus multi channel support.
- Vector interrupt support.
- Signal optimization.
- Storvsc driver performance improvement.
- Scatter and gather support for storvsc driver.
- Minor bug fix for KVP driver.
Thanks royger, jhb and delphij from FreeBSD community for the reviews
and comments. Also thanks Hovy Xu from NetApp for the contributions to
the storvsc driver.
PR: 195238
Submitted by: whu
Reviewed by: royger
Approved by: royger
Relnotes: yes
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D2575
sfxge: automatically turn off TSO when Tx checksum offload is disabled
Also return error if TSO is requested without Tx checksum offload.
Sponsored by: Solarflare Communications, Inc.
sfxge: allow to disable checksum offloads over VLAN
It just affects capabilities of the created VLAN interface.
Sponsored by: Solarflare Communications, Inc.
sfxge: fix overflow queue freeze
If TxQ lock is obtained, deferred packet list shold be serviced even if
the packet addition fails because of overflow.
Without the patch freeze happens if:
- queue is not blocked (i.e. completion does not trigger unblock and service)
- put-list overflow (1024 entries)
- sfxge_tx_packet_add() acquires TxQ lock just as it is released it in
sfxge_tx_qdpl_service() on the second CPU but before pending check
- sfxge_tx_packet_add() swizzles put-list to get-list, fails because of
non-tcp get-list overflow and returns without packet list service
- sfxge_tx_qdpl_service() on the second CPU checks that there are no
pending packets in the put-list and returns
Other possible solution is to guaranee that maximum length of the put-list
is less than maximum length of any get-list.
Sponsored by: Solarflare Communications, Inc.
sfxge: avoid usage of ifm_data
The driver uses ifm_data to save capabilities mask calculated during
initialization when supported phy modes are discovered.
The patch simply calculates it when either media or options are changed.
Sponsored by: Solarflare Communications, Inc.