When detaching the if_ure(4) driver, the TX active USB transfer array may
point to freed USB transfers. Given that the number of USB transfers is
very low, simply start all transfers every time there is a packet to
keep safe from use-after-free.
PR: 252608
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
-pci_get_class : This function search for a matching pci device based on
the class/subclass and returns a newly created pci_dev.
- pci_{save,restore}_state : This is analogous to ours with the same name
- pci_is_root_bus : Return true if this is the root bus
- pci_get_domain_bus_and_slot : This function search for a matching pci
device based on domain, bus and slot/function concat into a single
unsigned int (devfn) and returns a newly created pci_dev
- pci_bus_{read,write}_config* : Read/Write to the config space.
While here add some helper function to alloc and fill the pci_dev struct.
Reviewed by: hselasky, bz (older version)
Differential Revision: https://reviews.freebsd.org/D27550
pci_find_class_from help finding one or multiple device matching
a class and subclass.
If the from argument is not null we will first loop in the device list
until we find the matching device and only then start to check if the
class/subclass matches.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D27549
The cgem(4) driver was updated to support 64-bit bus addressing in
facdd1cd20. However, the committed version determines this in an
un-idiomatic way. Change the compile-time conditional to check
BUS_SPACE_MAXADDR, rather than comparing int and pointer sizes.
Reported by: jrtc27
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.
This allows an open privcmd instance to limit against which domains it
can act upon.
Sponsored by: Citrix Systems R&D
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.
This allows user-space to make use of the dm_op family of hypercalls,
which are used by device models.
Sponsored by: Citrix Systems R&D
The interface is mostly the same as the Linux ioctl, so that we don't
need to modify the user-space libraries that make use of it.
The ioctl is just a proxy for the XENMEM_acquire_resource hypercall.
Sponsored by: Citrix Systems R&D
Restore the hwofs functionality temporarily disabled by
7ba6ecf216 to prevent issues with iflib.
This patch brings the necessary changes to iflib to
enable howfs to allow interface restarts without
disrupting netmap applications actively using its
rings.
After this change, it becomes possible for multiple
non-cooperating netmap applications to use non-overlapping
subsets of the available netmap rings without clashing
with each other.
PR: 252453
MFC after: 1 week
Add 64-bit address support to Cadence CGEM Ethernet driver for use in
other SoCs such as the Zynq UltraScale+ and SiFive HighFive Unleashed.
Reviewed by: philip, 0mp (manpages)
Differential Revision: https://reviews.freebsd.org/D24304
Similarly to what done for iflib in 1d238b07d5,
this patch prevents access to the krings during the interface
reset triggered by netmap_register().
MFC after: 1 week
The netmap_reset() function is meant to be called by the driver
when they initialize (or re-initialize) a hardware ring.
However, since the introduction of support for opening (in
netmap mode) a subset of the available rings, netmap_reset()
may be called multiple times on actively used rings, causing
both kring and netmap ring to transition to an inconsistent
state.
This changes improves the situation by resetting all the
indices fields of the kring to 0, as expected after the
reinitialization of a hardware ring.
PR: 252518
MFC after: 1 week
When different processes open separate subsets of the
available rings of a same netmap interface, a device
reset may be performed while one of the processes
is actively using some rings (e.g., caused by another
process executing a nmport_open()).
With this patch, such situation will cause the
active process to get a POLLERR, so that it can
have a chance to detect the situation.
We also guarantee that no process is running a txsync
or rxsync (ioctl or poll) while an iflib device reset
is in progress.
PR: 252453
MFC after: 1 week
The NVMe byte-swap routines for big-endian platforms used memcpy() to
move the unaligned 64-bit value into a temp register to byte swap it.
Instead of introducing a dependency, manually byte-swap the values in
place.
Point hat: me
The device mapping table contains sc->max_devices entries, so only
indices in [0, sc->max_devices) are valid.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27964
Previously we copied in the request into a stack-allocated structure
that could be smaller than the request size. Furthermore, we checked
the request size only after doing the copyin.
Fix this by allocating a buffer to hold the request, then copying the
buffer's contents into a command descriptor. This is a bit heavy-handed
but I expect the overhead will not be noticeable. The approach of
coping the header in first is susceptible to TOCTOU problems.
Reviewed by: imp
Reported by: maxpl0it@protonmail.com
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27963
safexcel_ring_intr() could fail to observed that sc_blocked is set after
completing all outstanding ops for a ring, in which case blocked ops
would be deferred forever.
Request structures are managed by individual rings, so move the
"blocked" flag into the per-ring state block and use the ring lock to
synchronize with safexcel_process(). Remove sc_mtx since it is now
unused.
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (Netgate)
mtx_init() does not make a copy of the name so the buffer must be valid
for the lifetime of the driver instance. Store each ring's lock's name
in the ring structure.
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (Netgate)
The new UAR API already offsets the UAR map pointer the mlx5en(4) is using.
While at it remove some no longer needed variables for keeping track
of the current BF offset.
This fixes a regression issue after the new UAR allocation APIs
were introduced.
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Add decoding of the Device Self-test log page and the ability to start
or abort a test.
Reviewed by: imp, mav
Tested by: Muhammad Ahmad <muhammad.ahmad@seagate.com>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D27517
This ioctl would instantly induce a panic, likely since near inception, up
until 0861c7d3e0. Lack of previous interest in fixing it combined with
the problematic interface (exports a pointer, really a physical address)
brings us to the natural conclusion: remove it until a useful consumer
forward.
If it eventually gets resurrected, the interface should definitely not
return in this exact form and likely needs to be reimagined.
The associated KPI, efi_get_table, is left intact for the time being.
Reviewed by: imp, jrtc27
Also discussed with: brooks, jhb
Differential Revision: https://reviews.freebsd.org/D28030
Virtual PMCs could be running on multiple CPUs so this needs to be
a per-CPU value.
Submitted by: rwatson (earlier version)
Reviewed by: gnn
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D27973
When testing hwpmc on arm64 we found the counter could overflow while
reading the event count. Handle this case in the armv7 code by also
checking if the overflow bit is set and incrementing the overflow
cound as needed.
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D27969
This change include several changes as listed below all related to UAR.
UAR is a special PCI memory area where the so-called doorbell register and
blue flame register live. Blue flame is a feature for sending small packets
more efficiently via a PCI memory page, instead of using PCI DMA.
- All structures and functions named xxx_uuars were renamed into xxx_bfreg.
- Remove partially implemented Blueflame support from mlx5en(4) and mlx5ib.
- Implement blue flame register allocator.
- Use blue flame register allocator in mlx5ib.
- A common UAR page is now allocated by the core to support doorbell register
writes for all of mlx5en and mlx5ib, instead of allocating one UAR per
sendqueue.
- Add support for DEVX query UAR.
- Add support for 4K UAR for libmlx5.
Linux commits:
7c043e908a74ae0a935037cdd984d0cb89b2b970
2f5ff26478adaff5ed9b7ad4079d6a710b5f27e7
0b80c14f009758cefeed0edff4f9141957964211
30aa60b3bd12bd79b5324b7b595bd3446ab24b52
5fe9dec0d045437e48f112b8fa705197bd7bc3c0
0118717583cda6f4f36092853ad0345e8150b286
a6d51b68611e98f05042ada662aed5dbe3279c1e
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
- call pci_iov_detach() on detaching from PCI device to take care of hang
on destroying VFs after PF is down.
- disable eswitch SRIOV support right after pci_iov_detach(),
else the eswitch cleanup sometimes occur while the SRIOV flow table
is still present.
Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Remove wi(4). pccard is going away, and wi only supports PC Card
devices, though it has a minor amount of glue to also support
PCI cards. However, removing the one without removing the other
is hard, so the whole driver is being removed.
Relnotes: Yes
PC Card support is being removed, so remove its attachment here. ndis
is slated to be removed entirely for 13, but that's not been done yet.
Relnotes: Yes
This change includes:
hpen - Generic / MS Windows compatible HID pen tablet driver.
hgame - Generic game controller and joystick driver.
xb360gp - Xbox360-compatible game controller driver.
Submitted by: Greg V <greg_unrelenting.technology>
Reviewed by: hselasky (as part of D27993)
hidmap is a kernel module that maps HID input usages to evdev events.
Following dependent drivers is included in the commit:
hms - HID mouse driver.
hcons - Consumer page AKA Multimedia keys driver.
hsctrl - System Controls page (Power/Sleep keys) driver.
ps4dshock - Sony DualShock 4 gamepad driver.
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D27993
which installs /dev/uhid# alias to hidraw character device for
compatibility with some existing uhid(4) users like Firefox.
As side effect it renames traditional uhid(4) driver to hidraw
to make possible using of common unit number allocator.
Requested by: Greg V <greg_unrelenting.technology>
Reviewed by: hselasky (as part of D27992)
This driver provides raw access to HID devices through uhid(4)-compatible
interface and is based on pre-8.x uhid(4) code. Unlike uhid(4) it does
not take devices in to monopoly ownership and allows parallel access
from other drivers.
hidraw supports Linux's hidraw-compatible interface as well.
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D27992
This allows to mark HID-device interrupt handlers as MP-SAFE.
Atomics-based lockless key event queue with swi_giant taskqueue is used
to pass key-press events into Giant-protected system console.
Reviewed by: hselasky (as part of D27991)
This change implements hid_if.m methods for HID-over-USB protocol [1].
Also, this change adds USBHID_ENABLED kernel option which changes
device_probe() priority and adds/removes PnP records to prefer usbhid
over ums, ukbd, wmt and other USB HID device drivers and vice-versa.
The module is based on uhid(4) driver. It is disabled by default for
now due to conflicts with existing USB HID drivers.
[1] https://www.usb.org/sites/default/files/hid1_11.pdf
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D27893
hidquirk(4) is derived from usb_quirk(4) and inherits all its HID-related
functionality. It does not support ioctl(2) interface yet.
Reviewed by: hselasky
Differential revision: https://reviews.freebsd.org/D27890
This driver provides support for multiple HID driver attachments
to single HID transport backend. This ability existed in Net/OpenBSD
(uhidev and ihidev drivers) but has never been ported to FreeBSD.
Unlike Net/OpenBSD we do not use report number alone to distinct report
source but we follow MS way and use a top level collection (TLC) usage
index that report belongs to as a location key.
The driver performs child device autodiscovery based on HID report
descriptor data, proxying of HID requests from child devices to parent
transport backends and broadcasting of interrupts in backward direction.
Differential revision: https://reviews.freebsd.org/D27888
Create an abstract HID interface that provides hardware independent
access to HID capabilities and functions through the device tree.
hid_if.m resembles existing USBHID KPI and consist of next methods:
HID method USBHID variant
-----------------------------------------------------------------------
hid_intr_setup usbd_transfer_setup (INTERRUPT IN xfer)
hid_intr_unsetup usbd_transfer_unsetup (INTERRUPT IN xfer)
hid_intr_start usbd_transfer_start (INTERRUPT IN xfer)
hid_intr_stop usbd_transfer_drain (INTERRUPT IN xfer)
hid_intr_poll usbd_transfer_poll (INTERRUPT IN xfer)
hid_get_rdesc usbd_req_get_report_descriptor
hid_read No direct analog. Not intended for common use.
hid_write uhid(4) write()
hid_get_report usbd_req_get_report
hid_set_report usbd_req_set_report
hid_set_idle usbd_req_set_idle
hid_set_protocol usbd_req_set_protocol
This change is part of D27888
Also hide shim code added in a previous commit under COMPAT_USBHID12.
Note: it is enough to add -DCOMPAT_USBHID12 to CFLAGS to compile old
code with new HID subsystem, but it is not enough to link it at runtime.
HID dependency has to be added explicitly with MODULE_DEPEND macro.
Reviewed by: manu, hselasky (as part of D27887)
This does an import of quirk stubs, debugging macros from USB code and
numerous usage constants used by dependent drivers.
Besides, this change renames some functions to get a better matching
with userland library and NetBSD/OpenBSD HID code. Namely:
- Old hid_report_size() renamed to hid_report_size_max()
- New hid_report_size() calculates size of given report rather than
maximum size of all reports.
- hid_get_data_unsigned() renamed to hid_get_udata()
- hid_put_data_unsigned() renamed to hid_put_udata()
Compat shim functions are provided in usbhid.h to make possible compile
of legacy code unmodified after this change.
Reviewed by: manu, hselasky
Differential revision: https://reviews.freebsd.org/D27887
It will be used by the upcoming HID-over-i2C implementation. Should be
no-op, except hid.ko module dependency is to be added to affected drivers.
Reviewed by: hselasky, manu
Differential revision: https://reviews.freebsd.org/D27867
It is possible that the client list lock is taken by other process for too
long due to e.g. IO timeouts. Allow user to terminate open() in this case.
Reviewed by: markj (as part of D27865)
At the beginning of evdev there was a LOR between hardware driver's and
evdev client list locks as they were taken in different order at
driver's interrupt and evdev open()/close() handlers.
The LOR was fixed with introduction of evdev_register_mtx() function
which allowed to use a hardware driver's lock as evdev client list lock.
While this works good with PS/2 and USB, this does not work with I2C.
Unlike PS/2 and USB, I2C open()/close() handlers do unbound sleeps
while waiting for I2C bus to release and while performing IO.
This change uses epoch(9) for traversing evdev client list in interrupt
handler to avoid the LOR thus making possible to convert evdev client
list lock to sleepable sx.
While here add brief locking protocol description.
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D27865
hid_locate() currently ignores all HID items which tagged as constant,
i.e. bit 0 of main item data is set to 1. See p.6.2.2.4 of
hid1_11.pdf [1]. Such an items are unconditionally treated as
byte-alignment padding. While that may be right decision for input and
output reports that is wrong for features reports. Feature reports can
contain constant capabilities e.g. 'Contact Count Maximum'.
See: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=232040
Remove check for constant from hid_locate() to make possible parsing of
such a reports.
[1] https://www.usb.org/sites/default/files/documents/hid1_11.pdf
Reviewed by: hselasky
Obtained from: sysutils/iichid
Differential Revision: https://reviews.freebsd.org/D27747
Doing a 'dd' over iscsi will reliably cause stalls. Tx
cleaning _should_ reliably happen as data is sent.
However, currently if the transmit queue fills it will
wait until the iflib timer (hz/2) runs.
This change causes the the tx taskq thread to be run
if there are completed descriptors.
While here:
- make timer interrupt delay a sysctl
- simplify txd_db_check handling
- comment on INTR types
Background on the change:
Initially doorbell updates were minimized by only writing to the register
on every fourth packet. If txq_drain would return without writing to the
doorbell it scheduled a callout on the next tick to do the doorbell write
to ensure that the write otherwise happened "soon". At that time a sysctl
was added for users to avoid the potential added latency by simply writing
to the doorbell register on every packet. This worked perfectly well for
e1000 and ixgbe ... and appeared to work well on ixl. However, as it
turned out there was a race to this approach that would lockup the ixl MAC.
It was possible for a lower producer index to be written after a higher one.
On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial
response was to add a lock around doorbell writes - fixing the problem but
adding an unacceptable amount of lock contention.
The next iteration was to use transmit interrupts to drive delayed doorbell
writes. If there were no packets in the queue all doorbell writes would be
immediate as the queue started to fill up we could delay doorbell writes
further and further. At the start of drain if we've cleaned any packets we
know we've moved the state machine along and we write the doorbell (an
obvious missing optimization was to skip that doorbell write if db_pending
is zero). This change required that tx interrupts be scheduled periodically
as opposed to just when the hardware txq was full. However, that just leads
to our next problem.
Initially dedicated msix vectors were used for both tx and rx. However, it
was often possible to use up all available vectors before we set up all the
queues we wanted. By having rx and tx share a vector for a given queue we
could halve the number of vectors used by a given configuration. The problem
here is that with this change only e1000 passed the necessary value to have
the fast interrupt drive tx when appropriate.
Reported by: mav@
Tested by: mav@
Reviewed by: gallatin@
MFC after: 1 month
Sponsored by: iXsystems
Differential Revision: https://reviews.freebsd.org/D27683
In the conversion into a tunable, we converted the
size of the s/g list used by the driver to be based
off of a hardcoded size of 128k rather than maxphys,
this caused performance problems for us. Revert this
to use the maxphys tunable.
Note that this constant is used to size dynamically allocated
things, and not static data structs, so this is safe.
Reviewed By: imp, kib, mav
Tested By:i dhw
Differential Revision: https://reviews.freebsd.org/D28023
Sponsored by: Netflix
Code changes in this commit were obtained from straight from OpenBSD's
uplcom.c with almost no modification, the list of chip names and USB
IDs was obtained from Linux.
Differential Revision: https://reviews.freebsd.org/D27952
Submitted by: tomli_tomli.me (Yifeng Li)
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
We increment the overflow count when receiving an overflow interrupt
with special care to check if it happens while reading the event counter.
Sponsored by: Innovate UK
It appears we must read MIB values as 2 4-byte words, lower address
first. A single 8-byte MIB read returns the value with the lower 4
bytes copied into the upper 4 bytes, resulting in bogus byte counter
values.
Reviewed by: mw
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (Netgate)
Differential Revision: https://reviews.freebsd.org/D27870
Release a grabbed page's busy state only after marking it as referenced.
Otherwise there exists a narrow window where the page could be freed
before the update. Before r356902 this was not a problem since the
object lock was held.
Discussed with: kib
Sponsored by: The FreeBSD Foundation
When a break-to-debugger is triggered, kdb will grab the console and vt(4)
will generally switch back to ttyv0. If one issues a continue from the
debugger, then kdb will ungrab the console and the system rolls on.
This change adds a perhaps minor feature: when we're down to grab == 0 and
if vt actually switched away to ttyv0, switch back to the tty it was
previously on before the console was grabbed.
The justification behind this is that a typical flow is to work in
!ttyv0 to avoid console spam while occasionally dropping to ddb to inspect
system state before returning. This could easily enough be tossed behind
a sysctl or something if it's not generally appreciated, but I anticipate
indifference.
Reviewed by: ray
Differential Revision: https://reviews.freebsd.org/D27110
vt_allocate_keyboard only needs to unwind the effects of keyboard-grabbing,
rather than any associated vt window action that may have also happened.
Split out the bits that do the keyboard work into *_noswitch equivalents,
and use those in keyboard allocation. This will be less error-prone when a
later change will offer up different window state behavior when the console
is ungrabbed.
Reviewed by: ray
Differential Revision: https://reviews.freebsd.org/D27110
The firmware header loaded into an rsu(4) device has to be customized
to reflect device settings. The driver was overwriting the header
from the shared firmware image before sending it to the device. If
two devices attached at the same time with different settings, one
device could potentially get a corrupted header. The recent changes
in a095390344 exposed this bug in the
form of a panic as the firmware blobs are now marked read-only in
object files and mapped read-only by the kernel.
To avoid the bug, change the driver to allocate a copy of the firmware
header on the stack that is initialized before writing it to the
device.
PR: 252163
Reported by: vidwer+fbsdbugs@gmail.com
Tested by: vidwer+fbsdbugs@gmail.com
Reviewed by: hselasky, bz, emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27850
The handshake timer can race with another thread sending a FIN or RST
to close a TOE TLS socket. Just bail from the timer without
rescheduling if the connection is closed when the timer fires.
Reported by: Sony Arpita Das @ Chelsio QA
Reviewed by: np
Differential Revision: https://reviews.freebsd.org/D27583
Xenstore watches received are queued in a list and processed in a
deferred thread. Such queuing was done without any checking, so a
guest could potentially trigger a resource starvation against the
FreeBSD kernel if such kernel is watching any user-controlled xenstore
path.
Allowing limiting the amount of pending events a watch can accumulate
to prevent a remote guest from triggering this resource starvation
issue.
For the PV device backends and frontends this limitation is only
applied to the other end /state node, which is limited to 1 pending
event, the rest of the watched paths can still have unlimited pending
watches because they are either local or controlled by a privileged
domain.
The xenstore user-space device gets special treatment as it's not
possible for the kernel to know whether the paths being watched by
user-space processes are controlled by a guest domain. For this reason
watches set by the xenstore user-space device are limited to 1000
pending events. Note this can be modified using the
max_pending_watch_events sysctl of the device.
This is XSA-349.
Sponsored by: Citrix Systems R&D
MFC after: 3 days
The issue was found while building cxgbe with gcc 10 (in illumos),
the array subscription check is warning us about outside the bounds
access.
See also: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
Each entry actually stores a native pointer, not a uint64_t quantity. While
we're here, go ahead and export the pointer as-is rather than converting it
to KVA. This may be more useful as consumers can map /dev/mem and observe
the entry.
For reference, see: sys/contrib/edk2/Include/Uefi/UefiSpec.h
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27669
Account for any residual bytes. This is only relevant for vnode-backed
md(4) devices.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27738
The divider table already contains the correct HW divider value, it should
not be modified by other flags such as 'CLK_DIV_ZERO_BASED'.
MFC after: 4 weeks
These drivers should have been removed along with tl(4) as part of
7c897ca91f and r347918 respectively
as these fromer made sure to only ever attach to the latter, e. g.:
<...>
static int
tlphy_probe(device_t dev)
{
if (!mii_dev_mac_match(dev, "tl"))
return (ENXIO);
<...>
If LED state is set through evdev interface, than asynchronous nature
of USB transfer callback can lead to change of order of events echoed
back to userland as it causes LED events to be echoed with some lag.
Fix that with echoing of LED events synchronously in ioctl handler.
Reviewed by: hselasky
Obtained from: sysutils/iichid
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D27750
Unlike AT keyboards, HID devices are able to send all pc105 key
states within a single report. Let evdev to transmit all key state
changes within a single report too.
Reviewed by: hselasky
Obtained from: sysutils/iichid
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D27749
Add support for LAN found on Thinkpad USB-C and Thunderbolt Gen 2
docking stations.
Submitted by: ali.abdallah@suse.com
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Hardware timestamp reporting is disabled by default as it produces many
extra events which are not handled by consumers like libinput.
Add hw.usb.wmt.timestamps=1 tunable to loader.conf to enable it.
Obtained from: sysutils/iichid
In Hybrid mode, the number of contacts that can be reported in one
report is less than the maximum number of contacts that the device
supports. For example, a device that supports a maximum of 4
concurrent physical contacts, can set up its top-level collection to
deliver a maximum of two contacts in one report. If four contact
points are present, the device can break these up into two serial
reports that deliver two contacts each.
Obtained from: sysutils/iichid
Original if_dwc driver used m_defrag as an implementation shortcut but on
1000Mb networks it affects performance. Implement multi-descriptor support for
TX path.
Tested on RK3399-Firefly, patch adds ~15% of network throughput.
Reviewed By: manu
Differential Revision: https://reviews.freebsd.org/D27520
Now that bhyve(8) supports UART, bvmconsole and bvmdebug are no longer needed.
This also removes the '-b' and '-g' flag from bhyve(8). These two flags were
marked deprecated in r368519.
Reviewed by: grehan, kevans
Approved by: kevans (mentor)
Differential Revision: https://reviews.freebsd.org/D27490
g_handleattr_int() consumes the bio if the attribute matches, so when we
check bp->bio_cmd bp may have been freed.
Move GETATTR handling to a separate function to avoid the problem. We
do not need to set bio_completed for such bios, g_handleattr_int() will
handle it. Also remove the setting of bio_resid before the
devstat_end_transaction_bio() call. All of the md(4) bio handlers set
bio_resid already.
Reported by: KASAN
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27724
Otherwise libinput refuses to recoginize some Synaptics touchpads with
"kernel bug: device has min == max on ABS_X" message in Xorg.log.
PR: 251149
Reported-by: Jens Grassel <freebsd-ports@jan0sch.de>
Tested-by: Jens Grassel <freebsd-ports@jan0sch.de>
MFC-after: 2 weeks
Conducted tests showed that Embedded Controller is not mandatory for
WMI extensions to work.
Reported-by: yuripv
Reviewed-by: avg
MFC-after: 2 weeks
Differential-Revision: https://reviews.freebsd.org/D27653
These Mini-Box LCDs are using Microchip components and sub-licensed product
IDs. Whilst here, update the constant names and descriptions for the products
to use the names listed on the manufacturer's website rather than vague ones.
The picoLCD 4x20 is named that on the manufacturer's website so prefer that
name, even though linux-usb.org lists it with the numbers reversed as one might
expect.
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D27670
The SRAT may contain multiple distinct entries that together describe a
contiguous region of physical memory. In this case we were not
coalescing the corresponding entries in the memory affinity table, which
led to fragmented phys_avail[] entries. Since r338431 the vm_phys_segs[]
entries derived from phys_avail[] will be coalesced, resulting in a
situation where vm_phys_segs[] entries do not have a covering
phys_avail[] entry. vm_page_startup() will not add such segments to the
physical memory allocator, leaving them unused.
Reported by: Don Morris <dgmorris@earthlink.net>
Reviewed by: kib, vangyzen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27620
This caused us to write to the low half of the feature word twice, once with
the high bits and once with the low bits. Common legacy device implementations
seem to be fairly lenient about being able to write to the feature bits
multiple times, but Arm's models use a stricter implementation that will ignore
the second write. This fixes using vtnet(4) on those models.
Reported by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Pointy hat: jrtc27
instead of bailing out with EBUSY if there are any.
If driver module is unloaded, or just device is forcibly detached from
the driver, there is no way for driver to correctly unload otherwise.
Esp. if there are resources dedicated to the VFs which prevent turning
down other resources.
Reviewed by: jhb
Sponsored by: Mellanox Technologies / NVidia Networking
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27615
Add capability to SPIBUS to have child device with IRQ.
For example many ADC chip have a dedicated pin to signal "data ready"
and the host can just wait for a interrupt to go out and read the result.
It is the same code as in R282674 and R282702 for IICBUS by Michal Meloun
Submitted by: Oskar Holmund <oskar.holmlund@ohdata.se>
Differential Revision: https://reviews.freebsd.org/D27396
Current way, hardcoded value plus heuristic is not conform to the PCI(e)
specification and it fails on systems where MSI-X bar is not initialized by
BIOS/ACPI (many arm or arm64 systems for example).
Instead, use the standard PCI(e) capability for determining of
MSIX table bar address.
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D27265
- Use a uintptr_t cast to get the virtual address of a pointer in
USB_P2U() instead of a ptrdiff_t.
- Add offsets to a char * pointer directly without roundtripping the
pointer through a ptrdiff_t in USB_ADD_BYTES().
Reviewed by: imp, hselasky
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27581
The sense_ptr thing is quite broken. As near as I can tell, the
driver tries to copyout to a physical address rather than whatever
user address the sense buffer should be copied to. It is not
immediately obvious what user address the sense buffer should be
copied to.
Reviewed by: imp
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27578
Move initialization of num_altsetting under USB_CFG_INIT, else
there will be a page fault when enumerating USB devices.
PR: 251856
MFC after: 1 week
Submitted by: Ma, Horse <Shichun.Ma@dell.com>
Sponsored by: Mellanox Technologies // NVIDIA Networking
Allow setting the alternate interface number to fail when there is only
one alternate setting present, to comply with the USB specification.
Refactor how iface->num_altsetting is computed.
Bump the __FreeBSD_version due to change of core USB structure.
PR: 251856
MFC after: 1 week
Submitted by: Ma, Horse <Shichun.Ma@dell.com>
Sponsored by: Mellanox Technologies // NVIDIA Networking
Limit the number of alternate settings to 256.
Else the alternate index variable may wrap around.
PR: 251856
MFC after: 1 week
Submitted by: Ma, Horse <Shichun.Ma@dell.com>
Sponsored by: Mellanox Technologies // NVIDIA Networking
This is an import of the Google Summer of Code 2018 project completed by
Christian Kramer (and, sadly, ignored by us for two years now). The goals
stated for that project were:
FreeBSD already has support for interrupts implemented in the GPIO
controller drivers of several SoCs, but there are no interfaces to take
advantage of them out of user space yet. The goal of this work is to
implement such an interface by providing descriptors which integrate
with the common I/O system calls and multiplexing mechanisms.
The initial imported code supports the following functionality:
- A kernel driver that provides an interface to the user space; the
existing gpioc(4) driver was enhanced with this functionality.
- Implement support for the most common I/O system calls / multiplexing
mechanisms:
- read() Places the pin number on which the interrupt occurred in the
buffer. Blocking and non-blocking behaviour supported.
- poll()/select()
- kqueue()
- signal driven I/O. Posting SIGIO when the O_ASYNC was set.
- Many-to-many relationship between pins and file descriptors.
- A file descriptor can monitor several GPIO pins.
- A GPIO pin can be monitored by multiple file descriptors.
- Integration with gpioctl and libgpio.
I added some fixes (mostly to locking) and feature enhancements on top of
the original gsoc code. The feature ehancements allow the user to choose
between detailed and summary event reporting. Detailed reporting provides
a record describing each pin change event. Summary reporting provides the
time of the first and last change of each pin, and a count of how many times
it changed state since the last read(2) call. Another enhancement allows
the recording of multiple state change events on multiple pins between each
call to read(2) (the original code would track only a single event at a time).
The phabricator review for these changes timed out without approval, but I
cite it below anyway, because the review contains a series of diffs that
show how I evolved the code from its original state in Christian's github
repo for the gsoc project to what is being commited here. (In effect,
the phab review extends the VC history back to the original code.)
Submitted by: Christian Kramer
Obtained from: https://github.com/ckraemer/freebsd/tree/gsoc2018
Differential Revision: https://reviews.freebsd.org/D27398
nids(4) was a clever idea in the early 2000's when the market was
flooded with 10/100 NICs with Windows-only drivers, but that hasn't been
the case for ages and the driver has had no meaningful maintenance in
ages. It only supports Windows-XP era drivers.
Reviewed by: imp, bcr
MFC after: 3 days
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27527
The hme (Happy Meal Ethernet) driver was the onboard NIC in most
supported sparc64 platforms. A few PCI NICs do exist, but we have seen
no evidence of use on non-sparc systems.
Reviewed by: imp, emaste, bcr
Sponsored by: DARPA
Now that bhyve(8) supports UART, bvmconsole and bvmdebug are no longer needed.
Mark the '-b' and '-g' flag as deprecated for bhyve(8).
These will be removed in 13.
Reviewed by: jhb, grehan
Approved by: kevans (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27519
Do not assume that VBE framebuffer metadata can be used. Like with the
EFI fb metadata, it may be null, so we should take care not to
dereference the null vbefb pointer. This avoids a panic when booting
-CURRENT on a gen1 VM in Azure.
Approved by: tsoome
Sponsored by: Miles AS
Differential Revision: https://reviews.freebsd.org/D27533
PCI memory address space is shared between memory-mapped devices (MMIO)
and host memory (which may be remapped by an IOMMU). Device accesses to
an address within a memory aperture in a PCIe root port will be treated
as peer-to-peer and not forwarded to an IOMMU. To avoid this, reserve
the address space of the root port's memory apertures in the address
space used by the IOMMU for remapping.
Reviewed by: kib, tychon
Discussed with: Anton Rang <rang@acm.org>
Tested by: tychon
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D27503
Summary:
r358689 attempted to fix a clang warning/error by inferring the intent
of the condition "(cdb[0] != 0x28 || cdb[0] != 0x2A)". Unfortunately, it looks
like this broke things. Instead, fix this by making this path unconditional,
effectively reverting to the previous state.
PR: kern/251483
Reviewed By: ambrisko
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D27515
Replace some hard-coded magic values in the ioctl stats struct with
#defines. I'm going to follow up with some more sanity checking in
the receive path that also use these values so we don't do bad
things if the hardware is (more) confused.
Sync serial (T1/E1) interfaces are largely irrelevant today and phk
confirms this driver is unnecessary in review D23928.
This leaves ce(4) and cp(4) in the tree. They're likely not relevant
either, but glebius contacted the manufacturer and those devices are
still available for purchase. At glebius' suggestion leave them in
the tree as long as they do not impose a maintenace burden.
Approved by: phk
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
bus_dmamap_sync() ensures that memory that's prepared for PREWRITE can
be DMA'd immediately after it returns. The details differ, but this
mirrors atomic thread release semantics, at least for the buffers
synced.
For non-x86 platforms, bus_dmamap_sync() has the right syncing and
fences. So in the past, wmb() had been omitted for them.
For x86 platforms, the memory ordering is already strong enough to
ensure DMA to the device sees the current contents. As such, we don't
need the wmb() here. It translates to an sfence which is only needed
for writes to regions that have the write combining attribute set or
when some exotic opcodes are used. The nvme driver does neither of
these. Since bus_dmamap_sync() includes atomic_thread_fence_rel, we
can be assured any optimizer won't reorder the bus_dmamap_sync and the
bus_space_write operations. The wmb() was a vestiage of the pre-busdma
version initially committed to the tree.
Reviewed by: kib@, gallatin@, chuck@, mav@
Differential Revision: https://reviews.freebsd.org/D27448
By default, if a TOE TLS socket stops receiving data for more than 5
seconds, revert the connection back to plain TOE mode. This provides
a fallback if the userland SSL library does not support KTLS. In
addition, for client TLS 1.3 sockets using connect(), the TOE socket
blocks before the handshake has completed since the socket option is
only invoked for the final handshake.
The timeout defaults to 5 seconds, but can be changed at boot via the
hw.cxgbe.toe.tls_rx_timeout tunable or for an individual interface via
the dev.<nexus>.toe.tls_rx_timeout sysctl.
Reviewed by: np
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27470
This includes mbufs waiting for data from sendfile() I/O requests, or
mbufs awaiting encryption for KTLS.
Reviewed by: np
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27469
If TOE TLS is requested for an unsupported cipher suite or TLS
version, disable TLS processing and fall back to plain TOE. In
addition, if an error occurs when saving the decryption keys in the
card's memory, disable TLS processing and fall back to plain TOE.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27468
If a TOE TLS socket ends up using an unsupported TLS version or
ciphersuite, it must be downgraded to a "plain" TOE socket with TLS
encryption/decryption performed on the host. The previous
implementation of this fallback was incomplete and resulted in hung
connections.
Reviewed by: np
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27467
* uninitialised variable use
* Using AXGBE_SET_ADV() where it was intended; using AXGBE_ADV()
seems wrong and also causes a compiler warning.
Reviewed by: rpokala
Differential Revision: https://reviews.freebsd.org/D26839
DTS node can have this property which configure the burst length
for both TX and RX if it's the same.
This unbreak if_dwc on Allwinner A20 and possibly other boards that
uses this prop.
Reported by: qroxana <qroxana@mail.ru>
It is common for freelists to be starving when a netmap application
stops. Mailbox commands to free queues can hang in such a situation.
Avoid that by not freeing the queues when netmap is switched off.
Instead, use an alternate method to stop the queues without releasing
the context ids. If netmap is enabled again later then the same queue
is reinitialized for use. Move alloc_nm_rxq and txq to t4_netmap.c
while here.
MFC after: 1 week
Sponsored by: Chelsio Communications
Allow fdt devices to be used as debug ports for gdb(4).
A debug console can be specified with the "freebsd,debug-path" property
in the device tree's /chosen node, or using the environment variable
hw.fdt.dbgport.
The device should be specified by its name in the device tree, for
example hw.fdt.dbgport="serial2".
PR: 251053
Submitted by: Dmitry Salychev <dsl@mcusim.org>
Submitted by: stevek (original patch, D5986)
Reviewed by: andrew, mhorne
Differential Revision: https://reviews.freebsd.org/D27422
r367917 fixed the backpressure on the netmap rxq being stopped but that
doesn't help if some other netmap rxq is starved (because it is stopping
too although the driver doesn't know this yet) and blocks the pipeline.
An alternate fix that works in all cases will be checked in instead.
Sponsored by: Chelsio Communications
A failure in iflib_device_register() can result in
em_free_pci_resources() being called after receive queues have already
been freed. In particular, a failure to allocate IRQ resources will goto
fail_queues, where IFDI_QUEUES_FREE() will be called via
iflib_tx_structures_free(), preceding the call to IFDI_DETACH().
Cope with this by checking adapter->rx_queues before dereferencing it.
A similar check is present in ixgbe(4) and ixl(4).
MFC after: 1 week
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27260
- in nvme_qpair_process_completions() do dma sync before completion buffer
is used.
- in nvme_qpair_submit_tracker(), don't do explicit wmb() also for arm
and arm64. Bus_dmamap_sync() on these architectures is sufficient to ensure
that all CPU stores are visible to external (including DMA) observers.
- Allocate completion buffer as BUS_DMA_COHERENT. On not-DMA coherent systems,
buffers continuously owned (and accessed) by DMA must be allocated with this
flag. Note that BUS_DMA_COHERENT flag is no-op on DMA coherent systems
(or coherent buses in mixed systems).
MFC after: 4 weeks
Reviewed by: mav, imp
Differential Revision: https://reviews.freebsd.org/D27446
Some USB WLAN devices have "on-board" storage showing up as umass
and making the root mount wait for a very long time.
The WLAN drivers know how to deal with that an issue an eject
command later when attaching themselves.
Introduce a quirk to not probe these devices as umass and avoid
hangs and confusion altogether.
Reviewed by: hselasky, imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27434
Otherwise qat_detach() may attempt to deregister an unrelated crypto
driver if an error occurs in qat_attach() before crypto_get_driverid()
is called, since 0 is a valid driver ID.
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (Netgate)
If firmware_get() fails to find a loaded firmware image, it searches for
candidate KLDs to load. It will search for a KLD containing a module
with the same name as the requested image, and failing that, will load a
KLD with the same basename as the requested image.
The module name given by fw_stub.awk is simply "<mangled KLD name>_fw".
QAT firmware modules contain two images, neither of which match either
of the names used during lookup, so automatic loading of firmware images
after mountroot does not work. Work around this by using the same
string for the first image name and for the KLD basename.
MFC after: 3 days
Sponsored by: Rubicon Communications, LLC (Netgate)
Implement vt_vbefb to support Vesa Bios Extensions (VBE) framebuffer with VT.
vt_vbefb is built based on vt_efifb and is assuming similar data for
initialization, use MODINFOMD_VBE_FB to identify the structure vbe_fb
in kernel metadata.
struct vbe_fb, is populated by boot loader, and is passed to kernel via
metadata payload.
Differential Revision: https://reviews.freebsd.org/D27373
These swapping functions violate BUSDMA contract - we cannot write
to armed (by bus_dmamap_sync(PRE_..)) buffers. Remove them at least
from little endian machines until a better solution will be developed.
Reviewed by: imp
MFC after: 3 weeks
We read the bus end value from the _CRS method. On some systems we need
to further limit it based on the MCFG table.
Support this by setting a default value, then update it if needed in the
_CRS table, and finally reduce it if it is past the end of the MCFG tabel.
This will allow for both systems that use either method to encode this
value.
This partially reverts r347929, removing the error printf.
Reviewed by: philip
Tested by: philip, Andrey Fesenko <f0andrey_gmail.com>
MFC after: 2 weeks
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D27274
With 4KB page size the 2MB is the maximum we can address with one page PRP.
Going further would require chaining, that would add some more complexity.
On the other side, to reduce memory consumption, allocate the PRP memory
respecting maximum transfer size reported in the controller identify data.
Many of NVMe devices support much smaller values, starting from 128KB.
To do that we have to change the initialization sequence to pull the data
earlier, before setting up the I/O queue pairs. The admin queue pair is
still allocated for full MIN(maxphys, 2MB) size, but it is not a big deal,
since there is only one such queue with only 16 trackers.
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
- Remove code duplication by adding two new functions to execute prepared
queue entry via either mbox or request queue and wait for result.
- Since the new function executing via request queue sleeps any way, make
it sleep also in case of overflows or handle shortages. It should make it
more reliable and less affecting other less flexible request queue users.
- Turn isp_target_put_entry() into not target-specific isp_send_entry().
- Make handling of responses with control handles more universal.
- Move RQSTYPE_RPT_ID_ACQ handling into new function.
- Inline isp_handle_other_response(), becoming trivial after above.
- Clean the list of IOCBs from pre-24xx ones.
This driver provides support for Realtek PCI SD card readers. It attaches
mmc(4) bus on card insertion and detaches it on card removal. It has been
tested with RTS5209, RTS5227, RTS5229, RTS522A, RTS525A and RTL8411B. It
should also work with RTS5249, RTL8402 and RTL8411.
PR: 204521
Submitted by: Henri Hennebert (hlh at restart dot be)
Reviewed by: imp, jkim
Differential Revision: https://reviews.freebsd.org/D26435
It was broken by design and unused for years due to conflicts between
different threads, fighting for the same set of mailbox registers, not
designed for multiple requests at a time. So either request has to be
synchronous and spin under the lock, or it should be sent asynchronously
through the queues as Mailbox Command IOCB or some other way.
This removes any OS specifics from the wait code, so it can be inlined.
The ratelimit tags may be shared, especially for unlimited TLS
traffic, and then the refcount is allowed to be greater than one
when freeing the send tag.
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Before this change in case of request queue overflow driver just froze the
device queue for 100ms to retry after. It was pretty bad for performance.
This change introduces SIM queue freezing when free space on the request
queue drops below 255 entries (worst case of maximum I/O size S/G list),
checking for a chance to release it on I/O completion. If the queue still
get overflowed somehow, the old mechanism is still in place, just with
delay reduced to 10ms.
With the earlier queue length increase overflows should not happen often,
but it is still easily reachable on synthetic tests.
Do not hardcode what we setup for the DMA engine configuration but
lookup the fdt properties and configuring accordingly.
Use a default value of 8 for the burst dma length for both TX and
RX, this is what we used for TX before.
- Allocate 256 handlers more than payload commands for management purposes.
- Increase maximum number of handlers from 8K to 16K by tuning the format.
- Just to be safe limit the number of payload commands to 16K - 256.
- Limit number of target exchanges in mixed mode to the number of atpds.
- If we still somehow get out of atpds -- return BUSY, since we really are.
There is not much to inherit any more, may create more problems than solve.
Instead parent them all directly to upstream.
While there, add missed payload tag and tune scratch tag destructions.
The netmap application using the driver is responsible for replenishing
the receive freelists and they may be totally depleted when the
application exits. Packets in flight, if any, might block the pipeline
in case there aren't enough buffers left in the freelist. Avoid this by
filling up the freelists with a driver allocated buffer.
MFC after: 1 week
Sponsored by: Chelsio Communications
Qlogic chips store S/G lists in the same queue as requests themselves. In
the worst case 1MB I/O may require up to 52 IOCBs, that means queue of 1024
IOCBs can store only 19 of such requests. The increase reduces chances of
overflow, while we should be able to afford additional 512KB of RAM per HBA.
The Linux driver uses comparable numbers.
While there, decouple ATIO queue size from response queue size. There is
no reason for them to be equal.
- Make isp_start() to set all the IOCB fields aside of S/G list, removing
extra information from isp_send_cmd(), now only doing S/G lists and sending.
- Turn DMA setup/free from being card and PCI-specific into OS-specific,
instead add new card-specific method for isp_send_cmd(). Previously this
function was a monster handling all the cards.
- Remove double error code translation.
This removes 288KB (36%) of the driver code and zillions of hacks and
workarounds, making single driver uniformly support several different
generations of hardware interfaces, not counting minor card variations.
After years of the hopeless fight, I don't think it worth to continue
support for hardware obsolete for 15-20 years. Instead much cleaner
now code should allow to move forward toward better locking, multiple
queues and other cool features.
All the remaining Qlogic cards starting from 4Gb 24xx to 32Gb 27xx use
the same hardware/firmware interface with minor incremental improvements,
so it seems to be a good new starting point. Except one PCI-X model all
all of them are PCIe and so still usable in modern systems.
Discussed with: ken, scottl, jpaetzel, imp
Relnotes: yes
Rudimentary AUX multiplexing support was added to kernel to make possible
touchpad initialization on some HP EliteBook laptops with trackpoint.
Disable multiplexer probing on all Lenovo laptops now as they use touchpad
pass-through port rather than AUX multiplexer to connect trackpoint and
at least two model (X120e and X121e) is known for getting PS/2 AUX port
dysfunctional after switching back to hidden multiplexing mode.
AUX MUX probing can be reenabled with setting of hw.psm.mux_disabled loader
tunable to 0.
PR: 249987
Reported by: jwb
MFC after: 2 weeks
It can useful for code outside the VM system to look up the NUMA domain
of a page backing a virtual or physical address, specifically when
creating NUMA-aware data structures. We have _vm_phys_domain() for
this, but the leading underscore implies that it's an internal function,
and vm_phys.h has dependencies on a number of other headers.
Rename vm_phys_domain() to vm_page_domain(), and _vm_phys_domain() to
vm_phys_domain(). Make the latter an inline function.
Add _vm_phys.h and define struct vm_phys_seg there so that it's easier
to use in other headers. Include it from vm_page.h so that
vm_page_domain() can be defined there.
Include machine/vmparam.h from _vm_phys.h since it depends directly on
some constants defined there.
Reviewed by: alc
Reviewed by: dougm, kib (earlier versions)
Differential Revision: https://reviews.freebsd.org/D27207
Some of the PCI ID were described as ENA with LLQ support - it's not
fully accurate and because of that, their names were changed.
Instead of LLQ, use RSERV0 for the description of those devices.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27119
The new HAL allows the driver to read extra ENI stats. Exact meaning of
each of them can be found in base/ena_defs/ena_admin_defs.h file and
structure ena_admin_eni_stats.
Those stats are being updated inside of the timer service, which is
executed every second.
ENI metrics are turned off by default. They can be enabled, using the
sysctl node: dev.ena.X.eni_metrics.update_delay
0 value in this node means that the update is turned off. Other values
determine how many seconds must pass, before ENI metrics will be
updated.
They can be acquired, using sysctl:
sysctl dev.ena.X.eni_metrics
Where X stands for the interface number.
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27118
Refering to guide: https://wiki.freebsd.org/SPDX the SPDX tag should not
replace the standard license text, however it should be added over the
standard license text to make the automation easier.
Because of that, the old license was kept, but the SPDX tag was added
on top of every ENA driver file.
Submited by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27117
For the first descriptor in a chain the data may start at an offset.
It is optional feature of some devices, so the driver must ack that
it supports it.
The data pointer of the mbuf is simply shifted by the given value.
Submitted by: Maciej Bielski <mba@semihalf.com>
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27116
* Use the new API of ena_trace_*
* Fix typo syndrom --> syndrome
* Remove validation of the Rx req ID (already performed in the ena-com)
* Remove usage of deprecated ENA_ASSERT macro
Submitted by: Ido Segev <idose@amazon.com>
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27115
The latest generation hardware requires IO CQ (completion queue)
descriptors memory to be aligned to a 4K. It needs that feature for
the best performance.
Allocating unaligned descriptors will have a big performance impact as
the packet processing in a HW won't be optimized properly. For that
purpose adjust ena_dma_alloc() to support it.
It's a critical fix, especially for the arm64 EC2 instances.
Submitted by: Ido Segev <idose@amazon.com>
Obtained from: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27114
Ecmd memory is not directly related to the request queue, only referenced
from it sometimes in target mode. Separate allocation should be easier
in case of fragmented memory and can be skipped when target is not built.
MFC after: 1 month
The entry->flags field is initialized in iommu_gas_init_domain().
Reviewed by: kib
Sponsored by: Innovate DSbD
Differential Revision: https://reviews.freebsd.org/D27235
This is needed on arm64 for the interface between iommu framework
and iommu controller drivers.
Reviewed by: kib
Sponsored by: Innovate DSbD
Differential Revision: https://reviews.freebsd.org/D27229
APIs that have deferred callbacks should have some kind of cleanup
function that callers can use to fence the callbacks. Otherwise things
like module unloading can lead to dangling function pointers, or worse.
The IB MR code is the only place that calls this function and had a
really poor attempt at creating this fence. Provide a good version in
the core code as future patches will add more places that need this
fence.
Linux commit:
e355477ed9e4f401e3931043df97325d38552d54
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Report EQE data upon CQ completion to let upper layers use this data.
Linux commit:
4e0e2ea1886afe8c001971ff767f6670312a9b04
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Enhance mlx5_core_create_cq() to get the command out buffer from the
callers to let them use the output.
Linux commit:
38164b771947be9baf06e78ffdfb650f8f3e908e
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
To prevent a hardware memory leak when a DEVX DCT object is destroyed
without calling drain DCT before, (e.g. under cleanup flow), need to
manage its creation and destruction via mlx5 core.
Linux commit:
c5ae1954c47d3fd8815bd5a592aba18702c93f33
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Make sure order of cleanup is exactly the opposite of initialization.
Linux commit:
f4044dac63e952ac1137b6df02b233d37696e2f5
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
In standards such as LoPAPR, property names in excess of the usual 31
characters exist.
This breaks property traversal.
While in IEEE 1275-1994, nextprop is defined explicitly to work with a
32-byte region of memory, using a larger buffer should be fine. There is
actually no way to pass a buffer length to the nextprop call in the OF
client interface, so SLOF actually just blindly overflows the buffer.
So we have to defensively make the buffer larger, to avoid memory
corruption when reading out long properties on live OF systems.
Note also that on real-mode OF, things are pretty tight because we are
allocating against a static bounce buffer in low memory, so we can't just
use a huge buffer to work around this without it being wasteful of our
limited amount of 32-bit physical memory.
This allows a patched ofwdump to operate properly on SLOF (i.e. pseries)
systems, as well as any other PowerPC systems with overlength properties.
Reviewed by: jhibbits
MFC after: 2 weeks
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D26669
TCP SYNs in inner traffic will hit hardware listeners when VXLAN/NVGRE
rx parsing is enabled in the chip. t4_tom should pass on these SYNs to
the kernel and let it deal with them as if they arrived on the non-TOE
path.
Reported by: Sony at Chelsio
MFC after: 1 week
Sponsored by: Chelsio Communications
the HID volume keys support in the USB audio driver.
While at it re-organize the USB audio sysctls a bit.
Differential Revision: https://reviews.freebsd.org/D27180
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
This both:
- makes ifconfig media line similar to that of other drivers.
- fixes ENXIO in case when paradoxical current media word is not registered.
Now e.g.
ifconfig mce0 -mediaopt txpause,rxpause
works by disabling pauses if enabled.
Sponsored by: Mellanox Technologies/NVidia Networking
MFC after: 1 week
Otherwise, a socket can have a non-NULL tp->tod while TF_TOE is clear.
In particular, if a newly accepted socket falls back to non-TOE due to
an active open failure, the non-TOE socket will still have tp->tod set
even though TF_TOE is clear.
Reviewed by: np
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27028
Refer to the Linux commit mentioned below for a more detailed description.
Linux commit:
a18177925c252da7801149abe217c05b80884798
Requested by: Isilon
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
The MAC address can be set with the optional mac-addr property in the VF
section of the iovctl.conf(5) used to instantiate the VFs.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Query the firmware for the MAC address set by the PF for the VF and use
it instead of the firmware generated MAC if it's available.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
When using the ALT+CTRL+ESC sequence to break into kdb, the keyboard is
completely borked when you return. watch(8) shows that it's working, but
it's inserting escape sequences.
Further investigation revealed that VT_ALT_TO_ESC_HACK is the default and
directly conflicts with this sequence, so upon return from the debugger
ALKED is set.
If they triggered the break to debugger, it's safe to assume they didn't
mean to use VT_ALT_TO_ESC_HACK, so just unset it to reduce the surprise when
the keyboard seems non-functional upon return.
Reviewed by: tsoome
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27109
Improve the output of the recently often experienced debug message in order
to gather further data.
PR: 237666
Reviewed by: hselasky
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D27108
vt_generate_cons_palette() does take max values of RGB component colours, not
mask. Also we need to set info->fb_cmsize, or vt_fb_init() will re-initialize
the info->fb_cmap.