Commit Graph

34803 Commits

Author SHA1 Message Date
Eric Joyner
7d48aa4c72 ixgbe(4): Update shared code, add support for X552 1G, fix bug
This patch will:

- Update ixgbe shared code
- Add support for Intel(R) Ethernet Connection X552 1000BASE-T
- Add error handling for link state check preventing VF from stopping traffic
  after changing PF's MTU value

Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: Intel Networking
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D13885
2018-03-19 20:55:05 +00:00
Ian Lepore
d892051323 Add the device/chip type to the disk d_descr field, and print more info
about the chip including the erase block size at attach time.

Also add myself to the copyrights since at this point svn blame would point
to me as the culprit for much of this.
2018-03-18 18:58:47 +00:00
Ian Lepore
3c9af13c75 Add support for 4K and 32K erase block sizes. Many of the supported chips
have these flags set in the ident table, but there was no code to support
using the smaller erase sizes.
2018-03-18 18:37:47 +00:00
Ian Lepore
c03ab159f6 Make all internal routines return an int error status, and check the
status at all call points.  Combine the get_status and wait_for_ready
routines, since waiting for ready is the only reason to ever get status.
2018-03-18 17:47:57 +00:00
Ian Lepore
89a1585b8d Add sc_parent to the softc and use it in place of device_get_parent() calls
all over the place.  Also pass the softc as the arg to all the internal
functions instead of passing a device_t and calling device_get_softc() in
each function.
2018-03-18 17:25:23 +00:00
Ian Lepore
89a895b63c Bugfix: wait for writes/erases to complete after starting them, instead of
before starting them.

Using the wait-before logic would make sense if there was useful time-
consuming work that could be done between the end of one write and the
beginning of the next, but it also requires doing the wait-for-ready before
reading, because a prior write or erase could still be in progress.  Reading
is the far more common case, so adding a whole extra bus transaction to
check for ready before each read would soak up any small gains that might be
had from doing async writes.
2018-03-18 16:52:31 +00:00
Ian Lepore
19aa9f7183 Eliminate some unneeded intermediate variables. Eliminate some redundant
parens in shift-and-mask expressions.  Reword and reflow some comments.
2018-03-18 16:36:14 +00:00
Ian Lepore
f432eb7ea1 Remove a pointless KASSERT and reword a comment a bit. The KASSERT tested
for the same condition that the preceeding lines checked for and would have
returned EIO, so the assert could never possibly trigger (sc_sectorsize must
inherently be an integer multiple of FLASH_PAGE_SIZE).
2018-03-18 16:10:14 +00:00
Ian Lepore
dac94adb63 Do not overwrite the contents of BIO_WRITE buffers. SPI inherently
transfers data in both directions at once.  When writing to the device,
use a dummy buffer for the incoming data, not the same buffer as the
outgoing data.  Writes are done in FLASH_PAGE_SIZE chunks, which is only
256 bytes, so just put the dummy buffer into the softc.
2018-03-18 15:56:10 +00:00
Conrad Meyer
db488e4f52 random(4): Poll for signals during large reads
Occasionally poll for signals during large reads of the /dev/u?random
devices.  This allows cancellation via SIGINT of accidental invocations of
very large reads.  (A 2GB /dev/random read, which takes about 10 seconds on
my 2017 AMD Zen processor, can be aborted.)

I believe this behavior was intended since 2014 (r273997), just not fully
implemented.

This is motivated by a potential getrandom(2) interface that may not
explicitly forbid extremely large reads on 64-bit platforms -- even larger
than the 2GB limit imposed on devfs I/O by default.  Such reads, if they are
to be allowed, should be cancellable by the user or administrator.

Reviewed by:	delphij
Approved by:	secteam (delphij)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14684
2018-03-16 18:50:26 +00:00
Ian Lepore
9c45f7b4fd Use EFI RTC capabilities info when registering, add bootverbose diagnostics.
Make some small improvements to the efirtc driver by obtaining the clock
capabilities (resolution and whether the sub-second counters are reset) and
using the info when registering the clock. When the hardware zeroes out the
subsecond info on clock-set, schedule clock updates to happen just before
top-of-second, so that the RTC time is closely in-sync with kernel time.

Also, in the identify() routine, always add the driver if EFI runtime
services are available, then decide in probe() whether to attach the driver
or not. If not attaching and bootverbose is on, say why. All of this is
basically to avoid "silent failure" -- if someone thinks there should be an
efi rtc and it's not attaching, at least they can set bootverbose and maybe
get a clue from the output.

Differential Revision:	https://reviews.freebsd.org/D14565 (timed out)
2018-03-16 18:16:27 +00:00
Warner Losh
d85d964829 Try polling the qpairs on timeout.
On some systems, we're getting timeouts when we use multiple queues on
drives that work perfectly well on other systems. On a hunch, Jim
Harris suggested I poll the completion queue when we get a timeout.
This patch polls the completion queue if no fatal status was
indicated. If it had pending I/O, we complete that request and
return. Otherwise, if aborts are enabled and no fatal status, we abort
the command and return. Otherwise we reset the card.

This may clear up the problem, or we may see it result in lots of
timeouts and a performance problem. Either way, we'll know the next
step. We may also need to pay attention to the fatal status bit
of the controller.

PR: 211713
Suggested by: Jim Harris
Sponsored by: Netflix
2018-03-16 05:23:48 +00:00
Andriy Voskoboinyk
5f792f7478 rtwn(4): de-hardcode ('h/w rate index' - 'corresponding MCS index') constant 2018-03-16 01:03:10 +00:00
Andriy Voskoboinyk
46e18fc6d4 urtw(4), zyd(4): reduce code verbosity.
No functional change intended.
2018-03-16 00:38:10 +00:00
Andriy Voskoboinyk
2757acf673 urtw(4): provide names for some commonly used rate indices + drop
now-unused urtw_rate2rtl()
2018-03-16 00:09:16 +00:00
Brooks Davis
064c9c3d42 Add a request structure and make the implementation use it.
This allows compatibility translation to take place on the stack
(md_ioctl is too big) and is more suitable as a public interface within
the kernel than the kern_ioctl interface.

Except for the initialization of the md_req from the md_ioctl
(including detection of kernel md_file pointers) and the updating
of the md_ioctl prior to return, this is a mechanical replacment
of md_ioctl and mdio with md_req and mdr.

Reviewed by:	markj, cem, kib (assorted versions)
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14704
2018-03-15 21:42:49 +00:00
Brooks Davis
b65794ad84 Move implementation of ioctls into kern_*() functions.
Move locks from outside ioctl to the individual implementations.

This is the first step of changing the implementations to act on a
kernel-internal request struct rather than on struct md_ioctl and to
removing the use of kern_ioctl in mountroot.

Reviewed by:	cem, kib, markj (prior version)
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14700
2018-03-15 18:12:55 +00:00
Brooks Davis
94598ac9f9 Restore the behavior of returning the total number of units by
unconditionally incrementing i in the loop;

Reported by:	cem
MFC with:	r330880
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14685
2018-03-15 16:37:43 +00:00
Alexander Motin
db08ef4353 Increase ABOUT FIRMWARE command timeout to 5s.
It seems default timeout of 100ms is not enough for my 2694L card,
while it was perfectly fine for others, even for full-height 2694.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2018-03-15 01:07:21 +00:00
Jung-uk Kim
8438a7a80a Merge ACPICA 20180313. 2018-03-14 23:45:48 +00:00
Jung-uk Kim
ba425ae46d Remove local definitions for _STA method in favor of ACPICA.
These macros were added in ACPICA 20051216, more than a decade ago.
2018-03-14 23:42:28 +00:00
Warner Losh
5d7fd8f726 Fix error messages in cut and pasted code.
Also, fix an unnecessary deref to get ctrlr.

Noticed by: rpokala@
Sponsored by: Netflix
2018-03-14 23:28:28 +00:00
Warner Losh
8b1e6ebe0e When tearing down a queue pair, also delete the queue entries.
The NVME standard has required in section 7.2.6, since at least 1.1,
that a clean shutdown is signalled by deleting the subission and the
completion queues before setting the shutdown bit in CC. The 1.0
standard, apparently, did not and many of the early Intel cards didn't
care. Some newer cards care, at least one whose beta firmware can
scramble the card on an unclean shutdown. Linux has done this for some
time. To make it possible to move forward with an evaluation of this
pre-release card with wonky firmware, delete the queues on the card
when we delete the qpair structures.

Sponsored by: Netflix
2018-03-14 23:01:18 +00:00
Warner Losh
d61cf64d0e Don't make the namespace devices eternal.
We'll need to delete namespaces soon, so go ahead and stop making
these devices eternal. It doesn't help much, and will be getting in
the way soon.

Sponsored by: Netflix
2018-03-14 23:01:04 +00:00
Steven Hartland
7d147b81ea Fix mps deadlock when handling panic
During shutdown mps waits for its SSU requests to complete however when
performing a reboot after handling a panic the scheduler is stopped so
getmicrotime which is used can be non-functional.

Switch to using the same method as shutdown_panic to ensure we actually
complete.

In addition reduce the timeout when RB_NOSYNC is set in howto as we expect
this to fail.

Reviewed by:	slm
MFC after:	1 week
Sponsored by:	Multiplay
Differential Revision:	https://reviews.freebsd.org/D12776
2018-03-14 21:32:23 +00:00
Brooks Davis
f287c3e4d3 Fix FSACTL_GET_NEXT_ADAPTER_FIB under 32-bit compat.
This includes FSACTL_LNX_GET_NEXT_ADAPTER_FIB.

Reviewed by:	cem
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14672
2018-03-14 21:11:41 +00:00
John Baldwin
6b02bb1aa7 Fix the check for an empty send socket buffer on a TOE TLS socket.
Compare sbavail() with the cached sb_off of already-sent data instead of
always comparing with zero.  This will correctly close the connection and
send the FIN if the socket buffer contains some previously-sent data but
no unsent data.

Reported by:	Harsh Jain @ Chelsio
Sponsored by:	Chelsio Communications
2018-03-14 20:49:51 +00:00
John Baldwin
02d2bcfaba Remove TLS-related inlines from t4_tom.h to fix iw_cxgbe(4) build.
- Remove the one use of is_tls_offload() and the function.  AIO special
  handling only needs to be disabled when a TOE socket is actively doing
  TLS offload on transmit.  The TOE socket's mode (which affects receive
  operation) doesn't matter, so remove the check for the socket's mode and
  only check if a TOE socket has TLS transmit keys configured to determine
  if an AIO write request should fall back to the normal socket handling
  instead of the TOE fast path.
- Move can_tls_offload() into t4_tls.c.  It is not used in critical paths,
  so inlining isn't that important.  Change return type to bool while here.

Sponsored by:	Chelsio Communications
2018-03-14 20:46:25 +00:00
Edward Tomasz Napierala
6960c4e135 Fix typo in a warning message.
MFC after:	2 weeks
2018-03-14 18:27:06 +00:00
Warner Losh
807e94b2c3 Implement trim collapsing in nda
When multiple trims are in the queue, collapse them as much as
possible. At present, this usually results in only a few trims being
collapsed together, but more work on that will make it possible to do
hundreds (up to some configurable max).

Sponsored by: Netflix
2018-03-14 16:44:50 +00:00
John Baldwin
1e9538d253 Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS
encryption and TCP segmentation for offloaded connections.  Sockets
using TLS are required to use a set of custom socket options to upload
RX and TX keys to the NIC and to enable RX processing.  Currently
these socket options are implemented as TCP options in the vendor
specific range.  A patched OpenSSL library will be made available in a
port / package for use with the TLS TOE support.

TOE sockets can either offload both transmit and reception of TLS
records or just transmit.  TLS offload (both RX and TX) is enabled by
setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be
enabled on the relevant interface.  Transmit offload can be used on
any "normal" or TLS TOE socket by using the custom socket option to
program a transmit key.  This permits most TOE sockets to
transparently offload TLS when applications use a patched SSL library
(e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL
library).  Receive offload can only be used with TOE sockets using the
TLS mode.  The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a
list of TCP port numbers.  Any connection with either a local or
remote port number in that list will be created as a TLS socket rather
than a plain TOE socket.  Note that although this sysctl accepts an
arbitrary list of port numbers, the sysctl(8) tool is only able to set
sysctl nodes to a single value.  A TLS socket will hang without
receiving data if used by an application that is not using a patched
SSL library.  Thus, the tls_rx_ports node should be used with care.
For a server mostly concerned with offloading TLS transmit, this node
is not needed as plain TOE sockets will fall back to software crypto
when using an unpatched SSL library.

New per-interface statistics nodes are added giving counts of TLS
packets and payload bytes (payload bytes do not include TLS headers or
authentication tags/MACs) offloaded via the TOE engine, e.g.:

dev.cc.0.stats.rx_tls_octets: 149
dev.cc.0.stats.rx_tls_records: 13
dev.cc.0.stats.tx_tls_octets: 26501823
dev.cc.0.stats.tx_tls_records: 1620

TLS transmit work requests are constructed by a new variant of
t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c.

TLS transmit work requests require a buffer containing IVs.  If the
IVs are too large to fit into the work request, a separate buffer is
allocated when constructing a work request.  This buffer is associated
with the transmit descriptor and freed when the descriptor is ACKed by
the adapter.

Received TLS frames use two new CPL messages.  The first message is a
CPL_TLS_DATA containing the decryped payload of a single TLS record.
The handler places the mbuf containing the received payload on an
mbufq in the TOE pcb.  The second message is a CPL_RX_TLS_CMP message
which includes a copy of the TLS header and indicates if there were
any errors.  The handler for this message places the TLS header into
the socket buffer followed by the saved mbuf with the payload data.
Both of these handlers are contained in tom/t4_tls.c.

A few routines were exposed from t4_cpl_io.c for use by t4_tls.c
including send_rx_credits(), a new send_rx_modulate(), and
t4_close_conn().

TLS keys for both transmit and receive are stored in onboard memory
in the NIC in the "TLS keys" memory region.

In some cases a TLS socket can hang with pending data available in the
NIC that is not delivered to the host.  As a workaround, TLS sockets
are more aggressive about sending CPL_RX_DATA_ACK messages anytime that
any data is read from a TLS socket.  In addition, a fallback timer will
periodically send CPL_RX_DATA_ACK messages to the NIC for connections
that are still in the handshake phase.  Once the connection has
finished the handshake and programmed RX keys via the socket option,
the timer is stopped.

A new function select_ulp_mode() is used to determine what sub-mode a
given TOE socket should use (plain TOE, DDP, or TLS).  The existing
set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and
handles initialization of TLS-specific state when necessary in
addition to DDP-specific state.

Since TLS sockets do not receive individual TCP segments but always
receive full TLS records, they can receive more data than is available
in the current window (e.g. if a 16k TLS record is received but the
socket buffer is itself 16k).  To cope with this, just drop the window
to 0 when this happens, but track the overage and "eat" the overage as
it is read from the socket buffer not opening the window (or adding
rx_credits) for the overage bytes.

Reviewed by:	np (earlier version)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14529
2018-03-13 23:05:51 +00:00
John Baldwin
9689995d23 Simplify error handling in t4_tom.ko module loading.
- Change t4_ddp_mod_load() to return void instead of always returning
  success.  This avoids having to pretend to have proper support for
  unloading when only part of t4_tom_mod_load() has run.
- If t4_register_uld() fails, don't invoke t4_tom_mod_unload() directly.
  The module handling code in the kernel invokes MOD_UNLOAD on a module
  whose MOD_LOAD fails with an error already.

Reviewed by:	np (part of a larger patch)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-03-13 21:42:38 +00:00
Brooks Davis
8b9f77a14c Don't overflow the kernel struct mdio in the MDIOCLIST ioctl.
Always terminate the list with -1 and document the ioctl behavior.
This preserves existing behavior as seen from userspace with the
addition of the unconditional termination which will not be seen by
working consumers of MDIOCLIST.

Because this ioctl can only be performed by root (in default
configurations) and is not used in the base system this bug is not
deemed to warrant either a security advisory or an eratta notice.

Reviewed by:	kib
Obtained from:	CheriBSD
Discussed with:	security-officer (gordon)
MFC after:	3 days
Security:	kernel heap buffer overflow
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14685
2018-03-13 20:39:06 +00:00
Brooks Davis
8037cdcd9a Fix ISP_FC_LIP and ISP_RESCAN on big-endian 64-bit systems.
For _IO() ioctls, addr is a pointer to uap->data which is a caddr_t.
When the caddr_t stores an int, dereferencing addr as an (int *) results
in truncation on little-endian 64-bit systems and corruption (owing to
extracting top bits) on big-endian 64-bit systems. In practice the
value of chan was probably always zero on systems of the latter type as
all such FreeBSD platforms use a register-based calling convention.

Reviewed by:	mav
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14673
2018-03-13 19:56:10 +00:00
Kyle Evans
92f1731bf3 Correct minor typo in comment, efi_dmcap -> efi_tmcap 2018-03-13 15:02:46 +00:00
Kyle Evans
8521b4a9df efirtc: Pass a dummy tmcap pointer to efi_get_time_locked
As noted in the comment, UEFI spec claims the capabilities pointer is
optional, but some implementations will choke and attempt to dereference it
without checking. This specific problem was found on a Lenovo Thinkpad X220
that would panic in efirtc_identify.
2018-03-13 15:01:23 +00:00
Roger Pau Monné
c2272faa06 vt_vga: check if VGA is available from ACPI FADT table
On x86 the IA-PC Boot Flags in the FADT can signal whether VGA is
available or not.

Sponsored by:		Citrix systems R&D
Reviewed by:		marcel
Differential revision:	https://reviews.freebsd.org/D14397
2018-03-13 09:38:53 +00:00
Toomas Soome
a5b0fd9ca9 e1000g: this statement may fall through
The gcc 7 does check for switch statement fall through cases, and if legit,
such complaint can besilenced by /* FALLTHROUGH */ comment. Unfortunately
such comment is quite limited, but will still notify the reader.

This patch is backport from illumos, see
https://www.illumos.org/rb/r/941/

Reviewed by:	eadler
Differential Revision:	https://reviews.freebsd.org/D14663
2018-03-12 17:05:53 +00:00
Alexander Motin
01c1be35e0 Print fuses and fna fields in identify data.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2018-03-12 16:31:25 +00:00
Scott Long
cf6ea6f27a Implement a sysctl to dump in-flight I/O state for debugging. The tool to
parse it will be committed in a separate action.

Sponsored by:	Netflix
2018-03-12 05:02:22 +00:00
Alexander Motin
6b1a96b16b Add new opcodes and statuses from NVMe 1.3a.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2018-03-11 06:30:09 +00:00
Alexander Motin
3fa5467a06 Add new identify data structures fields from NVMe 1.3a.
Some of them are already supported by existing hardware, so reporting
them `nvmecontrol identify` can be useful.
2018-03-11 05:09:02 +00:00
Emmanuel Vadot
82ce05526b extres/regulators: Add sysctls for regulators
For each regulators create an hw.regulator.<regname>. :
uvolt: Current value
always_on: 1 If the reg is always on
boot_on: 1 If the reg is set at boot time
enable_cnt: Number of consumer(s)
enable_delay: Delay before enabling the regulator
ramp_delay: The Ramp delay
max_uamp: The maximum value of the regulator in uAmps
min_uamp: The minimal value of the regulator in uAmps
max_uvolt: The maximum value of the regulator in uVolts
min_uvolt: The minimal value of the regulator in uVolts

Reviewed by:	ian
Differential Revision:	https://reviews.freebsd.org/D14578
2018-03-11 04:37:05 +00:00
Andriy Voskoboinyk
cf268a8302 otus(4): check mcast / mgt / ucast rates during Tx descriptor setup
These parameters may be changed via ifconfig(8); by default,
mgt / mcast rates are lowest possible and ucast rate is not set
(matches previous configuration).

While here, store some variables locally for better readability.
2018-03-11 00:38:08 +00:00
Andriy Voskoboinyk
8f1ef22fca rtwn(4): reset Tx power values before calling get_txpower()
for RTL8192C / RTL8188E (like it is done for other chipsets).
2018-03-10 23:47:03 +00:00
Andriy Voskoboinyk
63a55eab9b usb/wlan/*: properly include "opt_wlan.h" into all drivers
Without it driver cannot be loaded when wlan(4) module is built with
'options IEEE80211_DEBUG_REFCNT'.
2018-03-10 23:16:24 +00:00
Andriy Voskoboinyk
2e0d057248 run(4): drop few unused variables.
Found by: Clang static analyzer
2018-03-10 22:52:39 +00:00
Edward Tomasz Napierala
8ff2372a2c Check for duplicates when modifying an iSCSI session. Previously we did
this check on open, but "iscsictl -M", or an iSCSI redirect received by
iscsid(8) could end up with two sessions with the same target name and
portal.

MFC after:	2 weeks
2018-03-10 14:21:37 +00:00
Conrad Meyer
918622dbc0 mlx5(4): Remove redundant declaration of mlx5_enter_error_state
Broken in r330644.

Sponsored by:	Dell EMC Isilon
2018-03-10 00:59:48 +00:00
Andriy Voskoboinyk
d1b671061b net80211: wrap protection frame allocation into ieee80211_alloc_prot()
Move copy-pasted code for RTS/CTS frame allocation into net80211.
While here, add stat / debug message for allocation failures
(copied from run(4)) + return error here in bwn(4).

Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D14628
2018-03-09 11:33:56 +00:00
Emmanuel Vadot
d597cfd520 Fix build when option MMCCAM is defined. 2018-03-08 22:49:36 +00:00
Konstantin Belousov
23fda0ad66 Remove unused variable.
Sponsored by:	The FreeBSD Foundation
2018-03-08 22:04:54 +00:00
Konstantin Belousov
2cec152827 Make mlx5 compilable on ILP32 arches.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-03-08 22:03:43 +00:00
Ed Maste
a1ea91a21f bktr: correct Japan IF frequency
PR:		36451
Submitted by:	Hijiri Umemoto <hijiri at umemoto.org>
MFC after:	2 weeks
2018-03-08 19:24:10 +00:00
Ed Maste
95320b99a5 asmc: update temperature sensor name/description
PR:		225911
Submitted by:	Trev <fbsdbugs4 at sentry.org>
MFC after:	1 week
2018-03-08 18:52:47 +00:00
Andriy Voskoboinyk
222a2f1de9 iwi(4): factor out rateset setup into iwi_set_rateset().
No functional change intended.
2018-03-08 18:42:23 +00:00
Hans Petter Selasky
50e3ba10e8 Set correct SL in completion for RoCE in mlx5ib(4).
There is a difference when parsing a completion entry between Ethernet
and IB ports. When link layer is Ethernet the bits describe the type of
L3 header in the packet. In the case when link layer is Ethernet and VLAN
header is present the value of SL is equal to the 3 UP bits in the VLAN
header. If VLAN header is not present then the SL is undefined and consumer
of the completion should check if IB_WC_WITH_VLAN is set.

While that, this patch also fills the vlan_id field in the completion if
present.

linux commit 12f8fedef2ec94c783f929126b20440a01512c14

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 16:27:31 +00:00
Hans Petter Selasky
0285276c92 Add call to setup firmware data dump structure during device load in
mlx5core.

Do not consider the inability to create a firmware dump fatal, but
inform about the situation and allow the driver to attach. The device
might not implement the needed VSC, or we might not know the layout of
the registers map. In either case, only firmware dump functionality is
limited, the network operations should be fine.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 16:19:01 +00:00
Hans Petter Selasky
9cb83c4689 Avoid more LFENCE/SFENCe on x86 in mlx5en(4),
by using the FreeBSD native fences.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:58:30 +00:00
Hans Petter Selasky
7d69d339ad Fix mlx5en(4) driver to properly call m_defrag().
When the mlx5en(4) driver was converted to using BUSDMA(9) the call to
m_defrag() was moved after the part of the TX routine that strips the
header from the mbuf chain. Before it called m_defrag it first trimmed
off the now-empty mbufs from the start of the chain. This has the side
effect of also removing the head of the chain that has M_PKTHDR set.
m_defrag() will not defrag a chain that does not have M_PKTHDR set,
thus it was effectively never defragging the mbuf chains.

As it turns out, trimming the mbufs in this fashion is unnecessary since
the call to bus_dmamap_load_mbuf_sg doesn't map empty mbufs anyway, so
remove it.

Differential Revision:	https://reviews.freebsd.org/D12050
Submitted by:	mjoras@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:53:04 +00:00
Hans Petter Selasky
1eb09ad434 Use vport rather than physical-port MTU in mlx5en(4).
Set and report vport MTU rather than physical MTU,
The driver will set both vport and physical port mtu
and will rely on the query of vport mtu.

SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.

Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.

Based on Linux upstream commit:
cd255efff9baadd654d6160e52d17ae7c568c9d3

Submitted by:	Meny Yossefi <menyy@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:47:17 +00:00
Hans Petter Selasky
7c22ae80d2 Use the device unit number for naming the ifnet interface in mlx5en(4).
Currently the ifnet interface is named mceX, where X is a monotonically
incremented value. If the device is reset due to a fatal error, then the
interface name will change.  Using the device unit number will keep the
naming consistent across the reset logic.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:43:41 +00:00
Hans Petter Selasky
1a872967ad Remove duplicate prototypes.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:37:09 +00:00
Hans Petter Selasky
e808190a59 Add kernel and userspace code to dump the firmware state of supported
ConnectX-4/5 devices in mlx5core.

The dump is obtained by reading a predefined register map from the
non-destructive crspace, accessible by the vendor-specific PCIe
capability (VSC). The dump is stored in preallocated kernel memory and
managed by the mlx5tool(8), which communicates with the driver using a
character device node.

The utility allows to store the dump in format
    <address> <value>
into a file, to reset the dump content, and to manually initiate the
dump.

A call to mlx5_fwdump() should be added at the places where a dump
must be fetched automatically. The most likely place is right before a
firmware reset request.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:21:56 +00:00
Hans Petter Selasky
4b95c6659a Add vendor specific capability interface support in mlx5core.
Add the ability to access the vendor specific space gateway in order
to support reading and writing data into the different configuration
domains.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:59:47 +00:00
Hans Petter Selasky
fba294620c Use device_printf() instead of printf() when printing warnings and errors
to dmesg(8) in mlx5core.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:58:27 +00:00
Hans Petter Selasky
10b0804509 Add support for per priority flow control, PFC, to mlx5en(4).
Add support for PFC and implement reading the per priority statistics
using the sysctl(8) interface. PFC is used together with VLAN priority
and can be enabled and disabled on a per priority basis.

Global pause frames and PFC are incompatible features and surrounding
logic has been added to warn the user about misconfiguration.

Update relevant mlx5core APIs for PFC configuration.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:40:39 +00:00
Hans Petter Selasky
118063fb70 Add support for explicit congestion notification, ECN, to mlx5ib(4).
ECN configuration and statistics is available through a set of sysctl(8)
nodes under sys.class.infiniband.mlx5_X.cong . The ECN configuration
nodes can also be used as loader tunables.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:23:14 +00:00
Hans Petter Selasky
788333d9a6 Use the autogenerated interface file for all commands in mlx5core.
This patch accumulates the following Linux commits:
- 90b3e38d048f09b22fb50bcd460cea65fd00b2d7
  mlx5_core: Modify CQ moderation parameters
- 09a7d9eca1a6cf5eb4f9abfdf8914db9dbd96f08
  mlx5_core: QP/XRCD commands via mlx5 ifc
- 1a412fb1caa2c1b77719ccb5ed8b0c3c2bc65da7
  mlx5_core: Modify QP commands via mlx5 ifc
- ec22eb53106be1472ba6573dc900943f52f8fd1e
  mlx5_core: MKey/PSV commands via mlx5 ifc
- 73b626c182dff06867ceba996a819e8372c9b2ce
  mlx5_core: EQ commands via mlx5 ifc
- 20ed51c643b6296789a48adc3bc2cc875a1612cf
  mlx5_core: Access register and MAD IFC commands via mlx5 ifc
- a533ed5e179cd15512d40282617909d3482a771c
  mlx5_core: Pages management commands via mlx5 ifc
- b8a4ddb2e8f44f872fb93bbda2d541b27079fd2b
  mlx5_core: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON
- af1ba291c5e498973cc325c501dd8da80b234571
  mlx5_core: Refactor internal SRQ API
- b06e7de8a9d8d1d540ec122bbdf2face2a211634
  mlx5_core: Refactor device capability function
- c4f287c4a6ac489c18afc4acc4353141a8c53070
  mlx5_core: Unify and improve command interface

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 10:43:42 +00:00
Hans Petter Selasky
ca55159467 Fix race between PCI error handlers and health work in mlx5core.
linux commit 05ac2c0b7438ea08c5d54b48797acf9b22cb2f6f

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:58:41 +00:00
Hans Petter Selasky
7053deeb3d Avoid calling sleeping function from the health poll thread in mlx5core.
linux commit c1d4d2e92ad670168a17a57dfa182a5a5baa72d4

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:51:33 +00:00
Hans Petter Selasky
a2485fe5a6 Updates for PCI and health monitor recovery in mlx5core.
This patch accumulates the following Linux commits:

mlx5_health.c
- 78ccb25861d76a8fc5c678d762180e6918834200
  mlx5_core: Fix wrong name in struct
- 171bb2c560f45c0427ca3776a4c8f4e26e559400
  mlx5_core: Update health syndromes
- 0144a95e2ad53a40c62148f44fb0c1f9d2a0d1e9
  mlx5_core: Use accessor functions to read from device memory
- ac6ea6e81a80172612e0c9ef93720f371b198918
  mlx5_core: Use private health thread for each device
- fd76ee4da55abb21babfc69310d321b9cb9a32e0
  mlx5_core: Fix internal error detection conditions
- 2241007b3d783cbdbaa78c30bdb1994278b6f9b9
  mlx5: Clear health sick bit when starting health poll
- 712bfef60912d91033cb25739f7444d5b8d8c59f
  mlx5: Fix version printout in case of health issue
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver

mlx5_cmd.c
- be87544de8df2b1eb34bcb5e32691287d96f9ec4
  mlx5_core: Fix async commands return code
- a31208b1e11df334d443ec8cace7636150bb8ce2
  mlx5_core: New init and exit flow for mlx5_core
- 020446e01eebc9dbe7eda038e570ab9c7ab13586
  mlx5_core: Prepare cmd interface to system errors handling
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver
- 0d834442cc247c7b3f3bd6019512ae03e96dd99a
  mlx5: Fix teardown errors that happen in pci error handler

mlx5_main.c
- 5fc7197d3a256d9c5de3134870304b24892a4908
  mlx5: Add pci shutdown callback

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:47:09 +00:00
Nathan Whitehorn
f9edb09d70 Move the powerpc64 direct map base address from zero to high memory. This
accomplishes a few things:
- Makes NULL an invalid address in the kernel, which is useful for catching
  bugs.
- Lays groundwork for radix-tree translation on POWER9, which requires the
  direct map be at high memory.
- Similarly lays groundwork for a direct map on 64-bit Book-E.

The new base address is chosen as the base of the fourth radix quadrant
(the minimum kernel address in this translation mode) and because all
supported CPUs ignore at least the first two bits of addresses in real
mode, allowing direct-map addresses to be used in real-mode handlers.
This is required by Linux and is part of the architecture standard
starting in POWER ISA 3, so can be relied upon.

Reviewed by:	jhibbits, Breno Leitao
Differential Revision:	D14499
2018-03-07 17:08:07 +00:00
Hans Petter Selasky
2e9c3a4f99 Implement priority to traffic class mapping in mlx5core.
Add support for mapping priority to traffic class via sysctl

Submitted by:	Slava Shwartsman <slavash@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:23:07 +00:00
Hans Petter Selasky
cfc9c386eb Implement rate limit per traffic class in mlx5core.
Add support for rate limiting traffic class via sysctl.

Submitted by:	Slava Shwartsman <slavash@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:17:36 +00:00
Hans Petter Selasky
d91421515a Implement missing query for current port rate in mlx5ib(4).
- Factor out port speed definitions into new port.h header file,
  similarly as done in Linux upstream.
- Correct two existing port speed definitions in mlx5en according to
  Linux upstream.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:03:11 +00:00
Hans Petter Selasky
ecb4fcc48e Add log message for unsupported QSFPs in mlx5core.
Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:51:50 +00:00
Hans Petter Selasky
2289559114 Make sure default VNET is set when adding a new interface in mlx5core.
Adding an interface might be done outside the device_attach() routine
and will then cause a panic, due to the VNET not being set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:49:27 +00:00
Hans Petter Selasky
11546d068d Add timeout handle to commands with callback in mlx5core.
The current implementation does not handle timeout in case of command
with callback request, and this can lead to deadlock if the command
doesn't get firmware response. Add delayed callback timeout work
before posting the command to firmware. In case of real firmware
command completion we will cancel the delayed work. In case of
firmware command timeout the callback timeout handler will be called
and it will simulate firmware completion with timeout error.

linux commit 65ee67084589c1783a74b4a4a5db38d7264ec8b5

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:41:29 +00:00
Hans Petter Selasky
2327a753b9 Fix potential deadlock in command mode change in mlx5core.
Call command completion handler in case of timeout when working in
interrupts mode. Avoid flushing the commands workqueue after acquiring
the semaphores to prevent a potential deadlock.

linux commit commit 9cba4ebcf374c3772f6eb61f2d065294b2451b49

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:35:28 +00:00
Hans Petter Selasky
ee1b5c5811 Use a macro in mlx5_command_str() instead of copying OP name.
linux commit 42ca502e179d0654ef441333a9d0f35c948734f3

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:29:30 +00:00
Hans Petter Selasky
2f2d3c0cf3 Disable unsupported disassociate ucontext functionality in mlx5ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:03:31 +00:00
Hans Petter Selasky
87e303056f Bump version information in mlx4ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 13:59:46 +00:00
Hans Petter Selasky
c9a80d0289 The mlx4ib(4) should not be loaded before the ibcore is initialized.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 13:58:58 +00:00
Hans Petter Selasky
54b55cbdd9 Disable unsupported disassociate ucontext functionality in mlx4ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 13:57:32 +00:00
Andrew Turner
e0fe10600a Create macros for the ACPI interrupt cross references. This is considered a
band aid until a better solution to find the correct interrupt controller
can be found.

While here fix one place in the GICv3 ITS driver where the offset wasn't
correctly applied.

Sponsored by:	DARPA, AFRL
Sponsored by:	Cavium (Hardware)
2018-03-07 13:16:03 +00:00
Andrew Turner
08fdb4ce38 Add an acpi attachment to the pci_host_generic driver and have the ACPI
bus provide it with its needed memory resources.

This allows us to use PCIe on the ThunderX2 and, with a previous version
of the patch, on the SoftIron 3000 with ACPI.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
Sponsored by:	DARPA, AFRL
Sponsored by:	Cavium (Hardware)
Differential Revision:	https://reviews.freebsd.org/D8767
2018-03-07 10:47:27 +00:00
Oleksandr Tymoshenko
b3e8ee5d05 [ig4] Add support for i2c controllers on Skylake and Kaby Lake
This was tested by Ben on  HP Chromebook 13 G1 with a
Skylake CPU and Sunrise Point-LP I2C controller and by me on
Minnowboard Turbot with Atom E3826 (formerly Bay Trail)

Submitted by:	Ben Pye <ben@curlybracket.co.uk>
Reviewed by:	gonzo
Obtained from:	DragonflyBSD (a4549657 by Imre Vadász)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D13654
2018-03-06 23:39:43 +00:00
Conrad Meyer
015ab09843 psm(4): Initialize variables before use
dxp/dyp could have been used uninitialized in the subsequent debugging log
invocation.

Reported by:	Coverity
Sponsored by:	Dell EMC Isilon
2018-03-06 20:31:14 +00:00
Andrey V. Elsukov
c2a5dc6cd7 Add mapping for several ethernet types used by Linux to FreeBSD
ethernet types.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14594
2018-03-06 12:58:00 +00:00
Ian Lepore
b138780b0c Build the ds1672 driver as a module. Add a detach() to unregister the rtc. 2018-03-06 02:30:34 +00:00
Ian Lepore
18029749f4 Fix a paste-o that broke the build. There is no softc pointer here, just
use the dev arg.

Reported by:	Jonathan Looney <jonlooney@gmail.com>
Pointy hat:	ian@
2018-03-06 02:21:41 +00:00
Hans Petter Selasky
1456d97c01 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
Ian Lepore
6d27b68a1a Switch to the new bcd_clocktime conversion routines, and add calls to the
new clock_dbgprint_xxx() functions.
2018-03-05 00:43:53 +00:00
Ian Lepore
bbff059175 Switch to the new bcd_clocktime conversion routines, and add calls to the
new clock_dbgprint_xxx() functions.
2018-03-05 00:15:56 +00:00
Ian Lepore
29e14eea8c Switch to the new bcd_clocktime conversion routines, and add calls to the
new clock_dbgprint_xxx() functions.
2018-03-04 23:39:40 +00:00
Ian Lepore
89ba361fcc The year is stored in a single byte in sram, in binary format, as a count
of years since the century, so strip the century out when converting to or
from bcd_clocktime format (the conversion routines will infer century by
pivoting on 70).
2018-03-04 21:58:32 +00:00
Ian Lepore
5deff57b6f Convert to the new(ish) bcd_clocktime conversion functions, add calls to
the new debug output functions, and when setting the clock, propagate the
timespec nsecs to the 1/100ths register.
2018-03-04 21:04:30 +00:00
Ian Lepore
02641ce942 Add calls to the new clock_dbgprint_xxx() functions. Also, stop applying
a local .5 second adjustment to the time, since that is now done by the
code in subr_rtc.c.
2018-03-04 19:32:52 +00:00
Ian Lepore
fa9d44f62e Add calls to the new clock_dbgprint_xxx() functions. 2018-03-04 19:26:47 +00:00