Commit Graph

121123 Commits

Author SHA1 Message Date
Warner Losh
8a3de7bc34 Allow NULL ccb to cam_iosched_bio_complete
When the ccb is NULL to cam_iosched_bio_complete, just update the
other statistics, but not the time. If many operations are collapsed
together, this is needed to keep stats properly for the grouped bp.
This should fix trim accounting.

Sponsored by: Netflix
2018-03-14 16:44:16 +00:00
Nathan Whitehorn
94f513c8db The expression (aim | fdt) is always true on PowerPC. The last PowerPC
platform that can run without a device tree (PS3) still uses the OF_*()
functions to check if one exists and OF_* is used unconditionally in
core parts of the system like powerpc/machdep.c. Reflect this reality
in files.powerpc, for example by changing occurrences of aim | fdt to
standard.
2018-03-14 16:16:25 +00:00
Ed Maste
7b194b3d3b Remove stray ; at end of linux_vdso_deinstall() 2018-03-14 13:20:36 +00:00
Wojciech Macek
22eedd96c7 PowerNV: Fix I2C to compile if FDT is disabled
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-03-14 09:20:03 +00:00
Conrad Meyer
052d3c1290 Update to Zstandard 1.3.3
Includes patch to conditionalize use of __builtin_clz(ll) on __has_builtin().
The issue is tracked upstream at https://github.com/facebook/zstd/pull/884 .
Otherwise, these are vanilla Zstandard 1.3.3 files.

Note that the 1.3.4 release should be due out soon.

Sponsored by:	Dell EMC Isilon
2018-03-14 03:00:17 +00:00
Warner Losh
f8f471cf5f We need opt_compat.h after r330819 and 330820.
Add opt_compat.h to fix the stand-alone build case.

Sponsored by: Netflix.
2018-03-13 23:36:15 +00:00
John Baldwin
1e9538d253 Support for TLS offload of TOE connections on T6 adapters.
The TOE engine in Chelsio T6 adapters supports offloading of TLS
encryption and TCP segmentation for offloaded connections.  Sockets
using TLS are required to use a set of custom socket options to upload
RX and TX keys to the NIC and to enable RX processing.  Currently
these socket options are implemented as TCP options in the vendor
specific range.  A patched OpenSSL library will be made available in a
port / package for use with the TLS TOE support.

TOE sockets can either offload both transmit and reception of TLS
records or just transmit.  TLS offload (both RX and TX) is enabled by
setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be
enabled on the relevant interface.  Transmit offload can be used on
any "normal" or TLS TOE socket by using the custom socket option to
program a transmit key.  This permits most TOE sockets to
transparently offload TLS when applications use a patched SSL library
(e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL
library).  Receive offload can only be used with TOE sockets using the
TLS mode.  The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a
list of TCP port numbers.  Any connection with either a local or
remote port number in that list will be created as a TLS socket rather
than a plain TOE socket.  Note that although this sysctl accepts an
arbitrary list of port numbers, the sysctl(8) tool is only able to set
sysctl nodes to a single value.  A TLS socket will hang without
receiving data if used by an application that is not using a patched
SSL library.  Thus, the tls_rx_ports node should be used with care.
For a server mostly concerned with offloading TLS transmit, this node
is not needed as plain TOE sockets will fall back to software crypto
when using an unpatched SSL library.

New per-interface statistics nodes are added giving counts of TLS
packets and payload bytes (payload bytes do not include TLS headers or
authentication tags/MACs) offloaded via the TOE engine, e.g.:

dev.cc.0.stats.rx_tls_octets: 149
dev.cc.0.stats.rx_tls_records: 13
dev.cc.0.stats.tx_tls_octets: 26501823
dev.cc.0.stats.tx_tls_records: 1620

TLS transmit work requests are constructed by a new variant of
t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c.

TLS transmit work requests require a buffer containing IVs.  If the
IVs are too large to fit into the work request, a separate buffer is
allocated when constructing a work request.  This buffer is associated
with the transmit descriptor and freed when the descriptor is ACKed by
the adapter.

Received TLS frames use two new CPL messages.  The first message is a
CPL_TLS_DATA containing the decryped payload of a single TLS record.
The handler places the mbuf containing the received payload on an
mbufq in the TOE pcb.  The second message is a CPL_RX_TLS_CMP message
which includes a copy of the TLS header and indicates if there were
any errors.  The handler for this message places the TLS header into
the socket buffer followed by the saved mbuf with the payload data.
Both of these handlers are contained in tom/t4_tls.c.

A few routines were exposed from t4_cpl_io.c for use by t4_tls.c
including send_rx_credits(), a new send_rx_modulate(), and
t4_close_conn().

TLS keys for both transmit and receive are stored in onboard memory
in the NIC in the "TLS keys" memory region.

In some cases a TLS socket can hang with pending data available in the
NIC that is not delivered to the host.  As a workaround, TLS sockets
are more aggressive about sending CPL_RX_DATA_ACK messages anytime that
any data is read from a TLS socket.  In addition, a fallback timer will
periodically send CPL_RX_DATA_ACK messages to the NIC for connections
that are still in the handshake phase.  Once the connection has
finished the handshake and programmed RX keys via the socket option,
the timer is stopped.

A new function select_ulp_mode() is used to determine what sub-mode a
given TOE socket should use (plain TOE, DDP, or TLS).  The existing
set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and
handles initialization of TLS-specific state when necessary in
addition to DDP-specific state.

Since TLS sockets do not receive individual TCP segments but always
receive full TLS records, they can receive more data than is available
in the current window (e.g. if a 16k TLS record is received but the
socket buffer is itself 16k).  To cope with this, just drop the window
to 0 when this happens, but track the overage and "eat" the overage as
it is read from the socket buffer not opening the window (or adding
rx_credits) for the overage bytes.

Reviewed by:	np (earlier version)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14529
2018-03-13 23:05:51 +00:00
John Baldwin
9689995d23 Simplify error handling in t4_tom.ko module loading.
- Change t4_ddp_mod_load() to return void instead of always returning
  success.  This avoids having to pretend to have proper support for
  unloading when only part of t4_tom_mod_load() has run.
- If t4_register_uld() fails, don't invoke t4_tom_mod_unload() directly.
  The module handling code in the kernel invokes MOD_UNLOAD on a module
  whose MOD_LOAD fails with an error already.

Reviewed by:	np (part of a larger patch)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-03-13 21:42:38 +00:00
Brooks Davis
c92c85ffeb md_pad is used by MDIOCLIST and not available for future use.
MFC after:	1 week
2018-03-13 20:54:18 +00:00
Brooks Davis
8b9f77a14c Don't overflow the kernel struct mdio in the MDIOCLIST ioctl.
Always terminate the list with -1 and document the ioctl behavior.
This preserves existing behavior as seen from userspace with the
addition of the unconditional termination which will not be seen by
working consumers of MDIOCLIST.

Because this ioctl can only be performed by root (in default
configurations) and is not used in the base system this bug is not
deemed to warrant either a security advisory or an eratta notice.

Reviewed by:	kib
Obtained from:	CheriBSD
Discussed with:	security-officer (gordon)
MFC after:	3 days
Security:	kernel heap buffer overflow
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14685
2018-03-13 20:39:06 +00:00
Brooks Davis
8037cdcd9a Fix ISP_FC_LIP and ISP_RESCAN on big-endian 64-bit systems.
For _IO() ioctls, addr is a pointer to uap->data which is a caddr_t.
When the caddr_t stores an int, dereferencing addr as an (int *) results
in truncation on little-endian 64-bit systems and corruption (owing to
extracting top bits) on big-endian 64-bit systems. In practice the
value of chan was probably always zero on systems of the latter type as
all such FreeBSD platforms use a register-based calling convention.

Reviewed by:	mav
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14673
2018-03-13 19:56:10 +00:00
Konstantin Belousov
741e1c9196 Revert the chunk from r330410 in vm_page_reclaim_run().
There, the pages freed might be managed but the page's lock is not
owned.  For KPI correctness, the page lock is requried around the call
to vm_page_free_prep(), which is asserted.  Reclaim loop already did
the work which could be done by vm_page_free_prep(), so the lock is
not needed and the only consequence of not owning it is the assert
trigger.

Instead of adding the locking to satisfy the assert, revert to the
code that calls vm_page_free_phys() directly.

Reported by:	pho
Discussed with:	jeff
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-03-13 18:27:23 +00:00
Nathan Whitehorn
9b0ec025d4 Restore missing temporary variable, deleted by accident in r330845. This
unbreaks the ppc32 AIM build.

Reported by:	jhibbits
2018-03-13 18:24:21 +00:00
Kyle Evans
63ee68c220 EFIRT: SetVirtualAddressMap with 1:1 mapping after exiting boot services
This fixes a problem encountered on the Lenovo Thinkpad X220/Yoga 11e where
runtime services would try to inexplicably jump to other parts of memory
where it shouldn't be when attempting to enumerate EFI vars, causing a
panic.

The virtual mapping is enabled by default and can be disabled by setting
efi_disable_vmap in loader.conf(5).

Reviewed by:	kib (earlier version)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D14677
2018-03-13 17:10:52 +00:00
Ed Maste
a95659f75f Use C99 boolean type for translate_osrel
Migrate to modern types before creating MD Linuxolator bits for new
architectures.

Reviewed by:	cem
Sponsored by:	Turing Robotic Industries Inc.
Differential Revision:	https://reviews.freebsd.org/D14676
2018-03-13 16:40:29 +00:00
Nathan Whitehorn
8864f35942 Execute PowerPC64/AIM kernel from direct map region when possible.
When the kernel can be in real mode in early boot, we can execute from
high addresses aliased to the kernel's physical memory. If that high
address has the first two bits set to 1 (0xc...), those addresses will
automatically become part of the direct map. This reduces page table
pressure from the kernel and it sets up the kernel to be used with
radix translation, for which it has to be up here.

This is accomplished by exploiting the fact that all PowerPC kernels are
built as position-independent executables and relocate themselves
on start. Before this patch, the kernel runs at 1:1 VA:PA, but that
VA/PA is random and set by the bootloader. Very early, it processes
its ELF relocations to operate wherever it happens to find itself.
This patch uses that mechanism to re-enter and re-relocate the kernel
a second time witha new base address set up in the early parts of
powerpc_init().

Reviewed by:	jhibbits
Differential Revision:	D14647
2018-03-13 15:03:58 +00:00
Kyle Evans
92f1731bf3 Correct minor typo in comment, efi_dmcap -> efi_tmcap 2018-03-13 15:02:46 +00:00
Kyle Evans
8521b4a9df efirtc: Pass a dummy tmcap pointer to efi_get_time_locked
As noted in the comment, UEFI spec claims the capabilities pointer is
optional, but some implementations will choke and attempt to dereference it
without checking. This specific problem was found on a Lenovo Thinkpad X220
that would panic in efirtc_identify.
2018-03-13 15:01:23 +00:00
Ed Maste
b7feabf906 Use C99 designated initializers for struct execsw
It it makes use slightly more clear and facilitates grepping.
2018-03-13 13:09:10 +00:00
Roger Pau Monné
4a6d4e7b58 at_rtc: check in ACPI FADT boot flags if the RTC is present
Or else disable the device. Note that the detection can be bypassed by
setting the hw.atrtc.enable option in the loader configuration file.
More information can be found on atrtc(4).

Sponsored by:		Citrix Systems R&D
Reviewed by:		ian
Differential revision:	https://reviews.freebsd.org/D14399
2018-03-13 09:42:33 +00:00
Roger Pau Monné
c2272faa06 vt_vga: check if VGA is available from ACPI FADT table
On x86 the IA-PC Boot Flags in the FADT can signal whether VGA is
available or not.

Sponsored by:		Citrix systems R&D
Reviewed by:		marcel
Differential revision:	https://reviews.freebsd.org/D14397
2018-03-13 09:38:53 +00:00
Ed Maste
4ba257591b Apply some style(9) to Linuxulator linux_sysvec.c comments 2018-03-13 00:40:05 +00:00
Ed Maste
644055e74e imgact_linux.c: use standard indentation
Sponsored by:	Turing Robotic Industries Inc.
2018-03-12 23:28:25 +00:00
Brooks Davis
467e627672 Use the stack for temporary storage in OTIOCCONS.
The old code used the thread's pcb via the uap->data pointer.

Reviewed by:	ed
Approved by:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14674
2018-03-12 23:04:42 +00:00
Brooks Davis
405b67a225 Reject ioctls to SCSI enclosures from 32-bit compat processes.
The ioctl objects contain pointers and require translation and some
refactoring of the infrastructure to work. For now prevent opertion
on garbage values. This is very slightly overbroad in that ENCIOC_INIT
is safe.

Reviewed by:	imp, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14671
2018-03-12 23:02:01 +00:00
Brooks Davis
871dc9833b Reject CAMIOGET and CAMIOQUEUE ioctl's on pass(4) in 32-bit compat mode.
These take a union ccb argument which is full of kernel pointers.
Substantial translation efforts would be required to make this work.
By rejecting the request we avoid processing or returning entierly
wrong data.

Reviewed by:	imp, ken, markj, cem
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14654
2018-03-12 22:58:07 +00:00
Brooks Davis
97519ff698 MIPS: Implement fue*word* and casueword* in assembly.
Remove NO_FUEWORD so the 'e' variants are wrapped by the non-'e'
variants.  This is more correct and leaves sparc64 as the outlier.

Reviewed by:	jmallett, kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14603
2018-03-12 22:10:06 +00:00
Toomas Soome
a5b0fd9ca9 e1000g: this statement may fall through
The gcc 7 does check for switch statement fall through cases, and if legit,
such complaint can besilenced by /* FALLTHROUGH */ comment. Unfortunately
such comment is quite limited, but will still notify the reader.

This patch is backport from illumos, see
https://www.illumos.org/rb/r/941/

Reviewed by:	eadler
Differential Revision:	https://reviews.freebsd.org/D14663
2018-03-12 17:05:53 +00:00
Alexander Motin
01c1be35e0 Print fuses and fna fields in identify data.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2018-03-12 16:31:25 +00:00
Ed Maste
5cc6d253f4 ANSIfy sys/kern/imgact_* 2018-03-12 15:45:50 +00:00
Ed Maste
340f4a8d3e Linuxulator: apply style(9) to return
Sponsored by:	Turing Robotic Industries Inc.
2018-03-12 15:35:24 +00:00
Ian Lepore
22b3d71e82 Give the atrtc_time_lock a unique name.
Reported by:	hps@
2018-03-12 15:26:11 +00:00
Warner Losh
af1823cde8 Tighten up periph lock to avoid some races
Make sure the periph lock is held around rmw access to softc data,
espeically flags, including work flags in iosched.
Add asserts for the periph lock where it should be held.

PR: 226510
Sponsored by: Netflix
Differential Review: https://reviews.freebsd.org/D14456
2018-03-12 15:17:16 +00:00
Andriy Gapon
7471a3fae8 fix r297857, do not modify CPU extension bits under virtual machines
r297857 was meant for real hardware only.

PR:		213155
Submitted by:	mainland@apeiron.net
MFC after:	1 week
2018-03-12 11:28:09 +00:00
Andrey V. Elsukov
12c080e613 Do not try to reassemble IPv6 fragments in "reass" rule.
ip_reass() expects IPv4 packet and will just corrupt any IPv6 packets
that it gets. Until proper IPv6 fragments handling function will be
implemented, pass IPv6 packets to next rule.

PR:		170604
MFC after:	1 week
2018-03-12 09:40:46 +00:00
Conrad Meyer
0a646b9715 Implement NO_WCAST_QUAL for gcc4.2 architectures 2018-03-12 05:41:27 +00:00
Scott Long
cf6ea6f27a Implement a sysctl to dump in-flight I/O state for debugging. The tool to
parse it will be committed in a separate action.

Sponsored by:	Netflix
2018-03-12 05:02:22 +00:00
Emmanuel Vadot
df5f14797b arm: Remove SoC Specific -MMCCAM kernelconfig
One should use the GENERIC-MMCCAM for this.
2018-03-11 23:14:50 +00:00
Ian Lepore
c7053bbe54 Revert r330780, it was improperly tested and results in taking a spin
mutex before acquiring sleep mutexes.

Reported by:	kib@
2018-03-11 20:13:15 +00:00
Ian Lepore
4b502f0016 Remove MTX_NOPROFILE from atrtc_lock, it was inappropriately copy/pasted
from the i8254 driver when I created separate mutexes for each.  The i8254
driver could be the active timecounter, leading to recursion during mutex
profiling, but the atrtc driver cannot be a timecounter, so it isn't needed.
2018-03-11 19:56:07 +00:00
Ian Lepore
86051be993 Eliminate atrtc_time_lock, and use atrtc_lock for efirtc locking. 2018-03-11 19:22:58 +00:00
Andrey V. Elsukov
017a5e581a Rework key_sendup_mbuf() a bit:
o count in_nomem counter when we have failed to allocate mbuf for
  promisc socket;
o count in_msgtarget counter when we have secussfully sent data to socket;
o Since we are sending messages in a loop, returning error on first fail
  interrupts the loop, and all remaining sockets will not receive this
  message. So, do not return error when we have failed to send data to ALL
  or REGISTERED target. Return error only for KEY_SENDUP_ONE case. Now,
  when some socket has overfilled its receive buffer, this will not break
  other sockets.

MFC after:	2 weeks
2018-03-11 19:14:01 +00:00
Ian Lepore
67e2a29216 Everywhere that multiple registers are accessed in sequence, lock/unlock
just once around the whole group of accesses.
2018-03-11 18:54:45 +00:00
Andrey V. Elsukov
d158b221e5 Add KASSERT to check that proper targed was used.
MFC after:	2 weeks
2018-03-11 18:46:40 +00:00
Andrey V. Elsukov
e3004d2429 Replace panic() with KASSERTs.
MFC after:	2 weeks
2018-03-11 18:37:55 +00:00
Ian Lepore
8355852f85 Use separate mutexes for atrtc and i8254 locking. Change all the strange
un-function-like RTC_LOCK/UNLOCK macro usage into normal function calls.
Since there is no longer any need to handle register access from a debugger
context, those function calls can just be regular mutex lock/unlock calls.

Requested by:  bde
2018-03-11 18:20:49 +00:00
Andrey V. Elsukov
dccd41cb39 Check that we have PF_KEY sockets before iterating over all RAW sockets.
MFC after:	2 weeks
2018-03-11 18:10:59 +00:00
Andrey V. Elsukov
26cb1e1dba Remove obsoleted and unused key_sendup() function.
Also remove declaration for nonexistend key_usrreq() function.

MFC after:	2 weeks
2018-03-11 18:03:55 +00:00
Ian Lepore
14d08b45b8 Convert atrtc the new style rtc debugging output. Remove the db show
command handler which provided much the same information.  Removing the
possibility of accessing the hardware regs from the debugger context
paves the way for simplifying the locking code in the driver.
2018-03-11 16:57:14 +00:00
Brooks Davis
31f2f1e895 Remove obsolete pcaudioio.h.
Nothing uses the #define's values or the types.  (Some NTP code does use
an audio_info_t, but it is in #ifdef'd support for Solaris and is not
this audio_info_t).

Sponsored by:	DARPA, AFRL
2018-03-11 16:17:53 +00:00
Alexander Motin
6b1a96b16b Add new opcodes and statuses from NVMe 1.3a.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2018-03-11 06:30:09 +00:00
Alexander Motin
3fa5467a06 Add new identify data structures fields from NVMe 1.3a.
Some of them are already supported by existing hardware, so reporting
them `nvmecontrol identify` can be useful.
2018-03-11 05:09:02 +00:00
Emmanuel Vadot
82ce05526b extres/regulators: Add sysctls for regulators
For each regulators create an hw.regulator.<regname>. :
uvolt: Current value
always_on: 1 If the reg is always on
boot_on: 1 If the reg is set at boot time
enable_cnt: Number of consumer(s)
enable_delay: Delay before enabling the regulator
ramp_delay: The Ramp delay
max_uamp: The maximum value of the regulator in uAmps
min_uamp: The minimal value of the regulator in uAmps
max_uvolt: The maximum value of the regulator in uVolts
min_uvolt: The minimal value of the regulator in uVolts

Reviewed by:	ian
Differential Revision:	https://reviews.freebsd.org/D14578
2018-03-11 04:37:05 +00:00
Emmanuel Vadot
7904418f8e allwinner: Add IR clock to sun8i
Add ir clock definition to sun8i-r-ccu.
No idea if it's working but aw_cir seems happy now and the frequency
is set to 3Mhz as it should.
2018-03-11 04:01:23 +00:00
Nathan Whitehorn
35feca377d Make FDT-using parts of ofw_machdep.c condition on options FDT. This fixes
the kernel build when options FDT is absent.
2018-03-11 01:09:31 +00:00
Andriy Voskoboinyk
cf268a8302 otus(4): check mcast / mgt / ucast rates during Tx descriptor setup
These parameters may be changed via ifconfig(8); by default,
mgt / mcast rates are lowest possible and ucast rate is not set
(matches previous configuration).

While here, store some variables locally for better readability.
2018-03-11 00:38:08 +00:00
Andriy Voskoboinyk
8f1ef22fca rtwn(4): reset Tx power values before calling get_txpower()
for RTL8192C / RTL8188E (like it is done for other chipsets).
2018-03-10 23:47:03 +00:00
Andriy Voskoboinyk
63a55eab9b usb/wlan/*: properly include "opt_wlan.h" into all drivers
Without it driver cannot be loaded when wlan(4) module is built with
'options IEEE80211_DEBUG_REFCNT'.
2018-03-10 23:16:24 +00:00
Andriy Voskoboinyk
2e0d057248 run(4): drop few unused variables.
Found by: Clang static analyzer
2018-03-10 22:52:39 +00:00
Ian Lepore
72721caf04 Make root mount timeout logic work for filesystems other than ufs.
The vfs.mountroot.timeout tunable and .timeout directive in a mount.conf(5)
file allow specifying a wait timeout for the device(s) hosting the root
filesystem to become usable.  The current mechanism for waiting for devices
and detecting their availability can't be used for zfs-hosted filesystems.
See the comment #20 in the PR for some expanded detail on these points.

This change adds retry logic to the actual root filesystem mount.  That is,
insted of relying on device availability using device name lookups, it uses
the kernel_mount() call itself to detect whether the filesystem can be
mounted, and loops until it succeeds or the configured timeout is exceeded.

These changes are based on the patch attached to the PR, but it's rewritten
enough that all mistakes belong to me.

PR:		208882
X-MFC after:	sufficient testing, and hopefully in time for 11.1
2018-03-10 22:07:57 +00:00
Edward Tomasz Napierala
8ff2372a2c Check for duplicates when modifying an iSCSI session. Previously we did
this check on open, but "iscsictl -M", or an iSCSI redirect received by
iscsid(8) could end up with two sessions with the same target name and
portal.

MFC after:	2 weeks
2018-03-10 14:21:37 +00:00
Oleksandr Tymoshenko
8e6a67d5be [rpi] remove IRQ support for BCM233x RNG
Upstream DTBs don't provide IRQ lines for the RNG. Moreover, harvesting
bytes as often as the RNG interrupt is triggered (87 times per sec) is an
overkill.

For these reasons, get rid of the interrupt mode and make callout mode the
default, with random bits harvested every 4 seconds.

Submitted by:	Sylvain Garrigues <sylgar@gmail.com>
Reviewed by:	ian, imp, manu, mmel
Approved by:	emaste
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14541
2018-03-10 02:49:58 +00:00
Bryan Drewery
71b126549d Fix rebase mismerge in r330724.
X-MFC-With:	r330724
MFC after:	2 weeks
Sponsored by:	Dell EMC
2018-03-10 02:13:48 +00:00
Bryan Drewery
906ce865c2 Don't skip reading depend for 'make obj' unless it is alone.
This was effectively done in bsd.dep.mk quite some time ago.

MFC after:	2 weeks
Sponsored by:	Dell EMC
2018-03-10 02:10:26 +00:00
Bryan Drewery
003a0576c9 Skip reading depend files with -V unless looking up a depend variable.
This speeds up some simple -V lookups significantly.

Reported by:	bde
MFC after:	2 weeks
Sponsored by:	Dell EMC
2018-03-10 02:10:19 +00:00
Bryan Drewery
d395e093ac Reduce overhead for simple 'make -V' lookups by avoiding 'find sys/'.
Setting -DNO_SKIP_MPATH can be used for debugging.

Reported by:	bde
MFC after:	2 weeks
Sponsored by:	Dell EMC
2018-03-10 02:09:36 +00:00
Conrad Meyer
dec5441a32 subr_gtaskqueue: Fix braino from r330715
Submitted by:	markj
Sponsored by:	Dell EMC Isilon
2018-03-10 01:53:42 +00:00
Conrad Meyer
2e1fccf2cf nvme_da: Fix minor memory leak in error case
Reported by:	cppcheck
Sponsored by:	Dell EMC Isilon
2018-03-10 01:28:55 +00:00
Brooks Davis
5c4ac7c2af Remove obsolete dataacq.h.
Nothing includes this file, lists it in a Makefile, or uses any of the
ioctl definitions.
2018-03-10 01:07:30 +00:00
Conrad Meyer
8e0e6abc1f subr_gtaskqueue: Fix minor leak of tq_name in error case
Reported by:	cppcheck
Sponsored by:	Dell EMC Isilon
2018-03-10 01:01:01 +00:00
Conrad Meyer
918622dbc0 mlx5(4): Remove redundant declaration of mlx5_enter_error_state
Broken in r330644.

Sponsored by:	Dell EMC Isilon
2018-03-10 00:59:48 +00:00
Hans Petter Selasky
be15e1332d Implement proper support for complete_all() in the LinuxKPI.
When complete_all() is called there might be multiple waiters. The
current implementation could only handle one waiter. Make sure the
completion is sticky when complete_all() is called to be compatible
with Linux.

Found by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-09 12:16:55 +00:00
Andriy Voskoboinyk
d1b671061b net80211: wrap protection frame allocation into ieee80211_alloc_prot()
Move copy-pasted code for RTS/CTS frame allocation into net80211.
While here, add stat / debug message for allocation failures
(copied from run(4)) + return error here in bwn(4).

Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D14628
2018-03-09 11:33:56 +00:00
Andrew Turner
e150585e9e Use the correct address to write back to memory in the GICv3 ITS driver.
This seems to no be needed on supported hardware as they are cache-coherent,
however this may not be the case on all platforms.

Sponsored by:	DARPA, AFRL
2018-03-09 10:34:44 +00:00
Hajimu UMEMOTO
9f5fab694c Fix Bad file descriptor error.
MFC after:	1 week
2018-03-09 04:45:24 +00:00
Brooks Davis
dd51fec3b9 Copyout a whole int to cpuset_domain's policy pointer.
The previous code only copied 16-bits and corrupted the target int.

Reviewed by:	kib, markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14611
2018-03-09 00:50:40 +00:00
Sean Bruno
d7fb35d13a Update tcp_lro with tested bugfixes from Netflix and LLNW:
rrs - Lets make the LRO code look for true dup-acks and window update acks
          fly on through and combine.
    rrs - Make the LRO engine a bit more aware of ack-only seq space. Lets not
          have it incorrectly wipe out newer acks for older acks when we have
          out-of-order acks (common in wifi environments).
    jeggleston - LRO eating window updates

Based on all of the above I think we are RFC compliant doing it this way:

https://tools.ietf.org/html/rfc1122

section 4.2.2.16

"Note that TCP has a heuristic to select the latest window update despite
possible datagram reordering; as a result, it may ignore a window update with
a smaller window than previously offered if neither the sequence number nor the
acknowledgment number is increased."

Submitted by:	Kevin Bowling <kevin.bowling@kev009.com>
Reviewed by:	rstone gallatin
Sponsored by:	NetFlix and Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14540
2018-03-09 00:08:43 +00:00
Emmanuel Vadot
1857bc794f arm: Add GENERIC-MMCCAM kernel config
MMCCAM is the new mmc stack currently developped by kibab@, add a kernel
configuration file that include GENERIC so it's easier to test for people.
2018-03-08 22:54:50 +00:00
Emmanuel Vadot
d597cfd520 Fix build when option MMCCAM is defined. 2018-03-08 22:49:36 +00:00
Konstantin Belousov
23fda0ad66 Remove unused variable.
Sponsored by:	The FreeBSD Foundation
2018-03-08 22:04:54 +00:00
Konstantin Belousov
2cec152827 Make mlx5 compilable on ILP32 arches.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-03-08 22:03:43 +00:00
Ed Maste
a1ea91a21f bktr: correct Japan IF frequency
PR:		36451
Submitted by:	Hijiri Umemoto <hijiri at umemoto.org>
MFC after:	2 weeks
2018-03-08 19:24:10 +00:00
Ed Maste
95320b99a5 asmc: update temperature sensor name/description
PR:		225911
Submitted by:	Trev <fbsdbugs4 at sentry.org>
MFC after:	1 week
2018-03-08 18:52:47 +00:00
Andriy Voskoboinyk
222a2f1de9 iwi(4): factor out rateset setup into iwi_set_rateset().
No functional change intended.
2018-03-08 18:42:23 +00:00
Mark Johnston
bde3b1e1a5 Return E2BIG if we run out of space writing a compressed kernel dump.
ENOSPC causes the MD kernel dump code to retry the dump, but this is
undesirable in the case where we legitimately ran out of space.
2018-03-08 17:04:36 +00:00
Hans Petter Selasky
50e3ba10e8 Set correct SL in completion for RoCE in mlx5ib(4).
There is a difference when parsing a completion entry between Ethernet
and IB ports. When link layer is Ethernet the bits describe the type of
L3 header in the packet. In the case when link layer is Ethernet and VLAN
header is present the value of SL is equal to the 3 UP bits in the VLAN
header. If VLAN header is not present then the SL is undefined and consumer
of the completion should check if IB_WC_WITH_VLAN is set.

While that, this patch also fills the vlan_id field in the completion if
present.

linux commit 12f8fedef2ec94c783f929126b20440a01512c14

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 16:27:31 +00:00
Hans Petter Selasky
0285276c92 Add call to setup firmware data dump structure during device load in
mlx5core.

Do not consider the inability to create a firmware dump fatal, but
inform about the situation and allow the driver to attach. The device
might not implement the needed VSC, or we might not know the layout of
the registers map. In either case, only firmware dump functionality is
limited, the network operations should be fine.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 16:19:01 +00:00
Hans Petter Selasky
9cb83c4689 Avoid more LFENCE/SFENCe on x86 in mlx5en(4),
by using the FreeBSD native fences.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:58:30 +00:00
Hans Petter Selasky
7d69d339ad Fix mlx5en(4) driver to properly call m_defrag().
When the mlx5en(4) driver was converted to using BUSDMA(9) the call to
m_defrag() was moved after the part of the TX routine that strips the
header from the mbuf chain. Before it called m_defrag it first trimmed
off the now-empty mbufs from the start of the chain. This has the side
effect of also removing the head of the chain that has M_PKTHDR set.
m_defrag() will not defrag a chain that does not have M_PKTHDR set,
thus it was effectively never defragging the mbuf chains.

As it turns out, trimming the mbufs in this fashion is unnecessary since
the call to bus_dmamap_load_mbuf_sg doesn't map empty mbufs anyway, so
remove it.

Differential Revision:	https://reviews.freebsd.org/D12050
Submitted by:	mjoras@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:53:04 +00:00
Hans Petter Selasky
1eb09ad434 Use vport rather than physical-port MTU in mlx5en(4).
Set and report vport MTU rather than physical MTU,
The driver will set both vport and physical port mtu
and will rely on the query of vport mtu.

SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.

Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.

Based on Linux upstream commit:
cd255efff9baadd654d6160e52d17ae7c568c9d3

Submitted by:	Meny Yossefi <menyy@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:47:17 +00:00
Hans Petter Selasky
7c22ae80d2 Use the device unit number for naming the ifnet interface in mlx5en(4).
Currently the ifnet interface is named mceX, where X is a monotonically
incremented value. If the device is reset due to a fatal error, then the
interface name will change.  Using the device unit number will keep the
naming consistent across the reset logic.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:43:41 +00:00
Hans Petter Selasky
1a872967ad Remove duplicate prototypes.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:37:09 +00:00
Hans Petter Selasky
e808190a59 Add kernel and userspace code to dump the firmware state of supported
ConnectX-4/5 devices in mlx5core.

The dump is obtained by reading a predefined register map from the
non-destructive crspace, accessible by the vendor-specific PCIe
capability (VSC). The dump is stored in preallocated kernel memory and
managed by the mlx5tool(8), which communicates with the driver using a
character device node.

The utility allows to store the dump in format
    <address> <value>
into a file, to reset the dump content, and to manually initiate the
dump.

A call to mlx5_fwdump() should be added at the places where a dump
must be fetched automatically. The most likely place is right before a
firmware reset request.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:21:56 +00:00
Hans Petter Selasky
4b95c6659a Add vendor specific capability interface support in mlx5core.
Add the ability to access the vendor specific space gateway in order
to support reading and writing data into the different configuration
domains.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:59:47 +00:00
Hans Petter Selasky
fba294620c Use device_printf() instead of printf() when printing warnings and errors
to dmesg(8) in mlx5core.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:58:27 +00:00
Hans Petter Selasky
10b0804509 Add support for per priority flow control, PFC, to mlx5en(4).
Add support for PFC and implement reading the per priority statistics
using the sysctl(8) interface. PFC is used together with VLAN priority
and can be enabled and disabled on a per priority basis.

Global pause frames and PFC are incompatible features and surrounding
logic has been added to warn the user about misconfiguration.

Update relevant mlx5core APIs for PFC configuration.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:40:39 +00:00
Hans Petter Selasky
118063fb70 Add support for explicit congestion notification, ECN, to mlx5ib(4).
ECN configuration and statistics is available through a set of sysctl(8)
nodes under sys.class.infiniband.mlx5_X.cong . The ECN configuration
nodes can also be used as loader tunables.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:23:14 +00:00
Hans Petter Selasky
788333d9a6 Use the autogenerated interface file for all commands in mlx5core.
This patch accumulates the following Linux commits:
- 90b3e38d048f09b22fb50bcd460cea65fd00b2d7
  mlx5_core: Modify CQ moderation parameters
- 09a7d9eca1a6cf5eb4f9abfdf8914db9dbd96f08
  mlx5_core: QP/XRCD commands via mlx5 ifc
- 1a412fb1caa2c1b77719ccb5ed8b0c3c2bc65da7
  mlx5_core: Modify QP commands via mlx5 ifc
- ec22eb53106be1472ba6573dc900943f52f8fd1e
  mlx5_core: MKey/PSV commands via mlx5 ifc
- 73b626c182dff06867ceba996a819e8372c9b2ce
  mlx5_core: EQ commands via mlx5 ifc
- 20ed51c643b6296789a48adc3bc2cc875a1612cf
  mlx5_core: Access register and MAD IFC commands via mlx5 ifc
- a533ed5e179cd15512d40282617909d3482a771c
  mlx5_core: Pages management commands via mlx5 ifc
- b8a4ddb2e8f44f872fb93bbda2d541b27079fd2b
  mlx5_core: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON
- af1ba291c5e498973cc325c501dd8da80b234571
  mlx5_core: Refactor internal SRQ API
- b06e7de8a9d8d1d540ec122bbdf2face2a211634
  mlx5_core: Refactor device capability function
- c4f287c4a6ac489c18afc4acc4353141a8c53070
  mlx5_core: Unify and improve command interface

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 10:43:42 +00:00
Hans Petter Selasky
ca55159467 Fix race between PCI error handlers and health work in mlx5core.
linux commit 05ac2c0b7438ea08c5d54b48797acf9b22cb2f6f

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:58:41 +00:00
Hans Petter Selasky
7053deeb3d Avoid calling sleeping function from the health poll thread in mlx5core.
linux commit c1d4d2e92ad670168a17a57dfa182a5a5baa72d4

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:51:33 +00:00