Commit Graph

259714 Commits

Author SHA1 Message Date
hselasky
f72eaedbb0 Avoid leaking send queue mbufs during error recovery in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:41:44 +00:00
hselasky
dfd1cfea1e Add helper functions to set/query MCC/MCDA/MCQI registers in mlx5core.
To be used by the mlx5 callbacks exposed to the mlxfw module.

Linux commit:
d2ad488b0073bd1a2c3f5d2ea50a7eb632103e5d

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:41:21 +00:00
hselasky
437610c98c Enhance MCAM reg to allow query on access reg support in mlx5core.
Enhance MCAM to allow the driver to query which access regs are
supported. For now, expose the regs needed for FW flashing.

Linux commit:
0ab87743cc8c5bcd482daf71961ed5fc45349e01

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:41:00 +00:00
hselasky
e726ac7024 Add MCC (Management Component Control) register definitions in mlx5core.
MCC (Management Component Control) allows to control a firmware
component update.

MCDA (Management Component Data Access) allows to read and write
a firmware component.

MCQI (Management Component Query Information) allows to query
information about firmware components.

Linux commit:
4717628938423fcba0aa8fa889e9fed4eb6a655f

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:40:41 +00:00
hselasky
5f30edfcae Add reading the mcam_reg in mlx5core.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:40:13 +00:00
hselasky
6533e2408e Query and cache PCAM, MCAM registers on initialization in mlx5core.
On load_one, we now cache our capabilities registers internally, similar
to QUERY_HCA_CAP. Capabilities can later be queried using macros
introduced in this patch.

Linux commit:
71862561f3a62015a11de16d1c306481e8415c08

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:39:53 +00:00
hselasky
5c23588acc Implement PCAM, MCAM access register commands in mlx5core.
Introduced registers will expose capabilities of new registers and
features related to port/management.
Driver will query MCAM and PCAM in order to avoid failing on old
firmwares with lack of support.

Linux commit:
c835ad64683bd3e2d1b31ed2cb1ff4366932edb1

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:39:25 +00:00
hselasky
23804e32d1 Expose PCAM, MCAM registers infrastructure in mlx5core.
PCAM: Ports capabilities mask register.
MCAM: Management capabilities mask register.

PCAM and MCAM registers will provide information regarding firmware
support for different features, in order to avoid cases where new driver
combined with old firmware results in syndromes (for ex. PCIe counters
before this patchset).

Linux commit:
cfdcbceaeffc669b70d904d80a2df9c86c232566

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:39:01 +00:00
hselasky
a054eaa8dc Add sysctl(8) to control fast unload support in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:38:31 +00:00
hselasky
946c834999 Add Fast teardown support to mlx5core.
Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown

This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.

Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been
   scattered to memory

Linux commit:
fcd29ad17c6ff885dfae58f557e9323941e63ba2

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:38:06 +00:00
hselasky
d6ddda5f54 Make sure the running variable is properly set for ratelimited SQs in mlx5en(4).
Else the SQs won't be properly released when closing rate-limited connections
leading to wrong state transitions on the SQ.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:37:31 +00:00
hselasky
e4d4e793db Implement get and set nic state as global functions in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:37:03 +00:00
hselasky
48029f79e6 Ticks are integer type in FreeBSD.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:36:32 +00:00
hselasky
550d539120 Configure firmware to use RX hash format in mini CQE in mlx5en(4).
When using CQE zipping, one can choose between RX hash and Checksum.
This will indicate the parameter on which a zipping session should be
stopped.

While porting the Linux code, Checksum was chosen. However, the value
of Checksum is not being used anywhere.
For the FreeBSD driver, we prefer to use the RX hash format which will
guarantee the RX hash value for all the mini CQEs.
While at it, make sure to initialize the Checksum value in the
decompressed CQE.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:35:55 +00:00
hselasky
307df95e12 Disable CQE zipping by default in mlx5en(4).
After doing performance measurements, it seems like CQE zipping doesn't
have any significant benefit.
Moreover, we know that this feature is disabled by default on other
operating systems (Linux for example).

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:35:35 +00:00
hselasky
277d924c32 Split mlx5e_update_stats_work() in mlx5en(4).
Split the function into the mlx5e_update_stats_locked() core and make
mlx5e_update_stats_work() call the _locked helper, similar to many other
places in the kernel. This improves the code structure, making the
locking clean.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:35:14 +00:00
hselasky
7a629e974e Implement fast close of RX channel in mlx5en(4).
Instead of waiting for all jobs to be cancelled, simply close the completion
queue to prevent more completion events and let mlx5e_destroy_rq() cleanup
the remaining mbufs.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:34:42 +00:00
hselasky
caf0b0b91a Correct number of elements for priority to traffic class mappings in mlx5en(4).
The number of priorities is always 8, while the number of traffic classes
supported can vary. While at it convert the sysctl node into an array.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:34:14 +00:00
hselasky
88a5e71b7e Remove unused module parameter in mlx5ib.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:33:29 +00:00
hselasky
622353cd04 Make sure to error out when arming the CQ fails in mlx4ib and mlx5ib.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:33:09 +00:00
hselasky
4222e245e7 Make sure to error out when arming the CQ fails in ibcore.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:32:45 +00:00
hselasky
529eb49cd0 Destroy port stats debug context in correct order in mlx5en(4).
Destroy children nodes before parent nodes.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:32:22 +00:00
hselasky
d67cf46eed Fix tx_jumbo_packets counter in mlx5en(4).
Instead of reading Ethernet RFC 2819 pXtoYoctets counters from
hardware which counts RX octets, count tx_stat_pXtoYoctets from
Ethernet extended counters which counts TX octets.

TX jumbo counters should be accumulated only after the PPCNT
counters were fetched from hardware with their latest value.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:32:03 +00:00
hselasky
aac38825f7 Update Ethernet extended counters in mlx5en(4).
Expose all Ethernet extended counters those counters via debug_stats
sysctl:
dev.mce.X.debug_stats

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:31:32 +00:00
hselasky
0ca4d482ae Protect from infinite sw-reset loop in mlx5core.
Avoid an infinite software firmware reset loop that may be caused by a
hardware bug by limiting the maximum number of resets.
The counter between resets is reset by request for reset, and not by a
successful reset.
The interval between two resets can be configured via sysctl:
hw.mlx5.sw_reset_timeout
which is global to all mlx5 devices in the system.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:30:47 +00:00
hselasky
4d6dbe0567 Disable all MSIX interrupts before shutdown in mlx5.
Make sure the interrupt handlers don't race with the fast unload one
code in the shutdown handler.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:30:18 +00:00
hselasky
15f173a9f9 Import Linux code to implement mlx5_ib_disassociate_ucontext() in mlx5ib.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:29:45 +00:00
hselasky
2e54e9d8f9 Add temperature warning event to log in mlx5core.
Temperature warning event is sent by FW to indicate high temperature
as detected by one of the sensors on the board.
Add handling of this event by writing the numbers of the alert sensors
to the kernel log.

Linux commit:
1865ea9adbfaf341c5cd5d8f7d384f19948b2fe9

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:28:18 +00:00
hselasky
38379ab7c9 Correctly define the interface state bits in mlx5en(4).
While at it remove unused interface state bits. This also fixes and issue
during shutdown:

There is an issue where the firmware fails during mlx5_load_one,
the health_care timer detects the issue and schedules a health_care call.
Then the mlx5_load_one detects the issue, cleans up and quits. Then
the health_care starts and calls mlx5_unload_one to clean up the resources
that no longer exist and causes kernel panic.

The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set
after mlx5_load_one fails. The solution is removing the bit
MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the
bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN
is redundant and we can use MLX5_INTERFACE_STATE_UP instead.

Linux commit:
10a8d00707082955b177164d4b4e758ffcbd4017
b3cb5388499c5e219324bfe7da2e46cbad82bfcf

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:27:29 +00:00
hselasky
641eac0b61 Enable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:26:33 +00:00
hselasky
90a4e7f88c Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:25:14 +00:00
hselasky
6cc096bffa Add support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).
Add support for DIM based on Linux,
with some minor adaptions specific to FreeBSD.

Linux commit
f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:23:33 +00:00
marius
1bed20a95a Allow to build without INET and INET6 again after r347221.
Submitted by:	cam
2019-05-08 09:03:43 +00:00
delphij
99029fadee Move contrib/zlib to sys/contrib/zlib so that we can use it in kernel.
This is a prerequisite of unifying kernel zlib instances.

Submitted by:	Yoshihiro Ota <ota at j.email.ne.jp>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20191
2019-05-08 08:43:15 +00:00
dim
5446f43efb Pull in r360099 from upstream llvm trunk (by Eli Friedman):
[ARM] Glue register copies to tail calls.

  This generally follows what other targets do. I don't completely
  understand why the special case for tail calls existed in the first
  place; even when the code was committed in r105413, call lowering
  didn't work in the way described in the comments.

  Stack protector lowering breaks if the register copies are not glued
  to a tail call: we have to insert the stack protector check before
  the tail call, and we choose the location based on the assumption
  that all physical register dependencies of a tail call are adjacent
  to the tail call. (See FindSplitPointForStackProtector.) This is sort
  of fragile, but I don't see any reason to break that assumption.

  I'm guessing nobody has seen this before just because it's hard to
  convince the scheduler to actually schedule the code in a way that
  breaks; even without the glue, the only computation that could
  actually be scheduled after the register copies is the computation of
  the call address, and the scheduler usually prefers to schedule that
  before the copies anyway.

  Fixes https://bugs.llvm.org/show_bug.cgi?id=41417

  Differential Revision: https://reviews.llvm.org/D60427

This should fix several instances of "Bad machine code: Using an
undefined physical register", when compiling ports such as
multimedia/vlc, audio/alsa-lib and devel/avro-c for armv6, with
-fstack-protector-strong.

Reported by:	jbeich
PR:		237074, 237783, 237784
MFC after:	3 days
2019-05-08 05:45:00 +00:00
jhibbits
d539c69d81 powerpc: hide innocuous printf behind bootverbose
NUMA associativity, and OFW node existence, is completely optional, and
shouldn't warn always.
2019-05-08 03:15:22 +00:00
kevans
0f415eea65 tun/tap: merge and rename to tuntap
tun(4) and tap(4) share the same general management interface and have a lot
in common. Bugs exist in tap(4) that have been fixed in tun(4), and
vice-versa. Let's reduce the maintenance requirements by merging them
together and using flags to differentiate between the three interface types
(tun, tap, vmnet).

This fixes a couple of tap(4)/vmnet(4) issues right out of the gate:
- tap devices may no longer be destroyed while they're open [0]
- VIMAGE issues already addressed in tun by kp

[0] emaste had removed an easy-panic-button in r240938 due to devdrn
blocking. A naive glance over this leads me to believe that this isn't quite
complete -- destroy_devl will only block while executing d_* functions, but
doesn't block the device from being destroyed while a process has it open.
The latter is the intent of the condvar in tun, so this is "fixed" (for
certain definitions of the word -- it wasn't really broken in tap, it just
wasn't quite ideal).

ifconfig(8) also grew the ability to map an interface name to a kld, so
that `ifconfig {tun,tap}0` can continue to autoload the correct module, and
`ifconfig vmnet0 create` will now autoload the correct module. This is a
low overhead addition.

(MFC commentary)

This may get MFC'd if many bugs in tun(4)/tap(4) are discovered after this,
and how critical they are. Changes after this are likely easily MFC'd
without taking this merge, but the merge will be easier.

I have no plans to do this MFC as of now.

Reviewed by:	bcr (manpages), tuexen (testing, syzkaller/packetdrill)
Input also from:	melifaro
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D20044
2019-05-08 02:32:11 +00:00
mav
16793845b5 Fix dataset name comparison in zfs_compare().
The code never returned match comparing two datasets (not snapshots).
As result, uu_avl_find(), called from zfs_callback(), never succeeded,
allowing to add same dataset into the list multiple times, for example:

	# zfs get name pers pers pers@z pers@z
	NAME    PROPERTY  VALUE   SOURCE
	pers    name      pers    -
	pers    name      pers    -
	pers@z  name      pers@z  -

With the patch:

	# zfs get name pers pers pers@z pers@z
	NAME    PROPERTY  VALUE   SOURCE
	pers    name      pers    -
	pers@z  name      pers@z  -

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-05-08 01:35:43 +00:00
cem
48e509c3fd random: x86 driver: Prefer RDSEED over RDRAND when available
Per
https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
, RDRAND is a PRNG seeded from the same source as RDSEED.  The source is
more suitable as PRNG seed material, so prefer it when the RDSEED intrinsic
is available (indicated in CPU feature bits).

Reviewed by:	delphij, jhb, imp (earlier version)
Approved by:	secteam(delphij)
Security:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20192
2019-05-08 00:45:16 +00:00
cem
06c7f39d51 vmm(4): Pass through RDSEED feature bit to guests
Reviewed by:	jhb
Approved by:	#bhyve (jhb)
MFC after:	2 leapseconds
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20194
2019-05-08 00:40:08 +00:00
imp
8d59f6cbf0 Add missing newline to debug printf. 2019-05-08 00:09:10 +00:00
cem
c5676feebf Fix libsbuf sbuf_printf_drain symbol version
(Introduced incorrectly in r347229 earlier today.)

As pointed out by kevans, 1.6 should be used for FreeBSD 13, like r340383.

Submitted by:	kevans
Reported by:	kib
Reviewed by:	jilles
X-MFC-with:	 r347229
Differential Revision:	https://reviews.freebsd.org/D20187
2019-05-07 21:15:11 +00:00
cy
fdb74f8a64 Improve the legibility of the login.access.5 man page by separating
each argument into its own paragraph.

MFC after:	3 days
2019-05-07 20:39:39 +00:00
tuexen
4be6e40cbd Remove non-functional SCTP checksum offload support for virtio.
Checksum offloading for SCTP is not currently specified for virtio.
If the hypervisor announces checksum offloading support, it means TCP
and UDP checksum offload. If an SCTP packet is sent and the host announced
checksum offload support, the hypervisor inserts the IP checksum (16-bit)
at the correct offset, but this is not the right checksum, which is a CRC32c.
This results in all outgoing packets having the wrong checksum and therefore
breaking SCTP based communications.

This patch removes SCTP checksum offloading support from the virtio
network interface.

Thanks to Felix Weinrank for making me aware of the issue.

Reviewed by:		bryanv@
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D20147
2019-05-07 20:28:12 +00:00
trasz
8d4b09f508 Support PTRACE_GETREGSET w/ NT_PRSTATUS in Linux ptrace(2).
While Linux strace(1) doesn't strictly require it - it has a fallback
to PTRACE_GETREGS - it's a newer interface, so we better support it
before the old one is deprecated.

Reviewed by:	dchagin
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20152
2019-05-07 19:06:41 +00:00
emaste
206ba42431 make sysent after r347228
Regenerate to add @generated tag in generated files.
2019-05-07 18:10:21 +00:00
cem
38fa0aaa51 device_printf: Use sbuf for more coherent prints on SMP
device_printf does multiple calls to printf allowing other console messages to
be inserted between the device name, and the rest of the message.  This change
uses sbuf to compose to two into a single buffer, and prints it all at once.

It exposes an sbuf drain function (drain-to-printf) for common use.

Update documentation to match; some unit tests included.

Submitted by:	jmg
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D16690
2019-05-07 17:47:20 +00:00
emaste
c7029e0b4a makesyscalls: use @generated tag in generated files
Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makesyscalls.sh as we've done for other
generated files.

Reviewed by:	cem
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20183
2019-05-07 16:17:33 +00:00
markj
72c6d2d761 Simplify the test against maxproc in fork1().
Previously nprocs_new would be tested against maxprocs twice when
nprocs_new < maxprocs - 10.  Eliminate the unnecessary comparison.

Submitted by:	Wuyang Chung <wuyang.chung1@gmail.com>
GitHub PR:	https://github.com/freebsd/freebsd/pull/397
MFC after:	1 week
2019-05-07 15:03:26 +00:00
br
25334b0724 Disable interrupts first and then set spinlock_count to 1.
Otherwise interrupt can be generated just after setting spinlock_count
and before disabling interrupts.

Sponsored by:	DARPA, AFRL
2019-05-07 14:32:17 +00:00