Commit Graph

37898 Commits

Author SHA1 Message Date
Hans Petter Selasky
75dc9c41ab Improve debug message to be more precise and clear.
For the sake of the record, this is the last use of the words master and slave
in the FreeBSD's USB stack, drivers and subsystems.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-20 14:16:24 +00:00
Michal Meloun
b72e2878ab Improve if_dwc:
- refactorize packet receive path. Make sure that we don't leak mbufs
   and/or that we don't create holes in RX descriptor ring
 - slightly simplify handling with TX descriptors

MFC after:	4 weeks
2020-06-19 19:26:55 +00:00
Brandon Bergren
37f530582d [PowerPC] De-giant powermac_nvram, update documentation
* Remove the giant lock requirement from powermac_nvram.
* Update manual pages to reflect current state.

Reviewed by:	bcr (manpages), jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24812
2020-06-19 18:36:10 +00:00
Michal Meloun
188aee740f Finish renaming in if_dwc.
By using DWC TRM terminology, normal descriptor format should be named
extended and alternate descriptor format should be named normal.

Should not been functional change.

MFC after:	4 weeks
2020-06-19 18:34:27 +00:00
Michal Meloun
8d43a8685c Use naming nomenclature used in DesignWare TRM.
Use naming nomenclature used in DesignWare TRM.
This driver was written by using Altera (now Intel) documentation for Arria
FPGA manual. Unfortunately this manual used very different (and in some cases
opposite naming) for registers and descriptor fields. Unfortunately,
this makes future expansion extremely hard.

Should not been functional change.

MFC after:	4 weeks
2020-06-19 18:04:41 +00:00
Andrew Turner
41b84341f5 Use the correct address when creating pci resources
When the PCI and CPU physical addresses are identical it doesn't matter
which is used to create the resources, however on some systems, e.g.
qemu armv7 virt, they are different. This leads to a panic as we try to
map the wrong physical address into the kernel address space.

Reported by:	Jenkins via trasz
Sponsored by:	Innovate UK
2020-06-19 18:00:20 +00:00
Michal Meloun
7f8437c353 Adapt ARMADA8k PCIe driver to newly imported 5.7 DT.
- temporarily disable handling with phy, we don't have driver for it yet
- always clear cause for administartive interrupt.
While I'm in, fix style(9) (mainly whitespace).

MFC after:	4 weeks
2020-06-19 17:33:54 +00:00
Michal Meloun
224c5a9ff3 Revert r362389, it was committed with <patch>.diff instead of <patch>.txt as
commit log.
2020-06-19 17:32:50 +00:00
Michal Meloun
7a5750fd2d diff --git a/sys/dev/pci/pci_dw_mv.c b/sys/dev/pci/pci_dw_mv.c
index 06a29fefbdd..571fc00f6c1 100644
--- a/sys/dev/pci/pci_dw_mv.c
+++ b/sys/dev/pci/pci_dw_mv.c
@@ -64,15 +64,11 @@ __FBSDID("$FreeBSD$");

 #define MV_GLOBAL_CONTROL_REG		0x8000
 #define PCIE_APP_LTSSM_EN		(1 << 2)
-//#define PCIE_DEVICE_TYPE_SHIFT		4
-//#define PCIE_DEVICE_TYPE_MASK		0xF
-//#define PCIE_DEVICE_TYPE_RC		0x4/

 #define MV_GLOBAL_STATUS_REG		0x8008
 #define	 MV_STATUS_RDLH_LINK_UP			(1 << 1)
 #define  MV_STATUS_PHY_LINK_UP			(1 << 9)

-
 #define MV_INT_CAUSE1			0x801C
 #define MV_INT_MASK1			0x8020
 #define  INT_A_ASSERT_MASK			(1 <<  9)
@@ -90,11 +86,7 @@ __FBSDID("$FreeBSD$");
 #define MV_ARUSER_REG			0x805C
 #define MV_AWUSER_REG			0x8060

-
-
 #define	MV_MAX_LANES	8
-
-
 struct pci_mv_softc {
 	struct pci_dw_softc	dw_sc;
 	device_t		dev;
@@ -112,7 +104,6 @@ static struct ofw_compat_data compat_data[] = {
 	{NULL,		 	  0},
 };

-
 static int
 pci_mv_phy_init(struct pci_mv_softc *sc)
 {
@@ -121,18 +112,23 @@ pci_mv_phy_init(struct pci_mv_softc *sc)
 	for (i = 0; i < MV_MAX_LANES; i++) {
 		rv =  phy_get_by_ofw_idx(sc->dev, sc->node, i, &(sc->phy[i]));
 		if (rv != 0 && rv != ENOENT) {
-	  		device_printf(sc->dev, "Cannot get phy[%d]\n", i);
-	  		goto fail;
-	  	}
-	  	if (sc->phy[i] == NULL)
-	  		continue;
-	  	rv = phy_enable(sc->phy[i]);
-	  	if (rv != 0) {
-	  		device_printf(sc->dev, "Cannot enable phy[%d]\n", i);
-	  		goto fail;
-	  	}
-	  }
-	  return (0);
+			device_printf(sc->dev, "Cannot get phy[%d]\n", i);
+/* XXX revert when phy driver will be implemented */
+#if 0
+		goto fail;
+#else
+		continue;
+#endif
+		}
+		if (sc->phy[i] == NULL)
+			continue;
+		rv = phy_enable(sc->phy[i]);
+		if (rv != 0) {
+			device_printf(sc->dev, "Cannot enable phy[%d]\n", i);
+			goto fail;
+		}
+	}
+	return (0);

 fail:
 	for (i = 0; i < MV_MAX_LANES; i++) {
@@ -173,13 +169,14 @@ pci_mv_init(struct pci_mv_softc *sc)
 	/* Enable local interrupts */
 	pci_dw_dbi_wr4(sc->dev, DW_MSI_INTR0_MASK, 0xFFFFFFFF);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_MASK1, 0xFFFFFFFF);
-	pci_dw_dbi_wr4(sc->dev, MV_INT_MASK2, 0xFFFFFFFF);
+	pci_dw_dbi_wr4(sc->dev, MV_INT_MASK2, 0xFFFFFFFD);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE1, 0xFFFFFFFF);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE2, 0xFFFFFFFF);

 	/* Errors have own interrupt, not yet populated in DTt */
 	pci_dw_dbi_wr4(sc->dev, MV_ERR_INT_MASK, 0);
 }
+
 static int pci_mv_intr(void *arg)
 {
 	struct pci_mv_softc *sc = arg;
@@ -188,8 +185,6 @@ static int pci_mv_intr(void *arg)
 	/* Ack all interrups */
 	cause1 = pci_dw_dbi_rd4(sc->dev, MV_INT_CAUSE1);
 	cause2 = pci_dw_dbi_rd4(sc->dev, MV_INT_CAUSE2);
-	if (cause1 == 0 || cause2 == 0)
-		return(FILTER_STRAY);

 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE1, cause1);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE2, cause2);
2020-06-19 17:25:54 +00:00
Michal Meloun
1f446a117e Improve DesignWare PCIe driver:
- only normal memory window is mandatory, prefetchable memory and
  I/O windows should be optional
- full PCIe configuration space is supported
- remove duplicated check from function for accessing configuration space.
  It is already contained in pci_dw_check_dev()

MFC after:	2 weeks
2020-06-19 16:15:06 +00:00
Mike Karels
349eddbd07 Add support for bcm54213PE in brgphy.
This chip is used in the Rasperry Pi 4, and is supported by the if_genet
driver. Currently we use the ukphy mii driver, this patch switches over
to the brgphy mii driver instead. To support the rgmii-rxid phy mode,
which is now the default in the Linux dtb, we add support for clock
skewing.

These changes are taken from OpenBSD and NetBSD, except for the bailout
in brgphy_bcm54xx_clock_delay() in rgmii mode, which was found necessary
after testing.

Submitted by:	Robert Crowston, crowston at protomail.com
Differential Revision:	https://reviews.freebsd.org/D25251
2020-06-18 23:57:10 +00:00
Alexander Motin
ead7e10308 Make polled request timeout less invasive.
Instead of panic after one second of polling, make the normal timeout
handler to activate, reset the controller and abort the outstanding
requests.  If all of it won't happen within 10 seconds then something
in the driver is likely stuck bad and panic is the only way out.

In particular this fixed device hot unplug during execution of those
polled commands, allowing clean device detach instead of panic.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-18 19:16:03 +00:00
Andrew Turner
c794cdc0a2 Stop assuming we can print rman_res_t with %lx
This is not the case on armv6 and armv7, where we also build this driver.
Fix by casting through uintmax_t and using %jx.

Sponsored by:	Innovate UK
2020-06-18 06:21:00 +00:00
Andriy Gapon
4c7d1ab06d hdac_intr_handler: keep working until global interrupt status clears
It is plausible that the hardware interrupts a host only when GIS goes
from zero to one.  GIS is formed by OR-ing multiple hardware statuses,
so it's possible that a previously cleared status gets set again while
another status has not been cleared yet.  Thus, there will be no new
interrupt as GIS always stayed set.  If we don't re-examine GIS then we
can leave it set and never get another interrupt again.

Without this change I frequently saw a problem where snd_hda would stop
working.  Setting dev.hdac.1.polling=1 would bring it back to life and
afterwards I could set polling back to zero.  Sometimes the problem
started right after a boot, sometimes it happened after resuming from
S3, frequently it would occur when sound output and input are active
concurrently (such as during conferencing).  I looked at HDAC_INTSTS
while the sound was not working and I saw that both HDAC_INTSTS_GIS and
HDAC_INTSTS_CIS were set, but there were no interrupts.

I have collected some statistics over a period of several days about how
many loops (calls to hdac_one_intr) the new code did for a single
interrupt:
+--------+--------------+
|Loops   |Times Happened|
+--------+--------------+
|0       |301           |
|1       |12857746      |
|2       |280           |
|3       |2             |
|4+      |0             |
+--------+--------------+
I believe that previously the sound would get stuck each time we had to loop
more than once.

The tested hardware is:
hdac1: <AMD (0x15e3) HDA Controller> mem 0xfe680000-0xfe687fff at device 0.6 on pci4
hdacc1: <Realtek ALC269 HDA CODEC> at cad 0 on hdac1

No objections:	mav
MFC after:	5 weeks
Differential Revision: https://reviews.freebsd.org/D25128
2020-06-18 06:12:06 +00:00
Andrew Turner
9a7053ce96 Clean up the pci host generic driver
- Support Prefetchable Memory.
 - Use the correct rman when allocating memory and ioports.
 - Translate PCI addresses in bus_alloc_resource to allow physical
   addresses that are different than pci addresses.

Reviewed by:	Robert Crowston <crowston_protonmail.com>
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25121
2020-06-17 19:56:17 +00:00
Alexander Motin
550d5d64fe Fix admin qpair leak if detached during initial reset.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-17 17:51:40 +00:00
Hans Petter Selasky
11304ef50e Fix HW TLS offload regression issue after r359919, in mlx5en(4).
Changes in the mbuf layout regarding HW TLS, resulted in wrong detection
of starting mbuf. Use a boolean variable to handle this and pass m_adj()
the top mbuf, so that the packet header is adjusted correctly.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-17 11:14:54 +00:00
Hans Petter Selasky
a26df270c9 Allow multicast packets to be received in promiscious mode, in mlx4en(4).
Make sure we disable the multicast filter in promiscious mode aswell as when
the all multicast flag is set.

MFC after:	1 week
Found by:	Tycho Nightingale <tychon@freebsd.org>
Sponsored by:	Mellanox Technologies
2020-06-17 11:12:10 +00:00
Vladimir Kondratyev
94811094f8 evdev: Add AT translated set1 scancodes for 'Eisu' & 'Kana' keys.
PR:		247292
Submitted by:	Yuichiro NAITO <naito.yuichiro@gmail.com>
MFC after:	1 week
2020-06-17 08:35:35 +00:00
Adrian Chadd
209be66e26 [rsu] Update wme ie API use.
Whoops, forgot to land this one too!
2020-06-16 01:11:40 +00:00
Vincenzo Maffione
ef6fdb3312 if_vtnet: let vtnet_rx_vq_intr() and vtnet_rxq_tq_intr() share code
Since the two functions are similar, introduce a common function
(vtnet_rx_vq_process()) to share common code.
This also improves locking, by ensuring vrxs_rescheduled is accessed
under the RXQ lock, and taskqueue_enqueue() is not called under the
lock (therefore avoiding a spurious duplicate lock warning).

Reported by:	jrtc27
MFC after:	2 weeks
2020-06-15 19:46:34 +00:00
Ryan Moeller
cbb9ccf735 Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in
various NIC drivers.

Reviewed by:	hselasky, np, gallatin, jpaetzel
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25120
2020-06-15 16:35:27 +00:00
Jessica Clarke
576b099a5f vtnet: Fix regression introduced in r361944
For legacy devices that don't support MrgRxBuf (such as bhyve pre-r358180),
r361944 failed to update the receive handler to account for the additional
padding introduced by the unused num_buffers field that is now always present
in struct vtnet_rx_header. Thus, calculate the padding dynamically based on
vtnet_hdr_size.

PR:		247242
Reported by:	thj
Tested by:	thj
2020-06-14 22:39:34 +00:00
Vincenzo Maffione
16f224b5f8 netmap: vtnet: fix races in vtnet_netmap_reg()
The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.

MFC after:	1 week
2020-06-14 20:47:31 +00:00
Brandon Bergren
a4ec123c56 [PowerPC] Fix scc z8530 driver
Parts of the z8530 driver were still using the SUN channel spacing.

This was invalid on PowerMac and QEMU, where the attachment was to escc,
not escc-legacy. This means the driver has apparently NEVER worked properly
on Macintosh hardware.

Add documentation for the channel spacing details, and change to using
driver-specific initialization instead of hardcoded spacing so either
spacing can be used.

Fixes boot hang in QEMU when using the serial console, and fixes use on
Xserve serial (and presumably PowerMacs that have a Stealth Serial port
or similar)

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24661
2020-06-14 16:47:16 +00:00
Toomas Soome
e7fd9688ea Move font related data structured to sys/font.c and update vtfontcvt
Prepare support to be able to handle font data in loader, consolidate
data structures to sys/font.h and update vtfontcvt.

vtfontcvt update is about to output set of glyphs in form of C source,
the implementation does allow to output compressed or uncompressed font
bitmaps.

Reviewed by:	bcr
Differential Revision:	https://reviews.freebsd.org/D24189
2020-06-14 06:58:58 +00:00
Konstantin Belousov
17edf152e5 Control for Special Register Buffer Data Sampling mitigation.
New microcode update for Intel enables mitigation for SRBDS, which
slows down RDSEED and related instructions.  The update also provides
a control to limit the mitigation to SGX enclaves, which should
restore the speed of random generator by the cost of potential
cross-core bufer sampling.

See https://software.intel.com/security-software-guidance/insights/deep-dive-special-register-buffer-data-sampling

GIve the user control over it.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25221
2020-06-12 22:14:45 +00:00
Alexander Motin
92390644e3 Fix config_intrhook leak on initial reset failure.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-12 14:14:01 +00:00
Vincenzo Maffione
6682323732 netmap: introduce netmap_kring_on()
This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().

MFC after:	1 week
2020-06-11 20:35:28 +00:00
Eric Joyner
104d75a051 em(4): Always reinit interface when adding/removing VLAN
This partially reverts r361053 since there have been reports
by users that this breaks some functionality for em(4)
devices; it seems at first glance that some sort of interface
restart is required for those cards.

This isn't a proper fix; this unbreaks those users until a proper
fix is found for their issues.

PR:		240818
Reported by:	Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after:	3 days
2020-06-11 15:59:49 +00:00
Hans Petter Selasky
9c847ffd74 Add missing range checks when receiving USB ethernet packets.
Found by:	Ilja Van Sprundel, IOActive
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-06-11 14:31:51 +00:00
Hans Petter Selasky
6fe9e470bb Make sure packets generated by raw IP code is let through by mlx5en(4).
Allow the TCP header to reside in the mbuf following the IP header.
Else such packets will get dropped.

Backtrace:
mlx5e_sq_xmit()
mlx5e_xmit()
ether_output_frame()
ether_output()
ip_output_send()
ip_output()
rip_output()
sosend_generic()
sosend()
kern_sendit()
sendit()
sys_sendto()
amd64_syscall()
fast_syscall_common()

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:41:54 +00:00
Hans Petter Selasky
b63b61cc75 Extend use of unlikely() in the fast path, in mlx5en(4).
Typically the TCP/IP headers fit within the first mbuf and should not
trigger any of the error cases. Use unlikely() for these cases.

No functional change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:38:51 +00:00
Hans Petter Selasky
9eb1e4aa21 Use const keyword when parsing the TCP/IP header in the fast path in mlx5en(4).
When parsing the TCP/IP header in the fast path, make it clear by using
the const keyword, no fields are to be modified inside the transmitted
packet.

No functional change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:36:37 +00:00
Andriy Gapon
4b869dd71d iicbb: rebuild the bit-banging algorithms using different primitives
I2C_SET was quite inflexible, it used too long delays as well as some
unnecessary delays.  The new building blocks are iicbb_clockin and
iicbb_clockout.  The former sets SDA and starts the high period of SCL,
the latter executes the low period of SCL.  What happens during the high
phase depends on the operation.  For writes we just hold both lines, for
reads we poll SDA.  S, Sr and P change SDA in the middle of the high
period.

Also, the calculation of udelay has been updated, so that the resulting
period more closely corresponds the requested bus frequency.  There is a
new knob, io_delay, that allows to further adjust udelay based on the
estimated latency of pin toggling operations.

Finally, I slightly changed debug tracing and added error indicators to
it.  The debug prints are compiled in but disabled by default.  This can
be of use if there is any fallout from this change.

Some ideas for further improvements:
- add a function for sub-microsecond delays (e.g., in units of 1/10th of
  a microsecond) and use it for more precise timing of short delays;
- account for the actual time spent in the pin I/O.

Some sample debug output with the new code follows.

Reading temperature and humidity from HTU21 in the bus hold mode:
  <<w80+ we3+ <w81+ .....r6d+ rac+ r94- >>
  <<w80+ we5+ <w81+ .............r47+ re2+ r84- >>
where '<<' is S, '<' is Sr, '>>' is P, '.' is one millisecond of clock
stretching by the slave.

Reading temperature and humidity in the no-hold mode:
  <<w80+ wf3+ >>
  <<w81- >>
  <<w81+ r6d+ r54+ raf- >>
  <<w80+ wf5+ >>
  <<w81- >>
  <<w81+ r48+ r4e+ r9c- >>
where '+' is Ack and '-' is NoAck.
We see that first read attempts are not acknowledged.

MFC after:	4 weeks
Differential Revision: https://reviews.freebsd.org/D22206
2020-06-11 05:34:31 +00:00
Konstantin Belousov
4149c6a3ec Remove double-calls to tc_get_timecount() to warm timecounters.
It seems that second call does not add any useful state change for all
implemented timecounters.

Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2020-06-10 22:30:32 +00:00
Oleksandr Tymoshenko
cbc596d6bf Fix reading EDID on TVs/monitors without E-DCC support
Writing segment id to I2C device 0x30 only required if the segment is
non-zero. On the devices without E-DCC support writing to that address
fails and whole transaction then fails too. To avoid this do
not attempt write to the segment selection device unless required.

MFC after:	2 weeks
2020-06-10 21:38:35 +00:00
John Baldwin
9b6b2f8608 Adjust crypto_apply function callbacks for OCF.
- crypto_apply() is only used for reading a buffer to compute a
  digest, so change the data pointer to a const pointer.

- To better match m_apply(), change the data pointer type to void *
  and the length from uint16_t to u_int.  The length field in
  particular matters as none of the apply logic was splitting requests
  larger than UINT16_MAX.

- Adjust the auth_xform Update callback to match the function
  prototype passed to crypto_apply() and crypto_apply_buf().  This
  removes the needs for casts when using the Update callback.

- Change the Reinit and Setkey callbacks to also use a u_int length
  instead of uint16_t.

- Update auth transforms for the changes.  While here, use C99
  initializers for auth_hash structures and avoid casts on callbacks.

Reviewed by:	cem
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25171
2020-06-10 21:18:19 +00:00
Chuck Tuffli
f14f005113 pci: loosen PCIe hot-plug requirements
The original PCIe hot-plug code required a couple of things which cause
PCI probing errors on the QEMU Q35 system and possibly physical systems
(Dell R6515).

Allocate the hot-plug interrupt as shared to support INTx interrupts.
The hot-plug interrupt mechanism should normally be MSI as PCIe mandates
MSI support, but QEMU's Q35 bridge only provides INTx interrupts.

Second, the code required the Electromechanical Interlock (Slot Status
EIS) to be engaged if present (Slot Capability EIP). Some platforms
including QEMU Q35 set EIP but not EIS. Fix by deleting the check.

Reviewed by: imp, mav, jhb
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D24877
2020-06-10 20:12:45 +00:00
Ruslan Bukin
c7dada4c03 All the ARM Coresight interconnect devices set ResourceProducer on memory
resources, ignore it.

The devices found in the ARM Neoverse N1 System Development Platform
(N1SDP).

Sponsored by:	DARPA, AFRL
2020-06-10 14:39:54 +00:00
Eric Joyner
b4a7ce0690 ixl(4): Add FW recovery mode support and other things
Update the iflib version of ixl driver based on the OOT version ixl-1.11.29.

Major changes:

- Extract iflib specific functions from ixl_pf_main.c to ixl_pf_iflib.c
  to simplify code sharing between legacy and iflib version of driver

- Add support for most recent FW API version (1.10), which extends FW
  LLDP Agent control by user to X722 devices

- Improve handling of device global reset

- Add support for the FW recovery mode

- Use virtchnl function to validate virtual channel messages instead of
  using separate checks

- Fix MAC/VLAN filters accounting

Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by:	erj@
Tested by:	Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D24564
2020-06-09 22:42:54 +00:00
Ruslan Bukin
b62d159cb3 Similar to UART on ThunderX2, the ARM Coresight (ETM component)
set ResourceProducer on memory resources: ignore it.

Tested on ARM N1SDP board.

Sponsored by:	DARPA, AFRL
2020-06-09 17:07:42 +00:00
Emmanuel Vadot
4707401c75 coufreq_dt: Rename DEBUG to DPRINTF
DEBUG is a kernel configuration flag and if used cpufreq_dt.c will fail the
build of kernel.

PR:		246867
Submitted by:	Oskar Holmund (oskar.holmlund@ohdata.se)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25080
2020-06-09 09:42:39 +00:00
Jessica Clarke
8c3988dff9 virtio: Support non-legacy network device and queue
The non-legacy interface always defines num_buffers in the header,
regardless of whether VIRTIO_NET_F_MRG_RXBUF, just leaving it unused. We
also need to ensure our virtqueue doesn't filter out VIRTIO_F_VERSION_1
during negotiation, as it supports non-legacy transports just fine. This
fixes network packet transmission on TinyEMU.

Reviewed by:	br, brooks (mentor), jhb (mentor)
Approved by:	br, brooks (mentor), jhb (mentor)
Differential Revision:	https://reviews.freebsd.org/D25132
2020-06-08 21:51:36 +00:00
Jessica Clarke
16ca3d0f59 virtio_mmio: Negotiate the upper half of the feature bits too
The feature bits are exposed as a 32-bit register with 2 banks, so we
should negotiate both halves. Notably, VIRTIO_F_VERSION_1 is in the
upper half, and will be used in an upcoming commit.

The PCI bus driver also has this bug, but the legacy BAR layout did not
include selector registers and is rather different from the modern
layout, so it remains solely as legacy.

Reviewed by:	br, brooks (mentor), jhb (mentor)
Approved by:	br, brooks (mentor), jhb (mentor)
Differential Revision:	https://reviews.freebsd.org/D25131
2020-06-08 21:49:42 +00:00
Alexander Motin
9a4510ac32 Implement zero-copy iSCSI target transmission/read.
Add ICL_NOCOPY flag to icl_pdu_append_data(), specifying that the method
can just reference the data buffer instead of immediately copying it.

Extend the offload KPI with optional PDU queue method, allowing to specify
completion callback, called when all the data referenced by above has been
transferred and won't be accessed any more (the buffers can be freed).

Implement the above functionality in software iSCSI driver using mbufs
with external storage and reference counter.  Note that some NICs (ixl(4))
may keep the mbuf in TX queue for a long time, so CTL has to be ready.

Add optional method to struct ctl_scsiio for buffer reference counting.
Implement it for CTL block backend, allowing to delay free of the struct
ctl_be_block_io and memory it references as needed.  In first reincarnation
of the patch I tried to delay whole I/O as it is done for FibreChannel,
that was cleaner, but due to the above callback delays I had to rewrite
it this way to not leave LUN referenced potentially for hours or more.

All together on sequential read from ZFS ARC this saves about 30% of CPU
time and memory bandwidth by avoiding one of 3 memory copies (the other
two are from ZFS ARC to DMU cache and then from DMU cache to CTL buffers).
On tests with 2x Xeon Silver 4114 this allows to reach full line rate of
100GigE NIC.  Tests with Gold CPUs and two 100GigE NICs are stil TBD,
but expectations to saturate them are pretty high. ;)

Discussed with:	Chelsio
Sponsored by:	iXsystems, Inc.
2020-06-08 20:53:57 +00:00
Adrian Chadd
857e0646ca [if_ath] Don't update the beacon bits from beacon frames in hostapd mode.
This logic is running the beacon receive bits in STA+AP mode on both the
STA and AP side.  The STA side sees its beacons from the BSS fine; the
AP side is seeing other beacons on the same channel but with the BSS
node for some odd reason.  (I think it's a valid reason, but I currently
forget what that valid reason is.)

So, just to be cleaner about things, don't run the nexttbtt/etc bits
at all if we're in hostap mode.  If I ever get mesh working then maybe
I'll make sure it works right on mesh+ap and mesh+sta modes.

Whilst here, log the VAP i'm being called on to make it clearer what
is going on.  I may end up adding a VAP dprintf version of this at
some point.

Tested:

* AR9380, STA (DWDS client) + hostap on the same NIC
2020-06-07 05:08:44 +00:00
Alexander Motin
fc68af7962 Add bunch of HDA controller and codec IDs.
MFC after:	2 weeks
2020-06-05 15:06:58 +00:00
Hans Petter Selasky
eff8154913 USB HID descriptors may push/pop the current state to allow
description of items residing in a so-called union. FreeBSD currently
only supports 4 such pop levels.

If the push level is not restored within the processing of the same
HID item, an invalid memory location may be used for subsequent HID
item processing.

Verify that the push level is always valid when processing HID items.

Reported by:	Andy Nguyen (Google)
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-06-05 07:57:16 +00:00
Adrian Chadd
60a9489509 [iwn] Set default ampdu parameters.
These are from the linux iwlwifi driver ;the default use smaller
maximum AMPDUs (8k) and a much smaller density (none.)  The latter
could cause stability issues.

Tested:

* Tested on Intel 6300, STA mode.

Differential Revision: https://reviews.freebsd.org/D25113
2020-06-05 04:24:34 +00:00
Alexander Motin
abab2155ed Limit AHCI to only one MSI if more is not needed.
My AMD Ryzen system has 4 AHCI controllers, each supporting 16 MSI vectors.
Since two of the controllers have only one SATA port, limit to single MSI
saves system 30 interrupt vectors for free.

It may be possible to also limit number of MSI vectors to 4 and 8 for the
other two controllers, but according to the AHCI specification after that
controllers may revert to only one vector, that would be a bigger loss to
risk.

MFC after:	2 weeks
2020-06-05 02:21:46 +00:00
Eric Joyner
51569bd793 em(4): Add support for Comet Lake Mobile Platform, update shared code
This change introduces Comet Lake Mobile Platform support in the e1000
driver along with shared code patches described below.

- Cast return value of e1000_ltr2ns() to higher type to avoid overflow
- Remove useless statement of assigning act_offset
- Add initialization of identification LED
- Fix flow control setup after connected standby:
  After connected standby the driver blocks resets during
  "AdapterStart" and skips flow control setup. This change adds
  condition in e1000_setup_link_ich8lan() to always setup flow control
  and to setup physical interface only when there is no need to block
  resets.

Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>

Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	erj@
Tested by:	Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D25035
2020-06-04 20:39:28 +00:00
Andriy Gapon
e84d431622 superio: do not assume that current LDN cannot change after config exit
That assumption should be true when superio(4) uses the hardware
exlusively.  But it turns out to not hold on some real systems.
So, err on the side of correctness rather than performance.
Clear current_ldn in sio_conf_exit.

Reported by:	bz
Tested by:	bz
MFC after:	1 week
2020-06-04 13:18:21 +00:00
Adrian Chadd
e649b526cc [run] Fix up tx/rx frame size.
This specifically fixes that TX frames are large enough now to hold a 3900 odd
byte AMSDU (the little ones); me flipping it on earlier messed up transmit!

Tested:

* if_run, STA mode, TX/RX TCP/UDP iperf.  TCP is now back to normal and
  correctly does ~ 3200 byte AMSDU/fast frames (2x1600ish byte MSDUs).
2020-06-03 22:30:44 +00:00
John Baldwin
1a4a7e98eb Explicitly zero IVs on the stack.
Reviewed by:	delphij
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25057
2020-06-03 22:19:52 +00:00
John Baldwin
0065d9a47f Explicitly zero AES key schedules on the stack.
Reviewed by:	delphij
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25057
2020-06-03 22:18:21 +00:00
John Baldwin
20c128da91 Add explicit bzero's of sensitive data in software crypto consumers.
Explicitly zero IVs, block buffers, and hashes/digests.

Reviewed by:	delphij
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25057
2020-06-03 22:11:05 +00:00
Adrian Chadd
53652fb94e [otus] enable 802.11n for 2GHz and 5GHz.
This flips on basic 11n for 2GHz/5GHz station operation.

* It flips on HT20 and MCS rates;
* It enables A-MPDU decap - the payload format is a bit different;
* It does do some basic checks for HT40 but I haven't yet flipped on
  HT40 support;
* It enables software A-MSDU transmit; I honestly don't want to make
  A-MPDU TX work and there are apparently issues with QoS and A-MPDU TX.
  So I totally am ignoring A-MPDU TX;
* MCS rate transmit is fine.

I haven't:

* A-MPDU TX, as I said above;
* made radiotap work fully;
* HT40;
* short-GI support;
* lots of other stuff that honestly no-one is likely to use.

But! Hey, this is another ye olde 11n USB NIC that now works pretty OK
in 11n rates. A-MPDU receive seems fine enough given it's a draft-n
device from before 2010.

Tested:

* Ye olde UB82 Test NIC (AR9170 + AR9104) - 2GHz/5GHz
2020-06-03 20:25:02 +00:00
Vincenzo Maffione
e8c07b1246 netmap: vtnet: clean up rxsync disabled logs
MFC after:	1 week
2020-06-03 17:47:32 +00:00
Vincenzo Maffione
1b6d5a80a6 netmap: vtnet: fix race condition in rxsync
This change prevents a race that happens when rxsync dequeues
N-1 rx packets (with N being the size of the netmap rx ring).
In this situation, the loop exits without re-enabling the
rx interrupts, thus causing the VQ to stall.

MFC after:	1 week
2020-06-03 17:46:21 +00:00
Vincenzo Maffione
2d769e25b1 netmap: vtnet: add vtnrx_nm_refill index to receive queues
The new index tracks the next netmap slot that is going
to be enqueued into the virtqueue. The index is necessary
to prevent the receive VQ and the netmap rx ring from going
out of sync, considering that we never enqueue N slots, but
at most N-1. This change fixes a bug that causes the VQ
and the netmap ring to go out of sync after N-1 packets
have been received.

MFC after:	1 week
2020-06-03 17:42:17 +00:00
Vincenzo Maffione
06f6997eb5 netmap: vale: fix disabled logs
MFC after:	1 week
2020-06-03 05:49:19 +00:00
Vincenzo Maffione
81d2cade1c netmap: vtnet: remove leftover memory barriers
MFC after:	1 week
2020-06-03 05:48:42 +00:00
Vincenzo Maffione
f0d8d352c0 netmap: vtnet: call netmap_rx_irq() under VQ lock
The netmap_rx_irq() function normally wakes up user-space threads
waiting for more packets. In this case, it is not necessary to
call it under the driver queue lock. However, if the interface is
attached to a VALE switch, netmap_rx_irq() ends up calling rxsync
on the interface (see netmap_bwrap_intr_notify()). Although
concurrent rxsyncs are serialized through the kring lock
(see nm_kr_tryget()), the lock acquire operation is not blocking.
As a result, it may happen that netmap_rx_irq() is called on
an RX ring while another instance is running, causing the
second call to fail, and received packets stall in the receive VQ.
We fix this issue by calling netmap_irx_irq() under the VQ lock.

MFC after:	1 week
2020-06-03 05:27:29 +00:00
Vincenzo Maffione
1b89d00bd4 netmap: vtnet: honor NM_IRQ_RESCHED
The netmap_rx_irq() function may return NM_IRQ_RESCHED to inform the
driver that more work is pending, and that netmap expects netmap_rx_irq()
to be called again as soon as possible.
This change implements this behaviour in the vtnet driver.

MFC after:	1 week
2020-06-03 05:09:33 +00:00
Adrian Chadd
6bc40d8d83 [run] note that PHY_HT is for mixed mode.
Submitted by:	Ashish Gupta <ashishgu@andrew.cmu.edu>
Differential Revision:	https://reviews.freebsd.org/D25108
2020-06-02 22:37:53 +00:00
Adrian Chadd
bb7234be77 [run] Set the number of HT chains.
* Set the tx/rx chains based on the existing MIMO eeprom reads
* Add 3-chain rates

Tested:

* MAC/BBP RT5390 (rev 0x0502), RF RT5370 (MIMO 1T1R), 2g/5g STA
* MAC/BBP RT3593 (rev 0x0402), RF RT3053 (MIMO 3T3R), 2g/5g STA
2020-06-02 22:36:17 +00:00
Adrian Chadd
8b05d37a76 [run] Add 11NA flags for 5G NICs that support HT.
Now that I'm a proud owner of an ASUS USB-N66, I can test 2G/5G and
3-stream configurations.

For now, just flip on 5G HT rates.  I've tested this in both
5G HT20 and 5G 11a modes.  It's still one stream for now until
we verify that the number of streams reported (ie the MIMO below)
is actually the number of 11n streams, NOT the number of antennas.
(They don't have to match! You can have more antennas than MIMO
streams!)

Tested:

* run0: MAC/BBP RT3593 (rev 0x0402), RF RT3053 (MIMO 3T3R)
2020-06-02 16:40:58 +00:00
Jason A. Harmening
ef1eabca5d vt(4): reset scrollback and cursor position after clearing history buffer
r361601 implemented basic support for cleaing the console history buffer.
But after clearing the history buffer, it's not especially useful to be
able to scroll back through that buffer, or for the cursor position to
remain at (very likely) the bottom of the screen.

PR:		224436
Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D25079
2020-06-02 01:21:48 +00:00
Vladimir Kondratyev
ec45be6c36 [psm] Workaround active PS/2 multiplexor hang
which happens on some laptops after returning to legacy multiplexing mode
at initialization stage.

PR:		242542
Reported by:	Felix Palmen <felix@palmen-it.de>
MFC after:	1 week
2020-06-02 01:04:49 +00:00
Vladimir Kondratyev
8137fb2e38 [psm] Do not disable trackpoint when hw.psm.elantech.touchpad_off is enabled
PR:		246117
Reported by:	Alexander Sieg <ports@xanderio.de>
MFC after:	1 week
2020-06-02 00:53:39 +00:00
Vincenzo Maffione
9ec71596c0 netmap: if_vtnet: avoid netmap ring wraparound
netmap assumes the one "slot" is left unused to distinguish
the empty ring and full ring conditions. This assumption was
violated by vtnet_netmap_rxq_populate().

MFC after:	1 week
2020-06-01 16:14:29 +00:00
Vincenzo Maffione
36f2d67026 netmap: if_vtnet: replace vtnet_free_used()
The functionality contained in this function is duplicated,
as it is already available in vtnet_txq_free_mbufs()
and vtnet_rxq_free_mbufs().

MFC after:	1 week
2020-06-01 16:12:09 +00:00
Vincenzo Maffione
c9de157d36 netmap: vtnet: fix RX virtqueue initialization bug
The vtnet_netmap_rxq_populate() function erroneously assumed
that kring->nr_hwcur = 0, i.e. the kring was in the initial
state. However, this is not always the case: for example,
when a vtnet reinit is triggered by some changes in the
interface flags or capenable.
This patch changes the behaviour of vtnet_netmap_kring_refill()
so that it always starts publishing the netmap buffers starting
from the current value of kring->nr_hwcur.

MFC after:	1 week
2020-06-01 16:10:44 +00:00
Adrian Chadd
f6287cc63c [ath] Don't re-program the beacon timers if we miss a beacon in software-beacon STA mode.
This is something I added a few years ago to handle resyncing the beacon if
we miss a beacon or need to sync after association/reassociation/powersave.

However, if we're doing STA+AP mode (eg DWDS) then we don't want
to reprogram the beacons here; this may upset normal AP operation.
I missed checking for the sc->sc_swbmiss flag so I was reinitialising
the beacon timers after every beacon miss / TSFOOR option, and
that isn't likely good.

This plus ensuring that STA's are created with "-beacon" to disable
BMISS/TSFOOR processing will hopefully quieten some of the issues
I've seen with missed beacons / TSFOOR (out of range) interrupts
coming in when operating in STA mode.

Tested:

* AR9380/AR9580, STA+AP modes
2020-06-01 06:10:25 +00:00
Adrian Chadd
c775d4ac42 [run] Don't add 11ng channels (2GHz) for RF2020
Don't also add the 11ng channels if we're not in 11n mode or net80211 will
get super weird.
2020-05-30 00:07:42 +00:00
Adrian Chadd
700af579c5 [run] Set ampdu rxmax same as linux; RF2020 isn't an 11n NIC
This is from the linux driver:

* set the ampdu rx max to 32k for 1 stream devics like mine, and
  64k for larger ones
* Don't enable 11n bits for RF2020
2020-05-30 00:06:26 +00:00
Adrian Chadd
f520d76129 [run] Add initial 802.11n support.
* Enable self-generated 11n frames
* add MCS rates for 1-stream and 2-stream rates; will do 3-stream
  once the rest of this tests out OK with other people.
* Hard-code 1 stream for now
* Add A-MPDU RX mbuf tagging
* RTS/CTS if doing RTSCTS in HT protmode as well as legacy; they're
  separate configuration flags
* Update the amrr rate index stuff - walk the rates array like others
  to find the right one - this now works for MCS and CCK/OFDM rates
* Add support for atheros fast frames/AMSDU support as we can generate
  those in net80211.

TODO:

* HT40 isn't enabled yet
* No A-MPDU support just yet; that requires some more firmware research
  and maybe porting some ath(4) A-MPDU support/tracking into net80211
* Short preamble flags aren't set yet for MCS; need to check the linux
  driver and see what's going on there
* Add 3x3 rates and set tx/rx stream configuration appropriately
* More 5GHz testing; I have a 3x3 dual band USB NIC coming soon that'll
  let me test this.
* Figure out why the RX path isn't performing as fast as it could -
  there's only a single buffer loaded at a time for the receive path
  in the USB bulk handler and this may not be super useful.

Tested:

* RT5390 usb, 1x1, RF5370 (2GHz radio), STA mode - A-MSDU TX, A-MPDU RX

Submitted by:	Ashish Gupta <ashishgu@andrew.cmu.edu>
Differential Revision:	https://reviews.freebsd.org/D22840
2020-05-29 15:56:44 +00:00
Andriy Gapon
cffd37da23 do not enable pci bridge decoding on resume until I/O windows are restored
PCI bus driver restores most but not all of a child PCI-PCI bridge
configuration.  The bridge's I/O windows are restored by pcib driver and
that happens later in time.  This can be problematic because the Command
register is restored before the windows are restored.  If the firmware
programs the windows incorrectly or even does not program them at all,
then the bridge can start claiming I/O cycles that are not intended for
it.  This will continue until the correct windows are restored.

I have observed this problem with a buggy BIOS where after resuming from
S3 an I/O port window of a PCI-PCI bridge was configured with zero base
and limit causing the bridge to claim 0x0 - 0xFFF port range.  That
interfered with ACPI port access including ACPI PM Timer at port 0x808,
thus wreaking havoc in the time keeping.

The solution is to restore the Command register of PCI-PCI bridges after
the windows are restored in pcib driver.  While here, I decided that for
other PCI device types (normal and cardbus) it's better to restore the
Command register after their BARs are restored.

To do: per jhb's suggestion, move the window handling to pci driver.

Reviewed by:	imp, jhb, kib
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D25028
2020-05-29 07:50:55 +00:00
Jason A. Harmening
98f7cf022c vt(4): Add support for `vidcontrol -C'
Extract scrollback buffer initialization into a common routine, used both
during vt(4) init and in handling the CONS_CLRHIST ioctl.

PR:		224436
Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D24815
2020-05-28 21:22:30 +00:00
Hans Petter Selasky
506a911bad Implement helper function, usbd_get_max_frame_length(), which allows kernel
device drivers to correctly predict the default USB transfer frame length.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-05-28 08:38:25 +00:00
Roger Pau Monné
06592d60f0 xen/control: short circuit xctrl_on_watch_event on spurious event
If there's no data to read from xenstore short-circuit
xctrl_on_watch_event to return early, there's no reason to continue
since the lack of data would prevent matching against any known event
type.

Sponsored by:	Citrix Systems R&D
MFC with:	r352925
MFC after:	1 week
2020-05-28 08:20:16 +00:00
Roger Pau Monné
3e7df58df2 xen/blkfront: use the correct type for disk sectors
The correct type to use to represent disk sectors is blkif_sector_t
(which is an uint64_t underneath). This avoid truncation of the disk
size calculation when resizing on i386, as otherwise the calculation
of d_mediasize in xbd_connect is truncated to the size of unsigned
long, which is 32bits on i386.

Note this issue didn't affect amd64, because the size of unsigned long
is 64bits there.

Sponsored by:	Citrix Systems R&D
MFC after:	1 week
2020-05-28 08:19:13 +00:00
Hans Petter Selasky
5e0552018c Don't allow USB device drivers to parent own interface.
It will prevent proper USB device detach.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-05-28 08:05:46 +00:00
Adrian Chadd
9f716a645d [ath] Update ath_rate_sample to use the same base type as ticks.
Until net80211 grows a specific ticks type that matches the system,
manually use the same type as the kernel/net80211 'ticks' type
(signed int.)

Tested:

* AR9380, STA mode
2020-05-27 22:48:34 +00:00
Eric Joyner
71d104536b ice(4): Introduce new driver for Intel E800 Ethernet controllers
The ice(4) driver is the driver for the Intel E8xx series Ethernet
controllers; currently with codenames Columbiaville and
Columbia Park.

These new controllers support 100G speeds, as well as introducing
more queues, better virtualization support, and more offload
capabilities. Future work will enable virtual functions (like
in ixl(4)) and the other functionality outlined above.

For full functionality, the kernel should be compiled with
"device ice_ddp" like in the amd64 NOTES file, and/or
ice_ddp_load="YES" should be added to /boot/loader.conf so that
the DDP package file included in this commit can be downloaded
to the adapter. Otherwise, the adapter will fall back to a single
queue mode with limited functionality.

A man page for this driver will be forthcoming.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21959
2020-05-26 23:35:10 +00:00
Ruslan Bukin
43843cc281 Rename dmar_get_dma_tag() to acpi_iommu_get_dma_tag().
This is needed for a new IOMMU controller support.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D24943
2020-05-26 16:40:40 +00:00
Marcin Wojtas
2287afd818 Update ENA driver version to v2.2.0
Driver version upgrade is connected with support for the new device
fetures, like Tx drops reporting or disabling meta caching.

Moreover, the driver configuration from the sysctl was reworked to
provide safer and better flow for configuring:
* number of IO queues (new feature),
* drbr size on Tx,
* Rx queue size.

Moreover, a lot of minor bug fixes and improvements were added.

Copyright date in the license of the modified files in this release was
updated to 2020.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
2020-05-26 16:11:46 +00:00
Marcin Wojtas
7381a86f47 Refactor ena_tx_map_mbuf() function
There is no guarantee from bus_dmamap_load_mbuf_sg() for matching
mbuf chain segments to dma physical segments.

This patch ensure correctly mapping to LLQ header and DMA segments.

Submitted by: Ido Segev <idose@amazon.com>
Obtained from: Amazon, Inc.
2020-05-26 16:05:42 +00:00
Marcin Wojtas
9bf7da9517 Fix double-free bug within ena_detach()
There is ena_free_all_io_rings_resources() called twice on device
detach:

ena_detach():

ena_destroy_device():
/* First call */
ena_free_all_io_rings_resources()

/* Second call */
ena_free_all_io_rings_resources()

The double-free causes panic() on kldunload, for example.

As the ena_destroy_device() is also called by ena_reset_task() it is
better to stay unchanged. Thus, remove the "Second call" of the function.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 16:02:10 +00:00
Marcin Wojtas
0b432b702e Allow disabling meta caching for ENA Tx path
Determined by a flag passed from the device. No metadata is set within
ena_tx_csum when caching is disabled.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 16:00:30 +00:00
Marcin Wojtas
9762a033da Create ENA IO queues with optional backoff
If requested size of IO queues is not supported try to decrease it until
finding the highest value that can be satisfied.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:58:48 +00:00
Marcin Wojtas
56d41ad5fe Add sysctl node for ENA IO queues number adjustment
By default, in ena_attach() the driver attempts to acquire
ena_adapter::max_num_io_queues MSI-X vectors for the purpose of IO
queues, however this is not guaranteed. The number of vectors acquired
depends also on system resources availability.

Regardless of that, enable the number of effectively used IO queues to
be further limited through the sysctl node.

Example: Assumming that there are 8 IO queues configured by default, the
command

$ sysctl dev.ena.0.io_queues_nb=4

will reduce the number of available IO queues to 4. Similarly, the value
can be also increased up to maximum supported value. A value higher than
maximum supported number of IO queues is ignored. Zero is ignored too.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:57:02 +00:00
Marcin Wojtas
e2735b095b Fix assumptions about number of IO queues in the ENA
Make the ena_adapter::num_io_queues a number of effectively used IO
queues. While the ena_adapter::max_num_io_queues is an upper-bound
specified by the HW, the ena_adapter::num_io_queues may be lower than
that, depending on runtime system resources availability.

On reset, there are called ena_destroy_device() and then
ena_restore_device(). The latter calls, in turn, ena_enable_msix(),
which will attempt to re-acquire ena_adapter::max_num_io_queues of
MSIX vectors again.

Thus, the value of ena_adapter::num_io_queues may be different before
and after reset. For this reason, free the IO rings structures (drbr,
counters) in ena_destroy_device() and allocate again in
ena_restore_device().

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:54:32 +00:00
Marcin Wojtas
2182354622 Rework ENA Tx buffer ring size reconfiguration
This method has been aligned with the way how the Rx queue size is being
updated - so it's now done synchronously instead of resetting the
device.

Moreover, the input parameter is now being validated if it's a power of
2. Without this, it can cause kernel panic.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:50:30 +00:00
Marcin Wojtas
7d8c4fee95 Rework ENA Rx queue size configuration
This patch reworks how the Rx queue size is being reconfigured and how
the information from the device is being processed.

Reconfiguration of the queues and reset of the device in order to make
the changes alive isn't the best approach. It can be done synchronously
and it will let to pass information if the reconfiguration was
successful to the user. It now is done in the ena_update_queue_size()
function.

To avoid reallocation of the ring buffer, statistic counters and the
reinitialization of the mutexes when only new size has to be assigned,
the io queues initialization function has been split into 2 stages:
basic, which is just copying appropriate fields and the advanced, which
allocates and inits more advanced structures for the IO rings.

Moreover, now the max allowed Rx and Tx ring size is being kept
statically in the adapter and the size of the variables holding those
values has been changed to uint32_t everywhere.

Information about IO queues size is now being logged in the up routine
instead of the attach.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:48:06 +00:00
Marcin Wojtas
92dc69a727 Mark the ENA driver as epoch ready
Recent changes to the epoch requires driver to notify that they knows
epoch in order to prevent input packet function to enter epoch each
time the packet is received.

ENA is using NET_TASK for handling Rx, so it's entering epoch
automatically whenever this task is being executed.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:45:54 +00:00
Marcin Wojtas
579d23aa96 Improve indentation in ena_up() and ena_down()
If the conditional check for ENA_FLAG_DEV_UP is negated, the body of the
function can have smaller indentation and it makes the code cleaner.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:44:08 +00:00
Marcin Wojtas
02a2a7cea4 Expose argument names for non static ENA driver functions
As functions which are declared in the header files are intended to be
the interface and are going to be used by other files, it's better to
include argument names in the definition, so the caller won't have to
check the .c file in order to check their meaning and order.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:41:53 +00:00
Marcin Wojtas
6959869eae Use single global lock in the ENA driver
Currently, the driver had 2 global locks - one was sx lock used for
up/down synchronization and the second one was mutex, which was used
for link configuration and timer service callout.

It is better to have single lock for that. We cannot use mutex, as it
can sleep and cause witness errors in up/down configuration, so sx lock
seems to be the only choice.

Callout cannot use sx lock, but the timer service is MP safe, so we just
need to avoid race between ena_down() and ena_detach(). It can be
avoided by acquiring sx lock.

Simple macros were added that are encapsulating implementation of the
lock and makes the code cleaner.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:39:41 +00:00