freebsd-dev

Author	SHA1	Message	Date
Hans Petter Selasky	94790180f3	Don't save PCI state when PCI error is detected in mlx5core. When a PCI error is detected the PCI state could be corrupt, don't save it in that flow. Save the state after initialization. After restoring the PCI state during slot reset save it again, restoring the state destroys the previously saved state info. Submitted by: slavash@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 18:34:35 +00:00
Hans Petter Selasky	f20b553d75	Add mutual exclusion mechanism for software reset of firmware in mlx5core. Since the FW can be shared between PCI functions it is common that more than one health poll will detected a failure, this can lead to multiple resets. The solution is to use a FW locking mechanism using semaphore space to provide a way to synchronize between functions. The FW semaphore is acquired via config cycle access. First the VSEC gateway must be acquired, then the semaphore can be locked by writing a value to it and confirmed it's locked by reading the same value back. The process in the same to free the semaphore, except the value written should be zero. Submitted by: slavash@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 18:32:03 +00:00
Hans Petter Selasky	fe242ba7c1	Issue a software reset on firmware assert in mlx5core. If a FW assert is considered fatal, indicated by a new bit in the health buffer, reset the FW. After the reset, follow the normal recovery flow. Submitted by: slavash@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 18:24:09 +00:00
Hans Petter Selasky	1900b6f887	Handle software reset of firmware in error flow in mlx5core. Some mlx5 adapter firmware allows the driver to reset the firmware in the event of an error. When a software reset is issued on any physical function all PFs enter reset state. This is a recoverable condition. The existing recovery flow was designed to allow the recovery of a VF after a PF driver reload. This patch expands the scope of that flow to recover PFs or VFs after a SW reset has been issued. When a software reset is issued the following occurs: 1. The NIC interface mode is set to SW_RESET (7) while the reset is in progress. 2. Once the reset completes the NIC interface mode is set to NIC disabled (1). After the reset has been issued (added in a subsequent patch) the health poll for other functions will detect that the NIC interface state has been set to disabled. This will cause it to enter the existing recovery flow. If the PCI is still working (meaning it doesn't return 0xff on all reads) it means recovery can proceed immediately instead of waiting 60 seconds. The error detetion has also been refactored to avoid incorrect or misleading log messages. Submitted by: slavash@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 18:20:42 +00:00
Hans Petter Selasky	1fb6089c3b	Hide verbose proclamation of error when forced in mlx5core. When mlx5_enter_error_state() operation is forced by shutdown, the messages surrounding setting the error state are not informational and confuse users. Submitted by: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 18:11:06 +00:00
Hans Petter Selasky	519774ea5a	Cancel delayed recovery work when unloading the mlx5core driver. linux commit 2a0165a034ac024b60cca49c61e46f4afa2e4d98 Submitted by: Matthew Finlay <matt@mellanox.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 18:09:09 +00:00
Hans Petter Selasky	c09025693b	Add support for fast unload in shutdown flow in mlx5core. This patch accumulates the following Linux commits: - 8812c24d28f4972c4f2b9998bf30b1f2a1b62adf net/mlx5: Add fast unload support in shutdown flow - 59211bd3b6329c3e5f4a90ac3d7f87ffa7867073 net/mlx5: Split the load/unload flow into hardware and software flows - 4525abeaae54560254a1bb8970b3d4c225d32ef4 net/mlx5: Expose command polling interface Submitted by: Matthew Finlay <matt@mellanox.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 18:02:20 +00:00
Hans Petter Selasky	4bb7662b09	Improve support for health recovery in mlx5core. This patch accumulates the following Linux commits: - 04c0c1ab38e95105d950db5b84e727637e149ce7 net/mlx5: PCI error recovery health care simulation - 0179720d6be2096b8d0a4d143254ff9e77747daa net/mlx5: Introduce trigger_health_work function - 3fece5d676939f42f434c63dfe1bd42d7d94e6f0 net/mlx5: Continue health polling until it is explicitly stopped Submitted by: Matthew Finlay <matt@mellanox.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 17:33:14 +00:00
Hans Petter Selasky	177781564f	Create designated workqueue for each mlx5en(4) device instance. The mlx5e_destroy_ifp() function may be called from the system workqueue and in this case trying to flush all works will cause a dead lock. Instead of using the system workqueue, create a designated workqueue for each mlx5en(4) device instance. Submitted by: slavash@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-23 16:59:51 +00:00
Warner Losh	b0a5c98898	Convert the PCI ID selection from a simple if into a table. Mark the table with PNP info. Fix compilation by returning FILTER_STRAY in two places, as suggested by comments. Create a simple module from this. Left unconnected because I can't test it as a module.	2018-03-23 15:35:19 +00:00
Warner Losh	f0df5e27ce	Add PNP info to xl as an example.	2018-03-23 15:35:15 +00:00
Warner Losh	9fbcec7d02	kill traling white space	2018-03-23 15:35:07 +00:00
Kenneth D. Merry	8881681b24	Disable T10 Protection Information / EEDP handling for type 2 protection. The mps(4) and mpr(4) drivers and hardware handle T10 Protection Information, which is a system of checksums and guard blocks to protect data while it is being transferred and while it is on disk. It is also known as T10 DIF. For more details, see section 4.22 of the SBC-4 spec. Supporting Type 2 protection requires using 32 byte CDBs, and filling in the fields in those CDBs. We don't yet support that in the da(4) driver. Type 1 and Type 3 protection don't require that, and can be handled by the mps(4)/mpr(4) driver's code and firmware without any additional input from the da(4) driver. If a drive has Type 2 protection enabled (you frequently see this with SAS drives shipped from Dell), don't set the various EEDP fields in the mps(4)/mpr(4) driver command fields. Otherwise, you wind up with errors like this that would otherwise make no sense: (da9:mpr0:0:18:0): READ(10). CDB: 28 00 00 00 00 00 00 02 00 00 (da9:mpr0:0:18:0): CAM status: SCSI Status Error (da9:mpr0:0:18:0): SCSI status: Check Condition (da9:mpr0:0:18:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code) (da9:mpr0:0:18:0): (da9:mpr0:0:18:0): Field Replaceable Unit: 0 (da9:mpr0:0:18:0): Command Specific Info: 0 (da9:mpr0:0:18:0): (da9:mpr0:0:18:0): Descriptor 0x80: f8 21 (da9:mpr0:0:18:0): Descriptor 0x81: 00 00 00 00 00 00 (da9:mpr0:0:18:0): Error 22, Unretryable error In other words, what kind of strange SAS hard drive doesn't support a standard 10 byte SCSI READ command? In this case, one that has Type 2 protection enabled. We can revisit this when we put Type 2 protection support in the da(4) driver, but for now this will help people who put Type 2 formatted drives in a system and wonder what in the world is going on. MFC after: 3 days Sponsored by: Spectra Logic	2018-03-23 13:52:26 +00:00
Andrew Turner	e5812b6d38	If sc->sc_ep_max is already set use it to find the number of RX and TX endpoints. The Allwinner driver will need to set this as the EPINFO register isn't useful there. Submitted by: jmcneill Reviewed by: hselasky Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D5881	2018-03-23 11:08:59 +00:00
Kyle Evans	50da29d25b	efidev: Drop a quick note in about efi_cfgtbl/efi_runtime There's no real annotation for it, so it's not immediately obvious to the unfamiliar that these pointers are to locations in the EFI runtime map unlike the system table pointer immediately above them.	2018-03-23 02:45:09 +00:00
Landon J. Fuller	397b9e40e9	Add missing NULL checks when calling malloc(M_NOWAIT) in bhnd_nv_strdup/bhnd_nv_strndup. If malloc(9) failed during initial bhnd(4) attach, while allocating the root NVRAM path string ("/"), the returned NULL pointer would be passed as the destination to memcpy(). Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>	2018-03-22 22:13:46 +00:00
Kyle Evans	ad456dd9fa	Re-work efidev ordering to fix efirt preloaded by loader on amd64 On amd64, efi_enter calls fpu_kern_enter(). This may not be called until fpuinitstate has been invoked, resulting in a kernel panic with efirt_load="YES" in loader.conf(5). Move fpuinitstate a little earlier in SI_SUB_DRIVERS so that we can squeeze efirt between it and efirtc at SI_SUB_DRIVERS, SI_ORDER_ANY. efidev must be after efirt and doesn't really need to be at SI_SUB_DEVFS, so drop it at SI_SUB_DRIVER, SI_ORDER_ANY. The not immediately obvious dependency of fpuinitstate by efirt has been noted in both places. Discussed with: kib, andrew Reported by: Jakob Alvermark <jakob@alvermark.net> X-MFC-With: r330868	2018-03-22 18:24:00 +00:00
Andrew Turner	112b88e391	Enter into the EFI environment before dereferencing the runtime services pointer. This may be within the EFI address space and not the FreeBSD kernel address space. X-MFC-With: r330868 Sponsored by: DARPA, AFRL	2018-03-22 15:32:57 +00:00
Andrew Turner	c5149a4979	Increase the size of the endpoint buffers. They are double buffered so need to be twice the size. Sponsored by: DARPA, AFRL	2018-03-22 15:24:26 +00:00
Warner Losh	e7932420b0	Revert r331298 Normally, shutdown_nice() just signals init. However, sometimes it calls kern_reboot directly. For that case, r331298 dropped the Giant lock before calling it. This turns out to be incorrect for the more common case where init exists and we just signal it. Restore the old behavior. The direct call to kern_reboot() doesn't sync buffers to the disk, so should work with Giant held, so we don't need to drop locks here for that. Noticed by: bde@ Sponsored by: Netflix	2018-03-22 15:11:53 +00:00
Jonathan T. Looney	2529f56ed3	Add the "TCP Blackbox Recorder" which we discussed at the developer summits at BSDCan and BSDCam in 2017. The TCP Blackbox Recorder allows you to capture events on a TCP connection in a ring buffer. It stores metadata with the event. It optionally stores the TCP header associated with an event (if the event is associated with a packet) and also optionally stores information on the sockets. It supports setting a log ID on a TCP connection and using this to correlate multiple connections that share a common log ID. You can log connections in different modes. If you are doing a coordinated test with a particular connection, you may tell the system to put it in mode 4 (continuous dump). Or, if you just want to monitor for errors, you can put it in mode 1 (ring buffer) and dump all the ring buffers associated with the connection ID when we receive an error signal for that connection ID. You can set a default mode that will be applied to a particular ratio of incoming connections. You can also manually set a mode using a socket option. This commit includes only basic probes. rrs@ has added quite an abundance of probes in his TCP development work. He plans to commit those soon. There are user-space programs which we plan to commit as ports. These read the data from the log device and output pcapng files, and then let you analyze the data (and metadata) in the pcapng files. Reviewed by: gnn (previous version) Obtained from: Netflix, Inc. Relnotes: yes Differential Revision: https://reviews.freebsd.org/D11085	2018-03-22 09:40:08 +00:00
Ravi Pokala	4754f6ad41	jedec_dimm: Use correct string length when populating sc->slotid_str Don't limit the copy to the size of the target string pointer (always 4 on 32-bit / 8 on 64-bit). Instead, just use strdup(). Reported by: Coverity CID: 1386912 Reviewed by: cem, imp MFC after: 1 week	2018-03-22 06:31:05 +00:00
Navdeep Parhar	1b4df78b42	cxgbe(4): Do not read MFG diags information from custom boards. MFC after: 1 week Sponsored by: Chelsio Communications	2018-03-22 04:42:29 +00:00
Navdeep Parhar	5401e09688	cxgbe(4): Tunnel congestion drops on a port should be cleared when the stats for that port are cleared. MFC after: 1 week Sponsored by: Chelsio Communications	2018-03-22 02:04:57 +00:00
Ed Maste	7976b9c5e0	Correct signedness bug in drm_modeset_ctl drm_modeset_ctl() takes a signed in from userland, does a boundscheck, and then uses it to index into a structure and write to it. The boundscheck only checks upper bound, and never checks for nagative values. If the int coming from userland is negative [after conversion] it will bypass the boundscheck, perform a negative index into an array and write to it, causing memory corruption. Note that this is in the "old" drm driver; this issue does not exist in drm2. Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com> Reviewed by: cem MFC after: 1 day Sponsored by: The FreeBSD Foundation	2018-03-22 01:00:55 +00:00
Ed Maste	16451ba2d7	Fix kernel memory disclosure in drm_infobufs drm_infobufs() has a structure on the stack, fills it out and copies it to userland. There are 2 elements in the struct that are not filled out and left uninitialized. This will leak uninitialized kernel stack data to userland. Submitted by: Domagoj Stolfa <ds815@cam.ac.uk> Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com> MFC after: 1 day Security: Kernel memory disclosure (798)	2018-03-21 23:51:14 +00:00
Stephen Hurd	7021bf0569	Update copyright per Matthew Macy "Under my tutelage Nicole did 85% of the work. At the time it seemed simplest for a number of reasons to put my copyright on it. I now consider that to have been a mistake." Submitted by: Matthew Macy <mmacy@mattmacy.io> Reviewed by: shurd Approved by: shurd Differential Revision: https://reviews.freebsd.org/D14766	2018-03-21 15:57:36 +00:00
Andrew Turner	d614c09a82	Use a table to find the endpoint configuration On the Allwinner SoCs we need to set a custom endpoint configuration. To allow for this use a table to store the configuration so the attachment can override it. Reviewed by: hselasky Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14783	2018-03-21 15:17:54 +00:00
Warner Losh	026fb270ca	Unlock giant when calling shutdown_nice()	2018-03-21 14:47:12 +00:00
Warner Losh	4e96c99bdf	Push down Giant one layer. In the days of yore, back when Penitums were the new kids on the block and F00F hacks were all the rage, one needed to take out Giant to do anything moderately complicated with the VM, mappings and such. So the pccard / cardbus code held Giant for the entire insertion or removal process. Today, the VM is MP safe. The lock is only needed for dealing with newbus things. Move locking and unlocking Giant to be only around adding and probing devices in pccard and cardbus.	2018-03-20 22:01:18 +00:00
Ed Maste	fc2a8776a2	Rename assym.s to assym.inc assym is only to be included by other .s files, and should never actually be assembled by itself. Reviewed by: imp, bdrewery (earlier) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D14180	2018-03-20 17:58:51 +00:00
Andrew Turner	ed4c884f2e	Check if the gettime runtime service is valid. The U-Boot efi runtime service expects us to set the address map before calling any runtime services. It will then remap a few functions to their runtime version. One of these is the gettime function. If we call into this without having set a runtime map we get a page fault. Add a check to see if this is valid in efi_init() so we don't try to use the possibly invalid pointer. Reviewed by: imp, kevans (both previous version) X-MFC-With: r330868 Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14759	2018-03-20 13:35:20 +00:00
Warner Losh	afdbfe1e1b	Starting LBA is a 64bit number, so use htole64 instead of htole32. The latter casts the LBA to a 32-bit number before assigning it to the 64 bit structure entity. This works fine on the first 2TB of TRIMs, but terrible beyond that due to trucation. Also, add an assert to make sure we don't end too many DSM TRIM entries in one request. Sponsored by: Netflix	2018-03-20 03:37:14 +00:00
Oleksandr Tymoshenko	108117cc22	[ofw] fix errneous checks for OF_finddevice(9) return value OF_finddevices returns ((phandle_t)-1) in case of failure. Some code in existing drivers checked return value to be equal to 0 or less/equal to 0 which is also wrong because phandle_t is unsigned type. Most of these checks were for negative cases that were never triggered so trhere was no impact on functionality. Reviewed by: nwhitehorn MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D14645	2018-03-20 00:03:49 +00:00
Alexander Motin	5f5baf0e96	Update mpr(4) driver from v15 to v18 from Broadcom site. Version 16 is just a number bump, since we already had those changes. Version 17 introduces new AdapterType value, that allows new user-space tools from Broadcom to differentiate adapter generations 3 and 3.5. Version 18 updates headers and adds SAS_DEVICE_DISCOVERY_ERROR reporting. MFC after: 2 weeks	2018-03-19 23:21:45 +00:00
Eric Joyner	7d48aa4c72	ixgbe(4): Update shared code, add support for X552 1G, fix bug This patch will: - Update ixgbe shared code - Add support for Intel(R) Ethernet Connection X552 1000BASE-T - Add error handling for link state check preventing VF from stopping traffic after changing PF's MTU value Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com> Reviewed by: Intel Networking Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D13885	2018-03-19 20:55:05 +00:00
Ian Lepore	d892051323	Add the device/chip type to the disk d_descr field, and print more info about the chip including the erase block size at attach time. Also add myself to the copyrights since at this point svn blame would point to me as the culprit for much of this.	2018-03-18 18:58:47 +00:00
Ian Lepore	3c9af13c75	Add support for 4K and 32K erase block sizes. Many of the supported chips have these flags set in the ident table, but there was no code to support using the smaller erase sizes.	2018-03-18 18:37:47 +00:00
Ian Lepore	c03ab159f6	Make all internal routines return an int error status, and check the status at all call points. Combine the get_status and wait_for_ready routines, since waiting for ready is the only reason to ever get status.	2018-03-18 17:47:57 +00:00
Ian Lepore	89a1585b8d	Add sc_parent to the softc and use it in place of device_get_parent() calls all over the place. Also pass the softc as the arg to all the internal functions instead of passing a device_t and calling device_get_softc() in each function.	2018-03-18 17:25:23 +00:00
Ian Lepore	89a895b63c	Bugfix: wait for writes/erases to complete after starting them, instead of before starting them. Using the wait-before logic would make sense if there was useful time- consuming work that could be done between the end of one write and the beginning of the next, but it also requires doing the wait-for-ready before reading, because a prior write or erase could still be in progress. Reading is the far more common case, so adding a whole extra bus transaction to check for ready before each read would soak up any small gains that might be had from doing async writes.	2018-03-18 16:52:31 +00:00
Ian Lepore	19aa9f7183	Eliminate some unneeded intermediate variables. Eliminate some redundant parens in shift-and-mask expressions. Reword and reflow some comments.	2018-03-18 16:36:14 +00:00
Ian Lepore	f432eb7ea1	Remove a pointless KASSERT and reword a comment a bit. The KASSERT tested for the same condition that the preceeding lines checked for and would have returned EIO, so the assert could never possibly trigger (sc_sectorsize must inherently be an integer multiple of FLASH_PAGE_SIZE).	2018-03-18 16:10:14 +00:00
Ian Lepore	dac94adb63	Do not overwrite the contents of BIO_WRITE buffers. SPI inherently transfers data in both directions at once. When writing to the device, use a dummy buffer for the incoming data, not the same buffer as the outgoing data. Writes are done in FLASH_PAGE_SIZE chunks, which is only 256 bytes, so just put the dummy buffer into the softc.	2018-03-18 15:56:10 +00:00
Conrad Meyer	db488e4f52	random(4): Poll for signals during large reads Occasionally poll for signals during large reads of the /dev/u?random devices. This allows cancellation via SIGINT of accidental invocations of very large reads. (A 2GB /dev/random read, which takes about 10 seconds on my 2017 AMD Zen processor, can be aborted.) I believe this behavior was intended since 2014 (r273997), just not fully implemented. This is motivated by a potential getrandom(2) interface that may not explicitly forbid extremely large reads on 64-bit platforms -- even larger than the 2GB limit imposed on devfs I/O by default. Such reads, if they are to be allowed, should be cancellable by the user or administrator. Reviewed by: delphij Approved by: secteam (delphij) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14684	2018-03-16 18:50:26 +00:00
Ian Lepore	9c45f7b4fd	Use EFI RTC capabilities info when registering, add bootverbose diagnostics. Make some small improvements to the efirtc driver by obtaining the clock capabilities (resolution and whether the sub-second counters are reset) and using the info when registering the clock. When the hardware zeroes out the subsecond info on clock-set, schedule clock updates to happen just before top-of-second, so that the RTC time is closely in-sync with kernel time. Also, in the identify() routine, always add the driver if EFI runtime services are available, then decide in probe() whether to attach the driver or not. If not attaching and bootverbose is on, say why. All of this is basically to avoid "silent failure" -- if someone thinks there should be an efi rtc and it's not attaching, at least they can set bootverbose and maybe get a clue from the output. Differential Revision: https://reviews.freebsd.org/D14565 (timed out)	2018-03-16 18:16:27 +00:00
Warner Losh	d85d964829	Try polling the qpairs on timeout. On some systems, we're getting timeouts when we use multiple queues on drives that work perfectly well on other systems. On a hunch, Jim Harris suggested I poll the completion queue when we get a timeout. This patch polls the completion queue if no fatal status was indicated. If it had pending I/O, we complete that request and return. Otherwise, if aborts are enabled and no fatal status, we abort the command and return. Otherwise we reset the card. This may clear up the problem, or we may see it result in lots of timeouts and a performance problem. Either way, we'll know the next step. We may also need to pay attention to the fatal status bit of the controller. PR: 211713 Suggested by: Jim Harris Sponsored by: Netflix	2018-03-16 05:23:48 +00:00
Andriy Voskoboinyk	5f792f7478	rtwn(4): de-hardcode ('h/w rate index' - 'corresponding MCS index') constant	2018-03-16 01:03:10 +00:00
Andriy Voskoboinyk	46e18fc6d4	urtw(4), zyd(4): reduce code verbosity. No functional change intended.	2018-03-16 00:38:10 +00:00
Andriy Voskoboinyk	2757acf673	urtw(4): provide names for some commonly used rate indices + drop now-unused urtw_rate2rtl()	2018-03-16 00:09:16 +00:00
Brooks Davis	064c9c3d42	Add a request structure and make the implementation use it. This allows compatibility translation to take place on the stack (md_ioctl is too big) and is more suitable as a public interface within the kernel than the kern_ioctl interface. Except for the initialization of the md_req from the md_ioctl (including detection of kernel md_file pointers) and the updating of the md_ioctl prior to return, this is a mechanical replacment of md_ioctl and mdio with md_req and mdr. Reviewed by: markj, cem, kib (assorted versions) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14704	2018-03-15 21:42:49 +00:00
Brooks Davis	b65794ad84	Move implementation of ioctls into kern_*() functions. Move locks from outside ioctl to the individual implementations. This is the first step of changing the implementations to act on a kernel-internal request struct rather than on struct md_ioctl and to removing the use of kern_ioctl in mountroot. Reviewed by: cem, kib, markj (prior version) Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14700	2018-03-15 18:12:55 +00:00
Brooks Davis	94598ac9f9	Restore the behavior of returning the total number of units by unconditionally incrementing i in the loop; Reported by: cem MFC with: r330880 Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14685	2018-03-15 16:37:43 +00:00
Alexander Motin	db08ef4353	Increase ABOUT FIRMWARE command timeout to 5s. It seems default timeout of 100ms is not enough for my 2694L card, while it was perfectly fine for others, even for full-height 2694. MFC after: 1 week Sponsored by: iXsystems, Inc.	2018-03-15 01:07:21 +00:00
Jung-uk Kim	8438a7a80a	Merge ACPICA 20180313.	2018-03-14 23:45:48 +00:00
Jung-uk Kim	ba425ae46d	Remove local definitions for _STA method in favor of ACPICA. These macros were added in ACPICA 20051216, more than a decade ago.	2018-03-14 23:42:28 +00:00
Warner Losh	5d7fd8f726	Fix error messages in cut and pasted code. Also, fix an unnecessary deref to get ctrlr. Noticed by: rpokala@ Sponsored by: Netflix	2018-03-14 23:28:28 +00:00
Warner Losh	8b1e6ebe0e	When tearing down a queue pair, also delete the queue entries. The NVME standard has required in section 7.2.6, since at least 1.1, that a clean shutdown is signalled by deleting the subission and the completion queues before setting the shutdown bit in CC. The 1.0 standard, apparently, did not and many of the early Intel cards didn't care. Some newer cards care, at least one whose beta firmware can scramble the card on an unclean shutdown. Linux has done this for some time. To make it possible to move forward with an evaluation of this pre-release card with wonky firmware, delete the queues on the card when we delete the qpair structures. Sponsored by: Netflix	2018-03-14 23:01:18 +00:00
Warner Losh	d61cf64d0e	Don't make the namespace devices eternal. We'll need to delete namespaces soon, so go ahead and stop making these devices eternal. It doesn't help much, and will be getting in the way soon. Sponsored by: Netflix	2018-03-14 23:01:04 +00:00
Steven Hartland	7d147b81ea	Fix mps deadlock when handling panic During shutdown mps waits for its SSU requests to complete however when performing a reboot after handling a panic the scheduler is stopped so getmicrotime which is used can be non-functional. Switch to using the same method as shutdown_panic to ensure we actually complete. In addition reduce the timeout when RB_NOSYNC is set in howto as we expect this to fail. Reviewed by: slm MFC after: 1 week Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D12776	2018-03-14 21:32:23 +00:00
Brooks Davis	f287c3e4d3	Fix FSACTL_GET_NEXT_ADAPTER_FIB under 32-bit compat. This includes FSACTL_LNX_GET_NEXT_ADAPTER_FIB. Reviewed by: cem Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14672	2018-03-14 21:11:41 +00:00
John Baldwin	6b02bb1aa7	Fix the check for an empty send socket buffer on a TOE TLS socket. Compare sbavail() with the cached sb_off of already-sent data instead of always comparing with zero. This will correctly close the connection and send the FIN if the socket buffer contains some previously-sent data but no unsent data. Reported by: Harsh Jain @ Chelsio Sponsored by: Chelsio Communications	2018-03-14 20:49:51 +00:00
John Baldwin	02d2bcfaba	Remove TLS-related inlines from t4_tom.h to fix iw_cxgbe(4) build. - Remove the one use of is_tls_offload() and the function. AIO special handling only needs to be disabled when a TOE socket is actively doing TLS offload on transmit. The TOE socket's mode (which affects receive operation) doesn't matter, so remove the check for the socket's mode and only check if a TOE socket has TLS transmit keys configured to determine if an AIO write request should fall back to the normal socket handling instead of the TOE fast path. - Move can_tls_offload() into t4_tls.c. It is not used in critical paths, so inlining isn't that important. Change return type to bool while here. Sponsored by: Chelsio Communications	2018-03-14 20:46:25 +00:00
Edward Tomasz Napierala	6960c4e135	Fix typo in a warning message. MFC after: 2 weeks	2018-03-14 18:27:06 +00:00
Warner Losh	807e94b2c3	Implement trim collapsing in nda When multiple trims are in the queue, collapse them as much as possible. At present, this usually results in only a few trims being collapsed together, but more work on that will make it possible to do hundreds (up to some configurable max). Sponsored by: Netflix	2018-03-14 16:44:50 +00:00
John Baldwin	1e9538d253	Support for TLS offload of TOE connections on T6 adapters. The TOE engine in Chelsio T6 adapters supports offloading of TLS encryption and TCP segmentation for offloaded connections. Sockets using TLS are required to use a set of custom socket options to upload RX and TX keys to the NIC and to enable RX processing. Currently these socket options are implemented as TCP options in the vendor specific range. A patched OpenSSL library will be made available in a port / package for use with the TLS TOE support. TOE sockets can either offload both transmit and reception of TLS records or just transmit. TLS offload (both RX and TX) is enabled by setting the dev.t6nex.<x>.tls sysctl to 1 and requires TOE to be enabled on the relevant interface. Transmit offload can be used on any "normal" or TLS TOE socket by using the custom socket option to program a transmit key. This permits most TOE sockets to transparently offload TLS when applications use a patched SSL library (e.g. using LD_LIBRARY_PATH to request use of a patched OpenSSL library). Receive offload can only be used with TOE sockets using the TLS mode. The dev.t6nex.0.toe.tls_rx_ports sysctl can be set to a list of TCP port numbers. Any connection with either a local or remote port number in that list will be created as a TLS socket rather than a plain TOE socket. Note that although this sysctl accepts an arbitrary list of port numbers, the sysctl(8) tool is only able to set sysctl nodes to a single value. A TLS socket will hang without receiving data if used by an application that is not using a patched SSL library. Thus, the tls_rx_ports node should be used with care. For a server mostly concerned with offloading TLS transmit, this node is not needed as plain TOE sockets will fall back to software crypto when using an unpatched SSL library. New per-interface statistics nodes are added giving counts of TLS packets and payload bytes (payload bytes do not include TLS headers or authentication tags/MACs) offloaded via the TOE engine, e.g.: dev.cc.0.stats.rx_tls_octets: 149 dev.cc.0.stats.rx_tls_records: 13 dev.cc.0.stats.tx_tls_octets: 26501823 dev.cc.0.stats.tx_tls_records: 1620 TLS transmit work requests are constructed by a new variant of t4_push_frames() called t4_push_tls_records() in tom/t4_tls.c. TLS transmit work requests require a buffer containing IVs. If the IVs are too large to fit into the work request, a separate buffer is allocated when constructing a work request. This buffer is associated with the transmit descriptor and freed when the descriptor is ACKed by the adapter. Received TLS frames use two new CPL messages. The first message is a CPL_TLS_DATA containing the decryped payload of a single TLS record. The handler places the mbuf containing the received payload on an mbufq in the TOE pcb. The second message is a CPL_RX_TLS_CMP message which includes a copy of the TLS header and indicates if there were any errors. The handler for this message places the TLS header into the socket buffer followed by the saved mbuf with the payload data. Both of these handlers are contained in tom/t4_tls.c. A few routines were exposed from t4_cpl_io.c for use by t4_tls.c including send_rx_credits(), a new send_rx_modulate(), and t4_close_conn(). TLS keys for both transmit and receive are stored in onboard memory in the NIC in the "TLS keys" memory region. In some cases a TLS socket can hang with pending data available in the NIC that is not delivered to the host. As a workaround, TLS sockets are more aggressive about sending CPL_RX_DATA_ACK messages anytime that any data is read from a TLS socket. In addition, a fallback timer will periodically send CPL_RX_DATA_ACK messages to the NIC for connections that are still in the handshake phase. Once the connection has finished the handshake and programmed RX keys via the socket option, the timer is stopped. A new function select_ulp_mode() is used to determine what sub-mode a given TOE socket should use (plain TOE, DDP, or TLS). The existing set_tcpddp_ulp_mode() function has been renamed to set_ulp_mode() and handles initialization of TLS-specific state when necessary in addition to DDP-specific state. Since TLS sockets do not receive individual TCP segments but always receive full TLS records, they can receive more data than is available in the current window (e.g. if a 16k TLS record is received but the socket buffer is itself 16k). To cope with this, just drop the window to 0 when this happens, but track the overage and "eat" the overage as it is read from the socket buffer not opening the window (or adding rx_credits) for the overage bytes. Reviewed by: np (earlier version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D14529	2018-03-13 23:05:51 +00:00
John Baldwin	9689995d23	Simplify error handling in t4_tom.ko module loading. - Change t4_ddp_mod_load() to return void instead of always returning success. This avoids having to pretend to have proper support for unloading when only part of t4_tom_mod_load() has run. - If t4_register_uld() fails, don't invoke t4_tom_mod_unload() directly. The module handling code in the kernel invokes MOD_UNLOAD on a module whose MOD_LOAD fails with an error already. Reviewed by: np (part of a larger patch) MFC after: 1 month Sponsored by: Chelsio Communications	2018-03-13 21:42:38 +00:00
Brooks Davis	8b9f77a14c	Don't overflow the kernel struct mdio in the MDIOCLIST ioctl. Always terminate the list with -1 and document the ioctl behavior. This preserves existing behavior as seen from userspace with the addition of the unconditional termination which will not be seen by working consumers of MDIOCLIST. Because this ioctl can only be performed by root (in default configurations) and is not used in the base system this bug is not deemed to warrant either a security advisory or an eratta notice. Reviewed by: kib Obtained from: CheriBSD Discussed with: security-officer (gordon) MFC after: 3 days Security: kernel heap buffer overflow Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14685	2018-03-13 20:39:06 +00:00
Brooks Davis	8037cdcd9a	Fix ISP_FC_LIP and ISP_RESCAN on big-endian 64-bit systems. For _IO() ioctls, addr is a pointer to uap->data which is a caddr_t. When the caddr_t stores an int, dereferencing addr as an (int *) results in truncation on little-endian 64-bit systems and corruption (owing to extracting top bits) on big-endian 64-bit systems. In practice the value of chan was probably always zero on systems of the latter type as all such FreeBSD platforms use a register-based calling convention. Reviewed by: mav Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14673	2018-03-13 19:56:10 +00:00
Kyle Evans	92f1731bf3	Correct minor typo in comment, efi_dmcap -> efi_tmcap	2018-03-13 15:02:46 +00:00
Kyle Evans	8521b4a9df	efirtc: Pass a dummy tmcap pointer to efi_get_time_locked As noted in the comment, UEFI spec claims the capabilities pointer is optional, but some implementations will choke and attempt to dereference it without checking. This specific problem was found on a Lenovo Thinkpad X220 that would panic in efirtc_identify.	2018-03-13 15:01:23 +00:00
Roger Pau Monné	c2272faa06	vt_vga: check if VGA is available from ACPI FADT table On x86 the IA-PC Boot Flags in the FADT can signal whether VGA is available or not. Sponsored by: Citrix systems R&D Reviewed by: marcel Differential revision: https://reviews.freebsd.org/D14397	2018-03-13 09:38:53 +00:00
Toomas Soome	a5b0fd9ca9	e1000g: this statement may fall through The gcc 7 does check for switch statement fall through cases, and if legit, such complaint can besilenced by /* FALLTHROUGH */ comment. Unfortunately such comment is quite limited, but will still notify the reader. This patch is backport from illumos, see https://www.illumos.org/rb/r/941/ Reviewed by: eadler Differential Revision: https://reviews.freebsd.org/D14663	2018-03-12 17:05:53 +00:00
Alexander Motin	01c1be35e0	Print fuses and fna fields in identify data. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2018-03-12 16:31:25 +00:00
Scott Long	cf6ea6f27a	Implement a sysctl to dump in-flight I/O state for debugging. The tool to parse it will be committed in a separate action. Sponsored by: Netflix	2018-03-12 05:02:22 +00:00
Alexander Motin	6b1a96b16b	Add new opcodes and statuses from NVMe 1.3a. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2018-03-11 06:30:09 +00:00
Alexander Motin	3fa5467a06	Add new identify data structures fields from NVMe 1.3a. Some of them are already supported by existing hardware, so reporting them `nvmecontrol identify` can be useful.	2018-03-11 05:09:02 +00:00
Emmanuel Vadot	82ce05526b	extres/regulators: Add sysctls for regulators For each regulators create an hw.regulator.<regname>. : uvolt: Current value always_on: 1 If the reg is always on boot_on: 1 If the reg is set at boot time enable_cnt: Number of consumer(s) enable_delay: Delay before enabling the regulator ramp_delay: The Ramp delay max_uamp: The maximum value of the regulator in uAmps min_uamp: The minimal value of the regulator in uAmps max_uvolt: The maximum value of the regulator in uVolts min_uvolt: The minimal value of the regulator in uVolts Reviewed by: ian Differential Revision: https://reviews.freebsd.org/D14578	2018-03-11 04:37:05 +00:00
Andriy Voskoboinyk	cf268a8302	otus(4): check mcast / mgt / ucast rates during Tx descriptor setup These parameters may be changed via ifconfig(8); by default, mgt / mcast rates are lowest possible and ucast rate is not set (matches previous configuration). While here, store some variables locally for better readability.	2018-03-11 00:38:08 +00:00
Andriy Voskoboinyk	8f1ef22fca	rtwn(4): reset Tx power values before calling get_txpower() for RTL8192C / RTL8188E (like it is done for other chipsets).	2018-03-10 23:47:03 +00:00
Andriy Voskoboinyk	63a55eab9b	usb/wlan/*: properly include "opt_wlan.h" into all drivers Without it driver cannot be loaded when wlan(4) module is built with 'options IEEE80211_DEBUG_REFCNT'.	2018-03-10 23:16:24 +00:00
Andriy Voskoboinyk	2e0d057248	run(4): drop few unused variables. Found by: Clang static analyzer	2018-03-10 22:52:39 +00:00
Edward Tomasz Napierala	8ff2372a2c	Check for duplicates when modifying an iSCSI session. Previously we did this check on open, but "iscsictl -M", or an iSCSI redirect received by iscsid(8) could end up with two sessions with the same target name and portal. MFC after: 2 weeks	2018-03-10 14:21:37 +00:00
Conrad Meyer	918622dbc0	mlx5(4): Remove redundant declaration of mlx5_enter_error_state Broken in r330644. Sponsored by: Dell EMC Isilon	2018-03-10 00:59:48 +00:00
Andriy Voskoboinyk	d1b671061b	net80211: wrap protection frame allocation into ieee80211_alloc_prot() Move copy-pasted code for RTS/CTS frame allocation into net80211. While here, add stat / debug message for allocation failures (copied from run(4)) + return error here in bwn(4). Reviewed by: adrian Differential Revision: https://reviews.freebsd.org/D14628	2018-03-09 11:33:56 +00:00
Emmanuel Vadot	d597cfd520	Fix build when option MMCCAM is defined.	2018-03-08 22:49:36 +00:00
Konstantin Belousov	23fda0ad66	Remove unused variable. Sponsored by: The FreeBSD Foundation	2018-03-08 22:04:54 +00:00
Konstantin Belousov	2cec152827	Make mlx5 compilable on ILP32 arches. Sponsored by: Mellanox Technologies MFC after: 1 week	2018-03-08 22:03:43 +00:00
Ed Maste	a1ea91a21f	bktr: correct Japan IF frequency PR: 36451 Submitted by: Hijiri Umemoto <hijiri at umemoto.org> MFC after: 2 weeks	2018-03-08 19:24:10 +00:00
Ed Maste	95320b99a5	asmc: update temperature sensor name/description PR: 225911 Submitted by: Trev <fbsdbugs4 at sentry.org> MFC after: 1 week	2018-03-08 18:52:47 +00:00
Andriy Voskoboinyk	222a2f1de9	iwi(4): factor out rateset setup into iwi_set_rateset(). No functional change intended.	2018-03-08 18:42:23 +00:00
Hans Petter Selasky	50e3ba10e8	Set correct SL in completion for RoCE in mlx5ib(4). There is a difference when parsing a completion entry between Ethernet and IB ports. When link layer is Ethernet the bits describe the type of L3 header in the packet. In the case when link layer is Ethernet and VLAN header is present the value of SL is equal to the 3 UP bits in the VLAN header. If VLAN header is not present then the SL is undefined and consumer of the completion should check if IB_WC_WITH_VLAN is set. While that, this patch also fills the vlan_id field in the completion if present. linux commit 12f8fedef2ec94c783f929126b20440a01512c14 MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 16:27:31 +00:00
Hans Petter Selasky	0285276c92	Add call to setup firmware data dump structure during device load in mlx5core. Do not consider the inability to create a firmware dump fatal, but inform about the situation and allow the driver to attach. The device might not implement the needed VSC, or we might not know the layout of the registers map. In either case, only firmware dump functionality is limited, the network operations should be fine. Submitted by: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 16:19:01 +00:00
Hans Petter Selasky	9cb83c4689	Avoid more LFENCE/SFENCe on x86 in mlx5en(4), by using the FreeBSD native fences. Submitted by: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 15:58:30 +00:00
Hans Petter Selasky	7d69d339ad	Fix mlx5en(4) driver to properly call m_defrag(). When the mlx5en(4) driver was converted to using BUSDMA(9) the call to m_defrag() was moved after the part of the TX routine that strips the header from the mbuf chain. Before it called m_defrag it first trimmed off the now-empty mbufs from the start of the chain. This has the side effect of also removing the head of the chain that has M_PKTHDR set. m_defrag() will not defrag a chain that does not have M_PKTHDR set, thus it was effectively never defragging the mbuf chains. As it turns out, trimming the mbufs in this fashion is unnecessary since the call to bus_dmamap_load_mbuf_sg doesn't map empty mbufs anyway, so remove it. Differential Revision: https://reviews.freebsd.org/D12050 Submitted by: mjoras@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 15:53:04 +00:00
Hans Petter Selasky	1eb09ad434	Use vport rather than physical-port MTU in mlx5en(4). Set and report vport MTU rather than physical MTU, The driver will set both vport and physical port mtu and will rely on the query of vport mtu. SRIOV VFs have to report their MTU to their vport manager (PF), and this will allow them to work with any MTU they need without failing the request. Also for some cases where the PF is not a port owner, PF can work with MTU less than the physical port mtu if set physical port mtu didn't take effect. Based on Linux upstream commit: cd255efff9baadd654d6160e52d17ae7c568c9d3 Submitted by: Meny Yossefi <menyy@mellanox.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 15:47:17 +00:00
Hans Petter Selasky	7c22ae80d2	Use the device unit number for naming the ifnet interface in mlx5en(4). Currently the ifnet interface is named mceX, where X is a monotonically incremented value. If the device is reset due to a fatal error, then the interface name will change. Using the device unit number will keep the naming consistent across the reset logic. Submitted by: Matthew Finlay <matt@mellanox.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 15:43:41 +00:00
Hans Petter Selasky	1a872967ad	Remove duplicate prototypes. MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 15:37:09 +00:00
Hans Petter Selasky	e808190a59	Add kernel and userspace code to dump the firmware state of supported ConnectX-4/5 devices in mlx5core. The dump is obtained by reading a predefined register map from the non-destructive crspace, accessible by the vendor-specific PCIe capability (VSC). The dump is stored in preallocated kernel memory and managed by the mlx5tool(8), which communicates with the driver using a character device node. The utility allows to store the dump in format <address> <value> into a file, to reset the dump content, and to manually initiate the dump. A call to mlx5_fwdump() should be added at the places where a dump must be fetched automatically. The most likely place is right before a firmware reset request. Submitted by: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 15:21:56 +00:00
Hans Petter Selasky	4b95c6659a	Add vendor specific capability interface support in mlx5core. Add the ability to access the vendor specific space gateway in order to support reading and writing data into the different configuration domains. Submitted by: Matthew Finlay <matt@mellanox.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2018-03-08 11:59:47 +00:00

1 2 3 4 5 ...

34838 Commits