freebsd-skq

Author	SHA1	Message	Date
dab	490b5e0b9f	asmc: Add support for mid-2011 Macmini 5,2 PR: 225911 Submitted by: trev <fbsdbugs4@sentry.org> Reported by: trev <fbsdbugs4@sentry.org> MFC after: 1 week	2018-12-17 17:21:45 +00:00
avg	260cc02954	add support for marking interrupt handlers as suspended The goal of this change is to fix a problem with PCI shared interrupts during suspend and resume. I have observed a couple of variations of the following scenario. Devices A and B are on the same PCI bus and share the same interrupt. Device A's driver is suspended first and the device is powered down. Device B generates an interrupt. Interrupt handlers of both drivers are called. Device A's interrupt handler accesses registers of the powered down device and gets back bogus values (I assume all 0xff). That data is interpreted as interrupt status bits, etc. So, the interrupt handler gets confused and may produce some noise or enter an infinite loop, etc. This change affects only PCI devices. The pci(4) bus driver marks a child's interrupt handler as suspended after the child's suspend method is called and before the device is powered down. This is done only for traditional PCI interrupts, because only they can be shared. At the moment the change is only for x86. Notable changes in core subsystems / interfaces: - BUS_SUSPEND_INTR and BUS_RESUME_INTR methods are added to bus interface along with convenience functions bus_suspend_intr and bus_resume_intr; - rman_set_irq_cookie and rman_get_irq_cookie functions are added to provide a way to associate an interrupt resource with an interrupt cookie; - intr_event_suspend_handler and intr_event_resume_handler functions are added to the MI interrupt handler interface. I added two new interrupt handler flags, IH_SUSP and IH_CHANGED, to implement the new intr_event functions. IH_SUSP marks a suspended interrupt handler. IH_CHANGED is used to implement a barrier that ensures that a change to the interrupt handler's state is visible to future interrupts. While there, I fixed some whitespace issues in comments and changed a couple of logically boolean variables to be bool. MFC after: 1 month (maybe) Differential Revision: https://reviews.freebsd.org/D15755	2018-12-17 17:11:00 +00:00
avg	dcef4b8263	add a knob that disables detection of write protected disks It has been reported that on some systems (with real hardware passed through to a virtual machine) the WP detection causes USB disk probing failures. While here, also fix the selection of the next state in the case of malloc failure in DA_STATE_PROBE_WP. It was DA_STATE_PROBE_RC unconditionally even when it should have been DA_STATE_PROBE_RC16. PR: 225794 Reported by: David Boyd <David.Boyd49@twc.com> MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D18496	2018-12-17 16:01:37 +00:00
sobomax	53dba18a1b	Allow ng_nat to be attached to a ethernet interface directly via ng_ether(4) or the likes. Add new control message types: setdlt and getdlt to switch from default DLT_RAW (no encapsulation) to DLT_EN10MB (ethernet). Approved by: glebius MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D18535	2018-12-17 16:00:35 +00:00
grog	ac4ff55dd9	Work around BIOS quirks on HPE Proliant MicroServer Gen10 PR: 221350 Submitted by: Bob Bishop Reported by: Rafal Lukawiecki Reviewed by: jhb MFC after: 2 weeks	2018-12-17 07:09:46 +00:00
avos	521cb263a8	Add revision number for TP-Link TL-WN722N to prevent ambiguity between different chipsets. MFC after: 3 days X-MFC with: 341786	2018-12-17 05:07:57 +00:00
mckusick	fae3bd79a5	Clarify panic in set_rootvnode(). Check for panic in vfs_mountroot_shuffle(). Sponsored by: Netflix	2018-12-15 19:18:58 +00:00
mckusick	afd5ac8b62	Under UFS/FFS the VFS_ROOT() function will return an error if the inode check-hash fails. Panic'ing is not an appropriate response. So, check for an error return from VFS_ROOT() and when an error is reported, unwind and return the error. Reported by: Gary Jennejohn (gj) Sponsored by: Netflix	2018-12-15 19:04:50 +00:00
mckusick	bfbf739aad	Ensure that the inode check-hash is not left zeroed out in the case where the check-hash fails. Prior to the fix in -r342133 the inode with the zeroed out check-hash was written back to disk causing further confusion. Reported by: Gary Jennejohn (gj) Sponsored by: Netflix	2018-12-15 18:49:30 +00:00
mckusick	b27d4425e8	Reorder ffs_verify_dinode_ckhash() so that it checks the inode check-hash before copying in the inode so that the mode and link-count are not set if the check-hash fails. This change ensures that the vnode will be properly unwound and recycled rather than being held in the cache. Initialize the file mode is zero so that if the loading of the inode fails (for example because of a check-hash failure), the vnode will be properly unwound and recycled. Reported by: Gary Jennejohn (gj) Sponsored by: Netflix	2018-12-15 18:35:46 +00:00
mckusick	c376c8363f	Must set ip->i_effnlink = ip->i_nlink to avoid a soft updates "panic: softdep_update_inodeblock: bad link count" when releasing a partially initialized vnode after an inode check-hash failure. Reported by: Gary Jennejohn <gljennjohn@gmail.com> Reported by: Peter Holm (pho) Sponsored by: Netflix	2018-12-15 17:58:42 +00:00
hiren	3dc78ca62f	Revert r331567 CC Cubic: fix underflow for cubic_cwnd() This change is causing TCP connections using cubic to hang. Need to dig more to find exact cause and fix it. Reported by: tj at mrsk dot me, Matt Garber (via twitter) Discussed with: sbruno (previously), allanjude, cperciva MFC after: 3 days	2018-12-15 17:01:16 +00:00
brooks	acaa8abf5c	Fix bugs in plugable CC algorithm and siftr sysctls. Use the sysctl_handle_int() handler to write out the old value and read the new value into a temporary variable. Use the temporary variable for any checks of values rather than using the CAST_PTR_INT() macro on req->newptr. The prior usage read directly from userspace memory if the sysctl() was called correctly. This is unsafe and doesn't work at all on some architectures (at least i386.) In some cases, the code could also be tricked into reading from kernel memory and leaking limited information about the contents or crashing the system. This was true for CDG, newreno, and siftr on all platforms and true for i386 in all cases. The impact of this bug is largest in VIMAGE jails which have been configured to allow writing to these sysctls. Per discussion with the security officer, we will not be issuing an advisory for this issue as root access and a non-default config are required to be impacted. Reviewed by: markj, bz Discussed with: gordon (security officer) MFC after: 3 days Security: kernel information leak, local DoS (both require root) Differential Revision: https://reviews.freebsd.org/D18443	2018-12-15 15:06:22 +00:00
avos	1df4530744	Add new USB id in rtwn_usb(4) (RTL8812AU) PR: 234029 Submitted by: <hakotani000@gmail.com> MFC after: 4 days	2018-12-15 14:58:45 +00:00
trasz	2ee1fcb23d	Add kern.rpc.gss.client_max, to make it possible to bump it easily. This can drastically lower the load on gssd(8) on large NFS servers. Submitted by: Per Andersson <pa at chalmers dot se> Reviewed by: rmacklem@ MFC after: 2 weeks Sponsored by: Chalmers University of Technology Differential Revision: https://reviews.freebsd.org/D18393	2018-12-15 11:32:11 +00:00
cem	b838d6700e	Revert accidentally included changes in r342108 If you're curious, please follow along in https://reviews.freebsd.org/D18537 . Sorry for the noise.	2018-12-15 05:47:22 +00:00
cem	a4ea2c9280	efirt: When present, attempt to use EFI runtime services to shutdown PR: maybe related to 233998 (inconclusive at this time) Submitted by: byuu <byuu AT tutanota.com> (previous version) Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D18506	2018-12-15 05:46:04 +00:00
jhibbits	30f48ca33a	powerpcspe: Don't require FPU_EMU for powerpcspe IEEE emulation Build only the necessary fpu_emu files for supporting the SPE IEEE-754 emulation exception handler. MFC after: 1 week	2018-12-15 04:53:02 +00:00
gonzo	8a26a904f5	[mv_pci] Do not attempt to attach disabled PCI ports Fail probe for PCI port if the respective FDT node is not enabled Differential Revision: https://reviews.freebsd.org/D18385	2018-12-15 02:35:48 +00:00
arichardson	6bf9e29261	make_dtb.sh: Use $CPP instead of assuming that cpp is in $PATH This fixes building in CheriBSD with a strict tmp path since we don't bootstrap a cpp but pass the full path to clang-cpp instead. While touching this file also fix all shellcheck warnings in make_dtb.sh. Reviewed By: manu Differential Revision: https://reviews.freebsd.org/D18376	2018-12-14 23:53:28 +00:00
mw	9aa3cccd72	Fix error check for ACPI_ID_PROBE in the TPM2.0 driver Updated API does not return pointer, so adjust the TPM2.0 driver accordingly. Reported by: jhb Obtained from: Semihalf Sponsored by: Stormshield	2018-12-14 22:22:43 +00:00
gonzo	0c9d4f297b	[twsi] Make extres/clk part conditional based on the EXT_RESOURCES option value This should fix kernel build for ARMADA38X and possibly some other ARM configs Approved by: manu	2018-12-14 21:17:42 +00:00
markj	ebb7bbe94a	Add some more checking to the RISC-V page fault handler. - Panic immediately if witness says we're holding non-sleepable locks. This helps ensure that we don't recurse on the pmap lock in pmap_fault_fixup(). - Panic if the kernel faults on a user address without setting an onfault handler. - Panic if the fault occurred in a critical section or interrupt handler, like we do on other platforms. - Fix some style issues in trap_pfault(). Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18561	2018-12-14 21:07:12 +00:00
markj	4f87ba02c1	Avoid needless TLB invalidations in pmap_remove_pages(). pmap_remove_pages() is called during process termination, when it is guaranteed that no other CPU may access the mappings being torn down. In particular, it unnecessary to invalidate each mapping individually since we do a pmap_invalidate_all() at the end of the function. Also don't call pmap_invalidate_all() while holding a PV list lock, the global pvh lock is sufficient. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18562	2018-12-14 21:04:30 +00:00
markj	2a30688b57	Assume that pmap_l1() will return a PTE. pmaps on RISC-V always have an L1 page table page, so we don't need to check for this when performing lookups. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18563	2018-12-14 21:03:01 +00:00
markj	96ce579a00	Add a QEMU config for RISC-V. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18560	2018-12-14 21:00:41 +00:00
markj	fa544b2f75	Enable witness(4) in the RISC-V GENERIC config. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18559	2018-12-14 20:57:57 +00:00
imp	d414fafdca	atomic_cmpset return value is also an int.	2018-12-14 19:48:42 +00:00
imp	528cf023e0	atomic_fcmpset* return int, not the type of *. fcmpset returns true/false as a int, so make the return types and variables match the int to be consistent with other arch. Reviewed by: cognet@ Differential Revision: https://reviews.freebsd.org/D18557	2018-12-14 19:14:51 +00:00
markj	f2faa35438	Clean up the riscv pmap_bootstrap() implementation. - Build up phys_avail[] in a single loop, excluding memory used by the loaded kernel. - Fix an array indexing bug in the aforementioned phys_avail[] initialization.[1] - Remove some unneeded code copied from the arm64 implementation. PR: 231515 [1] Reviewed by: jhb MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18464	2018-12-14 18:50:32 +00:00
manu	4cc6000919	allwinner: aw_pwm: Read value at attach The booloaded might have configured the pwm controller so read the values.	2018-12-14 18:39:17 +00:00
manu	7e40e33031	pwm: Convert period and duty to unsigned int We don't need a 64 bits value to store nanoseconds Discused with: ian, jhibbits	2018-12-14 18:37:26 +00:00
markj	742b58778c	Add support for the nForce MCP89 adapter. PR: 234015 Submitted by: Andrejs Bogdanovs <sinchiroca86@gmail.com> MFC after: 1 week	2018-12-14 18:16:35 +00:00
mw	696d820352	Fix TPM driver compilation from r342084 Include recent ACPI_ID_PROBE API change.	2018-12-14 17:43:35 +00:00
mw	9b71ef63cd	Introduce driver for TPM 2.0 in CRB and FIFO (TIS) modes It was written basing on: TCG PC Client Platform TPM Profile (PTP) Specification Version 22, Revision 1.03. It only supports Locality 0. Interrupts are only supported in FIFO mode. The driver in FIFO mode was tested on x86 with Infineon SLB9665 discrete TPM chip. Driver in both modes was also tested on qemu with swtpm running on host. Submitted by: Kornel Duleba <mindal@semihalf.com> Obtained from: Semihalf Sponsored by: Stormshield Differential Revision: https://reviews.freebsd.org/D18048	2018-12-14 16:14:36 +00:00
kadesai	c0c0f0acbf	Compilation failure on ppc and mips due to Revision 342066. Adding extra memset on chain frame. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 10:49:48 +00:00
manu	3b03c9e70f	arm64: allwinner: axp81x: Fix double invertion for FLDO1 This fix booting on A64 boards when disabling the unused regulators at boot. We did disable all the regulator handled by register 0x13 which of course contain mandatory regulators for the board to be up. Reported by: Mark Millard <marklmi@yahoo.com> X-MFC-With: r340848	2018-12-14 10:26:17 +00:00
avg	5b7e69bb0f	ichwd: add Sunrise Point-LP ID Submitted by: Tetsuya Uemura <t_uemura@macome.co.jp> Tested by: Tetsuya Uemura <t_uemura@macome.co.jp> MFC after: 2 weeks Relnotes: maybe	2018-12-14 09:30:43 +00:00
avg	a454a47086	ichwd: add support for clearing No Reboot bit in TCOv4 This is based on a patch developed by Tetsuya Uemura <t_uemura@macome.co.jp>. Many thanks! Submitted by: Tetsuya Uemura <t_uemura@macome.co.jp> (earlier version) Tested by: Tetsuya Uemura <t_uemura@macome.co.jp> MFC after: 2 weeks	2018-12-14 09:28:20 +00:00
kadesai	229ef35384	Driver version upgrade 07.708.02.00-fbsd Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:06:39 +00:00
kadesai	834a7a817e	This patch will increase debug level as current logging level has very minimal prints and even few important messages will not get logged. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:05:49 +00:00
kadesai	20341fba62	Change IOC INIT wait time to 180 secs to keep it inline with timeout used by internal DCMDs. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:05:01 +00:00
kadesai	e61876e4a9	This patch will add support for NVME PRPs creation by driver for fastpath capable IOs. NVME specification supports specific type of scatter gather list called as PRP (Physical Region Page) for IO data buffers. Since NVME drive is connected behind SAS3.5 tri-mode adapter, MegaRAID driver/firmware has to convert OS SGLs in native NVMe PRP format. For IOs sent to firmware, MegaRAID firmware does this job of OS SGLs to PRP translation and send PRPs to backend NVME device. For fastpath IOs, driver will do this OS SGLs to PRP translation. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:04:16 +00:00
kadesai	dc69dfc3bf	This patch will add support for new DCMD to get PD information and a single data structure to specify LD and JBOD. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:03:28 +00:00
kadesai	514f953e0e	To improve RAID 1/10 Write performance, OS drivers need to issue the required Write IOs as Fast Path IOs (after the appropriate checks allowing Fast Path to be used) to the appropriate physical drives (translated from the OS logical IO) and wait for all Write IOs to complete. Design: A write IO on RAID volume will be examined if it can be sent in Fast Path based on IO size and starting LBA and ending LBA falling on to a Physical Drive boundary. If the underlying RAID volume is a RAID 1/10, driver issues two fast path write IOs one for each corresponding physical drive after computing the corresponding start LBA for each physical drive. Both write IOs will have the same payload and are posted to HW such that replies land in the same reply queue. If there are no resources available for sending two IOs, driver will send the original IO from upper layer to RAID volume through the Firmware. When both IOs are completed by HW, the resources will be released and SCSI IO completion handler will be called. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:02:44 +00:00
kadesai	8384082de5	Detect sequential Write IOs and pass the hint that it is part of sequential stream to help HBA Firmware do the Full Stripe Writes. For read IOs on certain RAID volumes like Read Ahead volumes,this will help driver to send it to Firmware even if the IOs can potentially be sent to hardware directly (called fast path) bypassing firmware. Design: 8 streams are maintained per RAID volume as per the combined firmware/driver design. When there is no stream detected the LRU stream is used for next potential stream and LRU/MRU map is updated to make this as MRU stream. Every time a stream is detected the MRU map is updated to make the current stream as MRU stream. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:01:49 +00:00
kadesai	f3d059705f	This patch will add new interface to support more than 256 JBODs. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:00:45 +00:00
kadesai	bdabc799dc	This patch will add support for divert bitmap in RAID map. Divert bitmap is supported for SAS3.5 adapters only. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 08:00:01 +00:00
kadesai	51eed342e9	This patch will add support for new Dynamic RaidMap to have different sizes for different number of supported VDs for SAS3.5 MegaRAID adapters. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 07:59:09 +00:00
kadesai	3d8e3cc029	This patch will add support for next generation(SAS3.5) of Tri mode(SAS, SATA, NVMe) MegaRAID adapters. Submitted by: Sumit Saxena <sumit.saxena@broadcom.com> Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com> Approved by: ken MFC after: 3 days Sponsored by: Broadcom Inc	2018-12-14 07:57:00 +00:00
mjg	8598ea893e	vfs: mostly depessimize NDINIT_ALL 1) filecaps_init was unnecesarily a function call 2) an asignment at the end was preventing tail calling of cap_rights_init Sponsored by: The FreeBSD Foundation	2018-12-14 03:55:08 +00:00
jkim	b72e59ab7d	MFV: r342049 Merge ACPICA 20181213.	2018-12-14 00:40:38 +00:00
mjg	a2507cdb8a	dtrace: fix userspace access on boxes with SMAP dtrace has its own routines which were not updated after SMAP support got implemented. Use ifunc just like for other routines. This in particular fixes ustack(). Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18542	2018-12-13 20:09:38 +00:00
chuck	2fd83c3710	nda(4) fix check for Dataset Management support In the nda(4) driver, only set DISKFLAG_CANDELETE (a.k.a. can support BIO_DELETE) if the drive supports Dataset Management. There are reports that without this check, VMWare Workstation does not work reliably. Fix is to check the ONCS field in the NVMe Controller Data structure for support. This check previously existed but did not survive the big-endian changes. Reported by: yuripv@yuripv.net Reviewed by: imp, mav, jimharris Approved by: imp (mentor) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18493	2018-12-13 13:25:37 +00:00
ae	b3a32fc226	Plug memory leak for AES_*_NIST_GMAC algorithms. swcr_newsession() allocates sw_ictx for these algorithms, thus we need to free() it in swcr_freesession(). PR: 233907 MFC after: 1 week	2018-12-13 08:59:51 +00:00
jhibbits	9df2142898	powerpc/booke: Change KERNBASE to be physical load address Previous commits have made VM_MIN_KERNEL_ADDRESS its own separate entity, and rebased the kernel around that address instead of KERNBASE. This commit pulls the trigger to rebase KERNBASE to a physical load address. The eventual goal is to align the address with the AIM KERNBASE, but at this time that's not an option. Currently a Book-E kernel must be loaded on a 64MB boundary, due to size issues. The common load address is at the 64MB mark (0x04000000), so simply make that the default KERNBASE. As of this commit, Book-E kernels can be loaded and booted with ubldr. MFC after: 3 weeks	2018-12-13 05:07:39 +00:00
jhibbits	b7802808dc	powerpcspe: Fix GPR handling in SPE exception handler Optimize the exception handler to only save and load the upper word of the GPRs used in the emulating instruction. This reduces the save/load overhead, and as a side effect does not overwrite the upper word of any temporary register. With this commit I am now able to run editors/abiword and math/gnumeric on a e500-based system. MFC after: 1 week MFC With: r341752,r341751	2018-12-13 04:48:28 +00:00
mmacy	6bf6ad0f09	Generalize AES iov optimization Right now, aesni_cipher_alloc does a bit of special-casing for CRYPTO_F_IOV, to not do any allocation if the first uio is large enough for the requested size. While working on ZFS crypto port, I ran into horrible performance because the code uses scatter-gather, and many of the times the data to encrypt was in the second entry. This code looks through the list, and tries to see if there is a single uio that can contain the requested data, and, if so, uses that. This has a slight impact on the current consumers, in that the check is a little more complicated for the ones that use CRYPTO_F_IOV -- but none of them meet the criteria for testing more than one. Submitted by: sef at ixsystems.com Reviewed by: cem@ MFC after: 3 days Sponsored by: iX Systems Differential Revision: https://reviews.freebsd.org/D18522	2018-12-13 04:40:53 +00:00
imp	ea93db4e32	Correctly implemenet atomic_swap_long for mips64. MIPS64 has 64-bit longs, so use uint64_t for it, otherwise uint32_t. sizeof(long) == sizeof(ptr) for all platforms, so define atomic_swap_ptr in terms of atomic_swap_long. Submitted by: hps@	2018-12-13 00:42:26 +00:00
manu	a3af78d3ad	mv_thermal: Add thermal driver for AP806 and CP110 thermal sensor Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 22:33:05 +00:00
manu	7feac2f7e8	arm64: mv_cp110_icu: Fix build	2018-12-12 22:24:30 +00:00
manu	cce48d98de	mv_gpio: Since it's also an interrupt controller, attach sooner Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 22:10:11 +00:00
manu	e22b714d0f	sdhci_xenon: Add Marvell 8k compatible string Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 22:09:35 +00:00
manu	46170dc3e9	arm64: Add mv_cp110_icu and mv_cp110_gicp icu is a interrupt concentrator in the CP110 block and gicp is a gic extension to allow interrupts in the CP block to be turned into GIC SPI interrupts Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 22:08:43 +00:00
manu	35ad6ef573	twsi: Clean up marvell part and add support for Marvell 7k/8k Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 22:05:07 +00:00
manu	6c963cc241	arm64: marvell: Add cp110 clock controller support The cp110 clock controller controls the clocks and gate of the CP110 hardware block. Every clock/gate are implemented except the NAND clock. Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 22:04:21 +00:00
manu	c6fea0d9a5	arm64: mv_gpio: Add Marvell 8K support While here put the interrupts setup in it's own function Sponsored by: Rubicon Communications, LCC ("Netgate")	2018-12-12 22:02:57 +00:00
manu	579c7498f7	arm64: marvell: Add driver for Marvell Ap806 System Controller The first two clocks are for the clusters and their frequencies can be found reading a register. Then a fixed 1200Mhz clock is present and two fixed clocks, 'mss' which is 1200 / 6 and 'sdio' which is 1200 / 3. Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 22:01:06 +00:00
manu	7abba6b3dd	arm64: mvebu_pinctrl: Add driver for Marvell Pinmux Controller Add a driver compatible with Marvell mvebu-pinctrl and add ap806-pinctrl support. Sponsored by: Rubicon Communications, LCC ("Netgate")	2018-12-12 22:00:05 +00:00
manu	a3e7e99078	arm64: Add new SoC type MARVELL_8K Sponsored by: Rubicon Communications, LLC ("Netgate")	2018-12-12 21:58:30 +00:00
manu	da93ed974d	fdt: Add support for simple-mfd bus Quoting the binding Documentation : "These devices comprise a nexus for heterogeneous hardware blocks containing more than one non-unique yet varying hardware functionality." Reviewed by: loos Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D17751	2018-12-12 21:56:45 +00:00
manu	5da88e2d63	arm64: allwinner: Fix pwm dtso Double patched files ended up in the tree Reported by: kevans	2018-12-12 21:10:34 +00:00
manu	a048865ad1	arm64: allwinner: Add DTSO for pwm and r_pwm Those are both dtso (overlays) for the two pwm controllers found on the A64.	2018-12-12 21:02:22 +00:00
manu	f7c033a4e8	arm64: allwinner: Add pwm driver Add a pwm driver for Allwinner PWM Add pwm and aw_pwm to the GENERIC kernel	2018-12-12 20:58:43 +00:00
manu	561baae05e	Add a pwm subsystem so we can configure pwm controller from kernel and userland. The pwm subsystem consist of API for PWM controllers, pwmbus to register them and a pwm(8) utility to talk to them from userland. Reviewed by: oshgobo (capsicum), bcr (manpage), 0mp (manpage) Differential Revision: https://reviews.freebsd.org/D17938	2018-12-12 20:56:56 +00:00
sobomax	86af1f53ed	Add NETGRAPH_CHECKSUM. MFC after: 1 week	2018-12-12 20:40:01 +00:00
kp	40dbdeaf9d	pf: Fix endless loop on NAT exhaustion with sticky-address When we try to find a source port in pf_get_sport() it's possible that all available source ports will be in use. In that case we call pf_map_addr() to try to find a new source IP to try from. If there are no more available source IPs pf_map_addr() will return 1 and we stop trying. However, if sticky-address is set we'll always return the same IP address, even if we've already tried that one. We need to check the supplied address, because if that's the one we'd set it means pf_get_sport() has already tried it, and we should error out rather than keep trying. PR: 233867 MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D18483	2018-12-12 20:15:06 +00:00
sobomax	2571c24d86	Add NETGRAPH_CHECKSUM. MFC after: 1 week	2018-12-12 19:02:37 +00:00
cem	89e84f5e34	gmirror: Remove a last-minute INVARIANTS breakage in r341840 I mistakenly added a lock assertion to this routine at the last minute without confirming it was held during g_mirror_create. It isn't (it isn't even initialized yet). Mea culpa. Access is exclusive in both callers, just not always by that particular lock. Reported by: lwhsu X-MFC-With: r341840, r341674	2018-12-12 18:13:56 +00:00
vmaffione	14400f1a94	netmap: fix warning in netmap_kloop.c Reported by: markj MFC after: 3 days	2018-12-12 16:32:15 +00:00
markj	9732b465b4	Fix a possible mbuf double free in bwn_dma_tx_start(). If bus_dmamap_load_mbuf() fails following a defrag, the caller of bwn_dma_tx_start() would free the original mbuf after m_defrag() had already done so. Fix this by returning the defragged mbuf to the caller instead. Update bwn_pio_tx_start() similarly for consistency. Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com> Reviewed by: landonf Tested by: landonf MFC after: 3 days admbug: 820 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18342	2018-12-12 15:49:14 +00:00
dab	67b26ec81c	asmc: Add Support for Macbook Pro 8,1 PR: 217505 Submitted by: John O. Brickley <obryan.brickley@gmail.com>, updated by Maciej Pasternacki <maciej@pasternacki.net> Reported by: John O. Brickley <obryan.brickley@gmail.com> MFC after: 1 week	2018-12-12 13:43:55 +00:00
cem	25cda747b0	gmirror: Fix a bug introduced in r341674 r341674 inadvertently introduced a bug where newer mirror components being tasted would clear the high sc_flags that are not controlled by component metadata, such as G_MIRROR_DEVICE_FLAG_TASTING. This could plausibly expose a small window of time during STARTING where device destruction might race with mirror component addition, probably resulting in a crash. Reviewed by: markj X-MFC-With: r341674 Differential Revision: https://reviews.freebsd.org/D18521	2018-12-12 05:48:27 +00:00
mckusick	830a63af76	Continuing efforts to provide hardening of FFS. This change adds a check hash to the filesystem inodes. Access attempts to files associated with an inode with an invalid check hash will fail with EINVAL (Invalid argument). Access is reestablished after an fsck is run to find and validate the inodes with invalid check-hashes. This check avoids a class of filesystem panics related to corrupted inodes. The hash is done using crc32c. Note this check-hash is for the inode itself and not any of its indirect blocks. Check-hash validation may be extended to also cover indirect block pointers, but that will be a separate (and more costly) feature. Check hashes are added only to UFS2 and not to UFS1 as UFS1 is primarily used in embedded systems with small memories and low-powered processors which need as light-weight a filesystem as possible. Reviewed by: kib Tested by: Peter Holm Sponsored by: Netflix	2018-12-11 22:14:37 +00:00
kp	e7bab0023d	pf: Prevent integer overflow in PF when calculating the adaptive timeout. Mainly states of established TCP connections would be affected resulting in immediate state removal once the number of states is bigger than adaptive.start. Disabling adaptive timeouts is a workaround to avoid this bug. Issue found and initial diff by Mathieu Blanc (mathieu.blanc at cea dot fr) Reported by: Andreas Longwitz <longwitz AT incore.de> Obtained from: OpenBSD MFC after: 2 weeks	2018-12-11 21:44:39 +00:00
mjg	7e31d1de7e	Remove unused argument to priv_check_cred. Patch mostly generated with cocinnelle: @@ expression E1,E2; @@ - priv_check_cred(E1,E2,0) + priv_check_cred(E1,E2) Sponsored by: The FreeBSD Foundation	2018-12-11 19:32:16 +00:00
dim	d41b4ec2b4	Upgrade our copies of clang, llvm, lld, lldb, compiler-rt and libc++ to the upstream release_70 branch r348686 (effectively, 7.0.1 rc3). The release will follow very soon, but no more functional changes are expected. Release notes for llvm, clang and lld 7.0.0 are available here: <http://releases.llvm.org/7.0.0/docs/ReleaseNotes.html> <http://releases.llvm.org/7.0.0/tools/clang/docs/ReleaseNotes.html> <http://releases.llvm.org/7.0.0/tools/lld/docs/ReleaseNotes.html> PR: 230240, 230355 Relnotes: yes MFC after: 2 months	2018-12-11 19:05:28 +00:00
shurd	e8708506a8	Fix !tx_abdicate error from r336560 r336560 was supposed to restore pre-r323954 behaviour when tx_abdicate is not set (the default case). However, it appears that rather than the drainage check being made conditional on tx_abdicate being set, it was duplicated so it occured twice if tx_abdicate was set and once if it was not. Now when !tx_abdicate, drainage is only checked if the doorbell isn't pending. Reported by: lev MFC after: 1 week Sponsored by: Limelight Networks	2018-12-11 17:46:01 +00:00
mjg	9be8aba618	audi: replace open-coded TDP_AUDITREC checks with the macro Sponsored by: The FreeBSD Foundation	2018-12-11 17:14:12 +00:00
markj	b59887227d	Fix the PAE kernel gcc build. The error was caused by map_ucode() casting a vm_paddr_t to a void *. Use a uintptr_t instead to match the caller. Fix some style bugs while here. Reported by: bde Reviewed by: bde MFC after: 1 week Sponsored by: The FreeBSD Foundation	2018-12-11 16:49:01 +00:00
dab	1b02aef1b6	asmc: Add Support for MacBookAir 7,1 and 7,2 PR: 226172 Submitted by: James Wright <james.wright@jigsawdezign.com> Reported by: James Wright <james.wright@jigsawdezign.com> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D18396	2018-12-11 16:35:59 +00:00
mjg	ba8523cc7c	fd: dedup code in sys_getdtablesize Sponsored by: The FreeBSD Foundation	2018-12-11 12:08:18 +00:00
mjg	59363837d9	Make lim_cur inline if possible. It is a function call only to accomodate some ABIs which install a hook. They only care for 3 types of limits: DATA, STACK, VMEM Instead of always calling the func, see at compilation time if the requested limit is something else and just do the read if so. Sponsored by: The FreeBSD Foundation	2018-12-11 12:01:46 +00:00
mjg	45f96abb72	fd: tidy up closing a fd - avoid a call to knote_close in the common case - annotate mqueue as unlikely Sponsored by: The FreeBSD Foundation	2018-12-11 11:58:44 +00:00
mjg	78cf9b9e38	fd: stop looking for exact freefile after allocation If a lower fd is closed later, the lookup goes to waste. Allocation always performs the lookup anyway. Sponsored by: The FreeBSD Foundation	2018-12-11 11:57:12 +00:00
andrew	3e3733a177	Only read the ACPI proximity tabled on arm64 when we are booting from ACPI. Sponsored by: DARPA, AFRL	2018-12-11 11:13:11 +00:00
dim	30c4d65dc9	Merge ^/head r341764 through r341812.	2018-12-11 06:47:04 +00:00
delphij	1fcf9e5d2a	Remove questionable initialization for ICH8M, rely on BIOS to properly initialize the controller. According to the datasheet, the old code checks if port 2 (P2E, 0x4) was the only enabled port (except port 0, which was ignored by mask 0xfe), and issue a write to the PCS register to disable all but port 0, right before ahci_ctlr_reset. Some other operating systems would issue a port enable to all ports, but since the current code only does the special initialization for ICH8M, it entirely and rely on BIOS to do the right thing (the alternative would be https://reviews.freebsd.org/D18300?id=50922 , should we see reports that we really need to do it). Reviewed by: mav MFC after: 3 months Differential Revision: https://reviews.freebsd.org/D18300	2018-12-11 05:10:22 +00:00
kib	28de053312	Free bootstacks after AP startup. Bootstacks are unused after APs executed sched_throw() in init_secondary_tail() and started executing on proper idle thread stack. Add sysinit that detects that the idle thread for each CPU was scheduled at least once, and free corresponding bootstack. Slight addition of the code (~200 bytes) is compensated by the saving, because even on typical small modern desktop CPU we leak 128K of memory otherwise (4 pages x 8 threads). Reviewed by: jhb MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18486	2018-12-11 02:54:36 +00:00
kib	4045451f9e	Remove special case handling for getfhat(fd, NULL, handle). There is no reason for it to behave differently from openat(fd, NULL). Also the handling did not worked because the substituted path was from the system address space, causing EFAULT. Submitted by: Jack Halford <jack@gandi.net> MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18501	2018-12-11 02:48:49 +00:00
markj	2918dcca3c	Remove an unused malloc(9) type. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2018-12-11 02:16:27 +00:00
markj	32498fda11	Use inline tests for individual PTE bits in the RISC-V pmap. Inline tests for PTE_* bits are easy to read and don't really require a predicate function, and predicates which operate on a pt_entry_t are inconvenient when working with L1 and L2 page table entries. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18461	2018-12-11 02:15:56 +00:00
jhibbits	5b0ef9c55b	powerpc/booke: Don't get and use the load offset for TOC on APs The code was a near exact copy of the code in startup, but it doesn't need the complexity since the kernel is already relocated. With VM_MIN_KERNEL_ADDRESS as currently set to KERNBASE, this doesn't cause a problem, because it's a zero offset. However, when KERNBASE is changed to a physical load address, it then has a non-zero offset, and ends up with an invalid stack pointer, causing the AP to hang.	2018-12-11 02:03:00 +00:00
imp	da25748a36	Remove stray hints files.	2018-12-10 21:33:01 +00:00
jhb	7b28e77e79	Don't report stale signal information for non-signal events in ptrace_lwpinfo. Once a signal's siginfo was copied to 'td_si' as part of the signal exchange in issignal(), it was never cleared. This caused future thread events that are reported as SIGTRAP events without signal information to report the stale siginfo in 'td_si'. For example, if a debugger created a new process and used SIGSTOP to stop it after PT_ATTACH, future system call entry / exit events would set PL_FLAG_SI with the SIGSTOP siginfo in pl_siginfo. This broke 'catch syscall' in current versions of gdb as it assumed PL_FLAG_SI with SIGTRAP indicates a breakpoint or single step trap. Reviewed by: kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D18487	2018-12-10 19:39:24 +00:00
luporl	38e00d542b	ppc64: handle exception 0x1500 (soft patch) This change adds a hypervisor trap handler for exception 0x1500 (soft patch), normalizing all VSX registers and returning. This avoids a kernel panic due to unknown exception. Change made with the collaboration of leonardo.bianconi_eldorado.org.br, that found out that this is a hypervisor exception and not a supervisor one, and fixed this in the code. Reviewed by: jhibbits, sbruno Differential Revision: https://reviews.freebsd.org/D17806	2018-12-10 14:54:28 +00:00
hselasky	b2b1b7040b	Remove no longer needed ifdefs in the LinuxKPI, after r341787. Differential Revision: https://reviews.freebsd.org/D18450 Reviewed by: kib@ MFC after: 3 days Sponsored by: Mellanox Technologies	2018-12-10 13:41:33 +00:00
hselasky	dd98a579d3	Implement atomic_swap_xxx() for all platforms. Differential Revision: https://reviews.freebsd.org/D18450 Reviewed by: kib@ MFC after: 3 days Sponsored by: Mellanox Technologies	2018-12-10 13:38:13 +00:00
avos	2d987ba385	rtwn, rsu: add more USB ids. PR: 233638 Submitted by: cezary.sliwa@gmail.com MFC after: 3 days	2018-12-10 09:45:57 +00:00
arybchik	9aa88ec9e9	sfxge(4): use n Tx queues instead of n + 2 on EF10 HW On EF10 HW we can avoid sending packets without checksum offload or with IP-only checksum offload to dedicated queues. Instead, we can use option descriptors to change offload policy on any queue during runtime. Thus, we don't need to create two dedicated queues. Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru> Sponsored by: Solarflare Communications, Inc. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D18390	2018-12-10 09:36:05 +00:00
arybchik	b3dce2d92c	sfxge(4): prepare the number of Tx queues on event queue 0 to become variable The number of Tx queues on event queue 0 can depend on the NIC family type, and this property will be leveraged by future patches. This patch prepares the code for this change. Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru> Sponsored by: Solarflare Communications, Inc. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D18389	2018-12-10 09:35:53 +00:00
arybchik	641fe3596f	sfxge(4): report support for Tx checksum op descriptors FreeBSD driver needs a patch to provide a means for packets which do not need checksum offload but have flow ID set to avoid hitting only the first Tx queue (which has been used for packets not needing checksum offload). This should be possible on Huntington, Medford or Medford2 chips since these support toggling checksum offload on any given queue dynamically by means of pushing option descriptors. The patch for FreeBSD driver will then need a means to figure out whether the feature can be used, and testing adapter family might not be a good solution. This patch adds a feature bit specifically to indicate support for checksum option descriptors. The new feature bits may have more users in future, apart from the mentioned FreeBSD patch. Submitted by: Ivan Malov <Ivan.Malov at oktetlabs.ru> Sponsored by: Solarflare Communications, Inc. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D18388	2018-12-10 09:35:45 +00:00
arybchik	61f3f706aa	sfxge(4): populate per-event queue stats in sysctl In order to find out why the first event queue and corresponding interrupt is triggered more frequent, it is useful to know which events go to each event queue. Sponsored by: Solarflare Communications, Inc. MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D18418	2018-12-10 09:35:33 +00:00
jhibbits	b73b2f43ff	powerpc/booke: Replace a logical equivalent of pmap_kextract() with a real call No sense in reinventing the wheel here. AP bringup is not a time-critical point.	2018-12-10 04:16:40 +00:00
imp	0887da2d0d	Fix typo in powerpcspe name.	2018-12-09 21:53:45 +00:00
imp	568f85fc1c	Send a START UNIT command when a disk responds with an ASC of 04/1C. This will hopefully spin up a disk that's in low-power mode. Sponsored by: Netflix Submitted by: scottl@	2018-12-09 21:37:34 +00:00
alc	af34503285	blst_leaf_alloc updates bighint for a leaf when an allocation is successful and includes the last block represented by the leaf. The reasoning is that, if the last block is included, then there must be no solution before that one in the leaf, so the leaf cannot provide an allocation that big again; indeed, the leaf cannot provide a solution bigger than range1. Which is all correct, except that if the value of blk passed in did not represent the first block of the leaf, because the cursor was pointing to the middle of the leaf, then a possible solution before the cursor may have been ignored, and bighint cannot be updated. Consider the sequence allocate 63 (returning address 0), free 0,63 (freeing that same block, and allocate 1 (returning 63). The result is that one block is allocated from the first leaf, and the value of bighint is 0, so that nothing can be allocated from that leaf until the only block allocated from that leaf is freed. This change detects that skipped-over solution, and when there is one it makes sure that the value of bighint is not changed when the last block is allocated. Submitted by: Doug Moore <dougm@rice.edu> Tested by: pho X-MFC with: r340402 Differential Revision: https://reviews.freebsd.org/D18474	2018-12-09 17:55:10 +00:00
bde	7173dd8e13	Fix devstat on md devices. devstat_end_transaction() was called before the i/o was actually ended (by delivering it to GEOM), so at least the i/o length was messed up. It was always recorded as 0, so the average transaction size and the average transfer rate was always displayed as 0. devstat_end_transaction() was not called at all for the error case, so there were sometimes multiple starts per end. I didn't observe this in practice and don't know if it did much damage. I think it extended the length of the i/o to the next transaction. Reviewed by: kib	2018-12-09 15:34:20 +00:00
dim	07b9c9ba27	Merge ^/head r340918 through r341763.	2018-12-09 11:39:45 +00:00
scottl	b25216505e	I missed powerpcspe in the previous commit for excluding mps and mpr. I also learned that 'mips' is overly broad and covers 64bit architectures too. However, it's not worth the fight right now, so any refinements will have to come another day.	2018-12-09 06:52:25 +00:00
scottl	a4498f7fb2	Don't allocate the config_intrhook separately from the softc, it's small enough that it costs more code to handle the malloc/free than it saves.	2018-12-09 06:16:54 +00:00
scottl	b0d1efe0ae	Copy and clear the reply descriptor atomically. This prevents concurrency in the interrupt handlers (usually due to timeout/error recovery) from seeing and processing the same descriptor twice.	2018-12-09 06:10:11 +00:00
scottl	050d2d6c93	Remove the mps driver from powerpc 32bit GENERIC, and don't build it and mpr as a module for powerpc or mips. An upcoming commit will cause these drivers to rely on the presence of 64bit atomic operations. Discussed with jhibbits.	2018-12-09 06:06:06 +00:00
jhibbits	f6d390617b	powerpc/SPE: Copy lower part of source register to target for efdabs/efdnabs/efdneg MFC after: 1 week MFC With: r341751	2018-12-09 04:54:55 +00:00
jhibbits	2cbeab925f	powerpc/SPE: Reload vector registers after efdabs/efdnabs/efdneg While here, also style(9)-adjust indents around this code.	2018-12-09 04:13:14 +00:00
sobomax	b3480b6b3e	Hook up ng_checksum(4) module and appropriate manpage to the build. The module was added back in 2016, but has never been connected. MFC after: 1 week	2018-12-09 02:58:53 +00:00
kib	ea29016877	Fix PAE boot. With the introduction of M_EXEC support for kmem_malloc(), some kernel mappings start having NX bit set in the paging structures early, for PAE kernels on machines with NX support, i.e. practically on all machines. In particular, AP trampoline and initialization needs to access pages which translations has NX bit set, before initializecpu() is called. Check for CPUID NX feature and enable EFER.NXE before we enable paging in mp boot trampoline. This allows the CPU to use the kernel page table instead of generating page fault due to reserved bit set. PR: 233819 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-12-08 22:12:57 +00:00
jchandra	5a913206e0	arm64: add ACPI based NUMA support Use the newly defined SRAT/SLIT parsing APIs in arm64 to support ACPI based NUMA. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D17943	2018-12-08 19:42:01 +00:00
jchandra	6fd503bd86	acpica: support parsing of arm64 affinity in acpi_pxm.c ACPI SRAT table on arm64 uses GICC entries to provide CPU locality information. These entries use an AcpiProcessorUid to identify the CPU (unlike on x86 where the entries have an APIC ID). Update acpi_pxm.c to extend the cpu_add/cpu_find/cpu_get_info functions to handle AcpiProcessorUid. Use the updated functions while parsing ACPI_SRAT_GICC_AFFINITY entry for arm64. Also update sys/conf/files.arm64 to build acpi_pxm.c when ACPI is enabled. Reviewed by: markj (previous version) Differential Revision: https://reviews.freebsd.org/D17942	2018-12-08 19:32:23 +00:00
jchandra	2d1461899d	acpica : move SRAT/SLIT parsing to sys/dev/acpica This moves the architecture independent parts of sys/x86/acpica/srat.c to sys/dev/acpica/acpi_pxm.c, to be used later on arm64. The function declarations are moved to sys/dev/acpica/acpivar.h We also need to update sys/conf/files.{i386,amd64} to use the new file. No functional changes. Reviewed by: markj, imp Differential Revision: https://reviews.freebsd.org/D17941	2018-12-08 19:10:58 +00:00
jchandra	acaf867c57	x86/acpica/srat.c: Add API for parsing proximity tables The SLIT and SRAT ACPI tables needs to be parsed on arm64 as well, on systems that use UEFI/ACPI firmware and support NUMA. To do this, we need to move most of the logic of x86/acpica/srat.c to dev/acpica and provide an API that architectures can use to parse and configure ACPI NUMA information. This commit adds the API in srat.c as a first step, without making any functional changes. We will move the common code to sys/dev/acpica as the next step. The functions added are: * int acpi_pxm_init(int ncpus, vm_paddr_t maxphys) - to allocate and initialize data structures used * void acpi_pxm_parse_tables(void) - parse SRAT/SLIT, save the cpu and memory proximity information * void acpi_pxm_set_mem_locality(void) - use the saved data to set memory locality * void acpi_pxm_set_cpu_locality(void) - use the saved data to set cpu locality * void acpi_pxm_free(void) - free data structures allocated by init On arm64, we do not have an cpu APIC id that can be used as index to store CPU data, we need to use the Processor Uid. To help with this, define internal functions cpu_add, cpu_find, cpu_get_info to store and get CPU proximity information. Reviewed by: markj, jhb (previous version) Differential Revision: https://reviews.freebsd.org/D17940	2018-12-08 18:34:05 +00:00
mjg	d21951d547	umtx: avoid umtxshm locking on object termination if possible Sample build world result on tmpfs: kern.ipc.umtx_terminate_notempty: 0 kern.ipc.umtx_terminate_empty: 2891815 Sponsored by: The FreeBSD Foundation	2018-12-08 14:04:57 +00:00
mjg	bae6f9dc2d	Remove proctree acquire from note_procstat_proc It is not needed since r340482 ("proc: always store parent pid in p_oppid") Sponsored by: The FreeBSD Foundation	2018-12-08 11:38:39 +00:00
mjg	59185429c4	Fix a corner case in ID bitmap management. If all IDs from trypid to pid_max were used as pids, the code would enter a loop which would be infinite if none of the IDs could become free (e.g. they all belong to processes which did not transitioned to zombie). Fixes: r341684 ("Manage process-related IDs with bitmaps") Sponsored by: The FreeBSD Foundation	2018-12-08 10:22:12 +00:00
mjg	c2763443b4	proc: postpone proc unlock until after reporting with kqueue kqueue would always relock immediately afterwards. While here drop the NULL check for list itself. The list is always allocated. Sponsored by: The FreeBSD Foundation	2018-12-08 06:34:12 +00:00
mjg	af8321b07f	proc: handle sdt exit probe before taking the proc lock Sponsored by: The FreeBSD Foundation	2018-12-08 06:31:43 +00:00
mjg	c0fc2aadad	Provide SDT_PROBES_ENABLED macro. Sponsored by: The FreeBSD Foundation	2018-12-08 06:30:41 +00:00
mjg	4f7d169e9a	amd64: stop re-reading curpc on subyte/suword Originally read value is still safely kept. Re-reading code was there for previous iterations which were partially shared with i386. Sponsored by: The FreeBSD Foundation	2018-12-08 04:53:08 +00:00
kib	fa48020659	Simplify kern_readlink_vp(). When we detected that the vnode is not symlink, return immediately. This moves the readlink code out of else branch and unindents it. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-12-07 23:07:51 +00:00
kib	52dbc76ca2	Fix expression evaluation. Braces were put in the wrong place, causing failing EAGAIN check to return zero result. Remove the problematic assignment from the conditional expression at all. While there, remove used once variable vp, and wrap too long line. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2018-12-07 23:05:12 +00:00
imp	b517a81b5f	Even though they are reserved, cdw2 and cdw3 can be set via nvme-cli (and soon nvmecontrol). Go ahead and copy them into rsvd2 and rsvd3. Sponsored by: Netflix	2018-12-07 21:58:08 +00:00
mjg	69fc5b56ce	fd: use racct_set_unlocked Sponsored by: The FreeBSD Foundation	2018-12-07 16:51:38 +00:00
mjg	74492ad8e4	racct: add RACCT_ENABLED macro and racct_set_unlocked This allows to remove PROC_LOCK/UNLOCK pairs spread thorought the kernel only used to appease racct_set. Sponsored by: The FreeBSD Foundation	2018-12-07 16:47:34 +00:00
mjg	76d3335601	fd: try do less work with the lock in dup Sponsored by: The FreeBSD Foundation	2018-12-07 16:44:52 +00:00
mjg	5d5736ca94	vm: use fcmpset for vmspace reference counting Sponsored by: The FreeBSD Foundation	2018-12-07 16:22:54 +00:00
mjg	10f6c42acc	Replace hand-rolled unrefs if > 1 with refcount_release_if_not_last Sponsored by: The FreeBSD Foundation	2018-12-07 16:11:45 +00:00
mjg	6d80602477	refcount: remove a stale comment about conditional ref/unref routines It was there for vfs-specific variants and was copied over when it should not. Sponsored by: The FreeBSD Foundation	2018-12-07 16:10:13 +00:00
avg	72442df1f5	acpi_MatchHid: use ACPI_MATCHHID_NOMATCH instead of FALSE Binary representation of both is the same (zero), but ACPI_MATCHHID_NOMATCH is better for consistency. MFC after: 4 days X-MFC with: r339754	2018-12-07 16:05:39 +00:00
avg	ea228066e0	aibs: fix a typo in the probe method that was introduced in r339754 Because of that typo the driver would try to attach to every device on acpi bus. That disrupted acpi attachment of uart driver, at least. MFC after: 4 days X-MFC with: r339754	2018-12-07 16:01:51 +00:00
markj	7a0ac26a7e	Update the description of the address space layout on RISC-V. This adds more detail and fixes some inaccuracies. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18463	2018-12-07 15:56:40 +00:00
markj	a61c5fb063	Rename sptbr to satp per v1.10 of the privileged architecture spec. Add a subroutine for updating satp, for use when updating the active pmap. No functional change intended. Reviewed by: jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D18462	2018-12-07 15:55:23 +00:00
kib	0f30bb9a10	Regen.	2018-12-07 15:19:00 +00:00
kib	48d91fd889	Add new file handle system calls. Namely, getfhat(2), fhlink(2), fhlinkat(2), fhreadlink(2). The syscalls are provided for a NFS userspace server (nfs-ganesha). Submitted by: Jack Halford <jack@gandi.net> Sponsored by: Gandi.net Tested by: pho Feedback from: brooks, markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D18359	2018-12-07 15:17:29 +00:00
mjg	73f834bfbb	proc: when exiting move to zombproc before taking proctree The kernel was already doing this prior to r329615. It was changed to reduce contention on allproc. However, introduction of pidhash locks and removal of proctree -> allproc ordering from fork thanks to bitmaps fixed things enough to make this change pessimal. waitpid takes proctree on each call and this change (now) causes avoidable stalls if allproc is held. Sponsored by: The FreeBSD Foundation	2018-12-07 12:32:25 +00:00
mjg	6b8dd99954	Manage process-related IDs with bitmaps Currently unique pid allocation on fork often requires a full walk of process, group, session lists to make sure it is not used by anything. This has a side effect of requiring proctree to be held along with allproc, which adds more contention in poudriere -j 128. The patch below implements trivial bitmaps which gets rid of the problem. Dedicated lock is introduced to manage IDs. While here a bug was discovered: all processes would inherit reap id from the first process spawned by init. This had a side effect of keeping the ID used and when allocation rolls over to the beginning it keeps being skipped. The patch is loosely based on initial work by mjoras@. Reviewed by: kib Sponsored by: The FreeBSD Foundation	2018-12-07 12:22:32 +00:00
mjg	a1ba958dc0	Annotate Giant drop/pickup macros with __predict_false They are used in important places of the kernel with the lock not being held majority of the time. Sponsored by: The FreeBSD Foundation	2018-12-07 12:06:03 +00:00
mjg	c54695512c	unr64: use locked variant if not __LP64__ The current ifdefs are not sufficient to distinguish 32- and 64- bit variants, which results e.g. in powerpc64 not using atomics. While some 32-bit archs provide 64-bit atomics, there is no huge advantage of using them on these platforms. Reported by: many Suggested by: jhb Sponsored by: The FreeBSD Foundation	2018-12-07 12:05:11 +00:00
avg	c1fb81f798	daprobedone: announce if a disk is write-protected MFC after: 2 weeks	2018-12-07 12:02:31 +00:00
vmaffione	ab139270f3	netmap: remove dead code obsoleted by iflib The iflib subsystem implements netmap support in a driver-independent way (sys/net/iflib.c). We can therefore remove the headers that used to implement netmap support for all the drivers now supported by iflib (em, igb, ixl, ixgbe, lem). MFC after: 1 week	2018-12-07 11:47:42 +00:00
mmel	4b670ed804	Fix cut&paste typo in atomic_fetchadd_64(). Reported by: Jia-Shiun Li <jiashiun@gmail.com> MFC after: 1 week	2018-12-07 11:10:27 +00:00
cem	42d84ef531	gmirror: Evaluate mirror components against newest metadata copy Re-apply r341665 with format strings fixed. If we happen to taste a stale mirror component first, don't reject valid, newer components that have differing metadata from the stale component (during STARTING). Instead, update our view of the most recent metadata as we taste components. Like mediasize beforehand, remove some checks from g_mirror_check_metadata which would evict valid components due to metadata that can change over a mirror's lifetime. g_mirror_check_metadata is invoked long before we check genid/syncid and decide which component(s) are newest and whether or not we have quorum. Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW component is added, first remove any known stale or inconsistent disks from the mirrorset, rather than removing them after deciding we have quorum. Check if we have quorum after removing these components. Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to force gmirrors to wait out the full timeout (kern.geom.mirror.timeout) before transitioning from STARTING to RUNNING. This is a kludge to help ensure all eligible, boot-time available mirror components are tasted before RUNNING a gmirror. Add a basic test case for STARTING -> RUNNING startup behavior around stale genids. PR: 232671, 232835 Submitted by: Cindy Yang <cyang AT isilon.com> (previous version) Reviewed by: markj (kernel portions) Discussed with: asomers, Cindy Yang Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18062	2018-12-07 02:44:04 +00:00
cem	a6f65b4008	Revert r341665 due to tinderbox breakage I didn't notice that some format strings were non-portable. Will fix and re-commit later.	2018-12-07 00:47:05 +00:00
cem	0b3fcef470	gmirror: Evaluate mirror components against newest metadata copy If we happen to taste a stale mirror component first, don't reject valid, newer components that have differing metadata from the stale component (during STARTING). Instead, update our view of the most recent metadata as we taste components. Like mediasize beforehand, remove some checks from g_mirror_check_metadata which would evict valid components due to metadata that can change over a mirror's lifetime. g_mirror_check_metadata is invoked long before we check genid/syncid and decide which component(s) are newest and whether or not we have quorum. Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW component is added, first remove any known stale or inconsistent disks from the mirrorset, rather than removing them after deciding we have quorum. Check if we have quorum after removing these components. Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to force gmirrors to wait out the full timeout (kern.geom.mirror.timeout) before transitioning from STARTING to RUNNING. This is a kludge to help ensure all eligible, boot-time available mirror components are tasted before RUNNING a gmirror. When we are instructed to forget mirror components, bump the generation id to avoid confusion with such stale components later. Add a basic test case for STARTING -> RUNNING startup behavior around stale genids. PR: 232671, 232835 Submitted by: Cindy Yang <cyang AT isilon.com> (previous version) Reviewed by: markj (kernel portions) Discussed with: asomers, Cindy Yang Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18062	2018-12-06 23:55:39 +00:00
kib	78bf9626b8	Fix build with option RSS, removing unused variables. Reported by: np Sponsored by: Mellanox Technologies MFC after: 1 week	2018-12-06 21:52:40 +00:00
np	beb3c0891f	cxgbe(4): Get Linux cxgb4vf working in bhyve VMs with VFs passed through. cxgb4vf doesn't own the buffer size list but still expects the first two entries to be 4K and some power of 2 respectively. The BSD cxgbe doesn't care where its preferred buffer sizes are as long as they're in the list somewhere, so just move its entries towards the end as a workaround. MFC after: 1 month Sponsored by: Chelsio Communicatons	2018-12-06 21:33:08 +00:00
cy	89c5bde73b	Remove an ugly Ultrix hack. Ultrix has been AWOL since the last ice age, more to come. MFC after: 1 week	2018-12-06 20:15:54 +00:00
kp	dd11da4a2a	pfsync: Performance improvement pfsync code is called for every new state, state update and state deletion in pf. While pf itself can operate on multiple states at the same time (on different cores, assuming the states hash to a different hashrow), pfsync only had a single lock. This greatly reduced throughput on multicore systems. Address this by splitting the pfsync queues into buckets, based on the state id. This ensures that updates for a given connection always end up in the same bucket, which allows pfsync to still collapse multiple updates into one, while allowing multiple cores to proceed at the same time. The number of buckets is tunable, but defaults to 2 x number of cpus. Benchmarking has shown improvement, depending on hardware and setup, from ~30% to ~100%. MFC after: 1 week Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D18373	2018-12-06 19:27:15 +00:00
kib	22599ada54	Appease gcc build, remove duplicated declaration. Reported by: np Sponsored by: Mellanox Technologies MFC after: 1 week	2018-12-06 19:20:00 +00:00
sbruno	7554bfb930	Change u32 to uint32_t to allow the native-xtools target to build libsysdecode. Submitted by: kib	2018-12-06 18:59:33 +00:00
kp	a80983aa7d	pf: add a comment describing why do we call pf_map_addr again if port selection process fails Obtained from: OpenBSD	2018-12-06 18:58:54 +00:00
markj	29acbc71ce	Let kern.trap_enotcap be set as a tunable. This is handy for testing programs that are run by rc. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2018-12-06 17:29:37 +00:00
avg	08dcca15a0	acpi_{Device,Battery}IsPresent: restore pre-r330957 behaviour Specifically, assume that the device is present if evaluation of _STA method fails. Before r330957 we ignored any _STA evaluation failure (which was performed by AcpiGetObjectInfo in ACPICA contrib code) for the purpose of acpi_DeviceIsPresent and acpi_BatteryIsPresent. ACPICA 20180313 removed evaluation of _STA from AcpiGetObjectInfo. So, we added evaluation of _STA to acpi_DeviceIsPresent and acpi_BatteryIsPresent. One important difference is that the new code ignored a failure only if _STA did not exist (AE_NOT_FOUND). Any other kind of failure was treated as a fatal failure. Apparently, on some systems we can get AE_NOT_EXIST when evaluating _STA. And that error is not an evil twin of AE_NOT_FOUND, despite a very similar name, but a distinct error related to a missing handler for an ACPI operation region. It's possible that for some people the problem was already fixed by changes in ACPICA and/or in acpi_ec driver (or even in BIOS) that fixed the AE_NOT_EXIST failure related to EC operation region. This work is based on a great analysis by cem and an earlier patch by Ali Abdallah <aliovx@gmail.com>. PR: 227191 Reported by: 0mp MFC after: 2 weeks	2018-12-06 12:34:34 +00:00
vmaffione	4a965dbfd6	netmap: netmap_transmit should honor bpf packet tap hook This allows tcpdump to capture outbound kernel packets while in netmap mode Submitted by: Marc de la Gueronniere <mdelagueronniere@verisign.com> Reviewed by: vmaffione MFC after: 1 week Sponsored by: Verisign, Inc. Differential Revision: https://reviews.freebsd.org/D17896	2018-12-06 09:45:25 +00:00
np	98b7fa17e0	cxgbe(4): Fall back to a basic configuration in case of any error during card initialization. This is an expanded version of r333682. Break up prep_firmware into simpler routines while here. Load the firmware/config KLD only if needed. MFC after: 1 month Sponsored by: Chelsio Communications	2018-12-06 06:18:21 +00:00
jhibbits	f678f101fd	powerpc: Set very low priority mode while waiting for AP unleash event The POWER9 does not recognize 'or 27,27,27' as a thread priority NOP. On earlier POWER architectures, this NOP would note to the processor to give up resources if able, to improve performance of other threads. All processors that support the thread priority NOPs recognize the 'or 31,31,31' NOP as very low priority, so use this to perform a similar function, and not burn cycles on POWER9.	2018-12-06 04:36:02 +00:00
jhibbits	dea3520c76	powerpc: Fix ELFv2 JMP_SLOT relocation fixup The jump slot is a function pointer, not a descriptor pointer, in ELFv2. Just write the pointer itself over, not the contents of the pointer, which would be the first instruction of the function.	2018-12-06 04:30:24 +00:00
jhibbits	52acc74d85	powerpc/powermac: Fix macgpio(4) child interrupt resource handling The 'interrupts' property is actually 2 words, not one, on macgpio child nodes. Open Firmware's getprop function might be returning the value copied, not the total size of the property, but FDT's returns the total size. Prior to this patch, this would cause the SYS_RES_IRQ resource list to not be populated when running with the 'usefdt' loader variable set, to convert the OFW device tree to a FDT. Since the property is always 2 words, read both words, and ignore the second. Tested by: Dennis Clarke (previous attempt) MFC after: 2 weeks	2018-12-06 04:25:12 +00:00
mckusick	a394bed4ee	If the vfs.ffs.dotrimcons sysctl option is enabled while a file deletion is active, specifically after a call to ffs_blkrelease_start() but before the call to ffs_blkrelease_finish(), ffs_blkrelease_start() will have handed out SINGLETON_KEY rather than starting a collection sequence. Thus if we get a SINGLETON_KEY passed to ffs_blkrelease_finish(), we just return rather than trying to finish the nonexistent sequence. Reported by: Warner Losh (imp@) Sponsored by: Netflix	2018-12-06 01:04:56 +00:00
mckusick	2c9178edde	Normally when an attempt is made to mount a UFS/FFS filesystem whose superblock has a check-hash error, an error message noting the superblock check-hash failure is printed and the mount fails. The administrator then runs fsck to repair the filesystem and when successful, the filesystem can once again be mounted. This approach fails if the filesystem in question is a root filesystem from which you are trying to boot. Here, the loader fails when trying to access the filesystem to get the kernel to boot. So it is necessary to allow the loader to ignore the superblock check-hash error and make a best effort to read the kernel. The filesystem may be suffiently corrupted that the read attempt fails, but there is no harm in trying since the loader makes no attempt to write to the filesystem. Once the kernel is loaded and starts to run, it attempts to mount its root filesystem. Once again, failure means that it breaks to its prompt to ask where to get its root filesystem. Unless you have an alternate root filesystem, you are stuck. Since the root filesystem is initially mounted read-only, it is safe to make an attempt to mount the root filesystem with the failed superblock check-hash. Thus, when asked to mount a root filesystem with a failed superblock check-hash, the kernel prints a warning message that the root filesystem superblock check-hash needs repair, but notes that it is ignoring the error and proceeding. It does mark the filesystem as needing an fsck which prevents it from being enabled for writing until fsck has been run on it. The net effect is that the reboot fails to single user, but at least at that point the administrator has the tools at hand to fix the problem. Reported by: Rick Macklem (rmacklem@) Discussed with: Warner Losh (imp@) Sponsored by: Netflix	2018-12-06 00:09:39 +00:00
brooks	80178b3a55	Further simplify arguments to init. With the removal of BOOTCDROM and fastboot support, this code always passed "-s" or "--". The latter simply terminates getopt(3) processing in init so we only need to pass "-s" in the single user case, or nothing in other cases. The passing of "--" seems to have been done to ensure that the number of arguments passed to init was always the same and thus that argc was the same. Also GC the write-only variable pathlen (not in reviewed version). Reviewed by: kib, jhb Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D18441	2018-12-05 19:18:16 +00:00
alc	df302cc785	Terminate a blist_alloc search when a blst_meta_alloc call fails with cursor == 0. Every call to blst_meta_alloc but the one at the root is made only when the meta-node is known to include a free block, so that either the allocation will succeed, the node hint will be updated, or the last block of the meta- node range is, and remains, free. But the call at the root is made without checking that there is a free block, so in the case that every block is allocated, there is no hint update to prevent the current code from looping forever. Submitted by: Doug Moore <dougm@rice.edu> Reported by: pho Reviewed by: pho Tested by: pho X-MFC with: r340402 Differential Revision: https://reviews.freebsd.org/D17999	2018-12-05 18:26:40 +00:00
brooks	01c59afc06	Remove never enabled support for "fastboot". This has been ifdef notyet since the import of BSD 4.4 Lite Kernel Sources in r1541. Sponsored by: DARPA, AFRL	2018-12-05 17:35:15 +00:00
brooks	07ffa90cb4	Remove ifdef BOOTCDROM option to start init. When BOOTCDROM is defined (via CFLAGS as there is no config option) it causes -C to be passed to init, but our init and the version of sysinstall I glanced at in 6.x don't support -C. The last plausibly related support was removed from the tree in 1995. Reviewed by: kib Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D18431	2018-12-05 17:29:14 +00:00
markj	a74ba1d431	Clamp the INPCB port hash tables to IPPORT_MAX + 1 chains. Memory beyond that limit was previously unused, wasting roughly 1MB per 8GB of RAM. Also retire INP_PCBLBGROUP_PORTHASH, which was identical to INP_PCBPORTHASH. Reviewed by: glebius MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D17803	2018-12-05 17:06:00 +00:00
mjg	f8fa891369	sx: retire SX_NOADAPTIVE The flag is not used by anything for years and supporting it requires an explicit read from the lock when entering slow path. Flag value is left unused on purpose. Sponsored by: The FreeBSD Foundation	2018-12-05 16:43:03 +00:00
hselasky	9de0264534	Remove redundant declaration after r341517. MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 15:56:44 +00:00
hselasky	324b106453	Fix some build of LinuxKPI on some platforms after r341518. MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 15:53:34 +00:00
hselasky	eca1d53801	Fix LINT build after r341572. MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 15:42:31 +00:00
vmaffione	5b8d772a38	netmap.h: include stdatomic.h The stdatomic.h header exports atomic_thread_fence(), that can be used to implement the nm_stst_barrier() macro needed by netmap. MFC after: 3 days	2018-12-05 15:38:52 +00:00
slavash	13594958bc	mlx4/mlx5: Updated driver version to 3.5.0 Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:25:34 +00:00
slavash	ea81708b05	mlx5en: Implement backpressure indication. The backpressure indication is implemented using an unlimited rate type of mbuf send tag. When the upper layers typically the socket layer has obtained such a tag, it can then query the destination driver queue for the current amount of space available in the send queue. A single mbuf send tag may be referenced multiple times and a refcount has been added to the mlx5e_priv structure to track its usage. Because the send tag resides in the mlx5e_channel structure, there is no need to wait for refcounts to reach zero until the mlx4en(4) driver is detached. The channels structure is persistant during the lifetime of the mlx5en(4) driver it belongs to and can so be accessed without any need of synchronization. The mlx5e_snd_tag structure was extended to contain a type field, because there are now two different tag types which end up in the driver which need to be distinguished. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:25:03 +00:00
slavash	5446d21ddd	mlx5en: Improve configuration of HW LRO. In order to enable HW LRO, both the "hw_lro" sysctl in the mlx5en(4) config space must be set, and the ifconfig(8) LRO capability must be set. Any other settings will disable HW LRO. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:24:33 +00:00
slavash	6820dd38a3	mlx5en: Count all transmitted and received bytes. Add counter for all transmitted and received bytes. Currently only all transmitted and received packets were counted. Fix description of RX LRO counters while at it. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:24:02 +00:00
slavash	24fb5a65d7	mlx5en: Statically allocate and free the channel structure(s). By allocating the worst case size channel structure array at attach time we can eliminate various NULL checks in the fast path. And also reduce the chance for use-after-free issues in the transmit fast path. This change is also a requirement for implementing backpressure support. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:23:31 +00:00
slavash	9162575347	mlx5en: Fix race in mlx5e_ethtool_debug_stats(). Writing to the debug stats variable must be locked, else serialization will be lost which might cause various kernel panics due to creating and destroying sysctls out of order. Make sure the sysctl context is initialized after freeing the sysctl nodes, else they can be freed twice. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:23:01 +00:00
slavash	20eefd3e86	mlx5en: Add support for IFM_10G_LR and IFM_40G_ER4 media types. Inspect the ethernet compliance code to figure out actual cable type by reading the PDDR module info register. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:22:30 +00:00
slavash	2611e22441	mlx5en: Don't set rate on SQs when the SQ is already stopped. This can happen when connections are short lived and leads to a firmware error printout in dmesg, syndrome 0x51cfb0, because the SQ is in the wrong state. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:21:59 +00:00
slavash	ce7a3c5283	mlx5en: Fix for inlining issues in transmit path 1) Don't exceed the drivers own hardcoded TX inline limit. The blueflame register size can be much greater than the hardcoded limit for inlining. Make sure we don't exceed the drivers own limit, because this also means that the maximum number of TX fragments becomes invalid and then memory size assumptions in the TX path no longer hold up. 2) Make sure the mlx5_query_min_inline() function returns an error code. 3) Header inlining is required when using TSO. 4) Catch failure to compute inline header size for TSO. 5) Add support for UDP when computing inline header size. 6) Fix for inlining issues with regards to DSCP. Make sure we inline 4 bytes beyond the ethernet and/or VLAN header to workaround a hardware bug extracting the DSCP field from the IPv4/v6 header. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:21:28 +00:00
slavash	e19e32c488	mlx5en: Remove the DRBR and associated logic in the transmit path. The hardware queues are deep enough currently and using the DRBR and associated callbacks only leads to more task switching in the TX path. The is also a race setting the queue_state which can lead to hung TX rings. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:20:57 +00:00
slavash	21d54eb4ed	mlx5en: Implement support for bandwidth limiting in by ratio, ETS. Add support for setting the bandwidth limit as a ratio rather than in bits per second. The ratio must be an integer number between 1 and 100 inclusivly. Implement the needed firmware commands and SYSCTLs through mlx5en(4). Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:20:26 +00:00
slavash	dd3800d12c	mlx5fpga: Add set and query connect/disconnect FPGA Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:19:55 +00:00
slavash	9ba28d9c4d	mlx5fpga: IOCTL for FPGA temperature measurement Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:19:23 +00:00
slavash	0479b837bb	mlx5fpga: Support MorseQ board Added and supported new enum "morseQ = 4" for fpga_id field Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:18:52 +00:00
slavash	e20d4dfd05	mlx5fpga_tools initial code import. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:17:22 +00:00
slavash	16b94054c2	mlx5fpga: Initial code import. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 14:11:20 +00:00
slavash	4a2699276d	mlx5ib: Set default active width and speed when querying port. Make sure the active width and speed is set in case the translate_eth_proto_oper() function doesn't recognize the current port operation mask. Linux commit: 7672ed33c4c15dbe9d56880683baaba4227cf940 Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:49:11 +00:00
slavash	d61a3353ea	mlx5ib: Make sure the congestion work timer does not escape the drain procedure. If the mlx5_ib_read_cong_stats() function was running when mlx5ib was unloaded, because this function unconditionally restarts the timer, the timer can still be pending after the delayed work has been cancelled. To fix this simply loop on the delayed work cancel procedure as long as it returns non-zero. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:48:39 +00:00
slavash	00d1aa75c0	mlx5ib: Fix null pointer dereference in mlx5_ib_create_srq Although "create_srq_user" does overwrite "in.pas" on some paths, it also contains at least one feasible path which does not overwrite it. Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:48:10 +00:00
slavash	863b4e521d	mlx5ib: Fix sign extension in mlx5_ib_query_device "fw_rev_min(dev->mdev)" with type "unsigned short" (16 bits, unsigned) is promoted in "fw_rev_min(dev->mdev) << 16" to type "int" (32 bits, signed), then sign-extended to type "unsigned long" (64 bits, unsigned). If "fw_rev_min(dev->mdev) << 16" is greater than 0x7FFFFFFF, the upper bits of the result will all be 1. Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:47:41 +00:00
slavash	ba01a3ba32	mlx5: Fix driver version location Driver description should be set by core and not by the Ethernet driver. Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:47:10 +00:00
slavash	d26314fc77	mlx5: Fixes to allow command polling mode to exist alongside event mode. A command is either polling or event driven and the mode cannot change during execution of a command. Make sure the event handler only handle commands which are not polled. This is done by checking the command mode in the command handler before completing commands. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:46:39 +00:00
slavash	40afa928f6	mlx5: Fix wrong size allocation for QoS ETC TC register The driver allocates wrong size (due to wrong struct name) when issuing a query/set request to NIC's register. Linux commit: d14fcb8d877caf1b8d6bd65d444bf62b21f2070c Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:46:09 +00:00
slavash	927825bb6c	mlx5: Add software tx_jumbo_packets counter This counter will represent transmitted packets which has more than 1518 octets. The NIC has multiple hardware counters for counting transmitted packets larger than 1518 octets. Each counter counts the packets in specific range. We accumulate those counters to have a single counter. Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:45:37 +00:00
slavash	6c134699c3	mlx5: Implement support for configuring PCIe packet write ordering via a sysctl. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:45:08 +00:00
slavash	7384b73a2d	mlx5: Extend vector argument to u64. Else the MLX5_TRIGGERED_CMD_COMP flag will be masked away. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:44:38 +00:00
slavash	a6cbff74f7	mlx5: Add global control to disable firmware reset, for all mlx5 devices. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:44:08 +00:00
slavash	d37a092b07	mlx5: Fix use-after-free in self-healing flow When the mlx5 health mechanism detects a problem while the driver is in the middle of init_one or remove_one, the driver needs to prevent the health mechanism from scheduling future work; if future work is scheduled, there is a problem with use-after-free: the system WQ tries to run the work item (which has been freed) at the scheduled future time. Prevent this by disabling work item scheduling in the health mechanism when the driver is in the middle of init_one() or remove_one(). Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:43:37 +00:00
slavash	f1dbdee4ab	mlx5: Move hw.mlx5 node definition to mlx5_core. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:43:07 +00:00
slavash	bfe42c25f5	mlx5: Convert some spaces into tabs and use device_printf() instead of printf(). Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:42:36 +00:00
slavash	b6bd33a7e4	mlx5: Add SRQ fixes from Linux Combine multiple fixes from Linux to SRQ. Linux commits: c73b791 IB/mlx5: Assign SRQ type earlier 0fd27a8 IB/mlx5: Fix out-of-bound access c2b37f7 IB/mlx5: Fix integer overflows in mlx5_ib_create_srq d63c467 RDMA/mlx5: Fix memory leak in mlx5_ib_create_srq() error path Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:42:06 +00:00
slavash	81609111ff	mlx5: Fix for potential memory leaks. Make sure allocated data gets freed in error cases. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:41:37 +00:00
slavash	268f2aec71	mlx5: Discard unused return values. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:41:06 +00:00
slavash	3e56183e7f	mlx5: Raise fatal IB event when sys error occurs All other mlx5_events report the port number as 1 based, which is how FW reports it in the port event EQE. Reporting 0 for this event causes mlx5_ib to not raise a fatal event notification to registered clients due to a seemingly invalid port. All switch cases in mlx5_ib_event that go through the port check are supposed to set the port now, so just do it once at variable declaration. Linux commit: aba462134634b502d720e15b23154f21cfa277e5 Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:40:36 +00:00
slavash	42900ca3b2	mlx5: Fix integer overflow while resizing CQ The user can provide very large cqe_size which will cause to integer overflow. Linux commit: 28e9091e3119933c38933cb8fc48d5618eb784c8 Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:40:05 +00:00
slavash	24b7bb65bd	mlx4en: Optimise reception of small packets. Copy small packets like TCP ACKs into a new mbuf reusing the existing mbuf to receive a new ethernet frame. This avoids wasting buffer space for small sized packets. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:39:35 +00:00
slavash	240008849a	mlx4: Make sure default VNET is set when adding a new interface. Adding an interface might be done outside the device_attach() routine and will then cause a panic, due to the VNET not being defined. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:39:05 +00:00
slavash	61a5e7fed7	mlx4en: Remove duplicate statistics variable assignment. The "priv->pkstats.rx_dropped" is written twice in a row. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:38:35 +00:00
slavash	53562971dd	mlx4en: Add support for receiving all data using one or more MCLBYTES sized mbufs. Also when the MTU is greater than MCLBYTES. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:32:46 +00:00
slavash	e108164c56	mlx4en: Add support for netdump. Implement the needed callback functions and support for polling the driver. Differential Revision: https://reviews.freebsd.org/D15259 Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:32:15 +00:00
slavash	a109fe4300	mlx4en: Remove the DRBR and associated logic in the transmit path. The hardware queues are deep enough currently and using the DRBR and associated callbacks only leads to more task switching in the TX path. The is also a race setting the queue_state which can lead to hung TX rings. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:31:45 +00:00
slavash	f646208642	mlx4en: Add driver version to sysctl desc Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:31:14 +00:00
slavash	002f9a9760	mlx4: Add board identifier and firmware version to sysctl In last mlx4 update (r325841) we lost the sysctl to show the firmware version for mlx4 devices. Add both board identifier and firmware version under: sys.device.mlx4_core0.hw sysctl node. Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:30:48 +00:00
slavash	ba283f7367	mlx4core: Add checks for invalid port numbers. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:30:16 +00:00
slavash	f243f307d7	mlx4: Zero initialize device capabilities to avoid use of uninitialized fields. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:29:46 +00:00
slavash	6269d18d7c	mlx4core: Avoid multiplication overflow by casting multiplication. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:29:16 +00:00
slavash	504c72484d	krping: Fix for memory leak in error case. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:27:48 +00:00
slavash	ddf69b52ea	ipoib: Notify on modify QP failure only when relevant Modify QP can fail and it can be acceptable, like when moving from RST to ERR state, all the rest are not acceptable and a message to the log should be printed. The current code prints on all failures and many messages like: "Failed to modify QP to ERROR state" appear, even when supported by the state machine of the QP object. Linux commit: 5dc78ad1904db597bdb4427f3ead437aae86f54c Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:27:17 +00:00
slavash	3050b77031	ipoib: increase the non-cm queue length When a packet needs fragmentation, it might generate more than 3 fragments. With the queue length 3, all fragments are generated faster than the queue is drained, which effectively drops fourth and later fragments on the floor. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:26:47 +00:00
slavash	12154c5ccd	ipoib: Don't do a light flush when MTU is unchanged. When changing the MTU of ibX network interfaces, check that the MTU was really changed before requesting an update of the multicast rules. Else we might go into an infinite loop joining and leaving ibX multicast groups towards the opensm master interface. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:26:17 +00:00
slavash	4093f0685a	ipoib: correct setting MTU from inside ipoib(4). It is not enough to set ifnet->if_mtu to change the interface MTU. System saves the MTU for route in the radix tree, and route cache keeps the interface MTU as well. Since addition of the multicast group causes recalculation of MTU, even bringing the interface up changes MTU from 4042 to 1500, which makes the system configuration inconsistent. Worse, ip_output() prefers route MTU over interface MTU, so large packets are not fragmented and dropped on floor. Fix it for ipoib(4) using the same approach (or hack) as was applied for it_tun/if_tap in r339012. Thanks to bz@ for giving the hint. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:25:47 +00:00
slavash	2bd9dcbeba	ibcore: Fix clearing of bound device interface. Binding to a loopback device is not allowed. Make sure the destination device address is global by clearing the bound device interface. Only do this conditionally, else link local addresses won't work. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:25:13 +00:00
slavash	5413daa1a0	ibcore: ip6_dev_find() needs to know the scope ID. Else the wrong network device can be returned for link-local addresses. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:24:43 +00:00
slavash	ed5f1f49e7	ibcore: Fix sleeping in atomic when RoCE is used A couple of places in the CM do spin_lock_irq(&cm_id_priv->lock); ... if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg)) However when the underlying transport is RoCE, this leads to a sleeping function being called with the lock held - the callchain is cm_alloc_response_msg() -> ib_create_ah_from_wc() -> ib_init_ah_from_wc() -> rdma_addr_find_l2_eth_by_grh() -> rdma_resolve_ip() and rdma_resolve_ip() starts out by doing req = kzalloc(sizeof *req, GFP_KERNEL); not to mention rdma_addr_find_l2_eth_by_grh() doing wait_for_completion(&ctx.comp); to wait for the task that rdma_resolve_ip() queues up. Fix this by moving the AH creation out of the lock. Linux commit: c76161181193985087cd716fdf69b5cb6cf9ee85 Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:24:12 +00:00
slavash	cc6a289dfb	ibcore: Add missing unref of netdevice. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:23:44 +00:00
slavash	36bca24c90	ibcore: Fix loopback with rdma-cm. Trying to validate loopback fails because rtalloc1() resolves system local addresses to the loopback network interface, lo0. Fix this by explicitly checking for loopback during validation of the source and destination network address. If the source address belongs to a local network interface and is equal to the destination address, there is no need to run the destination address through rtalloc1(). Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:23:14 +00:00
slavash	2d7a612e9e	ibcore: Make sure all VNETs are scanned for VLAN interfaces. The master network interface and the VLANs may reside in different VNETs. Make sure that all VNETs are searched when scanning for GID entries. Submitted by: netapp Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:22:43 +00:00
slavash	f991e4f2bf	ibcore: Always check return value from ib_init_ah_from_wc(). This prevents code from accepting RoCEv1 connections when only ROCEv2 is enabled and vice versa. Linux commit: 0c4386ec77cfcd0ccbdbe8c2e67dd3a49b2a4c7f Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:22:07 +00:00
slavash	e1a4168430	ibcore: Add missing check for failure. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:21:20 +00:00
slavash	7736259840	ibcore: Fix an array index check The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS (80) elements. Hence compare the array index with that value instead of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity reports the following: Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127). Linux commit: 2fe2f378dd45847d2643638c07a7658822087836 Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:20:51 +00:00
slavash	2d572a8cf9	ibcore: Check ib_find_pkey() return value. Linux commit: d3a2418ee36a59bc02e9d454723f3175dcf4bfd9 Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:20:22 +00:00

... 3 4 5 6 7 ...

134791 Commits