freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	0fd7ea1f21	Add {rd,wr}{fs,gs}base C wrappers for instructions. Tested by: pho (as part of the larger patch) Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-08-14 11:20:54 +00:00
Konstantin Belousov	7bf0049e48	Style. Tested by: pho (as part of the larger patch) Sponsored by: The FreeBSD Foundation MFC after: 3 days	2017-08-14 11:20:10 +00:00
Sepherosa Ziehau	93b4e111bb	hyperv: Update copyright for the files changed in 2017 MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11982	2017-08-14 06:00:50 +00:00
Sepherosa Ziehau	d0cd8231e0	hyperv/hn: Re-set datapath after synthetic parts reattached. Do this even for non-transparent mode VF. Better safe than sorry. MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11981	2017-08-14 05:55:16 +00:00
Sepherosa Ziehau	c2d50b263f	hyperv/hn: Minor cleanup MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11979	2017-08-14 05:46:50 +00:00
Sepherosa Ziehau	a97fff1913	hyperv/hn: Fix/enhance receiving path when VF is activated. - Update hn(4)'s stats properly for non-transparent mode VF. - Allow BPF tapping to hn(4) for non-transparent mode VF. - Don't setup mbuf hash, if 'options RSS' is set. In Azure, when VF is activated, TCP SYN and SYN\|ACK go through hn(4) while the rest of segments and ACKs belonging to the same TCP 4-tuple go through the VF. So don't setup mbuf hash, if a VF is activated and 'options RSS' is not enabled. hn(4) and the VF may use neither the same RSS hash key nor the same RSS hash function, so the hash value for packets belonging to the same flow could be different! - Disable LRO. hn(4) will only receive broadcast packets, multicast packets, TCP SYN and SYN\|ACK (in Azure), LRO is useless for these packet types. For non-transparent, we definitely _cannot_ enable LRO at all, since the LRO flush will use hn(4) as the receiving interface; i.e. hn_ifp->if_input(hn_ifp, m). While I'm here, remove unapplied comment and minor style change. MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11978	2017-08-14 05:40:52 +00:00
Sepherosa Ziehau	3bed4e54f8	hyperv/hn: Update VF's ibytes properly under transparent VF mode. While, I'm here add comment about why updating VF's imcast stat is not necessary. MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11948	2017-08-14 05:30:02 +00:00
Ian Lepore	a6e709f29c	Add hinted attachment for non-FDT systems. Also, print a message if setting up the timer fails, because on some types of chips that's the first attempt to access the device. If the chip is missing/non-responsive then you'd get a driver that attached and didn't register the rtc, with no clue about why. On other chip types there are inits that come before timer setup, and they already print messages about errors.	2017-08-14 02:23:10 +00:00
Ian Lepore	1dc8df138d	Add back the drivers for Dallas/Maxim ds13xx and Seiko S35390x now that they've been rewritten/fixed to not cause panics by doing i2c transfers before interrupts are available. PR: 221227	2017-08-14 00:12:14 +00:00
Ian Lepore	098f6cb6e6	Minor fixes and enhancements for the s35390a i2c RTC driver... - Add FDT probe code. - Do i2c transfers with exclusive bus ownership. - Use config_intrhook_oneshot() to defer chip setup because some i2c busses can't do transfers without interrupts. - Add a detach() routine. - Add to module build.	2017-08-14 00:00:24 +00:00
Ian Lepore	90cff13c3c	Remove the old ds1374 driver and use the ds13rtc driver instead. Adjust several mips config files accordingly.	2017-08-13 22:07:42 +00:00
Ian Lepore	3777ed4378	Change "chiptype" to "compatible". Making the hint name the same as the FDT property name should make it easier to document the list of names accepted by both configuration mechanisms.	2017-08-13 21:45:46 +00:00
Ian Lepore	bb2e8108e1	Add a new driver, ds13rtc, that handles all DS13xx series i2c RTC chips. This driver supports only basic timekeeping functionality. It completely replaces the ds133x driver. It can also replace the ds1374 driver, but that will take a few other changes in MIPS code and config, and will be committed separately. It does NOT replace the existing ds1307 driver, which provides access to some of the extended features on the 1307 chip, such as controlling the square wave output signal. If both ds1307 and ds13rtc drivers are present, the ds1307 driver will outbid and win control of the device. This driver can be configured with FDT data, or by using hints on non-FDT systems. In addition to the standard hints for i2c devices, it requires a "chiptype" string of the form "dallas,ds13xx" where 'xx' is the chip id (i.e., the same format as FDT compat strings).	2017-08-13 21:02:40 +00:00
Andrew Turner	062c276886	Add support for multiple GICv3 ITS devices. For this we add sc_irq_base and sc_irq_length to the softc to handle the base number of IRQs available, make gicv3_get_nirqs return the number of available interrupt IDs, and limit which CPUs we send interrupts to based on the numa domain. The last point is only strictly needed on a dual socket ThunderX where we are unable to send MSI/MSI-X interrupts between sockets. Sponsored by: DARPA, AFRL	2017-08-13 18:54:51 +00:00
Ian Lepore	2db14f97de	Add config_intrhook_oneshot(): schedule an intrhook function and unregister it automatically after it runs. The config_intrhook mechanism allows a driver to stall the boot process until device(s) required for booting are available, by not allowing system inits to proceed until all intrhook functions have been unregistered. Virtually all existing code simply unregisters from within the hook function when it gets called. This new function makes that common usage more convenient. Instead of allocating and filling in a struct, passing it to a function that might (in theory) fail, and checking the return code, now a driver can simply call this cannot-fail routine, passing just the intrhook function and its arg. Differential Revision: https://reviews.freebsd.org/D11963	2017-08-13 18:10:24 +00:00
Kirk McKusick	037331ddbd	When read requests are sent from a filesystem running above g_journal, the g_journal level needs to check whether it is holding a newer copy of the block than that which exists on the disk. If so, it needs to return its copy. If not, it should pass the request down to the disk to fulfill. It currently considers six queues: 0) delayed queue, 1) unsent (current queue), 2) in-flight to the journal (flush queue), 3) active journal (active queue), 4) inactive journal (inactive queue), and 5) inflight to the disk (copy queue). Checking on two of these queues is unnecessary: 0) The delayed requests should not be used for reads because they have not yet been entered into the journal, so their value should reflect the disk contents, not the future contents that are not yet committed. 2) Because all the bio's in the flush queue are also found on the active queue, there is no need to inspect the flush queue for reads since they will be found when searching the active queue. Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Discussed with: kib MFC after: 1 week	2017-08-13 18:09:22 +00:00
Kirk McKusick	8fccf8ffd7	Eliminate a variable that is only ever set. Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Discussed with: kib MFC after: 1 week	2017-08-13 18:06:38 +00:00
Alan Cox	bee93d3cf0	The _meta_ functions include a radix parameter, a blk parameter, and another parameter that identifies a starting point in the memory address block. Radix is a power of two, blk is a multiple of radix, and the starting point is in the range [blk, blk+radix), so that blk can always be computed from the other two. This change drops the blk parameter from the meta functions and computes it instead. (On amd64, for example, this change reduces subr_blist.o's text size by 7%.) It also makes the radix parameters unsigned to address concerns that the calculation of '-radix' might overflow without the -fwrapv option. (See https://reviews.freebsd.org/D11819.) Submitted by: Doug Moore <dougm@rice.edu> MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11964	2017-08-13 16:39:49 +00:00
Roger Pau Monné	72446721e4	srat: use pmap_unmapbios To match the pmap_mapbios. Reported by: jhb MFC with: r322403	2017-08-13 14:50:38 +00:00
Nathan Whitehorn	c670f31f19	Move NVME controller shutdown from being called as part of module unloading to being called through the newbus DEVICE_SHUTDOWN() path. This ensures that the NVME controller gets shut down before the device and bus disappear and prevents data corruption on shutdown on at least Samsung EVO 960 SSDs. PR: kern/211852 Reviewed by: imp MFC after: 2 weeks	2017-08-12 22:13:06 +00:00
John Baldwin	992029ba10	Reliably enable debug exceptions on all CPUs. Previously, debug exceptions were only enabled on the boot CPU if DDB was enabled in the dbg_monitor_init() function. APs also called this function, but since mp_machdep.c doesn't include opt_ddb.h, the APs ended up calling an empty stub defined in <machine/debug_monitor.h> instead of the real function. Also, if DDB was not enabled in the kernel, the boot CPU would not enable debug exceptions. Fix this by adding a new dbg_init() function that always clears the OS lock to enable debug exceptions which the boot CPU and the APs call. This function also calls dbg_monitor_init() to enable hardware breakpoints from DDB on all CPUs if DDB is enabled. Eventually base support for hardware breakpoints/watchpoints will need to move out of the DDB-only debug_monitor.c for use by userland debuggers. Reviewed by: andrew Differential Revision: https://reviews.freebsd.org/D12001	2017-08-12 18:42:54 +00:00
John Baldwin	c9ee3caf19	Don't panic for PT_GETFPREGS. Only fetch the VFP state from the CPU if the thread whose registers are being requested is the current thread. If a stopped thread's registers are being fetched by a debugger, the saved state in the PCB is already valid. Reviewed by: andrew MFC after: 1 week	2017-08-12 18:38:18 +00:00
Ian Lepore	4541b9aab6	Bid for the device with BUS_PROBE_GENERIC, because this is very much a generic driver with minimal feature support for a large number of chips. More featureful per-chip drivers might exist (especially out-of-tree) and those should win the bidding even if they use BUS_PROBE_DEFAULT.	2017-08-12 17:39:32 +00:00
Navdeep Parhar	5d973bad2a	cxgbe(4): Save the last reported link parameters and compare them with the current state to determine whether to generate a link-state change notification. This fixes a bug introduced in r321063 that caused the driver to sometimes skip these notifications. Reported by: Jason Eggleston @ LLNW MFC after: 3 days Sponsored by: Chelsio Communications	2017-08-12 14:02:19 +00:00
John Baldwin	319e8f3e36	Fix a typo.	2017-08-11 22:47:32 +00:00
Ed Maste	67d4951ddf	arm: enable ARM_MANY_BOARD in NOTES for LINT build Added in r238189, ARM_MANY_BOARD adds support for multiple ARM boards in a single kernel. Include it for LINT builds to avoid duplicate symbol errors when linking with lld. Sponsored by: The FreeBSD Foundation	2017-08-11 19:49:29 +00:00
Mark Johnston	d055a46cda	Bump KERNELDUMP_BUFFER_SIZE to 4096. The encrypted kernel dump code writes data in blocks of this size. A buffer size of 4096 allows encrypted dumps to work with 4Kn drives. Reviewed by: cem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D11870	2017-08-11 19:24:08 +00:00
Ian Lepore	c82d887d47	Stop calling atrtc_set() from the xen timer clock_settime() method. That removes the only reference to atrtc_set() from outside of atrtc.c, so make it static. The xen timer driver registers as a realtime clock with 1us resolution. In the past that resulted in only the xen timer's clock_settime() getting called, so it would call atrtc_set() to set the hardware clock as well. As of r32090, the clock_settime() method of all registered realtime clocks gets called, so the xen driver no longer needs to chain-call the lower-resolution driver. Thanks to royger@ for talking me through the xen stuff, and for testing.	2017-08-11 19:02:11 +00:00
Ed Maste	9432a9bd9f	Rename at91_pmc's M_PMC malloc type to avoid duplicate definition M_PMC is defined in sys/dev/hwpmc/hwpmc_mod.c, and the LINT kernel build fails when linking with lld due to a duplicate symbol error. Sponsored by: The FreeBSD Foundation	2017-08-11 18:09:26 +00:00
David C Somayajulu	45f1312387	Performance enhancements to reduce CPU utililization for large number of TCP connections (order of tens of thousands), with predominantly Transmits. Choice to perform receive operations either in IThread or Taskqueue Thread. Submitted by:Vaishali.Kulkarni@cavium.com MFC after:5 days	2017-08-11 17:43:25 +00:00
Ryan Libby	9de5f67de2	x86/crc32_sse42.c: quiet unused function warning Reviewed by: cem Approved by: markj (mentor) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D11980	2017-08-11 17:05:31 +00:00
Mark Johnston	af0460beda	Have sendfile_swapin() use vm_page_grab_pages(). Reviewed by: alc, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D11942	2017-08-11 16:32:24 +00:00
Mark Johnston	9df950b35d	Modify vm_page_grab_pages() to handle VM_ALLOC_NOWAIT. This will allow its use in sendfile_swapin(). Reviewed by: alc, kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D11942	2017-08-11 16:29:22 +00:00
Alan Cox	6921451dab	An invalid page can't be dirty. Reviewed by: kib MFC after: 1 week	2017-08-11 16:27:54 +00:00
Roger Pau Monné	c642d2f5b5	acpi/srat: fix build without DMAP Use pmap_mapbios to map memory used to store the cpus array. Reported by: lwhsu X-MFC-with: r322348	2017-08-11 14:19:55 +00:00
Andrew Turner	a92a2f00b1	Only return the current cpu if it's in the cpumask. When we restrict the cpumask it probably means we are unable to sent interrupts to CPUs outside the map. As such only return the current CPU when it's within the mask otherwise return the first valid CPU. This is needed on ThunderX as, in a dual socket configuration, we are unable to send MSI/MSI-X interrupts between sockets. Reviewed by: mmel Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D11957	2017-08-11 12:45:58 +00:00
Hans Petter Selasky	ebf854802d	Make sure the "vm_flags" and "vm_page_prot" fields get set correctly in the VM area structure in the LinuxKPI when doing mmap() and that unsupported bits are masked away. While at it fix some redundant use of parenthesing inside some related macros. Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2017-08-11 10:44:40 +00:00
Mark Johnston	0b7bd01a82	Add a specialized function for DRM drivers to register themselves. Such drivers attach to a vgapci bus rather than directly to a pci bus. For the rest of the LinuxKPI to work correctly in this case, we override the vgapci bus' ivars with those of the grandparent. Reviewed by: hselasky MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11932	2017-08-11 03:59:48 +00:00
Mark Johnston	7e05ffa6e6	Micro-optimize kmem_unback(). We can remove some unnecessary object radix tree lookups by using the object memq to iterate over pages in the specified range. This does not, however, eliminate the lookup needed in vm_page_free_toq() to remove each tree entry. Reviewed by: alc, kib (previous revision) MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11945	2017-08-11 03:09:11 +00:00
Mark Johnston	2c642ec1e7	Make vm_page_sunbusy() assert that the page is unlocked. Reviewed by: kib MFC after: 1 week Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D11946	2017-08-10 22:43:38 +00:00
Ian Lepore	c89cfb4377	Ensure the clocks driver is attached before any drivers that need to enable clocks in their attach().	2017-08-10 19:42:30 +00:00
Roger Pau Monné	3f0a9fe06c	mptable: fix i386 build failure Reported by: emaste X-MFC-with: r322347	2017-08-10 17:46:57 +00:00
Kenneth D. Merry	6d4ffcb4ac	Changes to make mps(4) and mpr(4) handle reinit with reallocation. When the mps(4) and mpr(4) drivers need to reinitialize the firmware, they sometimes need to reallocate all of the memory allocated by the driver. The reallocation happens whenever the IOC Facts change. That should only happen after a firmware upgrade. If the reinitialization happens as a result of a timed out command sent to the card, the command that timed out and triggered the reinit may have been freed if iocfacts_allocate() reallocated all memory. If the caller attempts to access the command after that, the kernel will panic because the caller will be dereferencing freed memory. The solution is to set a flag in the softc when we reallocate, and avoid dereferencing the command strucure if we've reallocated. The changes are largely the same in both drivers, since mpr(4) is a derivative of mps(4). o In iocfacts_allocate(), if the IOC Facts have changed and we need to reallocate, set the REALLOCATED flag in the softc. o Change wait_command() to take a struct mps_command ** instead of a struct mps_command *. This allows us to NULL out the caller's command pointer if we have to reinit the controller and the data structures get reallocated. (The REALLOCATED flag will be set in the softc if that has happened.) o In every place that calls wait_command(), make sure we handle the case where the command is NULL after the call. o The mpr(4) driver has mpr_request_polled() which can also reinitialize the card. Also check for reallocation there. Reviewed by: scottl, slm MFC after: 1 week Sponsored by: Spectra Logic	2017-08-10 14:59:17 +00:00
Sean Bruno	5f593927a8	Purge deprecated locking macros. Submitted by: Matt Macy <matt@mattmacy.io> Sponsored by: Limelight Networks	2017-08-10 14:54:36 +00:00
Ruslan Bukin	af19cc59ca	Support for v1.10 (latest) of RISC-V privilege specification. New version is not compatible on supervisor mode with v1.9.1 (previous version). Highlights: o BBL (Berkeley Boot Loader) provides no initial page tables anymore allowing us to choose VM, to build page tables manually and enable MMU in S-mode. o SBI interface changed. o GENERIC kernel. FDT is now chosen standard for RISC-V hardware description. DTB is now provided by Spike (golden model simulator). This allows us to introduce GENERIC kernel. However, description for console and timer devices is not provided in DTB, so move these devices temporary to nexus bus. o Supervisor can't access userspace by default. Solution is to set SUM (permit Supervisor User Memory access) bit in sstatus register. o Compressed extension is now turned on by default. o External GCC 7.1 compiler used. o _gp renamed to __global_pointer$ o Compiler -march= string is now in use allowing us to choose required extensions (compressed, FPU, atomic, etc). Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D11800	2017-08-10 14:18:09 +00:00
Marcin Wojtas	e2b9d20234	Enable OF_setprop API function to add property in FDT This patch modifies function ofw_fdt_setprop (called by OF_setprop), so that it can add property, when replacing is not possible. Adding property is needed to fixup FDT's that have missing properties. Submitted by: Patryk Duda <pdk@semihalf.com> Reviewed by: nwhitehorn, cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11879	2017-08-10 13:45:56 +00:00
Hans Petter Selasky	f6800be3ce	Use integer type to pass around jiffies and/or ticks values in the LinuxKPI because in FreeBSD ticks are 32-bit. MFC after: 1 week Sponsored by: Mellanox Technologies	2017-08-10 13:05:40 +00:00
Hans Petter Selasky	4ef8a6301f	Fixes for wait event in the LinuxKPI. These are regression issues after r319757. 1) Correct the return value from __wait_event_common() from 1 to 0 in case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other case __ret is zero and will be substituted in the last part of the macro with the appropriate value before return. 2) Make sure the "timeout" argument is casted to "int" before evaluating negativity. Else the signedness of a "long" might be checked instead of the signedness of an integer. 3) The wait_event() function should not have a return value. Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com> MFC after: 1 week Sponsored by: Mellanox Technologies	2017-08-10 13:00:10 +00:00
Hans Petter Selasky	8ea4441598	Make sure the linux_wait_event_common() function in the LinuxKPI properly handles a timeout value of MAX_SCHEDULE_TIMEOUT which basically means there is no timeout. This is a regression issue after r319757. While at it change the type of returned variable from "long" to "int" to match the actual return type. MFC after: 1 week Sponsored by: Mellanox Technologies	2017-08-10 12:51:04 +00:00
Roger Pau Monné	a74bb29ada	x86: bump MAX_APIC_ID to 512 Introduce a new define to take int account the xAPIC ID limit, for systems where x2APIC is not available/reliable. Also change some of the usages of the APIC ID to use an unsigned int (which is the correct storage type to deal with x2APIC IDs as found in x2APIC MADT entries). This allows booting FreeBSD on a box with 256 CPUs and APIC IDs up to 295: FreeBSD/SMP: Multiprocessor System Detected: 256 CPUs FreeBSD/SMP: 1 package(s) x 64 core(s) x 4 hardware threads Package HW ID = 0 Core HW ID = 0 CPU0 (BSP): APIC ID: 0 CPU1 (AP/HT): APIC ID: 1 CPU2 (AP/HT): APIC ID: 2 CPU3 (AP/HT): APIC ID: 3 [...] Core HW ID = 73 CPU252 (AP): APIC ID: 292 CPU253 (AP/HT): APIC ID: 293 CPU254 (AP/HT): APIC ID: 294 CPU255 (AP/HT): APIC ID: 295 Submitted by: kib (previous version) Relnotes: yes MFC after: 1 month Reviewed by: kib Differential revision: https://reviews.freebsd.org/D11913	2017-08-10 09:16:40 +00:00
Roger Pau Monné	84525e55c1	x86: make the arrays that depend on MAX_APIC_ID dynamic So that MAX_APIC_ID can be bumped without wasting memory. Note that the usage of MAX_APIC_ID in the SRAT parsing forces the parser to allocate memory directly from the phys_avail physical memory array, which is not the best approach probably, but I haven't found any other way to allocate memory so early in boot. This memory is not returned to the system afterwards, but at least it's sized according to the maximum APIC ID found in the MADT table. Sponsored by: Citrix Systems R&D MFC after: 1 month Reviewed by: kib Differential revision: https://reviews.freebsd.org/D11912	2017-08-10 09:16:03 +00:00
Roger Pau Monné	fd1f83fb45	apic_enumerator: only set mp_ncpus and mp_maxid at probe cpus phase Populate the lapics arrays and call cpu_add/lapic_create in the setup phase instead. Also store the max APIC ID found in the newly introduced max_apic_id global variable. This is a requirement in order to make the static arrays currently using MAX_LAPIC_ID dynamic. Sponsored by: Citrix Systems R&D MFC after: 1 month Reviewed by: kib Differential revision: https://reviews.freebsd.org/D11911	2017-08-10 09:15:18 +00:00
Sean Bruno	5c5ca36ca2	Don't leak mbufs if clusers exceeds the number of segments. This would leak mbufs over time causing crashes. PR: 221202 Submitted by: Matt Macy <matt@mattmacy.io> Reported by: gergely.czuczy@harmless.hu Sponsored by: Limelight Networks	2017-08-10 03:43:23 +00:00
Sean Bruno	18a660b344	Export IFCAP_HWSTATS so that we don't experience double stats counting on iflib enabled devices. PR: 220198 Submitted by: Matt Macy <matt@mattmacy.io> Reported by: Ben Woods <woodsb02@freebsd.org> Sponsored by: Limelight Networks	2017-08-10 03:11:05 +00:00
David C Somayajulu	b284b46dc4	Provide compile to choose receive processing in either Ithread or Taskqueue Thread.	2017-08-09 22:18:49 +00:00
Ryan Libby	2473c1b145	i386/boot2: -fno-asynchronous-unwind-tables for gcc The amd64 build of boot2 was failing with gcc 6.3.0 due to being more than 1 kB too large. It was apparently generating a .eh_frame section which was not being removed by objcopy -S. The .eh_frame section seems to be mandatory per the amd64 ABI, but boot2 is compiled for i386 (uses -m32), and therefore should be optional in this context. Suppress generation of .eh_frame with the -fno-asynchronous-unwind-tables flag to gcc. This saves 1348 bytes (the limit is 7680 bytes). Reviewed by: dim, imp Approved by: markj (mentor) Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D11928	2017-08-09 20:13:49 +00:00
Andrey V. Elsukov	e54647920b	Make user supplied data checks a bit stricter. key_msg2sp() is used for parsing data from setsockopt(IP[V6]_IPSEC_POLICY) call. This socket option is usually used to configure IPsec bypass for socket. Only privileged user can set this socket option. The message syntax is described here http://www.kame.net/newsletter/20021210/ and our libipsec is usually used to create the correct request. Add additional checks: * that sadb_x_ipsecrequest_len is not out of bounds of user supplied buffer * that src/dst's sa_len is the same * that 2sa_len is not out of bounds of user supplied buffer that 2*sa_len fits into bounds of sadb_x_ipsecrequest Reported by: Ilja van Sprundel MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11796	2017-08-09 19:58:38 +00:00
Jung-uk Kim	b5669d0aa8	Split identify_cpu() into two functions for amd64 as we do for i386. This reduces diff between amd64 and i386. Also, it fixes a regression introduced in r322076, i.e., identify_hypervisor() failed to identify some hypervisors. This function assumes cpu_feature2 is already initialized. Reported by: dexuan Tested by: dexuan	2017-08-09 18:09:09 +00:00
Gleb Smirnoff	ef3266d58a	Plug uninitialized stack variable leak in sendfile(2). Reported by: Ilja Van Sprundel <ivansprundel ioactive.com> Submitted by: Domagoj Stolfa <domagoj.stolfa gmail.com> MFC after: 1 week Security: uninitialized stack variable leak	2017-08-09 17:48:38 +00:00
Warner Losh	0038725697	Also provide a warning for geom_fox. Differential Review: https://reviews.freebsd.org/D11935 Requested by: jhb@ MFC After: 3 days	2017-08-09 16:37:37 +00:00
Warner Losh	20995eab57	Mark geom classes as deprecated. geom_bsd, geom_mbr and geom_sunlabel have been obsolete since Marcel Moolenaar's geom_part was in FreeBSD 7. They haven't been in GENERIC since FreeBSD 8. Add warning when used. geom_vol_ffs has been obsolete since ufs support to geom_label was committed in FreeBSD 5. It hasn't been in GENERIC since FreeBSD 5. Add warning when used. geom_fox has been obsolete since gmultipath was committed in FreeBSD 7. (no warning added, since this is a very obscure class). These will all be removed in FreeBSD 12. MFC After: 3 days Differential Revision: https://reviews.freebsd.org/D11935 Note: Classes will be removed after MFC	2017-08-09 16:15:24 +00:00
Alexander Motin	2b2a6eb95e	Missing remanant of 322309. MFC after: 1 week	2017-08-09 13:46:16 +00:00
Andrey V. Elsukov	95e8b991ca	Add to if_enc(4) ability to capture packets via BPF after pfil processing. New flag 0x4 can be configured in net.enc.[in\|out].ipsec_bpf_mask. When it is set, if_enc(4) additionally captures a packet via BPF after invoking pfil hook. This may be useful for debugging. MFC after: 2 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D11804	2017-08-09 12:24:07 +00:00
Alexander Motin	6750c3d0fa	Use "Ibex Peak" codename for "5 Series/3400 Series" chipsets. This is shorter and unifies naming with later chipsets. MFC after: 1 week	2017-08-09 12:21:17 +00:00
Alexander Motin	aaa9b2b3f3	Add new Intel Lewisburg and Union Point chipset PCI IDs. While there, polish some old AHCI ones, since they are still reused. MFC after: 1 week	2017-08-09 12:03:12 +00:00
Oleg Bulyzhin	ff21796d25	Fix comment typo.	2017-08-09 10:46:34 +00:00
Hans Petter Selasky	768a720e95	Print maximum MTU when trying to set invalid MTU in the mlx4en(4) driver. Useful for debugging. Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org> MFC after: 3 days Sponsored by: Mellanox Technologies	2017-08-09 10:32:51 +00:00
Hans Petter Selasky	a3d0173d98	Increment queue drops in the network statistics when transmitted packets are dropped by the mlx4en(4) driver. Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org> MFC after: 3 days Sponsored by: Mellanox Technologies	2017-08-09 10:30:55 +00:00
Hans Petter Selasky	f7833544f1	Add support for RX and TX statistics when the mlx4en(4) PCI device is in VF or SRIOV mode typically in a virtual machine environment. Submitted by: Sepherosa Ziehau <sephe@dragonflybsd.org> MFC after: 3 days Sponsored by: Mellanox Technologies	2017-08-09 10:27:21 +00:00
Alexander Motin	c901a9b1f1	Do not loose CCB flags after r320493. There is at least CAM_UNLOCKED that should be kept. MFC after: 3 days	2017-08-09 09:13:15 +00:00
Dag-Erling Smørgrav	d80fbbee5f	Correct sysctl names.	2017-08-09 07:24:58 +00:00
Sepherosa Ziehau	9c6cae2431	hyperv/hn: Implement transparent mode network VF. How network VF works with hn(4) on Hyper-V in transparent mode: - Each network VF has a cooresponding hn(4). - The network VF and the it's cooresponding hn(4) have the same hardware address. - Once the network VF is attached, the cooresponding hn(4) waits several seconds to make sure that the network VF attach routing completes, then: o Set the intersection of the network VF's if_capabilities and the cooresponding hn(4)'s if_capabilities to the cooresponding hn(4)'s if_capabilities. And adjust the cooresponding hn(4) if_capable and if_hwassist accordingly. () o Make sure that the cooresponding hn(4)'s TSO parameters meet the constraints posed by both the network VF and the cooresponding hn(4). () o The network VF's if_input is overridden. The overriding if_input changes the input packet's rcvif to the cooreponding hn(4). The network layers are tricked into thinking that all packets are neceived by the cooresponding hn(4). o If the cooresponding hn(4) was brought up, bring up the network VF. The transmission dispatched to the cooresponding hn(4) are redispatched to the network VF. o Bringing down the cooresponding hn(4) also brings down the network VF. o All IOCTLs issued to the cooresponding hn(4) are pass-through'ed to the network VF; the cooresponding hn(4) changes its internal state if necessary. o The media status of the cooresponding hn(4) solely relies on the network VF. o If there are multicast filters on the cooresponding hn(4), allmulti will be enabled on the network VF. (*) - Once the network VF is detached. Undo all damages did to the cooresponding hn(4) in the above item. NOTE: No operation should be issued directly to the network VF, if the network VF transparent mode is enabled. The network VF transparent mode can be enabled by setting tunable hw.hn.vf_transparent to 1. The network VF transparent mode is _not_ enabled by default, as of this commit. The benefit of the network VF transparent mode is that the network VF attachment and detachment are transparent to all network layers; e.g. live migration detaches and reattaches the network VF. The major drawbacks of the network VF transparent mode: - The netmap(4) support is lost, even if the VF supports it. - ALTQ does not work, since if_start method cannot be properly supported. () These decisions were made so that things will not be messed up too much during the transition period. (**) This does _not_ need to go through the fancy multicast filter management stuffs like what vlan(4) has, at least currently: - As of this write, multicast does not work in Azure. - As of this write, multicast packets go through the cooresponding hn(4). MFC after: 3 days Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D11803	2017-08-09 05:59:45 +00:00
Kirk McKusick	77b63aa0fc	Since the switch to GPT disk labels, fsck for UFS/FFS has been unable to automatically find alternate superblocks. This checkin places the information needed to find alternate superblocks to the end of the area reserved for the boot block. Filesystems created with a newfs of this vintage or later will create the recovery information. If you have a filesystem created prior to this change and wish to have a recovery block created for your filesystem, you can do so by running fsck in forground mode (i.e., do not use the -p or -y options). As it starts, fsck will ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should answer yes. Discussed with: kib, imp MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D11589	2017-08-09 05:17:21 +00:00
Alan Cox	5471caf6f1	Introduce vm_page_grab_pages(), which is intended to replace loops calling vm_page_grab() on consecutive page indices. Besides simplifying the code in the caller, vm_page_grab_pages() allows for batching optimizations. For example, the current implementation replaces calls to vm_page_lookup() on consecutive page indices by cheaper calls to vm_page_next(). Reviewed by: kib, markj Tested by: pho (an earlier version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D11926	2017-08-09 04:23:04 +00:00
Marcin Wojtas	29a263df94	Update pl310 node in Armada 38x DTS to match the one used in Linux Since the cache controller nodes fixup is added to the platform code, this patch aligns it to the Linux device tree representation. Submitted by: Patryk Duda <pdk@semihalf.com> Reviewed by: cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11884	2017-08-09 01:31:05 +00:00
Marcin Wojtas	75b2aa51b4	Enable pl310 coherent operation in platform init for Armada 38x Updating PL310 sotfware context sc_io_coherent field in platform_pl310_init() routine for Armada 38x helps to avoid using 'arm,io-coherent' property, which is by default not present in the device tree node in Linux. This way another step for DT unification between two operating systems is done. The improvemnt will also work after enabling PLATFORM for Marvell ARMv7 SoCs. Reviewed by: andrew, cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11883	2017-08-09 01:25:47 +00:00
Marcin Wojtas	30608f6dd6	Remove clock-frequency properties from Armada 38x timer nodes Since the timers' base frequency setting is added to the platform code, this patch removes clock-frequency properties from global and twd timers, aligning both to the Linux device tree. Submitted by: Patryk Duda <pdk@semihalf.com> Reviewed by: cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11882	2017-08-09 01:20:53 +00:00
Marcin Wojtas	1070a9141c	Dynamically configure timers' base frequency for Armada 38x Instead of using 'clock-frequency' device tree property for global/twd mpcore timers of Armada 38x SoCs, set it in platform_late_init stage with arm_tmr_change_frequency() function. Reviewed by: cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11881	2017-08-09 01:14:29 +00:00
Marcin Wojtas	edf3dd3b0a	Enable using ofw_bus_find_compatible in early platform code Before this patch function ofw_bus_find_compatible was using memory allocations in order to find compatible node and the property's length. This way there was always a suited buffer for property, however this approach had also disadvantages - ofw_bus_find_compatible couldn't be used when malloc is not available, e.g. during fdt fixup stage. In order to remove the usage limitation of ofw_bus_find_compatible(), this patch modifies the function to use ofw_bus_node_is_compatible() (instead of the one without _int suffix), which uses a fixed buffer on stack instead of dynamic allocations. Submitted by: Patryk Duda <pdk@semihalf.com> Reviewed by: nwhitehorn, cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11880	2017-08-09 01:06:40 +00:00
Marcin Wojtas	a355bb8846	Add support for "compatible" parameter in ofw_fdt_fixup Sometimes it's convenient to provide fixup to many boards that use the same SoC family (eg. Marvell Armada 38x). Instead of putting multiple entries in fdt_fixup_table, use one entry which refers to all boards with given SoC. Submitted by: Patryk Duda <pdk@semihalf.com> Reviewed by: nwhitehorn, cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11878	2017-08-09 00:56:29 +00:00
Marcin Wojtas	f2d9a004fb	Restore original /soc ranges on Marvell Armada 38x boards Because fdt_get_ranges can process now multiple 'ranges' entries, restoring the ranges from original Linux device trees is possible. Submitted by: Patryk Duda <pdk@semihalf.com> Reviewed by: cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11877	2017-08-09 00:51:45 +00:00
Marcin Wojtas	bfd084c8c4	Enable parsing simple-bus 'ranges' with multiple entries This patch makes possible to boot with up to 8 ranges in soc. Dynamic allocation cannot be used, because ftd_get_ranges function is called early, when malloc is not available. Change is required for the alignment of Marvell Armada 38x device trees present in sys/gnu/dts/arm - originally the platform has 6 entries in simple-bus 'ranges'. Submitted by: Patryk Duda <pdk@semihalf.com> Reviewed by: manu, nwhitehorn, cognet (mentor) Approved by: cognet (mentor) Obtained from: Semihalf Differential Revision: https://reviews.freebsd.org/D11876	2017-08-09 00:45:25 +00:00
Ian Lepore	b8c53507cb	Remove the ds133x and s35390a i2c RTC drivers for now. They both do i2c transfers in their probe() or attach() routines, and that doesn't work when the low-level controller requires interrupts to be functional. The DS133x family of chips is nearly identical to the DS1307 and support for them should be added to that driver, then the ds133x driver can be deleted. The s35390a driver just needs a non-trivial workover. In both cases that work will be done and committed separately.	2017-08-08 22:58:34 +00:00
Kristof Provost	7f3ad01804	pf_get_sport(): Prevent possible endless loop when searching for an unused nat port This is an import of Alexander Bluhm's OpenBSD commit r1.60, the first chunk had to be modified because on OpenBSD the 'cut' declaration is located elsewhere. Upstream report by Jingmin Zhou: https://marc.info/?l=openbsd-pf&m=150020133510896&w=2 OpenBSD commit message: Use a 32 bit variable to detect integer overflow when searching for an unused nat port. Prevents a possible endless loop if high port is 65535 or low port is 0. report and analysis Jingmin Zhou; OK sashan@ visa@ Quoted from: https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/net/pf_lb.c PR: 221201 Submitted by: Fabian Keil <fk@fabiankeil.de> Obtained from: OpenBSD via ElectroBSD MFC after: 1 week	2017-08-08 21:09:26 +00:00
Warner Losh	5b896b567d	Turns out to be even simpler to just not create /dev/efi if we don't have a efi runtime.	2017-08-08 21:01:11 +00:00
Warner Losh	9057f54d74	Fail to open efirt device when no EFI on system. libefivar expects opening /dev/efi to indicate if the we can make efi runtime calls. With a null routine, it was always succeeding leading efi_variables_supported() to return the wrong value. Only succeed if we have an efi_runtime table. Also, while I'm hear, out of an abundance of caution, add a likely redundant check to make sure efi_systbl is not NULL before dereferencing it. I know it can't be NULL if efi_cfgtbl is non-NULL, but the compiler doesn't.	2017-08-08 20:44:16 +00:00
Alexander Motin	3a150601e1	Fix few issues of LinuxKPI workqueue. LinuxKPI workqueue wrappers reported "successful" cancellation for works already completed in normal way. This change brings reported status and real cancellation fact into sync. This required for drm-next operation. Reviewed by: hselasky (earlier version) Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D11904	2017-08-08 19:36:34 +00:00
John Baldwin	d081dfc7cd	Fix a NULL pointer dereference in mly_user_command(). If mly_user_command fails to allocate a command slot it jumps to an 'out' label used for error handling. The error handling code checks for a data buffer in 'mc->mc_data' to free before checking if 'mc' is NULL. Fix by just returning directly if we fail to allocate a command and only using the 'out' label for subsequent errors when there is actual cleanup to perform. PR: 217747 Reported by: PVS-Studio Reviewed by: emaste MFC after: 1 week	2017-08-08 17:49:57 +00:00
Alan Somers	c45796d54e	Make p1003_1b.aio_listio_max a tunable p1003_1b.aio_listio_max is now a tunable. Its value is reflected in the sysctl of the same name, and the sysconf(3) variable _SC_AIO_LISTIO_MAX. Its value will be bounded from below by the compile-time constant AIO_LISTIO_MAX and from above by the compile-time constant MAX_AIO_QUEUE_PER_PROC and the tunable vfs.aio.max_aio_queue. Reviewed by: jhb, kib MFC after: 3 weeks Relnotes: yes Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D11601	2017-08-08 16:14:31 +00:00
Warner Losh	d0e75394cf	Use the correct queue depth for nda devices. Submitted by: Matt Williams	2017-08-08 16:06:16 +00:00
Konstantin Belousov	16997138d3	Fix logic error in the the assert, causing the condition to be always true. Also improve the formatting of the corresponding KASSERT message. Based on the submission by: Svyatoslav <razmyslov@viva64.com> Found by: PVS-Studio PR: 217741 Reviewed by: emaste Sponsored by: The FreeBSD Foundation (kib) MFC after: 1 week	2017-08-08 15:46:29 +00:00
Michael Gmelin	9c52035814	Fix typo in cyapa out of bounds check. PR: 217783 Submitted by: razmyslov@viva64.com MFC after: 1 week	2017-08-08 13:27:32 +00:00
Hans Petter Selasky	8508e4d730	Make sure the received IP header gets 32-bit aligned for short packets in the mlx5en(4) driver. MFC after: 1 week Sponsored by: Mellanox Technologies	2017-08-08 11:49:36 +00:00
Hans Petter Selasky	869dd4b498	Count drop events due to lack of PCI bandwidth as queue drops and not as input errors in the mlx5en(4) driver. This improves the sysadmin view of physical port errors. Submitted by: gallatin@ MFC after: 1 week Sponsored by: Mellanox Technologies	2017-08-08 11:36:57 +00:00
Hans Petter Selasky	1a59bf5f7a	Fix for mlx4en(4) to properly call m_defrag(). The m_defrag() function can only defrag mbuf chains which have a valid mbuf packet header. In r291699 when the mlx4en(4) driver was converted into using BUSDMA(9), the call to m_defrag() was moved after the part of the transmit routine which strips the header from the mbuf chain. This effectivly disabled the mbuf defrag mechanism and such packets simply got dropped. This patch removes the stripping of mbufs from a chain and loads all mbufs using busdma. If busdma finds there are no segments, unload the DMA map and free the mbuf right away, because that means all data in the mbuf has been inlined in the TX ring. Else proceed as usual. Add a per-ring rounter for the number of defrag attempts and make sure the oversized_packets counter gets zeroed while at it. The counters are per-ring to avoid excessive cache misses in the TX path. Submitted by: mjoras@ Differential Revision: https://reviews.freebsd.org/D11683 MFC after: 1 week Sponsored by: Mellanox Technologies	2017-08-08 11:35:02 +00:00
Andriy Gapon	984d43cca5	MFV r322242: 8373 TXG_WAIT in ZIL commit path illumos/illumos-gate@d28671a3b0 `d28671a3b0` https://www.illumos.org/issues/8373 The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign a transaction to a transaction group. That seems to be logically incorrect as writing of the ZIL block does not introduce any new dirty data. Also, when there is a lot of dirty data, the call can introduce significant delays into the ZIL commit path, thus affecting all synchronous writes. Additionally, ARC throttling may affect the ZIL writing. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks	2017-08-08 11:26:03 +00:00
Andriy Gapon	2653426e89	MFV r322240: 8491 uberblock on-disk padding to reserve space for smoothly merging zpool checkpoint & MMP in ZFS illumos/illumos-gate@79c2b812ee `79c2b812ee` https://www.illumos.org/issues/8491 The zpool checkpoint feature in DxOS added a new field in the uberblock. The Multi-Modifier Protection Pull Request from ZoL adds two new fields in the uberblock (Reference: https://github.com/zfsonlinux/zfs/pull/6279). As these two changes come from two different sources and once upstreamed and deployed will introduce an incompatibility with each other we want to upstream a change that will reserve the padding for both of them so integration goes smoothly and everyone gets both features. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Olaf Faaland <faaland1@llnl.gov> Approved by: Gordon Ross <gwr@nexenta.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com> MFC after: 3 weeks	2017-08-08 11:21:58 +00:00
Andriy Gapon	6f2f8727e3	MFV r322238: 7915 checks in l2arc_evict could use some cleaning up illumos/illumos-gate@267ae6c3a8 `267ae6c3a8` https://www.illumos.org/issues/7915 l2arc_evict() is strictly serialized with respect to l2arc_write_buffers() and l2arc_write_done(). Normally, l2arc_evict() and l2arc_write_buffers() are called from the same thread, so they can not be concurrent. Also, l2arc_write_buffers() uses zio_wait() on the parent zio of all cache zio-s. That ensures that l2arc_write_done() is completed before l2arc_write_buffers() returns. Finally, if a cache device is removed, then l2arc_evict() is called under SCL_ALL in the exclusive mode. That ensures that it can not be concurrent with the normal L2ARC accesses to the device (including writing and evicting buffers). Given the above, some checks and actions in l2arc_evict() do not make sense. For instance, it must never encounter the write head header let alone remove it from the buffer list. Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Matthew Ahrens <mahrens@delphix.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks	2017-08-08 11:19:14 +00:00
Andriy Gapon	3cf2ea1aea	MFV r322236: 8126 ztest assertion failed in dbuf_dirty due to dn_nlevels changing illumos/illumos-gate@dcb6872c56 `dcb6872c56` https://www.illumos.org/issues/8126 The sync thread is concurrently modifying dn_phys->dn_nlevels while dbuf_dirty() is trying to assert something about it, without holding the necessary lock. We need to move this assertion further down in the function, after we have acquired the dn_struct_rwlock. Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: Serapheim Dimitropoulos <serapheim@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> MFC after: 2 weeks	2017-08-08 11:14:40 +00:00
Andriy Gapon	98a9c8e68a	zfs: no need for __DECONST after abd constification in r322233 Note that vdev_label_write_pad2() is FreeBSD specific. MFC after: 2 weeks X-MFC after: r322233	2017-08-08 11:07:34 +00:00
Andriy Gapon	b9a4f29445	MFV r322232: 8426 mark immutable buffer arguments as such in abd.h illumos/illumos-gate@9b195260e2 `9b195260e2` https://www.illumos.org/issues/8426 abd_copy_from_buf and abd_cmp_buf do not modify their void *buf arguments, so qualify them with const. abd_copy_from_buf_off and abd_cmp_buf_off already had that type for the corresponding arguments. Reviewed by: Matt Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks	2017-08-08 10:59:18 +00:00
Andriy Gapon	9c48e95dd9	MFV r322229: 7600 zfs rollback should pass target snapshot to kernel illumos/illumos-gate@77b171372e `77b171372e` https://www.illumos.org/issues/7600 At present, the kernel side code seems to blindly rollback to whatever happens to be the latest snapshot at the time when the rollback task is processed. The expected target's name should be passed to the kernel driver and the sync task should validate that the target exists and that it is the latest snapshot indeed. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 3 weeks	2017-08-08 10:52:01 +00:00
Andriy Gapon	8605a08bd2	MFV r322227: 8377 Panic in bookmark deletion illumos/illumos-gate@42418f9e73 `42418f9e73` https://www.illumos.org/issues/8377 The problem is that when dsl_bookmark_destroy_check() is executed from open context (the pre-check), it fills in dbda_success based on the existence of the bookmark. But the bookmark (or containing filesystem as in this case) can be destroyed before we get to syncing context. When we re-run dsl_bookmark_destroy_check() in syncing context, it will not add the deleted bookmark to dbda_success, intending for dsl_bookmark_destroy_sync() to not process it. But because the bookmark is still in dbda_success from the open-context call, we do try to destroy it. The fix is that dsl_bookmark_destroy_check() should not modify dbda_success when called from open context. Reviewed by: Paul Dagnelie <pcd@delphix.com> Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> MFC after: 2 weeks	2017-08-08 10:48:52 +00:00
Andriy Gapon	b4e4140d13	MFV r322223: 8378 crash due to bp in-memory modification of nopwrite block illumos/illumos-gate@b7edcb9408 `b7edcb9408` https://www.illumos.org/issues/8378 The problem is that zfs_get_data() supplies a stale zgd_bp to dmu_sync(), which we then nopwrite against. zfs_get_data() doesn't hold any DMU-related locks, so after it copies db_blkptr to zgd_bp, dbuf_write_ready() could change db_blkptr, and dbuf_write_done() could remove the dirty record. dmu_sync() then sees the stale BP and that the dbuf it not dirty, so it is eligible for nop-writing. The fix is for dmu_sync() to copy db_blkptr to zgd_bp after acquiring the db_mtx. We could still see a stale db_blkptr, but if it is stale then the dirty record will still exist and thus we won't attempt to nopwrite. Reviewed by: Prakash Surya <prakash.surya@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Matthew Ahrens <mahrens@delphix.com> MFC after: 2 weeks	2017-08-08 10:46:51 +00:00
Andriy Gapon	c6fb364293	MFV r322221: 7910 l2arc_write_buffers() may write beyond target_sz FreeBD note: the essence of this change was committed to FreeBSD in r314274. This commit catches up with differences between what was committed to FreeBSD and what was committed to OpenZFS, mainly more logical variable names. illumos/illumos-gate@16a7e5ac11 `16a7e5ac11` https://www.illumos.org/issues/7910 It seems that the change in issue #6950 resurrected the problem that was earlier fixed by the change in issue #5219. Please also see the following FreeBSD bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216178 Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Dan Kimmel <dan.kimmel@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks	2017-08-08 10:43:41 +00:00
Mark Johnston	c0589825fd	Add round_jiffies_up(), local_clock() and __setup_timer() to the LinuxKPI. Reviewed by: hselasky MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11871	2017-08-08 04:34:02 +00:00
Mark Johnston	48dac28d63	Add macros for defining attribute groups and for WO and RW attributes. Reviewed by: hselasky MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D11872	2017-08-08 04:30:22 +00:00
Marius Strobl	79f39c6aa1	- If available, use TRIM instead of ERASE for implementing BIO_DELETE. This also involves adding a quirk table as TRIM is broken for some Kingston eMMC devices, though. Compared to ERASE (declared "legacy" in the eMMC specification v5.1), TRIM has the advantage of operating on write sectors rather than on erase sectors, which typically are of a much larger size. Thus, employing TRIM, we don't need to fiddle with coalescing BIO_DELETE requests that are also of (write) sector units into erase sectors, which might not even add up in all cases. - For some SanDisk iNAND devices, the CMD38 argument, e. g. ERASE, TRIM etc., has to be specified via EXT_CSD[113], which now is also handled via a quirk. - My initial understanding was that for eMMC partitions, the granularity should be used as erase sector size, e. g. 128 KB for boot partitions. However, rereading the relevant parts of the eMMC specification v5.1, this isn't actually correct. So drop the code which used partition granularities for delmaxsize and stripesize. For the most part, this change is a NOP, though, because a) for ERASE, mmcsd_delete() used the erase sector size unconditionally for all partitions anyway and b) g_disk_limit() doesn't actually take the stripesize into account. - Take some more advantage of mmcsd_errmsg() in mmcsd(4) for making error codes human readable.	2017-08-07 23:33:05 +00:00
Warner Losh	36d6e01474	Eliminate useless adjustments of aliased device. No need to set any fields in the cloned device. devfs uses symlinks, so the adev entries returned won't be presented to the drivers. Since we don't save copies, nothing else will see them. This code came from the old compat code, and it appears to be obsolete or never needed. Submitted by: kib@ Differential Review: https://reviews.freebsd.org/D11919	2017-08-07 22:42:46 +00:00
Warner Losh	d45e16744f	Add nvd alias to nda ndoes. All ndaX and ndaXpY nodes will appear as nvdX and nvdXpY as well (through symlinks in devfs via the normal disk aliasing mechanism in GEOM). Differential Revision: https://reviews.freebsd.org/D11873	2017-08-07 21:12:43 +00:00
Warner Losh	d3517d306c	Expose API to allow disks to ask for alias names in devfs. Implement disk_add_alias to allow aliases to be added to disks. All disk have a primary name (say "foo") can also have secondary names (say "bar") such that all instances of "foo" also have a "bar" alias. So if you have foo0, foo0p1, foo1, foo1s1 and foo1s1a nodes created by the foo driver and gpart, device nodes bar0, bar0p1, bar1, bar1s1 and bar1s1a will appear as symlinks back to the original nodes. This generalizes to multiple aliases. However, since the unit number follows the primary name, multiple device drivers can't create the same aliases unless those drives coorinate the unit number space (eg you couldn't add an alias 'disk' to both 'da' and 'ada' because it's possible to have da0 and ada0, because 'disk0' is ambiguous). Differential Revision: https://reviews.freebsd.org/D11873	2017-08-07 21:12:38 +00:00
Warner Losh	5d7d13290a	Add alias support to gpart. When we're creating new providers for each of the partitions, add aliases to the geom before we create the provider so when geom_dev tastes the provider, the aliases are in place so the proper /dev entries are created. So foo5p6 gets created as an alias for bar5p6 when foo is an alias for bar in the geom we're partitioning with g_part. This also copies aliases from the container geom (eg disk) to the label geom (the disk with GPT partitioning) so that aliases nest properly. Differential Revision: https://reviews.freebsd.org/D11873	2017-08-07 21:12:33 +00:00
Warner Losh	c624eb2598	Add aliasing concept to geom. Add an alias name list to geoms. Use them in geom_dev to create aliases. Previously, geom_dev would create an device node for the name of the geom. Now, additional nodes are created pointing back to the primary node with make_dev_alias_p. Aliases must be in place on the geom before any tasting occurs. Differential Revision: https://reviews.freebsd.org/D11873	2017-08-07 21:12:28 +00:00
Kirk McKusick	6c6118b390	gjournal is broken in handling its flush_queue. If we have 10 bio's in the flush_queue: 1 2 3 4 5 6 7 8 9 10 and another 10 bio's go into the flush queue after only the first five bio's are removed from the flush queue, the queue should look like: 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20, but because of the bug we end up with 6 11 12 13 14 15 16 17 18 19 20 7 8 9 10. So the sequence of the bio's is damaged in the flush queue (and therefore in the journal on disk !). This error can be triggered by ffs_snapshot() when a block is read with readblock() and gjournal finds this block in the broken flush queue before it goes to the correct active queue. The fix is to place all new blocks at the end of the queue. Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Discussed with: kib MFC after: 1 week	2017-08-07 19:40:03 +00:00
Kirk McKusick	683590b642	sysctl kern.geom.journal.cache.limit shows negative value for FreeBSD/amd64 system having over 4GB RAM. That's due to: 1) the limit being u_int instead of u_long like vm.kmem_size (the limit is half of vm.kmem_size by default for amd64); 2) sysctl handler g_journal_cache_limit_sysctl() using u_int instead of u_long. The fix is to replace u_int with u_long for the kern.geom.journal.cache.limit sysctl variable. PR: 198500 Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Reported by: Eugene Grosbein Discussed with: kib MFC after: 1 week	2017-08-07 19:18:27 +00:00
Konstantin Belousov	fe04f5e9d0	Avoid DI recursion when reclaim_pv_chunk() is called from pmap_advise() or pmap_remove(). Reported and tested by: pho (previous version) Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-08-07 17:29:54 +00:00
Konstantin Belousov	1a47eac0f5	Explain why delayed invalidation is not required in pmap_protect() and pmap_remove_pages(). Submitted by: alc MFC after: 1 week	2017-08-07 17:23:10 +00:00
Alexander Motin	e1cf70fbab	Fix hrtimer_active() in case of cancellation. While there, switch to FreeBSD internal callout active status. Reviewed by: markj, hselasky Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D11900	2017-08-07 14:34:05 +00:00
Ruslan Bukin	ca20f8ec29	o Replace __riscv__ with __riscv o Replace __riscv64 with (__riscv && __riscv_xlen == 64) This is required to support new GCC 7.1 compiler. This is compatible with current GCC 6.1 compiler. RISC-V is extensible ISA and the idea here is to have built-in define per each extension, so together with __riscv we will have some subset of these as well (depending on -march string passed to compiler): __riscv_compressed __riscv_atomic __riscv_mul __riscv_div __riscv_muldiv __riscv_fdiv __riscv_fsqrt __riscv_float_abi_soft __riscv_float_abi_single __riscv_float_abi_double __riscv_cmodel_medlow __riscv_cmodel_medany __riscv_cmodel_pic __riscv_xlen Reviewed by: ngie Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D11901	2017-08-07 14:09:57 +00:00
Navdeep Parhar	b96793ae43	cxgbe(4): Add the T6 and T5 Unified Wire configuration files to the kernel, just like for T4, when the driver is compiled into the kernel. Reported by: mav@ MFC after: 3 days Sponsored by: Chelsio Communications	2017-08-07 14:04:19 +00:00
Navdeep Parhar	cc2050c5eb	cxgbe(4): Avoid a NULL dereference that would occur during module unload if there were problems earlier during attach. MFC after: 3 days Sponsored by: Chelsio Communications	2017-08-06 19:45:59 +00:00
Konstantin Belousov	0b9b3897a8	Remove trivial comments. Remove and-ing with UINT_MAX for minor(), cast to int already does the required truncation of significant bits. Requested and reviewed by: bde Sponsored by: The FreeBSD Foundation	2017-08-06 12:27:20 +00:00
Andrew Turner	1f15260790	Mark each cpu in the appropriate cpuset_domain set. This allows devices to handle cases where they can only run on a single domain. To allow all devices access to this set we need to move reading the domain earlier in the boot as it was previously handled in the CPU driver, however this is too late for the GICv3 ITS driver. Sponsored by: DARPA, AFRL	2017-08-05 20:57:34 +00:00
Jung-uk Kim	0105034487	Detect hypervisors early. We used to set lower hz on hypervisors by default but it was broken since r273800 (and r278522, its MFC to stable/10) because identify_cpu() is called too late, i.e., after init_param1(). MFC after: 3 days	2017-08-05 06:56:46 +00:00
Toomas Soome	07672e9c19	libefi/time.c cstyle cleanup libefi/time.c is mix of different styles, this update does cleanup. Also fix 0 versus NULL, and zero the tv structure for case we get error from UEFI firmware. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D11861	2017-08-05 05:20:03 +00:00
Cy Schubert	a3329e0129	Fix matchcing of NATed ICMP queries (resolving NATed MTU discovery). MFC after: 1 month	2017-08-05 00:28:42 +00:00
Warner Losh	9990efd2e2	Move EFI fmtdev functionality to libefi This patch moves code necessary for the fmtdev functionality from loader to libefi, allowing other applications to make use of it Submitted by: Eric McCorkle Differential Revision: https://reviews.freebsd.org/D11862	2017-08-04 16:33:36 +00:00
Navdeep Parhar	019c1a0111	cxgbe(4): Allow the TOE timer tunables to be set with microsecond precision. These timers are already displayed in microseconds in the sysctl MIB. Add variables to track these tunables while here. MFC after: 3 days Sponsored by: Chelsio Communications	2017-08-04 15:57:10 +00:00
Andrew Turner	49f347f450	Start to teach the GICv3 driver about NUMA. On ThunderX we may have multiple ITS devices, however we only want a single ITS device to be configured on each CPU. To fix this only enable ITS when the node matches the CPUs node. Sponsored by: DARPA, AFRL	2017-08-04 13:08:45 +00:00
Andrew Turner	dbba8930ce	Read the numa-node-id property from each CPU node. This will initially be used to support the dual package ThunderX where we need to send MSI/MSI-X interrupts to the same package as the device the interrupt came from. Sponsored by: DARPA, AFRL	2017-08-04 10:33:22 +00:00
Konstantin Belousov	4e93dbdf47	Relax visibility for some termios symbols. They are defined by XSI or newer SUS. This is a follow-up to r318780. Reported by: jbeich Obtained from: DragonflyBSD commit e08b3836c962 Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-08-04 09:45:40 +00:00
Alan Cox	ba98e6a2d7	In case readers are misled by expressions that combine multiplication and division, add parentheses to make the precedence explicit. Submitted by: Doug Moore <dougm@rice.edu> Requested by: imp Reviewed by: imp MFC after: 1 week X-MFC after: r321840 Differential Revision: https://reviews.freebsd.org/D11815	2017-08-04 04:23:23 +00:00
Warner Losh	f48bfebce1	Add EFI utility functions to libefi This patch adds additional EFI utility functions to convert errno values to EFI_STATUS errors, as well as EFI times to UNIX times. Submitted by: Eric McCorkle Differential Revision: https://reviews.freebsd.org/D11858	2017-08-04 04:20:11 +00:00
Warner Losh	457ea3bce3	Move EFI ZFS functions to libefi This patch moves some EFI ZFS functions from loader to libefi, allowing them to be used by anything that links against libefi. Submitted by: Eric McCorkle Differential Revision: https://reviews.freebsd.org/D11855	2017-08-04 04:20:06 +00:00
Warner Losh	acf82d2659	Add definitions and utilities for EFI drivers This patch adds definitions and utility code for creating EFI drivers using the EFI_DRIVER_BINDING_PROTOCOL. Submitted by: Eric McCorkle Differential Revision: https://reviews.freebsd.org/D11852	2017-08-04 04:16:41 +00:00
Warner Losh	8a5d94f94d	Make nvd vs nda choice boot-time rather than build-time Introduce hw.nvme.use_nvd tunable. This tunable allows both nvd and nda to be installed in the kernel, while allowing only one of them to create devices. This is an all-or-nothing setting, and you can't change it after boot-time. However, it will allow easier A/B testing. Differential Revision: https://reviews.freebsd.org/D11825	2017-08-04 03:40:01 +00:00
Navdeep Parhar	6320b0f850	cxgbe(4): Always use the first and not the last virtual interface associated with a port in begin_synchronized_op. MFC after: 3 days Sponsored by: Chelsio Communications	2017-08-04 01:28:06 +00:00
Conrad Meyer	6f240e18b5	x86: Tag some intrinsics with __pure2 Some C wrappers for x86 instructions do not touch global memory and only act on their arguments; they can be marked __pure2, aka __const__. Without this annotation, Clang 3.9.1 is not intelligent enough on its own to grok that these functions are __const__. Submitted by: Anton Rang <anton.rang AT isilon.com> Sponsored by: Dell EMC Isilon	2017-08-03 22:28:30 +00:00
Mark Johnston	a95435cfed	Bump the maximum file name length in pseudofs filesystems to 48. The previous limit of 24 was somewhat restrictive, and with this change ceil(log2(sizeof(struct pfs_node))) is the same as before in both the ILP32 and LP64 models, so the malloc zone used for allocations of struct pfs_node is the same as before. Approved by: des	2017-08-03 21:35:53 +00:00
Mark Johnston	f2ec04a394	Add subsystem vendor and device ID fields to struct pci_dev. MFC after: 1 week	2017-08-03 21:14:46 +00:00
Emmanuel Vadot	c31654c5b6	arm: Add a GENERIC-NODEBUG kernel config Like amd64 or arm64 provide a GENERIC-NODEBUG configuration file that remove WITNESS and INVARIANTS etc ...	2017-08-03 19:01:46 +00:00
Ian Lepore	ba60088b16	Add missing header file to SRCS. Reported by: manu@	2017-08-03 18:49:15 +00:00
Ian Lepore	094e5e7e12	Switch to iicdev_readfrom/writeto() to do xfers with proper bus ownership. Tested by: manu@	2017-08-03 18:43:54 +00:00
Ian Lepore	854519fdd9	Add an ahci driver for imx6. This was submitted by Rogiel Sulzbach (thank you!) but has a few last-minute changes by me, mostly where the code interfaces to my still-utterly-deficient imx6_ccm clocks implementation. So blame me for any mistakes. Submitted by: Rogiel Sulzbach <rogiel@rogiel.com> Differential Revision: https://reviews.freebsd.org/D11177	2017-08-03 14:43:41 +00:00
Navdeep Parhar	f856f099cb	cxgbe(4): Initial import of the "collect" component of Chelsio unified debug (cudbg) code, hooked up to the main driver via an ioctl. The ioctl can be used to collect the chip's internal state in a compressed dump file. These dumps can be decoded with the "view" component of cudbg. Obtained from: Chelsio Communications MFC after: 2 months Sponsored by: Chelsio Communications	2017-08-03 14:43:30 +00:00
Enji Cooper	1ef2a611de	Revert r321969 My change had good intentions, but the implementation was incorrect: - printf was returning the number of characters in the format string plus the NUL, but failed in two regards implementation wise: -- the pathological case, printf(""), wasn't being handled properly since the pointer is always incremented, so the value returned would be off-by-one. -- printf(3) reports the number of characters printed post-conversion via vfprintf, etc. - putchar(3) should return the character printed or EOF, not the number of characters output to the screen. My goal in making the change (again) was to increase parity, but as bde pointed out these are freestanding functions, so they don't have to conform to libc/POSIX. I argued that the functions should be named differently since the implementation is different enough to warrant it and to allow boot2 code to be usable when linked against sys/boot and libstand and other libraries in base. I have no interest in pushing this change forward more though, as the original concern I had behind the change with zfsboottest was resolved in r321849 and r321852. The next person that updates the toolchain gets to deal with the inconsistency if it's flagged by a newer compiler. MFC after: 1 month Reported by: ed, markj	2017-08-03 13:50:46 +00:00
Hans Petter Selasky	b40951b8cd	Change reject message type when destroying cm_id in ibore. This patch fixes an interopability issue between FreeBSD and non-FreeBSD systems when the connection establishment is aborted. Refer to the initial commit in Linux, drivers/infiniband/core/cm.c, for a more detailed description. Obtained from: Linux MFC after: 3 days Sponsored by: Mellanox Technologies	2017-08-03 09:31:10 +00:00
Hans Petter Selasky	44d8a0fc60	Ticks are 32-bit in FreeBSD. MFC after: 3 days Sponsored by: Mellanox Technologies	2017-08-03 09:18:25 +00:00
Hans Petter Selasky	713dd5cb9e	Resolve locking issue for non-sleepable context in the mlx5core. Code inspection reveals the busdma unload and free functions do not write to the belonging dma tag and does not need to be serialized. This allows mlx5_fwp_free() to be called from software interrupt context. MFC after: 3 days Sponsored by: Mellanox Technologies	2017-08-03 09:14:43 +00:00
Hans Petter Selasky	1d4905b5b0	Using GFP_ATOMIC with firmware commands is not supported after busdma was introduced in the mlx5core, because busdma might sleep when loading memory into DMA. MFC after: 3 days Sponsored by: Mellanox Technologies	2017-08-03 09:11:51 +00:00

1 2 3 4 5 ...

118076 Commits