freebsd-skq

Author	SHA1	Message	Date
yuripv	fb90dea0cc	linux: provide just one instance of futex_list Move futex_list definition to linux.c which is included once in linux.ko (i386) and in linux_common.ko (amd64 and aarch64) allowing 32/64 bit linux programs to access the same futexes in the latter case. PR: 240989 Reviewed by: dchagin Differential Revision: https://reviews.freebsd.org/D22073	2019-10-18 10:28:08 +00:00
kp	9cc4228e50	pf: Must be in NET_EPOCH to call icmp_error icmp_reflect(), called through icmp_error() requires us to be in NET_EPOCH. Failure to hold it leads to the following panic (with INVARIANTS): panic: Assertion in_epoch(net_epoch_preempt) failed at /usr/src/sys/netinet/ip_icmp.c:742 cpuid = 2 time = 1571233273 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00e0977920 vpanic() at vpanic+0x17e/frame 0xfffffe00e0977980 panic() at panic+0x43/frame 0xfffffe00e09779e0 icmp_reflect() at icmp_reflect+0x625/frame 0xfffffe00e0977aa0 icmp_error() at icmp_error+0x720/frame 0xfffffe00e0977b10 pf_intr() at pf_intr+0xd5/frame 0xfffffe00e0977b50 ithread_loop() at ithread_loop+0x1c6/frame 0xfffffe00e0977bb0 fork_exit() at fork_exit+0x80/frame 0xfffffe00e0977bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00e0977bf0 Note that we now enter NET_EPOCH twice if we enter ip_output() from pf_intr(), but ip_output() will soon be converted to a function that requires epoch, so entering NET_EPOCH directly from pf_intr() makes more sense. Discussed with: glebius@	2019-10-18 03:36:26 +00:00
cem	8980f3ad9a	nvdimm_e820: Fix braino in size=all SPA hint The sentinel value for "use the rest of the region," -1, isn't zero modulo PAGE_SIZE. Relax the check to permit the intended special value. X-MFC-With: r353110 Sponsored by: Dell EMC Isilon	2019-10-18 03:01:21 +00:00
cem	22adb9b140	x86: Remove unused variable from r353712 It was in my git tree (uncommitted) and didn't get carried over to SVN in r353712. X-MFC-With: r353712	2019-10-18 02:25:30 +00:00
cem	6487f370d2	x86: Fetch and save standard CPUID leaf 6 in identcpu Rather than a few scattered places in the tree. Organize flag names in a contiguous region of specialreg.h. While here, delete deprecated PCOMMIT from leaf 7. No functional change.	2019-10-18 02:18:17 +00:00
cem	b6995dbc52	gdb(4): Implement support for NoAckMode When the underlying debugport transport is reliable, GDB's additional checksums and acknowledgements are redundant. NoAckMode eliminates the the acks and allows us to skip checking RX checksums. The GDB packet framing does not change, so unfortunately (valid) checksums are still included as message trailers. The gdb(4) stub in FreeBSD advertises support for the feature in response to the client's 'qSupported' request IFF the current debugport has the gdb_dbfeatures flag GDB_DBGP_FEAT_RELIABLE set. Currently, only netgdb(4) supports this feature. If the remote GDB client supports the feature and does not have it disabled via a GDB configuration knob, it may instruct our gdb(4) stub to enter NoAckMode. Unless and until it issues that command, we must continue to transmit acks as usual (and for now, we continue to wait until we receive them as well, even if we know the debugport is on a reliable transport). In the kernel sources, the sense of the flag representing the state of the feature is reversed from that of the GDB command. (I.e., it is 'gdb_ackmode', not 'gdb_noackmode.') This is to avoid confusing double- negative conditions. For reference, see: * https://sourceware.org/gdb/onlinedocs/gdb/Packet-Acknowledgment.html * https://sourceware.org/gdb/onlinedocs/gdb/General-Query-Packets.html#QStartNoAckMode Reviewed by: jhb, markj (both earlier version) Differential Revision: https://reviews.freebsd.org/D21761	2019-10-17 22:37:25 +00:00
markj	b9b9f9a70d	Add an ldscript for amd64 kernel modules. Use it to pad the text and read-only data sections to a 4KB boundary. This will be used to enforce strict memory protections for some sections of loadable kernel modules. Reviewed by: kib MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21970	2019-10-17 21:39:23 +00:00
cem	45bf92cd20	Implement NetGDB(4) NetGDB(4) is a component of a system using a panic-time network stack to remotely debug crashed FreeBSD kernels over the network, instead of traditional serial interfaces. There are three pieces in the complete NetGDB system. First, a dedicated proxy server must be running to accept connections from both NetGDB and gdb(1), and pass bidirectional traffic between the two protocols. Second, the NetGDB client is activated much like ordinary 'gdb' and similarly to 'netdump' in ddb(4) after a panic. Like other debugnet(4) clients (netdump(4)), the network interface on the route to the proxy server must be online and support debugnet(4). Finally, the remote (k)gdb(1) uses 'target remote <proxy>:<port>' (like any other TCP remote) to connect to the proxy server. The NetGDB v1 protocol speaks the literal GDB remote serial protocol, and uses a 1:1 relationship between GDB packets and sequences of debugnet packets (fragmented by MTU). There is no encryption utilized to keep debugging sessions private, so this is only appropriate for local segments or trusted networks. Submitted by: John Reimer <john.reimer AT emc.com> (earlier version) Discussed some with: emaste, markj Relnotes: sure Differential Revision: https://reviews.freebsd.org/D21568	2019-10-17 21:33:01 +00:00
markj	f49b0d8c82	Clean up some nits in link_elf_(un)load_file(). - Remove a redundant assignment of ef->address. - Don't return a Mach error number to the caller if vm_map_find() fails. - Use ptoa() and fix style. MFC after: 2 weeks Sponsored by: Netflix	2019-10-17 21:25:50 +00:00
markj	7f37066f60	Belatedly bump __FreeBSD_version for r353537 and related commits. At least one small update to the out-of-tree DRM drivers is required now that cdev_pager_free_page() expects an xbusy page. Discussed with: jeff, zeising	2019-10-17 20:46:33 +00:00
cem	abc2745a10	debugnet(4): Add optional full-duplex mode It remains unattached to any client protocol. Netdump is unaffected (remaining half-duplex). The intended consumer is NetGDB. Submitted by: John Reimer <john.reimer AT emc.com> (earlier version) Discussed with: markj Differential Revision: https://reviews.freebsd.org/D21541	2019-10-17 20:25:15 +00:00
glebius	7d6b8e7344	Revert two parts of r353292 that enter epoch when processing vlan capabilities. It could be that entering epoch isn't necessary here, but better take a conservative approach. Submitted by: kp	2019-10-17 20:18:07 +00:00
cem	f92f351606	debugnet(4): Infer non-server connection parameters Loosen requirements for connecting to debugnet-type servers. Only require a destination address; the rest can theoretically be inferred from the routing table. Relax corresponding constraints in netdump(4) and move ifp validation to debugnet connection time. Submitted by: John Reimer <john.reimer AT emc.com> (earlier version) Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D21482	2019-10-17 20:10:32 +00:00
cem	d5b17121d8	acpica: Fix for the fix, unfortunately Follow-up to incomplete pedantic change in r353691 by actually fixing the default implementation to match the interface type. Mea culpa. X-MFC-With: r353691, r339754	2019-10-17 19:53:55 +00:00
cem	b0452a96d2	Add ddb(4) 'netdump' command to netdump a core without preconfiguration Add a 'X -s <server> -c <client> [-g <gateway>] -i <interface>' subroutine to the generic debugnet code. The imagined use is both netdump, shown here, and NetGDB (vaporware). It uses the ddb(4) lexer, with some new extensions, to parse out IPv4 addresses. 'Netdump' uses the generic debugnet routine to load a configuration and start a dump, without any netdump configuration prior to panic. Loosely derived from work by: John Reimer <john.reimer AT emc.com> Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D21460	2019-10-17 19:49:20 +00:00
cem	db456c276c	acpica: Match ID_PROBE default implementation to interface After r339754, the additional interface parameter was accidentally left out of the default acpi_generic_id_probe implementation. Apparently this does not cause any real problems, so this fix is mostly stylistic. No functional change intended. X-MFC-With: r339754	2019-10-17 18:45:11 +00:00
cem	4f75ec84a8	Add a very limited DDB dumpon(8)-alike to MI dumper code This allows ddb(4) commands to construct a static dumperinfo during panic/debug and invoke doadump(false) using the provided dumper configuration (always inserted first in the list). The intended usecase is a ddb(4)-time netdump(4) command. Reviewed by: markj (earlier version) Differential Revision: https://reviews.freebsd.org/D21448	2019-10-17 18:29:44 +00:00
cem	07646b2459	debugnet: Respond to broadcast ARP requests The in-tree netdump code has always ignored non-directed ARP requests, and that seems to work most of the time for netdump. In my work and testing on NetGDB, it seems like sometimes the remote FreeBSD conversant (the non-panic system) will send broadcast-destination ARP requests to the debugnet kernel; without this change, those are dropped and the remote will see EHOSTDOWN "Host is down" errors from the userspace interface of the network stack. Discussed with: markj	2019-10-17 17:48:32 +00:00
cem	0478ef8c6b	debugnet(4): Check hardware-validated UDP checksums Similar to INET checksums, lazily validate UDP checksums when the driver has already performed the check for us. Like debugnet(4) INET checksums, validation in software is left as future work. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D21745	2019-10-17 17:19:16 +00:00
glebius	a38665c7e7	Quickly fix up r353683: enter the epoch before calling into netisr_dispatch().	2019-10-17 17:02:50 +00:00
emaste	56393d14f8	Update Conrad Meyer's email cem is now a committer Approved by: cem	2019-10-17 16:38:44 +00:00
cem	f3a0ee41db	Split out a more generic debugnet(4) from netdump(4) Debugnet is a simplistic and specialized panic- or debug-time reliable datagram transport. It can drive a single connection at a time and is currently unidirectional (debug/panic machine transmit to remote server only). It is mostly a verbatim code lift from netdump(4). Netdump(4) remains the only consumer (until the rest of this patch series lands). The INET-specific logic has been extracted somewhat more thoroughly than previously in netdump(4), into debugnet_inet.c. UDP-layer logic and up, as much as possible as is protocol-independent, remains in debugnet.c. The separation is not perfect and future improvement is welcome. Supporting INET6 is a long-term goal. Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to 'debugnet_' or 'dn_' -- sorry. I thought keeping the netdump name on the generic module would be more confusing than the refactoring. The only functional change here is the mbuf allocation / tracking. Instead of initiating solely on netdump-configured interface(s) at dumpon(8) configuration time, we watch for any debugnet-enabled NIC for link activation and query it for mbuf parameters at that time. If they exceed the existing high-water mark allocation, we re-allocate and track the new high-water mark. Otherwise, we leave the pre-panic mbuf allocation alone. In a future patch in this series, this will allow initiating netdump from panic ddb(4) without pre-panic configuration. No other functional change intended. Reviewed by: markj (earlier version) Some discussion with: emaste, jhb Objection from: marius Differential Revision: https://reviews.freebsd.org/D21421	2019-10-17 16:23:03 +00:00
glebius	a6c3971a22	igmp_v1v2_queue_report() doesn't require epoch.	2019-10-17 16:02:34 +00:00
emaste	00976e705f	snd_hda: style(9) whitespace fixup PR: 241299 Submitted by: Neel Chauhan	2019-10-17 14:58:03 +00:00
kib	4cf89d6e98	swapon_check_swzone(): use already calculated static variables. Submitted by: ota@j.email.ne.jp MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22065	2019-10-17 13:49:47 +00:00
emaste	5f17a06d9c	vt: remove comment that is not true since r259680 r259680 added support to vt(4) for printing double-width characters. Remove the comment that claims no support. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-10-17 13:08:50 +00:00
avg	830e53a5a5	provide a way to assign taskqueue threads to a kernel process This can be used to group all threads belonging to a single logical entity under a common kernel process. I am planning to use the new interface for ZFS threads. MFC after: 4 weeks	2019-10-17 06:32:34 +00:00
avg	8a5c1a383f	wbwd: small clean-ups and improvements This change applies some suggestions by delphij from D21979. A write-only variable is removed. There is a diagnostic message if the driver does not recognize the chip. A chained if-statement is converted to a switch. MFC after: 3 weeks	2019-10-17 06:21:09 +00:00
philip	28635f8b5e	ether: add older ethertype definitions for QinQ Older network equipment used the ethertypes 0x9100, 0x9200, and 0x9300 for outer VLANs, before standardisation introduced 0x88a8. Submitted by: Lutz Donnerhacke <lutz_donnerhacke.de> Differential Revision: https://reviews.freebsd.org/D21846	2019-10-17 00:34:53 +00:00
markj	39a2929b65	Formalize the use of linker scripts for kernel modules. Automatically apply ldscript.kmod.${MACHINE_ARCH} if it exists. We already have an i386-specific linker script; rename it accordingly. Note that the linker script is applied when the object files are partially linked. (For amd64 this is also the final link.) Reviewed by: imp, kib Discussed with: jhb MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21887	2019-10-16 22:19:56 +00:00
markj	341d641470	Introduce pmap_change_prot() for amd64. This updates the protection attributes of subranges of the kernel map. Unlike pmap_protect(), which is typically used for user mappings, pmap_change_prot() does not perform lazy upgrades of protections. pmap_change_prot() also updates the aliasing range of the direct map. Reviewed by: kib MFC after: 1 month Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21758	2019-10-16 22:12:34 +00:00
markj	b0130de08d	Use KOBJMETHOD_END in the kernel linker. MFC after: 1 week	2019-10-16 22:06:19 +00:00
markj	84cd531f96	Remove page locking from pmap_mincore(). After r352110 the page lock no longer protects a page's identity, so there is no purpose in locking the page in pmap_mincore(). Instead, if vm.mincore_mapped is set to the non-default value of 0, re-lookup the page after acquiring its object lock, which holds the page's identity stable. The change removes the last callers of vm_page_pa_tryrelock(), so remove it. Reviewed by: kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21823	2019-10-16 22:03:27 +00:00
chs	c0b77cf1b2	Make all the gnop parameters optional in the request from userland, filling in the same defaults that the current userland module uses. This allows an old geom_nop.so userland module to work with a new kernel. Approved by: imp (mentor) Reviewed by: cem Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21972	2019-10-16 21:49:44 +00:00
chs	8356d793be	Add a new gctl_get_paraml_opt() interface to extract optional parameters from the request. It is the same as gctl_get_paraml() except that the request is not marked with an error if the parameter is not present. Approved by: imp (mentor) Reviewed by: cem Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21972	2019-10-16 21:49:39 +00:00
markj	f4dbe34c21	Correct the range boundaries used by kern_mincore(). Reported by: alc Sponsored by: Netflix	2019-10-16 21:47:58 +00:00
kib	0e82dce3d3	Port r353622 to sparc64 and arm v4. Noted by: alc Reviewed by: alc, jeff, markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D22056	2019-10-16 21:07:18 +00:00
cem	a5b549b956	ddb: Add support for disassembling 'crc32' on amd64	2019-10-16 18:27:27 +00:00
erj	63b49c2612	Fix compile error introduced in r353658 "adapter" doesn't exist in ixl. Reported by: O. Hartmann <ohartmann@walstatt.org>	2019-10-16 18:12:22 +00:00
erj	02fd9522cb	ixl: report whether device received pause frames From Jake: When updating the device statistics, report whether or not we have received any pause frames to the iflib stack. This allows the iflib stack to avoid generating a Tx hang message while the device is paused. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: gallatin@ Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21870	2019-10-16 17:19:17 +00:00
erj	9da68e5236	ix: report isc_pause_frames during stat update From Jake: Notify the iflib stack of whether we received any pause frames during the timer window. This allows the stack to avoid reporting a Tx hang due to the device being paused. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: gallatin@ Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21869	2019-10-16 17:16:32 +00:00
erj	fe09f9c176	e1000: correctly set isc_pause_frames only when XOFF increases From Jake: The e1000 driver sets the iflib shared context isc_pause_frames value to the number of received xoff frames. This is done so that the iflib watchdog timer won't trigger a Tx Hang due to pause frames. Unfortunately, the function simply sets it to the value of the xoffrxc counter. Once the device has received a single XOFF packet, the driver always reports that we received pause frames. This will prevent the Tx hang detection entirely from that point on. Fix this by assigning isc_pause_frames to a non-zero value if we received any XOFF packets in the last timer interval. We could attempt to calculate the total number of received packets by doing a subtraction, but the iflib stack only seems to check if isc_pause_frames is non-zero. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: gallatin@ Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21868	2019-10-16 17:13:46 +00:00
glebius	ec1d6a72a3	do_link_state_change() is executed in taskqueue context and in general is allowed to sleep. Don't enter the epoch for the whole duration. If some event handlers need the epoch, they should handle that theirselves. Discussed with: hselasky	2019-10-16 16:32:58 +00:00
ian	7c87e295c8	Update some comments; no functional changes. Some historical old comments in this driver indicate that the SD_CAPA register is write-once and after being set one time the values in it cannot be changed. That turns out not to be the case -- the values written to it survive a reset, but they can be rewritten/changed at any time.	2019-10-16 16:26:35 +00:00
ian	976e42830c	Revert r351218 (by manu). While the changes in r351218 appear to be (and should be) correct, they lead to the eMMC on a Beaglebone failing to work in some situations. The TI sdhci hardware is kind of strange. The first device inherently supports 1.8v and 3.3v and the abililty to switch between them, and the other two devices must be set to 1.8v in the sdhci power control register to operate correctly, but doing so actually makes them run at 3.3v (unless an external level-shifter is present in the signal path). Even the 1.8v on the first device may actually be 3.3v (or any other value), depending on what voltage is fed to the VDDS1-VDDS7 power supply pins on the am335x chip. Another strange quirk is that the convention for am335x sdhci drivers in linux and uboot and the am335x boot ROM seems to be to set the voltage in the sdhci capabilities register to 3.0v even though the actual voltage is 3.3v. Why this is done is a complete mystery to me, but it seems to be required for correct operation. If we had complete modern support for the am335x chip we could get the actual voltages from the FDT data and the regulator framework. But our am335x code currently doesn't have any regulator framework support. Reverting to the prior code will get the popular Beaglebone boards working again. This is part of the fix for PR 241301, but also requires r353651 for a complete fix. PR: 241301 Discussed with: manu	2019-10-16 16:19:21 +00:00
ian	8eb4642025	Relax the sdhci(4) check that filters out the 1.8v voltage option unless the slot is flagged as 'embedded'. The features related to embedded and shared slots were added in v3.0 of the sdhci spec. Hardware prior to v3 sometimes supported 1.8v on non- removable devices in embedded systems, but had no way to indicate that via the standard sdhci registers (instead they use out of band metadata such as FDT data). This change adds the controller specification version to the check for whether to filter out the 1.8v selection. On older hardware, the 1.8v option is allowed to remain. On 3.0 or later it still requires the embedded-slot flag to remain. This is part of the fix for PR 241301 (eMMC not detected on Beaglebone). Changes to the sdhci_ti driver are also needed for a full fix. PR: 241301	2019-10-16 16:03:19 +00:00
markj	957cc17627	Clear PGA_WRITEABLE in moea_pvo_remove(). moea_pvo_remove() might remove the last mapping of a page, in which case it is clearly no longer writeable. This can happen via pmap_remove(), or when a CoW fault removes the last mapping of the old page. Reported and tested by: bdragon Reviewed by: alc, bdragon, kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22044	2019-10-16 15:50:12 +00:00
avg	423d2bf684	attach itwd to the module build on x86 MFC after: 19 days X-MFC with: r353647	2019-10-16 15:01:44 +00:00
avg	92d6de6711	itwd(4): driver for watchdog function in ITE Super I/O chips The chips are commonly named with "IT" prefix. MFC after: 19 days	2019-10-16 14:57:38 +00:00
avg	6e94de3ffa	wbwd: move to superio(4) bus This allows to remove a bunch of low level code. Also, superio(4) provides safer interaction with other drivers that work with Super I/O configuration registers. Tested only on PCengines APU2: superio0: <Nuvoton NCT5104D/NCT6102D/NCT6106D (rev. B+)> at port 0x2e-0x2f on isa0 wbwd0: <Nuvoton NCT6102 (0xc4/0x53) Watchdog Timer> at WDT ldn 0x08 on superio0 The watchdog output is incorrectly wired on that system and the watchdog does not really do it its job, but the pulse can be seen with a signal analyzer. Reviewed by: delphij, bcr (man page) MFC after: 19 days Differential Revision: https://reviews.freebsd.org/D21979	2019-10-16 14:46:04 +00:00
avg	f5dd73c97a	move nctgpio to superio(4) bus This is where it logically belongs. The change allows to drop a bunch of low lewel code. Reviewed by: gonzo MFC after: 19 days Differential Revision: https://reviews.freebsd.org/D21980	2019-10-16 14:42:49 +00:00
manu	7198b43642	dwc3: Use a pair of ()'s around arguments for some macros Reported by: hselasky MFC after: 1 week X-MFC-With: r353533	2019-10-16 13:53:53 +00:00
andrew	53cf0bb146	Use tables to store the information to decode the arm64 ID registers. Arm updates these with each new architecture revision. To help keep them updated use a collection of tables to hold the needed information to decode these registers. Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22020	2019-10-16 13:30:28 +00:00
andrew	7052b2d00d	Stop leaking information from the kernel through timespec The timespec struct holds a seconds value in a time_t and a nanoseconds value in a long. On most architectures these are the same size, however on 32-bit architectures other than i386 time_t is 8 bytes and long is 4 bytes. Most ABIs will then pad a struct holding an 8 byte and 4 byte value to 16 bytes with 4 bytes of padding. When copying one of these structs the compiler is free to copy the padding if it wishes. In this case the padding may contain kernel data that is then leaked to userspace. Fix this by copying the timespec elements rather than the entire struct. This doesn't affect Tier-1 architectures so no SA is expected. admbugs: 651 MFC after: 1 week Sponsored by: DARPA, AFRL	2019-10-16 13:21:01 +00:00
avg	d2dd33611e	MFV r353637: 10844 Serialize ZTHR operations to eliminate races illumos/illumos-gate@6a316e1f6d `6a316e1f6d` https://www.illumos.org/issues/10844 ZoL 61c3391acc9 Serialize ZTHR operations to eliminate races Portions contributed by: Jerry Jelinek <jerry.jelinek@joyent.com> Author: Serapheim Dimitropoulos <serapheim@delphix.com> Obtained from: illumos, ZoL MFC after: 3 weeks	2019-10-16 09:29:01 +00:00
avg	dee41b6cfc	MFV r353630: 10809 Performance optimization of AVL tree comparator functions illumos/illumos-gate@c4ab0d3f46 `c4ab0d3f46` https://www.illumos.org/issues/10809 Port ZoL ee36c709c3d Performance optimization of AVL tree comparator functions This is a followup to r337567 that imported the ZoL commit directly into FreeBSD. It seems that at the time we did not have some of the earlier changes, so some pieces of the ZoL change were not applicable. Also, the illumos version got a few style cleanups. Some changes were missed or incorrectly merged (e.g., vdev_cache_lastused_compare and metaslab_rangesize_compare). Obtained from: ZoL, illumos MFC after: 25 days X-MFC after: r353634	2019-10-16 09:20:08 +00:00
hselasky	94dc322ef6	Fix panic in network stack due to use after free when receiving partial fragmented packets before a network interface is detached. When sending IPv4 or IPv6 fragmented packets and a fragment is lost before the network device is freed, the mbuf making up the fragment will remain in the temporary hashed fragment list and cause a panic when it times out due to accessing a freed network interface structure. 1) Make sure the m_pkthdr.rcvif always points to a valid network interface. Else the rcvif field should be set to NULL. 2) Use the rcvif of the last received fragment as m_pkthdr.rcvif for the fully defragged packet, instead of the first received fragment. Panic backtrace for IPv6: panic() icmp6_reflect() # tries to access rcvif->if_afdata[AF_INET6]->xxx icmp6_error() frag6_freef() frag6_slowtimo() pfslowtimo() softclock_call_cc() softclock() ithread_loop() Reviewed by: bz Differential Revision: https://reviews.freebsd.org/D19622 MFC after: 1 week Sponsored by: Mellanox Technologies	2019-10-16 09:11:49 +00:00
avg	4cc84298dd	MFV r348596: 9689 zfs range lock code should not be zpl-specific illumos/illumos-gate@7931524763 FreeBSD note: some tweaking was needed to avoid a conflict with sys/rangelock.h. Author: Matthew Ahrens <mahrens@delphix.com> Obtained from: illumos MFC after: 3 weeks	2019-10-16 09:04:53 +00:00
hselasky	12434b0a36	VLAN_TRUNKDEV() requires epochification in ibcore after r353292. Sponsored by: Mellanox Technologies	2019-10-16 08:56:07 +00:00
hselasky	ae1053fff8	Replace rdma_is_upper_dev_rcu() with rdma_vlan_dev_real_dev() in ibcore. This reduces the number of references to VLAN_TRUNKDEV() in ibcore. Currently only VLAN is supported as a child interface in FreeBSD. Remove superfluous RCU locking. Sponsored by: Mellanox Technologies	2019-10-16 08:55:29 +00:00
hselasky	a8aff284db	VLAN_DEVAT() requires epochification in ipoib after r353292. Sponsored by: Mellanox Technologies	2019-10-16 08:40:58 +00:00
avg	5f53fd39c6	MFV r353628: 10842 Mutex leak in dsl_dataset_hold_obj() illumos/illumos-gate@ad027c0ff9 `ad027c0ff9` https://www.illumos.org/issues/10842 ZoL d10b2f1d35b Mutex leak in dsl_dataset_hold_obj() Portions contributed by: Jerry Jelinek <jerry.jelinek@joyent.com> Author: Jorgen Lundman <lundman@lundman.net> Obtained from: illumos, ZoL MFC after: 15 days	2019-10-16 07:57:58 +00:00
kib	ae3c37c2e4	Fix assert in PowerPC pmaps after introduction of object busy. The VM_PAGE_OBJECT_BUSY_ASSERT() in pmap_enter() implementation should be only asserted when the code is executed as result of pmap_enter(), not when the same code is entered from e.g. pmap_enter_quick(). This is relevant for all PowerPC pmap variants, because mmu_*_enter() is used as the backend, and assert is located there. Add a PowerPC private pmap_enter() PMAP_ENTER_QUICK_LOCKED flag to indicate that the call is not from pmap_enter(). For non-quick-locked calls, assert that the object is locked. Reported and tested by: bdragon Reviewed by: alc, bdragon, markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D22041	2019-10-16 07:09:15 +00:00
avg	8db35e8374	MFV r353619: 9691 fat zap should prefetch when iterating illumos/illumos-gate@52abb70e07 `52abb70e07` https://www.illumos.org/issues/9691 When iterating over a ZAP object, we're almost always certain to iterate over the entire object. If there are multiple leaf blocks, we can realize a performance win by issuing reads for all the leaf blocks in parallel when the iteration begins. For example, if we have 10,000 snapshots, "zfs destroy -nv pool/fs@1%9999" can take 30 minutes when the cache is cold. This change provides a >3x performance improvement, by issuing the reads for all ~64 blocks of each ZAP object in parallel. Author: Matthew Ahrens <mahrens@delphix.com> Obtained from: illumos MFC after: 2 weeks	2019-10-16 07:09:00 +00:00
avg	b5daec7303	MFV r353617: 9425 allow channel programs to be stopped via signals illumos/illumos-gate@d0cb1fb926 `d0cb1fb926` https://www.illumos.org/issues/9425 Problem Statement ZFS Channel program scripts currently require a timeout, so that hung or long-running scripts return a timeout error instead of causing ZFS to get wedged. This limit can currently be set up to 100 million Lua instructions. Even with a limit in place, it would be desirable to have a sys admin (support engineer) be able to cancel a script that is taking a long time. Proposed Solution Make it possible to abort a channel program by sending an interrupt signal.In the underlying txg_wait_sync function, switch the cv_wait to a cv_wait_sig to catch the signal. Once a signal is encountered, the dsl_sync_task function can install a Lua hook that will get called before the Lua interpreter executes a new line of code. The dsl_sync_task can resume with a standard txg_wait_sync call and wait for the txg to complete. Meanwhile, the hook will abort the script and indicate that the channel program was canceled. The kernel returns a EINTR to indicate that the channel program run was canceled. FreeBSD note: the return value of cv_wait_sig() has inverted meaning between us and illumos. Author: Don Brady <don.brady@delphix.com> Obtained from: illumos MFC after: 4 weeks	2019-10-16 07:00:18 +00:00
avg	935d5ee530	MFV r353615: 9485 Optimize possible split block search space illumos/illumos-gate@a21fe34979 `a21fe34979` https://www.illumos.org/issues/9485 Port this commit from ZoL: `4589f3ae4c` Author: Brian Behlendorf <behlendorf1@llnl.gov> Obtained from: illumos, ZoL MFC after: 3 weeks	2019-10-16 06:43:22 +00:00
avg	18f0d2ac42	MFV r353613: 10731 zfs: NULL pointer errors FreeBSD already had these changes locally. This commit removes a small formatting difference. MFC after: 1 week	2019-10-16 06:38:05 +00:00
avg	e7993b34fd	MFC r353611: 10330 merge recent ZoL vdev and metaslab changes illumos/illumos-gate@a0b03b161c `a0b03b161c` https://www.illumos.org/issues/10330 3 recent ZoL changes in the vdev and metaslab code which we can pull over: PR 8324 c853f382db 8324 Change target size of metaslabs from 256GB to 16GB PR 8290 b194fab0fb 8290 Factor metaslab_load_wait() in metaslab_load() PR 8286 419ba59145 8286 Update vdev_is_spacemap_addressable() for new spacemap encoding Author: Serapheim Dimitropoulos <serapheimd@gmail.com> Obtained from: illumos, ZoL MFC after: 2 weeks	2019-10-16 06:26:51 +00:00
avg	3f51508cfe	MFV r353608: 10165 libzpool: passing argument 1 to restrict-qualified parameter illumos/illumos-gate@f91fcf59ac `f91fcf59ac` https://www.illumos.org/issues/10165 Author: Toomas Soome <tsoome@me.com> MFC after: 10 days	2019-10-16 06:09:00 +00:00
jhibbits	1537de8003	powerpc/mpc85xx: Fix function type for fsl_pcib_error_intr() Since it's only called as an interrupt handler, fsl_pcib_eror_intr() should just match the driver_intr_t type. Reported by: bdragon	2019-10-16 03:03:59 +00:00
jhibbits	cf6db46977	powerpc: Add AmigaOne platform, a subclass of MPC85xx Summary: The AmigaOne platform, encompassing the X5000 and A1222 at this time, is based on the mpc85xx platform, but includes some things not listed in the device tree. Some custom devices, like CPLD, could be added to the device tree with an overlay, or other means. However, some cannot easily be done, such as the power button interrupt. The directory will also become a location to add AmigaOne platform drivers, such as the aforementioned CPLD, and its children. Reviewed by: bdragon Differential Revision: https://reviews.freebsd.org/D21829	2019-10-16 00:38:50 +00:00
kp	038f82f772	Generalize ARM specific comments in devmap The comments in devmap are very ARM specific, this generalizes them for other architectures. Submitted by: Nicholas O'Brien <nickisobrien_gmail.com> Reviewed by: manu, philip Sponsored by: Axiado Differential Revision: https://reviews.freebsd.org/D22035	2019-10-15 23:21:52 +00:00
erj	31482d73c4	ixgbe: Disable EEE for backplane X550EM_X From Zach: Intel documentation indicates that backplane X550EM_X KR devices do not support Energy Efficient Ethernet. Prior to this patch, X552 devices (device ID 0x15AB) will crash the system when transitioning EEE state via sysctl. Signed-off-by: Zach Vargas <zvargas@xes-inc.com> PR: 240320 Submitted by: Zach Vargas <zvargas@xes-inc.com> Reviewed by: erj@ MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21673	2019-10-15 21:56:19 +00:00
glebius	9101a0b1d1	Missing from r353596.	2019-10-15 21:32:38 +00:00
glebius	072472d2fc	When assertion for a thread not being in an epoch fails also print all entered epochs. Works with EPOCH_TRACE only. Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D22017	2019-10-15 21:24:25 +00:00
trasz	4bf80ad94f	Add copyrights that I forgot to add when splitting arb.h off from tree.h. While here clean up the RCS tags. Suggested by: lstewart MFC after: 2 weeks Sponsored by: Klara Inc, Netflix	2019-10-15 19:44:43 +00:00
jhb	d4c3d2e382	Install an ACPI PCI bus notify handler. Rescan a PCI bus when the ACPI_NOTIFY_BUS_CHECK event is posted to a PCI bus. Reviewed by: scottl MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D21948	2019-10-15 19:12:09 +00:00
jhb	ce12d00cf5	Support hot insertion and removal of PCI devices on EC2. Install ACPI notify handlers on PCI devices with an _EJ0 method. This handler is invoked when devices are added or removed. - When an ACPI_NOTIFY_DEVICE_CHECK event posts, rescan the parent bus device. Note that strictly speaking we only need to rescan the specified device, but BUS_RESCAN is what is available, so we rescan the entire bus. - When an ACPI_NOTIFY_EJECT_REQUEST event posts, detach the device associated with the ACPI handle, invoke the _EJ0 method, and then delete the device. Eventually this might be changed to vector notify events to devd in userspace where devctl can be used instead to permit more complex actions such as graceful unmounting of filesystems. Tested by: cperciva Reviewed by: cperciva, imp, scottl MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21948	2019-10-15 19:04:39 +00:00
jhb	df7b71835a	Export pci_attach() and pci_detach(). Reviewed by: imp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21948	2019-10-15 18:58:01 +00:00
np	04295439a0	cxgbe(4): An EQ update can be requested in a TX_PKTS2 work request. MFC after: 1 week Sponsored by: Chelsio Communications	2019-10-15 17:35:39 +00:00
jhb	38c7492c7c	Use -march=octeon+ for OCTEON1. External binutils requires octeon+ for saa. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D22033	2019-10-15 17:28:26 +00:00
br	08cc5b3944	Fix dwmmc(4) driver attachment when ext_resources are not present. Ignore only ENOENT (no DTS properties found) and ENODEV (driver not present) non-zero return values from ext_resources. Reviewed by: manu Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D22043	2019-10-15 17:24:21 +00:00
jhb	ef102daf0b	Fix a write-only variable warning from external GCC. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D22032	2019-10-15 17:17:16 +00:00
jhb	96cbe399ef	Don't set the OUTPUT_FORMAT explicitly but let ld derive it. This fixes an error with modern ld.bfd and is inline with the changes in r215251 and r217612. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D22031	2019-10-15 17:14:30 +00:00
jhb	0c2b46c560	Update MIPS kernel builds to work with mips-gcc. - Use a default -march of mips64 on N64 and N32 kernels. - Set the endianness (via MIPS_ENDIAN) and ABI (via MIPS_ABI) in CFLAGS from MACHINE_ARCH. ARCH_FLAGS now only sets a different -march value if needed. - TRAMP_ARCH_FLAGS inherits MIPS_ENDIAN from MACHINE_ARCH but does not set the ABI since XLPN32 needs an N64 ABI for the trampoline loader. When TRAMP_ARCH_FLAGS is used it must set both -march and -mabi. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D22030	2019-10-15 17:11:42 +00:00
avg	36182cc5e9	fix up r353565, somehow a few files did not get committed MFC after: 3 weeks X-MFC with: r353565	2019-10-15 15:52:01 +00:00
glebius	7361293b96	Remove pfctlinput2(). It came from KAME and had never ever been in use.	2019-10-15 15:40:03 +00:00
avg	d739910195	MFV r353561: 10343 ZoL: Prefix all refcount functions with zfs_ illumos/illumos-gate@e914ace2e9 `e914ace2e9` https://www.illumos.org/issues/10343 On the openzfs feature/porting matrix, this is listed as: prefix to refcount funcs/types Having these changes will make it easier to share other work across the different ZFS operating systems. PR 7963 424fd7c3e Prefix all refcount functions with zfs_ PR 7885 & 7932 c13060e47 Linux 4.19-rc3+ compat: Remove refcount_t compat PR 5823 & 5842 4859fe796 Linux 4.11 compat: avoid refcount_t name conflict Author: Tim Schumacher <timschumi@gmx.de> Obtained from: illumos, ZoL MFC after: 3 weeks	2019-10-15 15:09:36 +00:00
avg	2cd2119783	MFV r353558: 10572 10579 Fix race in dnode_check_slots_free() illumos/illumos-gate@aa02ea0194 `aa02ea0194` 10572 Fix race in dnode_check_slots_free() https://www.illumos.org/issues/10572 The Fix from ZoL: Currently, dnode_check_slots_free() works by checking dn->dn_type in the dnode to determine if the dnode is reclaimable. However, there is a small window of time between dnode_free_sync() in the first call to dsl_dataset_sync() and when the useraccounting code is run when the type is set DMU_OT_NONE, but the dnode is not yet evictable, leading to crashes. This patch adds the ability for dnodes to track which txg they were last dirtied in and adds a check for this before performing the reclaim. This patch also corrects several instances when dn_dirty_link was treated as a list_node_t when it is technically a multilist_node_t. 10579 Don't allow dnode allocation if dn_holds != 0 https://www.illumos.org/issues/10579 The fix from ZoL: This patch simply fixes a small bug where dnode_hold_impl() could attempt to allocate a dnode that was in the process of being freed, but which still had active references. This patch simply adds the required check. Author: Tom Caputi <tcaputi@datto.com> Reported by: delphij MFC after: 2 weeks X-MFC with: r353176	2019-10-15 14:29:18 +00:00
avg	c233000250	MFV r353551: 10452 ZoL: merge in large dnode feature fixes illumos/illumos-gate@946342a260 `946342a260` https://www.illumos.org/issues/10452 illumos is missing a few small follow up ZoL bug fixes for the large dnode feature. We should pull those in. Those commits are in the ZoL tree as (newest to oldest): PR 8435 - 75d6b7ddca269542279975f716a343bb40a79baf - Add missing copyright notice to large_dnode tests PR 7433 - e14a32b1c844d924b9f093375c0badcf10f61741 - Fix object reclaim when using large dnodes PR 6616 - 48fbb9ddbf2281911560dfbc2821aa8b74127315 - Free objects when receiving full stream as clone PR 6695 - 39f56627ae988d09b4e3803c01c22b2026b2310e - receive_freeobjects() skips freeing some object Portions contributed by: Ned Bass <bass6@llnl.gov> Portions contributed by: Tom Caputi <tcaputi@datto.com> Author: Fabian Grünbichler <f.gruenbichler@proxmox.com> Obtained from: illumos, ZoL MFC after: 2 weeks X-MFC with: r353176	2019-10-15 14:20:11 +00:00
hselasky	d234ae58ee	The two functions ifnet_byindex() and ifnet_byindex_locked() are exactly the same after the network stack was epochified. Merge the two into one function and cleanup all uses of ifnet_byindex_locked(). While at it: - Add branch prediction macros. - Make sure the ifnet pointer is only deferred once, also when code optimisation is disabled. Sponsored by: Mellanox Technologies	2019-10-15 12:08:09 +00:00
hselasky	9eb258b5cf	Exclude the network link eventhandler from epochification after r353292. This fixes the following assert when "options RATELIMIT" is used: panic() malloc() sysctl_add_oid() tcp_rl_ifnet_link() do_link_state_change() taskqueue_run_locked() Sponsored by: Mellanox Technologies	2019-10-15 11:20:16 +00:00
hselasky	a229d895ec	Fix missing epochification of the LinuxKPI after r353292. Sponsored by: Mellanox Technologies	2019-10-15 11:14:14 +00:00
hselasky	0591f58c0a	Fix missing epochification of the ibcore code after r353292. Sponsored by: Mellanox Technologies	2019-10-15 11:12:31 +00:00
hselasky	80029ff884	Fix missing epochification of the ipoib code after r353292. Sponsored by: Mellanox Technologies	2019-10-15 11:11:21 +00:00
jeff	50eb2e4288	(6/6) Convert pmap to expect busy in write related operations now that all callers hold it. This simplifies pmap code and removes a dependency on the object lock. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21596	2019-10-15 03:51:46 +00:00
jeff	786dad5c20	(5/6) Move the VPO_NOSYNC to PGA_NOSYNC to eliminate the dependency on the object lock in vm_page_set_validclean(). Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21595	2019-10-15 03:48:22 +00:00
jeff	e249e932a5	(4/6) Protect page valid with the busy lock. Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594	2019-10-15 03:45:41 +00:00
jeff	0a6e7a4266	(3/6) Add a shared object busy synchronization mechanism that blocks new page busy acquires while held. This allows code that would need to acquire and release a very large number of page busy locks to use the old mechanism where busy is only checked and not held. This comes at the cost of false positives but never false negatives which the single consumer, vm_fault_soft_fast(), handles. Reviewed by: kib Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21592	2019-10-15 03:41:36 +00:00
jeff	209fb8d357	(2/6) Don't release xbusy in vm_page_remove(), defer to vm_page_free_prep(). This persists busy state across operations like rename and replace. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21549	2019-10-15 03:38:02 +00:00
jhibbits	4ec246542a	powerpc/atomic: Fix atomic_cmpset_rel() Need a release barrier, not an acquire barrier, else bad things happen.	2019-10-15 03:37:21 +00:00
jeff	51ed6c3ace	(1/6) Replace busy checks with acquires where it is trival to do so. This is the first in a series of patches that promotes the page busy field to a first class lock that no longer requires the object lock for consistency. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21548	2019-10-15 03:35:11 +00:00
manu	ce45f9aeb8	arm: allwinner: Add np and nmm clock file to the build MFC after: 1 month	2019-10-14 22:29:20 +00:00
manu	f6952dddfa	arm64: Add Synopsys DWC3 driver This add a driver for the Synopsys DWC3 driver found on multiple SoCs. It only supports host mode for now. MFC after: 1 month	2019-10-14 22:27:33 +00:00
manu	9116e61067	arm64: allwinner: Add aw_dwc3 driver This is a simplebus like driver that just deal with clocks and resets and attach the dwc3 child node. MFC after: 1 month	2019-10-14 22:22:19 +00:00
manu	3432510fca	arm64: allwinner: Add support for the usb3 phy The usb 3 controller in the H6 SoC have a dedicated phy. Add support for it. Mostly imported from NetBSD MFC after: 1 month	2019-10-14 21:58:46 +00:00
manu	0628be9d1d	arm64: allwinner: aw_usbphy: Add support for H6 PHY MFC after: 1 month	2019-10-14 21:56:41 +00:00
manu	a4b286109d	arm64: allwinner: Add H6 GPIO/Pinctrl driver This adds support for Allwinner H6 GPIO and pinctrl driver for both the main pinctrl unit and the 'r_' one. MFC after: 1 month	2019-10-14 21:55:45 +00:00
manu	a6c0ec79f3	arm64: allwinner: Add Allwinner H6 Support This adds support for H6 SoC. Add a CCU driver for H6 that support all PLLs and most of the clocks that we are intersted in for now (i2c, mmc, usb, etc ...) MFC after: 1 month	2019-10-14 21:53:53 +00:00
manu	3ce3414378	arm: allwinner: Disable the clock before changing it's freq You aren't supposed to changing the freq of a clock when it is enable so disable the clock before changing the freq and then re-enable it. MFC after: 1 month	2019-10-14 21:50:44 +00:00
manu	2f3f5200be	arm64: allwinner: Add aw_clk_nmm clock This is a clock type present on Allwinner H6 where the formula is : f = fparent * n / m0 / m1 MFC after: 1 month	2019-10-14 21:49:07 +00:00
manu	382da7bac2	arm64: allwinner: Add new clock aw_clk_np This is a clock type present in Allwinner H6 where the formula is : f = fparent * N / P MFC after: 1 month	2019-10-14 21:47:20 +00:00
manu	8d778d43ab	aw_ccung: Add more debug printfs No functional changes MFC after: 1 month	2019-10-14 21:45:15 +00:00
glebius	6d9d589814	if_delmulti() is never called without ifp argument, assert this instead of doing a useless search through interfaces.	2019-10-14 21:18:37 +00:00
cem	7c575799c3	x86: Use canonical spelling of MOVDIR64B feature/instruction The former spelling probably confused MOVDIR64B with MOVDIRI64. MOVDIR_64B is the 64-byte direct store instruction; MOVDIR_I64 is the 64-bit direct store instruction (underscores added here for clarity; they are not part of the canonical instruction name). No functional change. Sponsored by: Dell EMC Isilon	2019-10-14 20:55:01 +00:00
glebius	dd3da16cda	Convert to if_foreach_llmaddr() KPI. Reviewed by: philip	2019-10-14 20:33:14 +00:00
glebius	b7b62bfe2f	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:32:28 +00:00
tuexen	9c9657076e	Separate out SCTP related dtrace code. This is based on work done by markj@. Discussed with: markj@ MFC after: 3 days	2019-10-14 20:32:11 +00:00
glebius	26caa8963c	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:32:08 +00:00
glebius	5802d623cd	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:31:57 +00:00
glebius	c890fb42c6	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:31:43 +00:00
glebius	456197b99a	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:31:28 +00:00
glebius	e15795add3	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:30:44 +00:00
glebius	b6aceb6e28	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:30:30 +00:00
glebius	6e20ff3010	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:30:06 +00:00
glebius	ee90ab5949	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:29:50 +00:00
glebius	a253dc5437	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:29:32 +00:00
glebius	7cd5b29180	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:29:14 +00:00
glebius	9a9a9736e5	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:26:53 +00:00
glebius	07ae3c950b	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:26:17 +00:00
glebius	35db1c2a00	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:23:16 +00:00
glebius	df24c2c818	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:22:25 +00:00
glebius	79e3cb3789	Convert if_foreach_llmaddr() KPI. Reviewed by: erj	2019-10-14 20:21:02 +00:00
glebius	c69fd8a5cd	Convert to if_foreach_llmaddr() KPI. Reviewed by: gallatin	2019-10-14 20:18:36 +00:00
luporl	cad5513a14	Fix powerpc/powerpcspe builds Revision 353489 introduced some new function calls in common powerpc code, but these must be called only on powerpc64.	2019-10-14 19:06:17 +00:00
jhb	85cd56a70f	Remove an unused parameter from get_new_keyid().	2019-10-14 18:02:56 +00:00
dougm	01d8b40816	Correct a transcription error that broke GENERIC introduced in r353496.	2019-10-14 17:51:57 +00:00
dougm	6db7785e8f	Move the definition of _vm_map_assert_consistent so that it can use vm_map_free_{left,right} rather than re-implementing them. Use the VM_MAP_FOREACH macro where applicable. Fix some indentation. Suggested by: kib (in a comment on D21964) Tested by: pho (as part of D21964) Differential Revision: https://reviews.freebsd.org/D22011	2019-10-14 17:15:42 +00:00
glebius	5d9a41a2e5	Use epoch(9) directly instead of obsoleted KPI.	2019-10-14 16:37:41 +00:00
glebius	962bb78b05	ipfw(4) rule matching always happens in network epoch.	2019-10-14 16:37:00 +00:00
br	66c939a582	Fix the driver attachment in cases when the external resource devices (resets, regulators, clocks) are not available. Rely on a system initialization done by a bootloader in that cases. This fixes operation on Terasic DE10-Pro (an Intel Stratix 10 development kit). Sponsored by: DARPA, AFRL	2019-10-14 15:52:59 +00:00
glebius	d6d2b9fe51	in6ifa_llaonifp() is never called from fast path, so do not require epoch being entered.	2019-10-14 15:33:53 +00:00
rrs	2271f78dc9	if_hw_tsomaxsegsize needs to be initialized to zero, just like in bbr.c and tcp_output.c	2019-10-14 13:10:29 +00:00
luporl	57d28447c8	[PPC64] Initial kernel minidump implementation Based on POWER9BSD implementation, with all POWER9 specific code removed and addition of new methods in PPC64 MMU interface, to isolate platform specific code. Currently, the new methods are implemented on pseries and PowerNV (D21643). Reviewed by: jhibbits Differential Revision: https://reviews.freebsd.org/D21551	2019-10-14 13:04:04 +00:00
tuexen	c376e82038	Rename sctp_dtrace_declare.h to sctp_kdtrace.h for consistentcy. MFC after: 3 days	2019-10-14 13:02:49 +00:00
andrew	12778f78cc	Sort the id_aa64*_fields arrays to be in alphanumerical order. Sponsored by: DARPA, AFRL	2019-10-14 09:29:56 +00:00
glebius	187eb910e1	Since EPOCH_TRACE had been moved to opt_global.h, we don't need to waste extra space in struct thread.	2019-10-14 04:17:56 +00:00
glebius	c58d10ed59	Revert r353313. It is not needed with r353357 and is actually incorrect.	2019-10-14 04:10:00 +00:00
mmacy	8078d0b567	Fix sample check in hwpmc Don't drop samples with callchain pending Tested by: mjg@ Submitted by: Rajeeb Barman at amd.com MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D17011	2019-10-13 22:26:55 +00:00
tuexen	aefa767a80	Add missing include which breaks builds without VIMAGE. The bug was introduced by me in r353480. Reported by: Michael Butler MFC after: 3 days	2019-10-13 19:58:37 +00:00
jhibbits	0c685cb80e	powerpc/pmap: Tighten condition for removing tracked pages in Book-E pmap There are cases where there's no vm_page_t structure for a given physical address, such as the CCSR. In this case, trying to obtain the md.page_tracked struct member would lead to a NULL dereference, and panic. Tighten this up by checking for kernel_pmap AND that the page structure actually exists before dereferencing. The flag can only be set when it's tracked in the kernel pmap anyway. MFC after: 3 weeks	2019-10-13 19:33:00 +00:00
tuexen	e037f7f53e	Use an event handler to notify the SCTP about IP address changes instead of calling an SCTP specific function from the IP code. This is a requirement of supporting SCTP as a kernel loadable module. This patch was developed by markj@, I tweaked a bit the SCTP related code. Submitted by: markj@ MFC after: 3 days	2019-10-13 18:17:08 +00:00
markj	8cb00574b8	Move SCTP DTrace probe definitions into a .c file. Previously they were defined in a header which was included exactly once. Change this to follow the usual practice of putting definitions in C files. No functional change intended. Discussed with: tuexen MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-13 16:14:04 +00:00
mjg	db8c5484f2	tmpfs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:42:41 +00:00
mjg	e164faaa6f	pseudofs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:42:25 +00:00
mjg	4ed7410353	nullfs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:42:04 +00:00
mjg	431c8fcde9	devfs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:41:47 +00:00
mjg	5b51c4fad1	zfs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:41:30 +00:00
mjg	7cb37ce311	vfs: add MNTK_NOMSYNC On many filesystems the traversal is effectively a no-op. Add a way to avoid the overhead. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:40:34 +00:00
mjg	c576b0223d	vfs: return free vnode batches in sync instead of vfs_msync It is a more natural fit. vfs_msync only deals with active vnodes. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22008	2019-10-13 15:39:11 +00:00
glebius	b2fd2fd625	vlan_config() isn't always called in epoch context. Reported by: kp	2019-10-13 15:15:09 +00:00
tuexen	164e0334a4	Remove line not needed. Submitted by: markj@ MFC after: 3 days	2019-10-13 09:35:03 +00:00
kib	a3fd50a480	Restore nofaulting operations after r352807 The TDP_NOFAULTING flag should be checked in vm_fault(), not in vm_fault_trap(). Otherwise kernel accesses to userspace, like vn_io_fault(), enter vm locking when it should not. Reported and tested by: pho Reviewed by: alc, markj Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D21992	2019-10-13 06:56:45 +00:00
glebius	39cdd43bf3	Don't cover in6_ifattach() with network epoch, as it may call into network drivers ioctls, that may sleep. PR: 241223	2019-10-13 04:25:16 +00:00
markj	fdce34ac5e	Fix the build after r353458. MFC with: r353458 Sponsored by: The FreeBSD Foundation	2019-10-13 00:08:17 +00:00
bdragon	97ab2223d4	Fix read past end of struct in ncsw glue code. The logic in XX_IsPortalIntr() was reading past the end of XX_PInfo. This was causing it to erroneously return 1 instead of 0 in some circumstances, causing a panic on the AmigaOne X5000 due to mixing exclusive and nonexclusive interrupts on the same interrupt line. Since this code is only called a couple of times during startup, use a simple double loop instead of the complex read-ahead single loop. This also fixes a bug where it would never check cpu=0 on type=1. Approved by: jhibbits (mentor) Differential Revision: https://reviews.freebsd.org/D21988	2019-10-12 23:16:17 +00:00
markj	0ca8957b6d	Add a missing include of opt_sctp.h. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-12 23:01:16 +00:00
markj	f03e9ba4cb	Add a missing include of opt_sctp.h. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-12 22:58:33 +00:00
mav	d514a0e81a	Allocate device softc from the device domain. Since we are trying to bind device interrupt threads to the device domain, it should have sense to make memory often accessed by them local. If domain is not known, fall back to round-robin. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2019-10-12 19:03:07 +00:00
philip	f27a24e8b2	A comment in subr_devmap.c mentions that devmap_print_table() should be called on bootverbose. Do so on RISV-V too. Submitted by: Nicholas O'Brien <nickisobrien_gmail.com> Reviewed by: imp, kp Sponsored by: Axiado Differential Revision: https://reviews.freebsd.org/D21998	2019-10-12 18:18:11 +00:00
tuexen	b9f81c4458	Ensure that local variables are reset to their initial value when dealing with error cases in a loop over all remote addresses. This issue was found and reported by OSS_Fuzz in: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18080 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18086 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18121 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18163 MFC after: 3 days	2019-10-12 17:57:03 +00:00
kib	0dd6fd8974	devfs_vptocnp(): correct the component name when node is not at top. Node' cdp.si_name is the full path as provided by make_dev(9), it should not be returned by VOP_VPTOCNP() when only the last component is requested. Use the dirent entry instead. With this note, handling of VDIR and VCHR nodes only differs in handling of root vnode, which simplifies and unifies the logic. Reported by: Li, Zhichao1 <Zhichao_Li1@Dell.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-10-11 18:41:24 +00:00
kib	852150953b	Plug the rest of undef behavior places that were missed in r337456. There are three more places in msdosfs_fat.c which might shift one into the sign bit. While there, fix formatting of KASSERTs. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-10-11 18:37:02 +00:00
markj	7761b8de45	Remove an unneeded include of opt_sctp.h. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-11 17:23:23 +00:00
kp	9f3c88db10	mountroot: run statfs after mounting devfs The usual flow for mounting a file system is to VFS_MOUNT() and then immediately VFS_STATFS(). That's not done in vfs_mountroot_devfs(), which means the mp->mnt_stat.f_iosize field is not correctly populated, which in turn causes us to mark valid aio operations as unsafe (because the io size is set to 0), ultimately causing the aio_test:md_waitcomplete test to fail. Reviewed by: mckusick MFC after: 1 week Sponsored by: Axiado Differential Revision: https://reviews.freebsd.org/D21897	2019-10-11 17:04:38 +00:00
avg	238787c74e	fix up r353340, don't assume that fcmpset has strong semantics fcmpset can have two kinds of semantics, weak and strong. For practical purposes, strong semantics means that if fcmpset fails then the reported current value is always different from the expected value. Weak semantics means that the reported current value may be the same as the expected value even though fcmpset failed. That's a so called "sporadic" failure. I originally implemented atomic_cas expecting strong semantics, but many platforms actually have weak one. Reported by: pkubaj (not confirmed if same issue) Discussed with: kib, mjg MFC after: 19 days X-MFC with: r353340	2019-10-11 17:01:02 +00:00
philip	a850973810	Call devmap_bootstrap in RISC-V machine dependent code to actually create the static device mappings. While RISC-V support was added to subr_devmap.c in r298631, it was never actually initialised in the machine dependent code. Submitted by: Nicholas O'Brien <nickisobrien_gmail.com> Reviewed by: br, kp Sponsored by: Axiado Differential Revision: https://reviews.freebsd.org/D21975	2019-10-11 16:28:46 +00:00
asomers	ed8e9e6d05	MFZol: Fix performance of "zfs recv" with many deletions This patch fixes 2 issues with the DMU free throttle implemented in dmu_free_long_range(). The first issue is that get_next_chunk() was calculating the number of L1 blocks the free would dirty incorrectly. In some cases involving extremely large files, this code would greatly overestimate the number of affected L1 blocks, causing excessive calls to txg_wait_open(). This patch corrects the calculation. The second issue is that the free throttle uses the total number of free'd blocks in all (open, quiescing, and syncing) txgs to determine whether to throttle. This causes large frees (such as those created by the first issue) to cause 4 txg syncs before any further frees were allowed to proceed. This patch ensures that the accounting is done entirely in a per-txg fashion, so that frees from a given txg don't affect those that immediately follow it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Signed-off-by: Tom Caputi <tcaputi@datto.com> zfsonlinux/zfs@f4c594da94 Freeing throttle should account for holes Deletion throttle currently does not account for holes in a file. This means that it can activate when it shouldn't. To fix it we switch the throttle to be based on the number of L1 blocks we will have to dirty when freeing Reviewed-by: Tom Caputi <tcaputi@datto.com> Reviewed-by: Matt Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alek Pinchuk <apinchuk@datto.com> zfsonlinux/zfs@65282ee9e0 Submitted by: Alek Pinchuk <pinchuk.alek@gmail.com> Reviewed by: allanjude MFC after: 2 weeks Sponsored by: Axcient Differential Revision: https://reviews.freebsd.org/D21895	2019-10-11 14:59:28 +00:00
mjg	bd2df1a0b7	amd64 pmap: handle fictitious mappigns with addresses beyond pv_table There are provisions to do it already with pv_dummy, but new locking code did not account for it. Previous one did not have the problem because it hashed the address into the lock array. While here annotate common vars with __read_mostly and __exclusive_cache_line. Reported by: Thomas Laus Tesetd by: jkim, Thomas Laus Fixes: r353149 ("amd64 pmap: implement per-superpage locks") Sponsored by: The FreeBSD Foundation	2019-10-11 14:57:47 +00:00
jhibbits	197e87f234	gcore: Add aarch64 32-bit core support Summary: Add trivial 32-bit arm cores on aarch64 support for gcore. This doesn't handle fpregs. Reviewed by: #arm, andrew Sponsored by: Juniper Networks, Inc Differential Revision: https://reviews.freebsd.org/D21947	2019-10-11 14:15:50 +00:00
cem	917384c7c5	Fix braino in r353429 cy@ points out that I got parameter order backwards between definition and invocation of the helper function. He is totally correct. The earlier version of this patch predated the XFree column so this is one I introduced, rather than the original author. Submitted by: cy Reported by: cy X-MFC-With: r353429	2019-10-11 06:02:03 +00:00
cem	43181b339c	ddb: Add CSV option, sorting to 'show (malloc\|uma)' Add /i option for machine-parseable CSV output. This allows ready copy/ pasting into more sophisticated tooling outside of DDB. Add total zone size ("Memory Use") as a new column for UMA. For both, sort the displayed list on size (print the largest zones/types first). This is handy for quickly diagnosing "where has my memory gone?" at a high level. Submitted by: Emily Pettigrew <Emily.Pettigrew AT isilon.com> (earlier version) Sponsored by: Dell EMC Isilon	2019-10-11 01:31:31 +00:00
glebius	5a42f81ca8	Don't use if_maddr_rlock() in 802.11, use epoch(9) directly instead.	2019-10-10 23:55:33 +00:00
glebius	6bd890f89f	Don't use if_maddr_rlock() in sppp(4), use epoch(9) directly instead.	2019-10-10 23:54:37 +00:00
glebius	d8ae7b3010	Don't use if_maddr_rlock() in tuntap(4), use epoch(9) directly instead.	2019-10-10 23:51:14 +00:00
glebius	ba3932f44d	Interface output method must be executed in network epoch, so if_addr_rlock() isn't needed here.	2019-10-10 23:50:32 +00:00
glebius	94d93cfa86	Don't use if_maddr_rlock() in ng_eiface(4), use epoch(9) directly instead.	2019-10-10 23:49:19 +00:00
glebius	3c520efd9f	The divert(4) module must always be running in network epoch, thus call to if_addr_rlock() isn't needed.	2019-10-10 23:48:42 +00:00
glebius	1bbf4a3048	Don't use if_maddr_rlock() in ng_ether(4), use epoch(9) directly instead.	2019-10-10 23:47:14 +00:00
glebius	05b77fd477	Add two extra functions that basically give count of addresses on interface. Such function could been implemented on top of the if_foreach_llm?addr(), but several drivers need counting, so avoid copy-n-paste inside the drivers.	2019-10-10 23:44:56 +00:00
glebius	f4c0c06c47	Provide new KPI for network drivers to access lists of interface addresses. The KPI doesn't reveal neither how addresses are stored, how the access to them is synchronized, neither reveal struct ifaddr and struct ifmaddr. Reviewed by: gallatin, erj, hselasky, philip, stevek Differential Revision: https://reviews.freebsd.org/D21943	2019-10-10 23:42:55 +00:00
cem	4d006fe21a	nvdimm(4): Calculate and save memattr once; it never changes Refactor nvdimm_spa_memattr() routine and callers to just save the value at initialization and use the value directly. The reference value from NFIT, MemoryMapping, is read only once, so the associated memattr could never change. No functional change. Sponsored by: Dell EMC Isilon	2019-10-10 22:49:45 +00:00
kib	74ece39650	Typo out->in. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-10-10 18:52:24 +00:00
avg	50cdda62fc	emulate illumos membar_producer with atomic_thread_fence_rel membar_producer is supposed to be a store-store barrier. Also, in the code that FreeBSD has ported from illumos membar_producer is used only with regular stores to regular memory (with respect to caching). We do not have an MI primitive for the store-store barrier, so atomic_thread_fence_rel is the closest we have as it provides (load \| store) -> store barrier. Previously, membar_producer was an empty function call on all 32-bit arm-s, 32-bit powerpc, riscv and all mips variants. I think that it was inadequate. On other platforms, such as amd64, arm64, i386, powerpc64, sparc64, membar_producer was implemented using stronger primitives than required for a store-store barrier with respect to regular memory access. For example, it used sfence on amd64 and lock-ed nop in i386 (despite TSO). On powerpc64 we now use recommended lwsync instead of eieio. On sparc64 FreeBSD uses TSO mode. On arm64/aarch64 we now use dmb sy instead of dmb ish. Not sure if this is an improvement, actually. After this change we can drop opensolaris_atomic.S for aarch64, amd64, powerpc64 and sparc64 as all required atomic operations have either direct or light-weight mapping to FreeBSD native atomic operations. Discussed with: kib MFC after: 4 weeks	2019-10-10 07:39:41 +00:00
ambrisko	235ed49b64	This driver attaches to the Intel VMD drive and connects a new PCI domain starting at the max. domain, and then work down. Then existing FreeBSD drivers will attach. Interrupt routing from the VMD MSI-X to the NVME drive is not well known, so any interrupt is sent to all children that register. VROC used Intel meta data so graid(8) works with it. However, graid(8) supports RAID 0,1,10 for read and write. I have some early code to support writes with RAID 5. Note that RAID 5 can have life issues with SSDs since it can cause write amplification from updating the parity data. Hot plug support needs a change to skip the following check to work: if (pcib_request_feature(dev, PCI_FEATURE_HP) != 0) { in sys/dev/pci/pci_pci.c. Looked at by: imp, rpokala, bcr Differential Revision: https://reviews.freebsd.org/D21383	2019-10-10 03:12:17 +00:00
jhb	b6209cc9c7	Add opt_kern_tls.h to the sources from t4_tom.ko. Missed in r353328. Sponsored by: Chelsio Communications	2019-10-09 23:35:42 +00:00
jhb	56a61b7cc2	Don't free the cursor boundary tag during vmem_destroy(). The cursor boundary tag is statically allocated in the vmem instead of from the vmem_bt_zone. Explicitly remove it from the vmem's segment list in vmem_destroy before freeing all the segments from the vmem. Reviewed by: markj MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21953	2019-10-09 21:20:39 +00:00
jhb	bf3394bad4	Remove adapters from t4_list earlier during detach. This ensures the clip task won't race with t4_destroy_clip_table. While here, make some mutex destroys unconditional since attach always initializes them. Reviewed by: np MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21952	2019-10-09 21:08:51 +00:00
imp	516b5533ca	Fix casting error from newer gcc Cast the pointers to (uintptr_t) before assigning to type uint64_t. This eliminates an error from gcc when we cast the pointer to a larger integer type.	2019-10-09 21:02:06 +00:00
trasz	bd1ba152e6	Fix the compilation workaround so it's not entirely dead code - clang also defines __GNUC__. Submitted by: cem Sponsored by: Klara Inc, Netflix	2019-10-09 18:46:56 +00:00
hselasky	ff0a2f4dbc	Factor out TCP rateset destruction code. Ensure the epoch_call() function is not called more than one time before the callback has been executed, by always checking the RS_FUNERAL_SCHD flag before invoking epoch_call(). The "rs_number_dead" is balanced again after r353353. Discussed with: rrs@ Sponsored by: Mellanox Technologies	2019-10-09 17:08:40 +00:00
dim	b44399126d	Merge llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and openmp 9.0.0 final release r372316. Release notes for llvm, clang, lld and libc++ 9.0.0 are available here: https://releases.llvm.org/9.0.0/docs/ReleaseNotes.html https://releases.llvm.org/9.0.0/tools/clang/docs/ReleaseNotes.html https://releases.llvm.org/9.0.0/tools/lld/docs/ReleaseNotes.html https://releases.llvm.org/9.0.0/projects/libcxx/docs/ReleaseNotes.html PR: 240629 MFC after: 1 month	2019-10-09 17:06:56 +00:00
glebius	ac17d49973	Revert most of the multicast changes from r353292. This needs a more accurate approach.	2019-10-09 17:03:20 +00:00
glebius	62fd3c33fa	ip6_output() has a complex set of gotos, and some can jump out of the epoch section towards return statement. Since entering epoch is cheap, it is easier to cover the whole function with epoch, rather than try to properly maintain its state.	2019-10-09 17:02:28 +00:00
glebius	9618aadcd8	Cleanup unneeded includes that crept in with r353292.	2019-10-09 16:59:42 +00:00
manu	dd7d8c699a	dwmmc: Reset the dma controller at attach If the bootloader enabled DMA we need to fully reset the DMA controller otherwise we might have some stale data in it that provoke weird behavior. MFC after: 1 week	2019-10-09 16:57:14 +00:00
hselasky	166e590d9c	Fix locking order reversal in the TCP ratelimit code by moving destructors outside the rsmtx mutex. Witness message: lock order reversal: (sleepable after non-sleepable) 1st tcp_rs_mtx (rsmtx) @ sys/netinet/tcp_ratelimit.c:242 2nd sysctl lock (sysctl lock) @ sys/kern/kern_sysctl.c:607 Backtrace: witness_debugger witness_checkorder _rm_wlock_debug sysctl_ctx_free rs_destroy epoch_call_task gtaskqueue_run_locked gtaskqueue_thread_loop Discussed with: rrs@ Sponsored by: Mellanox Technologies	2019-10-09 16:48:48 +00:00
dim	beb54142f7	Merge ^/head r353316 through r353350.	2019-10-09 16:40:22 +00:00
glebius	1408ed7d50	ifnet_byindex_ref() requires network epoch.	2019-10-09 16:21:50 +00:00
glebius	7299f8c33d	Enter network epoch in domain callouts.	2019-10-09 16:21:05 +00:00
vangyzen	bacb9e94fe	Add CTLFLAG_STATS to the dev.ioat.N.stats sysctl OIDs Refer to r353111. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2019-10-09 12:14:10 +00:00
avg	e9642c209b	cleanup of illumos compatibility atomics atomic_cas_32 is implemented using atomic_fcmpset_32 on all platforms. Ditto for atomic_cas_64 and atomic_fcmpset_64 on platforms that have it. The only exception is sparc64 that provides MD atomic_cas_32 and atomic_cas_64. This is slightly inefficient as fcmpset reports whether the operation updated the target and that information is not needed for cas. Nevertheless, there is less code to maintain and to add for new platforms. Also, the operations are done inline now as opposed to function calls before. atomic_add_64_nv is implemented using atomic_fetchadd_64 on platforms that provide it. casptr, cas32, atomic_or_8, atomic_or_8_nv are completely removed as they have no users. atomic_mtx that is used to emulate 64-bit atomics on platforms that lack them is defined only on those platforms. As a result, platform specific opensolaris_atomic.S files have lost most of their code. The only exception is i386 where the compat+contrib code provides 64-bit atomics for userland use. That code assumes availability of cmpxchg8b instruction. FreeBSD does not have that assumption for i386 userland and does not provide 64-bit atomics. Hopefully, this can and will be fixed. MFC after: 3 weeks	2019-10-09 11:26:36 +00:00
glebius	6bf933c434	Revert changes to rip6_bind() from r353292. This function is always called in syscall context, so it must enter epoch itself. This changeset originates from early version of the patch, and somehow slipped to the final version. Reported by: pho	2019-10-09 05:52:07 +00:00
markj	bb579d181e	Fix handling of empty SCM_RIGHTS messages. As unp_internalize() processes the input control messages, it builds an output mbuf chain containing the internalized representations of those messages. In one special case, that of an empty SCM_RIGHTS message, the message is simply discarded. However, the loop which appends mbufs to the output chain assumed that each iteration would produce an mbuf, resulting in a null pointer dereference if an empty SCM_RIGHTS message was followed by a non-empty message. Fix this by advancing the output mbuf chain tail pointer only if an internalized control message was produced. Reported by: syzbot+1b5cced0f7fad26ae382@syzkaller.appspotmail.com MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-08 23:34:48 +00:00
jhb	0e8a8e5def	Add support for KTLS in the Chelsio TOE module. This adds a TOE hook to allocate a KTLS session. It also recognizes TLS mbufs in the socket buffer and sends those to the NIC using a TLS work request to encrypt the record before segmenting it. TOE TLS support must be enabled via the dev.t6nex.<N>.tls sysctl in addition to enabling KTLS. Reviewed by: np, gallatin Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21891	2019-10-08 21:40:42 +00:00
jhb	02e5a4c53c	Add a TOE KTLS mode and a TOE hook for allocating TLS sessions. This adds the glue to allocate TLS sessions and invokes it from the TLS enable socket option handler. This also adds some counters for active TOE sessions. The TOE KTLS mode is returned by getsockopt(TLSTX_TLS_MODE) when TOE KTLS is in use on a socket, but cannot be set via setsockopt(). To simplify various checks, a TLS session now includes an explicit 'mode' member set to the value returned by TLSTX_TLS_MODE. Various places that used to check 'sw_encrypt' against NULL to determine software vs ifnet (NIC) TLS now check 'mode' instead. Reviewed by: np, gallatin Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21891	2019-10-08 21:34:06 +00:00
mjg	b68ee60bab	amd64: plug spurious cld instructions ABI already guarantees the direction is forward. Note this does not take care of i386-specific cld's. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21906	2019-10-08 21:14:11 +00:00
jhb	79f37acc1f	Set the FID field in lookaside crypto requests to the rx queue ID. The PCI block in the adapter requires this field to be set to a valid queue ID. It is not clear why it did not fail on all machines, but the effect was that crypto operations reading input data via DMA failed with an internal PCI read error on machines with 128G or more of RAM. Reported by: gallatin Reviewed by: np MFC after: 3 days Sponsored by: Chelsio Communications	2019-10-08 20:22:05 +00:00
hselasky	a9b059643b	Fix regression issue after r352989: As noted by the commit message, callouts are now persistant and should not be in the auto-zero section of the RQ's and SQ's. This fixes an assert when using the TX completion event factor feature with mlx5en(4). Found by: gallatin@ MFC after: 3 days Sponsored by: Mellanox Technologies	2019-10-08 19:49:25 +00:00
dim	53d410d088	Prepare for merging back to head: * Set tentative merge date * Add UPDATING entry * Bump __FreeBSD_version * Bump FREEBSD_CC_VERSION * Bump LLD_REVISION	2019-10-08 18:21:33 +00:00
dim	1da7bc25ba	Merge ^/head r352764 through r353315.	2019-10-08 18:17:02 +00:00
glebius	be8284f983	Remove epoch assertion from if_setlladdr(). Originally this function was protected by IF_ADDR_LOCK(), which was a mutex, so that two simultaneous if_setlladdr() can't execute. Later it was switched to IF_ADDR_RLOCK(), likely by a mistake. Later it was switched to NET_EPOCH_ENTER(). Then I incorrectly added NET_EPOCH_ASSERT() here. In reality ifp->if_addr never goes away and never changes its length. So, doing bcopy() in it is always "safe", meaning it won't dereference a wrong pointer or write into someone's else memory. Of course doing two bcopy() in parallel would result in a mess of two addresses, but net epoch doesn't protect against that, neither IF_ADDR_RLOCK() did. So for now, just remove the assertion and leave for later a proper fix. Reported by: markj	2019-10-08 17:55:45 +00:00
glebius	1c64917ffa	Quickly plug another regression from r353292. Again, multicast locking needs lots of work... Reported by: pho	2019-10-08 16:59:17 +00:00
glebius	32a0d5bbd1	In DIAGNOSTIC block of if_delmulti_ifma_flags() enter the network epoch. This quickly plugs the regression from r353292. The locking of multicast definitely needs a broader review today... Reported by: pho, dhw	2019-10-08 16:45:56 +00:00
markj	8a79c7414f	Simplify pmap_page_array_startup() a bit. No functional change intended. Sponsored by: The FreeBSD Foundation	2019-10-08 16:42:50 +00:00
markj	1e2840f2f0	Avoid erroneously clearing PGA_WRITEABLE in riscv's pmap_enter(). During a CoW fault, we must check for both 4KB and 2MB mappings before clearing PGA_WRITEABLE on the old mapping's page. Previously we were only checking for 4KB mappings. This was missed in r344106. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-10-08 15:03:48 +00:00
mjg	236e505a95	amd64 pmap: allocate pv table entries for gaps in PA This matches the state prior to r353149 and fixes crashes with DRM modules. Reported and tested by: cy, garga, Krasznai Andras Fixes: r353149 ("amd64 pmap: implement per-superpage locks") Sponsored by: The FreeBSD Foundation	2019-10-08 14:59:50 +00:00
markj	d203e7ff7d	Clear PGA_WRITEABLE in riscv's pmap_remove_l3(). pmap_remove_l3() may remove the last mapping of a page, in which case it must clear PGA_WRITEABLE. Reported by: Jenkins, via lwhsu MFC after: 1 week Sponsored by: The FreeBSD Foundation	2019-10-08 14:54:35 +00:00
avg	2fa7e77a36	zfs: use atomic_load_64 to read atomic variable in dmu_object_alloc_impl As long as we support ZFS on 32-bit platforms we should do this for all 64-bit variables that are modified in a lockless fashion using atomic operations. Otherwise, there is a risk of a reading a torn value. Here is a rationale for why I am doing this in dmu_object_alloc_impl: - it's very recent code - the code deals with object IDs and a number of objects in a file system can overflow 32 bits - incorrect allocation of an object ID may result in hard to debug problems - fixing all plain reads of 64-bit atomic variables is not a trivial undertaking to do in one shot, so I chose to do it incrementally MFC after: 3 weeks X-MFC after: r353301, r353176	2019-10-08 11:27:48 +00:00
tuexen	e8c2889eec	Validate length before use it, not vice versa. r353060 should have contained this... This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18070 MFC after: 3 days	2019-10-08 11:07:16 +00:00
hselasky	131783f8e3	Fix regression issue after r353274: Make sure the vnet_shutdown field is not set until after all VNET_SYSUNINIT()'s in the SI_SUB_VNET_DONE subsystem have been executed. Especially the vnet_if_return() functions requires that if_move() is still operational. Reported by: lwhsu@ MFC after: 1 week Sponsored by: Mellanox Technologies	2019-10-08 11:06:24 +00:00
avg	fb316b3e7b	i386: hide more of atomic 64-bit definitions under _KERNEL At the moment i386 does not provide 64-bit atomic operations in userland. Exposing some atomic_*_64 defines can cause unnecessary confusion. Discussed with: kib MFC after: 2 weeks	2019-10-08 10:50:16 +00:00
dougm	918670a5ed	Define macro VM_MAP_ENTRY_FOREACH for enumerating the entries in a vm_map. In case the implementation ever changes from using a chain of next pointers, then changing the macro definition will be necessary, but changing all the files that iterate over vm_map entries will not. Drop a counter in vm_object.c that would have an effect only if the vm_map entry count was wrong. Discussed with: alc Reviewed by: markj Tested by: pho (earlier version) Differential Revision: https://reviews.freebsd.org/D21882	2019-10-08 07:14:21 +00:00
jhibbits	a6677a026c	powerpc: Implement atomic_(f)cmpset_ for short and char \| This adds two implementations for each atomic_fcmpset_ and atomic_cmpset_ short and char functions, selectable at compile time for the target architecture. By default, it uses a generic shift-and-mask to perform atomic updates to sub-components of 32-bit words from <sys/_atomic_subword.h>. However, if ISA_206_ATOMICS is defined it uses the ll/sc instructions for halfword and bytes, introduced in PowerISA 2.06. These instructions are supported by all IBM processors from POWER7 on, as well as the Freescale/NXP e6500 core. Although the e5500 and e500mc both implement PowerISA 2.06 they do not implement these instructions. As part of this, clean up the atomic_(f)cmpset_acq and _rel wrappers, by using macros to reduce code duplication. ISA_206_ATOMICS requires clang or newer binutils (2.20 or later). Differential Revision: https://reviews.freebsd.org/D21682	2019-10-08 01:36:34 +00:00
markj	eae373986b	Improve locking in the IPV6_V6ONLY socket option handler. Acquire the inp lock before checking whether the socket is already bound, and around updates to the inp_vflag field. MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21867	2019-10-07 23:35:23 +00:00
markj	c0e5f81551	Assert that the PGA_{WRITEABLE,EXECUTABLE} flags do not leak. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D21783	2019-10-07 23:31:17 +00:00
mjg	d8e13e75c0	vm: stop trylocking page queues in vm_page_pqbatch_submit About 11 minutes of poudriere -s -j 104 and probing on return value of trylocks reveals that over 10% of attempts fail, which in turn means there are more atomics performed than necessary. Trylocking was there to try preventing migration, but it's not very likely to happen if the lock is uncontested. Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21925	2019-10-07 23:19:09 +00:00
glebius	337378e04f	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111	2019-10-07 22:40:05 +00:00
tuexen	03060312fd	In r343587 a simple port filter as sysctl tunable was added to siftr. The new sysctl was not added to the siftr.4 man page at the time. This updates the man page, and removes one left over trailing whitespace. Submitted by: Richard Scheffenegger Reviewed by: bcr@ MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21619	2019-10-07 20:35:04 +00:00
trasz	008d4a5775	Introduce stats(3), a flexible statistics gathering API. This provides a framework to define a template describing a set of "variables of interest" and the intended way for the framework to maintain them (for example the maximum, sum, t-digest, or a combination thereof). Afterwards the user code feeds in the raw data, and the framework maintains these variables inside a user-provided, opaque stats blobs. The framework also provides a way to selectively extract the stats from the blobs. The stats(3) framework can be used in both userspace and the kernel. See the stats(3) manual page for details. This will be used by the upcoming TCP statistics gathering code, https://reviews.freebsd.org/D20655. The stats(3) framework is disabled by default for now, except in the NOTES kernel (for QA); it is expected to be enabled in amd64 GENERIC after a cool down period. Reviewed by: sef (earlier version) Obtained from: Netflix Relnotes: yes Sponsored by: Klara Inc, Netflix Differential Revision: https://reviews.freebsd.org/D20477	2019-10-07 19:05:05 +00:00
hselasky	cd0a9b48d4	Compile time assert a valid subsystem for all VNET init and uninit functions. Using VNET init and uninit functions outside the given range has undefined behaviour. MFC after: 1 week Sponsored by: Mellanox Technologies	2019-10-07 14:24:59 +00:00
hselasky	bd1a11f46e	Factor out VNET shutdown check into an own vnet structure field. Remove the now obsolete vnet_state field. This greatly simplifies the detection of VNET shutdown and avoids code duplication. Discussed with: bz@ MFC after: 1 week Sponsored by: Mellanox Technologies	2019-10-07 14:15:41 +00:00
hselasky	3e8c05776d	Make control endpoint quirk for xhci(4) configurable. MFC after: 1 week Sponsored by: Mellanox Technologies	2019-10-07 13:40:29 +00:00
avg	f562cf16b6	fix up r353168, add atomic_swap_64 to i386 version of opensolaris_atomic.S The compatibility code for the atomic operations in ZFS code is a bit messy. In some cases the native definitions are directly made available, in some cases there are emulated operations in opensolaris_atomic.c and in yet other cases there are atomic operations implemented in assembly that were obtained from OpenSolaris / illumos. This commit adds atomic_swap_64 for use with i386 userland. The code is copied from illumos. I am not sure why FreeBSD does not provide that operation natively. Maybe because we try (or pretend) to support processors that did not have the necessary instructions. While here I also added atomic_load_64 for the same reasons. This is original code based on iilumos atomic_swap_64 and FreeBSD atomic_load_acq_64_i586. Pointyhat to: avg MFC after: 1 week	2019-10-07 12:53:27 +00:00
avg	bd8e9d12b1	MFV r350898, r351075: 8423 8199 7432 Implement large_dnode pool feature 8423 8199 7432 Implement large_dnode pool feature 7432 Large dnode pool feature 8199 multi-threaded dmu_object_alloc() 8423 Implement large_dnode pool feature 10406 large_dnode changes broke zfs recv of legacy stream llumos/illumos-gate@54811da5ac `54811da5ac` https://www.illumos.org/issues/8423 https://www.illumos.org/issues/8199 https://www.illumos.org/issues/7432 illumos/illumos-gate@811964cd9f `811964cd9f` https://www.illumos.org/issues/10406 ZoL issues: Improved dnode allocation #6564 Clean up large dnode code #6262 Fix dnode_hold() freeing dnode behavior #8172 Fix dnode allocation race #6414, #6439 Partial: Raw sends must be able to decrease nlevels #6821, #6864 Remove unnecessary txg syncs from receive_object() Closes #7197 This updates FreeBSD large_dnode code (that was imported from ZoL) to a version that was committed to illumos. It has some cleanups, improvements and fixes comparing to what we have in FreeBSD now. I think that the most significant update is 8199 multi-threaded dmu_object_alloc(). This commit reverts r351077 that was a revert of r351074 and r351076 and restores those changes. Required atomic operations should be available now on all platforms where we build ZFS. Obtained from: illumos MFC after: 3 weeks	2019-10-07 08:14:45 +00:00
manu	6fea2fc980	arm: dts: ti: Fix mmc3 instance by setting it to disabled DTS Import of Linux 5.3 added a patch that rework the L3 mmc instance in the AM335x SoC but removed the status = 'disabled' on the node. This cause the kernel to probe the device even if the board doesn't have this mmc used and since we don't correctly activate the clock for this module we panic with an external data abort. Beaglebone(s) don't have this device anyway so simply disabling it. Patch for the DTS was sent upstream. https://patchwork.kernel.org/patch/11176921/ PR: 241089 Reported by: phk	2019-10-07 08:11:49 +00:00
avg	a7e994092f	ZFS: unconditionally use atomic_swap_64 Previously, the code used a plain store on platforms that lacked atomic_swap_64 and possibly some other platforms as the condition worked only if atomic_swap_64 was a macro. MFC after: 1 week X-MFC after: r353166, r353167	2019-10-07 08:00:54 +00:00
avg	3458e5d1e6	ZFS: add emulation of atomic_swap_64 and atomic_load_64 Some 32-bit platforms do not provide 64-bit atomic operations that ZFS requires, either in userland or at all. We emulate those operations for those platforms using a mutex. That is not entirely correct and it's very efficient. Besides, the loads are plain loads, so torn values are possible. Nevertheless, the emulation seems to work for some definition of work. This change adds atomic_swap_64, which is already used in ZFS code, and atomic_load_64 that can be used to prevent torn reads. MFC after: 1 week	2019-10-07 07:54:34 +00:00
avg	87b704d0e3	add atomic_load_64 for mipsn32 It's just an alias for atomic_load_acq_64 (same as on i386). MFC after: 1 week	2019-10-07 07:42:26 +00:00
avg	9ee01c07d8	align use of cp15_pmccntr_get with its availability According to ian, the only armv6 cpu we support is the 1176, so this change is effectively a no-op. The change is just to make the code more self-consistent. The issue was noticed by a standalone module build for armv6. Reviewed by: ian MFC after: 3 weeks	2019-10-07 07:37:42 +00:00

... 3 4 5 6 7 ...

138664 Commits