freebsd-nq

Author	SHA1	Message	Date
Bryan Venteicher	fbe0c4f4c7	virtio: Add modern (v1) virtqueue support This only supports the legacy virtqueue format that is now called "Split Virtqueues". Support for the new "Packed Virtqueues" described in v1.1 is left for a later date. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27857	2021-01-19 04:55:23 +00:00
Bryan Venteicher	9da9560c4d	virtio: Add VirtIO PCI modern (V1) support Use the existing legacy PCI driver as the basis for shared code between the legacy and modern PCI drivers. The existing virtio_pci kernel module will contain both the legacy and modern drivers. Changes to the virtqueue and each device driver (network, block, etc) for V1 support come in later commits. Update the MMIO driver to reflect the VirtIO bus method changes, but the modern compliance can be improved on later. Note that the modern PCI driver requires bus_map_resource() to be implemented, which is not the case on all archs. The hw.virtio.pci.transitional tunable default value is zero so transitional devices will continue to be driven via the legacy driver. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27856	2021-01-19 04:55:23 +00:00
Bryan Venteicher	1cd1ed3f5d	Revert: virtio: Support non-legacy network device and queue And subsequent fix `576b099a`. By adding the mergable header to the vtnet_rx_header structure, the size was increased by 2 bytes, breaking the alignment of this structure as described the in preceding comments. Furthermore, the mergable header does not belong the structure. With the mergable feature, the header is placed in line with the data, so there is no need for a separate segment, and misleading to follow the mergable header with any padding. The V1 header is effectively identical to mergable header, and the driver has long supported the mergable feature. Revert this so the later changes that add V1 support can show how V1 is derived from the existing mergable buffers support, and to facilitate a later MFC. Reviewed by: grehan (mentor) Differential Revision: https://reviews.freebsd.org/D27855	2021-01-19 04:55:23 +00:00
Jamie Gritton	effad35ed1	jail: Clean up some function placement and improve comments. Move prison_hold, prison_hold_locked ,prison_proc_hold, and prison_proc_free to a more intuitive part of the file (together with with prison_free and prison_free_locked), and add or improve comments to these and others, to better describe what's going in the prison reference cycle. No functional changes.	2021-01-18 17:23:51 -08:00
Mark Johnston	a45d905616	ppbus: Fix the direction of the PPISEPPA ioctl PR: 252711 Submitted by: Eugene <merfi@nearly.ru>	2021-01-18 19:44:42 -05:00
Oleksandr Tymoshenko	248f0cabca	make maximum interrupt number tunable on ARM, ARM64, MIPS, and RISC-V Use a machdep.nirq tunable intead of compile-time constant NIRQ as a value for maximum number of interrupts. It allows keep a system footprint small by default with an option to increase the limit for large systems like server-grade ARM64 Reviewd by: mhorne Differential Revision: https://reviews.freebsd.org/D27844 Submitted by: Klara, Inc. Sponsored by: Ampere Computing	2021-01-18 16:36:39 -08:00
Jamie Gritton	83bc72a04e	jail: Fix a stray mutex from `76ad42abf9`.	2021-01-18 15:47:09 -08:00
Mark Johnston	098c902b52	aesni: Ensure that key schedules are aligned Rather than depending on malloc() returning 16-byte aligned chunks, allocate some extra pad bytes and ensure that key schedules are appropriately aligned. Reviewed by: kib MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC (Netgate) Differential Revision: https://reviews.freebsd.org/D28157	2021-01-18 17:07:56 -05:00
Mark Johnston	5bdb8b273a	safexcel: Maintain per-session context records The context record contains key material precomputed by the driver at session creation time. Rather than storing various components of the context record in each session, go a bit further and store the full context record image so that safexcel_process() can simply copy the image into each request submitted to the hardware. This simplifies the data path and eliminates a bunch of unnecessary conditional logic that was getting executed for each request. MFC after: 1 week Sponsored by: Rubicon Communications, LLC (Netgate)	2021-01-18 17:07:56 -05:00
Mark Johnston	1a6ffed5d7	safexcel: Simplify request allocation Rather than preallocating a set of requests and moving them between queues during state transitions, maintain a shadow of the command descriptor ring to track the driver context of each request. This is simpler and requires less synchronization between safexcel_process() and the ring interrupt handler. MFC after: 1 week Sponsored by: Rubicon Communications, LLC (Netgate)	2021-01-18 17:07:56 -05:00
Mark Johnston	b7e27af36b	safexcel: Handle command/result descriptor exhaustion gracefully Rather than returning a hard error in this case, return ERESTART so that upper layers get a chance to retry the request (or drop it, depending on the desired policy). This case is hard to hit due to the somewhat low bound on queued requests, but that will no longer be true after an upcoming change. MFC after: 1 week Sponsored by: Rubicon Communications, LLC (Netgate)	2021-01-18 17:07:56 -05:00
Mark Johnston	0371c3faaa	safexcel: Add counters for some resource exhaustion conditions This is useful when analyzing performance problems. MFC after: 1 week Sponsored by: Rubicon Communications, LLC (Netgate)	2021-01-18 17:07:55 -05:00
Mark Johnston	e934d455ba	safexcel: Dispatch requests to the current CPU's ring This gives better performance in some tests than the previous policy of statically binding each session to a ring. MFC after: 1 week Sponsored by: Rubicon Communications, LLC (Netgate)	2021-01-18 17:07:55 -05:00
Mark Johnston	4af9323542	linuxkpi: Fix the shrinker scan target Use the number of items scanned to control the duration of the shrink loop. Otherwise, if a consumer like TTM is not able to free the number of items requested for some reason, the shrinker keeps looping forever. Reviewed by: manu Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28224	2021-01-18 17:07:55 -05:00
Mitchell Horne	a520f5ca58	armv8crypto: print a message on probe failure Similar to the message printed by aesni(4), let the user know if the driver is unsupported by their CPU. PR: 252543 Reported by: gbe MFC after: 3 days Sponsored by: The FreeBSD Foundation	2021-01-18 16:59:21 -04:00
Jamie Gritton	76ad42abf9	jail: Add prison_isvalid() and prison_isalive() prison_isvalid() checks if a prison record can be used at all, i.e. pr_ref > 0. This filters out prisons that aren't fully created, and those that are either in the process of being dismantled, or will be at the next opportunity. While the check for pr_ref > 0 is simple enough to make without a convenience function, this prepares the way for other measures of prison validity. prison_isalive() checks not only validity as far as the useablity of the prison structure, but also whether the prison is visible to user space. It replaces a test for pr_uref > 0, which is currently only used within kern_jail.c, and not often there. Both of these functions also assert that either the prison mutex or allprison_lock is held, since it's generally the case that unlocked prisons aren't guaranteed to remain useable for any length of time. This isn't entirely true, for example a thread can assume its own prison is good, but most exceptions will exist inside of kern_jail.c.	2021-01-18 10:56:20 -08:00
Andrew Gallatin	efa9c21bca	KTLS: Enable KERN_TLS in GENERIC on amd64 Based on discussions on freebsd-arch@, enable KERN_TLS in GENERIC on amd64, but leave it disabled via the sysctl kern.ipc.tls.enable. Users wishing to enable ktls must set kern.ipc.tls.enable=1 While here, fix wording in NOTES to mention that KERN_TLS also does receive now. Sponsored by: Netflix Reviewed by: allanjude Differential Revision: https://reviews.freebsd.org/D28163	2021-01-18 13:29:10 -05:00
Lutz Donnerhacke	c3e75b6c1a	netgraph/ng_one2main: Clarification in comments about copy mode The original comment suggests an optimization, which was proven wrong. Reported by: nc Reviewed by: kp, nc Approved by: kp (mentor) Differential Revision: https://reviews.freebsd.org/D23727	2021-01-18 14:10:34 +01:00
Lutz Donnerhacke	7c7c231c14	netgraph/ng_tag: permit variable length data ng_tag(4) operate on arbitrary data of mbuf_tags(9). Those structures are padded to the next multiple of the alignment by the compiler. Hence a valid argument has be at most as long as the data received. PR: 241462 Reviewed by: kp Approved by: kp (mentor) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22140	2021-01-18 13:23:22 +01:00
Konstantin Belousov	36bcc44e2c	Add ddb 'show timecounter' command. MFC after: 1 week Sponsored by: The FreeBSD Foundation	2021-01-18 09:51:48 +02:00
Jamie Gritton	25c2c952e3	jail: Add proper prison locking in mqfs_prison_remove.	2021-01-17 17:41:09 -08:00
Lutz Donnerhacke	75e7ef74df	netgraph/ng_source: Allow ng_source to inject into any netgraph network PR: 240530 Reviewed by: kp Approved by: kp (mentor) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D21968	2021-01-17 22:17:01 +01:00
Alexander V. Chernikov	74935ce881	Enable running fib tests inside vnet jail.	2021-01-17 20:32:26 +00:00
Alexander V. Chernikov	f879876721	Fix IPv4 fib bsearch4() lookup array construction. Current code didn't properly handle the case with nested prefixes like 10.0.0.0/24 && 10.0.0.0/25.	2021-01-17 20:32:26 +00:00
Alexander V. Chernikov	9d6567bc30	Fix panic on vnet creation if fib algo has been set to fixed value. Make fixed algo property per-VNET instead of global.	2021-01-17 20:32:25 +00:00
Alexander V. Chernikov	f9e0752e35	Create new in6_purgeifaddr() which purges bound ifa prefix if it gets unused. Currently if_purgeifaddrs() uses in6_purgeaddr() to remove IPv6 ifaddrs. in6_purgeaddr() does not trrigger prefix removal if number of linked ifas goes to 0, as this is a low-level function. As a result, if_purgeifaddrs() purges all IPv4/IPv6 addresses but keeps corresponding IPv6 prefixes. Fix this by creating higher-level wrapper which handles unused prefix usecase and use it in if_purgeifaddrs(). Differential revision: https://reviews.freebsd.org/D28128	2021-01-17 20:32:25 +00:00
Konstantin Belousov	f3ea417f96	x86 busdma_bounce: use malloc_domainset_aligned(9). This stops busdma bounce making assumptions about alignment of malloc(9) results, which are no longer true. Also add assert that the result of malloc_aligned() fits into single page, which is the assumption of the code. Reported by: dim Reviewed by: andrew, jah, markj Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28147	2021-01-17 19:29:05 +02:00
Konstantin Belousov	3b15beb30b	Implement malloc_domainset_aligned(9). Change the power-of-two malloc zones to require alignment equal to the size []. Current uma allocator already provides such alignment, so in fact this change does not change anything except providing future-proof setup. Suggested by: markj [] Reviewed by: andrew, jah, markj Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28147	2021-01-17 19:29:05 +02:00
Emmanuel Vadot	72c551930b	Bump __FreeBSD_version after linuxkpi changes	2021-01-17 12:47:28 +01:00
Vladimir Kondratyev	ec25b6fa5f	LinuxKPI: Reimplement irq_work queue on top of fast taskqueue Summary: Linux's irq_work queue was created for asynchronous execution of code from contexts where spin_lock's are not available like "hardware interrupt context". FreeBSD's fast taskqueues was created for the same purposes. Drm-kmod 5.4 uses irq_work_queue() at least in one place to schedule execution of task/work from the critical section that triggers following INVARIANTS-induced panic: ``` panic: acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) linuxkpi_short_wq @ /usr/src/sys/kern/subr_taskqueue.c:281 cpuid = 6 time = 1605048416 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe006b538c90 vpanic() at vpanic+0x182/frame 0xfffffe006b538ce0 panic() at panic+0x43/frame 0xfffffe006b538d40 witness_checkorder() at witness_checkorder+0xf3e/frame 0xfffffe006b538f00 __mtx_lock_flags() at __mtx_lock_flags+0x94/frame 0xfffffe006b538f50 taskqueue_enqueue() at taskqueue_enqueue+0x42/frame 0xfffffe006b538f70 linux_queue_work_on() at linux_queue_work_on+0xe9/frame 0xfffffe006b538fb0 irq_work_queue() at irq_work_queue+0x21/frame 0xfffffe006b538fd0 semaphore_notify() at semaphore_notify+0xb2/frame 0xfffffe006b539020 __i915_sw_fence_notify() at __i915_sw_fence_notify+0x2e/frame 0xfffffe006b539050 __i915_sw_fence_complete() at __i915_sw_fence_complete+0x63/frame 0xfffffe006b539080 i915_sw_fence_complete() at i915_sw_fence_complete+0x8e/frame 0xfffffe006b5390c0 dma_i915_sw_fence_wake() at dma_i915_sw_fence_wake+0x4f/frame 0xfffffe006b539100 dma_fence_signal_locked() at dma_fence_signal_locked+0x105/frame 0xfffffe006b539180 dma_fence_signal() at dma_fence_signal+0x72/frame 0xfffffe006b5391c0 dma_fence_is_signaled() at dma_fence_is_signaled+0x80/frame 0xfffffe006b539200 dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0xb3/frame 0xfffffe006b539270 i915_vma_move_to_active() at i915_vma_move_to_active+0x18a/frame 0xfffffe006b5392b0 eb_move_to_gpu() at eb_move_to_gpu+0x3ad/frame 0xfffffe006b539320 eb_submit() at eb_submit+0x15/frame 0xfffffe006b539350 i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0x7d4/frame 0xfffffe006b539570 i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x1c1/frame 0xfffffe006b539600 drm_ioctl_kernel() at drm_ioctl_kernel+0xd9/frame 0xfffffe006b539670 drm_ioctl() at drm_ioctl+0x5cd/frame 0xfffffe006b539820 linux_file_ioctl() at linux_file_ioctl+0x323/frame 0xfffffe006b539880 kern_ioctl() at kern_ioctl+0x1f4/frame 0xfffffe006b5398f0 sys_ioctl() at sys_ioctl+0x12a/frame 0xfffffe006b5399c0 amd64_syscall() at amd64_syscall+0x121/frame 0xfffffe006b539af0 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe006b539af0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800a6f09a, rsp = 0x7fffffffe588, rbp = 0x7fffffffe640 --- KDB: enter: panic ``` Here, the dma_resv_add_shared_fence() performs a critical_enter() and following call of schedule_work() from semaphore_notify() triggers 'acquiring blockable sleep lock with spinlock or critical section held' panic. Switching irq_work implementation to fast taskqueue fixes the panic for me. Other report with the similar bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247166 Reviewed By: hselasky Differential Revision: https://reviews.freebsd.org/D27171	2021-01-17 12:47:28 +01:00
Marius Strobl	944041f936	wl(4): remove obsolete header It's unused since `09b9789b28` and r304506 respectively and should have gone along with these.	2021-01-17 00:03:17 +01:00
Marius Strobl	daad26e5fc	openpromio(4): remove obsolete pseudo device driver It's unused since `58aa35d429` and r357455 respectively and should have gone along with these.	2021-01-16 23:53:13 +01:00
Marius Strobl	d65427ad58	sym(4): Remove remainder of SYM_SETUP_LP_PROBE_MAP support Missed in `221ac8f4cd` and r339575 respectively.	2021-01-16 23:53:12 +01:00
Alexander V. Chernikov	81728a538d	Split rtinit() into multiple functions. rtinit[1]() is a function used to add or remove interface address prefix routes, similar to ifa_maintain_loopback_route(). It was intended to be family-agnostic. There is a problem with this approach in reality. 1) IPv6 code does not use it for the ifa routes. There is a separate layer, nd6_prelist_(), providing interface for maintaining interface routes. Its part, responsible for the actual route table interaction, mimics rtenty() code. 2) rtinit tries to combine multiple actions in the same function: constructing proper route attributes and handling iterations over multiple fibs, for the non-zero net.add_addr_allfibs use case. It notably increases the code complexity. 3) dstaddr handling. flags parameter re-uses RTF_ flags. As there is no special flag for p2p connections, host routes and p2p routes are handled in the same way. Additionally, mapping IFA flags to RTF flags makes the interface pretty messy. It make rtinit() to clash with ifa_mainain_loopback_route() for IPV4 interface aliases. 4) rtinit() is the last customer passing non-masked prefixes to rib_action(), complicating rib_action() implementation. 5) rtinit() coupled ifa announce/withdrawal notifications, producing "false positive" ifa messages in certain corner cases. To address all these points, the following has been done: * rtinit() has been split into multiple functions: - Route attribute construction were moved to the per-address-family functions, dealing with (2), (3) and (4). - funnction providing net.add_addr_allfibs handling and route rtsock notificaions is the new routing table inteface. - rtsock ifa notificaion has been moved out as well. resulting set of funcion are only responsible for the actual route notifications. Side effects: * /32 alias does not result in interface routes (/32 route and "host" route) * RTF_PINNED is now set for IPv6 prefixes corresponding to the interface addresses Differential revision: https://reviews.freebsd.org/D28186	2021-01-16 22:42:41 +00:00
Emmanuel Vadot	8ca9ff4f28	mips: Fix build by using the correct device-tree include path	2021-01-16 11:34:10 +01:00
Emmanuel Vadot	a0ee5920be	mips: Add the device-tree path to the include paths	2021-01-16 11:33:37 +01:00
Emmanuel Vadot	fa67846c6f	riscv: Fix build by using the correct device-tree include path	2021-01-16 11:31:39 +01:00
Emmanuel Vadot	384bd0b5b0	riscv: Add the device-tree path to the include path	2021-01-16 11:31:17 +01:00
Mateusz Guzik	fe258f23ef	Save on getpid in setproctitle by supporting -1 as curproc.	2021-01-16 09:36:54 +01:00
Vincenzo Maffione	2968dde3de	axgbe: driver changes for netmap support AMD 10GbE hardware is designed to have two buffers per receive descriptor to support split header feature. For this purpose, the driver was designed to use 2 iflib freelists per receive queue. So, that buffers from 2 freelists are used to refill an entry in the receive descriptor. The current design holds good with regular data traffic. But, when netmap comes into play, the current design will not fit in. The current netmap interfaces and netmap implementation in iflib doesn't seem to accomodate the design of 2 freelists per receive queue. So, exercising Netmap capability with inbuilt tools like bridge, pkt-gen doesn't work with the 2 freelists driver design. So, the driver design is changed to accomodate the current netmap interfaces and netmap implementation in iflib by using single freelist per receive queue approach when Netmap capability is exercised without disturbing the current 2 freelists approach. The dev.ax.sph_enable tunable can be set to 0 to configure the single free list mode. Thanks to Stephan Dewt for his Initial set of code changes for the stated problem. Submitted by: rajesh1.kumar_amd.com Approved by: vmaffione MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D27797	2021-01-16 08:29:33 +00:00
Kirk McKusick	79a5c790bd	Eliminate a locking panic when cleaning up UFS snapshots after a disk failure. Each vnode has an embedded lock that controls access to its contents. However vnodes describing a UFS snapshot all share a single snapshot lock to coordinate their access and update. As part of mounting a UFS filesystem with snapshots, each of the vnodes describing a snapshot has its individual lock replaced with the snapshot lock. When the filesystem is unmounted the vnode's original lock is returned replacing the snapshot lock. When a disk fails while the UFS filesystem it contains is still mounted (for example when a thumb drive is removed) UFS forcibly unmounts the filesystem. The loss of the drive causes the GEOM subsystem to orphan the provider, but the consumer remains until the filesystem has finished with the unmount. Information describing the snapshot locks was being prematurely cleared during the orphaning causing the return of the snapshot vnode's original locks to fail. The fix is to not clear the needed information prematurely. Sponsored by: Netflix	2021-01-15 16:36:42 -08:00
Kirk McKusick	173779b98f	Eliminate lock order reversal in UFS when unmounting filesystems with snapshots. Each vnode has an embedded lock that controls access to its contents. However vnodes describing a UFS snapshot all share a single snapshot lock to coordinate their access and update. As part of mounting a UFS filesystem with snapshots, each of the vnodes describing a snapshot has its individual lock replaced with the snapshot lock. When the filesystem is unmounted the vnode's original lock is returned replacing the snapshot lock. The lock order reversal happens because vnode locks must be acquired before snapshot locks. When unmounting we must lock both the snapshot lock and the vnode lock before swapping them so that the vnode will be continuously locked during the swap. For each vnode representing a snapshot, we must first acquire the snapshot lock to ensure exclusive access to it and its original lock. We then face a lock order reversal when we try to acquire the original vnode lock. The problem is eliminated by doing a non-blocking exclusive lock on the original lock which will always succeed since there are no users of that lock. Sponsored by: Netflix	2021-01-15 16:03:01 -08:00
Andrew Turner	f64329bcdc	Extract the logic from pmap_kextract This allows us to use it when we only need to check if the virtual address is valid. For example when checking if an address in the DMAP region is mapped. Reviewed by: kib, markj Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D27621	2021-01-15 19:08:01 +00:00
Emmanuel Vadot	7c84a7405b	Remove the old dts imported tree. The new one is in sys/contrib/device-tree	2021-01-15 20:09:55 +01:00
Emmanuel Vadot	efdf807990	Switch to the new device-tree vendor tree The old vendor tree was never fully merged and doing partial merge isn't supported with git subtree merge so a new one was created. Switch the build to use the new DTS from sys/contrib/device-tree This also bump the DTS used to be in sync with Linux 5.9 While here change the way to get the linux version, simply hardcode the value in sys/dts/freebsd-compatible.dts and use awk to get that to put it in the CFLAGS. As a bonus we now have the bindings docs available in sys/contrib/device-tree/Bindings/ so no need to link to the Linux repo or to the vendor tree.	2021-01-15 20:08:39 +01:00
Emmanuel Vadot	955b980bdf	gpiokeys: Use the new device-tree vendor include	2021-01-15 20:07:24 +01:00
Emmanuel Vadot	c38fe8789a	arm64: Directly use #include <dt-binding/...> We have it in the includes path and this will help the transition to the new device-tree import in sys/contrib	2021-01-15 20:07:19 +01:00
Emmanuel Vadot	19775aa7bc	Re-apply `f81b2b9a8a` to the new device-tree import	2021-01-15 20:07:13 +01:00
Emmanuel Vadot	78abc9e2e6	Revert upstream commit 27c90e5e48d0 It changed the #pinctrl-cells value to be equal to 2 and the macro that generates the values. Based on the bindings docs a value of 2 is only acceptable if the node used pinctrl-single,bits and not pinctrl-single,pins This allow booting further on the beaglebone black with 5.9 DTS	2021-01-15 20:07:08 +01:00
Mitchell Horne	0b92d1dd18	riscv: fix kernel build A more complete fix for this function is being worked on in D28054. Fix the uninitialized variable error so that builds can at least proceed. Reported by: several	2021-01-15 11:57:04 -04:00

1 2 3 4 5 ...

135775 Commits