freebsd-skq

Author	SHA1	Message	Date
Allan Jude	ba6e37e47f	ipmi_smbios: Deduplicate smbios entry point discovery logic Sponsored by: Ampere Computing LLC Submitted by: Klara Inc. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D28743	2021-02-23 21:17:37 +00:00
Allan Jude	d0673fe160	smbios: Move smbios driver out from x86 machdep code Add it to the x86 GENERIC and MINIMAL kernels Sponsored by: Ampere Computing LLC Submitted by: Klara Inc. Reviewed by: rpokala Differential Revision: https://reviews.freebsd.org/D28738	2021-02-23 21:17:09 +00:00
Allan Jude	11ba8488b8	iicsmb: Request the bus recursively in bread() ipmi_ssif will `smbus_request_bus()` to do multiple smbus requests (which requests the iicbus), and then here in `bread()` we also need to request the bus because `bread()` takes multiple transactions. This causes deadlock as it's waiting for the bus it already has without `IIC_RECURSIVE`. Sponsored by: Ampere Computing LLC Submitted by: Klara Inc. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D28742	2021-02-23 20:06:16 +00:00
Konstantin Belousov	3ae8d83d04	Remove __NO_TLS. All supported platforms support thread-local vars and __thread. Reviewed by: emaste Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28796	2021-02-23 20:08:10 +02:00
Alex Richardson	fa32350347	close_range: add audit support This fixes the closefrom test in sys/audit. Includes cherry-picks of the following commits from openbsm: `4dfc628aaf` `99ff6fe32a` `da48a0399e` Reviewed By: kevans Differential Revision: https://reviews.freebsd.org/D28388	2021-02-23 17:47:07 +00:00
Alexander Motin	7d4c444374	Bump CTL block backend threads from 14 to 32 per LUN. This makes random read benchmarks look better on a wide ZFS pools. I am not sure where the original value goes from, but it is there for too long now. MFC after: 1 week	2021-02-23 11:03:32 -05:00
Kristof Provost	c139b3c19b	arp/nd: Cope with late calls to iflladdr_event When tearing down vnet jails we can move an if_bridge out (as part of the normal vnet_if_return()). This can, when it's clearing out its list of member interfaces, change its link layer address. That sends an iflladdr_event, but at that point we've already freed the AF_INET/AF_INET6 if_afdata pointers. In other words: when the iflladdr_event callbacks fire we can't assume that ifp->if_afdata[AF_INET] will be set. Reviewed by: donner@, melifaro@ MFC after: 1 week Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D28860	2021-02-23 13:54:07 +01:00
Kristof Provost	38c0951386	bridge: Remove members when assigned to a new vnet When the bridge is moved to a different vnet we must remove all of its member interfaces (and span interfaces), because we don't know if those will be moved along with it. We don't want to hold references to interfaces not in our vnet. Reviewed by: donner@ MFC after: 1 week Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D28859	2021-02-23 13:54:07 +01:00
Kristof Provost	89fa9c34d7	bridge/stp: Ensure we enter NET_EPOCH whenever we can send traffic Reviewed by: donner@ MFC after: 1 week Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D28858	2021-02-23 13:54:07 +01:00
Kristof Provost	711ed156b9	bridge: Support STP on VLAN devices VLAN devices have type IFT_L2VLAN, so the STP code mistakenly believed they couldn't be used for STP. That's not the case, so add the ITF_L2VLAN to the check. Reviewed by: donner@ MFC after: 1 week Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D28857	2021-02-23 13:54:06 +01:00
Eric Joyner	a7ac518bff	ice_ddp: Update package file to 1.3.19.0 This package is intended to be used with ice(4) version 0.28.1-k. That update will happen in a forthcoming commit. Signed-off-by: Eric Joyner <erj@FreeBSD.org> Sponsored by: Intel Corporation	2021-02-22 18:02:19 -08:00
Jamie Gritton	0a2a96f35a	jail: Don't allow jails under dying parents If a jail is created with jail_set(...JAIL_DYING), and it has a parent currently in a dying state, that will bring the parent jail back to life. Restrict that to require that the parent itself be explicitly brought back first, and not implicitly created along with the new child jail. Differential Revision: https://reviews.freebsd.org/D28515	2021-02-22 17:04:06 -08:00
Jamie Gritton	701d6b50ae	jail: Fix a LOR introduced in `1158508a80`	2021-02-22 15:51:10 -08:00
Alexander V. Chernikov	5964172837	Simplify ifa/ifp refcounting in the routing stack. The routing stack control depends on quite a tree of functions to determine the proper attributes of a route such as a source address (ifa) or transmit ifp of a route. When actually inserting a route, the stack needs to ensure that ifa and ifp points to the entities that are still valid. Validity means slightly more than just pointer validity - stack need guarantee that the provided objects are not scheduled for deletion. Currently, callers either ignore it (most ifp parts, historically) or try to use refcounting (ifa parts). Even in case of ifa refcounting it's not always implemented in fully-safe manner. For example, some codepaths inside rt_getifa_fib() are referencing ifa while not holding any locks, resulting in possibility of referencing scheduled-for-deletion ifa. Instead of trying to fix all of the callers by enforcing proper refcounting, switch to a different model. As the rib_action() already requires epoch, do not require any stability guarantees other than the epoch-provided one. Use newly-added conditional versions of the refcounting functions (ifa_try_ref(), if_try_ref()) and fail if any of these fails. Reviewed by: donner MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D28837	2021-02-22 23:37:59 +00:00
Alexander V. Chernikov	7563019bc6	Add if_try_ref() to simplify refcount handling inside epoch. When we have an ifp pointer and the code is running inside epoch, epoch guarantees the pointer will not be freed. However, the following case can still happen: * in thread 1 we drop to refcount=0 for ifp and schedule its deletion. * in thread 2 we use this ifp and reference it * destroy callout kicks in * unhappy user reports a bug This can happen with the current implementation of ifnet_byindex_ref(), as we're not holding any locks preventing ifnet deletion by a parallel thread. To address it, add if_try_ref(), allowing to return failure when referencing ifp with refcount=0. Additionally, enforce existing if_ref() is with KASSERT to provide a cleaner error in such scenarios. Finally, fix ifnet_byindex_ref() by using if_try_ref() and returning NULL if the latter fails. MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D28836	2021-02-22 23:37:59 +00:00
Mark Johnston	537f92cd35	uma: Update the comment above startup_alloc() to reflect reality The scheme used for early slab allocations changed in commit `a81c400e75`. Reported by: alc Reviewed by: alc MFC after: 1 week	2021-02-22 18:22:51 -05:00
Alexander Motin	d510bf133d	cxgb(4): Rework my commit `9dc7c250`. The previous implementation was reported to try to coalesce packets in situations when it should not, that resulted in assertion later. This implementation better checks the first packet of the chain for the coallescing elligibility. MFC after: 3 days	2021-02-22 17:33:43 -05:00
Mark Johnston	23e875fd97	vm_kern: Avoid sign extension in the KVA_QUANTUM definition Otherwise, on a powerpc64 NUMA system with hashed page tables, the first-level superpage reservation size is large enough that the value of the kernel KVA arena import quantum, KVA_NUMA_IMPORT_QUANTUM, is negative and gets sign-extended when passed to vmem_set_import(). This results in a boot-time hang on such platforms. Reported by: bdragon MFC after: 3 days	2021-02-22 15:50:09 -05:00
Jamie Gritton	811e27fa3c	jail: Add PD_KILL to remove a prison in prison_deref(). Add the PD_KILL flag that instructs prison_deref() to take steps to actively kill a prison and its descendents, namely marking it PRISON_STATE_DYING, clearing its PR_PERSIST flag, and killing any attached processes. This replaces a similar loop in sys_jail_remove(), bringing the operation under the same single hold on allprison_lock that it already has. It is also used to clean up failed jail (re-)creations in kern_jail_set(), which didn't generally take all the proper steps. Differential Revision: https://reviews.freebsd.org/D28473	2021-02-22 12:27:44 -08:00
Cy Schubert	a805ffbcbc	ipfilter: Make LARGE_NAT a tunable. LARGE_NAT is a C macro that increases NAT_SIZE from 127 to 2047, RDR_SIZE from 127 to 2047, HOSTMAP_SIZE from 2047 to 8191, NAT_TABLE_MAX from 30000 to 180000, and NAT_TABLE_SZ from 2047 to 16383. These values can be altered at runtime using the ipf -T command however some adminstrators of large firewalls rebuild the kernel to enable LARGE_NAT at boot. This revision adds the tunable net.inet.ipf.large_nat which allows an administrator to set this option at boot instead of build time. Setting the LARGE_NAT macro to 1 is unaffected allowing build-time users to continue using the old way.	2021-02-22 11:20:18 -08:00
Alex Richardson	ba2cfa80e1	Fix makefs bootstrap after `d485c77f20` The makefs msdosfs code includes fs/msdosfs/denode.h which directly uses struct buf from <sys/buf.h> rather than the makefs struct m_buf. To work around this problem provide a local denode.h that includes ffs/buf.h and defines buf as an alias for m_buf. Reviewed By: kib, emaste Differential Revision: https://reviews.freebsd.org/D28835	2021-02-22 17:55:45 +00:00
Alexander Motin	6895f89fe5	Coalesce socket reads in software iSCSI. Instead of 2-4 socket reads per PDU this can do as low as one read per megabyte, dramatically reducing TCP overhead and lock contention. With this on iSCSI target I can write more than 4GB/s through a single connection. MFC after: 1 month	2021-02-22 12:51:59 -05:00
Alex Richardson	c1b554c868	if_vtnet: Fix pointer-sign and used parameter warnings Reviewed By: grehan Differential Revision: https://reviews.freebsd.org/D28726	2021-02-22 17:41:04 +00:00
Hans Petter Selasky	9febbc4541	Fix for natd(8) sending wrong sequence number after TCP retransmission, terminating a TCP connection. If a TCP packet must be retransmitted and the data length has changed in the retransmitted packet, due to the internal workings of TCP, typically when ACK packets are lost, then there is a 30% chance that the logic in GetDeltaSeqOut() will find the correct length, which is the last length received. This can be explained as follows: If a "227 Entering Passive Mode" packet must be retransmittet and the length changes from 51 to 50 bytes, for example, then we have three cases for the list scan in GetDeltaSeqOut(), depending on how many prior packets were received modulus N_LINK_TCP_DATA=3: case 1: index 0: original packet 51 index 1: retransmitted packet 50 index 2: not relevant case 2: index 0: not relevant index 1: original packet 51 index 2: retransmitted packet 50 case 3: index 0: retransmitted packet 50 index 1: not relevant index 2: original packet 51 This patch simply changes the searching order for TCP packets, always starting at the last received packet instead of any received packet, in GetDeltaAckIn() and GetDeltaSeqOut(). Else no functional changes. Discussed with: rscheff@ Submitted by: Andreas Longwitz <longwitz@incore.de> PR: 230755 MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking	2021-02-22 17:13:58 +01:00
Roger Pau Monné	808d4aad10	xen-blkback: fix leak of grant maps on ring setup failure Multi page rings are mapped using a single hypercall that gets passed an array of grants to map. One of the grants in the array failing to map would lead to the failure of the whole ring setup operation, but there was no cleanup of the rest of the grant maps in the array that could have likely been created as a result of the hypercall. Add proper cleanup on the failure path during ring setup to unmap any grants that could have been created. This is part of XSA-361. Sponsored by: Citrix Systems R&D	2021-02-22 16:47:52 +01:00
Mark Johnston	608c44f96e	m_uiotombuf_nomap(): Stop clearing PG_ZERO in newly allocated pages The caller should not be passing M_ZERO in the first place, so PG_ZERO will not be preserved by the page allocator and clearing it accomplishes nothing. Reviewed by: gallatin, jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28808	2021-02-22 10:04:46 -05:00
Stefan Eßer	a0ba293c2f	Add missing entry for zfs_racct.c	2021-02-22 15:06:48 +01:00
Martin Matuska	ba27dd8be8	zfs: merge OpenZFS master-9312e0fd1 Notable upstream changes: 778869fa1 Fix reporting of mount progress e7adccf7f Disable use of hardware crypto offload drivers on FreeBSD 03e02e5b5 Fix checksum errors not being counted on repeated repair 64e0fe14f Restore FreeBSD resource usage accounting 11f2e9a49 Fix panic if scrubbing after removing a slog device MFC after: 2 weeks	2021-02-22 13:01:17 +01:00
Alexander Motin	c02a28754b	Fix build after `2c7dc6bae9`. MFC after: 1 month	2021-02-21 17:21:14 -05:00
Alexander Motin	2c7dc6bae9	Refactor CTL datamove KPI. - Make frontends call unified CTL core method ctl_datamove_done() to report move completion. It allows to reduce code duplication in differerent backends by accounting DMA time in common code. - Add to ctl_datamove_done() and be_move_done() callback samethr argument, reporting whether the callback is called in the same context as ctl_datamove(). It allows for some cases like iSCSI write with immediate data or camsim frontend write save one context switch, since we know that the context is sleepable. - Remove data_move_done() methods from struct ctl_backend_driver, unused since forever. MFC after: 1 month	2021-02-21 16:52:33 -05:00
Jamie Gritton	1158508a80	jail: Add pr_state to struct prison Rather that using references (pr_ref and pr_uref) to deduce the state of a prison, keep track of its state explicitly. A prison is either "invalid" (pr_ref == 0), "alive" (pr_uref > 0) or "dying" (pr_uref == 0). State transitions are generally tied to the reference counts, but with some flexibility: a new prison is "invalid" even though it now starts with a reference, and jail_remove(2) sets the state to "dying" before the user reference count drops to zero (which was prviously accomplished via the PR_REMOVE flag). pr_state is protected by both the prison mutex and allprison_lock, so it has the same availablity guarantees as the reference counts do. Differential Revision: https://reviews.freebsd.org/D27876	2021-02-21 13:24:47 -08:00
Mateusz Guzik	2443068d48	vfs: shrink struct vnode to 448 bytes on LP64 ... by moving v_hash into a 4 byte hole. Combined with several previous size reductions this makes the size small enough to fit 9 vnodes per page as opposed to 8. Add a compilation time assert so that this is not unknowingly worsened. Note the structure still remains bigger than it should be.	2021-02-21 21:07:14 +00:00
Mateusz Guzik	ee9b37ae5c	jail: fix build after the previous commit Noted by: Michael Butler <imb protected-networks.net>	2021-02-21 21:05:25 +00:00
Jamie Gritton	f7496dcab0	jail: Change the locking around pr_ref and pr_uref Require both the prison mutex and allprison_lock when pr_ref or pr_uref go to/from zero. Adding a non-first or removing a non-last reference remain lock-free. This means that a shared hold on allprison_lock is sufficient for prison_isalive() to be useful, which removes a number of cases of lock/check/unlock on the prison mutex. Expand the locking in kern_jail_set() to keep allprison_lock held exclusive until the new prison is valid, thus making invalid prisons invisible to any thread holding allprison_lock (except of course the one creating or destroying the prison). This renders prison_isvalid() nearly redundant, now used only in asserts. Differential Revision: https://reviews.freebsd.org/D28419 Differential Revision: https://reviews.freebsd.org/D28458	2021-02-21 10:55:44 -08:00
Michael Tuexen	b963ce4588	sctp: improve computation of an alternate net Espeially handle the case where the net passed in is about to be deleted and therefore not in the list of nets anymore. MFC after: 3 days Reported by: syzbot+9756917a7c8381adf5e8@syzkaller.appspotmail.com	2021-02-21 17:13:06 +01:00
Michael Tuexen	5ac839029d	sctp: clear a pointer to a net which will be removed MFC after: 3 days	2021-02-21 13:06:05 +01:00
Konstantin Belousov	8b7239681e	ext2fs: clear write cluster tracking on truncation Reviewed by: fsu, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Konstantin Belousov	2bfd8992c7	vnode: move write cluster support data to inodes. The data is only needed by filesystems that 1. use buffer cache 2. utilize clustering write support. Requested by: mjg Reviewed by: asomers (previous version), fsu (ext2 parts), mckusick Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Konstantin Belousov	d485c77f20	Remove #define _KERNEL hacks from libprocstat Make sys/buf.h, sys/pipe.h, sys/fs/devfs/devfs*.h headers usable in userspace, assuming that the consumer has an idea what it is for. Unhide more material from sys/mount.h and sys/ufs/ufs/inode.h, sys/ufs/ufs/ufsmount.h for consumption of userspace tools, with the same caveat. Remove unacceptable hack from usr.sbin/makefs which relied on sys/buf.h being unusable in userspace, where it override struct buf with its own definition. Instead, provide struct m_buf and struct m_vnode and adapt code to use local variants. Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Konstantin Belousov	750ea20d3f	Delete dead CLUSTERDEBUG config option. Reviewed by: mckusick Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D28679	2021-02-21 11:38:21 +02:00
Mateusz Guzik	81174cd8e2	vfs: employ vfs_ref_from_vp in statfs and fstatfs Avoids locking and unlocking the vnode. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D28695	2021-02-21 00:43:05 +00:00
Mateusz Guzik	a15f787adb	vfs: add vfs_ref_from_vp This generalizes what vop_stdgetwritemount used to be doing. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D28695	2021-02-21 00:43:05 +00:00
Mateusz Guzik	5fa12fe0cd	amd64: implement strlen in assembly, take 2 Tested with glibc test suite. The C variant in libkern performs excessive branching to find the zero byte instead of using the bsfq instruction. The same code patched to use it is still slower than the routine implemented here as the compiler keeps neglecting to perform certain optimizations (like using leaq). On top of that the routine can be used as a starting point for copyinstr which operates on words intead of bytes. The previous attempt had an instance of swapped operands to andq when dealing with fully aligned case, which had a side effect of breaking the code for certain corner cases. Noted by jrtc27. Sample results: $(perl -e "print 'A' x 3"): stock: 211198039 patched:338626619 asm: 465609618 $(perl -e "print 'A' x 100"): stock: 83151997 patched: 98285919 asm: 120719888 Reviewed by: jhb, kib Differential Revision: https://reviews.freebsd.org/D28779	2021-02-21 00:43:05 +00:00
Jamie Gritton	6e1d1bfcac	jail: Improve locking when removing prisons Change the flow of prison_deref() so it doesn't let go of allprison_lock until it's completely done using it (except for a possible drop as part of an upgrade on its first try). Differential Revision: https://reviews.freebsd.org/D28458 MFC after: 3 days	2021-02-20 14:38:58 -08:00
Richard Scheffenegger	a8e431e153	PRR: use accurate rfc6675_pipe when enabled Reviewed By: #transport, tuexen MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D28816	2021-02-20 20:11:48 +01:00
Alexander V. Chernikov	e5b394f2d0	Fix setting static entries for arp/ndp. rtsock message validation changes committed in `2fe5a79425` did not take llinfo messages into account. Add a special validation case for RTA_GATEWAY llinfo messages. MFC after: 2 days	2021-02-20 18:26:35 +00:00
Navdeep Parhar	0460a45062	cxgbe(4): Use the correct filter width for T5+. T5 and above have extra bits for the optional filter fields. This is a correctness issue and not just a waste because a filter mode valid on a T4 (36b) may not be valid on a T5+ (40b). MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-02-19 14:23:58 -08:00
Navdeep Parhar	c91dda5ad9	cxgbe(4): Add a driver ioctl to set the filter mask. Allow the filter mask (aka the hashfilter mode when hashfilters are in use) to be set any time it is safe to do so. The requested mask must be a subset of the filter mode already. The driver will not change the mode or ingress config just to support a new mask. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-02-19 14:23:58 -08:00
Navdeep Parhar	7ac8040a99	cxgbe(4): Use firmware commands to get/set filter configuration. 1. Query the firmware for filter mode, mask, and related ingress config instead of trying to figure them out from hardware registers. Read configuration from the registers only when the firmware does not support this query. 2. Use the firmware to set the filter mode. This is the correct way to do it and is more flexible as well. The filter mode (and associated ingress config) can now be changed any time it is safe to do so. The user can specify a subset of a valid mode and the driver will enable enough bits to make sure that the mode is maxed out -- that is, it is not possible to set another bit without exceeding the total width for optional filter fields. This is a hardware requirement that was not enforced by the driver previously. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-02-19 14:23:58 -08:00
Jamie Gritton	d4380c0cdd	jail: Change both root and working directories in jail_attach(2) jail_attach(2) performs an internal chroot operation, leaving it up to the calling process to assure the working directory is inside the jail. Add a matching internal chdir operation to the jail's root. Also ignore kern.chroot_allow_open_directories, and always disallow the operation if there are any directory descriptors open. Reported by: mjg Approved by: markj, kib MFC after: 3 days	2021-02-19 14:13:35 -08:00

1 2 3 4 5 ...

136285 Commits