freebsd-dev

Author	SHA1	Message	Date
Bjoern A. Zeeb	61a68e50d4	LinuxKPI: 802.11 enahnce linuxkpi_ieee80211_iterate_interfaces() Add support for IEEE80211_IFACE_SKIP_SDATA_NOT_IN_DRIVER in linuxkpi_ieee80211_iterate_interfaces() needed by a driver. MFC after: 3 days	2022-02-16 03:56:54 +00:00
Bjoern A. Zeeb	c5b96b3eae	LinuxKPI: 802.11 assign an(y) early chandef The Realtek driver assumes an early chandef to be set. At the time of linuxkpi_ieee80211_ifattach() we do not really know one yet so try to find the first one which is available and set that. This prevents a NULL-deref panic. MFC after: 3 days	2022-02-16 03:48:54 +00:00
Bjoern A. Zeeb	652e22d395	LinuxKPI: 802.11: defer workq allocation until we have a name Turned out all the workq's taskqueues were named "wlanNA" if you had more then one card in a machine as by the time we called wiphy_name() the device name was not set yet and we returned the fallback. Move the alloc_ordered_workqueue() from linuxkpi_ieee80211_alloc_hw() to linuxkpi_ieee80211_ifattach() at which time the device name has to be set to give us a unique name. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2022-02-16 03:26:30 +00:00
Bjoern A. Zeeb	d3ef7fb459	LinuxKPI: 802.11 scan update Realtek's rtw88 is returning a hard-coded 1 in case they cannot hw_scan (fw not advertising it). In that case if we want any scan to run we need to fall-back to sw scan. Start dealing with this. Long-term we probably need to keep internal state. MFC after: 3 days	2022-02-16 03:11:01 +00:00
Mark Johnston	26b08c5d21	armv8crypto: Use cursors to access crypto buffer data Currently armv8crypto copies the scheme used in aesni(9), where payload data and output buffers are allocated on the fly if the crypto buffer is not virtually contiguous. This scheme is simple but incurs a lot of overhead: for an encryption request with a separate output buffer we have to - allocate a temporary buffer to hold the payload - copy input data into the buffer - copy the encrypted payload to the output buffer - zero the temporary buffer before freeing it We have a handy crypto buffer cursor abstraction now, so reimplement the armv8crypto routines using that instead of temporary buffers. This introduces some extra complexity, but gallatin@ reports a 10% throughput improvement with a KTLS workload without additional CPU usage. The driver still allocates an AAD buffer for AES-GCM if necessary. Reviewed by: jhb Tested by: gallatin Sponsored by: Ampere Computing LLC Submitted by: Klara Inc. MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D28950	2022-02-15 21:50:41 -05:00
Mark Johnston	0b3235ef74	armv8crypto: Factor out some duplicated GCM code This is in preparation for using buffer cursors. No functional change intended. Reviewed by: jhb Sponsored by: Ampere Computing LLC Submitted by: Klara Inc. MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D28948	2022-02-15 21:47:41 -05:00
Mark Johnston	09bfa5cf16	opencrypto: Add a routine to copy a crypto buffer cursor This was useful in converting armv8crypto to use buffer cursors. There are some cases where one wants to make two passes over data, and this provides a way to "reset" a cursor. Reviewed by: jhb MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D28949	2022-02-15 21:47:10 -05:00
Bjoern A. Zeeb	6baea3312d	LinuxKPI: skbuff updates Various updates to skbuff for new/updated drivers and some housekeeping: - update types and struct members, add new (stub) functions - improve freeing of frags. - fix an issue with sleeping during alloc for dev_alloc_skb(). - Adjust a KASSERT for skb_reserve() which apparently can be called multiple times if no data was put into the skb yet. - move the sysctl from linux_8022.c (which may be in a different module) to linux_skbuff.c so in case we turn debugging on we do not run into unresolved symbols. Rename the sysctl variable to be less conflicting and update debugging macros along with that; also add IMPROVE(). - add DDB support to show an skbuff. - adjust comments/whitespace. No functional changes intended for iwlwifi. Sponsored by: The FreeBSD Foundation (partially) MFC after: 3 days	2022-02-16 02:10:10 +00:00
Bjoern A. Zeeb	2e183d999c	LinuxKPI: 802.11 header updates and add/adjust source dependencies. This update is for more/newer versions of drivers: - add and properly place more structs, enums, defines needed by drivers. - correct types of struct fields. - make various function arguments const. - move REG_RULE() macro to its own file regulatory.h and use macros for calculations. - add linuxkpi_ieee80211_get_channel() implementation. - change linuxkpi_ieee80211_ifattach() to return int for error checking. No intended functional changes for iwlwifi. Sponsored by: The FreeBSD Foundation (partially) MFC after: 3 days	2022-02-15 23:45:15 +00:00
Bjoern A. Zeeb	064c110f4b	LinuxKPI: lockdep add lockdep_assert_not_held() Add lockdep_assert_not_held() asserting LA_UNLOCKED as needed by a driver. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34232	2022-02-15 23:15:00 +00:00
Mateusz Guzik	e68a5225e8	fd: add fde_copy To dedup handrolled memcpy. This will be used later to make fd code atomic-clean.	2022-02-15 17:51:08 +00:00
Mateusz Guzik	ec12b4f4ff	fd: add missing seqc to dupfdopen	2022-02-15 17:51:08 +00:00
Mateusz Guzik	c9a995994b	seqc: rename seqc_consistent_nomb to seqc_consistent_no_fence For more consistency with other primitives.	2022-02-15 17:51:07 +00:00
Richard Scheffenegger	972a7d95eb	iscsi: Use calloutng instead of ticks in iscsi initiator callout *_sbt functions are used to reduce ping/timeout scheduling overhead, while allowing later improvments in the functionality. Keep similar 1000ms callouts while adding a 10 ms window, to allow some kernel scheduling improvements. Reviewed By: jhb Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34222	2022-02-15 17:36:22 +01:00
Mark Johnston	235ed6a486	mlx5e: Make TLS tag zones unmanaged These zones are cache zones used to allocate TLS offload contexts from firmware. Releasing items from the cache is a sleepable operation due to the need to await a response from the firmware command freeing the tag, so items cannot be reclaimed from the zone in non-sleepable contexts. Since the cache size is limited by firmware limits, avoid this by setting UMA_ZONE_UNMANAGED to avoid reclamation by uma_timeout() and the low memory handler. Reviewed by: hselasky, kib MFC after: 3 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34142	2022-02-15 09:25:34 -05:00
Mark Johnston	389a3fa693	uma: Add UMA_ZONE_UNMANAGED Allow a zone to opt out of cache size management. In particular, uma_reclaim() and uma_reclaim_domain() will not reclaim any memory from the zone, nor will uma_timeout() purge cached items if the zone is idle. This effectively means that the zone consumer has control over when items are reclaimed from the cache. In particular, uma_zone_reclaim() will still reclaim cached items from an unmanaged zone. Reviewed by: hselasky, kib MFC after: 3 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34142	2022-02-15 09:25:34 -05:00
Li-Wen Hsu	7442b63231	if_epair: Use ANSI C definition This fixes -Werror=strict-prototypes from gcc9 Sponsored by: The FreeBSD Foundation	2022-02-15 21:45:22 +08:00
Richard Scheffenegger	0c2832ee4f	tcp: Restore 6 tcps padding entries in HEAD The padding in CURRENT shall not shrink. It is designed that in CURRENT at always stays the same, and then when a new stable is branched, it inherits 6 pointer placeholders that can be used withing this stable/X lifetime to extend the structure. Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34269	2022-02-15 09:24:07 +01:00
Kristof Provost	24f0bfbad5	if_epair: implement fanout Allow multiple cores to be used to process if_epair traffic. We do this (if RSS is enabled) based on the RSS hash of the incoming packet. This allows us to distribute the load over multiple cores, rather than sending everything to the same one. We also switch from swi_sched() to taskqueues, which also contributes to better throughput. Benchmark results: With net.isr.maxthreads=-1 Setup A: (cc0 - bridge0 - epair0a) (epair0b - bridge1 - cc1) Before 627 Kpps After (no RSS) 1.198 Mpps After (RSS) 3.148 Mpps Setup B: (cc0 - bridge0 - epaira0) (epair0b - vnet jail - epair1a) (epair1b - bridge1 - cc1) Before 7.705 Kpps After (no RSS) 1.017 Mpps After (RSS) 2.083 Mpps MFC after: 3 weeks Sponsored by: Orange Business Services Differential Revision: https://reviews.freebsd.org/D33731	2022-02-15 09:03:24 +01:00
Wei Hu	de64aa32c8	mana: Add handling of CQE_RX_TRUNCATED The proper way to drop this kind of CQE is advancing rxq tail without indicating the packet to the upper network layer. MFC after: 2 weeks Sponsored by: Microsoft	2022-02-15 07:27:42 +00:00
Bjoern A. Zeeb	05f0b24bfb	Bump __FreeBSD_version to 1400052 for LinuxKPI changes. Add a marker after GUID_INIT() and linux/pm_qos.h were added, so that future version of drm-kmod can selectively remove these bits. The latest port version does not require user updates for this so no UPDATING entry.	2022-02-14 23:55:16 +00:00
Bjoern A. Zeeb	fa6d3522b5	LinuxKPI: add linux/pm_qos.h Add a linux/pm_qos.h with three dummy functions and a struct as needed by a driver and drm-kmod [1] with no intend to support this for the moment. Submitted by: wulf (drm-kmod bits) [1] Sponsored by: The FreeBSD Foundation (drm-kmod requested updates) MFC after: 3 days Reviewed by: hselasky (earlier version), wulf Differential Revision: https://reviews.freebsd.org/D34234	2022-02-14 23:53:17 +00:00
Bjoern A. Zeeb	97009980c4	LinuxKPI: add UUID_STRING_LEN and GUID_INIT to uuid.h Add a definition for UUID_STRING_LEN to uuid.h as needed by a driver. Also add GUID_INIT for drm-kmod [1]. Submitted by: wulf [1] MFC after: 3 days Reviewed by: hselasky (earlier), wulf Differential Revision: https://reviews.freebsd.org/D34235	2022-02-14 23:51:51 +00:00
Bjoern A. Zeeb	cee56e77d7	LinuxKPI: 802.11: get rid of lkpi_ic_getradiocaps warnings Users are seeing warnings about 2 channels (1 per band) triggered by an ioctl from wpa_supplicant usually: lkpi_ic_getradiocaps: Adding chan ... returned error 55 This was an early FAQ. Check the current number of channels against maxchans and the return code from net80211. In case net80211 reports that we reached the limit do not print the warning and do not try to add further channels. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2022-02-14 23:48:31 +00:00
Kristof Provost	78bc3d5e17	vlan: allow net.link.vlan.mtag_pcp to be set per vnet The primary reason for this change is to facilitate testing. MFC after: 1 week	2022-02-14 22:51:10 +01:00
Franco Fichtner	0143a6bb7f	pf: fix set_prio after nv conversion Reviewed by: kp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D34266	2022-02-14 22:51:10 +01:00
Bjoern A. Zeeb	32cf376a01	net80211: enhance (disabled) debugging Add maxchans to the disabled debugging in addchan() and copychan_prev() to aid debugging possible errors rreturned due to reaching maxchans limits. Sponsored by: The FreeBSD Foundation MFC after: 3 days	2022-02-14 22:16:59 +00:00
John Baldwin	2f6a842484	Disable -Wreturn-type on GCC. GCC is more pedantic than clang about warning when a function doesn't handle undefined enum values (see GCC bug 87950). Clang's warning gives a more pragmatic coverage and should find any real bugs, so disable the warning for GCC rather than adding __unreachable annotations to appease GCC. Reviewed by: imp, emaste Differential Revision: https://reviews.freebsd.org/D34147	2022-02-14 11:48:47 -08:00
John Baldwin	becaf6433b	Use vmspace->vm_stacktop in place of sv_usrstack in more places. Reviewed by: markj Obtained from: CheriBSD Differential Revision: https://reviews.freebsd.org/D34174	2022-02-14 10:57:30 -08:00
Gleb Smirnoff	65572cade3	unix/dgram: return EAGAIN instead of ENOBUFS when O_NONBLOCK set This is behavior what some programs expect and what Linux does. For example nginx expects EAGAIN when sending messages to /var/run/log, which it connects to with O_NONBLOCK. Particularly with nginx the problem is magnified by the fact that a ENOBUFS on send(2) is also logged, so situation creates a log-bomb - a failed log message triggers another log message. Reviewed by: markj Differential revision: https://reviews.freebsd.org/D34187	2022-02-14 09:21:55 -08:00
Mark Johnston	c7cd607a4e	msdosfs: Fix mounting when the device sector size is >512B HugeSectors * BytesPerSec should be computed before converting HugeSectors to a DEV_BSIZE-based count. Fixes: `ba2c98389b` ("msdosfs: sanity check sector count from BPB") Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34264	2022-02-14 10:06:47 -05:00
Mark Johnston	852ff943b9	sleepqueue: Annotate sleepq_max_depth as static MFC after: 1 week Sponsored by: The FreeBSD Foundation	2022-02-14 10:06:47 -05:00
Mark Johnston	893be9d8ac	sleepqueue: Address a lock order reversal After commit `74cf7cae4d` ("softclock: Use dedicated ithreads for running callouts."), there is a lock order reversal between the per-CPU callout lock and the scheduler lock. softclock_thread() locks callout lock then the scheduler lock, when preparing to switch off-CPU, and sleepq_remove_thread() stops the timed sleep callout while potentially holding a scheduler lock. In the latter case, it's the thread itself that's locked, and if the thread is sleeping then its lock will be a sleepqueue lock, but if it's still in the process of going to sleep it'll be a scheduler lock. We could perhaps change softclock_thread() to try to acquire locks in the opposite order, but that'd require dropping and re-acquiring the callout lock, which seems expensive for an operation that will happen quite frequently. We can instead perhaps avoid stopping the td_slpcallout callout if the thread is still going to sleep, which is what this patch does. This will result in a spurious call to sleepq_timeout(), but some counters suggest that this is very rare. PR: 261198 Fixes: `74cf7cae4d` ("softclock: Use dedicated ithreads for running callouts.") Reported and tested by: thj Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34204	2022-02-14 10:06:47 -05:00
Bjoern A. Zeeb	a4529c46d4	LinuxKPI; add the beginning of a tracepoint.h implementation Add a beginning of a tracepoint.h implementation to ease porting drivers making use of this Linux facility. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34236	2022-02-14 00:24:43 +00:00
Bjoern A. Zeeb	85d61bd872	LinuxKPI: add NETIF_F_HW_CSUM to netdev_features.h Add NETIF_F_HW_CSUM to netdev_features.h as needed by a driver. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34233	2022-02-14 00:22:24 +00:00
Bjoern A. Zeeb	c840d5cec2	LinuxKPI: add kstrtoint_from_user() and DECLARE_FLEX_ARRAY() Add an implementation of kstrtoint_from_user() based on the other implementations and an attempt at DECLARE_FLEX_ARRAY() which works for the driver needing it. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34231	2022-02-14 00:20:41 +00:00
Bjoern A. Zeeb	0c37ffda79	LinuxKPI: add an initial ethtool.h Add an initial ethtool.h for a define and a dummy struct for now needed by drivers. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34229	2022-02-14 00:19:08 +00:00
Bjoern A. Zeeb	3cd6d6ff52	LinuxKPI: add eth_random_addr() and device_get_mac_address() Add eth_random_addr() and a dummy of device_get_mac_address() pending OF (FDT) support needed by drivers. While here remove a white space in random_ether_addr(). MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34228	2022-02-14 00:17:14 +00:00
Bjoern A. Zeeb	8f33ad3cf5	LinuxKPI: add more errno Add ENOMEDIUM, ENOSR, and ELNRNG to linux/errno.h needed by drivers. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34227	2022-02-14 00:15:41 +00:00
Bjoern A. Zeeb	e5b95b2201	LinuxKPI: add sizeof_field() Add sizeof_field() to linux/compiler.h needed by a driver. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34226	2022-02-14 00:13:56 +00:00
Bjoern A. Zeeb	d17b78aa14	LinuxKPI: add __ffs64() Add __ffs64() to linux/bitops.h needed by a driver. Reviewed by: hselasky MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D34225	2022-02-14 00:12:09 +00:00
Bjoern A. Zeeb	2e818fbcfc	LinuxKPI: add get_unaligned_le16() Add get_unaligned_le16() to asm/unaligned.h needed by a driver. MFC after: 3 days Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D34224	2022-02-14 00:09:57 +00:00
Bjoern A. Zeeb	232d323ef2	TCP syncache: enhance KASSERT output Improve the "syncache: mbuf too small" assertion message with various variables (some not actually needed) but enough that it will be obvious if (a) we use IPv4 or IPv6, (b) if UDP tunneling is on, (c) what max_linkhdr is, and (d) what MHLEN is. This should help diagnostics in the future. The case was hit with wireless drivers setting a large ic_headroom and using IPv6. Reviewed by: gallatin, tuexen, rscheff MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D34217	2022-02-14 00:03:20 +00:00
Dimitry Andric	09d0a0fbe8	bwi: Fix clang 14 warning about possible unaligned access On architectures with strict alignment requirements (e.g. arm), clang 14 warns about a packed struct which encloses a non-packed union: In file included from sys/dev/bwi/bwimac.c:79: sys/dev/bwi/if_bwivar.h:308:7: error: field iv_val within 'struct bwi_fw_iv' is less aligned than 'union (unnamed union at sys/dev/bwi/if_bwivar.h:305:2)' and is usually due to 'struct bwi_fw_iv' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access] } iv_val; ^ It appears to help if you also add __packed to the inner union (i.e. iv_val). No change to the layout is intended. MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D34196	2022-02-13 14:35:58 +01:00
Mateusz Guzik	6aa246e605	vfs: convert vnsz2log to a macro	2022-02-13 13:07:08 +00:00
Mateusz Guzik	5c31025060	fd: use FILEDESC_FOREACH_{FDE,FP} where appropriate	2022-02-13 13:07:08 +00:00
Mateusz Guzik	60b699f99c	fd: add FILEDESC_FOREACH_{FDE,FP} Right now they naively walk the fd table just like all the other code, but that's going to change.	2022-02-13 13:07:08 +00:00
Mateusz Guzik	809f3121be	fd: assign fd_freefile early when copying This is to simplify an upcomming change.	2022-02-13 13:07:08 +00:00
Mateusz Guzik	893d20c95a	fd: move fd table sizing out of fdinit now it is placed with the rest of actual initialisation	2022-02-13 13:07:08 +00:00
Mateusz Guzik	4103c3cd5b	fd: drop volatile keyword from refcounts While here move a comment where it belongs and do small whitespace clean up.	2022-02-13 13:07:08 +00:00
Mateusz Guzik	b53133a778	proc: load/store p_cowgen using atomic primitives	2022-02-13 13:07:08 +00:00
Mateusz Guzik	29ee49f66b	thread: remove dead store from thread_cow_update	2022-02-13 13:07:08 +00:00
Richard Scheffenegger	70e9f880d8	iscsi: address unused-but-set-variable warning remove "interrupted" in icl_soft_proxy_connect() Reviewed By: hselasky Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34223	2022-02-12 06:15:02 +01:00
Navdeep Parhar	08c7dc7fd4	cxgbe(4): Fix illegal hardware access in cxgbe_refresh_stats. cxgbe_refresh_stats takes into account VI_SKIP_STATS but not VI_INIT_DONE when deciding whether to read the hardware stats. But before this change VI_SKIP_STATS was set only for VIs with VI_INIT_DONE. That meant that cxgbe_refresh_stats always accessed the hardware for uninitialized VIs, and this is a problem if the adapter is suspended or in the middle of a reset. Fix this by setting VI_SKIP_STATS on all VIs during suspend. While here, ignore VI_INIT_DONE in vi_refresh_stats too to be consistent with cxgbe_refresh_stats. MFC after: 1 week Sponsored by: Chelsio Communications	2022-02-12 09:53:50 -08:00
Ed Maste	acfb506b3d	newvers.sh: allow multiple -V args in one invocation Reviewed by: imp MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34253	2022-02-12 11:06:54 -05:00
Navdeep Parhar	39a36707bd	cxgbe(4): Avoid unsafe hardware access in the ifmedia ioctls. The hardware is unavailable when the device is suspended or in the middle of a reset. MFC after: 1 week Sponsored by: Chelsio Communications	2022-02-11 16:35:27 -08:00
John Baldwin	cd0525f615	ktls: Write-lock the INP when changing a transmit TLS session. The TCP rate pacing code relies on being able to read this pointer safely while holding an INP lock. The initial TLS session pointer is set while holding the write lock already. Reviewed by: gallatin, hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34086	2022-02-11 15:16:25 -08:00
Konstantin Belousov	fd8d4e53bc	vdso linker scripts: explicitly specify output arch and target Requested by: jhb Reviewed by: emaste, imp, jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34157	2022-02-12 00:32:23 +02:00
John Baldwin	b0c1600a8c	linuxkpi xarray: Correct expression in assertion. Reported by: GCC -Wparantheses Reviewed by: wulf, hselasky Differential Revision: https://reviews.freebsd.org/D34197	2022-02-11 13:59:27 -08:00
Mateusz Guzik	1d65a9b47e	cache: improve vnode vs name assertion in cache_enter_time	2022-02-11 12:29:26 +00:00
Mateusz Guzik	611470a515	cache: remove NOCACHE handling from cache_fplookup_noentry It was copy-pasted from locked lookup. As LOOKUP operation cannot have the flag set it was always ending up setting MAKEENTRY.	2022-02-11 12:29:26 +00:00
Mateusz Guzik	513c7a6e0c	fd: make fget_unlocked take a thread argument Just like other fget routines. This enables embedding fd table pointer in struct thread, avoiding taking a trip through proc.	2022-02-11 12:29:26 +00:00
Mateusz Guzik	45bb8beacc	fd: elide one acquire fence in fget_unlocked_seq Still validate we got the stable state before returning an error though.	2022-02-11 12:29:26 +00:00
Mateusz Guzik	62849eef5b	fd: split fget_unlocked_seq depending on CAPABILITIES This will simplify an upcoming change.	2022-02-11 12:27:22 +00:00
Mateusz Guzik	b937908e41	fd: split fget_cap depending on CAPABILITIES This will simplify an upcoming change.	2022-02-11 12:13:27 +00:00
Mateusz Guzik	f40dd6c803	tty: switch ttyhook_register to use fget_cap_locked It is still wrong-ish as fget* funcs don't expect to operate on abitrary file descriptor tables, but this at least moves it out of the way of an upcoming change while being bug-compatible.	2022-02-11 12:13:27 +00:00
Mateusz Guzik	93288e2445	Employ thread_cow_synced in setrlimit In order to avoid proc lock/unlock on next kernel entry.	2022-02-11 11:44:07 +00:00
Mateusz Guzik	32114b639f	Add PROC_COW_CHANGECOUNT and thread_cow_synced Combined they can be used to avoid a proc lock/unlock cycle in the syscall handler for curthread, see upcoming examples.	2022-02-11 11:44:07 +00:00
Mateusz Guzik	8a0cb04df4	Add lim_cowsync, similar to crcowsync	2022-02-11 11:44:07 +00:00
Warner Losh	0987dc5be5	riscv: Add static asssert for context size Add a static assert for the siginfo_t, mcontext_t and ucontext_t sizes. These are de-factor ABI options and cannot change size ever. Differential Revision: https://reviews.freebsd.org/D34214	2022-02-10 14:32:21 -07:00
Warner Losh	6e48e160ae	powerpc: Add static asssert for context size Add a static assert for the siginfo_t, mcontext_t and ucontext_t sizes. These are de-facto ABI options and cannot change size ever. For powerpc64, also add asserts for {u,m}mcontext32_t and siginfo32. Reviewed by: andrew Differential Revision: https://reviews.freebsd.org/D34213	2022-02-10 14:32:20 -07:00
Warner Losh	690601f0b4	amd64: Add static asssert for context size Add a static assert for the siginfo_t, mcontext_t and ucontext_t sizes. These are de-facto ABI options and cannot change size ever. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D34212	2022-02-10 14:32:20 -07:00
Warner Losh	d4f495fbf8	i386: Add static asssert for context size Add a static assert for the siginfo_t, mcontext_t and ucontext_t sizes. These are de-facto ABI options and cannot change size ever. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D34211	2022-02-10 14:32:20 -07:00
Warner Losh	0c988f92dc	arm: Add static asssert for context size Add a static assert for the siginfo_t, mcontext_t and ucontext_t sizes. These are de-facto ABI options and cannot change size ever. Reviewed by: andrew Differential Revision: https://reviews.freebsd.org/D34210	2022-02-10 14:32:20 -07:00
Warner Losh	3988ca5aab	aarch64: Add static asssert for context size Add a static assert for the siginfo{,32}_t, mcontext{,32}_t and ucontext{,32}_t sizes. These are de-facto ABI options and cannot change size ever. Reviewed by: kib, andrew, jhb Differential Revision: https://reviews.freebsd.org/D32958	2022-02-10 14:32:20 -07:00
Jason A. Harmening	974efbb3d5	unionfs: fix typo in comment I deleted the wrong word when writing up a comment in a prior change; the covered vnode may be recursed during any unmount, not just forced unmount.	2022-02-10 15:17:43 -06:00
Mark Johnston	b4f60fab5d	tcp: Avoid conditionally defined fields in union lro_address The layout of the structure ends up depending on whether the including file includes opt_inet.h and opt_inet6.h, so different compilation units can end up seeing different versions of the structure. Fix this by unconditionally defining the address fields. As a side effect, this eliminates some duplication in the kernel's CTF type graph. Reviewed by: rscheff, tuexen MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34242	2022-02-10 15:39:58 -05:00
Richard Scheffenegger	3f169c54ab	tcp: Add/update AccECN related statistics and numbers Reserve couters in the tcps struct in preparation for AccECN, extend the debugging output for TF2 flags, optimize the syncache flags from individual bits to a codepoint for the specifc ECN handshake. This is in preparation of AccECN. No functional chance except for extended debug output capabilities. Reviewed By: #transport, rrs Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34161	2022-02-10 00:21:31 +01:00
Justin Hibbits	6db44b0158	Fix gzip compressed core dumps on big endian architectures The gzip trailer words (size and CRC) are both little-endian per the spec. MFC after: 3 days Sponsored by: Juniper Networks, Inc.	2022-02-10 09:34:37 -06:00
Konstantin Belousov	b51927b7b0	Revert "vm_pageout_scans: correct detection of active object" This reverts commit `3de96d664a`. Problem is that it is possible to reach the state with ref_count == 1 for the mapped non-anonymous object. For instance, anonymous posix shmfd or linux shmfs object could be mapped, and then corresponding file descriptor closed, dropping the object reference owned by the shmfd/shmfs file. Then the check in inactive scan assumes that the object and page are not mapped and frees the page, while they are not. PR: 261707 Discussed with: markj Sponsored by: The FreeBSD Foundation MFC after: now	2022-02-10 16:55:10 +02:00
Hans Petter Selasky	a30f71704e	mlx5ib: Add support for parsing udata in mlx5_ib_create_flow(). Backport from Linux 5.17 (drivers/infiniband/hw/mlx5/fs.c) This fixes creating flow rules from user-space after the kernel space update based on Linux 5.7-rc1 . Sponsored by: NVIDIA Networking	2022-02-10 11:17:42 +01:00
Hans Petter Selasky	04f407a3e5	mlx5en: Make sure the NIC IP addresses are written to firmware on link up. Fixes `e059c120b4` . PR: 261746 MFC after: 1 day Sponsored by: NVIDIA Networking	2022-02-10 11:17:42 +01:00
Kyle Evans	b9c92d631c	Annotate geom_md with MODULE_VERSION This was missed in `74d6c131cb` where other geom modules were annotated with MODULE_VERSION. Again, the problem is the same: we can't detect that geom_md is loaded into the kernel without it. This was noticed in release builds on the cluster; mdconfig attempts to load geom_md because it can't detect it in the kernel, but the cluster config includes md(4) and does not build the kmod. This problem would have been masked on hosts with the kmod built, as the kmod attempts to register the g_md module and fails. With this commit, mdconfig would not even try to load it again. Reported by: re (cperciva) MFC after: 3 days	2022-02-10 00:16:19 -06:00
Rick Macklem	17a56f3fab	nfsd: Reply NFSERR_SEQMISORDERED for bogus seqid argument The ESXi NFSv4.1 client bogusly sends the wrong value for the csa_sequence argument for a Create_session operation. RFC8881 requires this value to be the same as the sequence reply from the ExchangeID operation most recently done for the client ID. Without this patch, the server replies NFSERR_STALECLIENTID, which is the correct response for an NFSv4.0 SetClientIDConfirm but is not the correct error for NFSv4.1/4.2, which is specified as NFSERR_SEQMISORDERED in RFC8881. This patch fixes this. This change does not fix the issue reported in the PR, where the ESXi client loops, attempting ExchangeID/Create_session repeatedly. Reported by: asomers Tested by: asomers PR: 261291 MFC after: 1 week	2022-02-09 15:17:50 -08:00
Kenneth D. Merry	3090d5045a	Fix non-printable characters in NVMe model and serial numbers. The NVMe 1.4 spec simply says that Model and Serial numbers are ASCII strings. Unlike SCSI, it doesn't prohibit non-printable characters or say that the strings should be padded with spaces. Since 2014, we have had cam_strvis_sbuf(), which gives additional options for handling non-ASCII characters. That behavior hasn't been available for non-sbuf consumers, so users of cam_strvis() were left with having octal ASCII codes inserted. So, to avoid having garbage or octal chracters in the strings, use cam_strvis_sbuf() to create a new function, cam_strvis_flag(), and re-implement cam_strvis() using cam_strvis_flag(). Now, for the NVMe drives, we can use cam_strvis_flag with the CAM_STRVIS_FLAG_NONASCII_SPC flag. This transforms non-printable characters into spaces. sys/cam/cam.c: Add a new function, cam_strvis_flag(), that creates an sbuf on the stack with the user's destination buffer, and calls cam_strvis_sbuf() with the given flag argument. Re-implement cam_strvis() to call cam_strvis_flag with the CAM_STRVIS_FLAG_NONASCII_ESC argument. This should be the equivalent of the old cam_strvis() function, except for the overhead of creating the sbuf and calling sbuf_putc/printf. sys/cam/cam.h: Declaration for cam_strvis_flag. sys/cam/nvme/nvme_all.c: In nvme_print_ident, use the NONASCII_SPC flag with cam_strvis_flag(). sys/cam/nvme/nvme_da.c: In ndaregister(), use cam_strvis_flag() with the NONASCII_SPC flag for the disk description and serial number we report to GEOM. sys/cam/nvme/nvme_xpt.c: In nvme_probe_done(), use cam_strvis_flag with the NONASCII_SPC flag when storing the drive serial number in the CAM EDT. MFC after: 1 week Sponsored by: Spectra Logic Differential Revision: https://reviews.freebsd.org/D33973	2022-02-09 17:09:25 -05:00
Alexander Motin	98d59d2e0d	snd_hda: Add some ATI HDMI codec IDs. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com> MFC after: 1 week	2022-02-09 16:29:23 -05:00
Randall Stewart	cc41c17433	opps my patch lost the removal of the tlp_threshold counter increments	2022-02-09 16:19:22 -05:00
Randall Stewart	8d64b4b4c4	cleanup of rack variables. During a recent deep dive into all the variables so I could discover why stack switching caused larger retransmits I examined every variable in rack. In the process I found quite a few bits that were not used and needed cleanup. This update pulls out all the unused pieces from rack. Note there are no functional changes here, just the removal of unused variables and a bit of spacing clean up. Reviewed by: Michael Tuexen, Richard Scheffenegger Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D34205	2022-02-09 16:08:32 -05:00
Aleksandr Fedorov	b27e6e91d0	ng pppoe(4): Add the required NET_EPOCH section to the hook disconnection function. Disconnecting hooks are called outside of NET_EPOCH, but ng_pppoe_disconnect() calls NG_SEND_DATA_ONLY() which should be called in NET_EPOCH. PR: 257067 Reported by: niels=freebsd@bakker.net Reviewed by: vmaffione (mentor), glebius, donner Approved by: vmaffione (mentor), glebius, donner Sponsored by: vstack.com Differential Revision: https://reviews.freebsd.org/D34185	2022-02-09 22:00:50 +03:00
Michael Tuexen	a0aeb1cef5	in_pcb.c: fix compilation of an IPv4 only configuration While there, remove a duplicate inclusion of sysctl.h. Reported by: Gary Jennejohn Fixes: `a35bdd4489` - main - tcp: add sysctl interface for setting socket options Sponsored by: Netflix, Inc.	2022-02-09 19:58:29 +01:00
Michael Tuexen	a35bdd4489	tcp: add sysctl interface for setting socket options This interface allows to set a socket option on a TCP endpoint, which is specified by its inp_gencnt. This interface will be used in an upcoming command line tool tcpsso. Reviewed by: glebius, rrs Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D34138	2022-02-09 12:24:41 +01:00
Michael Tuexen	528c764924	tcp: fix compliation when KERN_TLS is not defined Reported by: Gary Jennejohn Fixes: `fd7daa7271` - main - tcp: make tcp_ctloutput_set() non-static Sponsored by: Netflix, Inc.	2022-02-09 12:16:43 +01:00
Ram Kishore Vegesna	7bf31432fd	ocs_fc: Fix a possible Null pointer dereference Fix a possible Null pointer dereference in ocs_hw_get_profile_list_cb() PR: 261453 Reported by: lwhsu MFC after: 3 days	2022-02-09 16:18:21 +05:30
Stefan Grundmann	06296f77c5	vt: fix splash_cpu logos use of vd_drawrect In the (extremely unlikely) case of vd->vd_height == vt_logo_sprite_height the vd_drawrect code would write outside of frame-buffer memory. MFC after: 1 week Reviewed by: cem Differential Revision: https://reviews.freebsd.org/D34220	2022-02-08 22:22:07 -05:00
Michael Tuexen	fd7daa7271	tcp: make tcp_ctloutput_set() non-static tcp_ctloutput_set() will be used via the sysctl interface in a upcoming command line tool tcpsso. Reviewed by: glebius, rscheff Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D34164	2022-02-08 18:49:44 +01:00
Dimitry Andric	5f2aca8394	Disable clang 14 warning about bitwise operators in zstd Parts of zstd, used in openzfs and other places, trigger a new clang 14 -Werror warning: ``` sys/contrib/zstd/lib/decompress/huf_decompress.c:889:25: error: use of bitwise '&' with boolean operands [-Werror,-Wbitwise-instead-of-logical] (BIT_reloadDStreamFast(&bitD1) == BIT_DStream_unfinished) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` While the warning is benign, it should ideally be fixed upstream and then vendor-imported, but for now silence it selectively. MFC after: 3 days	2022-02-08 21:46:08 +01:00
Dimitry Andric	7d8a4eb943	tty_info: Avoid warning by using logical instead of bitwise operators Since TD_IS_RUNNING() and TS_ON_RUNQ() are defined as logical expressions involving '==', clang 14 warns about them being checked with a bitwise operator instead of a logical one: ``` sys/kern/tty_info.c:124:9: error: use of bitwise '\|' with boolean operands [-Werror,-Wbitwise-instead-of-logical] runa = TD_IS_RUNNING(td) \| TD_ON_RUNQ(td); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \|\| sys/sys/proc.h:562:27: note: expanded from macro 'TD_IS_RUNNING' ^ sys/kern/tty_info.c:124:9: note: cast one or both operands to int to silence this warning sys/sys/proc.h:562:27: note: expanded from macro 'TD_IS_RUNNING' ^ sys/kern/tty_info.c:129:9: error: use of bitwise '\|' with boolean operands [-Werror,-Wbitwise-instead-of-logical] runb = TD_IS_RUNNING(td2) \| TD_ON_RUNQ(td2); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \|\| sys/sys/proc.h:562:27: note: expanded from macro 'TD_IS_RUNNING' ^ sys/kern/tty_info.c:129:9: note: cast one or both operands to int to silence this warning sys/sys/proc.h:562:27: note: expanded from macro 'TD_IS_RUNNING' ^ ``` Fix this by using logical operators instead. No functional change intended. Reviewed by: cem, emaste, kevans, markj MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D34186	2022-02-08 21:21:04 +01:00
Dimitry Andric	14a15342bb	Remove device lio from i386's LINT-NOIP This fixes link errors for the LINT-NOIP kernel on i386: ``` ld: error: undefined symbol: tcp_lro_flush_all >>> referenced by lio_droq.c >>> lio_droq.o:(lio_droq_process_packets) ld: error: undefined symbol: tcp_lro_rx >>> referenced by lio_core.c >>> lio_core.o:(lio_push_packet) ld: error: undefined symbol: tcp_lro_init >>> referenced by lio_main.c >>> lio_main.o:(lio_attach) ld: error: undefined symbol: tcp_lro_free >>> referenced by lio_main.c >>> lio_main.o:(lio_attach) >>> referenced by lio_main.c >>> lio_main.o:(lio_destroy_nic_device) *** [kernel] Error code 1 ``` MFC after: 3 days	2022-02-08 19:53:52 +01:00
Mark Johnston	c862d5f2a7	riscv: Fix a race in pmap_pinit() All pmaps share the top half of the address space. With 3-level page tables, the top-level kernel map entries are not static: they might change if the kernel map is extended (via pmap_growkernel()) or a 1GB mapping in the direct map is demoted (not implemented yet). Thus the riscv pmap maintains the allpmaps list to synchronize updates to top-level entries. When a pmap is created, it is inserted into this list after copying top-level entries from the kernel pmap. The copying is done without holding the allpmaps lock, and it is possible for pmap_pinit() to race with kernel map updates. In particular, if a thread is modifying L1 entries, and a concurrent pmap_pinit() copies the old version of the entries, it might not receive the update. Fix the problem by copying the kernel map entries after inserting the pmap into the list. This ensures that the nascent pmap always receives updates, though pmap_distribute_l1() may race with the page copy. Reviewed by: mhorne, jhb MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34158	2022-02-08 13:31:55 -05:00
Mark Johnston	5de79eeddb	ktls: Disallow transmitting empty frames outside of TLS 1.0/CBC mode There was nothing preventing one from sending an empty fragment on an arbitrary KTLS TX-enabled socket, but ktls_frame() asserts that this could not happen. Though the transmit path handles this case for TLS 1.0 with AES-CBC, we should be strict and allow empty fragments only in modes where it is explicitly allowed. Modify sosend_generic() to reject writes to a KTLS-enabled socket if the number of data bytes is zero, so that userspace cannot trigger the aforementioned assertion. Add regression tests to exercise this case. Reported by: syzkaller Reviewed by: gallatin, jhb MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34195	2022-02-08 12:40:41 -05:00
Mark Johnston	300cfb96fc	file: Make fget() and getvnode() consistent about initializing fpp Most fget() functions initialize the output parameter to NULL. Make the externally visible interface behave consistently, and make fget_unlocked_seq() private to kern_descrip.c. This fixes at least one bug in a consumer, _filemon_wrapper_openat(), which assumes that getvnode() sets the output file pointer to NULL upon an error. Reported by: syzbot+01c0459408f896a5933a@syzkaller.appspotmail.com Reviewed by: kib MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34190	2022-02-08 12:40:41 -05:00
Franco Fichtner	47ded797ce	netinet: simplify RSS ifdef statements Approved by: transport (rrs) MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D31583	2022-02-07 19:22:03 -07:00
John Baldwin	511b83b167	cxgbei: Replace worker thread pools with per-connection kthreads. Having a single pool of worker threads adds extra complexity and overhead. The software backend also uses per-connection kthreads. Sponsored by: Chelsio Communications	2022-02-07 16:20:40 -08:00
John Baldwin	fd8f61d6e9	cxgbei: Dispatch sent PDUs to the NIC asynchronously. Previously the driver was called to send PDUs to the NIC synchronously from the icl_conn_pdu_queue_cb callback. However, this performed a fair bit of work while holding the icl connection lock. Instead, change the callback to add sent PDUs to a STAILQ and defer dispatching of PDUs to the NIC to a helper thread similar to the scheme used in the TCP iSCSI backend. - Replace rx_flags int and the sole RXF_ACTIVE flag with a simple rx_active bool. - Add a pool of transmit worker threads for cxgbei. - Fix worker thread exit to depend on the wakeup in kthread_exit() to fix a race with module unload. Reported by: mav Sponsored by: Chelsio Communications	2022-02-07 16:20:06 -08:00
Hans Petter Selasky	e85af89fa7	Add more USB host controller PCI ID's. Submitted by: Gary Jennejohn <gljennjohn@gmail.com> MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-08 00:29:24 +01:00
John Baldwin	6426978617	Extend the VMM stats interface to support a dynamic count of statistics. - Add a starting index to 'struct vmstats' and change the VM_STATS ioctl to fetch the 64 stats starting at that index. A compat shim for <= 13 continues to fetch only the first 64 stats. - Extend vm_get_stats() in libvmmapi to use a loop and a static thread local buffer which grows to hold the stats needed. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D27463	2022-02-07 14:11:10 -08:00
Kristof Provost	3f3e4f3c74	dummynet: don't use per-vnet locks to protect global data. The ref_count counter is global (i.e. not per-vnet) so we can't use a per-vnet lock to protect it. Moreover, in callouts curvnet is not set, so we'd end up panicing when trying to use DN_BH_WLOCK(). Instead we use the global sched_lock, which is already used when evaluating ref_count (in unload_dn_aqm()). Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D34059	2022-02-07 22:59:46 +01:00
John Baldwin	7043ca9140	linuxkpi: Add parentheses to pacify -Wparentheses warnings from GCC. Reviewed by: bz, emaste Differential Revision: https://reviews.freebsd.org/D34145	2022-02-07 13:43:22 -08:00
Sebastian Huber	3ec0dc367b	kern_ntptime.c: Remove ntp_init() The ntp_init() function did set a couple of global objects to zero. These objects are in the .bss section and already initialized to zero during kernel or module loading.	2022-02-07 14:16:16 -07:00
John Baldwin	a3d71fffa7	cfiscsi_done: Free the dummy PDU earlier. The dummy PDU needs to be freed before marking task abortion complete as otherwise cfiscsi_session_terminate_tasks can return and destroy the session in another thread before the PDU is freed. Fixes: `2e8d1a5525` iscsi: Allocate a dummy PDU for the internal nexus reset task. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D34176	2022-02-07 12:55:08 -08:00
John Baldwin	c227269e2f	Stop adding -Wredundant-decls to CWARNFLAGS. clang doesn't implement it, and Linux doesn't enforce it. As a result, new instances keep cropping up both in FreeBSD's code and in upstream sources from vendors. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D34144	2022-02-07 12:47:51 -08:00
John Baldwin	949e395966	Trim duplicate code for copying in iovecs for PT_[GS]ETREGSET. Reviewed by: andrew, emaste Differential Revision: https://reviews.freebsd.org/D34177	2022-02-07 11:49:29 -08:00
Robert Wing	c9e023541a	pbuf_ctor(): lock the buffer with LK_NOWAIT This LOR happens when reading from a file backed MD device: lock order reversal: 1st 0xfffffe00431eaac0 pbufwait (pbufwait, lockmgr) @ /cobra/src/sys/vm/vm_pager.c:471 2nd 0xfffff80003f17930 ufs (ufs, lockmgr) @ /cobra/src/sys/dev/md/md.c:977 lock order pbufwait -> ufs attempted at: #0 0xffffffff80c78ead at witness_checkorder+0xbdd #1 0xffffffff80bd6a52 at lockmgr_lock_flags+0x182 #2 0xffffffff80f52d5c at ffs_lock+0x6c #3 0xffffffff80d0f3f4 at _vn_lock+0x54 #4 0xffffffff80708629 at mdstart_vnode+0x499 #5 0xffffffff807060ec at md_kthread+0x20c #6 0xffffffff80bbfcd0 at fork_exit+0x80 #7 0xffffffff810b809e at fork_trampoline+0xe This LOR was previously blessed by witness before commit `531f8cfea0` ("Use dedicated lock name for pbufs"). Instead of blessing ufs and pbufwait, use LK_NOWAIT to prevent recording the lock order. LK_NOWAIT will be a nop here as the lock is dropped in pbuf_dtor(). The takes the same approach as `5875b94c74` ("buf_alloc(): lock the buffer with LK_NOWAIT"). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D34183	2022-02-07 10:05:20 -09:00
Jung-uk Kim	379797d4b4	usb(4): Belatedly add a PCI device ID for AMD Bolton chipset	2022-02-07 13:48:09 -05:00
Andrew Turner	31cf95cec7	Stop single stepping in signal handers on arm64 We should clear the single step flag when entering a signal hander and set it when returning. This fixes the ptrace__PT_STEP_with_signal test. While here add support for userspace to set the single step bit as on x86. This can be used by userspace for self tracing. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34170	2022-02-07 15:03:23 +00:00
Richard Scheffenegger	ab001fcdf2	tcp: Apply tcp flags after ECN processing in rack_fast_output() Missed to move the tcp_set_flags() past ECN processing in rack_fast_output() earlier. Reviewed By: rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34180	2022-02-07 03:28:27 +01:00
Andrew Turner	67dc576bae	Fix the signal code on 32-bit breakpoints on arm64 When debugging 32-bit programs a debugger may insert a instruction that will raise the undefined instruction trap. The kernel handles these by raising a SIGTRAP, however the code was incorrect. Fix this by using the expected TRAP_BRKPT signal code. Sponsored by: The FreeBSD Foundation	2022-02-07 11:56:04 +00:00
Randall Stewart	a9696510f5	tcp: Add hystart++ to our cubic implementation. As promised to the transport call on 11/4/22 here is an implementation of hystart++ for cubic. It also cleans up the tcp_congestion function to have a better name. Common variables are moved into the general cc.h structure so that both cubic and newreno can use them for hystart++ Reviewed by: Michael Tuexen, Richard Scheffenegger Sponsored by: Netflix Inc. Differential Revision: https://reviews.freebsd.org/D33035	2022-02-07 06:37:46 -05:00
Roger Pau Monné	476438e81f	xen: remove public headers in sys/xen/interface Those are superseded by the ones in sys/contrib/xen and no longer used. Sponsored by: Citrix Systems R&D	2022-02-07 10:12:34 +01:00
Elliott Mitchell	ad7dd51499	xen: switch to use headers in contrib These headers originate with the Xen project and shouldn't be mixed with the main portion of the FreeBSD kernel. Notably they shouldn't be the target of clean-up commits. Switch to use the headers in sys/contrib/xen. Reviewed by: royger	2022-02-07 10:11:56 +01:00
Roger Pau Monné	3a9fd8242b	xen: import Xen 4.16 public headers in sys/contrib/ The current path of the Xen headers at /sys/xen/interface/ is not correct, as those headers are imported verbatim from the Xen sources and shouldn't be modified, as any modifications would be lost when a new version is imported. Changes to the public headers must be first done in Xen upstream so that they can be backported and new imports will already carry them. Import Xen 4.16 headers in sys/contrib/xen/. It's unlikely that we will import different Xen code, so don't place them inside of any subdirectory. If in the future other pieces of Xen code need to be imported the headers will need to move into an include/ subdirectory. Note that this commit does not yet modify the include path to use the newly imported headers. Sponsored by: Citrix Systems R&D	2022-02-07 10:11:56 +01:00
Elliott Mitchell	b6da4ec609	xen: remove leftover bits missed in commit `ac3ede5371` These fields are now unused, remove them. Fixes: `ac3ede5371` ('x86/xen: remove PVHv1 code') Reviewed by: royger Differential Revision: https://reviews.freebsd.org/D31206	2022-02-07 10:06:27 +01:00
Roger Pau Monné	759ae58c00	xen/grant-table: remove explicit linear mapping additions There's no need to explicitly add linear mappings for the grant table area, as the memory is allocated using xenmem_alloc and it should already have a linear mapping that can be obtained using rman_get_virtual. While there also remove the return value of gnttab_map, since there's no return value anymore. Sponsored by: Citrix Systems R&D Reviewed by: Elliott Mitchell <ehem+freebsd@m5p.com> Differential revision: https://reviews.freebsd.org/D29602	2022-02-07 10:06:27 +01:00
Gordon Bergling	b6724f7004	tegra: Fix a common typo in source code comments - s/ajusted/adjusted/ MFC after: 3 days	2022-02-06 17:31:05 +01:00
Gordon Bergling	5a78ec9e7c	kern_fflock: Fix a typo in a source code comment - s/foward/forward/ MFC after: 3 days	2022-02-06 17:29:43 +01:00
Gordon Bergling	a9bee9c77a	kern_racct: Fix a typo in a source code comment - s/maxumum/maximum/ MFC after: 3 days	2022-02-06 17:28:27 +01:00
Gordon Bergling	8ea3ceda76	fs: fix a few common typos in source code comments - s/quadradically/quadratically/ - s/persistant/persistent/ Obtained from: NetBSD MFC after: 3 days	2022-02-06 13:48:31 +01:00
Gordon Bergling	f32dd4d58a	cam(4): Fix a few typos in source code comments - s/trafer/transfer/ - s/failes/fails/ Obtained from: NetBSD MFC after: 3 days	2022-02-06 13:45:47 +01:00
Gordon Bergling	bc9432d0e7	xen(4): Fix a common typo in a source code comments - s/existance/existence/ MFC after: 3 days	2022-02-06 13:44:49 +01:00
Aleksandr Fedorov	ceaf442ff2	if_vxlan(4): Allow netmap_generic to intercept RX packets. Netmap (generic) intercepts the if_input method to handle RX packets. Call ifp->if_input() instead of netisr_dispatch(). Add stricter check for incoming packet length. This change is very useful with bhyve + vale + if_vxlan. Reviewed by: vmaffione (mentor), kib, np, donner Approved by: vmaffione (mentor), kib, np, donner MFC after: 2 weeks Sponsored by: vstack.com Differential Revision: https://reviews.freebsd.org/D30638	2022-02-06 15:27:46 +03:00
Hans Petter Selasky	42cf33dd1a	Add new USB host controller PCI ID's. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com> MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-06 13:18:35 +01:00
Konstantin Belousov	0af463e661	ffs_read(): lock buffers after snaplk with LK_NOWITNESS Reviewed and tested by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34179	2022-02-06 03:26:22 +02:00
Gleb Smirnoff	c999e3481d	dmesg: detect wrapped msgbuf on the kernel side and if so, skip first line Since `59f256ec35` dmesg(8) will always skip first line of the message buffer, cause it might be incomplete. The problem is that in most cases it is complete, valid and contains the "---<<BOOT>>---" marker. This skip can be disabled with '-a', but that would also unhide all non-kernel messages. Move this functionality from dmesg(8) to kernel, since kernel actually knows if wrap has happened or not. The main motivation for the change is not actually the value of the "---<<BOOT>>---" marker. The problem breaks unit tests, that clear message buffer, perform a test and then check the message buffer for a result. Example of such test is sys/kern/sonewconn_overflow.	2022-02-05 13:35:31 -08:00
Richard Scheffenegger	1790549d80	tcp: use TCPSTAT_INC in kernel ecn functions Incorrectly used KMOD_ marco in static kernel ECN functions. Both eventually resolve to counter_s64_add(), but better use the correct macros. Reviewed By: tuexen, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34181	2022-02-05 16:55:22 +01:00
Aleksandr Fedorov	fc035df8af	if_vtnet(4): Restore the ability to set promisc mode. PR: 254343, 255054 Reviewed by: vmaffione (mentor), donner Approved by: vmaffione (mentor), donner MFC after: 2 weeks Sponsored by: vstack.com Differential Revision: https://reviews.freebsd.org/D30639	2022-02-05 18:47:46 +03:00
Richard Scheffenegger	f7220c486c	tcp: move ECN handling code to a common file Reduce the burden to maintain correct and extensible ECN related code across multiple stacks and codepaths. Formally no functional change. Incidentially this establishes correct ECN operation in one instance. Reviewed By: rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34162	2022-02-05 15:04:42 +01:00
Kristof Provost	b21826bf15	pf: deal with tables gaining or losing counters When we create a table without counters, add an entry and later re-define the table to have counters we wound up trying to read non-existent counters. We now cope with this by attempting to add them if needed, removing them when they're no longer needed and not trying to read from counters that are not present. MFC after: 2 weeks Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D34131	2022-02-05 10:29:34 +01:00
Richard Scheffenegger	7994ef3c39	Revert "tcp: move ECN handling code to a common file" This reverts commit `0c424c90ea`.	2022-02-05 01:07:51 +01:00
Alan Somers	18ed2ce77a	fusefs: fix the build without INVARIANTS after `00134a0789` MFC after: 2 weeks MFC with: `00134a0789` Reported by: se	2022-02-04 18:44:27 -07:00
Richard Scheffenegger	0c424c90ea	tcp: move ECN handling code to a common file Reduce the burden to maintain correct and extensible ECN related code across multiple stacks and codepaths. Formally no functional change. Incidentially this establishes correct ECN operation in one instance. Reviewed By: rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34162	2022-02-04 22:54:41 +01:00
John Baldwin	6bea696af2	linux_copyout_strings: Use PROC_PS_STRINGS(). Reviewed by: markj Obtained from: CheriBSD Differential Revision: https://reviews.freebsd.org/D34173	2022-02-04 15:57:57 -08:00
John Baldwin	74fea8eb4f	cxgbei: Rework parsing of pre-offload PDUs. sbcut() returns mbufs in reverse order so is not suitable for reading data from the socket buffer. Instead, check for already-received data in the receive worker thread before passing offload PDUs up to the iSCSI layer. This uses soreceive() to read data from the socket and is also to use M_WAITOK since it now runs from a worker thread instead of an interrupt thread. Also, fix decoding of the data segment length for pre-offload PDUs. Reported by: Jithesh Arakkan @ Chelsio Fixes: `a8c4147edc` cxgbei: Parse all PDUs received prior to enabling offload mode. Sponsored by: Chelsio Communications	2022-02-04 15:38:49 -08:00
Alan Somers	00134a0789	fusefs: require FUSE_NO_OPENDIR_SUPPORT for NFS exporting FUSE file systems that do not set FUSE_NO_OPENDIR_SUPPORT do not guarantee that d_off will be valid after closing and reopening a directory. That conflicts with NFS's statelessness, that results in unresolvable bugs when NFS reads large directories, if: * The file system _does_ change the d_off field for the last directory entry previously returned by VOP_READDIR, or * The file system deletes the last directory entry previously seen by NFS. Rather than doing a poor job of exporting such file systems, it's better just to refuse. Even though this is technically a breaking change, 13.0-RELEASE's NFS-FUSE support was bad enough that an MFC should be allowed. MFC after: 3 weeks. Reviewed by: rmacklem Differential Revision: https://reviews.freebsd.org/D33726	2022-02-04 16:31:05 -07:00
Alan Somers	4a6526d84a	fusefs: optimize NFS readdir for FUSE_NO_OPENDIR_SUPPORT In its lowest common denominator, FUSE does not require that a directory entry's d_off field is valid outside of the lifetime of the directory's FUSE file handle. But since NFS is stateless, it must reopen the directory on every call to VOP_READDIR. That means reading the directory all the way from the first entry. Not only does this create an O(n^2) condition for large directories, but it can also result in incorrect behavior if either: * The file system _does_ change the d_off field for the last directory entry previously seen by NFS, or * The file system deletes the last directory entry previously seen by NFS. Handily, for file systems that set FUSE_NO_OPENDIR_SUPPORT d_off is guaranteed to be valid for the lifetime of the directory entry, there is no need to read the directory from the start. MFC after: 3 weeks Reviewed by: rmacklem	2022-02-04 16:30:58 -07:00
Alan Somers	d088dc76e1	Fix NFS exports of FUSE file systems for big directories The FUSE protocol does not require that a directory entry's d_off field outlive the lifetime of its directory's file handle. Since the NFS server must reopen the directory on every VOP_READDIR call, that means it can't pass uio->uio_offset down to the FUSE server. Instead, it must read the directory from 0 each time. It may need to issue multiple FUSE_READDIR operations until it finds the d_off field that it's looking for. That was the intention behind SVN r348209 and r297887, but a logic bug prevented subsequent FUSE_READDIR operations from ever being issued, rendering large directories incompletely browseable. MFC after: 3 weeks Reviewed by: rmacklem	2022-02-04 16:30:49 -07:00
Emmanuel Vadot	867b4decb4	lindebugfs: Fix write For write operation pseudofs creates an sbuf with the data. Use this data instead of the uio as it's not usable anymore after uiomove. Reviewed by: hselasky MFC after: 1 week Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D34114	2022-02-04 14:31:08 +01:00
Konstantin Belousov	9596b349bb	x86 atomic.h: remove obsoleted comment Modules no longer call kernel functions for atomic ops, and since the previous commit, we always use lock prefix. Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com> Reviewed by: jhb, markj MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34153	2022-02-04 14:01:39 +02:00
Konstantin Belousov	9c0b759bf9	x86 atomics: use lock prefix unconditionally Atomics have significant other use besides providing in-system primitives for safe memory updates. They are used for implementing communication with out of system software or hardware following some protocols. For instance, even UP kernel might require a protocol using atomics to communicate with the software-emulated device on SMP hypervisor. Or real hardware might need atomic accesses as part of the proper management protocol. Another point is that UP configurations on x86 are extinct, so slight performance hit by unconditionally use proper atomics is not important. It is compensated by less code clutter, which in fact improves the UP/i386 lifetime expectations. Requested by: Elliott Mitchell <ehem+freebsd@m5p.com> Reviewed by: Elliott Mitchell, imp, jhb, markj, royger Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34153	2022-02-04 14:01:39 +02:00
Konstantin Belousov	cbf999e75d	x86 atomic.h: cleanup comments for preprocessor directives Reviewed by: Elliott Mitchell, imp, jhb, markj, royger Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34153	2022-02-04 14:01:39 +02:00
Andrew Turner	664640ba6c	Sort the names of the arm64 debug registers While here clean up the names for the naming convention of the other registers in this file. Reviewed by: kib, mhorne (earlier version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34060	2022-02-04 10:49:27 +00:00
Sylvian Meygret	cd7306bb1f	ip_mroute: split mrouter interface deactivation and if_free Move if_free outside MRW_LOCK. This will silence LOR message which might appere during deinitialization.	2022-02-04 10:25:07 +01:00
Adrian Chadd	e388de98bd	ar40xx_switch: add initial switch for the IPQ4018/IPQ4019. Summary: This switch is based off of the AR8327/AR8337 external switch/PHY. However unlike the AR8327/AR8337 it itself doesn't have any PHYs; instead an external PHY connects to it using the PSGMII port. Differential Revision: https://reviews.freebsd.org/D34112 Reviewed by: manu This code is inspired by the ar40xx code in openwrt, which itself is based on the Qualcomm QCA-SSDK. Both of these sources are, amusingly, BSD licenced - and thus I have included some of the comments in the hardware workaround paths to document some of the magic numbers.	2022-02-03 21:27:13 -08:00
Adrian Chadd	b509e53896	dts: add IPQ4018/IPQ4019 ethernet MAC and ethernet switch definitions This adds the ethernet MAC and ethernet switch definitions. I've rewritten the header file and the DTS based on documentation and the required driver fields rather than the GPL'ed ones from openwrt. Differential Revision: https://reviews.freebsd.org/D34111 Reviewed by: manu	2022-02-03 21:26:45 -08:00
Adrian Chadd	29332c0dce	qcom_mdio: add initial IPQ4018 MDIO support This adds support for the IPQ4018/IPQ4019 MDIO bus. This is used to talk to external PHYs and switches. (There's an internal switch in the IPQ4018/IPQ4019 as well, but it's accessible via MMIO/AXI.) Differential Revision: https://reviews.freebsd.org/D34110 Reviewed by: manu	2022-02-03 21:26:14 -08:00
Henri Hennebert	ad494d3b2d	rtsx: Update driver version number to 2.1c Differential Revision: https://reviews.freebsd.org/D32154	2022-02-03 18:43:13 -05:00
Henri Hennebert	1e800a5934	rtsx: Do not display pci_read_config() errors during rtsx_init() Differential Revision: https://reviews.freebsd.org/D32154	2022-02-03 18:43:13 -05:00
Henri Hennebert	ec1f122b56	rtsx: Add CTLFLAG_STATS flag for read and write counters Differential Revision: https://reviews.freebsd.org/D32154	2022-02-03 18:43:12 -05:00
Henri Hennebert	7e5933b333	rtsx: Prefer __FreeBSD_version over __FreeBSD__ No functional change. Differential Revision: https://reviews.freebsd.org/D32154	2022-02-03 18:43:12 -05:00
Henri Hennebert	8e9740b62e	rtsx: Convert driver to use the mmc_sim interface A lot more generic cam related things were done in mmc_sim so this simplifies the driver a lot. Differential Revision: https://reviews.freebsd.org/D32154 Reviewed by: imp	2022-02-03 18:43:12 -05:00
Justin Hibbits	aa4736459e	powerpc/atomic: Fix atomic_testand_*_long on powerpc64 After `b5d227b0` FreeBSD was panicking on boot with "Duplicate free" in UMA. Analyzing the asm, the '1' mask was treated as an integer, rather than a long, causing 'slw' (shift left word) to be used for the shifting instruction, not 'sld' (shift left double). This means the upper bits of the bitfield were not getting used, resulting in corruption of the bitfield. While fixing this, the 'and' check of the mask does not need to be recorded, so don't record (drop the '.').	2022-02-03 17:25:39 -06:00
Alexander Motin	3b248a2113	APEI: Make sure event data fit into the buffer. There seem to be systems returning some garbage here. I still don't know why, but at least I hope this check fix indefinite printf loop. MFC after: 2 weeks	2022-02-03 15:33:01 -05:00
Richard Scheffenegger	fd723975ec	tcp: fix typo in commit `f026275e26` missed one bitmask inversion while committing D34148 Differential Revision: https://reviews.freebsd.org/D34148 Differential Revision: https://reviews.freebsd.org/D34160	2022-02-03 21:05:09 +01:00
Richard Scheffenegger	3b0ee68050	tcp: Prevent setting of ECN bits with setsockopt() setsockopt() grants full access to the deprecated TOS byte. For TCP, mask out the ECN codepoint, so that only the DSCP portion can be adjusted. Reviewed By: tuexen, hselasky, #manpages, #transport, debdrup Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34154	2022-02-03 20:06:42 +01:00
Jesper Schmitz Mouridsen	ea07ba1170	sys/arm64/iommu/iommu_pmap.c readd sys/systm.h after `d950c5898a` UINT64_C and bzero were no longer defined Approved by: kib Differential Revision: https://reviews.freebsd.org/D34155	2022-02-03 20:03:29 +01:00
John Baldwin	87c5d39f77	iwlwifi: Disable -Wformat when building with GCC. GCC's -Wformat complains about NULL format strings passed to iwl_fw_dbg_collect_trig (though the function handles NULL format strings). Curious that upstream iwlwifi in Linux is built with GCC and explicitly opts into this warning via the __printf() attribute. Reviewed by: bz Differential Revision: https://reviews.freebsd.org/D34146	2022-02-03 10:48:18 -08:00
Hans Petter Selasky	c830e92924	mlx5ib: Fix whitespace. Found by: kib@ MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-03 17:45:19 +01:00
Cy Schubert	5d4a348d0b	ipfilter: Fix indentation error Fixes: `064a5a9564` MFC after: 3 days	2022-02-03 08:37:11 -08:00
Hans Petter Selasky	12af59c2cf	mlx5ib: Add missing auto generated header file to Makefile. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-03 17:35:12 +01:00
Alexander Motin	1a8d8a3a90	CTL: Fix mode page trucation on HA synchronization. Due to variable size of struct ctl_ha_msg_mode ctl_isc_announce_mode() sent only first 4 bytes of modified mode page to the other HA side, that caused its corruption there, noticeable only after failover. I've found alike bug also in ctl_isc_announce_lun(), but there it was sending slightly more than needed, that is a smaller problem. MFC after: 1 week Sponsored by: iXsystems, Inc.	2022-02-03 11:10:12 -05:00
Kyle Evans	642701abc8	kern: harvest entropy from callouts `74cf7cae4d` ("softclock: Use dedicated ithreads for running callouts.") switched callouts away from the swi infrastructure. It turns out that this was a major source of entropy in early boot, which we've now lost. As a result, first boot on hardware without a 'fast' entropy source would block waiting for fortuna to be seeded with little hope of progressing without manual intervention. Let's resolve it by explicitly harvesting entropy in callout_process() if we've handled any callouts. cc/curthread/now seem to be reasonable sources of entropy, so use those. Discussed with: jhb (also proposed initial patch) Reported by: many Reviewed by: cem, markm (both csprng) Differential Revision: https://reviews.freebsd.org/D34150	2022-02-03 10:05:06 -06:00
Richard Scheffenegger	f026275e26	tcp: set IP ECN header codepoint properly TCP RACK can cache the IP header while preparing a new TCP packet for transmission. Thus all the IP ECN codepoint bits need to be assigned, without assuming a clear field beforehand. Reviewed By: tuexen, kbowling, #transport MFC after: 3 days Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34148	2022-02-03 16:53:41 +01:00
Richard Scheffenegger	1ebf460758	tcp: Access all 12 TCP header flags via inline function In order to consistently provide access to all (including reserved) TCP header flag bits, use an accessor function tcp_get_flags and tcp_set_flags. Also expand any flag variable from uint8_t / char to uint16_t. Reviewed By: hselasky, tuexen, glebius, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34130	2022-02-03 16:21:58 +01:00
Mark Johnston	b84ed4e7f6	filemon: Reject FILEMON_SET_FD commands when the fd is a kqueue When FILEMON_SET_FD is used, the filemon handle effectively wraps the passed file. In particular, the handle may be inherited by a child process, or transferred over a unix domain socket, so we must verify that the backing file permits this. Reported by: syzbot+36e6be9e02735fe66ca8@syzkaller.appspotmail.com Reviewed by: emaste MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34128	2022-02-03 09:41:53 -05:00
Michael Tuexen	d51c80351f	rack: fix compilation and small cleanup Fix a function prototype missed in the last commit and whitespace change. Sponsored by: Netflix, Inc.	2022-02-02 09:41:40 +01:00
Michael Tuexen	3b3c08c135	tcp: cleanup functions related to socket option handling Consistently only pass the inp and the sopt around. Don't pass the so around, since in a upcoming commit tcp_ctloutput_set() will be called from a context different from setsockopt(). Also expect the inp to be locked when calling tcp_ctloutput_[gs]et(), this is also required for the upcoming use by tcpsso, a command line tool to set socket options. Reviewed by: glebius, rscheff Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D34151	2022-02-02 09:27:59 +01:00
Warner Losh	e30fceb89b	mps: Use 64-bit chain structures According to Broadcom, mixing 64-bit SGEs with 32-bit chain entries can lead to IOC Fault code 0x40000d04. This fault code has been observed to suddenly increase on certain machines when the OCA firmware images are deployed. The hardware interprets all elements of a 64-bit SGE, even ones marked as 32-bit. Depending on the other bits, this will just work, but sometimes generate the above fault. Broadcom recommends this practice, and the Linux and NetBSD drivers follow it. Rework the chaining code to use MPI2_SGE_CHAIN64 instead of MPI2_SGE_CHAIN32. Adjust MPS_SGC_SIZE from 8 to 12 to match the size of the new structure. Flag the structure as being 64-bits now. Since MPS_SGE64_SIZE and MPS_SGC_SIZE are the same now, mps_push_sge could be simplified (after the same fashion of mpr). The different number of cases collapse to whether or not there's room for the segments and if not we need a chain, however these changes haven't been made yet as the current code handles those cases properly with the new defines. Made chain_busaddr 64-bits, even though we ask for all allocations to be below 4GB for this tag. Use it to set both parts of the CHAIN64 address rather than baking the 4GB assumption. Add asserts around the allocation to detect and BUSDMA bugs in allocation. Remove asserts and associated comment in mpi_pre_fw_download and mpi_pre_fw_upload. The code does not, it seems, depend on this invariant. The mpr driver has similar code, no asserts and also doesn't depend on this. Adjust comments to reflect the updated size. Sponsored by: Netflix Reviewed by: scottl, mav Differential Revision: https://reviews.freebsd.org/D34016	2022-02-02 22:35:33 -07:00
Jason A. Harmening	83d61d5b73	unionfs: do not force LK_NOWAIT if VI_OWEINACT is set I see no apparent need to avoid waiting on the lock just because vinactive() may be called on another thread while the thread that cleared the vnode refcount has the lock dropped. In fact, this can at least lead to a panic of the form "vn_lock: error <errno> incompatible with flags" if LK_RETRY was passed to VOP_LOCK(). In this case LK_NOWAIT may cause the underlying FS to return an error which is incompatible with LK_RETRY. Reported by: pho Reviewed by: kib, markj, pho Differential Revision: https://reviews.freebsd.org/D34109	2022-02-02 21:08:17 -06:00
Jason A. Harmening	6ff167aa42	unionfs: allow lock recursion when reclaiming the root vnode The unionfs root vnode will always share a lock with its lower vnode. If unionfs was mounted with the 'below' option, this will also be the vnode covered by the unionfs mount. During unmount, the covered vnode will be locked by dounmount() while the unionfs root vnode will be locked by vgone(). This effectively requires recursion on the same underlying like, albeit through two different vnodes. Reported by: pho Reviewed by: kib, markj, pho Differential Revision: https://reviews.freebsd.org/D34109	2022-02-02 21:08:17 -06:00
Jason A. Harmening	0cd8f3e958	unionfs: fix assertion order in unionfs_lock() VOP_LOCK() may be handed a vnode that is concurrently reclaimed. unionfs_lock() accounts for this by checking for empty vnode private data under the interlock. But it incorrectly asserts that the vnode is using the unionfs dispatch table before making this check. Reverse the order, and also update KASSERT_UNIONFS_VNODE() to provide more useful information. Reported by: pho Reviewed by: kib, markj, pho Differential Revision: https://reviews.freebsd.org/D34109	2022-02-02 21:08:17 -06:00
Rick Macklem	e2fe58d61b	nfsd: Allow file owners to perform Open(Delegate_cur) Commit `b0b7d978b6` changed the NFSv4 server's default behaviour to check the file's mode or ACL for permission to open the file, to be Linux and Solaris compatible. However, it turns out that Linux makes an exception for the case of Claim_delegate_cur(_fh). When a NFSv4 client is returning a delegation, it must acquire Opens against the server to replace the ones done locally in the client. The client does this via an Open operation with Claim_delegate_cur(_fh). If this operation fails, due to a change to the file's mode or ACL after the delegation was issued, the client does not have any way to retain the open. As such, the Linux client allows the file's owner to perform an Open with Claim_delegate_cur(_fh) no matter what the mode or ACL allows. This patch makes the FreeBSD server allow this case, to be Linux compatible. This patch only affects the case where delegations are enabled, which is not the default. MFC after: 2 weeks	2022-02-02 14:10:16 -08:00
John Baldwin	63b7c2df8e	Disable -Wunused-function for {ed,x}25519_ref10.c in libsodium.	2022-02-02 12:25:16 -08:00
Konstantin Belousov	21a37c3cc6	Exclude DEBUG_VFS_LOCKS from non-debug kernel configs Sponsored by: The FreeBSD Foundation MFC after: 1 week	2022-02-02 19:27:32 +02:00
Hans Petter Selasky	a88e1a04df	usb(4): Ignore port resume failures. If port resume fails, likely the USB device is detached. Ignore such errors, because else the USB stack might try forever trying to resume the device, before it will proceed detaching it. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-02 13:00:48 +01:00
Konstantin Belousov	e7c5442162	amd64: micro-optimize vptopte()/vtopde() further Eliminate shlq $3,address shift after masking of the va is done, which is needed to convert pt_entry_t[] array index into byte offset. Do it by preshifting the mask, and compensating the right shift of va. Suggested by: alc Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33786	2022-02-02 11:40:04 +02:00
Konstantin Belousov	0b8643eaf6	vmmeter(): Fix detection of the named swap objects Noted and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33549	2022-02-02 11:39:58 +02:00
Konstantin Belousov	4cf9f5d807	vm_object: restore handling of shadow_count for all type of objects instead of only OBJ_ANON objects that are backing, as it is now. This is required for e.g. vm_meter is_object_active() detection, and should be useful in some more cases. Use refcount KPI for all objects, regardless of owning the object lock, and the fact that currently OBJ_ANON cannot change for the live object. Noted and reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D33549	2022-02-02 11:39:51 +02:00
Wojciech Macek	77223d98b6	ip_mroute: refactor epoch-basd locking Remove duplicated epoch_enter and epoch_exit in IP inp/outp routines. Remove unnecessary macros as well. Obtained from: Semihalf Spponsored by: Stormshield Reviewed by: glebius Differential revision: https://reviews.freebsd.org/D34030	2022-02-02 06:48:05 +01:00
Cy Schubert	445ecc480c	ipfilter: Correct a typo in a comment MFC after: 3 days	2022-02-01 19:55:56 -08:00
Mitchell Horne	4e1bc961bb	arm64, riscv: handle RB_KDB This allows entering the debugger at the earliest possible time, if the '-d' argument is passed to the kernel. Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D34120	2022-02-01 13:59:54 -04:00
Mitchell Horne	e6ee2b6506	riscv: add ALT_BREAK_TO_DEBUGGER to GENERIC It allows quickly entering ddb(4) over a serial line. Reviewed by: jhb MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D34119	2022-02-01 13:59:54 -04:00
Richard Scheffenegger	93e28d6e89	tcp: LRO code to deal with all 12 TCP header flags TCP per RFC793 has 4 reserved flag bits for future use. One of those bits may be used for Accurate ECN. This patch is to include these bits in the LRO code to ease the extensibility if/when these bits are used. Reviewed By: hselasky, rrs, #transport Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D34127	2022-02-01 18:41:36 +01:00
John Baldwin	8a67a1a964	<sys/bitstring.h>: Cast _BITSTR_BITS to int in a ternary operator. This fixes a -Wsign-compare error reported by GCC due to the two results of the ternary operator having differing signedness. Reviewed by: dougm, rlibby Differential Revision: https://reviews.freebsd.org/D34122	2022-02-01 09:45:11 -08:00
Kristof Provost	4daa31c108	pflog: align header to 4 bytes, not 8 `6d4baa0d01` incorrectly rounded the lenght of the pflog header up to 8 bytes, rather than 4. PR: 261566 Reported by: Guy Harris <gharris@sonic.net> MFC after: 1 week Sponsored by: Rubicon Communications, LLC ("Netgate")	2022-02-01 18:17:44 +01:00
Hans Petter Selasky	84d7b8e75f	mlx5en: Implement TLS RX support. TLS RX support is modeled after TLS TX support. The basic structures and layouts are almost identical, except that the send tag created filters RX traffic and not TX traffic. The TLS RX tag keeps track of past TLS records up to a certain limit, approximately 1 Gbyte of TCP data. TLS records of same length are joined into a single database record. Regularly the HW is queried for TLS RX progress information. The TCP sequence number gotten from the HW is then matches against the database of TLS TCP sequence number records and lengths. If a match is found a static params WQE is queued on the IQ and the hardware should immediately resume decrypting TLS data until the next non-sequential TCP packet arrives. Offloading TLS RX data is supported for untagged, prio-tagged, and regular VLAN traffic. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:17 +01:00
Hans Petter Selasky	e6d7ac1d03	mlx5core: Set driver version into firmware. If the driver_version capability bit is enabled, send the driver version to firmware after the init HCA command, for display purposes. Example of driver version: "FreeBSD,mlx5_core,14.0.0,3.x-xxx" Linux commits: 012e50e109fd27ff989492ad74c50ca7ab21e6a1 MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:17 +01:00
Hans Petter Selasky	8e332232a5	mlx5en: Implement one RQT object per channel. These objects will eventually be used to switch TLS RX traffic. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:17 +01:00
Hans Petter Selasky	ea00d7e8ca	mlx5: Add raw ethernet local loopback support. Currently, unicast/multicast loopback raw ethernet (non-RDMA) packets are sent back to the vport. A unicast loopback packet is the packet with destination MAC address the same as the source MAC address. For multicast, the destination MAC address is in the vport's multicast filter list. Moreover, the local loopback is not needed if there is one or none user space context. After this patch, the raw ethernet unicast and multicast local loopback are disabled by default. When there is more than one user space context, the local loopback is enabled. Note that when local loopback is disabled, raw ethernet packets are not looped back to the vport and are forwarded to the next routing level (eswitch, or multihost switch, or out to the wire depending on the configuration). Linux commits: c85023e153e3824661d07307138fdeff41f6d86a 8978cc921fc7fad3f4d6f91f1da01352aeeeff25 MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	c1b76119cb	mlx5: Implement mlx5_nic_vport_update_local_lb() MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	5381f93647	mlx5en: Create TIRs before flowtables. Because flowtables may redirect traffic to TIRs. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	001106f807	mlx5en: Create flowtables in correct order. Because it affects how the flow tables may re-direct traffic. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	2c0ade806a	mlx5: Implement flow steering helper functions for TCP sockets. This change adds convenience functions to setup a flow steering rule based on a TCP socket. The helper function gets all the address information from the socket and returns a steering rule, to be used with HW TLS RX offload. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	0ee1b09eaa	mlx5: Implement offloads flowtable namespace. This namespace will be used for TCP offloads, like hardware decryption of TLS TCP data. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	e059c120b4	mlx5en: Create and destroy all flow tables and rules when the network interface attaches and detaches. Previously flow steering tables and rules were only created and destroyed at link up and down events, respectivly. Due to new requirements for adding TLS RX flow tables and rules, the main flow steering table must always be available as there are permanent redirections from the TLS RX flow table to the vlan flow table. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	a8e715d21b	mlx5en: Add race protection for SQ remap Add a refcount for posted WQEs to avoid a race between post WQE and FW command flows. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:16 +01:00
Hans Petter Selasky	aabca1034c	mlx5en: Properly account for no-checksum on tunneled packets. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	06c2bd1872	mlx5en: Force all packets through the indirection table. All packets must go through the indirection table, RQT, because it is not possible to modify the RQN of the TIR for direct dispatchment after it is created, typically when the link goes up and down. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	266c81aae3	mlx5/mlx5en: Add SQ remap support Add support to map an SQ to a specific schedule queue using a special WQE as performance enhancement. SQ remap operation is handled by a privileged internal queue, IQ, and the mapping is enabled from one rate to another. The transition from paced to non-paced should however always go through FW. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	1c407d0494	mlx5: Properly define the reg_umr_sq networking offload capability bit. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	9680b1ba71	mlx5en: Only delete installed VxLAN rules. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	6176a5e338	mlx5en: Fix inverted logical assignment. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	694263572f	mlx5en: Implement support for internal queues, IQ. Internal send queues are regular sendqueues which are reserved for WQE commands towards the hardware and firmware. These queues typically carry resync information for ongoing TLS RX connections and when changing schedule queues for rate limited connections. The internal queue, IQ, code is more or less a stripped down copy of the existing SQ managing code with exception of: 1) An optional single segment memory buffer which can be read or written as a whole by the hardware, may be provided. 2) An optional completion callback for all transmit operations, may be provided. 3) Does not support mbufs. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	21228c67ab	mlx5en: Implement helper functions to open and close TLS TIR context. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:15 +01:00
Hans Petter Selasky	75767cb889	mlx5en: Share DEK objects with TLS RX. The TLS RX support also needs to be able to allocate DEK objects. Share the available objects 1:1. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	fad4b7d1f2	mlx5en: Add missing TLS structure prototype. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	3a1bf85503	mlx5en: Remove unused hardware TLS field. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	33a6a7a72a	mlx5en: Make the receive packet indirection table, RQT, static instead of dynamic. Allocate the RQT once, pointing all initial entries to the drop RQN. When opening the channels simplify modify the RQT, directing all traffic to the new RQNs. Similarly when closing the channels point all RQT entries back to the so-called drop RQN. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	7800af352a	mlx5en: Set CQN in RQ parameters for drop RQ. Else creating the drop RQ fails. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	03567b0dfa	mlx5en: Set channel pointer for drop receive queue. A valid channel pointer is needed to get the priv pointer during init. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	4e40e984da	mlx5en: Print error code when opening drop RQ fails. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	27b778ae55	mlx5en: Implement dummy receive queue, RQ, for dropping packets. What is a drop RQ and why is it needed? The RSS indirection table, also called the RQT, selects the destination RQ based on the receive queue number, RQN. The RQT is frequently referred to by flow steering rules to distribute traffic among multiple RQs. The problem is that the RQs cannot be destroyed before the RQT referring them is destroyed too. Further, TLS RX rules may still be referring to the RQT even if the link went down. Because there is no magic RQN for dropping packets, we create a dummy RQ, also called drop RQ, which sole purpose is to drop all received packets. When the link goes down this RQN is filled in all RQT entries, of the main RQT, so the real RQs which are about to be destroyed can be released and the TLS RX rules can be sustained. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	a60f953424	mlx5en: Make the hw_lro parameter read only tunable. This prevents the so-called TIR context from changing during runtime. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:14 +01:00
Hans Petter Selasky	788e9e7478	mlx5: Remove support for FreeBSD 10 and older. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:13 +01:00
Hans Petter Selasky	2d5e5a0d75	mlx5en: Patch to inhibit transmit doorbell writes during packet reception. During packet reception the network stack frequently transmit data in response to TCP window updates. To reduce the number of transmit doorbells needed, inhibit all transmit doorbells designated for the same channel until after the reception of packets for the given channel is completed. While at it slightly refactor the mlx5e_tx_notify_hw() function: 1) The doorbell information is always stored into sq->doorbell.d64 . No need to pass a separate pointer to this variable. 2) Move checks for skipping doorbell writes inside this function. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 16:21:13 +01:00
Konstantin Belousov	0f7b6e11c0	mlx5en: Use a UMA cache zone for managing TLS send tags Instead of allocating directly from a normal zone. This way import and release are guaranteed to process all allocated and then deallocated items. Also, the release occurs in a sleepable context when caller of uma_zfree() or uma_zdestroy() can sleep itself. MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 14:45:58 +02:00
Konstantin Belousov	028130b8e4	mlx5ib: idiomatic use of preprocessor, in particular paths MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 14:45:58 +02:00
Konstantin Belousov	7060097908	mlx5ib: normalize use of the opt_*.h files MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 14:45:57 +02:00
Konstantin Belousov	89918a2375	mlx5en: idiomatic use of preprocessor, in particular paths MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 14:45:57 +02:00
Konstantin Belousov	b984b95693	mlx5en: normalize use of the opt_*.h files MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 14:45:57 +02:00
Hans Petter Selasky	12c56d7dc4	mlx5: idiomatic use of preprocessor, in particular paths MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 14:45:57 +02:00
Konstantin Belousov	ee9d634bd3	mlx5: normalize use of the opt_*.h files MFC after: 1 week Sponsored by: NVIDIA Networking	2022-02-01 14:45:57 +02:00
Konstantin Belousov	303d3ae7e8	ufs, msdosfs: do not record witness order when creating vnode When allocating new vnode, we need to lock it exclusively before making it externally visible. Since other threads cannot observe the vnode yet, current lock order cannot create LoR conditions. Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34126	2022-02-01 10:51:55 +02:00
Konstantin Belousov	d51b0786a2	msdosfs_denode.c: some style Reviewed by: mckusick Sponsored by: The FreeBSD Foundation MFC after: 3 days Differential revision: https://reviews.freebsd.org/D34126	2022-02-01 10:51:48 +02:00
Konstantin Belousov	99aa3b731c	ffs: lock buffers after snaplk with LK_NOWITNESS Reviewed by: mckusick Discussed with: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34073	2022-02-01 06:54:50 +02:00
Konstantin Belousov	c02780b78c	Add GB_NOWITNESS flag It prevents WITNESS from recording the lock order for the buffer lock acquired by getblkx(). Reviewed by: mckusick Discussed with: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34073	2022-02-01 06:54:50 +02:00
Konstantin Belousov	e11b2b69c5	ffs_alloc.c: order includes alphabetically Reviewed by: mckusick Discussed with: markj Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34073	2022-02-01 06:54:50 +02:00
Konstantin Belousov	d950c5898a	vm/vm_extern.h, vm/vm_page.h: use sys/kassert.h instead of fatty sys/systm.h. Suggested by: jhb Reviewed by: alc, imp, jhb (previous version) Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34089	2022-02-01 05:55:35 +02:00
Konstantin Belousov	f4cdb9d7c3	vm/vm_pager.h: use sys/systm.h header it is needed for __read_mostly attribute definition, which right now comes from vm/vm_page.h including sys/systm.h Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34089	2022-02-01 05:55:35 +02:00
Konstantin Belousov	54d34bfbdf	Introduce sys/kassert.h It contains assert-related definitions previously provided by sys/systm.h. The new header is leaner than whole systm.h. Include kassert.h from systm.h for compatibility. The copyright assignment to Eivind Eklund was suggested by Kirk McKusick and is based in the commit `5526d2d920`. Suggested by: jhb Reviewed by: alc, imp, jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34089	2022-02-01 05:14:14 +02:00
John Baldwin	53e938e408	hyperv storvsc: Don't abuse struct sglist to hold virtual addresses. struct sglist is intended for holding S/G lists of physical address ranges, not virtual address ranges. GCC 9.x issues several warnings due to casts between pointers and integers of different sizes as a result (vm_paddr_t is 64-bits on i386). Instead, add a local 'struct hv_sglist' which uses an array of 'struct iovec' to hold the S/G list of virtual address ranges. Differential Revision: https://reviews.freebsd.org/D31933	2022-01-31 17:11:27 -08:00
John Baldwin	d782385e9b	tcp_ratelimit: Handle some edge cases with TLS + RL send tags. - After a connection has fallen back from NIC TLS to SW TLS, any pacing rate changes should modify the inpcb send tag even though SB_TLS_IFNET is set. - If a connection tries to modify the pacing rate before the send tag has been converted from plain TLS to TLS + RL, don't fail the rate request set but let it fall through to setting the rate on the non-TLS inpcb RL tag. Reviewed by: gallatin, rrs, hselasky Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D34085	2022-01-31 16:40:04 -08:00
John Baldwin	d958bc7963	ktls: Try to enable TOE TLS after marking existing data not ready. At the moment this is mostly a no-op but in the future there will be in-flight encrypted data which requires software decryption. This same setup is also needed for NIC TLS RX. Note that this does break TOE TLS RX for AES-CBC ciphers since there is no software fallback for AES-CBC receive. This will be resolved one way or another before 14.0 is released. Reviewed by: hselasky Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D34082	2022-01-31 16:39:21 -08:00
Mark Johnston	773e3a71b2	pf: Initialize pf_kpool mutexes earlier There are some error paths in ioctl handlers that will call pf_krule_free() before the rule's rpool.mtx field is initialized, causing a panic with INVARIANTS enabled. Fix the problem by introducing pf_krule_alloc() and initializing the mutex there. This does mean that the rule->krule and pool->kpool conversion functions need to stop zeroing the input structure, but I don't see a nicer way to handle this except perhaps by guarding the mtx_destroy() with a mtx_initialized() check. Constify some related functions while here and add a regression test based on a syzkaller reproducer. Reported by: syzbot+77cd12872691d219c158@syzkaller.appspotmail.com Reviewed by: kp MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D34115	2022-01-31 16:14:00 -05:00
Konstantin Belousov	66c5fbca77	insmntque1(): remove useless arguments Also remove once-used functions to clean up after failed insmntque1(), which were destructor callbacks in previous life. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D34071	2022-01-31 16:49:08 +02:00
Kornel Duleba	1a6d987b7f	enetc: Wait for pending transmissions before disabling TX queues According to the RM it's not safe to disable a TX ring while it is busy transmitting frames. In order to be safe wait until the ring is empty. (cidx==pidx) Use this opportunity to remove a set-but-unused variable. Obtained from: Semihalf Sponsored by: Alstom Group	2022-01-31 08:57:48 +01:00
Kornel Duleba	a6bda3e1ef	enetc: Simply TX ring credits counting logic According to the RM rings can hold at most ring_size - 1 descriptors at any time. No additional logic is needed since iflib already respects this constrain. Thanks to that the pidx == cidx situation is not ambiguous and indicates an empty ring. Use that to simplify the logic that calculates the amount of processed frames. Obtained from: Semihalf Sponsored by: Alstom Group	2022-01-31 08:57:48 +01:00
Kornel Duleba	f485d733e8	enetc: Disable HW IP packet alignment The NIC can IP align received packets. It was observed that it caused some rare stalls, that required full board reset. Disable this feature for now. It doesn't provide any significant performance improvement anyway. Obtained from: Semihalf Sponsored by: Alstom Group	2022-01-31 08:57:48 +01:00
Konstantin Belousov	8d8589b385	ufs: be more persistent with finishing some operations when the vnode is doomed after relock. The mere fact that the vnode is doomed does not prevent us from doing UFS operations on it while it is still belongs to UFS, which is determined by non-NULL v_data. Not finishing some operations, e.g. not syncing the inode block only because the vnode started reclamation, is not correct. Add macro IS_UFS() which incapsulates the v_data != NULL, and use it instead of VN_IS_DOOMED() for places where the operation completion is important. Reviewed by: markj, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34072	2022-01-31 04:46:21 +02:00
Konstantin Belousov	4559700a0a	ffs_snapblkfree(): add a comment explaining lockmgr invocation Reviewed by: markj, mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34072	2022-01-31 04:46:21 +02:00
Konstantin Belousov	0cdc603308	ufs: Use IS_SNAPSHOT() Reviewed by: markj, mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34072	2022-01-31 04:46:21 +02:00
Konstantin Belousov	3d68c4e175	syncer VOP_FSYNC(): unlock syncer vnode around call to VFS_SYNC() The lock is unneccessary since the mount point is busied, which prevents unmount and syncer vnode deallocation. Having the vnode locked causes innocent LoRs and complicates debugging. Also stop starting write accounting around it. Any caller of VOP_FSYNC() must do it already, and sync_vnode() does. Reported and tested by: pho Reviewed by: markj, mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D34072	2022-01-31 04:46:21 +02:00

... 3 4 5 6 7 ...

141717 Commits