freebsd-nq

Author	SHA1	Message	Date
Mateusz Guzik	9d5a594f0b	ufs: add support for lockless lookup ACLs are not supported, meaning their presence will force the use of the old lookup. Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25579	2020-07-25 10:38:05 +00:00
Mateusz Guzik	c42b77e694	vfs: lockless lookup Provides full scalability as long as all visited filesystems support the lookup and terminal vnodes are different. Inner workings are explained in the comment above cache_fplookup. Capabilities and fd-relative lookups are not supported and will result in immediate fallback to regular code. Symlinks, ".." in the path, mount points without support for lockless lookup and mismatched counters will result in an attempt to get a reference to the directory vnode and continue in regular lookup. If this fails, the entire operation is aborted and regular lookup starts from scratch. However, care is taken that data is not copied again from userspace. Sample benchmark: incremental -j 104 bzImage on tmpfs: before: 142.96s user 1025.63s system 4924% cpu 23.731 total after: 147.36s user 313.40s system 3216% cpu 14.326 total Sample microbenchmark: access calls to separate files in /tmpfs, 104 workers, ops/s: before: 2165816 after: 151216530 Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25578	2020-07-25 10:37:15 +00:00
Mateusz Guzik	07d2145a17	vfs: add the infrastructure for lockless lookup Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25577	2020-07-25 10:32:45 +00:00
Mateusz Guzik	0379ff6ae3	vfs: introduce vnode sequence counters Modified on each permission change and link/unlink. Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25573	2020-07-25 10:31:52 +00:00
Mateusz Guzik	82dc812235	seqc: add a sleepable variant and convert some routines to macros This temporarily duplicates some code. Macro conversion convinces clang to carry predicts into consumers.	2020-07-25 10:29:48 +00:00
Ruslan Bukin	62ad310c93	Split-out the Intel GAS (Guest Address Space) management component from Intel DMAR support, so it can be used on other IOMMU systems. Reviewed by: kib Sponsored by: DARPA/AFRL Differential Revision: https://reviews.freebsd.org/D25743	2020-07-25 09:28:38 +00:00
Mateusz Guzik	d53582388b	Remove duplicated content from _eventhandler.h	2020-07-25 07:48:20 +00:00
Mateusz Guzik	109b537cd7	Remove leftover macros for long gone vmsize mtx	2020-07-25 07:45:44 +00:00
Mateusz Guzik	d1385ab26e	Guard sbcompress_ktls_rx with KERN_TLS Fixes a compilation warning after r363464	2020-07-25 07:15:23 +00:00
Mateusz Guzik	bf71b96c69	Do a lockless check in kthread_suspend_check Otherwise an idle system running lockstat sleep 10 reports contention on process lock comming from bufdaemon. While here fix a style nit.	2020-07-25 07:14:33 +00:00
Michal Meloun	d873a521ca	Revert r363123. As Emanuel poited me the Linux processes these clock assignments in forward order, not in reversed. I misread the original code. Tha problem with wrong order for assigned clocks found in tegra (and some imx) DT should be reanalyzed and solved by different way. MFC with: r363123 Reported by; manu	2020-07-25 06:32:23 +00:00
Rick Macklem	cfaafa7908	Add support for ext_pgs mbufs to nfsm_uiombuflist() and nfsm_split(). This patch uses a slightly different algorithm for nfsm_uiombuflist() for the non-ext_pgs case, where a variable called "mcp" is maintained, pointing to the current location that mbuf data can be filled into. This avoids use of mtod(mp, char *) + mp->m_len to calculate the location, since this does not work for ext_pgs mbufs and I think it makes the algorithm more readable. This change should not result in semantic changes for the non-ext_pgs case. The patch also deletes come unneeded code. It also adds support for anonymous page ext_pgs mbufs to nfsm_split(). This is another in the series of commits that add support to the NFS client and server for building RPC messages in ext_pgs mbufs with anonymous pages. This is useful so that the entire mbuf list does not need to be copied before calling sosend() when NFS over TLS is enabled. At this time for this case, use of ext_pgs mbufs cannot be enabled, since ktls_encrypt() replaces the unencrypted data with encrypted data in place. Until such time as this can be enabled, there should be no semantic change. Also, note that this code is only used by the NFS client for a mirrored pNFS server.	2020-07-24 23:17:09 +00:00
Navdeep Parhar	a2e160c5af	cxgbe(4): Some updates to the common code. Obtained from: Chelsio Communications MFC after: 1 week Sponsored by: Chelsio Communications	2020-07-24 23:15:42 +00:00
Ilya Bakulin	badc50c270	Make it possible to get/set MMC frequency from camcontrol Enhance camcontrol(8) so that it's possible to manually set frequency for SD/MMC cards. While here, display more information about the current controller, such as supported operating modes and VCCQ voltages, as well as current VCCQ voltage. Reviewed by: manu Approved by: imp (mentor) Differential Revision: https://reviews.freebsd.org/D25795	2020-07-24 21:14:59 +00:00
Alexander Motin	9977c593a7	Introduce ipi_self_from_nmi(). It allows safe IPI sending to current CPU from NMI context. Unlike other ipi_*() functions this waits for delivery to leave LAPIC in a state safe for interrupted code. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2020-07-24 20:52:09 +00:00
Alexander Motin	279cd05b7e	Use APIC_IPI_DEST_OTHERS for bitmapped IPIs too. It should save bunch of LAPIC register accesses. MFC after: 2 weeks	2020-07-24 20:44:50 +00:00
Alexander Motin	23ce462092	Make lapic_ipi_vectored(APIC_IPI_DEST_SELF) NMI safe. Sending IPI to self or all CPUs does not require write into upper part of the ICR, prone to races. Previously the code disabled interrupts, but it was not enough for NMIs. Instead of that when possible write only lower part of the register, or use special SELF IPI register in x2APIC mode. This also removes ICR reads used to preserve reserved bits on write. It was there from the beginning, but I failed to find explanation why, neither I see Linux doing it. Specification even tells that ICR content may be lost in deep C-states, so if hardware does not bother to preserve it, why should we? MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2020-07-24 19:54:15 +00:00
Emmanuel Vadot	41c653be98	dwmmc: Add MMCCAM part Add support for MMCCAM for dwmmc Submitted by: kibab Tested On: Rock64, RockPro64	2020-07-24 19:52:52 +00:00
Emmanuel Vadot	a6d9c9257c	mmccam: aw_mmc: Only print the new ios value under bootverbose	2020-07-24 18:44:50 +00:00
Emmanuel Vadot	f1ed7b6563	mmccam: Make non bootverbose more readable Remove some debug printfs. Convert some to CAM_DEBUG Only print some when bootverbose is set.	2020-07-24 18:43:46 +00:00
Conrad Meyer	81dc6c2c61	Use gbincore_unlocked for unprotected incore() Reviewed by: markj Sponsored by: Isilon Differential Revision: https://reviews.freebsd.org/D25790	2020-07-24 17:34:44 +00:00
Conrad Meyer	68ee1dda06	Add unlocked/SMR fast path to getblk() Convert the bufobj tries to an SMR zone/PCTRIE and add a gbincore_unlocked() API wrapping this functionality. Use it for a fast path in getblkx(), falling back to locked lookup if we raced a thread changing the buf's identity. Reported by: Attilio Reviewed by: kib, markj Testing: pho (in progress) Sponsored by: Isilon Differential Revision: https://reviews.freebsd.org/D25782	2020-07-24 17:34:04 +00:00
Conrad Meyer	3c30b23519	Use SMR to provide safe unlocked lookup for pctries from SMR zones Adapt r358130, for the almost identical vm_radix, to the pctrie subsystem. Like that change, the tree is kept correct for readers with store barriers and careful ordering. Existing locks serialize writers. Add a PCTRIE_DEFINE_SMR() wrapper that takes an additional smr_t parameter and instantiates a FOO_PCTRIE_LOOKUP_UNLOCKED() function, in addition to the usual definitions created by PCTRIE_DEFINE(). Interface consumers will be introduced in later commits. As future work, it might be nice to add vm_radix algorithms missing from generic pctrie to the pctrie interface, and then adapt vm_radix to use pctrie. Reported by: Attilio Reviewed by: markj Sponsored by: Isilon Differential Revision: https://reviews.freebsd.org/D25781	2020-07-24 17:32:10 +00:00
Mateusz Guzik	138698898f	lockmgr: add missing 'continue' to account for spuriously failed fcmpset PR: 248245 Reported by: gbe Noted by: markj Fixes by: r363415 ("lockmgr: add adaptive spinning")	2020-07-24 17:28:24 +00:00
Emmanuel Vadot	bf2868538e	mmccam: Add some aliases for non-mmccam to mmccam transition A new tunable is present, kern.cam.sdda.mmcsd_compat to enable this feature or not (default is enabled)	2020-07-24 17:11:14 +00:00
Juli Mallett	ce219ecd93	Remove reference to nlist(3) missed in SCCS revision 5.26 by mckusick when converting rwhod(8) to using kern.boottime ather than extracting the boot time from kernel memory directly. Reviewed by: imp	2020-07-24 16:58:13 +00:00
Mateusz Piotrowski	d6dade0002	Fix grammar issues and typos Reported by: ian MFC after: 1 week	2020-07-24 15:04:34 +00:00
Mateusz Piotrowski	5ccb7079f8	Document that force_depend() supports only /etc/rc.d scripts Currently, force_depend() from rc.subr(8) does not support depending on scripts outside of /etc/rc.d (like /usr/local/etc/rc.d). The /etc/rc.d path is hard-coded into force_depend(). MFC after: 1 week	2020-07-24 14:17:37 +00:00
Mateusz Guzik	ee74412269	vm: fix swap reservation leak and clean up surrounding code The code did not subtract from the global counter if per-uid reservation failed. Cleanup highlights: - load overcommit once - move per-uid manipulation to dedicated routines - don't fetch wire count if requested size is below the limit - convert return type from int to bool - ifdef the routines with _KERNEL to keep vm.h compilable by userspace Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D25787	2020-07-24 13:23:32 +00:00
Alex Richardson	b798ef6490	Include TMPFS in all the GENERIC kernel configs Being able to use tmpfs without kernel modules is very useful when building small MFS_ROOT kernels without a real file system. Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs. Compiling TMPFS only adds 4 .c files so this should not make much of a difference to NO_MODULES build times (as we do for our minimal RISC-V images). Reviewed By: br (earlier version for riscv), brooks, emaste Differential Revision: https://reviews.freebsd.org/D25317	2020-07-24 08:40:04 +00:00
John-Mark Gurney	b6dd8b71d1	fix up docs for m_getjcl as well..	2020-07-24 00:47:14 +00:00
John-Mark Gurney	92b56ebaf7	document that m_get2 only accepts up to MJUMPAGESIZE..	2020-07-24 00:35:21 +00:00
John Baldwin	3c0e568505	Add support for KTLS RX via software decryption. Allow TLS records to be decrypted in the kernel after being received by a NIC. At a high level this is somewhat similar to software KTLS for the transmit path except in reverse. Protocols enqueue mbufs containing encrypted TLS records (or portions of records) into the tail of a socket buffer and the KTLS layer decrypts those records before returning them to userland applications. However, there is an important difference: - In the transmit case, the socket buffer is always a single "record" holding a chain of mbufs. Not-yet-encrypted mbufs are marked not ready (M_NOTREADY) and released to protocols for transmit by marking mbufs ready once their data is encrypted. - In the receive case, incoming (encrypted) data appended to the socket buffer is still a single stream of data from the protocol, but decrypted TLS records are stored as separate records in the socket buffer and read individually via recvmsg(). Initially I tried to make this work by marking incoming mbufs as M_NOTREADY, but there didn't seemed to be a non-gross way to deal with picking a portion of the mbuf chain and turning it into a new record in the socket buffer after decrypting the TLS record it contained (along with prepending a control message). Also, such mbufs would also need to be "pinned" in some way while they are being decrypted such that a concurrent sbcut() wouldn't free them out from under the thread performing decryption. As such, I settled on the following solution: - Socket buffers now contain an additional chain of mbufs (sb_mtls, sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by the protocol layer. These mbufs are still marked M_NOTREADY, but soreceive*() generally don't know about them (except that they will block waiting for data to be decrypted for a blocking read). - Each time a new mbuf is appended to this TLS mbuf chain, the socket buffer peeks at the TLS record header at the head of the chain to determine the encrypted record's length. If enough data is queued for the TLS record, the socket is placed on a per-CPU TLS workqueue (reusing the existing KTLS workqueues and worker threads). - The worker thread loops over the TLS mbuf chain decrypting records until it runs out of data. Each record is detached from the TLS mbuf chain while it is being decrypted to keep the mbufs "pinned". However, a new sb_dtlscc field tracks the character count of the detached record and sbcut()/sbdrop() is updated to account for the detached record. After the record is decrypted, the worker thread first checks to see if sbcut() dropped the record. If so, it is freed (can happen when a socket is closed with pending data). Otherwise, the header and trailer are stripped from the original mbufs, a control message is created holding the decrypted TLS header, and the decrypted TLS record is appended to the "normal" socket buffer chain. (Side note: the SBCHECK() infrastucture was very useful as I was able to add assertions there about the TLS chain that caught several bugs during development.) Tested by: rmacklem (various versions) Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24628	2020-07-23 23:48:18 +00:00
Bryan Drewery	3cee7cb269	Limit gmirror failpoint tests to the test worker This avoids injecting errors into the test system's mirrors. gnop seems like a good solution here but it injects errors at the wrong place vs where these tests expect and does not support a 'max global count' like the failpoints do with 'n*' syntax. Reviewed by: cem, vangyzen Sponsored by: Dell EMC Isilon	2020-07-23 23:29:50 +00:00
John-Mark Gurney	98b765e5c2	update example to make it active when creating a new boot method... Clean up some of the sentences and grammar... make igor happy..	2020-07-23 22:28:35 +00:00
John Baldwin	70d1a4351a	Consolidate duplicated code into a ktls_ocf_dispatch function. This function manages the loop around crypto_dispatch and coordination with ktls_ocf_callback. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25757	2020-07-23 21:43:06 +00:00
John Baldwin	d7d14db9c5	Set si_trapno to the exception code from esr. Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D25771	2020-07-23 21:40:03 +00:00
John Baldwin	e7aaabe15e	Pass the right size to memcpy() when copying the array of FP registers. The size of the containing structure was passed instead of the size of the array. This happened to be harmless as the extra word copied is one we copy in the next line anyway. Reported by: CHERI (bounds check violation) Reviewed by: brooks, imp Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D25791	2020-07-23 21:33:10 +00:00
John Baldwin	6273c7420d	Set si_addr to badvaddr for TLB faults. Reviewed by: kib Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D25775	2020-07-23 20:08:42 +00:00
Ed Maste	af9de844c4	md5: return non-zero if built-in tests (-x) fail MFC after: 1 week Sponsored by: The FreeBSD Foundation	2020-07-23 20:06:24 +00:00
Michael Tuexen	205f3e1597	Clear the pointer to the socket when closing it also in case of an ungraceful operation. This fixes a use-after-free bug found and reported by Taylor Brandstetter of Google by testing the userland stack. MFC after: 1 week	2020-07-23 19:43:49 +00:00
Ed Maste	e32e868528	modules/crypto: disable optimized assembly skein1024 implementation It is presumably broken in the same way as userland skein1024 (see r363454) PR: 248221	2020-07-23 19:19:33 +00:00
Ed Maste	0d2c19d05b	libmd: temporarily disable optimized assembly skein1024 implementation It is apparently broken when assembled by contemporary GNU as as well as Clang IAS (which is used in the default configuration). PR: 248221 Reported by: pizzamig Sponsored by: The FreeBSD Foundation	2020-07-23 18:55:47 +00:00
Cy Schubert	f0276e8c38	Document the IPFILTER_PREDEFINED environment variable. PR: 248088 Reported by: joeb1@a1poweruser.com MFC after: 1 week	2020-07-23 17:39:49 +00:00
Cy Schubert	795be686d8	Load ipfilter, ipnat, and ippool rules, and start ipmon in a vnet jail. PR: 248109 Reported by: joeb1@a1poweruser.com MFC after: 2 weeks	2020-07-23 17:39:45 +00:00
Mateusz Guzik	c795344ff7	locks: fix a long standing bug for primitives with kdtrace but without spinning In such a case the second argument to lock_delay_arg_init was NULL which was immediately causing a null pointer deref. Since the sructure is only used for spin count, provide a dedicate routine initializing it. Reported by: andrew	2020-07-23 17:26:53 +00:00
Doug Moore	e605dcc939	Rank balanced (RB) trees are a class of balanced trees that includes AVL trees, red-black trees, and others. Weak AVL (wavl) trees are a recently discovered member of that class. This change replaces red-black rebalancing with weak AVL rebalancing in the RB tree macros. Wavl trees sit between AVL and red-black trees in terms of how strictly balance is enforced. They have the stricter balance of AVL trees as the tree is built - a wavl tree is an AVL tree until the first deletion. Once removals start, wavl trees are lazier about rebalancing than AVL trees, so that removals can be fast, but the balance of the tree can decay to that of a red-black tree. Subsequent insertions can push balance back toward the stricter AVL conditions. Removing a node from a wavl tree never requires more than two rotations, which is better than either red-black or AVL trees. Inserting a node into a wavl tree never requires more than two rotations, which matches red-black and AVL trees. The only disadvantage of wavl trees to red-black trees is that more insertions are likely to adjust the tree a bit. That's the cost of keeping the tree more balanced. Testing has shown that for the cases where red-black trees do worst, wavl trees better balance leads to faster lookups, so that if lookups outnumber insertions by a nontrivial amount, lookup time saved exceeds the extra cost of balancing. Reviewed by: alc, gbe, markj Tested by: pho Discussed with: emaste Differential Revision: https://reviews.freebsd.org/D25480	2020-07-23 17:16:20 +00:00
Mark Johnston	7df88b9ddd	rc.firewall: Merge two identical conditions into one. No functional change intended. PR: 247949 Submitted by: Jose Luis Duran <jlduran@gmail.com> MFC after: 1 week	2020-07-23 15:03:28 +00:00
Alexander Motin	81614d236f	Add missing newlines. MFC after: 3 days	2020-07-23 14:33:25 +00:00
Mark Johnston	4cbba6ae24	MFOpenZFS: Fix zpool history unbounded memory usage In original implementation, zpool history will read the whole history before printing anything, causing memory usage goes unbounded. We fix this by breaking it into read-print iterations. Reviewed-by: Tom Caputi <tcaputi@datto.com> Reviewed-by: Matt Ahrens <matt@delphix.com> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #9516 Note, this change changes the libzfs.so ABI by modifying the prototype of zpool_get_history(). Since libzfs is effectively private to the base system it is anticipated that this will not be a problem. PR: 247557 Obtained from: OpenZFS Reported and tested by: Sam Vaughan <samjvaughan@gmail.com> Discussed with: freqlabs MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25745 openzfs/zfs@7125a109dc	2020-07-23 14:21:45 +00:00

1 2 3 4 5 ...

251803 Commits