freebsd-nq

Author	SHA1	Message	Date
Konstantin Belousov	6e0c8e1ae2	Add SOL_LOCAL symbolic constant for unix socket option level. The constant seems to exists on MacOS X >= 10.8. Requested by: swills Reviewed by: allanjude, kevans Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D25933	2020-08-03 22:13:02 +00:00
Warner Losh	e67c55c998	Some function had the blank lines, others didn't. Most of the ones that didn't were newer, so remove this now-optional blank line everywhere.	2020-08-03 22:12:18 +00:00
Konstantin Belousov	ca9a39acb3	Provide more correct description for sysctl kern.smp.cores. Reported by: dewayne@heuristicsystems.com.au PR: 248454 Sponsored by: The FreeBSD Foundation MFC after: 3 days	2020-08-03 17:17:17 +00:00
Mateusz Guzik	7ad2f1105e	vfs: store precomputed namecache hash in the vnode This significantly speeds up path lookup, Cascade Lake doing access(2) on ufs on /usr/obj/usr/src/amd64.amd64/sys/GENERIC/vnode_if.c, ops/s: before: 2535298 after: 2797621 Over +10%. The reversed order of computation here does not seem to matter for hash distribution. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D25921	2020-08-02 20:02:06 +00:00
Mateusz Guzik	838984de32	vfs: move namecache initialisation into cache_vnode_init	2020-08-02 19:42:06 +00:00
Conrad Meyer	9da903e5d3	Unlocked getblk: Fix new false-positive assertion A free buf's lock may be held (temporarily) due to unlocked lookup, so buf_alloc() must acquire it without LK_NOWAIT. The unlocked getblk path should unlock it promptly once it realizes the identity does not match the buffer it was searching for. Reported by: gallatin Reviewed by: kib Tested by: pho X-MFC-With: r363482 Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D25914	2020-08-02 16:34:27 +00:00
Mateusz Guzik	936c24faba	cred: add more asserts for td_realucred == td_ucred	2020-08-01 16:02:32 +00:00
Mateusz Guzik	8a7ec17095	cache: reshuffle struct cache_fpl and nameidata_saved Shaves 16 bytes.	2020-08-01 06:35:18 +00:00
Mateusz Guzik	5a3944334c	cache: mark climb_mount as __noinline	2020-08-01 06:34:18 +00:00
Mateusz Guzik	85cf316172	vfs: inline NDINIT_ALL The routine takes more than 6 arguments, which on amd64 means some of them have to be passed through the stack.	2020-08-01 06:33:38 +00:00
Mateusz Guzik	14576629bb	vfs: convert ni_rigthsneeded to a pointer Shaves 8 bytes of struct nameidata on 64-bit platforms.	2020-08-01 06:33:11 +00:00
Mateusz Guzik	21c162605b	vfs: make rights mandatory for NDINIT_ALL	2020-08-01 06:32:25 +00:00
Conrad Meyer	d6a75d39e9	getblk: Remove a non-sensical LK_NOWAIT \| LK_SLEEPFAIL No functional change. LK_SLEEPFAIL implies a behavior that is only possible if the lock operation can sleep. LK_NOWAIT prevents the lock operation from sleeping. Discussed with: kib	2020-07-31 00:13:40 +00:00
Conrad Meyer	59d13f6154	getblk: Avoid sleeping on wrong buf in lockless path If the buffer identity changed during lookup, sleeping could introduce a lock order reversal. Since we do not know if the identity changed until we get the lock, we must try-lock (LK_NOWAIT) only. EINTR and ERESTART error handling becomes irrelevant, as we no longer sleep. Reported by: kib Reviewed by: kib X-MFC-With: r363482 Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D25898	2020-07-31 00:07:01 +00:00
Mateusz Guzik	cb90ef2875	cache: drop the useless numchecks counter	2020-07-30 22:52:18 +00:00
Mateusz Guzik	952759111e	Further depessimize priv_check_cred_vfs_generation	2020-07-30 22:14:04 +00:00
Mateusz Guzik	848f8effdd	vfs: inline vops if there are no pre/post associated calls This removes a level of indirection from frequently used methods, most notably VOP_LOCK1 and VOP_UNLOCK1. Tested by: pho	2020-07-30 15:50:51 +00:00
Mateusz Guzik	2e4f8220e8	vfs: fold poll_no_poll into vop_nopoll The logic was almost completely present in vop_stdpoll anyway.	2020-07-30 15:48:56 +00:00
Mateusz Guzik	b1f910e02c	vfs: short-circuit the common case NDFREE calls Almost all consumers use the NDF_ONLY_PNBUF macro, making them avoidably branch a lot in the NDFREE routine. Also note most of them should not need to call any cleanup anyway as they don't request HASBUF.	2020-07-30 15:47:41 +00:00
Mateusz Guzik	404927357d	vfs: add support for WANTPARENT and LOCKPARENT to lockless lookup This makes the realpath syscall operational with the new lookup. Note that the walk to obtain the full path name still takes locks. Tested by: pho Differential Revision: https://reviews.freebsd.org/D23917	2020-07-30 15:45:11 +00:00
Mateusz Guzik	8230d29357	vfs: support negative entry promotion in lockless lookup Tested by: pho	2020-07-30 15:44:10 +00:00
Mateusz Guzik	4057e3eaaa	vfs: add NOMACCHECK and AUDITVNODE2 to lockless lookup They are both nops since lookup does not progress with either mac or audit enabled. Tested by: pho	2020-07-30 15:43:16 +00:00
Mateusz Guzik	d3e63e8eb2	vfs: make sure startdir_used is always assigned to before use CID: 1431070	2020-07-30 07:11:08 +00:00
Mark Johnston	1b778ba260	Fix a logic error in uipc_ready_scan(). When processing the last record in a socket buffer, take care to avoid a NULL pointer dereference when advancing the record iterator. Reported by: syzbot+6a689cc9c27bd265237a@syzkaller.appspotmail.com Fixes: r359778 MFC after: 1 week Sponsored by: The FreeBSD Foundation	2020-07-30 00:52:37 +00:00
John Baldwin	0f70a1489d	Properly handle a closed TLS socket with pending receive data. If the remote end closes a TLS socket and the socket buffer still contains not-yet-decrypted TLS records but no decrypted TLS records, soreceive needs to block or fail with EWOULDBLOCK. Previously it was trying to return data and dereferencing a NULL pointer. Reviewed by: np Sponsored by: Chelsio Differential Revision: https://reviews.freebsd.org/D25838	2020-07-29 23:24:32 +00:00
Mateusz Guzik	fad6dd772d	vfs: elide MAC-induced locking on rename if there are no relevant hoooks	2020-07-29 17:05:31 +00:00
Mateusz Guzik	fd8c6a48ab	vfs: honor error code returned by mac_vnode_check_rename_from MFC after: 3 days	2020-07-29 17:04:33 +00:00
Yoshihiro Takahashi	8f11c99715	- Cleanups related to sparc64 removal. - Remove remains of sparc64 files. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D25831	2020-07-28 10:58:37 +00:00
Kyle Evans	fd35bfaecf	makesyscalls.sh: improve the 'this is going away' message Reported by: Ronald Klop, rgrimes	2020-07-28 01:05:40 +00:00
Kyle Evans	bb97350f28	makesyscalls.sh: spit out a deprecation notice to stderr This has for a while been replaced by makesyscalls.lua in the stock FreeBSD build. Ensure downstreams get some notice that it'a going away if they're reliant on it, maybe.	2020-07-27 03:13:23 +00:00
Doug Moore	00fd73d2da	Fix an overflow bug in the blist allocator that needlessly capped max swap size by dividing a value, which was always a multiple of 64, by 64. Remove the code that reduced max swap size down to that cap. Eliminate the distinction between BLIST_BMAP_RADIX and BLIST_META_RADIX. Call them both BLIST_RADIX. Make improvments to the blist self-test code to silence compiler warnings and to test larger blists. Reported by: jmallett Reviewed by: alc Discussed with: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D25736	2020-07-25 18:29:10 +00:00
Mateusz Guzik	e914224af1	fd: put back FILEDESC_SUNLOCK to pwd_hold lost during rebase Reported by: pho	2020-07-25 15:34:29 +00:00
Alexander Motin	aba10e131f	Allow swi_sched() to be called from NMI context. For purposes of handling hardware error reported via NMIs I need a way to escape NMI context, being too restrictive to do something significant. To do it this change introduces new swi_sched() flag SWI_FROMNMI, making it careful about used KPIs. On platforms allowing IPI sending from NMI context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI, otherwise it works just like SWI_DELAY. To handle the delayed SWIs this patch calls clk_intr_event on every hardclock() tick. MFC after: 2 weeks Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25754	2020-07-25 15:19:38 +00:00
Mateusz Guzik	9dbd12fb52	vfs: add support for !LOCKLEAF to lockless lookup Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D23916	2020-07-25 10:40:38 +00:00
Mateusz Guzik	c42b77e694	vfs: lockless lookup Provides full scalability as long as all visited filesystems support the lookup and terminal vnodes are different. Inner workings are explained in the comment above cache_fplookup. Capabilities and fd-relative lookups are not supported and will result in immediate fallback to regular code. Symlinks, ".." in the path, mount points without support for lockless lookup and mismatched counters will result in an attempt to get a reference to the directory vnode and continue in regular lookup. If this fails, the entire operation is aborted and regular lookup starts from scratch. However, care is taken that data is not copied again from userspace. Sample benchmark: incremental -j 104 bzImage on tmpfs: before: 142.96s user 1025.63s system 4924% cpu 23.731 total after: 147.36s user 313.40s system 3216% cpu 14.326 total Sample microbenchmark: access calls to separate files in /tmpfs, 104 workers, ops/s: before: 2165816 after: 151216530 Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25578	2020-07-25 10:37:15 +00:00
Mateusz Guzik	07d2145a17	vfs: add the infrastructure for lockless lookup Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25577	2020-07-25 10:32:45 +00:00
Mateusz Guzik	0379ff6ae3	vfs: introduce vnode sequence counters Modified on each permission change and link/unlink. Reviewed by: kib Tested by: pho (in a patchset) Differential Revision: https://reviews.freebsd.org/D25573	2020-07-25 10:31:52 +00:00
Mateusz Guzik	d1385ab26e	Guard sbcompress_ktls_rx with KERN_TLS Fixes a compilation warning after r363464	2020-07-25 07:15:23 +00:00
Mateusz Guzik	bf71b96c69	Do a lockless check in kthread_suspend_check Otherwise an idle system running lockstat sleep 10 reports contention on process lock comming from bufdaemon. While here fix a style nit.	2020-07-25 07:14:33 +00:00
Conrad Meyer	81dc6c2c61	Use gbincore_unlocked for unprotected incore() Reviewed by: markj Sponsored by: Isilon Differential Revision: https://reviews.freebsd.org/D25790	2020-07-24 17:34:44 +00:00
Conrad Meyer	68ee1dda06	Add unlocked/SMR fast path to getblk() Convert the bufobj tries to an SMR zone/PCTRIE and add a gbincore_unlocked() API wrapping this functionality. Use it for a fast path in getblkx(), falling back to locked lookup if we raced a thread changing the buf's identity. Reported by: Attilio Reviewed by: kib, markj Testing: pho (in progress) Sponsored by: Isilon Differential Revision: https://reviews.freebsd.org/D25782	2020-07-24 17:34:04 +00:00
Conrad Meyer	3c30b23519	Use SMR to provide safe unlocked lookup for pctries from SMR zones Adapt r358130, for the almost identical vm_radix, to the pctrie subsystem. Like that change, the tree is kept correct for readers with store barriers and careful ordering. Existing locks serialize writers. Add a PCTRIE_DEFINE_SMR() wrapper that takes an additional smr_t parameter and instantiates a FOO_PCTRIE_LOOKUP_UNLOCKED() function, in addition to the usual definitions created by PCTRIE_DEFINE(). Interface consumers will be introduced in later commits. As future work, it might be nice to add vm_radix algorithms missing from generic pctrie to the pctrie interface, and then adapt vm_radix to use pctrie. Reported by: Attilio Reviewed by: markj Sponsored by: Isilon Differential Revision: https://reviews.freebsd.org/D25781	2020-07-24 17:32:10 +00:00
Mateusz Guzik	138698898f	lockmgr: add missing 'continue' to account for spuriously failed fcmpset PR: 248245 Reported by: gbe Noted by: markj Fixes by: r363415 ("lockmgr: add adaptive spinning")	2020-07-24 17:28:24 +00:00
John Baldwin	3c0e568505	Add support for KTLS RX via software decryption. Allow TLS records to be decrypted in the kernel after being received by a NIC. At a high level this is somewhat similar to software KTLS for the transmit path except in reverse. Protocols enqueue mbufs containing encrypted TLS records (or portions of records) into the tail of a socket buffer and the KTLS layer decrypts those records before returning them to userland applications. However, there is an important difference: - In the transmit case, the socket buffer is always a single "record" holding a chain of mbufs. Not-yet-encrypted mbufs are marked not ready (M_NOTREADY) and released to protocols for transmit by marking mbufs ready once their data is encrypted. - In the receive case, incoming (encrypted) data appended to the socket buffer is still a single stream of data from the protocol, but decrypted TLS records are stored as separate records in the socket buffer and read individually via recvmsg(). Initially I tried to make this work by marking incoming mbufs as M_NOTREADY, but there didn't seemed to be a non-gross way to deal with picking a portion of the mbuf chain and turning it into a new record in the socket buffer after decrypting the TLS record it contained (along with prepending a control message). Also, such mbufs would also need to be "pinned" in some way while they are being decrypted such that a concurrent sbcut() wouldn't free them out from under the thread performing decryption. As such, I settled on the following solution: - Socket buffers now contain an additional chain of mbufs (sb_mtls, sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by the protocol layer. These mbufs are still marked M_NOTREADY, but soreceive*() generally don't know about them (except that they will block waiting for data to be decrypted for a blocking read). - Each time a new mbuf is appended to this TLS mbuf chain, the socket buffer peeks at the TLS record header at the head of the chain to determine the encrypted record's length. If enough data is queued for the TLS record, the socket is placed on a per-CPU TLS workqueue (reusing the existing KTLS workqueues and worker threads). - The worker thread loops over the TLS mbuf chain decrypting records until it runs out of data. Each record is detached from the TLS mbuf chain while it is being decrypted to keep the mbufs "pinned". However, a new sb_dtlscc field tracks the character count of the detached record and sbcut()/sbdrop() is updated to account for the detached record. After the record is decrypted, the worker thread first checks to see if sbcut() dropped the record. If so, it is freed (can happen when a socket is closed with pending data). Otherwise, the header and trailer are stripped from the original mbufs, a control message is created holding the decrypted TLS header, and the decrypted TLS record is appended to the "normal" socket buffer chain. (Side note: the SBCHECK() infrastucture was very useful as I was able to add assertions there about the TLS chain that caught several bugs during development.) Tested by: rmacklem (various versions) Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24628	2020-07-23 23:48:18 +00:00
Mateusz Guzik	c795344ff7	locks: fix a long standing bug for primitives with kdtrace but without spinning In such a case the second argument to lock_delay_arg_init was NULL which was immediately causing a null pointer deref. Since the sructure is only used for spin count, provide a dedicate routine initializing it. Reported by: andrew	2020-07-23 17:26:53 +00:00
Brooks Davis	5a01eca698	Use SI_ORDER_(FOURTH\|FIFTH) rather than bespoke versions. No functional change. When these SYSINITs were added these macros didn't exist. Reviewed by: imp Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA Differential Revision: https://reviews.freebsd.org/D25758	2020-07-22 23:35:41 +00:00
Mateusz Guzik	31ad4050fe	lockmgr: add adaptive spinning It is very conservative. Only spinning when LK_ADAPTIVE is passed, only on exclusive lock and never when any waiters are present. buffer cache is remains not spinning. This reduces total sleep times during buildworld etc., but it does not shorten total real time (culprits are contention in the vm subsystem along with slock + upgrade which is not covered). For microbenchmarks: open3_processes -t 52 (open/close of the same file for writing) ops/s: before: 258845 after: 801638 Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D25753	2020-07-22 12:30:31 +00:00
Mitchell Horne	dc42509049	INTRNG: only shuffle for !EARLY_AP_STARTUP During device attachment, all interrupt sources will bind to the BSP, as it is the only processor online. This means interrupts must be redistributed ("shuffled") later, during SI_SUB_SMP. For the EARLY_AP_STARTUP case, this is no longer true. SI_SUB_SMP will execute much earlier, meaning APs will be online and available before devices begin attachment, and there will therefore be nothing to shuffle. All PIC-conforming interrupt controllers will handle this early distribution properly, except for RISC-V's PLIC. Make the necessary tweak to the PLIC driver. While here, convert irq_assign_cpu from a boolean_t to a bool. Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D25693	2020-07-21 22:47:02 +00:00
Mateusz Guzik	4aff9f5d99	lockmgr: denote recursion with a bit in lock value This reduces excessive reads from the lock. Tested by: pho	2020-07-21 14:42:22 +00:00
Mateusz Guzik	f6b091fbbd	lockmgr: rewrite upgrade to stop always dropping the lock This matches rw and sx locks.	2020-07-21 14:41:25 +00:00

... 2 3 4 5 6 ...

17682 Commits