Commit Graph

19325 Commits

Author SHA1 Message Date
Mateusz Guzik
3df3d88cc5 vfs: move cn_nameptr assignment out of namei_getpath 2022-09-17 09:08:34 +00:00
Mateusz Guzik
41a0a99f85 vfs: slightly reorganize error handling in chroot
This avoids duplicated NDFREE_NOTHING which will be of importance
later.
2022-09-17 09:08:34 +00:00
Warner Losh
7cd4984e67 SPDX: Not BSD-4-Clause
This is not BSD-4-Clause. It's closer to a modified BSD-2-Clause with 2
added clauses (and the first one has added clauses). Remove
SPDX-License-Idnetifier since this license doesn't match anything in
SPDX.
2022-09-16 21:49:16 -06:00
Konstantin Belousov
ff41239f58 Add AT_USRSTACK{BASE, LIM} AT vectors, and ELF_BSDF_VMNOOVERCOMMIT flag
Reviewed by:	brooks, imp (previous version)
Discussed with:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36540
2022-09-16 23:23:26 +03:00
Mateusz Guzik
50176b0296 locks: whack a failed experiment in form of restrict_starvation
This was never enabled and only pollutes the code. The issue will
be addressed later in a different manner.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-09-16 17:29:37 +00:00
Gordon Bergling
4771011b8f kern_jail: Fix a typo in a source code comment
- s/paramter/parameter/

MFC after:	3 days
2022-09-15 10:25:19 +02:00
Warner Losh
9baa0817ec SPDX: Not BSD-4-Clause
This license has 4 clauses, and shares some text with the BSD-4-Clause
license. However, it omits the standard disclaimer and has 2 clauses all
its own. Remove this tag, since it was made in error and this doesn't
match the SPDX copy of the BSD-4-Clause license.

Sponsored by:		Netflix
2022-09-14 21:29:31 -06:00
Mateusz Guzik
d04c7f10d4 vfs: make delmntque return with the interlock held
saves on relocking dance -- the lock is taken immediately
afterwards anyway.
2022-09-14 23:30:19 +00:00
Mateusz Guzik
43fbd0e7a7 lockf: elide vnode interlock in the common case in lf_purgelocks
The interlock was already taken and released when dooming, thus by
API contract locking state cannot be legally installed.

At the same time the state is almost never there to begin with.
2022-09-14 23:04:22 +00:00
Mateusz Guzik
a755fb921e vfs: retire the V_MNTREF flag
Reviewed by:	kib, mckusick
Differential Revision:	https://reviews.freebsd.org/D36521
2022-09-14 18:16:36 +00:00
Mateusz Guzik
61a1d5dde2 vfs: stop using the V_MNTREF flag
Reviewed by:	kib, mckusick
Differential Revision:	https://reviews.freebsd.org/D36521
2022-09-14 18:16:23 +00:00
Mateusz Guzik
f7dc4a71da vfs: plug spurious error checks in namei
error is guaranteed 0 at that point
2022-09-13 23:18:30 +00:00
Mateusz Guzik
b4137c9ed1 vfs: make NDVALIDATE private to vfs_lookup.c
it is not used elsewhere.
2022-09-12 22:50:48 +00:00
Allan Jude
b20ec58669 vfs.typenumhash: fix sysctl description
a string continuation was missing a space, resulting in two works
being smushed together.

Sponsored by:	Klara, Inc.
2022-09-10 22:47:51 +00:00
Mateusz Guzik
1760a6950a Fixup build after recent getsock changes 2022-09-10 20:40:43 +00:00
Mateusz Guzik
3be2225fc8 Remove fflag argument from getsock_cap
Interested callers can obtain in other own easily enough
and there is no reason to branch on it.
2022-09-10 19:47:47 +00:00
Mateusz Guzik
3212ad15ab Add getsock
All but one consumers of getsock_cap only pass 4 arguments.
Take advantage of it.
2022-09-10 19:47:47 +00:00
Mateusz Guzik
a2ad70923f Add branch prediction hints to getsock_cap 2022-09-10 19:41:52 +00:00
Gleb Smirnoff
e80062a2d4 tcp: avoid call to soisconnected() on transition to ESTABLISHED
This call existed since pre-FreeBSD times, and it is hard to understand
why it was there in the first place.  After 6f3caa6d81 it definitely
became necessary always and commit message from f1ee30ccd6 confirms that.
Now that 6f3caa6d81 is effectively backed out by 07285bb4c2, the call
appears to be useful only for sockets that landed on the incomplete queue,
e.g. sockets that have accept_filter(9) enabled on them.

Provide a new TCP flag to mark connections that are known to be on the
incomplete queue, and call soisconnected() only for those connections.

Reviewed by:		rrs, tuexen
Differential revision:	https://reviews.freebsd.org/D36488
2022-09-08 09:16:04 -07:00
Doug Moore
d0354fa7b6 rb_tree: reduce duplication in balancing code
Change RB_INSERT_COLOR and RB_REMOVE_COLOR so that the blocks of code
that are identical except for left and right being exchanged are made
only one block with a variable to indicate left- or right-handedness.

Rename RB macros so that those not intended for external use begin
with an underscore.

Add comments to the balancing code so that another might understand it.

Reviewed by:	alc, kib
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D36393
2022-09-07 23:46:19 -05:00
Mateusz Guzik
3e0b486886 vfs: flip a condition around in kern_statat
error tends to be 0.
2022-09-07 20:06:24 +00:00
Hans Petter Selasky
0e391a3197 ktls: Add missing NULL pointer check for TLS RX hardware offload.
The send tag pointer may be NULL when the ktls_reset_receive_tag()
function is invoked. Add check for this.

Reviewed by:	gallatin @
Sponsored by:	NVIDIA Networking
2022-09-06 13:49:23 +02:00
Mateusz Guzik
69413598d2 signal: use proc_iterate to save on work
Most notably poudriere performs kill -9 -1 in jails for each port
being built. This reduces the scan from hundrends of processes to
literally 1.

Reviewed by:	jamie, markj
Differential Revision:	https://reviews.freebsd.org/D34522
2022-09-05 11:54:47 +00:00
Mateusz Guzik
5ecb5444aa jail: add process linkage
It allows iteration over processes belonging to given jail instead of
having to walk the entire allproc list.

Note the iteration can miss processes which remains bug-compatible
with previous code.

Reviewed by:	jamie (previous version), markj (previous version)
Differential Revision:	https://reviews.freebsd.org/D34522
2022-09-05 11:54:47 +00:00
Gordon Bergling
d744e271eb kern: Remove a double word in a source code comment
- s/that that/that/

MFC after:	3 days
2022-09-04 17:32:10 +02:00
Gordon Bergling
49a033d8cf kern: Correct some typos in source code comments
- s/occured/occurred/
- s/the the/the/

MFC after:	3 days
2022-09-04 13:00:01 +02:00
Gordon Bergling
2b7d656f17 kern: Fix a typo in asource code comment
- s/overriden/overridden/

MFC after:	3 days
2022-09-03 15:26:55 +02:00
Gleb Smirnoff
24af7808fa protosw: repair protocol selection logic in socket(2)
Pointy hat to:	glebius
Fixes:		61f7427f02
2022-08-30 21:19:46 -07:00
Gleb Smirnoff
61f7427f02 protosw: cleanup protocols that existed merely to provide pr_input
Since 4.4BSD the protosw was used to implement socket types created
by socket(2) syscall and at the same to demultiplex incoming IPv4
datagrams (later copied to IPv6).  This story ended with 78b1fc05b2.

These entries (e.g. IPPROTO_ICMP) in inetsw that were added to catch
packets in ip_input(), they would also be returned by pffindproto()
if user says socket(AF_INET, SOCK_RAW, IPPROTO_ICMP).  Thus, for raw
sockets to work correctly, all the entries were pointing at raw_usrreq
differentiating only in the value of pr_protocol.

With 78b1fc05b2 all these entries are no longer needed, as ip_protox
is independent of protosw.  Any socket syscall requesting SOCK_RAW type
would end up with rip_protosw.  And this protosw has its pr_protocol
set to 0, allowing to mark socket with any protocol.

For IPv6 raw socket the change required two small fixes:
o Validate user provided protocol value
o Always use protocol number stored in inp in rip6_attach, instead
  of protosw value, which is now always 0.

Differential revision:	https://reviews.freebsd.org/D36380
2022-08-30 15:09:21 -07:00
Gleb Smirnoff
8624f4347e divert: declare PF_DIVERT domain and stop abusing PF_INET
The divert(4) is not a protocol of IPv4.  It is a socket to
intercept packets from ipfw(4) to userland and re-inject them
back.  It can divert and re-inject IPv4 and IPv6 packets today,
but potentially it is not limited to these two protocols.  The
IPPROTO_DIVERT does not belong to known IP protocols, it
doesn't even fit into u_char.  I guess, the implementation of
divert(4) was done the way it is done basically because it was
easier to do it this way, back when protocols for sockets were
intertwined with IP protocols and domains were statically
compiled in.

Moving divert(4) out of inetsw accomplished two important things:

1) IPDIVERT is getting much closer to be not dependent on INET.
   This will be finalized in following changes.
2) Now divert socket no longer aliases with raw IPv4 socket.
   Domain/proto selection code won't need a hack for SOCK_RAW and
   multiple entries in inetsw implementing different flavors of
   raw socket can merge into one without requirement of raw IPv4
   being the last member of dom_protosw.

Differential revision:	https://reviews.freebsd.org/D36379
2022-08-30 15:09:21 -07:00
Gleb Smirnoff
244e1aeaec domains: merge domain_init() into domain_add()
domain_init() called at SI_SUB_PROTO_DOMAIN/SI_ORDER_SECOND is always
called right after domain_add(), that had been called at SI_ORDER_FIRST.
Note that protocols aren't initialized yet at this point, since they are
usually scheduled to initialize at SI_ORDER_THIRD.

After this merge it becomes clear that DOMF_SUPPORTED / DOMF_INITED
can be garbage collected as they are set & checked in the same function.

For initialization of the domain system itself it is now clear that
domaininit() can be garbage collected and static initializer is enough.
2022-08-29 19:15:01 -07:00
Gleb Smirnoff
e18c5816ea domains: use queue(9) SLIST for linked list of domains 2022-08-29 19:15:01 -07:00
Gleb Smirnoff
d7574c7432 domains: init pr_domain in pr_init() 2022-08-29 19:15:01 -07:00
Gleb Smirnoff
c414347bc5 mbufs: isolate max_linkhdr and max_protohdr handling in the mbuf code
o Statically initialize max_linkhdr to default value without relying
  on domain(9) code doing that.
o Statically initialize max_protohdr to a sane value, without relying
  on TCP being always compiled in.
o Retire max_datalen. Set, but not used.
o Don't make the domain(9) system responsible in validating these
  values and updating max_hdr.  Instead provide KPI max_linkhdr_grow()
  and max_protohdr_grow().
o Call max_linkhdr_grow() from IEEE802.11 and max_protohdr_grow() from
  TCP.  Those are the only protocols today that may want to grow.

Reviewed by:		tuexen
Differential revision:	https://reviews.freebsd.org/D36376
2022-08-29 19:14:25 -07:00
Mark Johnston
32faf071bd devstat: Remove DTrace io probes lacking a BIO reference
The io:::start and end probes trace individual I/O requests.

Also remove the unimplemented wait-start and wait-done probes.

PR:		266098
MFC after:	1 week
2022-08-29 13:22:36 -04:00
Doug Moore
5d91386826 rb_tree: avoid extra reads in rebalancing
In RB_INSERT_COLOR and RB_REMOVE_COLOR, avoid reading a parent pointer
from memory, and then reading the left-color bit from memory, and then
reading the right-color bit from memory, since they're all in the same
field. The compiler can't infer that only the first read is really
necessary, so write the code in a way so that it doesn't have to.

Drop RB_RED_LEFT and RB_RED_RIGHT macros that reach into memory to get
those bits.  Drop RB_COLOR, the only thing left using RB_RED_LEFT and
RB_RED_RIGHT after the other changes, and go straight to DIAGNOSTIC
code in subr_stats to implement RB_COLOR for its single, dubious use
there.

Reviewed by:	alc
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D36353
2022-08-29 11:11:31 -05:00
John Baldwin
e3885a7893 soo_stat: Ensure error is always initialized.
In kernels without MAC, error is not set for sockets whose protocol
layer does not implement the pr_sense hook.

Reported by:	Jenkins (powerpc kernel builds)
Fixes:		7c04ca1fad sockets: for stat(2) on a socket don't report hiwat as block size
2022-08-26 11:17:55 -07:00
Gleb Smirnoff
837b7203f0 domains: use struct domain as argument 2022-08-26 10:35:35 -07:00
firk
768f6373eb Fix compat10 semaphore interface race
Wrong has-waiters and missing unconditional _count==0 check may cause
infinite waiting with already non-zero count.
1) properly clear _has_waiters flag when waiting failed to start
2) always check _count before start waiting

PR:	265997
Reviewed by:	kib
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D36272
2022-08-26 20:34:29 +03:00
Gleb Smirnoff
7c04ca1fad sockets: for stat(2) on a socket don't report hiwat as block size
The code appeared in d8392c6c39 with not good explanation.  It is
very unlikely any software in the world needs that.

Differential revision:	https://reviews.freebsd.org/D36283
2022-08-26 08:16:15 -07:00
Mateusz Guzik
49afea1059 proc: read the pid prior to unlocking in report_alive_proc1
In principle another thread could have reaped the process by that time.
2022-08-25 17:26:49 +00:00
Konstantin Belousov
fce3b1c327 fork_exit(): style comment
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D36302
2022-08-24 22:12:53 +03:00
Brooks Davis
840327e5dd mbuf: Don't support PAGE_SIZE < 4K
The Vax supported such things, but FreeBSD does not.  This further
implies that MJUMPAGESIZE > MCLBYTES so assert this and remove code
handling them being equal.

Reviewed by:	kp, imp, jhb
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D36320
2022-08-24 18:34:07 +01:00
Mateusz Guzik
9488262679 rms: add rms_assert_rlock_ok
So that callers which opportunistically elide the lock can still
assert that they can take it.

Reviewed by:
Differential Revision:
2022-08-23 19:15:48 +00:00
Robert Wing
3454a7caa0 kqueue: retire knlist_init_rw_reader()
Last usage was removed in afa85850e7.

Reviewed by:	pauamma, melifaro, kib
Differential Revision:	https://reviews.freebsd.org/D36205
2022-08-20 21:17:39 -08:00
Konstantin Belousov
f829268bcc Remove TDF_DOING_SA
We cannot see a thread with the flag set in unsuspend, after we stopped
doing SINGLE_ALLPROC from user processes.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:34:30 +03:00
Konstantin Belousov
5e5675cb4b Remove struct proc p_singlethr member
It does not serve any purpose after we stopped doing
thread_single(SINGLE_ALLPROC) from stoppable user processes.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:34:30 +03:00
Konstantin Belousov
2842ec6d99 REAP_KILL_PROC: kill processes in the threaded taskqueue context
There is a problem still left after the fixes to REAP_KILL_PROC.  The
handling of the stopping signals by sig_suspend_threads() can occur
outside the stopping process context by tdsendsignal(), and it uses
mostly the same mechanism of aborting sleeps as suspension.  In other
words, it badly interacts with thread_single(SINGLE_ALLPROC).

But unlike single threading from the process context, we cannot wait by
sleep for other single threading requests to pass, because we own
spinlock(s).

Fix this by moving both the thread_single(p2, SINGLE_ALLPROC), and the
signalling, to the threaded taskqueue which cannot be single-threaded
itself.

Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:34:11 +03:00
Konstantin Belousov
5e9bba94bd fork_norfproc(): unlock p1 before retrying
Reported and reviewed by:	markj
Tested by:	pho
Syzkaller:	647212368c3f32c6f13f
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:33:18 +03:00
Konstantin Belousov
0a4f2ac3b7 kern_sig.c: style
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36207
2022-08-20 20:33:18 +03:00