Commit Graph

144952 Commits

Author SHA1 Message Date
Hans Petter Selasky
1d41a05557 Fix kernel build after fcb3f813f3 .
By adding missing ifdefs for INET and INET6 when building LINT-NOIP .

Differential Revision:	https://reviews.freebsd.org/D36731
Sponsored by:	NVIDIA Networking
2022-10-04 16:08:39 +02:00
Hans Petter Selasky
c2a808b977 Fix kernel build after fcb3f813f3 .
By adding missing ifdefs for INET6 .

Differential Revision:	https://reviews.freebsd.org/D36731
Sponsored by:	NVIDIA Networking
2022-10-04 15:55:36 +02:00
Hans Petter Selasky
9f69c0b87d Fix kernel build after fcb3f813f3 .
By updating function arguments for ipsec_kmod_ctlinput() which is used
when loading IPSEC support via kernel modules.

Differential Revision:	https://reviews.freebsd.org/D36731
Sponsored by:	NVIDIA Networking
2022-10-04 15:42:51 +02:00
Hans Petter Selasky
d97ecf714e time(3): Increase precision of time conversion functions by using gcd.
When converting times to and from units which have many leading zeros,
it pays off to compute the greatest common divisor first, and then do the
scaling product. This way all time unit conversion comes down to scaling a
signed or unsigned 64-bit value by a fraction represented by two signed
or unsigned 32-bit integers.

SBT_1S is defined as 2^32 . When scaling using powers of 10 above 1,
the gcd of SBT_1S and 10^N is always greater than or equal to 4,
when N is greater or equal to 2.

Scaling a sbt value to milliseconds is then done by multiplying by
(1000 / 8) and dividing by (2^32 / 8).

This trick allows for higher precision at very little additional CPU cost.

It shall also be noted that the Xtosbt() functions prior to this patch,
sometimes were off-by-one:

For example when converting 1 / 8 of a second to sbt as 125ms the old sbt
conversion function would compute 0x20000001 while the new function computes
0x20000000 which multiplied by 8 becomes SBT_1S, which is the correct value.

Reviewed by:	kib@
Differential Revision:	https://reviews.freebsd.org/D36857
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-10-04 13:51:06 +02:00
Hans Petter Selasky
2ae0861242 cuse(3): Cosmetic change about testing boolean values.
No functional change intended.

Differential Revision:	https://reviews.freebsd.org/D36633
Suggested by:	jrtc27@ and avg@
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-10-04 13:51:06 +02:00
Hans Petter Selasky
1024bb2633 qdivrem: Predict division by zero as false.
Division by zero triggers an arithmetic exception and should not be very
common. Predict this.

No functional change intended.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-10-04 13:51:06 +02:00
Randall Stewart
cd84e78f09 tcp idle reduce does not work for a server.
TCP has an idle-reduce feature that allows a connection to reduce its
cwnd after it has been idle more than an RTT. This feature only works
for a sending side connection. It does this by at output checking the
idle time (t_rcvtime vs ticks) to see if its more than the RTO timeout.

The problem comes if you are a web server. You get a request and
then send out all the data.. then go idle. The next time you would
send is in response to a request from the peer asking for more data.
But the thing is you updated t_rcvtime when the request came in so
you never reduce.

The fix is to do the idle reduce check also on inbound.

Reviewed by: tuexen, rscheff
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D36721
2022-10-04 07:09:01 -04:00
Gleb Smirnoff
77198a945a tcp_timers: provide tcp_timer_drop() and tcp_timer_close()
Two functions to call tcp_drop() and tcp_close() from a callout context.
Garbage collect tcp_inpinfo_lock_del(), it has a single use now.

Differential revision:	https://reviews.freebsd.org/D36397
2022-10-03 22:21:55 -07:00
Gleb Smirnoff
775e20c159 tcp: make tcp_drop_syn_sent() static 2022-10-03 21:11:17 -07:00
Gleb Smirnoff
fcb3f813f3 netinet*: remove PRC_ constants and streamline ICMP processing
In the original design of the network stack from the protocol control
input method pr_ctlinput was used notify the protocols about two very
different kinds of events: internal system events and receival of an
ICMP messages from outside.  These events were coded with PRC_ codes.
Today these methods are removed from the protosw(9) and are isolated
to IPv4 and IPv6 stacks and are called only from icmp*_input().  The
PRC_ codes now just create a shim layer between ICMP codes and errors
or actions taken by protocols.

- Change ipproto_ctlinput_t to pass just pointer to ICMP header.  This
  allows protocols to not deduct it from the internal IP header.
- Change ip6proto_ctlinput_t to pass just struct ip6ctlparam pointer.
  It has all the information needed to the protocols.  In the structure,
  change ip6c_finaldst fields to sockaddr_in6.  The reason is that
  icmp6_input() already has this address wrapped in sockaddr, and the
  protocols want this address as sockaddr.
- For UDP tunneling control input, as well as for IPSEC control input,
  change the prototypes to accept a transparent union of either ICMP
  header pointer or struct ip6ctlparam pointer.
- In icmp_input() and icmp6_input() do only validation of ICMP header and
  count bad packets.  The translation of ICMP codes to errors/actions is
  done by protocols.
- Provide icmp_errmap() and icmp6_errmap() as substitute to inetctlerrmap,
  inet6ctlerrmap arrays.
- In protocol ctlinput methods either trust what icmp_errmap() recommend,
  or do our own logic based on the ICMP header.

Differential revision:	https://reviews.freebsd.org/D36731
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
809fef2913 netipsec: move specific ipsecmethods declarations to ipsec_support.h
where struct ipsec_methods is defined.  Not a functional change.
Allows further modification of method prototypes without breaking
compilation of other ipsec compilation units.

Differential revision:	https://reviews.freebsd.org/D36730
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
c0fc81e913 netinet*: remove dead code from TCP, UDP, SCTP control input
Now these functions are called only from icmp*_input().  The pointer
to the ICMP data is never NULL and cmd has a limited set of values.

In the past the functions were demultiplexing control messages from
ICMP layer, as well as internally generated events.  In the latter
case the the pointer to IP would be NULL.

Differential revision:	https://reviews.freebsd.org/D36729
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
7f3b00a87a netinet: filter out invalid ICMP responses in ip_icmp()
instead of doing that in every ipproto_ctlinput_t method.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36728
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
53807a8a27 netinet*: use sparse C99 initializer for inetctlerrmap
and mark those PRC_* codes, that are used.  The rest are dead code.
This is not a functional change, but illustrative to make easier
review of following changes.
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
43d39ca7e5 netinet*: de-void control input IP protocol methods
After decoupling of protosw(9) and IP wire protocols in 78b1fc05b2 for
IPv4 we got vector ip_ctlprotox[] that is executed only and only from
icmp_input() and respectively for IPv6 we got ip6_ctlprotox[] executed
only and only from icmp6_input().  This allows to use protocol specific
argument types in these methods instead of struct sockaddr and void.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36727
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
46ddeb6be8 netinet6: retire ip6protosw.h
The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with
the former removed in f0ffb944d2 in 2001 and the latter survived until
today.  It has been reduced down to only one useful declaration that
moves to ip6_var.h

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36726
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
0ab46f28dc tcp: remove unnecessary include of tcp6_var.h
Reviewed by:		rscheff, melifaro
Differential revision:	https://reviews.freebsd.org/D36725
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
bb77f0c204 udp: typedef udp tunneling functions to functions, not pointers
With this change one can make a forward declaration of a function
that is of UDP tunneling type.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36724
2022-10-03 20:53:04 -07:00
Gleb Smirnoff
24b96f35b9 netinet*: move ipproto_register() and co to ip_var.h and ip6_var.h
This is a FreeBSD KPI and belongs to private header not netinet/in.h.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36723
2022-10-03 20:53:04 -07:00
John Baldwin
e0df0dce71 powerpc_nvram: Fix a bug in the adler32 checksum.
The Adler32 digest consists of two 16-bit words whose values are
calculated modulo 65521 (largest prime < 2^16).  To avoid two division
instructions per byte, this version copies an optimization found in
zlib which defers the modulus until close to the point that the
intermediate sums can overflow 2^32.  (zlib uses NMAX == 5552 for
this, this version uses 5000)

The bug is that in the deferred modulus case, the modulus was
only applied to the high word (and twice at that) but not to
the low word.  The fix is to apply it to both words.

Reviewed by:	jhibbits
Reported by:	Miod Vallat <miod@openbsd.org>
Differential Revision:	https://reviews.freebsd.org/D36798
2022-10-03 16:10:41 -07:00
John Baldwin
a9fca3b987 Fix various places which cast a pointer to a vm_paddr_t or vice versa.
GCC warns about the mismatched sizes on i386 where vm_paddr_t is 64
bits.

Reviewed by:	imp, markj
Differential Revision:	https://reviews.freebsd.org/D36750
2022-10-03 16:10:41 -07:00
Jung-uk Kim
56cdab3372 bpf: obtain timestamps from controller via pkthdr if available
r325506 (3cf8254f1e) extended struct pkthdr to add packet timestamp in
mbuf(9) chain.  For example, cxgbe(4) and mlx5en(4) support this feature.
Use the timestamp for bpf(4) if it is available.

Reviewed by:	hselasky, kib, np
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D36868
2022-10-03 18:53:40 -04:00
Konstantin Belousov
85b715baae amd64/db_trace.c: remove stray prototype
Sponsored by:	NVIDIA networking
MFC after:	1 week
2022-10-04 01:50:30 +03:00
Richard Scheffenegger
4edff766cb tcp: correct simultaneous SYN ECN reaction in RFC3168 mode.
Ensure that an RFC3168 ECN reaction only occurs on non-SYN
segments.

Reviewed By:    	tuexen, #transport
Sponsored by:   	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D36867
2022-10-04 00:24:28 +02:00
Alfredo Dal'Ava Junior
ff80ab1ceb powerpc: slb_alloc_user_cache: fix missing uma_zalloc wait flag
Flags should be M_WAITOK | M_ZERO instead of just M_ZERO

Reviewed by:	markj
MFC after:	2 days
Sponsored by:	Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision:	https://reviews.freebsd.org/D36865
2022-10-03 20:56:41 -03:00
Johannes Totz
a9a267c1f7 acpi_thermal: Fix a potential stack buffer overflow.
While here, fix a typo as well.

Reviewed by:	jhb
Differential Revision: https://reviews.freebsd.org/D36495
2022-10-03 11:16:27 -07:00
Mitchell Horne
6f8a182b15 db_sym.c: restore sys/systm.h include
This was erroneously dropped in the previous commit.

Reported by:	Jenkins
Fixes:	754cb545b6 ("ddb: de-duplicate decode_syscall()")
2022-10-03 14:51:22 -03:00
Richard Scheffenegger
0924ae8f47 tcp: allow window scale and timestamps to be toggled individually
Simple change to allow for the individual toggling of
RFC7323 window scaling and timestamp option.

Reviewed By:    	rrs, tuexen, glebius, guest-ccui, #transport
Sponsored by:   	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D36863
2022-10-03 19:21:46 +02:00
Mitchell Horne
3a91cecea4 riscv: move GEOM_PART_GPT option to DEFAULTS
This is consistent with other architectures.
2022-10-03 13:49:54 -03:00
Mitchell Horne
791bfa60e8 riscv: decode syscall in ddb backtrace
This presents the existing information in a slightly more readable way.

Reviewed by:	jrtc27, markj, jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D36565
2022-10-03 13:49:54 -03:00
Mitchell Horne
754cb545b6 ddb: de-duplicate decode_syscall()
Only i386 and amd64 print the decoded syscall name in the backtrace.
This de-duplication facilitates further changes and adoption by other
platforms.

Reviewed by:	jrtc27, markj, jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D36565
2022-10-03 13:49:54 -03:00
Hans Petter Selasky
0def80f1a5 time(3): Align fast clock times to avoid firing multiple timers.
In non-periodic mode absolute timers fire at exactly the time given.
When specifying a fast clock, align the firing time so that less
timer interrupt events are needed.

Reviewed by:	rrs @
Differential Revision:	https://reviews.freebsd.org/D36858
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-10-03 17:53:17 +02:00
Hans Petter Selasky
5e59b2734f cuse(3): Optimise small reads and writes.
When doing small reads and writes use an intermediate buffer to store the
data to save locking the remote process to access data.

Reviewed by:	imp @
Differential Revision:	https://reviews.freebsd.org/D36633
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-10-03 17:35:14 +02:00
Hans Petter Selasky
8f0a3c9c35 cuse(3): Use bool type for boolean value instead of int type.
No functional change intended.

Reviewed by:	imp @
Differential Revision:	https://reviews.freebsd.org/D36633
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-10-03 17:35:14 +02:00
Kristof Provost
8a299958c1 if_epair: fix build with RSS
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-10-03 17:02:55 +02:00
Michael Tuexen
2515552e62 tcp: improve handling of SYN-ACK segments in TIMEWAIT state
Only consider segments with the SYN bit set and the ACK bit cleared
as "new connection attempts", which result in re-using a connection
being in TIMEWAIT state. This results in consistent handling of
SYN-ACK segments.

Reviewed by:		rscheff@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D36864
2022-10-03 14:46:47 +02:00
Kirk McKusick
dcdba3460d Updates to UFS/FFS superblock integrity checks when reading a superblock.
Further updates based on ways Peter Holm found to corrupt UFS
superblocks in ways that could cause kernel hangs or crashes.

No legitimate superblocks should fail as a result of these changes.

Reported by:  Peter Holm
Tested by:    Peter Holm
Sponsored by: The FreeBSD Foundation
2022-10-03 05:53:10 -07:00
Alfredo Dal'Ava Junior
db79bf75ac powerpc: cpuset: add local functions for copyin/copyout
Add local functions to workaround an instruction segment trap (panic)
when the indirect functions copyin and copyout are called by an external
loadable kernel module (i.e. pfsync, zfs and linuxulator). The crash
was triggered by change 47a57144af, but
kernel binary linked with LLD 9 works fine. LLVM bisect points that LLD
behavior chaged after dc06b0bc9ad055d06535462d91bfc2a744b2f589.

This is know to affect powerpc targets only and the final fix is still
being discussed with the LLVM community.

PR:	266730
Reviewed by:	luporl, jhibbits (on IRC, previous version)
MFC after:	2 days
Sponsored by:	Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision:	https://reviews.freebsd.org/D36234
2022-10-03 12:03:09 -03:00
Michael Tuexen
f8b5681094 tcp: honor drop_synfin sysctl variable in TIME-WAIT
Reviewed by:		rrs@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D36862
2022-10-03 12:48:30 +02:00
Doug Moore
ffbc2a58b1 Fix LINT build after 368ee2f86a
Reported by:	jenkins
Fixes:	368ee2f86a
2022-10-03 00:15:21 -05:00
Doug Moore
368ee2f86a rb_tree: let insert search start from next node
When the node to insert in the rb_tree is known to precede or follow a
particular node, new methods RB_INSERT_PREV and RB_INSERT_NEXT,
defined here, allow the search for where to insert the new node begin
with that particular node, rather than at the root, to save a bit of
time.

Using those methods, instead of RB_INSERT, in managing a tree in
iommu_gas.c, saves a little time.

Reviewed by:	kib
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D35516
2022-10-02 22:27:21 -05:00
Alexander V. Chernikov
356724fc93 netlink: fix non-default builds (no INET, INET6, ROUTE_MPATH). 2022-10-02 13:12:52 +00:00
Jung-uk Kim
fc083c3eb2 netlink: Fix build without VIMAGE 2022-10-01 21:41:54 -04:00
Rick Macklem
a82308abab nfs_clvnops.c: Fix access to v_mount when vnode unlocked
Commit ab17854f97 fixed access to v_mount when the
vnode is unlocked for nfs_copy_file_range().

This patch does the same for nfs_advlockasync().

MFC after:	1 week
2022-10-01 16:30:07 -07:00
Alexander V. Chernikov
03994c24da netlink: fix build without INVARIANTS
Reported by:	cy
2022-10-01 21:20:50 +00:00
Alexander V. Chernikov
8d9f3e0572 netlink: fix format strings on 32-bit platforms 2022-10-01 21:16:31 +00:00
Alexander V. Chernikov
c90bff3fa7 netlink: fix debugging on 32-bit platforms 2022-10-01 19:03:35 +00:00
Doug Moore
e5f93d1078 show_sysctl_all: reduce copying, please coverity
Modify db_show_sysctl_all so that it does not copy more than once the
data of the input oid, and so that what it passes to db_show_oid does
not alarm coverity.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D36847
2022-10-01 12:20:04 -05:00
Alexander V. Chernikov
dddafa8d25 netlink: make test-includes happy by hiding most of the header
contents under _KERNEL.
2022-10-01 17:01:53 +00:00
Rick Macklem
bffb3d947b nfs_clvnops.c: Fix access to v_mount when vnode unlocked
Commit ab17854f97 fixed access to v_mount when the
vnode is unlocked for nfs_copy_file_range().

This patch does the same for nfs_ioctl().

Reviewed by:	kib, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36846
2022-10-01 07:43:53 -07:00