Adds a few struct members and a function to get i915_runtime_pm_status()
to compile in drm-kmod.
Differential Revision: https://reviews.freebsd.org/D36749
Sponsored by: Google, Inc. (GSoC 2022)
Note that this required adding missing ()'s around the outermost level
of MSK_READ_MIB*. Otherwise, the void cast was only applied to the
first register read. This also meant that MSK_READ_MIB64 was pretty
broken as the uint64_t cast only applied to the first 16-bit register
read in each MSK_READ_MIB32 invocation and the 32-bit shift was only
applied to the second register read of the pair.
Reviewed by: imp, emaste
Reported by: GCC -Wunused-value
Differential Revision: https://reviews.freebsd.org/D36777
As an optimization, vm_page_activate() avoids requeuing a page that's
already in the active queue. A page's location in the active queue is
mostly unimportant.
When a page is unwired and placed back in the page queues,
vm_page_unwire() avoids moving pages out of PQ_ACTIVE to honour the
request, the idea being that they're likely mapped and so will simply
get bounced back in to PQ_ACTIVE during a queue scan.
In both cases, if the page was logically in PQ_ACTIVE but had not yet
been physically enqueued (i.e., the page is in a per-CPU batch), we
would end up clearing PGA_REQUEUE from the page. Then, batch processing
would ignore the page, so it would end up unwired and not in any queues.
This can arise, for example, when a page is allocated and then
vm_page_activate() is called multiple times in quick succession. The
result is that the page is hidden from the page daemon, so while it will
be freed when its VM object is destroyed, it cannot be reclaimed under
memory pressure.
Fix the bug: when checking if a page is in PQ_ACTIVE, only perform the
optimization if the page is physically enqueued.
PR: 256507
Fixes: f3f38e2580 ("Start implementing queue state updates using fcmpset loops.")
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: E-CARD Ltd.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36839
Use time_t rather than uint32_t to represent the timestamps. That means
we have 64 bits rather than 32 on all platforms except i386, avoiding
the Y2K38 issues on most platforms.
Reviewed by: Zhenlei Huang
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D36837
While at it optimise "case 3" into a default.
This way there is no need to initialize the "mark" variable in the beginning,
because all cases set it.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Differential Revision: https://reviews.freebsd.org/D36042
This allows invop-based providers (i.e., fbt and kinst) to expose the
register file of the CPU at the point where the probe fired. It does
not work for SDT providers because their probes are implemented as plain
function calls and so don't save registers. It's not clear what
semantics "regs" should have for them anyway.
This is akin to "uregs", which nominally provides access to the
userspace registers. In fact, DIF already had a DIF_VAR_REGS variable
defined, it was simply unimplemented.
Usage example: print the contents of %rdi upon each call to
amd64_syscall():
fbt::amd64_syscall:entry {printf("%x", regs[R_RDI]);}
Note that the R_* constants are defined in /usr/lib/dtrace/regs_x86.d.
Currently there are no similar definitions for non-x86 platforms.
Reviewed by: christos
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36799
Summary:
The indirect flag tells the hardware to use a flat or two level table.
As we only support using the flat table ensure the flag that marks
which is in use is set correctly.
We can't rely on this being set correctly as some firmware may set the
indirect flag, e.g. booting from LinuxBoot.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36873
Add sysctl/tunable to control Electromechanical Interlock support.
Disable it by default since Linux does not do it either and it seems
the number of systems having it broken is higher than having working.
This fixes NVMe backplane operation on ASUS RS500A-E11-RS12U server
with AMD EPYC 7402 CPU, where attempts to control reported interlock
for some reason end up in PCIe link loss, while interlock status does
not change (it is not really there).
MFC after: 2 weeks
By adding missing ifdefs for INET and INET6 when building LINT-NOIP .
Differential Revision: https://reviews.freebsd.org/D36731
Sponsored by: NVIDIA Networking
By updating function arguments for ipsec_kmod_ctlinput() which is used
when loading IPSEC support via kernel modules.
Differential Revision: https://reviews.freebsd.org/D36731
Sponsored by: NVIDIA Networking
When converting times to and from units which have many leading zeros,
it pays off to compute the greatest common divisor first, and then do the
scaling product. This way all time unit conversion comes down to scaling a
signed or unsigned 64-bit value by a fraction represented by two signed
or unsigned 32-bit integers.
SBT_1S is defined as 2^32 . When scaling using powers of 10 above 1,
the gcd of SBT_1S and 10^N is always greater than or equal to 4,
when N is greater or equal to 2.
Scaling a sbt value to milliseconds is then done by multiplying by
(1000 / 8) and dividing by (2^32 / 8).
This trick allows for higher precision at very little additional CPU cost.
It shall also be noted that the Xtosbt() functions prior to this patch,
sometimes were off-by-one:
For example when converting 1 / 8 of a second to sbt as 125ms the old sbt
conversion function would compute 0x20000001 while the new function computes
0x20000000 which multiplied by 8 becomes SBT_1S, which is the correct value.
Reviewed by: kib@
Differential Revision: https://reviews.freebsd.org/D36857
MFC after: 1 week
Sponsored by: NVIDIA Networking
Division by zero triggers an arithmetic exception and should not be very
common. Predict this.
No functional change intended.
MFC after: 1 week
Sponsored by: NVIDIA Networking
TCP has an idle-reduce feature that allows a connection to reduce its
cwnd after it has been idle more than an RTT. This feature only works
for a sending side connection. It does this by at output checking the
idle time (t_rcvtime vs ticks) to see if its more than the RTO timeout.
The problem comes if you are a web server. You get a request and
then send out all the data.. then go idle. The next time you would
send is in response to a request from the peer asking for more data.
But the thing is you updated t_rcvtime when the request came in so
you never reduce.
The fix is to do the idle reduce check also on inbound.
Reviewed by: tuexen, rscheff
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D36721
Two functions to call tcp_drop() and tcp_close() from a callout context.
Garbage collect tcp_inpinfo_lock_del(), it has a single use now.
Differential revision: https://reviews.freebsd.org/D36397
In the original design of the network stack from the protocol control
input method pr_ctlinput was used notify the protocols about two very
different kinds of events: internal system events and receival of an
ICMP messages from outside. These events were coded with PRC_ codes.
Today these methods are removed from the protosw(9) and are isolated
to IPv4 and IPv6 stacks and are called only from icmp*_input(). The
PRC_ codes now just create a shim layer between ICMP codes and errors
or actions taken by protocols.
- Change ipproto_ctlinput_t to pass just pointer to ICMP header. This
allows protocols to not deduct it from the internal IP header.
- Change ip6proto_ctlinput_t to pass just struct ip6ctlparam pointer.
It has all the information needed to the protocols. In the structure,
change ip6c_finaldst fields to sockaddr_in6. The reason is that
icmp6_input() already has this address wrapped in sockaddr, and the
protocols want this address as sockaddr.
- For UDP tunneling control input, as well as for IPSEC control input,
change the prototypes to accept a transparent union of either ICMP
header pointer or struct ip6ctlparam pointer.
- In icmp_input() and icmp6_input() do only validation of ICMP header and
count bad packets. The translation of ICMP codes to errors/actions is
done by protocols.
- Provide icmp_errmap() and icmp6_errmap() as substitute to inetctlerrmap,
inet6ctlerrmap arrays.
- In protocol ctlinput methods either trust what icmp_errmap() recommend,
or do our own logic based on the ICMP header.
Differential revision: https://reviews.freebsd.org/D36731
where struct ipsec_methods is defined. Not a functional change.
Allows further modification of method prototypes without breaking
compilation of other ipsec compilation units.
Differential revision: https://reviews.freebsd.org/D36730
Now these functions are called only from icmp*_input(). The pointer
to the ICMP data is never NULL and cmd has a limited set of values.
In the past the functions were demultiplexing control messages from
ICMP layer, as well as internally generated events. In the latter
case the the pointer to IP would be NULL.
Differential revision: https://reviews.freebsd.org/D36729
and mark those PRC_* codes, that are used. The rest are dead code.
This is not a functional change, but illustrative to make easier
review of following changes.
After decoupling of protosw(9) and IP wire protocols in 78b1fc05b2 for
IPv4 we got vector ip_ctlprotox[] that is executed only and only from
icmp_input() and respectively for IPv6 we got ip6_ctlprotox[] executed
only and only from icmp6_input(). This allows to use protocol specific
argument types in these methods instead of struct sockaddr and void.
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36727
The netinet/ipprotosw.h and netinet6/ip6protosw.h were KAME relics, with
the former removed in f0ffb944d2 in 2001 and the latter survived until
today. It has been reduced down to only one useful declaration that
moves to ip6_var.h
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36726
With this change one can make a forward declaration of a function
that is of UDP tunneling type.
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36724
The Adler32 digest consists of two 16-bit words whose values are
calculated modulo 65521 (largest prime < 2^16). To avoid two division
instructions per byte, this version copies an optimization found in
zlib which defers the modulus until close to the point that the
intermediate sums can overflow 2^32. (zlib uses NMAX == 5552 for
this, this version uses 5000)
The bug is that in the deferred modulus case, the modulus was
only applied to the high word (and twice at that) but not to
the low word. The fix is to apply it to both words.
Reviewed by: jhibbits
Reported by: Miod Vallat <miod@openbsd.org>
Differential Revision: https://reviews.freebsd.org/D36798
GCC warns about the mismatched sizes on i386 where vm_paddr_t is 64
bits.
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D36750
r325506 (3cf8254f1e) extended struct pkthdr to add packet timestamp in
mbuf(9) chain. For example, cxgbe(4) and mlx5en(4) support this feature.
Use the timestamp for bpf(4) if it is available.
Reviewed by: hselasky, kib, np
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D36868
Ensure that an RFC3168 ECN reaction only occurs on non-SYN
segments.
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D36867