In the common case, kinst emulates a traced instruction by copying it to
a trampoline, where it is followed by a jump back to the original code,
and pointing the interrupted thread's %rip at the trampoline. In
particular, the trampoline is executed with the same CPU context as the
original instruction, so if interrupts are enabled at the point where
the probe fires, they will be enabled when the trampoline is
subsequently executed.
It can happen that an interrupt is raised while a thread is executing a
kinst trampoline. In that case, it is possible that the interrupt
handler will trigger a kinst probe, so we must ensure that the thread
does not recurse and overwrite its trampoline before it is finished
executing the original contents, otherwise an attempt to trace code
called from interrupt handlers can crash the kernel.
To that end, add a per-CPU trampoline, used when the probe fired with
interrupts disabled. Note that this is not quite complete since it does
not handle the possibility of kinst probes firing while executing an NMI
handler.
Also ensure that we do not trace instructions which set IF, since in
that case it is not clear which trampoline (the per-thread trampoline or
the per-CPU trampoline) we should use, and since such instructions are
rare.
Reported and tested by: Domagoj Stolfa
Reviewed by: christos
Fixes: f0bc4ed144 ("kinst: Initial revision")
Differential Revision: https://reviews.freebsd.org/D37619
* Correct some function prototypes which were documented with the wrong
pointer type.
* Clarify return values and requirements for freeing the limit handle.
MFC after: 1 week
Sponsored by: Axcient
Reviewed by: oshogbo
Differential Revision: https://reviews.freebsd.org/D37586
shlib_required/provided is enough for the dependencies and this also
causes problems for packages like rescue which shouldn't depend on runtime
at all.
PR: 268063
Sponsored by: Beckhoff Automation GmbH & Co. KG
This allows the use of chroot and/or jail environments which depend on
interpreters registed with imgact_binmisc to use emulator binaries from
the host to emulate programs inside the chroot.
Reviewed by: imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D37432
ppp supports MSS clamping for TCP/IPv4. This patch
* improves MSS clamping for TCP/IPv4 by using the MSS as specified
in RFC 6691.
* adds support for MSS clamping for TCP/IPv6.
Reported by: Timo Voelker
Reviewed by: thj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D37624
All the other functions used pointers for from/to instead of
fixed-size array parameters. More importantly, this function can
accept pointers to buffers of multiple blocks, not just a single
block.
Reported by: GCC -Warray-parameter
Reviewed by: imp, emaste
Differential Revision: https://reviews.freebsd.org/D37547
They are shared between UDP over IPv4 and over IPv6. To prevent all
possible kernel build failures wrap them in #ifdef _SYS_PROTOSW_H_.
Prompted by feedback from jhb@ and jrtc27@ on c93db4abf4.
Add some extra customization points so that FreeBSD build
can be adapted to local requirements.
We use these to minimize changes to share/mk
Reviewed by: stevek
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37617
Early versions of this code had a free, but this one doesn't need
it. Remove the forgotten free(vv); from earlier versions.
Fixes: ed56dcfc6b
Noticed by: Michael Butler
Sponsored by: Netflix
Do the standard command line parsing... With a small twist to deal with
the quirks of booting via linuxboot to the initrd from the command line
in shell.efi and other observed oddities.
Sponsored by: Netflix
Copy the arg that sets a variable to maximize the reuse of this
routine. There are places we call it from that are const char * and it
might not be safe to cast that away.
Sponsored by: Netflix
main() of the boot loader is expected to call devinit() early. We do
this at the same time we do it in the EFI loader (except we don't have a
buffer cache here, we don't need to initialize time and we don't have
special efi partition handles to enumerate). This is just after we probe
for the console.
Sponsored by: Netflix
Copy EFI's bootinfo.c and make minor adjustments for kboot's needs. Do
not connect this to the build just yet until other pieces are in place.
Sponsored by: Netflix
bootinfo.c is about to be shared with kboot since they create
substantially similar environments / metadata tagging / etc. Tag this
with #ifdef EFI for the moment until the proper abstracting out can
happen.
Sponsored by: Netflix
Use only one callout structure per tcpcb that is responsible for handling
all five TCP timeouts. Use locked version of callout, of course. The
callout function tcp_timer_enter() chooses soonest timer and executes it
with lock held. Unless the timer reports that the tcpcb has been freed,
the callout is rescheduled for next soonest timer, if there is any.
With single callout per tcpcb on connection teardown we should be able
to fully stop the callout and immediately free it, avoiding use of
callout_async_drain(). There is one gotcha here: callout_stop() can
actually touch our memory when a rare race condition happens. See
comment above tcp_timer_stop(). Synchronous stop of the callout makes
tcp_discardcb() the single entry point for tcpcb destructor, merging the
tcp_freecb() to the end of the function.
While here, also remove lots of lingering checks in the beginning of
TCP timer functions. With a locked callout they are unnecessary.
While here, clean unused parts of timer KPI for the pluggable TCP stacks.
While here, remove TCPDEBUG from tcp_timer.c, as this allows for more
simplification of TCP timers. The TCPDEBUG is scheduled for removal.
Move the DTrace probes in timers to the beginning of a function, where
a tcpcb is always existing.
Discussed with: rrs, tuexen, rscheff (the TCP part of the diff)
Reviewed by: hselasky, kib, mav (the callout part)
Differential revision: https://reviews.freebsd.org/D37321
For the TCP protocol inpcb storage specify allocation size that would
provide space to most of the data a TCP connection needs, embedding
into struct tcpcb several structures, that previously were allocated
separately.
The most import one is the inpcb itself. With embedding we can provide
strong guarantee that with a valid TCP inpcb the tcpcb is always valid
and vice versa. Also we reduce number of allocs/frees per connection.
The embedded inpcb is placed in the beginning of the struct tcpcb,
since in_pcballoc() requires that. However, later we may want to move
it around for cache line efficiency, and this can be done with a little
effort. The new intotcpcb() macro is ready for such move.
The congestion algorithm data, the TCP timers and osd(9) data are
also embedded into tcpcb, and temprorary struct tcpcb_mem goes away.
There was no extra allocation here, but we went through extra pointer
every time we accessed this data.
One interesting side effect is that now TCP data is allocated from
SMR-protected zone. Potentially this allows the TCP stacks or other
TCP related modules to utilize that for their own synchronization.
Large part of the change was done with sed script:
s/tp->ccv->/tp->t_ccv./g
s/tp->ccv/\&tp->t_ccv/g
s/tp->cc_algo/tp->t_cc/g
s/tp->t_timers->tt_/tp->tt_/g
s/CCV\(ccv, osd\)/\&CCV(ccv, t_osd)/g
Dependency side effect is that code that needs to know struct tcpcb
should also know struct inpcb, that added several <netinet/in_pcb.h>.
Differential revision: https://reviews.freebsd.org/D37127
This should be spelt IMGACT_BINMISC to match the filename. The option
name does not appear outside of sys/conf and this module is typically
used via the kernel module imgact_binmisc.ko.
MFC After: 2 weeks
Notable upstream pull request merges:
#13782 Fix setting the large_block feature after receiving a snapshot
#14157 FreeBSD: stop using buffer cache-only routines on sync
#14172 zed: post a udev change event from spa_vdev_attach()
#14181 zed: unclean disk attachment faults the vdev
#14190 Bump checksum error counter before reporting to ZED
#14196 Remove atomics from zh_refcount
#14197 Don't leak packed recieved proprties
#14198 Switch dnode stats to wmsums
#14199 Remove few pointer dereferences in dbuf_read()
#14200 Micro-optimize zrl_remove()
#14204 Lua: Fix bad bitshift in lua_strx2number()
#14212 Zstd fixes
#14218 Avoid a null pointer dereference in zfs_mount() on FreeBSD
#14235 nopwrites on dmu_sync-ed blocks can result in a panic
#14236 zio can deadlock during device removal
#14247 Micro-optimize fletcher4 calculations
#14261 FreeBSD: zfs_register_callbacks() must implement error check
correctly
Obtained from: OpenZFS
OpenZFS commit: 59493b63c1
The reference to the "DARPA Internet" seems not quite
up to date in 2022, so move that to the HISTORY section.
Mention RFC 2780 and RFC 5237.
Obtained from: NetBSD
MFC after: 3 days
These facilitate customizing the build with minimal churn.
Reviewed by: stevek
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37592
When PCI_IOV is not enabled, do not attempt to call
iflib_softirq_alloc_generic(...IFLIB_INTR_IOV), as it results
in boot-time warnings similar to:
taskqgroup_attach_cpu: qid not found for iov cpu=2
ixl2: taskqgroup_attach_cpu failed 22
Instead, make it conditional on PCI_IOV like the other
SR-IOV related code.
Reviewed by: erj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37609