The option is not even recognized and with that patched it does not
compile. Even if it did work, it would be prohibitively expensive to
use.
Interested parties can use pmcstat or dtrace instead.
Until bnxt is fixed on i386, remove it from its lint. Create a new
section of the config file for things that work everywhere, except i386.
Sponsored by: Netflix
Some architectures will pretty-print a system call trap in the
backtrace. Rather than printing the symbol, use the syscallname()
function to pull the string from the sv_syscallnames array corresponding
to the process. This simplifies the function somewhat.
Mostly, this will result in dropping the "sys" prefix, e.g. "sys_exit"
will now be printed simply as "exit".
Make two minor tweaks to the function signature: use a u_int for the
syscall number since this is a more correct type (see the 'code' member
of struct syscall_args), and make the thread pointer the first argument.
The latter is more natural and conventional.
Suggested by: jrtc27
Reviewed by: jrtc27, markj, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37200
This allows the syscallname() function to give a usable result for Linux
ABIs.
Reported by: jrtc27
Reviewed by: jrtc27, markj, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37199
Use it and several other vm_page_*_valid() functions in more places.
Suggested and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D37024
Add a <sys/_pv_entry.h> intended for use in <machine/pmap.h> to
define struct pv_entry, pv_chunk, and related macros and inline
functions.
Note that powerpc does not yet use this as while the mmu_radix pmap
in powerpc uses the new scheme (albeit with fewer PV entries in a
chunk than normal due to an used pv_pmap field in struct pv_entry),
the Book-E pmaps for powerpc use the older style PV entries without
chunks (and thus require the pv_pmap field).
Suggested by: kib
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36685
Only i386 and amd64 print the decoded syscall name in the backtrace.
This de-duplication facilitates further changes and adoption by other
platforms.
Reviewed by: jrtc27, markj, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D36565
This matches the return type of pmap_mapdev/bios.
Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548
We don't need the 56 bytes at the end of bootinfo, and never had. Don't
copy them from old boot loaders, and don't provide them with new boot
loaders.
Add comments about what versions of FreeBSD 'old' means in various
contexts. Add note that old disk loader (from 1.x/2.x) is doomed to
failure because it doesn't provide metadata that we now require to boot,
and has been since approximately FreeBSD 7.x. Retain all this
information to explain why we have 4 arguments that are always 0, even
though it's ancient history.
This saves 56 bytes in the boot loader.
Sponsored by: Netflix
Reviewed by: phk, rgrimes, kib
Differential Revision: https://reviews.freebsd.org/D36550
Mark the obsolete fields in bootinfo with _was_. Note that the fields
from bi_memdesc_version to the end of the structure never were used in a
release. They were added in April 2010 for i386 EFI booting. The boot
loader set these fields though 2019, but no kernel ever looked at them
and we never supported i386 EFI booting, and likely never will in this
form. They can likely be deleted entirely in the future, but locore.S
needs to change to do that (it also needs to change to drop support for
really old booting scenarios as well, which will eliminate bi_endcommon
too).
All the other fields haven't been used since the 4.x -> 5.x cutover of
the wdc driver to ata.
The bi_bios_dev field is used in the handoff between bootXX and the
loader. The loader uses it to determine what disk it was loaded off of
to detmerine the default root filesystem. It's not used by the i386
kernel anymore to determine anything.
Sponsored by: Netflix
Reviewed by: tsoome
Differential Revision: https://reviews.freebsd.org/D36544
Do not require that %ebx contains idlePTD AKA %kcr3. This also
simplifies KBI contract between copyout_fast and page handler.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
by delegating the work to the slow path.
Some kernel memory, like pipe buffers, is pageable. We must not enable
interrupts, and consequently, preemption, while in critical section in
the fast copyout path, because we use pcpu buffers. If page fault
occurs while copying from the pcpu copyout_buf to kernel memory, abort
fast path and delegate work to the slow implementation.
In collaboration with: pho, tijl
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
and not on the trampoline stack. This is a useful way to ensure that
we did not enabled interrupts while on user %cr3 or trampoline stack.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This changes the default TCP Congestion Control (CC) to CUBIC.
For small, transactional exchanges (e.g. web objects <15kB), this
will not have a material effect. However, for long duration data
transfers, CUBIC allocates a slightly higher fraction of the
available bandwidth, when competing against NewReno CC.
Reviewed By: tuexen, mav, #transport, guest-ccui, emaste
Relnotes: Yes
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D36537
for the "trap with interrupts disabled" warning.
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
Also compactify the printfs, and remove comment about 'two prints'.
Their arguments are on same page, so one fault implies another.
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
It is enough to have only one 'call calltrap' locally.
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
There is no reason to do this. Instead just calculate it later.
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
tf_trapno is checked on return from interrupt/exception to determine if
special handling is needed for switching address space. This is due to
the possibility of NMI/MCHK/DBG to occur at arbitrary place in kernel,
where both address space and stack used could be transient. Kernel
saves current %cr3 in tf_err for such events, to restore on return.
If user is able to set tf_trapno, it can trigger that special handling,
and since tf_err is also user-controlled by sigreturn(2), the result is
undefined.
PR: 265889
Reported by: lwhsu
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
Which means that we must not copy top 8 bytes from the trampoline stack
for the exception frame to the regular thread kstack. As consequence,
this stops corruption of the pcb. The visible effect was often a broken
fork(2) on the CPU where corruption occured.
Account for the detail by substracting 8 from the copy byte count when
moving exception frames from trampoline to the regular stack.
[irettraps handles segmentation/stack/protection faults which could
occur on the doreti path, where we might already switched stack and
address space]
Reported and tested by: pho
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
Do not blindly account a page fault occuring on the trampoline area,
as the userspace access fault. Check that it occured exactly in the
instruction that does that.
This avoids unneeded switches of address space on faults not needing the
switch, effectively converting machine resets due to tripple faults,
into regular panics.
Reviewed by: jhb
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D36302
This applies one of the changes from
5567d6b4419b02a2099527228b1a51cc55a5b47d to other architectures
besides arm64.
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36263
It does not work with ULE, which is the default scheduler for over a
decade.
Reviewed by: emaste, kib
Differential Revision: https://reviews.freebsd.org/D36094
Make most AST handlers dynamically registered. This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it. For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.
Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit. For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.
The dynamic registration also allows third-party modules to register AST
handlers if needed. There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it. In fact, this
is already present behavior for hwpmc.ko and ufs.ko. I do not think it
is worth the efforts and the runtime overhead to try to fix it.
Reviewed by: markj
Tested by: emaste (arm64), pho
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
This is not completely exhaustive, but covers a large majority of
commands in the tree.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35583
Store the shared page address in struct vmspace.
Also instead of storing absolute addresses of various shared page
segments save their offsets with respect to the shared page address.
This will be more useful when the shared page address is randomized.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35393
Use a getter macro instead of fetching the sigcode address directly
from a sysent of a given process. It assumes that the sigcode is stored
in the shared page, which is true in all cases, except for a.out
binaries. This will be later useful when the shared page address
randomization is introduced.
No functional change intended.
Approved by: mw(mentor)
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D35392
On x86 systems, the debug.late_console tunable makes it possible to set
up the console before we call pmap_bootstrap. (The tunable is turned
on by default; setting late_console=0 results in consoles being probed
early.)
Unfortunately this is not compatible with using the ACPI SPCR table to
find the console, since consulting ACPI tables requires mapping memory
addresses. As such, we skip the call to uart_cpu_acpi_spcr from
uart_cpu_x86 in the !late_console case.
Reviewed by: imp
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D35794
Some tools (firecraker loader) only check for notes in PT_NOTE program
headers, so make sure the notes added using the ELFNOTE macro end up
in such header.
Output from readelf -Wl for and amd64 kernel after the change:
Elf file type is EXEC (Executable file)
Entry point 0xffffffff8038a000
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0xffffffff80200040 0x0000000000200040 0x000268 0x000268 R 0x8
INTERP 0x0002a8 0xffffffff802002a8 0x00000000002002a8 0x00000d 0x00000d R 0x1
[Requesting program interpreter: /red/herring]
LOAD 0x000000 0xffffffff80200000 0x0000000000200000 0x189e28 0x189e28 R 0x200000
LOAD 0x18a000 0xffffffff8038a000 0x000000000038a000 0xe447e8 0xe447e8 R E 0x200000
LOAD 0xfce7f0 0xffffffff811ce7f0 0x00000000011ce7f0 0x6b955c 0x6b955c R 0x200000
LOAD 0x1800000 0xffffffff81a00000 0x0000000001a00000 0x000140 0x000140 RW 0x200000
LOAD 0x1801000 0xffffffff81a01000 0x0000000001a01000 0x1c8480 0x5ff000 RW 0x200000
DYNAMIC 0x1800000 0xffffffff81a00000 0x0000000001a00000 0x000140 0x000140 RW 0x8
GNU_RELRO 0x1800000 0xffffffff81a00000 0x0000000001a00000 0x000140 0x000140 R 0x1
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0
NOTE 0x1687ae0 0xffffffff81887ae0 0x0000000001887ae0 0x0001c0 0x0001c0 R 0x4
Section to Segment mapping:
Segment Sections...
[...]
10 .note.gnu.build-id .note.Xen
Reported by: cperciva
Fixes: 1a9cdd373a6a ('xen: add PV/PVH kernel entry point')
Fixes: 93ee134a24fa ('Integrate support for xen in to i386 common code.')
Sponsored by: Citrix Systems R&D
Reviewed by: emaste
Differential revision: https://reviews.freebsd.org/D35611
The third argument to this function indicates whether the supplied
ticker is fixed or variable, i.e. requiring calibration. Give this
argument a type and name that better conveys this purpose.
Reviewed by: kib, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35459
Replace sigframe sf_extramask by native sigset_t and use it to
store/restore the thread signal mask without conversion to/from
Linux signal mask.
Pointy hat to: dchagin
MFC after: 2 weeks
To solve y2k38 problem in the recvmsg syscall the new SO_TIMESTAMP
constant were added on v5.1 Linux kernel. So, old 32-bit binaries
that knows only 32-bit time_t uses the old value of the constant,
and binaries that knows 64-bit time_t uses the new constant.
To determine what size of time_t type is expected by the user-space,
store requested value (SO_TIMESTAMP) in the process emuldata structure.
MFC after: 2 weeks