First, r204494 introduced dpcpu_off in struct __kvm and it was allocated
from _kvm_dpcpu_init() but it was not free(3)'ed from kvm_close(3).
Second, r291406 introduced kvm_nlist2(3) and converted kvm_nlist(3) to
use the new function but it did not free the temporary buffer.
Also, check possible calloc(3) failure while I am in the neighborhood.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29019
KCSAN complains about racy accesses in the locking code. Those races are
fine since they are inside a TD_SET_RUNNING() loop that expects the value
to be changed by another CPU.
Use relaxed atomic stores/loads to indicate that this variable can be
written/read by multiple CPUs at the same time. This will also prevent
the compiler from doing unexpected re-ordering.
Reported by: GENERIC-KCSAN
Test Plan: KCSAN no longer complains, kernel still runs fine.
Reviewed By: markj, mjg (earlier version)
Differential Revision: https://reviews.freebsd.org/D28569
Instead of trying to maintain pg_jobc counter on each process group
update (and sometimes before), just calculate the counter when needed.
Still, for the benefit of the signal delivery code, explicitly mark
orphaned groups as such with the new process group flag.
This way we prevent bugs in the corner cases where updates to the counter
were missed due to complicated configuration of p_pptr/p_opptr/real_parent
(debugger).
Since we need to iterate over all children of the process on exit, this
change mostly affects the process group entry and leave, where we need
to iterate all process group members to detect orpaned status.
(For MFC, keep pg_jobc around but unused).
Reported by: jhb
Reviewed by: jilles
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27871
As of r365978, minidumps include a copy of dump_avail[]. This is an
array of vm_paddr_t ranges. libkvm walks the array assuming that
sizeof(vm_paddr_t) is equal to the platform "word size", but that's not
correct on some platforms. For instance, i386 uses a 64-bit vm_paddr_t.
Fix the problem by always dumping 64-bit addresses. On platforms where
vm_paddr_t is 32 bits wide, namely arm and mips (sometimes), translate
dump_avail[] to an array of uint64_t ranges. With this change, libkvm
no longer needs to maintain a notion of the target word size, so get rid
of it.
This is a no-op on platforms where sizeof(vm_paddr_t) == 8.
Reviewed by: alc, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27082
No functional change intended.
Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).
__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.
Reviewed by: kib
Discussed with: markj, mjg
Differential Revision: https://reviews.freebsd.org/D27037
Repeating the default WARNS here makes it slightly more difficult to
experiment with default WARNS changes, e.g. if we did something absolutely
bananas and introduced a WARNS=7 and wanted to try lifting the default to
that.
Drop most of them; there is one in the blake2 kernel module, but I suspect
it should be dropped -- the default WARNS in the rest of the build doesn't
currently apply to kernel modules, and I haven't put too much thought into
whether it makes sense to make it so.
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).
Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.
Reviewed by: jhb
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26131
This test checks if value received from kvm_read is sane, based on
value returned by sysctl interface.
This should catch regression on bug fixed by r359160
Reviewed by: jhb
Approved by: jhibbits (mentor)
MFC after: 1 week
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D23783
It seems this manpage was copied from kvm_getloadavg(3), but the
DIAGNOSTICS section was not updated completely. Update the section with
correct information about a return value of -1.
MFC after: 3 days
It was used only to store the bounds of each swap device. However,
since swblk_t is a signed 32-bit int and daddr_t is a signed 64-bit
int, swp_pager_isondev() may return an invalid result if swap devices
are repeatedly added and removed and sw_end for a device ends up
becoming a negative number.
Note that the removed comment about maximum swap size still applies.
Reviewed by: jeff, kib
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23666
Revert parts of r353274 replacing vnet_state with a shutdown flag.
Not having the state flag for the current SI_SUB_* makes it harder to debug
kernel or module panics related to VNET bringup or teardown.
Not having the state also does not allow us to check for other dependency
levels between components, e.g. for moving interfaces.
Expand the VNET structure with the new boolean flag indicating that we are
doing a shutdown of a given vnet and update the vnet magic cookie for the
change.
Update libkvm to compile with a bool in the kernel struct.
Bump __FreeBSD_version for (external) module builds to more easily detect
the change.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23097
This change adds a new libkvm function, kvm_kerndisp(), that can be used to
retrieve the kernel displacement, that is the difference between the kernel's
base virtual address at run time and the kernel base virtual address specified
in the kernel image file.
This will be used by kgdb, to properly relocate kernel symbols, when needed.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D23285
Update a bunch of Makefile.depend files as
a result of adding Makefile.depend.options files
Reviewed by: bdrewery
MFC after: 1 week
Sponsored by: Juniper Networks
Differential Revision: https://reviews.freebsd.org/D22494
This change adds PowerPC64 support for minidumps on libkvm.
Address translation, page walk, and data retrieval were tested and seem to be
working correctly.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D21555
Add support for kernel threads in kvm_getprocs() and the underlying
kvm_proclist() in libkvm when fetching from a kernel core file. This
has been missing/needed for several releases, when kernel threads became
normal threads. The loop over the processes now contains a sub-loop for
threads, which iterates beyond the first thread only when threads are
requested. Also set some fields such as tid that were previously
uninitialized.
Reviewed by: vangyzen jhb(earlier revision)
MFC after: 4 days
Sponsored by: Forcepoint LLC
Differential Revision: https://reviews.freebsd.org/D21461
All of them are needed to be able to boot to single user and be able
to repair a existing FreeBSD installation so put them directly into
FreeBSD-runtime.
Reviewed by: bapt, gjb
Differential Revision: https://reviews.freebsd.org/D21503
Previous spellings of my name (NGie, Ngie) weren't my legal spelling. Use Enji
instead for clarity.
While here, remove "All Rights Reserved" from copyrights I "own".
MFC after: 1 week
Effectively all i386 kernels now have two pmaps compiled in: one
managing PAE pagetables, and another non-PAE. The implementation is
selected at cold time depending on the CPU features. The vm_paddr_t is
always 64bit now. As result, nx bit can be used on all capable CPUs.
Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE
configs, for drivers compatibility. Kernel layout, esp. max kernel
address, low memory PDEs and max user address (same as trampoline
start) are now same for PAE and for non-PAE regardless of the type of
page tables used.
Non-PAE kernel (when using PAE pagetables) can handle physical memory
up to 24G now, larger memory requires re-tuning the KVA consumers and
instead the code caps the maximum at 24G. Unfortunately, a lot of
drivers do not use busdma(9) properly so by default even 4G barrier is
not easy. There are two tunables added: hw.above4g_allow and
hw.above24g_allow, the first one is kept enabled for now to evaluate
the status on HEAD, second is only for dev use.
i386 now creates three freelists if there is any memory above 4G, to
allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed
from 3 to 1.
The PAE_TABLES kernel config option is retired.
In collaboarion with: pho
Discussed with: emaste
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18894
Move definitions from cpuregs.h into the cca.h, and include cca.h into vm.h.
This is required to make MIPS MD memattr definitions usable in userspace.
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D15583
Replace size_t members with ksize_t (uint64_t) and pointer members
(never used as pointers in userspace, but instead as unique
idenitifiers) with kvaddr_t (uint64_t). This makes the structs
identical between 32-bit and 64-bit ABIs.
On 64-bit bit systems, the ABI is maintained. On 32-bit systems,
this is an ABI breaking change. The ABI of most of these structs
was previously broken in r315662. This also imposes a small API
change on userspace consumers who must handle kernel pointers
becoming virtual addresses.
PR: 228301 (exp-run by antoine)
Reviewed by: jtl, kib, rwatson (various versions)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15386
Rather than using #ifdef's around a static char array, use the
existing helper macro from <sys/cdefs.h> for SCCS IDs. To
preserve existing behavior, add -DNO__SCCSID to CFLAGS to not
include SCCS IDs in the built library by default.
Reviewed by: brooks, dab (older version)
Reviewed by: rgrimes
Differential Revision: https://reviews.freebsd.org/D15459
While <sys/sysctl.h> includes <sys/queue.h> unconditionally, it is only
actually used in code which is conditional on _KERNEL. Make the #include
itself conditional as well, and fix userland code that uses <sys/queue.h>
for other purposes but relied on <sys/sysctl.h> to bring it in.
MFC after: 1 week
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
This API allows callers to enumerate all known pages, including any
direct map & kernel map virtual addresses, physical addresses, size,
offset into the core, & protection configured.
For architectures that support direct map addresses, also generate pages
for any direct map only addresses that are not associated with kernel
map addresses.
Fix page size portability issue left behind from previous kvm page table
lookup interface.
Reviewed by: jhb
Sponsored by: Backtrace I/O
Differential Revision: https://reviews.freebsd.org/D12279
systems that lack the libcall, based on __FreeBSD_version.
kvm_open2(3) wasn't made available until r291406, which is in ^/stable/11,
but not ^/stable/10. This makes some of kvm_geterr_test available for testing
on ^/stable/10.
MFC after: now
Sponsored by: Dell EMC Isilon
Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment. Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.
ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks. Unfortunately, not everything can be
fixed, especially outside the base system. For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.
Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.
Struct xvnode changed layout, no compat shims are provided.
For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.
Update note: strictly follow the instructions in UPDATING. Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.
Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver. Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).
Sponsored by: The FreeBSD Foundation (emaste, kib)
Differential revision: https://reviews.freebsd.org/D10439
- kvm_close: add a testcase to verify support for errno = EINVAL / -1
(see D10065) when kd == NULL is provided to the libcall.
- kvm_geterr:
-- Add a negative testcase for kd == NULL returning "" (see D10022).
-- Add two positive testcases:
--- test the error case using kvm_write on a O_RDONLY descriptor.
--- test the "no error" case using kvm_read(3) and kvm_nlist(3) as
helper routines and by injecting a bogus error message via
_kvm_err (an internal API) _kvm_err was used as there isn't a
formalized way to clear the error output, and because
kvm_nlist always returns ENOENT with the NULL terminator today.
- kvm_open, kvm_open2:
-- Add some basic negative tests for kvm_open(3) and kvm_open2(3).
Testing positive cases with a specific
`corefile`/`execfile`/`resolver` requires more work and would require
user intervention today in order to reliably test this out.
Reviewed by: markj
MFC after: 2 months
Sponsored by: Dell EMC Isilon
Differential Revision: D10024
- Fix -Wunused warnings with *_native detection handlers by marking `kd`
__unused, except with arm/mips, where a slightly more complicated scheme
is required to handle the native case vs the non-native case.
- Fix -Wmissing-variable-declarations warnings by marking struct kvm_arch
objects static.
Differential Revision: D10071
MFC after: 1 week
Reviewed by: vangyzen
Tested with: WIP test code (D10024) // kgdb7121 (i386 crash/kernel on amd64)
Sponsored by: Dell EMC Isilon
Return a NUL string instead of just working by accident with kvm_geterr(3)
when MALLOC_PRODUCTION is disabled (I didn't confirm the MALLOC_PRODUCTION
being enabled path).
Document the new explicit return behavior for kvm_geterr(3), as well
as the previous implicit behavior, i.e., the buffer attached to
returned via kvm_geterr(3) would be empty if a previous error hadn't been
stored in `kd`.
Differential Revision: D10022
MFC after: 1 week
Reviewed by: vangyzen
Sponsored by: Dell EMC Isilon