The type definitions and constants that were used by COMPAT_CLOUDABI64
are a literal copy of some headers stored inside of CloudABI's C
library, cloudlibc. What is annoying is that we can't make use of
cloudlibc's system call list, as the format is completely different and
doesn't provide enough information. It had to be synced in manually.
We recently decided to solve this (and some other problems) by moving
the ABI definitions into a separate file:
https://github.com/NuxiNL/cloudabi/blob/master/cloudabi.txt
This file is processed by a pile of Python scripts to generate the
header files like before, documentation (markdown), but in our case more
importantly: a FreeBSD system call table.
This change discards the old files in sys/contrib/cloudabi and replaces
them by the latest copies, which requires some minor changes here and
there. Because cloudabi.txt also enforces consistent names of the system
call arguments, we have to patch up a small number of system call
implementations to use the new argument names.
The new header files can also be included directly in FreeBSD kernel
space without needing any includes/defines, so we can now remove
cloudabi_syscalldefs.h and cloudabi64_syscalldefs.h. Patch up the
sources to include the definitions directly from sys/contrib/cloudabi
instead.
- truss can now log the system call invoked by a thread during a
voluntary process exit. No return value is logged, but the value passed
to exit() is included in the trace output. Arguments passed to thread
exit system calls such as thr_exit() are not logged as voluntary thread
exits cannot be distinguished from involuntary thread exits during a
system call.
- New events are now reported for thread births and exits similar to the
recently added events for new child processes when following forks.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D5561
Add two new functions, sysdecode_abi_to_freebsd_errno() and
sysdecode_freebsd_to_abi_errno(), which convert errno values between
the native FreeBSD ABI and other supported ABIs. Note that the
mappings are not necessarily perfect meaning in some cases multiple
errors in one ABI might map to a single error in another ABI. In that
case, the reverse mapping will return one of the errors that maps, but
which error is non-deterministic.
Change truss to always report the raw error value to the user but
use libsysdecode to map it to a native errno value that can be used
with strerror() to generate a description. Previously truss reported
the "converted" error value. Now the user will always see the exact
error value that the application sees.
Change kdump to report the truly raw error value to the user. Previously
kdump would report the absolute value of the raw error value (so for
Linux binaries it didn't output the FreeBSD error value, but the positive
value of the Linux error). Now it reports the real (i.e. negative) error
value for Linux binaries. Also, use libsysdecode to convert the native
FreeBSD error reported in the ktrace record to the raw error used by the
ABI. This means that the Linux ABI can now be handled directly in
ktrsysret() and removes the need for linux_ktrsysret().
Reviewed by: bdrewery, kib
Helpful notes: wblock (manpage)
Differential Revision: https://reviews.freebsd.org/D5314
- Consolidate duplicate code for printing the metadata at the start of
each line into a shared function.
- Add an -H option which will log the thread ID of the relevant thread
for each event.
While here, remove some extraneous calls to clock_gettime() in
print_syscall() and print_syscall_ret(). The caller of print_syscall_ret()
always updates the current thread's "after" time before it is called.
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5363
instead of passing some of that state as arguments to print_syscall() and
print_syscallret(). This just makes the calls of these functions shorter
and easier to read.
sysdecode_ioctlname() function. This function matches the behavior
of the truss variant in that it returns a pointer to a string description
for known ioctls. The caller is responsible for displaying unknown
ioctl requests. For kdump this meant moving the logic to handle unknown
ioctl requests out of the generated function and into an ioctlname()
function in kdump.c instead.
Differential Revision: https://reviews.freebsd.org/D4610
system call information such as system call arguments. Initially this
will consist of pulling duplicated code out of truss and kdump though it
may prove useful for other utilities in the future.
This commit moves the shared utrace(2) record parser out of kdump into
the library and updates kdump and truss to use it. One difference from
the previous version is that the library version treats unknown events
that start with the "RTLD" signature as unknown events. This simplifies
the interface and allows the consumer to decide how to handle all
non-recognized events. Instead, this function only generates a string
description for known malloc() and RTLD records.
Reviewed by: bdrewery
Differential Revision: https://reviews.freebsd.org/D4537
CloudABI has approximately 50 system calls that do not depend on the
pointer size of the system. As the ABI is pretty compact, it takes
little effort to each truss(8) the formatting rules for these system
calls. Start off by formatting pointer size independent system calls.
Changes:
- Make it possible to include the CloudABI system call definitions in
FreeBSD userspace builds. Add ${root}/sys to the truss(8) Makefile so
we can pull in <compat/cloudabi/cloudabi_syscalldefs.h>.
- Refactoring: patch up amd64-cloudabi64.c to use the CLOUDABI_*
constants instead of rolling our own table.
- Add table entries for all of the system calls.
- Add new generic formatting types (UInt, IntArray) that we'll be using
to format unsigned integers and arrays of integers.
- Add CloudABI specific formatting types.
Approved by: jhb
Differential Revision: https://reviews.freebsd.org/D3836
This uses the kdump(1) utrace support code directly until a common library
is created.
This allows malloc(3) tracing with MALLOC_CONF=utrace:true and rtld tracing
with LD_UTRACE=1. Unknown utrace(2) data is just printed as hex.
PR: 43819 [inspired by]
Reviewed by: jhb
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D3819
This is done by changing get_syscall() to either lookup the known syscall
or add it into the list with the default handlers for printing.
This also simplifies some code to not have to check if the syscall variable
is set or NULL.
Reviewed by: jhb
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D3792
address's length (and then overriding it if it "looks wrong"), use the
next argument to the system call to determine the length. This is more
reliable since this is what the kernel depends on anyway and is also
simpler.
integer. Fix the argument decoding to treat this as a quad instead of an
int. This includes using QUAD_ALIGN and QUAD_SLOTS as necessary. To
continue printing IDs in decimal, add a new QuadHex argument type that
prints a 64-bit integer in hex, use QuadHex for the existing off_t arguments,
repurpose Quad to print a 64-bit integer in decimal, and use Quad for id_t
arguments.
This fixes the decoding of wait6(2) and procctl(2) on 32-bit platforms.
probably fallout from the removal of the extra padding argument before
off_t in 7. However, that padding still exists for 32-bit powerpc, so
use QUAD_ALIGN.
- Fix QUAD_ALIGN to be zero for powerpc64. It should only be set to 1
for 32-bit platforms that add padding to align 64-bit arguments.
- Refactor the interface between the ABI-independent code and the
ABI-specific backends. The backends now provide smaller hooks to
fetch system call arguments and return values. The rest of the
system call entry and exit handling that was previously duplicated
among all the backends has been moved to one place.
- Merge the loop when waiting for an event with the loop for handling stops.
This also means not emulating a procfs-like interface on top of ptrace().
Instead, use a single event loop that fetches process events via waitid().
Among other things this allows us to report the full 32-bit exit value.
- Use PT_FOLLOW_FORK to follow new child processes instead of forking a new
truss process for each new child. This allows one truss process to monitor
a tree of processes and truss -c should now display one total for the
entire tree instead of separate summaries per process.
- Use the recently added fields to ptrace_lwpinfo to determine the current
system call number and argument count. The latter is especially useful
and fixes a regression since the conversion from procfs. truss now
generally prints the correct number of arguments for most system calls
rather than printing extra arguments for any call not listed in the
table in syscalls.c.
- Actually check the new ABI when processes call exec. The comments claimed
that this happened but it was not being done (perhaps this was another
regression in the conversion to ptrace()). If the new ABI after exec
is not supported, truss detaches from the process. If truss does not
support the ABI for a newly executed process the process is killed
before it returns from exec.
- Along with the refactor, teach the various ABI-specific backends to
fetch both return values, not just the first. Use this to properly
report the full 64-bit return value from lseek(). In addition, the
handler for "pipe" now pulls the pair of descriptors out of the
return values (which is the true kernel system call interface) but
displays them as an argument (which matches the interface exported by
libc).
- Each ABI handler adds entries to a linker set rather than requiring
a statically defined table of handlers in main.c.
- The arm and mips system call fetching code was changed to follow the
same pattern as amd64 (and the in-kernel handler) of fetching register
arguments first and then reading any remaining arguments from the
stack. This should fix indirect system call arguments on at least
arm.
- The mipsn32 and n64 ABIs will now look for arguments in A4 through A7.
- Use register %ebp for the 6th system call argument for Linux/i386 ABIs
to match the in-kernel argument fetch code.
- For powerpc binaries on a powerpc64 system, fetch the extra arguments
on the stack as 32-bit values that are then copied into the 64-bit
argument array instead of reading the 32-bit values directly into the
64-bit array.
Reviewed by: kib (earlier version)
Tested on: amd64 (FreeBSD/amd64 & i386), i386, arm (earlier version)
Tested on: powerpc64 (FreeBSD/powerpc64 & powerpc)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D3575
arrays generically rather than duplicating a hack in all of the backends.
- Add two new system call argument types and use them instead of StringArray
for the argument and environment arguments execve and linux_execve.
- Honor the -a/-e flags in the handling of these new types.
- Instead of printing "<missing argument>" when the decoding is disabled,
print the raw pointer value.
Before truss would fetch 100 string pointers and happily walk off the end
of the array if it never found a NULL. This also means for a short argv
list it could fail entirely if the 100 string pointers spanned into an
unmapped page.
Instead, fetch page-aligned blocks of string pointers in a loop fetching
each string until a NULL is found.
While here, make use of the open memstream file descriptor instead of
allocating a temporary array. This allows us to fetch each string once
instead of twice.
- Print the ident value as decimal instead of hexadecimal for filter types
that use "small" values such as file descriptors and PIDs.
- Decode NOTE_* flags in the fflags field of kevents for several system
filter types.
with open_memstream() to build the string for each argument. This allows
for more complicated argument building without resorting to intermediate
malloc's, etc.
Related, the strsig*() functions no longer return allocated strings but
use a static global buffer instead.
- Don't exit if get_struct() fails, instead print the raw pointer value to
match all other argument decoding cases.
- Use an xlat table instead of a home-rolled switch for the operation name.
- Display the nested socketcall args structure as a structure instead of as
two inline arguments.
sigqueue, sigreturn, sigsuspend, sigtimedwait, sigwait, sigwaitinfo, and
thr_kill.
- Print signal sets as a structure (with {}'s) and in particular use this to
differentiate empty sets from a NULL pointer.
- Decode arguments for some other system calls: issetugid, pipe2, sysarch
(operations are only decoded for amd64 and i386), and thr_self.
especially useful now that libc's open() always calls openat(). While here,
fix a few other things:
- Decode the mode argument passed to access(), eaccess(), and faccessat().
- Decode the atfd paramete to pretty-print AT_FDCWD.
- Decode the special AT_* flags used with some of the *at() system calls.
- Decode arguments for fchmod(), lchmod(), fchown(), lchown(), eaccess(),
and futimens().
- Decode both of the timeval structures passed to futimes() instead of just
the first one.
length. In particular, instead of blinding fetching 1k blocks, do an initial
fetch up to the end of the current page followed by page-sized fetches up to
the maximum size. Previously if the 1k buffer crossed a page boundary and
the second page was not valid, the entire operation would fail.
in a separate word from the _count. This does not permit both items to
be updated atomically in a portable manner. As a result, sem_post()
must always perform a system call to safely clear _has_waiters.
This change removes the _has_waiters field and instead uses the high bit
of _count as the _has_waiters flag. A new umtx object type (_usem2) and
two new umtx operations are added (SEM_WAIT2 and SEM_WAKE2) to implement
these semantics. The older operations are still supported under the
COMPAT_FREEBSD9/10 options. The POSIX semaphore API in libc has
been updated to use the new implementation. Note that the new
implementation is not compatible with the previous implementation.
However, this only affects static binaries (which cannot be helped by
symbol versioning). Binaries using a dynamic libc will continue to work
fine. SEM_MAGIC has been bumped so that mismatched binaries will error
rather than corrupting a shared semaphore. In addition, a padding field
has been added to sem_t so that it remains the same size.
Differential Revision: https://reviews.freebsd.org/D961
Reported by: adrian
Reviewed by: kib, jilles (earlier version)
Sponsored by: Norse
Older binaries are still permitted to use these flags.
PR: 193961 (exp-run in ports)
Differential Revision: https://reviews.freebsd.org/D848
Reviewed by: kib
- Retire long time unused (basically always unused) sys__umtx_lock()
and sys__umtx_unlock() syscalls
- struct umtx and their supporting definitions
- UMUTEX_ERROR_CHECK flag
- Retire UMTX_OP_LOCK/UMTX_OP_UNLOCK from _umtx_op() syscall
__FreeBSD_version is not bumped yet because it is expected that further
breakages to the umtx interface will follow up in the next days.
However there will be a final bump when necessary.
Sponsored by: EMC / Isilon storage division
Reviewed by: jhb
exhausted.
- Add a new protect(1) command that can be used to set or revoke protection
from arbitrary processes. Similar to ktrace it can apply a change to all
existing descendants of a process as well as future descendants.
- Add a new procctl(2) system call that provides a generic interface for
control operations on processes (as opposed to the debugger-specific
operations provided by ptrace(2)). procctl(2) uses a combination of
idtype_t and an id to identify the set of processes on which to operate
similar to wait6().
- Add a PROC_SPROTECT control operation to manage the protection status
of a set of processes. MADV_PROTECT still works for backwards
compatability.
- Add a p_flag2 to struct proc (and a corresponding ki_flag2 to kinfo_proc)
the first bit of which is used to track if P_PROTECT should be inherited
by new child processes.
Reviewed by: kib, jilles (earlier version)
Approved by: re (delphij)
MFC after: 1 month
- Don't treat an options argument of 0 to wait4() as an error in
kdump.
- Decode the wait options passed to wait4() and wait6() in truss
and decode the returned rusage and exit status.
Approved by: re (kib)
MFC after: 1 week
an address in the first 2GB of the process's address space. This flag should
have the same semantics as the same flag on Linux.
To facilitate this, add a new parameter to vm_map_find() that specifies an
optional maximum virtual address. While here, fix several callers of
vm_map_find() to use a VMFS_* constant for the findspace argument instead of
TRUE and FALSE.
Reviewed by: alc
Approved by: re (kib)
address alignment of mappings.
- MAP_ALIGNED(n) requests a mapping aligned on a boundary of (1 << n).
Requests for n >= number of bits in a pointer or less than the size of
a page fail with EINVAL. This matches the API provided by NetBSD.
- MAP_ALIGNED_SUPER is a special case of MAP_ALIGNED. It can be used
to optimize the chances of using large pages. By default it will align
the mapping on a large page boundary (the system is free to choose any
large page size to align to that seems best for the mapping request).
However, if the object being mapped is already using large pages, then
it will align the virtual mapping to match the existing large pages in
the object instead.
- Internally, VMFS_ALIGNED_SPACE is now renamed to VMFS_SUPER_SPACE, and
VMFS_ALIGNED_SPACE(n) is repurposed for specifying a specific alignment.
MAP_ALIGNED(n) maps to using VMFS_ALIGNED_SPACE(n), while
MAP_ALIGNED_SUPER maps to VMFS_SUPER_SPACE.
- mmap() of a device object now uses VMFS_OPTIMAL_SPACE rather than
explicitly using VMFS_SUPER_SPACE. All device objects are forced to
use a specific color on creation, so VMFS_OPTIMAL_SPACE is effectively
equivalent.
Reviewed by: alc
MFC after: 1 month