Decode fields from the siginfo_t stored in the PT_LWPINFO structure when a
signal is caught by a traced process. This includes the signal code
(si_code) as well as additional members such as si_addr, si_pid, etc.
With Flower (CloudABI's network connection daemon) becoming more
complete, there is no longer any need for creating any unconnected
sockets. Socket pairs in combination with file descriptor passing is all
that is necessary, as that is what is used by Flower to pass network
connections from the public internet to listening processes.
Remove all of the kernel bits that were used to implement socket(),
listen(), bindat() and connectat(). In principle, accept() and
SO_ACCEPTCONN may also be removed, but there are still some consumers
left.
Obtained from: https://github.com/NuxiNL/cloudabi
MFC after: 1 month
The CloudABI specification has had some minor changes over the last half
year. No substantial features have been added, but some features that
are deemed unnecessary in retrospect have been removed:
- mlock()/munlock():
These calls tend to be used for two different purposes: real-time
support and handling of sensitive (cryptographic) material that
shouldn't end up in swap. The former use case is out of scope for
CloudABI. The latter may also be handled by encrypting swap.
Removing this has the advantage that we no longer need to worry about
having resource limits put in place.
- SOCK_SEQPACKET:
Support for SOCK_SEQPACKET is rather inconsistent across various
operating systems. Some operating systems supported by CloudABI (e.g.,
macOS) don't support it at all. Considering that they are rarely used,
remove support for the time being.
- getsockname(), getpeername(), etc.:
A shortcoming of the sockets API is that it doesn't allow you to
create socket(pair)s, having fake socket addresses associated with
them. This makes it harder to test applications or transparently
forward (proxy) connections to them.
With CloudABI, we're slowly moving networking connectivity into a
separate daemon called Flower. In addition to passing around socket
file descriptors, this daemon provides address information in the form
of arbitrary string labels. There is thus no longer any need for
requesting socket address information from the kernel itself.
This change also updates consumers of the generated code accordingly.
Even though system calls end up getting renumbered, this won't cause any
problems in practice. CloudABI programs always call into the kernel
through a kernel-supplied vDSO that has the numbers updated as well.
Obtained from: https://github.com/NuxiNL/cloudabi
This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which
specifies that the data field contains absolute time to fire the
event.
To make this useful, data member of the struct kevent must be extended
to 64bit. Using the opportunity, I also added ext members. This
changes struct kevent almost to Apple struct kevent64, except I did
not changed type of ident and udata, the later would cause serious API
incompatibilities.
The type of ident was kept uintptr_t since EVFILT_AIO returns a
pointer in this field, and e.g. CHERI is sensitive to the type
(discussed with brooks, jhb).
Unlike Apple kevent64, symbol versioning allows us to claim ABI
compatibility and still name the new syscall kevent(2). Compat shims
are provided for both host native and compat32.
Requested by: bapt
Reviewed by: bapt, brooks, ngie (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D11025
This includes decoding both scheduler policy constants and the sched_param
structure for sched_get_priority_max(), sched_get_priority_min(),
sched_getparam(), sched_getscheduler(), sched_rr_get_interval(),
sched_setparam(), and sched_setscheduler().
The cmd argument passed to extattrctl() is not decoded as a string constant
but is just printed in hex. The value is filesystem-specific but in
practice is only used with UFS1 filesystems.
- dup and dup2 print fd arguments in decimal.
- pread and pwrite are similar to read and write with the addition of the
file offset.
- getdirentries displays the output entries as a string for now and also
prints the value returned in *basep. Eventually the buffer for
getdirentries should perhaps be decoded as an array of dirent
structures.
PR: 214885
Submitted by: Jonathan de Boyne Pollard <J.deBoynePollard-newsgroups@NTLWorld.COM>
Add a new sysdecode_getrusage_who() which decodes the RUSAGE_* constant
passed as the first argument to getrusage(). Use this function in both
kdump and truss to decode the first argument to getrusage().
PR: 215448
Submitted by: Anton Yuzhaninov <citrin+pr@citrin.ru>
MFC after: 1 month
Decoding of the third argument depends on the first one. For doing this,
add a corresponding function to libsysdecode.
Thanks to jhb@ for suggesting this.
Decode the last argument to ioctl() as a pointer rather than an int.
Eventually this could use 'int' for the _IOWINT() case and pointers for
all others.
The last argument to sendto() is a socklen_t value, not a pointer.
Previously, the offset in a system call description specified the
array index of the start of a system call argument. For most system
call arguments this was the same as the index of the argument in the
function signature. 64-bit arguments (off_t and id_t values) passed
on 32-bit platforms use two slots in the array however. This was
handled by adding (QUAD_SLOTS - 1) to the slot indicies of any
subsequent arguments after a 64-bit argument (though written as ("{
Quad, 1 }, { Int, 1 + QUAD_SLOTS }" rather than "{ Quad, 1 }, { Int, 2
+ QUAD_SLOTS - 1 }"). If a system call contained multiple 64-bit
arguments (such as posix_fadvise()), then additional arguments would
need to use 'QUAD_SLOTS * 2' but remember to subtract 2 from the
initial number, etc. In addition, 32-bit powerpc requires 64-bit
arguments to be 64-bit aligned, so if the effective index in the array
of a 64-bit argument is odd, it needs QUAD_ALIGN added to the current
and any subsequent slots. However, if the effective index in the
array of a 64-bit argument was even, QUAD_ALIGN was omitted.
This approach was messy and error prone. This commit replaces it with
automated pre-processing of the system call table to do fixups for
64-bit argument offsets. The offset in a system call description now
indicates the index of an argument in the associated function call's
signature. A fixup function is run against each decoded system call
description during startup on 32-bit platforms. The fixup function
maintains an 'offset' value which holds an offset to be added to each
remaining system call argument's index. Initially offset is 0. When
a 64-bit system call argument is encountered, the offset is first
aligned to a 64-bit boundary (only on powerpc) and then incremented to
account for the second argument slot used by the argument. This
modified 'offset' is then applied to any remaining arguments. This
approach does require a few things that were not previously required:
1) Each system call description must now list arguments in ascending
order (existing ones all do) without using duplicate slots in the
register array. A new assert() should catch any future
descriptions which violate this rule.
2) A system call description is still permitted to omit arguments
(though none currently do), but if the call accepts 64-bit
arguments those cannot be omitted or incorrect results will be
displated on 32-bit systems.
Tested on: amd64 and i386
Prefer ${SRCTOP}/foo over ${.CURDIR}/../../foo and ${SRCTOP}/usr.bin/foo
over ${.CURDIR}/../foo for paths in Makefiles.
Differential Revision: https://reviews.freebsd.org/D9932
Sponsored by: Netflix
Silence on: arch@ (twice)
Avoid always using an O(n^2) loop over known syscall structures with
strcmp() on each system call. Instead, use a per-ABI cache indexed by
the system call number. The first 1024 system calls (which should cover
all of the normal system calls in currently-supported ABIs) use a flat array
indexed by the system call number to find system call structure. For other
system calls, a linked list of structures storing an integer to structure
mapping is stored in the ABI. The linked list isn't very smart, but it
should only be used by buggy applications invoking unknown system calls.
This also fixes handling of unknown system calls which currently trigger
a NULL pointer dereference.
Reviewed by: kib
MFC after: 2 weeks
Restructure this script so that it generates a header of tables instead
of a source file. The tables are included in a flags.c source file which
provides functions to decode various system call arguments.
For functions that decode an enumeration, the function returns a pointer
to a string for known values and NULL for unknown values.
For functions that do more complex decoding (typically of a bitmask), the
function accepts a pointer to a FILE object (open_memstream() can be used
as a string builder) to which decoded values are written. If the
function operates on a bitmask, the function returns true if any bits
were decoded or false if the entire value was valid. Additionally, the
third argument accepts a pointer to a value to which any undecoded bits
are stored. This pointer can be NULL if the caller doesn't care about
remaining bits.
Convert kdump over to using decoder functions from libsysdecode instead of
mksubr. truss also uses decoders from libsysdecode instead of private
lookup tables, though lookup tables for objects not decoded by kdump remain
in truss for now. Eventually most of these tables should move into
libsysdecode as the automated table generation approach from mksubr is
less stale than the static tables in truss.
Some changes have been made to truss and kdump output:
- The flags passed to open() are now properly decoded in that one of
O_RDONLY, O_RDWR, O_WRONLY, or O_EXEC is always included in a decoded
mask.
- Optional arguments to open(), openat(), and fcntl() are only printed
in kdump if they exist (e.g. the mode is only printed for open() if
O_CREAT is set in the flags).
- Print argument to F_GETLK/SETLK/SETLKW in kdump as a pointer, not int.
- Include all procctl() commands.
- Correctly decode pipe2() flags in truss by not assuming full
open()-like flags with O_RDONLY, etc.
- Decode file flags passed to *chflags() as file flags (UF_* and SF_*)
rather than as a file mode.
- Fix decoding of quotactl() commands by splitting out the two command
components instead of assuming the raw command value matches the
primary command component.
In addition, truss and kdump now build without triggering any warnings.
All of the sysdecode manpages now include the required headers in the
synopsis.
Reviewed by: kib (several older versions), wblock (manpages)
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D7847
Now that we've switched over to using the vDSO on CloudABI, it becomes a
lot easier for us to phase out old features. System call numbering is no
longer something that's part of the ABI. It's fully based on names. As
long as the numbering used by the kernel and the vDSO is consistent
(which it always is), it's all right.
Let's put this to the test by removing a system call (thread_tcb_set())
that's already unused for quite some time now, but was only left intact
to serve as a placeholder. Sync in the new system call table that uses
alphabetic sorting of system calls.
Obtained from: https://github.com/NuxiNL/cloudabi
trussinfo->curthread must be initialized before calling enter_syscall(),
it is used by t->proc->abi->fetch_args().
Without that truss is segfaulting and the attached program also crash.
Submitted by: Nikita Kozlov (nikita@gandi.net)
Reviewed by: jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D7399
The type definitions and constants that were used by COMPAT_CLOUDABI64
are a literal copy of some headers stored inside of CloudABI's C
library, cloudlibc. What is annoying is that we can't make use of
cloudlibc's system call list, as the format is completely different and
doesn't provide enough information. It had to be synced in manually.
We recently decided to solve this (and some other problems) by moving
the ABI definitions into a separate file:
https://github.com/NuxiNL/cloudabi/blob/master/cloudabi.txt
This file is processed by a pile of Python scripts to generate the
header files like before, documentation (markdown), but in our case more
importantly: a FreeBSD system call table.
This change discards the old files in sys/contrib/cloudabi and replaces
them by the latest copies, which requires some minor changes here and
there. Because cloudabi.txt also enforces consistent names of the system
call arguments, we have to patch up a small number of system call
implementations to use the new argument names.
The new header files can also be included directly in FreeBSD kernel
space without needing any includes/defines, so we can now remove
cloudabi_syscalldefs.h and cloudabi64_syscalldefs.h. Patch up the
sources to include the definitions directly from sys/contrib/cloudabi
instead.
- truss can now log the system call invoked by a thread during a
voluntary process exit. No return value is logged, but the value passed
to exit() is included in the trace output. Arguments passed to thread
exit system calls such as thr_exit() are not logged as voluntary thread
exits cannot be distinguished from involuntary thread exits during a
system call.
- New events are now reported for thread births and exits similar to the
recently added events for new child processes when following forks.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D5561
Add two new functions, sysdecode_abi_to_freebsd_errno() and
sysdecode_freebsd_to_abi_errno(), which convert errno values between
the native FreeBSD ABI and other supported ABIs. Note that the
mappings are not necessarily perfect meaning in some cases multiple
errors in one ABI might map to a single error in another ABI. In that
case, the reverse mapping will return one of the errors that maps, but
which error is non-deterministic.
Change truss to always report the raw error value to the user but
use libsysdecode to map it to a native errno value that can be used
with strerror() to generate a description. Previously truss reported
the "converted" error value. Now the user will always see the exact
error value that the application sees.
Change kdump to report the truly raw error value to the user. Previously
kdump would report the absolute value of the raw error value (so for
Linux binaries it didn't output the FreeBSD error value, but the positive
value of the Linux error). Now it reports the real (i.e. negative) error
value for Linux binaries. Also, use libsysdecode to convert the native
FreeBSD error reported in the ktrace record to the raw error used by the
ABI. This means that the Linux ABI can now be handled directly in
ktrsysret() and removes the need for linux_ktrsysret().
Reviewed by: bdrewery, kib
Helpful notes: wblock (manpage)
Differential Revision: https://reviews.freebsd.org/D5314
- Consolidate duplicate code for printing the metadata at the start of
each line into a shared function.
- Add an -H option which will log the thread ID of the relevant thread
for each event.
While here, remove some extraneous calls to clock_gettime() in
print_syscall() and print_syscall_ret(). The caller of print_syscall_ret()
always updates the current thread's "after" time before it is called.
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D5363
instead of passing some of that state as arguments to print_syscall() and
print_syscallret(). This just makes the calls of these functions shorter
and easier to read.
A new sysdecode_syscallname() function accepts a system call code and
returns a string of the corresponding name (or NULL if the code is
unknown). To support different process ABIs, the new function accepts a
value from a new sysdecode_abi enum as its first argument to select the
ABI in use. Current ABIs supported include FREEBSD (native binaries),
FREEBSD32, LINUX, LINUX32, and CLOUDABI64. Note that not all ABIs are
supported by all platforms. In general, a given ABI is only supported
if a platform can execute binaries for that ABI.
To simplify the implementation, libsysdecode's build reuses the
existing pre-generated files from the kernel source tree rather than
duplicating new copies of said files during the build.
kdump(1) and truss(1) now use these functions to map system call
identifiers to names. For kdump(1), a new 'syscallname()' function
consolidates duplicated code from ktrsyscall() and ktrsyscallret().
The Linux ABI no longer requires custom handling for ktrsyscall() and
linux_ktrsyscall() has been removed as a result.
Reviewed by: bdrewery
Differential Revision: https://reviews.freebsd.org/D4823
sysdecode_ioctlname() function. This function matches the behavior
of the truss variant in that it returns a pointer to a string description
for known ioctls. The caller is responsible for displaying unknown
ioctl requests. For kdump this meant moving the logic to handle unknown
ioctl requests out of the generated function and into an ioctlname()
function in kdump.c instead.
Differential Revision: https://reviews.freebsd.org/D4610
system call information such as system call arguments. Initially this
will consist of pulling duplicated code out of truss and kdump though it
may prove useful for other utilities in the future.
This commit moves the shared utrace(2) record parser out of kdump into
the library and updates kdump and truss to use it. One difference from
the previous version is that the library version treats unknown events
that start with the "RTLD" signature as unknown events. This simplifies
the interface and allows the consumer to decide how to handle all
non-recognized events. Instead, this function only generates a string
description for known malloc() and RTLD records.
Reviewed by: bdrewery
Differential Revision: https://reviews.freebsd.org/D4537
This change copies over amd64-cloudabi64.c to aarch64-cloudabi.c and
adjusts it to fetch the proper registers on aarch64. To reduce the
amount of shared code, the errno conversion function is moved into a
separate source file.
Reviewed by: jhb, andrew
Differential Revision: https://reviews.freebsd.org/D4023
This is to make the Makefile more easily extendable for new ABIs.
This also makes several other subtle changes:
- The build now is given a list of ABIs to use based on the MACHINE_ARCH or
MACHINE_CPUARCH. These ABIs have a related path in sys/ that is used
to generate their syscalls. For each ABI to build check for a
ABI.c, MACHINE_ARCH-ABI.c, or a MACHINE_CPUARCH-ABI.c. This matches
the old behavior needed for archs such as powerpc* and mips*.
- The ABI source file selection allows for simpler assignment of common
ABIs such as "fbsd32" from sys/compat/freebsd32, or cloudabi64.
- Expand 'fbsd' to 'freebsd' everywhere for consistency.
- Split out the powerpc-fbsd.c file into a powerpc64-freebsd32.c to be more
like the amd64-freebsd32.c file and to more easily allow the auto-generation
of ABI handling to work.
- Rename 'syscalls.h' to 'fbsd_syscalls.h' to lessen the ambiguity and
avoid confusion with syscall.h (such as in r288997).
- For non-native syscall header files, they are now renamed to be
ABI_syscalls.h, where ABI is what ABI the Makefile is building.
- Remove all of the makesyscalls config files. The "native" one being
name i386.conf was a long outstanding bug. They were all the same
except for the data they generated, so now it is just auto-generated
as a build artifact.
- The syscalls array is now fixed to be static in the syscalls header to
remove the compiler warning about non-extern. This was worked around
in the aarch64-fbsd.c file but not the others.
- All syscall table names are now just 'syscallnames' since they don't
need to be different as they are all static in their own ABI files. The
alternative is to name them ABI_syscallnames which does not seem
necessary.
Reviewed by: ed, jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D3851
Without this, the signals are shown seemingly randomly in the output before
the final summary is shown. This is especially noticeable when there is
not much output from the application being traced.
Discussed with: jhb
Relnotes: yes
CloudABI has approximately 50 system calls that do not depend on the
pointer size of the system. As the ABI is pretty compact, it takes
little effort to each truss(8) the formatting rules for these system
calls. Start off by formatting pointer size independent system calls.
Changes:
- Make it possible to include the CloudABI system call definitions in
FreeBSD userspace builds. Add ${root}/sys to the truss(8) Makefile so
we can pull in <compat/cloudabi/cloudabi_syscalldefs.h>.
- Refactoring: patch up amd64-cloudabi64.c to use the CLOUDABI_*
constants instead of rolling our own table.
- Add table entries for all of the system calls.
- Add new generic formatting types (UInt, IntArray) that we'll be using
to format unsigned integers and arrays of integers.
- Add CloudABI specific formatting types.
Approved by: jhb
Differential Revision: https://reviews.freebsd.org/D3836
This uses the kdump(1) utrace support code directly until a common library
is created.
This allows malloc(3) tracing with MALLOC_CONF=utrace:true and rtld tracing
with LD_UTRACE=1. Unknown utrace(2) data is just printed as hex.
PR: 43819 [inspired by]
Reviewed by: jhb
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D3819
This is done by changing get_syscall() to either lookup the known syscall
or add it into the list with the default handlers for printing.
This also simplifies some code to not have to check if the syscall variable
is set or NULL.
Reviewed by: jhb
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D3792
This change adds the bits that are necessary to fetch system call
arguments and return values from trapframes for CloudABI. This allows us
to properly print system calls with the right name. We need to make sure
that we properly convert error numbers when system calls fail.
We still need to improve truss to pretty-print some of the system calls
that have flags.
address's length (and then overriding it if it "looks wrong"), use the
next argument to the system call to determine the length. This is more
reliable since this is what the kernel depends on anyway and is also
simpler.
integer. Fix the argument decoding to treat this as a quad instead of an
int. This includes using QUAD_ALIGN and QUAD_SLOTS as necessary. To
continue printing IDs in decimal, add a new QuadHex argument type that
prints a 64-bit integer in hex, use QuadHex for the existing off_t arguments,
repurpose Quad to print a 64-bit integer in decimal, and use Quad for id_t
arguments.
This fixes the decoding of wait6(2) and procctl(2) on 32-bit platforms.
probably fallout from the removal of the extra padding argument before
off_t in 7. However, that padding still exists for 32-bit powerpc, so
use QUAD_ALIGN.
- Fix QUAD_ALIGN to be zero for powerpc64. It should only be set to 1
for 32-bit platforms that add padding to align 64-bit arguments.
- Refactor the interface between the ABI-independent code and the
ABI-specific backends. The backends now provide smaller hooks to
fetch system call arguments and return values. The rest of the
system call entry and exit handling that was previously duplicated
among all the backends has been moved to one place.
- Merge the loop when waiting for an event with the loop for handling stops.
This also means not emulating a procfs-like interface on top of ptrace().
Instead, use a single event loop that fetches process events via waitid().
Among other things this allows us to report the full 32-bit exit value.
- Use PT_FOLLOW_FORK to follow new child processes instead of forking a new
truss process for each new child. This allows one truss process to monitor
a tree of processes and truss -c should now display one total for the
entire tree instead of separate summaries per process.
- Use the recently added fields to ptrace_lwpinfo to determine the current
system call number and argument count. The latter is especially useful
and fixes a regression since the conversion from procfs. truss now
generally prints the correct number of arguments for most system calls
rather than printing extra arguments for any call not listed in the
table in syscalls.c.
- Actually check the new ABI when processes call exec. The comments claimed
that this happened but it was not being done (perhaps this was another
regression in the conversion to ptrace()). If the new ABI after exec
is not supported, truss detaches from the process. If truss does not
support the ABI for a newly executed process the process is killed
before it returns from exec.
- Along with the refactor, teach the various ABI-specific backends to
fetch both return values, not just the first. Use this to properly
report the full 64-bit return value from lseek(). In addition, the
handler for "pipe" now pulls the pair of descriptors out of the
return values (which is the true kernel system call interface) but
displays them as an argument (which matches the interface exported by
libc).
- Each ABI handler adds entries to a linker set rather than requiring
a statically defined table of handlers in main.c.
- The arm and mips system call fetching code was changed to follow the
same pattern as amd64 (and the in-kernel handler) of fetching register
arguments first and then reading any remaining arguments from the
stack. This should fix indirect system call arguments on at least
arm.
- The mipsn32 and n64 ABIs will now look for arguments in A4 through A7.
- Use register %ebp for the 6th system call argument for Linux/i386 ABIs
to match the in-kernel argument fetch code.
- For powerpc binaries on a powerpc64 system, fetch the extra arguments
on the stack as 32-bit values that are then copied into the 64-bit
argument array instead of reading the 32-bit values directly into the
64-bit array.
Reviewed by: kib (earlier version)
Tested on: amd64 (FreeBSD/amd64 & i386), i386, arm (earlier version)
Tested on: powerpc64 (FreeBSD/powerpc64 & powerpc)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D3575
arrays generically rather than duplicating a hack in all of the backends.
- Add two new system call argument types and use them instead of StringArray
for the argument and environment arguments execve and linux_execve.
- Honor the -a/-e flags in the handling of these new types.
- Instead of printing "<missing argument>" when the decoding is disabled,
print the raw pointer value.
Before truss would fetch 100 string pointers and happily walk off the end
of the array if it never found a NULL. This also means for a short argv
list it could fail entirely if the 100 string pointers spanned into an
unmapped page.
Instead, fetch page-aligned blocks of string pointers in a loop fetching
each string until a NULL is found.
While here, make use of the open memstream file descriptor instead of
allocating a temporary array. This allows us to fetch each string once
instead of twice.
- Print the ident value as decimal instead of hexadecimal for filter types
that use "small" values such as file descriptors and PIDs.
- Decode NOTE_* flags in the fflags field of kevents for several system
filter types.