summits at BSDCan and BSDCam in 2017.
The TCP Blackbox Recorder allows you to capture events on a TCP connection
in a ring buffer. It stores metadata with the event. It optionally stores
the TCP header associated with an event (if the event is associated with a
packet) and also optionally stores information on the sockets.
It supports setting a log ID on a TCP connection and using this to correlate
multiple connections that share a common log ID.
You can log connections in different modes. If you are doing a coordinated
test with a particular connection, you may tell the system to put it in
mode 4 (continuous dump). Or, if you just want to monitor for errors, you
can put it in mode 1 (ring buffer) and dump all the ring buffers associated
with the connection ID when we receive an error signal for that connection
ID. You can set a default mode that will be applied to a particular ratio
of incoming connections. You can also manually set a mode using a socket
option.
This commit includes only basic probes. rrs@ has added quite an abundance
of probes in his TCP development work. He plans to commit those soon.
There are user-space programs which we plan to commit as ports. These read
the data from the log device and output pcapng files, and then let you
analyze the data (and metadata) in the pcapng files.
Reviewed by: gnn (previous version)
Obtained from: Netflix, Inc.
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D11085
The general idea here is to provide userspace programs with well-defined
sources of entropy, in a fashion that doesn't require opening a new file
descriptor (ulimits) or accessing paths (/dev/urandom may be restricted
by chroot or capsicum).
getrandom(2) is the more general API, and comes from the Linux world.
Since our urandom and random devices are identical, the GRND_RANDOM flag
is ignored.
getentropy(3) is added as a compatibility shim for the OpenBSD API.
truss(1) support is included.
Tests for both system calls are provided. Coverage is believed to be at
least as comprehensive as LTP getrandom(2) test coverage. Additionally,
instructions for running the LTP tests directly against FreeBSD are provided
in the "Test Plan" section of the Differential revision linked below. (They
pass, of course.)
PR: 194204
Reported by: David CARLIER <david.carlier AT hardenedbsd.org>
Discussed with: cperciva, delphij, jhb, markj
Relnotes: maybe
Differential Revision: https://reviews.freebsd.org/D14500
ConnectX-4/5 devices in mlx5core.
The dump is obtained by reading a predefined register map from the
non-destructive crspace, accessible by the vendor-specific PCIe
capability (VSC). The dump is stored in preallocated kernel memory and
managed by the mlx5tool(8), which communicates with the driver using a
character device node.
The utility allows to store the dump in format
<address> <value>
into a file, to reset the dump content, and to manually initiate the
dump.
A call to mlx5_fwdump() should be added at the places where a dump
must be fetched automatically. The most likely place is right before a
firmware reset request.
Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies
nothing - it was checking for ENXIO, which, with devfs, is no longer
returned - and was badly placed anyway, and replaces it with similar
one that works, and is done just before starting getty, instead of being
done when rereading ttys(5).
From the practical point of view, this makes init(8) handle disappearing
terminals (eg /dev/ttyU*) gracefully, without unneccessary getty restarts
and resulting error messages.
Reviewed by: imp@
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D14307
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
We removed the nonnull attributes from our headers long ago, but still
__printflike() includes it implicitly. This will cause the NULL check to
be optimized away in higher -O levels and it will also trigger a
-Wnonnull-compare warning.
Avoid warning with it in vwarnx().
Obtained from: DragonfLyBSD (git 6329e2f68af73662a1960240675e796ab586bcb1)
The GCC attribute causes a warning to be emitted if a caller of the
function with this attribute does not use its return value. Unlike the
traditional realloc, with reallocf(3) we don't have to check for NULL
values but we still have to make sure the result is used.
MFC after: 3 days
The daemonfd function is equivalent to the daemon(3) function expect that
arguments are descriptors. For example dhclient(8) which is sandboxed is
unable to open /dev/null to close stdio instead it's allows to fail
daemon(3) function to close the descriptors and then do it explicit in code.
Instead of such hacks we can use now daemonfd.
This API can be also helpful to migrate system to platforms like CheriBSD.
Reviewed by: brooks@, bcr@, jilles@ (earlier version)
Differential Revision: https://reviews.freebsd.org/D13433
Now that the POSIX working group is going to require that basename(3)
and dirname(3) are thread-safe in future revisions of the standard,
there is even less of a need to provide basename_r(3). Remove this
function to prevent people from writing code that only builds on
FreeBSD and Bionic.
Removing this function seems to break exactly one port: sbruno@'s
qemu-user-static. I will send him a pull request on GitHub in a bit.
__FreeBSD_version will not be bumped, as any value from 2017 can be used
to test for the presence of a thread-safe basename(3)/dirname(3).
PR: https://bugs.freebsd.org/224016
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
Referring to _DefaultRuneLocale causes this >4KB structure to be copied to
all executables that use <ctype.h> inlines (except PIE executables).
This only affects the case where thread local storage is available.
_CurrentRuneLocale cannot be NULL, so the check can be removed entirely.
_DefaultRuneLocale needs to remain available for now since libc++ uses it.
The __isctype inline in include/_ctype.h also refers to _DefaultRuneLocale
and remains available because it may still be used by third party software.
Reviewed by: bdrewery, theraven
Differential Revision: https://reviews.freebsd.org/D10363
definition of the same in sys/sys/. The problem was discovered while
working on implementing a new C11 gets_s() for libc. (The new gets_s()
requires rsize_t found in include/stddef.h.) The solution to sync the two
definitions was suggested by ed@ while discussing D12667.
Suggested by: ed
MFC after: 2 weeks
Implement the MMC/SD/SDIO protocol within a CAM framework. CAM's
flexible queueing will make it easier to write non-storage drivers
than the legacy stack. SDIO drivers from both the kernel and as
userland daemons are possible, though much of that functionality will
come later.
Some of the CAM integration isn't complete (there are sleeps in the
device probe state machine, for example), but those minor issues can
be improved in-tree more easily than out of tree and shouldn't gate
progress on other fronts. Appologies to reviews if specific items
have been overlooked.
Submitted by: Ilya Bakulin
Reviewed by: emaste, imp, mav, adrian, ian
Differential Review: https://reviews.freebsd.org/D4761
merge with first commit, various compile hacks.
FreeBSD's C library uses __STDC_VERSION__ to determine whether the
compiler provides language features specific to a certain version of the
C standard. __ISO_C_VISIBLE is used to specify which library features
need to be exposed.
max_align_t currently uses __STDC_VERSION__, even though it should be
using __ISO_C_VISIBLE to remain consistent with the rest of the headers
in include/.
Reviewed by: dim
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D11303
Since buildenv exports SYSROOT all of these uses will now look in
WORLDTMP by default.
sys/boot/efi/loader/Makefile
A LIBSTAND hack is no longer required for buildenv.
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
The Termios headers <termios.h> and <sys/_termios.h> used sometimes
_POSIX_SOURCE directly to determine if a thing should be exposed to
the user. This circumvented the feature mechanisms of <sys/cdefs.h>.
Submitted by: Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after: 2 weeks
Fix warnings about:
- redundant declarations
- a local variable shadowing a global function (dlinfo)
- an old-style function definition (with an empty parameter list)
- a variable that is possibly used uninitialized
"make tinderbox" passes this time, except for a few unrelated
kernel failures.
Reviewed by: kib
MFC after: 3 days
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10870
Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment. Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.
ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks. Unfortunately, not everything can be
fixed, especially outside the base system. For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.
Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.
Struct xvnode changed layout, no compat shims are provided.
For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.
Update note: strictly follow the instructions in UPDATING. Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.
Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver. Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).
Sponsored by: The FreeBSD Foundation (emaste, kib)
Differential revision: https://reviews.freebsd.org/D10439
patm(4) devices.
Maintaining an address family and framework has real costs when we make
infrastructure improvements. In the case of NATM we support no devices
manufactured in the last 20 years and some will not even work in modern
motherboards (some newer devices that patm(4) could be updated to
support apparently exist, but we do not currently have support).
With this change, support remains for some netgraph modules that don't
require NATM support code. It is unclear if all these should remain,
though ng_atmllc certainly stands alone.
Note well: FreeBSD 11 supports NATM and will continue to do so until at
least September 30, 2021. Improvements to the code in FreeBSD 11 are
certainly welcome.
Reviewed by: philip
Approved by: harti
9899:2011 Appendix K 3.7.4.1.
Other needed supporting types, defines and constraint_handler
infrastructure is added as specified in the C11 spec.
Submitted by: Tom Rix <trix@juniper.net>
Sponsored by: Juniper Networks
Discussed with: ed
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D9903
Differential revision: https://reviews.freebsd.org/D10161
Implement a new init(8) option in /etc/ttys. If this option is present
on the entry in /etc/ttys, the entry will be active if and only if it
exists. If the name starts with a '/', it will be considered an
absolute path. If not, it will be a path relative to /dev.
This allows one to turn off video console getty that aren't present
(while running a getty on them even when they aren't the system
console). Likewise with serial ports.
It differs from onifconsole in only requiring the device exist rather
than it be listed as one of the system consoles.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D10037
Add a clock_nanosleep() syscall, as specified by POSIX.
Make nanosleep() a wrapper around it.
Attach the clock_nanosleep test from NetBSD. Adjust it for the
FreeBSD behavior of updating rmtp only when interrupted by a signal.
I believe this to be POSIX-compliant, since POSIX mentions the rmtp
parameter only in the paragraph about EINTR. This is also what
Linux does. (NetBSD updates rmtp unconditionally.)
Copy the whole nanosleep.2 man page from NetBSD because it is complete
and closely resembles the POSIX description. Edit, polish, and reword it
a bit, being sure to keep any relevant text from the FreeBSD page.
Reviewed by: kib, ngie, jilles
MFC after: 3 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10020
the default partition, eMMC v4.41 and later devices can additionally
provide up to:
1 enhanced user data area partition
2 boot partitions
1 RPMB (Replay Protected Memory Block) partition
4 general purpose partitions (optionally with a enhanced or extended
attribute)
Of these "partitions", only the enhanced user data area one actually
slices the user data area partition and, thus, gets handled with the
help of geom_flashmap(4). The other types of partitions have address
space independent from the default partition and need to be switched
to via CMD6 (SWITCH), i. e. constitute a set of additional "disks".
The second kind of these "partitions" doesn't fit that well into the
design of mmc(4) and mmcsd(4). I've decided to let mmcsd(4) hook all
of these "partitions" up as disk(9)'s (except for the RPMB partition
as it didn't seem to make much sense to be able to put a file-system
there and may require authentication; therefore, RPMB partitions are
solely accessible via the newly added IOCTL interface currently; see
also below). This approach for one resulted in cleaner code. Second,
it retains the notion of mmcsd(4) children corresponding to a single
physical device each. With the addition of some layering violations,
it also would have been possible for mmc(4) to add separate mmcsd(4)
instances with one disk each for all of these "partitions", however.
Still, both mmc(4) and mmcsd(4) share some common code now e. g. for
issuing CMD6, which has been factored out into mmc_subr.c.
Besides simply subdividing eMMC devices, some Intel NUCs having UEFI
code in the boot partitions etc., another use case for the partition
support is the activation of pseudo-SLC mode, which manufacturers of
eMMC chips typically associate with the enhanced user data area and/
or the enhanced attribute of general purpose partitions.
CAVEAT EMPTOR: Partitioning eMMC devices is a one-time operation.
- Now that properly issuing CMD6 is crucial (so data isn't written to
the wrong partition for example), make a step into the direction of
correctly handling the timeout for these commands in the MMC layer.
Also, do a SEND_STATUS when CMD6 is invoked with an R1B response as
recommended by relevant specifications. However, quite some work is
left to be done in this regard; all other R1B-type commands done by
the MMC layer also should be followed by a SEND_STATUS (CMD13), the
erase timeout calculations/handling as documented in specifications
are entirely ignored so far, the MMC layer doesn't provide timeouts
applicable up to the bridge drivers and at least sdhci(4) currently
is hardcoding 1 s as timeout for all command types unconditionally.
Let alone already available return codes often not being checked in
the MMC layer ...
- Add an IOCTL interface to mmcsd(4); this is sufficiently compatible
with Linux so that the GNU mmc-utils can be ported to and used with
FreeBSD (note that due to the remaining deficiencies outlined above
SANITIZE operations issued by/with `mmc` currently most likely will
fail). These latter will be added to ports as sysutils/mmc-utils in
a bit. Among others, the `mmc` tool of the GNU mmc-utils allows for
partitioning eMMC devices (tested working).
- For devices following the eMMC specification v4.41 or later, year 0
is 2013 rather than 1997; so correct this for assembling the device
ID string properly.
- Let mmcsd.ko depend on mmc.ko. Additionally, bump MMC_VERSION as at
least for some of the above a matching pair is required.
- In the ACPI front-end of sdhci(4) describe the Intel eMMC and SDXC
controllers as such in order to match the PCI one.
Additionally, in the entry for the 80860F14 SDXC controller remove
the eMMC-only SDHCI_QUIRK_INTEL_POWER_UP_RESET.
OKed by: imp
Submitted by: ian (mmc_switch_status() implementation)
Use SRCTOP in place of .CURDIR/.. as appropriate. The hand-crafted
relative paths for the "links" option remain, though, since those are
relative to /usr/include/sys/<blah> not to the source tree.
Differential Revision: https://reviews.freebsd.org/D9932
Sponsored by: Netflix
Silence On: arch@ (twice)
Also mention <time.h> in sem_timedwait(3), because POSIX does,
and because the user will need it for clockid_t, struct timespec,
and TIMER_ABSTIME.
Reported by: bde
MFC after: 9 days
X-MFC with: r314179
Sponsored by: Dell EMC
This function allows the caller to specify the reference clock
and choose between absolute and relative mode. In relative mode,
the remaining time can be returned.
The API is similar to clock_nanosleep(3). Thanks to Ed Schouten
for that suggestion.
While I'm here, reduce the sleep time in the semaphore "child"
test to greatly reduce its runtime. Also add a reasonable timeout.
Reviewed by: ed (userland)
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9656
Replace uses of the GCC __nonnull__ attribute with the clang nullability
qualifiers. The replacement should be transparent for clang developers as
the new qualifiers will produce the same warnings and will be useful for
static checkers but will not cause aggressive optimizations.
GCC will not produce such warnings and developers will have to use
upgraded GCC ports built with the system headers from r312538.
Hinted by: Apple's Libc-1158.20.4, Bionic libc
MFC after: 11.1 Release
Differential Revision: https://reviews.freebsd.org/D9004
While the checks are considered useful, the attribute does dangerous
optimizations, removing NULL checks where they can be needed. Remove the
uses of this attribute introduced in r281130: the changes were inspired on
Google's bionic where this attribute is not used anymore.
The __nonnull() attribute will be deprecrated from our headers and
replaced with the Clang _Nonnull qualifier in the future.
MFC after: 3 days
This change consists of two parts:
- allow libkvm to recognize /dev/vmm/* character devices as devices that
provide access to the physical memory of a system (similarly to /dev/fwmem*)
- allow libkvm to recognize that /dev/vmm/* and /dev/fwmem* devices provide
access to the physical memory of live remote systems and, thus, the memory
is writable
As a result, it should be possible to run commands like
$ kgdb -w /path/to/kernel /dev/fwmem0.0
$ kgdb /path/to/kernel /dev/vmm/guest
Reviewed by: kib, jhb
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D8679
This paves way to implement VDSO for the enlightened time counter.
Reviewed by: kib
MFC after: 1 week
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8768
VSS stands for "Volume Shadow Copy Service". Unlike virtual machine
snapshot, it only takes snapshot for the virtual disks, so both
filesystem and applications have to aware of it, and cooperate the
whole VSS process.
This driver exposes two device files to the userland:
/dev/hv_fsvss_dev
Normally userland programs should _not_ mess with this device file.
It is currently used by the hv_vss_daemon(8), which freezes and
thaws the filesystem. NOTE: currently only UFS is supported, if
the system mounts _any_ other filesystems, the hv_vss_daemon(8)
will veto the VSS process.
If hv_vss_daemon(8) was disabled, then this device file must be
opened, and proper ioctls must be issued to keep the VSS working.
/dev/hv_appvss_dev
Userland application can opened this device file to receive the
VSS freeze notification, hold the VSS for a while (mainly to flush
application data to filesystem), release the VSS process, and
receive the VSS thaw notification i.e. applications can run again.
The VSS will still work, even if this device file is not opened.
However, only filesystem consistency is promised, if this device
file is not opened or is not operated properly.
hv_vss_daemon(8) is started by devd(8) by default. It can be disabled
by editting /etc/devd/hyperv.conf.
Submitted by: Hongjiang Zhang <honzhan microsoft com>
Reviewed by: kib, mckusick
MFC after: 3 weeks
Sponsored by: Microsoft
Differential Revision: https://reviews.freebsd.org/D8224
Now that the changes to the dirname(3) function had some time to settle,
let's go ahead and use the same approach for replacing basename(3) by a
simple implementation that modifies the input string, thereby making it
thread-safe and guaranteed to succeed.
Unlike dirname(3), this function already had a thread-safe variant
basename_r(3). This function had its own set of problems, like having an
upper bound on the pathname length. Keep this function around for
compatibility, but remove most references from the man page. Make the
man page more similar to that of dirname(3).
As the basename_r(3) function is only provided by FreeBSD (and Bionic),
depending on its use is even more implementation defined than assuming
that basename(3) is thread-safe.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D8382