use of u3g(4) dongles, and in many cases can work out of the box.
Reviewed by: hselasky
Approved by: re (gjb)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16974
Ths prevents etcupdate and mergemaster from deleting it for now.
Approved by: re (rgrimes), will (mentor)
Differential Revision: https://reviews.freebsd.org/D16975
The switch to lualoader creates a problem with userboot: the host is
inclined to build userboot with Lua, but the host userboot's interpreter
must match what's available on the guest. For almost all FreeBSD guests in
the wild, Lua is not yet available and a Lua-based userboot will fail.
This revision updates userboot protocol to version 5, which adds a
swap_interpreter callback to request a different interpreter, and tries to
determine the proper interpreter to be used based on how the guest
/boot/loader is compiled. This is still a bit of a guess, but it's likely
the best possible guess we can make in order to get it right. The
interpreter is now embedded in the resulting executable, so we can open
/boot/loader on the guest and hunt that down to derive the interpreter it
was built with.
Using -l with bhyveload will not allow an intepreter swap, even if the
loader specified happens to be a userboot with the wrong interpreter. We'll
simply complain about the mismatch and bail out.
For legacy guests without the interpreter marker, we assume they're 4th.
For new guests with the interpreter marker, we'll read it and swap over
to the proper interpreter if it doesn't match what the userboot we're using
was compiled with.
Both flavors of userboot are installed by default, userboot_4th.so and
userboot_lua.so. This fixes the build WITHOUT_FORTH as a coincidence, which
was broken by userboot being forced to 4th.
Reviewed by: imp, jhb, araujo (earlier version)
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D16945
After r336252 it is no longer necessary to have a separate bootpool when
booting from an encrypted disk with UEFI.
This change also switches the EFI System Partition contents from
the 800 KB boot1.efifat to a new 200 MB filesystem created with newfs_msdos
and uses loader.efi directly, instead of boot1.efi.
PR: 228916
Reviewed by: dteske
MFC after: 1 month
Relnotes: yes
Sponsored by: Klara Systems
Differential Revision: https://reviews.freebsd.org/D12315
* Constify rtpref_str declaration
* Remove unused h_errno declaration
* Use time_t type for expire
* Use strlcpy to set static "?" value to ifname
* Rename local variable 's' to stop shadowing global definition
* Close socket used in pfx_flush()
* Use local variables for sock() in setdefif() and getdefif()
* Increase WARNS to 3
Reviewed by: allanjude, kevans
Approved by: allanjude
Sponsored by: Rubicon Communications, LLC (Netgate)
Differential Revision: https://reviews.freebsd.org/D11118
This adds it to devctl, libdevctl, defines the two IOCTLs and
implements the kernel bits. causes any new drivers that are added via
kldload to be deferred until a 'thaw' comes in. These do not stack: it
is an error to freeze while frozen, or thaw while thawed.
Differential Revision: https://reviews.freebsd.org/D16735
This is pkgbase related as it switches to CONFS to properly tag this as a
config file.
Approved by: will (mentor)
Differential Revision: https://reviews.freebsd.org/D16848
For tools that uses bhyve such like libvirt, it is important to be able to
probe what features are supported by the given bhyve binary.
To give more context, libvirt probes bhyve's capabilities in a not very
effective way:
- Running 'bhyve -h' and parsing output.
- To detect devices, it runs 'bhyve -s 0,dev' for every each device and
parses error output to identify if the device is supported or not.
PR: 2101111
Submitted by: novel
MFC after: 2 weeks
Relnotes: yes
Sponsored by: iXsystems Inc.
2^32 bps or greater to be used. Prior to this, bandwidth parameters
would simply wrap at the 2^32 boundary. The computations in the HFSC
scheduler and token bucket regulator have been modified to operate
correctly up to at least 100 Gbps. No other algorithms have been
examined or modified for correct operation above 2^32 bps (some may
have existing computation resolution or overflow issues at rates below
that threshold). pfctl(8) will now limit non-HFSC bandwidth
parameters to 2^32 - 1 before passing them to the kernel.
The extensions to the pf(4) ioctl interface have been made in a
backwards-compatible way by versioning affected data structures,
supporting all versions in the kernel, and implementing macros that
will cause existing code that consumes that interface to use version 0
without source modifications. If version 0 consumers of the interface
are used against a new kernel that has had bandwidth parameters of
2^32 or greater configured by updated tools, such bandwidth parameters
will be reported as 2^32 - 1 bps by those old consumers.
All in-tree consumers of the pf(4) interface have been updated. To
update out-of-tree consumers to the latest version of the interface,
define PFIOC_USE_LATEST ahead of any includes and use the code of
pfctl(8) as a guide for the ioctls of interest.
PR: 211730
Reviewed by: jmallett, kp, loos
MFC after: 2 weeks
Relnotes: yes
Sponsored by: RG Nets
Differential Revision: https://reviews.freebsd.org/D16782
PR#230752 shows a panic where an nfsd thread tries to do soconnect() on
the AF_LOCAL socket used by the nfsuserd while already holding an
exclusive lock on it. I am not 100% sure how this happens, but since an
AF_LOCAL socket is in the file system namespace it is conceivable that it
could lock it and then attempt an upcall to the nfsuserd.
However, reverting r320757 stops the nfsuserd from using an AF_LOCAL
socket, so it should avoid any such panic().
r320757 did fix a problem with running the nfsuserd when jails were
enabled, but that can be dealt with less elegantly by allowing the
use of an alternate address instead of 127.0.0.1.
The gssd daemon also uses an AF_LOCAL socket, but it will do upcalls
before the nfsd thread processes the RPC, so I think it should not
be suseptible to this problem.
PR: 230752
This way the target fails if unifdef doesn't exist or doesn't modify the
file instead of just generating an empty .c file.
I found this while building without inherited $PATH (D16815)
Approved By: jhb (mentor)
The original NVMe API used bit-fields to represent fields in data
structures defined by the specification (e.g. the op-code in the command
data structure). The implementation targeted x86_64 processors and
defined the bit fields for little endian dwords (i.e. 32 bits).
This approach does not work as-is for big endian architectures and was
changed to use a combination of bit shifts and masks to support PowerPC.
Unfortunately, this changed the NVMe API and forces #ifdef's based on
the OS revision level in user space code.
This change reverts to something that looks like the original API, but
it uses bytes instead of bit-fields inside the packed command structure.
As a bonus, this works as-is for both big and little endian CPU
architectures.
Bump __FreeBSD_version to 1200081 due to API change
Reviewed by: imp, kbowling, smh, mav
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D16404
Prevent some classes of foot-shooting that may result in permissions
problems.
Reviewed by: dab, delphij, vangyzen (earlier version)
Relnotes: yes (behavior change)
Sponsored by: Dell EMC Isilon
Differential Revision: D16831
This helps with pkgbase by switching to CONFS so they are properly tagged as
config files.
Approved by: will (mentor)
Differential Revision: https://reviews.freebsd.org/D16833
This helps with pkgbase as it switches these to use CONFS which properly tags
them as config files.
Approved by: will (mentor)
Differential Revision: https://reviews.freebsd.org/D16783
Add a -C option, similar to -B, that allows gstat to produce basic CSV output
with absolute timestamps (ISO 8601, nearly.) Multiple devices are handled by
way of a single-pivot CSV table with duplicated timestamps for each object
output.
Submitted by: Nick Principe <nap__ixsystems.com>
Reviewed by: myself, imp@, asomers (earlier verison), bcr (manpages)
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D16151
For use with things like BOOT_TAG=\"\" -- there are valid reasons to allow
empty strings, especially as these are usually being passed through as
options. The same argument could perhaps be made for the unquoted
variant in things like MODULES_OVERRIDE="", but it's not immediately clear
that this is an issue so I've left it untouched.
MFC after: 3 days
If we can't find a Makefile.inc1 in the specified / default SOURCEDIR, and
there's a Makefile.inc1 in the current directory, offer the user the choice
of using . for SOURCEDIR.
Differential Revsion: https://reviews.freebsd.org/D16709
However, for post-install configuration, bsdinstall
is not of much use. Point the user to bsdconfig instead.
Reviewed by: 0mp, bcr
Approved by: 0mp, bcr
Differential Revision: https://reviews.freebsd.org/D16751
The original commit added granularity to the transaction latency display
in the extended device stats mode, but didn't update the man page.
Reported by: Miroslav Lachman <000.fbsd@quip.cz> via jmg
MFC after: 1 day
This allows preferring small (e.g. ACK) packets, in upload heavy
environments.
It was already possible to mark packets urgent based on destination
port. This option piggy backs on that feature.
created and before exec.start is called. [1]
- Bump __FreeBSD_version.
This allows to attach ZFS datasets and various other things to be
done before any command/service/rc-script is started in the new
jail.
PR: 228066 [1]
Reviewed by: jamie [1]
Submitted by: Stefan Grönke <stefan@gronke.net> [1]
Differential Revision: https://reviews.freebsd.org/D15330 [1]
it read-only instead of just failing if the media is write-protected.
The /net doesn't seem to require the flag.
MFC after: 2 weeks
Relnotes: yes
Sponsored by: DARPA, AFRL
This is pkgbase related as it uses CONFS to tag the file as a config file
Approved by: AllanJude (mentor)
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16693
This is related to pkgbase and changes these to use CONFS so that these are
tagged as config files.
Approved by: AllanJude (mentor)
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16694
This helps with pkgbase by using CONFS to tag these as config files.
Approved by: allanjude (mentor), ian, cy
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16661
This makes pkgbase easier by tagging these as CONFS so they are properly
tagged as config files.
Approved by: will (mentor)
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16553
This helps with pkgbase as this config file will now be tagged as a config
file
Approved by: allanjude (mentor)
Sponsored by: Essen Hackathon
Differential Revision: https://reviews.freebsd.org/D16674
This helps with pkgbase as these config files will be properly tagged as
config files.
Approved by: allanjude (mentor), oshogbo
Differential Revision: https://reviews.freebsd.org/D16679
and make sure it would terminate with nul with strlcpy().
Reviewed by: imp (earlier revision)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16595
argv has been incremented during argument handling, so elements of the
array are no longer valid. Change the err() arguments so only the
first string pointer in argv is used.
Found during code inspection.
- Remove the compression suffix macros and move them directly into the
compress_type array.
- Remove the hardcoded sizes on the suffix and compression args arrays.
- Simplify the compression args arrays at the expense of a __DECONST
when calling execv().
- Rewrite do_zipwork. The COMPRESS_* macros can directly index the
compress_types array, so the outer loop is not needed. Convert
fixed-length strings into asprintf or sbuf calls.
Submitted by: Dan Nelson <dnelson_1901@yahoo.com>
Reviewed by: gad
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D16518
This actually makes the rights requirements for accessing PCI config
space and BARs using /dev/pci same. Since unchanged /dev/pci mode
only allows write open for root, default configuration de-facto limits
the BAR read to root only. In particular, state-changing reads of the
registers are limited to root.
Discussed with: se
Suggested and reviewed by: jhb (kernel part)
Sponsored by: The FreeBSD Foundation
MFC after: 12 days
Differential revision: https://reviews.freebsd.org/D16580
The zstd invocation constructed by newsyslog contains one more parameter
than invocations for the other supported compression utilities. However,
the maximum number of arguments was hard-coded, leading to an
out-of-bounds array access when using zstd compression.
This patch adds a new sysctl(8) knob "security.jail.vmm_allowed",
by default this option is disable.
Submitted by: Shawn Webb <shawn.webb____hardenedbsd.org>
Reviewed by: jamie@ and myself.
Relnotes: Yes.
Sponsored by: HardenedBSD and G2, Inc.
Differential Revision: https://reviews.freebsd.org/D16057
This is prep for pkging base and helps tag and install config files with the
correct packages.
Approved by: bapt (mentor)
Differential Revision: https://reviews.freebsd.org/D16493
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
Reuse of the index variable in two nested loops resulted in only the first
argument in the list being used (fine for gzip, not fine for zstd). Also
add tests for xz and zstd, and fix the COMPRESS_SUFFIX_MAXLEN macro.
Submitted by: dnelson_1901_yahoo.com
Differential Revision: https://reviews.freebsd.org/D16509
It allows locking or unlocking physical pages in memory within a jail
This allows running elasticsearch with "bootstrap.memory_lock" inside a jail
Reviewed by: jamie@
Differential Revision: https://reviews.freebsd.org/D16342
r336795 adds support for handling of IPv6 addresses returned by getaddrinfo(3)
for DS hostnames. This updates the man page for this change.
This is a content change.
This patch adds code to handle IPv6 addresses returned by getaddrinfo()
for the host entries in the "-p" command line argument.
If the IPv6 address is a link local address, only use it if it is the
only address for the host. This is done since there is no way to know
if the NFSv4.1 pNFS client is in the same scope zone as the MDS.
inet_ntop() is used for the IPv6 address translation, since the client
will have no use for the scope zone suffix and inet_ntop() does not
put this in the address string.
Discussed with: bu7cher@yandex.ru
than the auotmatic selection). This is important in some scripting
environments.
Also, remove bogus checks for bootnum != 0. 0 is a valid bootnum.
Sponsored by: Netflix
It was also leading to segfaults; pw can be NULL when control reaches these
lines now, because of the way my previous change restructured the loops.
Reported by: lwhsu@
pw_scan(3) has been fixed in a way that doesn't perturb other callers of
it or the getpwnam(3) family.
Make pw(8) showuser work the same with or without -R <path> for non-root
users. Without -R, pw(8) uses getpwnam(3), which will open master.passwd
for the root user or passwd for non-root users. With -R <path> pw(8) was
always opening <path>/master.passwd, which would fail for a non-root user,
then falsely claim the userid you're trying to show doesn't exist.
Now for a non-root user it opens <path>/passwd, and populates the fields in
the returned struct passwd which aren't present in that file with well-known
canonical values, which duplicates the behavior of getpwnam(3). The net
effect is that the showuser output is identical whether using -R or not.
Although the ffs (and later msdosfs) implementation in makefs is
independent of the one in kernel, it makes sense to keep differences to
a minimum in order to ease comparison and porting changes across.
Submitted by: Siva Mahadevan
Sponsored by: The FreeBSD Foundation
to the type of rate limiter being configured. For example, the class
WRR scheduler doesn't need any kbps limits (it just needs the weights
for each class), the channel scheduler doesn't need anything except the
aggregate kbps to limit the channel to, and so on.
MFC after: 3 days
Sponsored by: Chelsio Communications
Temporarily decompress a copy of a crash dump compressed with either
gzip or zstd and run various tools against the decompressed copy while
generating the crash information. The uncompressed copy is deleted when
the script exits.
Note that crashinfo is enabled by default, so this will attempt to
decompress the most recent compressed crash dump after a crash that
generates a compressed crash dump. Users who wish to only do offline
analysis of compressed crash dumps can disable crashinfo in rc.conf.
Tested by: ler
Reviewed by: markj
MFC after: 2 weeks
users. Without -R, pw(8) uses getpwnam(3), which will open master.passwd
for the root user or passwd for non-root users. With -R <path> pw(8) was
always opening <path>/master.passwd, which would fail for a non-root user,
then falsely claim the userid you're trying to show doesn't exist.
Now for a non-root user it opens <path>/passwd and zeroes out the 3 fields
that aren't available in the passwd file, which duplicates the behavior of
getpwnam(3). The net effect is that the showuser output is identical
whether using -R or not.
Fix two failing makefs test cases by adding "-M 1m", which was already used
for every other FFS test case. Add a new test case for the underlying
issue: with no -M, -m, or -s options, makefs can underestimate image size.
PR: 229929
Reported by: Jenkins
MFC after: 2 weeks
- Use "Fl -" instead of "Cm --" for long options.
- Sort options alphabetically.
- Pet "mandoc -Tlint".
- Clean up the description of the "--interpreter" option.
- Clean up the description of the first example in the examples section.
- Use ".Bd -literal -offset indent" for all example code blocks for consistency.
- Use "Nm" instead of "Cm binmiscctl".
- Indent all examples for consistency.
Reviewed by: allanjude
Approved by: mat (mentor)
Differential Revision: https://reviews.freebsd.org/D15589
Code analysis and runtime analysis using truss(8) indicate that the only
privileged operations performed by ntpd are adjusting system time, and
(re-)binding to privileged UDP port 123. These changes add a new mac(4)
policy module, mac_ntpd(4), which grants just those privileges to any
process running with uid 123.
This also adds a new user and group, ntpd:ntpd, (uid:gid 123:123), and makes
them the owner of the /var/db/ntp directory, so that it can be used as a
location where the non-privileged daemon can write files such as the
driftfile, and any optional logfile or stats files.
Because there are so many ways to configure ntpd, the question of how to
configure it to run without root privs can be a bit complex, so that will be
addressed in a separate commit. These changes are just what's required to
grant the limited subset of privs to ntpd, and the small change to ntpd to
prevent it from exiting with an error if running as non-root.
Differential Revision: https://reviews.freebsd.org/D16281
Plumb the %VERSREQ from Makefile.<arch> through to the rest of config(8).
We've recorded the config(8) version that we're calling "the end of
envmode and hintmode," and we'll write them out for earlier versions. Later
kernel version bumps will remove envmode/hintmode from the kernel as needed,
which is OK since the current kernel does not use them at all.
These compatibility shims really need to go away when the major version
rolls over...
Discussed with: imp
config-generated hints.c/env.c from r335998 and later are incompatible with
earlier kernels due to no longer setting envmode/hintmode. A minor bump for
this is insufficient, as matching major version with a later minor version
is still viewed as backwards-compatible.
This was an MI kernel change, soo all VERSREQ's are bumped.
PR: bin/229806
Reported by: Andreas Sommer <andreas.sommer87@googlemail.com>
MFC after: 3 days
X-MFC-to: stable/11 stable/10 stable/9
Sponsored by: Smule, Inc.
Normally, we can get away with just reading the 1k buffer for the
string, since the placement of the data is generally no where near the
end of the file. However, it's possible that the string is within the
last 1k of the file, in which case the read will fail, and we'll not
produce the proper records needed for devmatch to work. By reading
using EF_SEG_READ_STRING, we automatically work around these problems
while still retaining safety.
This fix a problem with devmatch where we wouldn't load certain
modules (like ums). This didn't always happen (my tree didn't exhibit
it, while nathan's did because his optimization options were more
agressive).
Reported by: nathanw@
This variable has been given the name "loader_env.disabled" as it's the
primary way most people will have an MD environment. This restores the
previously-default behavior of ignoring the loader(8) environment, which may
be useful for vendor distributions or other scenarios where inheriting the
loader environment may be considered a security issue or potentially
breaking of a more locked-down environment.
As the change to config(5) indicates, disabling the loader environment
should not be a choice made lightly since it may provide ACPI hints and
other useful things that the system can rely on to boot.
An UPDATING entry has been added to mention an upgrade path for those that
may have relied on the previous behavior.
Discussed with: bde
Relnotes: yes (maybe)
The bhyve(8) exit status indicates how the VM was terminated:
0 rebooted
1 powered off
2 halted
3 triple fault
The problem is when we have wrappers around bhyve that parses the exit
error code and gets an exit(1) for an error but interprets it as "powered off".
So to mitigate this issue and makes it less error prone for third part
applications, I have added a new exit code 4 that is "exited due to an error".
For now the bhyve(8) exit status are:
0 rebooted
1 powered off
2 halted
3 triple fault
4 exited due to an error
Reviewed by: @jhb
MFC after: 2 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D16161
The pnfsdskill(8) command will normally fail if there is no valid mirror
for the DS to be disabled. However, a system administrator may need to
disable a DS which does not have a valid mirror so that the nfsd threads
can be terminated. This patch adds a "-f" option to pnfsdskill(8) that
uses the kernel changes made by r336141 to implement this "forced" case
of disabling a DS.
This patch only affects the pNFS server.
The RFC 5424 spec mentions that logging FQDNs over short hostnames is
preferred. Alter this code, so that the hostname doesn't get truncated
on startup. Keep track of the length of the short hostname, so that
fprintf() can do the truncation where necessary.
MFC after: 1 month
Tools such as Postfix use slashes in process names for hierarchy
(postfix/qmgr). By allowing these slashes, syslogd is able to extract
the process name and process ID nicely, so that they can be stored in
RFC 5424 message fields.
MFC after: 1 week
r336019 introduced ${SRCTOP}/sys to the include paths in order to pull in a
new sys/{c,}nv.h. This is wrong, because the build tree's ABI isn't
guaranteed to match what's running on the host system.
Fix instead by removing -I${SRCTOP}/sys and installing the libnv headers
with `make -C lib/libnv includes`... this may or may not get re-worked in
the future so that a userland lib isn't installing includes from sys/.
Reported by: bdrewery
r335653 flipped the order in which hints/env files are concatenated to match
the order in which vars are processed by the kernel. This is the other
hammer to drop.
Use nv(9) to de-dupe entries within a single `hint` or `env` file, using the
latest value specified for a key. This leaves some duplicates if a variable
is specified in multiple hint/env files or via `envvar` in a kernel config,
but the reversed order of concatenation (from r335653) makes this a
non-issue as the latest-specified version will be seen first.
This change also silently rewrote hint bits to use the same sanitization
process that ian@ wrote for r335642. To the kernel, hints and env vars are
basically the same thing through early boot, then get merged into the
dynamic environment once kmem becomes available and the dynamic environment
is created. They should be subjected to the same restrictions.
libnv has been added to -legacy for the time being to support the build of
config(8) with the new cnvlist API.
Tested with: universe (11 host & 12 host)
MFC after: 1 month
r335653 flipped the order in which hints/env files are concatenated to match
the order in which vars are processed by the kernel. This is the other
hammer to drop.
Use nv(9) to de-dupe entries within a single `hint` or `env` file, using the
latest value specified for a key. This leaves some duplicates if a variable
is specified in multiple hint/env files or via `envvar` in a kernel config,
but the reversed order of concatenation (from r335653) makes this a
non-issue as the latest-specified version will be seen first.
This change also silently rewrote hint bits to use the same sanitization
process that ian@ wrote for r335642. To the kernel, hints and env vars are
basically the same thing through early boot, then get merged into the
dynamic environment once kmem becomes available and the dynamic environment
is created. They should be subjected to the same restrictions.
MFC after: 1 month
At the moment, hintmode and envmode are used to indicate whether static
hints or static env have been provided in the kernel config(5) and the
static versions are mutually exclusive with loader(8)-provided environment.
hintmode *can* be reconfigured later to pull from the dynamic environment,
thus taking advantage of the loader(8) or post-kmem environment setting.
This changeset fixes both problems at once to move us from a semi-confusing
state to a consistent state: if an environment file, hints file, or
loader(8) environment are provided, we use them in a well-known order of
precedence:
- loader(8) environment
- static environment
- static hints file
Once the dynamic environment is setup this becomes a moot point. The
loader(8) and static environments are merged (respecting the above order of
precedence), and the static hints are merged in on an as-needed basis after
the dynamic environment has been setup.
Hints lookup are changed to respect all of the above. Before the dynamic
environment is setup, lookups use the above-mentioned order and fallback to
the next environment if a matching hint is not found. Once the dynamic
environment is setup, that is used on its own since it captures all of the
above information plus any dynamic kenv settings that came up later in boot.
The following tangentially related changes were made to res_find:
- A hintp cookie is now passed in so that related searches continue using
the chain of environments (or dynamic environment) without relying on
global state
- All three environments will be searched if they actually have valid hints
to use, rather than just choosing the first environment that actually had
a hint and rolling with that only
The hintmode sysctl has been ripped out. static_{env,hints}.disabled are
still honored and will disable their respective environments from being used
for hint lookups and from being merged into the dynamic environment, as
expected.
MFC after: 1 month (maybe)
Differential Revision: https://reviews.freebsd.org/D15953
At the moment, hintmode and envmode are used to indicate whether static
hints or static env have been provided in the kernel config(5) and the
static versions are mutually exclusive with loader(8)-provided environment.
hintmode *can* be reconfigured later to pull from the dynamic environment,
thus taking advantage of the loader(8) or post-kmem environment setting.
This changeset fixes both problems at once to move us from a semi-confusing
state to a consistent state: if an environment file, hints file, or
loader(8) environment are provided, we use them in a well-known order of
precedence:
- loader(8) environment
- static environment
- static hints file
Once the dynamic environment is setup this becomes a moot point. The
loader(8) and static environments are merged (respecting the above order of
precedence), and the static hints are merged in on an as-needed basis after
the dynamic environment has been setup.
Hints lookup are changed to respect all of the above. Before the dynamic
environment is setup, lookups use the above-mentioned order and fallback to
the next environment if a matching hint is not found. Once the dynamic
environment is setup, that is used on its own since it captures all of the
above information plus any dynamic kenv settings that came up later in boot.
The following tangentially related changes were made to res_find:
- A hintp cookie is now passed in so that related searches continue using
the chain of environments (or dynamic environment) without relying on
global state
- All three environments will be searched if they actually have valid hints
to use, rather than just choosing the first environment that actually had
a hint and rolling with that only
The hintmode sysctl has been ripped out. static_{env,hints}.disabled are
still honored and will disable their respective environments from being used
for hint lookups and from being merged into the dynamic environment, as
expected.
MFC after: 1 month (maybe)
Differential Revision: https://reviews.freebsd.org/D15953
The initial work on bhyve NVMe device emulation was done by the GSoC student
Shunsuke Mie and was heavily modified in performan, functionality and
guest support by Leon Dang.
bhyve:
-s <n>,nvme,devpath,maxq=#,qsz=#,ioslots=#,sectsz=#,ser=A-Z
accepted devpath:
/dev/blockdev
/path/to/image
ram=size_in_MiB
Tested with guest OS: FreeBSD Head, Linux Fedora fc27, Ubuntu 18.04,
OpenSuse 15.0, Windows Server 2016 Datacenter.
Tested with all accepted device paths: Real nvme, zdev and also with ram.
Tested on: AMD Ryzen Threadripper 1950X 16-Core Processor and
Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz.
Tests at: https://people.freebsd.org/~araujo/bhyve_nvme/nvme.txt
Submitted by: Shunsuke Mie <sux2mfgj_gmail.com>,
Leon Dang <leon_digitalmsx.com>
Reviewed by: chuck (early version), grehan
Relnotes: Yes
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D14022
For developers gensnmptree can now generate functions for enums to convert
between enums and strings and to check the validity of a value.
The sources in FreeBSD are now in sync with the upstream which allows to
bring in IPv6 modifications.
r335871 added support for an optional suffix of "#mds_path" that can be
applied to each entry in the "-p" option argument. This specifies that
the DS should be used to store files for the file system on the MDS
at "mds_path".
This patch documents this optional suffix.
This is a content change.
Without this patch, the pNFS server distributes the data storage files across
all of the specified DSs.
A tester noted that it would be nice if a system administrator could control
which DSs are used to store the file data for a given exported MDS file system.
This patch adds an optional suffix for each entry in the "-p" option argument
that specifies "store file data for this MDS file system" in this DS.
The patch should only affect sites using the pNFS server (specified via the
"-p" command line option for nfsd.
The interface between the nfsd and the kernel has changed with this patch,
so anyone using the "-p" option needs to rebuild their nfsd from sources
with this patch applied to them.
Discussed with: james.rose@framestore.com
The refactoring of the syslogd code to format messages using iovecs
slightly altered the output of syslogd by placing the facility/priority
after the hostname, as opposed to printing it right before. This change
reverts the behaviour to be consistent with how it was before.
PR: 229457
Reported by: Andre Albsmeier
MFC after: 1 week
When pnfsdscopymr(8) is used to create a mirror of a file on a mirrored
pNFS service, it expects to find an entry in the extended attribute for
IP address 0.0.0.0.
This patch adds a "-m" option which can be used to create these entrie(s).
It also tightens up the checks for use of incompatible command line options.
The kernel code assumes that nfsdargs.addr == NULL and nfsdargs.addrlen == 0
when there is no "-p" argument used for starting the nfsd.
This small patch ensures this is the case. In practice, I believe this always
happened, since "nfsdargs" was the last element on the stack for "main()",
but this little patch ensures it will be the case.
Spotted by inspection while adding a new optional field for "-p".
By using INSTALL_LINK instead of calling ln during install the files
end up in the METALOG file as well if we use -DNO_ROOT and will be
included in a disk image when using makefs with METALOG as the input.
The other file that was not included in METALOG was /var/db/services.db
which is now also included for -DNO_ROOT.
Approved By: brooks (mentor)
Differential Revision: https://reviews.freebsd.org/D15665
As previously noted, kernel's processing of these means that the first
appearance of a hint/variable wins. Flipping the order of concatenation
means that later variables override earlier variables, as expected when one
does:
hints x
hints y
Where perhaps x is:
hint.aw_sid.0.disable=1
and y is:
hint.aw_sid.0.disable=0
The expectation would be that a later appearing variable would override an
earlier appearing variable, such as with `device`/`nodevice`, device.hints,
and other similarly structured data files.
Previously, only one 'env' file could be specified. Later 'env' directives
would overwrite earlier 'env' directives. This is inconsistent with every
other file-accepting directives which process files in order, including
hints.
A caveat applies to both hints and env that isn't mentioned: they're
concatenated in the order of appearance, so they're not actually applied in
the way one might think by supplying:
hints x
hints y
Hints in x will take precedence over same-name hints in y due to how
the kernel processes them, stopping at the first line that matches the hint
we're searching for. Future work will flip the order of concatenation so
that later files may still properly override earlier files.
In practice, this likely doesn't matter at all due to the nature of the
beast.
envvar allows adding individual environment variables to the kernel's static
environment without the overhead of pulling in a full file. envvar in a
config looks like:
envvar some_var=5
All envvar-provided variables will be added after the env file is processed,
so envvar keys that exist in the previous env will be overwritten by
whatever value is set here in the kernel configuration directly.
As an aside, envvar lines are intentionally tokenized differently from
basically every other line. We used a named state when ENVVAR is encountered
to gobble up the rest of the line, which will later be cleaned and validated
in post-processing by sanitize_envline. This turns out to be the simplest
and cleanest way to allow the flexibility that kenv does while not
compromising on silly hacks.
Reviewed by: ian (also contributor of sanitize_envline rewrite)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D15962
The changes made in r326573 required that messages always start with an
RFC 3164 timestamp. It looks like certain devices, but also certain
logging libraries (Python 3's "logging" package) simply don't generate
RFC 3164 formatted messages containing a timestamp.
Make timestamps optional again. When the timestamp is missing, also
assume that the message contains no hostname. The first word of the
message likely already belongs to the message payload.
PR: 229236
Reported by: Michael Grimm & Marek Zarychta
Reviewed by: glebius (cursory)
MFC after: 1 week
userland, conceptually similar to what i2c(8) provides for i2c devices.
Submitted by: Bob Frazier
Differential Revision: https://reviews.freebsd.org/D15029
try to build them if MK_OPENSSL is unset.
Reviewed by: emaste imp kevans
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15211
To conform to RFC 5426, this function is intended to truncate messages
if they exceed the message size limits. Unfortunately, the amount of
space was computed the wrong way around, causing messages to be
truncated entirely.
Reported by: Michael Grimm on stable@
MFC after: 3 days
The usermgmt API was stomping on a global ($user_gid to be specific)
so things would appear to work fine until you tried to make a second
pass into the API with the now-tainted variable contents.
Fixed by localizing menu-specific contents as to not leak outside API.
PR: bin/208774
Reported by: Martin Waschbuesch <martin@waschbuesch.de>
MFC after: 1 week
X-MFC-to: stable/11, stable/10
Sponsored by: Smule, Inc.
PR: bin/203435
Reported by: Andreas Sommer <andreas.sommer87@googlemail.com>
MFC after: 6 days
X-MFC-to: stable/11
X-MFC-with: r335280
Sponsored by: Smule, Inc.
Fix exit status when f_sysrc_set() fails. Errors in the underlying API
provided by bsdconfig(8) -- /usr/share/bsdconfig/sysrc.subr -- were not
being communicated back to the command-line. This was affecting ansible
modules using sysrc as they were not able to accurately test for error.
PR: bin/211448
Reported by: Christian Schwarz <me@cschwarz.com>
MFC after: 3 days
X-MFC-to: stable/11
Sponsored by: Smule, Inc.
This command can be used by a sysadmin to either copy or migrate a data
file on one DS to another DS.
Its main use is to recover data files onto a mirrored DS after the DS has
been repaired and brought back online.
This command allows a sysadmin to display or modify the pnfsd.dsfile extended
attribute used by the pNFS MDS server in various ways.
Its main use is to set a DS's IP address to 0.0.0.0 when that DS has failed,
so that it will not be used for the file when brought back online after
being repaired.
kgdb now handles kernel module state internally, so the asf tool serves
no purpose.
PR: 229046
Reviewed by: brooks
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15827
This command can be used by a sysadmin to disable a malfunctioning pNFS server
mirrored DS. It is safe to use when a mirrored DS has already been disabled
via an I/O or network partitioning error.
The "-p" option specifies that the nfsd should run a pNFS service instead
of a regular NFS service. The "-m" option is only meaningful when used with
"-p" to specify that mirroring on the DSs should be done and on how many of
them.
This change requires the kernel changes committed as r334930.
The man page update will be committed as a separate commit soon.
allocation, I could identify that actually we use this pointer on pci_emul.c as
well as on vga.c source file.
I have reworked the logic here to make it more readable and also add a warn to
explicit show the function where the memory allocation error could happen,
also sort headers.
Also CID 1194192 was marked as "Intentional".
Obtained from: TrueOS
MFC after: 4 weeks.
Sponsored by: iXsystems Inc.
- fix debugging for family on AMD cpus and add useful debugging for
which file is being selected for update.
Reviewed by: cem
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15574
Fix the build of lib/libpmc and usr.sbin/pmc for gcc on amd64.
Reviewed by: mmacy
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D15723
We no longer need to od these things conditionally, and the fallbacks
are to 4.2BSD era defaults, which nobody uses anymore. Vixie cron has
diverged from upstream anyway in our tree, and it's not clear there's
actually a viable upstream anymore. Plus, we don't follow the
vendor-supplied code pattern here.
I'm doing this to reduce false positives from grep.
given interval, which is counted in seconds since exit of the previous
invocation of the job. Example user crontab entry:
@25 sleep 10
The example will launch 'sleep 10' every 35 seconds. This is a rather
useless example above, but clearly explains the functionality.
The practical goal here is to avoid overlap of previous job invocation
to a new one, or to avoid too short interval(s) for jobs that last long
and doesn't have any point of immediate launch soon after previous run.
Another useful effect of interval jobs can be noticed when a cluster of
machines periodically communicates with a single node. Running the task
time based creates too much load on the node. Running interval based
spreads invocations across machines in cluster. Note that -j/-J won't
help in this case.
Sponsored by: Netflix
- add '-j' options to filter to enable converting native pmc
log format to json lines format to enable the use of scripts
and external tooling
% pmc filter -j pmc.log pmc.jsonl
- Record the tsc value in sampling interrupts as opposed to
recording nanotime when the sample is copied to a global log
in hardclock - potentially many milliseconds later.
- At initialize record the tsc_freq and the time of day to give
us an offset for translating the tsc values in callchain records
Some mpc85xx devices with u-boot need MBR partitioning with a FAT boot
partition. Since the infrastructure is already in place to have a dedicated
boot partition, this adds the necessary bits to use that infrastructure with
mpc85xx boards.
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15664
By logging all threads and processes 'pmc filter'
can now filter on process or thread name, relieving
the user of the burden of determining which tid or
pid was which when the sample was taken.
% pmc filter -T if_io_tqg -P nginx pmc.log pmc-iflib.log
% pmc filter -x -T idle pmc.log pmc-noidle.log
Summary:
The kernel reads 'kernelname' to set the kern.bootfile sysctl. By setting this,
'make installkernel' will backup the running kernel as appropriate.
Reviewed by: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15660
This adds the -U options to pmcstat which will attribute in-kernel samples
back to the user stack that invoked the system call. It is not the default,
because when looking at kernel profiles it is generally more desirable to
merge all instances of a given system call together.
Although heavily revised, this change is directly derived from D7350 by
Jonathan T. Looney.
Obtained from: jtl
Sponsored by: Juniper Networks, Limelight Networks
- Restore local change to include <net/bpf.h> inside pcap.h.
This fixes ports build problems.
- Update local copy of dlt.h with new DLT types.
- Revert no longer needed <net/bpf.h> includes which were added
as part of r334277.
Suggested by: antoine@, delphij@, np@
MFC after: 3 weeks
Sponsored by: Mellanox Technologies
This will manage pmc functionality with a more
manageable structure of subcommands rather than the
gradually accreted spaghetti logic of overlapping flags
that exists in pmcstat.
This is intended to ultimately have all the same functionality
as pmcannotate+pmccontrol+pmcstat. Currently it just has
"stat" and "system-stat" - counters for the process itself and counters
for the system as a whole respectively (i.e. system-stat includes kernel
threads). Note that the rusage results (page faults/context switches/
user/sys) for stat-system will not account for the system as a whole -
only for the child process specified on the command line.
Implementing stat was suggested by mjg@ and the output is based on that
from Linux's "perf stat".
% pmc stat -- make -j32 buildkernel -DNO_MODULES -ss > /dev/null
9598393 page faults # 0.674 M/sec
387085 voluntary csw # 0.027 M/sec
106989 involuntary csw # 0.008 M/sec
2763965982317 cycles
2542953049760 instructions # 0.920 inst/cycle
511562750157 branches
12917006881 branch-misses # 2.525%
17944429878 cache-references # 0.007 refs/inst
2205119560 cache-misses # 12.289%
43.74 real # 2019.72% cpu
795.09 user # 1817.72% cpu
88.35 sys # 202.00% cpu
% make -j32 buildkernel -DNO_MODULES -ss > /dev/null &
% sudo pmc stat-system cat
^C 103 page faults # 0.811 M/sec
4 voluntary csw # 0.031 M/sec
0 involuntary csw # 0.000 M/sec
2843639070514 cycles
2606171217438 instructions # 0.916 inst/cycle
522450422783 branches
13092862839 branch-misses # 2.506%
18592101113 cache-references # 0.007 refs/inst
2562878667 cache-misses # 13.785%
44.85 real # 0.00% cpu
0.00 user # 0.00% cpu
0.00 sys # 0.00% cpu
Also stdarg(3) says that each invocation of va_start() must be paired
with a corresponding invocation of va_end() in the same function. [1]
Reported by: Coverity
CID: 1194318[0] and 1194332[1]
Discussed with: jhb
MFC after: 4 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15548
strncpy did not guarantee NUL termination of the "stack" string.
Use strlcpy instead. While I'm here, avoid unnecessary memset
and strnlen calls.
Reported by: Coverity
CID: 1381035
Sponsored by: Dell EMC
vendor provided pmu-events tables and sundry cleanups.
The vendor pmu-events tables provide counter descriptions, default
sample rates, event, umask, and flag values for all the counter
configuration permutations. Using this gives us:
- much simpler kernel code for the MD component
- helpful long and short event descriptions
- simpler user code
- sample rates that won't overload the system
Update man page with newer sample types and remove unused sample type.
vendor provided pmu-events tables and sundry cleanups.
The vendor pmu-events tables provide counter descriptions, default
sample rates, event, umask, and flag values for all the counter
configuration permutations. Using this gives us:
- much simpler kernel code for the MD component
- helpful long and short event descriptions
- simpler user code
- sample rates that won't overload the system
Update man page with newer sample types and remove unused sample type.
Squashed commit of the following:
commit 4459d43eff815bec08ccc5533dbe5de846f03128
Author: Matt Macy <mmacy@mattmacy.io>
Date: Sat May 26 00:06:31 2018 -0700
libpmc: fix pmu function signatures for non amd64
commit a2cb8bbc586c65d41f9b291430a2261ec67b59fe
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:38:11 2018 -0700
pmcstat: fix indentation of usage
commit f686954b15ff56a833ac80404898977cb80a265b
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:19:49 2018 -0700
pmclog(3): add callchain and pmcallocatedyn, remove pcsample
commit 73e13a0d2e9498c81c150d14d022050cee7511bb
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:19:00 2018 -0700
pmclog.h: GC pcsample field
commit 3e93ffd65da641fa657539dad3c48e281f8b5798
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:05:57 2018 -0700
hwpmc: make Intel core CPUs use external event tables
commit 634f5fae1e1644ac324003136c66cd9c619d1c93
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:00:06 2018 -0700
pmclog: update log record types, bump PMC_MAJOR
- explicitly make log record types a multiple of 8 bytes
- hook in pmu event types for pmc_allocate records
- remove references to no longer PCSAMPLE record
commit 83d84fcd2d65bdf6ddcb2e155a22f0cfa2a9c225
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 21:52:10 2018 -0700
libpmc: add support for having vendor table driven pmc_allocate
commit 9e6ad63c40c2fce8404847ace5078ca6cb33a736
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 19:11:33 2018 -0700
hwpmc_core: add accessors for EVSEL & UMASK, make IAP_UMASK useful to user
commit 859dceb93daa6419a48c794db99b6758e5b041c9
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 19:09:45 2018 -0700
pmcstat: update usage and man page as well as make -L consistent with pmccontrol
commit 79c7d8597e28c2eb13f5f9113e65ec2792ca57b1
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 18:07:03 2018 -0700
pmu_util: add support for all current intel event keywords
commit d8089c7f6a6c8527f38324252b1ffb47004694c6
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 17:45:00 2018 -0700
add description for new arguments
commit 058336740bab53c62ec88a3a026ea848cf3878c6
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 17:38:15 2018 -0700
libpmc: move pmu_events table and pmu_utils out of libpmcstat so that they can be used by pmc_allocate
commit 049b66b382e2f833c3f47bc8df9e750cb265709f
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 16:12:41 2018 -0700
pmcstat: hook pmu_events counter description utility routines in
commit f5e01e7b37a691dc045e1aa16b3ebdd162515de8
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 16:11:59 2018 -0700
pmu_events: add utility routines for listing counters and their descriptions
commit cba4d4f8907f772279f86f18f915e0d74d33ac56
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 16:09:50 2018 -0700
pmu-events: expand out skylake regex to simplify string matches
strdup(3) allocates memory for a copy of the string, does the copy and
returns a pointer to it. If there is no sufficient memory NULL is returned
and the global errno is set to ENOMEM.
We do a sanity check to see if it was possible to allocate enough memory.
Also as we allocate memory, we need to free this memory used. Or it will
going out of scope leaks the storage it points to.
Reviewed by: rgrimes
MFC after: 3 weeks.
X-MFC: r332298
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15550
on thread in post-processing.
To generate stacks for just ${THREADID}:
pmcstat -R ${PREFIX}.pmcstat -L ${THREADID} -z100 -G ${PREFIX}.stacks
Sponsored by: Limelight Networks
will be returned to indicate the error, so I'm applying an assert(3) to do
a sanity check of the return value.
Reported by: Coverity CID: 1391235, 1193654 and 1193651
Reviewed by: grehan
MFC after: 4 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15533