Plumb the %VERSREQ from Makefile.<arch> through to the rest of config(8).
We've recorded the config(8) version that we're calling "the end of
envmode and hintmode," and we'll write them out for earlier versions. Later
kernel version bumps will remove envmode/hintmode from the kernel as needed,
which is OK since the current kernel does not use them at all.
These compatibility shims really need to go away when the major version
rolls over...
Discussed with: imp
config-generated hints.c/env.c from r335998 and later are incompatible with
earlier kernels due to no longer setting envmode/hintmode. A minor bump for
this is insufficient, as matching major version with a later minor version
is still viewed as backwards-compatible.
This was an MI kernel change, soo all VERSREQ's are bumped.
PR: bin/229806
Reported by: Andreas Sommer <andreas.sommer87@googlemail.com>
MFC after: 3 days
X-MFC-to: stable/11 stable/10 stable/9
Sponsored by: Smule, Inc.
Normally, we can get away with just reading the 1k buffer for the
string, since the placement of the data is generally no where near the
end of the file. However, it's possible that the string is within the
last 1k of the file, in which case the read will fail, and we'll not
produce the proper records needed for devmatch to work. By reading
using EF_SEG_READ_STRING, we automatically work around these problems
while still retaining safety.
This fix a problem with devmatch where we wouldn't load certain
modules (like ums). This didn't always happen (my tree didn't exhibit
it, while nathan's did because his optimization options were more
agressive).
Reported by: nathanw@
This variable has been given the name "loader_env.disabled" as it's the
primary way most people will have an MD environment. This restores the
previously-default behavior of ignoring the loader(8) environment, which may
be useful for vendor distributions or other scenarios where inheriting the
loader environment may be considered a security issue or potentially
breaking of a more locked-down environment.
As the change to config(5) indicates, disabling the loader environment
should not be a choice made lightly since it may provide ACPI hints and
other useful things that the system can rely on to boot.
An UPDATING entry has been added to mention an upgrade path for those that
may have relied on the previous behavior.
Discussed with: bde
Relnotes: yes (maybe)
The bhyve(8) exit status indicates how the VM was terminated:
0 rebooted
1 powered off
2 halted
3 triple fault
The problem is when we have wrappers around bhyve that parses the exit
error code and gets an exit(1) for an error but interprets it as "powered off".
So to mitigate this issue and makes it less error prone for third part
applications, I have added a new exit code 4 that is "exited due to an error".
For now the bhyve(8) exit status are:
0 rebooted
1 powered off
2 halted
3 triple fault
4 exited due to an error
Reviewed by: @jhb
MFC after: 2 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D16161
The pnfsdskill(8) command will normally fail if there is no valid mirror
for the DS to be disabled. However, a system administrator may need to
disable a DS which does not have a valid mirror so that the nfsd threads
can be terminated. This patch adds a "-f" option to pnfsdskill(8) that
uses the kernel changes made by r336141 to implement this "forced" case
of disabling a DS.
This patch only affects the pNFS server.
The RFC 5424 spec mentions that logging FQDNs over short hostnames is
preferred. Alter this code, so that the hostname doesn't get truncated
on startup. Keep track of the length of the short hostname, so that
fprintf() can do the truncation where necessary.
MFC after: 1 month
Tools such as Postfix use slashes in process names for hierarchy
(postfix/qmgr). By allowing these slashes, syslogd is able to extract
the process name and process ID nicely, so that they can be stored in
RFC 5424 message fields.
MFC after: 1 week
r336019 introduced ${SRCTOP}/sys to the include paths in order to pull in a
new sys/{c,}nv.h. This is wrong, because the build tree's ABI isn't
guaranteed to match what's running on the host system.
Fix instead by removing -I${SRCTOP}/sys and installing the libnv headers
with `make -C lib/libnv includes`... this may or may not get re-worked in
the future so that a userland lib isn't installing includes from sys/.
Reported by: bdrewery
r335653 flipped the order in which hints/env files are concatenated to match
the order in which vars are processed by the kernel. This is the other
hammer to drop.
Use nv(9) to de-dupe entries within a single `hint` or `env` file, using the
latest value specified for a key. This leaves some duplicates if a variable
is specified in multiple hint/env files or via `envvar` in a kernel config,
but the reversed order of concatenation (from r335653) makes this a
non-issue as the latest-specified version will be seen first.
This change also silently rewrote hint bits to use the same sanitization
process that ian@ wrote for r335642. To the kernel, hints and env vars are
basically the same thing through early boot, then get merged into the
dynamic environment once kmem becomes available and the dynamic environment
is created. They should be subjected to the same restrictions.
libnv has been added to -legacy for the time being to support the build of
config(8) with the new cnvlist API.
Tested with: universe (11 host & 12 host)
MFC after: 1 month
r335653 flipped the order in which hints/env files are concatenated to match
the order in which vars are processed by the kernel. This is the other
hammer to drop.
Use nv(9) to de-dupe entries within a single `hint` or `env` file, using the
latest value specified for a key. This leaves some duplicates if a variable
is specified in multiple hint/env files or via `envvar` in a kernel config,
but the reversed order of concatenation (from r335653) makes this a
non-issue as the latest-specified version will be seen first.
This change also silently rewrote hint bits to use the same sanitization
process that ian@ wrote for r335642. To the kernel, hints and env vars are
basically the same thing through early boot, then get merged into the
dynamic environment once kmem becomes available and the dynamic environment
is created. They should be subjected to the same restrictions.
MFC after: 1 month
At the moment, hintmode and envmode are used to indicate whether static
hints or static env have been provided in the kernel config(5) and the
static versions are mutually exclusive with loader(8)-provided environment.
hintmode *can* be reconfigured later to pull from the dynamic environment,
thus taking advantage of the loader(8) or post-kmem environment setting.
This changeset fixes both problems at once to move us from a semi-confusing
state to a consistent state: if an environment file, hints file, or
loader(8) environment are provided, we use them in a well-known order of
precedence:
- loader(8) environment
- static environment
- static hints file
Once the dynamic environment is setup this becomes a moot point. The
loader(8) and static environments are merged (respecting the above order of
precedence), and the static hints are merged in on an as-needed basis after
the dynamic environment has been setup.
Hints lookup are changed to respect all of the above. Before the dynamic
environment is setup, lookups use the above-mentioned order and fallback to
the next environment if a matching hint is not found. Once the dynamic
environment is setup, that is used on its own since it captures all of the
above information plus any dynamic kenv settings that came up later in boot.
The following tangentially related changes were made to res_find:
- A hintp cookie is now passed in so that related searches continue using
the chain of environments (or dynamic environment) without relying on
global state
- All three environments will be searched if they actually have valid hints
to use, rather than just choosing the first environment that actually had
a hint and rolling with that only
The hintmode sysctl has been ripped out. static_{env,hints}.disabled are
still honored and will disable their respective environments from being used
for hint lookups and from being merged into the dynamic environment, as
expected.
MFC after: 1 month (maybe)
Differential Revision: https://reviews.freebsd.org/D15953
At the moment, hintmode and envmode are used to indicate whether static
hints or static env have been provided in the kernel config(5) and the
static versions are mutually exclusive with loader(8)-provided environment.
hintmode *can* be reconfigured later to pull from the dynamic environment,
thus taking advantage of the loader(8) or post-kmem environment setting.
This changeset fixes both problems at once to move us from a semi-confusing
state to a consistent state: if an environment file, hints file, or
loader(8) environment are provided, we use them in a well-known order of
precedence:
- loader(8) environment
- static environment
- static hints file
Once the dynamic environment is setup this becomes a moot point. The
loader(8) and static environments are merged (respecting the above order of
precedence), and the static hints are merged in on an as-needed basis after
the dynamic environment has been setup.
Hints lookup are changed to respect all of the above. Before the dynamic
environment is setup, lookups use the above-mentioned order and fallback to
the next environment if a matching hint is not found. Once the dynamic
environment is setup, that is used on its own since it captures all of the
above information plus any dynamic kenv settings that came up later in boot.
The following tangentially related changes were made to res_find:
- A hintp cookie is now passed in so that related searches continue using
the chain of environments (or dynamic environment) without relying on
global state
- All three environments will be searched if they actually have valid hints
to use, rather than just choosing the first environment that actually had
a hint and rolling with that only
The hintmode sysctl has been ripped out. static_{env,hints}.disabled are
still honored and will disable their respective environments from being used
for hint lookups and from being merged into the dynamic environment, as
expected.
MFC after: 1 month (maybe)
Differential Revision: https://reviews.freebsd.org/D15953
The initial work on bhyve NVMe device emulation was done by the GSoC student
Shunsuke Mie and was heavily modified in performan, functionality and
guest support by Leon Dang.
bhyve:
-s <n>,nvme,devpath,maxq=#,qsz=#,ioslots=#,sectsz=#,ser=A-Z
accepted devpath:
/dev/blockdev
/path/to/image
ram=size_in_MiB
Tested with guest OS: FreeBSD Head, Linux Fedora fc27, Ubuntu 18.04,
OpenSuse 15.0, Windows Server 2016 Datacenter.
Tested with all accepted device paths: Real nvme, zdev and also with ram.
Tested on: AMD Ryzen Threadripper 1950X 16-Core Processor and
Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz.
Tests at: https://people.freebsd.org/~araujo/bhyve_nvme/nvme.txt
Submitted by: Shunsuke Mie <sux2mfgj_gmail.com>,
Leon Dang <leon_digitalmsx.com>
Reviewed by: chuck (early version), grehan
Relnotes: Yes
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D14022
For developers gensnmptree can now generate functions for enums to convert
between enums and strings and to check the validity of a value.
The sources in FreeBSD are now in sync with the upstream which allows to
bring in IPv6 modifications.
r335871 added support for an optional suffix of "#mds_path" that can be
applied to each entry in the "-p" option argument. This specifies that
the DS should be used to store files for the file system on the MDS
at "mds_path".
This patch documents this optional suffix.
This is a content change.
Without this patch, the pNFS server distributes the data storage files across
all of the specified DSs.
A tester noted that it would be nice if a system administrator could control
which DSs are used to store the file data for a given exported MDS file system.
This patch adds an optional suffix for each entry in the "-p" option argument
that specifies "store file data for this MDS file system" in this DS.
The patch should only affect sites using the pNFS server (specified via the
"-p" command line option for nfsd.
The interface between the nfsd and the kernel has changed with this patch,
so anyone using the "-p" option needs to rebuild their nfsd from sources
with this patch applied to them.
Discussed with: james.rose@framestore.com
The refactoring of the syslogd code to format messages using iovecs
slightly altered the output of syslogd by placing the facility/priority
after the hostname, as opposed to printing it right before. This change
reverts the behaviour to be consistent with how it was before.
PR: 229457
Reported by: Andre Albsmeier
MFC after: 1 week
When pnfsdscopymr(8) is used to create a mirror of a file on a mirrored
pNFS service, it expects to find an entry in the extended attribute for
IP address 0.0.0.0.
This patch adds a "-m" option which can be used to create these entrie(s).
It also tightens up the checks for use of incompatible command line options.
The kernel code assumes that nfsdargs.addr == NULL and nfsdargs.addrlen == 0
when there is no "-p" argument used for starting the nfsd.
This small patch ensures this is the case. In practice, I believe this always
happened, since "nfsdargs" was the last element on the stack for "main()",
but this little patch ensures it will be the case.
Spotted by inspection while adding a new optional field for "-p".
By using INSTALL_LINK instead of calling ln during install the files
end up in the METALOG file as well if we use -DNO_ROOT and will be
included in a disk image when using makefs with METALOG as the input.
The other file that was not included in METALOG was /var/db/services.db
which is now also included for -DNO_ROOT.
Approved By: brooks (mentor)
Differential Revision: https://reviews.freebsd.org/D15665
As previously noted, kernel's processing of these means that the first
appearance of a hint/variable wins. Flipping the order of concatenation
means that later variables override earlier variables, as expected when one
does:
hints x
hints y
Where perhaps x is:
hint.aw_sid.0.disable=1
and y is:
hint.aw_sid.0.disable=0
The expectation would be that a later appearing variable would override an
earlier appearing variable, such as with `device`/`nodevice`, device.hints,
and other similarly structured data files.
Previously, only one 'env' file could be specified. Later 'env' directives
would overwrite earlier 'env' directives. This is inconsistent with every
other file-accepting directives which process files in order, including
hints.
A caveat applies to both hints and env that isn't mentioned: they're
concatenated in the order of appearance, so they're not actually applied in
the way one might think by supplying:
hints x
hints y
Hints in x will take precedence over same-name hints in y due to how
the kernel processes them, stopping at the first line that matches the hint
we're searching for. Future work will flip the order of concatenation so
that later files may still properly override earlier files.
In practice, this likely doesn't matter at all due to the nature of the
beast.
envvar allows adding individual environment variables to the kernel's static
environment without the overhead of pulling in a full file. envvar in a
config looks like:
envvar some_var=5
All envvar-provided variables will be added after the env file is processed,
so envvar keys that exist in the previous env will be overwritten by
whatever value is set here in the kernel configuration directly.
As an aside, envvar lines are intentionally tokenized differently from
basically every other line. We used a named state when ENVVAR is encountered
to gobble up the rest of the line, which will later be cleaned and validated
in post-processing by sanitize_envline. This turns out to be the simplest
and cleanest way to allow the flexibility that kenv does while not
compromising on silly hacks.
Reviewed by: ian (also contributor of sanitize_envline rewrite)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D15962
The changes made in r326573 required that messages always start with an
RFC 3164 timestamp. It looks like certain devices, but also certain
logging libraries (Python 3's "logging" package) simply don't generate
RFC 3164 formatted messages containing a timestamp.
Make timestamps optional again. When the timestamp is missing, also
assume that the message contains no hostname. The first word of the
message likely already belongs to the message payload.
PR: 229236
Reported by: Michael Grimm & Marek Zarychta
Reviewed by: glebius (cursory)
MFC after: 1 week
userland, conceptually similar to what i2c(8) provides for i2c devices.
Submitted by: Bob Frazier
Differential Revision: https://reviews.freebsd.org/D15029
try to build them if MK_OPENSSL is unset.
Reviewed by: emaste imp kevans
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15211
To conform to RFC 5426, this function is intended to truncate messages
if they exceed the message size limits. Unfortunately, the amount of
space was computed the wrong way around, causing messages to be
truncated entirely.
Reported by: Michael Grimm on stable@
MFC after: 3 days
The usermgmt API was stomping on a global ($user_gid to be specific)
so things would appear to work fine until you tried to make a second
pass into the API with the now-tainted variable contents.
Fixed by localizing menu-specific contents as to not leak outside API.
PR: bin/208774
Reported by: Martin Waschbuesch <martin@waschbuesch.de>
MFC after: 1 week
X-MFC-to: stable/11, stable/10
Sponsored by: Smule, Inc.
PR: bin/203435
Reported by: Andreas Sommer <andreas.sommer87@googlemail.com>
MFC after: 6 days
X-MFC-to: stable/11
X-MFC-with: r335280
Sponsored by: Smule, Inc.
Fix exit status when f_sysrc_set() fails. Errors in the underlying API
provided by bsdconfig(8) -- /usr/share/bsdconfig/sysrc.subr -- were not
being communicated back to the command-line. This was affecting ansible
modules using sysrc as they were not able to accurately test for error.
PR: bin/211448
Reported by: Christian Schwarz <me@cschwarz.com>
MFC after: 3 days
X-MFC-to: stable/11
Sponsored by: Smule, Inc.
This command can be used by a sysadmin to either copy or migrate a data
file on one DS to another DS.
Its main use is to recover data files onto a mirrored DS after the DS has
been repaired and brought back online.
This command allows a sysadmin to display or modify the pnfsd.dsfile extended
attribute used by the pNFS MDS server in various ways.
Its main use is to set a DS's IP address to 0.0.0.0 when that DS has failed,
so that it will not be used for the file when brought back online after
being repaired.
kgdb now handles kernel module state internally, so the asf tool serves
no purpose.
PR: 229046
Reviewed by: brooks
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15827
This command can be used by a sysadmin to disable a malfunctioning pNFS server
mirrored DS. It is safe to use when a mirrored DS has already been disabled
via an I/O or network partitioning error.
The "-p" option specifies that the nfsd should run a pNFS service instead
of a regular NFS service. The "-m" option is only meaningful when used with
"-p" to specify that mirroring on the DSs should be done and on how many of
them.
This change requires the kernel changes committed as r334930.
The man page update will be committed as a separate commit soon.
allocation, I could identify that actually we use this pointer on pci_emul.c as
well as on vga.c source file.
I have reworked the logic here to make it more readable and also add a warn to
explicit show the function where the memory allocation error could happen,
also sort headers.
Also CID 1194192 was marked as "Intentional".
Obtained from: TrueOS
MFC after: 4 weeks.
Sponsored by: iXsystems Inc.
- fix debugging for family on AMD cpus and add useful debugging for
which file is being selected for update.
Reviewed by: cem
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15574
Fix the build of lib/libpmc and usr.sbin/pmc for gcc on amd64.
Reviewed by: mmacy
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D15723
We no longer need to od these things conditionally, and the fallbacks
are to 4.2BSD era defaults, which nobody uses anymore. Vixie cron has
diverged from upstream anyway in our tree, and it's not clear there's
actually a viable upstream anymore. Plus, we don't follow the
vendor-supplied code pattern here.
I'm doing this to reduce false positives from grep.
given interval, which is counted in seconds since exit of the previous
invocation of the job. Example user crontab entry:
@25 sleep 10
The example will launch 'sleep 10' every 35 seconds. This is a rather
useless example above, but clearly explains the functionality.
The practical goal here is to avoid overlap of previous job invocation
to a new one, or to avoid too short interval(s) for jobs that last long
and doesn't have any point of immediate launch soon after previous run.
Another useful effect of interval jobs can be noticed when a cluster of
machines periodically communicates with a single node. Running the task
time based creates too much load on the node. Running interval based
spreads invocations across machines in cluster. Note that -j/-J won't
help in this case.
Sponsored by: Netflix
- add '-j' options to filter to enable converting native pmc
log format to json lines format to enable the use of scripts
and external tooling
% pmc filter -j pmc.log pmc.jsonl
- Record the tsc value in sampling interrupts as opposed to
recording nanotime when the sample is copied to a global log
in hardclock - potentially many milliseconds later.
- At initialize record the tsc_freq and the time of day to give
us an offset for translating the tsc values in callchain records
Some mpc85xx devices with u-boot need MBR partitioning with a FAT boot
partition. Since the infrastructure is already in place to have a dedicated
boot partition, this adds the necessary bits to use that infrastructure with
mpc85xx boards.
Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15664
By logging all threads and processes 'pmc filter'
can now filter on process or thread name, relieving
the user of the burden of determining which tid or
pid was which when the sample was taken.
% pmc filter -T if_io_tqg -P nginx pmc.log pmc-iflib.log
% pmc filter -x -T idle pmc.log pmc-noidle.log
Summary:
The kernel reads 'kernelname' to set the kern.bootfile sysctl. By setting this,
'make installkernel' will backup the running kernel as appropriate.
Reviewed by: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15660
This adds the -U options to pmcstat which will attribute in-kernel samples
back to the user stack that invoked the system call. It is not the default,
because when looking at kernel profiles it is generally more desirable to
merge all instances of a given system call together.
Although heavily revised, this change is directly derived from D7350 by
Jonathan T. Looney.
Obtained from: jtl
Sponsored by: Juniper Networks, Limelight Networks
- Restore local change to include <net/bpf.h> inside pcap.h.
This fixes ports build problems.
- Update local copy of dlt.h with new DLT types.
- Revert no longer needed <net/bpf.h> includes which were added
as part of r334277.
Suggested by: antoine@, delphij@, np@
MFC after: 3 weeks
Sponsored by: Mellanox Technologies
This will manage pmc functionality with a more
manageable structure of subcommands rather than the
gradually accreted spaghetti logic of overlapping flags
that exists in pmcstat.
This is intended to ultimately have all the same functionality
as pmcannotate+pmccontrol+pmcstat. Currently it just has
"stat" and "system-stat" - counters for the process itself and counters
for the system as a whole respectively (i.e. system-stat includes kernel
threads). Note that the rusage results (page faults/context switches/
user/sys) for stat-system will not account for the system as a whole -
only for the child process specified on the command line.
Implementing stat was suggested by mjg@ and the output is based on that
from Linux's "perf stat".
% pmc stat -- make -j32 buildkernel -DNO_MODULES -ss > /dev/null
9598393 page faults # 0.674 M/sec
387085 voluntary csw # 0.027 M/sec
106989 involuntary csw # 0.008 M/sec
2763965982317 cycles
2542953049760 instructions # 0.920 inst/cycle
511562750157 branches
12917006881 branch-misses # 2.525%
17944429878 cache-references # 0.007 refs/inst
2205119560 cache-misses # 12.289%
43.74 real # 2019.72% cpu
795.09 user # 1817.72% cpu
88.35 sys # 202.00% cpu
% make -j32 buildkernel -DNO_MODULES -ss > /dev/null &
% sudo pmc stat-system cat
^C 103 page faults # 0.811 M/sec
4 voluntary csw # 0.031 M/sec
0 involuntary csw # 0.000 M/sec
2843639070514 cycles
2606171217438 instructions # 0.916 inst/cycle
522450422783 branches
13092862839 branch-misses # 2.506%
18592101113 cache-references # 0.007 refs/inst
2562878667 cache-misses # 13.785%
44.85 real # 0.00% cpu
0.00 user # 0.00% cpu
0.00 sys # 0.00% cpu
Also stdarg(3) says that each invocation of va_start() must be paired
with a corresponding invocation of va_end() in the same function. [1]
Reported by: Coverity
CID: 1194318[0] and 1194332[1]
Discussed with: jhb
MFC after: 4 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15548
strncpy did not guarantee NUL termination of the "stack" string.
Use strlcpy instead. While I'm here, avoid unnecessary memset
and strnlen calls.
Reported by: Coverity
CID: 1381035
Sponsored by: Dell EMC
vendor provided pmu-events tables and sundry cleanups.
The vendor pmu-events tables provide counter descriptions, default
sample rates, event, umask, and flag values for all the counter
configuration permutations. Using this gives us:
- much simpler kernel code for the MD component
- helpful long and short event descriptions
- simpler user code
- sample rates that won't overload the system
Update man page with newer sample types and remove unused sample type.
vendor provided pmu-events tables and sundry cleanups.
The vendor pmu-events tables provide counter descriptions, default
sample rates, event, umask, and flag values for all the counter
configuration permutations. Using this gives us:
- much simpler kernel code for the MD component
- helpful long and short event descriptions
- simpler user code
- sample rates that won't overload the system
Update man page with newer sample types and remove unused sample type.
Squashed commit of the following:
commit 4459d43eff815bec08ccc5533dbe5de846f03128
Author: Matt Macy <mmacy@mattmacy.io>
Date: Sat May 26 00:06:31 2018 -0700
libpmc: fix pmu function signatures for non amd64
commit a2cb8bbc586c65d41f9b291430a2261ec67b59fe
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:38:11 2018 -0700
pmcstat: fix indentation of usage
commit f686954b15ff56a833ac80404898977cb80a265b
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:19:49 2018 -0700
pmclog(3): add callchain and pmcallocatedyn, remove pcsample
commit 73e13a0d2e9498c81c150d14d022050cee7511bb
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:19:00 2018 -0700
pmclog.h: GC pcsample field
commit 3e93ffd65da641fa657539dad3c48e281f8b5798
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:05:57 2018 -0700
hwpmc: make Intel core CPUs use external event tables
commit 634f5fae1e1644ac324003136c66cd9c619d1c93
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 22:00:06 2018 -0700
pmclog: update log record types, bump PMC_MAJOR
- explicitly make log record types a multiple of 8 bytes
- hook in pmu event types for pmc_allocate records
- remove references to no longer PCSAMPLE record
commit 83d84fcd2d65bdf6ddcb2e155a22f0cfa2a9c225
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 21:52:10 2018 -0700
libpmc: add support for having vendor table driven pmc_allocate
commit 9e6ad63c40c2fce8404847ace5078ca6cb33a736
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 19:11:33 2018 -0700
hwpmc_core: add accessors for EVSEL & UMASK, make IAP_UMASK useful to user
commit 859dceb93daa6419a48c794db99b6758e5b041c9
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 19:09:45 2018 -0700
pmcstat: update usage and man page as well as make -L consistent with pmccontrol
commit 79c7d8597e28c2eb13f5f9113e65ec2792ca57b1
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 18:07:03 2018 -0700
pmu_util: add support for all current intel event keywords
commit d8089c7f6a6c8527f38324252b1ffb47004694c6
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 17:45:00 2018 -0700
add description for new arguments
commit 058336740bab53c62ec88a3a026ea848cf3878c6
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 17:38:15 2018 -0700
libpmc: move pmu_events table and pmu_utils out of libpmcstat so that they can be used by pmc_allocate
commit 049b66b382e2f833c3f47bc8df9e750cb265709f
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 16:12:41 2018 -0700
pmcstat: hook pmu_events counter description utility routines in
commit f5e01e7b37a691dc045e1aa16b3ebdd162515de8
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 16:11:59 2018 -0700
pmu_events: add utility routines for listing counters and their descriptions
commit cba4d4f8907f772279f86f18f915e0d74d33ac56
Author: Matt Macy <mmacy@mattmacy.io>
Date: Fri May 25 16:09:50 2018 -0700
pmu-events: expand out skylake regex to simplify string matches
strdup(3) allocates memory for a copy of the string, does the copy and
returns a pointer to it. If there is no sufficient memory NULL is returned
and the global errno is set to ENOMEM.
We do a sanity check to see if it was possible to allocate enough memory.
Also as we allocate memory, we need to free this memory used. Or it will
going out of scope leaks the storage it points to.
Reviewed by: rgrimes
MFC after: 3 weeks.
X-MFC: r332298
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15550
on thread in post-processing.
To generate stacks for just ${THREADID}:
pmcstat -R ${PREFIX}.pmcstat -L ${THREADID} -z100 -G ${PREFIX}.stacks
Sponsored by: Limelight Networks
will be returned to indicate the error, so I'm applying an assert(3) to do
a sanity check of the return value.
Reported by: Coverity CID: 1391235, 1193654 and 1193651
Reviewed by: grehan
MFC after: 4 weeks.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15533
Convert pmcannotate to using $TMPDIR and _PATH_TMP rather than hard
coding /tmp for temporary files. Pmcannotate sometimes needs quite a
lot of space to store the output from objdump, and will fail in odd
ways if that output is truncated due to lack of space in /tmp.
Reviewed by: jtl
Sponsored by: Netflix
a diagnostic message. So we do a sanity checking on the return value
of vq_getchain().
Spotted by: gcc49
Reviewed by: avg
MFC after: 4 weeks
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D15388
According to the Intel SDM (Volme 3, 9.11.7) the BIOS signature MSR
should be zeroed before executing cpuid (although in practice it does
not seem to matter).
PR: 192487
Submitted by: Dan Lukes
Reported by: Henrique de Moraes Holschuh
MFC after: 3 days
The mld6query command relies on KAME behaviour which allows the
ipv6mr_multiaddr member of the request object in a IPV6_JOIN_GROUP
setsockopt() call to be INADDR6_ANY. The FreeBSD stack doesn't allow
this, so mld6query has been non-functional.
Also, add a -g option which sends a General Query (query INADDR6_ANY)
Reviewed by: sbruno, mmacy
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15384
While <sys/sysctl.h> includes <sys/queue.h> unconditionally, it is only
actually used in code which is conditional on _KERNEL. Make the #include
itself conditional as well, and fix userland code that uses <sys/queue.h>
for other purposes but relied on <sys/sysctl.h> to bring it in.
MFC after: 1 week
ioctl frontend ports.
This revision introduces two changes to CTL:
- Changes the way options are passed to CTL_LUN_REQ and CTL_PORT_REQ ioctls.
Removes ctl_be_arg structure and associated logic and replaces it with
nv(3)-based logic for passing in and out arguments.
- Allows creating multiple ioctl frontend ports using either ctladm(8) or
ctld(8).
New frontend ports are represented by /dev/cam/ctl<pp>.<vp> nodes, eg /dev/cam/ctl5.3.
Those device nodes respond only to CTL_IO ioctl.
New command-line options for ctladm:
# creates new ioctl frontend port with using free pp and vp=0
ctladm port -c
# creates new ioctl frontend port with pp=10 and vp=0
ctladm port -c -O pp=10
# creates new ioctl frontend port with pp=11 and vp=12
ctladm port -c -O pp=11 -O vp=12
# removes port with number 4 (it's a "targ_port" number, not pp number)
ctladm port -r -p 4
New syntax for ctl.conf:
target ... {
port ioctl/<pp>
...
}
target ... {
port ioctl/<pp>/<vp>
...
Note: Most of this work was made by jceel@, thank you.
Submitted by: jceel
Reworked by: myself
Reviewed by: mav (earlier versions and recently during the rework)
Obtained from: FreeNAS and TrueOS
Relnotes: Yes
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D9299
The --device and --part command line options were planned for Linux
compatibility mode. However, that mode will never happen, so remove
them as last vestiges of a false start.
Submitted by: Vlad Movchan
Print the boot variables in the order in the BootOrder variable, if it
exists, and then in verbose mode print any unreferneced BootXXXX
variables. If BootOrder isn't set, fall back to printing all the
variables.
Sponsored by: Netflix
by doing most of the work in a new function prison_add_vfs in kern_jail.c
Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and
the rest is taken care of. This includes adding a jail parameter like
allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed.
Both of these used to be a static list of known filesystems, with
predefined permission bits.
Reviewed by: kib
Differential Revision: D14681
The prior code only allowed multiples of 32 for the
numbers of columns. Remove this restriction to allow
a forthcoming UEFI firmware update to allow arbitrary
x,y resolutions.
(the code for handling rows already supported non mult-32 values)
Reviewed by: Leon Dang (original author)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D15274
This driver was for an early and uncommon legacy PCI 10GbE for a single
ASIC, Intel 82597EX. Intel quickly shifted to the long lived ixgbe family.
Submitted by: kbowling
Reviewed by: brooks imp jeffrey.e.pieper@intel.com
Relnotes: yes
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15234
This driver supports legacy, 32-bit PCI devices, and had an ambiguous
license. Supported devices were already reported to be rare in 2003
(when an earlier version of the driver was removed in r123201).
Reviewed by: rgrimes
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15245
This commit adds a new debug server to bhyve. Unlike the existing -g
option which provides an efficient connection to a debug server
running in the guest OS, this debug server permits inspection and
control of the guest from within the hypervisor itself without
requiring any cooperation from the guest. It is similar to the debug
server provided by qemu.
To avoid conflicting with the existing -g option, a new -G option has
been added that accepts a TCP port. An IPv4 socket is bound to this
port and listens for connections from debuggers. In addition, if the
port begins with the character 'w', the hypervisor will pause the
guest at the first instruction until a debugger attaches and
explicitly continues the guest. Note that only a single debugger can
attach to a guest at a time.
Virtual CPUs are exposed to the remote debugger as threads. General
purpose register values can be read for each virtual CPU. Other
registers cannot currently be read, and no register values can be
changed by the debugger.
The remote debugger can read guest memory but not write to guest
memory. To facilitate source-level debugging of the guest, memory
addresses from the debugger are treated as virtual addresses (rather
than physical addresses) and are resolved to a physical address using
the active virtual address translation of the current virtual CPU.
Memory reads should honor memory mapped I/O regions, though the debug
server does not attempt to honor any alignment or size constraints
when accessing MMIO.
The debug server provides limited support for controlling the guest.
The guest is suspended when a debugger is attached and resumes when a
debugger detaches. A debugger can suspend a guest by sending a Ctrl-C
request (e.g. via Ctrl-C in GDB). A debugger can also continue a
suspended guest while remaining attached. Breakpoints are not yet
supported. Single stepping is supported on Intel CPUs that support
MTRAP VM exits, but is not available on other systems.
While the current debug server has limited functionality, it should
at least be usable for basic debugging now. It is also a useful
checkpoint to serve as a base for adding additional features.
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D15022
pwd_mkdb has emitted v4 password database records since 2003 (r113596)
in addition to v3, and as of r283981 by default it emitted only v4.
As described in r283981, retire the -l legacy option.
The -B and -L options were originally added to set the endianness of v3
records emitted by pwd_mkdb, but they also set the db hash endiannes and
so have been retained temporarily.
Announced on the FreeBSD-Current and FreeBSD-Stable lists. In stable/11
the man page contains a deprecation notice, and pwd_mkdb will emit a
deprecation notice if the -l option is specified.
Reviewed by: delphij, lidl, rgrimes
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15144
User-visible changes:
"-u" is added to to list of command line options supported by bthidd.
Use it to enable evdev support. uinput and evdev modules should be
kld-loaded or compiled into the kernel in that case.
bthidd_evdev_support rc.conf variable is added to control enabling of
evdev support in bthidd startup script. Possible values are: "YES", "NO",
"AUTO"(default). Setting bthidd_evdev_support to "AUTO" inserts "-u" option
if kernel is compiled with EVDEV_SUPPORT option enabled.
Support for consumer HID usage page keyboard events is implemented. Most of
them are available only through evdev protocol.
kern.evdev.rcpt_mask sysctl is checked, so "sysctl kern.evdev.rcpt_mask=12"
should be executed if EVDEV_SUPPORT is compiled into kernel.
It is recommended to regenerate bthidd.conf entries with bthidcontrol(8)
"Query" command to set user-friendly names of bluetooth devices.
Reviewed by: emax, gonzo, wblock (docs), bcr (docs, early version)
Differential Revision: https://reviews.freebsd.org/D13456
Extend bthidd.conf format to store name of remote Bluetooth HID devices and
implement querying of this information with bthidcontrol(8) "Query" command.
Reviewed by: emax
Differential Revision: https://reviews.freebsd.org/D13456
become redundant after documenting all the subcommands, and switches
to the new syntax, without the '-d'.
Reviewed by: hselasky@
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
For cross-architecture reproducibility. The db(3) functions work with
hashes of either endianness, and the current (v4) version password db
entries already store integers in network order. Do so with the hash as
well so that identical password databases can be created on big- and
little-endian hosts.
The -B and -L flags exist to set the endianness for legacy (v3) entries
when the -l flag is used, and they will still control hash endianness
(at least until the backwards compatibility infrastructure is removed).
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
- cd9660 relies on an #include "iso.h" but does not build any .c files
out of source, so remove reach-over .PATH
- ffs does not rely on any sys/ headers, so remove -I from CFLAGS.
- ffs_tables from sys/ is used by ffs; move the SRCS entry from the top-
level Makefile to ffs' Makefile.inc.
Sponsored by: The FreeBSD Foundation
r283981 switched pwd_mkdb to emit only v4 database entries by default,
and introduced a -l (legacy) option emit v3 entries in addition. The
commit message claims that legacy support will be removed in 12.0, so
emit a warning now if it is used.
Previously the code only warned about the condition and then happily
proceeded to use the too large value resulting in the array
out-of-bounds access.
Obtained from: Panzura (Chuanbo Zheng)
MFC after: 10 days
Sponsored by: Panzura
supervised program. The existing -r option has a hard-coded delay of one
second. This change adds a -R option which takes a delay in seconds. This
can be used to prevent log spam and rapid restarts, similar to init(8)'s
behavior of adding a delay between rapid restarts when it's supervising a
program.
- Move all of the code responsible for transmitting log messages into a
separate function, fprintlog_write().
- Instead of manually modifying a list of iovecs, add a structure
iovlist with some helper functions.
- Alter the F_FORW (UDP message forwarding) case to also use iovecs like
the other cases. Use sendmsg() instead of sendto().
- In the case of F_FORW, truncate the message to a size dependent on the
address family (AF_INET, AF_INET6), as proposed by RFC 5426.
- Move all traditional message formatting into fprintlog_bsd(). Get rid
of some of the string copying and snprintf()'ing. Simply emit more
iovecs to get the job done.
- Increase ttymsg()'s limit of 7 iovecs to 32. Add a definition for this
limit, so it can be reused by iovlist.
- Add fprintlog_rfc5424() to emit RFC 5424 formatted log entries.
- Add a "-O" command line option to enable RFC 5424 formatting. It would
have been nicer if we supported "-o rfc5424", just like on NetBSD.
Unfortunately, the "-o" flag is already used for a different purpose
on FreeBSD.
- Don't truncate hostnames in the RFC 5424 case, as suggested by that
specific RFC.
For people interested in using this, this feature can be enabled by
adding the following line to /etc/rc.conf:
syslogd_flags="-s -O rfc5424"
Differential Revision: https://reviews.freebsd.org/D15011
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload. t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface. The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.
A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so. The general format
of a rule is: [socket-type] expr => settings
Example. See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed
The driver processes the rules for each new listen, active open, or
passive open and stops at the first match. There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload
This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.
Sponsored by: Chelsio Communications
By popular demand, pkg now walks thought the arguments passed and
if it finds -y or --yes it does accept those as equivalent of
ASSUME_ALWAYS_YES env var.
Requested by: many
MFC after: 1 week
Directory mtime will only change if a file is added or removed, not
modified. For /var/cron/tabs, this is fine because of how crontab(1) manages
it using temp files so all crontab(1) changes will trigger a reload of the
database.
For /etc/cron.d and /usr/local/etc/cron.d, this is not necessarily the case.
Instead of checking their mtime, we should descend into them and check mtime
on all jobs also.
Reported by: des
Reviewed by: bapt
MFC after: 1 week
from userland without the need to use sysctls, it allows the old
sysctls to continue to function, but deprecates them at
FreeBSD_version 1200060 (Relnotes for deprecate).
The command line of bhyve is maintained in a backwards compatible way.
The API of libvmmapi is maintained in a backwards compatible way.
The sysctl's are maintained in a backwards compatible way.
Added command option looks like:
bhyve -c [[cpus=]n][,sockets=n][,cores=n][,threads=n][,maxcpus=n]
The optional parts can be specified in any order, but only a single
integer invokes the backwards compatible parse. [,maxcpus=n] is
hidden by #ifdef until kernel support is added, though the api
is put in place.
bhyvectl --get-cpu-topology option added.
Reviewed by: grehan (maintainer, earlier version),
Reviewed by: bcr (manpages)
Approved by: bde (mentor), phk (mentor)
Tested by: Oleg Ginzburg <olevole@olevole.ru> (cbsd)
MFC after: 1 week
Relnotes: Y
Differential Revision: https://reviews.freebsd.org/D9930
Now that all of parsemsg() parses both RFC 3164 and 5424 messages and
hands them to logmsg(), alter the latter to properly forward all RFC
5424 message attributes to fprintlog(). While there, make some minor
cleanups to this code:
- Instead of extending the existing code that compares hostnames and
message bodies for deduplication, print all of the relevant message
fields into a single string that we can compare ('saved').
- No longer let the behaviour of fprintflog() depend on whether
'msg == NULL' to print repetition messages, Simply decompose this
function into fprintlog_first() and fprintlog_successive(). This
makes the interpretation of function arguments less magical and also
allows us to get consistent behaviour across RFC 3164 and 5424 when
adding support for the RFC 5424 output format.
- As RFC 5424 syslog messages have a dedicated application name field,
alter the repetition messages to be printed on behalf of syslogd on
the current system. Change these messages to use the local hostname,
so that it's obvious which syslogd instance detected the repetition.
Remove f_prevhost, as it has now become unnecessary.
- Remove a useless strdup(). Deconsting the message string is safe in
this specific case.