In particular, this avoids malloc(9) calls when from early tunable handling,
with no working malloc yet.
Reported and tested by: mav
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Fix a few 'if(' to be 'if (' in a few places, per style(9) and
overwhelming usage in the rest of the kernel / tree.
MFC After: 3 days
Sponsored by: Netflix
Refactor sysctl_sysctl_next_ls():
* Move huge inner loop out of sysctl_sysctl_next_ls() into a separate
non-recursive function, returning the next step to be taken.
* Update resulting node oid parts only on successful lookup
* Make sysctl_sysctl_next_ls() return boolean success/failure instead of errno,
slightly simplifying logic
Reviewed by: freqlabs
Differential Revision: https://reviews.freebsd.org/D27029
Ensure we also skip descendants of SKIP nodes when iterating through children
of an explicitly specified node.
Reported by: np
Reviewed by: np
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D26833
Remove unused oidpp parameter from sysctl_sysctl_next_ls and
add high level comments to describe how it works.
No functional change.
Reviewed by: imp
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D26854
Add an "nextnoskip" sysctl that allows for listing of sysctls intended to be
normally skipped for cost reasons.
This makes it so the names/descriptions of those sysctls can be discovered with
sysctl -aN/sysctl -ad/sysctl -at.
It also makes it so children are visited when a node flagged with CTLFLAG_SKIP
is explicitly requested.
The intended use case is to mark the root "kstat" node with CTLFLAG_SKIP so that
the extensive and expensive stats are skipped by default but may still be easily
obtained without having to know them all (which may not even be possible) and
request each one-by-one.
Reviewed by: jhb
MFC after: 2 weeks
Relnotes: yes
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D26560
Add CTLFLAG_NEEDGIANT flag (modelled after D_NEEDGIANT) that will be used to
mark sysctls that still require locking Giant.
Rewrite sysctl_handle_string() to use internal locking instead of locking
Giant.
Mark SYSCTL_STRING, SYSCTL_OPAQUE and their variants as MPSAFE.
Add infrastructure support for enforcing proper use of CTLFLAG_NEEDGIANT
and CTLFLAG_MPSAFE flags with SYSCTL_PROC and SYSCTL_NODE, not enabled yet.
Reviewed by: kib (mentor)
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D23378
r136999 introduced SYSTCL_DEBUG but apparently "opt_sysctl.h" was never
included making the option ignored.
r322954 introduced sysctl.reuse_test with OID number equal to 0, effectively
shadowing the very special sysctl.debug one. Use OID_AUTO as it doesn't need
any special treatment.
Reviewed by: kib (mentor)
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D23056
Several sysctl sysctls output to a user buffer while holding a
non-sleepable lock that protects the sysctl topology. They need to wire
the output buffer, or else they may try to sleep on a page fault.
Reviewed by: cem, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22528
Previously userspace would issue one syscall to resolve the sysctl and then
another one to actually use it. Do it all in one trip.
Fallback is provided in case newer libc happens to be running on an older
kernel.
Submitted by: Pawel Biernacki
Reported by: kib, brooks
Differential Revision: https://reviews.freebsd.org/D17282
KDB is standard and the kdb_active variable is always available. So,
de-conditionalize inclusion of sys/kdb.h in kern_sysctl.c.
Reported by: Michael Butler <imb AT protected-networks.net>
X-MFC-With: r350713
Sponsored by: Dell EMC Isilon
Implement `sysctl` in `ddb` by overriding `SYSCTL_OUT`. When handling the
req, we install custom ddb in/out handlers. The out handler prints straight
to the debugger, while the in handler ignores all input. This is intended
to allow us to print just about any sysctl.
There is a known issue when used from ddb(4) entered via 'sysctl
debug.kdb.enter=1'. The DDB mode does not quite prevent all lock
interactions, and it is possible for the recursive Giant lock to be unlocked
when the ddb(4) 'sysctl' command is used. This may result in a panic on
return from ddb(4) via 'c' (continue). Obviously, this is not a problem
when debugging already-paniced systems.
Submitted by: Travis Lane (formerly: <travis.lane AT isilon.com>)
Reviewed by: vangyzen (earlier version), Don Morris <dgmorris AT earthlink.net>
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20219
New sysctl/tunables can now set the interval (in seconds) between
rate-limited crypto warnings. The new sysctls are:
- kern.cryptodev_warn_interval for /dev/crypto
- net.inet.ipsec.crypto_warn_interval for IPsec
- kern.kgssapi_warn_interval for KGSSAPI
Reviewed by: cem
MFC after: 1 month
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D20555
device_printf does multiple calls to printf allowing other console messages to
be inserted between the device name, and the rest of the message. This change
uses sbuf to compose to two into a single buffer, and prints it all at once.
It exposes an sbuf drain function (drain-to-printf) for common use.
Update documentation to match; some unit tests included.
Submitted by: jmg
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D16690
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.
Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.
Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.
Reviewed by: kib, cem, jhb, jtl
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14941
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
The previous limit of just one page is hit by ps.
The entire mechanism should be reworked, if not whacked. It seems the intent
is to reduce kernel dos-ability - some handlers wire the amount of memory
passed here. Handlers should probably stop wiring in the first place or in
the worst case indicate they are doing so so that the check is done only if
necessary. It should also probably be a counter, not a lock.
MFC after: 1 week
A sysctl can have a custom handler that may access data that is initialized
via SYSINIT(9) or via a module event handler (also invoked via SYSINIT).
Thus, it is not safe to allow access to the module's sysctl-s until
the initialization is performed. Likewise, we should not allow access
to teh sysctl-s after the module is uninitialized.
The latter is easy to achieve by properly ordering linker_file_unregister_sysctls
and linker_file_sysuninit.
The former is not as easy for two reasons:
- the initialization may depend on tunables which get set when sysctl-s are
registered, so we need to set the tunables before running sysinit-s
- the initialization may try to dynamically add more sysctl-s under statically
defined sysctl nodes
So, this change splits the sysctl setup into two phases. In the first phase
the sysctl-s are registered as before but they are disabled and hidden from
consumers. In the second phase, done after sysinit-s, normal access to the
sysctl-s is enabled.
The change should affect only dynamic module loading and unloading after
the system boot-up. Nothing changes for sysctl-s compiled into the kernel
and sysctl-s in preloaded modules.
Discussed with: hselasky, ian, jhb
Reviewed by: julian, kib
MFC after: 2 weeks
Sponsored by: Panzura
Differential Revision: https://reviews.freebsd.org/D12545
Said checks were inherently racy anyway as jokers could unmap target areas
before the handler got around to accessing them.
This saves time by avoiding locking the address space.
MFC after: 1 week
Print the full conflicting oid path, and include the function name in the
warning so it is clear that the warnings are sysctl-related.
PR: 221853
Submitted by: Fabian Keil <fk AT fabiankeil.de> (earlier version)
Sponsored by: Dell EMC Isilon
This will provide a slightly better smoking gun than just stating
"can't remove non-dynamic nodes!" when calling sysctl_ctx_free(9)
and sysctl_remove_{name,oid}(9) with a non-dynamic (likely
static) sysctl.
MFC after: 1 week
Sponsored by: Dell EMC Isilon
I'm currently working on writing a metrics exporter for the Prometheus
monitoring system to provide access to sysctl metrics. Prometheus and
sysctl have some structural differences:
- sysctl is a tree of string component names.
- Prometheus uses a flat namespace for its metrics, but allows you to
attach labels with values to them, so that you can do aggregation.
An initial version of my exporter simply translated
hw.acpi.thermal.tz1.temperature
to
sysctl_hw_acpi_thermal_tz1_temperature_celcius
while we should ideally have
sysctl_hw_acpi_thermal_temperature_celcius{thermal_zone="tz1"}
allowing you to graph all thermal zones on a system in one go.
The change presented in this commit adds support for accomplishing this,
by providing the ability to attach labels to nodes. In the example I
gave above, the label "thermal_zone" would be attached to "tz1". As this
is a feature that will only be used very rarely, I decided to not change
the KPI too aggressively.
Discussed on: hackers@
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D8775
Because the size of bool can be implementation defined, make a bool
sysctl handler which handle bools. Userspace sees the bools like
unsigned 8-bit integers. Values are filtered to either 1 or 0 upon
read and write, similar to what a compiler would do.
Requested by: kmacy @
Sponsored by: Mellanox Technologies
The fail point handler may sleep, but this is not permitted while holding a
rm read lock.
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
Use the right intmax_t type instead of intptr_t in a few remaining
places.
Add support for CTLFLAG_TUN for the new fixed with types. Bruce will be
upset that the new handlers silently truncate tuned quad-sized inputs,
but so do all of the existing handlers.
Add the new types to debug_dump_node, for whatever use that is.
Bump FreeBSD_version again, for good measure. We are changing
SYSCTL_HANDLER_ARGS and a member of struct sysctl_oid to intmax_t.
Correct the sysctl typed NULL values for the fixed-width types. (Hat
tip: hps@.)
Suggested by: hps (partial)
Sponsored by: EMC / Isilon Storage Division
Add S8, S16, S32, and U32 types; add SYSCTL*() macros for them, as well
as for the existing 64-bit types. (While SYSCTL*QUAD and UQUAD macros
already exist, they do not take the same sort of 'val' parameter that
the other macros do.)
Clean up the documented "types" in the sysctl.9 document. (These are
macros and thus not real types, but the manual page documents intent.)
The sysctl_add_oid(9) arg2 has been bumped from intptr_t to intmax_t to
accommodate 64-bit types on 32-bit pointer architectures.
This is just the kernel support piece; the userspace sysctl(1) support
will follow in a later patch.
Submitted by: Ravi Pokala <rpokala@panasas.com>
Reviewed by: cem
Relnotes: no
Sponsored by: Panasas
Differential Revision: https://reviews.freebsd.org/D4091