This option is intended to be semantically identical to Linux's
SOL_SOCKET:SO_PASSCRED. For now, it is mutually exclusive with the
pre-existing sockopt SOL_LOCAL:LOCAL_CREDS.
Reviewed by: markj (penultimate version)
Differential Revision: https://reviews.freebsd.org/D27011
Our own Ports Collection is not targeting those systems at the moment,
so let's stop documenting bits specific to OpenBSD and NetBSD in the ports
documentation. Especially, that it might bit rot one day.
MFC after: 1 week
It is rather common for the ports users to replace su(1) with sudo(8)
within the SU_CMD variable. Let's document it in the manual page (so far
it's been hidden in a comment within bsd.commands.mk).
MFC after: 2 weeks
This patch also introduces an environment variable BE_UTILITY,
which can be used to specify the utility to use for managing
ZFS boot environments (which can be either bectl or beadm).
While here, fix some typos in the manual page and
remove beadm from section "SEE ALSO".
Reviewed by: bcr, kevans, rpokala
Approved by: will
Differential Revision: https://reviews.freebsd.org/D21111
If you need / want to includerd sys/systm.h, it has to be just after
param.h/types.h. Document this existing practice. Not all kernel files
include systm.h, but when you do, it should be done out of order.
Reviewed by: vangyzen, kib, emaste
Differential Review: https://reviews.freebsd.org/D26981
Foundation copyrights, approved by emaste@. It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.
Reviewed by: emaste, imp, gbe (manpages)
Differential Revision: https://reviews.freebsd.org/D26980
libjail is pretty small, so it makes for a good proof of concept demonstrating
how a system library can be wrapped to create a loadable Lua module for flua.
* Introduce 3lua section for man pages
* Add libjail module
Reviewed by: kevans, manpages
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D26080
The NTB hardware starting with Skylake has some changes to the register
map and the doorbell interface. Add a new NTB_XEON_GEN3 device type and
use it to conditionalize driver logic that differs from the existing
Xeon code.
Reviewed by: vangyzen
Discussed with: cem, Bret Ketchum <Bret.Ketchum@dell.com>
MFC after: 1 month
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26683
the failover protocol is supported due to limitations in the IPoIB
architecture. Refer to the lagg(4) manual page for how to configure
and use this new feature. A new network interface type,
IFT_INFINIBANDLAG, has been added, similar to the existing
IFT_IEEE8023ADLAG .
ifconfig(8) has been updated to accept a new laggtype argument when
creating lagg(4) network interfaces. This new argument is used to
distinguish between ethernet and infiniband type of lagg(4) network
interface. The laggtype argument is optional and defaults to
ethernet. The lagg(4) command line syntax is backwards compatible.
Differential Revision: https://reviews.freebsd.org/D26254
Reviewed by: melifaro@
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
Currently, this supports SHA1 and SHA2-{224,256,384,512} both as plain
hashes and in HMAC mode on both amd64 and i386. It uses the SHA
intrinsics when present similar to aesni(4), but uses SSE/AVX
instructions when they are not.
Note that some files from OpenSSL that normally wrap the assembly
routines have been adapted to export methods usable by 'struct
auth_xform' as is used by existing software crypto routines.
Reviewed by: gallatin, jkim, delphij, gnn
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26821
Only one MIPS-specific driver implements support for one of the
asymmetric operations. There are no in-kernel users besides
/dev/crypto. The only known user of the /dev/crypto interface was the
engine in OpenSSL releases before 1.1.0. 1.1.0 includes a rewritten
engine that does not use the asymmetric operations due to lack of
documentation.
Reviewed by: cem, markj
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D26810
Not that you can regenerate the motd by editing motd.template and
running 'service motd restart' rather than rebooting.
Small wordsmithing by me, and updated the example from FreeBSD 2.1.6.1
release to 12.1 release.
Submitted by: Dan Mack
Add support for ARC-1886, NVMe/SAS/SATA controller.
Many thanks to Areca for continuing to support FreeBSD.
Submitted by: 黃清隆 <ching2048 areca com tw>
MFC after: 2 weeks
This permits requests (netipsec ESP and AH protocol) to provide the
IPsec ESN (Extended Sequence Numbers) in a separate buffer.
As with separate output buffer and separate AAD buffer not all drivers
support this feature. Consumer must request use of this feature via new
session flag.
Submitted by: Grzegorz Jaszczyk <jaz@semihalf.com>
Patryk Duda <pdk@semihalf.com>
Reviewed by: jhb
Differential revision: https://reviews.freebsd.org/D24838
Obtained from: Semihalf
Sponsored by: Stormshield
It is lightweight way to check if an IPv4 address exists.
Submitted by: Roy Marples
Reviewed by: gnn, melifaro
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D26636
arm64 has a similar wrapper. This permits defining <machine/fpu.h> as
the standard header for fpu_kern_*.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26753
- whitespace at end of input line
- skipping paragraph macro: Pp at the end of Sh
- new sentence, new line
- consider using OS macro: Fx
- AUTHORS section without An macro
- skipping paragraph macro: Pp before Ss
These kind of drops come for free in the sense that they do not use the
filter TCAM or any other resource that wouldn't normally be used during
rx. Frames dropped by the hardware get counted in the MAC's rx stats
but are not delivered to the driver.
hw.cxgbe.attack_filter
Set to 1 to enable the "attack filter". Default is 0. The attack
filter will drop an incoming frame if any of these conditions is true:
src ip/ip6 == dst ip/ip6; tcp and src/dst ip is not unicast; src/dst ip
is loopback (127.x.y.z); src ip6 is not unicast; src/dst ip6 is loopback
(::1/128) or unspecified (::/128); tcp and src/dst ip6 is mcast
(ff00::/8).
hw.cxgbe.drop_ip_fragments
Set to 1 to drop all incoming IP fragments. Default is 0. Note that
this drops valid frames.
hw.cxgbe.drop_pkts_with_l2_errors
Set to 1 to drop incoming frames with Layer 2 length or checksum errors.
Default is 1.
hw.cxgbe.drop_pkts_with_l3_errors
Set to 1 to drop incoming frames with IP version, length, or checksum
errors. Default is 0.
hw.cxgbe.drop_pkts_with_l4_errors
Set to 1 to drop incoming frames with Layer 4 length, checksum, or other
errors. Default is 0.
MFC after: 2 weeks
Sponsored by: Chelsio Communications
- Extend the list of main libraries of section 3
- Extend the library functions that are included in the libc
MFC after: 2 weeks
Submitted by: Naga Chaitanya Vellanki <pnagato at protonmail dot com>
Approved by: gbe
Differential Revision: https://reviews.freebsd.org/D26476
This is a simple subsystem that allow drivers to register as a backlight.
Each backlight creates a device node under /dev/backlight/backlightX and
an alias based on the name provided.
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26250
For interfaces that do not support SIOCGIFMEDIA (for which there are
quite a few) the only fallback is to query the interface for
if_data->ifi_link_state. While it's possible to get at if_data for an
interface via getifaddrs(3) or sysctl, both are heavy weight mechanisms.
SIOCGIFDATA is a simple ioctl to retrieve this fast with very little
resource use in comparison. This implementation mirrors that of other
similar ioctls in FreeBSD.
Submitted by: Roy Marples <roy@marples.name>
Reviewed by: markj
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D26538
Document the new powerpc64le arch's initial specifications.
Certain things are subject to change while this is experimental. The most
likely change is that long double may switch to quad, dependent on POWER8
emulation assistance for __float128 being set up in the compiler (as
POWER8 does not have IEEE-compatible 128-bit hardware float, unlike POWER9.)
Sponsored by: Tag1 Consulting, Inc.
Document the calls to send messages to userland via devctl.
devctl_notify will create a message for the specified system,
subsystem and type, optionally adding additional information.
Reviewed by: bcr
Differential Revision: https://reviews.freebsd.org/D26520
Belatedly document the quoting requirements for the devctl protocol. I
thought they'd been previously documented.
Also, while I'm here, make igor happy.
Reviewed by: bcr
Differential Revision: https://reviews.freebsd.org/D26520
This routine centralizes the knowledge needed for properly quoting
'value' in all key="value" items that appear in devctl messages.
Reviewed by: bcr
Differential Revision: https://reviews.freebsd.org/D26520
This allows the PF interfaces to communicate with the VF interfaces over
the internal switch in the ASIC. Fix the GL limits for VM work requests
while here.
MFC after: 3 days
Sponsored by: Chelsio Communications
An upcoming patch to use the bitset macros for tracking vm page
dump information could conceivably need more than INT_MAX bits.
Expand the bit type to long so that the extra range is available
on 64-bit platforms where it would most likely be needed.
CPUSET_COUNT and DOMAINSET_COUNT are also modified to remain of
type `int`.
Reviewed by: kib, markj
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26190
This adds the getenv_bool() function, to parse a boolean value from a
kernel environment variable or tunable. This works for traditional
boolean values like "0" and "1", and also "true" and "false"
(case-insensitive). These semantics do not yet apply to sysctls declared
using SYSCTL_BOOL with CTLFLAG_TUN (they still only parse 1 and 0).
Also added are two wrapper functions, getenv_is_true() and
getenv_is_false(). These are slightly simpler for callers wishing to
perform a single check of a configuration variable.
Reviewed by: jhb (slightly earlier version)
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26270
bootonce feature is temporary, one time boot, activated by
"bectl activate -t BE", "bectl activate -T BE" will reset the bootonce flag.
By default, the bootonce setting is reset on attempt to boot and the next
boot will use previously active BE.
By setting zfs_bootonce_activate="YES" in rc.conf, the bootonce BE will
be set permanently active.
bootonce dataset name is recorded in boot pool labels, bootenv area.
in case of nextboot, the nextboot_enable boolean variable is recorded in
freebsd:nvstore nvlist, also stored in boot pool label bootenv area.
On boot, the loader will process /boot/nextboot.conf if nextboot_enable
is "YES", and will set nextboot_enable to "NO", preventing /boot/nextboot.conf
processing on next boot.
bootonce and nextboot features are usable in both UEFI and BIOS boot.
To use bootonce/nextboot features, the boot loader needs to be updated on disk;
if loader.efi is stored on ESP, then ESP needs to be updated and
for BIOS boot, stage2 (zfsboot or gptzfsboot) needs to be updated
(gpart or other tools).
At this time, only lua loader is updated.
Sponsored by: Netflix, Klara Inc.
Differential Revision: https://reviews.freebsd.org/D25512
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.
Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.
This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D25371
Hardware assistance includes checksumming (tx and rx), TSO, and RSS on
the inner traffic in a VXLAN tunnel.
Relnotes: Yes
Sponsored by: Chelsio Communications
This lets a VXLAN pseudo-interface take advantage of hardware checksumming (tx
and rx), TSO, and RSS if the NIC is capable of performing these operations on
inner VXLAN traffic.
A VXLAN interface inherits the capabilities of its vxlandev interface if one is
specified or of the interface that hosts the vxlanlocal address. If other
interfaces will carry traffic for that VXLAN then they must have the same
hardware capabilities.
On transmit, if_vxlan verifies that the outbound interface has the required
capabilities and then translates the CSUM_ flags to their inner equivalents.
This tells the hardware ifnet that it needs to operate on the inner frame and
not the outer VXLAN headers.
An event is generated when a VXLAN ifnet starts. This allows hardware drivers to
configure their devices to expect VXLAN traffic on the specified incoming port.
On receive, the hardware does RSS and checksum verification on the inner frame.
if_vxlan now does a direct netisr dispatch to take full advantage of RSS. It is
not very clear why it didn't do this already.
Future work:
Rx: it should be possible to avoid the first trip up the protocol stack to get
the frame to if_vxlan just so it can decapsulate and requeue for a second trip
up the stack. The hardware NIC driver could directly call an if_vxlan receive
routine for VXLAN traffic instead.
Rx: LRO. depends on what happens with the previous item. There will have to to
be a mechanism to indicate that it's time for if_vxlan to flush its LRO state.
Reviewed by: kib@
Relnotes: Yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25873
In D12421, the ability to compile stand/ in little-endian was added, with the
intention to extend loader.kboot to run in Petitboot.
However, no further work was done, as the kernel then gained self-execution
capabilities as Petitboot was taught to load FreeBSD kernels directly.
The FreeBSD installer on powerpc64 (on POWER8 and POWER9) uses
/boot/etc/kboot.conf instead of loader.
As this option does nothing but cause stand/ to be miscompiled and actively
causes confusion, remove it.
(I have a functioning petitboot loader in my local tree, however, it turned
out to be quite inconvient to use due to the current petitboot plugin design
so I put it on hold.)
Reviewed by: emaste, imp, jhibbits
Sponsored by: Tag1 Consulting, Inc.
Differential Revision: https://reviews.freebsd.org/D26430
Submitted by: Ka Ho Ng <khng300@gmail.com>
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26372
When cache precedes files, and nscd is configured to allow negative caching,
commands like "pw groupadd" can fail. The sequence of events looks like:
1. A command like pkg(8) looks up the group, and finds it absent.
2. pkg invokes pw(8) to add the group
3. pkg queries the group, but nscd says it doesn't exist, since it has a
negative cache entry for that group.
See also: https://lists.freebsd.org/pipermail/freebsd-current/2012-January/031595.html
Reviewed by: bcr (manpages)
MFC after: 1 week
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D26184
For historical reasons, defining MALLOC_PRODUCTION in /etc/make.conf has
been used to turn off potentially expensive debug checks and statistics
gathering in the implementation of malloc(3).
It seems more consistent to turn this into a regular src.conf(5) option,
e.g. WITH_MALLOC_PRODUCTION / WITHOUT_MALLOC_PRODUCTION. This can then
be toggled similar to any other source build option, and turned on or
off by default for e.g. stable branches.
Reviewed by: imp, #manpages
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D26337
hints data. Control register 2 holds the settings a user might want to
configure, such as the timeout value for idle busses and whether to enable
the mass-writes feature.
Also add hint support for disconnecting idle busses (which was already
supported using FDT data).
Update the manpage with the new features, and also split the hints section
into separate lists of required and optional hints.
This allows privileged userspace processes to find information about the
physical page backing a given mapping. It is useful in applications
such as DPDK which perform some of their own memory management.
Reviewed by: kib, jhb (previous version)
MFC after: 2 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D26237
It's no longer unusual to be able to build a release with a single
command, so drop "actually" that hints at a surprise. Also just use
"network install directory" instead of referencing FTP; it's more
likely to be HTTP now.
Reviewed by: gjb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26260
Add deprecation notice for apm bios, aka the apm(4) device. The apm(8)
command will remain, at least for a while, since ACPI emulates the apm
ioctl interface.
Discussed on: arch@
Relnotes: yes
MFC After: 3 days
In Linux, ksize() gets the actual amount of memory allocated for a given
object. This commit adds malloc_usable_size() to FreeBSD KPI which does
the same. It also maps LinuxKPI ksize() to newly created function.
ksize() function is used by drm-kmod.
Reviewed by: hselasky, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D26215
sbuf_setpos can only be used to truncate the buffer, never to make it
longer. Update the documentation to reflect this.
Reviewed By: allanjude, phk
Differential Revision: https://reviews.freebsd.org/D26198
crypto(9) functions can now be used on buffers composed of an array of
vm_page_t structures, such as those stored in an unmapped struct bio. It
requires the running to kernel to support the direct memory map, so not all
architectures can use it.
Reviewed by: markj, kib, jhb, mjg, mat, bcr (manpages)
MFC after: 1 week
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D25671
It was a driver for a USB FM tuner that was available in the market in 2002. I
wrote the driver in 2003. I've not used it since 2005 or so, so it's time to
retire this driver. No userland code ever interfaced to the special device it
created. There's no user base: the last bug I received on this driver was in
2004.
Relnotes: Yes
This was discussed in arch@ a while ago. Most of the 16-bit drivers that it
relied on have been removed. There's only a few other drivers remaining that
support it, and those are very rare the days (even the once ubiquitious wi(1)
is now quite rare).
Indvidual drivers will be handled separately before pccard itself is removed.
Add prng(9) as a replacement for random(9) in the kernel.
There are two major differences from random(9) and random(3):
- General prng(9) APIs (prng32(9), etc) do not guarantee an
implementation or particular sequence; they should not be used for
repeatable simulations.
- However, specific named API families are also exposed (for now: PCG),
and those are expected to be repeatable (when so-guaranteed by the named
algorithm).
Some minor differences from random(3) and earlier random(9):
- PRNG state for the general prng(9) APIs is per-CPU; this eliminates
contention on PRNG state in SMP workloads. Each PCPU generator in an
SMP system produces a unique sequence.
- Better statistical properties than the Park-Miller ("minstd") PRNG
(longer period, uniform distribution in all bits, passes
BigCrush/PractRand analysis).
- Faster than Park-Miller ("minstd") PRNG -- no division is required to
step PCG-family PRNGs.
For now, random(9) becomes a thin shim around prng32(). Eventually I
would like to mechanically switch consumers over to the explicit API.
Reviewed by: kib, markj (previous version both)
Discussed with: markm
Differential Revision: https://reviews.freebsd.org/D25916
The current scheme of calling VOP_GETATTR adds avoidable overhead.
An example with tmpfs doing fstat (ops/s):
before: 7488958
after: 7913833
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D25910
Add IEEE80211_IOC_IC_NAME to query the ic_name field and in ifconfig
to print the parent interface again. This functionality was lost
around r287197. It helps in case of multiple wlan interfaces and
multiple underlying hardware devices to keep track which wlan
interface belongs to which physical device.
Sponsored by: Rubicon Communications, LLC (d/b/a "Netgate")
Reviewed by: adrian, Idwer Vollering
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25832
- Add a better introduction to the DESCRIPTION section
- Add a description for MANPATH and POSIXLY_CORRECT
- Asorted improvements for the usage of some macros
PR: 43823
Submitted by: Lyndon Nerenberg <lyndon at orthanc dot ab dot ca>
Reviewed by: 0mp, bcr
Approved by: 0mp, bcr
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25912
These functions were introduced before UMA started ensuring that freed
memory gets placed in domain-local caches. They no longer serve any
purpose since UMA now provides their functionality by default. Remove
them to simplyify the kernel memory allocator interfaces a bit.
Reviewed by: cem, kib
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25937
The constant seems to exists on MacOS X >= 10.8.
Requested by: swills
Reviewed by: allanjude, kevans
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25933
As we are moving away from portsnap,
let's not recommend it in the manual page.
Reviewed by: bcr (manpages), mat (portmgr)
Differential Revision: https://reviews.freebsd.org/D25847
Update the ng_iface documentation and hooks to reflect the fact that the
node currently only supports IPv4 and v6 packets.
Reviewed by: Lutz Donnerhacke
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25862
- In the initial description of si_addr, do not claim that it is
always the faulting instruction.
- For si_addr, document that it is generally set to the PC for
synchronous signals, but that it can be set to the the address of
the faulting memory reference for some signals including SIGSEGV and
SIGBUS. In particular, while SIGSEGV generally sets si_addr to the
faulting memory reference, SIGBUS can vary. On some platforms, some
SIGBUS signals set si_addr to the PC and other SIGBUS signals set
si_addr to the faulting address depending on the specific hardware
exception.
- For si_trapno, synchronous signals should set this to some value.
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25777
For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.
To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
it careful about used KPIs. On platforms allowing IPI sending from NMI
context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
otherwise it works just like SWI_DELAY. To handle the delayed SWIs this
patch calls clk_intr_event on every hardclock() tick.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25754
Currently, force_depend() from rc.subr(8) does not support depending on
scripts outside of /etc/rc.d (like /usr/local/etc/rc.d). The /etc/rc.d path
is hard-coded into force_depend().
MFC after: 1 week
Allow TLS records to be decrypted in the kernel after being received
by a NIC. At a high level this is somewhat similar to software KTLS
for the transmit path except in reverse. Protocols enqueue mbufs
containing encrypted TLS records (or portions of records) into the
tail of a socket buffer and the KTLS layer decrypts those records
before returning them to userland applications. However, there is an
important difference:
- In the transmit case, the socket buffer is always a single "record"
holding a chain of mbufs. Not-yet-encrypted mbufs are marked not
ready (M_NOTREADY) and released to protocols for transmit by marking
mbufs ready once their data is encrypted.
- In the receive case, incoming (encrypted) data appended to the
socket buffer is still a single stream of data from the protocol,
but decrypted TLS records are stored as separate records in the
socket buffer and read individually via recvmsg().
Initially I tried to make this work by marking incoming mbufs as
M_NOTREADY, but there didn't seemed to be a non-gross way to deal with
picking a portion of the mbuf chain and turning it into a new record
in the socket buffer after decrypting the TLS record it contained
(along with prepending a control message). Also, such mbufs would
also need to be "pinned" in some way while they are being decrypted
such that a concurrent sbcut() wouldn't free them out from under the
thread performing decryption.
As such, I settled on the following solution:
- Socket buffers now contain an additional chain of mbufs (sb_mtls,
sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by
the protocol layer. These mbufs are still marked M_NOTREADY, but
soreceive*() generally don't know about them (except that they will
block waiting for data to be decrypted for a blocking read).
- Each time a new mbuf is appended to this TLS mbuf chain, the socket
buffer peeks at the TLS record header at the head of the chain to
determine the encrypted record's length. If enough data is queued
for the TLS record, the socket is placed on a per-CPU TLS workqueue
(reusing the existing KTLS workqueues and worker threads).
- The worker thread loops over the TLS mbuf chain decrypting records
until it runs out of data. Each record is detached from the TLS
mbuf chain while it is being decrypted to keep the mbufs "pinned".
However, a new sb_dtlscc field tracks the character count of the
detached record and sbcut()/sbdrop() is updated to account for the
detached record. After the record is decrypted, the worker thread
first checks to see if sbcut() dropped the record. If so, it is
freed (can happen when a socket is closed with pending data).
Otherwise, the header and trailer are stripped from the original
mbufs, a control message is created holding the decrypted TLS
header, and the decrypted TLS record is appended to the "normal"
socket buffer chain.
(Side note: the SBCHECK() infrastucture was very useful as I was
able to add assertions there about the TLS chain that caught several
bugs during development.)
Tested by: rmacklem (various versions)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24628
AVL trees, red-black trees, and others. Weak AVL (wavl) trees are a
recently discovered member of that class. This change replaces
red-black rebalancing with weak AVL rebalancing in the RB tree macros.
Wavl trees sit between AVL and red-black trees in terms of how
strictly balance is enforced. They have the stricter balance of AVL
trees as the tree is built - a wavl tree is an AVL tree until the
first deletion. Once removals start, wavl trees are lazier about
rebalancing than AVL trees, so that removals can be fast, but the
balance of the tree can decay to that of a red-black tree. Subsequent
insertions can push balance back toward the stricter AVL conditions.
Removing a node from a wavl tree never requires more than two
rotations, which is better than either red-black or AVL
trees. Inserting a node into a wavl tree never requires more than two
rotations, which matches red-black and AVL trees. The only
disadvantage of wavl trees to red-black trees is that more insertions
are likely to adjust the tree a bit. That's the cost of keeping the
tree more balanced.
Testing has shown that for the cases where red-black trees do worst,
wavl trees better balance leads to faster lookups, so that if lookups
outnumber insertions by a nontrivial amount, lookup time saved exceeds
the extra cost of balancing.
Reviewed by: alc, gbe, markj
Tested by: pho
Discussed with: emaste
Differential Revision: https://reviews.freebsd.org/D25480
Document that iwm(4) currently doesn't support 802.11n and 802.11ac.
PR: 247874
Submitted by: Charles Ross <cwr at sdf dot org>
Reviewed by: brueffer, markj
Approved by: brueffer
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25666
These routines are similar to crypto_getreq() and crypto_freereq() but
operate on caller-supplied storage instead of allocating crypto
requests from a UMA zone.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25691
initializations.
Relax some overly perscriptive rules against declarations: they may be at the
start of any blocks, even if things aren't super complicated. Allow more
initializations (those that call simple functions, like accessor functions for
newbus are fine). Allow the common idiom of declaring the loop variable in a for
loop.
This tries to codify what common exceptions are today, as well as give
some guidance on when it's best to do these things.
Reviewed by: tsoome, kp, markm, allanjude, jiles, cem, rpokala
(earlier versions: seanc, melifaro, bapt, pjd, bz, pstef, arichards,
jhibits, vangyzen, jmallet, ian, glebius, jhb, dab, adrian,
sef, gnn)
Differential Revision: https://reviews.freebsd.org/D25312
Note: date not bumped because "content" was not changed, just inserted some
missing words.
PR: 248001
Submitted by: Jose Luis Duran <jlduran@gmail.com>
MFC after: 2 weeks
Sponsored by: Klara Inc.
The EIP-97 is a packet processing module found on the ESPRESSObin. This
commit adds a crypto(9) driver for the crypto and hash engine in this
device. An initial skeleton driver that could attach and submit
requests was written by loos and others at Netgate, and the driver was
finished by me.
Support for separate AAD and output buffers will be added in a separate
commit, to simplify merging to stable/12 (where those features don't
exist).
Reviewed by: gnn, jhb
Feedback from: andrew, cem, manu
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (Netgate)
Differential Revision: https://reviews.freebsd.org/D25417
Optionally, alert you if the contents change from the previous backup
PR: 86388
Submitted by: Rob Fairbanks <rob.fx907@gmail.com>, Miroslav Lachman <000.fbsd@quip.cz> (Original Version)
MFC after: 4 weeks
Relnotes: yes
Sponsored by: Klara Inc.
Event: July 2020 Bugathon
Differential Revision: https://reviews.freebsd.org/D25628
With this change, a kernel compiled with "options SCTP_SUPPORT" and
without "options SCTP" supports dynamic loading of the SCTP stack.
Currently sctp.ko cannot be unloaded since some prerequisite teardown
logic is not yet implemented. Attempts to unload the module will return
EOPNOTSUPP.
Discussed with: tuexen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21997
This fixes Linux gettyname(3), with caveats (see PR).
PR: kern/240767
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25558
dumpon has accepted device names without the prefix ever since r291207.
Since dumpon and savecore are always paired, they ought to accept the same
arguments. Prior to this change, specifying 'dumpdev="da3"' in
/etc/rc.conf, for example, would result in dumpon working just fine but
savecore complaining that "Dump device does not exist".
PR: 247618
Reviewed by: cem, bcr
MFC after: 2 weeks
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D25500
Expand the mentioned RFC in the SEE ALSO section
and reference RFC1701 and RFC1702.
PR: 240250
Reviewed by: bcr (mentor)
Approved by: bcr (mentor)
Obtained from: OpenBSD
MFC after: 7 days
Differential Revision: https://reviews.freebsd.org/D25504
This mode was added in r362496. Rename it to make the meaning more
clear.
PR: 247306
Suggested by: rpokala
Submitted by: Ali Abdallah <ali.abdallah@suse.com>
MFC with: r362496
Document that RISC-V supports multiple page sizes: 4K, 2M, and 1G.
RISC-V's long double is always 128-bits wide, therefore quad precision.
Mention __riscv_float_abi_soft, which can be used to differentiate between
riscv64 and riscv64sf in userland code.
MFC after: 3 days
This permits requests to provide the AAD in a separate side buffer
instead of as a region in the crypto request input buffer. This is
useful when the main data buffer might not contain the full AAD
(e.g. for TLS or IPsec with ESN).
Unlike separate IVs which are constrained in size and stored in an
array in struct cryptop, separate AAD is provided by the caller
setting a new crp_aad pointer to the buffer. The caller must ensure
the pointer remains valid and the buffer contents static until the
request is completed (e.g. when the callback routine is invoked).
As with separate output buffers, not all drivers support this feature.
Consumers must request use of this feature via a new session flag.
To aid in driver testing, kern.crypto.cryptodev_separate_aad can be
set to force /dev/crypto requests to use a separate AAD buffer.
Discussed with: cem
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25288
There are cases when gif_interfaces cannot be replaced
with cloned_interfaces, such as tunnels with external IPv6 addresses
and internal IPv4 or vice versa. Such configuration requires
extra invocation of ifconfig(8) and supported with gif_interfaces only.
Fix manual page and provide some examples.
MFC after: 1 week
X-MFC-With: 362502
This is in preparation for enabling a loadable SCTP stack. Analogous to
IPSEC/IPSEC_SUPPORT, the SCTP_SUPPORT kernel option must be configured
in order to support a loadable SCTP implementation.
Discussed with: tuexen
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
The arguments for VFS_CHECKEXP() were changed by r362158.
Also, the numsecflavors and secflavors arguments were not documented,
so add these as well.
This is a content change.
the debug messages. While here, clean up some variable naming.
Reviewed by: bcr (manpages), emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25230
pthread_get_name_np() and pthread_set_name_np().
This re-applies r361770 after compatibility fixes.
Reviewed by: antoine, jkim, markj
Tested by: antoine (exp-run)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25117
applications, which often depend on this being the case. There's a new
sysctl, compat.linux.default_openfiles, to control this behaviour.
Reviewed by: kevans, emaste, bcr (manpages)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25177
Use this in GELI to print out a different message when accelerated
software such as AESNI is used vs plain software crypto.
While here, simplify the logic in GELI a bit for determing which type
of crypto driver was chosen the first time by examining the
capabilities of the matched driver after a single call to
crypto_newsession rather than making separate calls with different
flags.
Reviewed by: delphij
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25126
As timeout(9) was removed and all consumers were converted to
callout(9), reference it instead for the description of sbt, pr,
and flags arguments.
Reviewed by: trasz
Differential Revision: https://reviews.freebsd.org/D25165
Add xref to all SIM devices we currently have (including a rough indication
which ones are likely to fail).
Update to include all the CAM options.
Fix a few igor nits while I'm here.
for pthread_get_name_np() and pthread_set_name_np(), to be
compatible with Linux.
PR: 238404
Proposed and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25117
which happens on some laptops after returning to legacy multiplexing mode
at initialization stage.
PR: 242542
Reported by: Felix Palmen <felix@palmen-it.de>
MFC after: 1 week
Some crypto consumers such as GELI and KTLS for file-backed sendfile
need to store their output in a separate buffer from the input.
Currently these consumers copy the contents of the input buffer into
the output buffer and queue an in-place crypto operation on the output
buffer. Using a separate output buffer avoids this copy.
- Create a new 'struct crypto_buffer' describing a crypto buffer
containing a type and type-specific fields. crp_ilen is gone,
instead buffers that use a flat kernel buffer have a cb_buf_len
field for their length. The length of other buffer types is
inferred from the backing store (e.g. uio_resid for a uio).
Requests now have two such structures: crp_buf for the input buffer,
and crp_obuf for the output buffer.
- Consumers now use helper functions (crypto_use_*,
e.g. crypto_use_mbuf()) to configure the input buffer. If an output
buffer is not configured, the request still modifies the input
buffer in-place. A consumer uses a second set of helper functions
(crypto_use_output_*) to configure an output buffer.
- Consumers must request support for separate output buffers when
creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are
only permitted to queue a request with a separate output buffer on
sessions with this flag set. Existing drivers already reject
sessions with unknown flags, so this permits drivers to be modified
to support this extension without requiring all drivers to change.
- Several data-related functions now have matching versions that
operate on an explicit buffer (e.g. crypto_apply_buf,
crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf).
- Most of the existing data-related functions operate on the input
buffer. However crypto_copyback always writes to the output buffer
if a request uses a separate output buffer.
- For the regions in input/output buffers, the following conventions
are followed:
- AAD and IV are always present in input only and their
fields are offsets into the input buffer.
- payload is always present in both buffers. If a request uses a
separate output buffer, it must set a new crp_payload_start_output
field to the offset of the payload in the output buffer.
- digest is in the input buffer for verify operations, and in the
output buffer for compute operations. crp_digest_start is relative
to the appropriate buffer.
- Add a crypto buffer cursor abstraction. This is a more general form
of some bits in the cryptosoft driver that tried to always use uio's.
However, compared to the original code, this avoids rewalking the uio
iovec array for requests with multiple vectors. It also avoids
allocate an iovec array for mbufs and populating it by instead walking
the mbuf chain directly.
- Update the cryptosoft(4) driver to support separate output buffers
making use of the cursor abstraction.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24545
Add a 'native_blocksize' member to 'struct enc_xform' that ciphers can
use if they support a partial final block. This is particular useful
for stream ciphers, but can also apply to other ciphers. cryptosoft
will only pass in native blocks to the encrypt and decrypt hooks. For
the final partial block, 'struct enc_xform' now has new
encrypt_last/decrypt_last hooks which accept the length of the final
block. The multi_block methods are also retired.
Mark AES-ICM (AES-CTR) as a stream cipher. This has some interesting
effects on IPsec in that FreeBSD can now properly receive all packets
sent by Linux when using AES-CTR, but FreeBSD can no longer
interoperate with OpenBSD and older verisons of FreeBSD which assume
AES-CTR packets have a payload padded to a 16-byte boundary. Kornel
has offered to work on a patch to add a compatiblity sysctl to enforce
additional padding for AES-CTR in esp_output to permit compatibility
with OpenBSD and older versions of FreeBSD.
AES-XTS continues to use a block size of a single AES block length.
It is possible to adjust it to support partial final blocks by
implementing cipher text stealing via encrypt_last/decrypt_last hooks,
but I have not done so.
Reviewed by: cem (earlier version)
Tested by: Kornel Dulęba <mindal@semihalf.com> (AES-CTR with IPsec)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24906
The flush is needed to prevent cross-process ret2spec, which is not handled
on kernel entry if IBPB is enabled but SMEP is present.
While there, add i386 RSB flush.
Reported by: Anthony Steinhauser <asteinhauser@google.com>
Reviewed by: markj, Anthony Steinhauser
Discussed with: philip
admbugs: 961
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This man page contains stat utilities that are available in
the base system. This is a better approach than looking them
up via "apropos stat" or similar commands.
Thanks to Daniel Ebdrup Jensen for writing the original page
and incorporating the feedback given.
Submitted by: Daniel Ebdrup Jensen
Reviewed by: 0mp, allanjude, brueffer, bcr
Approved by: bcr
MFC after: 3 days
Relnotes: yes (new stats(7) man page)
Differential Revision: https://reviews.freebsd.org/D24417
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults. It's just
an older incarnation of the now-more-common strlcpy().
Reviewed by: jhb
MFC after: i² days
Differential Revision: yes (see 2/2)
There are no in-kernel consumers.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24775
It no longer has any in-kernel consumers via OCF. smbfs still uses
single DES directly, so sys/crypto/des remains for that use case.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24773
It no longer has any in-kernel consumers.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24772
There are no longer any in-kernel consumers. The software
implementation was also a non-functional stub.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24771
They no longer have any in-tree consumers. Note that these are a
different from MD5-HMAC and SHA1-HMAC and were only used with IPsec.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24770
This was removed from IPsec in r286100 and no longer has any in-tree
consumers.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24769
It no longer has any in-tree consumers.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24768
Although a few drivers supported this algorithm, there were never any
in-kernel consumers. cryptosoft and cryptodev never supported it,
and there was not a software xform auth_hash for it.
Reviewed by: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24767
pf by default does not do per-table address accounting unless the
"counters" keyword is specified in the corresponding pf.conf table
definition. Yet, we always allocate 12 per-CPU counters per table. For
large tables this carries a lot of overhead, so only allocate counters
when they will actually be used.
A further enhancement might be to use a dedicated UMA zone to allocate
counter arrays for table entries, since close to half of the structure
size comes from counter pointers. A related issue is the cost of
zeroing counters, since counter_u64_zero() calls smp_rendezvous() on
some architectures.
Reported by: loos, Jim Pingle <jimp@netgate.com>
Reviewed by: kp
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (Netgate)
Differential Revision: https://reviews.freebsd.org/D24803
It can be dangerous and there is no need for it in the kernel.
Inspired by Kees Cook's change in Linux, and later OpenBSD.
Reviewed by: cem, gordon, philip
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24760
for custom vendor-specific changes to FreeBSD's
default settings.
While here, fix a typo: perfomance -> performance
PR: 245404
Submitted by: Jose Luis Duran
With the removal of in-tree consumers of DES, Triple DES, and
MD5-HMAC, the only algorithm this driver still supports is SHA1-HMAC.
This is not very useful as a standalone algorithm (IPsec AH-only with
SHA1 would be the only user).
This driver has also not been kept up to date with the original driver
in OpenBSD which supports a few more cards and AES-CBC on newer cards.
The newest card currently supported by this driver was released in
2005.
Reviewed by: cem
MFC after: 1 week
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24691
source that works or is the new location on the
same page.
Submitted by: alfix86_gmail.com
Approved by: bcr
Differential Revision: https://reviews.freebsd.org/D23769
Save and restore (also known as suspend and resume) permits a snapshot
to be taken of a guest's state that can later be resumed. In the
current implementation, bhyve(8) creates a UNIX domain socket that is
used by bhyvectl(8) to send a request to save a snapshot (and
optionally exit after the snapshot has been taken). A snapshot
currently consists of two files: the first holds a copy of guest RAM,
and the second file holds other guest state such as vCPU register
values and device model state.
To resume a guest, bhyve(8) must be started with a matching pair of
command line arguments to instantiate the same set of device models as
well as a pointer to the saved snapshot.
While the current implementation is useful for several uses cases, it
has a few limitations. The file format for saving the guest state is
tied to the ABI of internal bhyve structures and is not
self-describing (in that it does not communicate the set of device
models present in the system). In addition, the state saved for some
device models closely matches the internal data structures which might
prove a challenge for compatibility of snapshot files across a range
of bhyve versions. The file format also does not currently support
versioning of individual chunks of state. As a result, the current
file format is not a fixed binary format and future revisions to save
and restore will break binary compatiblity of snapshot files. The
goal is to move to a more flexible format that adds versioning,
etc. and at that point to commit to providing a reasonable level of
compatibility. As a result, the current implementation is not enabled
by default. It can be enabled via the WITH_BHYVE_SNAPSHOT=yes option
for userland builds, and the kernel option BHYVE_SHAPSHOT.
Submitted by: Mihai Tiganus, Flavius Anton, Darius Mihai
Submitted by: Elena Mihailescu, Mihai Carabas, Sergiu Weisz
Relnotes: yes
Sponsored by: University Politehnica of Bucharest
Sponsored by: Matthew Grooms (student scholarships)
Sponsored by: iXsystems
Differential Revision: https://reviews.freebsd.org/D19495
- Abbreviated month name in .Dd
- position of HISTORY section
- alphabetical ordering within SEE ALSO section
- adding .Ed before .Sh DESCRIPTION
- remove trailing whitespaces
- Line break after a sentence stop
- Use BSD OS macros instead of hardcoded strings
No .Dd bumps as there was no actual content change made
in any of these pages.
Submitted by: Gordon Bergling gbergling_gmail.com
Approved by: bcr
Differential Revision: https://reviews.freebsd.org/D24591
o Shrink sglist(9) functions to work with multipage mbufs down from
four functions to two.
o Don't use 'struct mbuf_ext_pgs *' as argument, use struct mbuf.
o Rename to something matching _epg.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D24598