Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages. It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.
For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer
now points to a struct mbuf_ext_pgs structure instead of a data
buffer. This structure contains an array of physical addresses (this
reduces cache misses compared to an earlier version that stored an
array of vm_page_t pointers). It also stores additional fields needed
for in-kernel TLS such as the TLS header and trailer data that are
currently unused. To more easily detect these mbufs, the M_NOMAP flag
is set in m_flags in addition to M_EXT.
Various functions like m_copydata() have been updated to safely access
packet contents (using uiomove_fromphys()), to make things like BPF
safe.
NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability. This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.
If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output. If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.
Submitted by: gallatin (earlier version)
Reviewed by: gallatin, hselasky, rrs
Discussed with: ae, kp (firewalls)
Relnotes: yes
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20616
The format to use depends on hardware configuration (synthesis-time),
so make it compile-time kernel option.
Extended format allows DMA engine to operate with 64-bit memory addresses.
Sponsored by: DARPA, AFRL
NANDFS has been broken for years. Remove it. The NAND drivers that
remain are for ancient parts that are no longer relevant. They are
polled, have terrible performance and just for ancient arm
hardware. NAND parts have evolved significantly from this early work
and little to none of it would be relevant should someone need to
update to support raw nand. This code has been off by default for
years and has violated the vnode protocol leading to panics since it
was committed.
Numerous posts to arch@ and other locations have found no actual users
for this software.
Relnotes: Yes
No Objection From: arch@
Differential Revision: https://reviews.freebsd.org/D20745
had been done years ago. I did. All this time we've only compiled a LINT
kernel for TARGET_ARCH=arm. Now separate LINT-V5 and LINT-V7 configs are
generated and built.
There are two new files in arm/conf, NOTES.armv5 and NOTES.armv7, containing
some of what used to be in the arm NOTES file. That file now contains only
the bits that are common to v5 and v7.
The makeLINT.mk file now creates the LINT-V5 and LINT-V7 files by concatening
sys/conf/NOTES, arm/conf/NOTES, and arm/conf/NOTES.armv{5,7} in that order.
This adds ACPI device path on devinfo(8) output and
show value of _UPC(usb port capabilities), _PLD (physical location of device)
when hw.usb.debug >= 1 .
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D20630
rename the source to gsb_crc32.c.
This is a prerequisite of unifying kernel zlib instances.
PR: 229763
Submitted by: Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision: https://reviews.freebsd.org/D20193
The pwm and pwmbus interfaces were nearly identical, this merges them into a
single pwmbus interface. The pwmbus driver now implements the pwmbus
interface by simply passing all calls through to its parent (the hardware
driver). The channel_count method moves from pwm to pwmbus, and the
get_bus method is deleted (just no longer needed).
The net effect is that the interface for doing pwm stuff is now the same
regardless of whether you're a child of pwmbus, or some random driver
elsewhere in the hierarchy that is bypassing the pwmbus layer and is talking
directly to the hardware driver via cross-hierarchy connections established
using fdt data.
The pwmc driver is now a child of pwmbus, instead of being its sibling
(that's why the get_bus method is no longer needed; pwmc now gets the
device_t of the bus using device_get_parent()).
This was lost in r335910 for some reason.
This also fixes a META_MODE rebuild issue in some modules [1].
MFC after: 2 weeks
Reported by: npn [1]
Sponsored by: DellEMC
The A3700 has a different GPIO controller and thus, do not use the old (and
shared) code for Marvell.
The pinctrl driver, also part of the controller, is not supported yet (but
the implementation should be straightforward).
Sponsored by: Rubicon Communications, LLC (Netgate)
The gp register is intended to used by the linker as another means of
performing relaxations, and should point to the small data section (.sdata).
Currently gp is being used as the pcpu pointer within the kernel, but the more
appropriate choice for this is the tp register, which is unused.
Swap existing usage of gp with tp within the kernel, and set up gp properly
at boot with the value of __global_pointer$ for all harts.
Additionally, remove some cases of accessing tp from the PCB, as it is not
part of the per-thread state. The user's tp and gp should be tracked only
through the trapframe.
Reviewed by: markj, jhb
Approved by: markj (mentor)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19893
Apply a linker script when linking i386 kernel modules to apply padding
to a set_pcpu or set_vnet section. The padding value is kind-of random
and is used to catch modules not compiled with the linker-script, so
possibly still having problems leading to kernel panics.
This is needed as the code generated on certain architectures for
non-simple-types, e.g., an array can generate an absolute relocation
on the edge (just outside) the section and thus will not be properly
relocated. Adding the padding to the end of the section will ensure
that even absolute relocations of complex types will be inside the
section, if they are the last object in there and hence relocation will
work properly and avoid panics such as observed with carp.ko or ipsec.ko.
There is a rather lengthy discussion of various options to apply in
the mentioned PRs and their depends/blocks, and the review.
There seems no best solution working across multiple toolchains and
multiple version of them, so I took the liberty of taking one,
as currently our users (and our CI system) are hitting this on
just i386 and we need some solution. I wish we would have a proper
fix rather than another "hack".
Also backout r340009 which manually, temporarily fixed CARP before 12.0-R
"by chance" after a lead-up of various other link-elf.c and related fixes.
PR: 230857,238012
With suggestions from: arichardson (originally last year)
Tested by: lwhsu
Event: Waterloo Hackathon 2019
Reported by: lwhsu, olivier
MFC after: 6 weeks
Differential Revision: https://reviews.freebsd.org/D17512
Add a CAM-Newbus SDIO support module. This works provides a newbus
infrastructure for device drivers wanting to use SDIO. On the lower end
while it is connected by newbus to SDHCI, it talks CAM using the MMCCAM
framework to get to it.
This also duplicates the usbdevs framework to equally create sdiodev
header files with #defines for "vendors" and "products".
Submitted by: kibab (initial work, see https://reviews.freebsd.org/D12467)
Reviewed by: kibab, imp (comments on earlier version)
MFC after: 6 weeks
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19749
After our migration (of certain architectures) to lld the kernel is built
with a unique build-ID. Make it available via a sysctl and uname(1) to
allow the user to identify their running kernel.
Submitted by: Ali Mashtizadeh <ali_mashtizadeh.com>
MFC after: 2 weeks
Relnotes: Yes
Event: Waterloo Hackathon 2019
Differential Revision: https://reviews.freebsd.org/D20326
install -> ${INSTALL}
mtree -> ${MTREE_CMD}
services_mkdb -> ${SERVICES_MKDB_CMD}
cap_mkdb -> ${CAP_MKDB_CMD}
pwd_mkdb -> ${PWD_MKDB_CMD}
kldxref -> ${KLDXREF_CMD}
If you do custom FreeBSD builds you may want to override those
in some cases.
Sponsored by: Sippy Software, Inc.
I introduced an obvious compiler error in r346282, so this change fixes
that.
Unfortunately, RANDOM_LOADABLE isn't covered by our existing tinderbox, and
it seems like there were existing latent linking problems. I believe these
were introduced on accident in r338324 during reduction of the boolean
expression(s) adjacent to randomdev.c and hash.c. It seems the
RANDOM_LOADABLE build breakage has gone unnoticed for nine months.
This change correctly annotates randomdev.c and hash.c with !random_loadable
to match the pre-r338324 logic; and additionally updates the HWRNG drivers
in MD 'files.*', which depend on random_device symbols, with
!random_loadable (it is invalid for the kernel to depend on symbols from a
module).
(The expression for both randomdev.c and hash.c was the same, prior to
r338324: "optional random random_yarrow | random !random_yarrow
!random_loadable". I.e., "random && (yarrow || !loadable)." When Yarrow
was removed ("yarrow := False"), the expression was incorrectly reduced to
"optional random" when it should have retained "random && !loadable".)
Additionally, I discovered that virtio_random was missing a MODULE_DEPEND on
random_device, which breaks kld load/link of the driver on RANDOM_LOADABLE
kernels. Address that issue as well.
PR: 238223
Reported by: Eir Nym <eirnym AT gmail.com>
Reviewed by: delphij, markm
Approved by: secteam(delphij)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20466
With the reorg r348175, we now look at modified before it is
set. Rearrange things so that we can set include_metadata to either
yes, no or if-modified. This should fix the -R flag that was broken in
r348175, which broke WITH_REPRODUCIBLE_BUILD for kernels.
Feedback From: emaste@
Differential Revision: https://reviews.freebsd.org/D20480
cpufunc, in terms of __builtin_ffs and the like, for arm32 v6 and v7
architectures, and use those, rather than the simple libkern
implementations, in building arm32 kernels.
Reviewed by: manu
Approved by: kib, markj (mentors)
Tested by: iz-rpi03_hs-karlsruhe.de, mikael.urankar_gmail.com, ian
Differential Revision: https://reviews.freebsd.org/D20412
code. The primary client of this is probably going to be ZFS encryption.
Reviewed by: jhb, cem
Sponsored by: iXsystems Inc, Kithrup Enterprises
Differential Revision: https://reviews.freebsd.org/D19298
This takes the SPCR code currently in uart_cpu_arm64.c, moves it into
a new uart_cpu_acpi.c (with some associated refactoring), and uses it
from both arm64 and x86.
An SPCR serial port address AccessWidth field value of 0 ("reserved")
is now treated as 1 ("byte access") in order to work around a buggy
SPCR table on Amazon EC2 i3.metal instances.
Reviewed by: manu, Greg V
MFC after: 3 days
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D20357
Add a clock driver for clock that can either be used in integer mode
with one N factor and one M divider or in fractional mode where the
output frequency is chosen between two predifined output.
Add -v to print TYPE REVISION BRANCH RELEASE VERSION RELDATE variables
Add -V var to print var's value
Both of these in ${var}="${val}" format suitable for
eval $(sh newvers.sh -v)
in shell scripts / makefiles.
Add -c to print the copyright / license comment text only.
Document these, and remove soon-to-be obsolete comment.
Minor code motion as well bunded here to put functions after
VARS_ONLY and command line argument parsing.
Differential Revision: https://reviews.freebsd.org/D19849
FDT data is sometimes used to configure usb devices which are hardwired into
an embedded system. Because the devices are instantiated by the usb
enumeration process rather than by ofwbus iterating through the fdt data, it
is somewhat difficult for a usb driver to locate fdt data that belongs to
it. In the past, various ad-hoc methods have been used, which can lead to
errors such applying configuration that should apply only to a hardwired
device onto a similar device attached by the user at runtime. For example,
if the user adds an ethernet device that uses the same driver as the builtin
ethernet, both devices might end up with the same MAC address.
These changes add a new usb_fdt_get_node() helper function that a driver can
use to locate FDT data that belongs to a single unique instance of the
device. This function locates the proper FDT data using the mechanism
detailed in the standard "usb-device.txt" binding document [1].
There is also a new usb_fdt_get_mac_addr() function, used to retrieve the
mac address for a given device instance from the fdt data. It uses
usb_fdt_get_node() to locate the right node in the FDT data, and attempts to
obtain the mac-address or local-mac-address property (in that order, the
same as linux does it).
The existing if_smsc driver is modified to use the new functions, both as an
example and for testing the new functions. Rpi and rpi2 boards use this
driver and provide the mac address via the fdt data.
[1] https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/usb/usb-device.txt
Differential Revision: https://reviews.freebsd.org/D20262
cpufunc, in terms of __builtin_ffs and the like, for arm64
architectures, and use those, rather than the simple libkern
implementations, in building arm64 kernels.
Tested by: greg_unrelenting.technology (earlier version)
Reviewed by: alc
Approved by: kib (mentor)
Differential Revision: https://reviews.freebsd.org/D20250