Once all CPUs are online, determine if they all support LSE atomics and
set lse_supported to indicate this. For now the atomic(9)
implementations are still always inlined, though it would be preferable
to create out-of-line functions to avoid text bloat. This was not done
here since big.little systems exist in which some CPUs implement LSE
while others do not, and ifunc resolution must occur well before this
scenario can be detected. It does seem unlikely that FreeBSD will
ever run on such platforms, however, so converting atomic(9) to use
ifuncs is probably a good next step.
Add a LSE_ATOMICS arm64 kernel configuration option to unconditionally
select LSE-based atomic(9) implementations when the target system is
known.
Reviewed by: andrew, kib
MFC after: 1 month
Sponsored by: The FreeBSD Foundation, Amazon (hardware)
Differential Revision: https://reviews.freebsd.org/D23325
fonts. As a workaround, remove the static. vt is default on powerpc, but there's
a few old macs that still fail with vt. sc is used as a work arouond for those
machines, and the kernel fails to build w/o it.
Use ${SRCTOP} instead of /usr/share.
Prefer to depend on option sc_dflt_fnt instead of sc.
gc the 4 otherwise identical instances in the tree.
Platforms that don't need this won't included it.
Fix the old-style build by using ${SRCTOP} instead of a weird
construct that only works for new-style build.
Simplify the building of keymap files by using macros
Move atkbdmap.h in files.x86
This has been broken since r296899 which removed the implicit
dependency on /usr/share.
Now that armv5 is gone, we no longer need multiple LINT files. Kill
the odd-ball support here. From now on, we just have LINT built from
notes like all the other platforms. Keep the removal of LINT-V5/7
to remove stale files for a while still..
The Parallel Port SCSI adapter was interesting for 100MB ZIP drives, but is no
longer used or maintained. Remove it from the tree.
The Parallel Port microsequencer (microseq.9) is now mostly unused in the tree,
but remains. PPI still refrences it, but doesn't use its full functionality.
Relnotes: Yes
Reviewed by: rgrimes@, Ihor Antonov
Discussed on: arch@
Differential Revision: https://reviews.freebsd.org/D23389
the ! operator should have been a ~ instead:
Merge r357348 from the clang 10.0.0 import branch:
Disable new clang 10.0.0 warnings about converting the result of
shift operations to a boolean in tpm(4):
sys/dev/tpm/tpm_crb.c:301:32: error: converting the result of '<<' to a boolean; did you mean '(1 << (0)) != 0'? [-Werror,-Wint-in-bool-context]
WR4(sc, TPM_CRB_CTRL_CANCEL, !TPM_CRB_CTRL_CANCEL_CMD);
^
sys/dev/tpm/tpm_crb.c:73:34: note: expanded from macro 'TPM_CRB_CTRL_CANCEL_CMD'
#define TPM_CRB_CTRL_CANCEL_CMD BIT(0)
^
sys/dev/tpm/tpm20.h:60:19: note: expanded from macro 'BIT'
#define BIT(x) (1 << (x))
^
Such warnings can be useful in C++ contexts, but not so much in kernel
drivers, where this type of bit twiddling is commonplace. So disable
it for this case.
Noticed by: cem
MFC after: 3 days
Disable new clang 10.0.0 warnings about converting the result of shift
operations to a boolean in tpm(4):
sys/dev/tpm/tpm_crb.c:301:32: error: converting the result of '<<' to a boolean; did you mean '(1 << (0)) != 0'? [-Werror,-Wint-in-bool-context]
WR4(sc, TPM_CRB_CTRL_CANCEL, !TPM_CRB_CTRL_CANCEL_CMD);
^
sys/dev/tpm/tpm_crb.c:73:34: note: expanded from macro 'TPM_CRB_CTRL_CANCEL_CMD'
#define TPM_CRB_CTRL_CANCEL_CMD BIT(0)
^
sys/dev/tpm/tpm20.h:60:19: note: expanded from macro 'BIT'
#define BIT(x) (1 << (x))
^
Such warnings can be useful in C++ contexts, but not so much in kernel
drivers, where this type of bit twiddling is commonplace. So disable it
for this case.
MFC after: 3 days
operations to a boolean in tpm(4):
sys/dev/tpm/tpm_crb.c:301:32: error: converting the result of '<<' to a boolean; did you mean '(1 << (0)) != 0'? [-Werror,-Wint-in-bool-context]
WR4(sc, TPM_CRB_CTRL_CANCEL, !TPM_CRB_CTRL_CANCEL_CMD);
^
sys/dev/tpm/tpm_crb.c:73:34: note: expanded from macro 'TPM_CRB_CTRL_CANCEL_CMD'
#define TPM_CRB_CTRL_CANCEL_CMD BIT(0)
^
sys/dev/tpm/tpm20.h:60:19: note: expanded from macro 'BIT'
#define BIT(x) (1 << (x))
^
Such warnings can be useful in C++ contexts, but not so much in kernel
drivers, where this type of bit twiddling is commonplace. So disable it
for this case.
MFC after: 3 days
This is in the same family of algorithms as Epoch/QSBR/RCU/PARSEC but is
a unique algorithm. This has 3x the performance of epoch in a write heavy
workload with less than half of the read side cost. The memory overhead
is significantly lessened by limiting the free-to-use latency. A synthetic
test uses 1/20th of the memory vs Epoch. There is significant further
discussion in the comments and code review.
This code should be considered experimental. I will write a man page after
it has settled. After further validation the VM will begin using this
feature to permit lockless page lookups.
Both markj and cperciva tested on arm64 at large core counts to verify
fences on weaker ordering architectures. I will commit a stress testing
tool in a follow-up.
Reviewed by: mmacy, markj, rlibby, hselasky
Discussed with: sbahara
Differential Revision: https://reviews.freebsd.org/D22586
MAC is also almost universally a default; every GENERIC includes it, and
it's std.armv[67]. mips is again the oddball here with it only being
included in ERL/OCTEON1.
The only module currently working around this one is mac_veriexec, but it
looks like nothing it builds actually uses the MAC definition. Downstream
consumers enabling MAC in mips using mac_veriexec may be advised to do
something differently here in config.mk.
For untied module builds, we'll generate opt_foo headers if they're included
in SRCS. However, options that would normally be represented in opt_global.h
aren't properly represented.
Start generating opt_global.h with #define VIMAGE for !mips since it's
almost universally a project default and right now kmods must hack it in
themselves in order to be properly compiled for the default kernel. For
example, ^/sys/modules/pf/Makefile
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D23345
Originally, hack.c was compiled into a shard object with just -shared
-nostdlib. This assumed that ${CC} did not require any additional
flags for ABIs, cross-building, etc.
When kern.post.mk was created in r89509 by reducing duplication in
kernel Makefile.<arch> files, the -shared flag was moved into a
HACK_EXTRA_FLAGS variable so that sparc64 could override it with
-Wl,-shared. The sparc64 hack was removed in r111650, but
HACK_EXTRA_FLAGS was left in place. Over time, we have started
support toolchains that require flags to support alternate ABIs on
MIPS and PowerPC and started (ab)using HACK_EXTRA_FLAGS to set only
those flags.
I need to fix risc-v to pass -mno-relax to the hack.c build for lld in
llvm 10, and the patches to support cross-build from non-FreeBSD hosts
need to include -target for clang in CFLAGS for hack.c. Rather than
adding more hacks into HACK_EXTRA_FLAGS, just use the full set of
CFLAGS with hack.c.
Reviewed by: kib, arichardson
MFC after: 1 month
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23362
Intel Speed Shift is Intel's technology to control frequency in hardware,
with hints from software.
Let's get a working version of this in the tree and we can refine it from
here.
Submitted by: bwidawsk, scottph
Reviewed by: bcr (manpages), myself
Discussed with: jhb, kib (earlier versions)
With feedback from: Greg V, gallatin, freebsdnewbie AT freenet.de
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D18028
Redirect (and temporal) route expiration was broken a while ago.
This change brings route expiration back, with unified IPv4/IPv6 handling code.
It introduces net.inet.icmp.redirtimeout sysctl, allowing to set
an expiration time for redirected routes. It defaults to 10 minutes,
analogues with net.inet6.icmp6.redirtimeout.
Implementation uses separate file, route_temporal.c, as route.c is already
bloated with tons of different functions.
Internally, expiration is implemented as an per-rnh callout scheduled when
route with non-zero rt_expire time is added or rt_expire is changed.
It does not add any overhead when no temporal routes are present.
Callout traverses entire routing tree under wlock, scheduling expired routes
for deletion and calculating the next time it needs to be run. The rationale
for such implemention is the following: typically workloads requiring large
amount of routes have redirects turned off already, while the systems with
small amount of routes will not inhibit large overhead during tree traversal.
This changes also fixes netstat -rn display of route expiration time, which
has been broken since the conversion from kread() to sysctl.
Reviewed by: bz
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D23075
The main objective here is to make it easy to identify what needs to change
in order to use a different sysent generator than the current Lua-based one,
which may be used to MFC some of the changes that have happened so we can
avoid parallel accidents in stable branches, for instance.
As a secondary objective, it's now feasible to override the generator on a
per-Makefile basis if needed, so that one could refactor their Makefile to
use this while pinning generation to the legacy makesyscalls.sh. I don't
anticipate any consistent need for such a thing, but it's low-effort to
achieve.
Summary:
The CPLD is the communications medium between the CPU and the XMOS
"Xena" event coprocessor. It provides a mailbox communication feature,
along with dual-port RAM to be used between the CPU and XMOS. Also, it
provides basic board stats as well, such as PCIe presence, JTAG signals,
and CPU fan speed reporting (in revolutions per second). Only fan speed
reading is handled, as a sysctl.
Reviewed by: bdragon
Differential Revision: https://reviews.freebsd.org/D23136
r355473 vastly improved the readability and cleanliness of these Makefiles.
Every single one of them follows the same pattern and duplicates the exact
same logic.
Now that we have GENERATED/SRCS, split SRCS up into the two parameters we'll
use for ${MAKESYSCALLS} rather than assuming a specific ordering of SRCS and
include a common sysent.mk to handle the rest. This makes it less tedious to
make sweeping changes.
Some default values are provided for GENERATED/SYSENT_*; almost all of these
just use a 'syscalls.master' and 'syscalls.conf' in cwd, and they all use
effectively the same filenames with an arbitrary prefix. Most ABIs will be
able to get away with just setting GENERATED_PREFIX and including
^/sys/conf/sysent.mk, while others only need light additions. kern/Makefile
is the notable exception, as it doesn't take a SYSENT_CONF and the generated
files are spread out between ^/sys/kern and ^/sys/sys, but it otherwise fits
the pattern enough to use the common version.
Reviewed by: brooks, imp
Nice!: emaste
Differential Revision: https://reviews.freebsd.org/D23197
The utility here seems somewhat limited, but clang will attempt to generate
.eh_frame and actively fail in doing so. It is perhaps worth investigating
why it's being generated in the first place (GCC doesn't do so), but this
isn't a high priority.
lld on RISC-V is not yet able to handle undefined weak symbols for
non-PIC code in the code model (medany/medium) used by the RISC-V
kernel.
Both GCC and clang emit an auipc / addi pair of instructions to
generate an address relative to the current PC with a 31-bit offset.
Undefined weak symbols need to have an address of 0, but the kernel
runs with PC values much greater than 2^31, so there is no way to
construct a NULL pointer as a PC-relative value. The bfd linker
rewrites the instruction pair to use lui / addi with values of 0 to
force a NULL pointer address. (There are similar cases for 'ld'
becoming auipc / ld that bfd rewrites to lui / ld with an address of
0.)
To work around this, compile the kernel with -fPIE when using lld.
This does not make the kernel position-independent, but it does
force the compiler to indirect address lookups through GOT entries
(so auipc / ld against a GOT entry to fetch the address). This
adds extra memory indirections for global symbols, so should be
disabled once lld is finally fixed.
A few 'la' instructions in locore that depend on PC-relative
addressing to load physical addresses before paging is enabled have to
use auipc / addi and not indirect via GOT entries, so change those to
use 'lla' which always uses auipc / addi for both PIC and non-PIC.
Submitted by: jrtc27
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23064
more consistent with other NUMA features as UMA_ZONE_FIRSTTOUCH and
UMA_ZONE_ROUNDROBIN. The system will now pick a select a default depending
on kernel configuration. API users need only specify one if they want to
override the default.
Remove the UMA_XDOMAIN and UMA_FIRSTTOUCH kernel options and key only off
of NUMA. XDOMAIN is now fast enough in all cases to enable whenever NUMA
is.
Reviewed by: markj
Discussed with: rlibby
Differential Revision: https://reviews.freebsd.org/D22831
This is a lock-based emulation of 64-bit atomics for kernel use, split off
from an earlier patch by jhibbits.
This is needed to unblock future improvements that reduce the need for
locking on 64-bit platforms by using atomic updates.
The implementation allows for future integration with userland atomic64,
but as that implies going through sysarch for every use, the current
status quo of userland doing its own locking may be for the best.
Submitted by: jhibbits (original patch), kevans (mips bits)
Reviewed by: jhibbits, jeff, kevans
Differential Revision: https://reviews.freebsd.org/D22976
The fdt attachment for this heavily relies on extres for clk work. This
unbreaks the build for mips XLPN32/XLP, which have pci/fdt but no need for
this fdt attachment.
An i2c bus can be divided into segments which can be selectively connected
and disconnected from the main bus. This is usually done to enable using
multiple slave devices having the same address, by isolating the devices
onto separate bus segments, only one of which is connected to the main bus
at once.
There are several types of i2c bus muxes, which break down into two general
categories...
- Muxes which are themselves i2c slaves. These devices respond to i2c
commands on their upstream bus, and based on those commands, connect
various downstream buses to the upstream. In newbus terms, they are both
a child of an iicbus and the parent of one or more iicbus instances.
- Muxes which are not i2c devices themselves. Such devices are part of the
i2c bus electrically, but in newbus terms their parent is some other
bus. The association with the upstream bus must be established by
separate metadata (such as FDT data).
In both cases, the mux driver has one or more iicbus child instances
representing the downstream buses. The mux driver implements the iicbus_if
interface, as if it were an iichb host bridge/i2c controller driver. It
services the IO requests sent to it by forwarding them to the iicbus
instance representing the upstream bus, after electrically connecting the
upstream bus to the downstream bus that hosts the i2c slave device which
made the IO request.
The net effect is automatic mux switching which is transparent to slaves on
the downstream buses. They just do i2c IO they way they normally do, and the
bus is electrically connected for the duration of the IO and then idled when
it is complete.
The existing iicbus_if callback() method is enhanced so that the parameter
passed to it can be a struct which contains a device_t for the requesting
bus and slave devices. This change is done by adding a flag that indicates
the extra values are present, and making the flags field the first field of
a new args struct. If the flag is set, the iichb or mux driver can recast
the pointer-to-flags into a pointer-to-struct and access the extra
fields. Thus abi compatibility with older drivers is retained (but a mux
cannot exist on the bus with the older iicbus driver in use.)
A new set of core support routines exists in iicbus.c. This code will help
implement mux drivers for any type of mux hardware by supplying all the
boilerplate code that forwards IO requests upstream. It also has code for
parsing metadata and instantiating the child iicbus instances based on it.
Two new hardware mux drivers are added. The ltc430x driver supports the
LTC4305/4306 mux chips which are controlled via i2c commands. The
iic_gpiomux driver supports any mux hardware which is controlled by
manipulating the state of one or more gpio pins. Test Plan
Tested locally using a variety of mux'd bus configurations involving both
ltc4305 and a homebrew gpio-controlled mux. Tested configurations included
cascaded muxes (unlikely in the real world, but useful to prove that 'it all
just works' in terms of the automatic switching and upstream forwarding of
IO requests).
This brings arm into line with how every other arch does it. For some
reason, only arm lacked a definition of a symbol named kernbase in its
locore.S file(s) for use in its ldscript.arm file. Needlessly different
means harder to maintain.
Using a common symbol name also eases work in progress on a script to help
generate arm and arm64 kernels packaged in various ways (like with a header
blob needed for a bootloader prepended to the kernel file).
symbols from the linked kernel.
The main thrust of this change is to generate a kernel that has the arm
"marker" symbols stripped. Marker symbols start with $a, $d, $t or $x, and
are emitted by the compiler to tell other toolchain components about the
locations of data embedded in the instruction stream (literal-pool
stuff). They are used for generating mixed-endian binaries (which we don't
support). The linked kernel has approximately 21,000 such symbols in it,
wasting space (500K in kernel.full, 190K in the final linked kernel), and
sometimes obscuring function names in stack tracebacks.
This change also simplifies the way the kernel is linked. Instead of using
sed to generate two different ldscript files to generate both an elf kernel
and a binary (elf headers stripped) kernel, we now use a single ldscript
that refers to a "text_start" symbol, and we provide the value for that
symbol using --defsym on the linker command line.
This driver configure the registers in the GRF according to the value
of the regulators for the platform.
Some IP can run with either 3.0V or 1.8V, if we don't configure them
correctly according to the external voltage used they will not work.
It's only done at boot time for now and might be needed at runtime for
IP like sdmmc.
Reviewed by: mmel
Tested On: RockPro64, Firefly-RK3399 (gonzo), AIO-3288 (mmel)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D22854
* Fix a couple of format errors.
* Add some extra compiler flags needed to force clang to build SPE code.
(These are temporary until the target triple is fixed)
To improve reliability of kernel modules after the clang switch, switch to
-fPIC when building for now.
This bypasses some limitations to the way clang and LLD handle relocations,
and is a more robustly tested compilation regime than the
"static shared object" mode that we were previously attempting to convince
the compiler stack to use.
The kernel linker was recently augmented to be able to handle this mode.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D22798
other files.
Arm and mips systems need to replace the SYSTEM_LD variable because they
need to create intermediate files which are post-processed with objcopy to
create the final .TARGET file. Previously they did so by pasting the full
expansion of SYSTEM_LD with the output filename replaced. This means
changing SYSTEM_LD in kern.pre.mk means you need to chase down anything that
replaces it and figure out how it differs so you can paste your changes in
there too.
Now there is a SYSTEM_LD_BASECMD variable that holds the entire basic kernel
linker command without the input and output files. This will allow arm and
mips makefiles to create their custom versions by refering to
SYSTEM_LD_BASECMD, which then becomes the one place where you have to make
changes to the basic linker command args.
Differential Revision: https://reviews.freebsd.org/D22921
SYSTEM_LD variable. This avoids duplicating the contents of SYSTEM_LD
from kern.pre.mk just to add the -N flag to it. If the basic linker command
ever needs to be changed, this will be one less place that has to be found
and fixed.
Some testing by kp@ indicates that the -N flag may not be needed at all,
so a comment to that effect is also added, and the -N flag may be removed
in a followup commit.
Differential Revision: https://reviews.freebsd.org/D22920
The VM generation counter is a 128-bit value exposed by the BIOS via ACPI.
The value changes to another unique identifier whenever a VM is duplicated.
Additionally, ACPI provides notification events when such events occur.
The driver decodes the pointer to the UUID, exports the value to userspace
via OPAQUE sysctl blob, and forwards the ACPI notifications in the form of
an EVENTHANDLER invocation as well as userspace devctl events.
See design paper: https://go.microsoft.com/fwlink/p/?LinkID=260709
This is lame, but it's what we already do for the clang build. We take
misaligned pointers into network header structures in many places.
Reviewed by: ian
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D22876
The OpenCores I2C IP core can be found on any bus. Split out the PCI
bus specifics into their own file, only compiled on systems with PCI.
Reviewed by: kp
Sponsored by: Axiado
We use armv7/GENERIC for the RPI2 images. The original RPI2 is actually a
32-bit BCM2836, but v1.2 was upgraded to the 64-bit BCM2837. The project
continues to provide the RPI2 image as armv7, as it's the lowest common
denominator of the two. Historically, we've just kind of implicitly
acknowledged this by including some bcm2837 bits on a SOC_BCM2836 kernel
config -- this worked until r354875 added code that actually cared.
Acknowledge formally that BCM2837 is valid in arm32.
This name is inconsistent with the other BCM* SOC on !arm64 for two reasons:
1. It's a pre-existing option on arm64, and
2. the naming convention on arm/ should've arguably changed to include BRCM
#1 seems to be a convincing enough argument to maintain the existing name
for it.
array with a singleton.
Also, pccbb isa attachment is never going to happen, do disconnect it from the
build (will delete this in future commit). It would need to be updated as well,
but since this code is effectively dead code, remove it from the build instead.
Unfortunately, there are some limitations:
- memory aperture of his controller is only 16MiB, so it is nearly
unusable for graphic cards
- every attempt to generate type 1 config cycle always causes trap.
These config cycles are disabled now and we don't support cards
with PCIe switch.
- in some cases, attempt to do config cycle to (probably) not-yet ready
card also causes trap. This cannot be detected at runtime, but it seems
like very rare issue.
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22724
We used to include the hisi version if soc_hisi_hi6220 was present,
include the altera version if dwmmc_altera was present and include
the rockchip version if soc_rockchip_rk3328 was present.
Now every version have it's own device directive.
The rockchip version isn't named dwmmc_rockchip because all other
rockchip driver are named rk_XXX.
MFC after: 1 month
These were obtained from the Chelsio Unified Wire v3.12.0.1 beta
release.
Note that the firmwares are not uuencoded any more.
MFH: 1 month
Sponsored by: Chelsio Communications
This change makes it possible to use OPAL console as a GDB debug port.
Similar to uart and uart_phyp debug ports, it has to be enabled by
setting the hw.uart.dbgport variable to the serial console node
of the device tree.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D22649
Summary:
There's no need to use the fallback fls() and flsl() libkern functions
when the PowerISA includes instructions that already do the bulk of the
work. Take advantage of this through the GCC builtins __builtin_clz()
and __builtin_clzl().
Reviewed by: luporl
Differential Revision: https://reviews.freebsd.org/D22340
In some cases, like is locked bootstrap or device's inability to boot from
removable media, we cannot use standard boot sequence and is necessary to
boot kernel directly from U-Boot.
Discussed with: jhibbits
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D13861
Uses two GPIO pins as MDC (clock) and MDIO (bidirectional I/O), relies
on mii_bitbang.
Tested on SG-3200 where the PHY for one of the ports is wired independently
of the SoC MDIO bus.
Sponsored by: Rubicon Communications, LLC (Netgate)
ConnectX-6 DX.
Currently TLS v1.2 and v1.3 with AES 128/256 crypto over TCP/IP (v4
and v6) is supported.
A per PCI device UMA zone is used to manage the memory of the send
tags. To optimize performance some crypto contexts may be cached by
the UMA zone, until the UMA zone finishes the memory of the given send
tag.
An asynchronous task is used manage setup of the send tags towards the
firmware. Most importantly setting the AES 128/256 bit pre-shared keys
for the crypto context.
Updating the state of the AES crypto engine and encrypting data, is
all done in the fast path. Each send tag tracks the TCP sequence
number in order to detect non-contiguous blocks of data, which may
require a dump of prior unencrypted data, to restore the crypto state
prior to wire transmission.
Statistics counters have been added to count the amount of TLS data
transmitted in total, and the amount of TLS data which has been dumped
prior to transmission. When non-contiguous TCP sequence numbers are
detected, the software needs to dump the beginning of the current TLS
record up until the point of retransmission. All TLS counters utilize
the counter(9) API.
In order to enable hardware TLS offload the following sysctls must be set:
kern.ipc.mb_use_ext_pgs=1
kern.ipc.tls.ifnet.permitted=1
kern.ipc.tls.enable=1
Sponsored by: Mellanox Technologies
Interrupt based driver, implements SPI mode and clock configuration.
Tested on espressobin and SG-3200.
Sponsored by: Rubicon Communications, LLC (Netgate)
When the linker doesn't have this feature, add -mno-relax to CFLAGS
on RISC-V.
Define the feature for ld.bfd, but not lld. If lld gains relaxation
support in a newer version, we can enable it for those versions of lld
in bsd.linker.mk.
Reviewed by: mhorne
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22659
The hardware offload is primarily targeted for TLS v1.2 and v1.3,
using AES 128/256 bit pre-shared keys. This patch adds all the needed
hardware structures, capabilites and firmware commands.
Sponsored by: Mellanox Technologies
This controller is a bit tricky as the STOP condition must be indicated in
the last tranferred byte, some devices will not like the repeated start
behavior of this controller. A proper fix to this issue is in the works.
This driver works in polling mode, can be used early in the boot (required
in some cases).
Tested on espressobin/SG-1100 and the SG-3200.
Obtained from: pfSense
Sponsored by: Rubicon Communications, LLC (Netgate)
This makes it possible to retrieve per-connection statistical
information such as the receive window size, RTT, or goodput,
using a newly added TCP_STATS getsockopt(3) option, and extract
them using the stats_voistat_fetch(3) API.
See the net/tcprtt port for an example consumer of this API.
Compared to the existing TCP_INFO system, the main differences
are that this mechanism is easy to extend without breaking ABI,
and provides statistical information instead of raw "snapshots"
of values at a given point in time. stats(3) is more generic
and can be used in both userland and the kernel.
Reviewed by: thj
Tested by: thj
Obtained from: Netflix
Relnotes: yes
Sponsored by: Klara Inc, Netflix
Differential Revision: https://reviews.freebsd.org/D20655
After discussing with mmel@, it was clear this is insufficient to address
all the needs. mmel@ will commit his original patch, from
https://reviews.freebsd.org/D13861, and the additions needed from r354714
will be made afterward.
Requested by: mmel
Sponsored by: Juniper Networks, Inc.
r354290 removed arm.arm from universe, but arm.arm kernels were still
found and built during the kernel stage. r354934 tagged armv5 kernel
configs as NO_UNIVERSE, but LINT-V5 remained. Stop building it as well.
Leave the clean rule in place for now so folks don't end up with a stale
LINT-V5.
Reviewed by: imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D22560
This change makes it possible to use a POWER Hypervisor virtual
terminal device (phyp vty) as a GDB debug port.
Similar to the uart debug port, it has to be enabled by setting
the hw.uart_phyp.dbgport variable to the vty node of the device
tree.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D22205
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters.
Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE
connections.
NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment
the encrypted TLS frames output by the crypto engine. Instead, the
TOE is placed into a special setup to permit "dummy" connections to be
associated with regular sockets using KTLS. This permits using the
TOE to segment the encrypted TLS records. However, this approach does
have some limitations:
1) Regular TOE sockets cannot be used when the TOE is in this special
mode. One can use either TOE and TOE-based KTLS or NIC KTLS, but
not both at the same time.
2) In NIC KTLS mode, the TOE is only able to accept a per-connection
timestamp offset that varies in the upper 4 bits. Put another way,
only connections whose timestamp offset has the 28 lower bits
cleared can use NIC KTLS and generate correct timestamps. The
driver will refuse to enable NIC KTLS on connections with a
timestamp offset with any of the lower 28 bits set. To use NIC
KTLS, users can either disable TCP timestamps by setting the
net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the
tcp_new_ts_offset() function to clear the lower 28 bits of the
generated offset.
3) Because the TCP segmentation relies on fields mirrored in a TCB in
the TOE, not all fields in a TCP packet can be sent in the TCP
segments generated from a TLS record. Specifically, for packets
containing TCP options other than timestamps, the driver will
inject an "empty" TCP packet holding the requested options (e.g. a
SACK scoreboard) along with the segments from the TLS record.
These empty TCP packets are counted by the
dev.cc.N.txq.M.kern_tls_options sysctls.
Unlike TOE TLS which is able to buffer encrypted TLS records in
on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS
records for retransmit requests as well as non-retransmit requests
that do not include the start of a TLS record but do include the
trailer. The T6 NIC KTLS code tries to optimize some of the cases for
requests to transmit partial TLS records. In particular it attempts
to minimize sending "waste" bytes that have to be given as input to
the crypto engine but are not needed on the wire to satisfy mbufs sent
from the TCP stack down to the driver.
TCP packets for TLS requests are broken down into the following
classes (with associated counters):
- Mbufs that send an entire TLS record in full do not have any waste
bytes (dev.cc.N.txq.M.kern_tls_full).
- Mbufs that send a short TLS record that ends before the end of the
trailer (dev.cc.N.txq.M.kern_tls_short). For sockets using AES-CBC,
the encryption must always start at the beginning, so if the mbuf
starts at an offset into the TLS record, the offset bytes will be
"waste" bytes. For sockets using AES-GCM, the encryption can start
at the 16 byte block before the starting offset capping the waste at
15 bytes.
- Mbufs that send a partial TLS record that has a non-zero starting
offset but ends at the end of the trailer
(dev.cc.N.txq.M.kern_tls_partial). In order to compute the
authentication hash stored in the trailer, the entire TLS record
must be sent as input to the crypto engine, so the bytes before the
offset are always "waste" bytes.
In addition, other per-txq sysctls are provided:
- dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq
using AES-CBC.
- dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq
using AES-GCM.
- dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to
compensate for the TOE engine not being able to set FIN on the last
segment of a TLS record if the TLS record mbuf had FIN set.
- dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this
txq including full, short, and partial records.
- dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header
and payload) sent for TLS record requests.
- dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS
record requests.
To enable NIC KTLS with T6, set the following tunables prior to
loading the cxgbe(4) driver:
hw.cxgbe.config_file=kern_tls
hw.cxgbe.kern_tls=1
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D21962
This Makefile sets KERN_OPTS. This permits kernel module Makefiles to
use KERN_OPTS to control the value of variables such as SRCS that are
used by bsd.kmod.mk for KERN_OPTS values that honor WITH/WITHOUT
options for standalone builds.
Update the NetBSD Kernel Concurrency Sanitizer (KCSAN) runtime to work in
the FreeBSD kernel. It is a useful tool for finding data races between
threads executing on different CPUs.
This can be enabled by enabling KCSAN in the kernel config, or by using the
GENERIC-KCSAN amd64 kernel. It works on amd64 and arm64, however the later
needs a compiler change to allow -fsanitize=thread that KCSAN uses.
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22315
We could maintain the static conversions for the !AArch64 Raspberry Pis, but
I'm not sure it's worth it -- we'll traverse the platform list exactly once
(of which there are only two for armv7), then every conversion there-after
traverses the memory map listing of which there are at-most two entries for
these boards: sdram and peripheral space.
Detecting this at runtime is necessary for the AArch64 SOC, though, because
of the distinct IO windows being otherwise not discernible just from support
compiled into the kernel. We currently select the correct window based on
/compatible in the FDT.
We also use a similar mechanism to describe the DMA restrictions- the RPi 4
can have up to 4GB of RAM while the DMA controller and mailbox mechanism can
technically, kind of, only access the lowest 1GB. See the comment in
bcm2835_vcbus.h for a fun description/clarification of this.
Differential Revision: https://reviews.freebsd.org/D22301
It's still disabled by default, but now it can be enabled with config(5) and
it will be build in LINT.
Reviewed by: imp
MFC after: 1 week
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D22383
The Supervisor Binary Interface (SBI) specification v0.2 is a backwards
incompatible update to the SBI call interface for kernels running in
supervisor mode. The goal of this update was to make it easier for new
and optional functionality to be added to the SBI.
SBI functions are now called by passing an "extension ID" and a
"function ID" which are passed in a7 and a6 respectively. SBI calls
will also return an error and value in the following struct:
struct sbi_ret {
long error;
long value;
}
This version introduces several new functions under the "base"
extension. It is expected that all SBI implementations >= 0.2 will
support this base set of functions, as they implement some essential
services such as obtaining the SBI version, CPU implementation info, and
extension probing.
Existing SBI functions have been designated as "legacy". For the time
being they will remain implemented, but it is expected that in the
future their functionality will be duplicated or replaced by new SBI
extensions. Each legacy function has been assigned its own extension ID,
and for now we simply probe and assert for their existence.
Compatibility with legacy SBI implementations (such as BBL) is
maintained by checking the output of sbi_get_spec_version(). This
function is guaranteed to succeed by the new spec, but will return an
error in legacy implementations. We use this as an indicator of whether
or not we can rely on the new SBI base extensions.
For further info on the Supervisor Binary Interface, see:
https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc
Reviewed by: kp, jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22326
This driver allows to usage of the paravirt SCSI controller
in VMware products like ESXi. The pvscsi driver provides a
substantial performance improvement in block devices versus
the emulated mpt and mps SCSI/SAS controllers.
Error handling in this driver has not been extensively tested
yet.
Submitted by: vbhakta@vmware.com
Relnotes: yes
Sponsored by: VMware, Panzura
Differential Revision: D18613
Summary:
Boot arm64 kernel using booti command from U-boot. booti can relocate initrd
image into higher ram addresses, therefore align the initrd load address to 1GiB
and create VA = PA map for it. Create L2 pagetable entries to copy the initrd
image into KVA.
(parts of the code in https://reviews.freebsd.org/D13861 was referred and used
as appropriate)
Submitted by: Siddharth Tuli <siddharthtuli_gmail.com>
Reviewed by: manu
Sponsored by: Juniper Networks, Inc
Differential Revision: https://reviews.freebsd.org/D22255
Fix wrong section ordering that was causing a ".got is not contiguous with
other relro sections" lld error. This also brings ldscript.powerpc and
ldscript.powerpcspe closer to ldscript.powerpc64.
Also, remove unnecessary text relocs from the ppc32 AIM trap code.
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D22349
- add support for fractional dividers
- allow to declare fixed and linked clock
MFC after: 3 weeks
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D22282
This kind of clock nodes represent temporary placeholder for clocks
defined later in boot process. Also, these are necessary to break
circular dependencies occasionally occurring in complex clock graphs.
MFC after: 3 weeks
These files already have 'device' lines that they require; adding a
dependency on SOC_* options is an extra restriction that adds extra
verbosity when future supported Broadcom-based SOC will also feature the
same compatible device.
Users wishing to not compile these devices in should remove the 'device'
lines from their config.
In the past, we would add symbolic links for MACHINE_CPUARCH when it differed
from MACHINE. This was for pc98 only, however. All other architectures didn't
need this and it was really due to pc98 pulling from i386 rather than something
more intrinsic. At the time, we had the split we did to mimic what NetBSD did
for its 68k ports where many different kernels were possible for the same
architecture. Since then, both projects have moved away from this convention to
having a more generic MACHINE for each architecture. FreeBSD's new arm64/aarch64
breaks this old notion and so was an exception to the rule. So, we no longer
need to create this link for any old machine or any new machine, delete it
entirely.
Differential Revision: https://reviews.freebsd.org/D22246
This involved several changes:
* Since lld does not like text relocations, replace SMP boot page text relocs
in booke/locore.S with position-independent math, and track the virtual base
in the SMP boot page header.
* As some SPRs are interpreted differently on clang due to the way it handles
platform-specific SPRs, switch m*dear and m*esr mnemonics out for regular
m*spr. Add both forms of SPR_DEAR to spr.h so the correct encoding is selected.
* Change some hardcoded 32 bit things in the boot page to be pointer-sized, and
fix alignment.
* Fix 64-bit build of booke/pmap.c when enabling pmap debugging.
Additionally, I took the opportunity to document how the SMP boot page works.
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D21999
r351049 bogusly deleted these lines from files.amd64 but failed to add them to
files.x86. Since this works on i386, add them to files.x86 rather than just
adding them back to files.amd64.
PR: 240734
Reported by: Michael Pro
The debug monitor register state is now stored in a struct and updated
when required. Currently there is only a kernel state, however a
per-process state will be added in a future change.
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22128