Summary:
kexec-lite cannot currently handle multiple PT_LOAD segments. In some
cases the compiler generates multiple PT_LOAD segments for an unknown
reason, causing boot to fail from kexec-lite.
Submitted by: Brandon Bergren (older version)
Differential Revision: https://reviews.freebsd.org/D19574
We were doing so as a workaround for the problem addressed by r345593, so
it's no longer necessary.
Reviewed by: jhb
Discussed with: emaste
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19705
This allows for directives such as
makeoptions DTS+=/out/of/tree/myboard.dts
# in tree! Same rules applied as if this were in a dtb/ module
makeoptions DTS+=otherboard.dts
to be specified in config(5) and have these built/installed alongside th
kernel. The assumption that overlays live in an overlays/ directory is only
made for in-tree DTSO, but we still make the assumption that out-of-tree
arm64 DTS will be in vendored directories (for now).
This lowers the cost to hacking on an overlay or dts by being able to
quickly throw it in a custom config, especially if it doesn't fit one of the
current dtb/modules quite appropriately or it's not intended for commit
there.
The build/install targets were split out of dtb.mk to centralize the build
logic and leave out the all/realinstall/CLEANFILES additions... it was
believed that we didn't want to pollute the kernel build with these.
The build rules were converted to suffix rules at the suggestion of Ian to
clean things up a little bit in a world where we can have mixed
in-tree/out-of-tree DTS/DTSO specified.
Reviewed by: ian
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19351
TMPFS_PAGES_MINRESERVED controls how much memory is reserved for the system
and not used by tmpfs.
On very small memory systems, the default value may be too high and this
prevents these small memory systems from using reroot, which is required
for them to install firmware updates.
Submitted by: Hiroki Mori <yamori813@yahoo.co.jp>
Reviewed by: mizhka
Differential Revision: https://reviews.freebsd.org/D13583
While geom_flashmap has always supported label names for its slices, it does
so by appending "s.labelname" to the provider device name, meaning you still
have to know the name and unit of the hardware device to use the labels.
These changes add support for device-independent geom_flashmap labels, using
the standard geom_label infrastructure. geom_flashmap now creates a softc
struct attached to its geom, and as it creates slices it stores the label
into an array in the softc. The new geom_label_flashmap uses those labels
when tasting a geom_flashmap provider.
Differential Revision: https://reviews.freebsd.org/D19535
TPM has a built-in RNG, with its own entropy source.
The driver was extended to harvest 16 random bytes from TPM every 10 seconds.
A new build option "TPM_HARVEST" was introduced - for now, however, it
is not enabled by default in the GENERIC config.
Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: markm, delphij
Approved by: secteam
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D19620
Attempting to build www/firefox on POWER9 resulted in a HMI exception being
thrown, a fatal trap currently. This is typically caused by timer facility
errors, but examination of the Hypervisor Maintenance Exception Register
(HMER) yielded only that an exception had recovered, with no information of
the actual exception cause.
When an HMI occurs, OPAL_HANDLE_HMI or OPAL_HANDLE_HMI2 must be called to
handle the exception at the firmware level. If the exception is handled, we
can continue.
This adds only the preliminary handler, enough to prevent package building
from panicking. An enhancement in the future is to use the flags returned
by OPAL_HANDLE_HMI2 to print more useful error messages, and log maintenance
events.
Reviewed by: luporl
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19634
r345402 fixed the bug that led to the split of the ISA 3.0 HPT handling from
the existing manager. The cause of the bug was gcc moving the register
holding VPN to a different register (not r0), which triggered bizarre
behaviors. With the fix, things work, so they can be re-merged. No
performance lost with the merge.
This ensures files like genassym.o and awk/mfiles are generated before
descending into the modules build. It may also allow some module builds
to not recreate files that are already present in the KERNBUILDDIR.
This fixes a rare build race where genassym.o is missing and assym.inc
is empty.
More work is planned around this to reduce some redundant dependency
generation in modules.
PR: 233339
MFC after: 2 weeks
Reported by: markj
This makes it more consistent with other filesystems, which all end in "fs",
and more consistent with its mount helper, which is already named
"mount_fusefs".
Reviewed by: cem, rgrimes
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19649
The kernel build uses symlinks to make MD #includes like <machine/pcpu.h>
work. Debug info ends up referencing these symlinks in a relative path,
so debuggers generally don't know how to find the corresponding headers.
Address this by using -fdebug-prefix-map to map relative paths through
the symlinks to their absolute paths in the source tree. This is
consistent with how regular source file paths are defined in the
kernel's debug info.
Also map the current directory to an absolute path to the object
directory. This gives debuggers a chance to find auto-generated files
like vnode_if.c if the object directory is available.
Reviewed by: emaste, jhb (previous version)
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19633
Update NAT64LSN implementation:
o most of data structures and relations were modified to be able support
large number of translation states. Now each supported protocol can
use full ports range. Ports groups now are belongs to IPv4 alias
addresses, not hosts. Each ports group can keep several states chunks.
This is controlled with new `states_chunks` config option. States
chunks allow to have several translation states for single alias address
and port, but for different destination addresses.
o by default all hash tables now use jenkins hash.
o ConcurrencyKit and epoch(9) is used to make NAT64LSN lockless on fast path.
o one NAT64LSN instance now can be used to handle several IPv6 prefixes,
special prefix "::" value should be used for this purpose when instance
is created.
o due to modified internal data structures relations, the socket opcode
that does states listing was changed.
Obtained from: Yandex LLC
MFC after: 1 month
Sponsored by: Yandex LLC
o most of data structures and relations were modified to be able support
large number of translation states. Now each supported protocol can
use full ports range. Ports groups now are belongs to IPv4 alias
addresses, not hosts. Each ports group can keep several states chunks.
This is controlled with new `states_chunks` config option. States
chunks allow to have several translation states for single alias address
and port, but for different destination addresses.
o by default all hash tables now use jenkins hash.
o ConcurrencyKit and epoch(9) is used to make NAT64LSN lockless on fast path.
o one NAT64LSN instance now can be used to handle several IPv6 prefixes,
special prefix "::" value should be used for this purpose when instance
is created.
o due to modified internal data structures relations, the socket opcode
that does states listing was changed.
Obtained from: Yandex LLC
MFC after: 1 month
Sponsored by: Yandex LLC
CLAT is customer-side translator that algorithmically translates 1:1
private IPv4 addresses to global IPv6 addresses, and vice versa.
It is implemented as part of ipfw_nat64 kernel module. When module
is loaded or compiled into the kernel, it registers "nat64clat" external
action. External action named instance can be created using `create`
command and then used in ipfw rules. The create command accepts two
IPv6 prefixes `plat_prefix` and `clat_prefix`. If plat_prefix is ommitted,
IPv6 NAT64 Well-Known prefix 64:ff9b::/96 will be used.
# ipfw nat64clat CLAT create clat_prefix SRC_PFX plat_prefix DST_PFX
# ipfw add nat64clat CLAT ip4 from IPv4_PFX to any out
# ipfw add nat64clat CLAT ip6 from DST_PFX to SRC_PFX in
Obtained from: Yandex LLC
Submitted by: Boris N. Lytochkin
MFC after: 1 month
Relnotes: yes
Sponsored by: Yandex LLC
Firmware needed by petitboot, for example, GPU firmware, can be installed to
a partition in the flash filesystem. This driver exposes the full flash
given by the device tree, letting the user manage firmware, etc, from
FreeBSD.
To use the partitions provided by the flash module, the fdt_slicer module is
needed, but the module isn't needed for raw access, so there's no direct
dependency link in here.
MFC after: 2 weeks
The OPAL firmware only supports a finite number of in-flight asynchronous
operations. Rather than have each subsystem try to manage its own, use a
central management service to hand out tokens.
More work can be done to improve asynchronous behavior, such as funneling
things through a future OPAL heartbeat handler, but capabilities will be
added as needed.
Augment the existing consumers (i2c and sensors) to use this new API.
MFC after: 4 weeks
Marvell XHCI is in fact generic-xhci, so move the driver and
add the compatible string.
While here, get and enable the phy if the dtb provide one.
The xhci bindings state that phys should be in a 'phys' property but
Marvell DTS uses 'usb-phy', only add support for 'usb-phy' for now.
Sponsored-by: Rubicon Communications, LCC ("Netgate")
This is a "fake" phy that handle regulator, clocks and reset gpio.
Only clock and regulator is supported for now.
Sponsored-by: Rubicon Communications, LCC ("Netgate")
Embedded lzma decompression library becomes a module usable by other
consumers, in addition to geom_uzip.
Most important code changes are
- removal of XZ_DEC_SINGLE define, we need the code to work
with XZ_DEC_DYNALLOC;
- xz_crc32_init() call is removed from geom_uzip, xz module handles
initialization on its own.
xz is no longer embedded into geom_uzip, instead the depend line for
the module is provided, and corresponding kernel option is added to
each MIPS kernel config file using geom_uzip.
The commit also carries unrelated cleanup by removing excess "device geom_uzip"
in places which were missed in r344479.
Reviewed by: cem, hselasky, ray, slavash (previous versions)
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D19266
MFC after: 3 weeks
add gcov support and export results as files in debugfs
Reviewed by: hps@
MFC after: 1 week
Sponsored by: iX Systems
Differential Revision: https://reviews.freebsd.org/D19260
this one should also work on amd64 and sparc64.
LINT was broken in r312910 with the removal of pc98 support, by changing
the pathname in UKBD_DFLT_KEYBAP from a removed pc98 file to a nonexistent
file.
There are many bugs nearby. Some are:
- the error is not properly detected and handled by make(1), because
kbdcontrol(8) exits with status 0 after failing to find the keymap file
- UKBD_DFLT_KEYBAP is supposed to be MI, and is in MI NOTES to try enforce
this, but 5 out of 8 arches don't support it
- LINT seems to have been broken by this in only 7 out of 8 arches. mips
breaks test coverage instead, by killing this option in its MD NOTES.
arm kills ukbd but that is not enough to configure an unsupported option
used only by ukbd.
Add or fix options to control static and dynamic configuration. Keep
the default of scteken, but default to statically configuring all available
emulators (now 3 instead of 1).
The dumb emulator is almost usable. libedit and libreadline handle
dumb terminals perfectly for at least shell history. less(1) works
as well as possible except on exit. But curses programs make messes.
The dumb emulator has strange color support, with 2 dumb colors for
normal output but fancy colorization for the cursor, mouse pointer and
(with a non-dumb initial emulator) for low-level console output.
Using the sc emulator instead of the default of scteken fixes at least
the following bugs:
- NUL is a printing character in cons25 but not in teken
- teken doesn't support fixed colors for "reverse" video.
- The best versions of sc are about 10 times faster than scteken (for
printing to the frame buffer). This version is only about 5 times
faster.
Fix configuration features:
- make SC_DFLT_TERM (for setting the initial emulator) a normal option.
Add configuration features:
- negative options SC_NO_TERM_* for omitting emulators in the static config.
Modules for emulators might work, but I don't know of any
- vidcontrol -e shows the available emulators
- vidcontrol -E <emulator> sets the active emulator.
is easier to configure. It is MI, unlike some of the other syscons files
already in the MI list.
Move scvtb.c similarly. It is needed whenever sc is configured, and is
more MI than most of the files already in the MI list.
This only changes the combined list for arm64 and mips. These arches
already cannot build sc or even NOTES.
The data structure implements non-intersecting intervals over the [0,
UINT64_MAX] range, and supports fast insert, predicated clearing of
subrange, and lookup of an interval containing the specified address.
Internally it is a pctrie over the interval start addresses.
Implementation provides additional guarantees over the structure state
in case of memory allocation failures.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D18893
Per discussions on arch@ and elsewhere, the maintenance of this code
has moved to the drm-kmod and drm-legacy-kmod ports. Remove the i915
and radeon drivers from the tree.
Approved by: graphics team
Reviewed by: manu@, mmel@
Differential Revision: https://reviews.freebsd.org/D19196
Remove support for compiling drm2 as a module. This has transitioned
to the drm-kmod or drm-legacy-kmodw ports.
Approved by: graphics team
Reviewed by: manu@, mmel@
Differential Revision: https://reviews.freebsd.org/D19196
Retire the drm modules / drivers. These are now handled by the
drm-legacy-kmod port and/or the drm-kmod port. All future
development and maintanace will be handled there.
Approved by: graphics team
Reviewed by: manu@, mmel@
Differential Revision: https://reviews.freebsd.org/D19196
Ensure __bss_start is associated with the next section
in case orphan sections are placed directly after .sdata,
as has been seen to happen with LLD.
Submitted by: "J.R.T. Clarke" <jrtc4@cam.ac.uk>
Differential Revision: https://reviews.freebsd.org/D18429
This adds the CBC-MAC code to the kernel, but does not hook it up to
anything (that comes in the next commit).
https://tools.ietf.org/html/rfc3610 describes the algorithm.
Note that this is a software-only implementation, which means it is
fairly slow.
Sponsored by: iXsystems Inc
Differential Revision: https://reviews.freebsd.org/D18592
Without this fix, the usage of kernel coverage would lockup the system.
Thanks to Andrew for suggesting the final form of the fix.
PR: 235611
Reviewed by: andrew@, emaste@
Differential Revision: https://reviews.freebsd.org/D19135
Make every rockchip file depend on the multiple soc_rockchip options
While here make rk_i2c and rk_gpio depend on their device options.
Reported by: sbruno
Add new file arm64/acpica/acpi_iort.c to support the "IO Remapping
Table" (IORT). The table is specified in ARM document "ARM DEN 0049D"
titled "IO Remapping Table Platform Design Document". The IORT table
has information on the associations between PCI root complexes, SMMU
blocks and GIC ITS blocks in the system.
The changes are to parse and save the information in the IORT table.
The API to use this information is added to sys/dev/acpica/acpivar.h.
The acpi_iort.c also has code to check the GIC ITS nodes seen in the
IORT table with corresponding entries in MADT table (for validity)
and with entries in SRAT table (for proximity information).
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D18002
The XIVE (External Interrupt Virtualization Engine) is a new interrupt
controller present in IBM's POWER9 processor. It's a very powerful,
very complex device using queues and shared memory to improve interrupt
dispatch performance in a virtualized environment.
This yields a ~10% performance improvment over the XICS emulation mode,
measured in both buildworld, and 'dd' from nvme to /dev/null.
Currently, this only supports native access.
MFC after: 1 month
iflib is already a module, but it is unconditionally compiled into the
kernel. There are drivers which do not need iflib(4), and there are
situations where somebody might not want iflib in kernel because of
using the corresponding driver as module.
Reviewed by: marius
Discussed with: erj
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D19041
Effectively all i386 kernels now have two pmaps compiled in: one
managing PAE pagetables, and another non-PAE. The implementation is
selected at cold time depending on the CPU features. The vm_paddr_t is
always 64bit now. As result, nx bit can be used on all capable CPUs.
Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE
configs, for drivers compatibility. Kernel layout, esp. max kernel
address, low memory PDEs and max user address (same as trampoline
start) are now same for PAE and for non-PAE regardless of the type of
page tables used.
Non-PAE kernel (when using PAE pagetables) can handle physical memory
up to 24G now, larger memory requires re-tuning the KVA consumers and
instead the code caps the maximum at 24G. Unfortunately, a lot of
drivers do not use busdma(9) properly so by default even 4G barrier is
not easy. There are two tunables added: hw.above4g_allow and
hw.above24g_allow, the first one is kept enabled for now to evaluate
the status on HEAD, second is only for dev use.
i386 now creates three freelists if there is any memory above 4G, to
allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed
from 3 to 1.
The PAE_TABLES kernel config option is retired.
In collaboarion with: pho
Discussed with: emaste
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18894
This will allow multiple consumers of the coverage data to be compiled
into the kernel together. The only requirement is only one can be
registered at a given point in time, however it is expected they will
only register when the coverage data is needed.
A new kernel conflig option COVERAGE is added. This will allow kcov to
become a module that can be loaded as needed, or compiled into the
kernel.
While here clean up the #include style a little.
Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18955
o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many
pbufs are we going to have set.
In various subsystems that are going to utilize pbufs create private zones
via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(),
and sets a limit on created zone. After startup preallocate pbufs according
to requirements of all pbuf zones.
Subsystems that used to have a private limit with old allocator now have
private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS,
swap, vnode pager.
The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9),
aio(4). They should have their private limits, but changing that is out of
scope of this commit.
o Fetch tunable value of kern.nswbuf from init_param2() and while here move
NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only
this option.
Default values aren't touched by this commit, but they probably should be
reviewed wrt to modern hardware.
This change removes a tight bottleneck from sendfile(2) operation, that
uses pbufs in vnode pager. Other pagers also would benefit from faster
allocation.
Together with: gallatin
Tested by: pho
When building with KCOV enabled the compiler will insert function calls
to probes allowing us to trace the execution of the kernel from userspace.
These probes are on function entry (trace-pc) and on comparison operations
(trace-cmp).
Userspace can enable the use of these probes on a single kernel thread with
an ioctl interface. It can allocate space for the probe with KIOSETBUFSIZE,
then mmap the allocated buffer and enable tracing with KIOENABLE, with the
trace mode being passed in as the int argument. When complete KIODISABLE
is used to disable tracing.
The first item in the buffer is the number of trace event that have
happened. Userspace can write 0 to this to reset the tracing, and is
expected to do so on first use.
The format of the buffer depends on the trace mode. When in PC tracing just
the return address of the probe is stored. Under comparison tracing the
comparison type, the two arguments, and the return address are traced. The
former method uses on entry per trace event, while the later uses 4. As
such they are incompatible so only a single mode may be enabled.
KCOV is expected to help fuzzing the kernel, and while in development has
already found a number of issues. It is required for the syzkaller system
call fuzzer [1]. Other kernel fuzzers could also make use of it, either
with the current interface, or by extending it with new modes.
A man page is currently being worked on and is expected to be committed
soon, however having the code in the kernel now is useful for other
developers to use.
[1] https://github.com/google/syzkaller
Submitted by: Mitchell Horne <mhorne063@gmail.com> (Earlier version)
Reviewed by: kib
Testing by: tuexen
Sponsored by: DARPA, AFRL
Sponsored by: The FreeBSD Foundation (Mitchell Horne)
Differential Revision: https://reviews.freebsd.org/D14599
newvers.sh takes upwards of 4-5 seconds to complete on trees checked
out from github, due to searching the entire history for non-existent
git-svn metadata. Similarly, if one does not check out notes, we
again search the entire history for notes. That makes newvers.sh very
slow for many github users.
To fix this in a fair way, limit the history search to the last 10K
commits: if you're more than 10K commits out of sync, then you've
forked the project, and our SVN rev is no longer very important to you.
Due to how git implements --grep in conjunction with -n, --grep has been
removed for performance reasons (git does not seem to limit its search
to the -n limit in this case, and takes just as long as it did with no
limit).
Reviewed by: emaste, imp
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D18745