Move framebuffer.{c,h} to sys/boot/efi/loader and add the efifb
related metadata and pass it to the kernel
Reviewed by: imp, andrew
Differential Revision: https://reviews.freebsd.org/D12757
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
- expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
same way as for AT_HWCAP.
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D12699
HEAD. Enable VIMAGE in GENERIC kernels and some others (where GENERIC does
not exist) on HEAD.
Disable building LINT-VIMAGE with VIMAGE being default.
This should give it a lot more exposure in the run-up to 12 to help
us evaluate whether to keep it on by default or not.
We are also hoping to get better performance testing.
The feature can be disabled using nooptions.
Requested by: many
Reviewed by: kristof, emaste, hiren
X-MFC after: never
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D12639
All of the kernel dump implementations keep track of the current offset
("dumplo") within the dump device. However, except for textdumps, they
all write the dump sequentially, so we can reduce code duplication by
having the MI code keep track of the current offset. The new
dump_append() API can be used to write at the current offset.
This is needed to implement support for kernel dump compression in the
MI kernel dump code.
Also simplify dump_encrypted_write() somewhat: use dump_write() instead
of duplicating its bounds checks, and get rid of the redundant offset
tracking.
Reviewed by: cem
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11722
mapping. This uses the new common code shared with amd64.
The RTC should only be accessed via EFI. There is no locking around it as
the spec only has this as a requirement for the PC-AT CMOS device.
Reviewed by: kib, imp
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D12595
kernel. We can register callbacks to perform the required operation on the
saved registers before returning.
This is initially used to work around a bug in old versions of QEMU that
trigger such an exception when reading from an ID register when it should
load z zero value.
I expect this could be used with other exception types, e.g. to emulate
special register access from userland.
Sponsored by: DARPA, AFRL
A new 'u_long *sv_hwcap' field is added to 'struct sysentvec'. A
process ABI can set this field to point to a value holding a mask of
architecture-specific CPU feature flags. If an ABI does not wish to
supply AT_HWCAP to processes the field can be left as NULL.
The support code for AT_EHDRFLAGS was already present on all systems,
just the #define was not present. This is a step towards unifying the
AT_* constants across platforms.
Reviewed by: kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12290
Enabled driver can be used on boards equipped with Marvell Armada 3700 SoC.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12256
Enabled driver can be used on boards equipped with Marvell Armada
3700/7k/8k SoCs.
Submitted by: Patryk Duda <pdk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12253
values. As not all assemblers understand the new ID_AA64MMFR2_EL1 register
add a macro to access it. This seems to be safe for older CPUs to read this
new register, with them returning zero.
Sponsored by: DARPA, AFRL
Marvell Armada 80x0/70x0 SoC family uses same RTC IP as
Armada 38x. This patch adds necessary files and enable driver in
GENERIC config.
Submitted by: Rafal Kozik <rk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Semihalf
Differential Revision: https://reviews.freebsd.org/D12200
compatible string to check if the board is compatible with a given quirk.
It's possible this will be moved later, however as it's currently only used
by the MP code put it there.
So far the only instance of a quirk is when the list of CPUs may be
incorrect. This can happen on virtual machines with a hard coded
devicetree, but where the user may then set the number of CPUs as an
argument. This is the case on the ARM models so include the model specific
compat strings for these, including the spelling mistake found in some of
the OpenplatformPkg dtb files.
Sponsored by: DARPA, AFRL
The full system memory barrier around a TLB invalidation is stricter than
required. It needs to wait on accesses to main memory, with just the weaker
store variant before the invalidate. As such use the dsb istst, tlbi, dlb
ish sequence already used in pmap.
The tlbi instruction in this sequence is also unnecessarily using a
broadcast invalidate when it just needs to invalidate the local CPUs TLB.
Switch to a non-broadcast variant of this instruction.
Sponsored by: DARPA, AFRL
This helps simplify the code in kern_shutdown.c and reduces the number
of globally visible functions.
No functional change intended.
Reviewed by: cem, def
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11603
dump_start() and dump_finish() are responsible for writing kernel dump
headers, optionally writing the key when encryption is enabled, and
initializing the initial offset into the dump device.
Also remove the unused dump_pad(), and make some functions static now that
they're only called from kern_shutdown.c.
No functional change intended.
Reviewed by: cem, def
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11584
and sc_irq_length to the softc to handle the base number of IRQs available,
make gicv3_get_nirqs return the number of available interrupt IDs, and
limit which CPUs we send interrupts to based on the numa domain.
The last point is only strictly needed on a dual socket ThunderX where we
are unable to send MSI/MSI-X interrupts between sockets.
Sponsored by: DARPA, AFRL
Previously, debug exceptions were only enabled on the boot CPU if
DDB was enabled in the dbg_monitor_init() function. APs also called
this function, but since mp_machdep.c doesn't include opt_ddb.h, the
APs ended up calling an empty stub defined in <machine/debug_monitor.h>
instead of the real function. Also, if DDB was not enabled in the kernel,
the boot CPU would not enable debug exceptions.
Fix this by adding a new dbg_init() function that always clears the OS
lock to enable debug exceptions which the boot CPU and the APs call.
This function also calls dbg_monitor_init() to enable hardware breakpoints
from DDB on all CPUs if DDB is enabled. Eventually base support for
hardware breakpoints/watchpoints will need to move out of the DDB-only
debug_monitor.c for use by userland debuggers.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D12001
Only fetch the VFP state from the CPU if the thread whose registers are
being requested is the current thread. If a stopped thread's registers
are being fetched by a debugger, the saved state in the PCB is already
valid.
Reviewed by: andrew
MFC after: 1 week
handle cases where they can only run on a single domain.
To allow all devices access to this set we need to move reading the domain
earlier in the boot as it was previously handled in the CPU driver, however
this is too late for the GICv3 ITS driver.
Sponsored by: DARPA, AFRL
multiple ITS devices, however we only want a single ITS device to be
configured on each CPU. To fix this only enable ITS when the node matches
the CPUs node.
Sponsored by: DARPA, AFRL
used to support the dual package ThunderX where we need to send MSI/MSI-X
interrupts to the same package as the device the interrupt came from.
Sponsored by: DARPA, AFRL
to demote it to 512 pages, then remove each of these. We can just remove
the l2 map directly. This is what the intel pmaps already do.
Sponsored by: DARPA, AFRL
were unneeded as we tell the tlb the pagetables are in cached memory. This
gives us a small, but statistically significant improvement over just
removing the PTE_SYNC cases.
While here remove PTE_SYNC, it's now unneeded.
Sponsored by: DARPA, AFRL
--Remove special-case handling of sparc64 bus_dmamap* functions.
Replace with a more generic mechanism that allows MD busdma
implementations to generate inline mapping functions by
defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>. This
is currently useful for sparc64, x86, and arm64, which all
implement non-load dmamap operations as simple wrappers
around map objects which may be bus- or device-specific.
--Remove NULL-checked bus_dmamap macros. Implement the
equivalent NULL checks in the inlined x86 implementation.
For non-x86 platforms, these checks are a minor pessimization
as those platforms do not currently allow NULL maps. NULL
maps were originally allowed on arm64, which appears to have
been the motivation behind adding arm[64]-specific barriers
to bus_dma.h, but that support was removed in r299463.
--Simplify the internal interface used by the bus_dmamap_load*
variants and move it to bus_dma_internal.h
--Fix some drivers that directly include sys/bus_dma.h
despite the recommendations of bus_dma(9)
Reviewed by: kib (previous revision), marius
Differential Revision: https://reviews.freebsd.org/D10729
We set the shareability attributes in TCR_EL1 on boot. These tell the
hardware the pagetables are in cached memory so there is no need to flush
the entries from the cache to memory.
This has about 4.2% improvement in system time and 2.7% improvement in
user time for a buildkernel -j48 on a ThunderX.
Keep the old code for now to allow for further comparisons.
struct thread.
For all architectures, the syscall trap handlers have to allocate the
structure on the stack. The structure takes 88 bytes on 64bit arches
which is not negligible. Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already. The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.
The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.
This move will also allow several more uses shortly.
Reviewed by: jhb (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
X-Differential revision: https://reviews.freebsd.org/D11080
from machine/proc.h, consistently on all architectures.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
X-Differential revision: https://reviews.freebsd.org/D11080
no need to.
- Remove pmap_is_current(), pmap_[pte|l3]_valid_cacheable as there were only
used to know if we had to write back pages.
- In pmap_remove_pages(), don't bother invalidating each page in the TLB,
we're about to flush the whole TLB anyway.
This makes make world 8-9% faster on my hardware.
Reviewed by: andrew
This patch improves the boundary checks in busdma to allow more cases
using the regular page based kernel memory allocator. Especially in
the case of having a non-zero boundary in the parent DMA tag. For
example AMD64 based platforms set the PCI DMA tag boundary to
PCI_DMA_BOUNDARY, 4GB, which before this patch caused contiguous
memory allocations to be preferred when allocating more than PAGE_SIZE
bytes. Even if the required alignment was less than PAGE_SIZE bytes.
This patch also fixes the nsegments check for using kmem_alloc_attr()
when the maximum segment size is less than PAGE_SIZE bytes.
Updated some comments describing the code in question.
Differential Revision: https://reviews.freebsd.org/D10645
Reviewed by: kib, jhb, gallatin, scottl
MFC after: 1 week
Sponsored by: Mellanox Technologies
VM_MEMATTR_WRITE_COMBINING in the kernel. This fixes a bug where Xorg would
use write back cached memory for its graphics buffers. This would produce
artifacts on the screen as cachelines were written to memory.
MFC after: 1 week
Sponsored by: DARPA, AFRL
operate similarly, other than not needing the delayed invalidation.
It has been tested with artificial injection of vm_page_alloc failures
while running 'sort /dev/zero'.
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10574
kernel calls this directly so the event handler is not called, meaning
the computer fails to reboot.
Tested by: cognet
MFC after: 1 week
Sponsored by: DARPA, AFRL
in place. To do per-cpu stats, convert all fields that previously were
maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
before we have set up UMA and we can do counter_u64_alloc(), provide an
early counter mechanism:
o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
so that at early stages of boot, before counters are allocated we already
point to a counter that can be safely written to.
o For sparc64 that required a whole dummy pcpu[MAXCPU] array.
Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.
This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html
Reviewed by: kib, gallatin, marius, lidl
Differential Revision: https://reviews.freebsd.org/D10156
We need to set the Execute-never bits when mapping device memory as the
hardware may perform speculative instruction fetches.
Set the Privileged Execute-ever bit on userspace memory to stop the kernel
if it is tricked into executing it.
Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10382
from the kernel. Make use of this to restrict accessing userspace to just
the functions that explicitly handle crossing the user kernel boundary.
Reported by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10371
pagetables. This sets both bits when entering an address we know shouldn't
be executed.
I expect we could mark all userspace pages as Privileged execute-never to
ensure the kernel doesn't branch to one of these addresses.
While here add the ARMv8.1 upper attributes.
Reviewed by: alc, kib (previous version)
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D10360
places possible in the kernel. This forces these functions to fail if
userspace is unable to access a given memory location, even if it is in
the user memory range.
This will simplify adding Privileged Access Never support later.
MFC after: 1 week
Sponsored by: DARPA, AFRL
Arm64 pmap interprets accessed writable ptes as modified, since
ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable bit
is removed, page must be marked as dirty for MI VM.
This change is most important for COW, where fork caused losing
content of the dirty pages which were not yet scanned by pagedaemon.
Reviewed by: alc, andrew
Reported and tested by: Mark Millard <markmi@dsl-only.net>
PR: 217138, 217239
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().
Reviewed by: gnn, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10313
GNU toolchain does not recognize LR as standard register alias,
but clang does. Use of #define will work on both. Place the
definition into central machine/asm.h instead of patching every
affected file, as requested by plaftorm maintainers.
Reviews by: andrew, emaste, imp
Differential Revision: https://reviews.freebsd.org/D10307
I fixed this in 1997, but the fix was over-engineered and fragile and
was broken in 2003 if not before. i386 parameters were copied to 8
other arches verbatim, mostly after they stopped working on i386, and
mostly without the large comment saying how the values were chosen on
i386. powerpc has a non-verbatim copy which just changes the uncritical
parameter and seems to add a sign extension bug to it.
Just treat negative offsets as offsets if they are no more negative than
-db_offset_max (default -64K), and remove all the broken parameters.
-64K is not very negative, but it is enough for frame and stack pointer
offsets since kernel stacks are small.
The over-engineering was mainly to go more negative than -64K for the
negative offset format, without affecting printing for more than a
single address.
Addresses in the top 64K of a (full 32-bit or 64-bit) address space
are now printed less well, but there aren't many interesting ones.
For arches that have many interesting ones very near the top (e.g.,
68k has interrupt vectors there), there would be no good limit for
the negative offset format and -64K is a good as anything.
matches static binaries.
Interpretation of the 'static' there is that the binary must not
specify an interpreter. In particular, shared objects are matched by
the brand if BI_CAN_EXEC_DYN is also set.
This improves precision of the brand matching, which should eliminate
surprises due to brand ordering.
Revert r315701.
Discussed with and tested by: ed (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
CloudABI executables are statically linked and don't have an
interpreter. Setting the interpreter path to NULL used to work
previously, but r314851 introduced code that checks the string
unconditionally. Running CloudABI executables now causes a null pointer
dereference.
Looking at the rest of imgact_elf.c, it seems various other codepaths
already leaned on the fact that the interpreter path is set. Let's just
go ahead and pick an obviously incorrect interpreter path to appease
imgact_elf.c.
MFC after: 1 week
interrupt arrives in fork_trampoline after sp_el0 was written we may then
switch to a new thread, enter userland so change this stack pointer, then
return to this code with the wrong value. This fixes this case by moving
the load of sp_el0 until after interrupts have been disabled.
Reported by: Mark Millard (markmi@dsl-only.net)
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D9593
we will import a newer version of the Linux code so the linuxkpi was not
used.
This is still missing 10G support, and multicast has not been tested.
Reviewed by: gnn
Obtained from: ABT Systems Ltd
Sponsored by: SoftIron Inc
Differential Revision: https://reviews.freebsd.org/D8549
On arm64 use atomics. Then, both arm and arm64 do not need a critical
section around update. Replace all cpus loop by CPU_FOREACH().
This brings arm and arm64 counter(9) implementation closer to current
amd64, but being more RISC-y, arm* version cannot avoid atomics.
Reported by: Alexandre Martins <alexandre.martins@stormshield.eu>
Reviewed by: andrew
Tested by: Alexandre Martins, andrew
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
The types are for the byte offset and page index in vm object. They
are similar to off_t, which is defined as 64bit MI integer. Using MI
definitions will allow to provide consistent MD values of vm
object-related maximum sizes.
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
- em(4) igb(4) and lem(4)
- deprecate the igb device from kernel configurations
- create a symbolic link in /boot/kernel from if_em.ko to if_igb.ko
Devices tested:
- 82574L
- I218-LM
- 82546GB
- 82579LM
- I350
- I217
Please report problems to freebsd-net@freebsd.org
Partial review from jhb and suggestions on how to *not* brick folks who
originally would have lost their igbX device.
Submitted by: mmacy@nextbsd.org
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Limelight Networks and Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D8299
virtio_pci was missing from the GENERIC arm64 configuration, while
other virtio devices are present. Adding it will allow us to boot
the GENERIC kernel on QEMU with virtio storage and networking.
In case where GICD_CTLR.DS is 1, the IGROUPR registers are RW in
non-secure state and has to be initialized to 1 for the
corresponding interrupts to be delivered as Group 1 interrupts.
Update gic_v3_dist_init() and gic_v3_redist_init() to initialize
GICD_IGROUPRn and GICR_IGROUPRn respectively to address this. The
registers can be set unconditionally since the writes are ignored
in non-secure state when GICD_CTLR.DS is 0.
This fixes the hang on boot seen when running qemu-system-aarch64
with machine virt,gic-version=3
Changes include modifications in kernel crash dump routines, dumpon(8) and
savecore(8). A new tool called decryptcore(8) was added.
A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump
configuration in the diocskerneldump_arg structure to the kernel.
The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for
backward ABI compatibility.
dumpon(8) generates an one-time random symmetric key and encrypts it using
an RSA public key in capability mode. Currently only AES-256-CBC is supported
but EKCD was designed to implement support for other algorithms in the future.
The public key is chosen using the -k flag. The dumpon rc(8) script can do this
automatically during startup using the dumppubkey rc.conf(5) variable. Once the
keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O
control.
When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random
IV and sets up the key schedule for the specified algorithm. Each time the
kernel tries to write a crash dump to the dump device, the IV is replaced by
a SHA-256 hash of the previous value. This is intended to make a possible
differential cryptanalysis harder since it is possible to write multiple crash
dumps without reboot by repeating the following commands:
# sysctl debug.kdb.enter=1
db> call doadump(0)
db> continue
# savecore
A kernel dump key consists of an algorithm identifier, an IV and an encrypted
symmetric key. The kernel dump key size is included in a kernel dump header.
The size is an unsigned 32-bit integer and it is aligned to a block size.
The header structure has 512 bytes to match the block size so it was required to
make a panic string 4 bytes shorter to add a new field to the header structure.
If the kernel dump key size in the header is nonzero it is assumed that the
kernel dump key is placed after the first header on the dump device and the core
dump is encrypted.
Separate functions were implemented to write the kernel dump header and the
kernel dump key as they need to be unencrypted. The dump_write function encrypts
data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps
are not supported due to the way they are constructed which makes it impossible
to use the CBC mode for encryption. It should be also noted that textdumps don't
contain sensitive data by design as a user decides what information should be
dumped.
savecore(8) writes the kernel dump key to a key.# file if its size in the header
is nonzero. # is the number of the current core dump.
decryptcore(8) decrypts the core dump using a private RSA key and the kernel
dump key. This is performed by a child process in capability mode.
If the decryption was not successful the parent process removes a partially
decrypted core dump.
Description on how to encrypt crash dumps was added to the decryptcore(8),
dumpon(8), rc.conf(5) and savecore(8) manual pages.
EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU.
The feature still has to be tested on arm and arm64 as it wasn't possible to run
FreeBSD due to the problems with QEMU emulation and lack of hardware.
Designed by: def, pjd
Reviewed by: cem, oshogbo, pjd
Partial review: delphij, emaste, jhb, kib
Approved by: pjd (mentor)
Differential Revision: https://reviews.freebsd.org/D4712
don't need the extra debug facilities. Copied from the amd64
configuration of the same name.
Submitted by: Nikolai Lifanov
Reviewed by: emaste
MFC after: 2 weeks
contain a vm_page_t at the specified index. However, with this
change, vm_radix_remove() no longer panics. Instead, it returns NULL
if there is no vm_page_t at the specified index. Otherwise, it
returns the vm_page_t. The motivation for this change is that it
simplifies the use of radix tries in the amd64, arm64, and i386 pmap
implementations. Instead of performing a lookup before every remove,
the pmap can simply perform the remove.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D8708
Table to find the CPUs to find the CPUs to start. Currently we assume PSCI,
however this assumption is shared with the FDT code.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
will be used by the gicv2m and ITS ACPI drivers to only attach to the
correct parent.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Now that BCM283x source are buildable with SMP option it cam be moved to
GENERIC SMP config. SMP itself does not work on RPi3 yet due to lack of
PSCI monitor which is work in progress at the moment
FDT attachment to a new file. A separate ACPI attachment will then be added
to allow arm64 servers with ACPI to use it over FDT.
This should also help with merging this with the ofwpci driver, with
further work needed to remove restrictions this driver places on resource
allocation.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7319
them. Previously this would walk past the end of the array and print
whatever happened to be after the trapframe struct.
MFC after: 1 week
Sponsored by: DARPA, AFRL
these show a 9-10% reduction in user and system time for a buildworld -j48.
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
VFP code to store the old context, with lazy loading of the new context
when needed.
FPU_KERN_NOCTX is missing as this is unused in the crypto code this has
been tested with, and I am unsure on the requirements of the UEFI
Runtime Services.
Reviewed by: kib
Obtained from: ABT Systeems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8276
Policy for FreeBSD/arm64 kernel config is the same one as for x86
architectures: provide GENERIC kernel bootable on as many systems
as possible. Since there is no SMP support for RPi 3 yet new kernel
config was introduced: GENERIC-UP, which is effectively GENERIC with
SMP option disabled
not be sent to userspace, for example the future flag to tell when we are
using floating point in the kernel.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
- Rename SOC_BCM2837 to SOC_BRCM_BCM2837, put it to opt_soc.h
- do not use files.XXX files, just move required sources to
conf/files.arm64 and make them depend on soc_brcm_bcm2837
Suggested by: andrew
Using the device pager with /dev/kmem is not stable since KVA mappings
are transient, but the device pager caches the PA associated with a
given offset forever. Interestingly, mips' implementation of
memmap() already refused requests for /dev/kmem.
Note that kvm_read/kvm_write do not use mmap, but use read and write on
/dev/kmem, so this should not affect libkvm users.
Reviewed by: kib
MFC after: 2 months
RPI3 kernel config builds kernel compatible with latest upstream device
tree and firmware: https://github.com/raspberrypi/firmware/tree/master/boot
As of today it's 597c662a613df1144a6bc43e5f4505d83bd748ca
Default console is PL01x, so pi3-disable-bt dt overlay should be configured
in config.txt and stock U-Boot should be patched to use proper serial port.
Yet unsupported: SMP, VCHIQ, RNG driver. RNG requires some work due to
upstream device tree incompatibility.
Multiple people contributed to this work over time: db@, loos@, manu@
with no creative content. Include "lost" changes from git:
o Use /dev/efi instead of /dev/efidev
o Remove redundant NULL checks.
Submitted by: kib@, dim@, zbb@, emaste@
Keep resource state consistent with INTRNG state - if intr_activate_irq
fails - deactivate resource and propagate error to calling function
Reviewed by: mmel
to add actions that run when a TCP frame is sent or received on a TCP
session in the ESTABLISHED state. In the base tree, this functionality is
only used for the h_ertt module, which is used by the cc_cdg, cc_chd, cc_hd,
and cc_vegas congestion control modules.
Presently, we incur overhead to check for hooks each time a TCP frame is
sent or received on an ESTABLISHED TCP session.
This change adds a new compile-time option (TCP_HHOOK) to determine whether
to include the hhook(9) framework for TCP. To retain backwards
compatibility, I added the TCP_HHOOK option to every configuration file that
already defined "options INET". (Therefore, this patch introduces no
functional change. In order to see a functional difference, you need to
compile a custom kernel without the TCP_HHOOK option.) This change will
allow users to easily exclude this functionality from their kernel, should
they wish to do so.
Note that any users who use a custom kernel configuration and use one of the
congestion control modules listed above will need to add the TCP_HHOOK
option to their kernel configuration.
Reviewed by: rrs, lstewart, hiren (previous version), sjg (makefiles only)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D8185
PIC_SETUP_INTR implementation in GICv3 did not allow
for setting up interrupts without included FDT
description. GICv2m-like MSI interrupts, which map
MSI messages to SPI interrupt lines, may not have
a description in FDT. Add support for such interrupts
by setting the trigger and polarity to the appropriate
values for MSI (edge, high) and get the hardware
IRQ number from the corresponding ISRC.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Reviewed by: wma
Differential Revision: https://reviews.freebsd.org/D7662
As this is evaluation hardware with only a few users, and there is a lack
of information add a warning when booting on this hardware.
Reported by: cognet
Obtained from: ABT Systems Ltd
MFC after: Instant
Sponsored by: The FreeBSD Foundation
Move PMAP_TS_REFERENCED_MAX out of the various pmap implementations and
into vm/pmap.h, and describe what its purpose is. Eliminate the archaic
"XXX" comment about its value. I don't believe that its exact value, e.g.,
5 versus 6, matters.
Update the arm64 and riscv pmap implementations of pmap_ts_referenced()
to opportunistically update the page's dirty field.
On amd64, use the PDE value already cached in a local variable rather than
dereferencing a pointer again and again.
Reviewed by: kib, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D7836
the data cache to the point of unification. This is the point where the
two caches are unified to a single unified cache so cleaning past here
is just extra unneeded work.
This was noticed when investigating r305545.
Reported by: bz
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
page is non-executable the contents of the i-cache are unimportant so this
call is just adding unneeded overhead when inserting pages.
While doing research using gem5 with an O3 pipeline and 1k/32k/1M iTLB/L1
iCache/L2 Bjoern Zeeb (bz@) observed a fairly high rate of calls into
arm64_icache_sync_range() from pmap_enter() along with a high number of
instruction fetches and iTLB/iCache hits.
Limiting the calls to arm64_icache_sync_range() to only executable pages,
we observe the iTLB and iCache Hit going down by about 43%. These numbers
are quite misleading when looked at alone as at the same time instructions
retired were reduced by 19.2% and instruction fetches were reduced by 38.8%.
Overall this reduced the runtime of the test program by 22.4%.
On Juno hardware, in steady-state, running the same test, using the cycle
count to determine runtime, we do see a reduction of up to 28.9% in runtime.
While these numbers certainly depend on the program executed, we expect an
overall performance improvement.
Reported by: bz
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
This commit adds drivers for Alpine Cache Coherency Unit
and North Bridge Service whose task is to configure
the system fabric and enable cache coherency.
Obtained from: Semihalf
Submitted by: Michal Stanek <mst@semihalf.com>
Sponsored by: Annapurna Labs
Reviewed by: wma
Differential Revision: https://reviews.freebsd.org/D7565
survived multiple world and kernel builds, and of poudriere building full
package sets.
I have observed a 3% reduction in buildworld times with superpages enabled,
however further testing is needed to see if this is observed in other
workloads.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Idle page zeroing has been disabled by default on all architectures since
r170816 and has some bugs that make it seemingly unusable. Specifically,
the idle-priority pagezero thread exacerbates contention for the free page
lock, and yields the CPU without releasing it in non-preemptive kernels. The
pagezero thread also does not behave correctly when superpage reservations
are enabled: its target is a function of v_free_count, which includes
reserved-but-free pages, but it is only able to zero pages belonging to the
physical memory allocator.
Reviewed by: alc, imp, kib
Differential Revision: https://reviews.freebsd.org/D7714
will allow drivers that manage the clock frequency to communicate this with
the reset of the kernel.
Reported by: jmcneill
MFC after: 1 week
Sponsored by: ABT Systems Ltd
* Pass the correct virtual address when demoting a superpage
* Use the correct l3 table after demoting a superpage
* Remove an invalid KASSERT hit demoting then promoting a superpage [1]
With this it is believed that superpages on arm64 is stable.
Reported by: [1] cognet
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
there is a short period where functions that walk the kernel page table
without locking them may see an invalid entry. One solution would be to add
locking to these functions, however some may be called from locations where
we are unable to sleep.
Until a better solution can be found stop promoting pages in the kernel
pmap so these functions work as expected.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
the l2 entry is a block type and not an l3 page.
While here fix the string to correct the level name and add a missing ')'.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
to what the 32-bit arm code does, with the exception that it always assumes
the tag is non-coherent.
Tested by: jmcneill
Obtained from: ABT Systems Ltd
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
needed before enabling superpages on arm64. This code is based on the amd64
pmap with changes as needed to handle the differences between the two
architectures.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
iterate over superpages. We don't yet create these, but soon will.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
In all of these source files, the userspace pointer size corresponds
with the kernelspace pointer size, meaning that casting directly works.
As I'm planning on making 32-bit execution on 64-bit systems work as
well, use TO_PTR() here as well, so that the changes between source
files remain minimal.
finding the vm_page_t in pmap_extract_and_hold. Previously it would return
the vm_page_t of the first page in a block. This would cause issues when,
for example, fsck reads from a device into the middle of a superpage. In
this case the read call would write to the start of the block, and not to
the buffer passed in.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
pagetable is supported more will be added soon to support removing
superpages.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
physical address. This is required when either mapping is writeable.
While here remove an unneeded call to pmap_pde, we already have the pde
from earlier in the function.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
- Read interrupt properties at bus enumeration time and store
it into global mapping table.
- At bus_activate_resource() time, given mapping entry is resolved and
connected to real interrupt source. A copy of mapping entry is attached
to given resource.
- At bus_setup_intr() time, mapping entry stored in resource is used
for delivery of requested interrupt configuration.
- For MSI/MSIX interrupts, mapping entry is created within
pci_alloc_msi()/pci_alloc_msix() call.
- For legacy PCI interrupts, mapping entry must be created within
pcib_route_interrupt() by pcib driver itself.
Reviewed by: nwhitehorn, andrew
Differential Revision: https://reviews.freebsd.org/D7493
Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC. For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge. Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.
Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET. For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.
Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location. __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.
Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access. But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge
machines.
Tested by: Howard Su <howard0su@gmail.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D7473
promote memory as I am not sure all the demotion cases are handled, however
it is useful to implement pmap_page_set_memattr. This is used, for example,
when mapping uncached memory for bus_dma(9).
pmap_page_set_memattr needs to demote the DMAP region as on ARM we need to
ensure all mappings to the same physical address have the same attributes.
Reviewed by: kib
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6987
CloudABI executables already provide support for passing in vDSOs. This
functionality is used by the emulator for OS X to inject system call
handlers. On FreeBSD, we could use it to optimize calls to
gettimeofday(), etc.
Though I don't have any plans to optimize any system calls right now,
let's go ahead and already pass in a vDSO. This will allow us to
simplify the executables, as the traditional "syscall" shims can be
removed entirely. It also means that we gain more flexibility with
regards to adding and removing system calls.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D7438
On all the other architectures, this function can also be called on the
currently running thread. In this case, we shouldn't fix up the address
in the PCB, but also patch up the register itself. Otherwise it will not
become active and will simply become overwritten by the next switch.
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D7437
as invalidation will have completed before the pmap_invalidate_* functions
have complete.
Discussed with: alc, kib
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
between ACPI and FDT. This will be needed on machines with both, e.g. the
SoftIron Overdrive 3000. The kernel will accept one or more comma separated
values of either 'acpi' or 'fdt'. Any other values are skipped.
To set it the user can either set it on the loader command line, or
in loader.conf e.g. in loader.conf:
kern.cfg.order=acpi,fdt
This will try using ACPI then FDT. If none of the selected options work the
kernel tries to use one to get the serial console, then panics.
Reviewed by: emaste (earlier version)
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7274
inner-shareable memory accesses. There is no need for full system barriers.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
only for now, but wouldn't be too difficult to add support for FDT.
Reviewed by: hselasky
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7352
the ITS driver file. There is no need for other drivers to need to know
about these structures.
Obtained from: ABT Systems Ltd
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
It turns out that this value is not used within the system call code
under normal conditions, except when using tracing tools like ktrace.
If we forget to set this value, it is set to random garbage. This may
cause ktrace to hang indefinitely, making it impossible to kill.
Reported by: Michael Plass
PR: 210800
MFC before: 11.0-RELEASE
to INTRNG in r301565 with the old code no longer being built by default with
no reports of issues on any supported hardware.
Approved by: re (gjb)
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
uncategorised reason. We need to read the fault address register before
enabling interrupts as the interrupt handler may cause this register to
change.
Approved by: re (marius, kib)
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
threads, to make it less confusing and using modern kernel terms.
Rename the functions to reflect current use of the functions, instead
of the historic KSE conventions:
cpu_set_fork_handler -> cpu_fork_kthread_handler (for kthreads)
cpu_set_upcall -> cpu_copy_thread (for forks)
cpu_set_upcall_kse -> cpu_set_upcall (for new threads creation)
Reviewed by: jhb (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Approved by: re (hrs)
Differential revision: https://reviews.freebsd.org/D6731
framework has significantly changed the driver has moved to a new file.
While it shares some code with the existing driver this has been modified
to work better with the intrng framework.
This has been tested on the ThunderX servers in the netperf cluster and has
been used to boot them for other testing, including DTrace and hwpmc.
With this we can use intrng on all supported arm64 platforms I was able to
test on. It is expected we will move to intrng soon, and disable the old
arm64 interrupt framework.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6437
a non-zero ID. To do this we increment the cpuid of any CPUs with a smaller
devicetree ID by one to stop them conflicting with the boot CPU.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
this to be the case. This will mean we don't try and handle the cache in
bus_dmamap_sync when it is not needed.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6605
tested on the Pass 1.1 and 2.0 ThunderX machines in the Netperf cluster.
Reviewed by: jhb
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6453
needed in later changes where we may not be able to lock the pic list lock
to perform a lookup, e.g. from within interrupt context.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Allow to deallocate previously allocated ITS device along with
its interrupts. Interrupt numbers are being freed when the last
LPI number is no longer busy.
Reviewed by: wma
Obtained from: Semihalf
Sponsored by: Cavium
Differential Revision: https://reviews.freebsd.org/D6351
supports the Security Extensions or not. This bit is not the same as the CPU one.
Currently we are not checking for either before trying to write to the special
registers. This can lead to problems on hardware or simulators that do not
provide the security extensions. Add the missing checks. Their interactions with
the CPU flag is not entirely clear to me but using a macro will make it easier
to quickly adjust the condition once the CPU bits are sorted as well.
Reviewed by: br
Sponsored by: DARPA/AFRL
Differential Revision: https://reviews.freebsd.org/D6397