Commit Graph

498 Commits

Author SHA1 Message Date
kib
e2a14c603f Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack.  The structure takes 88 bytes on 64bit arches
which is not negligible.  Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already.  The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 21:03:23 +00:00
kib
7b6fe97487 Make struct syscall_args visible to userspace compilation environment
from machine/proc.h, consistently on all architectures.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 20:53:44 +00:00
andrew
4fce328cae Allow the arm64 machine/vfp.h to be included without first including
machine/pcb.h. It he latter is only needed for struct pcb.
2017-06-09 15:47:14 +00:00
andrew
7f9edbc607 Store the read-only thread pointer when scheduling a new thread. This is
not currently set, however we may wish to set it later.
2017-06-09 15:37:17 +00:00
cognet
2fe3ed022a - Don't bother flushing the data cache for pages we're about to unmap, there's
no need to.
- Remove pmap_is_current(), pmap_[pte|l3]_valid_cacheable as there were only
used to know if we had to write back pages.
- In pmap_remove_pages(), don't bother invalidating each page in the TLB,
we're about to flush the whole TLB anyway.

This makes make world 8-9% faster on my hardware.

Reviewed by:	andrew
2017-06-02 14:17:14 +00:00
kib
66ff0fc9c0 Add COMPAT_FREEBSD11 on arm64, the arch is almost tier-1.
Discussed with:	andrew, emaste
Sponsored by:	The FreeBSD Foundation
2017-05-23 13:57:55 +00:00
hselasky
107bf62085 Avoid use of contiguous memory allocations in busdma when possible.
This patch improves the boundary checks in busdma to allow more cases
using the regular page based kernel memory allocator. Especially in
the case of having a non-zero boundary in the parent DMA tag. For
example AMD64 based platforms set the PCI DMA tag boundary to
PCI_DMA_BOUNDARY, 4GB, which before this patch caused contiguous
memory allocations to be preferred when allocating more than PAGE_SIZE
bytes. Even if the required alignment was less than PAGE_SIZE bytes.

This patch also fixes the nsegments check for using kmem_alloc_attr()
when the maximum segment size is less than PAGE_SIZE bytes.

Updated some comments describing the code in question.

Differential Revision:	https://reviews.freebsd.org/D10645
Reviewed by:		kib, jhb, gallatin, scottl
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-16 14:21:37 +00:00
andrew
e1e13d1ad1 Allocate a cacheline when reading or writing to write through memory. The
hardware will still write to memory, however following reads will be from
the cache.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-05-13 13:03:20 +00:00
andrew
bf70d7ed98 Add the VM_MEMATTR_WRITE_THROUGH memory type to arm64 and use it to support
VM_MEMATTR_WRITE_COMBINING in the kernel. This fixes a bug where Xorg would
use write back cached memory for its graphics buffers. This would produce
artifacts on the screen as cachelines were written to memory.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-05-13 13:01:15 +00:00
andrew
7a5edbcf70 Add reclaim_pv_chunk on arm64. This is based on the amd64 code so should
operate similarly, other than not needing the delayed invalidation.

It has been tested with artificial injection of vm_page_alloc failures
while running 'sort /dev/zero'.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D10574
2017-05-05 10:01:27 +00:00
andrew
17b89157e4 Print all virtual addresses in the show vtop ddb command. The results may
be different with PAN enabled.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-05-03 12:24:31 +00:00
andrew
de3e0757f9 Call the PSCI reset from cpu_reset on arm64. When rebooting from DDB the
kernel calls this directly so the event handler is not called, meaning
the computer fails to reboot.

Tested by:	cognet
MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-04-24 11:06:10 +00:00
andrew
acc61195fc Restrict the arm64 supervisor all instructions to only allow a zero
immediate value for system calls. We may wish to use other values in the
future for other purposes.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-04-20 15:53:20 +00:00
andrew
d8ea52fd1f Push loading curthread into assembly in the synchronous exception handlers.
This will help investigating the performance impact of moving parts of the
switch statement in do_el0_sync into assembly.

Sponsored by:	DARPA, AFRL
2017-04-20 13:56:30 +00:00
emaste
77758299a3 Remove trailing whitespace from r317061 2017-04-17 18:57:26 +00:00
glebius
21ead51d79 - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter
in place.  To do per-cpu stats, convert all fields that previously were
  maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
  before we have set up UMA and we can do counter_u64_alloc(), provide an
  early counter mechanism:
  o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
  o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
    so that at early stages of boot, before counters are allocated we already
    point to a counter that can be safely written to.
  o For sparc64 that required a whole dummy pcpu[MAXCPU] array.

Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
  to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.

This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html

Reviewed by:	kib, gallatin, marius, lidl
Differential Revision:	https://reviews.freebsd.org/D10156
2017-04-17 17:34:47 +00:00
glebius
5763443023 All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.
2017-04-17 17:07:00 +00:00
andrew
ab6a57c4a3 Rather than checking if the top bit in a virtual address is a 0 or 1
compare against VM_MAXUSER_ADDRESS as we should have been doing.

Sponsored by:	DARPA, AFRL
2017-04-13 16:57:02 +00:00
andrew
5730f7ccdc Set the arm64 Execute-never bits in more places.
We need to set the Execute-never bits when mapping device memory as the
hardware may perform speculative instruction fetches.

Set the Privileged Execute-ever bit on userspace memory to stop the kernel
if it is tricked into executing it.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D10382
2017-04-13 15:03:03 +00:00
kan
de2c97b5fd Use proper fields to check for interrupt trigger mode. 2017-04-13 14:23:27 +00:00
andrew
d73bb06e63 In ARMv8.1 ARM has added a process state bit to disable access to userspace
from the kernel. Make use of this to restrict accessing userspace to just
the functions that explicitly handle crossing the user kernel boundary.

Reported by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D10371
2017-04-13 13:46:01 +00:00
andrew
a19f569aad Add SCTLR bits added in ARMv8.1 and ARMv8.2 and start to use them in the
early boot code.

Sponsored by:	DARPA, AFRL
2017-04-13 11:56:27 +00:00
andrew
e4d185768e Start to use the User and Privileged execute-never bits in the arm64
pagetables. This sets both bits when entering an address we know shouldn't
be executed.

I expect we could mark all userspace pages as Privileged execute-never to
ensure the kernel doesn't branch to one of these addresses.

While here add the ARMv8.1 upper attributes.

Reviewed by:	alc, kib (previous version)
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D10360
2017-04-12 16:28:40 +00:00
andrew
e10e5b0689 Use the unprivileged variant of the load and store instructions most
places possible in the kernel. This forces these functions to fail if
userspace is unable to access a given memory location, even if it is in
the user memory range.

This will simplify adding Privileged Access Never support later.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-04-12 12:34:27 +00:00
kib
26b779b185 Do not lose dirty bits for removing PROT_WRITE on arm64.
Arm64 pmap interprets accessed writable ptes as modified, since
ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable bit
is removed, page must be marked as dirty for MI VM.

This change is most important for COW, where fork caused losing
content of the dirty pages which were not yet scanned by pagedaemon.

Reviewed by:	alc, andrew
Reported and tested by:	Mark Millard <markmi@dsl-only.net>
PR:	217138, 217239
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-04-10 15:32:26 +00:00
pkelsey
33064e92a2 Corrected misspelled versions of rendezvous.
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().

Reviewed by:	gnn, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10313
2017-04-09 02:00:03 +00:00
kan
139ec191bb Define 'lr' as x30 on aarch64
GNU toolchain does not recognize LR as standard register alias,
but clang does. Use of #define will work on both. Place the
definition into central machine/asm.h instead of patching every
affected file, as requested by plaftorm maintainers.

Reviews by: andrew, emaste, imp
Differential Revision:	https://reviews.freebsd.org/D10307
2017-04-07 22:58:28 +00:00
bde
254458ab34 Fix printing of negative offsets (typically from frame pointers) again.
I fixed this in 1997, but the fix was over-engineered and fragile and
was broken in 2003 if not before.  i386 parameters were copied to 8
other arches verbatim, mostly after they stopped working on i386, and
mostly without the large comment saying how the values were chosen on
i386.  powerpc has a non-verbatim copy which just changes the uncritical
parameter and seems to add a sign extension bug to it.

Just treat negative offsets as offsets if they are no more negative than
-db_offset_max (default -64K), and remove all the broken parameters.

-64K is not very negative, but it is enough for frame and stack pointer
offsets since kernel stacks are small.

The over-engineering was mainly to go more negative than -64K for the
negative offset format, without affecting printing for more than a
single address.

Addresses in the top 64K of a (full 32-bit or 64-bit) address space
are now printed less well, but there aren't many interesting ones.
For arches that have many interesting ones very near the top (e.g.,
68k has interrupt vectors there), there would be no good limit for
the negative offset format and -64K is a good as anything.
2017-03-26 18:46:35 +00:00
imp
6b8dbe92e8 Add 'device iic' to bring in userland I2C driver.
Submitted by: karl@
2017-03-24 22:33:03 +00:00
ed
63254ceea6 Stop providing the compat_3_brand.
As of r315860, the ELF image activator works fine for CloudABI without it.

Reviewed by:	kib
MFC after:	2 weeks
2017-03-23 14:12:21 +00:00
kib
da63ef1f60 Update r315753 with the proper flag name.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-22 22:28:13 +00:00
kib
a22b5a3135 Add a flag BI_BRAND_ONLY_STATIC to specify that the brand only
matches static binaries.

Interpretation of the 'static' there is that the binary must not
specify an interpreter.  In particular, shared objects are matched by
the brand if BI_CAN_EXEC_DYN is also set.

This improves precision of the brand matching, which should eliminate
surprises due to brand ordering.

Revert r315701.

Discussed with and tested by:	ed (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-22 22:23:01 +00:00
ed
fc95dfd2e5 Set the interpreter path to /nonexistent.
CloudABI executables are statically linked and don't have an
interpreter. Setting the interpreter path to NULL used to work
previously, but r314851 introduced code that checks the string
unconditionally. Running CloudABI executables now causes a null pointer
dereference.

Looking at the rest of imgact_elf.c, it seems various other codepaths
already leaned on the fact that the interpreter path is set. Let's just
go ahead and pick an obviously incorrect interpreter path to appease
imgact_elf.c.

MFC after:	1 week
2017-03-22 07:05:27 +00:00
andrew
d4532eaf62 If ofw_bus_msimap fails don't try to use the invalid MSI/MSI-X parent node.
Sponsored by:	ABT Systems Ltd
2017-03-16 17:49:37 +00:00
andrew
a03d613777 Load the new sp_el0 with interrupts disabled in fork_trampoline. If an
interrupt arrives in fork_trampoline after sp_el0 was written we may then
switch to a new thread, enter userland so change this stack pointer, then
return to this code with the wrong value. This fixes this case by moving
the load of sp_el0 until after interrupts have been disabled.

Reported by:	Mark Millard (markmi@dsl-only.net)
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D9593
2017-02-15 14:56:47 +00:00
andrew
33bc722d77 Port the Linux AMX 10G network driver to FreeBSD as axgbe. It is unlikely
we will import a newer version of the Linux code so the linuxkpi was not
used.

This is still missing 10G support, and multicast has not been tested.

Reviewed by:	gnn
Obtained from:	ABT Systems Ltd
Sponsored by:	SoftIron Inc
Differential Revision:	https://reviews.freebsd.org/D8549
2017-02-15 13:56:04 +00:00
andrew
d431cd542b Push reading of ESR_EL1 to assembly. Among other uses this will allow us
to expose this to signal handlers, e.g. for the clang sanitizers.

Sponsored by:	DARPA, AFRL
2017-02-07 18:19:11 +00:00
andrew
3484d055de Remove arm64_tlb_flushID_SE, it's unused and may be wrong.
Sponsored by:	ABT Systems Ltd
2017-02-06 17:50:09 +00:00
kib
2318a84733 Update arm and arm64 counters MD bits.
On arm64 use atomics.  Then, both arm and arm64 do not need a critical
section around update.  Replace all cpus loop by CPU_FOREACH().
This brings arm and arm64 counter(9) implementation closer to current
amd64, but being more RISC-y, arm* version cannot avoid atomics.

Reported by:	Alexandre Martins <alexandre.martins@stormshield.eu>
Reviewed by:	andrew
Tested by:	Alexandre Martins, andrew
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-02-06 17:20:37 +00:00
kib
c24073c855 Define the vm_ooffset_t and vm_pindex_t types as machine-independend.
The types are for the byte offset and page index in vm object.  They
are similar to off_t, which is defined as 64bit MI integer.  Using MI
definitions will allow to provide consistent MD values of vm
object-related maximum sizes.

Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-02-04 12:26:38 +00:00
cognet
6a21064647 Implement atomic_fcmpset_* for arm and arm64. 2017-01-28 16:24:06 +00:00
bz
d0e035a60f Remove a static function declaration for a function not implemented.
Makes head code compile on 10.3 and cleanup is never wrong.

MFC after:	3 days
2017-01-23 16:40:20 +00:00
ed
be72efbdd4 Catch up with changes to structure member names.
Pointer/length pairs are now always named ${name} and ${name}_len.
2017-01-17 22:05:52 +00:00
sbruno
efab05d612 Migrate e1000 to the IFLIB framework:
- em(4) igb(4) and lem(4)
- deprecate the igb device from kernel configurations
- create a symbolic link in /boot/kernel from if_em.ko to if_igb.ko

Devices tested:
- 82574L
- I218-LM
- 82546GB
- 82579LM
- I350
- I217

Please report problems to freebsd-net@freebsd.org

Partial review from jhb and suggestions on how to *not* brick folks who
originally would have lost their igbX device.

Submitted by:	mmacy@nextbsd.org
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Limelight Networks and Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8299
2017-01-10 03:23:22 +00:00
jchandra
66a7c34b9e Add virtio_pci to GENERIC arm64 conf
virtio_pci was missing from the GENERIC arm64 configuration, while
other virtio devices are present. Adding it will allow us to boot
the GENERIC kernel on QEMU with virtio storage and networking.
2016-12-18 11:15:31 +00:00
jchandra
a38cab8fcb Initialize GIC[DR]_IGROUPRn registers for GICv3
In case where GICD_CTLR.DS is 1, the IGROUPR registers are RW in
non-secure state and has to be initialized to 1 for the
corresponding interrupts to be delivered as Group 1 interrupts.

Update gic_v3_dist_init() and gic_v3_redist_init() to initialize
GICD_IGROUPRn and GICR_IGROUPRn respectively to address this. The
registers can be set unconditionally since the writes are ignored
in non-secure state when GICD_CTLR.DS is 0.

This fixes the hang on boot seen when running qemu-system-aarch64
with machine virt,gic-version=3
2016-12-18 08:31:01 +00:00
andrew
3b55e84275 Enable ACPI on arm64. It's not yet functional, but it will help keeping the
code building until it is ready.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2016-12-12 18:13:03 +00:00
def
f63c437216 Add support for encrypted kernel crash dumps.
Changes include modifications in kernel crash dump routines, dumpon(8) and
savecore(8). A new tool called decryptcore(8) was added.

A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump
configuration in the diocskerneldump_arg structure to the kernel.
The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for
backward ABI compatibility.

dumpon(8) generates an one-time random symmetric key and encrypts it using
an RSA public key in capability mode. Currently only AES-256-CBC is supported
but EKCD was designed to implement support for other algorithms in the future.
The public key is chosen using the -k flag. The dumpon rc(8) script can do this
automatically during startup using the dumppubkey rc.conf(5) variable.  Once the
keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O
control.

When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random
IV and sets up the key schedule for the specified algorithm. Each time the
kernel tries to write a crash dump to the dump device, the IV is replaced by
a SHA-256 hash of the previous value. This is intended to make a possible
differential cryptanalysis harder since it is possible to write multiple crash
dumps without reboot by repeating the following commands:
# sysctl debug.kdb.enter=1
db> call doadump(0)
db> continue
# savecore

A kernel dump key consists of an algorithm identifier, an IV and an encrypted
symmetric key. The kernel dump key size is included in a kernel dump header.
The size is an unsigned 32-bit integer and it is aligned to a block size.
The header structure has 512 bytes to match the block size so it was required to
make a panic string 4 bytes shorter to add a new field to the header structure.
If the kernel dump key size in the header is nonzero it is assumed that the
kernel dump key is placed after the first header on the dump device and the core
dump is encrypted.

Separate functions were implemented to write the kernel dump header and the
kernel dump key as they need to be unencrypted. The dump_write function encrypts
data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps
are not supported due to the way they are constructed which makes it impossible
to use the CBC mode for encryption. It should be also noted that textdumps don't
contain sensitive data by design as a user decides what information should be
dumped.

savecore(8) writes the kernel dump key to a key.# file if its size in the header
is nonzero. # is the number of the current core dump.

decryptcore(8) decrypts the core dump using a private RSA key and the kernel
dump key. This is performed by a child process in capability mode.
If the decryption was not successful the parent process removes a partially
decrypted core dump.

Description on how to encrypt crash dumps was added to the decryptcore(8),
dumpon(8), rc.conf(5) and savecore(8) manual pages.

EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU.
The feature still has to be tested on arm and arm64 as it wasn't possible to run
FreeBSD due to the problems with QEMU emulation and lack of hardware.

Designed by:	def, pjd
Reviewed by:	cem, oshogbo, pjd
Partial review:	delphij, emaste, jhb, kib
Approved by:	pjd (mentor)
Differential Revision:	https://reviews.freebsd.org/D4712
2016-12-10 16:20:39 +00:00
gnn
6218b7c9ed This adds a configuration for arm64 users that track CURRENT but
don't need the extra debug facilities.  Copied from the amd64
configuration of the same name.

Submitted by: Nikolai Lifanov
Reviewed by: emaste
MFC after: 2 weeks
2016-12-10 10:00:27 +00:00
alc
7571ef95c1 Previously, vm_radix_remove() would panic if the radix trie didn't
contain a vm_page_t at the specified index.  However, with this
change, vm_radix_remove() no longer panics.  Instead, it returns NULL
if there is no vm_page_t at the specified index.  Otherwise, it
returns the vm_page_t.  The motivation for this change is that it
simplifies the use of radix tries in the amd64, arm64, and i386 pmap
implementations.  Instead of performing a lookup before every remove,
the pmap can simply perform the remove.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D8708
2016-12-08 04:29:29 +00:00