The change makes the user and kernel address spaces on i386
independent, giving each almost the full 4G of usable virtual addresses
except for one PDE at top used for trampoline and per-CPU trampoline
stacks, and system structures that must be always mapped, namely IDT,
GDT, common TSS and LDT, and process-private TSS and LDT if allocated.
By using 1:1 mapping for the kernel text and data, it appeared
possible to eliminate assembler part of the locore.S which bootstraps
initial page table and KPTmap. The code is rewritten in C and moved
into the pmap_cold(). The comment in vmparam.h explains the KVA
layout.
There is no PCID mechanism available in protected mode, so each
kernel/user switch forth and back completely flushes the TLB, except
for the trampoline PTD region. The TLB invalidations for userspace
becomes trivial, because IPI handlers switch page tables. On the other
hand, context switches no longer need to reload %cr3.
copyout(9) was rewritten to use vm_fault_quick_hold(). An issue for
new copyout(9) is compatibility with wiring user buffers around sysctl
handlers. This explains two kind of locks for copyout ptes and
accounting of the vslock() calls. The vm_fault_quick_hold() AKA slow
path, is only tried after the 'fast path' failed, which temporary
changes mapping to the userspace and copies the data to/from small
per-cpu buffer in the trampoline. If a page fault occurs during the
copy, it is short-circuit by exception.s to not even reach C code.
The change was motivated by the need to implement the Meltdown
mitigation, but instead of KPTI the full split is done. The i386
architecture already shows the sizing problems, in particular, it is
impossible to link clang and lld with debugging. I expect that the
issues due to the virtual address space limits would only exaggerate
and the split gives more liveness to the platform.
Tested by: pho
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D14633
By popular demand, pkg now walks thought the arguments passed and
if it finds -y or --yes it does accept those as equivalent of
ASSUME_ALWAYS_YES env var.
Requested by: many
MFC after: 1 week
Highlights:
- Passing "-" to -o will now cause output to go to stdout
- Path-based syntactic sugar for overlays is now accepted. This looks like:
/dts-v1/;
/plugin/;
&{/soc} {
sid: eeprom@1c14000 {
compatible = "allwinner,sun8i-h3-sid";
reg = <0x1c14000 0x400>;
status = "okay";
};
};
MFC after: 3 days
A few glyphs were converted incorrectly:
U+00A6 broken bar - center
U+2022 bullet - center
U+2026 horizontal ellipsis - move to bottom of character cell
xdma(4) interface.
This allows us to switch between Altera mSGDMA or SoftDMA engines used by
atse(4) device.
This also makes atse(4) driver become 25% smaller.
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9618
SoftDMA is a software implementation of DMA engine built using Altera
FIFO component.
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D9620
When we had both groff and mandoc in base, we decided to keep the roff(7)
manpage from groff. when remoing groff, we forgot to install the mandoc version
instead.
This fixes it.
Reported by: trasz
MFC after: 1 week
When the tape position is inside the Early Warning area, the tape
drive will return a sense key of NO SENSE, and an ASC/ASCQ of
0x00,0x02, which means: End-of-partition/medium detected". If
this was in response to a control command like WRITE FILEMARKS,
we correctly translate this as informational status and return
0 from saerror().
Programmable Early Warning should be handled the same way, but
we weren't handling it that way. As a result, if a PEW status
(sense key of NO SENSE, ASC/ASCQ of 0x00,0x07, "Programmable early
warning detected") came back in response to a WRITE FILEMARKS,
we returned an error.
The impact of this was that if an application was writing to a
sa(4) device, and a PEW area was set (in the Device Configuration
Extension subpage -- mode page 0x10, subpage 1), and a filemark
needed to be written on close, we could wind up returning an error
to the user on close because of a "failure" to write the filemarks.
It actually isn't a failure, but rather just a status report from
the drive, and shouldn't be treated as a failure.
sys/cam/scsi/scsi_sa.c:
For control commands in saerror(), treat asc/ascq 0x00,0x07
the same as 0x00,{0-5} -- not an error. Return 0, since
the command actually did succeed.
Reported by: Dr. Andreas Haakh <andreas@haakh.de>
Tested by: Dr. Andreas Haakh <andreas@haakh.de>
Sponsored by: Spectra Logic
MFC after: 3 days
given mbuf is considered as not matched.
If mbuf was consumed or freed during handling, we must return
IP_FW_DENY, since ipfw's pfil handler ipfw_check_packet() expects
IP_FW_DENY when mbuf pointer is NULL. This fixes KASSERT panics
when NAT64 is used with INVARIANTS. Also remove unused nomatch_final
field from struct nat64lsn_cfg.
Reported by: Justin Holcomb <justin at justinholcomb dot me>
Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC
The miscellaneous x86 sysent->sv_setregs() implementations tried to
migrate PSL_T from the previous program to the new executed one, but
they evaluated regs->tf_eflags after the whole regs structure was
bzeroed. Make this functional by saving PSL_T value before zeroing.
Note that if the debugger is not attached, executing the first
instruction in the new program with PSL_T set results in SIGTRAP, and
since all intercepted signals are reset to default dispostion on
exec(2), this means that non-debugged process gets killed immediately
if PSL_T is inherited. In particular, since suid images drop
P_TRACED, attempt to set PSL_T for execution of such program would
kill the process.
Another issue with userspace PSL_T handling is that it is reset by
trap(). It is reasonable to clear PSL_T when entering SIGTRAP
handler, to allow the signal to be handled without recursion or
delivery of blocked fault. But it is not reasonable to return back to
the normal flow with PSL_T cleared. This is too late to change, I
think.
Discussed with: bde, Ali Mashtizadeh
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
Differential revision: https://reviews.freebsd.org/D14995
This was inadvertently overriding the first found SYSDIR with the last
of /usr/src which could result in the wrong headers being used if not
building from /usr/src.
SYSDIR?= is not used here to avoid evaluating the exists() when unneeded.
Reported by: rgrimes, sjg, Mark Millard
Pointyhat to: bdrewery
Sponsored by: Dell EMC
"Terminus BSD Console" is a derivative of Terminus that is provided
by Mr. Dimitar Zhekov under the 2-clause BSD license for use by the
FreeBSD vt(4) console and other BSDs.
PR: 227409
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
In pti-enabled pmap, the PCID allocation scheme assigns temporal id
for the kernel page table, and user page table twin PCID is
calculating by setting high bit in the kernel PCID. So the kernel AS
is mapped with per-vmspace PCID, and we must completely shut down all
mappings in KVA when switching contexts, so that newly switched thread
would see all changes in KVA occured while it was not executing.
After all, KVA is same between all threads.
Currently the pti context switch for the user part of the page table
gets its TLB entries flushed too. It is excessive. The same PCID
flushing algorithm that is used for non-pti pmap, correctly works for
the UVA mappings. The only shared TLB entries are the pages from KVA
accessed by the kernel entry trampoline. All of them are static
except per-thread TSS and LDT. For TSS and LDT, the lifetime of newly
allocated entries is the whole thread life, so it is fine as well. If
not fine, then explicit shutdowns for current pmap of the newly
allocated LDT and TSS pages would be enough.
Also restore the constant value for the pm_pcid for the kernel_pmap.
Before, for PTI pmap, pm_pcid was erronously rolled same as user
pmap's pm_pcid, but it was not used.
Reviewed by: markj (previous version)
Discussed with: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D14961
After r331668 handling of F_NOT flag done in one place by
print_instruction() function. Also remove unused argument from
print_ip[6]() functions.
MFC after: 1 week
Some BIOSes have trouble booting from GPT in non-UEFI mode. This is
commonly reported with Lenovo laptops, including my x220. As we do not
currently support booting FreeBSD/i386 via UEFI there's no reason to
prefer GPT.
The "vestigial swap partition" was added in r265017 to work around an
issue with loader's GPT support, so we should not need it when using
MBR.
We may want to make the same change to amd64, although the issue there is
mitigated by such systems booting via UEFI in the common case.
PR: 227422
Reviewed by: gjb
MFC after: 3 weeks
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Originally, on the VAX exect() enable tracing once the new executable
image was loaded. This was possible because tracing was controllable
through user space code by setting the PSL_T flag. The following
instruction is a system call that activated tracing (as all
instructions do) by copying PSL_T to PSL_TP (trace pending). The
first instruction of the new executable image would trigger a trace
fault.
This is not portable to all platforms and the behavior was replaced with
ptrace(PT_TRACE_ME, ...) since FreeBSD forked off of the CSRG repository.
Platforms either incorrectly call execve(), trigger trace faults inside
the original executable, or do contain an implementation of this
function.
The exect() interfaces is deprecated or removed on NetBSD and OpenBSD.
Submitted by: Ali Mashtizadeh <ali@mashtizadeh.com>
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D14989
mdoc treats verbatim quotes in .Dl as a string delimiter and does
not pass them to the rendered output. Use special char \*q to specify
double quote
PR: 216755
MFC after: 3 days
To create hybrid boot media we want to specify a partition at a known location.
This extends the syntax of size partitions to include an optional offset that
can be absolute or relative. It also introduces validation to make sure that
this hasn't resulted in overlapping partitions. I haven't added this to the
file and process partition specifications yet but the mechanics are designed
such that if someone comes up with a good way of specifying the offset it
will be fairly easy to add in.
Reviewed by: imp
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D14916
Add one extra lock initialization to iflib_register() that was missed
in the git<->phab conversion.
Split out flag manipulation from general context manipulation in iflib
To avoid blocking on the context lock in the swi thread and risk potential
deadlocks, this change protects lighter weight updates that only need to
be consistent with each other with their own lock.
Submitted by: Matthew Macy <mmacy@mattmacy.io>
Reviewed by: shurd
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D14967
Directory mtime will only change if a file is added or removed, not
modified. For /var/cron/tabs, this is fine because of how crontab(1) manages
it using temp files so all crontab(1) changes will trigger a reload of the
database.
For /etc/cron.d and /usr/local/etc/cron.d, this is not necessarily the case.
Instead of checking their mtime, we should descend into them and check mtime
on all jobs also.
Reported by: des
Reviewed by: bapt
MFC after: 1 week
The change adds -t <name> option to zpool create and -t option to zpool
import in its form with an old name and a new name. This allows to
import (or create) a pool under a name that's different from its real,
permanent name without affecting that name. This is useful when working
with VM images or images of other physical systems if they happen to
have a ZFS pool with the same name as the host system.
The changes come from ZoL with some small tweaks.
The porting has been done by julian.
The change is being submitted to OpenZFS:
https://github.com/openzfs/openzfs/pull/600
Submitted by: julian
Reviewed by: smh
MFC after: 2 weeks
Sponsored by: Panzura (porting)
Differential Revision: https://reviews.freebsd.org/D14972