232974 Commits

Author SHA1 Message Date
Konstantin Belousov
a3c7cd11d2 Fix double-load of %cr3 and double-copy of the stack frame for the
kernel entry from userspace vm86.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-05-22 13:30:56 +00:00
Andrey V. Elsukov
67ad3c0bf9 Restore the ability to keep states after parent rule deletion.
This feature is disabled by default and was removed when dynamic states
implementation changed to be lockless. Now it is reimplemented with small
differences - when dyn_keep_states sysctl variable is enabled,
dyn_match_ipv[46]_state() function doesn't match child states of deleted
rule. And thus they are keept alive until expired. ipfw_dyn_lookup_state()
function does check that state was not orphaned, and if so, it returns
pointer to default_rule and its position in the rules map. The main visible
difference is that orphaned states still have the same rule number that
they have before parent rule deleted, because now a state has many fields
related to rule and changing them all atomically to point to default_rule
seems hard enough.

Reported by:	<lantw44 at gmail.com>
MFC after:	2 days
2018-05-22 13:28:05 +00:00
Konstantin Belousov
14f7050dba Enable IBRS when entering an interrupt handler from usermode.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-05-22 13:25:15 +00:00
Andrew Turner
5a00bf535c Only set realmem based on memory where the EXFLAG_NOALLOC is unset. This
will allow us to query the maps at any time without disturbing this value.

Obtained from:	ABT Systems Ltd
Sponsored by:	Turing Robotic Industries
2018-05-22 13:21:44 +00:00
Andrew Turner
89b5faf887 On ThunderX2 we need to be careful to only map the memory the firmware
lists in the EFI memory map. As such we need to reduce the mappings to
restrict them to not be the full 1G block. For now reduce this to a 2M
block, however this may be further restricted to be 4k page aligned as
other SoCs may require.

This allows ThunderX2 to boot reliably to userspace without performing
any speculative memory accesses to invalid physical memory.

Sponsored by:	DARPA, AFRL
2018-05-22 11:26:41 +00:00
Emmanuel Vadot
c6231a5f26 bus_dma(9): arm64 implementation notes
Indicate that BUS_DMA_COHERENT is supported for bus_dmamem_alloc and
bus_dmamem_create in the arm64 implementation.
2018-05-22 11:17:45 +00:00
Andrew Turner
9d0728e04e Stop using the DMAP region to map ACPI memory.
On some arm64 boards we need to access memory in ACPI tables that is not
mapped in the DMAP region. To handle this create the needed mappings in
pmap_mapbios in the KVA space.

Submitted by:	Michal Stanek (mst@semihalf.com)
Sponsored by:	Cavium
Differential Revision:	https://reviews.freebsd.org/D15059
2018-05-22 11:16:45 +00:00
Andrew Turner
79402150c1 Switch arm64 to use the same physmem code as 32-bit arm.
The main advantage of this is to allow us to exclude memory from being
used by the kernel. This may be from the memreserve property, or ranges
marked as no-map under the reserved-memory node.

More work is still needed to remove the physmap array. This is still used
for creating the DMAP region, however other patches need to be committed
before we can remove this.

Obtained from:	ABT Systems Ltd
Sponsored by:	Turing Robotic Industries
2018-05-22 11:07:04 +00:00
Konstantin Belousov
e95725feca Implement printf(3) family %m format string extension.
Reviewed by:	ed, dim (code only)
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-05-22 11:05:40 +00:00
Andrew Turner
66971d57e9 Allow the 32-bit arm physmem code to work on arm64.
This will help simplify the arm64 code and allow us to properly exclude
memory that should never be mapped.

Obtained from:	ABT Systems Ltd
Sponsored by:	Turing Robotic Industries
2018-05-22 10:31:06 +00:00
Andrew Turner
89ae4d7f7a Coalesce adjacent physical mappings.
This reduces the overhead when we have many small mappings, e.g. on some
EFI systems. This is to help use this code on arm64 where we may have a
large number of entries from the EFI firmware.

Obtained from:	ABT Systems Ltd
Sponsored by:	Turing Robotic Industries
Differential Revision:	https://reviews.freebsd.org/D15477
2018-05-22 10:14:20 +00:00
Roger Pau Monné
ffe4446b33 xen-blkback: do not use state 3 (XenbusStateInitialised)
Linux will not connect to a backend that's in state 3
(XenbusStateInitialised), it needs to be in state 2
(XenbusStateInitWait) for Linux to attempt to connect to the backend.

The protocol seems to suggest that the backend should indeed wait in
state 2 for the frontend to connect, which makes state 3 unusable for
disk backends.

Also make sure blkback will connect to the frontend if the frontend
reaches state 3 (XenbusStateInitialised) before blkback has processed
the results from the hotplug script (Submitted by Nathan Friess).

MFC after:	1 week
2018-05-22 08:51:16 +00:00
Mateusz Guzik
99ece3a9cd Reduce sdt-related branch-fest in mi_switch.
The code was evaluating flags before resorting to checking if dtrace is
enabled. This was inducing forward jumps in the common case.
2018-05-22 08:27:33 +00:00
Eitan Adler
dbcdf411a1 top(1): increase size of 'status' buffer
This corrects a warning issues by gcc9:
/srv/src/freebsd/head/usr.bin/top/machine.c:988:22: warning: '%5zu'
directive writing between 5 and 20 bytes into a
 region of size 15 [-Wformat-overflow=]
     sprintf(status, "?%5zu", state);
2018-05-22 07:56:58 +00:00
Mateusz Guzik
2466d12b09 sx: port over writer starvation prevention measures from rwlock
A constant stream of readers could completely starve writers and this is not
a hypothetical scenario.

The 'poll2_threads' test from the will-it-scale suite reliably starves writers
even with concurrency < 10 threads.

The problem was run into and diagnosed by dillon@backplane.com

There was next to no change in lock contention profile during -j 128 pkg build,
despite an sx lock being at the top.

Tested by:	pho
2018-05-22 07:20:22 +00:00
Mateusz Guzik
9feec7ef69 rw: decrease writer starvation
Writers waiting on readers to finish can set the RW_LOCK_WRITE_SPINNER
bit. This prevents most new readers from coming on. However, the last
reader to unlock also clears the bit which means new readers can sneak
in and the cycle starts over.

Change the code to keep the bit after last unlock.

Note that starvation potential is still there: no matter how many write
spinners are there, there is one bit. After the writer unlocks, the lock
is free to get raided by readers again. It is good enough for the time
being.

The real fix would include counting writers.

This runs into a caveat: the writer which set the bit may now be preempted.
In order to get rid of the problem all attempts to set the bit are preceeded
with critical_enter.

The bit gets cleared when the thread which set it goes to sleep. This way
an invariant holds that if the bit is set, someone is actively spinning and
will grab the lock soon. In particular this means that readers which find
the lock in this transient state can safely spin until the lock finds itself
an owner (i.e. they don't need to block nor speculate how long to spin
speculatively).

Tested by:	pho
2018-05-22 07:16:39 +00:00
Cy Schubert
c76af09019 Conform to Berne Convention.
MFC after:	3 days
2018-05-22 06:22:58 +00:00
Marcelo Araujo
92046bf113 Revert: r334016
Revert for now this change, it in somehow breaks init_pci.
2018-05-22 06:02:11 +00:00
Matt Macy
df58dad520 pmc: annotate locking for po_ssnext in pmc_owner 2018-05-22 05:49:40 +00:00
Marcelo Araujo
2d03aa5999 Include atkbdc header where there are declared the prototype functions
atkbdc_event and atkbdc_init.

MFC after:	4 weeks.
Sponsored by:	iXsystems Inc.
2018-05-22 05:21:53 +00:00
Matt Macy
137fd41bd9 fix i386 builds after r334005 and r334009
r334005: add pc_ibpb_set as it is now referenced by common code
(although presumably not needed on i386 since it has been there
since the first spectre mitigation work on amd64)

r334009: there is no amd64 rflags -> i386 eflags
2018-05-22 05:09:33 +00:00
Matt Macy
821a352a77 pmcstat: add option to not decode the leaf function in top mode
-I will allow the user to see the hot instruction in question
as opposed getting the name of the function
2018-05-22 04:45:46 +00:00
Marcelo Araujo
b5e3928d6d We must free the variable str.
Spotted by:	clang's static analyzer
Submitted by:	Tom Rix <trix_juniper.net>
Reviewed by:	grehan
MFC after:	4 weeks
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D10009
2018-05-22 04:08:08 +00:00
Justin Hibbits
1a3eaf6cc8 Add an IPMI attachment for PowerNV systems
IPMI access on PowerNV systems is done through the OPAL firmware.  This adds a
simple attachment for communicating with the FSP/BMC on these machines.  This
has been tested on a Talos POWER9 workstation, only in the bootup phase, noting
the successful attachment messages:

...
ipmi0: IPMI device rev. 0, firmware rev. 2.00, version 2.0, device support mask 0
ipmi0: Number of channels 2
...

The ipmi device has not been added to GENERIC64, but may be after further
testing.  It may also eventually be added to the ipmi module at that point.
2018-05-22 03:57:32 +00:00
Justin Hibbits
5272c9bd07 Add a comment explaining the need of a global temporary variable
cpu_xirr is used only as a temporary location for the OPAL call in
PIC_DISPATCH().

Requested by:	nwhitehorn
2018-05-22 03:24:16 +00:00
Justin Hibbits
9c6ba29de1 Basic OPAL sensor support for POWER9 platforms
Summary:
PowerNV architectures (in the test case POWER9) export sensors via the device
tree, which are accessed via OPAL calls.  This adds sysctl nodes for each
device in a generic fashion.  New sysctl nodes are:

dev.opal_sensor.N.sensor
dev.opal_sensor.N.sensor_min
dev.opal_sensor.N.sensor_max
dev.opal_sensor.N.type
dev.opal_sensor.N.label

These are rooted at a parent attachment under opal, called opalsens.  This does
not add support for the "sensor groups" defined in the device tree.

Reviewed by:	breno.leitao_gmail.com
Differential Revision: https://reviews.freebsd.org/D15362
2018-05-22 02:42:53 +00:00
Eitan Adler
bc875b45c5 top(1): unbreak build with gcc7; fix varargs
- use correct function for varargs argument
- allow build to complete with gcc7 at current WARNS

Reported by:	jhibbits, ian
2018-05-22 02:13:04 +00:00
John Baldwin
9e2154ff1c Cleanups related to debug exceptions on x86.
- Add constants for fields in DR6 and the reserved fields in DR7.  Use
  these constants instead of magic numbers in most places that use DR6
  and DR7.
- Refer to T_TRCTRAP as "debug exception" rather than a "trace trap"
  as it is not just for trace exceptions.
- Always read DR6 for debug exceptions and only clear TF in the flags
  register for user exceptions where DR6.BS is set.
- Clear DR6 before returning from a debug exception handler as
  recommended by the SDM dating all the way back to the 386.  This
  allows debuggers to determine the cause of each exception.  For
  kernel traps, clear DR6 in the T_TRCTRAP case and pass DR6 by value
  to other parts of the handler (namely, user_dbreg_trap()).  For user
  traps, wait until after trapsignal to clear DR6 so that userland
  debuggers can read DR6 via PT_GETDBREGS while the thread is stopped
  in trapsignal().

Reviewed by:	kib, rgrimes
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D15189
2018-05-22 00:45:00 +00:00
Jilles Tjoelker
dc0dbd74c4 sh: Split CNL syntax category to avoid a check on state[level].syntax
No functional change is intended.
2018-05-21 21:52:48 +00:00
Emmanuel Vadot
729ba386f0 devd: Always install devmatch.conf
It allows devd to run devmatch to find the correct driver based on pnp info.

No Objection from:    imp
2018-05-21 21:44:47 +00:00
Emmanuel Vadot
b091392eb8 aw_mmc: Correctly reset the mmc controller
Always disable FIFO access as we don't use it.
Rename some register bits so they are in sync with the register name.

While here add my copyright as I've probably wrote 70% of the code here.
2018-05-21 21:15:46 +00:00
Konstantin Belousov
3621ba1ede Add Intel Spec Store Bypass Disable control.
Speculative Store Bypass (SSB) is a speculative execution side channel
vulnerability identified by Jann Horn of Google Project Zero (GPZ) and
Ken Johnson of the Microsoft Security Response Center (MSRC)
https://bugs.chromium.org/p/project-zero/issues/detail?id=1528.
Updated Intel microcode introduces a MSR bit to disable SSB as a
mitigation for the vulnerability.

Introduce a sysctl hw.spec_store_bypass_disable to provide global
control over the SSBD bit, akin to the existing sysctl that controls
IBRS. The sysctl can be set to one of three values:
0: off
1: on
2: auto

Future work will enable applications to control SSBD on a per-process
basis (when it is not enabled globally).

SSBD bit detection and control was verified with prerelease microcode.

Security:	CVE-2018-3639
Tested by:	emaste (previous version, without updated microcode)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-05-21 21:08:19 +00:00
Konstantin Belousov
9be4bbbb21 Add definition for Intel Speculative Store Bypass Disable MSR bits
Security:	CVE-2018-3639
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-05-21 21:07:13 +00:00
Konstantin Belousov
2320153fcc Preserve other bits in IA32_SPEC_CTL MSR when changing the IBRS and
STIBP states.

Tested by:	emaste (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2018-05-21 21:05:55 +00:00
Andriy Gapon
a23a4e68fa uchcom: extend hardware support to version 0x30
This change adds support for a UBS<->RS232 adapter based on CH340 (or an
analogue) that I own.  The device seems to have a newer internal version
(0x30) and the existing code incorrectly configures line control for it
resulting in garbled transmission.  The changes are based on what I
learned in Linux drivers for the same hardware.

Additional changes:
- use UCHCOM_REG_LCR1 / UCHCOM_REG_LCR2 instead of explicit 0x18 and
  0x25
- use NULL instead of 0 where a pointer is expected

Reviewed by:	hselasky
MFC after:	3 weeks
Differential Revision: https://reviews.freebsd.org/D15498
2018-05-21 21:04:31 +00:00
Andriy Gapon
dad3e656eb uchcom: remove UCHCOM_REG_BREAK2 alias of UCHCOM_REG_LCR1
Also, add definitions for more bits of UCHCOM_REG_LCR1 as seen in the
Linux driver.  UCHCOM_LCR1_PARENB definition was different from that in
the Linux driver and clashed with newly added UCHCOM_LCR1_RX.  I took a
liberty to change UCHCOM_LCR1_PARENB to the Linux definition as it was
unused in the driver anyway.  This change should make
uchcom_cfg_set_break() easier to understand.

Approved by:	hselasky
MFC after:	2 weeks
2018-05-21 21:02:10 +00:00
Andriy Gapon
40e7b06492 uchcom: reject parity and double stop bits as unsupported
Reviewed by:	hselasky
MFC after:	2 weeks
2018-05-21 21:00:13 +00:00
Andriy Gapon
7acd73fd1e uchcom: add a hardware configuration tweak seen in Linux code
Reviewed by:	hselasky
MFC after:	2 weeks
2018-05-21 20:59:15 +00:00
Andriy Gapon
1d33c9a55f uchcom: add DPRINTF-s to aid debugging of the driver
Reviewed by:	hselasky
MFC after:	2 weeks
2018-05-21 20:58:06 +00:00
Andriy Gapon
d759c295c1 uchcom: report detected product based on USB product ID
Product IDs are specified in vendor documents.  The previously used
device ID is not.  This is a cosmetic change.  No functionality depends
on those IDs.

Reviewed by:	hselasky
MFC after:	2 weeks
2018-05-21 20:57:14 +00:00
Jean-Sébastien Pédron
b554075d14 teken: Rename the "Set Cursor Style" sequence to match vt100.net docs
This fixes inconsistencies with the rest of the `sequences` file.

No functional changes.

Requested by:	ed
2018-05-21 20:35:16 +00:00
Andriy Gapon
27dca831a6 stop and restart kernel event timers in the suspend / resume cycle
I have a system that is very unstable after resuming from suspend-to-RAM
but only if HPET is used as the event timer.  The theory is that SMM
code / firmware could be enabling HPET for its own uses and unexpected
interrupts cause a trouble for it.  Originally I wanted to solve the
problem in hpet_suspend() method, but that was insufficient as the event
timer could get reprogrammed again.

So, it's better, for my case and in general, to stop the event timer(s)
before entering the hardware suspend.

MFC after:	4 weeks
Differential Revision: https://reviews.freebsd.org/D15413
2018-05-21 20:23:04 +00:00
Konstantin Belousov
5988464ec4 Fix grammar.
Submitted by:	alc
MFC after:	1 week
2018-05-21 19:15:05 +00:00
Konstantin Belousov
0a4b04a616 Add missed barrier for pm_gen/pm_active interaction.
When we issue shootdown IPIs, we first assign zero to pm_gens to
indicate the need to flush on the next context switch in case our IPI
misses the context, next we read pm_active. On context switch we set
our bit in pm_active, then we read pm_gen. It is crucial that both
threads see the memory in the program order, otherwise invalidation
thread might read pm_active bit as zero and the context switching
thread might read pm_gen as zero.

IA32 allows CPU for both reads to see zero. We must use the barriers
between write and read. The pm_active bit set is already locked, so
only the invalidation functions need it.

I never saw it in real life, or at least I do not have a good
reproduction case. I found this during code inspection when hunting
for the Xen TLB issue reported by cperciva.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15506
2018-05-21 18:41:16 +00:00
Edward Tomasz Napierala
733efc21c4 Add a somewhat ugly hack that makes OSX serial device node names
human-readable.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-05-21 17:33:52 +00:00
Edward Tomasz Napierala
c7ad27069e Fix typo.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-05-21 16:50:27 +00:00
Edward Tomasz Napierala
ac4a7f30d2 Improve description strings for USB device-mode serial ports.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-05-21 16:33:13 +00:00
Andrey V. Elsukov
4bb8a5b0c9 Remove check for matching the rulenum, ruleid and rule pointer from
dyn_lookup_ipv[46]_state_locked(). These checks are remnants of not
ready to be committed code, and they are there by accident.
Due to the race these checks can lead to creating of duplicate states
when concurrent threads in the same time will try to add state for two
packets of the same flow, but in reverse directions and matched by
different parent rules.

Reported by:	lev
MFC after:	3 days
2018-05-21 16:19:00 +00:00
Andrew Turner
78921ae879 Restrict the faulting addresses we call pmap_fault from to just those that
may fault due to superpage mappings being changed.

Sponsored by:	DARPA, AFRL
2018-05-21 16:14:53 +00:00
Matt Macy
f42a83f2a6 inpcb: revert deferred inpcb free pending further review 2018-05-21 16:13:43 +00:00