Commit Graph

127762 Commits

Author SHA1 Message Date
Michael Tuexen
f1903dc055 Wakeup the application when doing PD-API for unordered DATA chunks.
Work done with rrs@.

MFC after:		1 week
2019-07-22 18:11:35 +00:00
Ruslan Bukin
e8e90fef03 Remove unused header.
Sponsored by:	DARPA, AFRL
2019-07-22 16:50:37 +00:00
Ruslan Bukin
951e058411 o Add support for BERI IOMMU device
o Add an experimental IOMMU support to xDMA framework

The BERI IOMMU device is the part of CHERI device-model project [1]. It
translates memory addresses for various BERI peripherals modelled in
software. It accepts FreeBSD/mips64 page directories format and manages
BERI TLB.

1. https://github.com/CTSRD-CHERI/device-model

Sponsored by:	DARPA, AFRL
2019-07-22 16:01:20 +00:00
Justin Hibbits
5db86748b5 powerpc64/mmu: Make moea64_pvo_enter() return if an entry already exists
Summary:
Instead of searching for a PVO entry before adding, take advantage of
the fact that RB_INSERT() returns NULL if it inserts, and the existing entry if
an entry exists, without inserting a new entry.  This saves an extra tree
traversal in the cases where the PVO does not exist.

Reviewed by:	luporl
Differential Revision: https://reviews.freebsd.org/D20944
2019-07-22 03:11:54 +00:00
Konstantin Belousov
13ff4eb1fa Switch the rest of the refcount(9) functions to bool return type.
There are some explicit comparisions of refcount_release(9) result
with 0/1, which are fine.

Reviewed by:	markj, mjg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21014
2019-07-21 20:16:48 +00:00
Ian Lepore
d4828bcfc7 Add support for setting the aging/frequency-offset register via sysctl.
The 2127 and 2129 chips support a frequency tuning value in the range of
-7 through +8 PPM; add a sysctl handler to read and set the value.
2019-07-21 17:14:39 +00:00
Alan Cox
f606d835de With the introduction of software dirty bit emulation for managed mappings,
we should test ATTR_SW_DBM, not ATTR_AP_RW, to determine whether to set
PGA_WRITEABLE.  In effect, we are currently setting PGA_WRITEABLE based on
whether the dirty bit is preset, not whether the mapping is writeable.
Correct this mistake.

Reviewed by:	markj
X-MFC with:	r350004
Differential Revision:	https://reviews.freebsd.org/D21013
2019-07-21 17:00:19 +00:00
Konstantin Belousov
1004039858 Fix userspace build after r350199.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-07-21 16:24:40 +00:00
Konstantin Belousov
f1cf2b9dcb Check and avoid overflow when incrementing fp->f_count in
fget_unlocked() and fhold().

On sufficiently large machine, f_count can be legitimately very large,
e.g. malicious code can dup same fd up to the per-process
filedescriptors limit, and then fork as much as it can.
On some smaller machine, I see
	kern.maxfilesperproc: 939132
	kern.maxprocperuid: 34203
which already overflows u_int.  More, the malicious code can create
transient references by sending fds over unix sockets.

I realized that this check is missed after reading
https://secfault-security.com/blog/FreeBSD-SA-1902.fd.html

Reviewed by:	markj (previous version), mjg
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D20947
2019-07-21 15:07:12 +00:00
Alan Cox
1a4cb969d1 Introduce pmap_store(), and use it to replace pmap_load_store() in places
where the page table entry was previously invalid.  (Note that I did not
replace pmap_load_store() when it was followed by a TLB invalidation, even
if we are not using the return value from pmap_load_store().)

Correct an error in pmap_enter().  A test for determining when to set
PGA_WRITEABLE was always true, even if the mapping was read only.

In pmap_enter_l2(), when replacing an empty kernel page table page by a
superpage mapping, clear the old l2 entry and issue a TLB invalidation.  My
reading of the ARM architecture manual leads me to believe that the TLB
could hold an intermediate entry referencing the empty kernel page table
page even though it contains no valid mappings.

Replace a couple direct uses of atomic_clear_64() by the new
pmap_clear_bits().

In a couple comments, replace the term "paging-structure caches", which is
an Intel-specific term for the caches that hold intermediate entries in the
page table, with wording that is more consistent with the ARM architecture
manual.

Reviewed by:	markj
X-MFC after:	r350004
Differential Revision:	https://reviews.freebsd.org/D20998
2019-07-21 03:26:26 +00:00
Justin Hibbits
b982c7ee20 powerpc: Remove an unnecessary #ifdef guard from slb.c
slb.c is only compiled for powerpc64, so no need for the #ifdef in this block.
2019-07-21 03:19:54 +00:00
Ian Lepore
2444018f7d Rewrite the nxprtc chip init to extend battery life by using power-saving
features offered by the chips.

For 2127 and 2129 chips, fix the detection of when chip-init is needed.  The
chip config needs to be reset whenever power was lost, but the logic was
wrong for 212x chips (it only worked for 8523).  Now the "oscillator
stopped" bit rather than the power manager mode is used to detect startup
after powerfail.

For all chips, disable the clock output pin.

For chips that have a timestamp/tamper-monitor feature, turn off monitoring
of the timestamp trigger pin.

The 8523, 2127, and 2129 chips have a "power manager" feature that offers
several options.  We've been using the default mode which enables
everything.  Now the code sets the power manager options to

 - direct-switch (when Vdd < Vbat, without extra threshold check)
 - no battery monitor
 - no external powerfail monitor

This reduces the current draw while running on battery from 1930nA to 880nA,
which should roughly double the lifespan of the battery under load.

Because battery checking is a nice thing to have, the code now does a check
at startup, and then once a day after that, instead of checking continuously
(but only actually reporting at startup).  The battery check is now done by
setting the power manager back to default mode, sleeping briefly while it
makes a voltage measurement, then switching back to power-saving mode.
2019-07-20 21:10:27 +00:00
Mark Johnston
b16e57a6c9 Rename vm_page_{import,release}() to vm_page_zone_{import,release}().
I would like to use the name vm_page_release() for a different purpose,
and vm_page_{import,release}() are local to vm_page.c.

Reviewed by:	kib
MFC after:	1 week
2019-07-20 18:25:41 +00:00
Justin Hibbits
cafceaebea powerpc/SPE: Enable SPV bit for EFSCFD instruction emulation
EFSCFD (floating point single convert from double) emulation requires saving
the high word of the register, which uses SPE instructions.  Enable the SPE
to avoid an SPV Unavailable exception.

MFC after:	1 week
2019-07-20 18:22:01 +00:00
Emmanuel Vadot
ad6521c24f dtso: allwinner: Add an overlay for H3 i2c0
Most of the H3 boards don't enable i2c as it is unused.
Add an overlay so it's easier for user to use i2c device.
2019-07-20 17:42:46 +00:00
John Baldwin
87c39157c6 Improve the precision of bhyve's vPIT.
Use 'struct bintime' instead of 'sbintime_t' to manage times in vPIT
to postpone rounding to final results rather than intermediate
results.  In tests performed by Joyent, this reduced the error measured
by Linux guests by 59 ppm.

Smart OS bug:	https://smartos.org/bugview/OS-6923
Submitted by:	Patrick Mooney
Reviewed by:	rgrimes
Obtained from:	SmartOS / Joyent
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20335
2019-07-20 15:59:49 +00:00
Emmanuel Vadot
d5fdfa2c8a arm64: Implement HWCAP
Add HWCAP support for arm64.
defines are the same as in Linux and a userland program can use
elf_aux_info to get the data.
We only save the common denominator for all cores in case the
big and little cluster have different support (this is known to
exists even if we don't support those SoCs in FreeBSD)

Differential Revision:	https://reviews.freebsd.org/D17137
2019-07-20 14:29:11 +00:00
Ganbold Tsagaankhuu
b24594e544 Add emmc support for Rockchip RK3399 SoC.
Tested on NanoPC-T4 board.

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D20156
2019-07-20 02:53:06 +00:00
Ganbold Tsagaankhuu
ea01660f1c Add driver for Rockchip RK3399 eMMC PHY.
Tested on NanoPC-T4 board.

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D20840
2019-07-20 02:03:31 +00:00
Konstantin Belousov
47c3450e50 Fix leak of memory and file refs with sendmsg(2) over unix domain sockets.
When sendmsg(2) sucessfully internalized one SCM_RIGHTS control
message, but failed to process some other control message later, both
file references and filedescent memory needs to be freed. This was not
done, only mbuf chain was freed.

Noted, test case written, reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21000
2019-07-19 20:51:39 +00:00
Doug Moore
312df2c1dd Define vm_map_entry_in_transition to handle an in-transition map
entry, combining code currently in vm_map_unwire and
vm_map_wire_locked into a single function, called by each of them for
entries in transition.

Discussed with: kib, markj
Reviewed by: alc
Approved by: kib, markj (mentors, implicit)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D20833
2019-07-19 20:47:35 +00:00
Alexander Motin
89b35a5274 Add Accessible Max Address Configuration support to camcontrol.
AMA replaced HPA in ACS-3 specification.  It allows to limit size of the
disk alike to HPA, but declares inaccessible data as indeterminate.  One
of its practical use cases is to under-provision SATA SSDs for better
reliability and performance.

While there, fix HPA Security detection/reporting.

MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2019-07-19 19:15:08 +00:00
Warner Losh
5e83c2ffaa Keep track of the number of commands that exhaust their retry limit.
While we print failure messages on the console, sometimes logs are lost or
overwhelmed. Keeping a count of how many times we've failed retriable commands
helps get a magnitude of the problem.
2019-07-19 18:39:24 +00:00
Warner Losh
c37fc318c4 Keep track of the number of retried commands.
Retried commands can indicate a performance degredation of an nvme drive. Keep
track of the number of retries and report it out via sysctl, just like number of
commands an interrupts.
2019-07-19 18:39:18 +00:00
Warner Losh
710becdd96 Remove pre-FreeBSD 7.0 compatibility. 2019-07-19 18:38:47 +00:00
Warner Losh
eec0e91e05 Add comments about KERN_OPT here. 2019-07-19 17:48:29 +00:00
Warner Losh
1071b50a65 Use sysctl + CTLRWTUN for hw.nvme.verbose_cmd_dump.
Also convert it to a bool. While the rest of the driver isn't yet bool clean,
this will help.

Reviewed by: cem@
Differential Revision: https://reviews.freebsd.org/D20988
2019-07-19 00:32:56 +00:00
Warner Losh
c75bdc044d Provide new tunable hw.nvme.verbose_cmd_dump
The nvme drive dumps only the most relevant details about a command when it
fails. However, there are times this is not sufficient (such as debugging weird
issues for a new drive with a vendor). Setting hw.nvme.verbose_cmd_dump=1
in loader.conf will enable more complete debugging information about each
command that fails.

Reviewed by: rpokala
Sponsored by: Netflix
Differential Version: https://reviews.freebsd.org/D20988
2019-07-18 21:58:51 +00:00
Warner Losh
62d2cf1847 Provide macros to extract the sub-fields of the CAP_LO and CAP_HI registers.
These macros make places where we extract these easier to read. The shift and
mask stuff is also a bit tedious and error prone. Start with the CAP_LO and
CAP_HI registers since their scope is somewhat constrained. This is style
chagne only, no functional changes.

Reviewed by: chuck
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20979
2019-07-18 15:41:10 +00:00
Andrew Turner
f1fbf9c3b1 Rename arm64 macros in preperation for a script to generate them.
I have a script to generate most of the ID_AA64* macros from the Arm
XML source [1]. In preperation for using this we need to clean up the
macros to be in line with what the script will generate. This is the
first step, rename the macros to follow the names in said XML.

[1] https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools

MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D20976
2019-07-18 13:58:04 +00:00
Ian Lepore
18cd8a2df8 Fix a paste-o, set is212x = false for other chip types. Doh! 2019-07-18 01:37:00 +00:00
Ian Lepore
634a2d26fd Handle the PCF2127 RTC chip the same as PCF2129 when init'ing the chip.
This affects the detection of 24-hour vs AM/PM mode... the ampm bit is in a
different location on 2127 and 2129 chips compared to other nxp rtc chips.
I noticed the 2127 case wasn't being handled correctly when I accidentally
misconfiged my system by claiming my PCF2129 was a 2127.
2019-07-18 01:30:56 +00:00
Kirk McKusick
fdf34aa3a5 The error reported in FS-14-UFS-3 can only happen on UFS/FFS
filesystems that have block pointers that are out-of-range for their
filesystem. These out-of-range block pointers are corrected by
fsck(8) so are only encountered when an unchecked filesystem is
mounted.

A new "untrusted" flag has been added to the generic mount interface
that can be set when mounting media of unknown provenance or integrity.
For example, a daemon that automounts a filesystem on a flash drive
when it is plugged into a system.

This commit adds a test to UFS/FFS that validates all block numbers
before using them. Because checking for out-of-range blocks adds
unnecessary overhead to normal operation, the tests are only done
when the filesystem is mounted as an "untrusted" filesystem.

Reported by:  Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert of Fraunhofer FKIE
Reported as:  FS-14-UFS-3: Out of bounds read in write-2 (ffs_alloccg)
Reviewed by:  kib
Sponsored by: Netflix
2019-07-17 22:07:43 +00:00
Kristof Provost
cd7795a5a4 riscv: Return vm_paddr_t in pmap_early_vtophys()
We can't use a u_int to compute the physical address in
pmap_early_vtophys(). Our int is 32-bit, but the physical address is
64-bit. This works fine if everything lives in below 0x100000000, but as
soon as it doesn't this breaks.

MFC after:	1 week
Sponsored by:	Axiado
2019-07-17 21:25:26 +00:00
Warner Losh
204498d7c2 Remove now-obsolete comment. 2019-07-17 20:43:14 +00:00
Alan Somers
0122532ee0 F_READAHEAD: Fix r349248's overflow protection, broken by r349391
I accidentally broke the main point of r349248 when making stylistic changes
in r349391.  Restore the original behavior, and also fix an additional
overflow that was possible when uio->uio_resid was nearly SSIZE_MAX.

Reported by:	cem
Reviewed by:	bde
MFC after:	2 weeks
MFC-With:	349248
Sponsored by:	The FreeBSD Foundation
2019-07-17 17:01:07 +00:00
Mark Johnston
61f2f0bae6 Fix FASTTRAPIOC_GETINSTR.
This ioctl is used when a breakpoint is encountered while disassembling
a symbol in the target process.  Since only one DTrace consumer can
toggle or enumerate fasttrap probes from a given process at time, this
ioctl does not appear to be used in practice.
2019-07-17 16:38:29 +00:00
Sean Bruno
fceeeec75f I add the ability to accept the default pin widget configuration to help
with various laptops using hdaa(4) sound devices.  We don't seem to know
the "correct" configurations for these devices and the defaults are far
superiour, e.g. they work if you don't nuke the default configs.

PR:	200526
Differential Revision:	https://reviews.freebsd.org/D17772
2019-07-17 04:13:46 +00:00
Kirk McKusick
ba554157a3 Style.
No change intended.
2019-07-16 23:39:39 +00:00
Kirk McKusick
1fd136ec5e When a process attempts to allocate space on a full filesystem, a
filesystem full message is sent to the offending process or the
kernel log if the offending process cannot be identified.

To prevent an explotion of messages, the kernel ppsratecheck()
function is used to limit the messages to one per second. This
revision changes the variable that tracks the rate of these messages
from a systemwide limit to a per-filesystem limit by moving it from
a global variable to a variable in the ufsmount structure.

Suggested by: kib
Reviewed by:  kib
Sponsored by: Netflix
2019-07-16 23:12:27 +00:00
Warner Losh
dc9df3a59d Assume that the timeout value from the capacity is 1-based
Neither the 1.3 or 1.4 standards say this number is 1's based, but adding 1
costs little and copes with those NVMe drives that report '0' in this field
cheaply. This is consistent with what the Linux driver does as well.
2019-07-16 22:55:30 +00:00
Cy Schubert
caddc9e343 As of upstream fil.c CVS r1.53 (March 1, 2009), prior to the import of
ipfilter 5.1.2 into FreeBSD-10, the fix for, 2580062 from/to targets
should be able to use any interface name, moved frentry.fr_cksum to
prior to frentry.fr_func thereby making this code redundant. After
investigating whether this fix to move fr_cksum was correct and if it
broke anything, it has been determined that the fix is correct and this
code is redundant. We remove it here.

MFC after:	2 weeks
2019-07-16 19:00:42 +00:00
Cy Schubert
a422d59f7b Refactor, removing one compare.
This changes the return code however the caller only tests for 0 and != 0.
One might ask then, why multiple return codes when the caller only tests
for 0 and != 0? From what I can tell, Darren probably passed various
return codes for sake of debugging. The debugging code is long gone
however we can still use the different return codes using DTrace FBT
traces. We can still determine why the compare failed by examining the
differences between the fr1 and fr2 frentry structs, which is a simple
test in DTrace. This allows reducing the number of tests, improving the
code while not affecting our ability to capture information for
diagnostic purposes.

MFC after:	1 week
2019-07-16 19:00:38 +00:00
Michael Tuexen
e4a5561e01 Fix compilation on platforms using gcc.
When compiling RACK on platforms using gcc, a warning that tcp_outflags
is defined but not used is issued and terminates compilation on PPC64,
for example. So don't indicate that tcp_outflags is used.

Reviewed by:		rrs@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D20971
2019-07-16 17:54:20 +00:00
Eric van Gyzen
9d3ecb7e62 Adds signal number format to kern.corefile
Add format capability to core file names to include signal
that generated the core. This can help various validation workflows
where all cores should not be considered equally (SIGQUIT is often
intentional and not an error unlike SIGSEGV or SIGBUS)

Submitted by:	David Leimbach (leimy2k@gmail.com)
Reviewed by:	markj
MFC after:	1 week
Relnotes:	sysctl kern.corefile can now include the signal number
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20970
2019-07-16 15:51:09 +00:00
Mark Johnston
7af2abed6a Always use the software DBM bit for now.
r350004 added most of the machinery needed to support hardware DBM
management, but it did not intend to actually enable use of the hardware
DBM bit.

Reviewed by:	andrew
MFC with:	r350004
Sponsored by:	The FreeBSD Foundation
2019-07-16 15:41:09 +00:00
Mark Johnston
32e09b04ce Fix the arm64 page table entry attribute mask.
It did not include the DBM or contiguous bits.

Reported by:	andrew
Reviewed by:	andrew
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-07-16 15:38:01 +00:00
Mark Johnston
9da9cb48e9 Propagate attribute changes during demotion.
After r349117 and r349122, some mapping attribute changes do not trigger
superpage demotion. However, pmap_demote_l2() was not updated to ensure
that the replacement L3 entries carry any attribute changes that
occurred since promotion.

Reported and tested by:	manu
Reviewed by:	alc
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20965
2019-07-16 14:40:49 +00:00
Andriy Gapon
a70e114dc6 bge: check that the bus is a pci bus before using it as such
This fixes the following panic on powerpc:
  pci_get_vendor failed for pcib1 on bus ofwbus0, error = 2

PR:		238730
Reported by:	Dennis Clarke <dclarke@blastwave.org>
Tested by:	Dennis Clarke <dclarke@blastwave.org>
MFC after:	2 weeks
2019-07-16 08:36:49 +00:00
Justin Hibbits
8d00d89228 powerpc: Fix casueword(9) post-r349951
'=' asm constraint marks a variable as write-only.  Because of this, gcc
throws away the initialization of 'res', causing garbage to be returned if
the CAS was successful.  Use '+' to mark res as read/write, so that the
initialization stays in the generated asm.  Also, fix the reservation
clearing stwcx store index register in casueword32, and only do the dummy
store when needed, skip it if the real store has already succeeded.
2019-07-16 03:55:27 +00:00