137075 Commits

Author SHA1 Message Date
markj
a05393f041 Fix the turnstile_lock() KPI.
turnstile_{lock,unlock}() were added for use in epoch.  turnstile_lock()
returned NULL to indicate that the calling thread had lost a race and
the turnstile was no longer associated with the given lock, or the lock
owner.  However, reader-writer locks may not have a designated owner,
in which case turnstile_lock() would return NULL and
epoch_block_handler_preempt() would leak spinlocks as a result.

Apply a minimal fix: return the lock owner as a separate return value.

Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21048
2019-07-24 23:04:59 +00:00
markj
9cea16db08 Remove cap_random(3).
Now that we have a way to obtain entropy in capability mode
(getrandom(2)), libcap_random is obsolete.  Remove it.

Bump __FreeBSD_version in case anything happens to use it, though I've
found no consumers.

Reviewed by:	delphij, emaste, oshogbo
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21033
2019-07-24 22:50:43 +00:00
erj
9bf94a5091 iflib: fix dangling device softc pointer
Commit text by Jake:
If a driver's IFDI_ATTACH_PRE function fails, the iflib_device_register
function will free the ctx pointer. However, it does not reset the
device softc pointer to NULL.

This will result in memory corruption as a future access to the now
invalid pointer will corrupt memory that is later allocated on top of
the same memory location.

The iflib_device_deregister function correctly resets the softc pointer
by using device_set_softc().

This clears up the invalid dangling pointer and prevents memory
corruption that could lead to a panic or undefined behavior if the
device's driver failed to attach.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	erj@, gallatin@
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21003
2019-07-24 21:43:41 +00:00
emaste
b7fc72efad enable ig4_acpi on aarch64
The already-listed APMC0D0F ID belongs to the Ampere eMAG aarch64
platform, but ACPI support was not even built on aarch64.

Submitted by:	Greg V <greg_unrelenting.technology>
Differential Revision:	https://reviews.freebsd.org/D21059
2019-07-24 21:26:17 +00:00
emaste
e6cea1ca6d pf: zero output buffer in pfioctl
Avoid potential structure padding leak.

Reported by:	Vlad Tsyrklevich <vlad@tsyrklevich.net>
Reviewed by:	kp
MFC after:	3 days
Security:	Potential kernel memory disclosure
Sponsored by:	The FreeBSD Foundation
2019-07-24 16:51:14 +00:00
krion
8d3bb29faf Allow set MTU more than 1500 bytes.
Submitted by:	Alexandr Fedorov <aleksandr.fedorov_itglobal_dot_com>
Approved by:	jhb, rgrimes
Sponsored by:	ITGlobal.com
Differential Revision:	https://reviews.freebsd.org/D19422
2019-07-24 16:10:20 +00:00
markj
66e6bd0562 Remove a redundant offset computation in elf_load_section().
With r344705 the offset is always zero.

Submitted by:	Wuyang Chung <wuyang.chung1@gmail.com>
2019-07-24 15:18:05 +00:00
tuexen
bd266b70fa Add a sysctl variable ts_offset_per_conn to change the computation
of the TCP TS offset from taking the IP addresses and the TCP port
numbers into account to a version just taking only the IP addresses
into account. This works around broken middleboxes or endpoints.
The default is to keep the behaviour, which is also the behaviour
recommended in RFC 7323.

Reported by:		devgs@ukr.net
Reviewed by:		rrs@
MFC after:		2 weeks
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D20980
2019-07-23 21:28:20 +00:00
emaste
feb038e4b4 mqueuefs: fix struct file leak
In some error cases we previously leaked a stuct file.

Submitted by:	mjg, markj
2019-07-23 20:59:36 +00:00
tuexen
04cac029e7 Don't hold a mutex while calling sbwait. This was found by syzkaller.
Submitted by:		rrs@
Reported by:		markj@
MFC after:		1 week
2019-07-23 18:31:07 +00:00
erj
67dd575813 ixgbe(4): Fix enabling/disabling and reconfiguration of queues
- Wrong order of casting and bit shift caused that enabling and disabling
  queues didn't work properly for queues number larger than 32. Use literals
  with right suffix instead.

- TX ring tail address was not updated during reinitiailzation of TX
  structures. It could block sending traffic.

- Also remove unused variables 'eims' and 'active_queues'.

Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by:	erj@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D20826
2019-07-23 18:14:32 +00:00
tuexen
49092d73d3 Fix a LOR in SCTP which was found by running syzkaller.
Submitted by:		rrs@
Reported by:		markj@
MFC after:		1 week
2019-07-23 18:07:36 +00:00
andrew
a68b656d72 As with r350241 use the new UL macro on the main register mask.
MFC after:	1 week
Sponsored by:	DARPA, AFRL
2019-07-23 14:52:46 +00:00
andrew
e52d3fea4b Ensure the arm64 ID register fields are 64 bit types.
Previously only some of the ID register fields were 64 bit. To allow
for a script to generate these mark them all 64 bit. To allow for their
use in assembly we need to use the UINT64_C macro via a new UL macro
to stop the lines from being too long.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D20977
2019-07-23 14:40:37 +00:00
ae
5163ded6d4 Eliminate rmlock from ipfw's BPF code.
After r343631 pfil hooks are invoked in net_epoch_preempt section,
this allows to avoid extra locking. Add NET_EPOCH_ASSER() assertion
to each ipfw_bpf_*tap*() call to require to be called from inside
epoch section.

Use NET_EPOCH_WAIT() in ipfw_clone_destroy() to wait until it becomes
safe to free() ifnet. And use on-stack ifnet pointer in each
ipfw_bpf_*tap*() call to avoid NULL pointer dereference in case when
V_*log_if global variable will become NULL during ipfw_bpf_*tap*() call.

Sponsored by:	Yandex LLC
2019-07-23 12:52:36 +00:00
mav
0bba6fe16f Make CAM ATA stack handle disk resizes.
While for ATA disks resize is even more rare situation than for SCSI, it
may happen in case of HPA or AMA being used.  Make ATA XPT report minor
IDENTIFY DATA change to upper layers with AC_GETDEV_CHANGED, and ada(4)
periph driver handle that event, recalculating all the disk properties and
signalling resize to GEOM.  Since ATA has no mechanism of UNIT ATTENTIONs,
like SCSI, it has no way to detect that something has changed.  That is why
this functionality depends on explicit reprobe via XPT_REPROBE_LUN call.

MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2019-07-23 02:11:14 +00:00
jhibbits
b5dc72b50a powerpc: Unbreak 64-bit pmap from 350206
oldpvo is never explicitly NULL'd by moea64_pvo_enter(), so don't check for
NULL to do anything, only check error.

PR:		239372
Reported by:	Francis Little
2019-07-22 22:59:50 +00:00
ian
a6d3a29aed Correct spelling, partion -> partition. 2019-07-22 22:41:44 +00:00
manu
dcad6efe66 arm: ti: Add a driver for ti,sysc bus
ti,sysc is a simple-bus like driver.
Add a driver for it so child nodes can attach.
2019-07-22 21:55:33 +00:00
manu
7df85047a4 arm: ti: Get the hwmods property from the parent node
Since the Linux 5.0 dts the ti,hwmods property is on the parent
ti.sysc node.
2019-07-22 21:53:58 +00:00
brooks
d803d715a9 ata_xpt: Use the correct union member when accessing valid.
In principle this should not matter as it's a union and they point to
the same memory location but based on the code above we should be
accessing .sata and not .ata.

Submitted by:	arichardson
Reviewed by:	scottl, imp
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D21002
2019-07-22 21:07:58 +00:00
asomers
fd3a7f64f4 [skip ci] Fix the comment for cache_purge(9)
This is a merge of r348738 from projects/fuse2

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-07-22 21:03:52 +00:00
tuexen
79ab72bc8e Wakeup the application when doing PD-API for unordered DATA chunks.
Work done with rrs@.

MFC after:		1 week
2019-07-22 18:11:35 +00:00
br
e767c4348a Remove unused header.
Sponsored by:	DARPA, AFRL
2019-07-22 16:50:37 +00:00
br
57b40e04b4 o Add support for BERI IOMMU device
o Add an experimental IOMMU support to xDMA framework

The BERI IOMMU device is the part of CHERI device-model project [1]. It
translates memory addresses for various BERI peripherals modelled in
software. It accepts FreeBSD/mips64 page directories format and manages
BERI TLB.

1. https://github.com/CTSRD-CHERI/device-model

Sponsored by:	DARPA, AFRL
2019-07-22 16:01:20 +00:00
jhibbits
a33be3befe powerpc64/mmu: Make moea64_pvo_enter() return if an entry already exists
Summary:
Instead of searching for a PVO entry before adding, take advantage of
the fact that RB_INSERT() returns NULL if it inserts, and the existing entry if
an entry exists, without inserting a new entry.  This saves an extra tree
traversal in the cases where the PVO does not exist.

Reviewed by:	luporl
Differential Revision: https://reviews.freebsd.org/D20944
2019-07-22 03:11:54 +00:00
kib
c8ac9961b7 Switch the rest of the refcount(9) functions to bool return type.
There are some explicit comparisions of refcount_release(9) result
with 0/1, which are fine.

Reviewed by:	markj, mjg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21014
2019-07-21 20:16:48 +00:00
ian
ce19d4db5f Add support for setting the aging/frequency-offset register via sysctl.
The 2127 and 2129 chips support a frequency tuning value in the range of
-7 through +8 PPM; add a sysctl handler to read and set the value.
2019-07-21 17:14:39 +00:00
alc
cfe8b54602 With the introduction of software dirty bit emulation for managed mappings,
we should test ATTR_SW_DBM, not ATTR_AP_RW, to determine whether to set
PGA_WRITEABLE.  In effect, we are currently setting PGA_WRITEABLE based on
whether the dirty bit is preset, not whether the mapping is writeable.
Correct this mistake.

Reviewed by:	markj
X-MFC with:	r350004
Differential Revision:	https://reviews.freebsd.org/D21013
2019-07-21 17:00:19 +00:00
kib
a7656b7463 Fix userspace build after r350199.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-07-21 16:24:40 +00:00
kib
7d29da5483 Check and avoid overflow when incrementing fp->f_count in
fget_unlocked() and fhold().

On sufficiently large machine, f_count can be legitimately very large,
e.g. malicious code can dup same fd up to the per-process
filedescriptors limit, and then fork as much as it can.
On some smaller machine, I see
	kern.maxfilesperproc: 939132
	kern.maxprocperuid: 34203
which already overflows u_int.  More, the malicious code can create
transient references by sending fds over unix sockets.

I realized that this check is missed after reading
https://secfault-security.com/blog/FreeBSD-SA-1902.fd.html

Reviewed by:	markj (previous version), mjg
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D20947
2019-07-21 15:07:12 +00:00
alc
19eb42c206 Introduce pmap_store(), and use it to replace pmap_load_store() in places
where the page table entry was previously invalid.  (Note that I did not
replace pmap_load_store() when it was followed by a TLB invalidation, even
if we are not using the return value from pmap_load_store().)

Correct an error in pmap_enter().  A test for determining when to set
PGA_WRITEABLE was always true, even if the mapping was read only.

In pmap_enter_l2(), when replacing an empty kernel page table page by a
superpage mapping, clear the old l2 entry and issue a TLB invalidation.  My
reading of the ARM architecture manual leads me to believe that the TLB
could hold an intermediate entry referencing the empty kernel page table
page even though it contains no valid mappings.

Replace a couple direct uses of atomic_clear_64() by the new
pmap_clear_bits().

In a couple comments, replace the term "paging-structure caches", which is
an Intel-specific term for the caches that hold intermediate entries in the
page table, with wording that is more consistent with the ARM architecture
manual.

Reviewed by:	markj
X-MFC after:	r350004
Differential Revision:	https://reviews.freebsd.org/D20998
2019-07-21 03:26:26 +00:00
jhibbits
095001610c powerpc: Remove an unnecessary #ifdef guard from slb.c
slb.c is only compiled for powerpc64, so no need for the #ifdef in this block.
2019-07-21 03:19:54 +00:00
ian
b0da751d90 Rewrite the nxprtc chip init to extend battery life by using power-saving
features offered by the chips.

For 2127 and 2129 chips, fix the detection of when chip-init is needed.  The
chip config needs to be reset whenever power was lost, but the logic was
wrong for 212x chips (it only worked for 8523).  Now the "oscillator
stopped" bit rather than the power manager mode is used to detect startup
after powerfail.

For all chips, disable the clock output pin.

For chips that have a timestamp/tamper-monitor feature, turn off monitoring
of the timestamp trigger pin.

The 8523, 2127, and 2129 chips have a "power manager" feature that offers
several options.  We've been using the default mode which enables
everything.  Now the code sets the power manager options to

 - direct-switch (when Vdd < Vbat, without extra threshold check)
 - no battery monitor
 - no external powerfail monitor

This reduces the current draw while running on battery from 1930nA to 880nA,
which should roughly double the lifespan of the battery under load.

Because battery checking is a nice thing to have, the code now does a check
at startup, and then once a day after that, instead of checking continuously
(but only actually reporting at startup).  The battery check is now done by
setting the power manager back to default mode, sleeping briefly while it
makes a voltage measurement, then switching back to power-saving mode.
2019-07-20 21:10:27 +00:00
markj
730c2462a2 Rename vm_page_{import,release}() to vm_page_zone_{import,release}().
I would like to use the name vm_page_release() for a different purpose,
and vm_page_{import,release}() are local to vm_page.c.

Reviewed by:	kib
MFC after:	1 week
2019-07-20 18:25:41 +00:00
jhibbits
8c805766f7 powerpc/SPE: Enable SPV bit for EFSCFD instruction emulation
EFSCFD (floating point single convert from double) emulation requires saving
the high word of the register, which uses SPE instructions.  Enable the SPE
to avoid an SPV Unavailable exception.

MFC after:	1 week
2019-07-20 18:22:01 +00:00
manu
c2c39bb79f dtso: allwinner: Add an overlay for H3 i2c0
Most of the H3 boards don't enable i2c as it is unused.
Add an overlay so it's easier for user to use i2c device.
2019-07-20 17:42:46 +00:00
jhb
7d8b1472d3 Improve the precision of bhyve's vPIT.
Use 'struct bintime' instead of 'sbintime_t' to manage times in vPIT
to postpone rounding to final results rather than intermediate
results.  In tests performed by Joyent, this reduced the error measured
by Linux guests by 59 ppm.

Smart OS bug:	https://smartos.org/bugview/OS-6923
Submitted by:	Patrick Mooney
Reviewed by:	rgrimes
Obtained from:	SmartOS / Joyent
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20335
2019-07-20 15:59:49 +00:00
manu
8d6dc914b6 arm64: Implement HWCAP
Add HWCAP support for arm64.
defines are the same as in Linux and a userland program can use
elf_aux_info to get the data.
We only save the common denominator for all cores in case the
big and little cluster have different support (this is known to
exists even if we don't support those SoCs in FreeBSD)

Differential Revision:	https://reviews.freebsd.org/D17137
2019-07-20 14:29:11 +00:00
ganbold
4684fc8cb7 Add emmc support for Rockchip RK3399 SoC.
Tested on NanoPC-T4 board.

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D20156
2019-07-20 02:53:06 +00:00
ganbold
3da75da272 Add driver for Rockchip RK3399 eMMC PHY.
Tested on NanoPC-T4 board.

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D20840
2019-07-20 02:03:31 +00:00
kib
ab87bfcd7a Fix leak of memory and file refs with sendmsg(2) over unix domain sockets.
When sendmsg(2) sucessfully internalized one SCM_RIGHTS control
message, but failed to process some other control message later, both
file references and filedescent memory needs to be freed. This was not
done, only mbuf chain was freed.

Noted, test case written, reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21000
2019-07-19 20:51:39 +00:00
dougm
b894f21a4a Define vm_map_entry_in_transition to handle an in-transition map
entry, combining code currently in vm_map_unwire and
vm_map_wire_locked into a single function, called by each of them for
entries in transition.

Discussed with: kib, markj
Reviewed by: alc
Approved by: kib, markj (mentors, implicit)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D20833
2019-07-19 20:47:35 +00:00
mav
f4934e6568 Add Accessible Max Address Configuration support to camcontrol.
AMA replaced HPA in ACS-3 specification.  It allows to limit size of the
disk alike to HPA, but declares inaccessible data as indeterminate.  One
of its practical use cases is to under-provision SATA SSDs for better
reliability and performance.

While there, fix HPA Security detection/reporting.

MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2019-07-19 19:15:08 +00:00
imp
7f1c525519 Keep track of the number of commands that exhaust their retry limit.
While we print failure messages on the console, sometimes logs are lost or
overwhelmed. Keeping a count of how many times we've failed retriable commands
helps get a magnitude of the problem.
2019-07-19 18:39:24 +00:00
imp
64b34948ed Keep track of the number of retried commands.
Retried commands can indicate a performance degredation of an nvme drive. Keep
track of the number of retries and report it out via sysctl, just like number of
commands an interrupts.
2019-07-19 18:39:18 +00:00
imp
77c0809f26 Remove pre-FreeBSD 7.0 compatibility. 2019-07-19 18:38:47 +00:00
imp
4d2be36042 Add comments about KERN_OPT here. 2019-07-19 17:48:29 +00:00
imp
37af220308 Use sysctl + CTLRWTUN for hw.nvme.verbose_cmd_dump.
Also convert it to a bool. While the rest of the driver isn't yet bool clean,
this will help.

Reviewed by: cem@
Differential Revision: https://reviews.freebsd.org/D20988
2019-07-19 00:32:56 +00:00
imp
9f9b80b338 Provide new tunable hw.nvme.verbose_cmd_dump
The nvme drive dumps only the most relevant details about a command when it
fails. However, there are times this is not sufficient (such as debugging weird
issues for a new drive with a vendor). Setting hw.nvme.verbose_cmd_dump=1
in loader.conf will enable more complete debugging information about each
command that fails.

Reviewed by: rpokala
Sponsored by: Netflix
Differential Version: https://reviews.freebsd.org/D20988
2019-07-18 21:58:51 +00:00