Commit Graph

135242 Commits

Author SHA1 Message Date
kib
a32144e6e2 amd64: add defines and decode protection keys and SGX page faults reasons.
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D18893
2019-02-20 09:46:44 +00:00
kib
5fa757e527 Implement rangesets.
The data structure implements non-intersecting intervals over the [0,
UINT64_MAX] range, and supports fast insert, predicated clearing of
subrange, and lookup of an interval containing the specified address.
Internally it is a pctrie over the interval start addresses.

Implementation provides additional guarantees over the structure state
in case of memory allocation failures.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D18893
2019-02-20 09:38:19 +00:00
ganbold
fa1581bf5c Clarify notifications when battery capacity ratio
reaches warning and shutdown thresholds.
2019-02-20 07:10:38 +00:00
cem
8c28d663d3 Fuse: whitespace and style(9) cleanup
Take a pass through fixing some of the most egregious whitespace issues in
fs/fuse.  Also fix some style(9) warts while here.  Not 100% cleaned up, but
somewhat less painful to look at and edit.

No functional change.
2019-02-20 02:49:26 +00:00
cem
c3081ecaeb fuse: add descriptions for remaining sysctls
(Except reclaim revoked; I don't know what that goal of that one is.)
2019-02-20 02:48:59 +00:00
bde
bb4b140d4f Fix hangs in r341810 waiting for AP startup.
idle_td is dereferenced without thread-locking it to make its contents is
invariant, and was accessed without telling the compiler that its contents
is invariant.  Some compilers optimized accesses to the supposedly invariant
contents by moving the critical checks for changes outside of the loop that
waits for changes.  Fix this using atomic ops.

This bug only showed up for the following configuration: a Turion2
system, amd64 kernels, compiled by gcc, and SCHED_4BSD.  clang fails
to do the optimization with all CFLAGS that I tried, because it doesn't
fully optimize the '__asm __volatile' for cpu_spinwait() although this
asm has no memory clobber.  gcc only does the optimization with most
CFLAGS.  I mostly used -Os with all compilers.  i386 works because gcc
-m32 -Os only moves 1 or the 2 accesses outside of the loop.
Non-Turion2 systems and SCHED_ULE worked due to different timing (when
all APs start before the BP checks them outside of the loop).

Reviewed by:	kib
2019-02-20 02:40:38 +00:00
bde
06fa597aae Attempt to complete fixing programmable function keys for syscons.
The flag for the driver capability of supporting the fix is independent
of the flag for cons25 mode so that it can be managed independently, but
I forget to preserve it when resetting the terminal.
2019-02-20 02:14:41 +00:00
pjd
d3acecd0b7 Simplify the code. No functional changes.
Reviewed by:	rpokala
2019-02-20 00:25:45 +00:00
pjd
77f6a0e2a3 Simplify the code. 2019-02-19 23:53:33 +00:00
pjd
f93f055a0a Correct typo in the comment. 2019-02-19 23:44:00 +00:00
pjd
53653771cd Change assertion to log the incorrect io_type we've got. 2019-02-19 23:43:15 +00:00
pjd
4d519354de Grabage-collect no longer used variable. 2019-02-19 23:41:23 +00:00
pjd
c40618cd08 The way ZFS searches for its vdevs is the following: first it looks for
a vdev that has the same name as the one stored in metadata and that has
all VDEV labels in place. If it cannot find a GEOM provider with the given
name and all VDEV labels it will scan all GEOM providers for the best match
(the most VDEV labels available), but here the name is ignored.

In case the ZFS pool is created, eg. using GPT partition label:

	# zpool create tank /dev/gpt/tank

everything works, and on every import ZFS will pick /dev/gpt/tank and
not /dev/da0p4.

The problem occurs when da0p4 is extended and ZFS is unable to find all
VDEV labels in /dev/gpt/tank anymore (the VDEV labels stored at the end
of the partition are now somewhere else). In this case it will scan all
GEOM providers and will pick the first one with the best match, ie. da0p4.

Fix this problem by checking the VDEV/provider name even if we get the same
match. If the name is the same as the one we have in pool's metadata, prefer
this GEOM provider.

Reported by:	oshogbo, Michal Mroz <m.mroz@fudosecurity.com>
Tested by:	Michal Mroz <m.mroz@fudosecurity.com>
Obtained from:	Fudo Security
2019-02-19 23:35:55 +00:00
pjd
12a384e71d In the vdev_geom_open_by_path() function we assume that vdev path starts
with "/dev/". Make sure this is the case.
2019-02-19 23:22:39 +00:00
ed
4207ba4fed Place an upper bound on the number of iterations for REP.
Right now it's possible to invoke the REP escape sequence with a maximum
of tens of millions of iterations. In practice, there is never any need
to do this. Calling it more frequently than the number of cells in the
terminal hardly makes any sense. By placing a limit on it, we can
prevent users from exhausting resources in inside the terminal emulator.

As support for this escape sequence is not present in any of the stable
branches, there is no need to MFC.

Reported by:	https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11255
2019-02-19 21:58:23 +00:00
ed
f64d2df6b5 Add missing __unused attributes to unused function arguments.
This fixes the userspace build of libteken.
2019-02-19 21:49:48 +00:00
markj
0a9182f1aa Limit the number of entries allocated for a REPORT_ZONES command.
The DIOCGETZONE ioctl can be used to fetch the zone list of an SMR
drive, and the caller specifies the number of entries it wants to fetch.
Clamp the caller's request to a sane limit so that a user cannot attempt
large allocations. Callers already need to invoke the ioctl multiple
times to fetch the full list in general, so there's no harm in limiting
the number of entries returned.

Fix style while here.

admbug:		807
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by:	asomers, ken
Tested by:	ken
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19249
2019-02-19 21:33:02 +00:00
markj
467f20b505 Impose a limit on the number of GEOM_CTL arguments.
Otherwise a privileged user can trigger a memory allocation of
unbounded size, or an integer overflow in the subsequent
geom_alloc_copyin() call, leading to out-of-bounds accesses.

Hard-code a large limit to circumvent this problem.

admbug:		854
Reported by:	Anonymous of the Shellphish Grill Team
Reviewed by:	ae
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19251
2019-02-19 21:22:22 +00:00
imp
2fa70908ad Remove drm from LINT kernels
drm was accidentally left in the LINT kernels.

Pointy hat to: imp
2019-02-19 21:20:50 +00:00
thj
79ac8c17fb When dropping a fragment queue count the number of fragments in the queue
When dropping a fragment queue, account for the number of fragments in the
queue. This improves accounting between the number of fragments received and
the number of fragments dropped.

Reviewed by:	jtl, bz, transport
Approved by:	jtl (mentor), bz (mentor)
Differential Revision:	https://review.freebsd.org/D17521
2019-02-19 19:57:55 +00:00
imp
9f2be4d8b3 Add an UPDATING entry for the removal of drm and drm2
Also bump FreeBSD version to 1300013 since this series is a big
change.
2019-02-19 19:37:09 +00:00
imp
5087ddbd1d Remove the i915 and radeon drivers.
Per discussions on arch@ and elsewhere, the maintenance of this code
has moved to the drm-kmod and drm-legacy-kmod ports. Remove the i915
and radeon drivers from the tree.

Approved by: graphics team
Reviewed by: manu@, mmel@
Differential Revision: https://reviews.freebsd.org/D19196
2019-02-19 19:37:02 +00:00
imp
46572bf47b Remove drm2 modules.
Remove support for compiling drm2 as a module. This has transitioned
to the drm-kmod or drm-legacy-kmodw ports.

Approved by: graphics team
Reviewed by: manu@, mmel@
Differential Revision: https://reviews.freebsd.org/D19196
2019-02-19 19:36:56 +00:00
imp
f0d9f09626 Per discussions on arch@ and elsewhere, retire drm module / drives.
Retire the drm modules / drivers. These are now handled by the
drm-legacy-kmod port and/or the drm-kmod port. All future
development and maintanace will be handled there.

Approved by: graphics team
Reviewed by: manu@, mmel@
Differential Revision: https://reviews.freebsd.org/D19196
2019-02-19 19:36:43 +00:00
kib
7fc374eaa9 Provide convenience C wrappers for RDPKRU and WRPKRU instructions.
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D18893
2019-02-19 19:17:20 +00:00
kib
a96ba1cd3b Add definition for %cr4 PKRU enable bit.
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D18893
2019-02-19 19:13:48 +00:00
thj
5152bed271 Fix style after r340832
Reported by:	jhb
Reviewed by:	jhb, jtl
Approved by:	jtl (mentor)
MFC after:	3 days
Differential Revision:	https://reviews/freebsd.org/D18354
2019-02-19 19:04:52 +00:00
andrew
adb8ee39e1 Create a common function to handle freeing the kcov info struct.
Both places that may free the kcov info struct are identical. Create a new
common function to hold the code.

Sponsored by:	DARPA, AFRL
2019-02-19 17:03:34 +00:00
markj
60e27aa8fb Move a racy assertion in filt_pipewrite().
EVFILT_WRITE knotes for pipes live on the knlist for the other end of the
pipe.  Since they do not hold a reference on the corresponding file
structure, they may be removed from the knlist by pipeclose() while still
remaining active.  In this case, there is no knlist lock acquired before
filt_pipewrite() is called, so the assertion fails.

Fix the problem by first checking whether that end of the pipe has been
closed.  These checks are memory safe since the knote holds a reference
on one end of the pipe, and the pipe structure is not freed until both
ends are closed.  The checks are not racy since PIPE_EOF is never cleared
after being set, and pipe_present is never set back to PIPE_ACTIVE after
pipeclose() has been called.

PR:		235640
Reported and tested by:	pho
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19224
2019-02-19 15:46:43 +00:00
trasz
a15d207376 Work around the "nfscl: bad open cnt on server" assertion
that can happen when rerooting into NFSv4 rootfs with kernel
built with INVARIANTS.

I've talked to rmacklem@ (back in 2017), and while the root cause
is still unknown, the case guarded by assertion (nfscl_doclose()
being called from VOP_INACTIVE) is believed to be safe, and the
whole thing seems to run just fine.

Obtained from:	CheriBSD
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-02-19 12:45:37 +00:00
trasz
2e82f607c0 Bump the default kern.rpc.gss.client_max from 128 to 1024.
The old value resulted in bad performance, with high kernel
and gssd(8) load, with more than ~64 clients; it also triggered
crashes, which are to be fixed by a different patch.

PR:		235582
Discussed with:	rmacklem@
MFC after:	2 weeks
2019-02-19 11:07:02 +00:00
trasz
59ce7e9052 Add kern.rpc.gss.client_hash tunable, to make it possible to bump
it easily.  This can lower the load on gssd(8) on large NFS servers.

Submitted by:	Per Andersson <pa at chalmers dot se>
Reviewed by:	rmacklem@
MFC after:	2 weeks
Sponsored by:	Chalmers University of Technology
2019-02-19 10:17:49 +00:00
ian
d3f45de15b Add a compatible string to match recent changes in the upstream dts. 2019-02-18 19:50:53 +00:00
kib
f7760539ad amd64: cleanup pmap_init_pat().
The pmap_works variable is always true for amd64.  Remove it, the
branch in the initialization taken when false, and corresponding
sysctl.

Remove pat_table[] local array, work on pat_index[] directly.

Collapse whole initialization to not override already assigned values.

Add comment explaining the choice for PAT4 and PAT7.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
MFC note:	Leave the sysctl around
Differential revision:	https://reviews.freebsd.org/D19225
2019-02-18 16:02:00 +00:00
vmaffione
23b439c4c0 netmap: don't schedule kqueue notify task when kqueue is not used
This change adds a counter (kqueue_users) to keep track of how many
kqueue users are referencing a given struct nm_selinfo.
In this way, nm_os_selwakeup() can schedule the kevent notification
task only when kqueue is actually being used.
This is important to avoid wasting CPU in the common case where
kqueue is not used.

Reviewed by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D19177
2019-02-18 14:21:41 +00:00
br
bbe25730d0 Avoid orphan sections between __bss_start and .(s)bss.
Ensure __bss_start is associated with the next section
in case orphan sections are placed directly after .sdata,
as has been seen to happen with LLD.

Submitted by:	"J.R.T. Clarke" <jrtc4@cam.ac.uk>
Differential Revision:	https://reviews.freebsd.org/D18429
2019-02-18 13:14:53 +00:00
oshogbo
04c97df2e9 libnv: fix revert
Reported by:	jenkins
2019-02-17 18:32:19 +00:00
oshogbo
8b0e5bff9d libnv: fix double free
In r343986 we introduced a double free. The structure was already
freed fixed in the r302966. This problem was introduced
because the GitHub version was out of sync with the FreeBSD one.

Submitted by:	Mindaugas Rasiukevicius <rmind@netbsd.org>
MFC with:	r343986
2019-02-17 18:26:27 +00:00
markj
0f8d8b1abc Remove a write-only variable orphaned by r340677. 2019-02-17 16:56:41 +00:00
markj
9a26ede14d Fix refcount leaks in the SGX Linux compat ioctl handler.
Some argument validation error paths would return without releasing the
file reference obtained at the beginning of the function.

While here, fix some style bugs and remove trivial debug prints.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19214
2019-02-17 16:43:44 +00:00
markj
88d649090e Remove a redundant flag variable.
Use the object pointer itself to determine whether the object is locked.
No functional change intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19215
2019-02-17 16:35:19 +00:00
ganbold
3c2ed743fc Add sysctl for setting battery charging current.
The charging current can be set using steps
from 0: 200mA to 13: 2800mA (200mA/step).
While there, fix battery charging current related
sensor descriptions.

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D19212
2019-02-17 01:16:27 +00:00
jhibbits
975b89fd73 powerpc/booke: Fix 32-bit build
MFC after:	2 weeks
MFC with:	344202
2019-02-16 04:47:33 +00:00
jhibbits
963403a0e2 powerpc/booke: depessimize MAS register updates
We only need to isync before we actually use the MAS registers, so before and
after the TLB read/write/sync/search operations.

MFC after:	2 weeks
2019-02-16 04:38:34 +00:00
jhibbits
77359000c0 powerpc/booke: Use DMAP where possible for page copy and zeroing
This avoids several locks and pmap_kenter()'s, improving performance
marginally.

MFC after:	2 weeks
2019-02-16 04:16:10 +00:00
avos
965219e5bb GC ATA_REQUEST_TIMEOUT option remnants
It was removed from code in r249083 and from sys/conf/options in r249213.

PR:		222170
MFC after:	3 days
2019-02-16 01:48:38 +00:00
glebius
d889424078 For 32-bit machines rollback the default number of vnode pager pbufs
back to the lever before r343030.  For 64-bit machines reduce it slightly,
too.  Together with r343030 I bumped the limit up to the value we use at
Netflix to serve 100 Gbit/s of sendfile traffic, and it probably isn't a
good default.

Provide a loader tunable to change vnode pager pbufs count. Document it.
2019-02-15 23:36:22 +00:00
cem
d5f3fe6662 FUSE: Refresh cached file size when it changes (lookup)
The cached fvdat->filesize is indepedent of the (mostly unused)
cached_attrs, and we failed to update it when a cached (but perhaps
inactive) vnode was found during VOP_LOOKUP to have a different size than
cached.

As noted in the code comment, this can occur in distributed filesystems or
with other kinds of irregular file behavior (anything is possible in FUSE).

We do something similar in fuse_vnop_getattr already.

PR:		230258 (as reported in description; other issues explored in
			comments are not all resolved)
Reported by:	MooseFS FreeBSD Team <freebsd AT moosefs.com>
Submitted by:	Jakub Kruszona-Zawadzki <acid AT moosefs.com> (earlier version)
2019-02-15 22:55:13 +00:00
cem
bb3c594db5 FUSE: The FUSE design expects writethrough caching
At least prior to 7.23 (which adds FUSE_WRITEBACK_CACHE), the FUSE protocol
specifies only clean data to be cached.

Prior to this change, we implement and default to writeback caching.  This
is ok enough for local only filesystems without hardlinks, but violates the
general design contract with FUSE and breaks distributed filesystems or
concurrent access to hardlinks of the same inode.

In this change, add cache mode as an extension of cache enable/disable.  The
new modes are UC (was: cache disabled), WT (default), and WB (was: cache
enabled).

For now, WT caching is implemented as write-around, which meets the goal of
only caching clean data.  WT can be better than WA for workloads that
frequently read data that was recently written, but WA is trivial to
implement.  Note that this has no effect on O_WRONLY-opened files, which
were already coerced to write-around.

Refs:
  * https://sourceforge.net/p/fuse/mailman/message/8902254/
  * https://github.com/vgough/encfs/issues/315

PR:		230258 (inspired by)
2019-02-15 22:52:49 +00:00
cem
58697545cb FUSE: Only "dirty" cached file size when data is dirty
Most users of fuse_vnode_setsize() set the cached fvdat->filesize and update
the buf cache bounds as a result of either a read from the underlying FUSE
filesystem, or as part of a write-through type operation (like truncate =>
VOP_SETATTR).  In these cases, do not set the FN_SIZECHANGE flag, which
indicates that an inode's data is dirty (in particular, that the local buf
cache and fvdat->filesize have dirty extended data).

PR:		230258 (related)
2019-02-15 22:51:09 +00:00