Commit Graph

260859 Commits

Author SHA1 Message Date
dougm
e1145d7ea4 Modify swapon(8) to invoke BIO_DELETE to trim swap devices, either if
'-E' appears on the swapon command line, or if "trimonce" appears as
an fstab option.

Discussed at: BSDCAN
Tested by: markj
Reviewed by: markj
Approved by: markj (mentor)
Differential Revision:https://reviews.freebsd.org/D20599
2019-06-22 03:16:01 +00:00
vangyzen
e18b1f330a VirtIO SCSI: validate seg_max on attach
Until r349278, bhyve presented a seg_max to the guest that was too large.
Detect this case and clamp it to the virtqueue size.  Otherwise, we would
fail the "too many segments to enqueue" assertion in virtqueue_enqueue().

I hit this by running a guest with a MAXPHYS of 256 KB.

Reviewed by:	bryanv cem
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20703
2019-06-22 01:20:45 +00:00
mav
440ed42621 Make ELEMENT INDEX validation more strict.
SES specifications tell: "The Additional Element Status descriptors shall
be in the same order as the status elements in the Enclosure Status
diagnostic page".  It allows us to question ELEMENT INDEX that is lower
then values we already processed.  There are many SAS2 enclosures with
this kind of problem.

While there, add more specific error messages for cases when ELEMENT INDEX
is obviously wrong.  Also skip elements with INVALID bit set.

MFC after:	2 weeks
2019-06-22 01:06:41 +00:00
scottl
117e38cc92 Refactor xpt_getattr() to make it more readable. No outwardly
visible functional changes, though code flow was modified a bit
internally to lessen the need for goto jumps and chained if
conditionals.
2019-06-21 23:40:26 +00:00
mav
daf9d8f42e Fix individual_element_index when some type has 0 elements.
When some type has 0 elements, saved_individual_element_index was set
to -1 on second type bump, since individual_element_index was not
restored after the first.  To me it looks easier just to increment
saved_individual_element_index separately than think when to save it.

MFC after:	2 weeks
2019-06-21 23:29:16 +00:00
asomers
4bfbbd74a8 Reduce namespace pollution from r349233
Define __daddr_t in _types.h and use it in filio.h

Reported by:	ian, bde
Reviewed by:	ian, imp, cem
MFC after:	2 weeks
MFC-With:	349233
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20715
2019-06-21 21:50:14 +00:00
vangyzen
4c3575eb30 bhyve: Fix vtscsi maximum segment config
The seg_max value reported to the guest should be two less than the
host's maximum, in order to leave room for the request and the
response.  This is analogous to r347033 for virtio_block.

We hit the "too many segments to enqueue" assertion on OneFS because
we increase MAXPHYS to 256 KB.

Reviewed by:	bryanv
Discussed with:	cem jhb rgrimes
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20529
2019-06-21 18:57:33 +00:00
johalun
f814ce2d88 LinuxKPI: Additions to rcu list.
- Add rcu list functions.
- Make rcu hlist's foreach macro use rcu calls instead of the non-rcu macro.
- Bump FreeBSD version so we have a checkpoint for the vboxvideo drm driver.

Reviewed by:	hps
Approved by:	imp (mentor), hps
MFC after:	1 week
Differential Revision:	D20719
2019-06-21 18:48:07 +00:00
johalun
1eab0f965f LinuxKPI: Add atomic_long_sub macro.
Reviewed by:	imp (mentor), hps
Approved by:	imp (mentor), hps
MFC after:	1 week
Differential Revision:	D20718
2019-06-21 16:43:16 +00:00
ian
06e3fdca3d Add pwm to the armv7 GENERIC kernel, it's now used by TI and Allwinner. 2019-06-21 15:44:58 +00:00
ian
679b7ef604 Do some general cleanup and light wordsmithing.
Sort methods alphabetically.  Wrap long lines.  Start sentences on a new
line.  Remove contractions (not because it's a good idea, just to silence
igor).  Add some explanation of the units for the period and duty arguments
and the convention for channel numbers.
2019-06-21 15:12:17 +00:00
ian
2a1f073196 Catch up with recent changes in pwmbus(9). The pwm(9) and pwmbus(9)
interfaces were unified into pwmbus(9), and the PWMBUS_CHANNEL_MAX method
was renamed PWMBUS_CHANNEL_COUNT.  The pwmbus_attach_bus() function just
went away completely.  Also, fix a few typos such as s/is/if/.
2019-06-21 14:46:43 +00:00
ian
492e6d4bb2 Add support for the PWM(9) API. This allows configuring the pwm output using
pwm(9), but also maintains the historical sysctl config interface for
compatiblity with existing apps.  The two config systems are not compatible
with each other; if you use both interfaces to change configurations you're
likely to end up with incorrect output or none at all.
2019-06-21 14:24:33 +00:00
ian
bbc96aee44 Some mundane tweaks and cleanups to help de-clutter the diffs of some
upcoming functional changes.

Add an ofw_compat_data table for probing compat strings, and use it to add
PNP data.  Remove some stray semicolons at the end of macro definitions,
and add a PWM_LOCK_ASSERT macro to round out the usual suite.  Move the
device_t and driver_methods structs to the end of the file.  Tweak comments.
2019-06-21 14:01:02 +00:00
emaste
08e5a1f70b nandsim: correct test to avoid out-of-bounds access
Previously nandsim_chip_status returned EINVAL iff both of user-provided
chip->ctrl_num and chip->num were out of bounds.  If only one failed the
bounds check arbitrary memory would be read and returned.

The NAND framework is not built by default, nandsim is not intended for
production use (it is a simulator), and the nandsim device has root-only
permissions.

admbugs:	827
Reported by:	Daniel Hodson of elttam
MFC after:	3 days
Security:	kernel information leak or DoS
Sponsored by:	The FreeBSD Foundation
2019-06-21 13:42:40 +00:00
ae
c6d750cdc7 Add "tcpmss" opcode to match the TCP MSS value.
With this opcode it is possible to match TCP packets with specified
MSS option, whose value corresponds to configured in opcode value.
It is allowed to specify single value, range of values, or array of
specific values or ranges. E.g.

 # ipfw add deny log tcp from any to any tcpmss 0-500

Reviewed by:	melifaro,bcr
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2019-06-21 10:54:51 +00:00
kp
5b2895d685 ip_output: pass PFIL_FWD in the slow path
If we take the slow path for forwarding we should still tell our
firewalls (hooked through pfil(9)) that we're forwarding. Pass the
ip_output() flags to ip_output_pfil() so it can set the PFIL_FWD flag
when we're forwarding.

MFC after:	1 week
Sponsored by:	Axiado
2019-06-21 07:58:08 +00:00
syrinx
ae701960e9 No need for each bsnmpd(1) module to open connection to syslog
bsnmpd(1) main does that early on init and the connection is available
to all loaded modules

Event:		Vienna Hackathon 2019
PR:		233431 , 221487
MFC after:	2 weeks
2019-06-21 07:45:58 +00:00
syrinx
711da9cba7 Unbreak snmp_pf(3) after the changes introduced in r338209
PR:		237011
Event:		Vienna Hackathon 2019
MFC after:	2 weeks
2019-06-21 07:29:02 +00:00
imp
4292f60c90 Mount and unmount devfs around calls to add packages.
pkg now uses /dev/null for some of its operations. NanoBSD's packaging
stuff didn't mount that for the chroot it ran in, so any config that
added packages would see the error:
	pkg: Cannot open /dev/null:No such file or directory
when trying to actually add those packages. It's easy enough for
nanobsd to mount /dev and it won't hurt anything that was already
working and may help things that weren't (like this). I moved the
mount/unmount pair to be in the right push/pop order from the
submitted patch.

PR: 238727
Submitted by: mike tancsa
Tested by: Karl Denninger
2019-06-21 03:49:36 +00:00
kevlo
76d6e9061c Correct function names. 2019-06-21 02:49:36 +00:00
cem
2535eb9e6b rc.d/motd: Update motd more robustly
Use appropriate fsyncs to persist the rewritten /etc/motd file, when a
rewrite is performed.

Reported by:	Jonathan Walton <jonathan AT isilon.com>
Reviewed by:	allanjude, vangyzen
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20701
2019-06-21 02:37:54 +00:00
cem
a7b7324d7c Fixup UPDATING text for r349253
Requested by:	delphij
2019-06-21 00:33:45 +00:00
cem
aae886fade sys: Remove DEV_RANDOM device option
Remove 'device random' from kernel configurations that reference it (most).
Replace perhaps mistaken 'nodevice random' in two MIPS configs with 'options
RANDOM_LOADABLE' instead.  Document removal in UPDATING; update NOTES and
random.4.

Reviewed by:	delphij, markm (previous version)
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D19918
2019-06-21 00:16:30 +00:00
takawata
9e3326a413 Fix the case where no root hub object while host controller object exist in ACPI namespace.
Also you can disable ACPI support for USB by setting
debug.acpi.disabled="usb"

PR:	238711
2019-06-20 23:52:33 +00:00
asomers
927d2d494a fcntl: fix overflow when setting F_READAHEAD
VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field.
However, fcntl allows you to set the seqcount in bytes to any nonnegative
31-bit value. The result can be a 16-bit overflow, which will be
sign-extended in functions like ffs_read. Fix this by sanitizing the
argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather
than INT16_MAX.

Also, fifos have overloaded the f_seqcount field for a completely different
purpose ever since r238936.  Formalize that by using a union type.

Reviewed by:	cem
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20710
2019-06-20 23:07:20 +00:00
mav
c694768564 SPC-3 and up require some UAs to be returned as fixed.
MFC after:	2 weeks
2019-06-20 22:20:30 +00:00
brooks
57acd8427d Add PROT_MAX to the HISTORY section.
In the case of mmap(), add a HISTORY section.  Mention that mmap() and
mprotect()'s documentation predates an implementation.  The
implementation first saw wide use in 4.3-Reno, but there seems to be no
easy way to express that in mdoc so stick with 4.4BSD.

Reviewed by:	emaste
Requested by:	cem
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D20713
2019-06-20 21:52:30 +00:00
mav
63e9246efe Optimize xpt_getattr().
Do not allocate temporary buffer for attributes we are going to return
as-is, just make sure to NUL-terminate them.  Do not zero temporary 64KB
buffer for CDAI_TYPE_SCSI_DEVID, XPT tells us how much data it filled
and there are also length fields inside the returned data also.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-20 20:29:42 +00:00
np
f3c088d5f0 cxgbe/t4_tom: DDP_DEAD is a ddp flag and not a toepcb flag.
The driver was in effect setting TPF_ABORT_SHUTDOWN on the toepcb
instead of what was intended.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-06-20 20:06:19 +00:00
emaste
5460546183 Clarify vm_map_protect max_protection downgrade
As reported in review D20709 by brooks calling vm_map_protect to set a
new max_protection value downgrades existing mappings if necessary (as
opposed to returning an error).

Reported by:	brooks
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-06-20 18:30:19 +00:00
brooks
0165ccc850 Extend mmap/mprotect API to specify the max page protections.
A new macro PROT_MAX() alters a protection value so it can be OR'd with
a regular protection value to specify the maximum permissions.  If
present, these flags specify the maximum permissions.

While these flags are non-portable, they can be used in portable code
with simple ifdefs to expand PROT_MAX() to 0.

This change allows (e.g.) a region that must be writable during run-time
linking or JIT code generation to be made permanently read+execute after
writes are complete.  This complements W^X protections allowing more
precise control by the programmer.

This change alters mprotect argument checking and returns an error when
unhandled protection flags are set.  This differs from POSIX (in that
POSIX only specifies an error), but is the documented behavior on Linux
and more closely matches historical mmap behavior.

In addition to explicit setting of the maximum permissions, an
experimental sysctl vm.imply_prot_max causes mmap to assume that the
initial permissions requested should be the maximum when the sysctl is
set to 1.  PROT_NONE mappings are excluded from this for compatibility
with rtld and other consumers that use such mappings to reserve
address space before mapping contents into part of the reservation.  A
final version this is expected to provide per-binary and per-process
opt-in/out options and this sysctl will go away in its current form.
As such it is undocumented.

Reviewed by:	emaste, kib (prior version), markj
Additional suggestions from:	alc
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D18880
2019-06-20 18:24:16 +00:00
emaste
517712e7a8 Clarify that vm_map_protect cannot upgrade max_protection
It's implied by the man page's RETURN VALUES section, but be explicit in
the description that vm_map_protect can not set new protection bits that
are already in each entry's max_protection.

Reviewed by:	brooks
MFC After:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20709
2019-06-20 18:19:09 +00:00
asomers
ec929b8c9b VOP_REVOKE(9): update locking requirements per r143495
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20524
2019-06-20 16:36:20 +00:00
allanjude
a4407b3f4b top(1): Don't show the swap line when there are no swap devices
Submitted by:	antranigv@freebsd.am
Reviewed by:	bapt
MFC after:	1 month
Sponsored by:	Klara Systems
Differential Revision:	https://reviews.freebsd.org/D18928
2019-06-20 15:44:43 +00:00
asomers
11d2dfaa2f VOP_BMAP(9): fix typo in the copyright header
Reported by:	rgrimes
MFC after:	2 weeks
MFC-With:	349230
Sponsored by:	The FreeBSD Foundation
2019-06-20 14:40:36 +00:00
asomers
65ed7460c7 #include <sys/types.h> from sys/filio.h
This fixes world build after r349231

Reported by:	Jenkins
MFC after:	2 weeks
MFC-With:	349231
Sponsored by:	The FreeBSD Foundation
2019-06-20 14:35:28 +00:00
asomers
2672f1ee16 Add FIOBMAP2 ioctl
This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux.  FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.

Reviewed by:	mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20705
2019-06-20 14:13:10 +00:00
asomers
2a58791758 Add a VOP_BMAP(9) man page
Reviewed by:	mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20704
2019-06-20 13:59:46 +00:00
antoine
2a96fe72ea Add head(1) to native-xtools so that it can be used in qemu-user jails 2019-06-20 13:24:58 +00:00
tuexen
8d4401b9b0 The variable names in the description of the port number usage is
inconsistent. This patch fixes that and improves the precision of
the description.
Thanks to Tom Marcoen for reporting the issue and providing an
initial patch, on which this change is based.

PR:			237723
Reviewed by:		bcr@
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D20708
2019-06-20 12:38:41 +00:00
lwhsu
9e1c4ac4c0 Finsh readding Big5 in r317204, which was reverting r315568. This commit
reverts r315569.

Reported by:	Ting-Wei Lan <lantw44 gmail com>
Discussed with:	kevlo
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-06-20 07:17:16 +00:00
mav
da627eba88 Add wakeup_any(), cheaper wakeup_one() for taskqueue(9).
wakeup_one() and underlying sleepq_signal() spend additional time trying
to be fair, waking thread with highest priority, sleeping longest time.
But in case of taskqueue there are many absolutely identical threads, and
any fairness between them is quite pointless.  It makes even worse, since
round-robin wakeups not only make previous CPU affinity in scheduler quite
useless, but also hide from user chance to see CPU bottlenecks, when
sequential workload with one request at a time looks evenly distributed
between multiple threads.

This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup
thread that went to sleep last, but no longer in context switch (to avoid
immediate spinning on the thread lock).  On top of that new wakeup_any()
function is added, equivalent to wakeup_one(), but setting the flag.
On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its
threads.

As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs
with 16KB block size spend 34% less time in wakeup_any() and descendants
then it was spending in wakeup_one(), and total write throughput increased
by ~10% with the same as before CPU usage.

Reviewed by:	markj, mmacy
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D20669
2019-06-20 01:15:33 +00:00
markj
d82071d5c7 Group vm_page_activate()'s definition with other related functions.
No functional change intended.

MFC after:	3 days
2019-06-19 21:36:00 +00:00
mmacy
501ffa35a7 Tell loader to ignore newer features enabled on the root pool.
There are many new features in ZoF. Most, if not all, do not effect read only usage.
Encryption in particular is enabled at the pool level but used at the dataset level.
The loader obviously will not be able to boot if the boot dataset is encrypted, but
should not care if some other dataset in the root pool is encrypted.

Reviewed by:	allanjude
MFC after:	1 week
2019-06-19 21:10:13 +00:00
bdrewery
3e2dcc642a Follow-up r349065: Fix .TARGET flag ambiguity with PROGS which broke MK_TESTS.
X-MFC-With:	r349065
Sponsored by:	DellEMC
2019-06-19 19:19:37 +00:00
bcran
e2bf80752c efinet: Defer exclusively opening the network handles
Don't commit to exclusive access to the network device handle by
efinet until the loader has decided to load something through the
network. This allows for the possibility of other users of the
network device.

Submitted by:	scottph
Reviewed by:	tsoome, emaste
Tested by: 	tsoome, bcran
Differential Revision:	https://reviews.freebsd.org/D20642
2019-06-19 18:47:44 +00:00
markj
395e29f3c0 Make zlib encoding messages idempotent.
Otherwise duplicate messages can trigger a reinitialization of the
compression stream while the update thread is running.  Also ensure
that the stream is initialized before the update thread may attempt
to use it.

PR:		238333
Reviewed by:	cem, rgrimes
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20673
2019-06-19 16:09:20 +00:00
mav
6839a1dd36 Use sbuf_cat() in GEOM confxml generation.
When it comes to megabytes of text, difference between sbuf_printf() and
sbuf_cat() becomes substantial.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-06-19 15:36:02 +00:00
jtl
a21d319b29 Add the ability to limit how much the code will fragment the RACK send map
in response to SACKs. The default behavior is unchanged; however, the limit
can be activated by changing the new net.inet.tcp.rack.split_limit sysctl.

Submitted by:	Peter Lei <peterlei@netflix.com>
Reported by:	jtl
Reviewed by:	lstewart (earlier version)
Security:	CVE-2019-5599
2019-06-19 13:55:00 +00:00