(1) pmap_remove(), where it eliminates redundant TLB invalidations by
pmap_remove() and pmap_remove_l3(), and (2) pmap_enter_l2(), where it may
optimize the TLB invalidations by batching them.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D12725
Latter is undesired when including <sys/param.h> according to style(9)
Submitted by: Faraz Vahedi
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D20637
'-E' appears on the swapon command line, or if "trimonce" appears as
an fstab option.
Discussed at: BSDCAN
Tested by: markj
Reviewed by: markj
Approved by: markj (mentor)
Differential Revision:https://reviews.freebsd.org/D20599
Until r349278, bhyve presented a seg_max to the guest that was too large.
Detect this case and clamp it to the virtqueue size. Otherwise, we would
fail the "too many segments to enqueue" assertion in virtqueue_enqueue().
I hit this by running a guest with a MAXPHYS of 256 KB.
Reviewed by: bryanv cem
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20703
SES specifications tell: "The Additional Element Status descriptors shall
be in the same order as the status elements in the Enclosure Status
diagnostic page". It allows us to question ELEMENT INDEX that is lower
then values we already processed. There are many SAS2 enclosures with
this kind of problem.
While there, add more specific error messages for cases when ELEMENT INDEX
is obviously wrong. Also skip elements with INVALID bit set.
MFC after: 2 weeks
When some type has 0 elements, saved_individual_element_index was set
to -1 on second type bump, since individual_element_index was not
restored after the first. To me it looks easier just to increment
saved_individual_element_index separately than think when to save it.
MFC after: 2 weeks
Define __daddr_t in _types.h and use it in filio.h
Reported by: ian, bde
Reviewed by: ian, imp, cem
MFC after: 2 weeks
MFC-With: 349233
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20715
The seg_max value reported to the guest should be two less than the
host's maximum, in order to leave room for the request and the
response. This is analogous to r347033 for virtio_block.
We hit the "too many segments to enqueue" assertion on OneFS because
we increase MAXPHYS to 256 KB.
Reviewed by: bryanv
Discussed with: cem jhb rgrimes
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20529
- Add rcu list functions.
- Make rcu hlist's foreach macro use rcu calls instead of the non-rcu macro.
- Bump FreeBSD version so we have a checkpoint for the vboxvideo drm driver.
Reviewed by: hps
Approved by: imp (mentor), hps
MFC after: 1 week
Differential Revision: D20719
Sort methods alphabetically. Wrap long lines. Start sentences on a new
line. Remove contractions (not because it's a good idea, just to silence
igor). Add some explanation of the units for the period and duty arguments
and the convention for channel numbers.
interfaces were unified into pwmbus(9), and the PWMBUS_CHANNEL_MAX method
was renamed PWMBUS_CHANNEL_COUNT. The pwmbus_attach_bus() function just
went away completely. Also, fix a few typos such as s/is/if/.
pwm(9), but also maintains the historical sysctl config interface for
compatiblity with existing apps. The two config systems are not compatible
with each other; if you use both interfaces to change configurations you're
likely to end up with incorrect output or none at all.
upcoming functional changes.
Add an ofw_compat_data table for probing compat strings, and use it to add
PNP data. Remove some stray semicolons at the end of macro definitions,
and add a PWM_LOCK_ASSERT macro to round out the usual suite. Move the
device_t and driver_methods structs to the end of the file. Tweak comments.
Previously nandsim_chip_status returned EINVAL iff both of user-provided
chip->ctrl_num and chip->num were out of bounds. If only one failed the
bounds check arbitrary memory would be read and returned.
The NAND framework is not built by default, nandsim is not intended for
production use (it is a simulator), and the nandsim device has root-only
permissions.
admbugs: 827
Reported by: Daniel Hodson of elttam
MFC after: 3 days
Security: kernel information leak or DoS
Sponsored by: The FreeBSD Foundation
With this opcode it is possible to match TCP packets with specified
MSS option, whose value corresponds to configured in opcode value.
It is allowed to specify single value, range of values, or array of
specific values or ranges. E.g.
# ipfw add deny log tcp from any to any tcpmss 0-500
Reviewed by: melifaro,bcr
Obtained from: Yandex LLC
MFC after: 1 week
Sponsored by: Yandex LLC
If we take the slow path for forwarding we should still tell our
firewalls (hooked through pfil(9)) that we're forwarding. Pass the
ip_output() flags to ip_output_pfil() so it can set the PFIL_FWD flag
when we're forwarding.
MFC after: 1 week
Sponsored by: Axiado
bsnmpd(1) main does that early on init and the connection is available
to all loaded modules
Event: Vienna Hackathon 2019
PR: 233431 , 221487
MFC after: 2 weeks
pkg now uses /dev/null for some of its operations. NanoBSD's packaging
stuff didn't mount that for the chroot it ran in, so any config that
added packages would see the error:
pkg: Cannot open /dev/null:No such file or directory
when trying to actually add those packages. It's easy enough for
nanobsd to mount /dev and it won't hurt anything that was already
working and may help things that weren't (like this). I moved the
mount/unmount pair to be in the right push/pop order from the
submitted patch.
PR: 238727
Submitted by: mike tancsa
Tested by: Karl Denninger
Use appropriate fsyncs to persist the rewritten /etc/motd file, when a
rewrite is performed.
Reported by: Jonathan Walton <jonathan AT isilon.com>
Reviewed by: allanjude, vangyzen
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D20701
VOP_READ and VOP_WRITE take the seqcount in blocks in a 16-bit field.
However, fcntl allows you to set the seqcount in bytes to any nonnegative
31-bit value. The result can be a 16-bit overflow, which will be
sign-extended in functions like ffs_read. Fix this by sanitizing the
argument in kern_fcntl. As a matter of policy, limit to IO_SEQMAX rather
than INT16_MAX.
Also, fifos have overloaded the f_seqcount field for a completely different
purpose ever since r238936. Formalize that by using a union type.
Reviewed by: cem
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20710
In the case of mmap(), add a HISTORY section. Mention that mmap() and
mprotect()'s documentation predates an implementation. The
implementation first saw wide use in 4.3-Reno, but there seems to be no
easy way to express that in mdoc so stick with 4.4BSD.
Reviewed by: emaste
Requested by: cem
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D20713
Do not allocate temporary buffer for attributes we are going to return
as-is, just make sure to NUL-terminate them. Do not zero temporary 64KB
buffer for CDAI_TYPE_SCSI_DEVID, XPT tells us how much data it filled
and there are also length fields inside the returned data also.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
As reported in review D20709 by brooks calling vm_map_protect to set a
new max_protection value downgrades existing mappings if necessary (as
opposed to returning an error).
Reported by: brooks
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
A new macro PROT_MAX() alters a protection value so it can be OR'd with
a regular protection value to specify the maximum permissions. If
present, these flags specify the maximum permissions.
While these flags are non-portable, they can be used in portable code
with simple ifdefs to expand PROT_MAX() to 0.
This change allows (e.g.) a region that must be writable during run-time
linking or JIT code generation to be made permanently read+execute after
writes are complete. This complements W^X protections allowing more
precise control by the programmer.
This change alters mprotect argument checking and returns an error when
unhandled protection flags are set. This differs from POSIX (in that
POSIX only specifies an error), but is the documented behavior on Linux
and more closely matches historical mmap behavior.
In addition to explicit setting of the maximum permissions, an
experimental sysctl vm.imply_prot_max causes mmap to assume that the
initial permissions requested should be the maximum when the sysctl is
set to 1. PROT_NONE mappings are excluded from this for compatibility
with rtld and other consumers that use such mappings to reserve
address space before mapping contents into part of the reservation. A
final version this is expected to provide per-binary and per-process
opt-in/out options and this sysctl will go away in its current form.
As such it is undocumented.
Reviewed by: emaste, kib (prior version), markj
Additional suggestions from: alc
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18880
It's implied by the man page's RETURN VALUES section, but be explicit in
the description that vm_map_protect can not set new protection bits that
are already in each entry's max_protection.
Reviewed by: brooks
MFC After: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20709
This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux. FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.
Reviewed by: mckusick
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20705
inconsistent. This patch fixes that and improves the precision of
the description.
Thanks to Tom Marcoen for reporting the issue and providing an
initial patch, on which this change is based.
PR: 237723
Reviewed by: bcr@
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20708
wakeup_one() and underlying sleepq_signal() spend additional time trying
to be fair, waking thread with highest priority, sleeping longest time.
But in case of taskqueue there are many absolutely identical threads, and
any fairness between them is quite pointless. It makes even worse, since
round-robin wakeups not only make previous CPU affinity in scheduler quite
useless, but also hide from user chance to see CPU bottlenecks, when
sequential workload with one request at a time looks evenly distributed
between multiple threads.
This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup
thread that went to sleep last, but no longer in context switch (to avoid
immediate spinning on the thread lock). On top of that new wakeup_any()
function is added, equivalent to wakeup_one(), but setting the flag.
On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its
threads.
As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs
with 16KB block size spend 34% less time in wakeup_any() and descendants
then it was spending in wakeup_one(), and total write throughput increased
by ~10% with the same as before CPU usage.
Reviewed by: markj, mmacy
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D20669
There are many new features in ZoF. Most, if not all, do not effect read only usage.
Encryption in particular is enabled at the pool level but used at the dataset level.
The loader obviously will not be able to boot if the boot dataset is encrypted, but
should not care if some other dataset in the root pool is encrypted.
Reviewed by: allanjude
MFC after: 1 week