This allows to simulated disk that is responding slowly to the IO requests.
Reviewed by: markj, bcr, pjd (previous version)
Differential Revision: https://reviews.freebsd.org/D21052
Avoid potential structure padding leak. r350294 identified a leak via
static analysis; although there's no report of a leak with the
DIOCGETSRCNODES ioctl it's a good practice to zero the memory.
Suggested by: kp
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
The motivation for this change is to allow wrappers around shm to be written
that don't set CLOEXEC. kern_shm_open currently accepts O_CLOEXEC but sets
it unconditionally. kern_shm_open is used by the shm_open(2) syscall, which
is mandated by POSIX to set CLOEXEC, and CloudABI's sys_fd_create1().
Presumably O_CLOEXEC is intended in the latter caller, but it's unclear from
the context.
sys_shm_open() now unconditionally sets O_CLOEXEC to meet POSIX
requirements, and a comment has been dropped in to kern_fd_open() to explain
the situation and add a pointer to where O_CLOEXEC setting is maintained for
shm_open(2) correctness. CloudABI's sys_fd_create1() also unconditionally
sets O_CLOEXEC to match previous behavior.
This also has the side-effect of making flags correctly reflect the
O_CLOEXEC status on this fd for the rest of kern_shm_open(), but a
glance-over leads me to believe that it didn't really matter.
Reviewed by: kib, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21119
mapping and then destroy one of the 4 KB page mappings so that there is a
potential trigger for repromotion. Currently, we destroy the first 4 KB
page mapping that falls within the (current) superpage mapping or the
virtual address range [sva, eva). However, I have found empirically that
destroying the last 4 KB mapping produces slightly better results,
specifically, more promotions and fewer failed promotion attempts.
Accordingly, this revision changes pmap_advise() to destroy the last 4 KB
page mapping. It also replaces some nearby uses of boolean_t with bool.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D21115
r350437 presents a merge conflict with r350115, which raised
__FreeBSD_version due to the addition of fusefs's intr/nointr mount options.
Sponsored by: The FreeBSD Foundation
It is assembled using "${CC} -x assembler-with-cpp", which by convention
(bsd.suffixes.mk) uses the .asm extension.
This is a portion of the review referenced below (D18344). That review
also renamed linux_support.s to .S, but that is a functional change
(using the compiler's integrated assembler instead of as) and will be
revisited separately.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18344
witness has long had a facility to "bless" designated lock pairs. Lock
order reversals between a pair of blessed locks are not reported upon.
We have a number of long-standing false positive LOR reports; start
marking well-understood LORs as blessed.
This change hides reports about UFS vnode locks and the UFS dirhash
lock, and UFS vnode locks and buffer locks, since those are the two that
I observe most often. In the long term it would be preferable to be
able to limit blessings to a specific site where a lock is acquired,
and/or extend witness to understand why some lock order reversals are
valid (for example, if code paths with conflicting lock orders are
serialized by a third lock), but in the meantime the false positives
frequently confuse users and generate bug reports.
Reviewed by: cem, kib, mckusick
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21039
copy_file_range() operates on a pair of file descriptors; it requires
CAP_READ for the source descriptor and CAP_WRITE for the destination
descriptor.
Reviewed by: kevans, oshogbo
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21113
Attempt to mitigate the security risks around refcount overflows by
introducing a "saturated" state for the counter. Once a counter reaches
INT_MAX+1, subsequent acquire and release operations will blindly set
the counter value to INT_MAX + INT_MAX/2, ensuring that the protected
resource will not be freed; instead, it will merely be leaked.
The approach introduces a small race: if a refcount value reaches
INT_MAX+1, a subsequent release will cause the releasing thread to set
the counter to the saturation value after performing the decrement. If
in the intervening window INT_MAX refcount releases are performed by a
different thread, a use-after-free is possible. This is very difficult
to trigger in practice, and any situation where it could be triggered
would likely be vulnerable to reference count wraparound problems
to begin with. An alternative would be to use atomic_cmpset to acquire
and release references, but this would introduce a larger performance
penalty, particularly when the counter is contended.
Note that refcount_acquire_checked(9) maintains its previous behaviour;
code which must accurately track references should use it instead of
refcount_acquire(9).
Reviewed by: kib, mjg
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21089
The current implementation of gzipped a.out support was based
on a very old version of InfoZIP which ships with an ancient
modified version of zlib, and was removed from the GENERIC
kernel in 1999 when we moved to an ELF world.
PR: 205822
Reviewed by: imp, kib, emaste, Yoshihiro Ota <ota at j.email.ne.jp>
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D21099
Both of these functions atomically unwire a page, optionally attempt
to free the page, and enqueue or requeue the page. Add functions
vm_page_release() and vm_page_release_locked() to perform the same task.
The latter must be called with the page's object lock held.
As a side effect of this refactoring, the buffer cache will no longer
attempt to free mapped pages when completing direct I/O. This is
consistent with the handling of pages by sendfile(SF_NOCACHE).
Reviewed by: alc, kib
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20986
If we take a WnR permission fault on a managed, writeable and dirty
PTE, simply return success without calling the main fault handler. This
situation can occur if multiple threads simultaneously access a clean
writeable mapping and trigger WnR faults; losers of the race to mark the
PTE dirty would end up calling the main fault handler, which had no work
to do.
Reported by: alc
Reviewed by: alc
MFC with: r350004
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21097
This is a partial merge of 350144 from projects/fuse2
PR: 236466
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21095
Terasic DE10-Pro (an Intel Stratix 10 GX/SX FPGA Development Kit).
The Altera EMAC is an instance of Synopsys DesignWare Gigabit MAC.
This driver sets correct clock range for MDIO interface on Intel Stratix 10
platform.
This is required due to lack of support for clock manager device for
this platform that could tell us the clock frequency value for ethernet
clock domain.
Sponsored by: DARPA, AFRL
ACTION_PTR() returns pointer to the start of rule action section,
but rule can keep several rule modifiers like O_LOG, O_TAG and O_ALTQ,
and only then real action opcode is stored.
ipfw_get_action() function inspects the rule action section, skips
all modifiers and returns action opcode.
Use this function in ipfw_reset_eaction() and flush_nat_ptrs().
MFC after: 1 week
Sponsored by: Yandex LLC
r343275 introduced a performance optimisation to the copyin/copyout
routines by attempting to copy word-per-word rather than byte-per-byte
where possible.
This optimisation failed to account for cases where the buffer is longer
than XLEN_BYTES, but due to misalignment does not not allow for any
word-sized copies. E.g. a 9 byte buffer (with XLEN_BYTES == 8) which is
misaligned by 2 bytes. The code nevertheless did a single full-word
copy, which meant we copied too much data. This potentially clobbered
other data.
This is most easily demonstrated by a simple `sysctl -a`.
Fix it by not assuming that we'll always have at least one full-word
copy to do, but instead checking the remaining length first.
Reviewed by: markj@, mhorne@, br@ (previous version)
MFC after: 1 week
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D21100
Remove our (very partial) support for RFC2675 Jumbograms. They're not
used, not actually supported and not a good idea.
Reviewed by: thj@
Differential Revision: https://reviews.freebsd.org/D21086
After r343619 ipfw uses own locking for packets flow. PULLUP_LEN() macro
is used in ipfw_chk() to make m_pullup(). When m_pullup() fails, it just
returns via `goto pullup_failed`. There are two places where PULLUP_LEN()
is called with IPFW_PF_RLOCK() held.
Add PULLUP_LEN_LOCKED() macro to use in these places to be able release
the lock, when m_pullup() fails.
Sponsored by: Yandex LLC
Since DTS from >= Linux 5.0 the slave address are relative to the parent
node address and aren't the full ones.
Check both so the cpsw driver can find the phy id.
r350229 changed the code to lookup the ti,hwmods property in the parent
as it's now like that in the DTS from >= Linux 5.0, allow the property
to be also in the node itself so we can boot with an older DTB.
Reported by: "Dr. Rolf Jansen" <rj@obsigna.com>
DCTCP specific methods. Also fallthrough NewReno for non ECN capable
TCP connections and improve the integer arithmetic.
Obtained from: Richard Scheffenegger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20550
* Initialize the alpha parameter to a conservative value (like Linux)
* Improve handling of arithmetic.
* Improve man-page
Obtained from: Richard Scheffenegger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D20549
counter, and the final freeing of freed swap blocks, outside the
region where an object lock is held. Correct some style(9) and
spelling errors. Change a panic() to a KASSERT(). Change a boolean_t
to a bool.
Suggested by: alc
Reviewed by: alc
Approved by: kib, markj (mentors)
Differential Revision: https://reviews.freebsd.org/D21093
When a fusefs file system is mounted using the writeback cache, the cache
may still be bypassed by opening a file with O_DIRECT. When writing with
O_DIRECT, the cache must be invalidated for the affected portion of the
file. Fix some panics caused by inadvertently invalidating too much.
Sponsored by: The FreeBSD Foundation
v_inval_buf_range invalidates all buffers within a certain LBA range of a
file. It will be used by fusefs(5). This commit is a partial merge of
r346162, r346606, and r346756 from projects/fuse2.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21032
If not limited by write_same_max_lba option, split operation into several
2^^31 blocks chunks in a loop. For large disks it may take a while, so
setting write_same_max_lba may be useful to avoid timeouts.
While there, fix build with CAM_CTL_DEBUG.
MFC after: 2 weeks
Nothing uses these anymore. They were for super small armv4 boards without
uboot. We removed armv4 support before 13.0, but neglected to garbage collect
this at the same time. Today, both flavors of armv5 kernels (mv and ralink) boot
via uboot which has its own compression scheme for boards that need it.
Note: OLDFILES has not been updated beacuse installkernel will move the whole
directory out of the way before installing the new kernel.
Differential Revision: https://reviews.freebsd.org/D21072
Substitute driver-defined IS_P2ALIGNED() with EFX_IS_P2ALIGNED()
defined in libefx.
Add type argument and cast value and alignment to one specified type.
Reported by: Andrea Valsania <andrea.valsania at answervad.it>
Reviewed by: philip
Sponsored by: Solarflare Communications, Inc.
MFC after: 2 days
Differential Revision: https://reviews.freebsd.org/D21076