The last usage of this function was removed in e3b1c847a4.
There are no in-tree consumers of kernel_vmount().
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D32607
They are unused today and cannot be safely used in the face of unlocked
lookup, in which pages may be busied without the object lock held.
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32948
- Modify vm_page_busy_sleep() and vm_page_busy_sleep_unlocked() to take
a VM_ALLOC_* flag indicating whether to sleep on shared-busy, and fix
up callers.
- Modify vm_page_busy_sleep() to return a status indicating whether the
object lock was dropped, and fix up callers.
- Convert callers of vm_page_sleep_if_busy() to use vm_page_busy_sleep()
instead.
- Remove vm_page_sleep_if_(x)busy().
No functional change intended.
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32947
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!
This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.
This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.
Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.
Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
Commit f0c9847a6c added the ioflag and cred arguments to
VOP_ALLOCATE() for NFSv4.2 server support. This patch updates
the man page for these arguments.
Reviewed by: khng, gbe
Differential Revision: https://reviews.freebsd.org/D32898
Document the new allocator variants and flesh out the description of
some details of the page allocator interface.
Reviewed by: kib, alc
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32035
Eliminate the nested loops and re-implement following a suggestion from
rlibby.
Add some simple regression tests.
Reviewed by: rlibby, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32472
This function was renamed to kern_reboot() in 2010, but the man page has
failed to keep in sync. Bring it up to date on the rename, add the
shutdown hooks to the synopsis, and document the (obvious) fact that
kern_reboot() does not return.
Fix an outdated reference to the old name in kern_reboot(), and leave a
reference to the man page so future readers might find it before any
large changes.
Reviewed by: imp, markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32085
These allow one to non-destructively iterate over the set or clear bits
in a bitset. The motivation is that we have several code fragments
which iterate over a CPU set like this:
while ((cpu = CPU_FFS(&cpus)) != 0) {
cpu--;
CPU_CLR(cpu, &cpus);
<do something>;
}
This is slow since CPU_FFS begins the search at the beginning of the
bitset each time. On amd64 and arm64, CPU sets have size 256, so there
are four limbs in the bitset and we do a lot of unnecessary scanning.
A second problem is that this is destructive, so code which needs to
preserve the original set has to make a copy. In particular, we have
quite a few functions which take a cpuset_t parameter by value, meaning
that each call has to copy the 32 byte cpuset_t.
The new macros address both problems.
Reviewed by: cem, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32028
Generialize bus specific property accessors. Those functions allow driver code
to access device specific information.
Currently there is only support for FDT and ACPI buses.
Reviewed by: manu, mw
Sponsored by: Semihalf
Differential revision: https://reviews.freebsd.org/D31597
Implement lock_spin()/unlock_spin() lock class methods, moving the
assertion to _sleep() instead. Change assertions in callout(9) to
allow spin locks for both regular and C_DIRECT_EXEC cases. In case of
C_DIRECT_EXEC callouts spin locks are the only locks allowed actually.
As the first use case allow taskqueue_enqueue_timeout() use on fast
task queues. It actually becomes more efficient due to avoided extra
context switches in callout(9) thanks to C_DIRECT_EXEC.
MFC after: 2 weeks
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D31778
pmap_extract_and_hold() returns a vm_page_t instead of a physical page
address.
Sponsored by: The FreeBSD Foundation
Reviewed by: alc
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D31691
rmsr.r_offset now is set to rqsr.r_offset plus the number of bytes
zeroed before hitting the end-of-file. After this change rmsr.r_offset
no longer contains the EOF when the requested operation range is
completely beyond the end-of-file. Instead in such case rmsr.r_offset is
equal to rqsr.r_offset. Callers can obtain the number of bytes zeroed
by subtracting rqsr.r_offset from rmsr.r_offset.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31677
rmacklem@ spotted two things in the system call:
- Upon returning from a successful operation, vop_stddeallocate can
update rmsr.r_offset to a value greater than file size. This behavior,
although being harmless, can be confusing.
- The EINVAL return value for rqsr.r_offset + rqsr.r_len > OFF_MAX is
undocumented.
This commit has the following changes:
- vop_stddeallocate and shm_deallocate to bound the the affected area
further by the file size.
- The EINVAL case for rqsr.r_offset + rqsr.r_len > OFF_MAX is
documented.
- The fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9)'s return
len is explicitly documented the be the value 0, and the return offset
is restricted to be the smallest of off + len and current file size
suggested by kib@. This semantic allows callers to interact better
with potential file size growth after the call.
Sponsored by: The FreeBSD Foundation
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D31604
Introduce m_get3() which is similar to m_get2(), but can allocate up to
MJUM16BYTES bytes (m_get2() can only allocate up to MJUMPAGESIZE).
This simplifies the bpf improvement in f13da24715.
Suggested by: glebius
Differential Revision: https://reviews.freebsd.org/D31455
The sysctl man page cautions against negative-sense boolean sysctls
(foobar_disable), but it gets lost at the end of a large paragraph.
Move it to a separate paragraph in an attempt to make it more clear.
This man page could use a more holistic review and edit pass. This
change is simple and straightforward and I hope provides a small but
immediate benefit.
This gives any given domain a chance to indicate that it's not actually
supported on the current system. If dom_probe isn't supplied, we assume
the domain is universally applicable as most of them are. Keeping
fully-initialized and registered domains around that physically can't
work on a large majority of FreeBSD deployments is sub-optimal and leads
to errors that aren't consistent with the reality of why the socket
can't be created (e.g. ESOCKTNOSUPPORT) because such scenario has to be
caught upon pru_attach, at which point kicking back the more-appropriate
EAFNOSUPPORT would seem weird.
The initial consumer of this will be hvsock, which is only available on
HyperV guests.
Reviewed by: cem (earlier version), bcr (manpages)
Differential Revision: https://reviews.freebsd.org/D25062
The addition of ioflag allows callers passing
IO_SYNC/IO_DATASYNC/IO_DIRECT down to the file system implementation.
The vop_stddeallocate fallback implementation is updated to pass the
ioflag to the file system implementation. vn_deallocate(9) internally is
also changed to pass ioflag to the VOP_DEALLOCATE call.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D31500
This includes a style fix around ioflag checking as well.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib, bcr
Differential Revision: https://reviews.freebsd.org/D31505
Add more manual pages which were not spotted previously in 0a0f748641
Ideally to be MFH'ed with:
8539518055 - Remove manpages from OLD_FILES
8b487b8292 - Fix bsd.subdir.mk-related issues after 0a0f748641f6043a6721 - ObsoleteFiles.inc: Remove manpages from OLD_FILES
0a0f748641 - man: Build manpages for all architectures
There is at least one pending issue when building with -DNO_ROOT.
Reported by: ceri@
MFH: 4 weeks
Discussed with: wosch
Differential Revision: https://reviews.freebsd.org/D31018
fspacectl(2) is a system call to provide space management support to
userspace applications. VOP_DEALLOCATE(9) is a VOP call to perform the
deallocation. vn_deallocate(9) is a public KPI for kmods' use.
The purpose of proposing a new system call, a KPI and a VOP call is to
allow bhyve or other hypervisor monitors to emulate the behavior of SCSI
UNMAP/NVMe DEALLOCATE on a plain file.
fspacectl(2) comprises of cmd and flags parameters to specify the
space management operation to be performed. Currently cmd has to be
SPACECTL_DEALLOC, and flags has to be 0.
fo_fspacectl is added to fileops.
VOP_DEALLOCATE(9) is added as a new VOP call. A trivial implementation
of VOP_DEALLOCATE(9) is provided.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28347
This KPI is created in addition to the existing vnode_pager_setsize(9)
KPI. The KPI is intended for file systems that are able to turn a range
of file into sparse range, also known as hole-punching.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D27194
Comments on a pending kvmclock driver suggested adding a
malloc_aligned() to complement malloc_domainset_aligned(); add it now,
and document both.
Reviewed by: imp, kib, allanjude (manpages)
Differential Revision: https://reviews.freebsd.org/D31004
g_alloc_event will allocate storage for an opaque event. g_post_event_ep
can use memory returned by g_alloc_event to send an event from a context
that might not be able to allocate the event. Occasionally, we can
alloate memory when we create an object, but not while we're destroy
it. This allows one to allocate at creation time memory to use when
destorying the object.
Reviewed by: jhb
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30544
Platforms may either silently handle unaligned accesses or return an
error. Atomicity is not guaranteed in this case, however.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31282
Refine mistakes from adaptaton of NetBSD's hardclock man page to
FreeBSD:
o clarify what usermode means
o clarify how often hardclock is called
o remove Xr callout(9) since that's done elsewhere
Reviewed by: mav@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30982
Update the stathz description to reflect reality. profhz is the only
thing we should deprecate. Add some implementation notes that describe
the optimizations made to date.
Discusssed with: emaste
Reviewed by: kib (prior), jhb (prior), gbe
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30815
Now that the upper layers all go through a layer to tie into these
information functions that translates an sbuf into char * and len. The
current interface suffers issues of what to do in cases of truncation,
etc. Instead, migrate all these functions to using struct sbuf and these
issues go away. The caller is also in charge of any memory allocation
and/or expansion that's needed during this process.
Create a bus_generic_child_{pnpinfo,location} and make it default. It
just returns success. This is for those busses that have no information
for these items. Migrate the now-empty routines to using this as
appropriate.
Document these new interfaces with man pages, and oversight from before.
Reviewed by: jhb, bcr
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29937
Document aspects of system time keeping. Hz is the nominal rate that we
interrupt the system and is known and the 'tick' period of 1 / hz.
hardclock is the routine that does various bits of timekeeping. stathz
and profhz are documented as historical relics that are deprecated
and replaced by hwpmc.4 and others.
Reviewed by: phk@, mav@ and gnn@ (previous version)
Obtained from: hardclock.9 from NetBSD (with FreeBSD adjustments)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30802
Codify our standard practice with $FreeBSD$
o New code only needs it if it might land in stable/12
o Old code should retain it until stable/12 is unsupported
o We'll do a bulk remove in the future: don't do it proactively.
o Give advice about how to tag files derived from other files
in the tree.
Reviewed by: bcr, allanjude,ceri
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30789