Commit Graph

832 Commits

Author SHA1 Message Date
Mark Johnston
e72f7ed43e buf: Dynamically allocate per-CPU buffer queues
To reduce static bloat.  No functional change intended.

PR:		269572
Reviewed by:	mjg, kib, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D39808
2023-04-26 10:09:31 -04:00
Mark Johnston
bcd8cd859e buf: Make buf_daemon_shutdown() a no-op after a panic
As in commit 9d7cc536e2, there is no need to do anything in this
context.

MFC after:	1 week
2023-03-01 10:15:54 -05:00
Mark Johnston
9d7cc536e2 buf: Make bufspace_daemon_shutdown() a no-op after a panic
This function doesn't need to do anything in that context, and calling
wakeup() can lead to recursive panics.

Discussed with:	mhorne
MFC after:	1 week
2023-02-23 21:56:36 -05:00
Konstantin Belousov
020e8a4d06 allocbuf(): convert direct panic() calls to KASSERT()s
Also do minor style adjustments.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38549
2023-02-14 00:28:42 +02:00
Mitchell Horne
1029dab634 mi_switch(): clean up switch types and their usage
Overall, this is a non-functional change, except for kernels built with
SCHED_STATS. However, the switch types are useful for communicating the
intent of the caller.

1. Ensure that every caller provides a type. In most cases, we upgrade
   the basic yield to sched_relinquish() aka SWT_RELINQUISH.
2. The case of sched_bind() is distinct, so add a new switch type SWT_BIND.
3. Remove the two unused types, SWT_PREEMPT and SWT_SLEEPQTIMO.
4. Remove SWT_NONE altogether and assert that callers always provide
   a type flag.
5. Reference the mi_switch(9) man page in the comments, as these flags
   will be documented there.

Reviewed by:	kib, markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D38184
2023-02-09 12:01:32 -04:00
Mateusz Guzik
83286682f8 vfs: whack mips remnant
This reverts commit 8ffa01a061.
2022-11-09 00:31:50 +00:00
Dimitry Andric
a387bd1b6a Adjust function definition in vfs_bio.c to avoid clang 15 warnings
With clang 15, the following -Werror warning is produced:

    sys/kern/vfs_bio.c:3430:11: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
    buf_daemon()
              ^
               void

This is because buf_daemon() is declared with a (void) argument list,
but defined with an empty argument list. Make the definition match the
declaration.

MFC after:	3 days
2022-07-26 19:59:57 +02:00
Mitchell Horne
c84c5e00ac ddb: annotate some commands with DB_CMD_MEMSAFE
This is not completely exhaustive, but covers a large majority of
commands in the tree.

Reviewed by:	markj
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D35583
2022-07-18 22:06:09 +00:00
Chuck Silvers
5bd21cbbd1 vfs: fix vfs_bio_clrbuf() for PAGE_SIZE > block size
Calculate the desired page valid mask using math that will not
overflow the types used.

Sponsored by:	Netflix

Reviewed by:	mckusick, kib, markj
Differential Revision:	https://reviews.freebsd.org/D34837
2022-06-21 17:58:52 -07:00
Konstantin Belousov
1fb00c8f10 buf_alloc(): Stop using LK_NOWAIT, use LK_NOWITNESS
Despite the buffer taken from cache or free list, it still can be
locked, due to 'lockless lookup' in getblkx() potentially operating on
the freed buffers.  The lock is transient, but prevents the use of
LK_NOWAIT there for the goal of neutralizing WITNESS.

Just use LK_NOWITNESS.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-03-06 10:29:31 -05:00
Mitchell Horne
5a8fceb3bd boottrace: trace annotations for startup and shutdown
Add trace events for execution of SYSINITs (both static and dynamically
loaded), and to the various steps in the shutdown/panic/reboot paths.

Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
X-NetApp-PR:	#23
Differential Revision:	https://reviews.freebsd.org/D30187
2022-02-21 20:15:57 -04:00
Konstantin Belousov
c02780b78c Add GB_NOWITNESS flag
It prevents WITNESS from recording the lock order for the buffer lock
acquired by getblkx().

Reviewed by:	mckusick
Discussed with:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34073
2022-02-01 06:54:50 +02:00
Konstantin Belousov
5875b94c74 buf_alloc(): lock the buffer with LK_NOWAIT
The buffer must not be accessed by any other thread, it is freshly
allocated.  As such, LK_NOWAIT should be nop but also it prevents
recording the order between the buffer lock and any other locks we might
own in the call to getnewbuf().  In particular, if we own FFS snap lock,
it should avoid triggering false positive warning.

Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:21 +02:00
Konstantin Belousov
531f8cfea0 Use dedicated lock name for pbufs
Also remove a pointer to array variable, use array address directly.

Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:14 +02:00
Alexander Motin
b7ff445ffa Reduce bufdaemon/bufspacedaemon shutdown time.
Before this change bufdaemon and bufspacedaemon threads used
kthread_shutdown() to stop activity on system shutdown.  The problem is
that kthread_shutdown() has no idea about the wait channel and lock used
by specific thread to wake them up reliably.  As result, up to 9 threads
could consume up to 9 seconds to shutdown for no good reason.

This change introduces specific shutdown functions, knowing how to
properly wake up specific threads, reducing wait for those threads on
shutdown/reboot from average 4 seconds to effectively zero.

MFC after:	2 weeks
Reviewed by:	kib, markj
Differential Revision:  https://reviews.freebsd.org/D33936
2022-01-18 19:26:16 -05:00
Alexander Motin
e76c010899 Fix inverse sleep logic in buf_daemon().
Before commit 3cec5c77d6 buf_daemon() went to longer 1s sleep if
numdirtybuffers <= lodirtybuffers.  After that commit new condition
!BIT_EMPTY(BUF_DOMAINS, &bdlodirty) got opposite -- true when one
or more more domains is above lodirtybuffers.  As result, on freshly
booted system with no dirty buffers buf_daemon() wakes up 10 times
per second and probably only 1 time per second when there is actual
work to do.

MFC after:	1 week
Reviewed by:	kib, markj
Tested by:	pho
Differential revision:	https://reviews.freebsd.org/D33890
2022-01-15 19:32:36 -05:00
Konstantin Belousov
a5c2d59ed3 Expand comment explaining reasons for automatic swapoff on shutdown
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33167
2021-12-03 10:42:21 +02:00
Gordon Bergling
b6f4818a7e vfs: Fix a typo in a sysctl description
- s/dependecies/dependencies/

MFC after:	3 days
2021-11-30 07:28:40 +01:00
Konstantin Belousov
08bb51f8d6 shutdown: unmount filesystems after swapoff
Swap on file requires operational underlying mount, otherwise
swapoff_all() is guaranteed to panic due to the default strategy VOP for
reclaimed vnodes.

Reported and tested by:	peterj
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33147
2021-11-29 18:38:02 +02:00
Wuyang Chung
8587d75255 Correct the name of the second parameter of biowait to wmesg
This parameter is passed directly to msleep, and the name of the msleep
parameter is wmesg. Make them match.

Pull Request: https://github.com/freebsd/freebsd-src/pull/557
2021-11-18 23:26:33 -07:00
Konstantin Belousov
a7b4a54d2c getblk(): do not require devvp vnodes to be locked
Reported and tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32761
2021-11-13 01:00:24 +02:00
Kirk McKusick
dfd704b7fb Allow biodone() to be used as a completion routine.
An ordered series of BIO_READ and BIO_WRITE operations are
typically done as:

	while (work to do) {
		setup bp for I/O
		g_io_request(bp, consumer);
		biowait(bp);
	}

Here you need to have biodone() called at the completion of
the I/O to set the BIO_DONE flag and awaken the biowait(). The
obvious way to do this would be to set bio_done = biodone, but
biodone() will only take the desired action if bio_done == NULL.
The relevant code at the end of biodone() is:

	done = bp->bio_done;
	if (done == NULL) {
		mtxp = mtx_pool_find(mtxpool_sleep, bp);
		mtx_lock(mtxp);
		bp->bio_flags |= BIO_DONE;
		wakeup(bp);
		mtx_unlock(mtxp);
	} else
		done(bp);

This code would infinitely recurse if biodone() is specified as the
routine to use at completion. So before this change, a wrapper done
function had to be written:

static void
g_io_done(struct bio *bp)
{

	bp->bio_done = NULL;
	biodone(bp);
	bp->bio_done = g_io_done;
}

This commit changes

	if (done == NULL)

to

	if (done == NULL || done == biodone)

which eliminates the need for the wrapper function.

Reviewed by:  kib
Sponsored by: Netflix
2021-10-23 14:11:57 -07:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Konstantin Belousov
197a4f29f3 buffer pager: allow get_blksize method to return error
Reported and reviewed by:	asomers
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31998
2021-09-17 20:29:55 +03:00
Mark Johnston
8978608832 amd64: Populate the KMSAN shadow maps and integrate with the VM
- During boot, allocate PDP pages for the shadow maps.  The region above
  KERNBASE is currently not shadowed.
- Create a dummy shadow for the vm page array.  For now, this array is
  not protected by the shadow map to help reduce kernel memory usage.
- Grow shadows when growing the kernel map.
- Increase the default kernel stack size when KMSAN is enabled.  As with
  KASAN, sanitizer instrumentation appears to create stack frames large
  enough that the default value is not sufficient.
- Disable UMA's use of the direct map when KMSAN is configured.  KMSAN
  cannot validate the direct map.
- Disable unmapped I/O when KMSAN configured.
- Lower the limit on paging buffers when KMSAN is configured.  Each
  buffer has a static MAXPHYS-sized allocation of KVA, which in turn
  eats 2*MAXPHYS of space in the shadow map.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31295
2021-08-10 21:27:53 -04:00
Mark Johnston
6faf45b34b amd64: Implement a KASAN shadow map
The idea behind KASAN is to use a region of memory to track the validity
of buffers in the kernel map.  This region is the shadow map.  The
compiler inserts calls to the KASAN runtime for every emitted load
and store, and the runtime uses the shadow map to decide whether the
access is valid.  Various kernel allocators call kasan_mark() to update
the shadow map.

Since the shadow map tracks only accesses to the kernel map, accesses to
other kernel maps are not validated by KASAN.  UMA_MD_SMALL_ALLOC is
disabled when KASAN is configured to reduce usage of the direct map.
Currently we have no mechanism to completely eliminate uses of the
direct map, so KASAN's coverage is not comprehensive.

The shadow map uses one byte per eight bytes in the kernel map.  In
pmap_bootstrap() we create an initial set of page tables for the kernel
and preloaded data.

When pmap_growkernel() is called, we call kasan_shadow_map() to extend
the shadow map.  kasan_shadow_map() uses pmap_kasan_enter() to allocate
memory for the shadow region and map it.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29417
2021-04-13 17:42:20 -04:00
Mark Johnston
369706a6f8 buf: Fix the dirtybufthresh check
dirtybufthresh is a watermark, slightly below the high watermark for
dirty buffers.  When a delayed write is issued, the dirtying thread will
start flushing buffers if the dirtybufthresh watermark is reached.  This
helps ensure that the high watermark is not reached, otherwise
performance will degrade as clustering and other optimizations are
disabled (see buf_dirty_count_severe()).

When the buffer cache was partitioned into "domains", the dirtybufthresh
threshold checks were not updated.  Fix this.

Reported by:	Shrikanth R Kamath <kshrikanth@juniper.net>
Reviewed by:	rlibby, mckusick, kib, bdrewery
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Fixes:		3cec5c77d6
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28901
2021-02-25 10:04:44 -05:00
Konstantin Belousov
bf0db19339 buf SU hooks: track buf_start() calls with B_IOSTARTED flag
and only call buf_complete() if previously started.  Some error paths,
like CoW failire, might skip buf_start() and do bufdone(), which itself
call buf_complete().

Various SU handle_written_XXX() functions check that io was started
and incomplete parts of the buffer data reverted before restoring them.
This is a useful invariant that B_IO_STARTED on buffer layer allows to
keep instead of changing check and panic into check and return.

Reported by:	pho
Reviewed by:	chs, mckusick
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundations
2021-02-12 03:02:19 +02:00
Bryan Drewery
c926114f2f Fix getblk() with GB_NOCREAT returning false-negatives.
It is possible for a buf to be reassigned between the dirty and clean
lists while gbincore_unlocked() looks in each list.  Avoid creating
a buffer in that case and fallback to a locked lookup.

This fixes a regression from r363482.

More discussion on potential improvements to the clean and dirty lists
handling is in the review.

Reviewed by:	cem, kib, markj, vangyzen, rlibby
Reported by:	Suraj.Raju at dell.com
Submitted by:	Suraj.Raju at dell.com, cem, [based on both]
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D28375
2021-01-28 11:24:24 -08:00
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Brooks Davis
44ca4575ea vmapbuf: don't smuggle address or length in buf
Instead, add arguments to vmapbuf.  Since this argument is
always a pointer use a type of void * and cast to vm_offset_t in
vmapbuf.  (In CheriBSD we've altered vm_fault_quick_hold_pages to
take a pointer and check its bounds.)

In no other situtation does b_data contain a user pointer and vmapbuf
replaces b_data with the actual mapping.

Suggested by:	jhb
Reviewed by:	imp, jhb
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26784
2020-10-21 16:00:15 +00:00
Bryan Drewery
c2c6fb90e0 Use unlocked page lookup for inmem() to avoid object lock contention
Reviewed By:	kib, markj
Submitted by:	mlaier
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D26653
2020-10-09 23:49:42 +00:00
Bryan Drewery
9ceba22462 Revert r366340.
CR wasn't finished and it breaks the build.
2020-10-01 20:08:27 +00:00
Bryan Drewery
2398cd1103 Use unlocked page lookup for inmem() to avoid object lock contention
Reviewed By:	kib, markj
Sponsored by:	Dell EMC Isilon
Submitted by:	mlaier
Differential Revision:	https://reviews.freebsd.org/D26597
2020-10-01 19:17:03 +00:00
Mateusz Guzik
6fed89b179 kern: clean up empty lines in .c and .h files 2020-09-01 22:12:32 +00:00
Mateusz Guzik
7ad2a82da2 vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_error
Most consumers pass NULL.
2020-08-19 02:51:17 +00:00
Conrad Meyer
9da903e5d3 Unlocked getblk: Fix new false-positive assertion
A free buf's lock may be held (temporarily) due to unlocked lookup, so
buf_alloc() must acquire it without LK_NOWAIT.  The unlocked getblk path
should unlock it promptly once it realizes the identity does not match
the buffer it was searching for.

Reported by:	gallatin
Reviewed by:	kib
Tested by:	pho
X-MFC-With:	r363482
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25914
2020-08-02 16:34:27 +00:00
Conrad Meyer
d6a75d39e9 getblk: Remove a non-sensical LK_NOWAIT | LK_SLEEPFAIL
No functional change.

LK_SLEEPFAIL implies a behavior that is only possible if the lock operation can
sleep.  LK_NOWAIT prevents the lock operation from sleeping.

Discussed with:	kib
2020-07-31 00:13:40 +00:00
Conrad Meyer
59d13f6154 getblk: Avoid sleeping on wrong buf in lockless path
If the buffer identity changed during lookup, sleeping could introduce a
lock order reversal.  Since we do not know if the identity changed until we
get the lock, we must try-lock (LK_NOWAIT) only.  EINTR and ERESTART error
handling becomes irrelevant, as we no longer sleep.

Reported by:	kib
Reviewed by:	kib
X-MFC-With:	r363482
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25898
2020-07-31 00:07:01 +00:00
Conrad Meyer
81dc6c2c61 Use gbincore_unlocked for unprotected incore()
Reviewed by:	markj
Sponsored by:	Isilon
Differential Revision:	https://reviews.freebsd.org/D25790
2020-07-24 17:34:44 +00:00
Conrad Meyer
68ee1dda06 Add unlocked/SMR fast path to getblk()
Convert the bufobj tries to an SMR zone/PCTRIE and add a gbincore_unlocked()
API wrapping this functionality.  Use it for a fast path in getblkx(),
falling back to locked lookup if we raced a thread changing the buf's
identity.

Reported by:	Attilio
Reviewed by:	kib, markj
Testing:	pho (in progress)
Sponsored by:	Isilon
Differential Revision:	https://reviews.freebsd.org/D25782
2020-07-24 17:34:04 +00:00
Mateusz Guzik
422f38d8ea vfs: fix trivial whitespace issues which don't interefere with blame
.. even without the -w switch
2020-07-10 09:01:36 +00:00
Chuck Silvers
d79ff54b5c This commit enables a UFS filesystem to do a forcible unmount when
the underlying media fails or becomes inaccessible. For example
when a USB flash memory card hosting a UFS filesystem is unplugged.

The strategy for handling disk I/O errors when soft updates are
enabled is to stop writing to the disk of the affected file system
but continue to accept I/O requests and report that all future
writes by the file system to that disk actually succeed. Then
initiate an asynchronous forced unmount of the affected file system.

There are two cases for disk I/O errors:

   - ENXIO, which means that this disk is gone and the lower layers
     of the storage stack already guarantee that no future I/O to
     this disk will succeed.

   - EIO (or most other errors), which means that this particular
     I/O request has failed but subsequent I/O requests to this
     disk might still succeed.

For ENXIO, we can just clear the error and continue, because we
know that the file system cannot affect the on-disk state after we
see this error. For EIO or other errors, we arrange for the geom_vfs
layer to reject all future I/O requests with ENXIO just like is
done when the geom_vfs is orphaned. In both cases, the file system
code can just clear the error and proceed with the forcible unmount.

This new treatment of I/O errors is needed for writes of any buffer
that is involved in a dependency. Most dependencies are described
by a structure attached to the buffer's b_dep field. But some are
created and processed as a result of the completion of the dependencies
attached to the buffer.

Clearing of some dependencies require a read. For example if there
is a dependency that requires an inode to be written, the disk block
containing that inode must be read, the updated inode copied into
place in that buffer, and the buffer then written back to disk.

Often the needed buffer is already in memory and can be used. But
if it needs to be read from the disk, the read will fail, so we
fabricate a buffer full of zeroes and pretend that the read succeeded.
This zero'ed buffer can be updated and written back to disk.

The only case where a buffer full of zeros causes the code to do
the wrong thing is when reading an inode buffer containing an inode
that still has an inode dependency in memory that will reinitialize
the effective link count (i_effnlink) based on the actual link count
(i_nlink) that we read. To handle this case we now store the i_nlink
value that we wrote in the inode dependency so that it can be
restored into the zero'ed buffer thus keeping the tracking of the
inode link count consistent.

Because applications depend on knowing when an attempt to write
their data to stable storage has failed, the fsync(2) and msync(2)
system calls need to return errors if data fails to be written to
stable storage. So these operations return ENXIO for every call
made on files in a file system where we have otherwise been ignoring
I/O errors.

Coauthered by: mckusick
Reviewed by:   kib
Tested by:     Peter Holm
Approved by:   mckusick (mentor)
Sponsored by:  Netflix
Differential Revision:  https://reviews.freebsd.org/D24088
2020-05-25 23:47:31 +00:00
Konstantin Belousov
bbca4bd7cd buffer pager: skip bogus pages.
We cannot validate bogus page by reading a buffer.

PR:	244713
Reviewed by:	glebius, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D24038
2020-03-30 21:42:46 +00:00
Konstantin Belousov
695e0701a0 buffer pager: deref ucred immediately after read.
Ucred is passed to bread(9) so that non-local filesystems use proper
credentials.  But, since clean buffer might be cached unless
buf_pager_relbuf is not enabled, it makes credentials to have extra
reference until buffer is reclaimed.  Ucred reference would prevent
jail from destroying if creds are jailed.

Dereferencing the read credentials on the valid buffer avoid that, and
should be fine because the buffer is valid and does not need re-read.

PR:	238032
Reported by:	bz
Reproduced and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23775
2020-03-05 15:52:34 +00:00
Jeff Roberson
6be21eb778 Provide a lock free alternative to resolve bogus pages. This is not likely
to be much of a perf win, just a nice code simplification.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D23866
2020-02-28 21:42:48 +00:00
Jeff Roberson
7aaf252c96 Convert a few triviail consumers to the new unlocked grab API.
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D23847
2020-02-28 20:34:30 +00:00
Mark Johnston
c99d0c5801 Add a blocking counter KPI.
refcount(9) was recently extended to support waiting on a refcount to
drop to zero, as this was needed for a lockless VM object
paging-in-progress counter.  However, this adds overhead to all uses of
refcount(9) and doesn't really match traditional refcounting semantics:
once a counter has dropped to zero, the protected object may be freed at
any point and it is not safe to dereference the counter.

This change removes that extension and instead adds a new set of KPIs,
blockcount_*, for use by VM object PIP and busy.

Reviewed by:	jeff, kib, mjg
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23723
2020-02-28 16:05:18 +00:00
Mateusz Guzik
f1fa1ba3d0 Fix up various vnode-related asserts which did not dump the used vnode 2020-02-03 14:25:32 +00:00
Mateusz Guzik
3ff65f71cb Remove duplicated empty lines from kern/*.c
No functional changes.
2020-01-30 20:05:05 +00:00