This eliminates a bunch of vnode overhead (approx 1-2 % speed
improvement) and gives us more control over the access to the storage
device.
Access counts on the underlying device are not correctly tracked and
therefore it is possible to read-only mount the same disk device multiple
times:
syv# mount -p
/dev/md0 /var ufs rw 2 2
/dev/ad0 /mnt ufs ro 1 1
/dev/ad0 /mnt2 ufs ro 1 1
/dev/ad0 /mnt3 ufs ro 1 1
Since UFS/FFS is not a synchrousely consistent filesystem (ie: it caches
things in RAM) this is not possible with read-write mounts, and the system
will correctly reject this.
Details:
Add a geom consumer and a bufobj pointer to ufsmount.
Eliminate the vnode argument from softdep_disk_prewrite().
Pick the vnode out of bp->b_vp for now. Eventually we
should find it through bp->b_bufobj->b_private.
In the mountcode, use g_vfs_open() once we have used
VOP_ACCESS() to check permissions.
When upgrading and downgrading between r/o and r/w do the
right thing with GEOM access counts. Remove all the
workarounds for not being able to do this with VOP_OPEN().
If we are the root mount, drop the exclusive access count
until we upgrade to r/w. This allows fsck of the root
filesystem and the MNT_RELOAD to work correctly.
Set bo_private to the GEOM consumer on the device bufobj.
Change the ffs_ops->strategy function to call g_vfs_strategy()
In ufs_strategy() directly call the strategy on the disk
bufobj. Same in rawread.
In ffs_fsync() we will no longer see VCHR device nodes, so
remove code which synced the filesystem mounted on it, in
case we came there. I'm not sure this code made sense in
the first place since we would have taken the specfs route
on such a vnode.
Redo the highly bogus readblock() function in the snapshot
code to something slightly less bogus: Constructing an uio
and using physio was really quite a detour. Instead just
fill in a bio and ship it down.
At some point later the syncer will unlearn about vnodes and the filesystems
method called by the syncer will know enough about what's in bo_private to
do the right thing.
[1] Ok, I know, but I couldn't resist the pun.
buf->b-dev.
Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.
Assert copyright on vfs_bio.c and update copyright message to canonical
text. There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.
IF INVARIANTS is defined, and in the rare case that we have
allocated some objects from the slab and at least one initializer
on at least one of those objects failed, and we need to fail the
allocation and push the uninitialized items back into the slab
caches -- in that scenario, we would fail to [re]set the
bucket cache's ub_bucket item references to NULL, which would
eventually trigger a KASSERT.
an inordinate amount of synchronous console output that is fairly
undesirable on slower serial console. It's easily hit by accident
when frobbing other sysctls late at night.
the device is suspended or shutting down. This will need to be rethought
slightly if we implement suspend/resume support within vr(4).
This appears to fix the vr_shutdown() panic on SMP machines.
My theory here is there's a race somewhere during vr_detach() with
vr_intr() in the SMP case which was sometimes being triggered,
although quite why this was happening is unclear (vr_stop() also
explicitly disables interrupts by writing to the IMR register).
MFC-to-RELENG_5* candidate.
PR: kern/62889
Tested by: seb at struchtrup dot com
MFC after: 10 days
that was greater than 4G. I originally used the same values as i386 in
order to save opening a new PML4 page slot, but in the day of gigabytes
of memory, worrying about a 4K page seems futile. Moving from 8 to 32G
moves the page to a different index, it doesn't increase the number of
pages used.
Give ffs it's own bufobj->bo_ops vector and create a private strategy
routine, (currently misnamed for forwards compatibility), which is
just a copy of the generic bufstrategy routine except we call
softdep_disk_prewrite() directly instead of through the buf_prewrite()
indirection.
Teach UFS about the need for softdep_disk_prewrite() and call the
function directly in FFS.
Remove buf_prewrite() from the default bufstrategy() and from the
global bio_ops method vector.
We keep si_bsize_phys around for now as that is the simplest way to pull
the number out of disk device drivers in devfs_open(). The correct solution
would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth
when filesystems sit on GEOM, so don't bother for now.
is ffs_copyonwrite() and the only place it can be called from is FFS which
would never want to call another filesystems copyonwrite method, should one
exist, so there is no reason why anything generic should know about this.
on UltraSPARC workstations. The driver is based on OpenBSD's SBus
cs4231 driver and heavily modified to incorporate into sound(4)
infrastructure. Due to the lack of APCDMA documentation, the DMA
code of SBus cs4231 came from OpenBSD's driver.
The driver runs without Giant lock and supports both SBus and EBus
based CS4231 audio controller. Special thanks to marius for providing
feedbacks during the driver writing. His feedback made it possible
to write hiccup free playback code under high system loads.
Approved by: jake (mentor)
Reviewed by: marius (initial version)
Tested by: marius, kwm, Julian C. Dunn(jdunn AT opentrend DOT net)
and release of the global page queues lock required to make the call.
Remove GIANT_REQUIRED from vm_hold_free_pages(). All of its VM operations
are properly synchronized.
count to prevent sockets from being garbage collected during
socket-specific system calls. This is the same approach used in
most VFS-specific system calls, as well as generic file descriptor
system calls such as read() and write().
To do this, add a utility function getsock(), which is logically
identical to getvnode() used for the same purpose in VFS. Unlike
fgetsock(), it returns with the file reference count elevated, but
no bump of the socket reference count. Replace matching calls to
fputsock() with fdrop().
This change is made to all socket system calls other than
sendfile() and accept(), but the approach should be applicable to
those system calls also.
This shaves about four mutex operations off of each of these
system calls, including send() and recv() variants, adding about
1% to pps on minimal UDP packets for UP using netblast, and 4% on
SMP.
Reviewed by: pjd
Extend it with a strategy method.
Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.
Rename ibwrite to bufwrite().
Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.
Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().
Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
trigger a socket creation race some some kind). Checking for non-NULL socket
and credential is not a bad idea anyway. Unfortunatly too late for the
release.
Reported & tested by: Gilbert Cao
MFC after: 2 weeks
vm_page_sleep_if_busy(). (The motivation being to transition
synchronization of the vm_page's PG_BUSY flag from the global page queues
lock to the per-object lock.)
two loops in agp_generic_bind_memory(). As an intended side-effect, all
of the calls to vm_page_wakeup() are now performed with the containing
vm object lock held.
that indicates that the caller does not want a page with its busy flag set.
In many places, the global page queues lock is acquired and released just
to clear the busy flag on a just allocated page. Both the allocation of
the page and the clearing of the busy flag occur while the containing vm
object is locked. So, the busy flag might as well never be set.
This flag gets set whenever the thread posts an event on the GEOM
event queue, and if the flag is set when the thread is prepared
to return to userland from the kernel, g_waitidle() will be called
to make sure that the posted events have completed.
This can replace an insufficient number of g_waitidle() calls in
various other places, and has the advantage of being failsafe: Any
system call which does a VOP_OPEN()/VOP_CLOSE will now correctly
wait for any geom events it posted as part of spoils or tastes.
Assert that topology and Giant is not held in g_waitidle().