instead of a vnode for it.
The vnode_pager does not and should not have any interest in what
the filesystem uses for backend.
(vfs_cluster doesn't use the backing store argument.)
Use this in all the places where sleeping with the lock held is not
an issue.
The distinction will become significant once we finalize the exact
lock-type to use for this kind of case.
return EINVAL rather than setting error, and don't free sops
unconditionally. The first change was merged accidentally as part of
the larger set of changes to introduce MAC labels and access control,
and potentially lead to continued processing of a request even after
it was determined to be invalid. The second change was due to changes
in the semaphore code since the original work was performed.
Pointed out by: truckman
to be modified and extended without breaking the user space ABI:
Use _kernel variants on _ds structures for System V sempahores, message
queues, and shared memory. When interfacing with userspace, export
only the _ds subsets of the _kernel data structures. A lot of search
and replace.
Define the message structure in the _KERNEL portion of msg.h so that it
can be used by other kernel consumers, but not exposed to user space.
Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, SPAWAR, McAfee Research
changes associated with adding System V IPC support. This will prevent
old modules from being used with the new kernel, and new modules from
being used with the old kernel.
includes the latter, but also declares variables which are defined
in kern/subr_param.c).
Change som VM parameters from quad_t to unsigned long. They refer to
quantities (size limits for text, heap and stack segments) which must
necessarily be smaller than the size of the address space, so long is
adequate on all platforms.
MFC after: 1 week
Change the spelling of the "catch" option to be consistent with the new
options. Implement the "no wait" option. An implementation of the "CPU
private" for i386 will be committed at a later date.
This generates a KTR event for each critical section entered and exited.
It would be desirable to also log the filename and line number of the
source entering or exiting the critical section, but this requires
hacking up the critical section API, so I've not done that yet.
because this call is only needed to wake threads that slept when they
discovered a dead object connected to a vnode. To eliminate unnecessary
calls to wakeup() by vnode_pager_dealloc(), introduce a new flag,
OBJ_DISCONNECTWNT.
Reviewed by: tegge@
priority. The sleep queues don't get updated when the priority of
threads changes, so sleepq_signal() might not always wakeup the
highest priority thread. Updating the queues when thread priorities
change cannot be easily done due to lock orders, so instead we do an
O(n) walk of the queue for a sleepq_signal() operation instead of O(1).
On the other hand, adding a thread to a sleep queue now goes from O(n)
to O(1) so it ends up as an even tradeoff. The correctness here with
regards to priorities is actually fairly important. msleep() gives
interactive threads their priority "boost" after they are placed on the
queue, but before this fix that "boost" wasn't used to determine the
highest priority thread that sleepq_signal() awoke.
- Fix up some comments.
Inspired by: ups, bde
- Tweak the updating of the ithread name in ithread_update() so that the
'+' and '*' characters for device names that were too short only get
added at the end after as many device names as possible were fit into
the allocated space. Prior to this, some long devices would result
in '+' chars showing up between two different devices rather than at the
end.
- Eliminate an initialized but unused variable.
- Eliminate an unnecessary call to clear the page's PG_BUSY flag. (The
call to vm_page_rename() already clears the page's PG_BUSY flag through
its call to vm_page_remove().)
motherboard, in practice the changes resulted in many false positives for
heavy network loads, etc. resulting in poor performance. Also, the
motherboard referenced in the 1.109 log has other problems and simply does
not seem to work with the APIC enabled even with the changes in 1.109. The
correct fix for that board seems to be to not use the APIC at all. One
thing kept from 1.109 is that throttled interrupts are now effectively
polled on every clock tick rather than just 10 times per second.
MFC after: 1 month
Tested by: Shunsuke SHINOMIYA shino at fornext dot org
need for most calls to vm_page_busy(). Specifically, most calls to
vm_page_busy() occur immediately prior to a call to vm_page_remove().
In such cases, the containing vm object is locked across both calls.
Consequently, the setting of the vm page's PG_BUSY flag is not even
visible to other threads that are following the synchronization
protocol.
This change (1) eliminates the calls to vm_page_busy() that
immediately precede a call to vm_page_remove() or functions, such as
vm_page_free() and vm_page_rename(), that call it and (2) relaxes the
requirement in vm_page_remove() that the vm page's PG_BUSY flag is
set. Now, the vm page's PG_BUSY flag is set only when the vm object
lock is released while the vm page is still in transition. Typically,
this is when it is undergoing I/O.
control the number of lines per page rather than a constant. The variable
can be examined and changed in ddb as '$lines'. Setting the variable to
0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
newlines and carriage returns so that one can rub out content on the
current line via '\r \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
the routine exits.
- Add some aliases to the simple pager to make it more compatible with
more(1): 'e' and 'j' do a single line. 'd' does half a page, and
'f' does a full page.
MFC after: 1 month
Inspired by: kris
This is magic and no other operating system do so (i.e. Solaris, Tru64,
Linux, AIX, HP-UX, Irix, MacOS X, NetBSD).
Discussed on: current@
Reported by: S³awek ¯ak <zaks@prioris.mini.pw.edu.pl>
for modules linked into the kernel or loaded very early, panics will
result otherwise, as the CV code it calls will panic due to its use
of a mutex before it is initialized.
outside of the nice threshold due to a recently awoken thread with a
lower nice value. This further reduces the amount of time a positively
niced thread gets while running in conjunction with a workload that has
many short sleeps (ie buildworld).
check for TD_ON_RUNQ() no longer means the thread is really on a run-
queue. I suspect this state should be re-evaluated as it must mean
something else now. This fixes ULE+KSE+PREEMPTION on UP x86.
At some point later the syncer will unlearn about vnodes and the filesystems
method called by the syncer will know enough about what's in bo_private to
do the right thing.
[1] Ok, I know, but I couldn't resist the pun.
buf->b-dev.
Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.
Assert copyright on vfs_bio.c and update copyright message to canonical
text. There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.
an inordinate amount of synchronous console output that is fairly
undesirable on slower serial console. It's easily hit by accident
when frobbing other sysctls late at night.
Give ffs it's own bufobj->bo_ops vector and create a private strategy
routine, (currently misnamed for forwards compatibility), which is
just a copy of the generic bufstrategy routine except we call
softdep_disk_prewrite() directly instead of through the buf_prewrite()
indirection.
Teach UFS about the need for softdep_disk_prewrite() and call the
function directly in FFS.
Remove buf_prewrite() from the default bufstrategy() and from the
global bio_ops method vector.
We keep si_bsize_phys around for now as that is the simplest way to pull
the number out of disk device drivers in devfs_open(). The correct solution
would be to do an ioctl(DIOCGSECTORSIZE), but the point is probably mooth
when filesystems sit on GEOM, so don't bother for now.
and release of the global page queues lock required to make the call.
Remove GIANT_REQUIRED from vm_hold_free_pages(). All of its VM operations
are properly synchronized.
count to prevent sockets from being garbage collected during
socket-specific system calls. This is the same approach used in
most VFS-specific system calls, as well as generic file descriptor
system calls such as read() and write().
To do this, add a utility function getsock(), which is logically
identical to getvnode() used for the same purpose in VFS. Unlike
fgetsock(), it returns with the file reference count elevated, but
no bump of the socket reference count. Replace matching calls to
fputsock() with fdrop().
This change is made to all socket system calls other than
sendfile() and accept(), but the approach should be applicable to
those system calls also.
This shaves about four mutex operations off of each of these
system calls, including send() and recv() variants, adding about
1% to pps on minimal UDP packets for UP using netblast, and 4% on
SMP.
Reviewed by: pjd
Extend it with a strategy method.
Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.
Rename ibwrite to bufwrite().
Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.
Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().
Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
This flag gets set whenever the thread posts an event on the GEOM
event queue, and if the flag is set when the thread is prepared
to return to userland from the kernel, g_waitidle() will be called
to make sure that the posted events have completed.
This can replace an insufficient number of g_waitidle() calls in
various other places, and has the advantage of being failsafe: Any
system call which does a VOP_OPEN()/VOP_CLOSE will now correctly
wait for any geom events it posted as part of spoils or tastes.
Assert that topology and Giant is not held in g_waitidle().