to wake up any processes waiting via PIOCWAIT on process exit, and truss
needs to be more aware that a process may actually disappear while it's
waiting.
Reviewed by: Paul Saab <ps@yahoo-inc.com>
1) Fastpath deletions. When a file is being deleted, check to see if it
was so recently created that its inode has not yet been written to
disk. If so, the delete can proceed to immediately free the inode.
2) Background writes: No file or block allocations can be done while the
bitmap is being written to disk. To avoid these stalls, the bitmap is
copied to another buffer which is written thus leaving the original
available for futher allocations.
3) Link count tracking. Constantly track the difference in i_effnlink and
i_nlink so that inodes that have had no change other than i_effnlink
need not be written.
4) Identify buffers with rollback dependencies so that the buffer flushing
daemon can choose to skip over them.
it only on the buf_daemon process). The problem is that when the
syncer process starts running the worklist, it wants to delete
lots of files. It does this by VFS_VGET'ing the vnodes, clearing
the blocks in them and bdwrite'ing the buffer. It can process close
to a thousand files per second which generates a large number of
dirty buffers. So, giving it special priviledge at the buffer trough
leads to trouble as the buf_daemon does occationally need a free
buffer to proceed and if the syncer has used every last one up,
we are toast.
into vnode dirtyblkhd we append it to the list instead of prepend it to
the list in order to maintain a 'forward' locality of reference, which
is arguably better then 'reverse'. The original algorithm did things this
way to but at a huge time cost.
Enhance the append interlock for NFS writes to handle intr/soft mounts
better.
Fix the hysteresis for NFS async daemon I/O requests to reduce the
number of unnecessary context switches.
Modify handling of NFS mount options. Any given user option that is
too high now defaults to the kernel maximum for that option rather then
the kernel default for that option.
Reviewed by: Alfred Perlstein <bright@wintelcom.net>
the low level interrupt handler number should be used. Change
setup_apic_irq_mapping() to allocate low level interrupt handler X (Xintr${X})
for any ISA interrupt X mentioned in the MP table.
Remove an assumption in the driver for the system clock (clock.c) that
interrupts mentioned in the MP table as delivered to IOAPIC #0 intpin Y
is handled by low level interrupt handler Y (Xintr${Y}) but don't assume
that low level interrupt handler 0 (Xintr0) is used.
Don't allocate two low level interrupt handlers for the system clock.
Reviewed by: NOKUBI Hirotaka <hnokubi@yyy.or.jp>
hardpps() produced offset component. This is tested and behaved
stable with frequency offsets from -338.05 to +499.91 PPM.
Interestingly the machine I tested this on would fail if the clock
were slower than 14.3132 MHz whereas it was perfectly happy to run
at 16.384 MHz, in other words [-340PPM ... +14.4%]
Make pps_shift tweakable with sysctl.
login (or not if root)
then exit the shell
truss will get stuct in tsleep
I dont know if this is correct, but it fixes the problem and
according to the commends in pioctl.h, PF_ISUGID is set when we
want to ignore UID changes.
The code is checking for when PF_ISUGID is not set and since it
never is set, we always ignore UID changes.
Submitted by: Paul Saab <ps@yahoo-inc.com>
resulted in vastly optimistic offset values reported to userland
(typically a factor 40+ too small). Apart from that, the code had
two sign-bugs.
Apply the hardpps() phase with the right sign with a simply
scaling by integration interval. (This may be too stiff at
long integration intervals, see below).
Allow pps_shiftmax to be reduced again.
Before this, the phase lock in hardpps() were broken, but due to
two bugs mostly cancelling out, it would end up basically working
with a large stochastic component. Now it behaves as one would
expect: smooth and quiet.
It seems that pps_shiftmax above 7..9 somewhere makes the phaselock
too weak to hold onto random walk phase errors from a HP-105 OCXO,
which basically means that it is too weak for real-life use with
such integration times. This is yet to be resolved.
Submitted to: Prof. Dave "NTP" Mills.
Tested by: Terje Mathisen <Terje.Mathisen@hda.hydro.com>
because bsd.kmod.mk is usually out of sync with kernel source. However
bsd.kmod.mk has to be updated now because of the _KERNEL change so there
is no need to keep this (pre-repo copy) version around.
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
to `register_t *'. This fixes bugs like misplacement of argc and argv
on the user stack on i386's with 64-bit longs. We still use longs to
represent "words" like argc and argv, and assume that they are on the
stack (and that there is stack). The suword() and fuword() families
should also use register_t.
register_t, so pointers to it must be passed around as `register_t *',
not as `int *'. The type mismatches were non-benign on alphas, but
the broken code is normally only configured by LINT.
fixee incoherency of pipe timestamps relative to file timestamps in
the usual case where getnanotime() is not used for the latter. (File
and pipe timestamps are still incoherent relative to real time unless
the vfs_timestamp_precision sysctl is set to 2 or 3).
NFSSERVER defined, useful for userland fileservers that want to
use a filehandle type interface to the filesystem.
Submitted by: Assar Westerlund assar@stacken.kth.se
PR: kern/15452
stressful situations. buf_daemon now makes a distinction between
being woken up and its sleep timing out, and as a consequence is now
much better able to dynamically tune itself to its environment.
Reviewed by: Alfred Perlstein <bright@wintelcom.net>
differentiate between one of three different scenarios:
1. No init.
2. Path to init munged by an incorrect loader configuration.
3. Root file system not mounted.
Reviewed-by: billf
The variables "m_mclalloc_wid" and "m_mballoc_wid" were not in the
proper place. They should have been in uipc_mbuf.c and have been global,
not in mbuf.h and local per each file that uses mbuf.h.
Sorta bug fix:
In mbuf.h, the definitions of various things for KERNEL and not
KERNEL cases were very screwy. This fixes all of that which I could
find.
1. Data written beyond end of pipe buffer, causing kernel memory corruption.
- Check that space is still valid after obtaining the pipe lock.
- Defer the calculation of transfer size until the pipe
lock has been obtained.
- Update the pipe buffer pointers while holding the pipe lock.
2. Writes of size <= PIPE_BUF not always atomic.
- Allow an internal write to span two contiguous segments,
so writes of size <= PIPE_BUF can be kept atomic
when wrapping around from the end to the start of the
pipe buffer.
PR: 15235
Reviewed by: Matt Dillon <dillon@FreeBSD.org>
the kernel while the vnode_if.h header is a bunch of inlines to call the
code that is in the kernel. Generating the .h file on the fly is kinda
bogus because it has to match the one compiled into the kernel.
IMHO we should have kern/vnode_if.c and sys/vnode_if.h committed in the
tree but that's another battle.
means that running out of mbuf space isn't a panic anymore, and code
which runs out of network memory will sleep to wait for it.
Submitted by: Bosko Milekic <bmilekic@dsuper.net>
Reviewed by: green, wollman
madvise().
This feature prevents the update daemon from gratuitously flushing
dirty pages associated with a mapped file-backed region of memory. The
system pager will still page the memory as necessary and the VM system
will still be fully coherent with the filesystem. Modifications made
by other means to the same area of memory, for example by write(), are
unaffected. The feature works on a page-granularity basis.
MAP_NOSYNC allows one to use mmap() to share memory between processes
without incuring any significant filesystem overhead, putting it in
the same performance category as SysV Shared memory and anonymous memory.
Reviewed by: julian, alc, dg
* lockstatus() and VOP_ISLOCKED() gets a new process argument and a new
return value: LK_EXCLOTHER, when the lock is held exclusively by another
process.
* The ASSERT_VOP_(UN)LOCKED family is extended to use what this gives them
* Extend the vnode_if.src format to allow more exact specification than
locked/unlocked.
This commit should not do any semantic changes unless you are using
DEBUG_VFS_LOCKS.
Discussed with: grog, mch, peter, phk
Reviewed by: peter