Note: The jumbo mbuf cluster API has been MFC'd only recently and
never shipped in a release. Thus the API change does not violate
our stable branch guidelines with regard to API compatibility.
Requested by: glebius, gallatin
Sponsored by: TCP/IP Optimization Fundraise 2005
Approved by: re (scottl)
- Use a dedicated kthread to call acctwatch() periodically rather than
a callout from softclock().
- Validate new values for the kern.acct_chkfreq sysctl.
- Whitespace and include sorting.
Approved by: re (scottl)
Cast VFS_STATFS() in vfs_domount() to (void) to indicate that ignoring the
return value is intentional: this is simply an attempt to pre-cache the
statfs state.
Found with: Coverity Prevent (tm)
Approved by: re (scottl)
When calling bioq_first() to see if a queue is empty in bioq_disksort(),
don't save the return value as we won't use it.
Noticed by: Coverity Prevent analysis tool
Approved by: re (scottl)
Reuse ktr_unused field in ktr_header structure as ktr_tid; populate
ktr_tid as part of gathering of ktr header data for new ktrace
records. The continued use of intptr_t is required for file layout
reasons, and cannot be changed to lwpid_t at this point.
Reviewed by: davidxu
Approved by: re (scottl)
- In pipe() return the error returned by pipe_create(), rather then
hardcoded ENFILES, which is incorrect. pipe_create() can fail due
to ENOMEM.
- Update manual page, describing ENOMEM return code.
Reviewed by: arch
Return EINVAL from lookup() if cn_nameiop is DELETE or RENAME and
the last component of the path name is "..". This keeps VOP_LOOKUP()
from locking vnodes in reverse order.
In kern_unlink(), remap EINVAL errors returned from namei() to EPERM
to match existing (and POSIX required) behaviour.
date: 2006/01/15 20:14:11; author: csjp; state: Exp; lines: +1 -1
vfs_busy can only return something useful if MNTK_UNMOUNT has been set.
Since we are using vfs_busy() on a freshly allocated mount structure, use
(void) to show that we do not care about the return value.
contains incorrect fractional second values (outside the range
0-999999).
Prior to this change users could create files with values outside
that range. Moreover, on 32-bit machines tv_usec offsets larger than
4.3s would result in an unnormalized AND wrong timestamp value,
due to overflow.
struct mbuf *m_getjcl(int how, short type, int flags, int size)
void *m_cljget(struct mbuf *m, int how, int size)
For size both take MCLBYTES, MJUM4BYTES, MJUM9BYTES, MJUM16BYTES.
Tested by: glebius
Sponsored by: TCP/IP Optimization Fundraise 2005
In realloc(9), determine size of the original block based on
UMA_SLAB_MALLOC flag.
In some circumstances (I observed it when I was doing a lot of reallocs)
UMA_SLAB_MALLOC can be set even if us_keg != NULL.
If this is the case we have wonderful, silent data corruption, because less
data is copied to the newly allocated region than should be.
date: 2005/12/16 18:32:39; author: delphij; state: Exp; lines: +2 -0
In pipe_write(): when uiomove() fails, do not spin on it forever.
Submitted by: Kostik Belousov <kostikbel at gmail.com> on -current@
Message-ID: <20051216151016.GE84442@deviant.zoral.local>
Security: Local DoS
emulating architectures that allow this (Linux so far).
To preserve kernel modules ABI, unlike the version commited into the trunk,
which adds new flag field into Brandinfo structure for this purpose, this
one checks if brand field of Brandinfo matches ELFOSABI_LINUX.
PR: kern/87615
Submitted by: Marcin Koziej <creep@desk.pl>
kern_sig.c revision 1.319
sys_process.c revision 1.134
Avoid kernel panic when attaching a process which
may not be stopped by debugger, e.g process is dumping core.
Call fill_kinfo_proc_only() instead of fill_kinfo_proc()
before calling fill_kinfo_thread(), because fill_kinfo_proc()
calls both fill_kinfo_proc_only() and fill_kinfo_thread().
This is a minor optimization and there should be no change
in functionality.
Leading whitespace cleanup.
Original commit messages:
Log:
Track all lock relationships instead of pruning direct relationships
if an indirect relationship exists (keep both A->B->C and A->C).
This allows witness_checkorder() to use isitmychild() instead of
the much more expensive isitmydescendant() to check for valid lock
ordering.
Don't do an expensive tree walk to update the w_level values when
the tree is updated. Only update the w_level values when using the
debugger to display the tree.
Nuke the experimental "witness_watch > 1" mode that only compared
w_level for the two locks. This information is no longer maintained
at run time, and the use of isitmychild() in witness_checkorder
should bring performance close enough to the acceptable level that
this hack is not needed.
Report witness data structure allocation statistics under the
debug.witness sysctl.
Reviewed by: jhb
MFC after: 30 days
Log:
Relocate witness_levelall(), witness_leveldescendents(), and
witness_displaydescendants() so that they are protected by
"#ifdef DDB/#endif" to unbreak kernels not using "option DDB".
MFC after: 3 weeks
In watchdog_config enable the software watchdog iff the WD_ACTIVE flag
is set. When watchdogd(1) is terminated intentionally it clears the
bit, which should then disable it in the kernel.
PR: kern/74386
Submitted by: Alex Hoff <ahoff at sandvine dot com>
Approved by: rwatson (mentor)
osigpending, osigvec, osigblock, osigsetmask, osigsuspend, osigstack,
clock_gettime, clock_settime, and clock_getres.
Also correct the prototype for freebsd32_nanosleep in syscalls.master.
Calling setrlimit from 32bit apps could potentially increase certain
limits beyond what should be capiable in a 32bit process, so we
must fixup the limits.
This is slightly different than HEAD to not change the ABI.
Move execve's access time update functionality into a
new vfs_mark_atime() function, and use the new function
for performing efficient atime updates in mmap().
When using m_dup(9) to copy more than MHLEN bytes of data, don't
create an mbuf chain that starts with a cluster containing just MHLEN
bytes. This happened because m_dup called m_get or m_getcl depending
on the amount of data to copy, but then always set the size available
in the first mbuf to MHLEN.
Approved by: jmg
Silence from: rwatson (mentor)
Significant refactoring of the accounting code to improve locking and VFS
happiness, as well as correct other bugs:
- Replace notion of current and saved accounting credential/vnode with a
single credential/vnode and an acct_suspended flag. This simplifies the
accounting logic substantially.
- Replace acct_mtx with acct_sx, a sleepable lock held exclusively during
reconfiguration and space polling, but shared during log entry
generation. This avoids holding a mutex over sleepable VFS operations.
- Hold the sx lock over the duration of the I/O so that the vnode I/O
cannot occur after vnode close, which could occur previously if
accounting was disabled as a process exited.
- Write the accounting log entry with Giant conditionally acquired based
on the file system where the log is stored. Previously, the accounting
code relied on the caller acquiring Giant.
- Acquire Giant conditionally in the accounting callout based on the file
system where the accounting log is stored. Run the callout MPSAFE.
- Expose acct_suspended via a read-only sysctl so it is possibly to
programmatically determine whether accounting is suspended or not without
attempting to parse logs.
- Check both acct_vp and acct_suspended lock-free before entering the
accounting sx lock in acct().
- When accounting is disabled due to a VBAD vnode (i.e., forceable unmount),
generate a log message indicating accounting has been disabled.
- Correct a long-standing bug in how free space is calculated and compared
to the required space: generate and compare signed results, not unsigned
results, or negative free space will cause accounting to not be suspended
when required, or worse, incorrectly resumed once negative free space is
reached.