the file descriptor reference, rather than paying additional lock
operations to acquire a socket reference from the file descriptor.
This will also help to ensure that file descriptor based socket
requests are not delivered to a socket after close. Most consumers
have already been converted to this model.
MFC after: 3 months
"fdinit() fails to initialize newfdp->fd_fd.fd_lastfile to -1. This breaks
fdcopy() which will incorrectly set newfdp->fd_freefile to 1 if no files are
open and the last file descriptor marked as unused for fdp was 0. This later
causes descriptor 0 to be unavailable in newfdp when the optimization is
enabled.
When the last file descriptor previously marked as used is nonzero and marked
as unused, fdunused() incorrectly sets fdp->fd_lastfile to fd - 1 due to
fd_last_used() returning (size - 1). This hides the problem that breaks the
optimization."
This allows us to keep the optimization, while un-breaking it.
This is a RELENG_6 candidate.
PR: kern/87208
MFC after: 1 week
Submitted by: tegge
really breaking things. Simple "close(0); dup(fd)" does not return descriptor
"0" in some cases. Further, this change also breaks some MAC interactions with
mac_execve_will_transition(). Under certain circumstances, fdcheckstd() can
be called in execve(2) causing an assertion that checks to make sure that
stdin, stdout and stderr reside at indexes 0, 1 and 2 in the process fd table
to fail, resulting in a kernel panic when INVARIANTS is on.
This should also kill the "dup(2) regression on 6.x" show stopper item on the
6.1-RELEASE TODO list.
This is a RELENG_6 candidate.
PR: kern/87208
Silence from: des
MFC after: 1 week
state about each open file, and identify the first process in the process
table that references the file. This is helpful in debugging leaks of
file descriptors.
MFC after: 1 week
with the file descriptor. When a file descriptor is closed as a result
of garbage collecting a UNIX domain socket, the file descriptor will
not have any associated thread, so the logic to identify advisory locks
held by that thread is not appropriate. Check the thread for NULL to
avoid this scenario. Expand an existing comment to say a bit more about
this.
MFC after: 1 week
- Prefer '_' to ' ', as it results in more easily parsed results in
memory monitoring tools such as vmstat.
- Remove punctuation that is incompatible with using memory type names
as file names, such as '/' characters.
- Disambiguate some collisions by adding subsystem prefixes to some
memory types.
- Generally prefer lower case to upper case.
- If the same type is defined in multiple architecture directories,
attempt to use the same name in additional cases.
Not all instances were caught in this change, so more work is required to
finish this conversion. Similar changes are required for UMA zone names.
- if minfd < fd_freefile (as is most often the case, since minfd is
usually 0), set it to fd_freefile.
- remove a call to fd_first_free() which duplicates work already done
by fdused().
This change results in a small but measurable speedup for processes
with large numbers (several thousands) of open files.
PR: kern/85176
Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz>
MFC after: 3 weeks
opening a device, devfs_open needs the file descriptor to install its
own fileops. Failing to pass the file descriptor causes the vnode to
be returned with the regular vnops, which will cause a panic on the
first read or write because devfs_specops is not meant to support
those operations.
This bug caused a panic after exec'ing any set[ug]id program with
fds 0..2 closed (i.e., if any action had to be taken by fdcheckstd, we
would panic if the exec'd program ever tried to use any of those
descriptors).
Reviewed by: phk
Approved by: re (scottl)
structure in the struct pointed to by the 3rd argument for IPC_STAT and
get rid of the 4th argument. The old way returned a pointer into the
kernel array that the calling function would then access afterwards
without holding the appropriate locks and doing non-lock-safe things like
copyout() with the data anyways. This change removes that unsafeness and
resulting race conditions as well as simplifying the interface.
- Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(),
fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of
code duplication for freebsd4 and netbsd compatability system calls.
- Add a new lookup function kern_alternate_path() that looks up a filename
under an alternate prefix and determines which filename should be used.
This is basically a more general version of linux_emul_convpath() that
can be shared by all the ABIs thus allowing for further reduction of
code duplication.
which holds on to just the data structure and the mutex. (The
existing refcount (fd_refcnt) holds onto the open files in the
descriptor.)
The fd_holdcnt is protected by fdesc_mtx, fd_refcnt by FILEDESC_LOCK.
Add fdhold(struct proc *) which gets a hold on the filedescriptors of
the specified proc..
Add fddrop(struct filedesc *) which drops the fd_holdcnt and if zero
destroys the mutex and frees the memory.
Initialize the fd_holdcnt to one in fdinit(). Normal operations on
the filedesc structure will not change it.
In fdfree() use fddrop() to dispose of the mutex and structure. Hold
the FILEDESC_LOCK() until we have cleaned out the contents and carefully
set the fields to null values during cleanup.
Use fdhold()/fddrop() in mountcheckdirs() and sysctl_kern_file().
for ensuring that a process' filedesc is not shared with anybody.
Use it in the two places which previously had private implmentations.
This collects all fd_refcnt handling in kern_descrip.c
instead acquire it conditionally in closef() if it is required for
advisory locking. This removes Giant from the close() path of sockets
and pipes (and any other objects that don't acquire Giant in their
fo_close path, such as kqueues). Giant will still be acquired twice for
vnodes -- once for advisory lock teardown, and a second time in the
fo_close method. Both Poul-Henning and I believe that the advisory lock
teardown code can be moved into the vn_closefile path shortly.
This trims a percent or two off the cost of most non-vnode close
operations on SMP, but has a fairly minimal impact on UP where the cost
of a single mutex operation is pretty low.
Use this in all the places where sleeping with the lock held is not
an issue.
The distinction will become significant once we finalize the exact
lock-type to use for this kind of case.