When returning EIO from DEVFSIO_RADD ioctl, drop the exclusive rule
lock. Otherwise the system comes to a rather sudden and grinding
halt.
As the underlying devfs locking changes that lead to this bug have not
been merged to RELENG_5, this fix to those locking changes does not
need to be merged.
src/sys/fs/devfs/devfs_vnops.c 1.128
src/sys/kern/vfs_subr.c 1.652
This is a workaround for a complicated issue involving VFS cookies and devfs.
The PR and patch have the details. The ultimate fix requires architectural
changes and clarifications to the VFS API, but this will prevent the system
from panicking when someone does "ls /dev" while running in a shell under the
linuxulator.
PR: 88249
Submitted by: "Devon H. O'Dell" <dodell@ixsystems.com>
Original commit message:
Modified files:
sys/fs/nwfs nwfs_vnops.c
Log:
Update nwfs_lookup() to match the current cache_lookup() API.
cache_lookup() has returned a ref'ed and locked vnode since
vfs_cache.c:1.96, dated Tue Mar 29 12:59:06 2005 UTC. This change
is similar to the change made to smbfs_lookup() in smbfs_vnops.c:1.58.
Tested by: "Antony Mawer" ant AT mawer.org
MFC after: 2 weeks
Revision Changes Path
1.42 +11 -26 src/sys/fs/nwfs/nwfs_vnops.c
Approved by: re (scottl)
| Ensure the full value is written into inode variables.
|
| PR: 85503
| Submitted by: Dmitry Pryanishnikov <dmitry@atlantis.dp.ua>
|
| Revision Changes Path
| 1.89 +2 -2 src/sys/fs/msdosfs/msdosfs_denode.c
Approved by: re (scottl)
Second attempt at a work-around for fifo-related socket panics during
make -j with high levels of parallelism: acquire Giant in fifo I/O
routines.
Discussed with: ups
Approved by: re (scottl)
msdosfs_vfsops.c v1.146
bootsect.h v1.13
Remove checks for BOOTSIG[23] from FAT32 bootblocks.
There seems to be very little documentary evidence outside this
implementation to suggest a these checks are neccessary, and more
than one camera-formatted flash disk fails the check, but mounts
successfully on most other systems.
Approved by: re (scottl@)
Assert v_fifoinfo is non-NULL in fifo_close() in order to catch
non-conforming cases sooner.
Reported by: Peter Holm <peter at holm dot cc>
Approved by: re (scottl)
sys/fs/msdosfs/msdosfs_vfsops.c:1.145,
sys/fs/ntfs/ntfs_vfsops.c:1.79-1.80,
sys/fs/udf/udf_vfsops.c:1.34-1.35,
sys/gnu/fs/ext2fs/ext2_vfsops.c:1.152-1.153,
sys/gnu/fs/reiserfs/reiserfs_vfsops.c:1.2-1.3 (by ssouhlal):
*_mountfs() (if the filesystem mounts from a device) needs devvp to be
locked, so lock it.
Approved by: re (scottl)
Lock the read socket receive buffer when frobbing the sb_state flag on
that socket during open, not the write socket receive buffer.
Spotted by: ups
Approved by: re (scottl)
For reasons of consistency (and necessity), assert an exclusive vnode
lock on the fifo vnode in fifo_open(): we rely on the vnode lock to
serialize access to v_fifoinfo.
Approved by: re (scottl)
Assert that (vp) is locked in fifo_close(), since we rely on the
exclusive vnode lock to synchronize the reference counts on struct
fifoinfo.
Approved by: re (scottl)
The socket pointers in fifoinfo are not permitted to be NULL, so
don't check if they are, it just confuses the fifo code more.
Approved by: re (kensmith)
Handle a race condition where NULLFS vnode can be cleaned while threads
can still be asleep waiting for lowervp lock.
Tested by: kkenn
Discussed with: ssouhlal, jeffr
(this is an early MFC for inclusion in the upcoming 6.0-BETA5)
Approved by: re (scottl)
Use vput() instead of vrele() in null_reclaim() since the lower vnode
is locked.
MFC 1.89 (by kan):
Handle a race condition where NULLFS vnode can be cleaned while threads
can still be asleep waiting for lowervp lock.
Tested by: kkenn
Discussed with: ssouhlal, jeffr
(rev. 1.89 is an early MFC for inclusion in the upcoming 6.0-BETA5)
Approved by: re (scottl)
Trim down now (believed to be) unused fifo_ioctl() and
fifo_kqfilter() VOP implementations, since they in theory are used
only on open file descriptors, in which case the ioctls are via
fifo_ioctl_f() and kqueue requests are via fifo_kqfilter_f().
Generate warnings if they are entered for now. These printf()
calls should become panic() calls.
Annotate and re-implement fifo_ioctl_f(): don't arbitrarily
forward ioctls to the socket layer, only forward the ones we
explicitly support for fifos. In the case of FIONREAD, don't
forward the request to the write socket on a read-write fifo, or
the read result is overwritten. Annotate a nasty case for the
undefined POSIX O_RDWR on fifos, in which failure of the second
ioctl will result in the socket pair being in an inconsistent
state.
Assert copyright as I find myself rewriting non-trivial parts of
fifofs.
Approved by: re (scottl)
Annotate two issues:
1) fifo_kqfilter() is not actually ever used, it likely should be GC'd.
2) fifo_kqfilter_f() doesn't implement EVFILT_VNODE, so detecting events
on the underlying vnode for a fifo no longer works (it did in 4.x).
Likely, fifo_kqfilter_f() should forward the request to the VFS using
fp->f_vnode, which would work once fifo_kqfilter() was detached from
the vnode operation vector (removing the fifo override).
Discussed with: phk
Approved by: re (scottl)
As a result of kqueue locking work, socket buffer locks will always
be held when entering a kqueue filter for fifos via a socket buffer
event: as such, assert the lock unconditionally rather than acquiring
it conditionally.
Approved by: re (scottl)
Introduce no-op nosup fifo kqueue filter and detach routine, which are
used when a read filter is requested on a write-only fifo descriptor, or
a write filter is requested on a read-only fifo descriptor. This
permits the filters to be registered, but never raises the event, which
causes kqueue behavior for fifos to more closely match similar semantics
for poll and select, which permit testing for the condition even though
the condition will never be raised, and is consistent with POSIX's notion
that a fifo has identical semantics to a one-way IPC channel created
using pipe() on most operating systems.
The fifo regression test suite can now run to completion on HEAD without
errors.
Approved by: re (kensmith)
When a request is made to register a filter on a fifo that doesn't
apply to the fifo (i.e., not EVFILT_READ or EVFILT_WRITE), reject
it as EINVAL, not by returning 1 (EPERM).
Approved by: re (kensmith)
Remove DFLAG_SEEKABLE from fifo file descriptors: fifos are not seekable
according to POSIX, not to mention the fact that it doesn't make sense
(and hence isn't really implemented). This causes the fifo_misc
regression test to succeed.
Approved by: re (scottl)
Only poll the fifo for read events if the fifo is attached to a readable
file descriptor. Otherwise, the read end of a fifo might return that it
is writable (which it isn't).
Only poll the fifo for write events if the fifo attached to a writable
file descriptor. Otherwise, the write end of a fifo might return that
it is readable (which it isn't).
In the event that a file is FREAD|FWRITE (which is allowed by POSIX, but
has undefined behavior), we poll for both.
Approved by: re (kensmith)
After going to some trouble to identify only the write-related events
to poll the write socket for, the fifo polling code proceeded to poll
for the complete set of events. Use 'levents' instead of 'events' as
the argument to poll, and only poll the write socket if there is
interest in write events.
Approved by: re (kensmith)
Add an assertion that fifo_open() doesn't race against other threads
while sleeping to allocate fifo state: due to using the vnode lock to
serialize access to a fifo during open, it shouldn't happen (tm).
Approved by: re (kensmith)
Rather than reaching into the internals of the UNIX domain socket code
by calling uipc_connect2() to connect two socket endpoints to create a
fifo, call soconnect2().
Approved by: re (kensmith)
HEAD to RELENG_6: changes to introduce a credentialed version of the
clone event handler, and then changes to merge the regular and
credentialed versions into a single interface (along with updates to
existing consumers). With this merge, 6.x and 7.x are in sync.
First batch merges devfs_devs.c:1.37, devfs_vnops.c:1.115,
kern_conf.c:1.187, tty_pty.c:1.138, mac_vfs.c:1.109, mac_biba.c:1.36,
mac_lomac.c:1.36, mac_mls.c:1.73, mac_stub.c:1.53, mac_test.c:1.61,
conf.h:1.223, mac.h:1.68, mac_policy.h:1.67 from HEAD to RELENG_6:
When devfs cloning takes place, provide access to the credential of the
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
Second batch merges scsi_target.c:1.68, coda_fbsd.c:1.43,
firewirereg.h:1.38, fwdev.c:1.47, nmdm.c:1.36, snp.c:1.100, dsp.c:1.82,
mixer.c:1.45, vkbd.c:1.9, devfs_vnops.c:1.117, tty_pty.c:1.139,
tty_tty.c:1.57, bpf.c:1.156, if_tap.c:1.56, if_tun.c:1.153,
smb_dev.c:1.28, conf.h:1.224 from HEAD to RELENG_6:
Merge the dev_clone and dev_clone_cred event handlers into a single
event handler, dev_clone, which accepts a credential argument.
Implementors of the event can ignore it if they're not interested,
and most do. This avoids having multiple event handler types and
fall-back/precedence logic in devfs.
This changes the kernel API for /dev cloning, and may affect third
party packages containg cloning kernel modules.
Requested by: phk
These changes modifies the kernel device driver API for device cloning,
and might require minor modifications to third party device drivers that
make use of devfs cloning. It will not be merged to RELENG_5.
Approved by: re (scottl)
> devfs is not yet fully MPSAFE - for example, multiple concurrent devfs(8)
> processes can cause a panic when operating on rulesets.
Approved by: re (scottl)
This is good enough to be able to run a RELENG_4 gdb binary against
a RELENG_4 application, along with various other tools (eg: 4.x gcore).
We use this at work.
ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace,
procfs and core dumps.
procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client
and target application.
procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their
sscanf fails. They expect an unsigned long.
imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps.
sys_process.c: handle 32 bit consumers debugging 32 bit targets. Note
that 64 bit consumers can still debug 32 bit targets.
IA64 has got stubs for ia32_reg.c.
Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't
implemented in the 32/64 wrapper yet. We also make a tiny patch to
gdb pacify it over conflicting formats of ld-elf.so.1.
Approved by: re
ioctl numbers in backwards compatability mode. eg: an IOC_IN ioctl with
a size of zero. Traditionally this was what you did before IOC_VOID
existed, and we had some established users of this in the tree, namely
procfs. Certain 3rd party drivers with binary userland components also
have this too.
This is necessary to have 4.x and 5.x binaries use these ioctl's. We
found this at work when trying to run 4.x binaries.
Approved by: re
To check a directory's in-use bitmap bit by bit, we use
a pointer to an 8 bit wide unsigned value.
The index used to dereference this pointer is calculated
by shifting the bit index right 3 bits. Then we do a
logical AND with the bit# represented by the lower 3
bits of the bit index.
This is an idiomatic way of iterating through a bit map
with simple bitwise operations.
This commit fixes the bug that we only checked bits
3:0 of each 8 bit chunk, because we only used bits 1:0
of the bit index for the bit# in the current 8 bit value.
This resulted in files not being returned by getdirentries(2).
Change the type of the bit map pointer from `char *' to
`u_int8_t *'.