expensive (!) 64bit multiply, divide, and comparison aren't necessary
(this came in originally from rev 1.19 to fix an overflow with large
sb_max or MCLBYTES).
The 64bit math in this function was measured in some kernel profiles as
being as much as 5-8% of the total overhead of the TCP/IP stack and
is eliminated with this commit. There is a harmless rounding error (of
about .4% with the standard values) introduced with this change,
however this is in the conservative direction (downward toward a
slightly smaller maximum socket buffer size).
MFC after: 3 days
make a series of modifications to the credential arguments relating
to file read and write operations to cliarfy which credential is
used for what:
- Change fo_read() and fo_write() to accept "active_cred" instead of
"cred", and change the semantics of consumers of fo_read() and
fo_write() to pass the active credential of the thread requesting
an operation rather than the cached file cred. The cached file
cred is still available in fo_read() and fo_write() consumers
via fp->f_cred. These changes largely in sys_generic.c.
For each implementation of fo_read() and fo_write(), update cred
usage to reflect this change and maintain current semantics:
- badfo_readwrite() unchanged
- kqueue_read/write() unchanged
pipe_read/write() now authorize MAC using active_cred rather
than td->td_ucred
- soo_read/write() unchanged
- vn_read/write() now authorize MAC using active_cred but
VOP_READ/WRITE() with fp->f_cred
Modify vn_rdwr() to accept two credential arguments instead of a
single credential: active_cred and file_cred. Use active_cred
for MAC authorization, and select a credential for use in
VOP_READ/WRITE() based on whether file_cred is NULL or not. If
file_cred is provided, authorize the VOP using that cred,
otherwise the active credential, matching current semantics.
Modify current vn_rdwr() consumers to pass a file_cred if used
in the context of a struct file, and to always pass active_cred.
When vn_rdwr() is used without a file_cred, pass NOCRED.
These changes should maintain current semantics for read/write,
but avoid a redundant passing of fp->f_cred, as well as making
it more clear what the origin of each credential is in file
descriptor read/write operations.
Follow-up commits will make similar changes to other file descriptor
operations, and modify the MAC framework to pass both credentials
to MAC policy modules so they can implement either semantic for
revocation.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
we can use the names _receive() and _send() for the receive() and send()
checks. Rename related constants, policy implementations, etc.
PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
MFC after:
type of the 'flags' argument m_getcl() was using anyway; m_extadd()
needed to be changed to accept an int instead of a short for 'flags.'
This makes things more consistent and also gives us more bits to
use for m_flags in the future (we have almost run out).
Requested by: sam (Sam Leffler)
during a label change resulting in an mmap removal. This is "fail stop"
behavior, which is preferred, although it offers slightly less
transparency.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
to do this made the following script hang:
#!/bin/sh
set -ex
extattrctl start /tmp
extattrctl initattr 64 /tmp/EA00
extattrctl enable /tmp user ea00 /tmp/EA00
extattrctl showattr /tmp/EA00
if the filesystem backing /tmp did not support EAs.
The real solution is probably to have the extattrctl syscall do the
unlocking rather than depend on the filesystem to do it. Considering
that extattrctl is going to be made obsolete anyway, this has dogwash
priority.
Sponsored by: DARPA & NAI Labs.
there is a global lock over the undo structures because of the way
they are managed.
Switch to using SLIST instead of rolling our own linked list.
Fix several races where a permission check was done before a
copyin/copyout, if the copy happened to fault it may have been
possible to race for access to a semaphore set that one shouldn't
have access to.
Requested by: rwatson
Tested by: NetBSD regression suite.
entire subsystem, we could move to per-message queue locks, however
the messages themselves seem to come from a global pool and to avoid
over-locking this code (locking individual queues, then the global
pool) I've opted to just do it this way.
Requested by: rwatson
Tested by: NetBSD's regression suite.
as part of the TrustedBSD MAC framework. Instrument the creation
and destruction of pipes, as well as relevant operations, with
necessary calls to the MAC framework. Note that the locking
here is probably not quite right yet, but fixes will be forthcoming.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
the jail check and the MAC socket labeling in socreate(). This handles
socket creation using a cached credential better (such as in the NFS
client code when rebuilding a socket following a disconnect: the new
socket should be created using the nfsmount cached cred, not the cred
of the thread causing the socket to be rebuilt).
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
'options MAC') as long as IO_NOMACCHECK is not set in the IO flags.
If IO_NOMACCHECK is set, bypass MAC checks in vn_rdwr(). This allows
vn_rdwr() to be used as a utility function inside of file systems
where MAC checks have already been performed, or where the operation
is being done on behalf of the kernel not the user.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI LAbs
enabled and the kernel provides the MAC registration and entry point
service. Declare a dependency on that module service for any
MAC module registered using mac_policy.h. For now, hard code the
version as 1, but once we've come up with a versioning policy, we'll
move to a #define of some sort. In the mean time, this will prevent
loading a MAC module when 'options MAC' isn't present, which (due to
a bug in the kernel linker) can result if the MAC module is preloaded
via loader.conf.
This particular evil recommended by: peter
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI LAbs
thus hiting EIO at the end of file. This is believed to be a feature
(not a bug) of vn_rdwr(), so we turn it off by supplying aresid param.
Reviewed by: rwatson, dg
(I skipped those in contrib/, gnu/ and crypto/)
While I was at it, fixed a lot more found by ispell that I
could identify with certainty to be errors. All of these
were in comments or text, not in actual code.
Suggested by: bde
MFC after: 3 days
- Make getvfsbyname() take a struct xvfsconf *.
- Convert several consumers of getvfsbyname() to use struct xvfsconf.
- Correct the getvfsbyname.3 manpage.
- Create a new vfs.conflist sysctl to dump all the struct xvfsconf in the
kernel, and rewrite getvfsbyname() to use this instead of the weird
existing API.
- Convert some {set,get,end}vfsent() consumers to use the new vfs.conflist
sysctl.
- Convert a vfsload() call in nfsiod.c to kldload() and remove the useless
vfsisloadable() and endvfsent() calls.
- Add a warning printf() in vfs_sysctl() to tell people they are using
an old userland.
After these changes, it's possible to modify struct vfsconf without
breaking the binary compatibility. Please note that these changes don't
break this compatibility either.
When bp will have updated mount_smbfs(8) with the patch I sent him, there
will be no more consumers of the {set,get,end}vfsent(), vfsisloadable()
and vfsload() API, and I will promptly delete it.
sysctl_sysctl_next() to skip this sysctl. The sysctl is
still available, but doesn't appear in a "sysctl -a".
This is especially useful when you want to deprecate a sysctl,
and add a warning into it to warn users that they are using
an old interface. Without this flag, the warning would get
echoed when running "sysctl -a" (which happens at boot).
appologize to those of you who may have been seeing crashes in
code that uses sendfile(2) or other types of external buffers
with mbufs.
Pointed out by, and provided trace:
Niels Chr. Bank-Pedersen <ncbp at bank-pedersen.dk>
VOP wrapper is called from within file systems so can result in odd
loopback effects when MAC enforcement is use with the active (as
opposed to saved) credential. These checks will be moved elsewhere.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
I'm not sure what happenned to the original setting of the P_CONTINUED
flag. it appears to have been lost in the paper shuffling...
Submitted by: David Xu <bsddiy@yahoo.com>
argument, not the 'type' argument. As a result of the buf, the
MAC label on some packet header mbufs might not be set in mbufs
allocated using m_getcl(), resulting in a page fault.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
vnode operations. This permits the rights of the user (typically root)
used to turn on accounting to be used when writing out accounting entries,
rather than the credentials of the process generating the accounting
record. This fixes accounting in a number of environments, including
file systems that offer revocation support, MAC environments, some
securelevel scenarios, and in some NFS environments.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
the initproc credential from the proc0 credential. Otherwise, the
proc0 credential is used instead of initproc's credentil when authorizing
start_init() activities prior to initproc hitting userland for the
first time. This could result in the incorrect credential being used
to authorize mounting of the root file system, which could in turn cause
problems for NFS when used in combination with uid/gid ipfw rules, or
with MAC.
Discussed with: julian
to the address of the user's aiocb rather than the kernel's aiocb. (In other
words, prior to this change, the ident field returned by kevent(2) on
completion of an AIO was effectively garbage.)
Submitted by: Romer Gil <rgil@cs.rice.edu>
cninit. This allows a console driver to replace the existing console
by calling cninit again, eg during the device probe. Otherwise the
multiple console code sends output to both, which is unfortunate if
they're using the same hardware.
about calls to SYSCTL_OUT() made with locks held if the buffer has not
been pre-wired. SYSCTL_OUT() should not be called while holding locks,
but if this is not possible, the buffer should be wired by calling
sysctl_wire_old_buffer() before grabbing any locks.
that LIO_READ and LIO_WRITE were requests for kevent()-based
notification of completion. Modify _aio_aqueue() to recognize LIO_READ
and LIO_WRITE.
Notes: (1) The patch provided by the PR perpetuates a second bug in this
code, a direct access to user-space memory. This change fixes that bug
as well. (2) This change is to code that implements a deprecated interface.
It should probably be removed after an MFC.
PR: kern/39556
investigate the problem described below.
I am seeing some strange livelock on recent -current sources with
a slow box under heavy load, which disappears with this change.
This might suggest some kind of problem (either insufficient locking,
or mishandling of priorities) in the poll_idle thread.