Print a warning if a requested interface name is longer than
IFNAMSIZ.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
WARNING: You need to backup and restore the _unencrypted_ contents
WARNING: of your GBDE disks when you take this update!
Sponsored by: DARPA & NAI Labs.
from all low-level bus space support functions. There's no need
to actually force the read/write to be accepted by the platform
before we can do anything else. We still have the mf instruction
there, which forces ordering. This too is not required given the
semantices of the bus space I/O functions, but it's not at all
clear to me if there are any poorly written device drivers that
depend on the strict ordering by the processor. The motto here is
to take small steps...
o Properly set the pointer to the counter for each interrupt and
update the intrnames table.
o Remove Alpha cruft from intrcnt.h.
o Create INTRNAME_LEN as the single entity that defines the width
of the names in the intrnames table (incl. terminatinf '\0').
missed. This bug has been present since the vn_start_write() and
vn_finished_write() calls were first added in revision 1.159. When
the case is triggered, any attempts to create snapshots on the
filesystem will deadlock and also prevent further write activity
on that filesystem.
This guarantees that loads and stores emitted before the fence are
made visible before the IPI becomes pended.
Remove the mf.a instruction after initiating the IPI. There's no
guarantee that the IPI becomes pended prior to subsequent reads or
writes. Even if there was a guarantee, it would mostly be without
any benefit.
to conform to 1003.1-2001. Make it possible for applications to actually
tell whether or not asynchronous I/O is supported.
Since FreeBSD's aio implementation works on all descriptor types, don't
call down into file or vnode ops when [f]pathconf() is asked about
_PC_ASYNC_IO; this avoids the need for every file and vnode op to know about
it.
Implement new sysconf keys. Change the implenentation of
_SC_ASYNCHRONOUS_IO in preparation for the next set of changes.
Move some limits which had been in <sys/syslimits.h> to <limits.h> where
they belong. They had only ever been in syslimits.h to provide for the
kernel implementation of the CTL_USER MIB branch, which went away with
newsysctl years ago. (There is a #error in <sys/syslimits.h> which I
will downgrade in the next commit.)
for sparc64 from trap #9 to trap #65. This is one of the ABI "blessed"
system call vectors and is different from any other system that we might
want to emulate, making the emulation easier by reducing the number of
code paths that need to be shared. Compatibility with old applications
is provided with COMPAT_FREEBSD4.
Add defines for a few special traps that we may need to implement for
compatibility with 32bit applications, and add comments on which vectors
are used for what in other systems, and which are available.
Pass magic flags to trap() for deprecated or unimplemented system call
vectors so they will deliver SIGSYS instead of SIGILL.
This piggy backs nicely with the recent sigaction(2) system call number
change, and provided the rules are followed for upgrading past it, this
change should not be noticed.
mac_enforce_system toggle, rather than several separate toggles.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
permit MAC policies to augment the security protections on sysctl()
operations. This is not really a wonderful entry point, as we
only have access to the MIB of the target sysctl entry, rather than
the more useful entry name, but this is sufficient for policies
like Biba that wish to use their notions of privilege or integrity
to prevent inappropriate sysctl modification. Affects MAC kernels
only. Since SYSCTL_LOCK isn't in sysctl.h, just kern_sysctl.c,
we can't assert the SYSCTL subsystem lockin the MAC Framework.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
permits MAC modules to augment system security decisions regarding
the reboot() system call, if MAC is compiled into the kernel.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
mac_check_system_swapon(), to reflect the fact that the primary
object of this change is the running kernel as a whole, rather
than just the vnode. We'll drop additional checks of this
class into the same check namespace, including reboot(),
sysctl(), et al.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
"refreshing" the label on the vnode before use, just get the label
right from inception. For single-label file systems, set the label
in the generic VFS getnewvnode() code; for multi-label file systems,
leave the labeling up to the file system. With UFS1/2, this means
reading the extended attribute during vfs_vget() as the inode is
pulled off disk, rather than hitting the extended attributes
frequently during operations later, improving performance. This
also corrects sematics for shared vnode locks, which were not
previously present in the system. This chances the cache
coherrency properties WRT out-of-band access to label data, but in
an acceptable form. With UFS1, there is a small race condition
during automatic extended attribute start -- this is not present
with UFS2, and occurs because EAs aren't available at vnode
inception. We'll introduce a work around for this shortly.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
This is not quite the set of information I would want, but the tree where
I have the "correct" version is messed up with conflicts.
Sponsored by: DARPA & NAI Labs.
- Make DDB use %y instead of %z.
- Teach GCC about %y.
- Implement support for the C99 %z format modifier.
Approved by: re@
Reviewed by: peter
Tested on: i386, sparc64
handling clean and functional as 5.x evolves. This allows some of the
nasty bandaids in the 5.x codepaths to be unwound.
Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an
anti-foot-shooting measure in place, 5.x folks need this for a while) and
finish encapsulating the older stuff under COMPAT_43. Since the ancient
stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *'
to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn
is supposed to take), add a compile time check to prevent foot shooting
there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.
Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago).
Approved by: re
Try INT 15H/E820H first, then fall back to the old compatibility
method (INT 12H).
This is a workaround for newer machines which have broken INT 12H BIOS
service implementation.
Reviewed by: -current ML
MFC after: 3 days
seem to have all the prerequisites already.
Call g_waitidle() as the first thing in vfs_mountroot() so that we have
it out of the way before we even decide if we should call .._ask() or
.._try().
Call the g_dev_print() function to provide better guidance for the
root-mount prompt.
streaming cache. This bug could have the potential to cause data
corruption on systems with Psycho U2P bridges (Sabre bridges have no
streaming cache).
However, due to the usual driver architecture, it is believed that
corruption did occur only in rare cases (if at all).
trap types and signals to send. Rearrange KASSERTs to better handle faults
early before curthread is setup, or in the case that it gets corrupted or
set to 0.
does not require Giant.
This means that we may miss panics on a class of mutex programming bugs,
but only if running with a Chernobyl setting of debug-flags.
Spotted by: Pete Carah <pete@ns.altadena.net>
long doubles at the moment (printf truncates them to doubles).
However, long doubles to appear to work to the ranges listed in this
commit on both -stable (4.5) and -current. There may be some slight
rounding issues with long doubles, but that's an orthogonal issue to
these constants.
I've had this in my local tree for 3 months, and in my company's local
tree for 15 months with no ill effects.
Obtained from: NetBSD
Not likely to like it: bde
check for and/or report I/O errors. The result is that a VFS_SYNC
or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in
the presence of a hard error writing a disk sector or in a filesystem
full condition. This patch ensures that I/O errors will always be
checked and returned. This patch also ensures that every call to
VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes
appropriate action when an error is returned.
Sponsored by: DARPA & NAI Labs.
so that there is ony one copy of it. Fix that one copy
so that KSEs with no mailbox in a KSE program are not a cause
of page faults (this can legitmatly happen).
Submitted by: (parts) davidxu
Quoting luigi:
In order to make the userland code fully 64-bit clean it may
be necessary to commit other changes that may or may not cause
a minor change in the ABI.
Reviewed by: luigi
they may be statically linked into the kernel. Note that statically
linked modules, unlike dynamically linked modules, get INVARIANTS,
so if there are INVARIANTS failures, you'll bump into them rather
than not. Add the options to NOTES.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Add code to free KSEs and KSEGRPs on exit.
Sort KSE prototypes in proc.h.
Add the missing kse_exit() syscall.
ksetest now does not leak KSEs and KSEGRPS.
Submitted by: (parts) davidxu
extra function calls. Refactor uma_zalloc_internal into seperate functions
for finding the most appropriate slab, filling buckets, allocating single
items, and pulling items off of slabs. This makes the code significantly
cleaner.
- This also fixes the "Returning an empty bucket." panic that a few people
have seen.
Tested On: alpha, x86
pages are 4KB.
o As a second order fix, don't assume we have enough space
after the bootinfo block left in a page to hold the memory
map.
o A third order fix as that we removed the assumption that a
bootinfo block fits in a single 8KB page.
PR: ia64/39415
submitted by: Espen Skoglund <esk@ira.uka.de>
held. This avoids a lock order reversal when destroying zones.
Unfortunately, this also means that the free checks are not done before
the destructor is called.
Reported by: phk
to the primary local IP address when doing a TCP connect(). The
tcp_connect() code was relying on in_pcbconnect (actually in_pcbladdr)
modifying the passed-in sockaddr, and I failed to notice this in
the recent change that added in_pcbconnect_setup(). As a result,
tcp_connect() was ending up using the unmodified sockaddr address
instead of the munged version.
There are two cases to handle: if in_pcbconnect_setup() succeeds,
then the PCB has already been updated with the correct destination
address as we pass it pointers to inp_faddr and inp_fport directly.
If in_pcbconnect_setup() fails due to an existing but dead connection,
then copy the destination address from the old connection.
This policy can be loaded dynamically, and assigns each process a
partition number, as well as permitting processes to operate outside
the partition. Processes contained in a partition can only "see"
processes inside the same partition, so it's a little like jail.
The partition of a user can be set using the label mechanisms in
login.conf. This sample policy is a good starting point for developers
wanting to learn about how to produce labeled policies, as it labels
only one kernel object, the process credential.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
This policy can be loaded dynamically, and assigns each process a
partition number, as well as permitting processes to operate outside
the partition. Processes contained in a partition can only "see"
processes inside the same partition, so it's a little like jail.
The partition of a user can be set using the label mechanisms in
login.conf. This sample policy is a good starting point for developers
wanting to learn about how to produce labeled policies, as it labels
only one kernel object, the process credential.
PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
- Add detach support to the driver so that you can kldunload the module.
Note that currently rc_detach() fails to detach a unit if any of its
child devices are open, thus a kldunload will fail if any of the tty
devices are currently open.
- sys/i386/isa/ic/cd180.h was moved to sys/dev/ic/cd180.h as part of
this change.
Requested by: rwatson
Tested by: rwatson
ones with one text and one data section.
The text and data rlimit checks still needs to be fixed to properly
accout for additional sections.
Reviewed by: peter (slightly different patch version)
and XPT_RESET_DEV.
In order to properly handle reset requests whether they originate in the
ATA layer (atacontrol reinit) or from the CAM layer (camcontrol reset)
ata_reinit does not cause the SIM to be deallocated anymore. The SIM
is now unconditionnally created for each ATAPI bus.
This change may cause existing bus ids to change on some setups.
Reviewed by: roberto
Approved by: sos
same size. Add some fields that previously overlapped with something else
or were missing.
- Make struct regs and struct mcontext (minus floating point) the same as
struct trapframe so converting between them is easy (null).
- Add space for saving floating point state to struct mcontext. This requires
that it be 64 byte aligned.
- Add assertions that none of these structures change size, as they are part
of the ABI.
- Remove some dead code in sendsig().
- Save and restore %gsr in struct trapframe. Remember to restore %fsr.
- Add some comments to exception.S.
to merge mac_te, since the SEBSD port of SELinux/FLASK provides a much
more mature Type Enforcement implementation. This changes the size
of the on-disk 'struct oldmac' EA labels, which may require regeneration.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
permitting policies to restrict access to memory mapping based on
the credential requesting the mapping, the target vnode, the
requested rights, or other policy considerations.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
perform authorization checks during swapon() events; policies
might choose to enforce protections based on the credential
requesting the swap configuration, the target of the swap operation,
or other factors such as internal policy state.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
to parse their own label elements (some cleanup to occur here in the
future to use the newly added kernel strsep()). Policies now
entirely encapsulate their notion of label in the policy module.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
trying to acquire it's proc lock since the proc lock may not have been
constructed yet.
- Split up the one big comment at the top of the loop and put the pieces
in the right order above the various checks.
Reported by: kris (1)
to use a modified notion of 'struct mac', and flesh out the new variation
system calls (almost identical to existing ones except that they permit
a pid to be specified for process label retrieval, and don't follow
symlinks). This generalizes the label API so that the framework is
now almost entirely policy-agnostic.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
on all label parsing occuring in userland, and knowledge of the loaded
policies in the user libraries. This revision of the API pushes that
parsing into the kernel, avoiding the need for shared library support
of policies in userland, permitting statically linked binaries (such
as ls, ps, and ifconfig) to use MAC labels. In these API revisions,
high level parsing of the MAC label is done in the MAC Framework,
and interpretation of label elements is delegated to the MAC policy
modules. This permits modules to export zero or more label elements
to user space if desired, and support them in the manner they want
and with the semantics they want. This is believed to be the final
revision of this interface: from the perspective of user applications,
the API has actually not changed, although the ABI has.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
__mac_set_link, based on __mac_get_proc() except with a pid,
and __mac_get_file(), __mac_set_file() except that they do
not follow symlinks. First in a series of commits to flesh
out the user API.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
the ffs_copyonwrite routine to avoid a deadlock between the syncer
daemon trying to sync out a snapshot vnode and the bufdaemon
trying to write out a buffer containing the snapshot inode.
With any luck this will be the last snapshot race condition.
Sponsored by: DARPA & NAI Labs.
a full filesystem. Previously, if the allocation failed, we had to
fsync the file before rolling back any partial allocation of indirect
blocks. Most block allocation requests only need to allocate a single
data block and if that allocation fails, there is nothing to unroll.
So, before doing the fsync, we check to see if any rollback will
really be necessary. If none is necessary, then we simply return.
This update eliminates the flurry of disk activity that got triggered
whenever a filesystem would run out of space.
Sponsored by: DARPA & NAI Labs.
locks the mount point directory while waiting for vfs_busy to clear.
Meanwhile the unmount which holds the vfs_busy lock tried to lock
the mount point vnode. The fix is to observe that it is safe for the
unmount to remove the vnode from the mount point without locking it.
The lookup will wait for the unmount to complete, then recheck the
mount point when the vfs_busy lock clears.
Sponsored by: DARPA & NAI Labs.
that works in the new threaded kernel. It was commented out of
the disksort routine earlier this year for the reasons given in
kern/subr_disklabel.c (which is where this code used to reside
before it moved to kern/subr_disk.c):
----------------------------
revision 1.65
date: 2002/04/22 06:53:20; author: phk; state: Exp; lines: +5 -0
Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.
The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *". Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.
Also, curthread may or may not have anything to do with the I/O request
at hand.
The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.
Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.
The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
----------------------------
As suggested in this comment, it is no longer located in the disk sort
routine, but rather now resides in spec_strategy where the disk operations
are being queued by the thread that is associated with the process that
is really requesting the I/O. At that point, the disk queues are not
visible, so the I/O for positively niced processes is always slowed
down whether or not there is other activity on the disk.
On the issue of scaling HZ, I believe that the current scheme is
better than using a fixed quantum of time. As machines and I/O
subsystems get faster, the resolution on the clock also rises.
So, ten years from now we will be slowing things down for shorter
periods of time, but the proportional effect on the system will
be about the same as it is today. So, I view this as a feature
rather than a drawback. Hence this patch sticks with using HZ.
Sponsored by: DARPA & NAI Labs.
Reviewed by: Poul-Henning Kamp <phk@critter.freebsd.dk>