Commit Graph

5608 Commits

Author SHA1 Message Date
Maxime Henrion
4578a2e652 - Rename the DDB specific %z printf format to %y.
- Make DDB use %y instead of %z.
- Teach GCC about %y.
- Implement support for the C99 %z format modifier.

Approved by:	re@
Reviewed by:	peter
Tested on:	i386, sparc64
2002-10-25 19:41:32 +00:00
Peter Wemm
23eeeff7be Split 4.x and 5.x signal handling so that we can keep 4.x signal
handling clean and functional as 5.x evolves.  This allows some of the
nasty bandaids in the 5.x codepaths to be unwound.

Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an
anti-foot-shooting measure in place, 5.x folks need this for a while) and
finish encapsulating the older stuff under COMPAT_43.  Since the ancient
stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *'
to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn
is supposed to take), add a compile time check to prevent foot shooting
there too.  Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.

Tested on: i386, alpha, ia64.  Compiled on sparc64 (a few days ago).
Approved by: re
2002-10-25 19:10:58 +00:00
Poul-Henning Kamp
df6b615a42 #include <geom/geom.h> to get proper prototypes. Contrary to my fears we
seem to have all the prerequisites already.

Call g_waitidle() as the first thing in vfs_mountroot() so that we have
it out of the way before we even decide if we should call .._ask() or
.._try().

Call the g_dev_print() function to provide better guidance for the
root-mount prompt.
2002-10-25 18:44:42 +00:00
David Xu
ddc4f28155 suspend thread only when it can be interrupted. 2002-10-25 13:12:36 +00:00
David Xu
0cf609706f let thread_schedule_upcall() handle idle kse. 2002-10-25 12:50:31 +00:00
Poul-Henning Kamp
fa669ab7b8 Disable the kernacc() check in mtx_validate() until such time that kernacc
does not require Giant.

This means that we may miss panics on a class of mutex programming bugs,
but only if running with a Chernobyl setting of debug-flags.

Spotted by:	Pete Carah <pete@ns.altadena.net>
2002-10-25 08:40:20 +00:00
Poul-Henning Kamp
0d6dc414b4 In vrele() we can actually have a VCHR with v_rdev == NULL if we
came from the bottom of addaliasu().  Don't panic.
2002-10-25 07:58:25 +00:00
Julian Elischer
de4723f6e8 fix style-o 2002-10-25 07:17:07 +00:00
Julian Elischer
9d10277721 More work on the interaction between suspending and sleeping threads.
Also clean up some code used with 'single-threading'.

Reviewed by:	davidxu
2002-10-25 07:11:12 +00:00
Kirk McKusick
9ab73fd11a Within ufs, the ffs_sync and ffs_fsync functions did not always
check for and/or report I/O errors. The result is that a VFS_SYNC
or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in
the presence of a hard error writing a disk sector or in a filesystem
full condition. This patch ensures that I/O errors will always be
checked and returned.  This patch also ensures that every call to
VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes
appropriate action when an error is returned.

Sponsored by:   DARPA & NAI Labs.
2002-10-25 00:20:37 +00:00
David Xu
4c40dcd4d7 fix typo. 2002-10-25 00:13:46 +00:00
Julian Elischer
1434d3fe6f Extract out KSE specific code from machine specific code
so that there is ony one copy of it. Fix that one copy
so that KSEs with no mailbox in a KSE program are not a cause
of page faults (this can legitmatly happen).

Submitted by:	(parts) davidxu
2002-10-24 23:09:48 +00:00
Poul-Henning Kamp
a2fb4feded Fix the spechash lock order reversal by keeping an updated sum
of v_usecount in the dev_t which vcount() can return without
locking any vnodes.

Seen by:	jhb
2002-10-24 19:38:56 +00:00
Poul-Henning Kamp
7c0c26b4c4 Make sure GEOM has stopped rattling the disks before we try to mount
the root filesystem, this may be implicated in the PC98 issue.
2002-10-24 19:26:08 +00:00
Poul-Henning Kamp
1f59664b68 Don't try to be cute and save a call/return by implementing a degenerate
vrele() inline.
2002-10-24 17:55:49 +00:00
David Xu
33862f40b0 respect TDF_SINTR, also for SINGLE_NO_EXIT threading mode, if a thread
was already suspended, do nothing.
2002-10-24 14:43:48 +00:00
David Xu
9991db0cb5 don't forget to remove kse from idle queue. 2002-10-24 09:16:46 +00:00
Julian Elischer
5c8329ed6c Move thread related code from kern_proc.c to kern_thread.c.
Add code to free KSEs and KSEGRPs on exit.
Sort KSE prototypes in proc.h.
Add the missing kse_exit() syscall.

ksetest now does not leak KSEs and KSEGRPS.

Submitted by:	(parts) davidxu
2002-10-24 08:46:34 +00:00
Dag-Erling Smørgrav
f2c1ea8152 Whitespace cleanup. 2002-10-23 10:26:54 +00:00
Alexander Kabaev
96725dd01a Handle binaries with arbitrary number PT_LOAD sections, not only
ones with one text and one data section.

The text and data rlimit checks still needs to be fixed to properly
accout for additional sections.

Reviewed by:	peter (slightly different patch version)
2002-10-23 01:57:39 +00:00
John Baldwin
12f65109c8 Don't dereference the 'x' pointer if it is NULL, instead skip the
assignment.  The netsmb code likes to call these functions with a NULL
x argument a lot.

Reported by:	Vallo Kallaste <kalts@estpak.ee>
2002-10-22 18:44:59 +00:00
Robert Drehmel
d08926b1f6 Change the `mutex_prof' structure to use three variables contained
in an anonymous structure as counters, instead of an array with
preprocessor-defined names for indices.  Remove the associated XXX-
comment.
2002-10-22 16:06:28 +00:00
Robert Watson
1cbfd977fd Introduce MAC_CHECK_VNODE_SWAPON, which permits MAC policies to
perform authorization checks during swapon() events; policies
might choose to enforce protections based on the credential
requesting the swap configuration, the target of the swap operation,
or other factors such as internal policy state.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 15:53:43 +00:00
Robert Watson
2789e47e2c Missed in previous merge: export sizeof(struct oldmac) rather than
sizeof(struct mac).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 15:33:33 +00:00
Robert Watson
f7b951a8e0 Support the new MAC user API in kernel: modify existing system calls
to use a modified notion of 'struct mac', and flesh out the new variation
system calls (almost identical to existing ones except that they permit
a pid to be specified for process label retrieval, and don't follow
symlinks).  This generalizes the label API so that the framework is
now almost entirely policy-agnostic.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 14:29:47 +00:00
Robert Watson
5cb559a5e0 Regen. 2002-10-22 14:23:52 +00:00
Robert Watson
aad1cdc852 Flesh out prototypes for __mac_get_pid, __mac_get_link, and
__mac_set_link, based on __mac_get_proc() except with a pid,
and __mac_get_file(), __mac_set_file() except that they do
not follow symlinks.  First in a series of commits to flesh
out the user API.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 14:22:24 +00:00
David Xu
81fd489272 detect idle kse correctly. 2002-10-22 02:27:19 +00:00
Kirk McKusick
9e4b381a54 This update removes a race between unmount and lookup. The lookup
locks the mount point directory while waiting for vfs_busy to clear.
Meanwhile the unmount which holds the vfs_busy lock tried to lock
the mount point vnode. The fix is to observe that it is safe for the
unmount to remove the vnode from the mount point without locking it.
The lookup will wait for the unmount to complete, then recheck the
mount point when the vfs_busy lock clears.

Sponsored by:	DARPA & NAI Labs.
2002-10-22 01:06:44 +00:00
Kirk McKusick
e03486d198 This checkin reimplements the io-request priority hack in a way
that works in the new threaded kernel. It was commented out of
the disksort routine earlier this year for the reasons given in
kern/subr_disklabel.c (which is where this code used to reside
before it moved to kern/subr_disk.c):

----------------------------
revision 1.65
date: 2002/04/22 06:53:20;  author: phk;  state: Exp;  lines: +5 -0
Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
----------------------------

As suggested in this comment, it is no longer located in the disk sort
routine, but rather now resides in spec_strategy where the disk operations
are being queued by the thread that is associated with the process that
is really requesting the I/O. At that point, the disk queues are not
visible, so the I/O for positively niced processes is always slowed
down whether or not there is other activity on the disk.

On the issue of scaling HZ, I believe that the current scheme is
better than using a fixed quantum of time. As machines and I/O
subsystems get faster, the resolution on the clock also rises.
So, ten years from now we will be slowing things down for shorter
periods of time, but the proportional effect on the system will
be about the same as it is today. So, I view this as a feature
rather than a drawback. Hence this patch sticks with using HZ.

Sponsored by:	DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@critter.freebsd.dk>
2002-10-22 00:59:49 +00:00
Poul-Henning Kamp
c177d125bf GEOM does not (and shall not) propagate flags like D_MEMDISK, so we will
revert to checking the name to determine if our root device is a ramdisk,
md(4) specifically to determine if we should attempt the root-mount RW

Sponsored by:	DARPA & NAI Labs.
2002-10-21 20:09:59 +00:00
Dag-Erling Smørgrav
6d0369001a Reduce the overhead of the mutex statistics gathering code, try to produce
shorter lines in the report, and clean up some minor style issues.
2002-10-21 18:48:28 +00:00
Olivier Houchard
e3bf3aea25 One #include <sys/sysctl.h> should be enough.
Approved by:	mux (mentor)
2002-10-21 18:40:40 +00:00
Brooks Davis
29e1b85f97 Use if_printf(ifp, "blah") instead of
printf("%s%d: blah", ifp->if_name, ifp->if_xname).
2002-10-21 02:51:56 +00:00
Thomas Moestl
5775150869 Fix the calculations of the length of the unread message buffer
contents. The code was subtracting two unsigned ints, stored the
result in a log and expected it to be the same as of a signed
subtraction; this does only work on platforms where int and long
have the same size (due to overflows).
Instead, cast to long before the subtraction; the numbers are
guaranteed to be small enough so that there will be no overflows
because of that.
2002-10-20 23:13:05 +00:00
Poul-Henning Kamp
962414a120 We have memset() and memcpy() in the kernel now, so we don't need to
#define them to bzero and bcopy.

Spotted by:	FlexeLint
2002-10-20 22:33:42 +00:00
Julian Elischer
2f030624b1 Add an actual implementation of kse_wakeup()
Submitted by:	Davidxu
2002-10-20 21:08:47 +00:00
Thomas Moestl
e381d2455b Add kernel dump support, based on the ia64 version (which was committed
as sparc64/sparc64/dump_machdep.c a while back).
Other than ia64 (which uses ELF), sparc64 uses a homegrown format for
the dumps (headers are required because the physical address and size of
the tsb must be noted, and because physical memory may be discontiguous);
ELF would not offer any advantages here.

Reviewed by:	jake
2002-10-20 17:03:15 +00:00
Poul-Henning Kamp
ab33958276 #unifdef the code for checking blessed lock collisions until we need it.
Spotted by:	DARPA & NAI Labs.
2002-10-20 08:48:39 +00:00
Robert Watson
a13c67da35 If MAC_MAX_POLICIES isn't defined, don't try to define it, just let the
compile fail.  MAC_MAX_POLICIES should always be defined, or we have
bigger problems at hand.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-20 03:41:09 +00:00
Peter Wemm
8556393bb2 Stake a claim on 418 (__xstat), 419 (__xfstat), 420 (__xlstat) 2002-10-19 22:25:31 +00:00
Peter Wemm
c8447553b5 Grab 416/417 real estate before I get burned while testing again.
This is for the not-quite-ready signal/fpu abi stuff.  It may not see
the light of day, but I'm certainly not going to be able to validate it
when getting shot in the foot due to syscall number conflicts.
2002-10-19 22:09:23 +00:00
Robert Watson
b614dd131a Add a new 'NOMACCHECK' flag to namei() NDINIT flags, which permits the
caller to indicate that MAC checks are not required for the lookup.
Similar to IO_NOMACCHECK for vn_rdwr(), this indicates that the caller
has already performed all required protections and that this is an
internally generated operation.  This will be used by the NFS server
code, as we don't currently enforce MAC protections against requests
delivered via NFS.

While here, add NOCROSSMOUNT to PARAMASK; apparently this was used at
one point for name lookup flag checking, but isn't any longer or it
would have triggered from the NFS server code passing it to indicate
that mountpoints shouldn't be crossed in lookups.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 21:25:51 +00:00
Robert Watson
3ab93f0958 Regen from addition of execve_mac placeholder. 2002-10-19 21:15:10 +00:00
Robert Watson
bc5245d94c Add a placeholder for the execve_mac() system call, similar to SELinux's
execve_secure() system call, which permits a process to pass in a label
for a label change during exec.  This permits SELinux to change the
label for the resulting exec without a race following a manual label
change on the process.  Because this interface uses our general purpose
MAC label abstraction, we call it execve_mac(), and wrap our port of
SELinux's execve_secure() around it with appropriate sid mappings.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 21:06:57 +00:00
Robert Watson
89c61753a0 Drop in the MAC check for file creation as part of open().
Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 20:56:44 +00:00
Robert Watson
9aeffb2b28 Make sure to clear the 'registered' flag for MAC policies when they
unregister.  Under some obscure (perhaps demented) circumstances,
this can result in a panic if a policy is unregistered, and then someone
foolishly unregisters it again.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 20:30:12 +00:00
Robert Watson
7587203c2f Hook up most of the MAC entry points relating to file/directory/node
creation, deletion, and rename.  There are one or two other stray
cases I'll catch in follow-up commits (such as unix domain socket
creation); this permits MAC policy modules to limit the ability to
perform these operations based on existing UNIX credential / vnode
attributes, extended attributes, and security labels.  In the rename
case using MAC, we now have to lock the from directory and file
vnodes for the MAC check, but this is done only in the MAC case,
and the locks are immediately released so that the remainder of the
rename implementation remains the same.  Because the create check
takes a vattr to know object type information, we now initialize
additional fields in the VATTR passed to VOP_SYMLINK() in the MAC
case.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 20:25:57 +00:00
Marcel Moolenaar
1aeb23cdfa Add two hooks to signal module load and module unload to MD code.
The primary reason for this is to allow MD code to process machine
specific attributes, segments or sections in the ELF file and
update machine specific state accordingly. An immediate use of this
is in the ia64 port where unwind information is updated to allow
debugging and tracing in/across modules. Note that this commit
does not add the functionality to the ia64 port. See revision 1.9
of ia64/ia64/elf_machdep.c.

Validated on: alpha, i386, ia64
2002-10-19 19:16:03 +00:00
Marcel Moolenaar
c143d6c24a Reduce code duplication by moving the common actions in
link_elf_init(), link_elf_link_preload_finish() and
link_elf_load_file() to link_elf_link_common_finish().
Since link_elf_init() did initializations as a side-effect
of doing the common actions, keep the initialization in
that function. Consequently, link_elf_add_gdb() is now also
called to insert the very first link_map() (ie the kernel).
2002-10-19 18:59:33 +00:00