Commit Graph

4253 Commits

Author SHA1 Message Date
Doug Rabson
e913ca22e2 Move setregs() out from under the PROC_LOCK so that it can use functions
list suword() which may trap.
2001-10-10 20:04:57 +00:00
Robert Watson
8a7d8cc675 - Combine kern.ps_showallprocs and kern.ipc.showallsockets into
a single kern.security.seeotheruids_permitted, describes as:
  "Unprivileged processes may see subjects/objects with different real uid"
  NOTE: kern.ps_showallprocs exists in -STABLE, and therefore there is
  an API change.  kern.ipc.showallsockets does not.
- Check kern.security.seeotheruids_permitted in cr_cansee().
- Replace visibility calls to socheckuid() with cr_cansee() (retain
  the change to socheckuid() in ipfw, where it is used for rule-matching).
- Remove prison_unpcb() and make use of cr_cansee() against the UNIX
  domain socket credential instead of comparing root vnodes for the
  UDS and the process.  This allows multiple jails to share the same
  chroot() and not see each others UNIX domain sockets.
- Remove unused socheckproc().

Now that cr_cansee() is used universally for socket visibility, a variety
of policies are more consistently enforced, including uid-based
restrictions and jail-based restrictions.  This also better-supports
the introduction of additional MAC models.

Reviewed by:	ps, billf
Obtained from:	TrustedBSD Project
2001-10-09 21:40:30 +00:00
John Baldwin
8688bb9383 proces -> process in a comment. 2001-10-09 17:25:30 +00:00
Robert Watson
32d186043b o Recent addition of (p1==p2) exception in p_candebug() permitted
processes to attach debugging to themselves even though the
  global kern_unprivileged_procdebug_permitted policy might disallow
  this.
o Move the kern_unprivileged_procdebug_permitted check above the
  (p1==p2) check.

Reviewed by:	des
2001-10-09 16:56:29 +00:00
John Baldwin
74e4502e62 Replace 'curproc' with 'td->td_proc'. 2001-10-08 21:05:46 +00:00
Matthew Dillon
917efbaaba WS Cleanup 2001-10-08 19:51:13 +00:00
Dag-Erling Smørgrav
3da3249106 Dissociate ptrace from procfs.
Until now, the ptrace syscall was implemented as a wrapper that called
various functions in procfs depending on which ptrace operation was
requested.  Most of these functions were themselves wrappers around
procfs_{read,write}_{,db,fp}regs(), with only some extra error checks,
which weren't necessary in the ptrace case anyway.

This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c
(renaming it to proc_rwmem() in the process), and implements ptrace()
directly in terms of procfs_{read,write}_{,db,fp}regs() instead of
having it fake up a struct uio and then call procfs_do{,db,fp}regs().

It also moves the prototypes for procfs_{read,write}_{,db,fp}regs()
and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files
except procfs_machdep.c as "optional procfs" instead of "standard".
2001-10-07 20:08:42 +00:00
Dag-Erling Smørgrav
23fad5b6c9 Always succeed if the target process is the same as the requesting process. 2001-10-07 20:06:03 +00:00
Ian Dowse
80f42b555d Fix a typo in do_sigaction() where sa_sigaction and sa_handler were
confused. Since sa_sigaction and sa_handler alias each other in a
union, the bug was completely harmless. This had been fixed as part
of the SIGCHLD changes in revision 1.125, but it was reverted when
they were backed out in revision 1.126.
2001-10-07 16:11:37 +00:00
Robert Watson
c175d2226f o Introduce an 'options REGRESSION'-dependant sysctl namespaces,
'regression.*'.
o Add 'regression.securelevel_nonmonotonic', conditional on 'options
  REGRESSION', which allows the securelevel to be lowered for the purposes
  of efficient regression testing of securelevel policy decisions.
  Regression tests for securelevels will be committed shortly.

NOTE: 'options REGRESSION' should never be used on production machines, as
it permits violation of system invariants so as to improve the ability to
effectively test edge cases, and improve testing efficiency.
2001-10-07 03:51:22 +00:00
Marcel Moolenaar
49ead724c6 Fix breakage caused by previous commit. The lkmnosys and lkmressys
syscalls are of type NODEF but not in a way that fits the given
definition of that type. The exact difference of lkmressys and
lkmnosys is unclear, which makes it all the more confusing. A
reevaluation of what we have and what we really need is in order.

Spotted by: Maxime Henrion <mux@qualys.com>
Pointy hat: marcel
2001-10-07 00:16:31 +00:00
Matthew Dillon
845bd795c9 vinvalbuf() was only waiting for write-I/O to complete. It really has to
wait for both read AND write I/O to complete.  Only NFS calls vinvalbuf()
on an active vnode (when the server indicates that the file is stale), so
this bug fix only effects NFS clients.

MFC after:	3 days
2001-10-05 20:10:32 +00:00
John Baldwin
43150722c9 The aio kthreads start off with a root credential just like all other
kthreads, so don't malloc a ucred just so we can create a duplicate of the
one we already have.
2001-10-05 17:55:11 +00:00
Paul Saab
4787fd37af Only allow users to see their own socket connections if
kern.ipc.showallsockets is set to 0.

Submitted by:	billf (with modifications by me)
Inspired by:	Dave McKay (aka pm aka Packet Magnet)
Reviewed by:	peter
MFC after:	2 weeks
2001-10-05 07:06:32 +00:00
Dag-Erling Smørgrav
50f74e92b8 Final style(9) commit: placement of opening brace; a continuation indent I
missed in the previous commit; a line that exceeded 80 characters.  No
functional changes, but the object file's md5 checksum changes because some
lines have been displaced.
2001-10-04 16:35:44 +00:00
Dag-Erling Smørgrav
8a8d4e459c More style(9) fixes: no spaces between function name and parameter list;
some indentation fixes (particularly continuation lines).

Reviewed by:	md5(1)
2001-10-04 16:29:45 +00:00
Dag-Erling Smørgrav
c5799337ea This file had a mixture of "return foo;" and "return (foo);"; standardize
on "return (foo);" as mandated by style(9).

Reviewed by:	md5(1)
2001-10-04 16:09:22 +00:00
David Malone
2bc21ed985 Hopefully improve control message passing over Unix domain sockets.
1) Allow the sending of more than one control message at a time
over a unix domain socket. This should cover the PR 29499.

2) This requires that unp_{ex,in}ternalize and unp_scan understand
mbufs with more than one control message at a time.

3) Internalize and externalize used to work on the mbuf in-place.
This made life quite complicated and the code for sizeof(int) <
sizeof(file *) could end up doing the wrong thing. The patch always
create a new mbuf/cluster now. This resulted in the change of the
prototype for the domain externalise function.

4) You can now send SCM_TIMESTAMP messages.

5) Always use CMSG_DATA(cm) to determine the start where the data
in unp_{ex,in}ternalize. It was using ((struct cmsghdr *)cm + 1)
in some places, which gives the wrong alignment on the alpha.
(NetBSD made this fix some time ago).

This results in an ABI change for discriptor passing and creds
passing on the alpha. (Probably on the IA64 and Spare ports too).

6) Fix userland programs to use CMSG_* macros too.

7) Be more careful about freeing mbufs containing (file *)s.
This is made possible by the prototype change of externalise.

PR:		29499
MFC after:	6 weeks
2001-10-04 13:11:48 +00:00
David Malone
59bdd40568 Allow sbcreatecontrol to make cluster sized control messages. 2001-10-04 12:59:53 +00:00
John Baldwin
0479e3d339 Move the ap boot spin lock earlier in the lock order before the sio(4)
lock since we occasionally call printf() while holding the ap boot lock
which can call down into the sio(4) driver if using a serial console.
2001-10-01 22:50:30 +00:00
Robert Watson
c6ab2f6b4e o Complete the migration from suser error checking in the following form
in vfs_syscalls.c:

    if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid &&
        (error = suser_td(td)) != 0) {
            unwrap_lots_of_stuff();
            return (error);
    }

  to:

    if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid) {
            error = suser_td(td);
            if (error) {
                unwrap_lots_of_stuff();
                return (error);
            }
    }

  This makes the code more readable when complex clauses are in use,
  and minimizes conflicts for large outstanding patchsets modifying the
  kernel authorization code (of which I have several), especially where
  existing authorization and context code are combined in the same if()
  conditional.

Obtained from:	TrustedBSD Project
2001-10-01 20:01:07 +00:00
Matthew Dillon
b5810bab2d After extensive testing it has been determined that adding complexity
to avoid removing higher level directory vnodes from the namecache has
no perceivable effect and will be removed.  This is especially true
when vmiodirenable is turned on, which it is by default now.  ( vmiodirenable
makes a huge difference in directory caching ).  The vfs.vmiodirenable and
vfs.nameileafonly sysctls have been left in to allow further testing, but
I expect to rip out vfs.nameileafonly soon too.

I have also determined through testing that the real problem with numvnodes
getting too large is due to the VM Page cache preventing the vnode from
being reclaimed.  The directory stuff made only a tiny dent relative
to Poul's original code, enough so that some tests succeeded.  But tests
with several million small files show that the bigger problem is the VM Page
cache.  This will have to be addressed by a future commit.

MFC after:	3 days
2001-10-01 04:33:35 +00:00
Jonathan Lemon
1a6fc8ef63 When FREE()ing kqueue related structures, charge them to the correct bucket.
Submitted by: iedowse
Forgotten by: jlemon
2001-09-30 17:00:56 +00:00
Bosko Milekic
70a61707f6 Re-enable mbtypes statistics in the mbuf allocator. I disabled these
when I changed the allocator bits. This implements per-CPU mbtypes
stats by keeping net number of decrements/increments of a given mbtype
per-CPU and then summing all of the per-CPU mbtypes to produce the total
net number of allocated mbufs of the given mbtype.
Counters are carefully balanced to avoid/prevent underflows/overflows.

mbtypes stats are re-enabled with the idea that we may occasionally
(although very rarely) observe slight inconsistencies in the stat
reporting. Most of the time, we should be fine, though.

Also make appropriate modifications to netstat(1) and systat(1) to do
the necessary reporting.

Submitted by: Jiangyi Liu <jyliu@163.net>
2001-09-30 01:58:39 +00:00
Jonathan Lemon
0217f5c71e Have EVFILT_TIMERS allocate their callouts via malloc() instead of using
the static callout list allocated by the system.

Change malloc type from M_TEMP to M_KQUEUE to better track memory.

Add a kern.kq_calloutmax to globally limit the amount of kernel memory
that can be allocated by callouts.

Submitted by: iedowse  (items 1, 2)
2001-09-29 17:48:39 +00:00
Dag-Erling Smørgrav
5b6db47748 Add a couple of API functions I need for my pseudofs WIP. Documentation
will follow when I've decided whether to keep this API or ditch it in
favor of something slightly more subtle.
2001-09-29 00:32:46 +00:00
Marcel Moolenaar
4166877345 Make the NODEF type usable. A syscall of type NODEF will only
have its entry in the syscall table added. Nothing else is
done. This differs from type NOPROTO in that NOPROTO adds a
definition to syscall.h besides adding a sysent. A syscall can
now have multiple entries without conflict. Note that the
argssize is fixed and depends on the syscall name.
2001-09-28 01:21:57 +00:00
Robert Watson
87fce2bb96 o When performing a securelevel check as part of securelevel_ge() or
securelevel_gt(), determine first if a local securelevel exists --
  if so, perform the check based on imax(local, global).  Otherwise,
  simply use the global value.
o Note: even though local securelevels might lag below the global one,
  if the global value is updated to higher than local values, maximum
  will still be used, making the global dominant even if there is local
  lag.

Obtained from:	TrustedBSD Project
2001-09-26 20:41:48 +00:00
Robert Watson
8a528812a0 o Modify kern.securelevel MIB entry to return a local securelevel, if
one is present in the current jail, otherwise, to return the global
  securelevel.
o If the securelevel is being updated, require that it be greater than
  the maximum of local and global, if a local securelevel exists,
  otherwise, just maximum of the global.  If there is a local
  securelevel, update the local one instead of the global one.
o Note: this does allow local securelevels to lag behind the global one
  as long as the local one is not updated following a global increase.

Obtained from:	TrustedBSD Project
2001-09-26 20:39:48 +00:00
Robert Watson
567931c8f6 o Initialize per-jail securelevel from global securelevel as part of
jail creation.

Obtained from:	TrustedBSD Project
2001-09-26 20:37:15 +00:00
Robert Watson
d501d04b9e o Modify static settime() to accept the proc * for the process requesting
a time change, and callers so that they provide td->td_proc.
o Modify settime() to use securevel_gt() for securelevel checking.

Obtained from:	TrustedBSD Project
2001-09-26 19:53:57 +00:00
Robert Watson
c2f413af19 o Modify sysctl access control check to use securelevel_gt(), and
clarify sysctl access control logic.

Obtained from:	TrustedBSD Project
2001-09-26 19:51:25 +00:00
Matthew Dillon
46cad5761c Enable vmiodirenable by default. Remove incorrect comment from sysctl.conf.
MFC after:	1 week
2001-09-26 19:35:04 +00:00
Matthew Dillon
3418ebebfe Make uio_yield() a global. Call uio_yield() between chunks
in vn_rdwr_inchunks(), allowing other processes to gain an exclusive
lock on the vnode.  Specifically: directory scanning, to avoid a race to the
root directory, and multiple child processes coring simultaniously so they
can figure out that some other core'ing child has an exclusive adv lock and
just exit instead.

This completely fixes performance problems when large programs core.  You
can have hundreds of copies (forked children) of the same binary core all
at once and not notice.

MFC after:	3 days
2001-09-26 06:54:32 +00:00
Paul Saab
88b1d98f31 Lock the vnode while truncating the corefile. This fixes a panic
with softupdates dangling deps.

Submitted by:	peter
MFC:		ASAP :)
2001-09-26 01:24:07 +00:00
John Baldwin
21377ce065 Remove superflous parens after de-macroizing. 2001-09-26 00:05:18 +00:00
Robert Watson
75bc5b3f22 o So, when <dd> e-mailed me and said that the comment was inverted
for securelevel_ge() and securelevel_gt(), I was a little surprised,
  but fixed it.  Turns out that it was the code that was inverted, during
  a whitespace cleanup in my commit tree.  This commit inverts the
  checks, and restores the comment.
2001-09-25 21:08:33 +00:00
John Baldwin
dde96c9933 Since we no longer inline any debugging code in the mutex operations, move
all the debugging code into the function versions of the mutex operations
in kern_mutex.c.  This reduced the __mtx_* macros to simply wrappers of
the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in
favor of just calling the _{get,rel}_lock_* macros.  The tangled hairy mass
of macros calling macros is at least a bit more sane now.
2001-09-22 21:19:55 +00:00
Robert Watson
b4799065ef o vpaccess() -> vn_access() -- Peter reminds me that there is already
a convention for vnop helper routines of this sort.

Submitted by:	Mr Wemm <peter>
2001-09-22 03:07:41 +00:00
John Baldwin
ed01445d8f Use the passed in thread to selrecord() instead of curthread. 2001-09-21 22:46:54 +00:00
John Baldwin
456ca585db Use the passed in thread pointer instead of curthread in calls to
selrecord() in ptcpoll().  The pre-KSE code used the passed in proc pointer
rather than curproc, and an earlier seltrue() call uses the passed in
thread and not curthread.
2001-09-21 22:22:25 +00:00
John Baldwin
fea2ab833e The P_SELECT flag was moved from p->p_flag to td->td_flags, but p_flag
was locked by the proc lock and td_flags is locked by the sched_lock.
The places that read, set, and cleared TDF_SELECT weren't updated, so they
read and modified td_flags w/o holding the sched_lock, meaning that they
could corrupt the per-thread flags field.  As an immediate band-aid,
grab sched_lock while reading and manipulating td_flags in relation to
TDF_SELECT.  This will probably be cleaned up some later on.
2001-09-21 22:06:22 +00:00
John Baldwin
e649bcb506 Remove unneeded proc variables and fix comments. 2001-09-21 21:54:45 +00:00
Robert Watson
a90a3f2882 o Part two of eaccess(2) commit, rebuilt system call code.
Obtained from:	TrustedBSD Project
2001-09-21 21:34:06 +00:00
Robert Watson
9c94f7731e o Introduce eaccess(2), a version of access(2) that uses the effective
credentials rather than the real credentials.  This is useful for
  implementing GUI's which need to modify icons based on access rights,
  but where use of open(2) is too expensive, use of stat(2) doesn't
  reflect the file system's real protection model, and use of
  access() suffers from real/effective credential confusion.  This
  implementation provides the same semantics as the call of the same
  name on SCO OpenServer.  Note: using this call improperly can
  leave you subject to some of the same races present in the
  access(2) call.
o To implement this, break out the basic logic of access(2) into
  vpaccess(), which accepts a passed credential to perform the
  invocation of VOP_ACCESS().  Add eaccess(2) to invoke vpaccess(),
  and modify access(2) to use vpaccess().

Obtained from:	TrustedBSD Project
2001-09-21 21:33:22 +00:00
John Baldwin
278da5113f Remove a bogus comment. "atomic" doesn't mean that the operation is done
as a physical atomic operation.  That would require the code to use the
atomic API, which it does not.  Instead, the operation is made psuedo
atomic (hence the quotes) by use of the lock to protect clearing all of the
flags in question.
2001-09-21 19:26:57 +00:00
John Baldwin
21832b1ec0 GC some #if 0'd code. 2001-09-21 19:21:18 +00:00
John Baldwin
3226cbf43b Whitespace and spelling fixes. 2001-09-21 19:16:12 +00:00
Michael Reifenberger
896de692f8 Make msgseg, msgssz (->msgmax) and msgmni TUNABLE. 2001-09-21 09:25:17 +00:00
Peter Wemm
1114d18594 Add a pointer to kenv(1). 2001-09-21 02:25:53 +00:00
Jonathan Lemon
57ea1fa07f Revert last commit. The same functionality can be obtained through the
'kenv' command, which I obviously was unaware of.
2001-09-21 02:09:01 +00:00
Robert Watson
94088977c9 o Rename u_cansee() to cr_cansee(), making the name more comprehensible
in the face of a rename of ucred to cred, and possibly generally.

Obtained from:	TrustedBSD Project
2001-09-20 21:45:31 +00:00
Jonathan Lemon
e492f03505 Add a sysctl MIB 'kern.env', that dumps the contents of the kernel
environment from the loader, as well as the kernel's compiled in static
hints.
2001-09-20 20:09:37 +00:00
Peter Wemm
fbd7a9dd97 decrement the dumping variable after use so we can call it several times
if needed.
2001-09-20 06:08:53 +00:00
John Baldwin
a44f918bf9 Fix a bug in propagate priority: the kse group pointer wasn't being
updated in the loop so the new thread always seemd to have the same
priority as the original thread and no actual priorities were changed.
2001-09-19 22:52:59 +00:00
Robert Watson
288b789333 o Clarification of securelevel_{ge,gt} comment.
Submitted by:	dd
2001-09-19 14:09:13 +00:00
Peter Wemm
66f769fe39 Add missing ; in last commit
Pointy-hat-to: jhb
2001-09-19 02:53:59 +00:00
Peter Wemm
98cdde71e7 Regenerate 2001-09-18 23:33:33 +00:00
Peter Wemm
eb25edbda3 Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.
2001-09-18 23:32:09 +00:00
John Baldwin
9ef3a9855d Use a 'p' variable instead of repetitively indirecting td->td_proc for
signal things that are still per-process and won't be per-thread.
2001-09-18 23:27:06 +00:00
John Baldwin
8cc06751dd Don't initialize proc0's mutex twice. It is already done earlier on in the
MD startup code.
2001-09-18 22:09:47 +00:00
Robert Watson
3ca719f12e o Introduce two new calls, securelevel_gt() and securelevel_ge(), which
abstract the securelevel implementation details from the checking
  code.  The call in -CURRENT accepts a struct ucred--in -STABLE, it
  will accept struct proc.  This facilitates the upcoming commit of
  per-jail securelevel support.  The calls will also generate a
  kernel printf if the calls are made with NULL ucred/proc pointers:
  generally speaking, there are few instances of this, and they should
  be fixed.
o Update p_candebug() to use securelevel_gt(); future updates to the
  remainder of the kernel tree will be committed soon.

Obtained from:	TrustedBSD Project
2001-09-18 21:03:53 +00:00
Mark Peek
796ed2a6d0 Set debug information on the process being traced, not the current (debugger)
process. This should allow gdb to function correctly on post-KSE kernels.
2001-09-18 19:06:11 +00:00
Jonathan Lemon
6a494eeb34 Change p into ke->ke_proc, this was hidden behind INVARIANTS. 2001-09-18 03:36:21 +00:00
Peter Wemm
d2718e479a Fix a fatal type mismatch (char *static_env; vs char static_env[]).
Submitted by:	bde
2001-09-17 21:27:41 +00:00
Julian Elischer
fdd4e5c652 Replace line accidentally deleted during KSE additions.
Symptom.. Stopped program unable to be restarted if it was stopped
while already sleeping.
2001-09-17 20:42:25 +00:00
Robert Watson
9844fbc3b5 o Correct authorization check in CANSIGIO(), which suffered from incorrect
transcription during the (pcred,ucred) merge; this was not used for
  the kill() system call, so does not affect direct explicit process
  signalling.

Pointed out by:	fenner
2001-09-15 22:34:46 +00:00
Peter Wemm
b711616825 In the devfs case, have initproc attempt the easy cases of mounting /dev.
This works if /dev exists, or if / is read/write (nfsroot).  If it is
too hard, leave it up to init -d (which will probably fail if /dev does
not exist, but there isn't much else we can do short of making a union
mount on /).

This means we get a proper /dev if you boot a 5.x kernel on a 4.x world,
which I happen to do often (the ramdisks on our install netboot servers
have 4.x userland worlds on them).
2001-09-15 11:15:22 +00:00
Doug Rabson
de1792cbb8 The ia64 kernel is now linked dynamically so parse its _DYNAMIC structure. 2001-09-15 11:02:10 +00:00
John Baldwin
bce9841972 Fix locking on td_flags for TDF_DEADLKTREAT. If the comments in the code
are true that curthread can change during this function, then this flag
needs to become a KSE flag, not a thread flag.
2001-09-13 22:33:37 +00:00
Michael Reifenberger
d528be2bf3 PR: kern/29698 (part)
Reviewed by:	audit
Implement SEM_STAT (like IPC_STAT but treats semid as sema-index).
The linuxerator will need it.
2001-09-13 21:06:41 +00:00
Michael Reifenberger
b3a4bc4247 PR: kern/29698 (part)
Reviewed by:	audit
Add tunables for the sem* and shm* syscontrols for tuning on boottime
until they become dynamic.
SAP R/3 doesn't like the compiled in defaults.
2001-09-13 20:20:09 +00:00
Julian Elischer
9dbea9237c If an incoming struct proc could have been NULL before, tehn don't
automatically change the code to add

struct proc *p = td->td_proc;

because now 'td' is probably capable of being NULL too.
I expect to see more of this kind of error during the 'weeding'
process. It's too easy to make. (junior hacker project.. look for these :-)

Submitted by:	mark Peek <mp@freebsd.org>
2001-09-12 20:26:57 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Peter Wemm
8ee6d9e90f Fix the kern.module_path issue that required the trailing '/' character
on each module path component.  Fix a one-byte buffer overflow at the
same time that got highlighted in the process.
2001-09-12 00:50:23 +00:00
Dima Dorfman
34d2276e63 Correct a debugging message. 2001-09-11 12:20:24 +00:00
Peter Wemm
505222d35f Implement the long-awaited module->file cache database. A userland
tool (kldxref(8)) keeps a cache of what modules and versions are inside
what .ko files.  I have tested this on both Alpha and i386.

Submitted by:	bp
2001-09-11 01:09:24 +00:00
John Baldwin
04b5a9bbd6 - Axe holding_giant as it is not used now anyways and was ok'd by
dillon in an earlier e-mail.
- We don't need to test the console right before we vfprintf() the panicstr
  message.  The printing of the panic message is a fine console test by
  itself and doesn't make useful messages scroll off the screen or tick
  developers off in quite the same.

Requested by:	jlemon, imp, bmilekic, chris, gsutter, jake (2)
2001-09-10 21:04:49 +00:00
Peter Wemm
b03a0c9e5e Fix a warning on alpha (real problem) and make pstat -t work as a bonus.
'struct tty' was out of sync in user and kernel due to dev_t/udev_t
mixups.  This takes advantage of the fact that dev_t changes type in
userland, so it isn't too pretty.
2001-09-10 12:05:47 +00:00
Dima Dorfman
b40832162b Make the nsops' variable in semop' unsigned. This prevents an
overflow if uap->nsops (which is already unsigned) is over INT_MAX;
consequently, the bounds check below becomes valid.  Previously, if a
value over INT_MAX was passed in uap->nsops, the bounds check wouldn't
catch it, and the value would be used to compute copyin()'s third
argument.

Obtained from:	NetBSD
2001-09-10 11:36:08 +00:00
Kris Kennaway
bf61e26696 Fix some signed/unsigned integer confusion, and add bounds checking of
arguments to some functions.

Obtained from:	NetBSD
Reviewed by:	peter
MFC after:	2 weeks
2001-09-10 11:28:07 +00:00
Peter Wemm
ed6c38886e Fix a warning. l_name is managed by us and is malloc/free'ed.
It is the userland declaration of l_name that is inconvenient for us.
2001-09-10 07:53:04 +00:00
Peter Wemm
e414d9aad7 Add on UPAGES to ki_rssize since it is there as result of the process
and can be swapped out with the process.
2001-09-10 07:29:32 +00:00
Peter Wemm
eb30c1c0b9 Rip some well duplicated code out of cpu_wait() and cpu_exit() and move
it to the MI area.  KSE touched cpu_wait() which had the same change
replicated five ways for each platform.  Now it can just do it once.
The only MD parts seemed to be dealing with fpu state cleanup and things
like vm86 cleanup on x86.  The rest was identical.

XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional
stub in place.

Reviewed by:	jake, tmm, dillon
2001-09-10 04:28:58 +00:00
Matthew Dillon
06ae1e91c4 This brings in a Yahoo coredump patch from Paul, with additional mods by
me (addition of vn_rdwr_inchunks).  The problem Yahoo is solving is that
if you have large process images core dumping, or you have a large number of
forked processes all core dumping at the same time, the original coredump code
would leave the vnode locked throughout.  This can cause the directory vnode
to get locked up, which can cause the parent directory vnode to get locked
up, and so on all the way to the root node, locking the entire machine up
for extremely long periods of time.

This patch solves the problem in two ways.  First it uses an advisory
non-blocking lock to abort multiple processes trying to core to the same
file.  Second (my contribution) it chunks up the writes and uses bwillwrite()
to avoid holding the vnode locked while blocking in the buffer cache.

Submitted by:	ps
Reviewed by:	dillon
MFC after:	2 weeks
2001-09-08 20:02:33 +00:00
John Baldwin
df53e91c18 Call sendsig() with the proc lock held and return with it held. 2001-09-06 22:20:41 +00:00
Peter Wemm
fc8b64e494 Sigh. Dig up text from a signature in a 1994 Usenet post I made and redo
the ..uhh... ``console test'' to avoid another 50 emails about GPL issues.
2001-09-05 23:51:06 +00:00
David E. O'Brien
faf73940c6 Fix the definition generation code from rev 1.15 that generates non-style(9)
compliant structure definitions.
2001-09-05 01:27:53 +00:00
Ian Dowse
7476f7e87d Fix a memory leak in __getcwd() that can occur after a filesystem
has been forcibly unmounted. If the filesystem root vnode is reached
and it has no associated mountpoint (vp->v_mount == NULL), __getcwd
would return without freeing 'buf'. Add the missing free() call.

PR:		kern/30306
Submitted by:	Mike Potanin <potanin@mccme.ru>
MFC after:	1 week
2001-09-04 19:03:47 +00:00
Peter Wemm
c92c4c8f79 Unindent a if (1) { that was left behind in the last commit.
(commits were seperated to not obscure the real change)
2001-09-03 04:39:38 +00:00
Peter Wemm
00dda5e82b Argh. Make the ia64 kernel work in all situations. For some reason,
and I still dont know why, this was not failing on the non-kse kernel.
It certainly should have since things were using linker_kernel_file
unconditionally.  This has highlighted a different problem though that
means that trying to do a kldload on a non-dynamic kernel will implode.
2001-09-03 04:37:55 +00:00
David E. O'Brien
6533ba2e33 Match the declaration in net/netisr.h.
Submitted by:	gcc 3.0.1
2001-09-03 03:24:31 +00:00
Peter Wemm
772121fd11 The !RESTARTABLE_PANICS code has some loose ends. 2001-09-02 12:24:38 +00:00
Peter Wemm
ef4181d98e For ia64, set the default elf brand to be FreeBSD. This is temporarily
necessary only for as long as we're using a linux toolchain.
2001-09-02 12:23:08 +00:00
John Baldwin
e342cd279f Use sched_lock to protect rtp_to_pri() and pri_to_rtp() when needed. 2001-09-02 01:05:36 +00:00
John Baldwin
51b4eed974 Protect pri_to_rtp() with sched_lock when needed. 2001-09-02 00:52:11 +00:00
Chris D. Faulhaber
dbb14f9874 In the case of ACL_OTHER and undefined ACL entry id's, set
ae_id to ACL_UNDEFINED_ID instead of 0.

Reviewed by:	rwatson
2001-09-01 23:16:02 +00:00
John Baldwin
da3abba462 Remove #if 0'd remnants of the old idle page zeroing. 2001-09-01 20:17:43 +00:00
Matthew Dillon
c8b8bac3ed Regenerate syscalls 2001-09-01 19:37:41 +00:00
Matthew Dillon
257d198890 Synchronize syscalls.master(s) with recent Giant pushdown work 2001-09-01 19:36:48 +00:00
Matthew Dillon
ad2edad94e Giant Pushdown:
read() pread() readv() write () pwrite() writev() ioctl() select ()
    poll() openbsd_poll()
2001-09-01 19:34:23 +00:00
Matthew Dillon
835a82ee2d Giant Pushdown. Saved the worst P4 tree breakage for last.
reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit()
    setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp()
    getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid()
    setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid()
    setresuid() setresgid() getresuid() getresgid () __setugid() getlogin()
    setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload()
    kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize()
    dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf()
    flock()
2001-09-01 19:04:37 +00:00
Matthew Dillon
fb99ab8811 Giant Pushdown
clock_gettime() clock_settime() nanosleep() settimeofday()
adjtime() getitimer() setitimer() __sysctl() ogetkerninfo()
sigaction() osigaction() sigpending() osigpending() osigvec()
osigblock() osigsetmask() sigsuspend() osigsuspend() osigstack()
sigaltstack() kill() okillpg() trapsignal() nosys()
2001-09-01 18:19:21 +00:00
Matthew Dillon
6f1e8c186f Pushdown Giant for: profil(), ntp_adjtime(), ogethostname(),
osethostname(), ogethostid(), osethostid()
2001-09-01 05:47:58 +00:00
Matthew Dillon
234216ef98 Giant pushdown sys_exit(), [o]wait(), wait4() 2001-09-01 04:37:34 +00:00
Matthew Dillon
f708f4d189 Giant Pushdown ACL syscalls:
__acl_get_file, __acl_set_file, __acl_get_fd, __acl_set_fd,
	__acl_delete_file, __acl_delete_fd, __acl_aclcheck_file,
	__acl_aclcheck_fd
2001-09-01 04:33:22 +00:00
Matthew Dillon
f7b200fd2f regenerate syscalls 2001-09-01 03:56:12 +00:00
Matthew Dillon
918c3b1361 Make yield() MPSAFE.
Synchronize syscalls.master with all MPSAFE changes to date.  Synchronize
new syscall generation follows because yield() will panic if it is out
of sync with syscalls.master.
2001-09-01 03:54:09 +00:00
Matthew Dillon
116734c4d1 Pushdown Giant for acct(), kqueue(), kevent(), execve(), fork(),
vfork(), rfork(), jail().
2001-09-01 03:04:31 +00:00
Matthew Dillon
2afac34da3 Make various posix4 system calls MPSAFE (will fixup syscalls.master later)
sched_setparam()
    sched_getparam()
    sched_setscheduler()
    sched_getscheduler()
    sched_yield()
    sched_get_priority_max()
    sched_get_priority_min()
    sched_rr_get_interval()
2001-08-31 22:34:40 +00:00
Robert Watson
93f4fd1cb6 o Screw over users of the kern.{security.,}suser_permitted sysctl again,
by renaming it to kern.security.suser_enabled.  This makes the name
  consistent with other use: "permitted" now refers to a specific right
  or privilege, whereas "enabled" refers to a feature.  As this hasn't
  been MFC'd, and using this destroys a running system currently, I believe
  the user base of the sysctl will not be too unhappy.
o While I'm at it, un-staticize and export the supporting variable, as it
  will be used by kern_cap.c shortly.

Obtained from:	TrustedBSD Project
2001-08-31 21:44:12 +00:00
Matthew Dillon
df9987602f Giant pushdown syscalls in kern/uipc_syscalls.c. Affected calls:
recvmsg(), sendmsg(), recvfrom(), accept(), getpeername(), getsockname(),
socket(), connect(), accept(), send(), recv(), bind(), setsockopt(), listen(),
sendto(), shutdown(), socketpair(), sendfile()
2001-08-31 00:37:34 +00:00
Matthew Dillon
b6a4b4f9ae Giant Pushdown: sysv shm, sem, and msg calls. 2001-08-31 00:02:18 +00:00
Matthew Dillon
356861db03 Remove the MPSAFE keyword from the parser for syscalls.master.
Instead introduce the [M] prefix to existing keywords.  e.g.
MSTD is the MP SAFE version of STD.  This is prepatory for a
massive Giant lock pushdown.  The old MPSAFE keyword made
syscalls.master too messy.

Begin comments MP-Safe procedures with the comment:
/*
 * MPSAFE
 */
This comments means that the procedure may be called without
Giant held (The procedure itself may still need to obtain
Giant temporarily to do its thing).

sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE
sv_transtrap() is now MP SAFE and assumed to be MP SAFE

ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown)
trapsignal() is now MP SAFE (Giant Pushdown)

Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant)
test in syscall[2]() in */*/trap.c now do not.  Instead they
explicitly unlock Giant if they previously obtained it, and then
assert that it is no longer held to catch broken system calls.

Rebuild syscall tables.
2001-08-30 18:50:57 +00:00
Andrey A. Chernov
c8e7634357 advlock: simplify overflow checks 2001-08-29 18:53:53 +00:00
Andrey A. Chernov
63347f1e8f lseek: simplify overflow checks 2001-08-29 18:35:53 +00:00
Robert Watson
3c4543e046 o Reduce gratuitous whitespace difference from Darwin. 2001-08-29 17:18:04 +00:00
Peter Wemm
df55753880 Fix the ogetkerninfo() syscall handling of sizes for
KINFO_BSDI_SYSINFO.  This supposedly fixes Netscape 3.0.4 (bsdi binary)
on -current.  (and is also applicable to RELENG_4)

PR:		25476
Submitted by:	Philipp Mergenthaler <un1i@rz.uni-karlsruhe.de>
2001-08-29 11:47:53 +00:00
Brian Somers
546a92c4d4 OR M_WAITOK with M_ZERO in malloc()s args for clarity. 2001-08-28 23:58:32 +00:00
Robert Watson
7fd6a9596d o Improve the style of a number of routines and comments in kern_prot.c,
with regards to redundancy, formatting, and style(9).

Submitted by:	bde
2001-08-28 16:35:33 +00:00
Robert Watson
4bcbade869 Fix typos in recent comments.
Submitted by:	dd
2001-08-28 05:16:19 +00:00
Robert Watson
3b243b7292 Generally improve documentation of kern_prot.c:
o Add comments for:
  - kern.security.suser_permitted
  - p_cansee()
  - p_cansignal()
  - p_cansched()
  - kern.security.unprivileged_procdebug_permitted
  - p_candebug()

Update copyright.

Obtained from:	TrustedBSD
2001-08-27 16:01:52 +00:00
Peter Wemm
0f7289022b If a file has been completely unlinked, stop automatically syncing the
file.  ffs will discard any pending dirty pages when it is closed,
so we may as well not waste time trying to clean them.  This doesn't
stop other things from writing it out, eg: pageout, fsync(2) etc.
2001-08-27 06:09:56 +00:00
Andrey A. Chernov
c4778eed9f Cosmetique & style fixes from bde 2001-08-26 10:23:49 +00:00
Peter Wemm
268bdb43f9 Optionize UPAGES for the i386. As part of this I split some of the low
level implementation stuff out of machine/globaldata.h to avoid exposing
UPAGES to lots more places.  The end result is that we can double
the kernel stack size with 'options UPAGES=4' etc.

This is mainly being done for the benefit of a MFC to RELENG_4 at some
point.  -current doesn't really need this so much since each interrupt
runs on its own kstack.
2001-08-25 02:20:02 +00:00
Bosko Milekic
76dcbd6f9f Force a commit on kern_mutex.c to explain reason for last commit but while
I'm at it also add a comment in mtx_validate() explaining the purpose
of the last change.

Basically, this fixes booting kernels compiled with MUTEX_DEBUG. What used
to happen is before we setidt from init386() [still using BTX idt], we
called mtx_init() on several mutex locks, notably Giant and some others.
This is a problem for MUTEX_DEBUG because it enables mtx_validate() which
calls kernacc(), some of which in turn requires Giant.
Fix by calling kernacc() from mtx_validate() only if (!cold).
2001-08-24 23:00:59 +00:00
Bosko Milekic
ab07087e16 *** empty log message *** 2001-08-24 22:53:45 +00:00
John Baldwin
6385dec00e Style nits:
- Don't use punctuation or newlines in panic messages.
- Remove excess blank lines.

Requested and partially submitted by:	bde
2001-08-24 17:46:58 +00:00
Peter Pentchev
ccdbd10cb7 Prevent passing a null pointer as a filename to vn_open(),
if for some reason expand_name() failed to build a core file name.

PR:		29931
Submitted by:	Foldi Tamas <crow@kapu.hu>
Reviewed by:	dd, -arch
MFC after:	1 month
2001-08-24 15:49:30 +00:00
Andrey A. Chernov
dc6e1079e6 Remove extra check unneded now 2001-08-24 10:20:26 +00:00
Robert Watson
670f6b2fc6 o Clarify comments in vaccess_acl_posix1e() ACL evaluation routine so
as to improve readability and accuracy.

Obtained from:	TrustedBSD Project
2001-08-24 01:41:42 +00:00
John Baldwin
b0b7cb508c Use witness_upgrade/downgrade for sx_try_upgrade/downgrade. 2001-08-23 22:51:22 +00:00
John Baldwin
c19fe5e261 Add witness_upgrade() and witness_downgrade() for handling upgrades and
downgrades of shared/exclusive locks.
2001-08-23 22:47:05 +00:00
John Baldwin
d7c4536a55 Convert some KASSERT()'s into if (foo) panic() because they are testing
how locks are managed by the rest of the kernel, not verifying the internal
integrity of witness itself.
2001-08-23 22:44:47 +00:00
John Baldwin
1432aa0c5e Add a new kernel option RESTARTABLE_PANICS. If this option is present,
then one can restart from a panic by resetting the panicstr variable to
NULL.  This commit conditionalizes the previously committed functionality
on this variable.  It also removes the __dead2 attribute from the panic()
function so that when one continues from a panic() the behavior will
be predictable.
2001-08-23 20:32:21 +00:00
John Baldwin
e2870579fa Clear the sx_xholder pointer when downgrading an exclusive lock. 2001-08-23 17:57:37 +00:00
Andrey A. Chernov
5d97bedb22 vn_stat(): if va_size (u_quad_t) > OFF_MAX, return EOVERFLOW, don't copy it
blindly to st_size
2001-08-23 17:56:48 +00:00
Andrey A. Chernov
6fb9fbceab Add yet one check for SEEK_END overflow 2001-08-23 17:09:23 +00:00
Andrey A. Chernov
db106eff39 lseek: fix check for vattr.va_size overflow. Check suggested by bde simple not
works with unsigned types.
2001-08-23 17:01:25 +00:00
Andrey A. Chernov
62be011ebd Oops, fix my broken handling of new l_len<0 case 2001-08-23 16:00:27 +00:00
Andrey A. Chernov
f510e1c2ec Originally BSD return EINVAL for l_len < 0, but now POSIX wants it too,
so implement POSIX l_len < 0 handling.
2001-08-23 15:40:30 +00:00
Andrey A. Chernov
6d24c65d96 Cosmetique: correct English in comments
Pointed by:	bde
2001-08-23 14:41:39 +00:00
Andrey A. Chernov
b82f5b624c Cosmetique: more <sys/*> into one group, separate include families by
blank line
2001-08-23 13:51:17 +00:00
Andrey A. Chernov
b44af710d3 Move <machine/*> after <sys/*>
Pointed by:	bde
2001-08-23 13:21:17 +00:00
Andrey A. Chernov
4b207d9868 Move <machine/*> after <sys/*>
Add missing fdrop() before EOVERFLOW

Pointed by:	bde
2001-08-23 13:19:32 +00:00
Andrey A. Chernov
69cc1d0d7f Detect off_t EOVERFLOW of start/end offsets calculations for adv. lock,
as POSIX require.
2001-08-23 07:42:40 +00:00
Thomas Moestl
040ef07af8 Regenerate from syscalls.master using the new makesyscalls.sh revision. 2001-08-22 23:27:20 +00:00
Thomas Moestl
a4189a088b Add padding before each element of the syscall argument structures in
sysproto.h in addition to the existing padding afterwards.
This is needed to support big-endian architectures like sparc64.

Reviewed by:	bde
Tested on alpha by:	jhb
2001-08-22 23:22:47 +00:00
Alexander Langer
b8c526df70 Fix a simple typo I just happened to find. 2001-08-22 19:12:24 +00:00
Matthew Dillon
0cf5e0ebd6 Remove the code that limited the buffer_map to 1/2 the size of the
kernel_map.  maxbcache takes care of this now and the 1/2 limit can
interfere with testing.

Suggested by: bde
2001-08-22 18:10:37 +00:00