Commit Graph

4132 Commits

Author SHA1 Message Date
Matthew Dillon
b5810bab2d After extensive testing it has been determined that adding complexity
to avoid removing higher level directory vnodes from the namecache has
no perceivable effect and will be removed.  This is especially true
when vmiodirenable is turned on, which it is by default now.  ( vmiodirenable
makes a huge difference in directory caching ).  The vfs.vmiodirenable and
vfs.nameileafonly sysctls have been left in to allow further testing, but
I expect to rip out vfs.nameileafonly soon too.

I have also determined through testing that the real problem with numvnodes
getting too large is due to the VM Page cache preventing the vnode from
being reclaimed.  The directory stuff made only a tiny dent relative
to Poul's original code, enough so that some tests succeeded.  But tests
with several million small files show that the bigger problem is the VM Page
cache.  This will have to be addressed by a future commit.

MFC after:	3 days
2001-10-01 04:33:35 +00:00
Jonathan Lemon
1a6fc8ef63 When FREE()ing kqueue related structures, charge them to the correct bucket.
Submitted by: iedowse
Forgotten by: jlemon
2001-09-30 17:00:56 +00:00
Bosko Milekic
70a61707f6 Re-enable mbtypes statistics in the mbuf allocator. I disabled these
when I changed the allocator bits. This implements per-CPU mbtypes
stats by keeping net number of decrements/increments of a given mbtype
per-CPU and then summing all of the per-CPU mbtypes to produce the total
net number of allocated mbufs of the given mbtype.
Counters are carefully balanced to avoid/prevent underflows/overflows.

mbtypes stats are re-enabled with the idea that we may occasionally
(although very rarely) observe slight inconsistencies in the stat
reporting. Most of the time, we should be fine, though.

Also make appropriate modifications to netstat(1) and systat(1) to do
the necessary reporting.

Submitted by: Jiangyi Liu <jyliu@163.net>
2001-09-30 01:58:39 +00:00
Jonathan Lemon
0217f5c71e Have EVFILT_TIMERS allocate their callouts via malloc() instead of using
the static callout list allocated by the system.

Change malloc type from M_TEMP to M_KQUEUE to better track memory.

Add a kern.kq_calloutmax to globally limit the amount of kernel memory
that can be allocated by callouts.

Submitted by: iedowse  (items 1, 2)
2001-09-29 17:48:39 +00:00
Dag-Erling Smørgrav
5b6db47748 Add a couple of API functions I need for my pseudofs WIP. Documentation
will follow when I've decided whether to keep this API or ditch it in
favor of something slightly more subtle.
2001-09-29 00:32:46 +00:00
Marcel Moolenaar
4166877345 Make the NODEF type usable. A syscall of type NODEF will only
have its entry in the syscall table added. Nothing else is
done. This differs from type NOPROTO in that NOPROTO adds a
definition to syscall.h besides adding a sysent. A syscall can
now have multiple entries without conflict. Note that the
argssize is fixed and depends on the syscall name.
2001-09-28 01:21:57 +00:00
Robert Watson
87fce2bb96 o When performing a securelevel check as part of securelevel_ge() or
securelevel_gt(), determine first if a local securelevel exists --
  if so, perform the check based on imax(local, global).  Otherwise,
  simply use the global value.
o Note: even though local securelevels might lag below the global one,
  if the global value is updated to higher than local values, maximum
  will still be used, making the global dominant even if there is local
  lag.

Obtained from:	TrustedBSD Project
2001-09-26 20:41:48 +00:00
Robert Watson
8a528812a0 o Modify kern.securelevel MIB entry to return a local securelevel, if
one is present in the current jail, otherwise, to return the global
  securelevel.
o If the securelevel is being updated, require that it be greater than
  the maximum of local and global, if a local securelevel exists,
  otherwise, just maximum of the global.  If there is a local
  securelevel, update the local one instead of the global one.
o Note: this does allow local securelevels to lag behind the global one
  as long as the local one is not updated following a global increase.

Obtained from:	TrustedBSD Project
2001-09-26 20:39:48 +00:00
Robert Watson
567931c8f6 o Initialize per-jail securelevel from global securelevel as part of
jail creation.

Obtained from:	TrustedBSD Project
2001-09-26 20:37:15 +00:00
Robert Watson
d501d04b9e o Modify static settime() to accept the proc * for the process requesting
a time change, and callers so that they provide td->td_proc.
o Modify settime() to use securevel_gt() for securelevel checking.

Obtained from:	TrustedBSD Project
2001-09-26 19:53:57 +00:00
Robert Watson
c2f413af19 o Modify sysctl access control check to use securelevel_gt(), and
clarify sysctl access control logic.

Obtained from:	TrustedBSD Project
2001-09-26 19:51:25 +00:00
Matthew Dillon
46cad5761c Enable vmiodirenable by default. Remove incorrect comment from sysctl.conf.
MFC after:	1 week
2001-09-26 19:35:04 +00:00
Matthew Dillon
3418ebebfe Make uio_yield() a global. Call uio_yield() between chunks
in vn_rdwr_inchunks(), allowing other processes to gain an exclusive
lock on the vnode.  Specifically: directory scanning, to avoid a race to the
root directory, and multiple child processes coring simultaniously so they
can figure out that some other core'ing child has an exclusive adv lock and
just exit instead.

This completely fixes performance problems when large programs core.  You
can have hundreds of copies (forked children) of the same binary core all
at once and not notice.

MFC after:	3 days
2001-09-26 06:54:32 +00:00
Paul Saab
88b1d98f31 Lock the vnode while truncating the corefile. This fixes a panic
with softupdates dangling deps.

Submitted by:	peter
MFC:		ASAP :)
2001-09-26 01:24:07 +00:00
John Baldwin
21377ce065 Remove superflous parens after de-macroizing. 2001-09-26 00:05:18 +00:00
Robert Watson
75bc5b3f22 o So, when <dd> e-mailed me and said that the comment was inverted
for securelevel_ge() and securelevel_gt(), I was a little surprised,
  but fixed it.  Turns out that it was the code that was inverted, during
  a whitespace cleanup in my commit tree.  This commit inverts the
  checks, and restores the comment.
2001-09-25 21:08:33 +00:00
John Baldwin
dde96c9933 Since we no longer inline any debugging code in the mutex operations, move
all the debugging code into the function versions of the mutex operations
in kern_mutex.c.  This reduced the __mtx_* macros to simply wrappers of
the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in
favor of just calling the _{get,rel}_lock_* macros.  The tangled hairy mass
of macros calling macros is at least a bit more sane now.
2001-09-22 21:19:55 +00:00
Robert Watson
b4799065ef o vpaccess() -> vn_access() -- Peter reminds me that there is already
a convention for vnop helper routines of this sort.

Submitted by:	Mr Wemm <peter>
2001-09-22 03:07:41 +00:00
John Baldwin
ed01445d8f Use the passed in thread to selrecord() instead of curthread. 2001-09-21 22:46:54 +00:00
John Baldwin
456ca585db Use the passed in thread pointer instead of curthread in calls to
selrecord() in ptcpoll().  The pre-KSE code used the passed in proc pointer
rather than curproc, and an earlier seltrue() call uses the passed in
thread and not curthread.
2001-09-21 22:22:25 +00:00
John Baldwin
fea2ab833e The P_SELECT flag was moved from p->p_flag to td->td_flags, but p_flag
was locked by the proc lock and td_flags is locked by the sched_lock.
The places that read, set, and cleared TDF_SELECT weren't updated, so they
read and modified td_flags w/o holding the sched_lock, meaning that they
could corrupt the per-thread flags field.  As an immediate band-aid,
grab sched_lock while reading and manipulating td_flags in relation to
TDF_SELECT.  This will probably be cleaned up some later on.
2001-09-21 22:06:22 +00:00
John Baldwin
e649bcb506 Remove unneeded proc variables and fix comments. 2001-09-21 21:54:45 +00:00
Robert Watson
a90a3f2882 o Part two of eaccess(2) commit, rebuilt system call code.
Obtained from:	TrustedBSD Project
2001-09-21 21:34:06 +00:00
Robert Watson
9c94f7731e o Introduce eaccess(2), a version of access(2) that uses the effective
credentials rather than the real credentials.  This is useful for
  implementing GUI's which need to modify icons based on access rights,
  but where use of open(2) is too expensive, use of stat(2) doesn't
  reflect the file system's real protection model, and use of
  access() suffers from real/effective credential confusion.  This
  implementation provides the same semantics as the call of the same
  name on SCO OpenServer.  Note: using this call improperly can
  leave you subject to some of the same races present in the
  access(2) call.
o To implement this, break out the basic logic of access(2) into
  vpaccess(), which accepts a passed credential to perform the
  invocation of VOP_ACCESS().  Add eaccess(2) to invoke vpaccess(),
  and modify access(2) to use vpaccess().

Obtained from:	TrustedBSD Project
2001-09-21 21:33:22 +00:00
John Baldwin
278da5113f Remove a bogus comment. "atomic" doesn't mean that the operation is done
as a physical atomic operation.  That would require the code to use the
atomic API, which it does not.  Instead, the operation is made psuedo
atomic (hence the quotes) by use of the lock to protect clearing all of the
flags in question.
2001-09-21 19:26:57 +00:00
John Baldwin
21832b1ec0 GC some #if 0'd code. 2001-09-21 19:21:18 +00:00
John Baldwin
3226cbf43b Whitespace and spelling fixes. 2001-09-21 19:16:12 +00:00
Michael Reifenberger
896de692f8 Make msgseg, msgssz (->msgmax) and msgmni TUNABLE. 2001-09-21 09:25:17 +00:00
Peter Wemm
1114d18594 Add a pointer to kenv(1). 2001-09-21 02:25:53 +00:00
Jonathan Lemon
57ea1fa07f Revert last commit. The same functionality can be obtained through the
'kenv' command, which I obviously was unaware of.
2001-09-21 02:09:01 +00:00
Robert Watson
94088977c9 o Rename u_cansee() to cr_cansee(), making the name more comprehensible
in the face of a rename of ucred to cred, and possibly generally.

Obtained from:	TrustedBSD Project
2001-09-20 21:45:31 +00:00
Jonathan Lemon
e492f03505 Add a sysctl MIB 'kern.env', that dumps the contents of the kernel
environment from the loader, as well as the kernel's compiled in static
hints.
2001-09-20 20:09:37 +00:00
Peter Wemm
fbd7a9dd97 decrement the dumping variable after use so we can call it several times
if needed.
2001-09-20 06:08:53 +00:00
John Baldwin
a44f918bf9 Fix a bug in propagate priority: the kse group pointer wasn't being
updated in the loop so the new thread always seemd to have the same
priority as the original thread and no actual priorities were changed.
2001-09-19 22:52:59 +00:00
Robert Watson
288b789333 o Clarification of securelevel_{ge,gt} comment.
Submitted by:	dd
2001-09-19 14:09:13 +00:00
Peter Wemm
66f769fe39 Add missing ; in last commit
Pointy-hat-to: jhb
2001-09-19 02:53:59 +00:00
Peter Wemm
98cdde71e7 Regenerate 2001-09-18 23:33:33 +00:00
Peter Wemm
eb25edbda3 Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.
2001-09-18 23:32:09 +00:00
John Baldwin
9ef3a9855d Use a 'p' variable instead of repetitively indirecting td->td_proc for
signal things that are still per-process and won't be per-thread.
2001-09-18 23:27:06 +00:00
John Baldwin
8cc06751dd Don't initialize proc0's mutex twice. It is already done earlier on in the
MD startup code.
2001-09-18 22:09:47 +00:00
Robert Watson
3ca719f12e o Introduce two new calls, securelevel_gt() and securelevel_ge(), which
abstract the securelevel implementation details from the checking
  code.  The call in -CURRENT accepts a struct ucred--in -STABLE, it
  will accept struct proc.  This facilitates the upcoming commit of
  per-jail securelevel support.  The calls will also generate a
  kernel printf if the calls are made with NULL ucred/proc pointers:
  generally speaking, there are few instances of this, and they should
  be fixed.
o Update p_candebug() to use securelevel_gt(); future updates to the
  remainder of the kernel tree will be committed soon.

Obtained from:	TrustedBSD Project
2001-09-18 21:03:53 +00:00
Mark Peek
796ed2a6d0 Set debug information on the process being traced, not the current (debugger)
process. This should allow gdb to function correctly on post-KSE kernels.
2001-09-18 19:06:11 +00:00
Jonathan Lemon
6a494eeb34 Change p into ke->ke_proc, this was hidden behind INVARIANTS. 2001-09-18 03:36:21 +00:00
Peter Wemm
d2718e479a Fix a fatal type mismatch (char *static_env; vs char static_env[]).
Submitted by:	bde
2001-09-17 21:27:41 +00:00
Julian Elischer
fdd4e5c652 Replace line accidentally deleted during KSE additions.
Symptom.. Stopped program unable to be restarted if it was stopped
while already sleeping.
2001-09-17 20:42:25 +00:00
Robert Watson
9844fbc3b5 o Correct authorization check in CANSIGIO(), which suffered from incorrect
transcription during the (pcred,ucred) merge; this was not used for
  the kill() system call, so does not affect direct explicit process
  signalling.

Pointed out by:	fenner
2001-09-15 22:34:46 +00:00
Peter Wemm
b711616825 In the devfs case, have initproc attempt the easy cases of mounting /dev.
This works if /dev exists, or if / is read/write (nfsroot).  If it is
too hard, leave it up to init -d (which will probably fail if /dev does
not exist, but there isn't much else we can do short of making a union
mount on /).

This means we get a proper /dev if you boot a 5.x kernel on a 4.x world,
which I happen to do often (the ramdisks on our install netboot servers
have 4.x userland worlds on them).
2001-09-15 11:15:22 +00:00
Doug Rabson
de1792cbb8 The ia64 kernel is now linked dynamically so parse its _DYNAMIC structure. 2001-09-15 11:02:10 +00:00
John Baldwin
bce9841972 Fix locking on td_flags for TDF_DEADLKTREAT. If the comments in the code
are true that curthread can change during this function, then this flag
needs to become a KSE flag, not a thread flag.
2001-09-13 22:33:37 +00:00
Michael Reifenberger
d528be2bf3 PR: kern/29698 (part)
Reviewed by:	audit
Implement SEM_STAT (like IPC_STAT but treats semid as sema-index).
The linuxerator will need it.
2001-09-13 21:06:41 +00:00