Commit Graph

6621 Commits

Author SHA1 Message Date
Warner Losh
45cc9f5f4f Fix an extreme edge case in leap second handling. We need to call
ntp_update_second twice when we have a large step in case that step
goes across a scheduled leap second.  The only way this could happen
would be if we didn't call tc_windup over the end of day on the day of
a leap second, which would only happen if timeouts were delayed for
seconds.  While it is an edge case, it is an important one to get
right for my employer.

Sponsored by: Timing Solutions Corporation
2003-08-20 05:34:27 +00:00
Sam Leffler
c06eb4e293 Change instances of callout_init that specify MPSAFE behaviour to
use CALLOUT_MPSAFE instead of "1" for the second parameter.  This
does not change the behaviour; it just makes the intent more clear.
2003-08-19 17:51:11 +00:00
Poul-Henning Kamp
037c3d0fb0 It is not an error to have no devices in the kernel: Return the
generation number and start it from one instead of zero.
2003-08-17 12:06:19 +00:00
Bosko Milekic
b618bba486 Use constants less throughout the code and instead use the objsize
variable.  This makes changing the size of an mbuf or cluster for
testing/debugging/whatever purposes easier.

Submitted by: sam
2003-08-16 19:48:52 +00:00
Marcel Moolenaar
26502503e5 Further cleanup <machine/cpu.h> and <machine/md_var.h>: move the MI
prototypes of cpu_halt(), cpu_reset() and swi_vm() from md_var.h to
cpu.h. This affects db_command.c and kern_shutdown.c.

ia64: move all MD prototypes from cpu.h to md_var.h. This affects
madt.c, interrupt.c and mp_machdep.c. Remove is_physical_memory().
It's not used (vm_machdep.c).

alpha: the MD prototypes have been left in cpu.h with a comment
that they should be there. Moving them is left for later. It was
expected that the impact would be significant enough to be done in
a seperate commit.

powerpc: MD prototypes left in cpu.h. Comment added.

Suggested by: bde
Tested with: make universe (pc98 incomplete)
2003-08-16 16:57:57 +00:00
Poul-Henning Kamp
78a49a45bc Give timecounters a numeric quality field.
A timecounter will be selected when registered if its quality is
not negative and no less than the current timecounters.

Add a sysctl to report all available timecounters and their qualities.

Give the dummy timecounter a solid negative quality of minus a million.

Give the i8254 zero and the ACPI 1000.

The TSC gets 800, unless APM or SMP forces it negative.

Other timecounters default to zero quality and thereby retain current
selection behaviour.
2003-08-16 08:23:53 +00:00
John Baldwin
70fca4277e - Various style fixes in both code and comments.
- Update some stale comments.
- Sort a couple of includes.
- Only set 'newcpu' in updatepri() if we use it.
- No functional changes.

Obtained from:	bde (via an old diff I got a long time ago)
2003-08-15 21:29:06 +00:00
Marcel Moolenaar
1c843354aa Add or finish support for machine dependent ptrace requests. When we
check for permissions, do it for all requests, not the known requests.
Later when we actually service the request we deal with the invalid
requests we previously caught earlier.

This commit changes the behaviour of the ptrace(2) interface for
boundary cases such as an unknown request without proper permissions.
Previously we would return EINVAL. Now we return EBUSY or EPERM.

Platforms need to define __HAVE_PTRACE_MACHDEP when they have MD
requests. This makes the prototype of cpu_ptrace() visible and
introduces a call to this function for all requests greater or
equal to PT_FIRSTMACH.

Silence on: audit
2003-08-15 05:25:06 +00:00
John-Mark Gurney
fc8684cd46 if we got this far, we definately don't have an EBADF. Return a more
sane result of EPIPE.

Reported by:	nCircle dev team
MFC after:	3 day
2003-08-15 04:31:01 +00:00
Cameron Grant
828447e0ca add a read-only sysctl to display the number of entries in the fixed size
kobj global method table; also kassert that the table has not overflowed
when defining a new method.

there are indications that the table is being overflowed in certain
situations as we gain more kobj consumers- this will allow us to check
whether kobj is at fault.  symptoms would be incorrect methods being called.
2003-08-14 21:16:46 +00:00
Peter Grehan
eac100658a Update powerpc to use the (old thread,new thread) calling convention
for cpu_throw() and cpu_switch().
2003-08-14 03:56:24 +00:00
Alan Cox
77685ea594 - The vm_object pointer in pipe_buffer is unused. Remove it.
- Check for successful initialization of pipe_zone in pipeinit()
   rather than every call to pipe(2).
2003-08-13 20:01:38 +00:00
Warner Losh
06b4bf3e55 Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's
copyrighted files.

Approved by: Matt Dillon
2003-08-12 23:24:05 +00:00
Maxime Henrion
affd4332fd Remove extra space. 2003-08-12 20:34:31 +00:00
John Baldwin
e9911cf591 - Convert Alpha over to the new calling conventions for cpu_throw() and
cpu_switch() where both the old and new threads are passed in as
  arguments.  Only powerpc uses the old conventions now.
- Update comments in the Alpha swtch.s to reflect KSE changes.

Tested by:	obrien, marcel
2003-08-12 19:33:36 +00:00
Alan Cox
ad8204e3f5 Pipespace() no longer requires Giant. 2003-08-11 22:23:25 +00:00
Alexander Kabaev
660ebf0ef2 Drop Giant in recvit before returning an error to the caller to avoid
leaking the Giant on the syscall exit.
2003-08-11 19:37:11 +00:00
Bruce M Simpson
abd498aa71 Add the mlockall() and munlockall() system calls.
- All those diffs to syscalls.master for each architecture *are*
   necessary. This needed clarification; the stub code generation for
   mlockall() was disabled, which would prevent applications from
   linking to this API (suggested by mux)
 - Giant has been quoshed. It is no longer held by the code, as
   the required locking has been pushed down within vm_map.c.
 - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES
   to express their intention explicitly.
 - Inspected at the vmstat, top and vm pager sysctl stats level.
   Paging-in activity is occurring correctly, using a test harness.
 - The RES size for a process may appear to be greater than its SIZE.
   This is believed to be due to mappings of the same shared library
   page being wired twice. Further exploration is needed.
 - Believed to back out of allocations and locks correctly
   (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).

PR:             kern/43426, standards/54223
Reviewed by:    jake, alc
Approved by:    jake (mentor)
MFC after:	2 weeks
2003-08-11 07:14:08 +00:00
Mike Silbersack
cebde06978 More pipe changes:
From alc:
Move pageable pipe memory to a seperate kernel submap to avoid awkward
vm map interlocking issues.  (Bad explanation provided by me.)

From me:
Rework pipespace accounting code to handle this new layout, and adjust
our default values to account for the fact that we now have a solid
limit on allocations.

Also, remove the "maxpipes" limit, as it no longer has a purpose.
(The limit on kva usage solves the problem of having two many pipes.)
2003-08-11 05:51:51 +00:00
Alan Cox
f9999c67be Use vm_page_hold() instead of vm_page_wire(). Otherwise, a multithreaded
application could cause a wired page to be freed.  In general,
vm_page_hold() should be preferred for ephemeral kernel mappings of pages
borrowed from a user-level address space.  (vm_page_wire() should really be
reserved for indefinite duration pinning by the "owner" of the page.)

Discussed with:	silby
Submitted by:	tegge
2003-08-11 00:17:44 +00:00
Jacques Vidrine
41b3077a6c panic() if we try to handle an out-of-range signal number in
psignal()/tdsignal().  The test was historically in psignal().  It was
changed into a KASSERT, and then later moved to tdsignal() when the
latter was introduced.

Reviewed by:	iedowse, jhb
2003-08-10 23:05:37 +00:00
Jacques Vidrine
007e25d95a Add or correct range checking of signal numbers in system calls and
ioctls.

In the particular case of ptrace(), this commit more-or-less reverts
revision 1.53 of sys_process.c, which appears to have been erroneous.

Reviewed by:	iedowse, jhb
2003-08-10 23:04:55 +00:00
Alan Cox
c6eb850aac Background: When proc_rwmem() wired and mapped a page, it also added
a reference to the containing object.  The purpose of the reference
being to prevent the destruction of the object and an attempt to free
the wired page.  (Wired pages can't be freed.)  Unfortunately, this
approach does not work.  Some operations, like fork(2) that call
vm_object_split(), can move the wired page to a difference object,
thereby making the reference pointless and opening the possibility
of the wired page being freed.

A solution is to use vm_page_hold() in place of vm_page_wire().  Held
pages can be freed.  They are moved to a special hold queue until the
hold is released.

Submitted by:	tegge
2003-08-09 18:01:19 +00:00
Alan Cox
9c62fce085 - Remove GIANT_REQUIRED from pipespace().
- Remove a duplicate initialization from pipe_create().
2003-08-08 22:38:15 +00:00
Daniel Eischen
ab908f5935 Copyin the thread mailbox flags from the correct location
in the mailbox.
2003-08-08 20:23:10 +00:00
John Baldwin
b2db7dc658 td_dupfd just needs to be less than 0, it does not have to hold the
negative value of the index of the new file, so just use -1.
2003-08-07 17:08:26 +00:00
Jacques Vidrine
01b9dc96e3 Update some argument-documenting comments to match reality.
Add an explicit range check to those same arguments to reduce risk of
cardiac arrest in future code readers.
2003-08-07 16:42:27 +00:00
John Baldwin
8b149b5131 Consistently use the BSD u_int and u_short instead of the SYSV uint and
ushort.  In most of these files, there was a mixture of both styles and
this change just makes them self-consistent.

Requested by:	bde (kern_ktrace.c)
2003-08-07 15:04:27 +00:00
John Baldwin
277576de43 The ktrace mutex does not need to be locked around the post of the ktrace
semaphore and doing so can lead to a possible reversal.  WITNESS would have
caught this if semaphores were used more often in the kernel.

Submitted by:	Ted Unangst <tedu@stanford.edu>, Dawson Engler
2003-08-07 13:58:13 +00:00
Alan Cox
f9b1de367e - Remove GIANT_REQUIRED from pipe_free_kmem().
- Remove the acquisition and release of Giant around pipe_kmem_free() and
   uma_zfree() in pipeclose().
2003-08-07 04:32:40 +00:00
Yaroslav Tykhiy
b81694ed13 If connect(2) has been interrupted by a signal and therefore the
connection is to be established asynchronously, behave as in the
case of non-blocking mode:

- keep the SS_ISCONNECTING bit set thus indicating that
  the connection establishment is in progress, which is the case
  (clearing the bit in this case was just a bug);

- return EALREADY, instead of the confusing and unreasonable
  EADDRINUSE, upon further connect(2) attempts on this socket
  until the connection is established (this also brings our
  connect(2) into accord with IEEE Std 1003.1.)
2003-08-06 14:04:47 +00:00
David Xu
75ea65e3a2 kse.h is not needed for these files. 2003-08-05 12:08:49 +00:00
David Xu
d3b5e418bc Introduce a thread mailbox flag TMF_NOUPCALL. On some architectures other
than i386 or AMD64, TP register points to thread mailbox, and they can not
atomically clear km_curthread in kse mailbox, in this case, thread retrieves
its thread pointer from TP register and sets flag TMF_NOUPCALL in its thread
mailbox to indicate a critical region.
2003-08-05 12:00:55 +00:00
Jeffrey Hsu
cc3426866c Make the second argument to sooptcopyout() constant in order to
simplify the upcoming PIM patches.

Submitted by:   Pavlin Radoslavov <pavlin@icir.org>
2003-08-05 00:27:54 +00:00
Ian Dowse
76bd23557b In the mknod(), mkfifo(), link(), symlink() and undelete() syscalls,
use vrele() instead of vput() on the parent directory vnode returned
by namei() in the case where it is equal to the target vnode. This
handles namei()'s somewhat strange (but documented) behaviour of
not locking either vnode when the two vnodes are equal and LOCKPARENT
but not LOCKLEAF is specified.

Note that since a vnode double-unlock is not currently fatal, these
coding errors were effectively harmless.

Spotted by:	Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
Reviewed by:	mckusick
2003-08-05 00:26:51 +00:00
David Malone
d2cce3d6e8 Do some minor Giant pushdown made possible by copyin, fget, fdrop,
malloc and mbuf allocation all not requiring Giant.

1) ostat, fstat and nfstat don't need Giant until they call fo_stat.
2) accept can copyin the address length without grabbing Giant.
3) sendit doesn't need Giant, so don't bother grabbing it until kern_sendit.
4) move Giant grabbing from each indivitual recv* syscall to recvit.
2003-08-04 21:28:57 +00:00
John Baldwin
bfe6598264 Adjust a comment to remove staleness and take slightly less implementation
specific perspective.
2003-08-04 20:35:13 +00:00
John Baldwin
139b7550d9 Set td_critnest to 1 when setting up a thread since it is a MI field with
MI values.  This ensures that td_critnest for a newly fork'd thread is
always valid.

Requested by:	bde (a long time ago)
2003-08-04 20:28:20 +00:00
John Baldwin
b35b737201 Insert cosmetic spaces.
Reported by:	kris
2003-08-04 19:24:25 +00:00
Robert Watson
60bdc14e90 Move more ACL logic from the UFS code (ufs_acl.c) to the central POSIX.1e
support routines in kern_acl.c:

- Define ACL_OVERRIDE_MASK and ACL_PRESERVE_MASK centrally in acl.h: the
  mode bits that are (and aren't) stored in the ACL.

- Add acl_posix1e_acl_to_mode(): given a POSIX.1e extended ACL, generate
  a compatibility mode (only the bits supported by the POSIX.1e ACL).

- acl_posix1e_newfilemode(): Given a requested creation mode and default
  ACL, calculate the mode for the new file system object (only the bits
  supported by the POSIX.1e ACL).

PR:		50148
Reported by:	Ritz, Bruno <bruno_ritz@gmx.ch>
Obtained from:	TrustedBSD Project
2003-08-04 02:13:05 +00:00
John Baldwin
80c09f69b0 Both 'c' an 'lines' are unused, the bogus init of lines was accidentally
left behind.
2003-08-02 17:35:00 +00:00
Alan Cox
884962ae4e Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in proc_rwmem().
See revision 1.140 of kern/sys_pipe.c for a detailed rationale.

Submitted by:	tegge
2003-08-02 17:08:21 +00:00
Poul-Henning Kamp
4bfd22f25e Grab Giant in bufdonebio() since drivers may not hold it.
This only protects the "struct buf" consumers (ie: DEV_STRATEGY()),
but does not protect BIO_STRATEGY() users.
2003-08-02 09:45:10 +00:00
Poul-Henning Kamp
f7e56e489d Grab Giant in physio() since non-giant drivers are starting to appear. 2003-08-02 09:40:53 +00:00
Alan Cox
105660e8ba Eliminate an abuse of kmem_alloc_pageable() in bufinit()
by using VM_ALLOC_NOOBJ to allocate the bogus page.

Reviewed by:	tegge
2003-08-02 05:05:34 +00:00
Alan Cox
efd02757c2 Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in sf_buf_init().
(See revision 1.140 of kern/sys_pipe.c for a detailed rationale.)

Submitted by:	tegge
2003-08-02 04:18:56 +00:00
David E. O'Brien
05a1bfa142 Fix kernel build -- 'c' was the unused var, not 'lines'. 2003-08-01 17:00:49 +00:00
Robert Watson
19c3e120f0 Attempt to simplify #ifdef logic for MAC_ALWAYS_LABEL_MBUF.
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-08-01 15:45:14 +00:00
Alan Cox
882d8469af Remove Giant from writev(2). Eliminate trivial style differences between
writev(2) and readv(2).
2003-08-01 02:21:54 +00:00
John Baldwin
4110951861 If a spin lock is held for too long and WITNESS is enabled, then call
witness_display_spinlock() to see if we can find out where the current
owner of the spin lock last acquired the lock.
2003-07-31 18:52:18 +00:00
John Baldwin
1beccae67c Add a new function to look for a spinlock's instance when it is held by
another thread.  We use the td_oncpu member of the other field to locate
it's associated CPU and then search the that CPU's list of spin locks
contained in its per-CPU data.  This is not always safe and may in fact
panic or just not work, but it is useful in at least one case.
2003-07-31 18:50:58 +00:00
John Baldwin
3f2a1b0656 Update the 'ps', 'show pci', and 'show ktr' ddb commands to use the new
pager callout instead of homerolling their own paging facility.
2003-07-31 17:29:42 +00:00
Peter Wemm
aeaead20b8 When ktracing context switches, make sure we record involuntary switches.
Otherwise, when we get a evicted from the cpu, there is no record of it.
This is not a default ktrace flag.
2003-07-31 01:36:24 +00:00
David Xu
1fc434dc9a Use correct signal when calling sigexit. 2003-07-30 23:11:37 +00:00
Pierre Beyssac
ae9fcf4c66 Remove test in pipe_write() which causes write(2) to return EAGAIN
on a non-blocking pipe in cases where select(2) returns the file
descriptor as ready for write. This in turns causes libc_r, for
one, to busy wait in such cases.

Note: it is a quick performance fix, a more complex fix might be
required in case this turns out to have unexpected side effects.

Reviewed by:	silby
MFC after:	3 days
2003-07-30 22:50:37 +00:00
John Baldwin
47b722c1af When complaining about a sleeping thread owning a mutex, display the
thread's pid to make debugging easier for people who don't want to have to
use the intended tool for these panics (witness).

Indirectly prodded by:	kris
2003-07-30 20:42:15 +00:00
Alan Cox
93b4c5b707 The introduction of vm object locking has caused witness to reveal
a long-standing mistake in the way a portion of a pipe's KVA is
allocated.  Specifically, kmem_alloc_pageable() is inappropriate
for use in the "direct" case because it allows a preceding vm map entry
and vm object to be extended to support the new KVA allocation.
However, the direct case KVA allocation should not have a backing
vm object.  This is corrected by using kmem_alloc_nofault().

Submitted by:	tegge (with the above explanation by me)
2003-07-30 18:55:04 +00:00
Alan Cox
fbe1bdddcc Revision 1.51 of vm/uma_core.c modified uma_large_free() to acquire Giant
when needed.  So, don't do it here.
2003-07-29 05:23:19 +00:00
Robert Watson
9080ff25cf Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the
kernel ACL interfaces and system call names.

Break out UFS2 and FFS extattr delete and list vnode operations from
setextattr and getextattr to deleteextattr and listextattr, which
cleans up the implementations, and makes the results more readable,
and makes the APIs more clear.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-07-28 18:53:29 +00:00
Robert Watson
2e4a71cdb1 When exporting file descriptor data for threads invoking the
kern.file sysctl, don't return information about processes that
fail p_cansee(td, p).  This prevents sockstat and related
programs from seeing file descriptors owned by processes not
in the same jail as the thread, as well as having implications
for MAC, etc.

This is a partial solution: it permits an information leak about
the number of descriptors in the sizing calculation (but this is
not new information, you can also get it from kern.openfiles),
and doesn't attempt to mask file descriptors based on the
properties of the descriptor, only the process referencing it.
However, it provides most of what you want under most
circumstances, without complicating the locking.

PR:	54211
Based on a patch submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2003-07-28 16:03:53 +00:00
Poul-Henning Kamp
cf7742997a Pass the file descriptor index down to vn_open.
If the method vector was replaced and we got the "special return code"
smile and trust that whatever happened below DTRT.
2003-07-27 20:09:13 +00:00
Poul-Henning Kamp
3ab6b09c53 Pass the fdidx argument from vn_open{_cred}() onto VOP_OPEN() 2003-07-27 20:05:36 +00:00
Poul-Henning Kamp
7c89f162bc Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00
Poul-Henning Kamp
1b6c609507 Call the new argument "fdidx" that is more precise than "fd". 2003-07-27 17:03:20 +00:00
David Malone
e41cbeba6d Now that we can call kmem_malloc without Giant it should be safe
to do mbuf allocation without Giant, so remove the GIANT_REQUIRED
from mb_alloc in the M_TRYWAIT case.
2003-07-27 14:19:23 +00:00
Poul-Henning Kamp
a8d43c90af Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.
2003-07-26 07:32:23 +00:00
Scott Long
c43cad1ac1 Guard against MLEN growing larger than a uint8_t due to MSIZE grwoing to a
value of 512 in LINT.  This keeps gcc from complaining.
2003-07-26 07:23:24 +00:00
Alan Cox
18e8d4e79c revision 1.51 of vm/uma_core.c modified uma_large_malloc() to acquire
Giant when needed.
2003-07-25 22:26:43 +00:00
Mike Makonnen
a6ca48085c The POSIX spec also requires that kern_sigtimedwait return
EINVAL if tv_nsec of the timeout is less than zero.
2003-07-24 17:07:17 +00:00
Peter Wemm
80611144e4 Initialize 'blocked' to NULL. I think this was a real problem, but I
am not sure about that.  The lack of -Werror and the inline noise hid
this for a while.
2003-07-23 20:29:13 +00:00
Poul-Henning Kamp
68f2d20b70 Revert stuff which accidentally ended up in the previous commit. 2003-07-22 10:36:36 +00:00
Poul-Henning Kamp
55d1d7034f Don't attempt to inline large functions mb_alloc() and mb_free(),
it more than doubles the text size of this file.

GCC has wisely ignored us on this previously
2003-07-22 10:24:41 +00:00
David Xu
432b45de08 Always deliver synchronous signal to UTS for SA threads. 2003-07-21 00:26:52 +00:00
Mike Makonnen
6022ec6737 Turn a KASSERT back into an EINVAL return value. So, next time someone
comes across it, it will turn into a core dump in userland instead of
a kernel panic. I had also inverted the sense of the test, so

Double pointy hat to:	mtm
2003-07-19 11:32:48 +00:00
Mike Silbersack
f8bf8e397b Three fixes:
- Make m_prepend use m_gethdr instead of m_get where
  appropriate

- Make m_copym use m_gethdr instead of m_get where
  appropriate

- Add a call to m_fixhdr in m_defrag; m_defrag can't
  deal with corrupted pkthdr.len counts.

MFC after:	3 days
2003-07-19 06:03:48 +00:00
Mike Makonnen
5c6edbec80 Remove a lock held across casuptr() that snuck in last commit. 2003-07-18 21:26:45 +00:00
Mike Makonnen
7df7f5c5ab Move the decision on whether to unset the contested
bit or not from lock to unlock time.

Suggested by:	jhb
2003-07-18 17:58:37 +00:00
Robert Drehmel
4e19fe1081 To avoid a kernel panic provoked by a NULL pointer dereference,
do not clear the `sb_sel' member of the sockbuf structure
while invalidating the receive sockbuf in sorflush(), called
from soshutdown().

The panic was reproduceable from user land by attaching a knote
with EVFILT_READ filters to a socket, disabling further reads
from it using shutdown(2), and then closing it.  knote_remove()
was called to remove all knotes from the socket file descriptor
by detaching each using its associated filterops' detach call-
back function, sordetach() in this case, which tried to remove
itself from the invalidated sockbuf's klist (sb_sel.si_note).

PR:	kern/54331
2003-07-17 23:49:10 +00:00
David Xu
3074d1b454 Fix sigwait to conform to POSIX.
When a signal is being delivered to process, first find a sigwait
thread to deliver, POSIX's argument is speed of delivering signal
to sigwait thread is faster than other ways. A signal in its wait
set will cause sigwait to return the signal number, a signal not
in its wait set but in not blocked by the thread also causes sigwait
to return, but sigwait returns EINTR, sigwait is oneshot operation,
only one signal can be delivered to its wait set, when a signal is
delivered to the sigwait thread, the thread's sigwait state is canceled.
2003-07-17 22:52:55 +00:00
David Xu
dd7da9aa28 o Refine kse_thr_interrupt to allow it to handle different commands.
o Remove TDF_NOSIGPOST.
o Add a member td_waitset to proc structure, it will be used for sigwait.

Tested by: deischen
2003-07-17 22:45:33 +00:00
Robert Drehmel
e76bad968c Correct six return statements which returned zero instead of
an appropriate error number after a failure condition.

In particular, three of the changed statements return ESRCH for a
failed pfind(), and in also three places a non-zero return
from p_cansee() will be passed back,

Also noticed by:	rwatson
2003-07-17 22:44:41 +00:00
Mike Makonnen
994599d782 Fix umtx locking, for libthr, in the kernel.
1. There was a race condition between a thread unlocking
   a umtx and the thread contesting it. If the unlocking
   thread won the race it may try to wakeup a thread that
   was not yet in msleep(). The contesting thread would then
   go to sleep to await a wakeup that would never come. It's
   not possible to close the race by using a lock because
   calls to casuptr() may have to fault a page in from swap.
   Instead, the race was closed by introducing a flag that
   the unlocking thread will set when waking up a thread.
   The contesting thread will check for this flag before
   going to sleep. For now the flag is kept in td_flags,
   but it may be better to use some other member or create
   a new one because of the possible performance/contention
   issues of having to own sched_lock. Thanks to jhb for
   pointing me in the right direction on this one.

2. Once a umtx was contested all future locks and unlocks
   were happening in the kernel, regardless of whether it
   was contested or not. To prevent this from happening,
   when a thread locks a umtx it checks the queue for that
   umtx and unsets the contested bit if there are no other
   threads waiting on it. Again, this is slightly more
   complicated than it needs to be because we can't hold
   a lock across casuptr(). So, the thread has to check
   the queue again after unseting the bit, and reset the
   contested bit if it finds that another thread has put
   itself on the queue in the mean time.

3. Remove the if... block for unlocking an uncontested
   umtx, and replace it with a KASSERT. The _only_ time
   a thread should be unlocking a umtx in the kernel is
   if it is contested.
2003-07-17 11:06:40 +00:00
Bosko Milekic
48719ca7c8 Change the style of the english used to print accounting enabled
and disabled.  This means no period at the end and changing
"Process accounting <foo>" to "Accounting <foo>".

Pointed out by: bde
2003-07-16 13:20:10 +00:00
Bosko Milekic
d2dbf5bc0b Log process accounting activation/deactivation.
Useful for some auditing purposes.

Submitted by: Christian S.J. Peron <maneo@bsdpro.com>
PR: kern/54529
2003-07-16 03:59:50 +00:00
Don Lewis
6ff1481d5c Rearrange the SYSINIT order to call lockmgr_init() earlier so that
the runtime lockmgr initialization code in lockinit() can be eliminated.

Reviewed by:	jhb
2003-07-16 01:00:39 +00:00
David Xu
af161f2232 If initial thread is still a bound thread, don't change its signal mask. 2003-07-15 14:04:38 +00:00
Hartmut Brandt
7e9024cdd9 Add a facility for devices, specifically network interfaces, that require
large to huge amounts of small or medium sized receive buffers. The problem
with these situations is that they eat up the available DMA address space
very quickly when using mbufs or even mbuf clusters. Additionally this
facility provides a direct mapping between 32-bit integers and these buffers.
This is needed for devices originally designed for 32-bit systems. Ususally
the virtual address of the buffer is used as a handle to find the buffer as
soon as it is returned by the card. This does not work for 64-bit machines
and hence this mapping is needed.
2003-07-15 08:59:38 +00:00
David Xu
4b7d5d84ee Rename thread_siginfo to cpu_thread_siginfo 2003-07-15 04:26:26 +00:00
Jeffrey Hsu
330841c763 Rev 1.121 meant to pass the value 1 to soalloc() to indicate waitok.
Reported by:	arr
2003-07-14 20:39:22 +00:00
Don Lewis
857d9c60d0 Extend the mutex pool implementation to permit the creation and use of
multiple mutex pools with different options and sizes.  Mutex pools can
be created with either the default sleep mutexes or with spin mutexes.
A dynamically created mutex pool can now be destroyed if it is no longer
needed.

Create two pools by default, one that matches the existing pool that
uses the MTX_NOWITNESS option that should be used for building higher
level locks, and a new pool with witness checking enabled.

Modify the users of the existing mutex pool to use the appropriate pool
in the new implementation.

Reviewed by:	jhb
2003-07-13 01:22:21 +00:00
Robert Drehmel
baf731e6ed Make the system call vector name of a process accessible to user
land applications by introducing the KERN_PROC_SV_NAME sysctl node,
which is searchable by PID.
2003-07-12 02:00:16 +00:00
David Xu
ffb2e92a98 If a thread is sending signal to its process, if the thread can handle
the signal itself, it should get it without looking for other threads.
2003-07-11 13:42:23 +00:00
Mike Silbersack
347194c172 Add init_param3() to subr_param. This function is called
immediately after the kernel map has been sized, and is
the optimal place for the autosizing of memory allocations
which occur within the kernel map to occur.

Suggested by:	bde
2003-07-11 00:01:03 +00:00
Peter Wemm
e95babf3a8 unifdef -DLAZY_SWITCH and start to tidy up the associated glue. 2003-07-10 01:02:59 +00:00
Mike Silbersack
ff56f15e26 A few minor changes:
- Use atomic ops to update the bigpipe count
- Make the bigpipe count sysctl readable
- Remove a duplicate comparison in an if statement
- Comment two SYSCTLs.
2003-07-09 21:59:48 +00:00
Mike Silbersack
41f16f8208 Pull in the entire kmem_map size calculation from kern_malloc, rather
than the shortcircuited version I had been using, which only worked
properly on i386 & amd64.

Also, change an autoscale constant to account for the more correct
kmem_map size.

Problem noticed by:     mux
2003-07-08 18:59:21 +00:00
Jeff Roberson
0c0a98b231 - When stealing a kse in kseq_move() ignore the current kseq's min nice
value.  We want to steal any thread, even one that is not given a slice
   on its current queue.
2003-07-08 06:19:40 +00:00
Mike Silbersack
289016f2d1 Put some concrete limits on pipe memory consumption:
- Limit the total number of pipes so that we do not
  exhaust all vm objects in the kernel map.  When
  this limit is reached, a ratelimited message will
  be printed to the console.

- Put a soft limit on the amount of memory consumable
  by pipes.  Once the limit has been reached, all new
  pipes will be limited to 4K in size, rather than the
  default of 16K.

- Put a limit on the number of pages that may be used
  for high speed page flipping in order to reduce the
  amount of wired memory.  Pipe writes that occur
  while this limit is exceeded will fall back to
  non-page flipping mode.

The above values are auto-tuned in subr_param.c and
are scaled to take into account both the size of
physical memory and the size of the kernel map.

These limits help to reduce the "kernel resources exhausted"
panics that could be caused by opening a large
number of pipes.  (Pipes alone are no longer able
to exhaust all resources, but other kernel memory hogs
in league with pipes may still be able to do so.)

PR:			53627
Ideas / comments from:	hsu, tjr, dillon@apollo.backplane.com
MFC after:		1 week
2003-07-08 04:02:31 +00:00
Jeff Roberson
0ec896fd28 - Clean up an unused variable.
Submitted by:	Steve Kargl <skg@routmask.apl.washington.edu>
2003-07-07 21:08:28 +00:00
Mike Makonnen
14b5ae1a98 Make the conditional, which decides what siglist to put a signal on,
more concise and improve the comment.

Submitted by: bde
2003-07-05 08:37:40 +00:00