Commit Graph

6561 Commits

Author SHA1 Message Date
mux
43629d3ba9 Remove extra space. 2003-08-12 20:34:31 +00:00
jhb
1c016824f1 - Convert Alpha over to the new calling conventions for cpu_throw() and
cpu_switch() where both the old and new threads are passed in as
  arguments.  Only powerpc uses the old conventions now.
- Update comments in the Alpha swtch.s to reflect KSE changes.

Tested by:	obrien, marcel
2003-08-12 19:33:36 +00:00
alc
23ea8b5c7a Pipespace() no longer requires Giant. 2003-08-11 22:23:25 +00:00
kan
91297961f6 Drop Giant in recvit before returning an error to the caller to avoid
leaking the Giant on the syscall exit.
2003-08-11 19:37:11 +00:00
bms
44aa51e3ae Add the mlockall() and munlockall() system calls.
- All those diffs to syscalls.master for each architecture *are*
   necessary. This needed clarification; the stub code generation for
   mlockall() was disabled, which would prevent applications from
   linking to this API (suggested by mux)
 - Giant has been quoshed. It is no longer held by the code, as
   the required locking has been pushed down within vm_map.c.
 - Callers must specify VM_MAP_WIRE_HOLESOK or VM_MAP_WIRE_NOHOLES
   to express their intention explicitly.
 - Inspected at the vmstat, top and vm pager sysctl stats level.
   Paging-in activity is occurring correctly, using a test harness.
 - The RES size for a process may appear to be greater than its SIZE.
   This is believed to be due to mappings of the same shared library
   page being wired twice. Further exploration is needed.
 - Believed to back out of allocations and locks correctly
   (tested with WITNESS, MUTEX_PROFILING, INVARIANTS and DIAGNOSTIC).

PR:             kern/43426, standards/54223
Reviewed by:    jake, alc
Approved by:    jake (mentor)
MFC after:	2 weeks
2003-08-11 07:14:08 +00:00
silby
bd71f7b671 More pipe changes:
From alc:
Move pageable pipe memory to a seperate kernel submap to avoid awkward
vm map interlocking issues.  (Bad explanation provided by me.)

From me:
Rework pipespace accounting code to handle this new layout, and adjust
our default values to account for the fact that we now have a solid
limit on allocations.

Also, remove the "maxpipes" limit, as it no longer has a purpose.
(The limit on kva usage solves the problem of having two many pipes.)
2003-08-11 05:51:51 +00:00
alc
1625d6386b Use vm_page_hold() instead of vm_page_wire(). Otherwise, a multithreaded
application could cause a wired page to be freed.  In general,
vm_page_hold() should be preferred for ephemeral kernel mappings of pages
borrowed from a user-level address space.  (vm_page_wire() should really be
reserved for indefinite duration pinning by the "owner" of the page.)

Discussed with:	silby
Submitted by:	tegge
2003-08-11 00:17:44 +00:00
nectar
78ff87db8b panic() if we try to handle an out-of-range signal number in
psignal()/tdsignal().  The test was historically in psignal().  It was
changed into a KASSERT, and then later moved to tdsignal() when the
latter was introduced.

Reviewed by:	iedowse, jhb
2003-08-10 23:05:37 +00:00
nectar
f5b9f87e77 Add or correct range checking of signal numbers in system calls and
ioctls.

In the particular case of ptrace(), this commit more-or-less reverts
revision 1.53 of sys_process.c, which appears to have been erroneous.

Reviewed by:	iedowse, jhb
2003-08-10 23:04:55 +00:00
alc
c37c941215 Background: When proc_rwmem() wired and mapped a page, it also added
a reference to the containing object.  The purpose of the reference
being to prevent the destruction of the object and an attempt to free
the wired page.  (Wired pages can't be freed.)  Unfortunately, this
approach does not work.  Some operations, like fork(2) that call
vm_object_split(), can move the wired page to a difference object,
thereby making the reference pointless and opening the possibility
of the wired page being freed.

A solution is to use vm_page_hold() in place of vm_page_wire().  Held
pages can be freed.  They are moved to a special hold queue until the
hold is released.

Submitted by:	tegge
2003-08-09 18:01:19 +00:00
alc
f5d5533b42 - Remove GIANT_REQUIRED from pipespace().
- Remove a duplicate initialization from pipe_create().
2003-08-08 22:38:15 +00:00
deischen
547619d0d3 Copyin the thread mailbox flags from the correct location
in the mailbox.
2003-08-08 20:23:10 +00:00
jhb
af302d132f td_dupfd just needs to be less than 0, it does not have to hold the
negative value of the index of the new file, so just use -1.
2003-08-07 17:08:26 +00:00
nectar
df9de6c5cd Update some argument-documenting comments to match reality.
Add an explicit range check to those same arguments to reduce risk of
cardiac arrest in future code readers.
2003-08-07 16:42:27 +00:00
jhb
37641f86f1 Consistently use the BSD u_int and u_short instead of the SYSV uint and
ushort.  In most of these files, there was a mixture of both styles and
this change just makes them self-consistent.

Requested by:	bde (kern_ktrace.c)
2003-08-07 15:04:27 +00:00
jhb
12f44bde5d The ktrace mutex does not need to be locked around the post of the ktrace
semaphore and doing so can lead to a possible reversal.  WITNESS would have
caught this if semaphores were used more often in the kernel.

Submitted by:	Ted Unangst <tedu@stanford.edu>, Dawson Engler
2003-08-07 13:58:13 +00:00
alc
6178e0ad16 - Remove GIANT_REQUIRED from pipe_free_kmem().
- Remove the acquisition and release of Giant around pipe_kmem_free() and
   uma_zfree() in pipeclose().
2003-08-07 04:32:40 +00:00
yar
65e4901760 If connect(2) has been interrupted by a signal and therefore the
connection is to be established asynchronously, behave as in the
case of non-blocking mode:

- keep the SS_ISCONNECTING bit set thus indicating that
  the connection establishment is in progress, which is the case
  (clearing the bit in this case was just a bug);

- return EALREADY, instead of the confusing and unreasonable
  EADDRINUSE, upon further connect(2) attempts on this socket
  until the connection is established (this also brings our
  connect(2) into accord with IEEE Std 1003.1.)
2003-08-06 14:04:47 +00:00
davidxu
69df6d1c3b kse.h is not needed for these files. 2003-08-05 12:08:49 +00:00
davidxu
93e075cf7a Introduce a thread mailbox flag TMF_NOUPCALL. On some architectures other
than i386 or AMD64, TP register points to thread mailbox, and they can not
atomically clear km_curthread in kse mailbox, in this case, thread retrieves
its thread pointer from TP register and sets flag TMF_NOUPCALL in its thread
mailbox to indicate a critical region.
2003-08-05 12:00:55 +00:00
hsu
fb82c18f66 Make the second argument to sooptcopyout() constant in order to
simplify the upcoming PIM patches.

Submitted by:   Pavlin Radoslavov <pavlin@icir.org>
2003-08-05 00:27:54 +00:00
iedowse
7bf5fa9caf In the mknod(), mkfifo(), link(), symlink() and undelete() syscalls,
use vrele() instead of vput() on the parent directory vnode returned
by namei() in the case where it is equal to the target vnode. This
handles namei()'s somewhat strange (but documented) behaviour of
not locking either vnode when the two vnodes are equal and LOCKPARENT
but not LOCKLEAF is specified.

Note that since a vnode double-unlock is not currently fatal, these
coding errors were effectively harmless.

Spotted by:	Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
Reviewed by:	mckusick
2003-08-05 00:26:51 +00:00
dwmalone
cb188056e6 Do some minor Giant pushdown made possible by copyin, fget, fdrop,
malloc and mbuf allocation all not requiring Giant.

1) ostat, fstat and nfstat don't need Giant until they call fo_stat.
2) accept can copyin the address length without grabbing Giant.
3) sendit doesn't need Giant, so don't bother grabbing it until kern_sendit.
4) move Giant grabbing from each indivitual recv* syscall to recvit.
2003-08-04 21:28:57 +00:00
jhb
e71dfc3b00 Adjust a comment to remove staleness and take slightly less implementation
specific perspective.
2003-08-04 20:35:13 +00:00
jhb
52adb98aef Set td_critnest to 1 when setting up a thread since it is a MI field with
MI values.  This ensures that td_critnest for a newly fork'd thread is
always valid.

Requested by:	bde (a long time ago)
2003-08-04 20:28:20 +00:00
jhb
a69166c61f Insert cosmetic spaces.
Reported by:	kris
2003-08-04 19:24:25 +00:00
rwatson
543a037619 Move more ACL logic from the UFS code (ufs_acl.c) to the central POSIX.1e
support routines in kern_acl.c:

- Define ACL_OVERRIDE_MASK and ACL_PRESERVE_MASK centrally in acl.h: the
  mode bits that are (and aren't) stored in the ACL.

- Add acl_posix1e_acl_to_mode(): given a POSIX.1e extended ACL, generate
  a compatibility mode (only the bits supported by the POSIX.1e ACL).

- acl_posix1e_newfilemode(): Given a requested creation mode and default
  ACL, calculate the mode for the new file system object (only the bits
  supported by the POSIX.1e ACL).

PR:		50148
Reported by:	Ritz, Bruno <bruno_ritz@gmx.ch>
Obtained from:	TrustedBSD Project
2003-08-04 02:13:05 +00:00
jhb
f0ef0df712 Both 'c' an 'lines' are unused, the bogus init of lines was accidentally
left behind.
2003-08-02 17:35:00 +00:00
alc
15ec2b9212 Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in proc_rwmem().
See revision 1.140 of kern/sys_pipe.c for a detailed rationale.

Submitted by:	tegge
2003-08-02 17:08:21 +00:00
phk
adb4818b64 Grab Giant in bufdonebio() since drivers may not hold it.
This only protects the "struct buf" consumers (ie: DEV_STRATEGY()),
but does not protect BIO_STRATEGY() users.
2003-08-02 09:45:10 +00:00
phk
e1e146913d Grab Giant in physio() since non-giant drivers are starting to appear. 2003-08-02 09:40:53 +00:00
alc
507ad47156 Eliminate an abuse of kmem_alloc_pageable() in bufinit()
by using VM_ALLOC_NOOBJ to allocate the bogus page.

Reviewed by:	tegge
2003-08-02 05:05:34 +00:00
alc
4d05c167d2 Use kmem_alloc_nofault() rather than kmem_alloc_pageable() in sf_buf_init().
(See revision 1.140 of kern/sys_pipe.c for a detailed rationale.)

Submitted by:	tegge
2003-08-02 04:18:56 +00:00
obrien
1c53f0726f Fix kernel build -- 'c' was the unused var, not 'lines'. 2003-08-01 17:00:49 +00:00
rwatson
23fd91f044 Attempt to simplify #ifdef logic for MAC_ALWAYS_LABEL_MBUF.
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-08-01 15:45:14 +00:00
alc
7199d3e24f Remove Giant from writev(2). Eliminate trivial style differences between
writev(2) and readv(2).
2003-08-01 02:21:54 +00:00
jhb
937519b3ea If a spin lock is held for too long and WITNESS is enabled, then call
witness_display_spinlock() to see if we can find out where the current
owner of the spin lock last acquired the lock.
2003-07-31 18:52:18 +00:00
jhb
a3b9c0d553 Add a new function to look for a spinlock's instance when it is held by
another thread.  We use the td_oncpu member of the other field to locate
it's associated CPU and then search the that CPU's list of spin locks
contained in its per-CPU data.  This is not always safe and may in fact
panic or just not work, but it is useful in at least one case.
2003-07-31 18:50:58 +00:00
jhb
bc9db472d8 Update the 'ps', 'show pci', and 'show ktr' ddb commands to use the new
pager callout instead of homerolling their own paging facility.
2003-07-31 17:29:42 +00:00
peter
8dd9d4012a When ktracing context switches, make sure we record involuntary switches.
Otherwise, when we get a evicted from the cpu, there is no record of it.
This is not a default ktrace flag.
2003-07-31 01:36:24 +00:00
davidxu
176657958f Use correct signal when calling sigexit. 2003-07-30 23:11:37 +00:00
pb
edb5fbc5cc Remove test in pipe_write() which causes write(2) to return EAGAIN
on a non-blocking pipe in cases where select(2) returns the file
descriptor as ready for write. This in turns causes libc_r, for
one, to busy wait in such cases.

Note: it is a quick performance fix, a more complex fix might be
required in case this turns out to have unexpected side effects.

Reviewed by:	silby
MFC after:	3 days
2003-07-30 22:50:37 +00:00
jhb
97e378fb00 When complaining about a sleeping thread owning a mutex, display the
thread's pid to make debugging easier for people who don't want to have to
use the intended tool for these panics (witness).

Indirectly prodded by:	kris
2003-07-30 20:42:15 +00:00
alc
fc6d1980cc The introduction of vm object locking has caused witness to reveal
a long-standing mistake in the way a portion of a pipe's KVA is
allocated.  Specifically, kmem_alloc_pageable() is inappropriate
for use in the "direct" case because it allows a preceding vm map entry
and vm object to be extended to support the new KVA allocation.
However, the direct case KVA allocation should not have a backing
vm object.  This is corrected by using kmem_alloc_nofault().

Submitted by:	tegge (with the above explanation by me)
2003-07-30 18:55:04 +00:00
alc
bbf702f5b5 Revision 1.51 of vm/uma_core.c modified uma_large_free() to acquire Giant
when needed.  So, don't do it here.
2003-07-29 05:23:19 +00:00
rwatson
d2f7ae9f88 Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the
kernel ACL interfaces and system call names.

Break out UFS2 and FFS extattr delete and list vnode operations from
setextattr and getextattr to deleteextattr and listextattr, which
cleans up the implementations, and makes the results more readable,
and makes the APIs more clear.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-07-28 18:53:29 +00:00
rwatson
9bfbf98f8a When exporting file descriptor data for threads invoking the
kern.file sysctl, don't return information about processes that
fail p_cansee(td, p).  This prevents sockstat and related
programs from seeing file descriptors owned by processes not
in the same jail as the thread, as well as having implications
for MAC, etc.

This is a partial solution: it permits an information leak about
the number of descriptors in the sizing calculation (but this is
not new information, you can also get it from kern.openfiles),
and doesn't attempt to mask file descriptors based on the
properties of the descriptor, only the process referencing it.
However, it provides most of what you want under most
circumstances, without complicating the locking.

PR:	54211
Based on a patch submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2003-07-28 16:03:53 +00:00
phk
e457974b5d Pass the file descriptor index down to vn_open.
If the method vector was replaced and we got the "special return code"
smile and trust that whatever happened below DTRT.
2003-07-27 20:09:13 +00:00
phk
b80d7fd8a0 Pass the fdidx argument from vn_open{_cred}() onto VOP_OPEN() 2003-07-27 20:05:36 +00:00
phk
d4d7ca154a Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00