Commit Graph

5355 Commits

Author SHA1 Message Date
Poul-Henning Kamp
c7143e7150 Oops, broke the build there. Uninline biodone() now that it is non-trivial.
Introduce biowait() function.  Currently there is a race condition and the
mitigation is a timeout/retry.  It is not obvious what kind of locking (if any)
is suitable for BIO_DONE, since the majority of users take are of this
themselves, and only a few places actually rely on the wakeup.

Sponsored by: DARPA & NAI Labs.
2002-09-13 11:28:31 +00:00
Don Lewis
28b325aa60 Drop the proc lock while calling fdcheckstd() which may block to allocate
memory.

Reviewed by:	jhb
2002-09-13 09:31:56 +00:00
Bruce Evans
f70de49661 Fixed style bugs in resource_list_add_next(). 2002-09-12 13:45:38 +00:00
Andrew R. Reiter
b4dcc46af5 - Fix two obvious locking bugs; 1) returning with lock held when it needed
to be dropped, 2) attempting to lock acct_mtx while already holding it.
  Sorry to those who experienced pain.
- Added two comments referring to two areas in which acct_mtx is held over
  vnode operations that might sleep.  Patch in the works for this.
2002-09-12 05:00:32 +00:00
John Baldwin
c9e7d28e26 - Change utrace ktrace events to malloc the work buffer before getting a
request structure.
- Re-optimize the case of utrace being disabled by doing an explicit
  KTRPOINT check instead of relying on the one in ktr_getrequest() so that
  we don't waste time on a malloc in the non-tracing case.
- Change utrace() to return an error if the copyin() fails.  Before it
  would just ignore the request but still return success.  This last is
  a change in behavior and can be backed out if necessary.
2002-09-11 21:00:56 +00:00
John Baldwin
1d3ab18279 Remove support for synchronous ktrace requests now that none exist anymore.
They were an ugly, gross hack.
2002-09-11 20:58:10 +00:00
John Baldwin
b92584a689 - Change ktrace genio events to only copy up to ktr_geniosize bytes of a
transfer to a malloc'd buffer and use that bufer for the ktrace event.
  This means that genio ktrace events no longer need to be synchronous.
- Now that ktr_buffer isn't overloaded to sometimes point to a cached uio
  pointer for genio requests and always points to a malloc'd buffer if not
  NULL, free the buffer in ktr_freerequest() instead of in
  ktr_writerequest().  This closes a memory leak for ktrace events that
  used a malloc'd buffer that had their vnode ripped out from under them
  while they were on the todo list.

Suggested by:	bde (1, in principle)
2002-09-11 20:56:05 +00:00
John Baldwin
12301fc3c7 - Add a kern.ktrace sysctl node.
- Rename kern.ktrace_request_pool tunable/sysctl to
  kern.ktrace.request_pool.
- Add a variable to control the max amount of data to log for genio events.
  This variable is tunable via the tunable/sysctl kern.ktrace.genio_size
  and defaults to one page.
2002-09-11 20:49:55 +00:00
John Baldwin
4b3aac3d4e Change namei and syscall ktrace events to malloc work buffers before
obtaining a ktr_request structure from the free pool so we can avoid
starving other threads of ktr_request structures.
2002-09-11 20:46:50 +00:00
Julian Elischer
85e40eaf26 Indentation does not make a block.. need curly braces too.
Submitted by: Eagle-eyes evans <bde@freebsd.org>
2002-09-11 18:15:26 +00:00
Julian Elischer
71fad9fdee Completely redo thread states.
Reviewed by:	davidxu@freebsd.org
2002-09-11 08:13:56 +00:00
Bruce Evans
527eee2d40 Include <vm/uma.h> instead of depending on namespace pollution in
<sys/malloc.h>.

Sorted includes as much as possible.  Removed banal comment(s) attached to
includes.
2002-09-11 07:13:28 +00:00
Warner Losh
74014b7f0a Clarify the return value from child_present. 2002-09-11 04:22:10 +00:00
Andrew R. Reiter
4f39d5d511 - Lock down the accounting code globals with a subsystem mutex.
Reviewed by:	jhb, mdodd
2002-09-11 04:10:41 +00:00
Bruce Evans
e5d6cd0c98 Include <sys/malloc.h> instead of depending on namespace pollution 2
layers deep in <sys/proc.h> or <sys/vnode.h>.

Removed unused includes.  Sorted includes.
2002-09-10 11:57:02 +00:00
Bruce Evans
d3a7b5e70e vfs_syscalls.c:
Changed rename(2) to follow the letter of the POSIX spec.  POSIX
requires rename() to have no effect if its args "resolve to the same
existing file".  I think "file" can only reasonably be read as referring
to the inode, although the rationale and "resolve" seem to say that
sameness is at the level of (resolved) directory entries.

ext2fs_vnops.c, ufs_vnops.c:
Replaced code that gave the historical BSD behaviour of removing one
link name by checks that this code is now unreachable.  This fixes
some races.  All vnodes needed to be unlocked for the removal, and
locking at another level using something like IN_RENAME was not even
attempted, so it was possible for rename(x, y) to return with both x
and y removed even without any unlink(2) syscalls (one process can
remove x using rename(x, y) and another process can remove y using
rename(y, x)).

Prodded by:	alfred
MFC after:	8 weeks
PR:		42617
2002-09-10 11:09:13 +00:00
Robert Watson
c0f3990523 Add security.mac.mmap_revocation, a flag indicating whether we
should revoke access to memory maps on a process label change.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-09-09 17:12:24 +00:00
Robert Watson
1614003510 Minor code sync to MAC tree: push Giant locking up from
mac_cred_mmapped_drop_perms() to the caller.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-09-09 17:10:16 +00:00
Poul-Henning Kamp
5ea98f59b9 Fix a logic bug in the struct dev_t allocation code.
Spotted by:	Neelkanth Natu <neelnatu@yahoo.com>
2002-09-08 15:15:12 +00:00
Jake Burkholder
c0d676c068 Make this driver work a whole lot better.
- Get the initial mode from the prom settings and don't clobber the mode
  on open.
- Copy output into an internal ring buffer instead of accessing the tty
  outq directly in the interrupt handler.  This fixes a problem where
  garbage would show up in the output stream.
- Reset the console port completely and reprogram all the parameters
  before enabling it.  This fixes seemingly random hangs on startup
  when using a fast interrupt handler.
- Add minimal locking in place of spls.
- Remove dead code and minor cleanups.
2002-09-08 04:45:16 +00:00
Peter Wemm
d0ca7c29dc Do not blow up when we walk off the end of the brands list.
Found by:	kris, jake
2002-09-08 02:17:44 +00:00
Peter Wemm
a9f9df5daf Tidy up some loose ends that bde pointed out. caddr_t bad, ok?
Move fill_kinfo_proc to before we copy the results instead of after
the copy and too late.

There is still more to do here.
2002-09-07 22:31:44 +00:00
Peter Wemm
1ed8cb4870 Remove bogus fill_kinfo_proc() before ptrace_set_pc(). There was no need
for this.

Submitted by:	bde
2002-09-07 22:18:19 +00:00
Peter Wemm
99a17113cd The true value of how the kernel was configured for KSTACK_PAGES was not
available at module compile time.  Do not #include the bogus
opt_kstack_pages.h at this point and instead refer to the variables that
are also exported via sysctl.
2002-09-07 22:15:47 +00:00
Peter Wemm
b9f009b08d Make UAREA_PAGES and KSTACK_PAGES visible to userland via sysctl, like
PS_STRINGS and USRSTACK is.  This is necessary in order to decode a.out
core dumps.  kern_proc.c was already referring to both of these values
but was missing the #include "opt_kstack_pages.h".  Make the sysctl
variables visible so that certain kld modules can see how their parent
kernel was configured.
2002-09-07 22:11:45 +00:00
Julian Elischer
c0698d32ce fix braino..
was clearing part of wrong thread structure..
2002-09-07 12:58:44 +00:00
Julian Elischer
9b0e281b69 fix misplaced schedlock
Submitted by:	davidxu@freebsd.org
2002-09-07 01:48:53 +00:00
Peter Wemm
710ded3ac5 Collect the a.out coredump code into the calling functions.
XXX why does pecoff dump in a.out format?
2002-09-07 01:23:51 +00:00
Robert Watson
6f22742b25 Minor spelling tweak: assume "his" is actually "This". 2002-09-06 13:22:44 +00:00
Julian Elischer
1faf202ea9 Use UMA as a complex object allocator.
The process allocator now caches and hands out complete process structures
*including substructures* .

i.e. it get's the process structure with the first thread (and soon KSE)
already allocated and attached, all in one hit.

For the average non threaded program (non KSE that is) the allocated thread and its stack remain attached to the process, even when the process is
unused and in the process cache. This saves having to allocate and attach it
later, effectively bringing us (hopefully) close to the efficiency
of pre-KSE systems where these were a single structure.

Reviewed by:	davidxu@freebsd.org, peter@freebsd.org
2002-09-06 07:00:37 +00:00
David Xu
65c17e749b Remove extra ';' 2002-09-06 00:18:52 +00:00
Poul-Henning Kamp
e1657bbb97 Introduce the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods.
Together these two implement a simple transcation style grouping for
modifications of extended attributes on a vnode.

VOP_CLOSEEXTATTR() takes a boolean "commit" argument, which determines
if the aggregate changes are attempted written or not.  A commit will
fail if any of the VOP_SETEXTATTR() calls since the VOP_OPENEXTATTR()
have failed to meet their objective or if the flush to disk fails.

The default operations for these two VOP's is to return EOPNOTSUPP.

This API may still be subject to change.

Sponsored by:   DARPA & NAI Labs
2002-09-05 20:56:14 +00:00
Poul-Henning Kamp
f8b663614d Fix an inherited style bug: compare with NOCRED instead of NULL.
Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:46:19 +00:00
Poul-Henning Kamp
c1a925a637 Introduce new extattr_check_cred() function which implements the canonical
crential washing for extended attributes.

Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:38:57 +00:00
Mitsuru IWASAKI
2894f9d0a7 Add debug.rman_debug sysctl MIB and loader tunable instead of broken
RMAN_DEBUG option.
This would be useful for debugging resource manager code.
2002-09-05 11:45:02 +00:00
Poul-Henning Kamp
32c6c4780a Fix a format buglet.
Spotted by:	iedowse
2002-09-05 11:42:03 +00:00
David Xu
1279572a92 s/SGNL/SIG/
s/SNGL/SINGLE/
s/SNGLE/SINGLE/

Fix abbreviation for P_STOPPED_* etc flags, in original code they were
inconsistent and difficult to distinguish between them.

Approved by: julian (mentor)
2002-09-05 07:30:18 +00:00
Bruce Evans
b656366b46 Include <sys/malloc.h> instead of depending on namespace pollution 2
layers deep in <sys/proc.h> or <sys/vnode.h>.

Removed unused includes.

Fixed some printf format errors (1 fatal on i386's; 1 fatal on alphas;
1 not fatal on any supported machine).
2002-09-05 07:02:43 +00:00
Ian Dowse
012e544f12 Split up ptrace() into a wrapper that does the copying to and from
user space and a kern_ptrace() implementation. Use the kern_*()
version in the Linux emulation code to remove more stack gap uses.

Approved by:	des
2002-09-05 01:02:50 +00:00
Poul-Henning Kamp
b336df68d1 Under DIAGNOSTIC, complain if a timeout(9) routine took more than 1msec. 2002-09-04 20:05:00 +00:00
Poul-Henning Kamp
e46eeb89b9 Do not employ timecounter hardware if our hz does not support their
correct rewinding.
2002-09-04 19:32:18 +00:00
Poul-Henning Kamp
e7fa55af89 Give up on calling tc_ticktock() from a timeout, we have timeout
functions which run for several milliseconds at a time and getting
in queue behind one or more of those makes us miss our rewind.

Instead call it from hardclock() like we used to do, but retain the
prescaler so we still cope with high HZ values.
2002-09-04 10:15:19 +00:00
Matthew Dillon
21c2d0479c Alright, fix the problems with the elf loader for the Alpha. It turns
out that there is no easy way to discern the difference between a text
segment and a data segment through the read-only OR execute attribute
in the elf segment header, so revert the algorithm to what it was before.

Neither can we account for multiple data load segments in the vmspace
structure (at least not without more work), due to assumptions obreak()
makes in regards to the data start and data size fields.

Retain RLIMIT_VMEM checking by using a local variable to track the
total bytes of data being loaded.

Reviewed by:	peter
X-MFC after:	ASAP
2002-09-04 04:42:12 +00:00
Peter Wemm
9782ecbab0 Make the text segment locating heuristics from rev 1.121 more reliable
so that it works on the Alpha.  This defines the segment that the entry
point exists in as 'text' and any others (usually one) as data.

Submitted by: tmm
Tested on: i386, alpha
2002-09-03 21:18:17 +00:00
John Baldwin
5fc3031366 - Change falloc() to acquire an fd from the process table last so that
it can do it w/o needing to hold the filelist_lock sx lock.
- fdalloc() doesn't need Giant to call free() anymore.  It also doesn't
  need to drop and reacquire the filedesc lock around free() now as a
  result.
- Try to make the code that copies fd tables when extending the fd table in
  fdalloc() a bit more readable by performing assignments in separate
  statements.  This is still a bit ugly though.
- Use max() instead of an if statement so to figure out the starting point
  in the search-for-a-free-fd loop in fdalloc() so it reads better next to
  the min() in the previous line.
- Don't grow nfiles in steps up to the size needed if we dup2() to some
  really large number.  Go ahead and double 'nfiles' in a loop prior
  to doing the malloc().
- malloc() doesn't need Giant now.
- Use malloc() and free() instead of MALLOC() and FREE() in fdalloc().
- Check to see if the size we are going to grow to is too big, not if the
  current size of the fd table is too big in the loop in fdalloc().  This
  means if we are out of space or if dup2() requests too high of a fd,
  then we will return an error before we go off and try to allocate some
  huge table and copy the existing table into it.
- Move all of the logic for dup'ing a file descriptor into do_dup() instead
  of putting some of it in do_dup() and duplicating other parts in four
  different places.  This makes dup(), dup2(), and fcntl(F_DUPFD) basically
  wrappers of do_dup now.  fcntl() still has an extra check since it uses
  a different error return value in one case then the other functions.
- Add a KASSERT() for an assertion that may not always be true where the
  fdcheckstd() function assumes that falloc() returns the fd requested and
  not some other fd.  I think that the assertion is always true because we
  are always single-threaded when we get to this point, but if one was
  using rfork() and another process sharing the fd table were playing with
  the fd table, there might could be a problem.
- To handle the problem of a file descriptor we are dup()'ing being closed
  out from under us in dup() in general, do_dup() now obtains a reference
  on the file in question before calling fdalloc().  If after the call to
  fdalloc() the file for the fd we are dup'ing is a different file, then
  we drop our reference on the original file and return EBADF.  This
  race was only handled in the dup2() case before and would just retry
  the operation.  The error return allows the user to know they are being
  stupid since they have a locking bug in their app instead of dup'ing
  some other descriptor and returning it to them.

Tested on:	i386, alpha, sparc64
2002-09-03 20:16:31 +00:00
John Baldwin
0d975d6341 Add some KASSERT()'s to ensure that we don't perform spin mutex ops on
sleep mutexes and vice versa.  WITNESS normally should catch this but
not everyone uses WITNESS so this is a fallback to catch nasty but easy
to do bugs.
2002-09-03 18:25:16 +00:00
David Xu
35c32a76f9 In the kernel code, we have the tsleep() call with the PCATCH argument.
PCATCH means 'if we get a signal, interrupt me!" and tsleep returns
either EINTR or ERESTART depending on the circumstances.  ERESTART is
"special" because it causes the system call to fail, but right as it
returns back to userland it tells the trap handler to move %eip back a
bit so that userland will immediately re-run the syscall.
This is a syscall restart. It only works for things like read() etc where
nothing has changed yet. Note that *userland* is tricked into restarting
the syscall by the kernel. The kernel doesn't actually do the restart. It
is deadly for things like select, poll, nanosleep etc where it might cause
the elapsed time to be reset and start again from scratch.  So those
syscalls do this to prevent userland rerunning the syscall:
  if (error == ERESTART) error = EINTR;

Fake "signals" like SIGTSTP from ^Z etc do not normally invoke userland
signal handlers. But, in -current, the PCATCH *is* being triggered and
tsleep is returning ERESTART, and the syscall is aborted even though no
userland signal handler was run.
That is the fault here.  We're triggering the PCATCH in cases that we
shouldn't.  ie: it is being triggered on *any* signal processing, rather
than the case where the signal is posted to userland.
	--- Peter

The work of psignal() is a patchwork of special case required by the process
debugging and job-control facilities...
	--- Kirk McKusick
	"The design and impelementation of the 4.4BSD Operating system"
	Page 105

in STABLE source, when psignal is posting a STOP signal to sleeping
process and the signal action of the process is SIG_DFL, system will
directly change the process state from SSLEEP to SSTOP, and when
SIGCONT is posted to the stopped process, if it finds that the process
is still on sleep queue, the process state will be restored to SSLEEP,
and won't wakeup the process.

this commit mimics the behaviour in STABLE source tree.

Reviewed by: Jon Mini, Tim Robbins, Peter Wemm
Approved by: julian@freebsd.org (mentor)
2002-09-03 12:56:01 +00:00
Ian Dowse
48b52b7a32 Split up __getcwd so that kernel callers of the internal version
can specify whether the buffer is in user or system space.
2002-09-02 22:40:30 +00:00
Ian Dowse
49c2ff159f Split fcntl() into a wrapper and a kernel-callable kern_fcntl()
implementation. The wrapper is responsible for copying additional
structure arguments (struct flock) to and from userland.
2002-09-02 22:24:14 +00:00
Matthew Dillon
05ef87980a Grammer cleanup 2002-09-02 17:27:30 +00:00