5330 Commits

Author SHA1 Message Date
rwatson
241e77818a Minor spelling tweak: assume "his" is actually "This". 2002-09-06 13:22:44 +00:00
julian
4446570abf Use UMA as a complex object allocator.
The process allocator now caches and hands out complete process structures
*including substructures* .

i.e. it get's the process structure with the first thread (and soon KSE)
already allocated and attached, all in one hit.

For the average non threaded program (non KSE that is) the allocated thread and its stack remain attached to the process, even when the process is
unused and in the process cache. This saves having to allocate and attach it
later, effectively bringing us (hopefully) close to the efficiency
of pre-KSE systems where these were a single structure.

Reviewed by:	davidxu@freebsd.org, peter@freebsd.org
2002-09-06 07:00:37 +00:00
davidxu
e3c3155c8c Remove extra ';' 2002-09-06 00:18:52 +00:00
phk
aa2987768b Introduce the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods.
Together these two implement a simple transcation style grouping for
modifications of extended attributes on a vnode.

VOP_CLOSEEXTATTR() takes a boolean "commit" argument, which determines
if the aggregate changes are attempted written or not.  A commit will
fail if any of the VOP_SETEXTATTR() calls since the VOP_OPENEXTATTR()
have failed to meet their objective or if the flush to disk fails.

The default operations for these two VOP's is to return EOPNOTSUPP.

This API may still be subject to change.

Sponsored by:   DARPA & NAI Labs
2002-09-05 20:56:14 +00:00
phk
3303b3f624 Fix an inherited style bug: compare with NOCRED instead of NULL.
Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:46:19 +00:00
phk
55be95d161 Introduce new extattr_check_cred() function which implements the canonical
crential washing for extended attributes.

Sponsored by:	DARPA & NAI Labs.
2002-09-05 20:38:57 +00:00
iwasaki
9a172ee34e Add debug.rman_debug sysctl MIB and loader tunable instead of broken
RMAN_DEBUG option.
This would be useful for debugging resource manager code.
2002-09-05 11:45:02 +00:00
phk
d5001c9818 Fix a format buglet.
Spotted by:	iedowse
2002-09-05 11:42:03 +00:00
davidxu
b1d94c37f7 s/SGNL/SIG/
s/SNGL/SINGLE/
s/SNGLE/SINGLE/

Fix abbreviation for P_STOPPED_* etc flags, in original code they were
inconsistent and difficult to distinguish between them.

Approved by: julian (mentor)
2002-09-05 07:30:18 +00:00
bde
725b1916bd Include <sys/malloc.h> instead of depending on namespace pollution 2
layers deep in <sys/proc.h> or <sys/vnode.h>.

Removed unused includes.

Fixed some printf format errors (1 fatal on i386's; 1 fatal on alphas;
1 not fatal on any supported machine).
2002-09-05 07:02:43 +00:00
iedowse
0fc3eadf20 Split up ptrace() into a wrapper that does the copying to and from
user space and a kern_ptrace() implementation. Use the kern_*()
version in the Linux emulation code to remove more stack gap uses.

Approved by:	des
2002-09-05 01:02:50 +00:00
phk
b1f33fc74e Under DIAGNOSTIC, complain if a timeout(9) routine took more than 1msec. 2002-09-04 20:05:00 +00:00
phk
d608e476ab Do not employ timecounter hardware if our hz does not support their
correct rewinding.
2002-09-04 19:32:18 +00:00
phk
8ceeefb3da Give up on calling tc_ticktock() from a timeout, we have timeout
functions which run for several milliseconds at a time and getting
in queue behind one or more of those makes us miss our rewind.

Instead call it from hardclock() like we used to do, but retain the
prescaler so we still cope with high HZ values.
2002-09-04 10:15:19 +00:00
dillon
469a54660c Alright, fix the problems with the elf loader for the Alpha. It turns
out that there is no easy way to discern the difference between a text
segment and a data segment through the read-only OR execute attribute
in the elf segment header, so revert the algorithm to what it was before.

Neither can we account for multiple data load segments in the vmspace
structure (at least not without more work), due to assumptions obreak()
makes in regards to the data start and data size fields.

Retain RLIMIT_VMEM checking by using a local variable to track the
total bytes of data being loaded.

Reviewed by:	peter
X-MFC after:	ASAP
2002-09-04 04:42:12 +00:00
peter
89f4f91595 Make the text segment locating heuristics from rev 1.121 more reliable
so that it works on the Alpha.  This defines the segment that the entry
point exists in as 'text' and any others (usually one) as data.

Submitted by: tmm
Tested on: i386, alpha
2002-09-03 21:18:17 +00:00
jhb
b0aee047fb - Change falloc() to acquire an fd from the process table last so that
it can do it w/o needing to hold the filelist_lock sx lock.
- fdalloc() doesn't need Giant to call free() anymore.  It also doesn't
  need to drop and reacquire the filedesc lock around free() now as a
  result.
- Try to make the code that copies fd tables when extending the fd table in
  fdalloc() a bit more readable by performing assignments in separate
  statements.  This is still a bit ugly though.
- Use max() instead of an if statement so to figure out the starting point
  in the search-for-a-free-fd loop in fdalloc() so it reads better next to
  the min() in the previous line.
- Don't grow nfiles in steps up to the size needed if we dup2() to some
  really large number.  Go ahead and double 'nfiles' in a loop prior
  to doing the malloc().
- malloc() doesn't need Giant now.
- Use malloc() and free() instead of MALLOC() and FREE() in fdalloc().
- Check to see if the size we are going to grow to is too big, not if the
  current size of the fd table is too big in the loop in fdalloc().  This
  means if we are out of space or if dup2() requests too high of a fd,
  then we will return an error before we go off and try to allocate some
  huge table and copy the existing table into it.
- Move all of the logic for dup'ing a file descriptor into do_dup() instead
  of putting some of it in do_dup() and duplicating other parts in four
  different places.  This makes dup(), dup2(), and fcntl(F_DUPFD) basically
  wrappers of do_dup now.  fcntl() still has an extra check since it uses
  a different error return value in one case then the other functions.
- Add a KASSERT() for an assertion that may not always be true where the
  fdcheckstd() function assumes that falloc() returns the fd requested and
  not some other fd.  I think that the assertion is always true because we
  are always single-threaded when we get to this point, but if one was
  using rfork() and another process sharing the fd table were playing with
  the fd table, there might could be a problem.
- To handle the problem of a file descriptor we are dup()'ing being closed
  out from under us in dup() in general, do_dup() now obtains a reference
  on the file in question before calling fdalloc().  If after the call to
  fdalloc() the file for the fd we are dup'ing is a different file, then
  we drop our reference on the original file and return EBADF.  This
  race was only handled in the dup2() case before and would just retry
  the operation.  The error return allows the user to know they are being
  stupid since they have a locking bug in their app instead of dup'ing
  some other descriptor and returning it to them.

Tested on:	i386, alpha, sparc64
2002-09-03 20:16:31 +00:00
jhb
d8e689eb09 Add some KASSERT()'s to ensure that we don't perform spin mutex ops on
sleep mutexes and vice versa.  WITNESS normally should catch this but
not everyone uses WITNESS so this is a fallback to catch nasty but easy
to do bugs.
2002-09-03 18:25:16 +00:00
davidxu
de678b0952 In the kernel code, we have the tsleep() call with the PCATCH argument.
PCATCH means 'if we get a signal, interrupt me!" and tsleep returns
either EINTR or ERESTART depending on the circumstances.  ERESTART is
"special" because it causes the system call to fail, but right as it
returns back to userland it tells the trap handler to move %eip back a
bit so that userland will immediately re-run the syscall.
This is a syscall restart. It only works for things like read() etc where
nothing has changed yet. Note that *userland* is tricked into restarting
the syscall by the kernel. The kernel doesn't actually do the restart. It
is deadly for things like select, poll, nanosleep etc where it might cause
the elapsed time to be reset and start again from scratch.  So those
syscalls do this to prevent userland rerunning the syscall:
  if (error == ERESTART) error = EINTR;

Fake "signals" like SIGTSTP from ^Z etc do not normally invoke userland
signal handlers. But, in -current, the PCATCH *is* being triggered and
tsleep is returning ERESTART, and the syscall is aborted even though no
userland signal handler was run.
That is the fault here.  We're triggering the PCATCH in cases that we
shouldn't.  ie: it is being triggered on *any* signal processing, rather
than the case where the signal is posted to userland.
	--- Peter

The work of psignal() is a patchwork of special case required by the process
debugging and job-control facilities...
	--- Kirk McKusick
	"The design and impelementation of the 4.4BSD Operating system"
	Page 105

in STABLE source, when psignal is posting a STOP signal to sleeping
process and the signal action of the process is SIG_DFL, system will
directly change the process state from SSLEEP to SSTOP, and when
SIGCONT is posted to the stopped process, if it finds that the process
is still on sleep queue, the process state will be restored to SSLEEP,
and won't wakeup the process.

this commit mimics the behaviour in STABLE source tree.

Reviewed by: Jon Mini, Tim Robbins, Peter Wemm
Approved by: julian@freebsd.org (mentor)
2002-09-03 12:56:01 +00:00
iedowse
a62f952615 Split up __getcwd so that kernel callers of the internal version
can specify whether the buffer is in user or system space.
2002-09-02 22:40:30 +00:00
iedowse
62f75e87a4 Split fcntl() into a wrapper and a kernel-callable kern_fcntl()
implementation. The wrapper is responsible for copying additional
structure arguments (struct flock) to and from userland.
2002-09-02 22:24:14 +00:00
dillon
49e348fa48 Grammer cleanup 2002-09-02 17:27:30 +00:00
davidxu
0946da5e4e fix bogus CTR3 message.
Reviewed by: julian@freebsd.org (mentor)
2002-09-02 07:55:06 +00:00
jake
40170d28fc Moved elf brand identification into a function. Fully identify the
brand early in the process of loading an elf file, so that we can
identify the sysentvec, and so that we do not continue if we do not
have a brand (and thus a sysentvec).  Use the values in the sysentvec
for the page size and vm ranges unconditionally, since they are all
filled in now.
2002-09-02 04:50:57 +00:00
alc
0f13b0caca o Synchronize updates to struct vm_page::cow with the page queues lock. 2002-09-02 04:04:12 +00:00
jake
72e807ca38 Fixed more indentation bugs. 2002-09-02 02:41:26 +00:00
jake
ce650f8c33 Added fields for VM_MIN_ADDRESS, PS_STRINGS and stack protections to
sysentvec.  Initialized all fields of all sysentvecs, which will allow
them to be used instead of constants in more places.  Provided stack
fixup routines for emulations that previously used the default.
2002-09-01 21:41:24 +00:00
iedowse
be17b12cb6 Split out a number of mostly VFS and signal related syscalls into
a kernel-internal kern_*() version and a wrapper that is called via
the syscall vector table. For paths and structure pointers, the
internal version either takes a uio_seg parameter or requires the
caller to copyin() the data to kernel memory as appropiate. This
will permit emulation layers to use these syscalls without having
to copy out translated arguments to the stack gap.

Discussed on:		-arch
Review/suggestions:	bde, jhb, peter, marcel
2002-09-01 20:37:28 +00:00
dillon
85479bded2 Implement data, text, and vmem limit checking in the elf loader and svr4
compat code.  Clean up accounting for multiple segments.  Part 1/2.

Submitted by:	Andrey Alekseyev <uitm@zenon.net> (with some modifications)
MFC after:	3 days
2002-08-30 18:09:46 +00:00
peter
c3bdd669c3 Change hw.physmem and hw.usermem to unsigned long like they used to be
in the original hardwired sysctl implementation.

The buf size calculator still overflows an integer on machines with large
KVA (eg: ia64) where the number of pages does not fit into an int.  Use
'long' there.

Change Maxmem and physmem and related variables to 'long', mostly for
completeness.  Machines are not likely to overflow 'int' pages in the
near term, but then again, 640K ought to be enough for anybody.  This
comes for free on 32 bit machines, so why not?
2002-08-30 04:04:37 +00:00
julian
b085f4e6d2 Rejig the code to figure out estcpu and work out how long a KSEGRP has been
idle. What was there before was surprisingly ALMOST correct.

Peter and I fried our brains on this for a couple of hours figuring out
what this actually means in the context of multiple threads.

Reviewed by:	peter@freebsd.org
2002-08-30 00:25:49 +00:00
peter
1755247959 Actually remove the a.out kld loader. While I am not 100% sure, I believe
it is broken.  It certainly has been suffering neglect.  It is not needed
because we never shipped a.out kld's and they never really worked right.
2002-08-29 23:04:05 +00:00
julian
a9901abd65 Fix crack-smoking code that was panicing on the quad xeon:
- If either of proc or kse are NULL during thread_exit(), then
          the kernel is going to fault because parts of the function
          assume they aren't NULL.  Instead, just assert they aren't NULL
          (as well as the kse group) and assume they are in all of the
          code.  It doesn't make sense for them to be NULL here anyways.
        - Move the PROC_UNLOCK(p) up above clearing td_proc, etc. since
          otherwise we will panic if the proc's lock is contested.

Submitted by:	jhb@freebsd.org
2002-08-29 19:49:53 +00:00
iwasaki
f3872b5db5 Add sanity check seeing if adjusted start address exceeds end address
after boundary and alignment adjustment.
2002-08-29 12:39:21 +00:00
jake
821a548da7 Renamed poorly named setregs to exec_setregs. Moved its prototype to
imgact.h with the other exec support functions.
2002-08-29 06:17:48 +00:00
jake
ef3b17314a Don't require that sysentvec.sv_szsigcode be non-NULL. 2002-08-29 01:28:27 +00:00
jake
725e38b285 Unrot SPARSE_MAPPING code (vm_map_pageable -> vm_map_wire). 2002-08-29 01:16:14 +00:00
peter
a225758933 updatepri() works on a ksegrp (where the scheduling parameters are), so
directly give it the ksegrp instead of the thread.  The only thing it used
to use in the thread was the ksegrp.

Reviewed by:	julian
2002-08-28 23:45:15 +00:00
archie
bf4ebb4609 accept(2) on a socket that has been shutdown(2) normally returns
ECONNABORTED. Make this happen in the non-blocking case as well.
The previous behavior was to return EAGAIN, which (a) is not
consistent with the blocking case and (b) causes the application
to think the socket is still valid.

PR:		bin/42100
Reviewed by:	freebsd-net
MFC after:	3 days
2002-08-28 20:56:01 +00:00
bde
5f882ab3d1 Include <sys/lockmgr.h> for the definitions of the locking interfaces that
are implemented here instead of depending on namespace pollution in
<sys/lock.h>.  Fixed nearby include messes (1 disordered include and 1
unused include).
2002-08-27 09:59:47 +00:00
iedowse
7a9fd7b468 Add a new KTR type KTR_CONTENTION, and use it in the mutex code to
log the start and end of periods during which mtx_lock() is waiting
to acquire a sleep mutex. The log message includes the file and
line of both the waiter and the holder.

Reviewed by:	jhb, jake
2002-08-26 18:39:38 +00:00
iedowse
3643fd9d4d Add WITNESS_FILE() and WITNESS_LINE(), which allow users of witness
to print out the file and line from the lock object. These will be
used shortly by CTR() calls in the mutex code.

Reviewed by:	jhb, jake
2002-08-26 18:31:26 +00:00
julian
db3b659129 move the assert to cover more cases 2002-08-26 05:02:56 +00:00
jake
0209c6a78a Fixed most indentation bugs. 2002-08-25 22:36:52 +00:00
jake
ee427569ae Fixed placement of operators. Wrapped long lines. 2002-08-25 20:48:45 +00:00
charnier
7dd9d47059 Replace various spelling with FALLTHROUGH which is lint()able 2002-08-25 13:23:09 +00:00
jake
883caafe6b Fixed white space around operators, casts and reserved words.
Reviewed by:	md5
2002-08-24 22:55:16 +00:00
jake
4616366bb2 return x; -> return (x);
return(x); -> return (x);

Reviewed by:	md5
2002-08-24 22:01:40 +00:00
marcel
c558abdba7 Work around a GCC optimization bug on ia64: In link_elf_symbol_values(),
a pointer to a symbol is given and we have to find the containing symbol
table. We do this by bounds checking. For some strange reason (ie I
haven't found the root cause) the first test succeeded for said symbol,
implying that the symbol came from the .dynsym table. In reality however
the symbol actually resided in the .symtab table. Needless to say that
all that was returned was junk.

The upper bounds check was: (symptr - baseptr) < symtab_size
This has been rewritten to: symptr < (baseptr + symtab_size)

As a side-effect, slightly more optimal (and still correct :-) code can
be generated on ia64.
2002-08-24 05:01:33 +00:00
peter
6bc0c8ebd0 Move the TAILQ_INIT(&td->td_selq) before the retry: label. Otherwise in
some circumstances when we get a select collision, we can end up with
cases where we do not clear some sip->si_thread on the way out, leading to
page faults in selwakeup().  This should solve the problem where postfix
can crash the kernel during select collisions.

Reviewed by: alfred
2002-08-23 22:43:28 +00:00