Commit Graph

3536 Commits

Author SHA1 Message Date
Luigi Rizzo
5fe86675f0 Preserve alignment of first mbuf in m_copypacket.
This is useful when doing copies of packet where some leading
space has been preallocated to insert protocol headers.
Note that there are in fact almost no users of m_copypacket.

MFC candidate.
2001-02-20 08:23:41 +00:00
John Baldwin
5813dc03bd - Don't call clear_resched() in userret(), instead, clear the resched flag
in mi_switch() just before calling cpu_switch() so that the first switch
  after a resched request will satisfy the request.
- While I'm at it, move a few things into mi_switch() and out of
  cpu_switch(), specifically set the p_oncpu and p_lastcpu members of
  proc in mi_switch(), and handle the sched_lock state change across a
  context switch in mi_switch().
- Since cpu_switch() no longer handles the sched_lock state change, we
  have to setup an initial state for sched_lock in fork_exit() before we
  release it.
2001-02-20 05:26:15 +00:00
Bruce Evans
0ad74739ac Removed all traces of T_ASTFLT (except for gaps where it was). It became
unused except in dead code when ast() was split off from trap().
2001-02-19 15:47:38 +00:00
Bruce Evans
d2ef4060d7 Fixed a longstanding latency bug in signal delivery. When a signal
is sent to a process, psignal() needs to schedule an AST for the
process if the process is runnable, not just if it is current, so that
pending signals get checked for on the next return of the process to
user mode.  This wasn't practical until recently because the AST flag
was per-cpu so setting it for a non-current process would usually just
cause a bogus AST for the current process.

For non-current processes looping in user mode, it took accidental
(?) magic to deliver signals at all.  Signals were usually delivered
late as a side effect of rescheduling (need_resched() sets astpending,
etc.).  In pre-SMPng, delivery was delayed by at most 1 quantum (the
need_resched() call in roundrobin() is certain to occur within 1
quantum for looping processes).  In -current, things are complicated
by normal interrupt handlers being threads.  Missing handling of the
complications makes roundrobin() a bogus no-op, but preemptive
scheduling sort of works anyway due to even larger bogons elsewhere.
2001-02-19 09:40:58 +00:00
Bruce Evans
866546105a Changed the aston() family to operate on a specified process instead of
always on curproc.  This is needed to implement signal delivery properly
(see a future log message for kern_sig.c).

Debogotified the definition of aston().  aston() was defined in terms
of signotify() (perhaps because only the latter already operated on
a specified process), but aston() is the primitive.

Similar changes are needed in the ia64 versions of cpu.h and trap.c.
I didn't make them because the ia64 is missing the prerequisite changes
to make astpending and need_resched per-process and those changes are
too large to make without testing.
2001-02-19 04:15:59 +00:00
Brian Feldman
c0511d3b58 Switch to using a struct xucred instead of a struct xucred when not
actually in the kernel.  This structure is a different size than
what is currently in -CURRENT, but should hopefully be the last time
any application breakage is caused there.  As soon as any major
inconveniences are removed, the definition of the in-kernel struct
ucred should be conditionalized upon defined(_KERNEL).

This also changes struct export_args to remove dependency on the
constantly-changing struct ucred, as well as limiting the bounds
of the size fields to the correct size.  This means: a) mountd and
friends won't break all the time, b) mountd and friends won't crash
the kernel all the time if they don't know what they're doing wrt
actual struct export_args layout.

Reviewed by:	bde
2001-02-18 13:30:20 +00:00
Jeroen Ruigrok van der Werven
d7d97eb0aa Preceed/preceeding are not english words. Use precede and preceding. 2001-02-18 10:43:53 +00:00
Bruce Evans
a25f057175 Added a dummy lookup vop. Specfs was broken by removing its dummy
lookup vop so that it defaulted to using vop_eopnotsupp for strange
lookups like the ones for open("/dev/null/", ...) and stat("/dev/null/",
...).  This mainly caused the wrong errno to be returned by vfs syscalls
(EOPNOTSUPP is not in POSIX, and is not documented in connection with
specfs in open.2 and is not documented in stat.2 at all).  Also, lookup
vops are apparently required to set *ap->a_vpp to NULL on error, but
vop_eopnotsupp is too broken to do this.
2001-02-18 02:22:58 +00:00
Jonathan Lemon
9bfd6482c8 Fix tab breakage from last commit.
Spotted by: bde
2001-02-17 19:40:22 +00:00
Jonathan Lemon
c3d7bcdfc9 Introduce copyinfrom and copyinstrfrom, which can copy data from either
user or kernel space.  This will allow layering of os-compat (e.g.: linux)
system calls.  Apply the changes to mount.
2001-02-16 14:31:49 +00:00
Jonathan Lemon
608a3ce62a Extend kqueue down to the device layer.
Backwards compatible approach suggested by: peter
2001-02-15 16:34:11 +00:00
Robert Watson
661702ab20 o Fix spellign in a comment: s/referernce/reference/ 2001-02-14 06:53:57 +00:00
Bosko Milekic
fffd12bd72 Implement m_getm() which will perform an "all or nothing" mbuf + cluster
allocation, as required.

If m_getm() receives NULL as a first argument, then it allocates `len'
(second argument) bytes worth of mbufs + clusters and returns the chain
only if it was able to allocate everything.
If the first argument is non-NULL, then it should be an existing mbuf
chain (e.g. pre-allocated mbuf sitting on a ring, on some list, etc.) and
so it will allocate `len' bytes worth of clusters and mbufs, as needed,
and append them to the tail of the passed in chain, only if it was able
to allocate everything requested.

If allocation fails, only what was allocated by the routine will be freed,
and NULL will be returned.

Also, get rid of existing m_getm() in netncp code and replace calls to it
to calls to this new generic code.

Heavily Reviewed by: bp
2001-02-14 05:13:04 +00:00
Jonathan Lemon
2fd7d53d36 Return ECONNABORTED from accept if connection is closed while on the
listen queue, as well as the current behavior of a zero-length sockaddr.

Obtained from: KAME
Reviewed by: -net
2001-02-14 02:09:11 +00:00
Robert Watson
d941d4752c o Export the nextpid variable via SYSCTL as kern.lastpid, decreasing by
one the number of variables needed for top and other setgid kmem
  utilities that could only be accessed via /dev/kmem previously.

Submitted by:	Thomas Moestl <tmoestl@gmx.net>
Reviewed by:	freebsd-audit
2001-02-12 17:59:01 +00:00
Bosko Milekic
2786342687 Change all instances of CURPROC' and CURTHD' to `curproc,' in order
to stay consistent.

Requested by: bde
2001-02-12 03:15:43 +00:00
Jake Burkholder
d5a08a6065 Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different
  scheduling classes using different portions of the array.  This
  allows user processes to have their priorities propogated up into
  interrupt thread range if need be.
- I chose 64 run queues as an arbitrary number that is greater than
  32.  We used to have 4 separate arrays of 32 queues each, so this
  may not be optimal.  The new run queue code was written with this
  in mind; changing the number of run queues only requires changing
  constants in runq.h and adjusting the priority levels.
- The new run queue code takes the run queue as a parameter.  This
  is intended to be used to create per-cpu run queues.  Implement
  wrappers for compatibility with the old interface which pass in
  the global run queue structure.
- Group the priority level, user priority, native priority (before
  propogation) and the scheduling class into a struct priority.
- Change any hard coded priority levels that I found to use
  symbolic constants (TTIPRI and TTOPRI).
- Remove the curpriority global variable and use that of curproc.
  This was used to detect when a process' priority had lowered and
  it should yield.  We now effectively yield on every interrupt.
- Activate propogate_priority().  It should now have the desired
  effect without needing to also propogate the scheduling class.
- Temporarily comment out the call to vm_page_zero_idle() in the
  idle loop.  It interfered with propogate_priority() because
  the idle process needed to do a non-blocking acquire of Giant
  and then other processes would try to propogate their priority
  onto it.  The idle process should not do anything except idle.
  vm_page_zero_idle() will return in the form of an idle priority
  kernel thread which is woken up at apprioriate times by the vm
  system.
- Update struct kinfo_proc to the new priority interface.  Deliberately
  change its size by adjusting the spare fields.  It remained the same
  size, but the layout has changed, so userland processes that use it
  would parse the data incorrectly.  The size constraint should really
  be changed to an arbitrary version number.  Also add a debug.sizeof
  sysctl node for struct kinfo_proc.
2001-02-12 00:20:08 +00:00
Mark Murray
d888fc4e73 RIP <machine/lock.h>.
Some things needed bits of <i386/include/lock.h> - cy.c now has its
own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK()
has been moved to <i386/include/apic.h> (AKA <machine/apic.h>).
Reviewed by:	jhb
2001-02-11 10:44:09 +00:00
Bosko Milekic
122a814af5 Long awaited style fixup in mbuf code. Get rid of K&R style prototyping
and function argument declarations. Make sure that functions that are
supposed to return a pointer return NULL in case of failure. Don't cast
NULL. Finally, get rid of annoying `register' uses.
2001-02-11 05:02:06 +00:00
Bosko Milekic
5746a1d866 - Place back STR string declarations for lock/unlock strings used for KTR_LOCK
tracing in order to avoid duplication.
- Insert some tracepoints back into the mutex acq/rel code, thus ensuring
  that we can trace all lock acq/rel's again.
- All CURPROC != NULL checks are MPASS()es (under MUTEX_DEBUG) because they
  signify a serious mutex corruption.
- Change up some KASSERT()s to MPASS()es, and vice-versa, depending on the
  type of problem we're debugging (INVARIANTS is used here to check that
  the API is being used properly whereas MUTEX_DEBUG is used to ensure that
  something general isn't happening that will have bad impact on mutex
  locks).

Reminded by: jhb, jake, asmodai
2001-02-11 02:54:16 +00:00
Jake Burkholder
3cbe75a414 Clear the reschedule flag after finding it set in userret(). This
used to be in cpu_switch(), but I don't see any difference between
doing it here.
2001-02-10 20:33:35 +00:00
Jake Burkholder
c11f93b3e7 Acquire sched_lock around need_resched() in roundrobin() to satisfy
assertions that it is held.  Since roundrobin() is a timeout there's
no possible way that it could be called with sched_lock held.
2001-02-10 19:07:32 +00:00
John Baldwin
142ba5f3d7 - Make astpending and need_resched process attributes rather than CPU
attributes.  This is needed for AST's to be properly posted in a preemptive
  kernel.  They are backed by two new flags in p_sflag: PS_ASTPENDING and
  PS_NEEDRESCHED.  They are still accesssed by their old macros:
  aston(), astoff(), etc.  For completeness, an astpending() macro has been
  added to check for a pending AST, and clear_resched() has been added to
  clear need_resched().
- Rename syscall2() on the x86 back to syscall() to be consistent with
  other architectures.
2001-02-10 02:20:34 +00:00
John Baldwin
c75e5182ce Unify the two sleep lock order lists to enforce the process lock ->
uidinfo lock locking order.
2001-02-09 20:52:02 +00:00
John Baldwin
c3a6f33758 Revert the previous revision for two reasons:
- I can't seem to reproduce the warning I got from WITNESS anymore.
- The fix was wrong.  Since a uidinfo struct is a member of proc, it
  makes sense for the locking order to be such that you are allowed to
  hold proc and then grab the uidinfo lock.
2001-02-09 20:51:11 +00:00
John Baldwin
1aa97cdea7 Work around some sizeof(long) != sizeof(int) bogons. 2001-02-09 19:02:39 +00:00
John Baldwin
062d8ff5a0 - Catch up to the new swi API changes:
- Use swi_* function names.
  - Use void * to hold cookies to handlers instead of struct intrhand *.
- In sio.c, use 'driver_name' instead of "sio" as the name of the driver
  lock to minimize diffs with cy(4).
2001-02-09 17:46:35 +00:00
John Baldwin
b4151f7101 - Move struct ithd to sys/interrupt.h.
- Add a set of MI helper functions for interrupt threads:
  - ithread_create() creates a new interrupt thread
  - ithread_destroy() destroys an interrupt thread
  - ithread_add_handler() attaches a new handler to an interrupt thread
  - ithread_remove_handler() detaches a handler from an interrupt thread
- Rename sinthand_add() and sched_swi() to swi_add() and swi_sched()
  respectively so that they live in a consistent namespace.
- struct intrhand is no longer a public type.  It would be private to
  kern_intr.c but the current implementation of fast interrupts on the
  alpha requires the type to be exported.  However, all handlers should
  be treated as void * cookies in the way that new-bus treats them.  This
  includes references to software interrupt handlers.
2001-02-09 17:42:43 +00:00
John Baldwin
8ad802d82c Release the proc lock around crfree() and uifree() in wait1(). It leads to
a lock order violation, and since p is already a zombie at this point,
I'm not sure that we even need all the locking currently in wait1().
2001-02-09 16:43:18 +00:00
John Baldwin
635962afdf Proc locking. 2001-02-09 16:27:41 +00:00
John Baldwin
929604ec9b Move the initailization of the proc lock for proc0 very early into the MD
startup code.
2001-02-09 16:25:16 +00:00
John Baldwin
a91fe908db Woops, remove an obsolete reference to gd_cpu_lockid. 2001-02-09 16:13:57 +00:00
John Baldwin
e910ba59fc - Change the 'witness_list' ddb command to 'show mutexes'. Note that this
will only display sleep mutexes held by the current process.
- Clean up some nits in the witness_display() function and add a ddb
  command 'show witness' that dumps the hierarchy and order lists to the
  console.
- Use queue(3) macros where appropriate.
- Resort the spin lock order list so that "com" is before "sched_lock".
  Also, add appropriate #ifdef's around SMP and i386-specific mutexes.
- Add two new mutexes used to protect the ithread lists and tables to the
  order list.

Requested by:	bde (1)
2001-02-09 15:19:41 +00:00
John Baldwin
cd85c9e17c Change the ktr ddb commands to be show commands. The commands are now as
follows:
 - show ktr_first	display the first entry
 - show ktr_next	display the next entry
 - show ktr		display the entire buffer

The /v modifiers continue to work as described previously.

Requested by:	bde
2001-02-09 15:07:30 +00:00
John Baldwin
7ecfc090c0 - Point out that we don't lock anything during the idle setup because
only the boot processor should be running in the comments.
- Initialize curproc to point to each CPU's respective idleproc if their
  curproc is NULL.
- Keep track of the number of context switches performed by idleproc.
2001-02-09 14:59:43 +00:00
Peter Wemm
2bd5ac330f poll(2) array limits (take 2) - after some input from bde. 2001-02-09 08:10:22 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
John Baldwin
5dbc7fe2d7 Don't bother with acquiring/releasing Giant around kmem_malloc() and
kmem_free() for now.  Kmem_malloc() and kmem_free() now have appropriate
assertions in place, and these checks aren't feasible until more of the
networking code is locked down.  Also, the extra assertions here should
already be caught by the WITNESS code as lock order violations should
mutex operations on Giant be reintroduced here later.
2001-02-08 00:27:38 +00:00
John Baldwin
297c46b68c Don't enable interrupts for a kernel breakpoint or trace trap. Otherwise,
this negates the explicit disabling of interrupts when entering the
debugger in Debugger().
2001-02-08 00:10:07 +00:00
Peter Wemm
89b716473e The code I picked up from NetBSD in '97 had a nasty bug. It limited
the index of the pollfd array to the number of fd's currently open, not
the maximum number of fd's.  ie: if you had 0,1,2 open, you could not
use pollfd slots higher than 20.  The specs say we only have to support
OPEN_MAX [64] entries but we allow way more than that.
2001-02-07 23:28:01 +00:00
Jeroen Ruigrok van der Werven
2fa72ea7d4 Fix typo: compatability -> compatibility.
Compatability is not an existing english word.
2001-02-06 12:05:58 +00:00
Jeroen Ruigrok van der Werven
1a6e52d0e9 Fix typo: seperate -> separate.
Seperate does not exist in the english language.
2001-02-06 11:21:58 +00:00
Jeroen Ruigrok van der Werven
f09deb6962 Fix typo: wierd -> weird.
There is no such thing as wierd in the english language.
2001-02-06 09:25:10 +00:00
Jeroen Ruigrok van der Werven
ba091d9673 Fix typo: teh -> the. 2001-02-06 09:18:39 +00:00
Brian Feldman
a02f31364e It is _DEFINITELY_ not okay to change shmseg on a running system. 2001-02-04 20:10:32 +00:00
Poul-Henning Kamp
37d4006626 Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with:   sed(1)
Reviewed by:    md5(1)
2001-02-04 16:08:18 +00:00
Poul-Henning Kamp
fc2ffbe604 Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)
2001-02-04 13:13:25 +00:00
Matthew Dillon
4e71e795a1 This commit represents work mainly submitted by Tor and slightly modified
by myself.  It solves a serious vm_map corruption problem that can occur
with the buffer cache when block sizes > 64K are used.  This code has been
heavily tested in -stable but only tested somewhat on -current.  An MFC
will occur in a few days.  My additions include the vm_map_simplify_entry()
and minor buffer cache boundry case fix.

Make the buffer cache use a system map for buffer cache KVM rather then a
normal map.

Ensure that VM objects are not allocated for system maps.  There were cases
where a buffer map could wind up with a backing VM object -- normally
harmless, but this could also result in the buffer cache blocking in places
where it assumes no blocking will occur, possibly resulting in corrupted
maps.

Fix a minor boundry case in the buffer cache size limit is reached that
could result in non-optimal code.

Add vm_map_simplify_entry() calls to prevent 'creeping proliferation'
of vm_map_entry's in the buffer cache's vm_map.  Previously only a simple
linear optimization was made.  (The buffer vm_map typically has only a
handful of vm_map_entry's.  This stabilizes it at that level permanently).

PR: 20609
Submitted by: (Tor Egge) tegge
2001-02-04 06:19:28 +00:00
Brian Somers
115867175a KASSERT that the minor number passed to make_dev() is valid. 2001-02-02 03:32:11 +00:00
Boris Popov
f3f1af390d Properly lock new vnode.
Reminded by:	tegge
2001-01-31 04:54:23 +00:00