8525 Commits

Author SHA1 Message Date
jeff
6eaf0bed4f - Fix the case where we're not preempting but there is already a newtd
as this happens via thread_switchout().  I don't particularly like the
   structure of the code here.  We twice call out to thread code when
   a thread is voluntarily switching.  Once to thread_switchout() and once
   to slot_fill(), while sched_4BSD does even more work which is redundant
   to select another thread to use our remaining slice.  This should be
   simplified in the future, but for now I'm only going to fix the bug not
   the bad design.
2005-06-07 02:59:16 +00:00
dwhite
1d894721d3 Make "show msgbuf" use the pager instead of blasting the whole thing out.
MFC after:	3 days
2005-06-06 22:18:32 +00:00
davidxu
c2895a92bd Fix a bug relavant to debugging, a masked signal unexpectedly interrupts
a sleeping thread when process is being debugged.

PR: GNU/77818
Tested by: Sean C. Farley <sean-freebsd at farley org>
2005-06-06 05:13:10 +00:00
gallatin
c6980c2b7a Allow sends sent from non page-aligned userspace addresses to be
considered for zero-copy sends.

Reviewed by: alc
Submitted by: Romer Gil at Rice University
2005-06-05 17:13:23 +00:00
alc
981752ea4e Eliminate an unused field from struct aio_liojob. 2005-06-05 05:41:48 +00:00
marius
c74fc16e2d After some input from bde@ and rereading the datasheet use a MTX_SPIN
mutex instead of a MTX_DEF one in order to defer preemption while
reading the date and time registers. If we don't manage to read them
within the time slot where we are guaranteed that no updates occur we
might actually read them during an update in which case the output is
undefined.
2005-06-04 23:24:50 +00:00
alc
369cab6800 Eliminate the original method of requesting notification of aio_read(2) and
aio_write(2) completion through kevent(2).  This method does not work on
64-bit architectures.  It was deprecated in FreeBSD 4.4.  See revisions
1.87 and 1.70.2.7.

Change aio_physwakeup() to call psignal(9) directly rather than indirectly
through a timeout(9).  Discussed with: bde

Correct a bug introduced in revision 1.65 that could result in premature
delivery of a signal if an lio_listio(2) consisted of a mixture of
direct/raw and queued I/O operations.  Observed by: tegge

Eliminate a field from struct kaioinfo that is now unused.

Reviewed by: tegge
2005-06-04 19:16:33 +00:00
jeff
d33221f20a - It's 2005 already, I've been working on this for three years. 2005-06-04 09:24:15 +00:00
jeff
c720bf6f50 - Don't SLOT_USE() in the preempt case, sched_add() has already taken the
slot for us.  Previously, we would take two slots on every preempt, and
   setrunqueue() would fix it up for us in the non threaded case.  The
   threaded case was simply broken.
 - Clean up flags, prototypes, comments.
2005-06-04 09:23:28 +00:00
ps
bac0ce72d5 Wrap copyin/copyout for kevent so the 32bit wrapper does not have
to malloc nchanges * sizeof(struct kevent) AND/OR nevents *
sizeof(struct kevent) on every syscall.

Glanced at by:	peter, jmg
Obtained from:	Yahoo!
MFC after:	2 weeks
2005-06-03 23:15:01 +00:00
alc
652afa11cf Synchronize access to the per process aiocb lists in many of the functions. 2005-06-03 05:27:20 +00:00
alc
685bbca37b In aio_waitcomplete() correct two cases of using an aiocb after freeing it. 2005-06-02 23:14:38 +00:00
alc
47a9b57f58 Giant is no longer required in kern_setrlimit(); remove its acquisition and
release.

Reviewed by: jhb
2005-06-01 17:52:51 +00:00
kensmith
3a7e275ce6 This patch addresses a standards violation issue. The standards say a
file's access time should be updated when it gets executed.  A while
ago the mechanism used to exec was changed to use a more mmap based
mechanism and this behavior was broken as a side-effect of that.

A new vnode flag is added that gets set when the file gets executed,
and the VOP_SETATTR() vnode operation gets called.  The underlying
filesystem is expected to handle it based on its own semantics, some
filesystems don't support access time at all.  Those that do should
handle it in a way that does not block, does not generate I/O if possible,
etc.  In particular vn_start_write() has not been called.  The UFS code
handles it the same way as it would normally handle the access time if
a file was read - the IN_ACCESS flag gets set in the inode but no other
action happens at this point.  The actual time update will happen later
during a sync (which handles all the necessary locking).

Got me into this:	cperciva
Discussed with:		a lot with bde, a little with kan
Showed patches to:	phk, jeffr, standards@, arch@
Minor discussion on:	arch@
2005-05-31 19:39:52 +00:00
alc
09a2a99469 Synchronize access to aio_freeproc with a mutex. Eliminate related spl
calls.

Reduce the scope of Giant in aio_daemon().
2005-05-30 22:26:34 +00:00
alc
0a10c2b5cd Use the proc mtx to prevent simultaneous changes to p_aioinfo. 2005-05-30 19:33:33 +00:00
alc
3ecc8d1129 Eliminate unnecessary calls to wakeup(); no one sleeps on &aio_freeproc.
Eliminate an unused flag, AIOP_SCHED; it's cleared but never set.
2005-05-30 18:02:00 +00:00
rwatson
5010364761 Rebuild generated system call definition files following the addition of
the audit event field to the syscalls.master file format.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:20:21 +00:00
rwatson
370e72b242 Introduce a new field in the syscalls.master file format to hold the
audit event identifier associated with each system call, which will
be stored by makesyscalls.sh in the sy_auevent field of struct sysent.
For now, default the audit identifier on all system calls to AUE_NULL,
but in the near future, other BSM event identifiers will be used.  The
mapping of system calls to event identifiers is many:one due to
multiple system calls that map to the same end functionality across
compatibility wrappers, ABI wrappers, etc.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2005-05-30 15:09:18 +00:00
jeff
33b78c31e9 - Add bufobj_wrefl() to add a write ref to a bufobj that is already locked. 2005-05-30 07:01:18 +00:00
jkoshy
ad86ac4ba4 Kernel hooks to support PMC sampling modes.
Reviewed by:	alc
2005-05-30 06:29:29 +00:00
alc
f570134192 Eliminate aio_activeproc; it's unused. 2005-05-30 05:25:10 +00:00
alc
404d37a14d Eliminate aio_bufjobs; it's unused. 2005-05-29 21:29:15 +00:00
rwatson
aaf5c1d3e8 Normalize white space in syscalls.master: try to use tabs before system
call types.
2005-05-29 20:20:16 +00:00
rwatson
7035f9f56a Kernel malloc layers malloc_type allocation over one of two underlying
allocators: a set of power-of-two UMA zones for small allocations, and the
VM page allocator for large allocations.  In order to maintain unified
statistics for specific malloc types, kernel malloc maintains a separate
per-type statistics pool, which can be monitored using vmstat -m.  Prior
to this commit, each pool of per-type statistics was protected using a
per-type mutex associated with the malloc type.

This change modifies kernel malloc to maintain per-CPU statistics pools
for each malloc type, and protects writing those statistics using critical
sections.  It also moves to unsynchronized reads of per-CPU statistics
when generating coalesced statistics.  To do this, several changes are
implemented:

- In the previous world order, the statistics memory was allocated by
  the owner of the malloc type structure, allocated statically using
  MALLOC_DEFINE().  This embedded the definition of the malloc_type
  structure into all kernel modules.  Move to a model in which a pointer
  within struct malloc_type points at a UMA-allocated
  malloc_type_internal data structure owned and maintained by
  kern_malloc.c, and not part of the exported ABI/API to the rest of
  the kernel.  For the purposes of easing a possible MFC, re-use an
  existing pointer in 'struct malloc_type', and maintain the current
  malloc_type structure size, as well as layout with respect to the
  fields reused outside of the malloc subsystem (such as ks_shortdesc).
  There are several unused fields as a result of no longer requiring
  the mutex in malloc_type.

- Struct malloc_type_internal contains an array of malloc_type_stats,
  of size MAXCPU.  The structure defined above avoids hard-coding a
  kernel compile-time value of MAXCPU into kernel modules that interact
  with malloc.

- When accessing per-cpu statistics for a malloc type, surround read -
  modify - update requests with critical_enter()/critical_exit() in
  order to avoid races during write.  The per-CPU fields are written
  only from the CPU that owns them.

- Per-CPU stats now maintained "allocated" and "freed" counters for
  number of allocations/frees and bytes allocated/freed, since there is
  no longer a coherent global notion of the totals.  When coalescing
  malloc stats, accept a slight race between reading stats across CPUs,
  and avoid showing the user a negative allocation count for the type
  in the event of a race.  The global high watermark is no longer
  maintained for a malloc type, as there is no global notion of the
  number of allocations.

- While tearing up the sysctl() path, also switch to using sbufs.  The
  current "export as text" sysctl format is retained with the same
  syntax.  We may want to change this in the future to export more
  per-CPU information, such as how allocations and frees are balanced
  across CPUs.

This change results in a substantial speedup of kernel malloc and free
paths on SMP, as critical sections (where usable) out-perform mutexes
due to avoiding atomic/bus-locked operations.  There is also a minor
improvement on UP due to the slightly lower cost of critical sections
there.  The cost of the change to this approach is the loss of a
continuous notion of total allocations that can be exploited to track
per-type high watermarks, as well as increased complexity when
monitoring statistics.

Due to carefully avoiding changing the ABI, as well as hardening the ABI
against future changes, it is not necessary to recompile kernel modules
for this change.  However, MFC'ing this change to RELENG_5 will require
also MFC'ing optimizations for soft critical sections, which may modify
exposed kernel ABIs.  The internal malloc API is changed, and
modifications to vmstat in order to restore "vmstat -m" on core dumps will
follow shortly.

Several improvements from:		bde
Statistics approach discussed with:	ups
Tested by:				scottl, others
2005-05-29 13:38:07 +00:00
pjd
58d2b4c193 Fix panic when module is compiled in and it is loaded from loader.conf.
Only panic is fixed, module will be still listed in kldstat(8) output.
Not sure what is correct fix, because adding unloading code in case of
failure to linker_init_kernel_modules() doesn't work.
2005-05-28 23:20:05 +00:00
gad
2add4b872d Change the way options are parsed on the `#!'-line of a shell-script. Instead
of having the kernel parse that line and add an entry to the argument list for
each 'separate word' it finds, have it add only one entry which holds all
the words found on that line.  The old behavior is useful in some situations,
but it does not match the way any other operating system will parse that line.

This has been discussed in the thread "Bug in #! processing - One More Time"
on the freebsd-arch mailing list (starting back on Feb 24, 2005).  The first
few messages in that thread provide the background in much detail.

PR:		16393
Reviewed by:	freebsd-arch
2005-05-28 22:42:41 +00:00
pjd
7543a23525 Prevent loading modules with are compiled into the kernel.
PR:		kern/48759
Submitted by:	Pawe³ Ma³achowski <pawmal@unia.3lo.lublin.pl>
Patch from:	demon
MFC after:	2 weeks
2005-05-28 22:29:44 +00:00
rwatson
067b94d2d9 Regenerate from syscalls.master. 2005-05-28 14:35:43 +00:00
rwatson
c0001c0613 Mark ntp_gettime() as MSTD, since its system call path will acquire
Giant if required.
2005-05-28 14:35:05 +00:00
rwatson
f1dfea9d61 Explicitly acquire Giant around the ntp_gettime() and assert it in the
sysctl path.  While this code is close to MPSAFE, it may require some
additional locking.  Mark ntp_gettime1() as GIANT_REQUIRED for now.

Suggested by:	phk
2005-05-28 14:34:41 +00:00
rwatson
fb931ae00a Regenerate for updated syscalls.master. 2005-05-28 13:24:05 +00:00
rwatson
ceb26b4c48 Mark the following compatability system calls as MCOMPAT or MCOMPAT4 based
on the their simply wrapping MPSAFE implementations of existing MPSAFE
system calls:

  getfsstat()
  lseek()
  stat()
  lstat()
  truncate()
  ftruncate()
  statfs()
  fstatfs()

Note that ogetdirentries() is not marked MPSAFE because it does not share
the MPSAFE implementation used for getdirentries(), and requires separate
locking to be implemented.
2005-05-28 13:23:42 +00:00
rwatson
c060f4b949 Regenerate from syscalls.master. 2005-05-28 13:13:01 +00:00
rwatson
0439e13c01 Mark quotactl() as MSTD. 2005-05-28 13:12:04 +00:00
rwatson
527c640ad3 Acquire Giant explicitly in quotactl() so that the syscalls.master
entry can become MSTD.
2005-05-28 13:11:35 +00:00
rwatson
ff36d1a493 Regenerate from updated syscalls.master. 2005-05-28 13:09:56 +00:00
rwatson
35ffa17830 Mark kenv(2) as MPSAFE, since it appears to be properly locked down. 2005-05-28 13:09:41 +00:00
rwatson
ea08d61a73 Regenerate system call tables from syscalls.master. 2005-05-28 13:08:26 +00:00
rwatson
acb673063c Also mark the COMPAT4 version of fhstatfs() as MPSAFE. 2005-05-28 13:07:43 +00:00
rwatson
fa7cf37c72 Mark fhopen(), fhstat(), and fhstatfs() as MSTD, since they now
acquire Giant themselves.
2005-05-28 12:59:33 +00:00
rwatson
66d882141f Acquire Giant explicitly in fhopen(), fhstat(), and kern_fhstatfs(),
so that we can start to eliminate the presence of non-MPSAFE system
call entries in syscalls.master.
2005-05-28 12:58:54 +00:00
pjd
ac435fbb13 Remove (now) unused argument 'td' from cvtstatfs(). 2005-05-27 19:23:48 +00:00
pjd
788f75ddb2 Sync locking in freebsd4_getfsstat() with getfsstat().
Giant is probably also needed in kern_fhstatfs().
2005-05-27 19:21:08 +00:00
pjd
2fc56b12a9 Use consistent style in functions I want to modify in the near future. 2005-05-27 19:15:46 +00:00
rwatson
ac1a365e2d In the current world order, each socket has two mutexes: a mutex
that protects socket and receive socket buffer state, and a second
mutex to protect send socket buffer state.  In some places, the
mutex shared between the socket and receive socket buffer will be
acquired twice, once by each layer, resulting in some
inconsistency, but providing the abstraction benefit of being able
to more easily separate the two mutexes in the future if desired.

When transitioning a socket to the SS_ISDISCONNECTING or
SS_ISDISCONNECTED states, grab the socket/receive socket buffer lock
once rather than grabbing it as the socket lock, modifying socket
state, then grabbing a second time as the receive lock in order to
modify the socket buffer state to indicate no further data can be
read.  This change is believed to close a race between the change in
socket state and the change in socket buffer state, which for a
remotely initiated close on a UNIX domain socket, resulted in
soreceive() returning ENOTCONN rather than an EOF condition.

A similar race still exists in the case of send, however, and is
harder to fix as the socket and send socket buffer mutexes are not
the same, and we would like to avoid holding combinations of socket
mutexes over sb_upcall until we've finished clarifying the locking
protocol for upcalls.

This change has the side affect of reducing the number of mutex
operations to initiate disconnect or perform disconnect on a
socket by two.

PR:		78824
Rerported by:	Marc Olzheim <marcolz@stack.nl>
MFC after:	2 weeks
2005-05-27 17:16:43 +00:00
davidxu
5a8d3af0d6 Remove thread_upcall_check, it was used to avoid race bug in earlier
day's sleep queue code, today the bug no longer exists.
please see 04/25/2004 freebsd-threads@ mailing list archive.
2005-05-27 15:57:27 +00:00
davidxu
3fbc6983fa Remove sleep queue hack, it is no longer needed with current sleep queue.
Actually, it causes process to hang when it is being debugged.

PR: gnu/77818
2005-05-27 04:27:22 +00:00
jmg
07e93041c6 make stat return an zero'd struct, and be a FIFO again... This is only
to fix libc_r since it requires stat to close fd's, and so commented in
the code...

PR:		threads/75795
Reviewed by:	ps
MFC after:	1 week
2005-05-24 23:42:50 +00:00
cognet
9bcd47137c Don't set the default of kern.fallback_elf_brand to FreeBSD for arm, as
binutils now do the job for us
2005-05-24 22:21:44 +00:00