Commit Graph

8920 Commits

Author SHA1 Message Date
phk
4bbae65b4a Make sbuf_copyin() return the number of bytes copied on success.
Submitted by:	"Wojciech A. Koszek" <dunstan@freebsd.czest.pl>
2005-12-23 11:49:53 +00:00
scottl
90a17769ed Create the taskqueue_fast handler with INTR_MPSAFE so that it doesn't run
with Giant.

MFC After: 3 days
2005-12-23 06:18:33 +00:00
jhb
cb0d490ebe Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly.  In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures.  Basically,
  all the archs were taking a trapframe and converting it into a clockframe
  one way or another.  Now they can just extract the PC and usermode values
  directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
  accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
  the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
  timecounter, call hardclock() directly.  This removes an extra
  conditional check from every clock interrupt on Alpha on the BSP.
  There is probably room for even further pruning here by changing Alpha
  to use the simplified timecounter we use on x86 with the lapic timer
  since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
  is false, so add a KASSERT() to that affect and remove a condition
  to slightly optimize the non-lapic case.
- Change prototypeof  arm_handler_execute() so that it's first arg is a
  trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on:	alpha, amd64, arm, i386, ia64, sparc64
Reviewed by:	bde (mostly)
2005-12-22 22:16:09 +00:00
alc
09b6655974 Maintain the vnode lock throughout elfN_load_file() rather than releasing
it and reacquiring it in vrele().  Consequently, there is no reason to
increase the reference count on the vm object caching the file's pages.
Reviewed by: tegge

Eliminate unused parameters to elfN_load_file().
2005-12-21 18:58:40 +00:00
alc
4bc5d218ff Eliminate an unneeded (vm_prot_t) parameter from two functions. Eliminate
unnecessary uses of a local variable.

Reviewed by: tegge
2005-12-20 23:42:18 +00:00
pjd
b98bf8f1d5 Reduce Giant scope a bit, as fdrop() is believed to be MPSAFE.
The purpose of this change is consistency (not performance improvement:)),
as it was hard to tell if fdrop() is MPSAFE or not when I saw it sometimes
under the Giant and sometimes without it.

Glanced at by:	ssouhlal, kan
2005-12-20 00:49:59 +00:00
pjd
d4461845a5 vfs_mount_alloc() always returns 0, but what we really want is newly
allocated 'struct mount *' pointer, so simplify code a bit and return
the pointer directly.

Reviewed by:	ssouhlal
2005-12-20 00:43:51 +00:00
pjd
eb0bfcfd9a Use 'td' instead of 'curthread'. 2005-12-19 16:27:13 +00:00
davidxu
9dda14459d Fix a bug in slice calculation code, current code uses hz but
sched_clock() is called by state clock.

Submitted by: taku at tackymt dot homeip dot net
2005-12-19 08:26:09 +00:00
njl
035fda7990 Remove the KTR for hardclock completely. It seems to not be useful.
Requested by:	jhb
2005-12-18 18:11:55 +00:00
njl
731935d9f8 Restore KTR_CRITICAL but conditionally compile it in as KTR_SCHED.
Requested by:	scottl, jhb
2005-12-18 18:10:57 +00:00
marcel
0a081d09f4 Make our ELF64 type definitions match standards. In particular this
means:
o  Remove Elf64_Quarter,
o  Redefine Elf64_Half to be 16-bit,
o  Redefine Elf64_Word to be 32-bit,
o  Add Elf64_Xword and Elf64_Sxword for 64-bit entities,
o  Use Elf_Size in MI code to abstract the difference between
   Elf32_Word and Elf64_Word.
o  Add Elf_Ssize as the signed counterpart of Elf_Size.

MFC after: 2 weeks
2005-12-18 04:52:37 +00:00
alc
8f7e8790b1 Correct a long-standing problem in elfN_map_insert(): In order to copy a
page to user space, the user space mapping must allow write access.

In collaboration with: tegge@
MFC after: 3 weeks
2005-12-17 19:40:47 +00:00
njl
4c2aff8681 Clean up unused or poorly utilized KTR values. Remove KTR_FS, KTR_KGDB,
and KTR_IO as they were never used.  Remove KTR_CLK since it was only
used for hardclock firing and use KTR_INTR there instead.  Remove
KTR_CRITICAL since it was only used for crit enter/exit and use
KTR_CONTENTION instead.
2005-12-17 03:57:10 +00:00
jhb
ce80df24ac - Use uintfptr_t rather than int for the kernel profiling index (though it
really should be a fptrdiff_t if we had that) in profclock().
- Don't try to profile kernel pc's that are >= the kernel lowpc to avoid
  underflows when computing a profiling index.
- Use the PC_TO_I() macro to compute the kernel profiling index rather than
  doing it inline.

Discussed with:	bde
2005-12-16 22:11:52 +00:00
jhb
60c3b40e9e Change the addupc_*() functions to use the uintfptr_t type for pc rather
than uintptr_t as that is technically more correct.
2005-12-16 22:08:32 +00:00
alc
8df8bb9f23 Style: The second argument to vm_map_find() should be NULL instead of 0. 2005-12-16 19:14:25 +00:00
alc
f69d4d5fa8 Use sf_buf_alloc() instead of vm_map_find() on exec_map to create the
ephemeral mappings that are used as the source for three copy
operations from kernel space to user space.  There are two reasons for
making this change: (1) Under heavy load exec_map can fill up causing
vm_map_find() to fail.  When it fails, the nascent process is aborted
(SIGABRT).  Whereas, this reimplementation using sf_buf_alloc()
sleeps.  (2) Although it is possible to sleep on vm_map_find()'s
failure until address space becomes available (see kmem_alloc_wait()),
using sf_buf_alloc() is faster.  Furthermore, the reimplementation
uses a CPU private mapping, avoiding a TLB shootdown on
multiprocessors.

Problem uncovered by: kris@
Reviewed by: tegge@
MFC after: 3 weeks
2005-12-16 18:34:14 +00:00
delphij
4ea00e0984 In pipe_write(): when uiomove() fails, do not spin on it forever.
Submitted by:	Kostik Belousov <kostikbel at gmail.com> on -current@
Message-ID:	<20051216151016.GE84442@deviant.zoral.local>
MFC After:	3 weeks
2005-12-16 18:32:39 +00:00
davidxu
2673c91f24 Replace selwakeuppri with selwakeup, let scheduler figure out
appropriate thread priority.
2005-12-16 15:01:16 +00:00
emaste
a7aeead21d When using m_dup(9) to copy more than MHLEN bytes of data, don't create an
mbuf chain that starts with a cluster containing just MHLEN bytes.  This
happened because m_dup called m_get or m_getcl depending on the amount of
data to copy, but then always set the size available in the first mbuf to
MHLEN.

Submitted by:	Matt Koivisto <mkoivisto at sandvine dot com>
Approved by:	jmg
Silence from:	rwatson (mentor)
2005-12-14 23:34:26 +00:00
mux
b29e3549b8 Fix a bunch of SYSCTL_INT() that should have been SYSCTL_ULONG() to
match the type of the variable they are exporting.

Spotted by:	Thomas Hurst <tom@hur.st>
MFC after:	3 days
2005-12-14 22:27:48 +00:00
des
5d3c44687b Eradicate caddr_t from the VFS API. 2005-12-14 00:49:52 +00:00
jhb
c69212d7ad Add a new 'show lock' command to ddb. If the argument has a valid lock
class, then it displays various information about the lock and calls a
new function pointer in lock_class (lc_ddb_show) to dump class-specific
information about the lock as well (such as the owner of a mutex or
xlock'ed sx lock).  This is easier than staring at hex dumps of locks to
figure out who owns the lock, etc.  Note that extending lock_class doesn't
affect the ABI for any kernel modules as the only code that deals with
lock_class structures directly is kern_mutex.c, kern_sx.c, and witness.

MFC after:	1 week
2005-12-13 23:14:35 +00:00
davidxu
a7ea81a09f Stop fiddling thread priority with msleep, eliminating unnecessary
context switching. This improves performance about 30% on UP machine.
2005-12-12 05:04:56 +00:00
rodrigc
4377e3b906 Contributions from XFS for FreeBSD project:
- Implement cv_wait_unlock() method which has semantics compatible
  with the sv_wait() method in IRIX.  For cv_wait_unlock(), the lock
  must be held before entering the function, but is not held when the
  function is exited.

- Implement the existing cv_wait() function in terms of cv_wait_unlock().

Submitted by:	kan
Feedback from:	jhb, trhodes, Christoph Hellwig <hch at infradead dot org>
2005-12-12 00:02:22 +00:00
alc
a5d0ac5faf Remove unneeded calls to pmap_remove_all(). The given page is not mapped.
Reviewed by: tegge
2005-12-11 22:06:57 +00:00
andre
5751cf9192 Hide the 4k mbuf clusters if the normal clusters are defined to be
4k already.

This unbreaks tinderbox.

Submitted by:	ru
2005-12-10 15:21:04 +00:00
davidxu
c7a54e3a64 Fix compiling warning on 64 bits system. 2005-12-09 13:16:48 +00:00
davidxu
52e3a6cedd Add a sysctl to force a process to sigexit if a trap signal is
being hold by current thread or ignored by current process,
otherwise, it is very possible the thread will enter an infinite loop
and lead to an administrator's nightmare.
2005-12-09 08:29:29 +00:00
davidxu
1628e16677 Register itimers_event_hook as a kernel event handler, so I don't
have to duplicate code to call it in exec() and exit1().
2005-12-09 05:43:26 +00:00
davidxu
459600a8d3 Comment out mqfs_create_link. Inline some small functions. 2005-12-09 02:38:29 +00:00
davidxu
c0c32b144f Now SIGCHLD is always queued. 2005-12-09 02:27:55 +00:00
davidxu
2ee31f310b Cleanup sigqueue sysctl. 2005-12-09 02:26:44 +00:00
andre
143b5d29e0 Add an API for jumbo mbuf cluster allocation and also provide
4k clusters in addition to 9k and 16k ones.

 struct mbuf *m_getjcl(int how, short type, int flags, int size)
 void *m_cljget(struct mbuf *m, int how, int size)

m_getjcl() returns an mbuf with a cluster of the specified size attached
like m_getcl() does for 2k clusters.

m_cljget() is different from m_clget() as it can allocate clusters
without attaching them to an mbuf.  In that case the return value
is the pointer to the cluster of the requested size.  If an mbuf was
specified, it gets the cluster attached to it and the return value
can be safely ignored.

For size both take MCLBYTES, MJUM4BYTES, MJUM9BYTES, MJUM16BYTES.

Reviewed by:	glebius
Tested by:	glebius
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-12-08 13:13:06 +00:00
rodrigc
89bb16053b In devfs_first(), set mp->mnt_opt to a valid empty list of mount options
instead of leaving it NULL.  This eliminates a kernel panic
when trying to do a mount -o update of /dev.

Noticed by:	cjsp
Reviewed by:	phk
2005-12-08 04:27:53 +00:00
rodrigc
d4df430592 Add "errmsg" to list of global mount options. 2005-12-08 04:09:29 +00:00
rodrigc
5a03a98174 Changes imported from XFS for FreeBSD project:
- add fields to struct buf (needed by XFS)
    - 3 private fields: b_fsprivate1, b_fsprivate2, b_fsprivate3
    - b_pin_count, count of pinned buffer

- add new B_MANAGED flag
- add breada() function to initiate asynchronous I/O on read-ahead blocks.
- add bufdone_finish(), bpin(), bunpin_wait() functions

Patches provided by:	kan
Reviewed by:		phk
Silence on:		arch@
2005-12-07 03:39:08 +00:00
alc
71b52e0fd2 Reduce the scope of the page queues lock in exec_map_first_page(). The vm
object lock is sufficient for reading a page's PG_BUSY and busy flags.

MFC after: 1 week
2005-12-06 07:39:36 +00:00
davidxu
34bbe012ae o Turn on MPSAFE flag for mqueuefs.
o Reuse si_mqd field in siginfo_t, this also gives userland
  information about which descriptor is notified.
2005-12-06 06:22:12 +00:00
davidxu
2d6ec412df Fix a lock leak in childproc_continued(). 2005-12-06 05:30:13 +00:00
jhb
7e42aad088 Tweak witness handling of lock object to shave 2 pointers off of each
lock object (and thus off of each mutex and sx lock):
- Rename the all_locks list to pending_locks and only put locks initialized
  before SI_SUB_WITNESS on the list so that the SI_SUB_WITNESS can add them
  to witness once it starts up.
- Now that pending_locks is only used during early startup, change it from
  a TAILQ to an STAILQ.  This removes a pointer from the STAILQ_ENTRY in
  struct lock_object.
- Since the pending_locks list is only used during the single-threaded
  early boot it no longer needs to be protected by a mutex, so remove
  all_mtx.
- Since the lo_list member of struct lock_object is now only used during
  early boot before witness is running, collapse lo_list and lo_witness
  into a union.  This shaves the second pointer off of struct lock_object.
- Axe lock_cur_cnt and lock_max_cnt.

With these changes, struct mtx shrinks from 36 to 28 bytes on 32-bit
platforms and from 72 to 56 bytes on 64-bit platforms.  Note that this
commit will completely and utterly destroy the kernel ABI, so no MFC.

Tested on:	alpha, amd64, i386, sparc64
2005-12-05 20:45:24 +00:00
davidxu
1e351478d4 After reading some documents, I realized SIGEV_NONE != NULL, also
fix code in mqueue_send_notification to handle SIGEV_NONE.
2005-12-05 04:41:32 +00:00
davidxu
d4b584de18 Handle SIGEV_NONE, if notification is SIGEV_NONE, error status and
return status will be set, but no notification will be registered.
Increase hard limit of maxmsg to 100, so posixtestsuite ports can run.
2005-12-05 03:23:27 +00:00
ru
522e9c2b7b Fix -Wundef. 2005-12-04 02:12:43 +00:00
rodrigc
fa8afa1a00 Add "rdonly" to global_opts, and parse it in vfs_donmount().
Requested by:	rwatson
2005-12-03 12:04:20 +00:00
rodrigc
16d338ecbb - Add "rw" mount option to global_opts.
- In vfs_donmount(), parse "ro", "noro", and "rw", in order to set or
  unset the MNT_RDONLY filesystem flag.
2005-12-03 01:26:27 +00:00
davidxu
e184004d54 1. Cleanup including.
2. Set configuration value for CTL_P1003_1B_MESSAGE_PASSING.
2005-12-02 14:09:32 +00:00
davidxu
a8411b92b7 1. Check if message priority is less than MQ_PRIO_MAX.
2. Use getnanotime instead of getnanouptime.
3. Don't free message in _mqueue_send, mqueue_send will free it.
2005-12-02 08:23:49 +00:00
davidxu
739ca77c48 1. Set timer configuration values for sysconf().
2. Set overrun limit to INT_MAX, report ERANGE error if overrun will be
   greater than INT_MAX.
2005-12-01 07:56:15 +00:00