Commit Graph

7069 Commits

Author SHA1 Message Date
peter
c59ea86c9f Regen for mpsafe kse_create() 2004-03-13 22:32:17 +00:00
peter
1cb95fd2b7 Push Giant down a little further:
- no longer serialize on Giant for thread_single*() and family in fork,
  exit and exec
- thread_wait() is mpsafe, assert no Giant
- reduce scope of Giant in exit to not cover thread_wait and just do
  vm_waitproc().
- assert that thread_single() family are not called with Giant
- remove the DROP/PICKUP_GIANT macros from thread_single() family
- assert that thread_suspend_check() s not called with Giant
- remove manual drop_giant hack in thread_suspend_check since we know it
  isn't held.
- remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family
- mark kse_create() mpsafe
2004-03-13 22:31:39 +00:00
rwatson
6a1193a64c Add annotations to mtx_lock(&Giant) in kern_select() and poll() that
we always grab Giant, even if we're actually only polling objects that
don't require giant.  Once socket locking is merged, there will be
strong motivation to fix this.
2004-03-13 05:58:57 +00:00
bde
58d250bc29 Align the offset in vn_rdwr_inchunks() so that at most the first and
the last chunk are misaligned relative to a MAXBSIZE byte boundary.
vn_rdwr_inchunks() is used mainly for elf core dumps, and elf sections
are usually perfectly misaligned relative to MAXBSIZE, and chunking
prevents the file system from doing much realigning.

This gives a surprisingly large speedup for core dumps -- from 50 to
13 seconds for a 512MB core dump here.  The pessimization was mostly
from an interaction of the misalignment with IO_DIRECT.  It increased
the number of i/o's for each chunk by a factor of 5 (3 writes and 2
read-before-writes instead of 1 write).
2004-03-13 02:56:27 +00:00
trhodes
dfcfecd6e4 These are changes to allow to use the Intel C/C++ compiler (lang/icc)
to build the kernel. It doesn't affect the operation if gcc.

Most of the changes are just adding __INTEL_COMPILER to #ifdef's, as
icc v8 may define __GNUC__ some parts may look strange but are
necessary.

Additional changes:
 - in_cksum.[ch]:
   * use a generic C version instead of the assembly version in the !gcc
     case (ASM code breaks with the optimizations icc does)
     -> no bad checksums with an icc compiled kernel
     Help from:		andre, grehan, das
     Stolen from: 	alpha version via ppc version
     The entire checksum code should IMHO be replaced with the DragonFly
     version (because it isn't guaranteed future revisions of gcc will
     include similar optimizations) as in:
        ---snip---
          Revision  Changes    Path
          1.12      +1 -0      src/sys/conf/files.i386
          1.4       +142 -558  src/sys/i386/i386/in_cksum.c
          1.5       +33 -69    src/sys/i386/include/in_cksum.h
          1.5       +2 -0      src/sys/netinet/igmp.c
          1.6       +0 -1      src/sys/netinet/in.h
          1.6       +2 -0      src/sys/netinet/ip_icmp.c

          1.4       +3 -4      src/contrib/ipfilter/ip_compat.h
          1.3       +1 -2      src/sbin/natd/icmp.c
          1.4       +0 -1      src/sbin/natd/natd.c
          1.48      +1 -0      src/sys/conf/files
          1.2       +0 -1      src/sys/conf/files.amd64
          1.13      +0 -1      src/sys/conf/files.i386
          1.5       +0 -1      src/sys/conf/files.pc98
          1.7       +1 -1      src/sys/contrib/ipfilter/netinet/fil.c
          1.10      +2 -3      src/sys/contrib/ipfilter/netinet/ip_compat.h
          1.10      +1 -1      src/sys/contrib/ipfilter/netinet/ip_fil.c
          1.7       +1 -1      src/sys/dev/netif/txp/if_txp.c
          1.7       +1 -1      src/sys/net/ip_mroute/ip_mroute.c
          1.7       +1 -2      src/sys/net/ipfw/ip_fw2.c
          1.6       +1 -2      src/sys/netinet/igmp.c
          1.4       +158 -116  src/sys/netinet/in_cksum.c
          1.6       +1 -1      src/sys/netinet/ip_gre.c
          1.7       +1 -2      src/sys/netinet/ip_icmp.c
          1.10      +1 -1      src/sys/netinet/ip_input.c
          1.10      +1 -2      src/sys/netinet/ip_output.c
          1.13      +1 -2      src/sys/netinet/tcp_input.c
          1.9       +1 -2      src/sys/netinet/tcp_output.c
          1.10      +1 -1      src/sys/netinet/tcp_subr.c
          1.10      +1 -1      src/sys/netinet/tcp_syncache.c
          1.9       +1 -2      src/sys/netinet/udp_usrreq.c

          1.5       +1 -2      src/sys/netinet6/ipsec.c
          1.5       +1 -2      src/sys/netproto/ipsec/ipsec.c
          1.5       +1 -1      src/sys/netproto/ipsec/ipsec_input.c
          1.4       +1 -2      src/sys/netproto/ipsec/ipsec_output.c

          and finally remove
            sys/i386/i386        in_cksum.c
            sys/i386/include     in_cksum.h
        ---snip---
 - endian.h:
   * DTRT in C++ mode
 - quad.h:
   * we don't use gcc v1 anymore, remove support for it
   Suggested by:	bde (long ago)
 - assym.h:
   * avoid zero-length arrays (remove dependency on a gcc specific
     feature)
     This change changes the contents of the object file, but as it's
     only used to generate some values for a header, and the generator
     knows how to handle this, there's no impact in the gcc case.
   Explained by:	bde
   Submitted by:	Marius Strobl <marius@alchemy.franken.de>
 - aicasm.c:
   * minor change to teach it about the way icc spells "-nostdinc"
   Not approved by:	gibbs (no reply to my mail)
 - bump __FreeBSD_version (lang/icc needs to know about the changes)

Incarnations of this patch survive gcc compiles since a loooong time,
I use it on my desktop. An icc compiled kernel works since Nov. 2003
(exceptions: snd_* if used as modules), it survives a build of the
entire ports collection with icc.

Parts of this commit contains suggestions or submissions from
Marius Strobl <marius@alchemy.franken.de>.

Reviewed by:	-arch
Submitted by:	netchild
2004-03-12 21:45:33 +00:00
ru
50a11e8dfd Do what the execve(2) manpage says and enforce what a Strictly
Conforming POSIX application should do by disallowing the argv
argument to be NULL.

PR:		kern/33738
Submitted by:	Marc Olzheim, Serge van den Boom
OK'ed by:	nectar
2004-03-12 21:06:20 +00:00
kensmith
2cb0d83a6c This is a temporary fix to solve a regression issue on sparc64 that
is caused by the way sparc64 registers its CPUs.  Nate will work on
a real fix shortly.

Approved by:	njl
2004-03-12 20:35:21 +00:00
jhb
c754b5af47 - Remove old sleep queues.
- Remove sleepqueue argument from sleepq_set_timeout() since it is not
  used.
2004-03-12 19:06:18 +00:00
jhb
6103cfbeb5 Fixup a comment. 2004-03-12 19:05:46 +00:00
des
a77c8e8035 Replace a manual check of a VMIO candidate with vn_canvmio(). This
silences an annoying warning in getblk() when VMIO'ing on a directory
vnode, which can happen when vfs.vmiodirenable is 1.

Bring the warning message in line with reality at the same time.

Submitted by:	hmp
2004-03-12 12:02:12 +00:00
phk
5c532f7fd4 When I was a kid my work table was one cluttered mess an cleaning it up
were a rather overwhelming task.  I soon learned that if you don't know
where you're going to store something, at least try to pile it next to
something slightly related in the hope that a pattern emerges.

Apply the same principle to the ffs/snapshot/softupdates code which have
leaked into specfs:  Add yet a buf-quasi-method and call it from the
only two places I can see it can make a difference and implement the
magic in ffs_softdep.c where it belongs.

It's not pretty, but at least it's one less layer violated.
2004-03-11 18:50:33 +00:00
phk
2a5e157787 Properly vector all bwrite() and BUF_WRITE() calls through the same path
and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().
2004-03-11 18:02:36 +00:00
phk
9ba3cede82 Remove unused mnt_reservedvnlist field. 2004-03-11 16:59:57 +00:00
phk
eeb7579130 Remove unused second arg to vfinddev().
Don't call addaliasu() on VBLK nodes.
2004-03-11 16:33:11 +00:00
phk
7ad97e57ad Correctly account for extra bits in unit numbers when looking for
next free unit.
2004-03-11 14:11:02 +00:00
phk
fdd216910f Add clone_setup() function rather than rely on lazy initialization.
Requested by:	rwatson
2004-03-11 12:58:55 +00:00
jmg
cf1b8bdb72 make sure we had the filedesc lock when calling fdinit when RFCFDG is set
on call to rfork.

Submitted by:	Brian Buchanan
Semi-Reviewed by: rwatson
2004-03-10 00:27:36 +00:00
njl
89565a7301 Hook CPUs up to newbus. CPUs will ultimately be a bus driver so that
multiple CPU-specific drivers can attach.  This is a work in progress
so children aren't supported yet.

Help from:	jhb
2004-03-09 03:37:21 +00:00
rwatson
8ff4e76430 Mark loadaverage callout as CALLOUT_MPSAFE.
Reviewed by:	jhb
2004-03-08 22:01:19 +00:00
pjd
b2d1dcd936 Add two new sysctls:
- security.bsd.hardlink_check_uid, when set, means, that unprivileged
		users are not permitted to create hard links to files not
		owned by them,
	- security.bsd.hardlink_check_gid, when set, means, that unprivileged
		users are not permitted to create hard links to files owned
		by group they don't belong to.

OK'ed by:	rwatson
2004-03-08 20:37:25 +00:00
peter
836666b0d7 Move a vref call outside of proc locks and Giant. By virtue of the fact
that we (p1) are currently running, we hold a reference on p_textvp which
means the vnode cannot go away.  p2 cannot run yet (and hence cannot exit)
so this should be safe to do at this point.  As a bonus, it removes a
block of under-Giant code that was there to support the vref.
2004-03-08 00:32:34 +00:00
alc
6b2e0639b2 Remove GIANT_REQUIRED from vunmapbuf(). 2004-03-07 00:37:18 +00:00
alc
94f567f9bb Giant is not required for vm_thread_new_altkstack(). 2004-03-07 00:06:32 +00:00
kan
e795b7939d Always call vn_finished_write after vn_start_write was called. All
occurences of 'goto done' after vn_start_write invocation were cleaning
up incompletely.
2004-03-06 04:09:54 +00:00
peter
8ac8c686e1 Add a missing part of jhb's previous commit. It looks like he had a
patch chunk rejected that he missed.  This would manifest as a lock
assertion panic at boot (Giant not locked in kern_fork.c).

Obtained from:  jhb
2004-03-06 00:44:59 +00:00
jhb
2642ed4029 kthread_exit() no longer requires Giant, so don't force callers to acquire
Giant just to call kthread_exit().

Requested by:	many
2004-03-05 22:42:17 +00:00
jhb
4e1bd1e348 - Push down Giant in exit() and wait().
- Push Giant down a bit in coredump() and call coredump() with the proc
  lock already held rather than unlocking it only to turn around and
  relock it.

Requested by:	peter
2004-03-05 22:39:53 +00:00
jhb
445ca63264 Lock Giant around the single threading code in exec() to satisfy an
assertion in the single threading code.
2004-03-05 22:38:26 +00:00
jhb
af72c48e5f - Grab a share lock of the proctree lock while looking for a pid due to the
process group and session dereferences.  Also, check that p_pgrp and
  p_sesssion are NULL before dereferencing them.
- Push down Giant in fork1().

Requested by:	peter
2004-03-05 22:37:32 +00:00
truckman
367b608998 Undo the merger of mlock()/vslock and munlock()/vsunlock() and the
introduction of kern_mlock() and kern_munlock() in
        src/sys/kern/kern_sysctl.c      1.150
        src/sys/vm/vm_extern.h          1.69
        src/sys/vm/vm_glue.c            1.190
        src/sys/vm/vm_mmap.c            1.179
because different resource limits are appropriate for transient and
"permanent" page wiring requests.

Retain the kern_mlock() and kern_munlock() API in the revived
vslock() and vsunlock() functions.

Combine the best parts of each of the original sets of implementations
with further code cleanup.  Make the mclock() and vslock()
implementations as similar as possible.

Retain the RLIMIT_MEMLOCK check in mlock().  Move the most strigent
test, which can return EAGAIN, last so that requests that have no
hope of ever being satisfied will not be retried unnecessarily.

Disable the test that can return EAGAIN in the vslock() implementation
because it will cause the sysctl code to wedge.

Tested by:	Cy Schubert <Cy.Schubert AT komquats.com>
2004-03-05 22:03:11 +00:00
rwatson
702f89fc5d The roundrobin callout from sched_4bsd is MPSAFE, so set up the
callout as MPSAFE to avoid grabbing Giant.

Reviewed by:	jhb
2004-03-05 19:27:04 +00:00
rwatson
e2aad13d33 Put "failed to set signal flags properly for ast()" check under
DIAGNOSTIC instead of INVARIANTS.  INVARIANTS is intended for tests
that don't substantially change code flow or behavior (passive), but
this test required locking both the proc lock and scheduler lock
in order to execute.  It also appears to be a very advisory diagnostic
as opposed to an invariant violation.

Following discussion with:	bde
2004-03-05 17:35:28 +00:00
phk
e215fa23b4 Just because the timecounter reads the same value on two samples
after each other doesn't mean that nothing happened.
2004-03-04 14:14:23 +00:00
bde
9c7937d9cf Fixed some style bugs (mainly English usage errors in comments). 2004-03-04 09:56:29 +00:00
bde
a3f844af36 Fixed some style bugs (mainly misplaced comments, and totally disordered
declarations in acct_process()).
2004-03-04 09:47:09 +00:00
rwatson
48d4fe5ea4 Remove unneeded label 'done2' from socket(). We now grab Giant
only around socreate(), and don't need it for file descriptor
accesses.

Submitted by:	sam
2004-03-04 01:57:48 +00:00
des
e6b61c95ad Use different dummy wait channels to avoid panic in msleep().
Reviewed by:	jhb
2004-03-03 23:03:18 +00:00
jhb
286e504b8f Always assert that the passed in lock is the same as the saved lock in the
sleep queue now that the one abnormal case has been fixed.
2004-03-02 15:02:08 +00:00
jhb
93c4123deb Correct handling of PDROP in msleep() to just skip the mtx_lock() rather
than clear the lock pointer so that sleepq_add() still gets the correct
lock pointer and doesn't bogusly trip an assertion.
2004-03-02 14:58:33 +00:00
jhb
86e3aa5b6c Check for TDF_SINTR before calling sleepq_abort() as there is a narrow
race in between sleepq_add() and sleepq_catch_signals() in that setting
td_wchan and TDF_SINTR is not atomic to sched_lock but only to the sleepq
lock.  This band-aid will stop assertion failures, but there is perhaps a
larger problem with the sleepq_add/sleepq_catch_signals race that I am not
sure how to solve.  For the signals case the race is harmless because we
always call cursig() after setting TDF_SINTR.  However, KSE doesn't do
anything in sleepq_catch_signals() to check that this race was lost, so I
am unsure if this race is harmful for this specific abort.
2004-03-01 23:07:58 +00:00
rwatson
b0b5f961bd Rename dup_sockaddr() to sodupsockaddr() for consistency with other
functions in kern_socket.c.

Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT
in from the caller context rather than "1" or "0".

Correct mflags pass into mac_init_socket() from previous commit to not
include M_ZERO.

Submitted by:	sam
2004-03-01 03:14:23 +00:00
scottl
48b0575f79 Convert the other use of flags to mflags in soalloc(). 2004-03-01 01:14:28 +00:00
rwatson
94d29f7426 Modify soalloc() API so that it accepts a malloc flags argument rather
than a "waitok" argument.  Callers now passing M_WAITOK or M_NOWAIT
rather than 0 or 1.  This simplifies the soalloc() logic, and also
makes the waiting behavior of soalloc() more clear in the calling
context.

Submitted by:	sam
2004-02-29 17:54:05 +00:00
phk
fdda2333fa Loudly announce WITNESS and DIAGNOSTIC options and warn about reduced
performance.
2004-02-29 16:56:54 +00:00
phk
c45bc64148 Make sure to disable the watchdog if we cannot honour the timeout. 2004-02-28 22:01:19 +00:00
phk
1ea4f5b08e Rename the WATCHDOG option to SW_WATCHDOG and make it use the
generic watchdoc(9) interface.

Make watchdogd(8) perform as watchdog(8) as well, and make it
possible to specify a check command to run, timeout and sleep
periods.

Update watchdog(4) to talk about the generic interface and add
new watchdog(8) page.
2004-02-28 20:56:35 +00:00
jhb
d25301c858 Switch the sleep/wakeup and condition variable implementations to use the
sleep queue interface:
- Sleep queues attempt to merge some of the benefits of both sleep queues
  and condition variables.  Having sleep qeueus in a hash table avoids
  having to allocate a queue head for each wait channel.  Thus, struct cv
  has shrunk down to just a single char * pointer now.  However, the
  hash table does not hold threads directly, but queue heads.  This means
  that once you have located a queue in the hash bucket, you no longer have
  to walk the rest of the hash chain looking for threads.  Instead, you have
  a list of all the threads sleeping on that wait channel.
- Outside of the sleepq code and the sleep/cv code the kernel no longer
  differentiates between cv's and sleep/wakeup.  For example, calls to
  abortsleep() and cv_abort() are replaced with a call to sleepq_abort().
  Thus, the TDF_CVWAITQ flag is removed.  Also, calls to unsleep() and
  cv_waitq_remove() have been replaced with calls to sleepq_remove().
- The sched_sleep() function no longer accepts a priority argument as
  sleep's no longer inherently bump the priority.  Instead, this is soley
  a propery of msleep() which explicitly calls sched_prio() before
  blocking.
- The TDF_ONSLEEPQ flag has been dropped as it was never used.  The
  associated TDF_SET_ONSLEEPQ and TDF_CLR_ON_SLEEPQ macros have also been
  dropped and replaced with a single explicit clearing of td_wchan.
  TD_SET_ONSLEEPQ() would really have only made sense if it had taken
  the wait channel and message as arguments anyway.  Now that that only
  happens in one place, a macro would be overkill.
2004-02-27 18:52:44 +00:00
jhb
d76d631711 Drop sched_lock around the wakeup of the parent process after setting
the process state to zombie when a process exits to avoid a lock order
reversal with the sleepqueue locks.  This appears to be the only place
that we call wakeup() with sched_lock held.
2004-02-27 18:39:09 +00:00
jhb
d07a9130c6 Add an implementation of a generic sleep queue abstraction that is used
to queue threads sleeping on a wait channel similar to how turnstiles are
used to queue threads waiting for a lock.  This subsystem will be used as
the backend for sleep/wakeup and condition variables initially.  Eventually
it will also be used to replace the ithread-specific iwait thread
inhibitor.

Sleep queues are also not locked by sched_lock, so this splits sched_lock
up a bit further increasing concurrency within the scheduler.  Sleep queues
also natively support timeouts on sleeps and interruptible sleeps allowing
for the reduction of a lot of duplicated code between the sleep/wakeup and
condition variable implementations.  For more details on the sleep queue
implementation, check the comments in sys/sleepqueue.h and
kern/subr_sleepqueue.c.
2004-02-27 18:33:09 +00:00
des
51a4ce2dff Add sysctl_move_oid() which reparents an existing OID. 2004-02-27 17:13:23 +00:00