Commit Graph

6147 Commits

Author SHA1 Message Date
jeff
6e01278555 - Mark signals which may be delivered to any thread in the process with
SA_PROC.  Signals without this flag should be directed to a particular
   thread if this is possible.
2003-03-31 22:12:09 +00:00
jeff
4a3718fb25 - Change trapsignal() to accept a thread and not a proc.
- Change all consumers to pass in a thread.

Right now this does not cause any functional changes but it will be important
later when signals can be delivered to specific threads.
2003-03-31 22:02:38 +00:00
alc
3333da28bf Recent changes to uipc_cow.c have eliminated the need for some sf_buf-
related variables to be global.  Make them either local to sf_buf_init() or
static.
2003-03-31 06:25:42 +00:00
phk
bb8188c895 retire the "busy" field in bioqueues, it's served it's purpose. 2003-03-30 10:16:31 +00:00
phk
a0fbf93755 Preparation commit before I start on the bioqueue lockdown:
Collect all the bits of bioqueue handing in subr_disk.c, vfs_bio.c is big
enough as it is and disksort already lives in subr_disk.c.
2003-03-30 08:51:23 +00:00
jeff
83e8b19361 - We are not guaranteed that read ahead blocks are not in memory already.
Check for B_DELWRI as well as B_CACHED before issuing io on a buffer.  This
   is especially important since we are changing the b_iocmd.
2003-03-30 02:57:32 +00:00
alc
6f59be774d Pass the vm_page's address to sf_buf_alloc(); map the vm_page as part
of sf_buf_alloc() instead of expecting sf_buf_alloc()'s caller to map it.

The ultimate reason for this change is to enable two optimizations:
(1) that there never be more than one sf_buf mapping a vm_page at a time
and (2) 64-bit architectures can transparently use their 1-1 virtual
to physical mapping (e.g., "K0SEG") avoiding the overhead of pmap_qenter()
and pmap_qremove().
2003-03-29 06:14:14 +00:00
silby
7d7faf316e Add the m_defrag routine, as discussed on committers@. This
incarnation should address the concerns of all in the discussion,
and keeps statistics which show how much it is used.

MFC after:	2 weeks
2003-03-29 05:48:36 +00:00
jhb
4fcebd533b Check for the PS_NEEDSIGCHK flag in the right flags field. 2003-03-28 18:08:57 +00:00
silby
430664f150 Allow m_dup_pkthdr to accept mbufs with attached clusters as
targets.

Submitted by:	bmilekic
2003-03-28 05:57:48 +00:00
iedowse
b399d5ecbd Add a checksum to the kernel message buffer, and update it every
time a character is written. Use this at boot time to reject the
existing buffer contents if they are corrupt. This fixes a problem
seen on some hardware (especially laptops) where the message buffer
gets partially corrupted during a short power cycle or reset, but
the msgbuf structure is left intact so it gets reused, resulting
in random junk and control characters appearing in dmesg and
/var/log/messages.

PR:		kern/28497
2003-03-28 02:50:10 +00:00
tegge
ede5ebede7 Add support for reading directly from file to userland buffer when the
O_DIRECT descriptor status flag is set and both offset and length is a
multiple of the physical media sector size.
2003-03-26 23:40:42 +00:00
tegge
5e14826743 Adjust the number of vnodes scanned by vlrureclaim() according to the
size of the vnode list.
2003-03-26 22:15:58 +00:00
rwatson
84af8bf695 Permit debug.malloc.failure_rate to be specified using a tunable so
that the feature can be enabled during the boot process.  Note the
continued limitation that FreeBSD fails so rapidly with this setting
enabled that it's hard to narrow down particular failures for
correction; we really need per-malloc type failure rates.
2003-03-26 20:44:29 +00:00
rwatson
68d9c43724 Add a new kernel option, MALLOC_MAKE_FAILURES, which compiles
in a debugging feature causing M_NOWAIT allocations to fail at
a specified rate.  This can be useful for detecting poor
handling of M_NOWAIT: the most frequent problems I've bumped
into are unconditional deference of the pointer even though
it's NULL, and hangs as a result of a lost event where memory
for the event couldn't be allocated.  Two sysctls are added:

debug.malloc.failure_rate

  How often to generate a failure: if set to 0 (default), this
  feature is disabled.  Otherwise, the frequency of failures --
  I've been using 10 (one in ten mallocs fails), but other
  popular settings might be much lower or much higher.

debug.malloc.failure_count

  Number of times a coerced malloc failure has occurred as a
  result of this feature.  Useful for tracking what might have
  happened and whether failures are being generated.

Useful possible additions: tying failure rate to malloc type,
printfs indicating the thread that experienced the coerced
failure.

Reviewed by:	jeffr, jhb
2003-03-26 20:18:40 +00:00
tegge
d9da9de257 fp->f_offset doesn't need any protection when it isn't accessed. 2003-03-26 19:21:12 +00:00
rwatson
e5680de54a Modify the mac_init_ipq() MAC Framework entry point to accept an
additional flags argument to indicate blocking disposition, and
pass in M_NOWAIT from the IP reassembly code to indicate that
blocking is not OK when labeling a new IP fragment reassembly
queue.  This should eliminate some of the WITNESS warnings that
have started popping up since fine-grained IP stack locking
started going in; if memory allocation fails, the creation of
the fragment queue will be aborted.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-03-26 15:12:03 +00:00
jhb
671aa92ea0 Remove extraneous check. We are not going to return from copyin/out on
the stack of a thread A but actually be thread B instead of thread A.
2003-03-25 20:13:24 +00:00
mdodd
fe36e1c847 Give print_child a default method. 2003-03-25 04:32:52 +00:00
jake
783ae539c3 - Add vm_paddr_t, a physical address type. This is required for systems
where physical addresses larger than virtual addresses, such as i386s
  with PAE.
- Use this to represent physical addresses in the MI vm system and in the
  i386 pmap code.  This also changes the paddr parameter to d_mmap_t.
- Fix printf formats to handle physical addresses >4G in the i386 memory
  detection code, and due to kvtop returning vm_paddr_t instead of u_long.

Note that this is a name change only; vm_paddr_t is still the same as
vm_offset_t on all currently supported platforms.

Sponsored by:	DARPA, Network Associates Laboratories
Discussed with:	re, phk (cdevsw change)
2003-03-25 00:07:06 +00:00
jhb
98a481610a Replace the at_fork, at_exec, and at_exit functions with the slightly more
flexible process_fork, process_exec, and process_exit eventhandlers.  This
reduces code duplication and also means that I don't have to go duplicate
the eventhandler locking three more times for each of at_fork, at_exec, and
at_exit.

Reviewed by:	phk, jake, almost complete silence on arch@
2003-03-24 21:15:35 +00:00
jhb
966c72c345 - Remove witness_dead and just use witness_watch instead. If witness_watch
is set to 0, it now has the same affect as setting witness_dead used to
  have.
- Added a sysctl handler that allows root to change witness_watch from a
  non-zero value to zero to disable witness at runtime.  Note that you
  can't turn witness back on once it is off.  You can only turn it off as
  a one-way switch.
- Added a comment describing the possible values of witness_watch.
2003-03-24 21:03:53 +00:00
mux
cfd612e2d7 Remove a trailing semicolon in SCHED_QUANTUM definition.
Luckily this didn't cause any bugs.

Spotted by:	Samy Al Bahra <samy@kerneled.com>
2003-03-24 15:16:21 +00:00
cognet
501da8bd5f s/discriptors/descriptors/ 2003-03-23 19:41:34 +00:00
tjr
9785758af0 Remove unused mtx_lock_giant(), mtx_unlock_giant(), related globals
and sysctls.
2003-03-23 11:26:11 +00:00
yar
f9968b4d9f We shouldn't assert that a vode is locked in vop_lock_post()
if VOP_LOCK() has failed.

Reviewed by:	jeff
2003-03-22 13:21:54 +00:00
jhb
300b98684d Use td_ucred of curthread instead of p_ucred of curproc. This required
changing sem_perm() and sem_hasopen() to take a thread instead of a proc
for the first argument.
2003-03-20 21:12:31 +00:00
phk
afab808bc4 Backout the getcwd changes, a more comprehensive effort will be needed. 2003-03-20 10:40:45 +00:00
davidxu
8e88e8da05 Adjust code for userland preemptive. Userland can set a quantum in
kse_mailbox to schedule an upcall, this is useful for userland timeout
routine, for example pthread_cond_timedwait().

Also extract upcall scheduling code from kse_reassign and create
a new function called thread_switchout to include these code.

Reviewed by: julain
2003-03-19 05:49:38 +00:00
des
d0bee7242c Unregisterize, ansify. 2003-03-19 00:49:40 +00:00
des
e97206db4c Whitespace cleanup. 2003-03-19 00:33:38 +00:00
jake
ce68d8fa22 long != int. Use SYSCTL_UINT for kern.devstat.generation. Fixes booting
on sparc64.
2003-03-18 23:32:27 +00:00
gallatin
d18eb82da5 Fix a race condition in socow_setup(): The page must be wired before
sf_buf_alloc() is called, as sf_buf_alloc() may sleep.  If it does sleep,
the page might be reclaimed before wiring occurs.

Reported by: alc
2003-03-18 18:27:33 +00:00
phk
eef257a93e If devstat_new_entry() is passed a unit number of -1 assume that
the devstat is for an "interior" GEOM node and register using the
name argument as a geom identity pointer.  Do not put these devstat
structures on the list returned by the sysctl.

This gives us the ability to tell the two kinds of nodes apart and
leave the current "strictly physical" view of devstat intact without
modifications, yet be able to use devstat for both kinds of devices.

It also saves us bloating struct devstat with another 48 bytes of
space for the name.  At least for now.

Reviewed by:    ken
2003-03-18 09:30:31 +00:00
phk
45ffc6d110 Make devstat fully Giant agnostic:
Add a mutex and protect the allocation and traversal of the list with it.

When we allocate a page for devstat use we drop the mutex and use
M_WAITOK this is not nice, but under the given circumstances the
best we can do.

In the sysctl handler for returning the devstat entries we do not want to
hold the mutex across copyout(9) calls, so we keep a very careful eye on
the devstat_generation count, and abandon with EBUSY if it changes under
our feet.

Specifically test for BIO_WRITE, rather than default non-read,non-deletes
as write.  Make the default be DEVSTAT_NO_DATA.

Add atomic increments of the sequence[01] fields so applications using the
mmap'ed view stand a chance of detecting updates in progress.

Reviewed by:    ken
2003-03-18 09:20:20 +00:00
phk
e059b79437 Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
phk
6c4e57c44b Make devstat_new_entry() take a const void * rather than const char *
argument, GEOM nodes are not identified by ascii string.
2003-03-18 07:52:59 +00:00
jeff
c0f79e52d6 - Unlock the target bp and not the pager buf bp in a failure case in
cluster_wbuild().  This was causing strange panics that were widely
   reported on current@.

Big Pointy Hat to:	jeff
2003-03-17 18:38:49 +00:00
phk
90f136c592 (This commit certainly increases the need for a wash&clean of vfs_cache.c,
but I decided that it was important for this patch to not bit-rot, and
since it is mainly moving code around, the total amount of entropy is
epsilon /phk)

This is a patch to move the common parts of linux_getcwd() back into
kern/vfs_cache.c so that the standard FreeBSD libc getcwd() can use it's
extended functionality.  The linux syscall linux_getcwd() in
compat/linux/linux_getcwd.c has been rewritten to use it too.  It should
be possible to simplify libc's getcwd() after this.  No doubt this code
needs some cleaning up, since I've left in the sysctl variables I used
for debugging.

PR:	48169
Submitted by:	James Whitwell <abacau@yahoo.com.au>
2003-03-17 12:21:08 +00:00
phk
1ff2d4dcb1 Add a #define for the device name of the mmap device for devstat.
Constify the geom identification pointer.
2003-03-16 23:20:05 +00:00
alc
4641d3d127 Pass the sf buf to MEXTADD() as the optional argument. This permits
the simplification of socow_iodone() and sf_buf_free(); they don't
have to reverse engineer the sf buf from the data's address.
2003-03-16 07:19:12 +00:00
phk
1b534a022f One devstat_start_transaction_bio() is enough. 2003-03-15 22:20:38 +00:00
phk
f432014308 Run a revision of the devstat interface:
Kernel:

Change statistics to use the *uptime() timescale (ie: relative to
boottime) rather than the UTC aligned timescale.  This makes the
device statistics code oblivious to clock steps.

Change timestamps to bintime format, they are cheaper.

Remove the "busy_count", and replace it with two counter fields:
"start_count" and "end_count", which are updated in the down and
up paths respectively.  This removes the locking constraint on
devstat.

Add a timestamp argument to devstat_start_transaction(), this will
normally be a timestamp set by the *_bio() function in bp->bio_t0.
Use this field to calculate duration of I/O operations.

Add two timestamp arguments to devstat_end_transaction(), one is
the current time, a NULL pointer means "take timestamp yourself",
the other is the timestamp of when this transaction started (see
above).

Change calculation of busy_time to operate on "the salami principle":
Only when we are idle, which we can determine by the start+end
counts being identical, do we update the "busy_from" field in the
down path.  In the up path we accumulate the timeslice in busy_time
and update busy_from.

Change the byte_* and num_* fields into two arrays: bytes[] and
operations[].

Userland:

Change the misleading "busy_time" name to be called "snap_time" and
make the time long double since that is what most users need anyway,
fill it using clock_gettime(CLOCK_MONOTONIC) to put it on the same
timescale as the kernel fields.

Change devstat_compute_etime() to operate on struct bintime.

Remove the version 2 legacy interface: the change to bintime makes
compatibility far too expensive.

Fix a bug in systat's "vm" page where boot relative busy times would
be bogus.

Bump __FreeBSD_version to 500107

Review & Collaboration by:	ken
2003-03-15 21:59:06 +00:00
phk
4a623e073c Add a devstat_start_transaction_bio() to match the
devstat_end_transaction_bio() we already have.

For now it just calls devstat_start_transaction(), but that will change
shortly.
2003-03-15 10:33:32 +00:00
davidxu
c2573f692d Export current time when returning from never blocked syscall. 2003-03-14 03:52:16 +00:00
jhb
e90ccce535 Trim some trailing whitespace. 2003-03-13 23:07:09 +00:00
jhb
a21ccbbf8c Add a new userland-visible ktrace flag KTR_DROP and an internal ktrace flag
KTRFAC_DROP to track instances when ktrace events are dropped due to the
request pool being exhausted.  When a thread tries to post a ktrace event
and is unable to due to no available ktrace request objects, it sets
KTRFAC_DROP in its process' p_traceflag field.  The next trace event to
successfully post from that process will set the KTR_DROP flag in the
header of the request going out and clear KTRFAC_DROP.

The KTR_DROP flag is the high bit in the type field of the ktr_header
structure.  Older kdump binaries will simply complain about an unknown type
when seeing an entry with KTR_DROP set.  Note that KTR_DROP being set on a
record in a ktrace file does not tell you anything except that at least one
event from this process was dropped prior to this event.  The user has no
way of knowing what types of events were dropped nor how many were dropped.

Requested by:	phk
2003-03-13 18:31:15 +00:00
jhb
f02ef38080 - Cache a reference to the credential of the thread that starts a ktrace in
struct proc as p_tracecred alongside the current cache of the vnode in
  p_tracep.  This credential is then used for all later ktrace operations on
  this file rather than using the credential of the current thread at the
  time of each ktrace event.
- Now that we have multiple ktrace-related items in struct proc that are
  pointers, rename p_tracep to p_tracevp to make it less ambiguous.

Requested by:	rwatson (1)
2003-03-13 18:24:22 +00:00
iedowse
db8dd4828e In m_dup_pkthdr(), convert the supplied `how' argument into malloc
flags when passing it into m_tag_copy_chain(), as m_tag* functions
use malloc, not mbuf flags.
2003-03-13 09:02:19 +00:00
jeff
459181e3ed - Add a lock for protecting against msleep(bp, ...) wakeup(bp) races.
- Create a new function bdone() which sets B_DONE and calls wakup(bp). This
   is suitable for use as b_iodone for buf consumers who are not going
   through the buf cache.
 - Create a new function bwait() which waits for the buf to be done at a set
   priority and with a specific wmesg.
 - Replace several cases where the above functionality was implemented
   without locking with the new functions.
2003-03-13 07:31:45 +00:00