Commit Graph

3665 Commits

Author SHA1 Message Date
John Baldwin
486b8ac04a Don't explicitly zero p_intr_nesting_level and p_aioinfo in fork. 2001-03-28 03:14:14 +00:00
John Baldwin
0006681fe6 Switch from save/disable/restore_intr() to critical_enter/exit(). 2001-03-28 03:06:10 +00:00
John Baldwin
b944b9033a Catch up to the mtx_saveintr -> mtx_savecrit change. 2001-03-28 02:46:21 +00:00
John Baldwin
35a472461a Use mtx_intr_enable() on sched_lock to ensure child processes always start
with interrupts enabled rather than calling the no-longer MI function
enable_intr().  This is bogus anyways and in theory shouldn't even be
needed.
2001-03-28 02:44:11 +00:00
John Baldwin
6283b7d01b - Switch from using save/disable/restore_intr to using critical_enter/exit
and change the u_int mtx_saveintr member of struct mtx to a critical_t
  mtx_savecrit.
- On the alpha we no longer need a custom _get_spin_lock() macro to avoid
  an extra PAL call, so remove it.
- Partially fix using mutexes with WITNESS in modules.  Change all the
  _mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line
  parameters and rename them to use a prefix of two underscores.  Inside
  of kern_mutex.c, generate wrapper functions for
  _mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore)
  that are called from modules.  The macros mtx_{un,}lock_{spin,}_flags()
  are mapped to the __mtx_* macros inside of the kernel to inline the
  usual case of mutex operations and map to the internal _mtx_* functions
  in the module case so that modules will use WITNESS and KTR logging if
  the kernel is compiled with support for it.
2001-03-28 02:40:47 +00:00
Paul Saab
6b8b8c7fdc Last commit was broken.. It always prints '[CTRL-C to abort]'.
Move duplicate code for printing the status of the dump and checking
for abort into a separate function.

Pointy hat to:	me
2001-03-28 01:37:29 +00:00
David Malone
d1cadeb02a Don't leak the memory we've just malloced if we can't find the
process we're looking for. (I don't think this can currently
happen, but it depends how the function is called).

PR:		25932
Submitted by:	David Xu <davidx@viasoft.com.cn>
2001-03-27 20:49:51 +00:00
Yaroslav Tykhiy
ae5fa19aa9 Make cblock_alloc_cblocks() spell its own name
correctly in its warning message.

PR: kern/7693
2001-03-27 10:21:26 +00:00
Kenneth D. Merry
3393f8daa3 Rewrite of the CAM error recovery code.
Some of the major changes include:

	- The SCSI error handling portion of cam_periph_error() has
	  been broken out into a number of subfunctions to better
	  modularize the code that handles the hierarchy of SCSI errors.
	  As a result, the code is now much easier to read.

	- String handling and error printing has been significantly
	  revamped.  We now use sbufs to do string formatting instead
	  of using printfs (for the kernel) and snprintf/strncat (for
	  userland) as before.

	  There is a new catchall error printing routine,
	  cam_error_print() and its string-based counterpart,
	  cam_error_string() that allow the kernel and userland
	  applications to pass in a CCB and have errors printed out
	  properly, whether or not they're SCSI errors.  Among other
	  things, this helped eliminate a fair amount of duplicate code
	  in camcontrol.

	  We now print out more information than before, including
	  the CAM status and SCSI status and the error recovery action
	  taken to remedy the problem.

	- sbufs are now available in userland, via libsbuf.  This
	  change was necessary since most of the error printing code
	  is shared between libcam and the kernel.

	- A new transfer settings interface is included in this checkin.
	  This code is #ifdef'ed out, and is primarily intended to aid
	  discussion with HBA driver authors on the final form the
	  interface should take.  There is example code in the ahc(4)
	  driver that implements the HBA driver side of the new
	  interface.  The new transfer settings code won't be enabled
	  until we're ready to switch all HBA drivers over to the new
	  interface.

src/Makefile.inc1,
lib/Makefile:		Add libsbuf.  It must be built before libcam,
			since libcam uses sbuf routines.

libcam/Makefile:	libcam now depends on libsbuf.

libsbuf/Makefile:	Add a makefile for libsbuf.  This pulls in the
			sbuf sources from sys/kern.

bsd.libnames.mk:	Add LIBSBUF.

camcontrol/Makefile:	Add -lsbuf.  Since camcontrol is statically
			linked, we can't depend on the dynamic linker
			to pull in libsbuf.

camcontrol.c:		Use cam_error_print() instead of checking for
			CAM_SCSI_STATUS_ERROR on every failed CCB.

sbuf.9:			Change the prototypes for sbuf_cat() and
			sbuf_cpy() so that the source string is now a
			const char *.  This is more in line wth the
			standard system string functions, and helps
			eliminate warnings when dealing with a const
			source buffer.

			Fix a typo.

cam.c:			Add description strings for the various CAM
			error status values, as well as routines to
			look up those strings.

			Add new cam_error_string() and
			cam_error_print() routines for userland and
			the kernel.

cam.h:			Add a new CAM flag, CAM_RETRY_SELTO.

			Add enumerated types for the various options
			available with cam_error_print() and
			cam_error_string().

cam_ccb.h:		Add new transfer negotiation structures/types.

			Change inq_len in the ccb_getdev structure to
			be "reserved".  This field has never been
			filled in, and will be removed when we next
			bump the CAM version.

cam_debug.h:		Fix typo.

cam_periph.c:		Modularize cam_periph_error().  The SCSI error
			handling part of cam_periph_error() is now
			in camperiphscsistatuserror() and
			camperiphscsisenseerror().

			In cam_periph_lock(), increase the reference
			count on the periph while we wait for our lock
			attempt to succeed so that the periph won't go
			away while we're sleeping.

cam_xpt.c:		Add new transfer negotiation code.  (ifdefed
			out)

			Add a new function, xpt_path_string().  This
			is a string/sbuf analog to xpt_print_path().

scsi_all.c:		Revamp string handing and error printing code.
			We now use sbufs for much of the string
			formatting code.  More of that code is shared
			between userland the kernel.

scsi_all.h:		Get rid of SS_TURSTART, it wasn't terribly
			useful in the first place.

			Add a new error action, SS_REQSENSE.  (Send a
			request sense and then retry the command.)
			This is useful when the controller hasn't
			performed autosense for some reason.

			Change the default actions around a bit.

scsi_cd.c,
scsi_da.c,
scsi_pt.c,
scsi_ses.c:		SF_RETRY_SELTO -> CAM_RETRY_SELTO.  Selection
			timeouts shouldn't be covered by a sense flag.

scsi_pass.[ch]:		SF_RETRY_SELTO -> CAM_RETRY_SELTO.

			Get rid of the last vestiges of a read/write
			interface.

libkern/bsearch.c,
sys/libkern.h,
conf/files:		Add bsearch.c, which is needed for some of the
			new table lookup routines.

aic7xxx_freebsd.c:	Define AHC_NEW_TRAN_SETTINGS if
			CAM_NEW_TRAN_CODE is defined.

sbuf.h,
subr_sbuf.c:		Add the appropriate #ifdefs so sbufs can
			compile and run in userland.

			Change sbuf_printf() to use vsnprintf()
			instead of kvprintf(), which is only available
			in the kernel.

			Change the source string for sbuf_cpy() and
			sbuf_cat() to be a const char *.

			Add __BEGIN_DECLS and __END_DECLS around
			function prototypes since they're now exported
			to userland.

kdump/mkioctls:		Include stdio.h before cam.h since cam.h now
			includes a function with a FILE * argument.

Submitted by:	gibbs (mostly)
Reviewed by:	jdp, marcel (libsbuf makefile changes)
Reviewed by:	des (sbuf changes)
Reviewed by:	ken
2001-03-27 05:45:52 +00:00
Boris Popov
602ef63172 Previous commit broke interlock locking for !LK_RETRY case. 2001-03-26 12:45:35 +00:00
Poul-Henning Kamp
f83880518b Send the remains (such as I have located) of "block major numbers" to
the bit-bucket.
2001-03-26 12:41:29 +00:00
Boris Popov
71d8277b51 Prevent race condition by using msleep() instead of mtx_unlock()/tsleep().
Reviewed by:	alfred
2001-03-26 03:10:07 +00:00
Bosko Milekic
2ba1a89559 Move the atomic() mbstat.m_drops incrementing to the MGET(HDR) and
MCLGET macros in order to avoid incrementing the drop count twice.
Otherwise, in some cases, we may increment m_drops once in m_mballoc()
for example, and increment it again in m_mballoc_wait() if the
wait fails.
2001-03-24 23:47:52 +00:00
John Baldwin
1f723035c8 Use (..., "%s", foo) instead of (..., foo) to avoid a warning about a
non-constant format string when calling kthread_create() to create an
ithread.
2001-03-24 06:26:47 +00:00
Peter Wemm
32e479705a This is kind of a hack, but it should work. Currently, world is broken
because libc/rpc/key_call.c references uname(), and ps/print.c also
defines uname(), and ps is linked statically.  This leads to a symbol
clash.  The userland uname(3) kinda sucked anyway as the hostname
etc was too short.  And since the libc rpc interface now uses
the utsname.nodename which gets truncated, I was tempted into doing
something about it.  Create a new userland uname function, called
__xuname() which takes an extra argument that allows you to change
the size of the fields.  uname() becomes a static inline function
in sys/utsname.h that passes the extra argument in.  struct utsname
has its field members expanded by default now in userland.
We still provide a 'uname' externally linkable function for things
that either think that they ``know'' the utsname format and assume
32 character strings and bypass the include file, or objects that
are linked against old libcs.  ie: just about every plausible
case that I can think of is covered.  Should we ever change the
default lengths again, a libc major bump should not be required
as the size is now passed to the function.

XXX the uname(2) in the kernel is for FreeBSD 1.1 binary compatability!
All the uname(3) functions that are exported to userland are actually
implemented in libc with sysctl.  uname(1) uses sysctl directly and
does not call uname(3).

PR:		bin/4688
2001-03-24 04:40:49 +00:00
John Baldwin
bae3a80b16 Just use the proc lock to protect read accesses to p_pptr rather than the
more expensive proctree lock.
2001-03-24 04:00:01 +00:00
John Baldwin
8d2725181a Protect p_wmesg and p_wchan with sched_lock while checking for deadlocks
with other byte range file locks.
2001-03-24 03:57:44 +00:00
Alfred Perlstein
ec4dff5e50 replace calls to non-existant bail() subroutine with calls to
the die() builtin function.
2001-03-23 11:48:50 +00:00
Boris Popov
a91f68bca6 o Actually extract version of interface and store it along with the name.
o Add new parameter to the modlist_lookup() function to perform lookups
  with strict version matching.

o Collapse duplicate code to function(s).
2001-03-22 08:58:45 +00:00
Boris Popov
303b15f193 Slightly reorganize code in the linker_load_dependancies() function to make
codepath more straightforward.
2001-03-22 07:55:33 +00:00
Boris Popov
804f27299d Remove support for old way of handling module dependencies.
Approved by:	peter
2001-03-22 07:14:42 +00:00
Poul-Henning Kamp
71d033119f Make the pseudo-driver for "/dev/fd/*" handle fd's larger than 255.
PR:	25936
2001-03-20 13:26:13 +00:00
Poul-Henning Kamp
15b6f00fd1 Add a KASSERT on unit2minor() so that we catch it if people try to pass
us unit numbers which doesn't fit in 24 bits.
2001-03-20 13:24:24 +00:00
Bruce Evans
0abc15fd0b Fixed breakage of access() in rev.1.164. Wrong credentials were used for
the final path component.
2001-03-20 09:38:05 +00:00
Peter Wemm
439fea92c2 Use the same API as the example code.
Allow the initial hash value to be passed in, as the examples do.
Incrementally hash in the dvp->v_id (using the official api) rather than
add it.  This seems to help power-of-two predictable filename trees
where the filenames repeat on a power-of-two cycle and the directory trees
have power-of-two components in it.  The simple add then mask was causing
things like 12000+ entry collision chains while most other entries have
between 0 and 3 entries each.  This way seems to improve things.
2001-03-20 02:10:18 +00:00
Robert Watson
231b9e916a o Rename "namespace" argument to "attrnamespace" as namespace is a C++
reserved word.  Part 2 of syscalls.master commit to catch rebuilt
  files.

Submitted by:	jkh
Obtained from:	TrustedBSD Project
2001-03-19 05:48:58 +00:00
Robert Watson
3063207147 o Rename "namespace" argument to "attrnamespace" as namespace is a C++
reserved word.

Submitted by:	jkh
Obtained from:	TrustedBSD Project
2001-03-19 05:44:15 +00:00
Bosko Milekic
9612101e4c Fix a couple of things in the internal mbuf allocation interface:
- Make sure that m_mballoc() really doesn't allow over nmbufs mbufs to
  be allocated from mb_map. In the case where nmbufs-reserved space is not
  an exact multiple of PAGE_SIZE (which it should be, but anyway...), we
  hold nmbufs as an absolute maximum which need not ever be reached.

- Clean up m_clalloc(); make it more consistent in the sense that the first
  argument `ncl' really means "the number of clusters ensured to be allocated"
  and not "the number of pages worth of clusters to be allocated," as was
  previously the case. This also makes it consistent with m_mballoc() as well
  as the comment that preceeds it.

Reviewed by: jlemon
2001-03-17 23:23:24 +00:00
Peter Wemm
6eb39ac8fc Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash).
Make the name cache hash as well as the nfsnode hash use it.

As a special tweak, create an unsigned version of register_t.  This allows
us to use a special tweak for the 64 bit versions that significantly
speeds up the i386 version (ie: int64 XOR int64 is slower than int64
XOR int32).

The code layout is a little strange for the string function, but I was
able to get between 5 to 10% improvement over the original version I
started with. The layout affects gcc code generation choices and this way
was fastest on x86 and alpha.

Note that 'CPUTYPE=p3' etc makes a fair difference to this.  It is
around 45% faster with -march=pentiumpro on a p6 cpu.
2001-03-17 09:31:06 +00:00
Jonathan Lemon
4d286823c5 When doing a recv(.. MSG_WAITALL) for a message which is larger than
the socket buffer size, the receive is done in sections.  After completing
a read, call pru_rcvd on the underlying protocol before blocking again.

This allows the the protocol to take appropriate action, such as
sending a TCP window update to the peer, if the window happened to
close because the socket buffer was filled.  If the protocol is not
notified, a TCP transfer may stall until the remote end sends a window
probe.
2001-03-16 22:37:06 +00:00
Peter Wemm
50e2347e68 Kill the 4MB kernel limit dead. [I hope :-)].
For UP, we were using $tmp_stk as a stack from the data section.  If the
kernel text section grew beyond ~3MB, the data section would be pushed
beyond the temporary 4MB P==V mapping.  This would cause the trampoline
up to high memory to fault.  The hack workaround I did was to use all of
the page table pages that we already have while preparing the initial
P==V mapping, instead of just the first one.
For SMP, the AP bootstrap process suffered the same sort of problem and
got the same treatment.

MFC candidate - this breaks on 4.x just the same..

Thanks to:	Richard Todd <rmtodd@ichotolot.servalan.com>
2001-03-15 05:10:06 +00:00
Peter Wemm
6fe01250f4 Jake essentially rewrote this. It is not by any stretch of the
imagination a derivative of what I did before.
2001-03-15 05:02:08 +00:00
Peter Wemm
043cc5a602 Regenerate after rwatson's commit to syscalls.master (rev 1.85) 2001-03-15 04:43:57 +00:00
Robert Watson
70f3685105 o Change the API and ABI of the Extended Attribute kernel interfaces to
introduce a new argument, "namespace", rather than relying on a first-
  character namespace indicator.  This is in line with more recent
  thinking on EA interfaces on various mailing lists, including the
  posix1e, Linux acl-devel, and trustedbsd-discuss forums.  Two namespaces
  are defined by default, EXTATTR_NAMESPACE_SYSTEM and
  EXTATTR_NAMESPACE_USER, where the primary distinction lies in the
  access control model: user EAs are accessible based on the normal
  MAC and DAC file/directory protections, and system attributes are
  limited to kernel-originated or appropriately privileged userland
  requests.

o These API changes occur at several levels: the namespace argument is
  introduced in the extattr_{get,set}_file() system call interfaces,
  at the vnode operation level in the vop_{get,set}extattr() interfaces,
  and in the UFS extended attribute implementation.  Changes are also
  introduced in the VFS extattrctl() interface (system call, VFS,
  and UFS implementation), where the arguments are modified to include
  a namespace field, as well as modified to advoid direct access to
  userspace variables from below the VFS layer (in the style of recent
  changes to mount by adrian@FreeBSD.org).  This required some cleanup
  and bug fixing regarding VFS locks and the VFS interface, as a vnode
  pointer may now be optionally submitted to the VFS_EXTATTRCTL()
  call.  Updated documentation for the VFS interface will be committed
  shortly.

o In the near future, the auto-starting feature will be updated to
  search two sub-directories to the ".attribute" directory in appropriate
  file systems: "user" and "system" to locate attributes intended for
  those namespaces, as the single filename is no longer sufficient
  to indicate what namespace the attribute is intended for.  Until this
  is committed, all attributes auto-started by UFS will be placed in
  the EXTATTR_NAMESPACE_SYSTEM namespace.

o The default POSIX.1e attribute names for ACLs and Capabilities have
  been updated to no longer include the '$' in their filename.  As such,
  if you're using these features, you'll need to rename the attribute
  backing files to the same names without '$' symbols in front.

o Note that these changes will require changes in userland, which will
  be committed shortly.  These include modifications to the extended
  attribute utilities, as well as to libutil for new namespace
  string conversion routines.  Once the matching userland changes are
  committed, a buildworld is recommended to update all the necessary
  include files and verify that the kernel and userland environments
  are in sync.  Note: If you do not use extended attributes (most people
  won't), upgrading is not imperative although since the system call
  API has changed, the new userland extended attribute code will no longer
  compile with old include files.

o Couple of minor cleanups while I'm there: make more code compilation
  conditional on FFS_EXTATTR, which should recover a bit of space on
  kernels running without EA's, as well as update copyright dates.

Obtained from:	TrustedBSD Project
2001-03-15 02:54:29 +00:00
Søren Schmidt
b417a1a8c8 Dont call device close and ioctl functions if device has disappeared.
Reviewed by: phk
2001-03-13 08:45:05 +00:00
Dag-Erling Smørgrav
9cbd039343 Assert that the process we're trying to enqueue isn't already there. 2001-03-11 18:57:30 +00:00
Alan Cox
136446540a When aio_read/write() is used on a raw device, physical buffers are
used for up to "vfs.aio.max_buf_aio" of the requests.  If a request
size is MAXPHYS, but the request base isn't page aligned, vmapbuf()
will map the end of the user space buffer into the start of the kva
allocated for the next physical buffer.  Don't use a physical buffer
in this case.  (This change addresses problem report 25617.)

When an aio_read/write() on a raw device has completed, timeout() is
used to schedule a signal to the process.  Thus, the reporting is
delayed up to 10 ms (assuming hz is 100).  The process might have
terminated in the meantime, causing a trap 12 when attempting to
deliver the signal.  Thus, the timeout must be cancelled when removing
the job.

aio jobs in state JOBST_JOBQGLOBAL should be removed from the
kaio_jobqueue list during process rundown.

During process rundown, some aio jobs might move from one list to a
different list that has already been "emptied", causing the rundown to
be incomplete.  Retry the rundown.

A call to BUF_KERNPROC() is needed after obtaining a physical buffer
to disassociate the lock from the running process since it can return
to userland without releasing that lock.

PR:		25617
Submitted by:	tegge
2001-03-10 22:47:57 +00:00
Alfred Perlstein
9708152c20 Don't call malloc with M_WAITOK while holding a mutex. 2001-03-09 18:40:34 +00:00
Jonathan Lemon
c0647e0d07 Push the test for a disconnected socket when accept()ing down to the
protocol layer.  Not all protocols behave identically.  This fixes the
brokenness observed with unix-domain sockets (and postfix)
2001-03-09 08:16:40 +00:00
John Baldwin
5db078a9be Fix mtx_legal2block. The only time that it is bad to block on a mutex is
if we hold a spin mutex, since we can trivially get into deadlocks if we
start switching out of processes that hold spinlocks.  Checking to see if
interrupts were disabled was a sort of cheap way of doing this since most
of the time interrupts were only disabled when holding a spin lock.  At
least on the i386.  To fix this properly, use a per-process counter
p_spinlocks that counts the number of spin locks currently held, and
instead of checking to see if interrupts are disabled in the witness code,
check to see if we hold any spin locks.  Since child processes always
start up with the sched lock magically held in fork_exit(), we initialize
p_spinlocks to 1 for child processes.  Note that proc0 doesn't go through
fork_exit(), so it starts with no spin locks held.

Consulting from:	cp
2001-03-09 07:24:17 +00:00
Alan Cox
c9a970a79f Use the kthread API to create and destroy AIO daemons.
Submitted by:	jhb
2001-03-09 06:27:01 +00:00
John Baldwin
3a3f608288 Add a new informative KASSERT to ensure that a process is in the SRUN state
before we return it to cpu_switch().
2001-03-09 03:59:50 +00:00
Bosko Milekic
4bde2ac539 Fix is a similar race condition as existed in the mbuf code. When we go
into an interruptable sleep and we increment a sleep count, we make sure
that we are the thread that will decrement the count when we wakeup.
Otherwise, what happens is that if we get interrupted (signal) and we
have to wake up, but before we get our mutex, some thread that wants
to wake us up detects that the count is non-zero and so enters wakeup_one(),
but there's nothing on the sleep queue and so we don't get woken up. The
thread will still decrement the sleep count, which is bad because we will
also decrement it again later (as we got interrupted) and are already off
the sleep queue.
2001-03-08 19:21:45 +00:00
David Malone
2239c07de9 Make the wait for sendfile buffers interruptable. Stops one process
consuming them all and then getting stuck.

Reviewed by:	dg
Reviewed by:	bmilekic
Observed by:	Andreas Persson <pap@garen.net>
2001-03-08 16:28:10 +00:00
Thomas Moestl
3a51557243 Make the SYSCTL_OUT handlers sysctl_old_user() and sysctl_old_kernel()
more robust. They would correctly return ENOMEM for the first time when
the buffer was exhausted, but subsequent calls in this case could cause
writes ouside of the buffer bounds.

Approved by:	rwatson
2001-03-08 01:20:43 +00:00
Kirk McKusick
589c7af992 Fixes to track snapshot copy-on-write checking in the specinfo
structure rather than assuming that the device vnode would reside
in the FFS filesystem (which is obviously a broken assumption with
the device filesystem).
2001-03-07 07:09:55 +00:00
Kirk McKusick
393d77ffad Bitch more loudly when someone botches changes to kinfo_proc
in the hopes that they will actually *read* the comment above
it and *follow* the instructions so as to cause all the rest
of us less a lot less grief.
2001-03-07 06:52:12 +00:00
John Baldwin
5641ae5dc3 - Don't hold the proc lock across VREF and the fd* functions to avoid lock
order reversals.
- Add some preliminary locking in the !RF_PROC case.
- Protect p_estcpu with sched_lock.
2001-03-07 05:21:47 +00:00
John Baldwin
f227364a17 - Release Giant a bit earlier on syscall exit.
- Don't try to grab Giant before postsig() in userret() as it is no longer
  needed.
- Don't grab Giant before psignal() in ast() but get the proc lock instead.
2001-03-07 03:53:39 +00:00
John Baldwin
19eb87d22a Grab the process lock while calling psignal and before calling psignal. 2001-03-07 03:37:06 +00:00