freebsd-nq

Author	SHA1	Message	Date
John Baldwin	d8f831678d	Reparent a kernel thread to init during kthread_exit() so that the zombie can be reaped.	2000-10-19 19:53:44 +00:00
Robert Watson	47460a23a0	o Introduce new VOP_ACCESS() flag VADMIN, allowing file systems to perform "administrative" authorization checks. In most cases, the VADMIN test checks to make sure the credential effective uid is the same as the file owner. o Modify vaccess() to set VADMIN as an available right if the uid is appropriate. o Modify references to uid-based access control operations such that they now always invoke VOP_ACCESS() instead of using hard-coded policy checks. o This allows alternative UFS policies to be implemented by replacing only ufs_access() (such as mandatory system policies). o VOP_ACCESS() requires the caller to hold an exclusive vnode lock on the vnode: I believe that new invocations of VOP_ACCESS() are always called with the lock held. o Some direct checks of the uid remain, largely associated with the QUOTA and SUIDDIR code. Reviewed by: eivind Obtained from: TrustedBSD Project	2000-10-19 07:53:59 +00:00
John Baldwin	dc13e6dfbb	Axe the idle_event eventhandler, and add a MD cpu_idle function used for things such as halting CPU's, idling CPU's, etc. Discussed with: msmith	2000-10-19 07:47:16 +00:00
Peter Wemm	5d391f75d6	EVENTHANDLER_INVOKE() takes two arguments.	2000-10-18 17:56:06 +00:00
John Baldwin	86bc23af90	Don't needlessly pass the diagnostic counter to the idle_event event handlers.	2000-10-18 08:10:25 +00:00
Matthew N. Dodd	0cb53e2487	Add new bus method 'GET_RESOURCE_LIST' and appropriate generic implementation. Add bus_generic_rl_{get,set,delete,release,alloc}_resource() functions which provide generic operations for devices using resource list style resource management. This should simplify a number of bus drivers. Further commits to follow.	2000-10-18 05:15:40 +00:00
John Baldwin	3650b37578	- Wrap the sanity checks for staying in the idle loop for absurdly long amounts of time in #ifdef DIAGNOSTIC - Call vm_page_zero_idle() during the idle loop.	2000-10-17 23:12:37 +00:00
Warner Losh	85d693f9d8	Implement resource alignment as discussed in arch@ a long time ago. This was implemented by Shigeru YAMAMOTO-san and Jonathan Chen. I've cleaned them up somewhat and they seem to work well enough to boot current (but given current's state it can be hard to tell). Doug Rabson also reviewed the design and signed off on it.	2000-10-17 22:08:03 +00:00
Nick Hibma	d686268728	Put the header section in the header file not the c file. Submitted by: Jonathan Chen <jon@spock.org> PR: 21982	2000-10-15 15:19:35 +00:00
Poul-Henning Kamp	db7e3af111	Remove unneeded #include <machine/clock.h>	2000-10-15 14:19:01 +00:00
Bosko Milekic	181d2a1564	Add nmbcnt sysctl and make it tunable at boottime; nmbcnt is the number of ext_buf counters that are possibly allocatable. Do this because: (i) It will make it easier to influence EXT_COUNTERS for if_sk, if_ti (or similar) users where the driver allocates its own ext_bufs and where it is important for the mbuf system to take it into account when reserving necessary space for counters. (ii) Facilitate some percentile calculation for netstat(1)	2000-10-15 06:24:07 +00:00
John W. De Boskey	2ec40c9aac	Remove the signal value check from the PT_STEP codepath. It can cause an bogus failure. Reviewed by: Sean Eric Fagan <sef@kithrup.com> and no other response to the review request.	2000-10-14 03:56:01 +00:00
Peter Wemm	ac5f943c37	savectx() is now used exclusively by the crash dump system. Move the i386 specific gunk (copy %cr3 to the pcb) from the MI dumpsys() to the MD savectx().	2000-10-13 22:03:29 +00:00
Paul Saab	16a011f973	Do not allocate a callout for all crashdumps, not just when you panic.	2000-10-13 21:49:19 +00:00
Robert Watson	ab024bb02e	o Simplify capability types away from an array of ints to a single u_int64_t flag field, bounding the number of capabilities at 64, but substantially cleaning up capability logic (there are currently 43 defined capabilities). o Heads up to anyone actually using capabilities: the constant assignments for various capabilities have been redone, so any persistent binary capability stores (i.e., '$posix1e.cap' EA backing files) must be recreated. If you have one of these, you'll know about it, so if you have no idea what this means, don't worry. o Update libposix1e to reflect this new definition, fixing the exposed functions that directly manipulate the flags fields. Obtained from: TrustedBSD Project	2000-10-13 17:12:58 +00:00
Jason Evans	9722d88fba	For lockmgr mutex protection, use an array of mutexes that are allocated and initialized during boot. This avoids bloating sizeof(struct lock). As a side effect, it is no longer necessary to enforce the assumtion that lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been removed. Idea taken from: BSD/OS.	2000-10-12 22:37:28 +00:00
Doug Rabson	63c47a5ca0	Add a gross hack for ia64 to allocate the backing store for a new program.	2000-10-12 14:24:03 +00:00
Eivind Eklund	7eb9fca557	Blow away the v_specmountpoint define, replacing it with what it was defined as (rdev->si_mountpoint)	2000-10-09 17:31:39 +00:00
Jason Evans	39df86086f	Do not call lockdestroy() for v_vnlock, which may point to a lock in a deeper vfs stacking layer. Submitted by: bp	2000-10-06 08:04:48 +00:00
John Baldwin	ca29467e9a	Correct a warning where the r_debug_state() dummy function used to trigger a breakpoint in the kernel didn't use the proper argument list. To avoid having to include the userland link.h header everyhwere that sys/linker.h is used, make r_debug_state() a static function in link_elf.c as well.	2000-10-06 05:20:02 +00:00
John Baldwin	6c56727456	- Change fast interrupts on x86 to push a full interrupt frame and to return through doreti to handle ast's. This is necessary for the clock interrupts to work properly. - Change the clock interrupts on the x86 to be fast instead of threaded. This is needed because both hardclock() and statclock() need to run in the context of the current process, not in a separate thread context. - Kill the prevproc hack as it is no longer needed. - We really need Giant when we call psignal(), but we don't want to block during the clock interrupt. Instead, use two p_flag's in the proc struct to mark the current process as having a pending SIGVTALRM or a SIGPROF and let them be delivered during ast() when hardclock() has finished running. - Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was broken on the x86 if it was turned on since cpl is gone. It's only use was to bogusly run softclock() directly during hardclock() rather than scheduling an SWI. - Remove the COM_LOCK simplelock and replace it with a clock_lock spin mutex. Since the spin mutex already handles disabling/restoring interrupts appropriately, this also lets us axe all the *_intr() fu. - Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use temporary fast interrupts for the APIC trial. - Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending signals in hardclock() that are to be delivered in ast(). Submitted by: jakeb (making statclock safe in a fast interrupt) Submitted by: cp (concept of delaying signals until ast())	2000-10-06 02:20:21 +00:00
John Baldwin	a91b7dc11b	Various whitespace cleanups after the SMPng commit, which jumbled things around a bit in the trap handling code.	2000-10-06 01:55:07 +00:00
John Baldwin	0e2aab1237	Don't treat a kernel stack fault the same as a general protect fault or a segment not present fault in the non-vm86 case.	2000-10-06 01:50:43 +00:00
John Baldwin	1931cf940a	- Heavyweight interrupt threads on the alpha for device I/O interrupts. - Make softinterrupts (SWI's) almost completely MI, and divorce them completely from the x86 hardware interrupt code. - The ihandlers array is now gone. Instead, there is a MI shandlers array that just contains SWI handlers. - Most of the former machine/ipl.h files have moved to a new sys/ipl.h. - Stub out all the spl*() functions on all architectures. Submitted by: dfr	2000-10-05 23:09:57 +00:00
Eivind Eklund	a863c0fb2f	Style fixes based on comments by bde	2000-10-05 18:22:46 +00:00
Doug Rabson	c9b004775d	Add a workaround for statically linked kernels.	2000-10-04 17:40:24 +00:00
Jason Evans	a18b1f1d4d	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.	2000-10-04 01:29:17 +00:00
Boris Popov	f8be809e0f	Move KASSERTs which checks value of v_usecount after vnode locking, so it will not produce wrong alarms.	2000-10-02 09:57:06 +00:00
Mike Smith	aa998012c7	Treat %X the same as %x (not entirely correct, but close enough).	2000-10-02 07:13:10 +00:00
Bosko Milekic	7d03271452	Big mbuf subsystem diff #1 : incorporate mutexes and fix things up somewhat to accomodate the changes. Here's a list of things that have changed (I may have left out a few); for a relatively complete list, see http://people.freebsd.org/~bmilekic/mtx_journal * Remove old (once useful) mcluster code for MCLBYTES > PAGE_SIZE which nobody uses anymore. It was great while it lasted, but now we're moving onto bigger and better things (Approved by: wollman). * Practically re-wrote the allocation macros in sys/sys/mbuf.h to accomodate new allocations which grab the necessary lock. * Make sure that necessary mbstat variables are manipulated with corresponding atomic() routines. * Changed the "wait" routines, cleaned it up, made one routine that does the job. * Generalized MWAKEUP() macro. Got rid of m_retry and m_retryhdr, as they are now included in the generalized "wait" routines. * Sleep routines now use msleep(). * Free lists have locks. * etc... probably other stuff I'm missing... Things to look out for and work on later: * find a better way to (dynamically) adjust EXT_COUNTERS * move necessity to recurse on a lock from drain routines by providing lock-free lower-level version of MFREE() (and possibly m_free()?). * checkout include of mutex.h in sys/sys/mbuf.h - probably violating general philosophy here. The code has been reviewed quite a bit, but problems may arise... please, don't panic! Send me Emails: bmilekic@freebsd.org Reviewed by: jlemon, cp, alfred, others?	2000-09-30 06:30:39 +00:00
Doug Rabson	918c9eec57	Add ia64 support.	2000-09-29 13:36:47 +00:00
Doug Rabson	ff2d7ae543	Don't support dynamic linking on ia64 for now - the tools can't cope.	2000-09-29 13:34:04 +00:00
Doug Rabson	b99353b99e	Change the conditionaal so that we only build this on i386 instead of trying to build it on all non-alpha arches.	2000-09-29 13:32:24 +00:00
Jonathan Lemon	d5aa12349f	Check so_error in filt_so{read\|write} in order to detect UDP errors. PR: 21601	2000-09-28 04:41:22 +00:00
Kirk McKusick	02a1e48f02	Do the right thing if bdevvp is called twice for the same device. Obtained from: Poul-Henning Kamp <phk@freebsd.org>	2000-09-27 18:03:17 +00:00
Alan Cox	b92bb032d8	aio_qphysio: Eliminate one instance of an out-of-range check that is performed twice. Eliminate initialization that is already performed by _aio_aqueue. aio_physwakeup: Eliminate redundant synchronization that is already performed by bufdone.	2000-09-26 06:35:22 +00:00
Takanori Watanabe	b9a22da4cf	Make size of dynamic loader argument variable to support various executable file format. Reviewed by: peter	2000-09-26 05:09:21 +00:00
Boris Popov	67e871664b	Add a lock structure to vnode structure. Previously it was either allocated separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_nolock() functions maintain only interlock lock. vop_stdlock() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick	2000-09-25 15:24:04 +00:00
John Baldwin	fd2802cfe0	Add a KASSERT() to catch instances where the mutex that we pass in to msleep() are recursed. Suggested by: cp	2000-09-24 00:33:51 +00:00
Paul Saab	92b123a002	Move MAXCPU from machine/smp.h to machine/param.h to fix breakage with !SMP kernels. Also, replace NCPUS with MAXCPU since they are redundant.	2000-09-23 12:18:06 +00:00
Jason Evans	9a02e8c68f	Don't #include <sys/proc.h>, since machine/mutex.h does it now.	2000-09-23 00:01:37 +00:00
Paul Saab	7321545f26	Remove the NCPU, NAPIC, NBUS, NINTR config options. Make NAPIC, NBUS, NINTR dynamic and set NCPU to a maximum of 16 under SMP. Reviewed by: peter	2000-09-22 23:40:10 +00:00
Robert Watson	100d2c187c	o Introduce vn_extattr_rm(), a helper function in the style of vn_extattr_get() and vn_extattr_set(). vn_extattr_rm() removes the specified extended attribute from a vnode, authorizing the change as the kernel (NULL cred). Obtained from: TrustedBSD Project	2000-09-22 22:33:13 +00:00
Eivind Eklund	453aaa0dff	Style fixes: * Add lots of comments * Convert a couple of assertions to KASSERT() * Minimal whitespace & misapplied {} fixes * Convert #if 0 to #if COMPILING_LINT for code we presently do not support, but want to keep available. Reviewed by: adrian, markm	2000-09-22 12:22:36 +00:00
Eivind Eklund	bba25953af	Staticize addalias()	2000-09-22 11:54:48 +00:00
Mike Smith	bdf3d8b954	Create an event (idle_event) which is invoked every time around the idle loop. Machine-dependant code can elect to eg. take power-saving actions when this event is invoked.	2000-09-22 03:19:24 +00:00
Mike Smith	6595c331f3	Make the EVENTHANDLER mechanism MP-safe. Events can now be invoked without holding the Giant lock.	2000-09-22 03:17:35 +00:00
Robert Watson	988ee790d4	o Change locking rules for VOP_GETACL() to indicate that vnode locks must be held when retrieving ACLs from vnodes. This is required for EA-based UFS ACL implementations. o Update vacl_get_acl() so that it does appropriate vnode locking. o Remove static from M_ACL malloc define so that it is accessible for consumers of ACLs other than in kern_acl.c Obtained from: TrustedBSD Project	2000-09-21 18:43:32 +00:00
Alfred Perlstein	21a9039725	comment vfs_export functions, requested by: eivind	2000-09-21 15:55:55 +00:00
Don Lewis	eabc23efb3	Remove unneeded #include that was a remnant of an earlier version of my uidinfo patch. Found by: phk	2000-09-21 09:04:17 +00:00
Robert Watson	e084835893	o Add additional comment describing vaccess() behavior. Requested by: eivind Reviewed by: eivind, adrian	2000-09-20 17:18:12 +00:00
Peter Wemm	6413a4bc9d	Fully initialize msqids[]. This could lead to ENOSPC and other strange stuff. PR: 21085 Submitted by: Marcin Cieslak <saper@SYSTEM.PL>	2000-09-19 22:59:22 +00:00
Poul-Henning Kamp	b0d17ba69e	Rename lminor() to dev2unit(). This function gives a linear unit number which hides the 'hole' in the minor bits. Introduce unit2minor() to do the reverse operation. Fix some some make_dev() calls which didn't use UID_* or GID_* macros. Kill the v_hashchain alias macro, it hides the real relationship. Introduce experimental SI_CHEAPCLONE flag set it on cloned bpfs.	2000-09-19 10:28:44 +00:00
Paul Saab	b429049a5d	Add new line character to debugging printf's.	2000-09-18 17:03:03 +00:00
Matthew N. Dodd	098b8a1eb0	Initialize 'hints_loaded' to 0. This allows static hints to work properly.	2000-09-17 23:57:52 +00:00
Bruce Evans	33510ef17a	Unpessimized CURSIG(). The fast path through CURSIG() was broken in the 128-bit sigset_t changes by moving conditionally (rarely) executed code to the beginning where it is always executed, and since this code now involves 3 128-bit operations, the pessimization was relatively large. This change speeds up lmbench's pipe latency benchmark by 3.5%. Fixed style bugs in CURSIG().	2000-09-17 15:12:04 +00:00
Bruce Evans	fbbeeb6cd6	Uninlined CURSIG() and unpolluted <sys/signalvar.h>. CURSIG() had become very bloated, first with 128-bit sigset_t's, then with locking in the SMP case, then with locking in all cases. The space bloat was probably also time bloat, partly because the fast path through CURSIG() was pessimized by the sigset_t changes. This change speeds up lmbench's pipe-based latency benchmark by 4% on a Celeron. <sys/signalvar.h> had become very polluted to support the bloat.	2000-09-17 14:28:33 +00:00
Bruce Evans	621dbe43df	Added used include of <sys/mutex.h> (don't depend on pollution in <sys/signalvar.h>).	2000-09-17 12:20:49 +00:00
Boris Popov	3ff1a2f43e	Add new flag PDIRUNLOCK to the component.cn_flags which should be set by filesystem lookup() routine if it unlocks parent directory. This flag should be carefully tracked by filesystems if they want to work properly with nullfs and other stacked filesystems. VFS takes advantage of this flag to perform symantically correct usage of vrele() instead of vput() if parent directory already unlocked. If filesystem fails to track this flag then previous codepath in VFS left unchanged. Convert UFS code to set PDIRUNLOCK flag if necessary. Other filesystmes will be changed after some period of testing. Reviewed in general by: mckusick, dillon, adrian Obtained from: NetBSD	2000-09-17 07:26:42 +00:00
Poul-Henning Kamp	c866ec47e3	Make LINT compile.	2000-09-16 18:55:05 +00:00
Poul-Henning Kamp	fc87418be0	Turn dkcksum() into an __inline function. Change its type to u_int_16_t.	2000-09-16 13:43:00 +00:00
John Baldwin	4cc6117e1d	Remove some commented out cruft.	2000-09-15 23:00:46 +00:00
John Baldwin	7ab37af1ed	- Add a new process flag P_NOLOAD that marks a process that should be ignored during load average calcuations. - Set this flag for the idle processes and the softinterrupt process.	2000-09-15 22:00:23 +00:00
John Baldwin	f6a0af8015	Idle processes are always runnable, so let them state at SRUN.	2000-09-15 19:49:48 +00:00
John Baldwin	db72809d24	Release Giant before starting up init. Submitted by: jake	2000-09-15 19:25:29 +00:00
Don Lewis	42fd51cedc	Enforce process limit policy in one place to keep proccnt from diverging from reality.	2000-09-14 23:07:39 +00:00
John Baldwin	606f8eb27a	Remove the mtx_t, witness_t, and witness_blessed_t types. Instead, just use struct mtx, struct witness, and struct witness_blessed. Requested by: bde	2000-09-14 20:15:16 +00:00
Jonathan Lemon	6f451c99b3	Pipes are not writeable while a direct write is in progress. However, the kqueue filter got the sense of the test reversed, so fix it. Spotted by: Michael Elkins <me@sigpipe.org>	2000-09-14 20:10:19 +00:00
Eivind Eklund	98d39ed48b	Add function comments for functions missing them	2000-09-14 19:13:59 +00:00
Eivind Eklund	1d95078aaf	Blow away COMPAT_43 support for mount	2000-09-14 18:11:44 +00:00
Eivind Eklund	cb144e905c	GC vax-only code	2000-09-14 16:51:47 +00:00
John Baldwin	9a94c9c5c3	- Remove the inthand2_t type and use the equivalent driver_intr_t type from newbus for referencing device interrupt handlers. - Move the 'struct intrec' type which describes interrupt sources into sys/interrupt.h instead of making it just be a x86 structure. - Don't create 'ithd' and 'intrec' typedefs, instead, just use 'struct ithd' and 'struct intrec' - Move the code to translate new-bus interrupt flags into an interrupt thread priority out of the x86 nexus code and into a MI ithread_priority() function in sys/kern/kern_intr.c. - Remove now-uneeded x86-specific headers from sys/dev/ata/ata-all.c and sys/pci/pci_compat.c.	2000-09-13 18:33:25 +00:00
Bruce Evans	9c15b3c143	Fixed hang on booting with -d. mtx_enter() was called on an uninitialized lock. The quick fix in trap.c was not quite the version tested and had no effect; back it out.	2000-09-13 12:40:43 +00:00
Boris Popov	6413416817	Unlock current directory when calling VFS_ROOT() because underlying filesystem may hold the lock. Otherwise unavoidable deadlock will occur. This shouldn't have any side effects as long as we hold vfs lock. Obtained from: NetBSD	2000-09-13 08:57:56 +00:00
John Baldwin	77044cb6d9	Clean up process accounting some more. Unfortunately, it is still not quite right on i386 as the CPU who runs statclock() doesn't have a valid clockframe to calculate statistics with.	2000-09-12 18:57:59 +00:00
Bruce Evans	bbbb2579b4	Quick fix for hang on booting with -d. mtx_enter() was called before curproc was initialized. curproc == NULL was interpreted as matching the process holding Giant... Just skip mtx_enter() and mtx_exit() in trap() if (curproc == NULL && cold) (&& cold for safety).	2000-09-12 18:41:56 +00:00
Boris Popov	9ff5ce6baf	Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT. They will be used by nullfs and other stacked filesystems to support full cache coherency. Reviewed in general by: mckusick, dillon	2000-09-12 09:49:08 +00:00
John Baldwin	4a6404dfc1	Fix some printf format string warnings due to sizeof(int) != sizeof(long) on the alpha.	2000-09-11 23:55:10 +00:00
Poul-Henning Kamp	5ef2707e6e	revent multiple make_dev() calls on the same dev_t and similar bogosities. A couple of new warnings may be emitted during boot if drivers DTWT. Tested by: George Cox <gjvc@gjvc.com>	2000-09-11 17:15:33 +00:00
John Baldwin	b162b45509	When doing statistics for statclock on other CPU's, use the other CPUs' idleproc pointers instead of our own for comparisons. Submitted by: tegge	2000-09-11 04:10:29 +00:00
John Baldwin	a93a7807b2	aio processes need to have the Giant mutex before doing work. Submitted by: tegge	2000-09-11 04:06:48 +00:00
Jason Evans	69ef67f983	Add malloc_mtx to protect malloc and friends, so that they're thread-safe. Reviewed by: peter	2000-09-11 02:32:30 +00:00
Jake Burkholder	817bf5d4a6	Rename tsleep to msleep and add a mutex argument, which is released before sleeping and re-acquired before msleep returns. A compatibility cpp macro has been provided for tsleep to avoid changing all occurences of it in the kernel. Remove an assertion that the Giant mutex be held before calling tsleep or asleep. This is intended to serve the same purpose as condition variables, but does not preclude their addition in the future. Approved by: jasone Obtained from: BSD/OS	2000-09-11 00:20:02 +00:00
Jason Evans	62820f25f5	Allow interrupt threads to run during shutdown. This should fix the "dirty buffers during shutdown" problem introduced by the SMPng commit. Submitted by: tegge, cg	2000-09-10 23:06:50 +00:00
Doug Rabson	36240ea5bf	Move the include of <sys/systm.h> so that KTR gets a declaration for snprintf().	2000-09-10 13:54:52 +00:00
Doug Rabson	4eb38057ea	Fix printf warnings in CTRx calls.	2000-09-10 13:34:35 +00:00
Doug Rabson	21ac8e0b77	Move the include of <sys/systm.h> so that KTR gets a declaration for snprintf().	2000-09-10 13:33:31 +00:00
Poul-Henning Kamp	8925e63cd3	Updates to the ntp pll from John Hay. Submitted by: jhay	2000-09-10 09:13:34 +00:00
Boris Popov	67b23794b1	Change variable naming to be consistent with the rest of VFS code. Reduce number of indirections by using already fetched values.	2000-09-10 03:46:12 +00:00
Jason Evans	28d4c2dde3	Back out the addition of malloc_mtx. It was incompletely conceived, and will be done correctly in the future.	2000-09-10 01:54:15 +00:00
Jason Evans	5340642a2e	Style cleanups. No functional changes.	2000-09-09 23:18:48 +00:00
Jason Evans	46bf3fe5a6	Add file and line arguments to WITNESS_ENTER() and WITNESS_EXIT, since __FILE__ and __LINE__ don't get expanded usefully in inline functions. Add const to all witness*() arguments that are filenames.	2000-09-09 22:43:22 +00:00
Jason Evans	9360d3ebdd	Add a mutex to the malloc interfaces so that it can safely be called without owning the Giant lock.	2000-09-09 22:27:35 +00:00
Poul-Henning Kamp	8d25eb2c3a	Add code to devname(3) so it can find the names of devices which were not present when dev_mkdb(8) was run. First the dev_mkdb(8) database is searched, this caters for non-DEVFS cases where people have renamed a device. If that fails we ask the kernel using sysctl kern.devname if the device driver has put a name in the dev_t. This covers DEVFS cloned devices. If that also fails we format a string which isn't entirely useless.	2000-09-09 11:39:59 +00:00
Jason Evans	12473b76dc	Rename mtx_enter(), mtx_try_enter(), and mtx_exit() and wrap them with cpp macros that expand to pass filename and line number information. This is necessary since we're using inline functions instead of macros now. Add const to the filename pointers passed througout the mtx and witness code.	2000-09-08 21:48:06 +00:00
John Baldwin	1baab78f9e	Remove an unneeded extern declaration of cp_time.	2000-09-08 20:18:29 +00:00
Jake Burkholder	4ef34f39ec	Really fix USER_LDT. (Don't use currentldt as an L-value.)	2000-09-08 03:36:09 +00:00
Jason Evans	0384fff8c5	Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh	2000-09-07 01:33:02 +00:00
Jason Evans	62ae6c89ad	Add KTR, a facility that logs kernel events in order to to facilitate debugging. Acquired from: BSDi (BSD/OS) Submitted by: dfr, grog, jake, jhb	2000-09-07 01:29:44 +00:00
Don Lewis	c5930ee45a	Change the calls to panic() in uifree(), chgproccnt(), and chgsbsize() to printf(). Any errors detected are not likely to be fatal, so it should be safe to let things keep running.	2000-09-06 19:00:19 +00:00
Alfred Perlstein	34b94e8b82	Accept filter maintainance Update copyrights. Introduce a new sysctl node: net.inet.accf Although acceptfilters need refcounting to be properly (safely) unloaded as a temporary hack allow them to be unloaded if the sysctl net.inet.accf.unloadable is set, this is really for developers who want to work on thier own filters. A near complete re-write of the accf_http filter: 1) Parse check if the request is HTTP/1.0 or HTTP/1.1 if not dump to the application. Because of the performance implications of this there is a sysctl 'net.inet.accf.http.parsehttpversion' that when set to non-zero parses the HTTP version. The default is to parse the version. 2) Check if a socket has filled and dump to the listener 3) optimize the way that mbuf boundries are handled using some voodoo 4) even though you'd expect accept filters to only be used on TCP connections that don't use m_nextpkt I've fixed the accept filter for socket connections that use this. This rewrite of accf_http should allow someone to use them and maintain full HTTP compliance as long as net.inet.accf.http.parsehttpversion is set.	2000-09-06 18:49:13 +00:00
Peter Wemm	fc611b0634	Do not panic on an uninitialized VOP_xxx() call. This was meant as a sanity check, but it is too easy to run into, eg: making an ACL syscall when no filesystems have the ACL implementation enabled. The original reason for the panic was that the VOP_ vector had not been assigned and therefor could not be passed down the stack.. and there was no point passing it down since nothing implemented it anyway. vop_defaultop entries could not pass it on because it had a zero (unknown) vector that was indistinguishable from another unknown VOP vector. Anyway, we can do something reasonable in this case, we shouldn't need to panic here as there is a reasonable recovery option (return EOPNOTSUPP and dont pass it down the stack). Requested by: rwatson	2000-09-06 17:51:54 +00:00
Robert Watson	728783c27a	o Synchronize vaccess() capability access control checks with TrustedBSD tree. Obtained from: TrustedBSD Project	2000-09-06 12:18:24 +00:00
David E. O'Brien	6b6821c771	The kernel is now known as `kernel.ko' and it and its matching modules live in ``/boot/kernel/''. Submitted by: Hisashi Hiramoto <hiramoto@phys.chs.nihon-u.ac.jp>	2000-09-06 06:22:20 +00:00
Boris Popov	9548091b84	Ignore ELF files with 'interpreter' section because KLDs doesn't contain it. Reviewed by: peter	2000-09-06 02:21:43 +00:00
Don Lewis	f535380cb6	Remove uidinfo hash table lookup and maintenance out of chgproccnt() and chgsbsize(), which are called rather frequently and may be called from an interrupt context in the case of chgsbsize(). Instead, do the hash table lookup and maintenance when credentials are changed, which is a lot less frequent. Add pointers to the uidinfo structures to the ucred and pcred structures for fast access. Pass a pointer to the credential to chgproccnt() and chgsbsize() instead of passing the uid. Add a reference count to the uidinfo structure and use it to decide when to free the structure rather than freeing the structure when the resource consumption drops to zero. Move the resource tracking code from kern_proc.c to kern_resource.c. Move some duplicate code sequences in kern_prot.c to separate helper functions. Change KASSERTs in this code to unconditional tests and calls to panic().	2000-09-05 22:11:13 +00:00
Poul-Henning Kamp	64dc16df4a	Move extern declaration of dead_vnodeop_p to a .h file. Remove race condition in vn_isdisk().	2000-09-05 21:09:56 +00:00
Robert Watson	e81c5f4307	o vn_extattr_set() will now call appropriate vn_start_write() and vn_finished_write() if IO_NODELOCKED is not set. Obtained from: TrustedBSD Project	2000-09-05 03:15:02 +00:00
Robert Watson	b4d0de586d	o Remove commented out code which modified return values from extattr_{get,set} syscalls in the face of partial reads or writes. Obtained from: TrustedBSD Project	2000-09-05 02:13:14 +00:00
Peter Wemm	58da4602af	When we are picking the next available unit number, specifically say what we picked. Otherwise it is anybody's guess as to where the device ended up.	2000-09-05 00:30:46 +00:00
Poul-Henning Kamp	97804a5c99	Update the NTP kernel PLL code to the 2000-08-29 version of Dave Mills nanokernel. The FreeBSD private mode hardpps Type 2 PLL has been removed.	2000-09-04 08:19:32 +00:00
Alan Cox	b70158bae1	Make filt_aio() check the jobstate for JOBST_JOBBFINISHED (in addition to JOBST_JOBFINISHED) in case the aio_read() or aio_write() was performed via the high-performance physio method, i.e., aio_qphysio().	2000-09-04 07:56:32 +00:00
Peter Wemm	82acbcf57b	kern_shutdown.c was more ANSI-C than K&R - remove the remnants of K&R support with extreme prejudice.	2000-09-03 06:44:53 +00:00
Peter Wemm	87de370376	gcc knows that savectx() is potentially a setjmp style dual-return function which may lead to stack lossage and clobbered variables. This isn't the case here, but there is no way to tell gcc that. Work around this in a kinda bizzare way, but it shuts gcc up.	2000-09-03 06:35:04 +00:00
Poul-Henning Kamp	db90128160	Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support. If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present". This happily removes an ugly hack from kern/vfs_conf.c. This forces a rename of the eventhandler and the standard clone helper function. Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h> Remove all #includes of opt_devfs.h they no longer matter.	2000-09-02 19:17:34 +00:00
Don Lewis	8577117cc8	access() shouldn't diddle with the contents of a potentially shared credential. Create a temporary copy of the current credential and modify the copy. Submitted by: tegge	2000-09-02 12:31:55 +00:00
Brian Feldman	468ddc8a44	Casts are needed to subtract u_longs. Submitted by: tor	2000-08-31 22:21:33 +00:00
Robert Watson	c52396e365	o p_cansee() wasn't setting privused when suser() was required to override kern.ps_showallprocs. Apparently got lost in the merge process from the capability patches. Now fixed. Submitted by: jdp Obtained from: TrustedBSD Project	2000-08-31 15:55:17 +00:00
Brian Feldman	b6240737d5	Fix hangs caused by overzealous code removal. Thanks, Nickolay, for figuring out this is the problem. Submitted by: Nickolay Dudorov <nnd@mail.nsk.ru>	2000-08-31 11:31:58 +00:00
Mike Smith	3e755f76d1	Make it possible to pass boot()'s flags to shutdown_nice() so that the kernel can instigate an orderly shutdown but still determine the form of that shutdown. Make it possible eg. to cleanly shutdown and power off the system under ACPI when the power button is pressed.	2000-08-31 00:08:50 +00:00
Robert Watson	387d2c036b	o Centralize inter-process access control, introducing: int p_can(p1, p2, operation, privused) which allows specification of subject process, object process, inter-process operation, and an optional call-by-reference privused flag, allowing the caller to determine if privilege was required for the call to succeed. This allows jail, kern.ps_showallprocs and regular credential-based interaction checks to occur in one block of code. Possible operations are P_CAN_SEE, P_CAN_SCHED, P_CAN_KILL, and P_CAN_DEBUG. p_can currently breaks out as a wrapper to a series of static function checks in kern_prot, which should not be invoked directly. o Commented out capabilities entries are included for some checks. o Update most inter-process authorization to make use of p_can() instead of manual checks, PRISON_CHECK(), P_TRESPASS(), and kern.ps_showallprocs. o Modify suser{,_xxx} to use const arguments, as it no longer modifies process flags due to the disabling of ASU. o Modify some checks/errors in procfs so that ENOENT is returned instead of ESRCH, further improving concealment of processes that should not be visible to other processes. Also introduce new access checks to improve hiding of processes for procfs_lookup(), procfs_getattr(), procfs_readdir(). Correct a bug reported by bp concerning not handling the CREATE case in procfs_lookup(). Remove volatile flag in procfs that caused apparently spurious qualifier warnigns (approved by bde). o Add comment noting that ktrace() has not been updated, as its access control checks are different from ptrace(), whereas they should probably be the same. Further discussion should happen on this topic. Reviewed by: bde, green, phk, freebsd-security, others Approved by: bde Obtained from: TrustedBSD Project	2000-08-30 04:49:09 +00:00
Robert Watson	c6fac29aff	o Disable flagging of ASU in suser_xxx() authorization check. For the time being, the ASU accounting flag will no longer be available, but may be reinstituted in the future once authorization have been redone. As it is, the kernel went through contortions in access control to avoid calling suser, which always set the flag. This will also allow suser to accept const struct *{cred, proc} arguments. Reviewed by: bde, green, phk, freebsd-security, others Approved by: bde Obtained from: TrustedBSD Project	2000-08-30 04:35:32 +00:00
Brian Feldman	343079d9b2	Remove an extraneous setting of sb_hiwat.	2000-08-30 00:09:57 +00:00
Robert Watson	012c643d3e	o Restructure vaccess() so as to check for DAC permission to modify the object before falling back on privilege. Make vaccess() accept an additional optional argument, privused, to determine whether privilege was required for vaccess() to return 0. Add commented out capability checks for reference. Rename some variables to make it more clear which modes/uids/etc are associated with the object, and which with the access mode. o Update file system use of vaccess() to pass NULL as the optional privused argument. Once additional patches are applied, suser() will no longer set ASU, so privused will permit passing of privilege information up the stack to the caller. Reviewed by: bde, green, phk, -security, others Obtained from: TrustedBSD Project	2000-08-29 14:45:49 +00:00
Brian Feldman	6aef685fbb	Remove any possibility of hiwat-related race conditions by changing the chgsbsize() call to use a "subject" pointer (&sb.sb_hiwat) and a u_long target to set it to. The whole thing is splnet(). This fixes a problem that jdp has been able to provoke.	2000-08-29 11:28:06 +00:00
Doug Rabson	f80e454726	Add kobj_class_compile_static() to allow classes to be initialised statically (i.e. without calling malloc). This allows kobj to be used very early in the boot sequence.	2000-08-28 21:11:12 +00:00
Doug Rabson	a4f9b116e3	* Remove a bogus call to kobj_init() from make_device(). * Add a non-empty implementation of root_print_child().	2000-08-28 21:08:12 +00:00
Marcel Moolenaar	3f4ab6537f	Regen: fix prototypes for {o\|}{g\|s}etrlimit.	2000-08-28 07:56:38 +00:00
Marcel Moolenaar	ae51d56ce1	Fix prototypes for {o\|}{g\|s}etrlimit. A recent change in the Linuxulator caused this bug to trigger.	2000-08-28 07:50:44 +00:00
Alfred Perlstein	c58b821e4c	new sysctl 'kern.openfiles' (exports nfiles to userland)	2000-08-26 23:49:44 +00:00
Robert Watson	877dd71fc6	o Correct spelling of ufs_exttatr_find_attr -> ufs_extattr_find_attr o Add "const" qualifier to attrname argument of various calls to remove warnings Obtained from: TrustedBSD Project	2000-08-26 22:00:58 +00:00
Marcel Moolenaar	31c8f3f0af	Make this file compile again when COMPAT_43 has not been defined. This boils down to conditionally compile the old signal syscalls. We might want to extend the types in syscalls.master to make these syscalls conditionally on something more appropriate than COMPAT_43.	2000-08-26 02:27:01 +00:00
Peter Wemm	9579e8c145	m_mballoc_wait() had a spl/tsleep race. mbufs can be freed in interrupt context, which can cause a wakeup.. which can race with this.	2000-08-25 22:28:08 +00:00
Peter Wemm	12db06a04f	If the config program found a hints file and included it as a fallback, then treat it as such. This isn't perfect, but should do for things like GENERIC. When in fallback mode, they will be used if there are NO other hints.	2000-08-25 19:48:10 +00:00
Poul-Henning Kamp	d8cd1501f2	Dang, a _clone routine escaped #ifdef DEVFS containment.	2000-08-24 15:59:44 +00:00
Poul-Henning Kamp	a481b90b82	Fix panic when removing open device (found by bp@) Implement subdirs. Build the full "devicename" for cloning functions. Fix panic when deleted device goes away. Collaps devfs_dir and devfs_dirent structures. Add proper cloning to the /dev/fd* "device-"driver. Fix a bug in make_dev_alias() handling which made aliases appear multiple times. Use devfs_clone to implement getdiskbyname() Make specfs maintain the stat(2) timestamps per dev_t	2000-08-24 15:36:55 +00:00
Brian Feldman	0a7d171157	Revert the suser -> suser_xxx change made previously. It was right before.	2000-08-24 04:54:31 +00:00
Paul Saab	03f808c55a	Add a sysctl which hides all process except those that belong to the user asking for the process list. Reviewed by: peter	2000-08-23 21:41:25 +00:00
Poul-Henning Kamp	3f54a085a6	Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c) Remove old DEVFS support fields from dev_t. Make uid, gid & mode members of dev_t and set them in make_dev(). Use correct uid, gid & mode in make_dev in disk minilayer. Add support for registering alias names for a dev_t using the new function make_dev_alias(). These will show up as symlinks in DEVFS. Use makedev() rather than make_dev() for MFSs magic devices to prevent DEVFS from noticing this abuse. Add a field for DEVFS inode number in dev_t. Add new DEVFS in fs/devfs. Add devfs cloning to: disk minilayer (ie: ad(4), sd(4), cd(4) etc etc) md(4), tun(4), bpf(4), fd(4) If DEVFS add -d flag to /sbin/inits args to make it mount devfs. Add commented out DEVFS to GENERIC	2000-08-20 21:34:39 +00:00
Poul-Henning Kamp	4fe6d43729	Fix typo in last commit.	2000-08-20 11:46:39 +00:00
Poul-Henning Kamp	e39c53eda5	Centralize the canonical vop_access user/group/other check in vaccess(). Discussed with: bde	2000-08-20 08:36:26 +00:00
David Malone	a5c4836d39	Replace the mbuf external reference counting code with something that should be better. The old code counted references to mbuf clusters by using the offset of the cluster from the start of memory allocated for mbufs and clusters as an index into an array of chars, which did the reference counting. If the external storage was not a cluster then reference counting had to be done by the code using that external storage. NetBSD's system of linked lists of mbufs was cosidered, but Alfred felt it would have locking issues when the kernel was made more SMP friendly. The system implimented uses a pool of unions to track external storage. The union contains an int for counting the references and a pointer for forming a free list. The reference counts are incremented and decremented atomically and so should be SMP friendly. This system can track reference counts for any sort of external storage. Access to the reference counting stuff is now through macros defined in mbuf.h, so it should be easier to make changes to the system in the future. The possibility of storing the reference count in one of the referencing mbufs was considered, but was rejected 'cos it would often leave extra mbufs allocated. Storing the reference count in the cluster was also considered, but because the external storage may not be a cluster this isn't an option. The size of the pool of reference counters is available in the stats provided by "netstat -m". PR: 19866 Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: alfred (glanced at by others on -net)	2000-08-19 08:32:59 +00:00
Poul-Henning Kamp	39f70682ae	Introduce vop_stdinactive() and make it the default if no vop_inactive is declared. Sort and prune a few vop_op[].	2000-08-18 10:01:02 +00:00
Brian Feldman	9b96968623	Fix a couple cases where p_trespass wasn't transitioned into place. Make RTP_SET (rtprio) only accessible to real root, not root in jails.	2000-08-16 23:28:54 +00:00
Peter Wemm	37b087a645	Clean up some low level bootstrap code: - stop using the evil 'struct trapframe' argument for mi_startup() (formerly main()). There are much better ways of doing it. - do not use prepare_usermode() - setregs() in execve() will do it all for us as long as the p_md.md_regs pointer is set. (which is now done in machdep.c rather than init_main.c. The Alpha port did it this way all along and is much cleaner). - collect all the magic %cr0 etc register settings into one place and have the AP's call that instead of using magic numbers (!!) that keep changing over and over again. - Make it safe to call kthread_create() earlier, including during the device probe sequence. It doesn't need the callback mechanism that NetBSD's version uses. - kthreads created this way are root-less as they exist before the root filesystem is mounted. init(1) is set up so that it aquires the root pointers prior to running. If other kthreads want filesystem acccess we can make this code more generic. - set all threads start times once we have decided what time it is. - init uses a trampoline rather than the evil prepare_usermode() hack. - kern_descrip.c has a couple of tweaks to deal with forking when there is no rootdir or cwd etc. - adjust the early SYSINIT() sequence so that a few prereqisites are in place. eg: make sure the run queue is initialized before doing forks. With this, the USB code can easily create a kthread to do the device tree discovery. (I have tested it, it works nicely). There are still some open issues before this is truely useful. - tsleep() does not like working before the clock is running. It sort-of tries to spin wait, but it can do more useful things now. - stopping a kthread in kld code at unload time is "interesting" but we have a solution for that. The Alpha code needs no changes for this. It already uses pretty much the same strategies, but a little cleaner.	2000-08-11 09:05:12 +00:00
Tor Egge	4428d39d63	Don't skip IOAPIC id conflict detection when only one pci bus is present. PR: 20312 Reviewed by: Steve Roome <steve@sse0691.bri.hp.com>	2000-08-10 17:33:24 +00:00
Tor Egge	3c2498c0d3	Don't set flags on the mount structure before all permission checks have been done. Don't allow multiple mount operations with MNT_UPDATE at the same time on the same mount point. When the first mount operation completed, MNT_UPDATE was cleared in the mount structure, causing the second to complete as if it was a no-update mount operation with the following bad side effects: - mount structure inserted multiple times onto the mountlist - vp->v_mountedhere incorrectly set, causing next namei operation walking into the mountpoint to crash with a locking against myself panic. Plug a vnode leak in case vinvalbuf fails.	2000-08-09 01:57:11 +00:00
Robert Watson	e6a9ab52db	o Introduce vn_extattr_{get,set}, wrapper routines for VOP_GETEXTATTR and VOP_SETEXTATTR to simplify calling from in-kernel consumers, such as capability code. Both accept a vnode (optionally locked, with ioflg to indicate that), attribute name, and a buffer + buffer length in UIO_SYSSPACE. Both authorize the call as a kernel request, with cred set to NULL for the actual VOP_ calls. Obtained from: TrustedBSD Project	2000-08-08 17:15:32 +00:00
Jonathan Lemon	a114459191	Make the kqueue socket read filter honor the SO_RCVLOWAT value. Spotted by: "Steve M." <stevem@redlinenetworks.com>	2000-08-07 17:52:08 +00:00
Jonathan Lemon	ad91b6a280	Fix bug with timeout; previously, when attempting to poll the kqueue by passing a zero-valued timeout, the code would always sleep for one tick. Change code to avoid calling tsleep if we have no intention of sleeping. Bring in bugfix from sys_select.c, r1.60 which also applies here. Modify error handling slightly; passing in an invalid fd will now result in EBADF returned in the eventlist, while an attempt to change a knote which does not exist will result in ENOENT being returned. Previously such attempts would fail silently without notification. Pointed out by: nicolas.leonard@animaths.com Rick Reed (rr@yahoo-inc.com)	2000-08-07 16:45:42 +00:00
Paul Saab	c206a8609e	Change the behavior of isa_nmi to log an error message instead of panicing and return a status so that we can decide whether to drop into DDB or panic. If the status from isa_nmi is true, panic the kernel based on machdep.panic_on_nmi, otherwise if DDB is enabled, drop to DDB based on machdep.ddb_on_nmi. Reviewed by: peter, phk	2000-08-06 14:17:21 +00:00
Tor Egge	e666f57c3e	Be more verbose when changing APIC ID on an IO APIC. Don't allow cpu entries in the MP table to contain APIC IDs out of range. Don't write outside array boundaries if an IO APIC entry in the MP table contains an APIC ID out of range. Assign APIC IDs for all IO APICs according to section 3.6.6 in the Intel MP spec: - If the current APIC ID on an IO APIC doesn't conflict with other IO APICs or CPUs, that APIC ID should be used. The copy of the MP table must be updated if the corresponding APIC ID in the MP table is different. - If the current APIC ID was in conflict with other units, the corresponding APIC ID specified in the MP table is checked for conflict. - If a conflict is still found then fall back to using a new unique ID. The copy of the MP table must be updated. - IDs out of range is considered to be in conflict. During these operations, the IO_TO_ID array cannot be used, since any conflict would have caused information loss. The array is then corrected, since all APIC ID conflicts should have been resolved. PR: 20312, 18919	2000-08-06 00:04:03 +00:00
Jeffrey Hsu	51b86781c0	Modify to use fixed STAILQ_LAST(). Reviewed by: dfr	2000-08-03 16:37:46 +00:00
Peter Wemm	2c7f8b4ebd	Fix self referential dependencies. eg: uhub was packaged along with usb, all in usb.ko. uhub depends on usb. The bug was that the preload processing only adds a module to the list once it's internal dependencies are resolved... Since it was not "seeing" the internal usb module it believed that uhub had a missing dependency.	2000-08-02 21:08:53 +00:00
Peter Wemm	af4b2d2d1c	Fix the SYSINIT() bubble sort. This was fixed in kern_linker.c already.	2000-08-02 21:05:21 +00:00
Jonathan Lemon	1dfd47607b	Back out rev 1.12; its not clear that this is the right thing to do, and in any event, it wasn't done correctly in the first place.	2000-08-01 04:27:50 +00:00
Luoqi Chen	3fb50adb4c	Handle write page faults (both write only or read-modify-write) as MI vm write-only faults. This would allow write-only mmapped regions to function correctly.	2000-07-31 14:47:14 +00:00
Alfred Perlstein	9ad48853de	mbstat should be a read-only sysctl. Submitted by: Bosko Milekic <bmilekic@dsuper.net>	2000-07-31 09:24:32 +00:00
Paul Saab	030f7b3faa	Remove unnecessary call to splnet when setting an accept filter since we are already at splnet.	2000-07-31 08:23:43 +00:00
Peter Wemm	3a285cc807	Regen. (Fix SYS_exit)	2000-07-29 10:07:38 +00:00
Peter Wemm	4e0f152bbe	Sigh. Fix SYS_exit problems. I misunderstood the significance of these trailing options.	2000-07-29 10:05:25 +00:00
Paul Saab	0e461cb7e2	Remove this file incase of further confusion.	2000-07-29 04:09:07 +00:00
Peter Wemm	69065e880a	Regenerate with makesyscalls.sh	2000-07-29 00:21:50 +00:00
Peter Wemm	ac2b067b9a	Change the 'exit()' system call to 'sys_exit()'. This avoids overlapping gcc's internal exit() prototypes and the (futile) hackery that we did to try and avoid warnings. main() was renamed for similar reasons. Remove an exit related hack from makesyscalls.sh.	2000-07-29 00:16:28 +00:00
Peter Wemm	5dec52bada	Fix the #ifdef VFS_AIO to not compile a whole bunch of unused stuff in the !VFS_AIO case. Lots of things have hooks into here (kqueue, exit(), sockets, etc), I elected to keep the external interfaces the same rather than spread more #ifdefs around the kernel.	2000-07-28 23:10:10 +00:00
Peter Wemm	f7ce4efc8a	Fix a const related warning.	2000-07-28 22:41:56 +00:00
Peter Wemm	93e8459a02	Fix some style nits. Fix(?) some compile warnings regarding const handling.	2000-07-28 22:40:04 +00:00
Peter Wemm	c828c7b784	Fix warnings - make kevent args in comment match those in syscalls.master. Deal with consts.	2000-07-28 22:32:25 +00:00
Peter Wemm	b31ae1adc5	Fix a warning that has been annoying me for some time: "kern/sys_generic.c:358: warning: cast discards qualifiers from pointer target type" The idea for using the uintptr_t intermediate cast for de-constifying a pointer was hinted at by bde some time ago.	2000-07-28 22:17:42 +00:00
Robert Watson	fc3345a4a7	o Modify extattr_{set,get}() syscalls so that partial reads and writes with an error condition such as EINTR, EWOULDBLOCK, and ERESTART, are reported to the application, not silently conceal. This behavior was copied from the {read,write}v() syscalls, and is appropriate there but not here. o Correct a bug in extattr_delete() wherein the LOCKLEAF flag is passed to the wrong argument in namei(), resulting in some unexpected errors during name resolution, and passing in an unlocked vnode. Obtained from: TrustedBSD Project	2000-07-28 19:52:38 +00:00
Jonathan Lemon	ab2adc20f2	Have kevent() automatically restart if interrupted by a signal. If this is not desired, then the user can register an EV_SIGNAL filter to explicitly catch a signal event. Change requested by: jayanth, ps, peter "Why is kevent non-restartable after a signal?"	2000-07-27 23:06:14 +00:00
Brian Feldman	3c89e357f0	Distinguish between whether ktraceing was enabled before an IO operation or after it. If the ktrace operation was enabled while the process was blocked doing IO, the race would allow it to pass down invalid (uninitialized) data and panic later down the call stack.	2000-07-27 03:45:18 +00:00
Robert Watson	3ce7b7aa84	o Lock vnode before calling extattr_* VOP's, and modify vnode spec to allow for that. o Remember to call NDFREE() if exiting as a result of a failed vn_start_write() when snapshotting. Reviewed by: mckusick Obtained from: TrustedBSD Project	2000-07-26 20:29:20 +00:00
Kirk McKusick	54e53ebda7	Now that buffer locks can be recursive, we need to delete the panics that complain about them. Obtained from: Brian Fundakowski Feldman <green@FreeBSD.org>	2000-07-25 18:28:46 +00:00
Kirk McKusick	aec3bbe11c	Do not need vrele(nd.ni_vp) as that is done by NDFREE(&nd, 0); Submitted by: Peter Holm <pho@freebsd.org>	2000-07-25 05:38:54 +00:00
Robert Watson	e2e45aa8a0	o Add missing function return types from capability syscall call stubs, fix compiler warning. Submitted by: jake	2000-07-25 03:37:36 +00:00
Kirk McKusick	9b97113391	This patch corrects the first round of panics and hangs reported with the new snapshot code. Update addaliasu to correctly implement the semantics of the old checkalias function. When a device vnode first comes into existence, check to see if an anonymous vnode for the same device was created at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than creating a new vnode for the device. This corrects a problem which caused the kernel to panic when taking a snapshot of the root filesystem. Change the calling convention of vn_write_suspend_wait() to be the same as vn_start_write(). Split out softdep_flushworklist() from softdep_flushfiles() so that it can be used to clear the work queue when suspending filesystem operations. Access to buffers becomes recursive so that snapshots can recursively traverse their indirect blocks using ffs_copyonwrite() when checking for the need for copy on write when flushing one of their own indirect blocks. This eliminates a deadlock between the syncer daemon and a process taking a snapshot. Ensure that softdep_process_worklist() can never block because of a snapshot being taken. This eliminates a problem with buffer starvation. Cleanup change in ffs_sync() which did not synchronously wait when MNT_WAIT was specified. The result was an unclean filesystem panic when doing forcible unmount with heavy filesystem I/O in progress. Return a zero'ed block when reading a block that was not in use at the time that a snapshot was taken. Normally, these blocks should never be read. However, the readahead code will occationally read them which can cause unexpected behavior. Clean up the debugging code that ensures that no blocks be written on a filesystem while it is suspended. Snapshots must explicitly label the blocks that they are writing during the suspension so that they do not cause a `write on suspended filesystem' panic. Reorganize ffs_copyonwrite() to eliminate a deadlock and also to prevent a race condition that would permit the same block to be copied twice. This change eliminates an unexpected soft updates inconsistency in fsck caused by the double allocation. Use bqrelse rather than brelse for buffers that will be needed soon again by the snapshot code. This improves snapshot performance.	2000-07-24 05:28:33 +00:00
Brian Feldman	55af4c7d94	Using an atomic operation here won't help if nobody else uses them (for this). Use the simple_lock() on v_interlock like elsewhere.	2000-07-23 22:19:49 +00:00
Brian Feldman	25ead03462	Solve the problem where it is possible to get the kernel stuck in a loop down in pmap_init_pt(). A subtraction causes the number of pages to become negative, that was assigned to an unsigned variable, and there is a lot of iteration. The bug is due to the ELF image activator not properly checking for its files being the correct size as specified by the ELF header. The solution is to check that the header doesn't ask for part of a file when that part of the file doesn't exist. Make sure to set VEXEC at the proper times to make the executables immutable (remove race conditions). Also, the ELF format specifiies header entries that allow embedding of other executables (hence how ld-elf.so.1 gets loaded, but not the same as loading shared libraries), so those executables need to be set VEXEC, too, so they're immutable. Reviewed by: peter	2000-07-23 06:49:46 +00:00
Alfred Perlstein	f408896444	only allow accept filter modifications on listening sockets Submitted by: ps	2000-07-20 12:17:17 +00:00
Alfred Perlstein	85f5e7f098	disallow unload until we do proper refcounting	2000-07-20 12:12:41 +00:00
Jonathan Lemon	2ba03123c5	Fix a bug which would cause some knotes to get lost when two kqueues were being used in a process at the same time. Test case provided by: Chris Peiffer <peifferc@CS.Stanford.EDU>	2000-07-18 21:41:47 +00:00
Jonathan Lemon	a8e65b915e	Simplify kqueue API slightly. Discussed on: -arch	2000-07-18 19:31:52 +00:00
Peter Wemm	f03c9f90d1	Patch up some bogons in the resource_find() vs resource_find_hard() interfaces. The original resource_find() returned a pointer to an internal resource table entry. resource_find_hard() dereferences the actual passed in value (oops!) - effectively trashing random memory due to the pointer being passed in with a random initial value. Submitted by: bde	2000-07-18 06:08:27 +00:00
Andrzej Bialecki	bd3cdc3105	These patches implement dynamic sysctls. It's possible now to add and remove sysctl oids at will during runtime - they don't rely on linker sets. Also, the node oids can be referenced by more than one kernel user, which means that it's possible to create partially overlapping trees. Add sysctl contexts to help programmers manage multiple dynamic oids in convenient way. Please see the manpages for detailed discussion, and example module for typical use. This work is based on ideas and code snippets coming from many people, among them: Arun Sharma, Jonathan Lemon, Doug Rabson, Brian Feldman, Kelly Yancey, Poul-Henning Kamp and others. I'd like to specially thank Brian Feldman for detailed review and style fixes. PR: kern/16928 Reviewed by: dfr, green, phk	2000-07-15 10:26:04 +00:00
Alfred Perlstein	af0e6bcdf0	Make mbstat.m_mtypes seperate and viewable via sysctl, also expand the size from short to ulong Submitted by: Ian Dowse <iedowse@maths.tcd.ie> PR: kern/19809	2000-07-15 06:02:48 +00:00
Paul Saab	88f675ba30	Change the way NMI's are handled. Before, if DDB was enabled and a NMI occured, you could type continue in DDB and the kernel would not attempt to detect what type of NMI was recieved. Now we check for the type of NMI first and then go to DDB if it is enabled. This will solve the problem with having DDB enabled and getting an NMI due to some possibly bad error and being able to continue the operation of the kernel when you really want to panic and know what happened. Submitted by: jhb	2000-07-14 11:49:44 +00:00
Robert Watson	e8483a05a6	o Commit two of two, introducing __cap_{get,set}_{fd,file} syscalls to modify capability sets on files. Obtained from: TrustedBSD Project	2000-07-13 20:38:52 +00:00
Robert Watson	92eebb8a9b	o Introduce syscall prototypes, stubs for __cap_{get,set}_{fd,file}, syscalls to manage capability sets on files. First of two commits. Obtained from: TrustedBSD Project	2000-07-13 20:31:24 +00:00
John Baldwin	9c386f6b7d	For infinite timeouts, set both the tv_sec and tv_usec fields to zero in poll() and select(). Noticed by: Wesley Morgan <morganw@chemicals.tacorp.com>	2000-07-13 02:12:25 +00:00
John Baldwin	4da144c091	Fix a very obscure bug in select() and poll() where the timeout would never expire if poll() or select() was called before the system had been in multiuser for 1 second. This was caused by only checking to see if tv_sec was zero rather than checking both tv_sec and tv_usec.	2000-07-12 22:46:40 +00:00
Jun-ichiro itojun Hagino	f38211642f	remove m_pulldown statistics, which is highly experimental and does not belong to *bsd-merged tree	2000-07-12 16:39:13 +00:00
Kirk McKusick	f2a2857bb3	Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed. Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).	2000-07-11 22:07:57 +00:00
Boris Popov	2ff087318a	Correct SYSINIT execution order in the case when KLD contains more than one SYSINIT with the same 'subsystem' id and different 'order' id. Reviewed by: peter	2000-07-09 23:58:56 +00:00
Brian Feldman	7ceba2d755	Remove two micro-pessimizations I made. Bruce is teaching me well :) KTRPOINT(p, KTR_GENIO) is more uncommon than error == 0, so it should be first in the && statement.	2000-07-07 22:11:37 +00:00
Brian Feldman	9d1cfdce2a	Change that &@!$# UIO_READ to be UIO_WRITE. I tested the ktrace stuff, but somehow... pass the pointy hat, again!	2000-07-07 21:52:15 +00:00
Boris Popov	3660ebc2c0	Fix support for more than 256 simultaneous mounts. Theoretical limit is 2^16 mounts per fs type. Reported by: Troy Arie Cobb <tcobb@staff.circle.net> via phk Reviewed by: bde	2000-07-07 14:01:08 +00:00
John Baldwin	9701cd40b4	Support for unsigned integer and long sysctl variables. Update the SYSCTL_LONG macro to be consistent with other integer sysctl variables and require an initial value instead of assuming 0. Update several sysctl variables to use the unsigned types. PR: 15251 Submitted by: Kelly Yancey <kbyanc@posi.net>	2000-07-05 07:46:41 +00:00
Warner Losh	5d10777c46	End two weeks of on and off debugging. Fix the crash on the Nth insertion of a CF card, for random values of N > 1. With these fixes, I've been able to do 100 insert/remove of the cards w/o a crash with lots of system activity going on that in the past would help trigger the crash. The problem: FreeBSD creates dev_t's on the fly as they are needed and never destroys them. These dev_t's point to a struct disk that is used for housekeeping on the disk. When a device goes away, the struct disk pointer becomes a dangling pointer. Sometimes when the device comes back, the pointer will point to the new struct disk (in which case the insertion will work). Other times it won't (especially if any length of time has passed, since it is dependent on memory returned from malloc). The Fix: There is one of these dev_t's that is always correct. The device for the WHOLE_DISK_SLICE is always right. It gets set at create_disk() time. So, the fix is to spend a little CPU time and lookup the WHOLE_DISK_SLICE dev_t and use the si_disk from that in preference to the one that's in the device asking to do the I/O. In addition, we change the test of si_disk == NULL meaning that the dev needed to inherit properties from the pdev to dev->si_disk != pdev->si_disk. This test is a little stronger than the previous test, but can sometimes be fooled into not inheriting. However, the results of this fooling are that the old values will be used, which will generally always be the same as before. si_drv[12] are the only values that are copied that might pose a problem. They tend to change as the si_disk field would change, so it is a hole, but it is a small hole. One could correctly argue that one should replace much of this code with something much much better. I would be on the pro side of that argument. Reviewed by: phk (who also ported the original patch to current) Sponsored by: Timing Solutions	2000-07-05 06:01:33 +00:00
Jun-ichiro itojun Hagino	686cdd19b1	sync with kame tree as of july00. tons of bug fixes/improvements. API changes: - additional IPv6 ioctls - IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8). (also syntax change)	2000-07-04 16:35:15 +00:00
Poul-Henning Kamp	77978ab8bc	Previous commit changing SYSCTL_HANDLER_ARGS violated KNF. Pointed out by: bde	2000-07-04 11:25:35 +00:00
Kirk McKusick	c904bbbdd8	Simplify and rationalise the management of the vnode free list (preparing the code to add snapshots).	2000-07-04 04:32:40 +00:00
Kirk McKusick	e6796b67d9	Move the truncation code out of vn_open and into the open system call after the acquisition of any advisory locks. This fix corrects a case in which a process tries to open a file with a non-blocking exclusive lock. Even if it fails to get the lock it would still truncate the file even though its open failed. With this change, the truncation is done only after the lock is successfully acquired. Obtained from: BSD/OS	2000-07-04 03:34:11 +00:00
Kirk McKusick	3764219663	If a buffer flush fails when trying to reclaim a vnode, it is too late to save the vnode, so just toss any remaining unwritten buffers rather than leaving them lying around to make trouble in the future.	2000-07-04 03:23:29 +00:00
Kirk McKusick	bdbd3ff7cf	Update tags directive to reflect the new location of soft updates and the reorganization of the eisa directory.	2000-07-04 00:18:43 +00:00
Poul-Henning Kamp	3275cf7379	Make the two calls from kern/* into softupdates #ifdef SOFTUPDATES, that is way cleaner than using the softupdates_stub stunt, which should be killed when convenient. Discussed with: mckusick	2000-07-03 13:26:54 +00:00
Poul-Henning Kamp	9282307a5d	Add device_set_softc() which does the obvious. Not objected to by: dfr	2000-07-03 13:06:29 +00:00
Poul-Henning Kamp	82d9ae4e32	Style police catches up with rev 1.26 of src/sys/sys/sysctl.h: Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)	2000-07-03 09:35:31 +00:00
Chris Costello	d41c16130b	Instead of just blindly setting -rw-rw-rw-: o Set access mode to -r--r--r-- if SS_CANTRCVMORE is set and the receive buffer is empty. o Set access mode to --w--w--w- is SS_CANTSENDMORE is set. Discussed with: alfred	2000-07-02 23:56:45 +00:00
Chris Costello	417779230b	Report -rw-rw-rw file access modes in soo_stat. Reviewed by: alfred	2000-07-02 19:31:00 +00:00
Brian Feldman	42ebfbf227	Modify ktrace's general I/O tracing, ktrgenio(), to use a struct uio * instead of a struct iovec * array and int len. Get rid of stupidly trying to allocate all of the memory and copyin()ing the entire iovec[], and instead just do the proper VOP_WRITE() in ktrwrite() using a copy of the struct uio that the syscall originally used. This solves the DoS which could easily be performed; to work around the DoS, one could also remove "options KTRACE" from the kernel. This is a very strong MFC candidate for 4.1. Found by: art@OpenBSD.org	2000-07-02 08:08:09 +00:00
Brian S. Dean	c6d3f3bfc1	Fix my own style bugs (use of spaces instead of tabs for indentation). This is a style-only change.	2000-07-01 02:40:13 +00:00
Archie Cobbs	6c66bbed1a	Move the securelevel check before loading KLD's into linker_load_file(), instead of requiring every caller of linker_load_file() to perform the check itself. This avoids netgraph loading KLD's when securelevel > 0, not to mention any future code that may call linker_load_file(). Reviewed by: dfr	2000-06-29 17:57:04 +00:00
Boris Popov	5badeabaca	Move #ifdef to the right place.	2000-06-29 09:26:26 +00:00
Boris Popov	99063cf89e	If kernel compiled with INVARIANTS: On unload, remove references from freelist to memory type defined by module. Print a warning if module defines and allocate its own memory type, but didn't free it all on unload. Reviewed by: peter	2000-06-29 03:41:30 +00:00
Chris Costello	0e8363eca9	Report a file type (S_IFIFO) in kqueue_stat().	2000-06-28 19:16:27 +00:00
Alfred Perlstein	1a61fa5e0d	don't panic the system when fpathconv is called on an unsupported filetype.	2000-06-27 23:08:36 +00:00
Alfred Perlstein	35b1da8080	remove crufty exec stuff, perl is in the base system make it work with warnings on (there was some harmless use of uninitialized variables) make it work with 'use strict' Approved by: peter	2000-06-27 19:09:55 +00:00
Poul-Henning Kamp	a8b1f9d2c9	Move prtactive to vfs from ufs. It is used all over the place.	2000-06-27 07:46:22 +00:00
Neil Blakey-Milner	47fdd692c6	Add sysctl descriptions to a few sysctls. Simply "documentation". PR: kern/8015 Submitted by: Stefan Eggers <seggers@semyam.dinoco.de>	2000-06-26 13:52:31 +00:00
Peter Wemm	ce365ee318	Some changes and fixes from Bruce: Use strtoul(), not strtol() in the hints decoder so that 'flags 0xa0ffa0ff' is not truncated to 0x7fffffff. Use a stack buffer instead of a static 100 byte bss buffer. Use \0 for the NUL character. Remove some ``excessive'' parens.	2000-06-26 09:53:37 +00:00
Jonathan Lemon	cb5ad9d362	Fix stupid braino in last commit, initialize `vp' before we test vp->v_tag. Spotted by: dillon	2000-06-25 18:10:45 +00:00
Mark Murray	b6e67f5c7d	Remove no-longer-relevant comment.	2000-06-25 10:14:06 +00:00
Mark Murray	4eeb4f04c3	Forgot this earlier; delete the old /dev/random driver, bring in the header for the new. Reviewed by: dfr	2000-06-25 09:35:40 +00:00
Dima Ruban	1a432a2f54	Fix typo (inT -> int)	2000-06-23 07:10:34 +00:00
Alfred Perlstein	c636255150	fix races in the uidinfo subsystem, several problems existed: 1) while allocating a uidinfo struct malloc is called with M_WAITOK, it's possible that while asleep another process by the same user could have woken up earlier and inserted an entry into the uid hash table. Having redundant entries causes inconsistancies that we can't handle. fix: do a non-waiting malloc, and if that fails then do a blocking malloc, after waking up check that no one else has inserted an entry for us already. 2) Because many checks for sbsize were done as "test then set" in a non atomic manner it was possible to exceed the limits put up via races. fix: instead of querying the count then setting, we just attempt to set the count and leave it up to the function to return success or failure. 3) The uidinfo code was inlining and repeating, lookups and insertions and deletions needed to be in their own functions for clarity. Reviewed by: green	2000-06-22 22:27:16 +00:00
Jonathan Lemon	c8bea19ee3	Add a hack to fail registration of kq events on a non-ufs filesystem, as support for those is non-existent at the moment.	2000-06-22 18:41:07 +00:00
Jonathan Lemon	d2693dbbc4	Add code so that the udata field is preserved across a TRACK event. When re-adding an event, do not reset the event state. If the event was pending, it will remain pending. This allows the user to change the udata field after the event was registered, while not losing any events which have already occurred. Reported by: jmg	2000-06-22 18:39:31 +00:00
Neil Blakey-Milner	445572c1ed	Add 'kern.disks', a sysctl which returns the list of disks from disk_enumerate(), space delimited. This allows non-root users to get a list of disks and will simplify libdisk's Disk_Names(). Reviewed by: phk	2000-06-22 11:44:43 +00:00
Alfred Perlstein	a79b71281c	return of the accept filter part II accept filters are now loadable as well as able to be compiled into the kernel. two accept filters are provided, one that returns sockets when data arrives the other when an http request is completed (doesn't work with 0.9 requests) Reviewed by: jmg	2000-06-20 01:09:23 +00:00
Alfred Perlstein	a72fda7154	backout accept optimizations. Requested by: jmg, dcs, jdp, nate	2000-06-18 08:49:13 +00:00
Poul-Henning Kamp	7c50d77218	Revert part of my bioops change which implemented panic(8).	2000-06-16 14:32:13 +00:00
Poul-Henning Kamp	a2e7a027a7	Virtualizes & untangles the bioops operations vector. Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@	2000-06-16 08:48:51 +00:00
Robert Watson	625cc84808	Second of two commits adding capability manipulation syscalls for processes. Obtained from: TrustedBSD Project	2000-06-15 23:27:18 +00:00
Robert Watson	b09b66abf6	Introduce syscalls for process capability manipulation. Currently backs onto already committed stubs. Commit one of two. Reviewed by: Damned if I can remember. Many people. Obtained from: TrustedBSD Project	2000-06-15 23:08:17 +00:00
Poul-Henning Kamp	4bd02a5609	Add disk_enumerate() for finding names of disks. Vinum and libh will need this RSN. Remove a pointless warning in the root device locating code. Remove the "wd" compatibility name from the "ad" driver. WARNING: If you have not updated to use /dev/wd* in your /etc/fstab and modern bootblocks, it would be a very good idea to do so BEFORE you upgrade your kernel.	2000-06-15 20:30:53 +00:00
Alfred Perlstein	8f4e4aa5f1	add socketoptions DELAYACCEPT and HTTPACCEPT which will not allow an accept() until the incoming connection has either data waiting or what looks like a HTTP request header already in the socketbuffer. This ought to reduce the context switch time and overhead for processing requests. The initial idea and code for HTTPACCEPT came from Yahoo engineers and has been cleaned up and a more lightweight DELAYACCEPT for non-http servers has been added Reviewed by: silence on hackers.	2000-06-15 18:18:43 +00:00
Peter Wemm	7d02379e48	As a bit of a gross hack, allow earlier access to both the static and dynamic hints. This allows the resource_XXX_value() calls to work before malloc() has started. This gets the serial console working as well as a few other things.	2000-06-15 09:57:20 +00:00
Peter Wemm	690f8fc4c3	Fix a stray debug output. change if (1 \|\| bootverbose) to if (bootverbose)	2000-06-15 04:12:17 +00:00
Bruce Evans	8e8cac5555	sys/malloc.h: Order the SYSINIT() for MALLOC_DEFINE() correctly so that malloc() doesn't have to waste time initializing itself. The (SI_SUB_KMEM, SI_ORDER_ANY) order was shared with syscons' SYSINIT() for scmeminit(), and scmeminit() calls malloc(), so malloc() initialization was not always complete on the first call to malloc(). kern/kern_malloc.c: - Removed self-initialization in malloc(). - Removed half-baked sanity check in free(). Trust MALLOC_DEFINE().	2000-06-14 18:31:42 +00:00
Peter Wemm	f71c01cc52	Borrow phk's axe and apply the next stage of config(8)'s evolution. Use Warner Losh's "hint" driver to decode ascii strings to fill the resource table at boot time. config(8) no longer generates an ioconf.c table - ie: the configuration no longer has to be compiled into the kernel. You can reconfigure your isa devices with the likes of this at loader(8) time: set hint.ed.0.port=0x320 userconfig will be rewritten to use this style interface one day and will move to /boot/userconfig.4th or something like that. It is still possible to statically compile in a set of hints into a kernel if you do not wish to use loader(8). See the "hints" directive in GENERIC as an example. All device wiring has been moved out of config(8). There is a set of helper scripts (see i386/conf/gethints.pl, and the same for alpha and pc98) that extract the 'at isa? port foo irq bar' from the old files and produces a hints file. If you install this file as /boot/device.hints (and update /boot/defaults/loader.conf - You can do a build/install in sys/boot) then loader will load it automatically for you. You can also compile in the hints directly with: hints "device.hints" as well. There are a few things that I'm not too happy with yet. Under this scheme, things like LINT would no longer be useful as "documentation" of settings. I have renamed this file to 'NOTES' and stored the example hints strings in it. However... this is not something that config(8) understands, so there is a script that extracts the build-specific data from the documentation file (NOTES) to produce a LINT that can be config'ed and built. A stack of man4 pages will need updating. :-/ Also, since there is no longer a difference between 'device' and 'pseudo-device' I collapsed the two together, and the resulting 'device' takes a 'number of units' for devices that still have it statically allocated. eg: 'device fe 4' will compile the fe driver with NFE set to 4. You can then set hints for 4 units (0 - 3). Also note that 'device fe0' will be interpreted as "zero units of 'fe'" which would be bad, so there is a config warning for this. This is only needed for old drivers that still have static limits on numbers of units. All the statically limited drivers that I could find were marked. Please exercise EXTREME CAUTION when transitioning! Moral support by: phk, msmith, dfr, asmodai, imp, and others	2000-06-13 22:28:50 +00:00
Jeroen Ruigrok van der Werven	3b43fd626a	Fix panic by moving the prp == 0 check up the order of sanity checks. Submitted by: Bart Thate <freebsd@1st.dudi.org> on -current Approved by: rwatson	2000-06-13 15:44:04 +00:00
Alfred Perlstein	8757e5bbc5	unstatic getfp() so that other subsystems can use it. make sendfile() use it. Approved by: dg	2000-06-12 18:06:12 +00:00
Bruce Evans	0477138dad	Fixed allocation of unit numbers. Allocate the amount of space actually required (rounded up a little) instead of twice the previous amount (or a fixed amount for the first allocation). The bug caused memory corruption when a new unit number for a devclass was more than about twice the previous maximum one (or more than 3 for the first one), so it corrupted memory (which happened to be the atkbdc port resource list) in the reporter's configuration with sio unit numbers { 0, 25, 1, 2, ... }. Reviewed by: dfr Reported by: Leonid Lukiyanets <stalwar78@hotmail.com>	2000-06-11 07:19:20 +00:00
Poul-Henning Kamp	c27f4d3c50	fix a typo	2000-06-10 19:21:20 +00:00
Peter Wemm	53cc6add2a	Unused include: #include "pty.h"	2000-06-10 07:12:40 +00:00
Jonathan Lemon	d36cb22369	malloc(..., M_WAITOK) will not return NULL, so remove the error handling for this case (which was slightly broken anyway) Fix up some whitespace problems while I'm here too. Submitted by: alfred (in a slightly different form)	2000-06-10 01:51:18 +00:00
Robert Watson	e812e4917d	Dammit. Trimmed an extra sysctl when I moved kern.suser_permitted from kern_mib.c to kern_prot.c. This commit should restore it, as well as fix the resulting build problems. Submitted by: asmodai	2000-06-07 18:54:41 +00:00
Robert Watson	a996141f6e	Introduce additional POSIX.1e-related stubs o options CAPABILITIES o kern/kern_cap.c -- syscall stubs returning ENOSYS syscalls.master changes to follow Obtained from: TrustedBSD Project	2000-06-07 04:53:49 +00:00
Robert Watson	579f4eb4cd	o bde suggested moving the SYSCTL from kern_mib to the more appropriate kern_prot, which cleans up some namespace issues o Don't need a special handler to limit un-setting, as suser is used to protect suser_permitted, making it one-way by definition. Suggested by: bde	2000-06-05 18:30:55 +00:00
Robert Watson	0309554711	o Introduce kern.suser_permitted, a sysctl that disables the suser_xxx() returning anything but EPERM. o suser is enabled by default; once disabled, cannot be reenabled o To be used in alternative security models where uid0 does not connote additional privileges o Should be noted that uid0 still has some additional powers as it owns many important files and executables, so suffers from the same fundamental security flaws as securelevels. This is fixed with MAC integrity protection code (in progress) o Not safe for consumption unless you are really sure you don't want things like shutdown to work, et al :-) Obtained from: TrustedBSD Project	2000-06-05 14:53:55 +00:00
Robert Watson	7cadc2663e	o Modify jail to limit creation of sockets to UNIX domain sockets, TCP/IP (v4) sockets, and routing sockets. Previously, interaction with IPv6 was not well-defined, and might be inappropriate for some environments. Similarly, sysctl MIB entries providing interface information also give out only addresses from those protocol domains. For the time being, this functionality is enabled by default, and toggleable using the sysctl variable jail.socket_unixiproute_only. In the future, protocol domains will be able to determine whether or not they are ``jail aware''. o Further limitations on process use of getpriority() and setpriority() by jailed processes. Addresses problem described in kern/17878. Reviewed by: phk, jmg	2000-06-04 04:28:31 +00:00
Bruce Evans	f47f0edde4	Use "nm \| awk ..." instead of genassym(1) to generate symbol value headers. Symbol values are now represented using array sizes (4 arrays per symbol so that 16-bit machines can represent 64-bit values) instead of being raw binary values. Reviewed by: marcel	2000-06-02 09:27:48 +00:00
Mike Smith	c3c50c4e3a	Further fixes for multiple-IO-APIC systems from Tor Egge: Further experimentation showed that some Dell 2450 machines with the prevention kludge installed still got T_RESERVED traps. CPU interrupt vector 0x7A was observed to be triggered. This might have been the bitwise OR of two different vectors sent from each of the IOAPICs at the same time. IOAPIC #0: 0x68 --> irq 8: RTC timer interrupt IOAPIC #1: 0x32 --> irq 18: scsi host adapter or network interface ---- 0x7a --> T_RESERVED Both IOAPICs had ID 0. Appendix B.3 in the MP spec indicates that the operating system is responsible for assigning unique IDs to the IOAPICs. The enclosed patch programs the IOAPIC IDs according to the IOAPIC entries in the MP table. Submitted by: tegge	2000-05-31 21:37:28 +00:00
Matthew Dillon	8b03c8ed5e	This is a cleanup patch to Peter's new OBJT_PHYS VM object type and sysv shared memory support for it. It implements a new PG_UNMANAGED flag that has slightly different characteristics from PG_FICTICIOUS. A new sysctl, kern.ipc.shm_use_phys has been added to enable the use of physically-backed sysv shared memory rather then swap-backed. Physically backed shm segments are not tracked with PV entries, allowing programs which use a large shm segment as a rendezvous point to operate without eating an insane amount of KVM in the PV entry management. Read: Oracle. Peter's OBJT_PHYS object will also allow us to eventually implement page-table sharing and/or 4MB physical page support for such segments. We're half way there.	2000-05-29 22:40:54 +00:00
Doug Rabson	ca2e05343b	Add taskqueue system for easy-to-use SWIs among other things. Reviewed by: arch	2000-05-28 15:45:30 +00:00
Søren Schmidt	d5f65fcbd7	If devclass_alloc_unit() is called with a wired unit #, and this is buzy, only search upwards for a free slot to use.. This broke unit numbering on ATA systems where PCI attached controllers come before the mainboard ones... Reviewed by: dfr	2000-05-26 13:59:05 +00:00
Jake Burkholder	e39756439c	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
Jake Burkholder	740a1973a6	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
Mike Smith	b38f58db69	Make a trip to Pointy-Hats-R-Us and actually include the header that defines ROOTDEVNAME. Submitted by: "Jeffrey S. Sharp" <jss@subatomix.com>	2000-05-22 17:25:47 +00:00
David E. O'Brien	d4af7a50dc	Sort the sys includes.	2000-05-22 17:09:13 +00:00
Brian Feldman	a274d19ba2	Back out NOTE_EXIT status reporting pending discussion.	2000-05-21 16:27:41 +00:00
Peter Wemm	24488c7498	Provide a temporary undocumented option: SHM_PHYS_BACKED. This will become sysctl and/or flags controlled later. It's mainly here for an easy place to test the physical memory backed objects.	2000-05-21 13:52:13 +00:00
Brian Feldman	a24b514d72	Put the wait(2) exit status in "data" for NOTE_EXIT kevents.	2000-05-17 01:16:11 +00:00
Jeroen Ruigrok van der Werven	01f76720fb	Fix the rootmount code for now. This function will probably rewritten/renamed to devpp. Submitted by: Assar Westerlund <assar@sics.se> on -current Confirmed to work: Steinar Haug <sthaug@nethelp.no>, Manfred Antar <mantar@pacbell.net> Reviewed by: phk	2000-05-14 07:43:12 +00:00
Jeroen Ruigrok van der Werven	37d90a44af	Fix comment typo. Submitted by: nrahlstr	2000-05-12 16:06:49 +00:00
Chris Costello	040fac0bbd	Include the UID and GID values filled in by socreate() into socket->so_cred for stat() calls. Reviewed by: phk	2000-05-11 22:08:57 +00:00
Chris Costello	12861d58db	Include UID and GID information for stat() calls using the values filled into the file descriptor data by falloc(). Reviewed by: phk	2000-05-11 22:08:20 +00:00
Bruce Evans	9114579d7a	Regenerated (fixed the calculation of sy_nargs in sysent tables).	2000-05-09 21:52:02 +00:00
Bruce Evans	6b972e0bdd	Fixed the calculation of sy_nargs in sysent tables. We attempted to do this in awk using the hack of counting args of type off_t twice and args of all other types once. This is too simple to work. It gave benignly wrong results on alphas (off_t shouldn't be counted twice) and for svr4_sys_mmap64() on i386's (off64_t should be counted twice). It gave fatally wrong results for i386's with 64-bit longs (longs should be counted twice). The correct value for sy_nargs is easier to determine from the size of the args struct anyway, except for complications to make the generated code almost readable. Improved formatting of sysent tables by lining up the comments where possible.	2000-05-09 21:18:30 +00:00
Poul-Henning Kamp	192c06ea1b	Change the "bdev-whiner" to whine when open is attempted and extend the deadline a month.	2000-05-09 18:53:57 +00:00
Matthew Dillon	d2ba455c2c	Some ioctl routines assume that the ioctl buffer is aligned, but a char[] declaration makes no such guarentee. A union is used to force alignment of the char buffer.	2000-05-09 17:43:21 +00:00
Bruce Evans	4aee570d90	Regenerated (fixed the type of mmap()'s padding arg).	2000-05-09 08:35:51 +00:00
Bruce Evans	aa4b7eae22	Fixed the declaration of mmap(). The crufty padding arg had the wrong type. This gave an inconsistent amount of crufty padding on i386's with 64-bit longs (8 bytes instead of 4). On alphas it gives a consistent amount of crufty padding (8 bytes) in addition to the 4 bytes of normal padding caused by passing int args as register_t's. Fixed the args struct tag for the NOPROTO syscalls (netbsd_lchown() and netbsd_msync()). The tag is currently unused for NOPROTO syscalls, so the bug has no effect, but it will be used even in the NOPROTO case to calculate sy_nargs correctly.	2000-05-09 08:31:06 +00:00
Peter Wemm	0e59fec6d8	Make issetugid return correctly. It was returning -1 with errno == 1 if it was set?id! Submitted by: Valentin Nechayev <netch@segfault.kiev.ua>	2000-05-09 00:58:34 +00:00
Greg Lehey	72cc7e2dce	Correct a couple of typos.	2000-05-07 05:09:45 +00:00
Poul-Henning Kamp	ad7ba3d455	Remove devstat_end_transaction_buf() everybody uses devstat_end_transaction_bio() now.	2000-05-06 06:59:08 +00:00
Poul-Henning Kamp	9626b608de	Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>. <sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes. Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data. Still a few bogus uses of struct buf to track down. Repocopy by: peter	2000-05-05 09:59:14 +00:00
Jonathan Lemon	b4b03426ca	Fix one bug where the kn_head list could be manipulated without spl() protection in the case of a copyout error. Add missing spl calls around the intial activation call that is done when when the kevent is added. Add two KASSERT macros to help catch errors in the future.	2000-05-04 20:19:17 +00:00
Paul Richards	8651b9ec1b	If BUS_DEBUG is defined then create a sysctl, debug.bus_debug, that is used to control whether the debug messages are output at runtime. It defaults to on so that if you define BUS_DEBUG in your kernel then you get all the debugging info when you boot. It's very useful for disabling all the debugging info when you're developing a loadable device driver and you're doing lots of loads and unloads but don't always want to see all the debugging info.	2000-05-03 17:45:04 +00:00
Paul Richards	c0151c49d2	Replace all the ifdef debugging spaghetti with a single ifdef and a macro so that it is easier to read the flow of the code.	2000-05-03 00:20:36 +00:00
Peter Wemm	365c5db0a7	Add $FreeBSD$	2000-05-01 20:32:07 +00:00
Poul-Henning Kamp	017ef345bc	Give struct bio it's own call back mechanism.	2000-05-01 13:36:25 +00:00
Peter Wemm	ab063af911	Move the MSG* and SEM* options to opt_sysvipc.h Remove evil allocation macros from machdep.c (why was that there???) and use malloc() instead. Move paramters out of param.h and into the code itself. Move a bunch of internal definitions from public sys/.h headers (without #ifdef _KERNEL even) into the code itself. I had hoped to make some of this more dynamic, but the cost of doing wakeups on all sleeping processes on old arrays was too frightening. The other possibility is to initialize on the first use, and allow dynamic sysctl changes to parameters right until that point. That would allow /etc/rc.sysctl to change SEM and MSG* defaults as we presently do with SHM*, but without the nightmare of changing a running system.	2000-05-01 13:33:56 +00:00
Peter Wemm	2553c04ce2	Regenerate (removed semconfig)	2000-05-01 11:14:08 +00:00
Peter Wemm	b423446cc0	Remove the undocumented, flawed, broken-as-designed semconfig() syscall.	2000-05-01 11:13:41 +00:00
Peter Wemm	39e4c0c888	Remove undocumented broken-as-designed semconfig() syscall.	2000-05-01 11:11:44 +00:00
Andrey A. Chernov	051f60b976	Move t_timeout initializing to ttyregister Pointed-by: bde	2000-05-01 10:51:54 +00:00
Doug Rabson	4b4a49fda5	* Move the driver_t::refs field to kobj_t to replace kobj_t::instances. * Back out a couple of workarounds for the confusion between kobj_t::instances and driver_t::refs.	2000-05-01 10:45:15 +00:00
Andrey A. Chernov	ef4de1ad38	Since ptys are allocated dynamically, there is no needs to keep their t_timeout across close, so move t_timeout initializing to ptcopen	2000-05-01 10:24:21 +00:00
Andrey A. Chernov	4eaed34ba0	Set t_timeout to its default sysctl value only once in ttyopen Initialize t_timeout to -1 for this reason Pointed-by: bde	2000-05-01 09:05:03 +00:00
Poul-Henning Kamp	2c9b67a8df	Remove unneeded #include <vm/vm_zone.h> Generated by: src/tools/tools/kerninclude	2000-04-30 18:52:11 +00:00
Brian Feldman	226f14bc83	Change the scheduler to actually respect the PUSER barrier. It's been wrong for many years that negative niceness would lower the priority of a process below PUSER, and once below PUSER, there were conditionals in the code that are required to test for whether a process was in the kernel which would break. The breakage could (and did) cause lock-ups, basically nothing else but the least nice program being able to run in some conditions. The algorithm which adjusts the priority now subtracts PRIO_MIN to do things properly, and the ESTCPULIM() algorithm was updated to use PRIO_TOTAL (PRIO_MAX - PRIO_MIN) to calculate the estcpu. NICE_WEIGHT is now 1 to accomodate the full range of priorities better (a -20 process with full CPU time has the priority of a +0 process with no CPU time). There are now 20 queues (exactly; 80 priorities) for use in user processes' scheduling, and PUSER has been lowered to 48 to accomplish this. This means, to the user, that things will be scheduled more correctly (noticeable), there is no lock-up anymore WRT a niced -20 process never releasing the CPU time for other processes. In this fair system, tsleep()ed < PUSER processes now will get the proper higher priority than priority >= PUSER user processes. The detective work of this was done by me, along with part of the solution. Luoqi Chen has provided most of the solution, and really helped me understand what was happening better, to boot :) Submitted by: luoqi Concept reviewed by: bde	2000-04-30 18:33:43 +00:00
Andrey A. Chernov	c1d0c3a89d	Add sysctl variable to set initial drainwait timeout on ttyopen, default to 5 minutes	2000-04-30 16:00:53 +00:00
Poul-Henning Kamp	95bdaa0ee8	Hmm, diff/patch still doesn't like me. Missed one s/biowait/bufwait/g	2000-04-30 06:16:03 +00:00
Poul-Henning Kamp	87150cb06d	s/biowait/bufwait/g Prodded by: several.	2000-04-29 16:25:22 +00:00
Poul-Henning Kamp	c1462ad325	Remove a leftover dysonism.	2000-04-29 16:14:10 +00:00
Poul-Henning Kamp	eb95c536ad	Remove unneeded #include <sys/kernel.h>	2000-04-29 15:36:14 +00:00
Peter Wemm	eb2d8c2e8a	The newer module dependency code exposes an apparent bug in the bus/driver/kobj system. I am not 100% sure that this is the correct fix, but it is harmless and does seem to solve the problem. At worst, it could cause a tiny memory leak at unload time - this is better than a free(NULL) and subsequent panic. I'm waiting for comments from Doug about this. This may yet be backed out and fixed differently. The change itself is to increment the reference count on drivers in one case where it appears to have been missed. When everything is unloaded, kobj_class_free() was being called twice in some cases, and panicing the second time.	2000-04-29 13:24:35 +00:00
Peter Wemm	54823af256	First round implementation of a fine grain enhanced module to module version dependency system. This isn't quite finished, but it is at a useful stage to do a functional checkpoint. Highlights: - version and dependency metadata is gathered via linker sets, so things are handled the same for static kernels and code built to live in a kld. - The dependencies are at module level (versus at file level). - Dependencies determine kld symbol search order - this means that you cannot link against symbols in another file unless you depend on it. This is so that you cannot accidently unload the target out from underneath the ones referencing it. - It is flexible enough that we can put tags in #include files and macros so that we can get decent hooks for enforcing recompiles on incompatable ABI changes. eg: if we change struct proc, we could force a recompile for all kld's that reference the proc struct. - Tangled dependency references at boot time are sorted. Files are relocated once all their dependencies are already relocated. Caveats: - Loader support is incomplete, but has been worked on seperately. - Actual enforcement of the version number tags is not active yet - just the module dependencies are live. The actual structure of versioning hasn't been agreed on yet. (eg: major.minor, or whatever) - There is some backwards compatability for old modules without metadata but I'm not sure how good it is. This is based on work originally done by Boris Popov (bp@freebsd.org), but I'm not sure he'd recognize much of it now. Don't blame him. :-) Also, ideas have been borrowed from Mike Smith.	2000-04-29 13:19:31 +00:00

... 4 5 6 7 8 ...

3466 Commits