freebsd-dev

Author	SHA1	Message	Date
Alfred Perlstein	3a4d365463	Add reserved lkmressys keyword. I swear, this script will die the next time I need to hack on it.	2000-12-01 08:47:54 +00:00
Alfred Perlstein	1dc8643099	implement NOSTD syscall type, this creates the syscall args, but sticks a lkmnosys into the sysent table so that SYSCALL_MODULE() works	2000-12-01 07:40:20 +00:00
Alfred Perlstein	c5a86b0ab9	Translate alfred to english. Submitted by: bde	2000-12-01 06:59:18 +00:00
Jake Burkholder	1512b5d6ab	Use an mp-safe callout for endtsleep.	2000-12-01 04:55:52 +00:00
John Baldwin	2191340786	Use msleep() instead of mtx_exit()/tsleep() so that we release the lock and go to sleep as an "atomic" operation.	2000-12-01 03:43:33 +00:00
John Baldwin	472fd56ea5	Don't update p_stat in exit1() to SZOMB until after releasing the allproc lock. Otherwise, if we block on the backing mutex while releasing the allproc lock, then when we resume, we will be at SRUN, and we will stay that way all the way through cpu_exit. As a result, our parent will never harvest us.	2000-12-01 03:42:17 +00:00
Jake Burkholder	96fde7da19	Use msleep instead of mtx_exit; tsleep; mtx_enter, which is not safe.	2000-12-01 02:18:38 +00:00
John Baldwin	6936206ebd	Split the WITNESS and MUTEX_DEBUG options apart so that WITNESS does not depend on MUTEX_DEBUG. The MUTEX_DEBUG option turns on extra assertions and checks to verify that mutexes themselves are implemented properly. The WITNESS option uses extra checks and diagnostics to verify that other code is using mutexes properly.	2000-12-01 00:10:59 +00:00
Robert Watson	cf64863a1e	o Add a comment to exec_check_permissions() to indicate that the passed vnode must be locked; this is the case because of calls to VOP_GETATTR(), VOP_ACCESS(), and VOP_OPEN(). This becomes more of an issue when VOP_ACCESS() gets a bit more complicated, which it does when you introduce ACL, Capability, and MAC support. Obtained from: TrustedBSD Project	2000-11-30 21:06:05 +00:00
Alfred Perlstein	c6ab5768aa	only call bwillwrite() to stall on IO when dealing with VNODEs otherwise we will stall on non-disk IO for things like fifos and sockets	2000-11-30 20:23:14 +00:00
Alfred Perlstein	237710275e	This is a fix for a problem described in PR kern/19572. It was recently discussed at -hackers. The problem is a null-pointer dereference that happens in kern/vfs_lookup.c when accessing ".." with a v_mount entry for the current directory vnode of NULL. This happens when a volume is forcibly unmounted, and the vnode for a working directory in the mounted volume is cleared. PR: 23191 Submitted by: Thomas Moestl <tmoestl@gmx.net>	2000-11-30 20:04:44 +00:00
Alfred Perlstein	1baf4aabbc	use a oppurtunistic locking strategy with the uidinfo structures to avoid locking the global hash on each uifree() make struct uidinfo only visible to the kernel make uihold() a function rather than a macro to reduce bloat swap the order of a spl/mutex to maintain consistancy	2000-11-30 19:15:22 +00:00
Alfred Perlstein	5c3f70d7c0	make crfree into a function rather than a macro to avoid bloat because of the mutex aquire/release reorder struct ucred	2000-11-30 19:09:48 +00:00
Kirk McKusick	6d984dfa6a	Get rid of a bogus mtx_exit (it was attempting to release an already released mutex). Submitted by: "Chris Knight" <chris@aims.com.au>	2000-11-30 19:09:29 +00:00
Marcel Moolenaar	d034d459da	Don't use p->p_sigstk.ss_flags to keep state of whether the process is on the alternate stack or not. For compatibility with sigstack(2) state is being updated if such is needed. We now determine whether the process is on the alternate stack by looking at its stack pointer. This allows a process to siglongjmp from a signal handler on the alternate stack to the place of the sigsetjmp on the normal stack. When maintaining state, this would have invalidated the state information and causing a subsequent signal to be delivered on the normal stack instead of the alternate stack. PR: 22286	2000-11-30 05:23:49 +00:00
John Baldwin	1bd0eefb4c	Fix up priority propagation: - Use a better test for determining when a process is running. - Convert some checks to assertions. - Remove unnecessary tests. - Save the priority before acquiring a mutex rather than in msleep(9).	2000-11-30 00:51:16 +00:00
John Baldwin	86327ad8a4	Set p_mtxname when blocking on a mutex and clear it when waking up.	2000-11-29 20:17:15 +00:00
John Baldwin	62ca2477d8	Save a copy of p_mtxname in e_mtxname when creating an eproc.	2000-11-29 20:14:50 +00:00
John Baldwin	f404050e44	Use an atomic operation with an appropriate memory barrier when releasing a contested sleep mutex in the case that at least two processes are blocked on the contested mutex.	2000-11-29 18:41:19 +00:00
John Baldwin	8f838cb563	The sched_lock mutex goes after the sio mutex in the locking order since a software interrupt can be scheduled in the sio interrupt handler while the sio mutex is held.	2000-11-29 18:38:14 +00:00
John Baldwin	bbc7a98a31	Save the line number and filename of the last mtx_enter operation for spin locks. We already do this for sleep locks.	2000-11-29 18:37:01 +00:00
John Baldwin	e2979dcc85	Don't drop Giant and the passed in mutex incorrectly in the cold \|\| panicstr case. Do drop the passed in mutex in that case if PDROP is specified.	2000-11-29 18:32:50 +00:00
John Baldwin	2bcc63c545	Only print out APIC info on an SMP system during a panic if APIC_IO is defined.	2000-11-29 01:33:15 +00:00
John Baldwin	8d9888d37a	Don't wait forever for CPUs to stop or restart. Instead, give up after a timeout. If DIAGNOSTIC is turned on, then display a message to the console with a map of which CPUs failed to stop or restart. This gives an SMP box at least a fighting chance of getting into DDB if one of the other CPUs has interrupts disabled.	2000-11-28 23:52:36 +00:00
Jordan K. Hubbard	7022a92395	Kernel support for erase2 character. Submitted by: Rui Pedro Mendes Salgueiro <rps@mat.uc.pt>	2000-11-28 20:03:23 +00:00
Matthew N. Dodd	46aa504e42	Alter the return value and arguments of the GET_RESOURCE_LIST bus method. Alter consumers of this method to conform to the new convention. Minor cosmetic adjustments to bus.h. This isn't of concern as this interface isn't in use yet.	2000-11-28 06:49:15 +00:00
Jake Burkholder	4f55983606	Use callout_reset instead of timeout(9). Most callouts are statically allocated, 2 have been added to struct proc for setitimer and sleep. Reviewed by: jhb, jlemon	2000-11-27 22:52:31 +00:00
John Baldwin	91b7c97713	Drop Giant around the mi_switch() call in yield(). Submitted by: tegge	2000-11-27 18:48:13 +00:00
Alfred Perlstein	1e5d626ad9	ucred system overhaul: 1) mpsafe (protect the refcount with a mutex). 2) reduce duplicated code by removing the inlined crdup() from crcopy() and make crcopy() call crdup(). 3) use M_ZERO flag when allocating initial structs instead of calling bzero after allocation. 4) expand the size of the refcount from a u_short to an u_int, by using shorts we might have an overflow. Glanced at by: jake	2000-11-27 00:09:16 +00:00
Alfred Perlstein	0931dcefb3	Move the #define of _KERN_MUTEX_C_ so that it's before any system headers are included. System headers can include sys/mutex.h and then certain macros do not get defined. Reviewed by: jake	2000-11-26 21:14:17 +00:00
Poul-Henning Kamp	a52585d77e	Simplify the tprintf() API. Loose the special <sys/tprintf.h> #include file.	2000-11-26 20:35:21 +00:00
Poul-Henning Kamp	4d88c4598f	Make log(-1, ...) do what addlog(...) did. Replace all uses of addlog(...) with log(-1, ...) Remove bogus "register" keywords in subr_prf.c Make log() return void.	2000-11-26 19:34:06 +00:00
Poul-Henning Kamp	cb7e609a3c	Make diskerr() always log with printf.	2000-11-26 19:29:15 +00:00
Jake Burkholder	a5d5c61c12	Add uidinfo hash and uidinfo struct to the witness order list.	2000-11-26 15:05:46 +00:00
Alfred Perlstein	9c19bcddf0	Make uidinfo subsystem mpsafe use a mutex lock when looking up/deleting entries on the hashlist use a mutex lock on each uidinfo when updating fields make uifree() a void function rather than 'int' since no one cares allocate uidinfo structs with the M_ZERO flag and don't explicitly initialize them Assisted by: eivind, jhb, jakeb	2000-11-26 12:08:17 +00:00
Jonathan Lemon	e82ac18e52	Revert the last commit to the callout interface, and add a flag to callout_init() indicating whether the callout is safe or not. Update the callers of callout_init() to reflect the new interface. Okayed by: Jake	2000-11-25 06:22:16 +00:00
Jake Burkholder	249849e0b9	- Rename callout_reset to _callout_reset and add a flags argument. - Add macros callout_reset, which does the obvious, and mp_callout_reset, which passes the CALLOUT_MPSAFE flag.	2000-11-25 03:34:49 +00:00
Jake Burkholder	553629ebc9	Protect the following with a lockmgr lock: allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd	2000-11-22 07:42:04 +00:00
John Baldwin	0959cc6680	Ahem, fix the disclaimer portion of the copyright so it disclaim's the voices in my head. You can sue the voices in Bill Paul's head all you want. Noticed by: jhb	2000-11-21 21:10:15 +00:00
Jonathan Lemon	4a476efa51	Protect p_wchan with sched_lock in selwakeup().	2000-11-21 20:22:34 +00:00
Alan Cox	c6fa9f78d2	Provide a new interface for the user of aio_read() and aio_write() to request a kevent upon completion of the I/O. Specifically, introduce a new type of sigevent notification, SIGEV_EVENT. If sigev_notify is SIGEV_EVENT, then sigev_notify_kqueue names the kqueue that should receive the event and sigev_value contains the "void *" is copied into the kevent's udata field. In contrast to the existing interface, this one: 1) works on the Alpha 2) avoids the extra copyin() call for the kevent because all of the information needed is in the sigevent and 3) could be applied to request a single kevent upon completion of an entire lio_listio(). Reviewed by: jlemon	2000-11-21 19:36:36 +00:00
Alfred Perlstein	830fedd28f	Accept filters broke kernels compiled without options INET. Make accept filters conditional on INET support to fix. Pointed out by: bde Tested and assisted by: Stephen J. Kiernan <sab@vegamuse.org>	2000-11-20 01:35:25 +00:00
Robert Watson	7f112b0489	o Export cp_time ("CPU time statistics") using SYSCTL_OPAQUE. This removes a reason that systat requires setgid kmem. More to come.	2000-11-20 00:44:58 +00:00
Robert Watson	aa5429970c	o Export nchstats ("VFS cache effectiveness statistics") using SYSCTL_OPAQUE. This removes a reason that systat requires setgid kmem. More to come.	2000-11-20 00:41:11 +00:00
David Malone	32af0d74f0	Make sbcompress use the new M_WRITABLE macro. Previously sbcompress could not compress into clusters. This could result in lots of wasted clusters while recieving small packets from an interface that uses clusters for all it's packets. Patch is partially from BSDi (limiting the size of the copy) and based on a patch for 4.1 by Ian Dowse <iedowse@maths.tcd.ie> and myself. Reviewed by: bmilekic Obtained From: BSDi Submitted by: iedowse	2000-11-19 22:22:47 +00:00
Jake Burkholder	fa2fbc3dac	- Protect the callout wheel with a separate spin mutex, callout_lock. - Use the mutex in hardclock to ensure no races between it and softclock. - Make softclock be INTR_MPSAFE and provide a flag, CALLOUT_MPSAFE, which specifies that a callout handler does not need giant. There is still no way to set this flag when regstering a callout. Reviewed by: -smp@, jlemon	2000-11-19 06:02:32 +00:00
Matthew Dillon	936524aa02	Implement a low-memory deadlock solution. Removed most of the hacks that were trying to deal with low-memory situations prior to now. The new code is based on the concept that I/O must be able to function in a low memory situation. All major modules related to I/O (except networking) have been adjusted to allow allocation out of the system reserve memory pool. These modules now detect a low memory situation but rather then block they instead continue to operate, then return resources to the memory pool instead of cache them or leave them wired. Code has been added to stall in a low-memory situation prior to a vnode being locked. Thus situations where a process blocks in a low-memory condition while holding a locked vnode have been reduced to near nothing. Not only will I/O continue to operate, but many prior deadlock conditions simply no longer exist. Implement a number of VFS/BIO fixes (found by Ian): in biodone(), bogus-page replacement code, the loop was not properly incrementing loop variables prior to a continue statement. We do not believe this code can be hit anyway but we aren't taking any chances. We'll turn the whole section into a panic (as it already is in brelse()) after the release is rolled. In biodone(), the foff calculation was incorrectly clamped to the iosize, causing the wrong foff to be calculated for pages in the case of an I/O error or biodone() called without initiating I/O. The problem always caused a panic before. Now it doesn't. The problem is mainly an issue with NFS. Fixed casts for ~PAGE_MASK. This code worked properly before only because the calculations use signed arithmatic. Better to properly extend PAGE_MASK first before inverting it for the 64 bit masking op. In brelse(), the bogus_page fixup code was improperly throwing away the original contents of 'm' when it did the j-loop to fix the bogus pages. The result was that it would potentially invalidate parts of the WRONG page(!), leading to corruption. There may still be cases where a background bitmap write is being duplicated, causing potential corruption. We have identified a potentially serious bug related to this but the fix is still TBD. So instead this patch contains a KASSERT to detect the problem and panic the machine rather then continue to corrupt the filesystem. The problem does not occur very often.. it is very hard to reproduce, and it may or may not be the cause of the corruption people have reported. Review by: (VFS/BIO: mckusick, Ian Dowse <iedowse@maths.tcd.ie>) Testing by: (VM/Deadlock) Paul Saab <ps@yahoo-inc.com>	2000-11-18 23:06:26 +00:00
Matthew Dillon	279d722604	This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>	2000-11-18 21:01:04 +00:00
John Baldwin	b6b55e27a4	Release sched_lock very briefly to give interrupts a chance to fire if we are in softclock() for a long time. The old code already did an splx()/slphigh() pair here, I just missed adding in the equivalent mutex operations on sched_lock earlier.	2000-11-18 00:21:00 +00:00
Tor Egge	e5c5b82950	Don't attempt to cluster write buffers where the VMIO flag isn't set.	2000-11-17 23:40:08 +00:00
Jake Burkholder	7da6f97772	- Split the run queue and sleep queue linkage, so that a process may block on a mutex while on the sleep queue without corrupting it. - Move dropping of Giant to after the acquire of sched_lock. Tested by: John Hay <jhay@icomtek.csir.co.za> jhb	2000-11-17 18:09:18 +00:00
John Baldwin	cb799bfef9	The recent changes to msleep() and mawait() resulted in timeout() and untimeout() not being called with Giant in those functions. For now, use the sched_lock to protect the callout wheel in softclock() and in the various timeout and callout functions. Noticed by: tegge	2000-11-16 21:20:52 +00:00
John Baldwin	20cdcc5b73	Don't release and acquire Giant in mi_switch(). Instead, release and acquire Giant as needed in functions that call mi_switch(). The releases need to be done outside of the sched_lock to avoid potential deadlocks from trying to acquire Giant while interrupts are disabled. Submitted by: witness	2000-11-16 02:16:44 +00:00
John Baldwin	92c79c7e3e	Argh, add in a missing release of the sched_lock.	2000-11-16 01:16:54 +00:00
John Baldwin	95de685572	CURSIG() calls functions that acquire sleep mutexes, so it is not a good idea to be holding the sched_lock while we are calling it. As such, release sched_lock before calling CURSIG() in msleep() and mawait() and reacquire it after CURSIG() returns. Submitted by: witness	2000-11-16 01:07:19 +00:00
John Baldwin	b84988521c	- Rename await() to mawait(). mawait() is to await() as msleep() is to tsleep(). Namely, mawait() takes an extra argument which is a mutex to drop when going to sleep. Just as with msleep(), if the priority argument includes the PDROP flag, then the mutex will be dropped and will not be reacquired when the process wakes up. - Add in a backwards compatible macro await() that passes in NULL as the mutex argument to mawait().	2000-11-15 22:39:35 +00:00
John Baldwin	3ae4dd935b	- Replace a KASSERT() that knew too much about mutex internals with a mtx_assert() that ensures the mutex we release during msleep() is both not recursed and owned by the current process.	2000-11-15 22:30:48 +00:00
John Baldwin	f33a072eb9	- Convert references from tsleep() -> msleep() - Fix a buglet in a comment above await()	2000-11-15 22:27:38 +00:00
John Baldwin	9c36c934a1	Include the right headers to get the DDB #define and the db_active variable.	2000-11-15 22:08:16 +00:00
John Baldwin	896c2303d4	- Replace some instances of sched_ithd with sched_swi in KTR tracepoints. - Assert that Giant is not owned during the main loop of sithd_loop().	2000-11-15 22:05:23 +00:00
John Baldwin	59f857e4ea	Declare the 'witness_spin_check' properly as a per-CPU variable in the non-SMP case.	2000-11-15 22:02:05 +00:00
John Baldwin	ecbd8e3710	Don't perform witness checks in witness_enter() during a panic.	2000-11-15 22:00:31 +00:00
John Baldwin	22f1b34223	Make ktr_verbose a bit more useful: - On SMP systems display the cpu number with each message - If ktr_verbose > 1, then include the filename and line number with each trace message	2000-11-15 21:51:53 +00:00
Kirk McKusick	324d6bacc3	Bug fix for revision 1.14 on the replacement of CIRCLEQ with TAILQ. Submitted by: Warner Losh <imp@village.org>	2000-11-15 20:07:16 +00:00
Kirk McKusick	a077f63555	In preparation for deprecating CIRCLEQ macros in favor of TAILQ macros which provide the same functionality and are a bit more efficient, convert use of CIRCLEQ's in resource manager to TAILQ's. Approved by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>	2000-11-14 20:46:02 +00:00
David Greenman	866746b6a6	Fixed a certain panic on IO error in sendfile(): Page must be set PG_BUSY before calling vm_page_free() on it.	2000-11-12 14:51:15 +00:00
Bosko Milekic	e778918123	* Have m_pulldown() use the new M_WRITABLE() macro in order to determine whether the given ext_buf is shared. * Have the sf_bufs be setup with the mbuf subsystem using MEXTADD() with the two new arguments. Note: m_pulldown() is somewhat crotchy; the added comment explains the situation. Reviewed by: jlemon	2000-11-11 23:04:15 +00:00
Robert Watson	7f73938e96	o Fix a mis-transcription of sef's -STABLE protection fixes--only root could debug processes after the commit that introduced the typo. Security is good, but security is not always the same as turning things off :-). PR: kern/22711 Obtained from: brooks@one-eyed-alien.net	2000-11-10 23:57:48 +00:00
John Baldwin	20af769e69	Don't overwrite the filename for KTR_EXTEND with "../../kern/kern_ktr.c".	2000-11-10 22:30:44 +00:00
John Baldwin	9842fc8dda	Axe some unused variables.	2000-11-10 21:54:19 +00:00
John Baldwin	bf619f9506	Fix SMP kernel compiles by #include'ing machine/globals.h to get the cpuid variable.	2000-11-10 21:52:04 +00:00
John Baldwin	0fe4e534b1	Minor whitespace nit in a comment.	2000-11-10 21:21:20 +00:00
John Baldwin	b5d09a79b5	Ignore the INTR_MPSAFE flag when calculating the priority of an interrupt thread.	2000-11-10 21:19:14 +00:00
Mike Smith	edcb5775ec	Implement a trivial but effective interface for obtaining the kernel's device tree and resource manager contents. This is the kernel side of the upcoming libdevinfo, which will expose this information to userspace applications in a trivial fashion. Remove the now-obsolete DEVICE_SYSCTLS code.	2000-11-09 10:21:23 +00:00
Marcel Moolenaar	806d7daafe	Make MINSIGSTKSZ machine dependent, and have the sigaltstack syscall compare against a variable sv_minsigstksz in struct sysentvec as to properly take the size of the machine- and ABI dependent struct sigframe into account. The SVR4 and iBCS2 modules continue to have a minsigstksz of 8192 to preserve behavior. The real values (if different) are not known at this time. Other ABI modules use the real values. The native MINSIGSTKSZ is now defined as follows: Arch MINSIGSTKSZ ---- ----------- alpha 4096 i386 2048 ia64 12288 Reviewed by: mjacob Suggested by: bde	2000-11-09 08:25:48 +00:00
John Baldwin	d8f03321bd	- Remove much of the inlining of the KTR tracepoints into a ktr_tracepoint() function declared in kern_ktr.c. The only inline checks left are the checks that compare KTR_COMPILE with the supplied mask and thus should be optimized away into either nothing or a direct call to ktr_tracepoint(). - Move several KTR-related options to opt_ktr.h now that they are only needed by kern_ktr.c and not by ktr.h. - Add in the ktr_verbose functionality if KTR_EXTEND is turned on. If the global variable 'ktr_verbose' is non-zero, then KTR messages will be dumped to the console. This variable can be set by either kernel code or via the 'debug.ktr_verbose' sysctl. It defaults to off unless the KTR_VERBOSE kernel option is specified in which case it defaults to on. This can be useful when the machine locks up spinning in a loop with interrupts disabled as you might be able to see what it is doing when it locks up. Requested by: phk	2000-11-07 01:49:48 +00:00
John Baldwin	a924ab9741	Minor nit: missed ithd_loop -> sithd_loop in the KTR tracepoints.	2000-11-07 00:45:18 +00:00
David E. O'Brien	00910f2882	ELF kernels should use an ELF sysvec. This allows us to move a.out specific files to those platforms that acutally support a.out.	2000-11-05 10:41:35 +00:00
Bosko Milekic	fe27eea9d1	Change the sf_bufs wakeups to be wakeup_one(), because we don't want to wakeup all of the sleeping threads when we free only one buffer. This avoids us having to needlessly try again (and fail, and go back to sleep) for all the threads sleeping. We will now only wakeup the thread we know will succeed. Reviewed by: green	2000-11-04 21:55:25 +00:00
Bosko Milekic	0eecc42758	Setup and put to use the mutex lock for sf_freelist, the sendfile(2) bufs freelist. Should now be thread-friendly, in part. Note: More work is needed in uipc_syscalls.c, but it will have to wait until the socket locking issues are at least 80% implemented and committed.	2000-11-04 07:16:08 +00:00
Tor Egge	a2d1480cf8	Clear the VFREE flag when the vnode is removed from the free list in getnewvnode(). Otherwise routines called from VOP_INACTIVE() might attempt to remove the vnode from a free list the vnode isn't on, causing corruption. PR: 18012	2000-11-02 21:42:54 +00:00
Poul-Henning Kamp	1d7e3e42e7	Take VBLK devices further out of their missery. This should fix the panic I introduced in my previous commit on this topic.	2000-11-02 21:14:13 +00:00
Eivind Eklund	e3c4036b18	Give vop_mmap an untimely death. The opportunity to give it a timely death timed out in 1996.	2000-11-01 17:57:24 +00:00
Poul-Henning Kamp	a16d0eb2d7	Deprecate devsw->d_bmaj entirely. This removes support for booting current kernels with very old bootblocks. Device driver writers: Please remove initializations for the d_bmaj field in your cdevsw{}.	2000-10-31 10:58:14 +00:00
Jordan K. Hubbard	e7c2b5a51d	Add a new ioctl for doing virgin disklabels. Submitted by: dillon	2000-10-31 07:05:40 +00:00
Robert Watson	cb1f0db9db	o Deny access to System V IPC from within jail by default, as in the current implementation, jail neither virtualizes the Sys V IPC namespace, nor provides inter-jail protections on IPC objects. o Support for System V IPC can be enabled by setting jail.sysvipc_allowed=1 using sysctl. o This is not the "real fix" which involves virtualizing the System V IPC namespace, but prevents processes within jail from influencing those outside of jail when not approved by the administrator. Reported by: Paulo Fragoso <paulo@nlink.com.br>	2000-10-31 01:34:00 +00:00
Robert Watson	c087a04f6a	o Tighten up rules for which processes can't debug which other processes in the p_candebug() function. Synchronize with sef's CHECKIO() macro from the old procfs, which seems to be a good source of security checks. Obtained from: TrustedBSD Project	2000-10-30 20:30:03 +00:00
Kenneth D. Merry	2906da29dc	Write support for the cd(4) driver. This allows writing to DVD-RAM, PD and similar drives that probe as CD devices. Note that these are randomly writeable devices, not sequential-only devices like CD-R drives, which are supported by cdrecord. Add a new flag value for dsopen(), DSO_COMPATLABEL. The cd(4) driver now uses this flag instead of the DSO_NOLABELS flag. The DSO_NOLABELS always used a "fake" disklabel for the entire disk, provided by the caller. With the DSO_COMPATLABEL flag, dsopen() will first search the media for a label, and if it finds a label, it will use that label. Otherwise it will use the fake disklabel provided by the caller. This provides backwards compatibility, since we will still have labels for ISO9660 media. It also provides new functionality, since you can now have a regular BSD disklabel on read-only media, or on writeable media (e.g. DVD-RAM). Bruce and I both think that we should eventually (in a few years) get away from using disklabels for ISO9660 media, and just use the whole disk device (/dev/cd0). At that point disklabel handling in the cd(4) driver could follow the "normal" model, as used in the da(4) driver. Also, clean up the path in a couple of places in cdregister(). (Thanks to Nick Hibma for catching that bug.) Reviewed by: bde	2000-10-30 07:03:00 +00:00
Alan Cox	39b2b25fa0	_aio_aqueue(): Change kevent registration to use its own struct file pointer. Otherwise, aio_read() and aio_write() on sockets are broken if a kevent is registered. (The code after kevent registration for handling sockets assumes that the struct file pointer "fp" still refers to the socket, not the kqueue.)	2000-10-29 21:38:28 +00:00
Poul-Henning Kamp	fe4e324374	Allow all users to access the dev -> devname sysctl.	2000-10-29 19:50:06 +00:00
Poul-Henning Kamp	da936bf80a	Remove unneeded <stddef.h> #includes.	2000-10-29 16:57:42 +00:00
Poul-Henning Kamp	cf9fa8e725	Move suser() and suser_xxx() prototypes and a related #define from <sys/proc.h> to <sys/systm.h>. Correctly document the #includes needed in the manpage. Add one now needed #include of <sys/systm.h>. Remove the consequent 48 unused #includes of <sys/proc.h>.	2000-10-29 16:06:56 +00:00
Poul-Henning Kamp	53ce36d17a	Remove unneeded #include <sys/proc.h> lines.	2000-10-29 13:57:19 +00:00
Don Lewis	19c34d1596	Nuke a bit of dead code.	2000-10-29 01:00:36 +00:00
Alan Cox	4a71feb71c	Add missing call to knote_fdclose() in setugidsafety() and fdcloseexec(). Reviewed by: jlemon	2000-10-28 20:27:32 +00:00
Poul-Henning Kamp	46aa3347cb	Convert all users of fldoff() to offsetof(). fldoff() is bad because it only takes a struct tag which makes it impossible to use unions, typedefs etc. Define __offsetof() in <machine/ansi.h> Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h> Remove myriad of local offsetof() definitions. Remove includes of <stddef.h> in kernel code. NB: Kernelcode should never include from /usr/include ! Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API. Deprecate <struct.h> with a warning. The warning turns into an error on 01-12-2000 and the file gets removed entirely on 01-01-2001. Paritials reviews by: various. Significant brucifications by: bde	2000-10-27 11:45:49 +00:00
John Baldwin	a5a96a1978	- Use MUTEX_DECLARE() and MTX_COLD for the WITNESS code's internal mutex so it can function before malloc(9) is up and running. - Add two new options WITNESS_DDB and WITNESS_SKIPSPIN. If WITNESS_SKIPSPIN is enabled, then spin mutexes are ignored by the WITNESS code. If WITNESS_DDB is turned on and DDB is compiled into the kernel, then the kernel will drop into DDB when either a lock hierarchy violation occurs or mutexes are held when going to sleep. - Add some new sysctls: debug.witness_ddb is a read-write sysctl that corresponds to WITNESS_DDB. The kernel option merely changes the default value to on at boot. debug.witness_skipspin is a read-only sysctl that one can use to determine if the kernel was compiled with WITNESS_SKIPSPIN. - Wipe out the BSD/OS-specific lock order lists. We get to build our own lists now as we add mutexes to the kernel.	2000-10-27 02:59:30 +00:00
Andrew Gallatin	810bfc8ea1	unstaticize change_ruid() because it is needed by osf1_setuid()	2000-10-26 15:49:35 +00:00
John Baldwin	8088699f79	- Overhaul the software interrupt code to use interrupt threads for each type of software interrupt. Roughly, what used to be a bit in spending now maps to a swi thread. Each thread can have multiple handlers, just like a hardware interrupt thread. - Instead of using a bitmask of pending interrupts, we schedule the specific software interrupt thread to run, so spending, NSWI, and the shandlers array are no longer needed. We can now have an arbitrary number of software interrupt threads. When you register a software interrupt thread via sinthand_add(), you get back a struct intrhand that you pass to sched_swi() when you wish to schedule your swi thread to run. - Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit more intuitive. Also, prefix all the members of struct intrhand with 'ih_'. - Make swi_net() a MI function since there is now no point in it being MD. Submitted by: cp	2000-10-25 05:19:40 +00:00
John Baldwin	3127162743	Quite some warnings.	2000-10-25 04:37:54 +00:00
John Baldwin	d543796f86	- Make the eventhandler_mutex mutex a private variable in subr_eventhandler.c - Move the extra #include's in sys/eventhandler.h to be protected by the #ifndef SYS_EVENTHANDLER/#endif	2000-10-25 00:01:39 +00:00
Warner Losh	bbfe025461	Cleanup the rman_make_alignment_flags function to be much clearer and shorter than the prior version.	2000-10-22 04:48:11 +00:00
John Baldwin	b67a3e6e85	Propogate the 'const'ness of mutex descriptions to the witness code to quiet warnings.	2000-10-20 22:45:01 +00:00
John Baldwin	78f0da0373	Actually enable the witness code if the WITNESS kernel option is enabled.	2000-10-20 21:58:11 +00:00
John Baldwin	f5271ebc2f	Doh. Fix a 64-bit-ism by using uintptr_t for a temporary lock variable instead of int.	2000-10-20 20:24:40 +00:00
Poul-Henning Kamp	1921a06d6a	Introduce the M_ZERO flag to malloc(9) Instead of: foo = malloc(sizeof(foo), M_WAIT); bzero(foo, sizeof(foo)); You can now (and please do) use: foo = malloc(sizeof(foo), M_WAIT \| M_ZERO); In the future this will enable us to do idle-time pre-zeroing of malloc-space.	2000-10-20 17:54:55 +00:00
John Baldwin	35e0e5b311	Catch up to moving headers: - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h	2000-10-20 07:58:15 +00:00
John Baldwin	700bfa750f	- GC some #if 0'd code regarding the non-existant safepri variable. - Don't dink with the witness state of Giant unless we actually own it during mi_switch().	2000-10-20 07:52:10 +00:00
John Baldwin	eec258d257	- machine/mutex.h -> sys/mutex.h - Use MUTEX_DECLARE() and MTX_COLD for the malloc_mtx mutex	2000-10-20 07:29:16 +00:00
John Baldwin	d8881ca31d	- machine/mutex.h -> sys/mutex.h - The initial lock_mtx mutex used in the lockmgr code is initialized very early, so use MUTEX_DECLARE() and MTX_COLD.	2000-10-20 07:28:00 +00:00
John Baldwin	36412d79b4	- Make the mutex code almost completely machine independent. This greatly reducues the maintenance load for the mutex code. The only MD portions of the mutex code are in machine/mutex.h now, which include the assembly macros for handling mutexes as well as optionally overriding the mutex micro-operations. For example, we use optimized micro-ops on the x86 platform #ifndef I386_CPU. - Change the behavior of the SMP_DEBUG kernel option. In the new code, mtx_assert() only depends on INVARIANTS, allowing other kernel developers to have working mutex assertiions without having to include all of the mutex debugging code. The SMP_DEBUG kernel option has been renamed to MUTEX_DEBUG and now just controls extra mutex debugging code. - Abolish the ugly mtx_f hack. Instead, we dynamically allocate seperate mtx_debug structures on the fly in mtx_init, except for mutexes that are initiated very early in the boot process. These mutexes are declared using a special MUTEX_DECLARE() macro, and use a new flag MTX_COLD when calling mtx_init. This is still somewhat hackish, but it is less evil than the mtx_f filler struct, and the mtx struct is now the same size with and without mutex debugging code. - Add some micro-micro-operation macros for doing the actual atomic operations on the mutex mtx_lock field to make it easier for other archs to override/optimize mutex ops if needed. These new tiny ops also clean up the code in some places by replacing long atomic operation function calls that spanned 2-3 lines with a short 1-line macro call. - Don't call mi_switch() from mtx_enter_hard() when we block while trying to obtain a sleep mutex. Calling mi_switch() would bogusly release Giant before switching to the next process. Instead, inline most of the code from mi_switch() in the mtx_enter_hard() function. Note that when we finally kill Giant we can back this out and go back to calling mi_switch().	2000-10-20 07:26:37 +00:00
John Baldwin	d8f831678d	Reparent a kernel thread to init during kthread_exit() so that the zombie can be reaped.	2000-10-19 19:53:44 +00:00
Robert Watson	47460a23a0	o Introduce new VOP_ACCESS() flag VADMIN, allowing file systems to perform "administrative" authorization checks. In most cases, the VADMIN test checks to make sure the credential effective uid is the same as the file owner. o Modify vaccess() to set VADMIN as an available right if the uid is appropriate. o Modify references to uid-based access control operations such that they now always invoke VOP_ACCESS() instead of using hard-coded policy checks. o This allows alternative UFS policies to be implemented by replacing only ufs_access() (such as mandatory system policies). o VOP_ACCESS() requires the caller to hold an exclusive vnode lock on the vnode: I believe that new invocations of VOP_ACCESS() are always called with the lock held. o Some direct checks of the uid remain, largely associated with the QUOTA and SUIDDIR code. Reviewed by: eivind Obtained from: TrustedBSD Project	2000-10-19 07:53:59 +00:00
John Baldwin	dc13e6dfbb	Axe the idle_event eventhandler, and add a MD cpu_idle function used for things such as halting CPU's, idling CPU's, etc. Discussed with: msmith	2000-10-19 07:47:16 +00:00
Peter Wemm	5d391f75d6	EVENTHANDLER_INVOKE() takes two arguments.	2000-10-18 17:56:06 +00:00
John Baldwin	86bc23af90	Don't needlessly pass the diagnostic counter to the idle_event event handlers.	2000-10-18 08:10:25 +00:00
Matthew N. Dodd	0cb53e2487	Add new bus method 'GET_RESOURCE_LIST' and appropriate generic implementation. Add bus_generic_rl_{get,set,delete,release,alloc}_resource() functions which provide generic operations for devices using resource list style resource management. This should simplify a number of bus drivers. Further commits to follow.	2000-10-18 05:15:40 +00:00
John Baldwin	3650b37578	- Wrap the sanity checks for staying in the idle loop for absurdly long amounts of time in #ifdef DIAGNOSTIC - Call vm_page_zero_idle() during the idle loop.	2000-10-17 23:12:37 +00:00
Warner Losh	85d693f9d8	Implement resource alignment as discussed in arch@ a long time ago. This was implemented by Shigeru YAMAMOTO-san and Jonathan Chen. I've cleaned them up somewhat and they seem to work well enough to boot current (but given current's state it can be hard to tell). Doug Rabson also reviewed the design and signed off on it.	2000-10-17 22:08:03 +00:00
Nick Hibma	d686268728	Put the header section in the header file not the c file. Submitted by: Jonathan Chen <jon@spock.org> PR: 21982	2000-10-15 15:19:35 +00:00
Poul-Henning Kamp	db7e3af111	Remove unneeded #include <machine/clock.h>	2000-10-15 14:19:01 +00:00
Bosko Milekic	181d2a1564	Add nmbcnt sysctl and make it tunable at boottime; nmbcnt is the number of ext_buf counters that are possibly allocatable. Do this because: (i) It will make it easier to influence EXT_COUNTERS for if_sk, if_ti (or similar) users where the driver allocates its own ext_bufs and where it is important for the mbuf system to take it into account when reserving necessary space for counters. (ii) Facilitate some percentile calculation for netstat(1)	2000-10-15 06:24:07 +00:00
John W. De Boskey	2ec40c9aac	Remove the signal value check from the PT_STEP codepath. It can cause an bogus failure. Reviewed by: Sean Eric Fagan <sef@kithrup.com> and no other response to the review request.	2000-10-14 03:56:01 +00:00
Peter Wemm	ac5f943c37	savectx() is now used exclusively by the crash dump system. Move the i386 specific gunk (copy %cr3 to the pcb) from the MI dumpsys() to the MD savectx().	2000-10-13 22:03:29 +00:00
Paul Saab	16a011f973	Do not allocate a callout for all crashdumps, not just when you panic.	2000-10-13 21:49:19 +00:00
Robert Watson	ab024bb02e	o Simplify capability types away from an array of ints to a single u_int64_t flag field, bounding the number of capabilities at 64, but substantially cleaning up capability logic (there are currently 43 defined capabilities). o Heads up to anyone actually using capabilities: the constant assignments for various capabilities have been redone, so any persistent binary capability stores (i.e., '$posix1e.cap' EA backing files) must be recreated. If you have one of these, you'll know about it, so if you have no idea what this means, don't worry. o Update libposix1e to reflect this new definition, fixing the exposed functions that directly manipulate the flags fields. Obtained from: TrustedBSD Project	2000-10-13 17:12:58 +00:00
Jason Evans	9722d88fba	For lockmgr mutex protection, use an array of mutexes that are allocated and initialized during boot. This avoids bloating sizeof(struct lock). As a side effect, it is no longer necessary to enforce the assumtion that lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been removed. Idea taken from: BSD/OS.	2000-10-12 22:37:28 +00:00
Doug Rabson	63c47a5ca0	Add a gross hack for ia64 to allocate the backing store for a new program.	2000-10-12 14:24:03 +00:00
Eivind Eklund	7eb9fca557	Blow away the v_specmountpoint define, replacing it with what it was defined as (rdev->si_mountpoint)	2000-10-09 17:31:39 +00:00
Jason Evans	39df86086f	Do not call lockdestroy() for v_vnlock, which may point to a lock in a deeper vfs stacking layer. Submitted by: bp	2000-10-06 08:04:48 +00:00
John Baldwin	ca29467e9a	Correct a warning where the r_debug_state() dummy function used to trigger a breakpoint in the kernel didn't use the proper argument list. To avoid having to include the userland link.h header everyhwere that sys/linker.h is used, make r_debug_state() a static function in link_elf.c as well.	2000-10-06 05:20:02 +00:00
John Baldwin	6c56727456	- Change fast interrupts on x86 to push a full interrupt frame and to return through doreti to handle ast's. This is necessary for the clock interrupts to work properly. - Change the clock interrupts on the x86 to be fast instead of threaded. This is needed because both hardclock() and statclock() need to run in the context of the current process, not in a separate thread context. - Kill the prevproc hack as it is no longer needed. - We really need Giant when we call psignal(), but we don't want to block during the clock interrupt. Instead, use two p_flag's in the proc struct to mark the current process as having a pending SIGVTALRM or a SIGPROF and let them be delivered during ast() when hardclock() has finished running. - Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was broken on the x86 if it was turned on since cpl is gone. It's only use was to bogusly run softclock() directly during hardclock() rather than scheduling an SWI. - Remove the COM_LOCK simplelock and replace it with a clock_lock spin mutex. Since the spin mutex already handles disabling/restoring interrupts appropriately, this also lets us axe all the *_intr() fu. - Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use temporary fast interrupts for the APIC trial. - Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending signals in hardclock() that are to be delivered in ast(). Submitted by: jakeb (making statclock safe in a fast interrupt) Submitted by: cp (concept of delaying signals until ast())	2000-10-06 02:20:21 +00:00
John Baldwin	a91b7dc11b	Various whitespace cleanups after the SMPng commit, which jumbled things around a bit in the trap handling code.	2000-10-06 01:55:07 +00:00
John Baldwin	0e2aab1237	Don't treat a kernel stack fault the same as a general protect fault or a segment not present fault in the non-vm86 case.	2000-10-06 01:50:43 +00:00
John Baldwin	1931cf940a	- Heavyweight interrupt threads on the alpha for device I/O interrupts. - Make softinterrupts (SWI's) almost completely MI, and divorce them completely from the x86 hardware interrupt code. - The ihandlers array is now gone. Instead, there is a MI shandlers array that just contains SWI handlers. - Most of the former machine/ipl.h files have moved to a new sys/ipl.h. - Stub out all the spl*() functions on all architectures. Submitted by: dfr	2000-10-05 23:09:57 +00:00
Eivind Eklund	a863c0fb2f	Style fixes based on comments by bde	2000-10-05 18:22:46 +00:00
Doug Rabson	c9b004775d	Add a workaround for statically linked kernels.	2000-10-04 17:40:24 +00:00
Jason Evans	a18b1f1d4d	Convert lockmgr locks from using simple locks to using mutexes. Add lockdestroy() and appropriate invocations, which corresponds to lockinit() and must be called to clean up after a lockmgr lock is no longer needed.	2000-10-04 01:29:17 +00:00
Boris Popov	f8be809e0f	Move KASSERTs which checks value of v_usecount after vnode locking, so it will not produce wrong alarms.	2000-10-02 09:57:06 +00:00
Mike Smith	aa998012c7	Treat %X the same as %x (not entirely correct, but close enough).	2000-10-02 07:13:10 +00:00
Bosko Milekic	7d03271452	Big mbuf subsystem diff #1 : incorporate mutexes and fix things up somewhat to accomodate the changes. Here's a list of things that have changed (I may have left out a few); for a relatively complete list, see http://people.freebsd.org/~bmilekic/mtx_journal * Remove old (once useful) mcluster code for MCLBYTES > PAGE_SIZE which nobody uses anymore. It was great while it lasted, but now we're moving onto bigger and better things (Approved by: wollman). * Practically re-wrote the allocation macros in sys/sys/mbuf.h to accomodate new allocations which grab the necessary lock. * Make sure that necessary mbstat variables are manipulated with corresponding atomic() routines. * Changed the "wait" routines, cleaned it up, made one routine that does the job. * Generalized MWAKEUP() macro. Got rid of m_retry and m_retryhdr, as they are now included in the generalized "wait" routines. * Sleep routines now use msleep(). * Free lists have locks. * etc... probably other stuff I'm missing... Things to look out for and work on later: * find a better way to (dynamically) adjust EXT_COUNTERS * move necessity to recurse on a lock from drain routines by providing lock-free lower-level version of MFREE() (and possibly m_free()?). * checkout include of mutex.h in sys/sys/mbuf.h - probably violating general philosophy here. The code has been reviewed quite a bit, but problems may arise... please, don't panic! Send me Emails: bmilekic@freebsd.org Reviewed by: jlemon, cp, alfred, others?	2000-09-30 06:30:39 +00:00
Doug Rabson	918c9eec57	Add ia64 support.	2000-09-29 13:36:47 +00:00
Doug Rabson	ff2d7ae543	Don't support dynamic linking on ia64 for now - the tools can't cope.	2000-09-29 13:34:04 +00:00
Doug Rabson	b99353b99e	Change the conditionaal so that we only build this on i386 instead of trying to build it on all non-alpha arches.	2000-09-29 13:32:24 +00:00
Jonathan Lemon	d5aa12349f	Check so_error in filt_so{read\|write} in order to detect UDP errors. PR: 21601	2000-09-28 04:41:22 +00:00
Kirk McKusick	02a1e48f02	Do the right thing if bdevvp is called twice for the same device. Obtained from: Poul-Henning Kamp <phk@freebsd.org>	2000-09-27 18:03:17 +00:00
Alan Cox	b92bb032d8	aio_qphysio: Eliminate one instance of an out-of-range check that is performed twice. Eliminate initialization that is already performed by _aio_aqueue. aio_physwakeup: Eliminate redundant synchronization that is already performed by bufdone.	2000-09-26 06:35:22 +00:00
Takanori Watanabe	b9a22da4cf	Make size of dynamic loader argument variable to support various executable file format. Reviewed by: peter	2000-09-26 05:09:21 +00:00
Boris Popov	67e871664b	Add a lock structure to vnode structure. Previously it was either allocated separately (nfs, cd9660 etc) or keept as a first element of structure referenced by v_data pointer(ffs). Such organization leads to known problems with stacked filesystems. From this point vop_nolock() functions maintain only interlock lock. vop_stdlock() functions maintain built-in v_lock structure using lockmgr(). vop_sharedlock() is compatible with vop_stdunlock(), but maintains a shared lock on vnode. If filesystem wishes to export lockmgr compatible lock, it can put an address of this lock to v_vnlock field. This indicates that the upper filesystem can take advantage of it and use single lock structure for entire (or part) of stack of vnodes. This field shouldn't be examined or modified by VFS code except for initialization purposes. Reviewed in general by: mckusick	2000-09-25 15:24:04 +00:00
John Baldwin	fd2802cfe0	Add a KASSERT() to catch instances where the mutex that we pass in to msleep() are recursed. Suggested by: cp	2000-09-24 00:33:51 +00:00

1 2 3 4 5 ...

3427 Commits