freebsd-nq

Author	SHA1	Message	Date
David Xu	450c38d016	Set kse mailbox pointer to NULL when P_KSES is turned off.	2003-01-04 05:59:25 +00:00
Julian Elischer	a98c9b8604	White space fixes	2003-01-03 20:55:52 +00:00
Julian Elischer	03ea472080	Make an explicit flag to indicate that a KSE has a reason to upcall, and use that flag when there is a kse_wakeup() call. It will probably be used with signal delivery as well eventually. Submitted by: davidxu@	2003-01-03 20:41:49 +00:00
Julian Elischer	3f5f24287f	Don't need to set retvals to 0 in the non error case. They are set to a good default anyhow. Submitted by: davidxu@	2003-01-03 19:38:54 +00:00
Poul-Henning Kamp	862702306b	Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since all BUF_STRATEGY did in the first place was call VOP_STRATEGY.	2003-01-03 06:32:15 +00:00
Poul-Henning Kamp	e2a3ea1c45	Remove unused second argument from DEV_STRATEGY().	2003-01-03 05:57:35 +00:00
Andrew Gallatin	1f88bad30a	o Introduce a new external mbuf type, EXT_EXTREF. o Allow callers of m_extadd() to allocate their own reference m_ext.ref_cnt pointer, rather than having the mbuf system allocate it with a malloc() in the critical path. This speeds m_extadd() up, and also simplifies locking (malloc() may need Giant). A driver or subsystem wishing to take use its own ref counter must initialize m_ext.ref_cnt to point to its ref counter prior to calling m_extadd(), and it must use EXT_EXTREF as its external type. Eg: m->m_ext.ref_cnt = my_ref_cnt_ptr; m_extadd(.....,EXT_EXTREF); Reviewed by: bosko	2003-01-02 21:16:50 +00:00
Alan Cox	49bf855d20	Lock the vm object when performing back-to-back vm_object_clear_flag() and vm_object_set_flag().	2003-01-02 18:32:13 +00:00
David Xu	42f67bd752	Adjust code for Julian's last commit. use td_mailbox to detect if a syscall is from UTS kernel.	2003-01-02 02:48:03 +00:00
Jens Schweikhardt	9d5abbddbf	Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.	2003-01-01 18:49:04 +00:00
Warner Losh	62c8b32c71	Use 0600 for permissions for /dev/devctl until it is cloneable. Use UID_ROOT and GID_WHEEL rather than 0. Prompted by: rwatson	2003-01-01 03:43:58 +00:00
Alfred Perlstein	13438f6823	When compiling the kernel do not implicitly include filedesc.h from proc.h, this was causing filedesc work to be very painful. In order to make this work split out sigio definitions to thier own header (sigio.h) which is included from proc.h for the time being.	2003-01-01 01:56:19 +00:00
Alfred Perlstein	c522c1bf4b	fdcopy() only needs a filedesc pointer.	2003-01-01 01:19:31 +00:00
Alfred Perlstein	03282e6e3d	purge 'register'.	2003-01-01 01:05:54 +00:00
Alfred Perlstein	c7f1c11b20	Since fdshare() and fdinit() only operate on filedescs, make them take pointers to filedesc structures instead of threads. This makes it more clear that they do not do any voodoo with the thread/proc or anything other than the filedesc passed in or returned. Remove some XXX KSE's as this resolves the issue.	2003-01-01 01:01:14 +00:00
Alfred Perlstein	59c97598d3	fdinit() does not need to lock the filedesc it is creating as no one besideds itself has access until the function returns.	2003-01-01 00:35:46 +00:00
Sam Leffler	addea9d4d7	o reduce the overhead of calling ppsratecheck by using ticks instead of calling getmicrouptime (but maintain the struct timeval-based calling convention for compatibility) o eliminate the use of timersub in ratecheck Note that flood ping tests indicate ppsratecheck is inaccurate (but on the conservative side) with this revised implementation. If more accuracy is needed we'll have to introduce an alternate interface or increase the overhead. Reviewed by: silby, dillon, bde	2002-12-31 18:22:12 +00:00
Jens Schweikhardt	d64ada501a	Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/ Add FreeBSD Id tag where missing.	2002-12-30 21:18:15 +00:00
Sam Leffler	9967cafc49	Correct mbuf packet header propagation. Previously, packet headers were sometimes propagated using M_COPY_PKTHDR which actually did something between a "move" and a "copy" operation. This is replaced by M_MOVE_PKTHDR (which copies the pkthdr contents and "removes" it from the source mbuf) and m_dup_pkthdr which copies the packet header contents including any m_tag chain. This corrects numerous problems whereby mbuf tags could be lost during packet manipulations. These changes also introduce arguments to m_tag_copy and m_tag_copy_chain to specify if the tag copy work should potentially block. This introduces an incompatibility with openbsd which we may want to revisit. Note that move/dup of packet headers does not handle target mbufs that have a cluster bound to them. We may want to support this; for now we watch for it with an assert. Finally, M_COPYFLAGS was updated to include M_FIRSTFRAG\|M_LASTFRAG. Supported by: Vernier Networks Reviewed by: Robert Watson <rwatson@FreeBSD.org>	2002-12-30 20:22:40 +00:00
Robert Watson	3c67c23bcf	Implement new ACL system calls which do not follow symbolic links: __acl_get_link(), __acl_set_link(), acl_delete_link(), and __acl_aclcheck_link(), with almost identical implementations to the existing __acl_*_file() variants on these calls. Update copyright. Obtained from: TrustedBSD Project	2002-12-29 20:28:44 +00:00
Robert Watson	6f123c35a0	Regen from syscalls.master:1.139	2002-12-29 20:26:41 +00:00
Robert Watson	b1f4acd8ac	Add definitions for four new system calls: __acl_get_link() Retrieve an ACL by name without following symbolic links. __acl_set_link() Set an ACL by name without following symbolic links. __acl_delete_link() Delete an ACL by name without following symbolic links. __acl_aclcheck_link() Check an ACL against a file by name without following symbolic links. These calls are similar in spirit to lstat(), lchown(), lchmod(), etc, and will be used under similar circumstances. Obtained from: TrustedBSD Project	2002-12-29 20:25:54 +00:00
Ian Dowse	6a1b2a22ef	Add a new vnode flag VI_DOINGINACT to indicate that a VOP_INACTIVE call is in progress on the vnode. When vput() or vrele() sees a 1->0 reference count transition, it now return without any further action if this flag is set. This flag is necessary to avoid recursion into VOP_INACTIVE if the filesystem inactive routine causes the reference count to increase and then drop back to zero. It is also used to guarantee that an unlocked vnode will not be recycled while blocked in VOP_INACTIVE(). There are at least two cases where the recursion can occur: one is that the softupdates code called by ufs_inactive() via ffs_truncate() can call vput() on the vnode. This has been reported by many people as "lockmgr: draining against myself" panics. The other case is that nfs_inactive() can call vget() and then vrele() on the vnode to clean up a sillyrename file. Reviewed by: mckusick (an older version of the patch)	2002-12-29 18:30:49 +00:00
Poul-Henning Kamp	371400cf2e	Use a timeout of one second while we wait for the vnode washer, this prevents a potential race and makes the system a little bit less jerky under extreme loads.	2002-12-29 11:18:25 +00:00
Poul-Henning Kamp	851a87ea1a	Vnodes pull in 800-900 bytes these days, all things counted, so we need to treat desiredvnodes much more like a limit than as a vague concept. On a 2GB RAM machine where desired vnodes is 130k, we run out of kmem_map space when we hit about 190k vnodes. If we wake up the vnode washer in getnewvnode(), sleep until it is done, so that it has a chance to offer us a washed vnode. If we don't sleep here we'll just race ahead and allocate yet a vnode which will never get freed. In the vnodewasher, instead of doing 10 vnodes per mountpoint per rotation, do 10% of the vnodes distributed evenly across the mountpoints.	2002-12-29 10:39:05 +00:00
Alan Cox	a28cc55e5b	Reduce the number of times that we acquire and release the page queues lock by making vm_page_rename()'s caller, rather than vm_page_rename(), responsible for acquiring it.	2002-12-29 07:17:06 +00:00
Jake Burkholder	24fbeaf9c3	Don't put a newline in KTR traces.	2002-12-28 23:22:22 +00:00
Jake Burkholder	dcc4093c7a	Add a tunable kern.smp.disabled for disabling explicitly smp on an smp kernel.	2002-12-28 23:21:13 +00:00
Poul-Henning Kamp	9f16282798	KASSERT that vop_revoke() gets a VCHR.	2002-12-28 22:27:14 +00:00
Poul-Henning Kamp	f53c6e5c9a	Remove unused cdevsw_ALLOCSTART macro.	2002-12-28 21:47:43 +00:00
Poul-Henning Kamp	7068a01c6f	Remove cdevsw_add calls, they are deprecated.	2002-12-28 21:39:46 +00:00
Matthew Dillon	45587e2514	Abstract-out the constants for the sequential heuristic. No operational changes. MFC after: 1 day	2002-12-28 20:28:10 +00:00
Julian Elischer	93a7aa79d6	Add code to ddb to allow backtracing an arbitrary thread. (show thread {address}) Remove the IDLE kse state and replace it with a change in the way threads sahre KSEs. Every KSE now has a thread, which is considered its "owner" however a KSE may also be lent to other threads in the same group to allow completion of in-kernel work. n this case the owner remains the same and the KSE will revert to the owner when the other work has been completed. All creations of upcalls etc. is now done from kse_reassign() which in turn is called from mi_switch or thread_exit(). This means that special code can be removed from msleep() and cv_wait(). kse_release() does not leave a KSE with no thread any more but converts the existing thread into teh KSE's owner, and sets it up for doing an upcall. It is just inhibitted from being scheduled until there is some reason to do an upcall. Remove all trace of the kse_idle queue since it is no-longer needed. "Idle" KSEs are now on the loanable queue.	2002-12-28 01:23:07 +00:00
Robert Watson	f0bc12ee8d	Improve consistency between devfs and MAKEDEV: use UID_ROOT and GID_WHEEL instead of UID_BIN and GID_BIN for /dev/fd/* entries. Submitted by: kris	2002-12-27 16:54:44 +00:00
Alfred Perlstein	5590e7fdf0	Lock filedesc while performing a range check on the file descriptor. Reviewed by: alc	2002-12-27 08:39:42 +00:00
Alan Cox	d746789347	Hold the page queues lock when calling vm_page_flag_clear().	2002-12-27 06:52:32 +00:00
Jeffrey Hsu	6f782c4636	Ensure that the made-up inode number for a Unix domain socket is persistent.	2002-12-25 07:59:39 +00:00
Robert Watson	79191eca57	Flush vop_refreshlabel() definition, since it is no longer used. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-12-24 19:47:13 +00:00
Poul-Henning Kamp	a7010ee2f4	White-space changes.	2002-12-24 09:44:51 +00:00
Jeffrey Hsu	956b0b653c	SMP locking for radix nodes.	2002-12-24 03:03:39 +00:00
Poul-Henning Kamp	08c7670a8b	Move the declaration of the socket fileops from socketvar.h to file.h. This allows us to use the new typedefs and removes the needs for a number of forward struct declarations in socketvar.h	2002-12-23 22:46:47 +00:00
Poul-Henning Kamp	f3a682116c	Detediousficate declaration of fileops array members by introducing typedefs for them.	2002-12-23 21:53:20 +00:00
Poul-Henning Kamp	6ce9c72c30	s/sokqfilter/soo_kqfilter/ for consistency with the naming of all other socket/file operations.	2002-12-23 21:37:28 +00:00
Alan Cox	0cb6c00463	- Hold the kernel_object's lock around vm_page_alloc(kernel_object,...). - Hold the page queues lock around vm_page_wakeup().	2002-12-23 20:10:47 +00:00
Jake Burkholder	c3c2862df4	- Add a spin lock to single thread cache invalidation and tlb flush ipis, which allows ipis to be sent outside of Giant. - Remove the ap boot mutex, which is unused.	2002-12-22 20:50:23 +00:00
Kris Kennaway	4ef3d7a27b	Enforce correct ordering of the filedesc structure and pipe mutex, because WITNESS can get the order wrong if it guesses based on first use. Reviewed by: jhb, alfred	2002-12-22 16:32:34 +00:00
Jeffrey Hsu	b30a244c34	SMP locking for ifnet list.	2002-12-22 05:35:03 +00:00
Marcel Moolenaar	551d79e177	Fix multiple registration of the elf_legacy_coredump sysctl variable. The duplication is caused by the fact that imgact_elf.c is included by both imgact_elf32.c and imgact_elf64.c and both are compiled by default on ia64. Consequently, we have two seperate copies of the elf_legacy_coredump variable due to them being declared static, and two entries for the same sysctl in the linker set, both referencing the unique copy of the elf_legacy_coredump variable. Since the second sysctl cannot be registered, one of the elf_legacy_coredump variables can not be tuned (if ordering still holds, it's the ELF64 related one). The only solution is to create two different sysctl variables, just like the elf<32\|64>_trace sysctl variables. This unfortunately is an (user) interface change, but unavoidable. Thus, on ELF32 platforms the sysctl variable is called elf32_legacy_coredump and on ELF64 platforms it is called elf64_legacy_coredump. Platforms that have both ELF formats have both sysctl variables. These variables should probably be retired sooner rather than later.	2002-12-21 01:15:39 +00:00
Sam Leffler	91974ce10b	add generic rate limiting support from netbsd; ratelimit is purely time based, ppsratecheck is for controlling packets/second Obtained from: netbsd	2002-12-20 23:54:47 +00:00
Alan Cox	2952e1fb58	Extend the scope of the page queues lock in vm_pgmoveco().	2002-12-20 21:18:29 +00:00
Maxime Henrion	894db7b01f	Don't forget to destroy the mutex if an error occurs in the jail() system call. Submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>	2002-12-20 14:32:20 +00:00
Alan Cox	ee113343eb	Hold the page queues lock when performing vm_page_busy().	2002-12-18 20:16:22 +00:00
Poul-Henning Kamp	4d99ef8d55	Indent properly.	2002-12-17 19:31:26 +00:00
Poul-Henning Kamp	126c7e29fe	Remove unused variable cn_devfsdev.	2002-12-17 19:30:50 +00:00
Poul-Henning Kamp	d321df47c3	Don't cast a pointer to (intptr_t) and then on to (int) when we cannot be sure that (int) is large enough. Instead cast only to (intptr_t) and cast the switch/case values to (intptr_t) as well.	2002-12-17 19:13:03 +00:00
Matthew Dillon	fa7dd9c5bc	Change the way ELF coredumps are handled. Instead of unconditionally skipping read-only pages, which can result in valuable non-text-related data not getting dumped, the ELF loader and the dynamic loader now mark read-only text pages NOCORE and the coredump code only checks (primarily) for complete inaccessibility of the page or NOCORE being set. Certain applications which map large amounts of read-only data will produce much larger cores. A new sysctl has been added, debug.elf_legacy_coredump, which will revert to the old behavior. This commit represents collaborative work by all parties involved. The PR contains a program demonstrating the problem. PR: kern/45994 Submitted by: "Peter Edwards" <pmedwards@eircom.net>, Archie Cobbs <archie@dellroad.org> Reviewed by: jdp, dillon MFC after: 7 days	2002-12-16 19:24:43 +00:00
Robert Drehmel	0adb6d7a49	Remove the hto(be\|le)[slq] and (be\|le)toh[slq] macros defined in _KERNEL scope from "src/sys/sys/mchain.h". Replace each occurrence of the above in _KERNEL scope with the appropriate macro from the set of hto(be\|le)(16\|32\|64) and (be\|le)toh(16\|32\|64) from "src/sys/sys/endian.h". Tested by: tjr Requested by: comment marked with XXX	2002-12-16 16:20:06 +00:00
Matthew Dillon	72e7f3ddc2	Regenerate system calls (swapoff added)	2002-12-15 19:19:15 +00:00
Matthew Dillon	92da00bb24	This is David Schultz's swapoff code which I am finally able to commit. This should be considered highly experimental for the moment. Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 3 weeks	2002-12-15 19:17:57 +00:00
Matthew Dillon	389d2b6e21	Fix a refcount race with the vmspace structure. In order to prevent resource starvation we clean-up as much of the vmspace structure as we can when the last process using it exits. The rest of the structure is cleaned up when it is reaped. But since exit1() decrements the ref count it is possible for a double-free to occur if someone else, such as the process swapout code, references and then dereferences the structure. Additionally, the final cleanup of the structure should not occur until the last process referencing it is reaped. This commit solves the problem by introducing a secondary reference count, calling 'vm_exitingcnt'. The normal reference count is decremented on exit and vm_exitingcnt is incremented. vm_exitingcnt is decremented when the process is reaped. When both vm_exitingcnt and vm_refcnt are 0, the structure is freed for real. MFC after: 3 weeks	2002-12-15 18:50:04 +00:00
Maxim Konovalov	9f59c468f3	o Clear a high bit of ipc_perm.seq so msgget(3) never returns a negative message queue id. PR: kern/46122 Submitted by: Vladimir B.Grebenschikov <vova@sw.ru> MFC after: 2 weeks	2002-12-15 09:41:46 +00:00
Alan Cox	475e8011ab	Perform vm_object_lock() and vm_object_unlock() around vm_object_page_remove().	2002-12-15 05:41:56 +00:00
Alfred Perlstein	f97182acf8	unwrap lines made short enough by SCARGS removal	2002-12-14 08:18:06 +00:00
Alfred Perlstein	b80521fee5	remove syscallarg(). Suggested by: peter	2002-12-14 02:07:32 +00:00
Alfred Perlstein	d1e405c5ce	SCARGS removal take II.	2002-12-14 01:56:26 +00:00
Kirk McKusick	0f5f789c0d	The buffer daemon cannot skip over buffers owned by locked inodes as they may be the only viable ones to flush. Thus it will now wait for an inode lock if the other alternatives will result in rollbacks (and immediate redirtying of the buffer). If only buffers with rollbacks are available, one will be flushed, but then the buffer daemon will wait briefly before proceeding. Failing to wait briefly effectively deadlocks a uniprocessor since every other process writing to that filesystem will wait for the buffer daemon to clean up which takes close enough to forever to feel like a deadlock. Reported by: Archie Cobbs <archie@dellroad.org> Sponsored by: DARPA & NAI Labs. Approved by: re	2002-12-14 01:35:30 +00:00
Alfred Perlstein	bc9e75d7ca	Backout removal SCARGS, the code freeze is only "selectively" over.	2002-12-13 22:41:47 +00:00
Alfred Perlstein	0bbe7292e1	Remove SCARGS. Reviewed by: md5	2002-12-13 22:27:25 +00:00
Tim J. Robbins	9d0fffd3ca	Drop filedesc lock and acquire Giant around calls to malloc() and free(). These call uma_large_malloc() and uma_large_free() which require Giant. Fixes panic when descriptor table is larger than KMEM_ZMAX bytes noticed by kkenn. Reviewed by: jhb	2002-12-13 09:59:40 +00:00
Julian Elischer	696058c3c5	Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage during the context switch. Rearrange thread cleanups to avoid problems with Giant. Clean threads when freed or when recycled. Approved by: re (jhb)	2002-12-10 02:33:45 +00:00
Robert Watson	990b4b2dc5	Remove dm_root entry from struct devfs_mount. It's never set, and is unused. Replace it with a dm_mount back-pointer to the struct mount that the devfs_mount is associated with. Export that pointer to MAC Framework entry points, where all current policies don't use the pointer. This permits the SEBSD port of SELinux's FLASK/TE to compile out-of-the-box on 5.0-CURRENT with full file system labeling support. Approved by: re (murray) Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-12-09 03:44:28 +00:00
Alan Cox	2e29a1f21f	To avoid lock order reversals in getnewvnode(), the call to uma_zfree() must be delayed until the vnode interlock is released. Reported by: kris@ Approved by: re (jhb)	2002-12-08 05:06:50 +00:00
Giorgos Keramidas	0c920c0de8	Fix typo in comment. It's SYSINIT, not SYSINT. Approved by: re (murray)	2002-11-30 22:15:30 +00:00
Kirk McKusick	c6964d3bc9	Remove a race condition / deadlock from snapshots. When converting from individual vnode locks to the snapshot lock, be sure to pass any waiting processes along to the new lock as well. This transfer is done by a new function in the lock manager, transferlockers(from_lock, to_lock); Thanks to Lamont Granquist <lamont@scriptkiddie.org> for his help in pounding on snapshots beyond all reason and finding this deadlock. Sponsored by: DARPA & NAI Labs.	2002-11-30 19:00:51 +00:00
Warner Losh	304f10ce4a	devd kernel improvements: 1) Record all device events when devctl is enabled, rather than just when devd has devctl open. This is necessary to prevent races between when a device arrives, and when devd starts. 2) Add hw.bus.devctl_disable to disable devctl, this can also be set as a tunable. 3) Fix async support. Reset nonblocking and async_td in open. remove async flags. 4) Free all memory when devctl is disabled. Approved by: re (blanket)	2002-11-30 00:49:43 +00:00
Alan Cox	fdff30d256	Use pmap_remove_all() instead of pmap_remove() before freeing the page in vm_pgmoveco(); the page may have more than one mapping. Hold the page queues lock when calling pmap_remove_all(). Approved by: re (blanket)	2002-11-28 08:44:26 +00:00
Robert Drehmel	f85a961930	Do not set a variable (vp->p_pollinfo) to NULL if we know it already has that value. Approved by: re	2002-11-27 16:45:54 +00:00
Maxim Konovalov	8819f45b51	Small SO_RCVTIMEO and SO_SNDTIMEO values are mistakenly taken to be zero. PR: kern/32827 Submitted by: Hartmut Brandt <brandt@fokus.gmd.de> Approved by: re (jhb) MFC after: 2 weeks	2002-11-27 13:34:04 +00:00
Tim J. Robbins	fef82663b8	o Initialise each mbuf's m_len to 0 in m_getm(); mb_put_mem() depends on this. o Update the `cur' pointer in the cluster loop in m_getm() to avoid incorrect truncation and leaked mbufs. Reviewed by: bmilekic Approved by: re	2002-11-27 04:26:00 +00:00
Warner Losh	647501a046	Make the rman_{get,set}_* macros into real functions. The macros create an ABI that encodes offsets and sizes of structures into client drivers. The functions isolate the ABI from changes to the resource structure. Since these are used very rarely (once at startup), the speed penalty will be down in the noise. Also, add r_rid to the structure so that clients can save the 'rid' of the resource in the struct resource, plus accessor functions. Future additions to newbus will make use of this to present a simplified interface for resource specification. Approved by: re (jhb) Reviewed by: jhb, jake	2002-11-27 03:55:22 +00:00
Bill Fenner	8b5f8b061a	Don't hold acct_mtx over limcopy(), since it's unnecessary and limcopy() can sleep. Approved by: re	2002-11-26 18:04:12 +00:00
Sam Leffler	c8f43965d6	correct function names in KASSERT's for 2 m_tag routines Submitted by: rwatson Approved by: re	2002-11-26 17:59:16 +00:00
Robert Drehmel	d1989db545	To avoid sleeping with all sorts of resources acquired (the reported problem was a locked directory vnode), do not give the process a chance to sleep in state "stopevent" (depends on the S_EXEC bit being set in p_stops) until most resources have been released again. Approved by: re	2002-11-26 17:30:55 +00:00
John Baldwin	04f4a16448	If the file descriptors passed into do_dup() are negative, return EBADF instead of panicing. Also, perform some of the simpler sanity checks on the fds before acquiring the filedesc lock. Approved by: re Reported by: Dan Nelson <dan@emsphone.com> and others	2002-11-26 17:22:15 +00:00
Robert Watson	4d10c0ce5f	Un-staticize mac_cred_mmapped_drop_perms() so that it may be used by policy modules making use of downgrades in the MAC AST event. This is required by the mac_lomac port of LOMAC to the MAC Framework. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-26 17:11:57 +00:00
Alan Cox	2d21129db2	Acquire and release the page queues lock around pmap_remove_pages() because it updates several of vm_page's fields.	2002-11-25 04:37:44 +00:00
Alan Cox	178949e021	Hold the page queues/flags lock when calling vm_page_set_validclean(). Approved by: re	2002-11-23 19:10:31 +00:00
Maxime Henrion	b19d9defef	Under certain circumstances, we were calling kmem_free() from i386 cpu_thread_exit(). This resulted in a panic with WITNESS since we need to hold Giant to call kmem_free(), and we weren't helding it anymore in cpu_thread_exit(). We now do this from a new MD function, cpu_thread_dtor(), called by thread_dtor(). Approved by: re@ Suggested by: jhb	2002-11-22 23:57:02 +00:00
Jeff Roberson	79acfc497b	- Add the new sched_pctcpu() function to the sched_* api. - Provide a routine in sched_4bsd to add this functionality. - Use sched_pctcpu() in kern_proc, which is the one place outside of sched_4bsd where the old pctcpu value was accessed directly. Approved by: re	2002-11-21 09:30:55 +00:00
Jeff Roberson	06439a04a1	- Move scheduler specific macros and defines out of proc.h Approved by: re	2002-11-21 09:14:13 +00:00
Jeff Roberson	148302c9c9	- Move FSCALE back to kern_sync. This is not scheduler specific. - Create a new callout for lbolt and move it out of schedcpu(). This is not scheduler specific either. Approved by: re	2002-11-21 08:57:08 +00:00
Jeff Roberson	de028f5a4a	- Implement a mechanism for allowing schedulers to place scheduler dependant data in the scheduler independant structures (proc, ksegrp, kse, thread). - Implement unused stubs for this mechanism in sched_4bsd. Approved by: re Reviewed by: luigi, trb Tested on: x86, alpha	2002-11-21 01:22:38 +00:00
Robert Watson	2555374c4f	Introduce p_label, extensible security label storage for the MAC framework in struct proc. While the process label is actually stored in the struct ucred pointed to by p_ucred, there is a need for transient storage that may be used when asynchronous (deferred) updates need to be performed on the "real" label for locking reasons. Unlike other label storage, this label has no locking semantics, relying on policies to provide their own protection for the label contents, meaning that a policy leaf mutex may be used, avoiding lock order issues. This permits policies that act based on historical process behavior (such as audit policies, the MAC Framework port of LOMAC, etc) can update process properties even when many existing locks are held without violating the lock order. No currently committed policies implement use of this label storage. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-20 15:41:25 +00:00
Robert Watson	a3df768b04	Merge kld access control checks from the MAC tree: these access control checks permit policy modules to augment the system policy for permitting kld operations. This permits policies to limit access to kld operations based on credential (and other) properties, as well as to perform checks on the kld being loaded (integrity, etc). Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-19 22:12:42 +00:00
Robert Watson	293d2d2261	We leaked a process lock reference in the event an RFTHREAD process leader wasn't exiting during a fork; instead, do remember to release the lock avoiding lock order reversals and recursion panic. Reported by: "Joel M. Baldwin" <qumqats@outel.org>	2002-11-18 14:23:21 +00:00
David Xu	bfd8325073	Make sure only update wall clock at upcall time, slightly reformat code in kse_relase().	2002-11-18 12:28:15 +00:00
Alfred Perlstein	ec63e12a03	During shutdown explain what the numbers following the 'syncing disks' message mean, specifically, 'buffers remaining...'.	2002-11-18 02:41:03 +00:00
David Xu	8798d4f9c8	1. Support versioning and wall clock in kse mailbox, also add rusage time in thread mailbox. 2. Minor change for thread limit code in thread_user_enter(), fix typo in kse_release() last I committed. Reviewed by: deischen, mini	2002-11-18 01:59:31 +00:00
Julian Elischer	904f1b77cc	include smp.h. it is required by some code that was commented out until david's last commit.	2002-11-17 23:26:42 +00:00
David Xu	fdc5ecd24f	1.Add sysctls to control KSE resource allocation. kern.threads.max_threads_per_proc kern.threads.max_groups_per_proc 2.Temporary disable borrower thread stash itself as owner thread's spare thread in thread_exit(). there is a race between owner thread and borrow thread: an owner thread may allocate a spare thread as this: if (td->td_standin == NULL) td->standin = thread_alloc(); but thread_alloc() can block the thread, then a borrower thread would possible stash it self as owner's spare thread in thread_exit(), after owner is resumed, result is a thread leak in kernel, double check in owner can avoid the race, but it may be ugly and not worth to do.	2002-11-17 11:47:03 +00:00
David Xu	db9b0729fc	Rework last exiting thread in kse_release(), wait a signal and then schedule an upcall and call thread_exit().	2002-11-17 10:12:00 +00:00
Jeff Roberson	a9a088823e	- Release the imgp vnode prior to freeing exec_map resources to avoid deadlock.	2002-11-17 09:33:00 +00:00
Alfred Perlstein	f51c1e897d	Rework the sysconf(3) interaction with aio: sysconf.c: Use 'break' rather than 'goto yesno' in sysconf.c so that we report a '0' return value from the kernel sysctl. vfs_aio.c: Make aio reset its configuration parameters to -1 after unloading instead of 0. posix4_mib.c: Initialize the aio configuration parameters to -1 to indicate that it is not loaded. Add a facility (p31b_iscfg()) to determine if a posix4 facility has been initialized to avoid having to re-order the SYSINITs. Use p31b_iscfg() to determine if aio has had a chance to run yet which is likely if it is compiled into the kernel and avoid spamming its values. Introduce a macro P31B_VALID() instead of doing the same comparison over and over. posix4.h: Prototype p31b_iscfg().	2002-11-17 04:15:34 +00:00
Alan Cox	4fec79bef8	Now that pmap_remove_all() is exported by our pmap implementations use it directly.	2002-11-16 07:44:25 +00:00
Alfred Perlstein	86d52125a2	Export the values for _SC_AIO_MAX and _SC_AIO_PRIO_DELTA_MAX via the p1003b sysctl interface.	2002-11-16 06:38:07 +00:00
Daniel Eischen	f3ec9000e9	Regenerate after adding system calls.	2002-11-16 06:36:56 +00:00
Daniel Eischen	2be05b70c9	Add getcontext, setcontext, and swapcontext as system calls. Previously these were libc functions but were requested to be made into system calls for atomicity and to coalesce what might be two entrances into the kernel (signal mask setting and floating point trap) into one. A few style nits and comments from bde are also included. Tested on alpha by: gallatin	2002-11-16 06:35:53 +00:00
Alfred Perlstein	c844abc920	Call 'p31b_setcfg(CTL_P1003_1B_AIO_LISTIO_MAX, AIO_LISTIO_MAX)' when AIO is initialized so that sysconf() gives correct results. Reported by: Craig Rodrigues <rodrigc@attbi.com>	2002-11-16 04:22:55 +00:00
Alfred Perlstein	b565fb9e6f	headers should not really include "opt_foo.h" (in this case opt_posix.h). remove it from the header and add it to the files that require it.	2002-11-15 22:55:06 +00:00
David Xu	1d2c5bd519	Return EWOULDBLOCK for last thread in kse_release(). Requested by: archie	2002-11-15 00:53:59 +00:00
Thomas Moestl	01ee43955c	Make the msg_size, msg_bufx and msg_bufr memebers of struct msgbuf signed, since they describe a ring buffer and signed arithmetic is performed on them. This avoids some evilish casts. Since this changes all but two members of this structure, style(9) those remaining ones, too. Requested by: bde Reviewed by: bde (earlier version)	2002-11-14 16:11:12 +00:00
David Xu	ca161eb6e9	In kse_release(), check if current thread is bound and current kse mailbox was already initialized, also prevent last thread from exiting unless we figure out how to safely support null thread proc.	2002-11-14 06:06:45 +00:00
Robert Watson	a96acd1ace	Introduce a condition variable to avoid returning EBUSY when the MAC policy list is busy during a load or unload attempt. We assert no locks held during the cv wait, meaning we should be fairly deadlock-safe. Because of the cv model and busy count, it's possible for a cv waiter waiting for exclusive access to the policy list to be starved by active and long-lived access control/labeling events. For now, we accept that as a necessary tradeoff. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-13 15:47:09 +00:00
Maxime Henrion	2bb95458bd	Add support for the C99 %t format modifier.	2002-11-13 15:15:59 +00:00
Robert Watson	63b6f478ec	Garbage collect mac_create_devfs_vnode() -- it hasn't been used since we brought in the new cache and locking model for vnode labels. We now rely on mac_associate_devfs_vnode(). Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-12 04:20:36 +00:00
John Baldwin	d2b28e078a	Correct an assertion in the code to traverse the list of locks to find an earlier acquired lock with the same witness as the lock currently being acquired. If we had released several earlier acquired locks after acquiring enough locks to require another lock_list_entry bucket in the lock list, then subsequent lock_list_entry buckets could contain only one lock instance in which case i would be zero. Reported by: Joel M. Baldwin <qumqats@outel.org>	2002-11-11 16:36:20 +00:00
Robert Watson	2d43d24ed4	Garbage collect definition of M_MACOPVEC -- we no longer perform a dynamic mapping of an operation vector into an operation structure, rather, we rely on C99 sparse structure initialization. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-11 14:15:58 +00:00
Alan Cox	d154fb4fe6	When prot is VM_PROT_NONE, call pmap_page_protect() directly rather than indirectly through vm_page_protect(). The one remaining page flag that is updated by vm_page_protect() is already being updated by our various pmap implementations. Note: A later commit will similarly change the VM_PROT_READ case and eliminate vm_page_protect().	2002-11-10 07:12:04 +00:00
Alfred Perlstein	29f194457c	Fix instances of macros with improperly parenthasized arguments. Verified by: md5	2002-11-09 12:55:07 +00:00
Robert Watson	6d7bdc8def	Assign value of NULL to imgp->execlabel when imgp is initialized in the ELF code. Missed in earlier merge from the MAC tree. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-08 20:49:50 +00:00
Robert Watson	52378b8acd	To reduce per-return overhead of userret(), call into mac_thread_userret() only if PS_MACPEND is set in the process AST mask. This avoids the cost of the entry point in the common case, but requires policies interested in the userret event to set the flag (protected by the scheduler lock) if they do want the event. Since all the policies that we're working with which use mac_thread_userret() use the entry point only selectively to perform operations deferred for locking reasons, this maintains the desired semantics. Approved by: re Requested by: bde Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-08 19:00:17 +00:00
Robert Watson	9fa3506ecd	Add an explicit execlabel argument to exec-related MAC policy entry points, rather than relying on policies to grub around in the image activator instance structure. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-08 18:04:00 +00:00
Thomas Moestl	0fca57b8b8	Move the definitions of the hw.physmem, hw.usermem and hw.availpages sysctls to MI code; this reduces code duplication and makes all of them available on sparc64, and the latter two on powerpc. The semantics by the i386 and pc98 hw.availpages is slightly changed: previously, holes between ranges of available pages would be included, while they are excluded now. The new behaviour should be more correct and brings i386 in line with the other architectures. Move physmem to vm/vm_init.c, where this variable is used in MI code.	2002-11-07 23:57:17 +00:00
John Baldwin	6274bdda4c	- Use %j to print intmax_t values. - Cast more daddr_t values to intmax_t when printing to quiet warnings.	2002-11-07 22:41:08 +00:00
John Baldwin	d0e938f4f1	Use %z to quiet a warning.	2002-11-07 22:38:04 +00:00
Maxime Henrion	a7a00d0546	- Fix a bunch of casts to long which were truncating off_t's. - Remove the comments which were justifying this by the fact that we don't have %q in the kernel, this was probably right back in time, but we now have %q, and we even have better to print those types (%j).	2002-11-07 21:56:05 +00:00
Maxime Henrion	b65d1ba9dd	- Use a better definition for MNAMELEN which doesn't require to have one #ifdef per architecture. - Change a space to a tab after a nearby #define. Obtained from: bde	2002-11-07 21:15:02 +00:00
Robert Watson	f8f750c53e	Do a bit more work in the aio code to simulate the credential environment of the original AIO request: save and restore the active thread credential as well as using the file credential, since MAC (and some other bits of the system) rely on the thread credential instead of/as well as the file credential. In brief: cache td->td_ucred when the AIO operation is queued, temporarily set and restore the kernel thread credential, and release the credential when done. Similar to ktrace credential management. Reviewed by: alc Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-07 20:46:37 +00:00
Kelly Yancey	04ac9b97b5	Spotted a couple of places where the socket buffer's counters were being manipulated directly (rather than using sballoc()/sbfree()); update them to tweak the new sb_ctl field too. Sponsored by: NTT Multimedia Communications Labs	2002-11-05 18:52:25 +00:00
Kelly Yancey	247a32f22a	Fix filt_soread() to properly flag a kevent when a 0-byte datagram is received. Verified by: dougb, Manfred Antar <null@pozo.com> Sponsored by: NTT Multimedia Communications Labs	2002-11-05 18:48:46 +00:00
Robert Watson	0c93266b9c	Correct merge-o: disable the right execve() variation if !MAC	2002-11-05 18:04:50 +00:00
Robert Watson	670cb89bf4	Bring in two sets of changes: (1) Permit userland applications to request a change of label atomic with an execve() via mac_execve(). This is required for the SEBSD port of SELinux/FLASK. Attempts to invoke this without MAC compiled in result in ENOSYS, as with all other MAC system calls. Complexity, if desired, is present in policy modules, rather than the framework. (2) Permit policies to have access to both the label of the vnode being executed as well as the interpreter if it's a shell script or related UNIX nonsense. Because we can't hold both vnode locks at the same time, cache the interpreter label. SEBSD relies on this because it supports secure transitioning via shell script executables. Other policies might want to take both labels into account during an integrity or confidentiality decision at execve()-time. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-05 17:51:56 +00:00
Robert Watson	051c41caf1	Regen.	2002-11-05 17:48:04 +00:00
Robert Watson	21bb9ea225	Flesh out the definition of __mac_execve(): per earlier discussion, it's essentially execve() with an optional MAC label argument. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-05 17:47:08 +00:00
Robert Watson	4443e9ff4a	Assert that appropriate vnodes are locked in mac_execve_will_transition(). Allow transitioning to be twiddled off using the process and fs enforcement flags, although at some point this should probably be its own flag. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-05 15:11:33 +00:00
Robert Watson	ccafe7eb35	Hook up the mac_will_execve_transition() and mac_execve_transition() entrypoints, #ifdef MAC. The supporting logic already existed in kern_mac.c, so no change there. This permits MAC policies to cause a process label change as the result of executing a binary -- typically, as a result of executing a specially labeled binary. For example, the SEBSD port of SELinux/FLASK uses this functionality to implement TE type transitions on processes using transitioning binaries, in a manner similar to setuid. Policies not implementing a notion of transition (all the ones in the tree right now) require no changes, since the old label data is copied to the new label via mac_create_cred() even if a transition does occur. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-05 14:57:49 +00:00
Giorgos Keramidas	5f9ae8e026	Typo in comment: commmand -> command Reviewed by: jhb	2002-11-05 14:54:07 +00:00
Robert Watson	450ffb4427	Remove reference to struct execve_args from struct imgact, which describes an image activation instance. Instead, make use of the existing fname structure entry, and introduce two new entries, userspace_argv, and userspace_envv. With the addition of mac_execve(), this divorces the image structure from the specifics of the execve() system call, removes a redundant pointer, etc. No semantic change from current behavior, but it means that the structure doesn't depend on syscalls.master-generated includes. There seems to be some redundant initialization of imgact entries, which I have maintained, but which could probably use some cleaning up at some point. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-05 01:59:56 +00:00
Robert Watson	e5e820fd1f	Permit MAC policies to instrument the access control decisions for system accounting configuration and for nfsd server thread attach. Policies might use this to protect the integrity or confidentiality of accounting data, limit the ability to turn on or off accounting, as well as to prevent inappropriately labeled threads from becoming nfs server threads. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-04 15:13:36 +00:00
Robert Watson	3da87a65c7	Remove mac_cache_fslabel_in_vnode sysctl -- with the new VFS/MAC construction, labels are always cached. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-04 14:55:14 +00:00
Robert Watson	6201265be7	License clarification and wording changes: NAI has approved removal of clause three, and NAI Labs now goes by the name Network Associates Laboratories.	2002-11-04 01:42:39 +00:00
Robert Watson	4b8d5f2d97	Introduce mac_check_system_settime(), a MAC check allowing policies to augment the system policy for changing the system time. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-03 02:39:42 +00:00
Robert Watson	01ce3b5661	Regen from yesterday's system call placeholder rename.	2002-11-02 23:54:36 +00:00
Alan Cox	151113a946	Catch up with the removal of the vm page buckets spin mutex.	2002-11-02 22:42:18 +00:00
Alan Cox	5ee0a409fc	Revert the change in revision 1.77 of kern/uipc_socket2.c. It is causing a panic because the socket's state isn't as expected by sofree(). Discussed with: dillon, fenner	2002-11-02 05:14:31 +00:00
Kelly Yancey	47baac87a6	Update the st_size reported via stat(2) to accurately reflect the amount of data available to read for non-TCP sockets. Reviewed by: -net, -arch Sponsored by: NTT Multimedia Communications Labs MFC after: 2 weeks	2002-11-01 21:31:13 +00:00
Kelly Yancey	e0f640e82d	Track the number of non-data chararacters stored in socket buffers so that the data value returned by kevent()'s EVFILT_READ filter on non-TCP sockets accurately reflects the amount of data that can be read from the sockets by applications. PR: 30634 Reviewed by: -net, -arch Sponsored by: NTT Multimedia Communications Labs MFC after: 2 weeks	2002-11-01 21:27:59 +00:00
Robert Watson	6cedb451fb	Rename __execve_mac() to __mac_execve() for increased consistency with other MAC system calls. Requested by: various (phk, gordont, jake, ...)	2002-11-01 21:00:02 +00:00
Robert Watson	e686e5ae91	Add MAC checks for various kenv() operations: dump, get, set, unset, permitting MAC policies to limit access to the kernel environment. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-11-01 20:46:53 +00:00
Poul-Henning Kamp	1fb14a47a1	Introduce malloc_last_fail() which returns the number of seconds since malloc(9) failed last time. This is intended to help code adjust memory usage to the current circumstances. A typical use could be: if (malloc_last_fail() < 60) reduce_cache_by_one();	2002-11-01 18:58:12 +00:00
Poul-Henning Kamp	38b0884cc3	Introduce a "time_uptime" global variable which holds the time since boot in seconds.	2002-11-01 18:52:20 +00:00
David Xu	adac9400a7	KSE-enabled processes only.	2002-10-31 08:00:51 +00:00
Robert Watson	5c8dd34218	Move to C99 sparse structure initialization for the mac_policy_ops structure definition, rather than using an operation vector we translate into the structure. Originally, we used a vector for two reasons: (1) We wanted to define the structure sparsely, which wasn't supported by the C compiler for structures. For a policy with five entry points, you don't want to have to stick in a few hundred NULL function pointers. (2) We thought it would improve ABI compatibility allowing modules to work with kernels that had a superset of the entry points defined in the module, even if the kernel had changed its entry point set. Both of these no longer apply: (1) C99 gives us a way to sparsely define a static structure. (2) The ABI problems existed anyway, due to enumeration numbers, argument changes, and semantic mismatches. Since the going rule for FreeBSD is that you really need your modules to pretty closely match your kernel, it's not worth the complexity. This submit eliminates the operation vector, dynamic allocation of the operation structure, copying of the vector to the structure, and redoes the vectors in each policy to direct structure definitions. One enourmous benefit of this change is that we now get decent type checking on policy entry point implementation arguments. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-30 18:48:51 +00:00
Robert Watson	b914de36c0	While 'mode_t' seemed like a good idea for the access mode argument for MAC access() and open() checks, the argument actually has an int type where it becomes available. Switch to using 'int' for the mode argument throughout the MAC Framework and policy modules. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-30 17:56:57 +00:00
David Xu	8db2431f61	Check NULL thread mailbox pointer.	2002-10-30 05:09:29 +00:00
David Xu	7b290dd008	Style fixes.	2002-10-30 03:01:28 +00:00
David Xu	37fcb8bcc8	Don't forget to set syscall result.	2002-10-30 02:39:10 +00:00
David Xu	34e80e027d	Add an actual implementation of kse_thr_interrupt()	2002-10-30 02:28:41 +00:00
Robert Watson	9a1b076af2	Minor comment typo fix. Submitted by: Wayne Morrison <tewok@tislabs.com>	2002-10-29 20:51:44 +00:00
David Malone	6bd34a1e6f	The syscall names are string constants, so make them consts.	2002-10-29 15:47:06 +00:00
Robert Watson	6151efaa54	Trim extraneous #else and #endif MAC comments per style(9).	2002-10-28 21:17:53 +00:00
Robert Watson	8b3a843438	An inappropriate ASSERT slipped in during the recent merge of the reboot checking; remove.	2002-10-28 18:53:53 +00:00
David Xu	72465621ff	Close a race window in kse_create(): signal delivered after SIGPENDING call but before we call kse_link().	2002-10-28 07:37:06 +00:00
Ian Dowse	4e08ccb2ff	Fix a case in kern_rename() where a vn_finished_write() call was missed. This bug has been present since the vn_start_write() and vn_finished_write() calls were first added in revision 1.159. When the case is triggered, any attempts to create snapshots on the filesystem will deadlock and also prevent further write activity on that filesystem.	2002-10-27 23:23:51 +00:00
Garrett Wollman	c7047e5204	Change the way support for asynchronous I/O is indicated to applications to conform to 1003.1-2001. Make it possible for applications to actually tell whether or not asynchronous I/O is supported. Since FreeBSD's aio implementation works on all descriptor types, don't call down into file or vnode ops when [f]pathconf() is asked about _PC_ASYNC_IO; this avoids the need for every file and vnode op to know about it.	2002-10-27 18:07:41 +00:00
Robert Watson	9e913ebd0a	Centrally manage enforcement of {reboot,swapon,sysctl} using the mac_enforce_system toggle, rather than several separate toggles. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-27 15:50:49 +00:00
Robert Watson	d3fc69ee6a	Implement mac_check_system_sysctl(), a MAC Framework entry point to permit MAC policies to augment the security protections on sysctl() operations. This is not really a wonderful entry point, as we only have access to the MIB of the target sysctl entry, rather than the more useful entry name, but this is sufficient for policies like Biba that wish to use their notions of privilege or integrity to prevent inappropriate sysctl modification. Affects MAC kernels only. Since SYSCTL_LOCK isn't in sysctl.h, just kern_sysctl.c, we can't assert the SYSCTL subsystem lockin the MAC Framework. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-27 07:12:34 +00:00
Robert Watson	a2ecb9b790	Hook up mac_check_system_reboot(), a MAC Framework entry point that permits MAC modules to augment system security decisions regarding the reboot() system call, if MAC is compiled into the kernel. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-27 07:03:29 +00:00
Robert Watson	03ce2c0c9b	Merge from MAC tree: rename mac_check_vnode_swapon() to mac_check_system_swapon(), to reflect the fact that the primary object of this change is the running kernel as a whole, rather than just the vnode. We'll drop additional checks of this class into the same check namespace, including reboot(), sysctl(), et al. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-27 06:54:06 +00:00
Maxime Henrion	5b8ee62bc2	Fix a style nit.	2002-10-26 18:19:46 +00:00
Robert Watson	763bbd2f4f	Slightly change the semantics of vnode labels for MAC: rather than "refreshing" the label on the vnode before use, just get the label right from inception. For single-label file systems, set the label in the generic VFS getnewvnode() code; for multi-label file systems, leave the labeling up to the file system. With UFS1/2, this means reading the extended attribute during vfs_vget() as the inode is pulled off disk, rather than hitting the extended attributes frequently during operations later, improving performance. This also corrects sematics for shared vnode locks, which were not previously present in the system. This chances the cache coherrency properties WRT out-of-band access to label data, but in an acceptable form. With UFS1, there is a small race condition during automatic extended attribute start -- this is not present with UFS2, and occurs because EAs aren't available at vnode inception. We'll introduce a work around for this shortly. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-26 14:38:24 +00:00
Julian Elischer	053effc60e	iBack out david's last commit. the suspension code needs to be called for non KSE processes too.	2002-10-26 04:44:17 +00:00
David Xu	3139ada54c	Move suspension checking code from userret() into thread_userret().	2002-10-26 02:56:51 +00:00
David Xu	56a6a23ea6	Backout revision 1.48.	2002-10-26 01:26:36 +00:00
Robert Watson	a67fe518a1	Comment describing the semantics of mac_late. Trim trailing whitespace. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-25 20:45:27 +00:00
Maxime Henrion	4578a2e652	- Rename the DDB specific %z printf format to %y. - Make DDB use %y instead of %z. - Teach GCC about %y. - Implement support for the C99 %z format modifier. Approved by: re@ Reviewed by: peter Tested on: i386, sparc64	2002-10-25 19:41:32 +00:00
Peter Wemm	23eeeff7be	Split 4.x and 5.x signal handling so that we can keep 4.x signal handling clean and functional as 5.x evolves. This allows some of the nasty bandaids in the 5.x codepaths to be unwound. Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an anti-foot-shooting measure in place, 5.x folks need this for a while) and finish encapsulating the older stuff under COMPAT_43. Since the ancient stuff is required on alpha (longjmp(3) passes a 'struct osigcontext ' to the current sigreturn(2), instead of the 'ucontext_t ' that sigreturn is supposed to take), add a compile time check to prevent foot shooting there too. Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc. Tested on: i386, alpha, ia64. Compiled on sparc64 (a few days ago). Approved by: re	2002-10-25 19:10:58 +00:00
Poul-Henning Kamp	df6b615a42	#include <geom/geom.h> to get proper prototypes. Contrary to my fears we seem to have all the prerequisites already. Call g_waitidle() as the first thing in vfs_mountroot() so that we have it out of the way before we even decide if we should call .._ask() or .._try(). Call the g_dev_print() function to provide better guidance for the root-mount prompt.	2002-10-25 18:44:42 +00:00
David Xu	ddc4f28155	suspend thread only when it can be interrupted.	2002-10-25 13:12:36 +00:00
David Xu	0cf609706f	let thread_schedule_upcall() handle idle kse.	2002-10-25 12:50:31 +00:00
Poul-Henning Kamp	fa669ab7b8	Disable the kernacc() check in mtx_validate() until such time that kernacc does not require Giant. This means that we may miss panics on a class of mutex programming bugs, but only if running with a Chernobyl setting of debug-flags. Spotted by: Pete Carah <pete@ns.altadena.net>	2002-10-25 08:40:20 +00:00
Poul-Henning Kamp	0d6dc414b4	In vrele() we can actually have a VCHR with v_rdev == NULL if we came from the bottom of addaliasu(). Don't panic.	2002-10-25 07:58:25 +00:00
Julian Elischer	de4723f6e8	fix style-o	2002-10-25 07:17:07 +00:00
Julian Elischer	9d10277721	More work on the interaction between suspending and sleeping threads. Also clean up some code used with 'single-threading'. Reviewed by: davidxu	2002-10-25 07:11:12 +00:00
Kirk McKusick	9ab73fd11a	Within ufs, the ffs_sync and ffs_fsync functions did not always check for and/or report I/O errors. The result is that a VFS_SYNC or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in the presence of a hard error writing a disk sector or in a filesystem full condition. This patch ensures that I/O errors will always be checked and returned. This patch also ensures that every call to VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes appropriate action when an error is returned. Sponsored by: DARPA & NAI Labs.	2002-10-25 00:20:37 +00:00
David Xu	4c40dcd4d7	fix typo.	2002-10-25 00:13:46 +00:00
Julian Elischer	1434d3fe6f	Extract out KSE specific code from machine specific code so that there is ony one copy of it. Fix that one copy so that KSEs with no mailbox in a KSE program are not a cause of page faults (this can legitmatly happen). Submitted by: (parts) davidxu	2002-10-24 23:09:48 +00:00
Poul-Henning Kamp	a2fb4feded	Fix the spechash lock order reversal by keeping an updated sum of v_usecount in the dev_t which vcount() can return without locking any vnodes. Seen by: jhb	2002-10-24 19:38:56 +00:00
Poul-Henning Kamp	7c0c26b4c4	Make sure GEOM has stopped rattling the disks before we try to mount the root filesystem, this may be implicated in the PC98 issue.	2002-10-24 19:26:08 +00:00
Poul-Henning Kamp	1f59664b68	Don't try to be cute and save a call/return by implementing a degenerate vrele() inline.	2002-10-24 17:55:49 +00:00
David Xu	33862f40b0	respect TDF_SINTR, also for SINGLE_NO_EXIT threading mode, if a thread was already suspended, do nothing.	2002-10-24 14:43:48 +00:00
David Xu	9991db0cb5	don't forget to remove kse from idle queue.	2002-10-24 09:16:46 +00:00
Julian Elischer	5c8329ed6c	Move thread related code from kern_proc.c to kern_thread.c. Add code to free KSEs and KSEGRPs on exit. Sort KSE prototypes in proc.h. Add the missing kse_exit() syscall. ksetest now does not leak KSEs and KSEGRPS. Submitted by: (parts) davidxu	2002-10-24 08:46:34 +00:00
Dag-Erling Smørgrav	f2c1ea8152	Whitespace cleanup.	2002-10-23 10:26:54 +00:00
Alexander Kabaev	96725dd01a	Handle binaries with arbitrary number PT_LOAD sections, not only ones with one text and one data section. The text and data rlimit checks still needs to be fixed to properly accout for additional sections. Reviewed by: peter (slightly different patch version)	2002-10-23 01:57:39 +00:00
John Baldwin	12f65109c8	Don't dereference the 'x' pointer if it is NULL, instead skip the assignment. The netsmb code likes to call these functions with a NULL x argument a lot. Reported by: Vallo Kallaste <kalts@estpak.ee>	2002-10-22 18:44:59 +00:00
Robert Drehmel	d08926b1f6	Change the `mutex_prof' structure to use three variables contained in an anonymous structure as counters, instead of an array with preprocessor-defined names for indices. Remove the associated XXX- comment.	2002-10-22 16:06:28 +00:00
Robert Watson	1cbfd977fd	Introduce MAC_CHECK_VNODE_SWAPON, which permits MAC policies to perform authorization checks during swapon() events; policies might choose to enforce protections based on the credential requesting the swap configuration, the target of the swap operation, or other factors such as internal policy state. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-22 15:53:43 +00:00
Robert Watson	2789e47e2c	Missed in previous merge: export sizeof(struct oldmac) rather than sizeof(struct mac). Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-22 15:33:33 +00:00
Robert Watson	f7b951a8e0	Support the new MAC user API in kernel: modify existing system calls to use a modified notion of 'struct mac', and flesh out the new variation system calls (almost identical to existing ones except that they permit a pid to be specified for process label retrieval, and don't follow symlinks). This generalizes the label API so that the framework is now almost entirely policy-agnostic. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-22 14:29:47 +00:00
Robert Watson	5cb559a5e0	Regen.	2002-10-22 14:23:52 +00:00
Robert Watson	aad1cdc852	Flesh out prototypes for __mac_get_pid, __mac_get_link, and __mac_set_link, based on __mac_get_proc() except with a pid, and __mac_get_file(), __mac_set_file() except that they do not follow symlinks. First in a series of commits to flesh out the user API. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-22 14:22:24 +00:00
David Xu	81fd489272	detect idle kse correctly.	2002-10-22 02:27:19 +00:00
Kirk McKusick	9e4b381a54	This update removes a race between unmount and lookup. The lookup locks the mount point directory while waiting for vfs_busy to clear. Meanwhile the unmount which holds the vfs_busy lock tried to lock the mount point vnode. The fix is to observe that it is safe for the unmount to remove the vnode from the mount point without locking it. The lookup will wait for the unmount to complete, then recheck the mount point when the vfs_busy lock clears. Sponsored by: DARPA & NAI Labs.	2002-10-22 01:06:44 +00:00
Kirk McKusick	e03486d198	This checkin reimplements the io-request priority hack in a way that works in the new threaded kernel. It was commented out of the disksort routine earlier this year for the reasons given in kern/subr_disklabel.c (which is where this code used to reside before it moved to kern/subr_disk.c): ---------------------------- revision 1.65 date: 2002/04/22 06:53:20; author: phk; state: Exp; lines: +5 -0 Comment out Kirks io-request priority hack until we can do this in a civilized way which doesn't cause grief. The problem is that it is not generally safe to cast a "struct bio " to a "struct buf ". Things like ccd, vinum, ata-raid and GEOM constructs bio's which are not entrails of a struct buf. Also, curthread may or may not have anything to do with the I/O request at hand. The correct solution can either be to tag struct bio's with a priority derived from the requesting threads nice and have disksort act on this field, this wouldn't address the "silly-seek syndrome" where two equal processes bang the diskheads from one edge to the other of the disk repeatedly. Alternatively, and probably better: a sleep should be introduced either at the time the I/O is requested or at the time it is completed where we can be sure to sleep in the right thread. The sleep also needs to be in constant timeunits, 1/hz can be practicaly any sub-second size, at high HZ the current code practically doesn't do anything. ---------------------------- As suggested in this comment, it is no longer located in the disk sort routine, but rather now resides in spec_strategy where the disk operations are being queued by the thread that is associated with the process that is really requesting the I/O. At that point, the disk queues are not visible, so the I/O for positively niced processes is always slowed down whether or not there is other activity on the disk. On the issue of scaling HZ, I believe that the current scheme is better than using a fixed quantum of time. As machines and I/O subsystems get faster, the resolution on the clock also rises. So, ten years from now we will be slowing things down for shorter periods of time, but the proportional effect on the system will be about the same as it is today. So, I view this as a feature rather than a drawback. Hence this patch sticks with using HZ. Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@critter.freebsd.dk>	2002-10-22 00:59:49 +00:00
Poul-Henning Kamp	c177d125bf	GEOM does not (and shall not) propagate flags like D_MEMDISK, so we will revert to checking the name to determine if our root device is a ramdisk, md(4) specifically to determine if we should attempt the root-mount RW Sponsored by: DARPA & NAI Labs.	2002-10-21 20:09:59 +00:00
Dag-Erling Smørgrav	6d0369001a	Reduce the overhead of the mutex statistics gathering code, try to produce shorter lines in the report, and clean up some minor style issues.	2002-10-21 18:48:28 +00:00
Olivier Houchard	e3bf3aea25	One #include <sys/sysctl.h> should be enough. Approved by: mux (mentor)	2002-10-21 18:40:40 +00:00
Brooks Davis	29e1b85f97	Use if_printf(ifp, "blah") instead of printf("%s%d: blah", ifp->if_name, ifp->if_xname).	2002-10-21 02:51:56 +00:00
Thomas Moestl	5775150869	Fix the calculations of the length of the unread message buffer contents. The code was subtracting two unsigned ints, stored the result in a log and expected it to be the same as of a signed subtraction; this does only work on platforms where int and long have the same size (due to overflows). Instead, cast to long before the subtraction; the numbers are guaranteed to be small enough so that there will be no overflows because of that.	2002-10-20 23:13:05 +00:00
Poul-Henning Kamp	962414a120	We have memset() and memcpy() in the kernel now, so we don't need to #define them to bzero and bcopy. Spotted by: FlexeLint	2002-10-20 22:33:42 +00:00
Julian Elischer	2f030624b1	Add an actual implementation of kse_wakeup() Submitted by: Davidxu	2002-10-20 21:08:47 +00:00
Thomas Moestl	e381d2455b	Add kernel dump support, based on the ia64 version (which was committed as sparc64/sparc64/dump_machdep.c a while back). Other than ia64 (which uses ELF), sparc64 uses a homegrown format for the dumps (headers are required because the physical address and size of the tsb must be noted, and because physical memory may be discontiguous); ELF would not offer any advantages here. Reviewed by: jake	2002-10-20 17:03:15 +00:00
Poul-Henning Kamp	ab33958276	#unifdef the code for checking blessed lock collisions until we need it. Spotted by: DARPA & NAI Labs.	2002-10-20 08:48:39 +00:00
Robert Watson	a13c67da35	If MAC_MAX_POLICIES isn't defined, don't try to define it, just let the compile fail. MAC_MAX_POLICIES should always be defined, or we have bigger problems at hand. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-20 03:41:09 +00:00
Peter Wemm	8556393bb2	Stake a claim on 418 (__xstat), 419 (__xfstat), 420 (__xlstat)	2002-10-19 22:25:31 +00:00
Peter Wemm	c8447553b5	Grab 416/417 real estate before I get burned while testing again. This is for the not-quite-ready signal/fpu abi stuff. It may not see the light of day, but I'm certainly not going to be able to validate it when getting shot in the foot due to syscall number conflicts.	2002-10-19 22:09:23 +00:00
Robert Watson	b614dd131a	Add a new 'NOMACCHECK' flag to namei() NDINIT flags, which permits the caller to indicate that MAC checks are not required for the lookup. Similar to IO_NOMACCHECK for vn_rdwr(), this indicates that the caller has already performed all required protections and that this is an internally generated operation. This will be used by the NFS server code, as we don't currently enforce MAC protections against requests delivered via NFS. While here, add NOCROSSMOUNT to PARAMASK; apparently this was used at one point for name lookup flag checking, but isn't any longer or it would have triggered from the NFS server code passing it to indicate that mountpoints shouldn't be crossed in lookups. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-19 21:25:51 +00:00
Robert Watson	3ab93f0958	Regen from addition of execve_mac placeholder.	2002-10-19 21:15:10 +00:00
Robert Watson	bc5245d94c	Add a placeholder for the execve_mac() system call, similar to SELinux's execve_secure() system call, which permits a process to pass in a label for a label change during exec. This permits SELinux to change the label for the resulting exec without a race following a manual label change on the process. Because this interface uses our general purpose MAC label abstraction, we call it execve_mac(), and wrap our port of SELinux's execve_secure() around it with appropriate sid mappings. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-19 21:06:57 +00:00
Robert Watson	89c61753a0	Drop in the MAC check for file creation as part of open(). Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-19 20:56:44 +00:00
Robert Watson	9aeffb2b28	Make sure to clear the 'registered' flag for MAC policies when they unregister. Under some obscure (perhaps demented) circumstances, this can result in a panic if a policy is unregistered, and then someone foolishly unregisters it again. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-19 20:30:12 +00:00
Robert Watson	7587203c2f	Hook up most of the MAC entry points relating to file/directory/node creation, deletion, and rename. There are one or two other stray cases I'll catch in follow-up commits (such as unix domain socket creation); this permits MAC policy modules to limit the ability to perform these operations based on existing UNIX credential / vnode attributes, extended attributes, and security labels. In the rename case using MAC, we now have to lock the from directory and file vnodes for the MAC check, but this is done only in the MAC case, and the locks are immediately released so that the remainder of the rename implementation remains the same. Because the create check takes a vattr to know object type information, we now initialize additional fields in the VATTR passed to VOP_SYMLINK() in the MAC case. Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-19 20:25:57 +00:00
Marcel Moolenaar	1aeb23cdfa	Add two hooks to signal module load and module unload to MD code. The primary reason for this is to allow MD code to process machine specific attributes, segments or sections in the ELF file and update machine specific state accordingly. An immediate use of this is in the ia64 port where unwind information is updated to allow debugging and tracing in/across modules. Note that this commit does not add the functionality to the ia64 port. See revision 1.9 of ia64/ia64/elf_machdep.c. Validated on: alpha, i386, ia64	2002-10-19 19:16:03 +00:00
Marcel Moolenaar	c143d6c24a	Reduce code duplication by moving the common actions in link_elf_init(), link_elf_link_preload_finish() and link_elf_load_file() to link_elf_link_common_finish(). Since link_elf_init() did initializations as a side-effect of doing the common actions, keep the initialization in that function. Consequently, link_elf_add_gdb() is now also called to insert the very first link_map() (ie the kernel).	2002-10-19 18:59:33 +00:00
Marcel Moolenaar	1720979bc5	Non-functional change in preparation of the next commit: Move link_elf_add_gdb(), link_elf_delete_gdb() and link_elf_error() near the top of the file. The *_gdb() functions are moved inside the #ifdef DDB already present there.	2002-10-19 18:43:37 +00:00
Marcel Moolenaar	f5b07e11ad	In link_elf_load_file(), when SPARSE_MAPPING is defined and we cannot allocate ef->object, we freed ef before bailing out with an error. This is wrong because ef=lf and when we have an error and lf is non-NULL (which holds if we try to alloc ef->object), we free lf and thus ef as part of the bailing-out.	2002-10-19 05:01:54 +00:00
Alfred Perlstein	871de19fab	Don't leak memory in semop(2). (Fix a bug I introduced in rev 1.55.) Detective work by: jake	2002-10-19 02:07:35 +00:00
John Baldwin	6222047300	Do not lock the process when calling fdfree() (this would have recursed on a non-recursive lock, the proc lock, before) since we don't need it to change p_fd.	2002-10-18 17:45:41 +00:00
John Baldwin	6d345e2a45	fdfree() clears p_fd for us, no need to do it again.	2002-10-18 17:44:39 +00:00
John Baldwin	4562d72638	Don't lock the proc lock to clear p_fd. p_fd isn't protected by the proc lock.	2002-10-18 17:42:28 +00:00
Kirk McKusick	3a096f6c09	Have lockinit() initialize the debugging fields of a lock when DEBUG_LOCKS is defined. Sponsored by: DARPA & NAI Labs.	2002-10-18 01:34:10 +00:00
Kirk McKusick	bc7bdd50c1	When the number of dirty buffers rises too high, the buf_daemon runs to help clean up. After selecting a potential buffer to write, this patch has it acquire a lock on the vnode that owns the buffer before trying to write it. The vnode lock is necessary to avoid a race with some other process holding the vnode locked and trying to flush its dirty buffers. In particular, if the vnode in question is a snapshot file, then the race can lead to a deadlock. To avoid slowing down the buf_daemon, it does a non-blocking lock request when trying to lock the vnode. If it fails to get the lock it skips over the buffer and continues down its queue looking for buffers to flush. Sponsored by: DARPA & NAI Labs.	2002-10-18 01:29:59 +00:00
Maxim Sobolev	2e307eb8c9	Separate fiels reported by disk_err() with spaces, so that output doesn't look cryptic. MFC after: 1 week	2002-10-17 23:48:29 +00:00
Robert Drehmel	bb8992b32c	Instead of (sizeof(source_buffer) - 1) bytes, copy at most (sizeof(destination_buffer) - 1) bytes into the destination buffer. This was not harmful because they currently both provide space for (MAXCOMLEN + 1) bytes.	2002-10-17 21:02:02 +00:00
Robert Drehmel	e80fb43467	Use strlcpy() instead of strncpy() to copy NUL terminated strings for safety and consistency.	2002-10-17 20:03:38 +00:00
Sam Leffler	3b132a615f	fix kldload error return when a module is rejected because it's statically linked in the kernel. When this condition is detected deep in the linker internals the EEXIST error code that's returned is stomped on and instead an ENOEXEC code is returned. This makes apps like sysinstall bitch.	2002-10-17 17:28:57 +00:00
Robert Drehmel	55c8556834	- Allocate only enough space for a temporary buffer to hold the path including the terminating NUL character from `struct sockaddr_un' rather than SOCK_MAXADDRLEN bytes. - Use strlcpy() instead of strncpy() to copy strings.	2002-10-17 15:52:42 +00:00
Bosko Milekic	a91db09ec0	Fix a fairly subtle bug in mbuf_init() where the reference counter contiguous space was being allocated from the clust_map instead of the mbuf_map as the comments indicated. This resulted in some address space wastage in mbuf_map. Submitted by: Rohit Jalan <rohjal@yahoo.co.in>	2002-10-16 19:59:08 +00:00
John Baldwin	5c0cc63c40	Add a missing PROC_UNLOCK in ptrace() for the PT_IO case. PR: kern/44065 Submitted by: Mark Kettenis <kettenis@chello.nl>	2002-10-16 16:28:33 +00:00
John Baldwin	bf3e55aa2c	Many style and whitespace fixes. Submitted by: bde (mostly)	2002-10-16 15:45:37 +00:00
John Baldwin	18d9bd8f65	Sort includes a bit. Submitted by: bde	2002-10-16 15:14:31 +00:00
Poul-Henning Kamp	c3053131ca	Be consistent about funtions being static. Spotted by: FlexeLint	2002-10-16 10:42:13 +00:00
Sam Leffler	5d84645305	Replace aux mbufs with packet tags: o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month	2002-10-16 01:54:46 +00:00
Poul-Henning Kamp	7c61d7858c	Plug a memory-leak. "I think you're right" by: jake	2002-10-15 18:58:38 +00:00
Poul-Henning Kamp	9736c8f03a	Use ; not , as statement separator in PDEBUG() macro. Ignoring a NULL dev in device_set_ivars() sounds wrong, KASSERT it to non-NULL instead. Do the same for device_get_ivars() for reasons of symmetry, though it probably would have yielded a panic anyway, this gives more precise diagnostics. Absentmindedly nodded OK to by: jhb	2002-10-15 18:56:13 +00:00
John Baldwin	7fd1f2b8bc	Argh. Put back setting of P_ADVLOCK for the F_WRLCK case that was accidentally lost in the previous revision. Submitted by: bde Pointy hat to: jhb	2002-10-15 18:10:13 +00:00
Marcel Moolenaar	47f750125b	Fix kernel module loading on ia64. Cross-module function calls were improperly relocated due to faulty logic in lookup_fdesc() in elf_machdep.c. The symbol index (symidx) was bogusly used for load modules other than the one the relocation applied to. This resulted in bogus bindings and consequently runtime failures. The fix is to use the symbol index only for the module being relocated and to use the symbol name for look-ups in the modules in the dependent list. As such, we need a function to return the symbol name given the linker file and symbol index.	2002-10-15 05:40:07 +00:00
Peter Wemm	803cc8aa8f	Restore pointer that was removed in 1.128. This wasn't a merge-o.	2002-10-15 01:36:45 +00:00
John Baldwin	c65440644e	- Add a new global mutex 'ppeers_lock' to protect the p_peers list of processes forked with RFTHREAD. - Use a goto to a label for common code when exiting from fork1() in case of an error. - Move the RFTHREAD linkage setup code later in fork since the ppeers_lock cannot be locked while holding a proc lock. Handle the race of a task leader exiting and killing its peers while a peer is forking a new child. In that case, go ahead and let the peer process proceed normally as the parent is about to kill it. However, the task leader may have already gone to sleep to wait for the peers to die, so the new child process may not receive a SIGKILL from the task leader. Rather than try to destruct the new child process, just go ahead and send it a SIGKILL directly and add it to the p_peers list. This ensures that the task leader will wait until both the peer process doing the fork() and the new child process have received their KILL signals and exited. Discussed with: truckman (earlier versions)	2002-10-15 00:14:32 +00:00
John Baldwin	60a6965a88	Remove the leaderp variable and just access p_leader directly. The p_leader field is not protected by the proc lock but is only set during fork1() by the parent process and never changes.	2002-10-15 00:03:40 +00:00
Alfred Perlstein	8ced1eb281	Remove a KASSERT I added in 1.73 to catch uninitialized pipes. It must be removed because it is done without the pipe being locked via pipelock() and therefore is vulnerable to races with pipespace() erroneously triggering it by temporarily zero'ing out the structure backing the pipe. It looks as if this assertion is not needed because all manipulation of the data changed by pipespace() _is_ protected by pipelock(). Reported by: kris, mckusick	2002-10-14 21:15:04 +00:00
Julian Elischer	24c5baae53	Did you ever notice how stupid bugs show up much clearer when you see them in a commit message?	2002-10-14 20:43:02 +00:00
Julian Elischer	1f955e2d48	Tidy up the scheduler's code for changing the priority of a thread. Logically pretty much a NOP.	2002-10-14 20:34:31 +00:00
Kirk McKusick	a6b9f47b31	When scanning the freelist looking for candidate vnodes to recycle, be sure to exit the loop with vp == NULL if no candidates are found. Formerly, this bug would cause the last vnode inspected to be used, even if it was not available. The result was a panic "vn_finished_write: neg cnt". Sponsored by: DARPA & NAI Labs.	2002-10-14 19:54:39 +00:00
Kirk McKusick	e04a020067	Unconditionally reset vp->v_vnlock back to the default in the vclean() function (e.g., vp->v_vnlock = &vp->v_lock) rather than requiring filesystems that use alternate locks to do so in their vop_reclaim functions. This change is a further cleanup of the vop_stdlock interface. Submitted by: Poul-Henning Kamp <phk@critter.freebsd.dk> Sponsored by: DARPA & NAI Labs.	2002-10-14 19:44:51 +00:00
Poul-Henning Kamp	64b023f4bd	Populate more fields of the disklabel for PC98. Submitted by: Kawanobe Koh <kawanobe@st.rim.or.jp>	2002-10-14 14:22:29 +00:00
Kirk McKusick	a5b65058d5	Regularize the vop_stdlock'ing protocol across all the filesystems that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock). In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme. Sponsored by: DARPA & NAI Labs.	2002-10-14 03:20:36 +00:00
Alan Cox	4d752b01b4	Eliminate the unnecessary clearing of flag bits that are already clear in lio_listio(2).	2002-10-14 01:21:37 +00:00
Mike Barcroft	eeea998c3c	Update a sysctl to use _POSIX_VERSION from <sys/unistd.h>, instead of the kernel option _KPOSIX_VERSION.	2002-10-13 14:26:29 +00:00
Mike Barcroft	9e020cdab9	Include <sys/_posix.h> directly instead of depending on <sys/proc.h> to include <sys/signal.h> to include <sys/_posix.h>.	2002-10-13 11:54:16 +00:00
Alfred Perlstein	1e31f88689	whitespace fixes.	2002-10-12 22:26:41 +00:00
Jeff Roberson	b43179fbe8	- Create a new scheduler api that is defined in sys/sched.h - Begin moving scheduler specific functionality into sched_4bsd.c - Replace direct manipulation of scheduler data with hooks provided by the new api. - Remove KSE specific state modifications and single runq assumptions from kern_switch.c Reviewed by: -arch	2002-10-12 05:32:24 +00:00
Peter Wemm	d2575b9651	Register the machine check private state spinlock on ia64.	2002-10-12 00:33:36 +00:00
John Baldwin	e1b1aa3bc2	- Move the 'done1' label down below the unlock of the proc lock and move the locking of the proc lock after the goto to done1 to avoid locking the lock in an error case just so we can turn around and unlock it. - Move the exec_setregs() stuff out from under the proc lock and after the p_args stuff. This allows exec_setregs() to be able to sleep or write things out to userland, etc. which ia64 does. Tested by: peter	2002-10-11 21:04:01 +00:00
John Baldwin	8559443093	Fix %z to always print values as signed like it is supposed to. Reviewed by: bde Tested on: i386 in ddb	2002-10-11 17:54:55 +00:00
Mike Barcroft	2b7f24d210	Change iov_base's type from `char ' to the standard` void '. All uses of iov_base which assume its type is `char ' (in order to do pointer arithmetic) have been updated to cast iov_base to `char '.	2002-10-11 14:58:34 +00:00
Poul-Henning Kamp	2e07db0b0a	Remove an unused variable.	2002-10-11 10:36:22 +00:00
Kirk McKusick	192e439ed4	When considering a vnode for reuse in getnewvnode, we call vcanrecycle to check a free vnode's availability. If it is available, vcanrecycle returns an error code of zero and the vnode in question locked. The getnewvnode routine then used to call vn_start_write with the V_NOWAIT flag. If the filesystem was suspended while taking a snapshot, the vn_start_write would fail but getnewvnode would fail to unlock the vnode, instead leaving it locked on the freelist. The result would be that the vnode would be locked forever and would eventually hang the system with a race to the root when it was attempted to recycle it. This fix moves the vn_start_write check into vcanrecycle where it will properly unlock the vnode if it is unavailable for recycling due to filesystem suspension. Sponsored by: DARPA & NAI Labs.	2002-10-11 01:04:14 +00:00
Robert Watson	2dba710ddb	Incremental style improvements: more consistently avoid assignments in conditionals; remove some excess vertical whitespace; remove a bug in the return handling of the delete_vp() case for MAC. Spotted by: bde	2002-10-10 13:59:58 +00:00
Robert Watson	16c26e60ef	Regen from syntax fix to syscalls.master. PR: Submitted by: Reviewed by: Approved by: Obtained from: MFC after:	2002-10-10 04:08:11 +00:00
Robert Watson	3c4aba09e3	Fix what looks like a merge-o from a conflict in the last commit to syscalls.master.	2002-10-10 04:02:49 +00:00
Robert Watson	b101411be1	Explore new heights in alphabetization for _file and _fd variations on the extended attribute system calls.	2002-10-10 00:32:08 +00:00
Peter Wemm	0d66d36f44	Add a pointer to the alternate syscall tables on 64 bit platforms.	2002-10-09 22:04:09 +00:00
Robert Watson	6f90723cad	Implement extattr_{delete,get,set}_link() system calls: extended attribute operations that do not follow links. Sync to MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-09 21:48:22 +00:00
Robert Watson	233d463548	Regen.	2002-10-09 21:47:29 +00:00
Robert Watson	8b10835c35	Flesh out the extattr_{delete,get,set}_link() system calls: variations on the _file() theme that do not follow symlinks. Sync to MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-09 21:47:04 +00:00
John Baldwin	5715307f74	- Move p_cpulimit to struct proc from struct plimit and protect it with sched_lock. This means that we no longer access p_limit in mi_switch() and the p_limit pointer can be protected by the proc lock. - Remove PRS_ZOMBIE check from CPU limit test in mi_switch(). PRS_ZOMBIE processes don't call mi_switch(), and even if they did there is no longer the danger of p_limit being NULL (which is what the original zombie check was added for). - When we bump the current processes soft CPU limit in ast(), just bump the private p_cpulimit instead of the shared rlimit. This fixes an XXX for some value of fix. There is still a (probably benign) bug in that this code doesn't check that the new soft limit exceeds the hard limit. Inspired by: bde (2)	2002-10-09 17:17:24 +00:00
Julian Elischer	48bfcddd94	Round out the facilty for a 'bound' thread to loan out its KSE in specific situations. The owner thread must be blocked, and the borrower can not proceed back to user space with the borrowed KSE. The borrower will return the KSE on the next context switch where teh owner wants it back. This removes a lot of possible race conditions and deadlocks. It is consceivable that the borrower should inherit the priority of the owner too. that's another discussion and would be simple to do. Also, as part of this, the "preallocatd spare thread" is attached to the thread doing a syscall rather than the KSE. This removes the need to lock the scheduler when we want to access it, as it's now "at hand". DDB now shows a lot mor info for threaded proceses though it may need some optimisation to squeeze it all back into 80 chars again. (possible JKH project) Upcalls are now "bound" threads, but "KSE Lending" now means that other completing syscalls can be completed using that KSE before the upcall finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is one of the highest priorities in the KSE system.) The upcall when it happens will present all the completed syscalls to the KSE for selection.	2002-10-09 02:33:36 +00:00
Warner Losh	0b294f891d	Introducing /dev/devctl. This device reports events in the configuration device hierarchy. Device arrival, departure and not matched are presently reported. This will be the basis for devd, which I still need to polish a little more before I commit it. If you don't use /dev/devctl, it will be a noop.	2002-10-07 23:17:44 +00:00
Warner Losh	c17fdbe3a9	Two minor bugfixes: o Allow the bus_debug variable to be set via the bus.debug tunable. o Return pnpinfo and location info via the devinfo interface to userland. devinfo(8) needs to be updated to print it.	2002-10-07 23:15:40 +00:00
Ian Dowse	197b023b1b	Add back a fdrop() call at the end of kern_open() that got lost in revision 1.218. This bug caused a "struct file" reference to be leaked if VOP_ADVLOCK(), vn_start_write(), or mac_check_vnode_write() failed during the open operation. PR: kern/43739 Reported by: Arne Woerner <woerner@mediabase-gmbh.de>	2002-10-07 20:49:22 +00:00
Warner Losh	0a1d3ef9b8	Add wrappers around the newly created bus_child_pnpinfo_str and bus_child_location_str.	2002-10-07 07:08:00 +00:00
Warner Losh	d71dec96bf	Minor string handling cleanup that I've had in my tree for a while: Don't use snprintf where strlcpy() will do the job. Also, a NUL is '\0' not 0 in our style (C doesn't care), so spell it like. Remove useless {} and () in the general area of this change.	2002-10-07 06:50:35 +00:00
Warner Losh	da7b83f9ea	Don't need to NUL terminate after snprintf	2002-10-07 06:26:17 +00:00
Warner Losh	3d9841b4eb	Add two interfaces to allow for busses to report the pnpinfo for devices as well as their location on the bus.	2002-10-07 05:06:38 +00:00
Alfred Perlstein	c814aa3fdb	disable debug output by default.	2002-10-07 04:13:21 +00:00
Robert Watson	b371c939ce	Integrate mac_check_socket_send() and mac_check_socket_receive() checks from the MAC tree: allow policies to perform access control for the ability of a process to send and receive data via a socket. At some point, we might also pass in additional address information if an explicit address is requested on send. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-06 14:39:15 +00:00
Robert Watson	e183f80e54	Sync from MAC tree: break out the single mmap entry point into seperate entry points for each occasion: mac_check_vnode_mmap() Check at initial mapping mac_check_vnode_mprotect() Check at mapping protection change mac_check_vnode_mmap_downgrade() Determine if a mapping downgrade should take place following subject relabel. Implement mmap() and mprotect() entry points for labeled vnode policies. These entry points are currently not hooked up to the VM system in the base tree. These changes improve the consistency of the access control interface and offer more flexibility regarding limiting access to vnode mmaping. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-06 02:46:26 +00:00
Robert Watson	83985c267e	Modify label allocation semantics for sockets: pass in soalloc's malloc flags so that we can call malloc with M_NOWAIT if necessary, avoiding potential sleeps while holding mutexes in the TCP syncache code. Similar to the existing support for mbuf label allocation: if we can't allocate all the necessary label store in each policy, we back out the label allocation and fail the socket creation. Sync from MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 21:23:47 +00:00
Robert Watson	b497ca81d6	Make sure that the accounting credential is saved along with the vp when accounting is suspended--otherwise when accounting is restored, we may incorrectly assume the credential is valid. Panics experienced by: juli	2002-10-05 20:05:23 +00:00
Robert Watson	74e62b1b75	Integrate a devfs/MAC fix from the MAC tree: avoid a race condition during devfs VOP symlink creation by introducing a new entry point to determine the label of the devfs_dirent prior to allocation of a vnode for the symlink. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 18:40:10 +00:00
Robert Watson	0a69419678	Merge support for mac_check_vnode_link(), a MAC framework/policy entry point that instruments the creation of hard links. Policy implementations to follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 18:11:36 +00:00
Robert Watson	56c1541237	While the MAC API has supported the ability to handle M_NOWAIT passed to mbuf label initialization, that functionality was never merged to the main tree. Go ahead and merge that functionality now. Note that this requires policy modules to accept the case where the label element may be destroyed even if init has not succeeded on it (in the event that policy failed the init). This will shortly also apply to sockets. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 17:44:49 +00:00
Robert Watson	87807196f8	Rearrange object and label init/destroy functions to match the order used in mac_policy.h and elsewhere. Sort order is basically "by operation category", then "alphabetically by object". Sync to MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 17:38:45 +00:00
Robert Watson	a931e345a9	Sync to MAC tree: use 'flag' instead of 'how' for mac_init_mbuf(); remove a slightly less than useful comment.	2002-10-05 17:18:43 +00:00
Brian Feldman	dab3d85fd7	Don't allow dev_stdclone(9) to accept minors larger than the system is able to handle (0xffffff).	2002-10-05 17:10:28 +00:00
Robert Watson	69bbb5b1c7	Another big diff, little functional change: move label internalization, externalization, and cred label life cycle events to entirely above devfs and vnode events. Sync from MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 16:57:16 +00:00
Robert Watson	08bcdc586e	Move all object label init/destroy routines to the head of the entry points to better match the entry point ordering in mac_policy.h. Big diff, no functional change; merge from the MAC tree. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 16:54:59 +00:00
Robert Watson	ea599aa018	Synch from TrustedBSD MAC tree: - If a policy isn't registered when a policy module unloads, silently succeed. - Hold the policy list lock across more of the validity tests to avoid races. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 16:46:03 +00:00
Poul-Henning Kamp	3bd6561289	NB: This commit does NOT make GEOM the default in FreeBSD NB: But it will enable it in all kernels not having options "NO_GEOM" Put the GEOM related options into the intended order. Add "options NO_GEOM" to all kernel configs apart from NOTES. In some order of controlled fashion, the NO_GEOM options will be removed, architecture by architecture in the coming days. There are currently three known issues which may force people to need the NO_GEOM option: boot0cfg/fdisk: Tries to update the MBR while it is being used to control slices. GEOM does not allow this as a direct operation. SCSI floppy drives: Appearantly the scsi-da driver return "EBUSY" if no media is inserted. This is wrong, it should return ENXIO. PC98: It is unclear if GEOM correctly recognizes all variants of PC98 disklabels. (Help Wanted! I have neither docs nor HW) These issues are all being worked. Sponsored by: DARPA & NAI Labs.	2002-10-05 16:35:33 +00:00
Robert Watson	226b96fb6d	Cosmetic line wrap synchronization.	2002-10-05 16:33:46 +00:00
Robert Watson	b2f0927ad6	Push the debugging obect label counters into security.mac.debug.counters rather than directly under security.mac.debug. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 16:30:53 +00:00
Robert Watson	96adb90996	Begin another merge from the TrustedBSD MAC branch: - Change mpo_init_foo(obj, label) and mpo_destroy_foo(obj, label) policy entry points to mpo_init_foo_label(label) and mpo_destroy_foo_label(label). This will permit the use of the same entry points for holding temporary type-specific label during internalization and externalization, as well as for caching purposes. - Because of this, break out mpo_{init,destroy}_socket() and mpo_{init,destroy}_mount() into seperate entry points for socket main/peer labels and mount main/fs labels. - Since the prototype for label initialization is the same across almost all entry points, implement these entry points using common implementations for Biba, MLS, and Test, reducing the number of almost identical looking functions. This simplifies policy implementation, as well as preparing us for the merge of the new flexible userland API for managing labels on objects. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-05 15:10:00 +00:00
Maxim Sobolev	790a8088d0	Fix problem introduced in rev.1.406, which can cause already unlocked mutex being unlocked again causing system panic.	2002-10-05 12:56:10 +00:00
Brian Somers	52ae0b7fb5	If dsgetlabel() returns a label with a size of zero in diskdumpconf(), treat it as an invalid partition. This fixes a bug where ``dumpon <device>'' will configure the dump device at a random offset on the disk if <device> isn't a valid partition. Reviewed by: phk	2002-10-05 11:24:21 +00:00
Juli Mallett	0d29446006	Put an easy-to-miss assignment into the proper place. It was stray in the middle of a block of code, with no clear assignment. While here, move one nearby assignment out of declaration.	2002-10-05 04:49:46 +00:00
Juli Mallett	ecafb24b41	Remove bogus duplicate assignment of local variables.	2002-10-05 04:35:59 +00:00
Poul-Henning Kamp	c5f9218b48	Add the new function "sbuf_done()" which returns non-zero if the sbuf is finished. This allows sbufs to be used for request/response scenarioes without needing additional communication flags. Sponsored by: DARPA & NAI Labs.	2002-10-04 09:58:17 +00:00
Peter Wemm	c281972e61	Add some unspeakable hackery to the tree under #ifdef __ia64__ to work around limitations in the ia64 kernel stack handling code. Basically preallocate a bunch of threads (and hence kstacks) while contigmalloc() still works, and never free them back to the general memory pool. After the system has been running for a while, contigmalloc() eventually fails at a critical momemt and panics the system.	2002-10-04 01:31:39 +00:00
Don Lewis	cb81d3ca4d	hashinit() calls MALLOC(), so release the filedesc lock in knote_attach() before calling hashinit() and relock afterwards, taking care to see that we don't lose a race.	2002-10-03 06:03:26 +00:00
Juli Mallett	a723033a4d	XXX Add a check for p->p_limit being NULL before dereferencing it. This is totally bogus but will hide the occurances of access of 0xbc(NULL) which people have run into lately. This is not a proper fix, just a bandaid, until the cause of this happening is tracked down and fixed. Reviewed by: rwatson	2002-10-03 04:09:00 +00:00
Don Lewis	91e97a8266	In an SMP environment post-Giant it is no longer safe to blindly dereference the struct sigio pointer without any locking. Change fgetown() to take a reference to the pointer instead of a copy of the pointer and call SIGIO_LOCK() before copying the pointer and dereferencing it. Reviewed by: rwatson	2002-10-03 02:13:00 +00:00
David Xu	5da2b58aeb	set ke_bound to NULL when kse owner thread becomes runnable. Reviewed by: julian (mentor)	2002-10-03 01:22:05 +00:00
Julian Elischer	4162f2fe92	Whitespace fix only	2002-10-02 23:12:01 +00:00
John Baldwin	551cf4e150	Rename the mutex thread and process states to use a more generic 'LOCK' name instead. (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of TD_ON_MUTEX()) Eventually a turnstile abstraction will be added that will be shared with mutexes and other types of locks. SLOCK/TDI_LOCK will be used internally by the turnstile code and will not be specific to mutexes. Making the change now ensures that turnstiles can be dropped in at a later date without affecting the ABI of userland applications.	2002-10-02 20:31:47 +00:00
Juli Mallett	289e1e23d1	Access td->td_kse inside sched_lock. Submitted by: julian	2002-10-02 18:25:09 +00:00
Archie Cobbs	36a8dac10d	Let kse_wakeup() take a KSE mailbox pointer argument. Reviewed by: julian	2002-10-02 16:48:16 +00:00
Juli Mallett	bc7b9f1dba	De-obfuscate local use of members of 'struct thread', for which we have local variables, and group assignment.	2002-10-02 16:39:39 +00:00
Poul-Henning Kamp	c56c20f13d	Absorb <sys/bus_private.h> into kern/subr_bus.c to prevent misunderstandings. Suggested by: bde Approved by: dfr	2002-10-02 09:34:29 +00:00
Poul-Henning Kamp	8c5d013757	Fix mis-indentation. Spotted by: FlexeLint	2002-10-02 09:09:25 +00:00
Scott Long	316ec49abd	Some kernel threads try to do significant work, and the default KSTACK_PAGES doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb	2002-10-02 07:44:29 +00:00
Robert Watson	92dbb82a47	Add a new MAC entry point, mac_thread_userret(td), which permits policy modules to perform MAC-related events when a thread returns to user space. This is required for policies that have floating process labels, as it's not always possible to acquire the process lock at arbitrary points in the stack during system call processing; process labels might represent traditional authentication data, process history information, or other data. LOMAC will use this entry point to perform the process label update prior to the thread returning to userspace, when plugged into the MAC framework. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-02 02:42:38 +00:00
Juli Mallett	1d9c56964d	Back our kernel support for reliable signal queues. Requested by: rwatson, phk, and many others	2002-10-01 17:15:53 +00:00
John Baldwin	feb2449610	Minor style nits in a comment.	2002-10-01 15:49:32 +00:00
Poul-Henning Kamp	8d3574c7a4	Fix some harmless mis-indents. Spotted by: FlexeLint	2002-10-01 15:48:31 +00:00
Poul-Henning Kamp	328048bc56	Remember to include "opt_devfs.h" so we get any relevant changes to NDEVFSINO before we include devfs.h. Spotted by: FlexeLint	2002-10-01 15:24:35 +00:00
John Baldwin	6cae6dacd5	Various style fixups. Submitted by: bde (mostly)	2002-10-01 14:16:50 +00:00
John Baldwin	f6ccde8308	Actually clear PS_XCPU in ast() when we handle it. Submitted by: bde Pointy hat to: jhb	2002-10-01 14:13:13 +00:00
John Baldwin	1d56414515	- Adjust comment noting that handling of CPU limit exhaustion is done in ast(). - Actually set KEF_ASTPENDING so ast() is called. I think this is buggy for a process with multiple KSE's in that PS_XCPU is not a KSE event, it's a process-wide event. IMO there really should probably be two ASTPENDING flags, one for per-process, and one for per-KSE. Submitted by: bde	2002-10-01 14:10:08 +00:00
Poul-Henning Kamp	fa15abd8a6	Don't #error if we are lint.	2002-10-01 13:15:11 +00:00
Poul-Henning Kamp	3bb24c35f2	Split MBR and PC98 on-disk sliceformats out from disklabel.h, step 1: Peter had repocopied sys/disklabel.h to sys/diskpc98.h and sys/diskmbr.h. These two new copies are still intact copies of disklabel.h and therefore protected by #ifndef _SYS_DISKLABEL_H_ so #including them in programs which already include <sys.disklabel.h> is currently a no-op. This commit adds a number of such #includes. Once I have verified that I have fixed all the places which need fixing, I will commit the updated versions of the three #include files. Sponsored by: DARPA & NAI Labs.	2002-10-01 07:24:55 +00:00
Robert Watson	1aa37f5392	Improve locking of pipe mutexes in the context of MAC: (1) Where previously the pipe mutex was selectively grabbed during pipe_ioctl(), now always grab it and then release if if not needed. This protects the call to mac_check_pipe_ioctl() to make sure the label remains consistent. (Note: it looks like sigio locking may be incorrect for fgetown() since we call it not-by-reference and sigio locking assumes call by reference). (2) In pipe_stat(), lock the pipe if MAC is compiled in so that the call to mac_check_pipe_stat() gets a locked pipe to protect label consistency. We still release the lock before returning actual stat() data, risking inconsistency, but apparently our pipe locking model accepts that risk. (3) In various pipe MAC authorization checks, assert that the pipe lock is held. (4) Grab the lock when performing a pipe relabel operation, and assert it a little deeper in the stack. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-01 04:30:19 +00:00
Robert Watson	6be0c25e4e	Push 'security.mac.debug_label_fallback' behind options MAC_DEBUG. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-10-01 03:24:20 +00:00
Juli Mallett	7bf2a42fd5	Until I find a way to release arbitrary locks held when sending signals (there really should not be some), use the M_NOWAIT flag to malloc(9), and panic(9) if malloc(9) fails.	2002-10-01 03:19:49 +00:00
Robert Watson	d0bd8ced91	Regen.	2002-10-01 02:37:35 +00:00
Robert Watson	4499985ef2	Reserve system call numbers for the following system calls: __mac_get_pid Retrieve MAC label of a process by pid Similar to __mac_get_proc() except that the target process of the operation is explicitly specified rather than assuming curthread. __mac_get_link Retrieve MAC label of a path with NOFOLLOW __mac_set_link Set MAC label of a path with NOFOLLOW extattr_set_link Set EAs on a path with NOFOLLOW extattr_get_link Retrieve EAs on a path with NOFOLLOW extattr_delete_link Delete EAs on a path with NOFOLLOW These calls are similar to __mac_get_file(), __mac_set_file(), extattr_set_file(), extattr_get_file(), and extattr_delete_file(), except that they do not follow symlinks. The distinction between these calls is similar to lchown() vs chown(). Implementations to follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-01 02:35:59 +00:00
Juli Mallett	a88b260a86	Back out code changes that snuck into the previous forced commit.	2002-10-01 00:16:17 +00:00
Juli Mallett	226e1171e1	(Forced commit, to clarify previous commit of ksiginfo/signal queue code.) I've added a structure, kernel-private, to represent a pending or in-delivery signal, called `ksiginfo'. It is roughly analogous to the basic information that is exported by the POSIX interface 'siginfo_t', but more basic. I've added functions to allocate these structures, and further to wrap all signal operations using them. Once the operations are wrapped, I've added a TailQ (see queue(3)) of these structures to 'struct proc', and all pending signals are in that TailQ. When a signal is being delivered, it is dequeued from the list. Once I finish the spreading of ksiginfo throughout the tree, the dequeued structure will be delivered to the process in question, whereas currently and normally, the signal number is what is used.	2002-10-01 00:07:28 +00:00
John Baldwin	dc183990ca	- Add a new per-process flag PS_XCPU to indicate that at least one thread has exceeded its CPU time limit. - In mi_switch(), set PS_XCPU when the CPU time limit is exceeded. - Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set. Requested by: many	2002-09-30 21:13:54 +00:00
John Baldwin	f4cd8f9ff4	Change p_cpulimit to be in seconds instead of microseconds. Since p_runtime now is a bintime, it is no longer an optimization to store p_cpulimit as microseconds. Suggested by: phk	2002-09-30 21:08:38 +00:00
Robert Watson	0626774f08	Move vnode MAC label initialization to after the release of the vnode interlock in getnewvnode() to avoid possible sleeps while holding the mutex. Note that the warning from Witness is a slight false positive since we know there will be no contention on the interlock since we haven't made the vnode available for use yet, but the theory is not a bad one. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-09-30 20:51:48 +00:00
Robert Watson	c031391bd5	Add tunables for the existing sysctl twiddles for pipe and vm enforcement so they can be disabled prior to kernel start. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-09-30 20:50:00 +00:00
Juli Mallett	1226f694e6	First half of implementation of ksiginfo, signal queues, and such. This gets signals operating based on a TailQ, and is good enough to run X11, GNOME, and do job control. There are some intricate parts which could be more refined to match the sigset_t versions, but those require further evaluation of directions in which our signal system can expand and contract to fit our needs. After this has been in the tree for a while, I will make in kernel API changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo more robustly, such that we can actually pass information with our (queued) signals to the userland. That will also result in using a struct ksiginfo pointer, rather than a signal number, in a lot of kern_sig.c, to refer to an individual pending signal queue member, but right now there is no defined behaviour for such. CODAFS is unfinished in this regard because the logic is unclear in some places. Sponsored by: New Gold Technology Reviewed by: bde, tjr, jake [an older version, logic similar]	2002-09-30 20:20:22 +00:00
Poul-Henning Kamp	50c2233141	Plug memory leaks. Detected by: FlexeLint Approved by: jhb	2002-09-30 19:19:47 +00:00
Julian Elischer	2735483034	uh, commit all of the patch	2002-09-29 23:28:58 +00:00
Julian Elischer	e081731767	commit the version I actually tested.. Submitted by: davidxu	2002-09-29 23:23:25 +00:00
Julian Elischer	9eb1fdea37	Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode from stopping another thread from completing a syscall, and this allows it to release its resources etc. Probably more related commits to follow (at least one I know of) Initial concept by: julian, dillon Submitted by: davidxu	2002-09-29 23:04:34 +00:00
David E. O'Brien	21b68415cd	Fix style nit where conditionally compiled code was unconditionalized, but style(9) was consulted. Submitted by: bde	2002-09-29 04:47:41 +00:00
Julian Elischer	0cd3964f6d	lock proc while calling psignal (plus related cleanups) Submitted by: davidxu	2002-09-29 02:48:37 +00:00

... 5 6 7 8 9 ...

6083 Commits