freebsd-nq

Author	SHA1	Message	Date
Robert Watson	db42a33d81	o Introduce group subset test, which limits the ability of a process to debug another process based on their respective {effective,additional, saved,real} gid's. p1 is only permitted to debug p2 if its effective gids (egid + additional groups) are a strict superset of the gids of p2. This implements properly the security test previously incorrectly implemented in kern_ktrace.c, and is consistent with the kernel security policy (although might be slightly confusing for those more familiar with the userland policy). o Restructure p_candebug() logic so that various results are generated comparing uids, gids, credential changes, and then composed in a single check before testing for privilege. These tests encapsulate the "BSD" inter-process debugging policy. Other non-BSD checks remain seperate. Additional comments are added. Submitted by: tmm, rwatson Obtained from: TrustedBSD Project Reviewed by: petef, tmm, rwatson	2001-11-02 18:44:50 +00:00
Poul-Henning Kamp	bad699770a	Add empty shell for nmount syscall (take 2!)	2001-11-02 18:35:54 +00:00
Poul-Henning Kamp	06d133c475	Add nmount() stub function and regenerate the syscall-glue which should not need to check in generated files.	2001-11-02 17:59:23 +00:00
Poul-Henning Kamp	c60693dbd3	Reserve 378 for the new mount syscall Maxime Henrion <mux@qualys.com> is working on. (This is to get us more than 32 mountoptions).	2001-11-02 17:58:26 +00:00
Warner Losh	89bbe0cd1e	Don't hide the failure to allocate device behind boot verbose. It is still telling us of real problems so should remain until it stops doing that. Submitted by: OGAWA Takaya <t-ogawa@triaez.kaisei.org>	2001-11-02 17:33:06 +00:00
Jonathan Lemon	198475ebeb	+ Fix another possible vn_close race, in the same fashion as r1.95. + Check that the cached vnode type != VBAD before calling devsw(), this can happen if the vnode has been revoked.	2001-11-02 17:04:32 +00:00
Robert Watson	5fab7614f4	o Add a comment to p_candebug() noting that the P_INEXEC check should really be moved elsewhere: p_candebug() encapsulates the security policy decision, whereas the P_INEXEC check has to do with "correctness" regarding race conditions, rather than security policy. Example: even if no security protections were enforced (the "uids are advisory" model), removing P_INEXEC could result in incorrect operation due to races on credential evaluation and modification during execve(). Obtained from: TrustedBSD Project	2001-11-02 16:41:06 +00:00
Robert Watson	bb51af2816	Merge from POSIX.1e Capabilities development tree: o Reorder and synchronize #include's, including moving "opt_cap.h" to above system includes. o Introduce #ifdef'd kern.security.capabilities sysctl tree, including kern.security.capabilities.enabled, which defaults to 0. The rest of the file remains stubs for the time being. Obtained from: TrustedBSD Project	2001-11-02 15:22:32 +00:00
Robert Watson	bcc0dc3dc7	Merge from POSIX.1e Capabilities development tree: o POSIX.1e capabilities authorize overriding of VEXEC for VDIR based on CAP_DAC_READ_SEARCH, but of !VDIR based on CAP_DAC_EXECUTE. Add appropriate conditionals to vaccess() to take that into account. o Synchronization cap_check_xxx() -> cap_check() change. Obtained from: TrustedBSD Project	2001-11-02 15:16:59 +00:00
Robert Watson	4df571b101	o Capabilities cap_check() interface revised to remove _xxx, so rename in p_cansched(). Also, replace '0' with 'NULL' for the ucred * pointer. Obtained from: TrustedBSD Project	2001-11-02 15:08:08 +00:00
Robert Watson	a76789e7df	o Since kern_acl.c uses #ifdef CAPABILITIES to control capability-specific semantics, #include "opt_cap.h". Obtained from: TrustedBSD Project	2001-11-02 14:53:04 +00:00
Poul-Henning Kamp	8dd72bc887	#ifdef KTRACE a variable to silence a warning. Submitted by: Maxime "mux" Henrion <mux@qualys.com>	2001-11-02 09:55:01 +00:00
Poul-Henning Kamp	a2d7281c5a	Turn the symlinks around, instead of ad0s1 -> ad0s1c, make it ad0s1c -> ad0s1. Requested by: peter	2001-11-02 09:16:25 +00:00
Robert Watson	6d8785434f	o Update copyright dates. o Add reference to TrustedBSD Project in license header. o Update dated comments, including comment in extattr.h claiming that no file systems support extended attributes. o Improve comment consistency.	2001-11-01 21:37:07 +00:00
Robert Watson	fc5d29ef7d	o Move suser() calls in kern/ to using suser_xxx() with an explicit credential selection, rather than reference via a thread or process pointer. This is part of a gradual migration to suser() accepting a struct ucred instead of a struct proc, simplifying the reference and locking semantics of suser(). Obtained from: TrustedBSD Project	2001-11-01 20:56:57 +00:00
Mitsuru IWASAKI	f9390180fe	Some fix for the recent apm module changes. - Now that apm loadable module can inform its existence to other kernel components (e.g. i386/isa/clock.c:startrtclock()'s TCS hack). - Exchange priority of SI_SUB_CPU and SI_SUB_KLD for above purpose. - Add simple arbitration mechanism for APM vs. ACPI. This prevents the kernel enables both of them. - Remove obsolete `#ifdef DEV_APM' related code. - Add abstracted interface for Powermanagement operations. Public apm(4) functions, such as apm_suspend(), should be replaced new interfaces. Currently only power_pm_suspend (successor of apm_suspend) is implemented. Reviewed by: peter, arch@ and audit@	2001-11-01 16:34:07 +00:00
Josef Karthauser	0c5d0f0eff	Tidy up the variable declarations and switch on warnings and strict. Reviewed by: diffing the generated files from before and after the change.	2001-11-01 12:46:08 +00:00
Andrey A. Chernov	82849b4dfe	Add new interface function int devclass_find_free_unit(devclass_t dc, int unit); which return first free unit in given class starting from 'unit'.	2001-11-01 05:07:28 +00:00
Marcel Moolenaar	1245202150	Don't remove the tentative declaration. It's the only one... Pointy hat: marcel (self-sponsoring)	2001-10-31 20:43:38 +00:00
Marcel Moolenaar	8b3e7871bc	Make smp_started volatile in sys/smp.h and remove the volatile declaration in subr_smp.c. This solves a compile problem with gcc 3.0.1 (ia64 cross-build). Reviewed: jhb	2001-10-31 09:03:05 +00:00
Brian Feldman	bb9fe9dd9e	Add the sysctl "kern.function_list", which currently exports all function symbols in the kernel in a list of C strings, with an extra nul-termination at the end. This sysctl requires addition of a new linker operation. Now, linker_file_t's need to respond to "each_function_name" to export their function symbols. Note that the sysctl doesn't currently allow distinguishing multiple symbols with the same name from different modules, but could quite easily without a change to the linker operation. This will be a nicety to have when it can be used. Obtained from: NAI Labs CBOSS project Funded by: DARPA	2001-10-30 15:21:45 +00:00
Brian Feldman	08d68dda08	Also, machine/profile.h should be necessary for the function prototype of kmupetext().	2001-10-30 15:10:16 +00:00
Brian Feldman	f99502a4d4	Use kmupetext() for ELF KLDs to allow for increased text segment size. Obtained from: NAI Labs CBOSS project Funded by: DARPA	2001-10-30 15:08:51 +00:00
Brian Feldman	4a44bd4b4a	Add kmupetext(), a function that expands the range of memory covered by the profiler on a running system. This is not done sparsely, as memory is cheaper than processor speed and each gprof mcount() and mexitcount() operation is already very expensive. Obtained from: NAI Labs CBOSS project Funded by: DARPA	2001-10-30 15:04:57 +00:00
Julian Elischer	48810023a3	Use the thread we have instead of finding another that may be the wrong one.	2001-10-30 07:15:46 +00:00
David Malone	12396bdca7	When scanning for control messages, don't process the data mbufs. This could cause hangs if a unix domain socket was closed with data still to be read from it. Tested by: Andrea Campi <andrea@webcom.it>	2001-10-29 20:04:03 +00:00
Matthew Dillon	434d21ccbf	Make ttyprintf() of tv_sec value type agnostic.	2001-10-29 01:23:28 +00:00
Andrey A. Chernov	e9c044bd9e	1) In devclass_alloc_unit(), skip duplicated wired devices (i.e. with fixed number) instead of allocating next free unit for them. If someone needs fixed place, he must specify it correctly. "Allocating next" is especially bad because leads to double device detection and to "repeat make_dev panic" as result. This can happens if the same devices present somewhere on PCI bus, hints and ACPI. Making them present in one place only not always possible, "sc" f.e. can't be removed from hints, it results to no console at all. 2) In make_device(), detect when devclass_add_device() fails, free dev and return. I.e. add missing error checking. This part needed to finish fix in 1), but must be done this way in anycase, with old variant too.	2001-10-28 23:32:35 +00:00
Matthew Dillon	0e9fe2127c	Adjust printfs to be time_t agnostic.	2001-10-28 22:53:45 +00:00
Poul-Henning Kamp	4e13006747	Fix a problem in the disk related hack where device nodes for a physically non-existent disk in a legacy /dev on a DEVFS system would panic the system if stat(2)'ed. Do not whine about anonymous device nodes not having a si_devsw, they're not supposed to.	2001-10-28 09:39:28 +00:00
Michael Reifenberger	491dec936c	Introduce [IPC\|SHM]_[INFO\|STAT] to shmctl to make `/compat/linux/usr/bin/ipcs -m` happy.	2001-10-28 09:29:10 +00:00
Matthew Dillon	4ffa210b94	syncdelay, filedelay, dirdelay, metadelay are ints, not time_t's, and can also be made static.	2001-10-27 19:58:56 +00:00
Poul-Henning Kamp	4e4a76633b	Nudge the axe a bit closer to cdevsw[]: Make it a panic to repeat make_dev() or destroy_dev(), this check should maybe be neutered when -current goes -stable. Whine if devsw() is called on anon dev_t's in a devfs system. Make a hack to avoid our lazy-eval disk code triggering the above whine. Fix the multiple make_dev() in disk code by making ${disk}${unit}s${slice} an alias/symlink to ${disk}${unit}s${slice}c	2001-10-27 17:44:21 +00:00
Dag-Erling Smørgrav	9ca45e813c	Add a P_INEXEC flag that indicates that the process has called execve() and it has not yet returned. Use this flag to deny debugging requests while the process is execve()ing, and close once and for all any race conditions that might occur between execve() and various debugging interfaces. Reviewed by: jhb, rwatson	2001-10-27 11:11:25 +00:00
Robert Watson	48be932ac0	o Update copyright dates. Obtained from: TrustedBSD Project	2001-10-27 05:46:43 +00:00
Robert Watson	fdba6d3a1e	o Improve style(9) compliance following KSE modifications. In particular, strip the space from '( struct thread *...', wrap long lines. o Remove an unneeded comment on the topic of no lock being required as part of the NDINIT() in __acl_get_file(), as it's really not required there. Obtained from: TrustedBSD Project	2001-10-27 05:45:42 +00:00
Matthew Dillon	d23f5958bc	Add mtx_lock_giant() and mtx_unlock_giant() wrappers for sysctl management of Giant during the Giant unwinding phase, and start work on instrumenting Giant for the file and proc mutexes. These wrappers allow developers to turn on and off Giant around various subsystems. DEVELOPERS SHOULD NEVER TURN OFF GIANT AROUND A SUBSYSTEM JUST BECAUSE THE SYSCTL EXISTS! General developers should only considering turning on Giant for a subsystem whos default is off (to help track down bugs). Only developers working on particular subsystems who know what they are doing should consider turning off Giant. These wrappers will greatly improve our ability to unwind Giant and test the kernel on a (mostly) subsystem by subsystem basis. They allow Giant unwinding developers (GUDs) to emplace appropriate subsystem and structural mutexes in the main tree and then request that the larger community test the work by turning off Giant around the subsystem(s), without the larger community having to mess around with patches. These wrappers also allow GUDs to boot into a (more likely to be) working system in the midst of their unwinding work and to test that work under more controlled circumstances. There is a master sysctl, kern.giant.all, which defaults to 0 (off). If turned on it overrides ALL other kern.giant sysctls and forces Giant to be turned on for all wrapped subsystems. If turned off then Giant around individual subsystems are controlled by various other kern.giant.XXX sysctls. Code which overlaps multiple subsystems must have all related subsystem Giant sysctls turned off in order to run without Giant.	2001-10-26 20:48:04 +00:00
John Baldwin	282873e2c0	- Change the taskqueue locking to protect the necessary parts of a task while it is on a queue with the queue lock and remove the per-task locks. - Remove TASK_DESTROY now that it is no longer needed. - Go back to inlining TASK_INIT now that it is short again. Inspired by: dfr	2001-10-26 18:46:48 +00:00
Poul-Henning Kamp	5f7806ab69	Make cdevsw[] static.	2001-10-26 15:31:22 +00:00
John Baldwin	8e2e767b1f	Add a per-thread ucred reference for syscalls and synchronous traps from userland. The per thread ucred reference is immutable and thus needs no locks to be read. However, until all the proc locking associated with writes to p_ucred are completed, it is still not safe to use the per-thread reference. Tested on: x86 (SMP), alpha, sparc64	2001-10-26 08:12:54 +00:00
John Baldwin	1de1c550b1	Add locking to taskqueues. There is one mutex per task, one mutex per queue, and a mutex to protect the global list of taskqueues. The only visible change is that a TASK_DESTROY() macro has been added to mirror the TASK_INIT() macro to destroy a task before it is free'd. Submitted by: Andrew Reiter <awr@watson.org>	2001-10-26 06:32:21 +00:00
John Baldwin	40c6d2be16	Use msleep() to avoid lost wakeup's instead of doing an ineffective splhigh() before the mtx_unlock and tsleep(). The splhigh() was probably correct in the original code using simplelocks but is not correct in 5.0-current. Noticed by: Andrew Reiter <awr@FreeBSD.org>	2001-10-26 06:09:01 +00:00
Matthew Dillon	245df27cee	Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has a real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week	2001-10-26 00:08:05 +00:00
Matthew Dillon	f92dcd3e4a	Add missing TAILQ_INSERT_TAIL's which somehow didn't get comitted with the recent vnode cleanup.	2001-10-25 23:13:56 +00:00
Matthew Dillon	f02098e59c	In cluster_rbuild(), 'size' had better match buf->b_bcount and buf->b_bufsize or the cluster will not be properly merged. Dup the code from cluster_wbuild() and add some printf()s to see if bad cases are present. MFC after: 2 weeks	2001-10-25 22:49:48 +00:00
John Baldwin	5a08b84f83	Fix an inverted test csae. Success of getenv() is determined by a return value of !NUL rather than NUL. Submitted by: luigi Pointy hat to: jhb	2001-10-25 17:22:31 +00:00
Jonathan Lemon	18bfd58110	cnclose() can potentially race against itself. To avoid vn_close() races, NULL-out cnd_vp before calling the latter, as it may block. Submitted by: dillon	2001-10-25 04:51:37 +00:00
Jonathan Lemon	7ce26133ea	Force FWRITE on when opening the console, so that the flags passed to vn_close match those from vn_open. This fixes the panic some people were seeing about "vrele: missed vn_close".	2001-10-25 00:14:16 +00:00
John Baldwin	882bcf5879	Document the requirements and nature of the logical CPU IDs. It isn't very strict and leaves much up to the platform so that it can define a convenient mapping. Requested by: mjacob	2001-10-24 22:15:38 +00:00
Matthew Dillon	a06fe5111e	unwind v_writecount in fhopen() if we are unable to allocate the descriptor. MFC after: 3 days	2001-10-24 18:32:17 +00:00
John Baldwin	781a35df6b	Fix this to actually compile in the !INVARIANTS case. Reported by: Maxime Henrion <mux@qualys.com>	2001-10-24 14:18:33 +00:00
Robert Drehmel	9a024fc559	Use vm_offset_t instead of caddr_t to fix a warning and remove two casts.	2001-10-24 14:15:28 +00:00
Matthew Dillon	79deba82cd	Fix ktrace enablement/disablement races that can result in a vnode ref count panic. Bug noticed by: ps Reviewed by: ps MFC after: 1 day	2001-10-24 01:05:39 +00:00
John Baldwin	4e5e677bc0	Change the sx(9) assertion API to use a sx_assert() function similar to mtx_assert(9) rather than several SX_ASSERT_* macros.	2001-10-23 22:39:11 +00:00
John Baldwin	21cbf0cc8b	- Change getenv_quad() to return an int instead of a quad_t since it returns an success/failure code rather than the actual value. - Add getenv_string() which copies a string from the environment to another string and returns true on success.	2001-10-23 22:34:36 +00:00
Jonathan Lemon	991f976036	Implement multiple low-level console support.	2001-10-23 20:25:50 +00:00
Robert Watson	fc2749a40c	o vn_open() fails to call VOP_CLOSE() if vfs_object_create fails. Ideally all successful calls to VOP_OPEN() might be reflected in a call to VOP_CLOSE(). For now, simply add a comment reflecting this problem; this should be fixed at some point.	2001-10-23 19:09:01 +00:00
John Baldwin	ac9a258074	Assert that Giant is not held in mi_switch() unless the process state is SMTX or SRUN.	2001-10-23 17:52:49 +00:00
Matthew Dillon	4f467cb8c1	Fix incorrect double-termination of vm_object. When a vm_object is terminated and flushes pending dirty pages it is possible for the object to be ref'd (0->1) and then deref'd (1->0) during termination. We do not terminate the object a second time. Document vop_stdgetvobject() to explicitly allow it to be called without the vnode interlock held (for upcoming sync_msync() and ffs_sync() performance optimizations) MFC after: 3 days	2001-10-23 01:23:41 +00:00
Matthew Dillon	c72ccd014d	Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days	2001-10-23 01:21:29 +00:00
Poul-Henning Kamp	5015bb7f85	disk_clone() was a bit too eager to please: "md0s1ec" is not a valid device. Noticed by: Chad David <davidc@acns.ab.ca>	2001-10-22 10:18:45 +00:00
Dag-Erling Smørgrav	7c62990641	Move procfs_* from procfs_machdep.c into sys_process.c, and rename them to proc_* in the process; procfs_machdep.c is no longer needed. Run-tested on i386, build-tested on Alpha, untested on other platforms.	2001-10-21 23:57:24 +00:00
Dag-Erling Smørgrav	45fb069ac9	Convert textvp_fullpath() into the more generic vn_fullpath() which takes a struct thread * and a struct vnode * instead of a struct proc *. Temporarily add a textvp_fullpath macro for compatibility.	2001-10-21 15:52:51 +00:00
Matthew Dillon	5eb13f768c	Documentation MFC after: 1 day	2001-10-21 06:26:55 +00:00
Matthew Dillon	57601bcb5d	Syntax cleanup and documentation, no operational changes. MFC after: 1 day	2001-10-21 06:12:06 +00:00
Ian Dowse	72ec63a53d	Introduce some jitter to the timing of the samples that determine the system load average. Previously, the load average measurement was susceptible to synchronisation with processes that run at regular intervals such as the system bufdaemon process. Each interval is now chosen at random within the range of 4 to 6 seconds. This large variation is chosen so that over the shorter 5-minute load average timescale there is a good dispersion of samples across the 5-second sample period (the time to perform 60 5-second samples now has a standard deviation of approx 4.5 seconds).	2001-10-20 16:07:17 +00:00
Ian Dowse	0eb6ce3169	Move the code that computes the system load average from vm_meter.c to kern_synch.c in preparation for adding some jitter to the inter-sample time. Note that the "vm.loadavg" sysctl still lives in vm_meter.c which isn't the right place, but it is appropriate for the current (bad) name of that sysctl. Suggested by: jhb (some time ago) Reviewed by: bde	2001-10-20 13:10:43 +00:00
John Baldwin	7ada587697	The mtx_init() and sx_init() functions bzero'd locks before handing them off to witness_init() making the check for double intializating a lock by testing the LO_INITIALIZED flag moot. Workaround this by checking the LO_INITIALIZED flag ourself before we bzero the lock structure.	2001-10-20 01:22:42 +00:00
Peter Wemm	259ed91740	Add a sysctl for preventing the sync() in panic() recovery. This can be so dangerous it isn't funny. eg: if you panic inside NFS or softdep, and then try and sync you run into held locks and cause either deadlocks, recursive panics or other interesting chaos. Default is unchanged.	2001-10-19 23:32:03 +00:00
Jonathan Lemon	7e7c3f3f33	Add dev_named(dev, name), which is similar in spirit to devtoname(). This function returns success if the device is known by either 'name' or any of its aliases.	2001-10-17 18:47:12 +00:00
Matthew Dillon	2210e5d9fa	fix minor bug in kern.minvnodes sysctl. Use OID_AUTO.	2001-10-16 23:08:09 +00:00
Robert Watson	ab323a7d45	o Update init_sysent.c and friends for allocation of afs_syscall.	2001-10-13 13:30:21 +00:00
Robert Watson	b55abfd929	o Reserve system call 377 for afs_syscall; by reserving a system call number, portable OpenAFS applications don't have to attempt to determine what system call number was dynamically allocated. No system call prototype or implementation is defined. Requested by: Tom Maher <tardis@watson.org>	2001-10-13 13:19:34 +00:00
Poul-Henning Kamp	ce9d2b59b2	Regenerate syscall stuff. Remove syscall-hide.h	2001-10-13 09:18:28 +00:00
Poul-Henning Kamp	5ab1bfacb1	Don't generate <sys/syscalls-hide.h> it has never had any users anywhere in the source tree.	2001-10-13 09:17:49 +00:00
Peter Pentchev	88fbb423d4	Remove the panic when trying to register a sysctl with an oid too high. This stops panics on unloading modules which define their own sysctl sets. However, this also removes the protection against somebody actually defining a static sysctl with an oid in the range of the dynamic ones, which would break badly if there is already a dynamic sysctl with the requested oid. Apparently, the algorithm for removing sysctl sets needs a bit more work. For the present, the panic I introduced only leads to Bad Things (tm). Submitted by: many users of -current :( Pointy hat to: roam (myself) for not testing rev. 1.112 enough.	2001-10-12 09:16:36 +00:00
John Baldwin	a2f2b3afcd	- Catch up to the new ucred API. - Add proc locking to the jail() syscall. This mostly involved shuffling a few things around so that blockable things like malloc and copyin were performed before acquiring the lock and checking the existing ucred and then updating the ucred as one "atomic" change under the proc lock.	2001-10-11 23:39:43 +00:00
John Baldwin	bd78cece5d	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
John Baldwin	698166ca55	Whitespace fixes.	2001-10-11 22:49:27 +00:00
John Baldwin	6a90c862d3	Rework some code to be a bit simpler by inverting a few tests and using else clauses instead of goto's.	2001-10-11 22:48:37 +00:00
John Baldwin	61d80e90a9	Add missing includes of sys/ktr.h.	2001-10-11 17:53:43 +00:00
John Baldwin	7106ca0d1a	Add missing includes of sys/lock.h.	2001-10-11 17:52:20 +00:00
Michael Reifenberger	91a701cd13	Fix SysV Semaphore Handling. Updated by peter following KSE and Giant pushdown. I've running with this patch for two week with no ill side effects. PR: kern/12014: Fix SysV Semaphore handling Submitted by: Peter Jeremy <peter.jeremy@alcatel.com.au>	2001-10-11 08:15:14 +00:00
Paul Saab	cbc89bfbfe	Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable. Reviewed by: peter MFC after: 2 weeks	2001-10-10 23:06:54 +00:00
John Baldwin	f21fc12736	Add a temporary hack that will go away with the ucred API update to bzero the duplicated mutex before initializing it to avoid triggering the check for init'ing an already initialized mutex.	2001-10-10 20:45:40 +00:00
John Baldwin	6a40eccec3	Malloc mutexes pre-zero'd as random garbage (including 0xdeadcode) my trigget the check to make sure we don't initalize a mutex twice.	2001-10-10 20:43:50 +00:00
Doug Rabson	e913ca22e2	Move setregs() out from under the PROC_LOCK so that it can use functions list suword() which may trap.	2001-10-10 20:04:57 +00:00
Robert Watson	8a7d8cc675	- Combine kern.ps_showallprocs and kern.ipc.showallsockets into a single kern.security.seeotheruids_permitted, describes as: "Unprivileged processes may see subjects/objects with different real uid" NOTE: kern.ps_showallprocs exists in -STABLE, and therefore there is an API change. kern.ipc.showallsockets does not. - Check kern.security.seeotheruids_permitted in cr_cansee(). - Replace visibility calls to socheckuid() with cr_cansee() (retain the change to socheckuid() in ipfw, where it is used for rule-matching). - Remove prison_unpcb() and make use of cr_cansee() against the UNIX domain socket credential instead of comparing root vnodes for the UDS and the process. This allows multiple jails to share the same chroot() and not see each others UNIX domain sockets. - Remove unused socheckproc(). Now that cr_cansee() is used universally for socket visibility, a variety of policies are more consistently enforced, including uid-based restrictions and jail-based restrictions. This also better-supports the introduction of additional MAC models. Reviewed by: ps, billf Obtained from: TrustedBSD Project	2001-10-09 21:40:30 +00:00
John Baldwin	8688bb9383	proces -> process in a comment.	2001-10-09 17:25:30 +00:00
Robert Watson	32d186043b	o Recent addition of (p1==p2) exception in p_candebug() permitted processes to attach debugging to themselves even though the global kern_unprivileged_procdebug_permitted policy might disallow this. o Move the kern_unprivileged_procdebug_permitted check above the (p1==p2) check. Reviewed by: des	2001-10-09 16:56:29 +00:00
John Baldwin	74e4502e62	Replace 'curproc' with 'td->td_proc'.	2001-10-08 21:05:46 +00:00
Matthew Dillon	917efbaaba	WS Cleanup	2001-10-08 19:51:13 +00:00
Dag-Erling Smørgrav	3da3249106	Dissociate ptrace from procfs. Until now, the ptrace syscall was implemented as a wrapper that called various functions in procfs depending on which ptrace operation was requested. Most of these functions were themselves wrappers around procfs_{read,write}_{,db,fp}regs(), with only some extra error checks, which weren't necessary in the ptrace case anyway. This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c (renaming it to proc_rwmem() in the process), and implements ptrace() directly in terms of procfs_{read,write}_{,db,fp}regs() instead of having it fake up a struct uio and then call procfs_do{,db,fp}regs(). It also moves the prototypes for procfs_{read,write}_{,db,fp}regs() and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files except procfs_machdep.c as "optional procfs" instead of "standard".	2001-10-07 20:08:42 +00:00
Dag-Erling Smørgrav	23fad5b6c9	Always succeed if the target process is the same as the requesting process.	2001-10-07 20:06:03 +00:00
Ian Dowse	80f42b555d	Fix a typo in do_sigaction() where sa_sigaction and sa_handler were confused. Since sa_sigaction and sa_handler alias each other in a union, the bug was completely harmless. This had been fixed as part of the SIGCHLD changes in revision 1.125, but it was reverted when they were backed out in revision 1.126.	2001-10-07 16:11:37 +00:00
Robert Watson	c175d2226f	o Introduce an 'options REGRESSION'-dependant sysctl namespaces, 'regression.*'. o Add 'regression.securelevel_nonmonotonic', conditional on 'options REGRESSION', which allows the securelevel to be lowered for the purposes of efficient regression testing of securelevel policy decisions. Regression tests for securelevels will be committed shortly. NOTE: 'options REGRESSION' should never be used on production machines, as it permits violation of system invariants so as to improve the ability to effectively test edge cases, and improve testing efficiency.	2001-10-07 03:51:22 +00:00
Marcel Moolenaar	49ead724c6	Fix breakage caused by previous commit. The lkmnosys and lkmressys syscalls are of type NODEF but not in a way that fits the given definition of that type. The exact difference of lkmressys and lkmnosys is unclear, which makes it all the more confusing. A reevaluation of what we have and what we really need is in order. Spotted by: Maxime Henrion <mux@qualys.com> Pointy hat: marcel	2001-10-07 00:16:31 +00:00
Matthew Dillon	845bd795c9	vinvalbuf() was only waiting for write-I/O to complete. It really has to wait for both read AND write I/O to complete. Only NFS calls vinvalbuf() on an active vnode (when the server indicates that the file is stale), so this bug fix only effects NFS clients. MFC after: 3 days	2001-10-05 20:10:32 +00:00
John Baldwin	43150722c9	The aio kthreads start off with a root credential just like all other kthreads, so don't malloc a ucred just so we can create a duplicate of the one we already have.	2001-10-05 17:55:11 +00:00
Paul Saab	4787fd37af	Only allow users to see their own socket connections if kern.ipc.showallsockets is set to 0. Submitted by: billf (with modifications by me) Inspired by: Dave McKay (aka pm aka Packet Magnet) Reviewed by: peter MFC after: 2 weeks	2001-10-05 07:06:32 +00:00
Dag-Erling Smørgrav	50f74e92b8	Final style(9) commit: placement of opening brace; a continuation indent I missed in the previous commit; a line that exceeded 80 characters. No functional changes, but the object file's md5 checksum changes because some lines have been displaced.	2001-10-04 16:35:44 +00:00
Dag-Erling Smørgrav	8a8d4e459c	More style(9) fixes: no spaces between function name and parameter list; some indentation fixes (particularly continuation lines). Reviewed by: md5(1)	2001-10-04 16:29:45 +00:00
Dag-Erling Smørgrav	c5799337ea	This file had a mixture of "return foo;" and "return (foo);"; standardize on "return (foo);" as mandated by style(9). Reviewed by: md5(1)	2001-10-04 16:09:22 +00:00
David Malone	2bc21ed985	Hopefully improve control message passing over Unix domain sockets. 1) Allow the sending of more than one control message at a time over a unix domain socket. This should cover the PR 29499. 2) This requires that unp_{ex,in}ternalize and unp_scan understand mbufs with more than one control message at a time. 3) Internalize and externalize used to work on the mbuf in-place. This made life quite complicated and the code for sizeof(int) < sizeof(file ) could end up doing the wrong thing. The patch always create a new mbuf/cluster now. This resulted in the change of the prototype for the domain externalise function. 4) You can now send SCM_TIMESTAMP messages. 5) Always use CMSG_DATA(cm) to determine the start where the data in unp_{ex,in}ternalize. It was using ((struct cmsghdr )cm + 1) in some places, which gives the wrong alignment on the alpha. (NetBSD made this fix some time ago). This results in an ABI change for discriptor passing and creds passing on the alpha. (Probably on the IA64 and Spare ports too). 6) Fix userland programs to use CMSG_* macros too. 7) Be more careful about freeing mbufs containing (file *)s. This is made possible by the prototype change of externalise. PR: 29499 MFC after: 6 weeks	2001-10-04 13:11:48 +00:00
David Malone	59bdd40568	Allow sbcreatecontrol to make cluster sized control messages.	2001-10-04 12:59:53 +00:00
John Baldwin	0479e3d339	Move the ap boot spin lock earlier in the lock order before the sio(4) lock since we occasionally call printf() while holding the ap boot lock which can call down into the sio(4) driver if using a serial console.	2001-10-01 22:50:30 +00:00
Robert Watson	c6ab2f6b4e	o Complete the migration from suser error checking in the following form in vfs_syscalls.c: if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid && (error = suser_td(td)) != 0) { unwrap_lots_of_stuff(); return (error); } to: if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid) { error = suser_td(td); if (error) { unwrap_lots_of_stuff(); return (error); } } This makes the code more readable when complex clauses are in use, and minimizes conflicts for large outstanding patchsets modifying the kernel authorization code (of which I have several), especially where existing authorization and context code are combined in the same if() conditional. Obtained from: TrustedBSD Project	2001-10-01 20:01:07 +00:00
Matthew Dillon	b5810bab2d	After extensive testing it has been determined that adding complexity to avoid removing higher level directory vnodes from the namecache has no perceivable effect and will be removed. This is especially true when vmiodirenable is turned on, which it is by default now. ( vmiodirenable makes a huge difference in directory caching ). The vfs.vmiodirenable and vfs.nameileafonly sysctls have been left in to allow further testing, but I expect to rip out vfs.nameileafonly soon too. I have also determined through testing that the real problem with numvnodes getting too large is due to the VM Page cache preventing the vnode from being reclaimed. The directory stuff made only a tiny dent relative to Poul's original code, enough so that some tests succeeded. But tests with several million small files show that the bigger problem is the VM Page cache. This will have to be addressed by a future commit. MFC after: 3 days	2001-10-01 04:33:35 +00:00
Jonathan Lemon	1a6fc8ef63	When FREE()ing kqueue related structures, charge them to the correct bucket. Submitted by: iedowse Forgotten by: jlemon	2001-09-30 17:00:56 +00:00
Bosko Milekic	70a61707f6	Re-enable mbtypes statistics in the mbuf allocator. I disabled these when I changed the allocator bits. This implements per-CPU mbtypes stats by keeping net number of decrements/increments of a given mbtype per-CPU and then summing all of the per-CPU mbtypes to produce the total net number of allocated mbufs of the given mbtype. Counters are carefully balanced to avoid/prevent underflows/overflows. mbtypes stats are re-enabled with the idea that we may occasionally (although very rarely) observe slight inconsistencies in the stat reporting. Most of the time, we should be fine, though. Also make appropriate modifications to netstat(1) and systat(1) to do the necessary reporting. Submitted by: Jiangyi Liu <jyliu@163.net>	2001-09-30 01:58:39 +00:00
Jonathan Lemon	0217f5c71e	Have EVFILT_TIMERS allocate their callouts via malloc() instead of using the static callout list allocated by the system. Change malloc type from M_TEMP to M_KQUEUE to better track memory. Add a kern.kq_calloutmax to globally limit the amount of kernel memory that can be allocated by callouts. Submitted by: iedowse (items 1, 2)	2001-09-29 17:48:39 +00:00
Dag-Erling Smørgrav	5b6db47748	Add a couple of API functions I need for my pseudofs WIP. Documentation will follow when I've decided whether to keep this API or ditch it in favor of something slightly more subtle.	2001-09-29 00:32:46 +00:00
Marcel Moolenaar	4166877345	Make the NODEF type usable. A syscall of type NODEF will only have its entry in the syscall table added. Nothing else is done. This differs from type NOPROTO in that NOPROTO adds a definition to syscall.h besides adding a sysent. A syscall can now have multiple entries without conflict. Note that the argssize is fixed and depends on the syscall name.	2001-09-28 01:21:57 +00:00
Robert Watson	87fce2bb96	o When performing a securelevel check as part of securelevel_ge() or securelevel_gt(), determine first if a local securelevel exists -- if so, perform the check based on imax(local, global). Otherwise, simply use the global value. o Note: even though local securelevels might lag below the global one, if the global value is updated to higher than local values, maximum will still be used, making the global dominant even if there is local lag. Obtained from: TrustedBSD Project	2001-09-26 20:41:48 +00:00
Robert Watson	8a528812a0	o Modify kern.securelevel MIB entry to return a local securelevel, if one is present in the current jail, otherwise, to return the global securelevel. o If the securelevel is being updated, require that it be greater than the maximum of local and global, if a local securelevel exists, otherwise, just maximum of the global. If there is a local securelevel, update the local one instead of the global one. o Note: this does allow local securelevels to lag behind the global one as long as the local one is not updated following a global increase. Obtained from: TrustedBSD Project	2001-09-26 20:39:48 +00:00
Robert Watson	567931c8f6	o Initialize per-jail securelevel from global securelevel as part of jail creation. Obtained from: TrustedBSD Project	2001-09-26 20:37:15 +00:00
Robert Watson	d501d04b9e	o Modify static settime() to accept the proc * for the process requesting a time change, and callers so that they provide td->td_proc. o Modify settime() to use securevel_gt() for securelevel checking. Obtained from: TrustedBSD Project	2001-09-26 19:53:57 +00:00
Robert Watson	c2f413af19	o Modify sysctl access control check to use securelevel_gt(), and clarify sysctl access control logic. Obtained from: TrustedBSD Project	2001-09-26 19:51:25 +00:00
Matthew Dillon	46cad5761c	Enable vmiodirenable by default. Remove incorrect comment from sysctl.conf. MFC after: 1 week	2001-09-26 19:35:04 +00:00
Matthew Dillon	3418ebebfe	Make uio_yield() a global. Call uio_yield() between chunks in vn_rdwr_inchunks(), allowing other processes to gain an exclusive lock on the vnode. Specifically: directory scanning, to avoid a race to the root directory, and multiple child processes coring simultaniously so they can figure out that some other core'ing child has an exclusive adv lock and just exit instead. This completely fixes performance problems when large programs core. You can have hundreds of copies (forked children) of the same binary core all at once and not notice. MFC after: 3 days	2001-09-26 06:54:32 +00:00
Paul Saab	88b1d98f31	Lock the vnode while truncating the corefile. This fixes a panic with softupdates dangling deps. Submitted by: peter MFC: ASAP :)	2001-09-26 01:24:07 +00:00
John Baldwin	21377ce065	Remove superflous parens after de-macroizing.	2001-09-26 00:05:18 +00:00
Robert Watson	75bc5b3f22	o So, when <dd> e-mailed me and said that the comment was inverted for securelevel_ge() and securelevel_gt(), I was a little surprised, but fixed it. Turns out that it was the code that was inverted, during a whitespace cleanup in my commit tree. This commit inverts the checks, and restores the comment.	2001-09-25 21:08:33 +00:00
John Baldwin	dde96c9933	Since we no longer inline any debugging code in the mutex operations, move all the debugging code into the function versions of the mutex operations in kern_mutex.c. This reduced the __mtx_* macros to simply wrappers of the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in favor of just calling the _{get,rel}_lock_* macros. The tangled hairy mass of macros calling macros is at least a bit more sane now.	2001-09-22 21:19:55 +00:00
Robert Watson	b4799065ef	o vpaccess() -> vn_access() -- Peter reminds me that there is already a convention for vnop helper routines of this sort. Submitted by: Mr Wemm <peter>	2001-09-22 03:07:41 +00:00
John Baldwin	ed01445d8f	Use the passed in thread to selrecord() instead of curthread.	2001-09-21 22:46:54 +00:00
John Baldwin	456ca585db	Use the passed in thread pointer instead of curthread in calls to selrecord() in ptcpoll(). The pre-KSE code used the passed in proc pointer rather than curproc, and an earlier seltrue() call uses the passed in thread and not curthread.	2001-09-21 22:22:25 +00:00
John Baldwin	fea2ab833e	The P_SELECT flag was moved from p->p_flag to td->td_flags, but p_flag was locked by the proc lock and td_flags is locked by the sched_lock. The places that read, set, and cleared TDF_SELECT weren't updated, so they read and modified td_flags w/o holding the sched_lock, meaning that they could corrupt the per-thread flags field. As an immediate band-aid, grab sched_lock while reading and manipulating td_flags in relation to TDF_SELECT. This will probably be cleaned up some later on.	2001-09-21 22:06:22 +00:00
John Baldwin	e649bcb506	Remove unneeded proc variables and fix comments.	2001-09-21 21:54:45 +00:00
Robert Watson	a90a3f2882	o Part two of eaccess(2) commit, rebuilt system call code. Obtained from: TrustedBSD Project	2001-09-21 21:34:06 +00:00
Robert Watson	9c94f7731e	o Introduce eaccess(2), a version of access(2) that uses the effective credentials rather than the real credentials. This is useful for implementing GUI's which need to modify icons based on access rights, but where use of open(2) is too expensive, use of stat(2) doesn't reflect the file system's real protection model, and use of access() suffers from real/effective credential confusion. This implementation provides the same semantics as the call of the same name on SCO OpenServer. Note: using this call improperly can leave you subject to some of the same races present in the access(2) call. o To implement this, break out the basic logic of access(2) into vpaccess(), which accepts a passed credential to perform the invocation of VOP_ACCESS(). Add eaccess(2) to invoke vpaccess(), and modify access(2) to use vpaccess(). Obtained from: TrustedBSD Project	2001-09-21 21:33:22 +00:00
John Baldwin	278da5113f	Remove a bogus comment. "atomic" doesn't mean that the operation is done as a physical atomic operation. That would require the code to use the atomic API, which it does not. Instead, the operation is made psuedo atomic (hence the quotes) by use of the lock to protect clearing all of the flags in question.	2001-09-21 19:26:57 +00:00
John Baldwin	21832b1ec0	GC some #if 0'd code.	2001-09-21 19:21:18 +00:00
John Baldwin	3226cbf43b	Whitespace and spelling fixes.	2001-09-21 19:16:12 +00:00
Michael Reifenberger	896de692f8	Make msgseg, msgssz (->msgmax) and msgmni TUNABLE.	2001-09-21 09:25:17 +00:00
Peter Wemm	1114d18594	Add a pointer to kenv(1).	2001-09-21 02:25:53 +00:00
Jonathan Lemon	57ea1fa07f	Revert last commit. The same functionality can be obtained through the 'kenv' command, which I obviously was unaware of.	2001-09-21 02:09:01 +00:00
Robert Watson	94088977c9	o Rename u_cansee() to cr_cansee(), making the name more comprehensible in the face of a rename of ucred to cred, and possibly generally. Obtained from: TrustedBSD Project	2001-09-20 21:45:31 +00:00
Jonathan Lemon	e492f03505	Add a sysctl MIB 'kern.env', that dumps the contents of the kernel environment from the loader, as well as the kernel's compiled in static hints.	2001-09-20 20:09:37 +00:00
Peter Wemm	fbd7a9dd97	decrement the dumping variable after use so we can call it several times if needed.	2001-09-20 06:08:53 +00:00
John Baldwin	a44f918bf9	Fix a bug in propagate priority: the kse group pointer wasn't being updated in the loop so the new thread always seemd to have the same priority as the original thread and no actual priorities were changed.	2001-09-19 22:52:59 +00:00
Robert Watson	288b789333	o Clarification of securelevel_{ge,gt} comment. Submitted by: dd	2001-09-19 14:09:13 +00:00
Peter Wemm	66f769fe39	Add missing ; in last commit Pointy-hat-to: jhb	2001-09-19 02:53:59 +00:00
Peter Wemm	98cdde71e7	Regenerate	2001-09-18 23:33:33 +00:00
Peter Wemm	eb25edbda3	Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.	2001-09-18 23:32:09 +00:00
John Baldwin	9ef3a9855d	Use a 'p' variable instead of repetitively indirecting td->td_proc for signal things that are still per-process and won't be per-thread.	2001-09-18 23:27:06 +00:00
John Baldwin	8cc06751dd	Don't initialize proc0's mutex twice. It is already done earlier on in the MD startup code.	2001-09-18 22:09:47 +00:00
Robert Watson	3ca719f12e	o Introduce two new calls, securelevel_gt() and securelevel_ge(), which abstract the securelevel implementation details from the checking code. The call in -CURRENT accepts a struct ucred--in -STABLE, it will accept struct proc. This facilitates the upcoming commit of per-jail securelevel support. The calls will also generate a kernel printf if the calls are made with NULL ucred/proc pointers: generally speaking, there are few instances of this, and they should be fixed. o Update p_candebug() to use securelevel_gt(); future updates to the remainder of the kernel tree will be committed soon. Obtained from: TrustedBSD Project	2001-09-18 21:03:53 +00:00
Mark Peek	796ed2a6d0	Set debug information on the process being traced, not the current (debugger) process. This should allow gdb to function correctly on post-KSE kernels.	2001-09-18 19:06:11 +00:00
Jonathan Lemon	6a494eeb34	Change p into ke->ke_proc, this was hidden behind INVARIANTS.	2001-09-18 03:36:21 +00:00

1 2 3 4 5 ...

4339 Commits