freebsd-skq

Author	SHA1	Message	Date
John Baldwin	1de1c550b1	Add locking to taskqueues. There is one mutex per task, one mutex per queue, and a mutex to protect the global list of taskqueues. The only visible change is that a TASK_DESTROY() macro has been added to mirror the TASK_INIT() macro to destroy a task before it is free'd. Submitted by: Andrew Reiter <awr@watson.org>	2001-10-26 06:32:21 +00:00
John Baldwin	40c6d2be16	Use msleep() to avoid lost wakeup's instead of doing an ineffective splhigh() before the mtx_unlock and tsleep(). The splhigh() was probably correct in the original code using simplelocks but is not correct in 5.0-current. Noticed by: Andrew Reiter <awr@FreeBSD.org>	2001-10-26 06:09:01 +00:00
Matthew Dillon	245df27cee	Implement kern.maxvnodes. adjusting kern.maxvnodes now actually has a real effect. Optimize vfs_msync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. Improves looping case by 500%. Optimize ffs_sync(). Avoid having to continually drop and re-obtain mutexes when scanning the vnode list. This makes a couple of assumptions, which I believe are ok, in regards to vnode stability when the mount list mutex is held. Improves looping case by 500%. (more optimization work is needed on top of these fixes) MFC after: 1 week	2001-10-26 00:08:05 +00:00
Matthew Dillon	f92dcd3e4a	Add missing TAILQ_INSERT_TAIL's which somehow didn't get comitted with the recent vnode cleanup.	2001-10-25 23:13:56 +00:00
Matthew Dillon	f02098e59c	In cluster_rbuild(), 'size' had better match buf->b_bcount and buf->b_bufsize or the cluster will not be properly merged. Dup the code from cluster_wbuild() and add some printf()s to see if bad cases are present. MFC after: 2 weeks	2001-10-25 22:49:48 +00:00
John Baldwin	5a08b84f83	Fix an inverted test csae. Success of getenv() is determined by a return value of !NUL rather than NUL. Submitted by: luigi Pointy hat to: jhb	2001-10-25 17:22:31 +00:00
Jonathan Lemon	18bfd58110	cnclose() can potentially race against itself. To avoid vn_close() races, NULL-out cnd_vp before calling the latter, as it may block. Submitted by: dillon	2001-10-25 04:51:37 +00:00
Jonathan Lemon	7ce26133ea	Force FWRITE on when opening the console, so that the flags passed to vn_close match those from vn_open. This fixes the panic some people were seeing about "vrele: missed vn_close".	2001-10-25 00:14:16 +00:00
John Baldwin	882bcf5879	Document the requirements and nature of the logical CPU IDs. It isn't very strict and leaves much up to the platform so that it can define a convenient mapping. Requested by: mjacob	2001-10-24 22:15:38 +00:00
Matthew Dillon	a06fe5111e	unwind v_writecount in fhopen() if we are unable to allocate the descriptor. MFC after: 3 days	2001-10-24 18:32:17 +00:00
John Baldwin	781a35df6b	Fix this to actually compile in the !INVARIANTS case. Reported by: Maxime Henrion <mux@qualys.com>	2001-10-24 14:18:33 +00:00
Robert Drehmel	9a024fc559	Use vm_offset_t instead of caddr_t to fix a warning and remove two casts.	2001-10-24 14:15:28 +00:00
Matthew Dillon	79deba82cd	Fix ktrace enablement/disablement races that can result in a vnode ref count panic. Bug noticed by: ps Reviewed by: ps MFC after: 1 day	2001-10-24 01:05:39 +00:00
John Baldwin	4e5e677bc0	Change the sx(9) assertion API to use a sx_assert() function similar to mtx_assert(9) rather than several SX_ASSERT_* macros.	2001-10-23 22:39:11 +00:00
John Baldwin	21cbf0cc8b	- Change getenv_quad() to return an int instead of a quad_t since it returns an success/failure code rather than the actual value. - Add getenv_string() which copies a string from the environment to another string and returns true on success.	2001-10-23 22:34:36 +00:00
Jonathan Lemon	991f976036	Implement multiple low-level console support.	2001-10-23 20:25:50 +00:00
Robert Watson	fc2749a40c	o vn_open() fails to call VOP_CLOSE() if vfs_object_create fails. Ideally all successful calls to VOP_OPEN() might be reflected in a call to VOP_CLOSE(). For now, simply add a comment reflecting this problem; this should be fixed at some point.	2001-10-23 19:09:01 +00:00
John Baldwin	ac9a258074	Assert that Giant is not held in mi_switch() unless the process state is SMTX or SRUN.	2001-10-23 17:52:49 +00:00
Matthew Dillon	4f467cb8c1	Fix incorrect double-termination of vm_object. When a vm_object is terminated and flushes pending dirty pages it is possible for the object to be ref'd (0->1) and then deref'd (1->0) during termination. We do not terminate the object a second time. Document vop_stdgetvobject() to explicitly allow it to be called without the vnode interlock held (for upcoming sync_msync() and ffs_sync() performance optimizations) MFC after: 3 days	2001-10-23 01:23:41 +00:00
Matthew Dillon	c72ccd014d	Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days	2001-10-23 01:21:29 +00:00
Poul-Henning Kamp	5015bb7f85	disk_clone() was a bit too eager to please: "md0s1ec" is not a valid device. Noticed by: Chad David <davidc@acns.ab.ca>	2001-10-22 10:18:45 +00:00
Dag-Erling Smørgrav	7c62990641	Move procfs_* from procfs_machdep.c into sys_process.c, and rename them to proc_* in the process; procfs_machdep.c is no longer needed. Run-tested on i386, build-tested on Alpha, untested on other platforms.	2001-10-21 23:57:24 +00:00
Dag-Erling Smørgrav	45fb069ac9	Convert textvp_fullpath() into the more generic vn_fullpath() which takes a struct thread * and a struct vnode * instead of a struct proc *. Temporarily add a textvp_fullpath macro for compatibility.	2001-10-21 15:52:51 +00:00
Matthew Dillon	5eb13f768c	Documentation MFC after: 1 day	2001-10-21 06:26:55 +00:00
Matthew Dillon	57601bcb5d	Syntax cleanup and documentation, no operational changes. MFC after: 1 day	2001-10-21 06:12:06 +00:00
Ian Dowse	72ec63a53d	Introduce some jitter to the timing of the samples that determine the system load average. Previously, the load average measurement was susceptible to synchronisation with processes that run at regular intervals such as the system bufdaemon process. Each interval is now chosen at random within the range of 4 to 6 seconds. This large variation is chosen so that over the shorter 5-minute load average timescale there is a good dispersion of samples across the 5-second sample period (the time to perform 60 5-second samples now has a standard deviation of approx 4.5 seconds).	2001-10-20 16:07:17 +00:00
Ian Dowse	0eb6ce3169	Move the code that computes the system load average from vm_meter.c to kern_synch.c in preparation for adding some jitter to the inter-sample time. Note that the "vm.loadavg" sysctl still lives in vm_meter.c which isn't the right place, but it is appropriate for the current (bad) name of that sysctl. Suggested by: jhb (some time ago) Reviewed by: bde	2001-10-20 13:10:43 +00:00
John Baldwin	7ada587697	The mtx_init() and sx_init() functions bzero'd locks before handing them off to witness_init() making the check for double intializating a lock by testing the LO_INITIALIZED flag moot. Workaround this by checking the LO_INITIALIZED flag ourself before we bzero the lock structure.	2001-10-20 01:22:42 +00:00
Peter Wemm	259ed91740	Add a sysctl for preventing the sync() in panic() recovery. This can be so dangerous it isn't funny. eg: if you panic inside NFS or softdep, and then try and sync you run into held locks and cause either deadlocks, recursive panics or other interesting chaos. Default is unchanged.	2001-10-19 23:32:03 +00:00
Jonathan Lemon	7e7c3f3f33	Add dev_named(dev, name), which is similar in spirit to devtoname(). This function returns success if the device is known by either 'name' or any of its aliases.	2001-10-17 18:47:12 +00:00
Matthew Dillon	2210e5d9fa	fix minor bug in kern.minvnodes sysctl. Use OID_AUTO.	2001-10-16 23:08:09 +00:00
Robert Watson	ab323a7d45	o Update init_sysent.c and friends for allocation of afs_syscall.	2001-10-13 13:30:21 +00:00
Robert Watson	b55abfd929	o Reserve system call 377 for afs_syscall; by reserving a system call number, portable OpenAFS applications don't have to attempt to determine what system call number was dynamically allocated. No system call prototype or implementation is defined. Requested by: Tom Maher <tardis@watson.org>	2001-10-13 13:19:34 +00:00
Poul-Henning Kamp	ce9d2b59b2	Regenerate syscall stuff. Remove syscall-hide.h	2001-10-13 09:18:28 +00:00
Poul-Henning Kamp	5ab1bfacb1	Don't generate <sys/syscalls-hide.h> it has never had any users anywhere in the source tree.	2001-10-13 09:17:49 +00:00
Peter Pentchev	88fbb423d4	Remove the panic when trying to register a sysctl with an oid too high. This stops panics on unloading modules which define their own sysctl sets. However, this also removes the protection against somebody actually defining a static sysctl with an oid in the range of the dynamic ones, which would break badly if there is already a dynamic sysctl with the requested oid. Apparently, the algorithm for removing sysctl sets needs a bit more work. For the present, the panic I introduced only leads to Bad Things (tm). Submitted by: many users of -current :( Pointy hat to: roam (myself) for not testing rev. 1.112 enough.	2001-10-12 09:16:36 +00:00
John Baldwin	a2f2b3afcd	- Catch up to the new ucred API. - Add proc locking to the jail() syscall. This mostly involved shuffling a few things around so that blockable things like malloc and copyin were performed before acquiring the lock and checking the existing ucred and then updating the ucred as one "atomic" change under the proc lock.	2001-10-11 23:39:43 +00:00
John Baldwin	bd78cece5d	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
John Baldwin	698166ca55	Whitespace fixes.	2001-10-11 22:49:27 +00:00
John Baldwin	6a90c862d3	Rework some code to be a bit simpler by inverting a few tests and using else clauses instead of goto's.	2001-10-11 22:48:37 +00:00
John Baldwin	61d80e90a9	Add missing includes of sys/ktr.h.	2001-10-11 17:53:43 +00:00
John Baldwin	7106ca0d1a	Add missing includes of sys/lock.h.	2001-10-11 17:52:20 +00:00
Michael Reifenberger	91a701cd13	Fix SysV Semaphore Handling. Updated by peter following KSE and Giant pushdown. I've running with this patch for two week with no ill side effects. PR: kern/12014: Fix SysV Semaphore handling Submitted by: Peter Jeremy <peter.jeremy@alcatel.com.au>	2001-10-11 08:15:14 +00:00
Paul Saab	cbc89bfbfe	Make MAXTSIZ, DFLDSIZ, MAXDSIZ, DFLSSIZ, MAXSSIZ, SGROWSIZ loader tunable. Reviewed by: peter MFC after: 2 weeks	2001-10-10 23:06:54 +00:00
John Baldwin	f21fc12736	Add a temporary hack that will go away with the ucred API update to bzero the duplicated mutex before initializing it to avoid triggering the check for init'ing an already initialized mutex.	2001-10-10 20:45:40 +00:00
John Baldwin	6a40eccec3	Malloc mutexes pre-zero'd as random garbage (including 0xdeadcode) my trigget the check to make sure we don't initalize a mutex twice.	2001-10-10 20:43:50 +00:00
Doug Rabson	e913ca22e2	Move setregs() out from under the PROC_LOCK so that it can use functions list suword() which may trap.	2001-10-10 20:04:57 +00:00
Robert Watson	8a7d8cc675	- Combine kern.ps_showallprocs and kern.ipc.showallsockets into a single kern.security.seeotheruids_permitted, describes as: "Unprivileged processes may see subjects/objects with different real uid" NOTE: kern.ps_showallprocs exists in -STABLE, and therefore there is an API change. kern.ipc.showallsockets does not. - Check kern.security.seeotheruids_permitted in cr_cansee(). - Replace visibility calls to socheckuid() with cr_cansee() (retain the change to socheckuid() in ipfw, where it is used for rule-matching). - Remove prison_unpcb() and make use of cr_cansee() against the UNIX domain socket credential instead of comparing root vnodes for the UDS and the process. This allows multiple jails to share the same chroot() and not see each others UNIX domain sockets. - Remove unused socheckproc(). Now that cr_cansee() is used universally for socket visibility, a variety of policies are more consistently enforced, including uid-based restrictions and jail-based restrictions. This also better-supports the introduction of additional MAC models. Reviewed by: ps, billf Obtained from: TrustedBSD Project	2001-10-09 21:40:30 +00:00
John Baldwin	8688bb9383	proces -> process in a comment.	2001-10-09 17:25:30 +00:00
Robert Watson	32d186043b	o Recent addition of (p1==p2) exception in p_candebug() permitted processes to attach debugging to themselves even though the global kern_unprivileged_procdebug_permitted policy might disallow this. o Move the kern_unprivileged_procdebug_permitted check above the (p1==p2) check. Reviewed by: des	2001-10-09 16:56:29 +00:00
John Baldwin	74e4502e62	Replace 'curproc' with 'td->td_proc'.	2001-10-08 21:05:46 +00:00
Matthew Dillon	917efbaaba	WS Cleanup	2001-10-08 19:51:13 +00:00
Dag-Erling Smørgrav	3da3249106	Dissociate ptrace from procfs. Until now, the ptrace syscall was implemented as a wrapper that called various functions in procfs depending on which ptrace operation was requested. Most of these functions were themselves wrappers around procfs_{read,write}_{,db,fp}regs(), with only some extra error checks, which weren't necessary in the ptrace case anyway. This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c (renaming it to proc_rwmem() in the process), and implements ptrace() directly in terms of procfs_{read,write}_{,db,fp}regs() instead of having it fake up a struct uio and then call procfs_do{,db,fp}regs(). It also moves the prototypes for procfs_{read,write}_{,db,fp}regs() and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files except procfs_machdep.c as "optional procfs" instead of "standard".	2001-10-07 20:08:42 +00:00
Dag-Erling Smørgrav	23fad5b6c9	Always succeed if the target process is the same as the requesting process.	2001-10-07 20:06:03 +00:00
Ian Dowse	80f42b555d	Fix a typo in do_sigaction() where sa_sigaction and sa_handler were confused. Since sa_sigaction and sa_handler alias each other in a union, the bug was completely harmless. This had been fixed as part of the SIGCHLD changes in revision 1.125, but it was reverted when they were backed out in revision 1.126.	2001-10-07 16:11:37 +00:00
Robert Watson	c175d2226f	o Introduce an 'options REGRESSION'-dependant sysctl namespaces, 'regression.*'. o Add 'regression.securelevel_nonmonotonic', conditional on 'options REGRESSION', which allows the securelevel to be lowered for the purposes of efficient regression testing of securelevel policy decisions. Regression tests for securelevels will be committed shortly. NOTE: 'options REGRESSION' should never be used on production machines, as it permits violation of system invariants so as to improve the ability to effectively test edge cases, and improve testing efficiency.	2001-10-07 03:51:22 +00:00
Marcel Moolenaar	49ead724c6	Fix breakage caused by previous commit. The lkmnosys and lkmressys syscalls are of type NODEF but not in a way that fits the given definition of that type. The exact difference of lkmressys and lkmnosys is unclear, which makes it all the more confusing. A reevaluation of what we have and what we really need is in order. Spotted by: Maxime Henrion <mux@qualys.com> Pointy hat: marcel	2001-10-07 00:16:31 +00:00
Matthew Dillon	845bd795c9	vinvalbuf() was only waiting for write-I/O to complete. It really has to wait for both read AND write I/O to complete. Only NFS calls vinvalbuf() on an active vnode (when the server indicates that the file is stale), so this bug fix only effects NFS clients. MFC after: 3 days	2001-10-05 20:10:32 +00:00
John Baldwin	43150722c9	The aio kthreads start off with a root credential just like all other kthreads, so don't malloc a ucred just so we can create a duplicate of the one we already have.	2001-10-05 17:55:11 +00:00
Paul Saab	4787fd37af	Only allow users to see their own socket connections if kern.ipc.showallsockets is set to 0. Submitted by: billf (with modifications by me) Inspired by: Dave McKay (aka pm aka Packet Magnet) Reviewed by: peter MFC after: 2 weeks	2001-10-05 07:06:32 +00:00
Dag-Erling Smørgrav	50f74e92b8	Final style(9) commit: placement of opening brace; a continuation indent I missed in the previous commit; a line that exceeded 80 characters. No functional changes, but the object file's md5 checksum changes because some lines have been displaced.	2001-10-04 16:35:44 +00:00
Dag-Erling Smørgrav	8a8d4e459c	More style(9) fixes: no spaces between function name and parameter list; some indentation fixes (particularly continuation lines). Reviewed by: md5(1)	2001-10-04 16:29:45 +00:00
Dag-Erling Smørgrav	c5799337ea	This file had a mixture of "return foo;" and "return (foo);"; standardize on "return (foo);" as mandated by style(9). Reviewed by: md5(1)	2001-10-04 16:09:22 +00:00
David Malone	2bc21ed985	Hopefully improve control message passing over Unix domain sockets. 1) Allow the sending of more than one control message at a time over a unix domain socket. This should cover the PR 29499. 2) This requires that unp_{ex,in}ternalize and unp_scan understand mbufs with more than one control message at a time. 3) Internalize and externalize used to work on the mbuf in-place. This made life quite complicated and the code for sizeof(int) < sizeof(file ) could end up doing the wrong thing. The patch always create a new mbuf/cluster now. This resulted in the change of the prototype for the domain externalise function. 4) You can now send SCM_TIMESTAMP messages. 5) Always use CMSG_DATA(cm) to determine the start where the data in unp_{ex,in}ternalize. It was using ((struct cmsghdr )cm + 1) in some places, which gives the wrong alignment on the alpha. (NetBSD made this fix some time ago). This results in an ABI change for discriptor passing and creds passing on the alpha. (Probably on the IA64 and Spare ports too). 6) Fix userland programs to use CMSG_* macros too. 7) Be more careful about freeing mbufs containing (file *)s. This is made possible by the prototype change of externalise. PR: 29499 MFC after: 6 weeks	2001-10-04 13:11:48 +00:00
David Malone	59bdd40568	Allow sbcreatecontrol to make cluster sized control messages.	2001-10-04 12:59:53 +00:00
John Baldwin	0479e3d339	Move the ap boot spin lock earlier in the lock order before the sio(4) lock since we occasionally call printf() while holding the ap boot lock which can call down into the sio(4) driver if using a serial console.	2001-10-01 22:50:30 +00:00
Robert Watson	c6ab2f6b4e	o Complete the migration from suser error checking in the following form in vfs_syscalls.c: if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid && (error = suser_td(td)) != 0) { unwrap_lots_of_stuff(); return (error); } to: if (mp->mnt_stat.f_owner != p->p_ucred->cr_uid) { error = suser_td(td); if (error) { unwrap_lots_of_stuff(); return (error); } } This makes the code more readable when complex clauses are in use, and minimizes conflicts for large outstanding patchsets modifying the kernel authorization code (of which I have several), especially where existing authorization and context code are combined in the same if() conditional. Obtained from: TrustedBSD Project	2001-10-01 20:01:07 +00:00
Matthew Dillon	b5810bab2d	After extensive testing it has been determined that adding complexity to avoid removing higher level directory vnodes from the namecache has no perceivable effect and will be removed. This is especially true when vmiodirenable is turned on, which it is by default now. ( vmiodirenable makes a huge difference in directory caching ). The vfs.vmiodirenable and vfs.nameileafonly sysctls have been left in to allow further testing, but I expect to rip out vfs.nameileafonly soon too. I have also determined through testing that the real problem with numvnodes getting too large is due to the VM Page cache preventing the vnode from being reclaimed. The directory stuff made only a tiny dent relative to Poul's original code, enough so that some tests succeeded. But tests with several million small files show that the bigger problem is the VM Page cache. This will have to be addressed by a future commit. MFC after: 3 days	2001-10-01 04:33:35 +00:00
Jonathan Lemon	1a6fc8ef63	When FREE()ing kqueue related structures, charge them to the correct bucket. Submitted by: iedowse Forgotten by: jlemon	2001-09-30 17:00:56 +00:00
Bosko Milekic	70a61707f6	Re-enable mbtypes statistics in the mbuf allocator. I disabled these when I changed the allocator bits. This implements per-CPU mbtypes stats by keeping net number of decrements/increments of a given mbtype per-CPU and then summing all of the per-CPU mbtypes to produce the total net number of allocated mbufs of the given mbtype. Counters are carefully balanced to avoid/prevent underflows/overflows. mbtypes stats are re-enabled with the idea that we may occasionally (although very rarely) observe slight inconsistencies in the stat reporting. Most of the time, we should be fine, though. Also make appropriate modifications to netstat(1) and systat(1) to do the necessary reporting. Submitted by: Jiangyi Liu <jyliu@163.net>	2001-09-30 01:58:39 +00:00
Jonathan Lemon	0217f5c71e	Have EVFILT_TIMERS allocate their callouts via malloc() instead of using the static callout list allocated by the system. Change malloc type from M_TEMP to M_KQUEUE to better track memory. Add a kern.kq_calloutmax to globally limit the amount of kernel memory that can be allocated by callouts. Submitted by: iedowse (items 1, 2)	2001-09-29 17:48:39 +00:00
Dag-Erling Smørgrav	5b6db47748	Add a couple of API functions I need for my pseudofs WIP. Documentation will follow when I've decided whether to keep this API or ditch it in favor of something slightly more subtle.	2001-09-29 00:32:46 +00:00
Marcel Moolenaar	4166877345	Make the NODEF type usable. A syscall of type NODEF will only have its entry in the syscall table added. Nothing else is done. This differs from type NOPROTO in that NOPROTO adds a definition to syscall.h besides adding a sysent. A syscall can now have multiple entries without conflict. Note that the argssize is fixed and depends on the syscall name.	2001-09-28 01:21:57 +00:00
Robert Watson	87fce2bb96	o When performing a securelevel check as part of securelevel_ge() or securelevel_gt(), determine first if a local securelevel exists -- if so, perform the check based on imax(local, global). Otherwise, simply use the global value. o Note: even though local securelevels might lag below the global one, if the global value is updated to higher than local values, maximum will still be used, making the global dominant even if there is local lag. Obtained from: TrustedBSD Project	2001-09-26 20:41:48 +00:00
Robert Watson	8a528812a0	o Modify kern.securelevel MIB entry to return a local securelevel, if one is present in the current jail, otherwise, to return the global securelevel. o If the securelevel is being updated, require that it be greater than the maximum of local and global, if a local securelevel exists, otherwise, just maximum of the global. If there is a local securelevel, update the local one instead of the global one. o Note: this does allow local securelevels to lag behind the global one as long as the local one is not updated following a global increase. Obtained from: TrustedBSD Project	2001-09-26 20:39:48 +00:00
Robert Watson	567931c8f6	o Initialize per-jail securelevel from global securelevel as part of jail creation. Obtained from: TrustedBSD Project	2001-09-26 20:37:15 +00:00
Robert Watson	d501d04b9e	o Modify static settime() to accept the proc * for the process requesting a time change, and callers so that they provide td->td_proc. o Modify settime() to use securevel_gt() for securelevel checking. Obtained from: TrustedBSD Project	2001-09-26 19:53:57 +00:00
Robert Watson	c2f413af19	o Modify sysctl access control check to use securelevel_gt(), and clarify sysctl access control logic. Obtained from: TrustedBSD Project	2001-09-26 19:51:25 +00:00
Matthew Dillon	46cad5761c	Enable vmiodirenable by default. Remove incorrect comment from sysctl.conf. MFC after: 1 week	2001-09-26 19:35:04 +00:00
Matthew Dillon	3418ebebfe	Make uio_yield() a global. Call uio_yield() between chunks in vn_rdwr_inchunks(), allowing other processes to gain an exclusive lock on the vnode. Specifically: directory scanning, to avoid a race to the root directory, and multiple child processes coring simultaniously so they can figure out that some other core'ing child has an exclusive adv lock and just exit instead. This completely fixes performance problems when large programs core. You can have hundreds of copies (forked children) of the same binary core all at once and not notice. MFC after: 3 days	2001-09-26 06:54:32 +00:00
Paul Saab	88b1d98f31	Lock the vnode while truncating the corefile. This fixes a panic with softupdates dangling deps. Submitted by: peter MFC: ASAP :)	2001-09-26 01:24:07 +00:00
John Baldwin	21377ce065	Remove superflous parens after de-macroizing.	2001-09-26 00:05:18 +00:00
Robert Watson	75bc5b3f22	o So, when <dd> e-mailed me and said that the comment was inverted for securelevel_ge() and securelevel_gt(), I was a little surprised, but fixed it. Turns out that it was the code that was inverted, during a whitespace cleanup in my commit tree. This commit inverts the checks, and restores the comment.	2001-09-25 21:08:33 +00:00
John Baldwin	dde96c9933	Since we no longer inline any debugging code in the mutex operations, move all the debugging code into the function versions of the mutex operations in kern_mutex.c. This reduced the __mtx_* macros to simply wrappers of the _{get,rel}_lock_* macros, so the __mtx_* macros were also abolished in favor of just calling the _{get,rel}_lock_* macros. The tangled hairy mass of macros calling macros is at least a bit more sane now.	2001-09-22 21:19:55 +00:00
Robert Watson	b4799065ef	o vpaccess() -> vn_access() -- Peter reminds me that there is already a convention for vnop helper routines of this sort. Submitted by: Mr Wemm <peter>	2001-09-22 03:07:41 +00:00
John Baldwin	ed01445d8f	Use the passed in thread to selrecord() instead of curthread.	2001-09-21 22:46:54 +00:00
John Baldwin	456ca585db	Use the passed in thread pointer instead of curthread in calls to selrecord() in ptcpoll(). The pre-KSE code used the passed in proc pointer rather than curproc, and an earlier seltrue() call uses the passed in thread and not curthread.	2001-09-21 22:22:25 +00:00
John Baldwin	fea2ab833e	The P_SELECT flag was moved from p->p_flag to td->td_flags, but p_flag was locked by the proc lock and td_flags is locked by the sched_lock. The places that read, set, and cleared TDF_SELECT weren't updated, so they read and modified td_flags w/o holding the sched_lock, meaning that they could corrupt the per-thread flags field. As an immediate band-aid, grab sched_lock while reading and manipulating td_flags in relation to TDF_SELECT. This will probably be cleaned up some later on.	2001-09-21 22:06:22 +00:00
John Baldwin	e649bcb506	Remove unneeded proc variables and fix comments.	2001-09-21 21:54:45 +00:00
Robert Watson	a90a3f2882	o Part two of eaccess(2) commit, rebuilt system call code. Obtained from: TrustedBSD Project	2001-09-21 21:34:06 +00:00
Robert Watson	9c94f7731e	o Introduce eaccess(2), a version of access(2) that uses the effective credentials rather than the real credentials. This is useful for implementing GUI's which need to modify icons based on access rights, but where use of open(2) is too expensive, use of stat(2) doesn't reflect the file system's real protection model, and use of access() suffers from real/effective credential confusion. This implementation provides the same semantics as the call of the same name on SCO OpenServer. Note: using this call improperly can leave you subject to some of the same races present in the access(2) call. o To implement this, break out the basic logic of access(2) into vpaccess(), which accepts a passed credential to perform the invocation of VOP_ACCESS(). Add eaccess(2) to invoke vpaccess(), and modify access(2) to use vpaccess(). Obtained from: TrustedBSD Project	2001-09-21 21:33:22 +00:00
John Baldwin	278da5113f	Remove a bogus comment. "atomic" doesn't mean that the operation is done as a physical atomic operation. That would require the code to use the atomic API, which it does not. Instead, the operation is made psuedo atomic (hence the quotes) by use of the lock to protect clearing all of the flags in question.	2001-09-21 19:26:57 +00:00
John Baldwin	21832b1ec0	GC some #if 0'd code.	2001-09-21 19:21:18 +00:00
John Baldwin	3226cbf43b	Whitespace and spelling fixes.	2001-09-21 19:16:12 +00:00
Michael Reifenberger	896de692f8	Make msgseg, msgssz (->msgmax) and msgmni TUNABLE.	2001-09-21 09:25:17 +00:00
Peter Wemm	1114d18594	Add a pointer to kenv(1).	2001-09-21 02:25:53 +00:00
Jonathan Lemon	57ea1fa07f	Revert last commit. The same functionality can be obtained through the 'kenv' command, which I obviously was unaware of.	2001-09-21 02:09:01 +00:00
Robert Watson	94088977c9	o Rename u_cansee() to cr_cansee(), making the name more comprehensible in the face of a rename of ucred to cred, and possibly generally. Obtained from: TrustedBSD Project	2001-09-20 21:45:31 +00:00
Jonathan Lemon	e492f03505	Add a sysctl MIB 'kern.env', that dumps the contents of the kernel environment from the loader, as well as the kernel's compiled in static hints.	2001-09-20 20:09:37 +00:00
Peter Wemm	fbd7a9dd97	decrement the dumping variable after use so we can call it several times if needed.	2001-09-20 06:08:53 +00:00
John Baldwin	a44f918bf9	Fix a bug in propagate priority: the kse group pointer wasn't being updated in the loop so the new thread always seemd to have the same priority as the original thread and no actual priorities were changed.	2001-09-19 22:52:59 +00:00
Robert Watson	288b789333	o Clarification of securelevel_{ge,gt} comment. Submitted by: dd	2001-09-19 14:09:13 +00:00
Peter Wemm	66f769fe39	Add missing ; in last commit Pointy-hat-to: jhb	2001-09-19 02:53:59 +00:00
Peter Wemm	98cdde71e7	Regenerate	2001-09-18 23:33:33 +00:00
Peter Wemm	eb25edbda3	Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.	2001-09-18 23:32:09 +00:00
John Baldwin	9ef3a9855d	Use a 'p' variable instead of repetitively indirecting td->td_proc for signal things that are still per-process and won't be per-thread.	2001-09-18 23:27:06 +00:00
John Baldwin	8cc06751dd	Don't initialize proc0's mutex twice. It is already done earlier on in the MD startup code.	2001-09-18 22:09:47 +00:00
Robert Watson	3ca719f12e	o Introduce two new calls, securelevel_gt() and securelevel_ge(), which abstract the securelevel implementation details from the checking code. The call in -CURRENT accepts a struct ucred--in -STABLE, it will accept struct proc. This facilitates the upcoming commit of per-jail securelevel support. The calls will also generate a kernel printf if the calls are made with NULL ucred/proc pointers: generally speaking, there are few instances of this, and they should be fixed. o Update p_candebug() to use securelevel_gt(); future updates to the remainder of the kernel tree will be committed soon. Obtained from: TrustedBSD Project	2001-09-18 21:03:53 +00:00
Mark Peek	796ed2a6d0	Set debug information on the process being traced, not the current (debugger) process. This should allow gdb to function correctly on post-KSE kernels.	2001-09-18 19:06:11 +00:00
Jonathan Lemon	6a494eeb34	Change p into ke->ke_proc, this was hidden behind INVARIANTS.	2001-09-18 03:36:21 +00:00
Peter Wemm	d2718e479a	Fix a fatal type mismatch (char *static_env; vs char static_env[]). Submitted by: bde	2001-09-17 21:27:41 +00:00
Julian Elischer	fdd4e5c652	Replace line accidentally deleted during KSE additions. Symptom.. Stopped program unable to be restarted if it was stopped while already sleeping.	2001-09-17 20:42:25 +00:00
Robert Watson	9844fbc3b5	o Correct authorization check in CANSIGIO(), which suffered from incorrect transcription during the (pcred,ucred) merge; this was not used for the kill() system call, so does not affect direct explicit process signalling. Pointed out by: fenner	2001-09-15 22:34:46 +00:00
Peter Wemm	b711616825	In the devfs case, have initproc attempt the easy cases of mounting /dev. This works if /dev exists, or if / is read/write (nfsroot). If it is too hard, leave it up to init -d (which will probably fail if /dev does not exist, but there isn't much else we can do short of making a union mount on /). This means we get a proper /dev if you boot a 5.x kernel on a 4.x world, which I happen to do often (the ramdisks on our install netboot servers have 4.x userland worlds on them).	2001-09-15 11:15:22 +00:00
Doug Rabson	de1792cbb8	The ia64 kernel is now linked dynamically so parse its _DYNAMIC structure.	2001-09-15 11:02:10 +00:00
John Baldwin	bce9841972	Fix locking on td_flags for TDF_DEADLKTREAT. If the comments in the code are true that curthread can change during this function, then this flag needs to become a KSE flag, not a thread flag.	2001-09-13 22:33:37 +00:00
Michael Reifenberger	d528be2bf3	PR: kern/29698 (part) Reviewed by: audit Implement SEM_STAT (like IPC_STAT but treats semid as sema-index). The linuxerator will need it.	2001-09-13 21:06:41 +00:00
Michael Reifenberger	b3a4bc4247	PR: kern/29698 (part) Reviewed by: audit Add tunables for the sem* and shm* syscontrols for tuning on boottime until they become dynamic. SAP R/3 doesn't like the compiled in defaults.	2001-09-13 20:20:09 +00:00
Julian Elischer	9dbea9237c	If an incoming struct proc could have been NULL before, tehn don't automatically change the code to add struct proc *p = td->td_proc; because now 'td' is probably capable of being NULL too. I expect to see more of this kind of error during the 'weeding' process. It's too easy to make. (junior hacker project.. look for these :-) Submitted by: mark Peek <mp@freebsd.org>	2001-09-12 20:26:57 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Peter Wemm	8ee6d9e90f	Fix the kern.module_path issue that required the trailing '/' character on each module path component. Fix a one-byte buffer overflow at the same time that got highlighted in the process.	2001-09-12 00:50:23 +00:00
Dima Dorfman	34d2276e63	Correct a debugging message.	2001-09-11 12:20:24 +00:00
Peter Wemm	505222d35f	Implement the long-awaited module->file cache database. A userland tool (kldxref(8)) keeps a cache of what modules and versions are inside what .ko files. I have tested this on both Alpha and i386. Submitted by: bp	2001-09-11 01:09:24 +00:00
John Baldwin	04b5a9bbd6	- Axe holding_giant as it is not used now anyways and was ok'd by dillon in an earlier e-mail. - We don't need to test the console right before we vfprintf() the panicstr message. The printing of the panic message is a fine console test by itself and doesn't make useful messages scroll off the screen or tick developers off in quite the same. Requested by: jlemon, imp, bmilekic, chris, gsutter, jake (2)	2001-09-10 21:04:49 +00:00
Peter Wemm	b03a0c9e5e	Fix a warning on alpha (real problem) and make pstat -t work as a bonus. 'struct tty' was out of sync in user and kernel due to dev_t/udev_t mixups. This takes advantage of the fact that dev_t changes type in userland, so it isn't too pretty.	2001-09-10 12:05:47 +00:00
Dima Dorfman	b40832162b	Make the `nsops' variable in` semop' unsigned. This prevents an overflow if uap->nsops (which is already unsigned) is over INT_MAX; consequently, the bounds check below becomes valid. Previously, if a value over INT_MAX was passed in uap->nsops, the bounds check wouldn't catch it, and the value would be used to compute copyin()'s third argument. Obtained from: NetBSD	2001-09-10 11:36:08 +00:00
Kris Kennaway	bf61e26696	Fix some signed/unsigned integer confusion, and add bounds checking of arguments to some functions. Obtained from: NetBSD Reviewed by: peter MFC after: 2 weeks	2001-09-10 11:28:07 +00:00
Peter Wemm	ed6c38886e	Fix a warning. l_name is managed by us and is malloc/free'ed. It is the userland declaration of l_name that is inconvenient for us.	2001-09-10 07:53:04 +00:00
Peter Wemm	e414d9aad7	Add on UPAGES to ki_rssize since it is there as result of the process and can be swapped out with the process.	2001-09-10 07:29:32 +00:00
Peter Wemm	eb30c1c0b9	Rip some well duplicated code out of cpu_wait() and cpu_exit() and move it to the MI area. KSE touched cpu_wait() which had the same change replicated five ways for each platform. Now it can just do it once. The only MD parts seemed to be dealing with fpu state cleanup and things like vm86 cleanup on x86. The rest was identical. XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional stub in place. Reviewed by: jake, tmm, dillon	2001-09-10 04:28:58 +00:00
Matthew Dillon	06ae1e91c4	This brings in a Yahoo coredump patch from Paul, with additional mods by me (addition of vn_rdwr_inchunks). The problem Yahoo is solving is that if you have large process images core dumping, or you have a large number of forked processes all core dumping at the same time, the original coredump code would leave the vnode locked throughout. This can cause the directory vnode to get locked up, which can cause the parent directory vnode to get locked up, and so on all the way to the root node, locking the entire machine up for extremely long periods of time. This patch solves the problem in two ways. First it uses an advisory non-blocking lock to abort multiple processes trying to core to the same file. Second (my contribution) it chunks up the writes and uses bwillwrite() to avoid holding the vnode locked while blocking in the buffer cache. Submitted by: ps Reviewed by: dillon MFC after: 2 weeks	2001-09-08 20:02:33 +00:00
John Baldwin	df53e91c18	Call sendsig() with the proc lock held and return with it held.	2001-09-06 22:20:41 +00:00
Peter Wemm	fc8b64e494	Sigh. Dig up text from a signature in a 1994 Usenet post I made and redo the ..uhh... ``console test'' to avoid another 50 emails about GPL issues.	2001-09-05 23:51:06 +00:00
David E. O'Brien	faf73940c6	Fix the definition generation code from rev 1.15 that generates non-style(9) compliant structure definitions.	2001-09-05 01:27:53 +00:00
Ian Dowse	7476f7e87d	Fix a memory leak in __getcwd() that can occur after a filesystem has been forcibly unmounted. If the filesystem root vnode is reached and it has no associated mountpoint (vp->v_mount == NULL), __getcwd would return without freeing 'buf'. Add the missing free() call. PR: kern/30306 Submitted by: Mike Potanin <potanin@mccme.ru> MFC after: 1 week	2001-09-04 19:03:47 +00:00
Peter Wemm	c92c4c8f79	Unindent a if (1) { that was left behind in the last commit. (commits were seperated to not obscure the real change)	2001-09-03 04:39:38 +00:00
Peter Wemm	00dda5e82b	Argh. Make the ia64 kernel work in all situations. For some reason, and I still dont know why, this was not failing on the non-kse kernel. It certainly should have since things were using linker_kernel_file unconditionally. This has highlighted a different problem though that means that trying to do a kldload on a non-dynamic kernel will implode.	2001-09-03 04:37:55 +00:00
David E. O'Brien	6533ba2e33	Match the declaration in net/netisr.h. Submitted by: gcc 3.0.1	2001-09-03 03:24:31 +00:00
Peter Wemm	772121fd11	The !RESTARTABLE_PANICS code has some loose ends.	2001-09-02 12:24:38 +00:00
Peter Wemm	ef4181d98e	For ia64, set the default elf brand to be FreeBSD. This is temporarily necessary only for as long as we're using a linux toolchain.	2001-09-02 12:23:08 +00:00
John Baldwin	e342cd279f	Use sched_lock to protect rtp_to_pri() and pri_to_rtp() when needed.	2001-09-02 01:05:36 +00:00
John Baldwin	51b4eed974	Protect pri_to_rtp() with sched_lock when needed.	2001-09-02 00:52:11 +00:00
Chris D. Faulhaber	dbb14f9874	In the case of ACL_OTHER and undefined ACL entry id's, set ae_id to ACL_UNDEFINED_ID instead of 0. Reviewed by: rwatson	2001-09-01 23:16:02 +00:00
John Baldwin	da3abba462	Remove #if 0'd remnants of the old idle page zeroing.	2001-09-01 20:17:43 +00:00
Matthew Dillon	c8b8bac3ed	Regenerate syscalls	2001-09-01 19:37:41 +00:00
Matthew Dillon	257d198890	Synchronize syscalls.master(s) with recent Giant pushdown work	2001-09-01 19:36:48 +00:00
Matthew Dillon	ad2edad94e	Giant Pushdown: read() pread() readv() write () pwrite() writev() ioctl() select () poll() openbsd_poll()	2001-09-01 19:34:23 +00:00
Matthew Dillon	835a82ee2d	Giant Pushdown. Saved the worst P4 tree breakage for last. reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit() setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp() getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid() setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid() setresuid() setresgid() getresuid() getresgid () __setugid() getlogin() setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload() kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize() dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf() flock()	2001-09-01 19:04:37 +00:00
Matthew Dillon	fb99ab8811	Giant Pushdown clock_gettime() clock_settime() nanosleep() settimeofday() adjtime() getitimer() setitimer() __sysctl() ogetkerninfo() sigaction() osigaction() sigpending() osigpending() osigvec() osigblock() osigsetmask() sigsuspend() osigsuspend() osigstack() sigaltstack() kill() okillpg() trapsignal() nosys()	2001-09-01 18:19:21 +00:00
Matthew Dillon	6f1e8c186f	Pushdown Giant for: profil(), ntp_adjtime(), ogethostname(), osethostname(), ogethostid(), osethostid()	2001-09-01 05:47:58 +00:00
Matthew Dillon	234216ef98	Giant pushdown sys_exit(), [o]wait(), wait4()	2001-09-01 04:37:34 +00:00
Matthew Dillon	f708f4d189	Giant Pushdown ACL syscalls: __acl_get_file, __acl_set_file, __acl_get_fd, __acl_set_fd, __acl_delete_file, __acl_delete_fd, __acl_aclcheck_file, __acl_aclcheck_fd	2001-09-01 04:33:22 +00:00
Matthew Dillon	f7b200fd2f	regenerate syscalls	2001-09-01 03:56:12 +00:00
Matthew Dillon	918c3b1361	Make yield() MPSAFE. Synchronize syscalls.master with all MPSAFE changes to date. Synchronize new syscall generation follows because yield() will panic if it is out of sync with syscalls.master.	2001-09-01 03:54:09 +00:00
Matthew Dillon	116734c4d1	Pushdown Giant for acct(), kqueue(), kevent(), execve(), fork(), vfork(), rfork(), jail().	2001-09-01 03:04:31 +00:00
Matthew Dillon	2afac34da3	Make various posix4 system calls MPSAFE (will fixup syscalls.master later) sched_setparam() sched_getparam() sched_setscheduler() sched_getscheduler() sched_yield() sched_get_priority_max() sched_get_priority_min() sched_rr_get_interval()	2001-08-31 22:34:40 +00:00
Robert Watson	93f4fd1cb6	o Screw over users of the kern.{security.,}suser_permitted sysctl again, by renaming it to kern.security.suser_enabled. This makes the name consistent with other use: "permitted" now refers to a specific right or privilege, whereas "enabled" refers to a feature. As this hasn't been MFC'd, and using this destroys a running system currently, I believe the user base of the sysctl will not be too unhappy. o While I'm at it, un-staticize and export the supporting variable, as it will be used by kern_cap.c shortly. Obtained from: TrustedBSD Project	2001-08-31 21:44:12 +00:00
Matthew Dillon	df9987602f	Giant pushdown syscalls in kern/uipc_syscalls.c. Affected calls: recvmsg(), sendmsg(), recvfrom(), accept(), getpeername(), getsockname(), socket(), connect(), accept(), send(), recv(), bind(), setsockopt(), listen(), sendto(), shutdown(), socketpair(), sendfile()	2001-08-31 00:37:34 +00:00
Matthew Dillon	b6a4b4f9ae	Giant Pushdown: sysv shm, sem, and msg calls.	2001-08-31 00:02:18 +00:00
Matthew Dillon	356861db03	Remove the MPSAFE keyword from the parser for syscalls.master. Instead introduce the [M] prefix to existing keywords. e.g. MSTD is the MP SAFE version of STD. This is prepatory for a massive Giant lock pushdown. The old MPSAFE keyword made syscalls.master too messy. Begin comments MP-Safe procedures with the comment: /* * MPSAFE / This comments means that the procedure may be called without Giant held (The procedure itself may still need to obtain Giant temporarily to do its thing). sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE sv_transtrap() is now MP SAFE and assumed to be MP SAFE ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown) trapsignal() is now MP SAFE (Giant Pushdown) Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant) test in syscall[2]() in /*/trap.c now do not. Instead they explicitly unlock Giant if they previously obtained it, and then assert that it is no longer held to catch broken system calls. Rebuild syscall tables.	2001-08-30 18:50:57 +00:00
Andrey A. Chernov	c8e7634357	advlock: simplify overflow checks	2001-08-29 18:53:53 +00:00
Andrey A. Chernov	63347f1e8f	lseek: simplify overflow checks	2001-08-29 18:35:53 +00:00
Robert Watson	3c4543e046	o Reduce gratuitous whitespace difference from Darwin.	2001-08-29 17:18:04 +00:00
Peter Wemm	df55753880	Fix the ogetkerninfo() syscall handling of sizes for KINFO_BSDI_SYSINFO. This supposedly fixes Netscape 3.0.4 (bsdi binary) on -current. (and is also applicable to RELENG_4) PR: 25476 Submitted by: Philipp Mergenthaler <un1i@rz.uni-karlsruhe.de>	2001-08-29 11:47:53 +00:00
Brian Somers	546a92c4d4	OR M_WAITOK with M_ZERO in malloc()s args for clarity.	2001-08-28 23:58:32 +00:00
Robert Watson	7fd6a9596d	o Improve the style of a number of routines and comments in kern_prot.c, with regards to redundancy, formatting, and style(9). Submitted by: bde	2001-08-28 16:35:33 +00:00
Robert Watson	4bcbade869	Fix typos in recent comments. Submitted by: dd	2001-08-28 05:16:19 +00:00
Robert Watson	3b243b7292	Generally improve documentation of kern_prot.c: o Add comments for: - kern.security.suser_permitted - p_cansee() - p_cansignal() - p_cansched() - kern.security.unprivileged_procdebug_permitted - p_candebug() Update copyright. Obtained from: TrustedBSD	2001-08-27 16:01:52 +00:00
Peter Wemm	0f7289022b	If a file has been completely unlinked, stop automatically syncing the file. ffs will discard any pending dirty pages when it is closed, so we may as well not waste time trying to clean them. This doesn't stop other things from writing it out, eg: pageout, fsync(2) etc.	2001-08-27 06:09:56 +00:00
Andrey A. Chernov	c4778eed9f	Cosmetique & style fixes from bde	2001-08-26 10:23:49 +00:00
Peter Wemm	268bdb43f9	Optionize UPAGES for the i386. As part of this I split some of the low level implementation stuff out of machine/globaldata.h to avoid exposing UPAGES to lots more places. The end result is that we can double the kernel stack size with 'options UPAGES=4' etc. This is mainly being done for the benefit of a MFC to RELENG_4 at some point. -current doesn't really need this so much since each interrupt runs on its own kstack.	2001-08-25 02:20:02 +00:00
Bosko Milekic	76dcbd6f9f	Force a commit on kern_mutex.c to explain reason for last commit but while I'm at it also add a comment in mtx_validate() explaining the purpose of the last change. Basically, this fixes booting kernels compiled with MUTEX_DEBUG. What used to happen is before we setidt from init386() [still using BTX idt], we called mtx_init() on several mutex locks, notably Giant and some others. This is a problem for MUTEX_DEBUG because it enables mtx_validate() which calls kernacc(), some of which in turn requires Giant. Fix by calling kernacc() from mtx_validate() only if (!cold).	2001-08-24 23:00:59 +00:00
Bosko Milekic	ab07087e16	* empty log message *	2001-08-24 22:53:45 +00:00
John Baldwin	6385dec00e	Style nits: - Don't use punctuation or newlines in panic messages. - Remove excess blank lines. Requested and partially submitted by: bde	2001-08-24 17:46:58 +00:00
Peter Pentchev	ccdbd10cb7	Prevent passing a null pointer as a filename to vn_open(), if for some reason expand_name() failed to build a core file name. PR: 29931 Submitted by: Foldi Tamas <crow@kapu.hu> Reviewed by: dd, -arch MFC after: 1 month	2001-08-24 15:49:30 +00:00
Andrey A. Chernov	dc6e1079e6	Remove extra check unneded now	2001-08-24 10:20:26 +00:00
Robert Watson	670f6b2fc6	o Clarify comments in vaccess_acl_posix1e() ACL evaluation routine so as to improve readability and accuracy. Obtained from: TrustedBSD Project	2001-08-24 01:41:42 +00:00
John Baldwin	b0b7cb508c	Use witness_upgrade/downgrade for sx_try_upgrade/downgrade.	2001-08-23 22:51:22 +00:00
John Baldwin	c19fe5e261	Add witness_upgrade() and witness_downgrade() for handling upgrades and downgrades of shared/exclusive locks.	2001-08-23 22:47:05 +00:00
John Baldwin	d7c4536a55	Convert some KASSERT()'s into if (foo) panic() because they are testing how locks are managed by the rest of the kernel, not verifying the internal integrity of witness itself.	2001-08-23 22:44:47 +00:00
John Baldwin	1432aa0c5e	Add a new kernel option RESTARTABLE_PANICS. If this option is present, then one can restart from a panic by resetting the panicstr variable to NULL. This commit conditionalizes the previously committed functionality on this variable. It also removes the __dead2 attribute from the panic() function so that when one continues from a panic() the behavior will be predictable.	2001-08-23 20:32:21 +00:00
John Baldwin	e2870579fa	Clear the sx_xholder pointer when downgrading an exclusive lock.	2001-08-23 17:57:37 +00:00
Andrey A. Chernov	5d97bedb22	vn_stat(): if va_size (u_quad_t) > OFF_MAX, return EOVERFLOW, don't copy it blindly to st_size	2001-08-23 17:56:48 +00:00
Andrey A. Chernov	6fb9fbceab	Add yet one check for SEEK_END overflow	2001-08-23 17:09:23 +00:00
Andrey A. Chernov	db106eff39	lseek: fix check for vattr.va_size overflow. Check suggested by bde simple not works with unsigned types.	2001-08-23 17:01:25 +00:00
Andrey A. Chernov	62be011ebd	Oops, fix my broken handling of new l_len<0 case	2001-08-23 16:00:27 +00:00
Andrey A. Chernov	f510e1c2ec	Originally BSD return EINVAL for l_len < 0, but now POSIX wants it too, so implement POSIX l_len < 0 handling.	2001-08-23 15:40:30 +00:00
Andrey A. Chernov	6d24c65d96	Cosmetique: correct English in comments Pointed by: bde	2001-08-23 14:41:39 +00:00
Andrey A. Chernov	b82f5b624c	Cosmetique: more <sys/*> into one group, separate include families by blank line	2001-08-23 13:51:17 +00:00
Andrey A. Chernov	b44af710d3	Move <machine/> after <sys/> Pointed by: bde	2001-08-23 13:21:17 +00:00
Andrey A. Chernov	4b207d9868	Move <machine/> after <sys/> Add missing fdrop() before EOVERFLOW Pointed by: bde	2001-08-23 13:19:32 +00:00
Andrey A. Chernov	69cc1d0d7f	Detect off_t EOVERFLOW of start/end offsets calculations for adv. lock, as POSIX require.	2001-08-23 07:42:40 +00:00
Thomas Moestl	040ef07af8	Regenerate from syscalls.master using the new makesyscalls.sh revision.	2001-08-22 23:27:20 +00:00
Thomas Moestl	a4189a088b	Add padding before each element of the syscall argument structures in sysproto.h in addition to the existing padding afterwards. This is needed to support big-endian architectures like sparc64. Reviewed by: bde Tested on alpha by: jhb	2001-08-22 23:22:47 +00:00
Alexander Langer	b8c526df70	Fix a simple typo I just happened to find.	2001-08-22 19:12:24 +00:00
Matthew Dillon	0cf5e0ebd6	Remove the code that limited the buffer_map to 1/2 the size of the kernel_map. maxbcache takes care of this now and the 1/2 limit can interfere with testing. Suggested by: bde	2001-08-22 18:10:37 +00:00
Matthew Dillon	219d632c15	Move most of the kernel submap initialization code, including the timeout callwheel and buffer cache, out of the platform specific areas and into the machine independant area. i386 and alpha adjusted here. Other cpus can be fixed piecemeal. Reviewed by: freebsd-smp, jake	2001-08-22 04:07:27 +00:00
John Baldwin	61e9650010	Clear db_active in boot() so that one can call the boot function (as well as use the panic command) w/o having to manually clear db_active first to avoid the db_error() in mi_switch().	2001-08-21 23:29:40 +00:00
John Baldwin	b285782b29	Release the sched_lock before bombing out in mi_switch() via db_error(). This makes things slightly easier if you call a function that calls mi_switch() as it keeps the locking before and after closer.	2001-08-21 23:10:37 +00:00
John Baldwin	1a5333c37c	Allow one to restart from a panic in DDB by clearing the panicstr variable to NULL. Note that since panic() is marked with __dead2, this has somewhat unpredictable results at best.	2001-08-21 22:55:20 +00:00
Andrey A. Chernov	383f169d4a	Make lseek() POSIXed: for non character special files 1) handle off_t overflow with EOVERFLOW 2) handle negative offsets with EINVAL Reviewed by: arch discussion	2001-08-21 21:20:42 +00:00
John Baldwin	161778121a	Add a hook to mi_switch() to abort via db_error() if we attempt to perform a context switch from DDB. Consulting from: bde	2001-08-21 20:09:05 +00:00
John Baldwin	91a4536f22	- Fix a bug in the previous workaround for the tsleep/endtsleep race. callout_stop() would fail in two cases: 1) The timeout was currently executing, and 2) The timeout had already executed. We only needed to work around the race for 1). We caught some instances of 2) via the PS_TIMEOUT flag, however, if endtsleep() fired after the process had been woken up but before it had resumed execution, PS_TIMEOUT would not be set, but callout_stop() would fail, so we would block the process until endtsleep() resumed it. Except that endtsleep() had already run and couldn't resume it. This adds a new flag PS_TIMOFAIL to indicate the case of 2) when PS_TIMEOUT isn't set. - Implement this race fix for condition variables as well. Tested by: sos	2001-08-21 18:42:45 +00:00
Peter Wemm	e8ebc08f80	Make COMPAT_43 optional again. XXX we need COMPAT_FBSD3 etc for this stuff.	2001-08-21 02:32:59 +00:00
Ian Dowse	8774836bf8	Avoid sleeping while holding a mutex in dounmount(). This problem has existed for a long time, but I made it worse a few months ago by by adding calls to VFS_ROOT() and checkdirs() in revision 1.179. Also, remove the LK_REENABLE flag in the lockmgr() call; this flag has been ignored by the lockmgr code for 4 years. This was the only remaining mention of it apart from its definition. Reviewed by: jhb	2001-08-20 19:16:31 +00:00
Matthew Dillon	e1616f3a7b	Conditionalize VM_SWZONE_SIZE_MAX and VM_BCACHE_SIZE_MAX so MD sections that don't define these constants don't break.	2001-08-20 16:29:13 +00:00
Dima Dorfman	fcd7e67061	Sync the default module search path with the one in sys/boot/common/module.c. PR: 21405 Submitted by: Makoto MATSUSHITA <matusita@jp.FreeBSD.org>	2001-08-20 01:12:28 +00:00
Matthew Dillon	2f9e4e8025	Limit the amount of KVM reserved for the buffer cache and for swap-meta information. The default limits only effect machines with > 1GB of ram and can be overriden with two new kernel conf variables VM_SWZONE_SIZE_MAX and VM_BCACHE_SIZE_MAX, or with loader variables kern.maxswzone and kern.maxbcache. This has the effect of leaving more KVM available for sizing NMBCLUSTERS and 'maxusers' and should avoid tripups where a sysad adds memory to a machine and then sees the kernel panic on boot due to running out of KVM. Also change the default swap-meta auto-sizing calculation to allocate half of what it was previously allocating. The prior defaults were way too high. Note that we cannot afford to run out of swap-meta structures so we still stay somewhat conservative here.	2001-08-20 00:41:12 +00:00
Julian Elischer	a8cfc0ee40	Forgot to remove this un-needed test. (M_WAITOK won't fail) I vaguely remember someone once proving it COULD return NULL.. was that changed? Reminded by: BDE MFC after: 2 weeks	2001-08-19 04:30:13 +00:00
Julian Elischer	ad4ff09012	fix typo Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	2001-08-18 17:43:29 +00:00
Mark Peek	29b7fbd17f	Unbreak linux compatibility by providing the correct length of the buffer. Reported by: "Pierre Y. Dampure" <pierre.dampure@westmarsh.com>, "Niels Chr. Bank-Pedersen" <ncbp@bank-pedersen.dk> Pointy hat to: mp	2001-08-18 04:24:30 +00:00
Julian Elischer	8f364875fe	Don't alocate a 400 byte buffer on the stack, Nor 800 bytes of structures.. MFC after: 2 weeks	2001-08-18 02:53:50 +00:00
Dima Dorfman	0c1bb4fbf1	Implement a LOCAL_PEERCRED socket option which returns a `struct xucred` with the credentials of the connected peer. Obviously this only works (and makes sense) on SOCK_STREAM sockets. This works for both the connect(2) and listen(2) callers. There is precise documentation of the semantics in unix(4). Reviewed by: dwmalone (eyeballed)	2001-08-17 22:01:18 +00:00
Peter Wemm	0ecd57ad0b	Fix part of another problem that bde pointed out. This is different to what bde suggested though.	2001-08-16 23:43:24 +00:00
Peter Wemm	5a66a2532b	Remove redundant null-termination. The buffer is already explicitly zeroed, and we intentionally leave -1 on the strncpy length to leave the original \0. Submitted by: bde	2001-08-16 20:18:43 +00:00
Peter Wemm	a75a0c55f4	Don't explicitly null-terminate. The buffer we are copying into is already zeroed, and we explicitly leave the last byte untouched. Submitted by: bde	2001-08-16 20:16:20 +00:00
Mark Peek	911c2be00b	Reduce stack allocation (stack-fast?). elf_load_file() => 352 to 52 bytes exec_elf_imgact() => 1072 to 48 bytes elf_corehdr() => 396 to 8 bytes Reviewed by: julian	2001-08-16 16:14:26 +00:00
Peter Wemm	77330eeba7	Use the backwards compatability mechanisms so that ps/top etc dont have unnecessary breakage. While here, use explicit sizes for the string fields so that we dont have unintentional changes again in the future when key tunables change. This still is not quite right, but a june userland is happy with a -current kernel with these tweaks.	2001-08-16 08:41:15 +00:00
Peter Wemm	6eef6816a8	Use explicit sizes for the prpsinfo command length string so that we dont have any more unexpected changes in core dumps. This gets us back to the original core dump layout from a few days ago.	2001-08-16 08:35:51 +00:00
Bruce Evans	a572c95c3b	Don't dump on the label sector or below. This avoids clobbering the label if the dump device overflaps the label (which is a slight misconfiguration). Dump routines don't use dscheck(), so the normal write protection of the label doesn't help. Reduced some nearby overflow bugs. In disk_dumpcheck(), there was (fatal but fail-safe) overflow on i386's with 4GB of memory, at least if Maxmem was the top page (can this happen?). The fix assumes that the sector size divides PAGE_SIZE (dump routines already assume this). In setdumpdev(), the corresponding overflow occurred with only about 2GB of memory on all machines with 32-bit ints. This allowed setdumpdev() to succeed when it shouldn't have, but then disk_dumpcheck() failed safe later. Except in old versions of FreeBSD like RELENG_3 where there is no disk_dumpcheck(). PR: 28164 (label clobbering part) MFC after: 1 week	2001-08-15 11:35:45 +00:00
Jason Evans	54db32e945	Implement kernel semaphores. Reviewed by: jhb	2001-08-14 22:13:14 +00:00
Jason Evans	d55229b72e	Add sx_try_upgrade() and sx_downgrade(). Submitted by: Alexander Kabaev <ak03@gte.com>	2001-08-13 21:25:30 +00:00
John Baldwin	3f085c228e	If we've panic'd already, then just bail in lockmgr rather than blocking or possibly panic'ing again.	2001-08-10 23:29:15 +00:00
Bill Paul	c214e6636e	Fix some of the GDB linkage setup. The l_name member of the gdb linkage structure is always free()ed yet only sometimes malloc()ed. In particular, it was simply set to point to l_filename from the a linker_file_t in link_elf_link_preload_finish(). The l_filename had been malloc()ed inside the kern_linker.c module and was being free()ed twice: once by link_elf_unload_file() and again by linker_file_unload(), leading to a panic. How to duplicate the problem: - Pre-load a kernel module from the loader, i.e. if_sis.ko - Boot system - Attempt to unload module with kldunload if_sis - Bewm The problem here is that the case where the module was loaded with kldload after system boot would work correctly, so this bug went unnoticed until I stubbed my toe on it just now. (Also, you can only trip this bug if you compile a kernel with options DDB, but that's the default now.) Fix: remember to malloc() a separate copy of the module name for the l_name member of the gdb linkage structure in three places where the linkage structure can be initialized.	2001-08-10 23:15:13 +00:00
John Baldwin	688ebe120c	- Close races with signals and other AST's being triggered while we are in the process of exiting the kernel. The ast() function now loops as long as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with preemption disabled so that any further AST's that arrive via an interrupt will be delayed until the low-level MD code returns to user mode. - Use u_int's to store the tick counts for profiling purposes so that we do not need sched_lock just to read p_sticks. This also closes a problem where the call to addupc_task() could screw up the arithmetic due to non-atomic reads of p_sticks. - Axe need_proftick(), aston(), astoff(), astpending(), need_resched(), clear_resched(), and resched_wanted() in favor of direct bit operations on p_sflag. - Fix up locking with sched_lock some. In addupc_intr(), use sched_lock to ensure pr_addr and pr_ticks are updated atomically with setting PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing PS_OWEUPC. We also do not grab the lock just to test a flag. - Simplify the handling of Giant in ast() slightly. Reviewed by: bde (mostly)	2001-08-10 22:53:32 +00:00
John Baldwin	827dcaf663	Make witness compile w/o DDB. Reported by: wpaul	2001-08-10 22:33:59 +00:00
Ian Dowse	a9a8ba3d71	Arbitrarily limit to 64k the number of bytes that can be read at a time using the ogetdirentries() compatibility syscall. This is a hack to ensure that rediculous values don't get passed to MALLOC(). Reviewed by: kris	2001-08-10 22:14:18 +00:00
John Baldwin	8791b43513	Work around a race between msleep() and endtsleep() where it was possible for endtsleep() to be executing when msleep() resumed, for endtsleep() to spin on sched_lock long enough for the other process to loop on msleep() and sleep again resulting in endtsleep() waking up the "wrong" msleep. Obtained from: BSD/OS	2001-08-10 21:08:56 +00:00
John Baldwin	a45982d2ea	Change callout_stop() to return an integer. If callout_stop() succeeds in removing the callout entry, return 1. If callout_stop() fails to remove the callout entry because it is currently executing or has already been executed, then the function returns 0. The idea was obtained from BSD/OS, however, BSD/OS changed untimeout(), and I've just changed callout_stop() to be more conservative. Obtained from: BSD/OS	2001-08-10 21:06:59 +00:00
John Baldwin	4d33620270	Style nit: covert a couple of if (p_wchan) tests to if (p_wchan != NULL).	2001-08-10 20:56:25 +00:00
John Baldwin	c4a448100c	- Remove asleep(), await(), and M_ASLEEP. - Callers of asleep() and await() have been converted to calling tsleep(). The only caller outside of M_ASLEEP was the ata driver, which called both asleep() and await() with spl-raised, so there was no need for the asleep() and await() pair. M_ASLEEP was unused. Reviewed by: jasone, peter	2001-08-10 06:45:43 +00:00
John Baldwin	8ec48c6dbf	- Remove asleep(), await(), and M_ASLEEP. - Callers of asleep() and await() have been converted to calling tsleep(). The only caller outside of M_ASLEEP was the ata driver, which called both asleep() and await() with spl-raised, so there was no need for the asleep() and await() pair. M_ASLEEP was unused. Reviewed by: jasone, peter	2001-08-10 06:37:05 +00:00
John Baldwin	ab32297d8d	Axe spl's obsoleted by the callout mutex.	2001-08-10 01:36:25 +00:00
Peter Wemm	99ab2d5dca	* empty log message *	2001-08-09 01:21:58 +00:00
Peter Wemm	2aca0c28d3	Zap 'ptrace(PT_READ_U, ...)' and 'ptrace(PT_WRITE_U, ...)' since they are a really nasty interface that should have been killed long ago when 'ptrace(PT_[SG]ETREGS' etc came along. The entity that they operate on (struct user) will not be around much longer since it is part-per-process and part-per-thread in a post-KSE world. gdb does not actually use this except for the obscure 'info udot' command which does a hexdump of as much of the child's 'struct user' as it can get. It carries its own #defines so it doesn't break compiles.	2001-08-08 05:25:15 +00:00
Brian Feldman	bcc92693d4	Previously, the ELF linker would always just store the pointer to a filename passed in via the module loader functions in the GDB "sharedlibrary" support structures. This isn't good, since the pointer would become stale in almost every case (not the pre-loaded case, of course). Change this to malloc()ed copy of the string and finally fix the reason that gdb -k's "sharedlibrary" command stopped working. Obtained from: LOMAC/FreeBSD (cf. NAI Labs)	2001-08-06 14:21:57 +00:00
Chris Costello	c30d4da338	Remove the fildesc_clone() function and its associated unnecessary code. It didn't implement the proper /dev/fd functionality (which would be to include in the directory listing /dev/fd/n if the process has fd n open) anyway. Anything needing access to /dev/fd/n where n > 2 can use the optional fdescfs module, which implements this properly and does not cause any trouble with devfs. Discussed with: phk	2001-08-06 05:56:33 +00:00
Thomas Moestl	12543b2e98	Export the tk_nin and tk_nout variables (number of tty input/output characters) as sysctls (kern.tty_nin and kern.tty_nout).	2001-08-04 18:09:24 +00:00
Thomas Moestl	938a4e5c0c	Export the head structure for the device statistics STAILQ in sys/devicestat.h, so that the queue can be walked in crashdumps using libkvm.	2001-08-04 18:02:47 +00:00
John Baldwin	c9c1406f76	Add KTR_INTR tracepoints for when clock interrupts are triggered.	2001-08-03 20:54:41 +00:00
Robert Watson	fd6aaf7fe1	Anton kindly pointed out (and fixed) a bug in the Jail handling of the bind() call on IPv4 sockets: Currently, if one tries to bind a socket using INADDR_LOOPBACK inside a jail, it will fail because prison_ip() does not take this possibility into account. On the other hand, when one tries to connect(), for example, to localhost, prison_remote_ip() will silently convert INADDR_LOOPBACK to the jail's IP address. Therefore, it is desirable to make bind() to do this implicit conversion as well. Apart from this, the patch also replaces 0x7f000001 in prison_remote_ip() to a more correct INADDR_LOOPBACK. This is a 4.4-RELEASE "during the freeze, thanks" MFC candidate. Submitted by: Anton Berezin <tobez@FreeBSD.org> Discussed with at some point: phk MFC after: 3 days	2001-08-03 18:21:06 +00:00
Bosko Milekic	ba3e88262e	Rename mb_init() mbuf subsystem initialization routine to mbuf_init(), in order to avoid namespace collision with subr_mchain.c's mb_init(). This wasn't "fatal" as the mbuf initialization routine mb_init() was local to subr_mbuf.c which in turn didn't pull in subr_mchain.c's mb_init() declaration, but it should deffinately be changed now before it creates headache.	2001-08-03 05:05:32 +00:00
Jake Burkholder	f74250ca46	Remove some code that appears to have endian problems with INVARIANTS. This is #if BIG_ENDIAN, but is only necessary if malloc types are shorts, not struct malloc_type * like they are now.	2001-08-03 03:31:45 +00:00
John Baldwin	b39bc3e160	Use 'p' instead of the potentially more expensive 'curproc' inside of mi_switch().	2001-08-02 22:15:31 +00:00
Warner Losh	c7021493ba	Make the fmt arguments to make_dev and make_dev_alias const char *. Approved on IRC as long as it didn't cause a large number of warnings by: phk MFC After: 700 hours	2001-08-02 20:35:35 +00:00
Peter Wemm	aa7a4dae6d	Temporarily back out kern_sig.c rev 1.125 and kern_exit.c rev 1.131. This paniced my one of my machines one time too many :-( and there is no sign of a solution in the pipeline. The deltas are still easily available in cvs. The problem is that if the parent has been swapped out, the child process cannot grope around in the parent's UPAGES to see the sigact[] array or it will fault. This probably is a showstopper for this implementation anyway.	2001-08-01 20:35:24 +00:00
Bosko Milekic	bb6f838c79	Move CPU_ABSENT() macro to smp.h, where it belongs anyway. It will be defined to 0 in the non-SMP case, which very much makes sense as it permits its usage in per-CPU initialization loops (for an example, check out subr_mbuf.c). Further, on a UP system, make mb_alloc always use the first per-CPU container, regardless of cpuid (i.e. remove reliability on cpuid in the UP case). Requested by: alfred	2001-08-01 00:54:00 +00:00
John Baldwin	36c2e9feb4	Apply the cluebat to myself and undo the await() -> mawait() rename. The asleep() and await() functions split the functionality of msleep() up into two halves. Only the asleep() half (which is what puts the process on the sleep queue) actually needs the lock usually passed to msleep() held to prevent lost wakeups. await() does not need the lock held, so the lock can be released prior to calling await() and does not need to be passed in to the await() function. Typical usage of these functions would be as follows: mtx_lock(&foo_mtx); ... do stuff ... asleep(&foo_cond, PRIxx, "foowt", hz); ... mtx_unlock&foo_mtx); ... await(-1, -1); Inspired by: dillon on the couch at Usenix	2001-07-31 22:06:56 +00:00
John Baldwin	e9121d0663	Add a safety belt to mawait() for the (cold \|\| panicstr) case identical to the one in msleep() such that we return immediately rather than blocking. Submitted by: peter Prodded by: sheldonh	2001-07-31 20:57:57 +00:00
John Baldwin	5cb0fbe47e	If we have already panic'd then don't bother enforcing mutex asserts as things are pretty much shot already and all panic'ing does is hurt our chances of getting a dump. Inspired by: sheldonh	2001-07-31 17:45:50 +00:00

... 3 4 5 6 7 ...

4399 Commits