freebsd-dev

Author	SHA1	Message	Date
Jeff Roberson	a57decdf32	- In sysctl_kern_file skip fdps with negative lastfiles. This can happen if there are no files open. Accounting for these can eventually return a negative value for olenp causing sysctl to crash with a bad malloc. Reported by: Pawel Worach <pawel.worach@gmail.com>	2008-01-03 01:26:59 +00:00
Jeff Roberson	397c19d175	Remove explicit locking of struct file. - Introduce a finit() which is used to initailize the fields of struct file in such a way that the ops vector is only valid after the data, type, and flags are valid. - Protect f_flag and f_count with atomic operations. - Remove the global list of all files and associated accounting. - Rewrite the unp garbage collection such that it no longer requires the global list of all files and instead uses a list of all unp sockets. - Mark sockets in the accept queue so we don't incorrectly gc them. Tested by: kris, pho	2007-12-30 01:42:15 +00:00
Robert Watson	cc43c38c87	Add two new sysctls in support of the forthcoming procstat(1) to support its -f and -v arguments: kern.proc.filedesc - dump file descriptor information for a process, if debugging is permitted, including socket addresses, open flags, file offsets, file paths, etc. kern.proc.vmmap - dump virtual memory mapping information for a process, if debugging is permitted, including layout and information on underlying objects, such as the type of object and path. These provide a superset of the information historically available through the now-deprecated procfs(4), and are intended to be exported in an ABI-robust form.	2007-12-02 10:10:27 +00:00
Robert Watson	0bf686c125	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
Jeff Roberson	f6c1ecca50	- Use explicit locking in the various fcntl case statements so that we can acquire shared filedescriptor locks in the appropriate cases. - Remove Giant from calls that issue ioctls. The ioctl path has been mpsafe for some time now. - Only acquire giant for VOP_ADVLOCK when the filesystem requires giant. advlock is now mpsafe. Reviewed by: rwatson Approved by: re	2007-07-03 21:26:06 +00:00
Robert Watson	7251b7863c	Rather than passing SUSER_RUID into priv_check_cred() to specify when a privilege is checked against the real uid rather than the effective uid, instead decide which uid to use in priv_check_cred() based on the privilege passed in. We use the real uid for PRIV_MAXFILES, PRIV_MAXPROC, and PRIV_PROC_LIMIT. Remove the definition of SUSER_RUID; there are now no flags defined for priv_check_cred(). Obtained from: TrustedBSD Project	2007-06-16 23:41:43 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Konstantin Belousov	5c76452f8f	Mark the filedescriptor table entries with VOP_OPEN being performed for them as UF_OPENING. Disable closing of that entries. This should fix the crashes caused by devfs_open() (and fifo_open()) dereferencing struct file * by index, while the filedescriptor is closed by parallel thread. Idea by: tegge Reviewed by: tegge (previous version of patch) Tested by: Peter Holm Approved by: re (kensmith) MFC after: 3 weeks	2007-05-04 14:23:29 +00:00
John Baldwin	06e043fb20	Avoid a lot of code duplication by using kern_open() to open /dev/null in fdcheckstd() instead of a stripped down version of kern_open()'s code. MFC after: 1 week Reviewed by: cperciva	2007-04-26 18:01:19 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
John Baldwin	3076ca6720	Just use 'fdrop()' instead of 'FILE_LOCK(); fdrop_locked()' in dupfdopen(). While I'm at it, move the second fdrop() out from under the filedesc lock.	2007-03-15 21:19:21 +00:00
Robert Watson	873fbcd776	Further system call comment cleanup: - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.	2007-03-05 13:10:58 +00:00
Robert Watson	0c14ff0eb5	Remove 'MPSAFE' annotations from the comments above most system calls: all system calls now enter without Giant held, and then in some cases, acquire Giant explicitly. Remove a number of other MPSAFE annotations in the credential code and tweak one or two other adjacent comments.	2007-03-04 22:36:48 +00:00
Robert Watson	780a98ad1f	Catch up file descriptor printing function in DDB to the addition of kqueues and POSIX message queues.	2007-02-15 10:55:43 +00:00
Robert Watson	442f65e958	Break file descriptor printing logic out of db_show_files() into db_print_file(), and add a new "show file <ptr>" DDB command, which can be used to print out file descriptors referenced in stack traces.	2007-02-15 10:50:48 +00:00
Xin LI	4f506694bb	Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form.	2007-01-17 14:58:53 +00:00
John Baldwin	9ae328fc8f	- Close a race between enumerating UNIX domain socket pcb structures via sysctl and socket teardown by adding a reference count to the UNIX domain pcb object and fixing the sysctl that enumerates unpcbs to grab a reference on each unpcb while it builds the list to copy out to userland. - Close a race between UNIX domain pcb garbage collection (unp_gc()) and file descriptor teardown (fdrop()) by adding a new garbage collection flag FWAIT. unp_gc() sets FWAIT while it walks the message buffers in a UNIX domain socket looking for nested file descriptor references and clears the flag when it is finished. fdrop() checks to see if the flag is set on a file descriptor whose refcount just dropped to 0 and waits for unp_gc() to clear the flag before completely destroying the file descriptor. MFC after: 1 week Reviewed by: rwatson Submitted by: ups Hopefully makes the panics go away: mx1	2007-01-05 19:59:46 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
John-Mark Gurney	aeab19b21f	return EBADF instead of successfully attaching (and then panicing) when an fd is dieing.. Convinced by: jhb PR: 103127	2006-09-24 02:29:53 +00:00
John Baldwin	b04aff773e	Add a comment to explain what fdclose() does and what it's purpose is since the subtlety eluded me when I looked at it last week.	2006-07-21 20:24:00 +00:00
John Baldwin	c1cccebe8b	Add a kern_close() so that the ABIs can close a file descriptor w/o having to populate a close_args struct and change some of the places that do.	2006-07-08 20:03:39 +00:00
Pawel Jakub Dawidek	0bd645ae0c	Compress direct cr_ruid comparsion and jailed() call to suser_cred(9). Reviewed by: rwatson	2006-06-27 11:32:08 +00:00
Robert Watson	197b35d717	Mark fgetsock() and fputsock() as depcrecated: callers should rely on the file descriptor reference, rather than paying additional lock operations to acquire a socket reference from the file descriptor. This will also help to ensure that file descriptor based socket requests are not delivered to a socket after close. Most consumers have already been converted to this model. MFC after: 3 months	2006-04-01 11:09:54 +00:00
Christian S.J. Peron	2ed4894a26	Restore fd optimization with a few minor tweaks, to quote tegge: "fdinit() fails to initialize newfdp->fd_fd.fd_lastfile to -1. This breaks fdcopy() which will incorrectly set newfdp->fd_freefile to 1 if no files are open and the last file descriptor marked as unused for fdp was 0. This later causes descriptor 0 to be unavailable in newfdp when the optimization is enabled. When the last file descriptor previously marked as used is nonzero and marked as unused, fdunused() incorrectly sets fdp->fd_lastfile to fd - 1 due to fd_last_used() returning (size - 1). This hides the problem that breaks the optimization." This allows us to keep the optimization, while un-breaking it. This is a RELENG_6 candidate. PR: kern/87208 MFC after: 1 week Submitted by: tegge	2006-03-20 00:13:47 +00:00
Christian S.J. Peron	30bacc08e0	Back out fd optimization introduced in revision 1.280 as it appears to be really breaking things. Simple "close(0); dup(fd)" does not return descriptor "0" in some cases. Further, this change also breaks some MAC interactions with mac_execve_will_transition(). Under certain circumstances, fdcheckstd() can be called in execve(2) causing an assertion that checks to make sure that stdin, stdout and stderr reside at indexes 0, 1 and 2 in the process fd table to fail, resulting in a kernel panic when INVARIANTS is on. This should also kill the "dup(2) regression on 6.x" show stopper item on the 6.1-RELEASE TODO list. This is a RELENG_6 candidate. PR: kern/87208 Silence from: des MFC after: 1 week	2006-03-18 23:27:21 +00:00
Wayne Salamon	a750d0b2a2	Add auditing of arguments to the close() and fstat() system calls. Much more argument auditing yet to come, for remaining system calls in this file. Obtained from: TrustedBSD Project Approved by: rwatson (mentor)	2006-02-05 23:57:32 +00:00
John Baldwin	38f63f7e47	Return EBADF rather than EINVAL for FWRITE failure as per POSIX. MFC after: 1 week	2006-01-06 16:30:30 +00:00
David Xu	b2f92ef96b	Last step to make mq_notify conform to POSIX standard, If the process has successfully attached a notification request to the message queue via a queue descriptor, file closing should remove the attachment.	2005-11-30 05:12:03 +00:00
Robert Watson	742be7821c	Add the f_msgcount field to the set of struct file fields printed in show files. MFC after: 1 week	2005-11-10 13:26:29 +00:00
Robert Watson	2be165c93e	Expanet of details printed for each file descriptor to include it's garbage collection flags. Reformat generally to make this fit and leave some room for future expansion. MFC after: 1 week	2005-11-10 11:35:59 +00:00
Robert Watson	b4e507aafa	Add a DDB "show files" command to list the current open file list, some state about each open file, and identify the first process in the process table that references the file. This is helpful in debugging leaks of file descriptors. MFC after: 1 week	2005-11-10 10:42:50 +00:00
Robert Watson	f8a9ed1fa7	Fix typo in recent comment tweak. Submitted by: jkim MFC after: 1 week	2005-11-09 22:02:02 +00:00
Robert Watson	923633b4b5	In closef(), remove the assumption that there is a thread associated with the file descriptor. When a file descriptor is closed as a result of garbage collecting a UNIX domain socket, the file descriptor will not have any associated thread, so the logic to identify advisory locks held by that thread is not appropriate. Check the thread for NULL to avoid this scenario. Expand an existing comment to say a bit more about this. MFC after: 1 week	2005-11-09 20:54:25 +00:00
John Baldwin	68a17869c1	Push down Giant into fdfree() and remove it from two of the callers. Other callers such as some rfork() cases weren't locking Giant anyway. Reviewed by: csjp MFC after: 1 week	2005-11-01 17:13:05 +00:00
Robert Watson	5bb84bc84b	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
Roman Kurakin	826cf005ed	Use FILEDESC_UNLOCK(fdp) after FILE_UNLOCK(p), not before to avoid LOR. Slightly discussed on current@. LOR #055 MFC after: 14 days	2005-10-04 16:27:54 +00:00
Dag-Erling Smørgrav	d09dfa2bfd	Two minor optimizations of fdalloc(): - if minfd < fd_freefile (as is most often the case, since minfd is usually 0), set it to fd_freefile. - remove a call to fd_first_free() which duplicates work already done by fdused(). This change results in a small but measurable speedup for processes with large numbers (several thousands) of open files. PR: kern/85176 Submitted by: Divacky Roman <xdivac02@stud.fit.vutbr.cz> MFC after: 3 weeks	2005-08-26 11:16:39 +00:00
Dima Dorfman	1ee6b74603	Fix fdcheckstd to pass the file descriptor along through vn_open. When opening a device, devfs_open needs the file descriptor to install its own fileops. Failing to pass the file descriptor causes the vnode to be returned with the regular vnops, which will cause a panic on the first read or write because devfs_specops is not meant to support those operations. This bug caused a panic after exec'ing any set[ug]id program with fds 0..2 closed (i.e., if any action had to be taken by fdcheckstd, we would panic if the exec'd program ever tried to use any of those descriptors). Reviewed by: phk Approved by: re (scottl)	2005-06-25 03:34:49 +00:00
Jeff Roberson	6de925e58b	- Use NAMEI to pickup Giant if we need it in fpcheckstd().	2005-05-03 10:52:22 +00:00
Giorgos Keramidas	0a11e99990	Remove redundant initialization that is repeated in the for() loop right below it. Approved by: jhb	2005-03-08 16:57:20 +00:00
Giorgos Keramidas	46da8bf8fb	Typo & grammar fixes in comments.	2005-03-08 00:58:50 +00:00
Poul-Henning Kamp	44dc16a986	Make some file/filedesc related functions static	2005-02-10 12:27:58 +00:00
John Baldwin	76951d21d1	- Tweak kern_msgctl() to return a copy of the requested message queue id structure in the struct pointed to by the 3rd argument for IPC_STAT and get rid of the 4th argument. The old way returned a pointer into the kernel array that the calling function would then access afterwards without holding the appropriate locks and doing non-lock-safe things like copyout() with the data anyways. This change removes that unsafeness and resulting race conditions as well as simplifying the interface. - Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(), fstatfs(), and fhstatfs(). Use these wrappers to cut out a lot of code duplication for freebsd4 and netbsd compatability system calls. - Add a new lookup function kern_alternate_path() that looks up a filename under an alternate prefix and determines which filename should be used. This is basically a more general version of linux_emul_convpath() that can be shared by all the ABIs thus allowing for further reduction of code duplication.	2005-02-07 18:44:55 +00:00
Poul-Henning Kamp	8516dd18e1	Don't use VOP_GETVOBJECT, use vp->v_object directly.	2005-01-25 00:40:01 +00:00
Jeff Roberson	66ca1b4878	- Use VFS_LOCK_GIANT() in place of mtx_lock(&giant), etc. Sponsored By: Isilon Systems, Inc.	2005-01-24 10:19:31 +00:00
Warner Losh	9454b2d864	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 23:35:40 +00:00
Poul-Henning Kamp	662d80dc23	Fix a deadlock I introduced this morning. Mostly from: tegge	2004-12-14 20:48:40 +00:00
Poul-Henning Kamp	d986dbb448	Add a new kind of reference count (fd_holdcnt) to struct filedesc which holds on to just the data structure and the mutex. (The existing refcount (fd_refcnt) holds onto the open files in the descriptor.) The fd_holdcnt is protected by fdesc_mtx, fd_refcnt by FILEDESC_LOCK. Add fdhold(struct proc ) which gets a hold on the filedescriptors of the specified proc.. Add fddrop(struct filedesc ) which drops the fd_holdcnt and if zero destroys the mutex and frees the memory. Initialize the fd_holdcnt to one in fdinit(). Normal operations on the filedesc structure will not change it. In fdfree() use fddrop() to dispose of the mutex and structure. Hold the FILEDESC_LOCK() until we have cleaned out the contents and carefully set the fields to null values during cleanup. Use fdhold()/fddrop() in mountcheckdirs() and sysctl_kern_file().	2004-12-14 09:09:51 +00:00
Poul-Henning Kamp	30abaa53df	Make fdesc_mtx private to kern_descrip.c now that the flock has come home.	2004-12-14 08:44:51 +00:00
Poul-Henning Kamp	12b18fdab4	Move the checkdirs() function from vfs_mount.c to kern_descrip.c and call it mountcheckdirs().	2004-12-14 08:23:18 +00:00
Poul-Henning Kamp	c113083c5a	Add new function fdunshare() which encapsulates the necessary light magic for ensuring that a process' filedesc is not shared with anybody. Use it in the two places which previously had private implmentations. This collects all fd_refcnt handling in kern_descrip.c	2004-12-14 07:20:03 +00:00
Poul-Henning Kamp	9722743b9a	Sort and wash #includes.	2004-12-03 21:29:25 +00:00
Poul-Henning Kamp	355be4eeda	Drop ffree() as a separate function and incorporate the only place used.	2004-12-02 12:17:27 +00:00
Poul-Henning Kamp	20ddb405f8	Style polishing. Use grepable functions Other minor nitpickings.	2004-12-02 11:56:13 +00:00
Poul-Henning Kamp	d672e07541	We already have a lock initialization function, use that for fdesc_mtx also. Polish badfo stuff.	2004-12-01 09:42:35 +00:00
Poul-Henning Kamp	010b1e3fdc	Collect the stuff for the /dev/fd/{%d,std{in,out,err}} pseudo-device driver at the bottom of the file.	2004-12-01 09:29:31 +00:00
Poul-Henning Kamp	e4643c730a	"nfiles" is a bad name for a global variable. Call it "openfiles" instead as this is more correct and matches the sysctl variable.	2004-12-01 09:22:26 +00:00
Poul-Henning Kamp	cc2f51ef32	Style: move data to top of file.	2004-12-01 08:06:27 +00:00
Robert Watson	1a1238a112	Don't acquire Giant before calling closef() in close() (and elsewhere); instead acquire it conditionally in closef() if it is required for advisory locking. This removes Giant from the close() path of sockets and pipes (and any other objects that don't acquire Giant in their fo_close path, such as kqueues). Giant will still be acquired twice for vnodes -- once for advisory lock teardown, and a second time in the fo_close method. Both Poul-Henning and I believe that the advisory lock teardown code can be moved into the vn_closefile path shortly. This trims a percent or two off the cost of most non-vnode close operations on SMP, but has a fairly minimal impact on UP where the cost of a single mutex operation is pretty low.	2004-11-28 14:37:17 +00:00
Poul-Henning Kamp	f0775d7c7a	Fix LOR. Solution pointed out by: jhb	2004-11-26 06:14:04 +00:00
David Schultz	c17ff94938	Neither of the arguments to closef() can be NULL anymore, so don't check for that.	2004-11-21 11:06:24 +00:00
Poul-Henning Kamp	dc99052535	Move a FILEDESC_UNLOCK up to maintain correct nesting of FILEDESC/FILE locking.	2004-11-16 09:12:03 +00:00
Poul-Henning Kamp	970d8904d6	Make FILE_LOCK and FILEDESC_LOCK nest properly by postponing the the release of FILEDESC_LOCK a few more lines.	2004-11-15 16:10:55 +00:00
Poul-Henning Kamp	2e4fed7c56	Move #define up.	2004-11-14 09:21:01 +00:00
Poul-Henning Kamp	124e4c3be8	Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST. Use this in all the places where sleeping with the lock held is not an issue. The distinction will become significant once we finalize the exact lock-type to use for this kind of case.	2004-11-13 11:53:02 +00:00
Poul-Henning Kamp	598b7ec86b	Use more intuitive pointer for fdinit() and fdcopy(). Change fdcopy() to take unlocked filedesc.	2004-11-08 12:43:23 +00:00
Poul-Henning Kamp	ef11fbd7c4	Introduce fdclose() which will clean an entry in a filedesc. Replace homerolled versions with call to fdclose(). Make fdunused() static to kern_descrip.c	2004-11-07 22:16:07 +00:00
Poul-Henning Kamp	2f5a40aa3f	Move fdinit() related stuff from .h to .c	2004-11-07 15:34:45 +00:00
Poul-Henning Kamp	8ec21e3a68	Allow fdinit() to be called with a NULL fdp argument so we can use it when setting up init. Make fdinit() lock the fdp argument as needed.	2004-11-07 12:39:28 +00:00
Poul-Henning Kamp	3b19b5af3a	When we open /dev/null for stdin/out/err for safety reasons, do it right: we should preserve f_data and f_ops if they are already set.	2004-11-06 23:36:09 +00:00
Robert Watson	81158452be	Push acquisition of the accept mutex out of sofree() into the caller (sorele()/sotryfree()): - This permits the caller to acquire the accept mutex before the socket mutex, avoiding sofree() having to drop the socket mutex and re-order, which could lead to races permitting more than one thread to enter sofree() after a socket is ready to be free'd. - This also covers clearing of the so_pcb weak socket reference from the protocol to the socket, preventing races in clearing and evaluation of the reference such that sofree() might be called more than once on the same socket. This appears to close a race I was able to easily trigger by repeatedly opening and resetting TCP connections to a host, in which the tcp_close() code called as a result of the RST raced with the close() of the accepted socket in the user process resulting in simultaneous attempts to de-allocate the same socket. The new locking increases the overhead for operations that may potentially free the socket, so we will want to revise the synchronization strategy here as we normalize the reference counting model for sockets. The use of the accept mutex in freeing of sockets that are not listen sockets is primarily motivated by the potential need to remove the socket from the incomplete connection queue on its parent (listen) socket, so cleaning up the reference model here may allow us to substantially weaken the synchronization requirements. RELENG_5_3 candidate. MFC after: 3 days Reviewed by: dwhite Discussed with: gnn, dwhite, green Reported by: Marc UBM Bocklet <ubm at u-boot-man dot de> Reported by: Vlad <marchenko at gmail dot com>	2004-10-18 22:19:43 +00:00
Julian Elischer	c233d032d2	Another case where we need to guard against a partially constructed process. Submitted by: Stephan Uphoff ( ups at tree.com ) MFC after: 3 days	2004-10-04 06:45:48 +00:00
Robert Watson	16239786ca	Remove GIANT_REQUIRED from setugidsafety() as knote_fdclose() no longer requires Giant.	2004-08-19 14:59:51 +00:00
Brian Feldman	8912c44d9f	Add the missing knote_fdclose().	2004-08-16 03:09:01 +00:00
John-Mark Gurney	ad3b9257c2	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00
Robert Watson	b223d06425	We're not yet ready to assert !Giant in kern_fcntl(), as it's called with Giant from ABI wrappers such as Linux emulation. Foot shoot off: phk	2004-08-07 14:09:02 +00:00
Robert Watson	a0a819747c	Avoid acquiring Giant for some common light-weight or already MPSAFE fcntl() operations, including: F_DUPFD dup() alias F_GETFD retrieve close-on-exec flag F_SETFD set close-on-exec flag F_GETFL retrieve file descriptor flags For the remaining fcntl() operations, do acquire Giant, especially where we call into fo_ioctl() as a result. We're not yet ready to push Giant into fo_ioctl(). Once we do, this can all become quite a bit prettier.	2004-08-06 22:00:55 +00:00
Robert Watson	0be8ad5fbc	Assert Giant in the following file descriptor-related functions: Function Reason -------- ------ fdfree() VFS setugidsafety() KQueue fdcheckstd() VFS _fgetvp() VFS fgetsock() Conditional assertion based on debug.mpsafenet	2004-08-04 18:35:33 +00:00
Robert Watson	a6719c82b1	Push Giant acquisition down into fo_stat() from most callers. Acquire Giant conditional on debug.mpsafenet in the socket soo_stat() routine, unconditionally in vn_statfile() for VFS, and otherwise don't acquire Giant. Accept an unlocked read in kqueue_stat(), and cryptof_stat() is a no-op. Don't acquire Giant in fstat() system call. Note: in fdescfs, fo_stat() is called while holding Giant due to the VFS stack sitting on top, and therefore there will still be Giant recursion in this case.	2004-07-22 20:40:23 +00:00
Robert Watson	1c1ce9253f	Push acquisition of Giant from fdrop_closed() into fo_close() so that individual file object implementations can optionally acquire Giant if they require it: - soo_close(): depends on debug.mpsafenet - pipe_close(): Giant not acquired - kqueue_close(): Giant required - vn_close(): Giant required - cryptof_close(): Giant required (conservative) Notes: Giant is still acquired in close() even when closing MPSAFE objects due to kqueue requiring Giant in the calling closef() code. Microbenchmarks indicate that this removal of Giant cuts 3%-3% off of pipe create/destroy pairs from user space with SMP compiled into the kernel. The cryptodev and opencrypto code appears MPSAFE, but I'm unable to test it extensively and so have left Giant over fo_close(). It can probably be removed given some testing and review.	2004-07-22 18:35:43 +00:00
Christian S.J. Peron	ed6c545cf0	In addition to the real user ID check, do an explicit jail check to ensure that the caller is not prison root. The intention is to fix file descriptor creation so that prison root can not use the last remaining file descriptors. This privilege should be reserved for non-jailed root users. Approved by: bmilekic (mentor)	2004-07-14 19:04:31 +00:00
Poul-Henning Kamp	a769355f9b	Explicitly initialize f_data and f_vnode to NULL. Report f_vnode to userland in struct xfile.	2004-06-19 11:40:08 +00:00
Poul-Henning Kamp	89c9c53da0	Do the dreaded s/dev_t/struct cdev */ Bump __FreeBSD_version accordingly.	2004-06-16 09:47:26 +00:00
Robert Watson	395a08c904	Extend coverage of SOCK_LOCK(so) to include so_count, the socket reference count: - Assert SOCK_LOCK(so) macros that directly manipulate so_count: soref(), sorele(). - Assert SOCK_LOCK(so) in macros/functions that rely on the state of so_count: sofree(), sotryfree(). - Acquire SOCK_LOCK(so) before calling these functions or macros in various contexts in the stack, both at the socket and protocol layers. - In some cases, perform soisdisconnected() before sotryfree(), as this could result in frobbing of a non-present socket if sotryfree() actually frees the socket. - Note that sofree()/sotryfree() will release the socket lock even if they don't free the socket. Submitted by: sam Sponsored by: FreeBSD Foundation Obtained from: BSD/OS	2004-06-12 20:47:32 +00:00
Poul-Henning Kamp	1930e303cf	Deorbit COMPAT_SUNOS. We inherited this from the sparc32 port of BSD4.4-Lite1. We have neither a sparc32 port nor a SunOS4.x compatibility desire these days.	2004-06-11 11:16:26 +00:00
Robert Watson	63732dce22	Push the VOP_ADVLOCK() call to release advisory locks on vnode file descriptors out of fdrop_locked() and into vn_closefile(). This removes all knowledge of vnodes from fdrop_locked(), since the lock behavior was specific to vnodes. This also removes the specific requirement for Giant in fdrop_locked(), it's now only required by code that it calls into. Add GIANT_REQUIRED to vn_closefile() since VFS requires Giant.	2004-06-01 18:03:20 +00:00
Warner Losh	7f8a436ff2	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-05 21:03:37 +00:00
Robert Watson	a1288c786e	Conditionally assert Giant in fputsock() based on the value of debug.mpsafenet.	2004-03-29 00:33:02 +00:00
Don Lewis	47934cef8f	Split the mlock() kernel code into two parts, mlock(), which unpacks the syscall arguments and does the suser() permission check, and kern_mlock(), which does the resource limit checking and calls vm_map_wire(). Split munlock() in a similar way. Enable the RLIMIT_MEMLOCK checking code in kern_mlock(). Replace calls to vslock() and vsunlock() in the sysctl code with calls to kern_mlock() and kern_munlock() so that the sysctl code will obey the wired memory limits. Nuke the vslock() and vsunlock() implementations, which are no longer used. Add a member to struct sysctl_req to track the amount of memory that is wired to handle the request. Modify sysctl_wire_old_buffer() to return an error if its call to kern_mlock() fails. Only wire the minimum of the length specified in the sysctl request and the length specified in its argument list. It is recommended that sysctl handlers that use sysctl_wire_old_buffer() should specify reasonable estimates for the amount of data they want to return so that only the minimum amount of memory is wired no matter what length has been specified by the request. Modify the callers of sysctl_wire_old_buffer() to look for the error return. Modify sysctl_old_user to obey the wired buffer length and clean up its implementation. Reviewed by: bms	2004-02-26 00:27:04 +00:00
Poul-Henning Kamp	dc08ffec87	Device megapatch 4/6: Introduce d_version field in struct cdevsw, this must always be initialized to D_VERSION. Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.	2004-02-21 21:10:55 +00:00
Dag-Erling Smørgrav	44f4b94b38	Don't bother storing a result when all you need are the side effects.	2004-02-16 18:38:46 +00:00
David Malone	a82294d01c	In fdcheckstd the descriptor table should never be shared, so just KASSERT this rather than trying to deal with what happens when file descriptors change out from under us.	2004-02-15 21:14:48 +00:00
John Baldwin	91d5354a2c	Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64	2004-02-04 21:52:57 +00:00
Dag-Erling Smørgrav	a6d4491c71	Restore correct semantics for F_DUPFD fcntl. This should fix the errors people have been getting with configure scripts.	2004-01-17 00:59:04 +00:00
Dag-Erling Smørgrav	56a9fc0e93	WITNESS won't let us hold two filedesc locks at the same time, so juggle fdp and newfdp around a bit.	2004-01-16 21:54:56 +00:00
Dag-Erling Smørgrav	ddce426f69	Remove two KASSERTs which were overly paranoid.	2004-01-16 08:45:56 +00:00
Dag-Erling Smørgrav	12d568c2b1	Take care to drop locks when calling malloc()	2004-01-15 18:50:11 +00:00
Dag-Erling Smørgrav	a2fe44e8cf	New file descriptor allocation code, derived from similar code introduced in OpenBSD by Niels Provos. The patch introduces a bitmap of allocated file descriptors which is used to locate available descriptors when a new one is needed. It also moves the task of growing the file descriptor table out of fdalloc(), reducing complexity in both fdalloc() and do_dup(). Debts of gratitude are owed to tjr@ (who provided the original patch on which this work is based), grog@ (for the gdb(4) man page) and rwatson@ (for assistance with pxeboot(8)).	2004-01-15 10:15:04 +00:00
Dag-Erling Smørgrav	c9de31f55f	Mechanical whitespace cleanup.	2004-01-11 19:39:14 +00:00
Alan Cox	0e88a71798	Remove long dead code, specifically, code related to munmapfd(). (See also vm/vm_mmap.c revision 1.173.)	2004-01-11 06:59:21 +00:00
David Malone	70ad6c2190	Plug a leak of open files that happens when you exec a suid program with one of std{in,out,err} open. This helps with the file descriptor leaks reported on -current. This should probably be merged into 5.2. Reviewed by: ru Tested by: Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net>	2003-12-28 19:27:14 +00:00
David Malone	e1419c08e2	falloc allocates a file structure and adds it to the file descriptor table, acquiring the necessary locks as it works. It usually returns two references to the new descriptor: one in the descriptor table and one via a pointer argument. As falloc releases the FILEDESC lock before returning, there is a potential for a process to close the reference in the file descriptor table before falloc's caller gets to use the file. I don't think this can happen in practice at the moment, because Giant indirectly protects closes. To stop the file being completly closed in this situation, this change makes falloc set the refcount to two when both references are returned. This makes life easier for several of falloc's callers, because the first thing they previously did was grab an extra reference on the file. Reviewed by: iedowse Idea run past: jhb	2003-10-19 20:41:07 +00:00
Robert Watson	c142b0fcfe	Remove the global variable 'cmask', which was used to initialize the fd_cmask field in the file descriptor structure for the first process indirectly from CMASK, and when an fd structure is initialized before being filled in, and instead just use CMASK. This appears to be an artifact left over from the initial integration of quotas into BSD. Suggested by: peter	2003-10-02 03:57:59 +00:00
David Malone	d2cce3d6e8	Do some minor Giant pushdown made possible by copyin, fget, fdrop, malloc and mbuf allocation all not requiring Giant. 1) ostat, fstat and nfstat don't need Giant until they call fo_stat. 2) accept can copyin the address length without grabbing Giant. 3) sendit doesn't need Giant, so don't bother grabbing it until kern_sendit. 4) move Giant grabbing from each indivitual recv* syscall to recvit.	2003-08-04 21:28:57 +00:00
Alan Cox	fbe1bdddcc	Revision 1.51 of vm/uma_core.c modified uma_large_free() to acquire Giant when needed. So, don't do it here.	2003-07-29 05:23:19 +00:00
Robert Watson	2e4a71cdb1	When exporting file descriptor data for threads invoking the kern.file sysctl, don't return information about processes that fail p_cansee(td, p). This prevents sockstat and related programs from seeing file descriptors owned by processes not in the same jail as the thread, as well as having implications for MAC, etc. This is a partial solution: it permits an information leak about the number of descriptors in the sizing calculation (but this is not new information, you can also get it from kern.openfiles), and doesn't attempt to mask file descriptors based on the properties of the descriptor, only the process referencing it. However, it provides most of what you want under most circumstances, without complicating the locking. PR: 54211 Based on a patch submitted by: Pawel Jakub Dawidek <nick@garage.freebsd.pl>	2003-07-28 16:03:53 +00:00
Poul-Henning Kamp	7c89f162bc	Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.	2003-07-27 17:04:56 +00:00
Alan Cox	18e8d4e79c	revision 1.51 of vm/uma_core.c modified uma_large_malloc() to acquire Giant when needed.	2003-07-25 22:26:43 +00:00
Don Lewis	857d9c60d0	Extend the mutex pool implementation to permit the creation and use of multiple mutex pools with different options and sizes. Mutex pools can be created with either the default sleep mutexes or with spin mutexes. A dynamically created mutex pool can now be destroyed if it is no longer needed. Create two pools by default, one that matches the existing pool that uses the MTX_NOWITNESS option that should be used for building higher level locks, and a new pool with witness checking enabled. Modify the users of the existing mutex pool to use the appropriate pool in the new implementation. Reviewed by: jhb	2003-07-13 01:22:21 +00:00
Poul-Henning Kamp	1226914c17	Use the f_vnode field to tell which file descriptors have a vnode.	2003-07-04 12:20:27 +00:00
Poul-Henning Kamp	3b6d965263	Add a f_vnode field to struct file. Several of the subtypes have an associated vnode which is used for stuff like the f*() functions. By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use. At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.	2003-06-22 08:41:43 +00:00
Poul-Henning Kamp	eaaca5deee	Don't (re)initialize f_gcflag to zero. Move initialization of DTYPE_VNODE specific field f_seqcount into the DTYPE_VNODE specific code.	2003-06-20 08:02:30 +00:00
Alfred Perlstein	bab88630ba	Unlock the struct file lock before aquiring Giant, otherwise we can deadlock because of lock order reversals. This was not caught because Witness ignores pool mutexes right now. Diagnosis and help: truckman Noticed by: pho	2003-06-19 18:13:07 +00:00
Mike Silbersack	4d7dfc31b8	Add a rate limited message reporting when kern.maxfiles is exceeded, reporting who did it. Also, fix a style bug introduced in the previous change. MFC after: 1 week	2003-06-19 04:07:12 +00:00
Mike Silbersack	438f085b2f	Reserve the last 5% of file descriptors for root use. This should allow systems to fail more gracefully when a file descriptor exhaustion situation occurs. Original patch by: David G. Andersen <dga@lcs.mit.edu> PR: 45353 MFC after: 1 week	2003-06-18 18:57:58 +00:00
Poul-Henning Kamp	7c2d2efd58	Initialize struct fileops with C99 sparse initialization.	2003-06-18 18:16:40 +00:00
David E. O'Brien	677b542ea2	Use __FBSDID().	2003-06-11 00:56:59 +00:00
Tor Egge	ad05d58087	Add tracking of process leaders sharing a file descriptor table and allow a file descriptor table to be shared between multiple process leaders. PR: 50923	2003-06-02 16:05:32 +00:00
Poul-Henning Kamp	90471005e1	Remove needless return Found by: FlexeLint	2003-05-31 20:16:44 +00:00
Robert Watson	c1dca9ab07	VOP_PATHCONF() requires a vnode lock; this patch adds locking to fpathconf(). The lock is held for direct calls to VOP_PATHCONF() in pathconf() already. Approved by: re (jhb) Pointed out by: DEBUG_VFS_LOCKS	2003-05-15 21:13:08 +00:00
Mark Murray	51da11a27a	Fix some easy, global, lint warnings. In most cases, this means making some local variables static. In a couple of cases, this means removing an unused variable.	2003-04-30 12:57:40 +00:00
Alexander Kabaev	104a9b7e3e	Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-04-29 13:36:06 +00:00
Poul-Henning Kamp	7ac40f5f59	Gigacommit to improve device-driver source compatibility between branches: Initialize struct cdevsw using C99 sparse initializtion and remove all initializations to default values. This patch is automatically generated and has been tested by compiling LINT with all the fields in struct cdevsw in reverse order on alpha, sparc64 and i386. Approved by: re(scottl)	2003-03-03 12:15:54 +00:00
Tor Egge	c6faf3bf1d	Remove unneeded code added in revision 1.188.	2003-03-01 17:18:28 +00:00
Scott Long	3303c14b57	Don't NULL out p_fd until after closefd() has been called. This isn't totally correct, but it has caused breakage for too long. I welcome someone with more fd fu to fix it correctly.	2003-02-24 05:46:55 +00:00
Mike Makonnen	750a91d8b1	Remove a comment which hasn't been true since rev. 1.158 Approved by: jhb, markm (mentor)(implicit)	2003-02-22 05:59:48 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Tor Egge	218a01e062	Avoid file lock leakage when linuxthreads port or rfork is used: - Mark the process leader as having an advisory lock - Check if process leader is marked as having advisory lock when closing file - Check that file is still open after lock has been obtained - Don't allow file descriptor table sharing between processes with different leaders PR: 10265 Reviewed by: alfred	2003-02-15 22:43:05 +00:00
Alfred Perlstein	e7d6662f1b	Do not allow kqueues to be passed via unix domain sockets.	2003-02-15 06:04:55 +00:00
Alfred Perlstein	edf6699ae6	Fix LOR with PROC/filedesc. Introduce fdesc_mtx that will be used as a barrier between free'ing filedesc structures. Basically if you want to access another process's filedesc, you want to hold this mutex over the entire operation.	2003-02-15 05:52:56 +00:00
Alfred Perlstein	42e1b74af2	Don't lock FILEDESC under PROC. The locking here needs to be revisited, but this ought to get rid of the LOR messages that people are complaining about for now. I imagine either I or someone else interested with smp will eventually clear this up.	2003-02-11 07:20:52 +00:00
Poul-Henning Kamp	4af0d0c21f	NODEVFS cleanup: remove #ifdefs	2003-01-30 12:35:40 +00:00
Jeffrey Hsu	a448a15bc1	Add missing SMP file locks around read-modify-write operations on the flag field. Reviewed by: rwatson	2003-01-21 20:20:48 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Poul-Henning Kamp	7e760e148a	Originally when DEVFS was added, a global variable "devfs_present" was used to control code which were conditional on DEVFS' precense since this avoided the need for large-scale source pollution with #include "opt_geom.h" Now that we approach making DEVFS standard, replace these tests with an #ifdef to facilitate mechanical removal once DEVFS becomes non-optional. No functional change by this commit.	2003-01-19 11:03:07 +00:00
Matthew Dillon	48e3128b34	Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.	2003-01-13 00:33:17 +00:00
Matthew Dillon	cd72f2180b	Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.	2003-01-12 01:37:13 +00:00
Jacques Vidrine	f0c093284d	Correct file descriptor leaks in lseek and do_dup. The leak in lseek was introduced in vfs_syscalls.c revision 1.218. The leak in do_dup was introduced in kern_descrip.c revision 1.158. Submitted by: iedowse	2003-01-06 13:19:05 +00:00
Alfred Perlstein	c522c1bf4b	fdcopy() only needs a filedesc pointer.	2003-01-01 01:19:31 +00:00
Alfred Perlstein	03282e6e3d	purge 'register'.	2003-01-01 01:05:54 +00:00
Alfred Perlstein	c7f1c11b20	Since fdshare() and fdinit() only operate on filedescs, make them take pointers to filedesc structures instead of threads. This makes it more clear that they do not do any voodoo with the thread/proc or anything other than the filedesc passed in or returned. Remove some XXX KSE's as this resolves the issue.	2003-01-01 01:01:14 +00:00
Alfred Perlstein	59c97598d3	fdinit() does not need to lock the filedesc it is creating as no one besideds itself has access until the function returns.	2003-01-01 00:35:46 +00:00
Robert Watson	f0bc12ee8d	Improve consistency between devfs and MAKEDEV: use UID_ROOT and GID_WHEEL instead of UID_BIN and GID_BIN for /dev/fd/* entries. Submitted by: kris	2002-12-27 16:54:44 +00:00
Poul-Henning Kamp	a7010ee2f4	White-space changes.	2002-12-24 09:44:51 +00:00
Poul-Henning Kamp	f3a682116c	Detediousficate declaration of fileops array members by introducing typedefs for them.	2002-12-23 21:53:20 +00:00
Tim J. Robbins	9d0fffd3ca	Drop filedesc lock and acquire Giant around calls to malloc() and free(). These call uma_large_malloc() and uma_large_free() which require Giant. Fixes panic when descriptor table is larger than KMEM_ZMAX bytes noticed by kkenn. Reviewed by: jhb	2002-12-13 09:59:40 +00:00
John Baldwin	04f4a16448	If the file descriptors passed into do_dup() are negative, return EBADF instead of panicing. Also, perform some of the simpler sanity checks on the fds before acquiring the filedesc lock. Approved by: re Reported by: Dan Nelson <dan@emsphone.com> and others	2002-11-26 17:22:15 +00:00
Garrett Wollman	c7047e5204	Change the way support for asynchronous I/O is indicated to applications to conform to 1003.1-2001. Make it possible for applications to actually tell whether or not asynchronous I/O is supported. Since FreeBSD's aio implementation works on all descriptor types, don't call down into file or vnode ops when [f]pathconf() is asked about _PC_ASYNC_IO; this avoids the need for every file and vnode op to know about it.	2002-10-27 18:07:41 +00:00
John Baldwin	4562d72638	Don't lock the proc lock to clear p_fd. p_fd isn't protected by the proc lock.	2002-10-18 17:42:28 +00:00
John Baldwin	bf3e55aa2c	Many style and whitespace fixes. Submitted by: bde (mostly)	2002-10-16 15:45:37 +00:00
John Baldwin	18d9bd8f65	Sort includes a bit. Submitted by: bde	2002-10-16 15:14:31 +00:00
John Baldwin	7fd1f2b8bc	Argh. Put back setting of P_ADVLOCK for the F_WRLCK case that was accidentally lost in the previous revision. Submitted by: bde Pointy hat to: jhb	2002-10-15 18:10:13 +00:00
John Baldwin	60a6965a88	Remove the leaderp variable and just access p_leader directly. The p_leader field is not protected by the proc lock but is only set during fork1() by the parent process and never changes.	2002-10-15 00:03:40 +00:00
Don Lewis	91e97a8266	In an SMP environment post-Giant it is no longer safe to blindly dereference the struct sigio pointer without any locking. Change fgetown() to take a reference to the pointer instead of a copy of the pointer and call SIGIO_LOCK() before copying the pointer and dereferencing it. Reviewed by: rwatson	2002-10-03 02:13:00 +00:00
Thomas Moestl	dde1c2c0d6	fcntl(..., F_SETLKW, ...) takes a pointer to a struct flock just like F_SETLK does, so it also needs this structure copied in in fnctl() before calling kern_fcntl().	2002-09-16 01:05:15 +00:00
Nate Lawson	06be2aaa83	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)	2002-09-14 09:02:28 +00:00
Thomas Moestl	4e115a85ab	Fix fcntl(..., F_GETOWN, ...) and fcntl(..., F_SETOWN, ...) on sparc64 by not passing a pointer to a register_t or intptr_t when the code in the lower layers expects one to an int.	2002-09-13 15:15:16 +00:00
John Baldwin	5fc3031366	- Change falloc() to acquire an fd from the process table last so that it can do it w/o needing to hold the filelist_lock sx lock. - fdalloc() doesn't need Giant to call free() anymore. It also doesn't need to drop and reacquire the filedesc lock around free() now as a result. - Try to make the code that copies fd tables when extending the fd table in fdalloc() a bit more readable by performing assignments in separate statements. This is still a bit ugly though. - Use max() instead of an if statement so to figure out the starting point in the search-for-a-free-fd loop in fdalloc() so it reads better next to the min() in the previous line. - Don't grow nfiles in steps up to the size needed if we dup2() to some really large number. Go ahead and double 'nfiles' in a loop prior to doing the malloc(). - malloc() doesn't need Giant now. - Use malloc() and free() instead of MALLOC() and FREE() in fdalloc(). - Check to see if the size we are going to grow to is too big, not if the current size of the fd table is too big in the loop in fdalloc(). This means if we are out of space or if dup2() requests too high of a fd, then we will return an error before we go off and try to allocate some huge table and copy the existing table into it. - Move all of the logic for dup'ing a file descriptor into do_dup() instead of putting some of it in do_dup() and duplicating other parts in four different places. This makes dup(), dup2(), and fcntl(F_DUPFD) basically wrappers of do_dup now. fcntl() still has an extra check since it uses a different error return value in one case then the other functions. - Add a KASSERT() for an assertion that may not always be true where the fdcheckstd() function assumes that falloc() returns the fd requested and not some other fd. I think that the assertion is always true because we are always single-threaded when we get to this point, but if one was using rfork() and another process sharing the fd table were playing with the fd table, there might could be a problem. - To handle the problem of a file descriptor we are dup()'ing being closed out from under us in dup() in general, do_dup() now obtains a reference on the file in question before calling fdalloc(). If after the call to fdalloc() the file for the fd we are dup'ing is a different file, then we drop our reference on the original file and return EBADF. This race was only handled in the dup2() case before and would just retry the operation. The error return allows the user to know they are being stupid since they have a locking bug in their app instead of dup'ing some other descriptor and returning it to them. Tested on: i386, alpha, sparc64	2002-09-03 20:16:31 +00:00
Ian Dowse	49c2ff159f	Split fcntl() into a wrapper and a kernel-callable kern_fcntl() implementation. The wrapper is responsible for copying additional structure arguments (struct flock) to and from userland.	2002-09-02 22:24:14 +00:00
Philippe Charnier	93b0017f88	Replace various spelling with FALLTHROUGH which is lint()able	2002-08-25 13:23:09 +00:00
Robert Watson	d49fa1ca6e	In continuation of early fileop credential changes, modify fo_ioctl() to accept an 'active_cred' argument reflecting the credential of the thread initiating the ioctl operation. - Change fo_ioctl() to accept active_cred; change consumers of the fo_ioctl() interface to generally pass active_cred from td->td_ucred. - In fifofs, initialize filetmp.f_cred to ap->a_cred so that the invocations of soo_ioctl() are provided access to the calling f_cred. Pass ap->a_td->td_ucred as the active_cred, but note that this is required because we don't yet distinguish file_cred and active_cred in invoking VOP's. - Update kqueue_ioctl() for its new argument. - Update pipe_ioctl() for its new argument, pass active_cred rather than td_ucred to MAC for authorization. - Update soo_ioctl() for its new argument. - Update vn_ioctl() for its new argument, use active_cred rather than td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-17 02:36:16 +00:00
Robert Watson	ea6027a8e1	Make similar changes to fo_stat() and fo_poll() as made earlier to fo_read() and fo_write(): explicitly use the cred argument to fo_poll() as "active_cred" using the passed file descriptor's f_cred reference to provide access to the file credential. Add an active_cred argument to fo_stat() so that implementers have access to the active credential as well as the file credential. Generally modify callers of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which was redundantly provided via the fp argument. This set of modifications also permits threads to perform these operations on behalf of another thread without modifying their credential. Trickle this change down into fo_stat/poll() implementations: - badfo_poll(), badfo_stat(): modify/add arguments. - kqueue_poll(), kqueue_stat(): modify arguments. - pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to MAC checks rather than td->td_ucred. - soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather than cred to pru_sopoll() to maintain current semantics. - sopoll(): moidfy arguments. - vn_poll(), vn_statfile(): modify/add arguments, pass new arguments to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL() to maintian current semantics. - vn_close(): rename cred to file_cred to reflect reality while I'm here. - vn_stat(): Add active_cred and file_cred arguments to vn_stat() and consumers so that this distinction is maintained at the VFS as well as 'struct file' layer. Pass active_cred instead of td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics. - fifofs: modify the creation of a "filetemp" so that the file credential is properly initialized and can be used in the socket code if desired. Pass ap->a_td->td_ucred as the active credential to soo_poll(). If we teach the vnop interface about the distinction between file and active credentials, we would use the active credential here. Note that current inconsistent passing of active_cred vs. file_cred to VOP's is maintained. It's not clear why GETATTR would be authorized using active_cred while POLL would be authorized using file_cred at the file system level. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-16 12:52:03 +00:00
Robert Watson	9ca435893b	In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-15 20:55:08 +00:00
Dag-Erling Smørgrav	aefe27a25c	Have the kern.file sysctl export xfiles rather than files. The truth is out there! Sponsored by: DARPA, NAI Labs	2002-07-31 12:26:52 +00:00
Don Lewis	5c38b6dbce	Wire the sysctl output buffer before grabbing any locks to prevent SYSCTL_OUT() from blocking while locks are held. This should only be done when it would be inconvenient to make a temporary copy of the data and defer calling SYSCTL_OUT() until after the locks are released.	2002-07-28 19:59:31 +00:00
John Baldwin	3d3f20cbe6	Preallocate a struct file as the first thing in falloc() before we lock the filelist_lock and check nfiles. This closes a race where we had to unlock the filedesc to re-lock the filelist_lock. Reported by: David Xu Reviewed by: bde (mostly)	2002-07-17 02:48:43 +00:00
Alfred Perlstein	7f05b0353a	More caddr_t removal, make fo_ioctl take a void * instead of a caddr_t.	2002-06-29 01:50:25 +00:00
Seigo Tanimura	4cc20ab1f0	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
Seigo Tanimura	243917fe3b	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
Tom Rhodes	d394511de3	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
Alfred Perlstein	e649887b1e	Make funsetown() take a 'struct sigio **' so that the locking can be done internally. Ensure that no one can fsetown() to a dying process/pgrp. We need to check the process for P_WEXIT to see if it's exiting. Process groups are already safe because there is no such thing as a pgrp zombie, therefore the proctree lock completely protects the pgrp from having sigio structures associated with it after it runs funsetownlst. Add sigio lock to witness list under proctree and allproc, but over proc and pgrp. Seigo Tanimura helped with this.	2002-05-06 19:31:28 +00:00
Seigo Tanimura	6041fa0a60	As malloc(9) and free(9) are now Giant-free, remove the Giant lock across malloc(9) and free(9) of a pgrp or a session.	2002-05-03 07:46:59 +00:00
Seigo Tanimura	c8d8a686e4	Fix the lock order reversal between the sigio lock and a process/pgrp lock in funsetownlst() by locking the sigio lock across funsetownlst().	2002-05-03 05:32:25 +00:00
Alfred Perlstein	f132072368	Redo the sigio locking. Turn the sigio sx into a mutex. Sigio lock is really only needed to protect interrupts from dereferencing the sigio pointer in an object when the sigio itself is being destroyed. In order to do this in the most unintrusive manner change pgsigio's sigio * argument into a **, that way we can lock internally to the function.	2002-05-01 20:44:46 +00:00
Jeroen Ruigrok van der Werven	1cf1a725ff	Fix indention which I did wrong in a previous commit. Submitted by: bde	2002-04-29 08:18:06 +00:00
Seigo Tanimura	d48d4b2501	Add a global sx sigio_lock to protect the pointer to the sigio object of a socket. This avoids lock order reversal caused by locking a process in pgsigio(). sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now require sigio_lock to be locked. Provide sowwakeup_locked(), soisconnected_locked(), and so on in case where we have to modify a socket and wake up a process atomically.	2002-04-27 08:24:29 +00:00
Alfred Perlstein	ea5b39d029	Don't FILEDESC_LOCK around calls to falloc().	2002-04-22 20:09:11 +00:00
Seigo Tanimura	1c2451c24d	Push down Giant for setpgid(), setsid() and aio_daemon(). Giant protects only malloc(9) and free(9).	2002-04-20 12:02:52 +00:00
Jacques Vidrine	e983a3762b	When exec'ing a set[ug]id program, make sure that the stdio file descriptors (0, 1, 2) are allocated by opening /dev/null for any which are not already open. Reviewed by: alfred, phk MFC after: 2 days	2002-04-19 00:45:29 +00:00
John Baldwin	ba626c1db2	Lock proctree_lock instead of pgrpsess_lock.	2002-04-16 17:11:34 +00:00
Jeroen Ruigrok van der Werven	bcbf4411d6	Use the correct macros for F_SETFD/F_GETFD instead of magic numbers. Reflect that fact in the manual page. PR: 12723 Submitted by: Peter Jeremy <peter.jeremy@alcatel.com.au> Approved by: bde MFC after: 2 weeks	2002-04-13 10:16:53 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Seigo Tanimura	5cf4bcebbf	The description of fd_mtx is "filedesc structure."	2002-03-29 11:26:05 +00:00
Jeff Roberson	c897b81311	Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.	2002-03-20 04:09:59 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
Jeff Roberson	8355f576a9	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00
Alfred Perlstein	4a950215ef	Close a race when vfs_syscalls.c:checkdirs() runs. To do this protect the filedesc pointer in the proc with PROC_LOCK in both checkdirs() and kern_descrip.c:fdfree().	2002-03-19 04:30:04 +00:00
Alfred Perlstein	628abf6c69	Giant pushdown for read/write/pread/pwrite syscalls. kern/kern_descrip.c: Aquire Giant in fdrop_locked when file refcount hits zero, this removes the requirement for the caller to own Giant for the most part. kern/kern_ktrace.c: Aquire Giant in ktrgenio, simplifies locking in upper read/write syscalls. kern/vfs_bio.c: Aquire Giant in bwillwrite if needed. kern/sys_generic.c Giant pushdown, remove Giant for: read, pread, write and pwrite. readv and writev aren't done yet because of the possible malloc calls for iov to uio processing. kern/sys_socket.c Grab giant in the socket fo_read/write functions. kern/vfs_vnops.c Grab giant in the vnode fo_read/write functions.	2002-03-15 08:03:46 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
Seigo Tanimura	f591779bb5	Lock struct pgrp, session and sigio. New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)	2002-02-23 11:12:57 +00:00
Peter Wemm	1037bbb195	Fix broken Giant locking protocol introduced in rev 1.114. You cannot unlock Giant if it is not locked in the first place. This make the nfstat(2) syscall (#278) a nice panic(2) implementation.	2002-02-08 09:16:57 +00:00
Alfred Perlstein	3865fa138b	Remove bogus assertion in dup2 that can lead to panics when kernel threads race for a file slot. dup2(2) incorrectly assumes that if it needs to grow the ofiles array that it will get what it wants. This assertion was valid before we allowed shared filedescriptor tables but is now incorrect. The assertion can trigger superfolous panics if the thread doing a dup2 looses a race with another thread while possibly blocked in the MALLOC call in fdalloc. Another thread may grab the slot we are requesting which makes fdalloc return something other than what we asked for, this will triggering the bogus assertion. MFC after: 2 weeks Reviewed by: phk	2002-02-01 19:25:36 +00:00
Alfred Perlstein	2b39743941	Avoid lock order reversal filedesc/Giant when calling FREE() in fdalloc by unlocking the filedesc before calling FREE(). Submitted by: bde	2002-02-01 19:19:54 +00:00
Alfred Perlstein	eb20931127	Attempt to fixup select(2) and poll(2), this should fix some races with other threads as well as speed up the interfaces. To fix the race and accomplish the speedup, remove selholddrop and pollholddrop. The entire concept is somewhat bogus because holding the individual struct file pointers offers us no guarantees that another thread context won't close it on us thereby removing our access to our own reference. Selholddrop and pollholddrop also would do multiple locks and unlocks of mutexes _per-file_ in the fd arrays to be scanned, this needed to be sped up. Instead of using selholddrop and pollholddrop, simply hold the filedesc lock over the selscan and pollscan functions. This should protect us against close(2)'s on the files as reduce the multiple lock/unlock pairs per fd into a single lock over the filedesc.	2002-01-29 22:54:19 +00:00
Alfred Perlstein	5980a85f08	Backout 1.120, EINVAL isn't a proper error return when the passed fd is negative, the 'pointer' referred to by the manpage is actually the struct file's f_offset field. Pointed out by: bde	2002-01-29 17:12:10 +00:00
Alfred Perlstein	095f670d4e	in fget() return EINVAL when the descriptor requested is negative.	2002-01-23 08:40:35 +00:00
Alfred Perlstein	767567d3c2	use mutex pools for "struct file" locking. fix indentation of FILE_LOCK/UNLOCK macros while I'm here.	2002-01-20 22:58:08 +00:00
Alfred Perlstein	74aac58b52	Push down Giant in dup(2) and dup2(2), Giant is only needed when calling closef() in the case of dup2(2) duping over a descriptor and when fdalloc must grow or free a filedesc.	2002-01-15 00:58:40 +00:00
Alfred Perlstein	a4db49537b	Replace ffind_* with fget calls. Make fget MPsafe. Make fgetvp and fgetsock use the fget subsystem to reduce code bloat. Push giant down in fpathconf().	2002-01-14 00:13:45 +00:00
Alfred Perlstein	ba868b0da2	Comment fdrop and fdrop_locked functions.	2002-01-13 12:58:14 +00:00
Alfred Perlstein	c2824dd49b	Implement ffind_hold using ffind_lock. Recommended by: jhb	2002-01-13 12:57:02 +00:00
Alfred Perlstein	426da3bcfb	SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file fp); / increments reference count on a file / struct file fhold_locked(struct file fp); / like fhold but expects file to locked / struct file ffind_hold(struct thread , int fd); / finds the struct file in thread, adds one reference and returns it unlocked / struct file ffind_lock(struct thread , int fd); / ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.	2002-01-13 11:58:06 +00:00
Jonathan Lemon	2b846bd3a5	When removing kqueue descriptors from the descriptor table during a fork, update fd_freefile and fd_lastfile as well, to keep things in sync. Pointed out by: Debbie Chu <dchu@juniper.net>	2001-12-14 19:02:57 +00:00
Matthew Dillon	b1e4abd246	Give struct socket structures a ref counting interface similar to vnodes. This will hopefully serve as a base from which we can expand the MP code. We currently do not attempt to obtain any mutex or SX locks, but the door is open to add them when we nail down exactly how that part of it is going to work.	2001-11-17 03:07:11 +00:00
Matthew Dillon	b064d43d8f	remove holdfp() Replace uses of holdfp() with fget() or fgetvp() calls as appropriate introduce fget(), fget_read(), fget_write() - these functions will take a thread and file descriptor and return a file pointer with its ref count bumped. introduce fgetvp(), fgetvp_read(), fgetvp_write() - these functions will take a thread and file descriptor and return a vref()'d vnode. _read() requires that the file pointer be FREAD, _write that it be FWRITE. This continues the cleanup of struct filedesc and struct file access routines which, when are all through with it, will allow us to then make the API calls MP safe and be able to move Giant down into the fo_* functions.	2001-11-14 06:30:36 +00:00
John Baldwin	bd78cece5d	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
Jonathan Lemon	1a6fc8ef63	When FREE()ing kqueue related structures, charge them to the correct bucket. Submitted by: iedowse Forgotten by: jlemon	2001-09-30 17:00:56 +00:00
Julian Elischer	9dbea9237c	If an incoming struct proc could have been NULL before, tehn don't automatically change the code to add struct proc *p = td->td_proc; because now 'td' is probably capable of being NULL too. I expect to see more of this kind of error during the 'weeding' process. It's too easy to make. (junior hacker project.. look for these :-) Submitted by: mark Peek <mp@freebsd.org>	2001-09-12 20:26:57 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Matthew Dillon	835a82ee2d	Giant Pushdown. Saved the worst P4 tree breakage for last. reboot() getpriority() setpriority() rtprio() osetrlimit() ogetrlimit() setrlimit() getrlimit() getrusage() getpid() getppid() getpgrp() getpgid() getsid() getgid() getegid() getgroups() setsid() setpgid() setuid() seteuid() setgid() setegid() setgroups() setreuid() setregid() setresuid() setresgid() getresuid() getresgid () __setugid() getlogin() setlogin() modnext() modfnext() modstat() modfind() kldload() kldunload() kldfind() kldnext() kldstat() kldfirstmod() kldsym() getdtablesize() dup2() dup() fcntl() close() ofstat() fstat() nfsstat() fpathconf() flock()	2001-09-01 19:04:37 +00:00
Andrey A. Chernov	c8e7634357	advlock: simplify overflow checks	2001-08-29 18:53:53 +00:00
Andrey A. Chernov	4b207d9868	Move <machine/> after <sys/> Add missing fdrop() before EOVERFLOW Pointed by: bde	2001-08-23 13:19:32 +00:00
Andrey A. Chernov	69cc1d0d7f	Detect off_t EOVERFLOW of start/end offsets calculations for adv. lock, as POSIX require.	2001-08-23 07:42:40 +00:00
Chris Costello	c30d4da338	Remove the fildesc_clone() function and its associated unnecessary code. It didn't implement the proper /dev/fd functionality (which would be to include in the directory listing /dev/fd/n if the process has fd n open) anyway. Anything needing access to /dev/fd/n where n > 2 can use the optional fdescfs module, which implements this properly and does not cause any trouble with devfs. Discussed with: phk	2001-08-06 05:56:33 +00:00
Robert Watson	b1fc0ec1a7	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
John Baldwin	33a9ed9d0e	Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake	2001-04-24 00:51:53 +00:00
Poul-Henning Kamp	f83880518b	Send the remains (such as I have located) of "block major numbers" to the bit-bucket.	2001-03-26 12:41:29 +00:00
Poul-Henning Kamp	71d033119f	Make the pseudo-driver for "/dev/fd/*" handle fd's larger than 255. PR: 25936	2001-03-20 13:26:13 +00:00
Jonathan Lemon	608a3ce62a	Extend kqueue down to the device layer. Backwards compatible approach suggested by: peter	2001-02-15 16:34:11 +00:00
David Malone	7cc0979fd6	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
Matthew Dillon	279d722604	This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>	2000-11-18 21:01:04 +00:00
Alan Cox	4a71feb71c	Add missing call to knote_fdclose() in setugidsafety() and fdcloseexec(). Reviewed by: jlemon	2000-10-28 20:27:32 +00:00
Poul-Henning Kamp	db90128160	Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support. If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present". This happily removes an ugly hack from kern/vfs_conf.c. This forces a rename of the eventhandler and the standard clone helper function. Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h> Remove all #includes of opt_devfs.h they no longer matter.	2000-09-02 19:17:34 +00:00
Alfred Perlstein	c58b821e4c	new sysctl 'kern.openfiles' (exports nfiles to userland)	2000-08-26 23:49:44 +00:00
Poul-Henning Kamp	d8cd1501f2	Dang, a _clone routine escaped #ifdef DEVFS containment.	2000-08-24 15:59:44 +00:00
Poul-Henning Kamp	a481b90b82	Fix panic when removing open device (found by bp@) Implement subdirs. Build the full "devicename" for cloning functions. Fix panic when deleted device goes away. Collaps devfs_dir and devfs_dirent structures. Add proper cloning to the /dev/fd* "device-"driver. Fix a bug in make_dev_alias() handling which made aliases appear multiple times. Use devfs_clone to implement getdiskbyname() Make specfs maintain the stat(2) timestamps per dev_t	2000-08-24 15:36:55 +00:00
Peter Wemm	37b087a645	Clean up some low level bootstrap code: - stop using the evil 'struct trapframe' argument for mi_startup() (formerly main()). There are much better ways of doing it. - do not use prepare_usermode() - setregs() in execve() will do it all for us as long as the p_md.md_regs pointer is set. (which is now done in machdep.c rather than init_main.c. The Alpha port did it this way all along and is much cleaner). - collect all the magic %cr0 etc register settings into one place and have the AP's call that instead of using magic numbers (!!) that keep changing over and over again. - Make it safe to call kthread_create() earlier, including during the device probe sequence. It doesn't need the callback mechanism that NetBSD's version uses. - kthreads created this way are root-less as they exist before the root filesystem is mounted. init(1) is set up so that it aquires the root pointers prior to running. If other kthreads want filesystem acccess we can make this code more generic. - set all threads start times once we have decided what time it is. - init uses a trampoline rather than the evil prepare_usermode() hack. - kern_descrip.c has a couple of tweaks to deal with forking when there is no rootdir or cwd etc. - adjust the early SYSINIT() sequence so that a few prereqisites are in place. eg: make sure the run queue is initialized before doing forks. With this, the USB code can easily create a kthread to do the device tree discovery. (I have tested it, it works nicely). There are still some open issues before this is truely useful. - tsleep() does not like working before the clock is running. It sort-of tries to spin wait, but it can do more useful things now. - stopping a kthread in kld code at unload time is "interesting" but we have a solution for that. The Alpha code needs no changes for this. It already uses pretty much the same strategies, but a little cleaner.	2000-08-11 09:05:12 +00:00
Poul-Henning Kamp	77978ab8bc	Previous commit changing SYSCTL_HANDLER_ARGS violated KNF. Pointed out by: bde	2000-07-04 11:25:35 +00:00
Poul-Henning Kamp	82d9ae4e32	Style police catches up with rev 1.26 of src/sys/sys/sysctl.h: Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources: -sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)	2000-07-03 09:35:31 +00:00
Alfred Perlstein	1a61fa5e0d	don't panic the system when fpathconv is called on an unsupported filetype.	2000-06-27 23:08:36 +00:00
Jake Burkholder	e39756439c	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
Jake Burkholder	740a1973a6	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
Jonathan Lemon	cb679c385e	Introduce kqueue() and kevent(), a kernel event notification facility.	2000-04-16 18:53:38 +00:00
Warner Losh	27e2c03a27	Fix the style bugs in the style bugs fix. The style bug fix made the new function inconsistant with the rest of this file. The spelling and grammer fixes were good and remain.	2000-01-21 06:57:52 +00:00
Brian Feldman	bd9079fa6c	Fix style bugs in the last commit.	2000-01-21 02:52:54 +00:00
Warner Losh	7001be49f8	bdeize last commit: o Remove opt_dontuse.h and ifdef PROCFS Subitted by: bde, peter	2000-01-20 17:03:53 +00:00
Warner Losh	5e2664428c	When we are execing a setugid program, and we have a procfs filesystem file open in one of the special file descriptors (0, 1, or 2), close it before completing the exec. Submitted by: nergal@idea.avet.com.pl Constructive comments: deraadt@openbsd.org, sef, peter, jkh	2000-01-20 07:12:52 +00:00
Bruce Evans	f85bdfcc66	Removed unused includes. Rumoved unused compatibility cruft for dup(). Using it today would just break dup() on fd's >= 64. Fixed some style bugs.	1999-12-26 14:07:43 +00:00
Matthew Dillon	151f7a5d8a	Only bother converting the stat structure if we intend to return it, when no error occurs. PR: kern/14966 Reviewed by: dillon@freebsd.org Submitted by: Kelly Yancey kbyanc@posi.net	1999-11-18 08:08:28 +00:00
Peter Wemm	2c77a71d3d	Remove cdevsw_add() - the necessary make_dev() calls appear to be there already.	1999-11-18 06:34:47 +00:00
Poul-Henning Kamp	2e3c8fcbd0	This is a partial commit of the patch from PR 14914: Alot of the code in sys/kern directly accesses the Q_HEAD and Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914	1999-11-16 10:56:05 +00:00
Peter Wemm	cf87559cab	Use fo_stat() rather than duplicating knowledge of file type internals in here for stat(2) and friends. Update the badops entries accordingly.	1999-11-08 03:27:14 +00:00
Brian Feldman	d91e41c8c9	Fix the advisory file locking by restoring previous ordering in closef()/ fdrop(). This only showed up when a file descriptor was duplicated and then closed once, where the lock would be released on the first close().	1999-11-07 05:58:38 +00:00
Peter Wemm	d1f088dab5	Trim unused options (or #ifdef for undoc options). Submitted by: phk	1999-10-11 15:19:12 +00:00
Poul-Henning Kamp	d6a0e38a1b	Remove five now unused fields from struct cdevsw. They should never have been there in the first place. A GENERIC kernel shrinks almost 1k. Add a slightly different safetybelt under nostop for tty drivers. Add some missing FreeBSD tags	1999-09-25 18:24:47 +00:00
Poul-Henning Kamp	2fe5bd8bb8	Fix a hole in jail(2). Noticed by: Alexander Bezroutchko <abb@zenon.net>	1999-09-25 14:14:21 +00:00
Brian Feldman	13ccadd4b0	This is what was "fdfix2.patch," a fix for fd sharing. It's pretty far-reaching in fd-land, so you'll want to consult the code for changes. The biggest change is that now, you don't use fp->f_ops->fo_foo(fp, bar) but instead fo_foo(fp, bar), which increments and decrements the fp refcount upon entry and exit. Two new calls, fhold() and fdrop(), are provided. Each does what it seems like it should, and if fdrop() brings the refcount to zero, the fd is freed as well. Thanks to peter ("to hell with it, it looks ok to me.") for his review. Thanks to msmith for keeping me from putting locks everywhere :) Reviewed by: peter	1999-09-19 17:00:25 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Poul-Henning Kamp	9dcbe2404a	Convert DEVFS hooks in (most) drivers to make_dev(). Diskslice/label code not yet handled. Vinum, i4b, alpha, pc98 not dealt with (left to respective Maintainers) Add the correct hook for devfs to kern_conf.c The net result of this excercise is that a lot less files depends on DEVFS, and devtoname() gets more sensible output in many cases. A few drivers had minor additional cleanups performed relating to cdevsw registration. A few drivers don't register a cdevsw{} anymore, but only use make_dev().	1999-08-23 20:59:21 +00:00
Brian Feldman	e32c66c539	Fix fd race conditions (during shared fd table usage.) Badfileops is now used in f_ops in place of NULL, and modifications to the files are more carefully ordered. f_ops should also be set to &badfileops upon "close" of a file. This does not fix other problems mentioned in this PR than the first one. PR: 11629 Reviewed by: peter	1999-08-04 18:53:50 +00:00
Mike Smith	79fc0bf4a0	From the submitter: - this causes POSIX locking to use the thread group leader (p->p_leader) as the locking thread for all advisory locks. In non-kernel-threaded code p->p_leader == p, so this will have no effect. This results in (more) correct POSIX threaded flock-ing semantics. It also prevents the leader from exiting before any of the children. (so that p->p_leader will never be stale) in exit1(). We have been running this patch for over a month now in our lab under load and at customer sites. Submitted by: John Plevyak <jplevyak@inktomi.com>	1999-06-07 20:37:29 +00:00
Poul-Henning Kamp	2447bec829	Simplify cdevsw registration. The cdevsw_add() function now finds the major number(s) in the struct cdevsw passed to it. cdevsw_add_generic() is no longer needed, cdevsw_add() does the same thing. cdevsw_add() will print an message if the d_maj field looks bogus. Remove nblkdev and nchrdev variables. Most places they were used bogusly. Instead check a dev_t for validity by seeing if devsw() or bdevsw() returns NULL. Move bdevsw() and devsw() functions to kern/kern_conf.c Bump __FreeBSD_version to 400006 This commit removes: 72 bogus makedev() calls 26 bogus SYSINIT functions if_xe.c bogusly accessed cdevsw[], author/maintainer please fix. I4b and vinum not changed. Patches emailed to authors. LINT probably broken until they catch up.	1999-05-31 11:29:30 +00:00
Poul-Henning Kamp	4e2f199e0c	This commit should be a extensive NO-OP: Reformat and initialize correctly all "struct cdevsw". Initialize the d_maj and d_bmaj fields. The d_reset field was not removed, although it is never used. I used a program to do most of this, so all the files now use the same consistent format. Please keep it that way. Vinum and i4b not modified, patches emailed to respective authors.	1999-05-30 16:53:49 +00:00
Poul-Henning Kamp	bfbb9ce670	Divorce "dev_t" from the "major\|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland. Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev() For now they're functions, they will become in-line functions after one of the next two steps in this process. Return major/minor/makedev to macro-hood for userland. Register a name in cdevsw[] for the "filedescriptor" driver. In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device. In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang). A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that. Without DEVT_FASCIST I belive this patch is a no-op. Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result. Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).	1999-05-11 19:55:07 +00:00
Bill Fumerola	3d177f465a	Add sysctl descriptions to many SYSCTL_XXXs PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)	1999-05-03 23:57:32 +00:00
Dmitrij Tejblum	604359cf9b	s/static foo_devsw_installed = 0;/static int foo_devsw_installed;/. (Edited automatically)	1999-04-28 10:54:24 +00:00
Eivind Eklund	5526d2d920	Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith	1999-01-08 17:31:30 +00:00
Don Lewis	62d6ce3af2	I got another batch of suggestions for cosmetic changes from bde.	1998-11-11 10:56:07 +00:00
Don Lewis	831d27a9f5	Installed the second patch attached to kern/7899 with some changes suggested by bde, a few other tweaks to get the patch to apply cleanly again and some improvements to the comments. This change closes some fairly minor security holes associated with F_SETOWN, fixes a few bugs, and removes some limitations that F_SETOWN had on tty devices. For more details, see the description on the PR. Because this patch increases the size of the proc and pgrp structures, it is necessary to re-install the includes and recompile libkvm, the vinum lkm, fstat, gcore, gdb, ipfilter, ps, top, and w. PR: kern/7899 Reviewed by: bde, elvind	1998-11-11 10:04:13 +00:00
Bruce Evans	d974cf4dda	Fixed printf format errors.	1998-07-29 17:38:14 +00:00
Bruce Evans	1ede4662be	Cast longs to intptr_t before casting them to pointers. Fixed bitrot in pseudo-declaration of `struct fcntl_args'. fcntl() is now broken in some cases when ints are larger than longs.	1998-07-15 06:10:16 +00:00
Doug Rabson	2ef49ddfcb	64bit fixes: p->p_retval is a register_t[] not an int[].	1998-06-10 10:27:43 +00:00
John Dyson	1f56217280	Fix the futimes/undelete/utrace conflict with other BSD's. Note that the only common usage of utrace (the possible problem with this commit) is with malloc, so this should be a real problem. Add the various NetBSD syscalls that allow full emulation of their development environment.	1998-05-11 03:55:28 +00:00
John Dyson	9f24f214c3	Make the rootdir handling more consistent. Now, processes always have a root vnode associated with them, and no special checks for the null case are needed. Submitted by: terry@freebsd.org	1998-02-15 04:17:09 +00:00
Eivind Eklund	0b08f5f737	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
Eivind Eklund	47cfdb166d	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
Eivind Eklund	7b778b5e61	Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style. This introduce an xxxFS_BOOT for each of the rootable filesystems. (Presently not required, but encouraged to allow a smooth move of option *FS to opt_dontuse.h later.) LFS is temporarily disabled, and will be re-enabled tomorrow.	1998-01-24 02:54:56 +00:00
Eivind Eklund	5591b823d1	Make COMPAT_43 and COMPAT_SUNOS new-style options.	1997-12-16 17:40:42 +00:00
John Dyson	fd3bf77574	Fix and complete the AIO syscalls. There are some performance enhancements coming up soon, but the code is functional. Docs will be forthcoming.	1997-11-29 01:33:10 +00:00
Bruce Evans	a3c78a768e	Fixed a missing conversion of retval to p_retval in disabled code. Fixed overflow of FFLAGS() in fcntl(F_SETFL, ...). This was not a security hole, but gave wrong results for silly flags values. E.g., it make fcntl(F_SETFL, -1) equivalent to fcntl(F_SETFL, 0). POSIX requires ignoring the open mode bits in fcntl() (even if they would be invalid for open()).	1997-11-23 12:24:59 +00:00
Bruce Evans	d826c47904	Fixed duplicate definitions of M_FILE (one static).	1997-11-23 10:43:49 +00:00
Poul-Henning Kamp	cb226aaa62	Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /ARGSUSED/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.	1997-11-06 19:29:57 +00:00
Poul-Henning Kamp	a1c995b626	Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde	1997-10-12 20:26:33 +00:00
Poul-Henning Kamp	55166637cd	Distribute and statizice a lot of the malloc M_* types. Substantial input from: bde	1997-10-11 18:31:40 +00:00
Peter Wemm	51338ea83c	Various select -> poll changes	1997-09-14 02:52:18 +00:00
Bruce Evans	32545fd108	Removed some stale comments. Fixed a gratuitous ANSIism.	1997-08-26 00:09:44 +00:00
Bruce Evans	9dd8309d56	Removed support for OLD_PIPE. <sys/stat.h> is now missing the hack that supported nameless pipes being indistinguishable from fifos. We're not going back.	1997-04-09 16:53:45 +00:00
Peter Wemm	6875d25465	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
Jordan K. Hubbard	1130b656e5	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
John Dyson	8b612c4b4a	This commit is the embodiment of some VFS read clustering improvements. Firstly, now our read-ahead clustering is on a file descriptor basis and not on a per-vnode basis. This will allow multiple processes reading the same file to take advantage of read-ahead clustering. Secondly, there previously was a problem with large reads still using the ramp-up algorithm. Of course, that was bogus, and now we read the entire "chunk" off of the disk in one operation. The read-ahead clustering algorithm should use less CPU than the previous also (I hope :-)). NOTE: THAT LKMS MUST BE REBUILT!!!	1996-12-29 02:45:28 +00:00
Bruce Evans	b8b6f5017f	Fixed nonexistent checking of lock types for F_GETLK. Found by: NIST-PCTS	1996-12-19 19:59:51 +00:00
Bruce Evans	bb65f5a1cc	Fixed lseek() on named pipes. It always succeeded but should always fail. Broke locking on named pipes in the same way as locking on non-vnodes (wrong errno). This will be fixed later. The fix involves negative logic. Named pipes are now distinguished from other types of files with vnodes, and there is additional code to handle vnodes and named pipes in the same way only where that makes sense (not for lseek, locking or TIOCSCTTY).	1996-12-19 19:42:37 +00:00
Bruce Evans	efebc4ab84	Fixed bitrot in the read-only attribute: - kern.maxfilesperproc was read-only (and thus essentially useless). Removed unused #includes. Strength-reduced used #includes.	1996-09-28 16:33:21 +00:00
Sujal Patel	de71b88098	Fix fdavail() so that correctly pays attention to the rlimit. Fixes unp_externalize panic which occurs when a process is at it's ulimit for file descriptors and tries to receive a file descriptor from another process. Reviewed by: wollman	1996-08-15 16:33:32 +00:00
Bill Paul	8a095c52ed	Add a couple of #ifdef DEVFS/#endif clauses to slence the following compiler warnings which occur if you don't have 'options DEVFS' in your kernel config file: ../../kern/kern_descrip.c: In function `fildesc_drvinit': ../../kern/kern_descrip.c:1103: warning: unused variable `fd' ../../kern/kern_descrip.c: At top level: ../../kern/kern_descrip.c:1095: warning: `devfs_token_stdin' defined but not use d ../../kern/kern_descrip.c:1096: warning: `devfs_token_stdout' defined but not us ed ../../kern/kern_descrip.c:1097: warning: `devfs_token_stderr' defined but not us ed ../../kern/kern_descrip.c:1098: warning: `devfs_token_fildesc' defined but not u sed	1996-06-17 16:54:03 +00:00
Gary Palmer	c23670e294	Clean up -Wunused warnings. Reviewed by: bde	1996-06-12 05:11:41 +00:00
Bruce Evans	8fb3332429	Fixed the unit numbers of the devfs `fd' devices. Made the devfs `fd' devices bug for bug compatible with the ones created by MAKEDEV: - ownership is bin.bin, not root.wheel, except for std. The devfsext interface doesn't seem to allow specifying the ownership of /devfs/fd, so it's still incompatible. - std aren't links to fd/[0-2].	1996-03-27 19:19:58 +00:00
Jeffrey Hsu	4b50ceef3b	Merge in Lite2: LIST replacement for f_filef, f_fileb, and filehead. Did not accept change of second argument to ioctl from int to u_long. Reviewed by: davidg & bde	1996-03-11 02:17:30 +00:00
Peter Wemm	dabee6fecc	kern_descrip.c: add fdshare()/fdcopy() kern_fork.c: add the tiny bit of code for rfork operation. kern/sysv_: shmfork() takes one less arg, it was never used. sys/shm.h: drop "isvfork" arg from shmfork() prototype sys/param.h: declare rfork args.. (this is where OpenBSD put it..) sys/filedesc.h: protos for fdshare/fdcopy. vm/vm_mmap.c: add minherit code, add rounding to mmap() type args where it makes sense. vm/: drop unused isvfork arg. Note: this rfork() implementation copies the address space mappings, it does not connect the mappings together. ie: once the two processes have split, the pages may be shared, but the address space is not. If one does a mmap() etc, it does not appear in the other. This makes it not useful for pthreads, but it is useful in it's own right for having light-weight threads in a static shared address space. Obtained from: Original by Ron Minnich, extended by OpenBSD	1996-02-23 18:49:25 +00:00
John Dyson	2834ceec7c	Improve the performance for pipe(2) again. Also include some fixes for previous version of new pipes from Bruce Evans. This new version: Supports more properly the semantics of select (BDE). Supports "OLD_PIPE" correctly (kern_descrip.c, BDE). Eliminates incorrect EPIPE returns (bash 'pipe broken' messages.) Much faster yet, currently tuned relatively conservatively -- but now gives approx 50% more perf than the new pipes code did originally. (That was about 50% more perf than the original BSD pipe code.) Known bugs outstanding: No support for async io (SIGIO). Will be included soon. Next to do: Merge support for FIFOs. Submitted by: bde	1996-02-04 19:56:35 +00:00
John Dyson	f982721359	Enable the new fast pipe code. The old pipes can be used with the "OLD_PIPE" config option.	1996-01-28 23:41:40 +00:00
Poul-Henning Kamp	87b6de2b76	A Major staticize sweep. Generates a couple of warnings that I'll deal with later. A number of unused vars removed. A number of unused procs removed or #ifdefed.	1995-12-14 08:32:45 +00:00
Poul-Henning Kamp	d2f265fab8	Julian forgot to make the *devsw structures static.	1995-12-08 23:23:00 +00:00
Julian Elischer	87f6c6625d	Pass 3 of the great devsw changes most devsw referenced functions are now static, as they are in the same file as their devsw structure. I've also added DEVFS support for nearly every device in the system, however many of the devices have 'incorrect' names under DEVFS because I couldn't quickly work out the correct naming conventions. (but devfs won't be coming on line for a month or so anyhow so that doesn't matter) If you "OWN" a device which would normally have an entry in /dev then search for the devfs_add_devsw() entries and munge to make them right.. check out similar devices to see what I might have done in them in you can't see what's going on.. for a laugh compare conf.c conf.h defore and after... :) I have not doen DEVFS entries for any DISKSLICE devices yet as that will be a much more complicated job.. (pass 5 :) pass 4 will be to make the devsw tables of type (cdevsw * ) rather than (cdevsw) seems to work here.. complaints to the usual places.. :)	1995-12-08 11:19:42 +00:00
David Greenman	efeaf95a41	Untangled the vm.h include file spaghetti.	1995-12-07 12:48:31 +00:00
Bruce Evans	4cb03b1b55	Include <vm/vm.h> or <vm/vm_page.h> explicitly to avoid breaking when vnode_if.h doesn't include vm stuff.	1995-12-05 21:51:45 +00:00
Poul-Henning Kamp	946bb7a268	A major sweep over the sysctl stuff. Move a lot of variables home to their own code (In good time before xmas :-) Introduce the string descrition of format. Add a couple more functions to poke into these marvels, while I try to decide what the correct interface should look like. Next is adding vars on the fly, and sysctl looking at them too. Removed a tine bit of defunct and #ifdefed notused code in swapgeneric.	1995-12-04 16:48:58 +00:00
Bruce Evans	98d938220c	Completed function declarations and/or added prototypes.	1995-12-02 18:58:56 +00:00
Julian Elischer	7198bf4725	If you're going to mechanically replicate something in 50 files it's best to not have a (compiles cleanly) typo in it! (sigh)	1995-11-29 14:41:20 +00:00
Julian Elischer	53ac6efbd8	OK, that's it.. That's EVERY SINGLE driver that has an entry in conf.c.. my next trick will be to define cdevsw[] and bdevsw[] as empty arrays and remove all those DAMNED defines as well.. Each of these drivers has a SYSINIT linker set entry that comes in very early.. and asks teh driver to add it's own entry to the two devsw[] tables. some slight reworking of the commits from yesterday (added the SYSINIT stuff and some usually wrong but token DEVFS entries to all these devices. BTW does anyone know where the 'ata' entries in conf.c actually reside? seems we don't actually have a 'ataopen() etc... If you want to add a new device in conf.c please make sure I know so I can keep it up to date too.. as before, this is all dependent on #if defined(JREMOD) (and #ifdef DEVFS in parts)	1995-11-29 10:49:16 +00:00
Poul-Henning Kamp	18e6fe0250	Add new-style sysctl for KERN_FILE here.	1995-11-14 08:58:35 +00:00
Bruce Evans	d2d3e8751c	Included <sys/sysproto.h> to get central declarations for syscall args structs and prototypes for syscalls. Ifdefed duplicated decentralized declarations of args structs. It's convenient to have this visible but they are hard to maintain. Some are already different from the central declarations. 4.4lite2 puts them in comments in the function headers but I wanted to avoid the large changes for that.	1995-11-12 06:43:28 +00:00
David Greenman	079cc25b11	Killed a few gratuitous #include's.	1995-10-21 08:38:13 +00:00
Steven Wallace	ad7507e248	Remove prototype definitions from <sys/systm.h>. Prototypes are located in <sys/sysproto.h>. Add appropriate #include <sys/sysproto.h> to files that needed protos from systm.h. Add structure definitions to appropriate files that relied on sys/systm.h, right before system call definition, as in the rest of the kernel source. In kern_prot.c, instead of using the dummy structure "args", create individual dummy structures named <syscall>_args. This makes life easier for prototype generation.	1995-10-08 00:06:22 +00:00
Rodney W. Grimes	9b2e535452	Remove trailing whitespace.	1995-05-30 08:16:23 +00:00
Bruce Evans	3aa12267a5	Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) that I didn't notice when I fixed "all" such warnings before.	1995-03-28 07:58:53 +00:00
Guido van Rooij	e6373c9ec0	Implement maxprocperuid and maxfilesperproc. They are tunable via sysctl(8). The initial value of maxprocperuid is maxproc-1, that of maxfilesperproc is maxfiles (untill maxfile will disappear) Now it is at least possible to prohibit one user opening maxfiles -Guido Submitted by: Obtained from:	1995-02-20 19:42:42 +00:00
Bruce Evans	20989d2d64	Obtained from: my fix for 1.1.5 Remove compatibility hack so that dup(fd) isn't interpreted as dup2(fd & 0x3f, random_junk_on_stack_fd) when (fd & 0x3f) != 0.	1994-12-12 12:27:39 +00:00
Poul-Henning Kamp	797f2d22f0	All of this is cosmetic. prototypes, #includes, printfs and so on. Makes GCC a lot more silent.	1994-10-02 17:35:40 +00:00
Poul-Henning Kamp	bb56ec4a05	While in the real world, I had a bad case of being swapped out for a lot of cycles. While waiting there I added a lot of the extra ()'s I have, (I have never used LISP to any extent). So I compiled the kernel with -Wall and shut up a lot of "suggest you add ()'s", removed a bunch of unused var's and added a couple of declarations here and there. Having a lap-top is highly recommended. My kernel still runs, yell at me if you kernel breaks.	1994-09-25 19:34:02 +00:00
David Greenman	b36a2ba1ce	munmapfd() was being called with one too few params - bug introduced during my initial kernel port.	1994-09-02 10:17:30 +00:00
David Greenman	3c4dd3568f	Added $Id$	1994-08-02 07:55:43 +00:00
Rodney W. Grimes	26f9a76710	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman	1994-05-25 09:21:21 +00:00
Rodney W. Grimes	df8bae1de4	BSD 4.4 Lite Kernel Sources	1994-05-24 10:09:53 +00:00

... 5 6 7 8 9 ...

615 Commits