freebsd-dev

Author	SHA1	Message	Date
Poul-Henning Kamp	7f21497282	Add missing break.	2004-11-16 06:57:52 +00:00
Poul-Henning Kamp	f661e9a0bc	Straighten the ioctl function out to have only one exit point.	2004-11-15 21:51:28 +00:00
Poul-Henning Kamp	ef11fbd7c4	Introduce fdclose() which will clean an entry in a filedesc. Replace homerolled versions with call to fdclose(). Make fdunused() static to kern_descrip.c	2004-11-07 22:16:07 +00:00
Mike Silbersack	5173e8f567	Major enhancements to pipe memory usage: - pipespace is now able to resize non-empty pipes; this allows for many more resizing opportunities - Backing is no longer pre-allocated for the reverse direction of pipes. This direction is rarely (if ever) used, so this cuts the amount of map space allocated to a pipe in half. - Pipe growth is now much more dynamic; a pipe will now grow when the total amount of data it contains and the size of the write are larger than the size of pipe. Previously, only individual writes greater than the size of the pipe would cause growth. - In low memory situations, pipes will now shrink during both read and write operations, where possible. Once the memory shortage ends, the growth code will cause these pipes to grow back to an appropriate size. - If the full PIPE_SIZE allocation fails when a new pipe is created, the allocation will be retried with SMALL_PIPE_SIZE. This helps to deal with the situation of a fragmented map after a low memory period has ended. - Minor documentation + code changes to support the above. In total, these changes increase the total number of pipes that can be allocated simultaneously, drastically reducing the chances that pipe allocation will fail. Performance appears unchanged due to dynamic resizing.	2004-08-16 01:27:24 +00:00
John-Mark Gurney	ad3b9257c2	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00
Mike Silbersack	e10ecdea88	Standardize pipe locking, ensuring that everything is locked via pipelock(), not via a mixture of mutexes and pipelock(). Additionally, add a few KASSERTS, and change some statements that should have been KASSERTS into KASSERTS. As a result of these cleanups, some segments of code have become significantly shorter and/or easier to read.	2004-08-03 02:59:15 +00:00
Brian Feldman	b23f72e98a	* Add a "how" argument to uma_zone constructors and initialization functions so that they know whether the allocation is supposed to be able to sleep or not. * Allow uma_zone constructors and initialation functions to return either success or error. Almost all of the ones in the tree currently return success unconditionally, but mbuf is a notable exception: the packet zone constructor wants to be able to fail if it cannot suballocate an mbuf cluster, and the mbuf allocators want to be able to fail in general in a MAC kernel if the MAC mbuf initializer fails. This fixes the panics people are seeing when they run out of memory for mbuf clusters. * Allow debug.nosleepwithlocks on WITNESS to be disabled, without changing the default. Both bmilekic and jeff have reviewed the changes made to make failable zone allocations work.	2004-08-02 00:18:36 +00:00
Robert Watson	46b25cb5f6	Don't perform pipe endpoint locking during pipe_create(), as the pipe can't yet be referenced by other threads. In microbenchmarks, this appears to reduce the cost of pipe();close();close() on UP by 10%, and SMP by 7%. The vast majority of the cost of allocating a pipe remains VM magic. Suggested by: silby	2004-07-23 14:11:04 +00:00
Mike Silbersack	eb3d2c61b4	Fix a minor error in pipe_stat - st_size was always reported as 0 when direct writes kicked in. Whether this affected any applications is unknown.	2004-07-20 07:06:43 +00:00
Alan Cox	e3b19536fb	Revise the direct or optimized case to use uiomove_fromphys() by the reader instead of ephemeral mappings using pmap_qenter() by the writer. The writer is still, however, responsible for wiring the pages, just not mapping them. Consequently, the allocation of KVA for the direct case is unnecessary. Remove it and the sysctls limiting it, i.e., kern.ipc.maxpipekvawired and kern.ipc.amountpipekvawired. The number of temporarily wired pages is still, however, limited by kern.ipc.maxpipekva. Note: On platforms lacking a direct virtual-to-physical mapping, uiomove_fromphys() uses sf_bufs to cache ephemeral mappings. Thus, the number of available sf_bufs can influence the performance of pipes on platforms such i386. Surprisingly, I saw the greatest gain from this change on such a machine: lmbench's pipe bandwidth result increased from ~1050MB/s to ~1850MB/s on my 2.4GHz, 400MHz FSB P4 Xeon.	2004-03-27 19:50:23 +00:00
Robert Watson	049ffe98a8	Assert pipe mutex in pipeselwakeup(), as we manipulate pipe_state in a non-atomic manner. It appears to always be called with the mutex (good).	2004-02-26 00:18:22 +00:00
Robert Watson	094bdd260c	Update comment regarding MAC labels: we no longer pass endpoints into the MAC Framework, just the pipe pair. GC 'hadpeer' used in pipedestroy(), which is no longer needed as we check pipe_present flags on the pair.	2004-02-25 23:30:56 +00:00
Brian Feldman	240160d48b	Correct some major SMP-harmful problems in the pipe implementation. First of all, PIPE_EOF is not checked pervasively after everything that can drop the pipe mutex and msleep(), so fix. Additionally, though it might not harm anything, pipelock() and pipeunlock() are not used consistently. Third, the kqueue support functions do not use the pipe mutex correctly. Last, but absolutely not least, is a race: if pipe_busy is not set on the closing side of the pipe, the other side that is trying to write to that will crash BECAUSE PIPE_EOF IS NOT SET! Unconditionally set PIPE_EOF, and get rid of all the lockups/crashes I have seen trying to build ports.	2004-02-22 23:00:14 +00:00
Robert Watson	4f638130c3	Don't dec/inc the amountpipes counter every time we resize a pipe -- instead, just dec/inc in the ctor/dtor. For now, increment/decrement in two's, since we're now performing the operation once per pair, not once per pipe. Not really any measurable performance change in my micro-benchmarks, but doing less work is good, especially when it comes to atomic operations. Suggested by: alc	2004-02-03 04:55:24 +00:00
Robert Watson	9a830ddc54	Catch instances of (pipe == NULL) that were obsoleted with recent changes to jointly allocated pipe pairs. Replace these checks with pipe_present checks. This avoids a NULL pointer dereference when a pipe is half-closed. Submitted by: Peter Edwards <peter.edwards@openet-telecom.com>	2004-02-03 02:50:51 +00:00
Robert Watson	4795b82c13	Coalesce pipe allocations and frees. Previously, the pipe code would allocate two 'struct pipe's from the pipe zone, and malloc a mutex. - Create a new "struct pipepair" object holding the two 'struct pipe' instances, struct mutex, and struct label reference. Pipe structures now have a back-pointer to the pipe pair, and a 'pipe_present' flag to indicate whether the half has been closed. - Perform mutex init/destroy in zone init/destroy, avoiding reallocating the mutex for each pipe. Perform most pipe structure setup in zone constructor. - VM memory mappings for pageable buffers are still done outside of the UMA zone. - Change MAC API to speak 'struct pipepair' instead of 'struct pipe', update many policies. MAC labels are also handled outside of the UMA zone for now. Label-only policy modules don't have to be recompiled, but if a module is recompiled, its pipe entry points will need to be updated. If a module actually reached into the pipe structures (unlikely), that would also need to be modified. These changes substantially simplify failure handling in the pipe code as there are many fewer possible failure modes. On half-close, pipes no longer free the 'struct pipe' for the closed half until a full-close takes place. However, VM mapped buffers are still released on half-close. Some code refactoring is now possible to clean up some of the back references, etc; this patch attempts not to change the structure of most of the pipe implementation, only allocation/free code paths, so as to avoid introducing bugs (hopefully). This cuts about 8%-9% off the cost of sequential pipe allocation and free in system call tests on UP and SMP in my micro-benchmarks. May or may not make a difference in macro-benchmarks, but doing less work is good. Reviewed by: juli, tjr Testing help: dwhite, fenestro, scottl, et al	2004-02-01 05:56:51 +00:00
Robert Watson	26518e8d8c	Fix an error in a KASSERT string: it's pipe_free_kmem(), not pipespace(), that contains this KASSERT.	2004-01-31 23:03:22 +00:00
Dag-Erling Smørgrav	a2fe44e8cf	New file descriptor allocation code, derived from similar code introduced in OpenBSD by Niels Provos. The patch introduces a bitmap of allocated file descriptors which is used to locate available descriptors when a new one is needed. It also moves the task of growing the file descriptor table out of fdalloc(), reducing complexity in both fdalloc() and do_dup(). Debts of gratitude are owed to tjr@ (who provided the original patch on which this work is based), grog@ (for the gdb(4) man page) and rwatson@ (for assistance with pxeboot(8)).	2004-01-15 10:15:04 +00:00
Dag-Erling Smørgrav	ac34dc4e79	Back out 1.160, which was committed by mistake.	2004-01-11 20:08:57 +00:00
Dag-Erling Smørgrav	0e5dfade00	Mechanical whitespace cleanup.	2004-01-11 19:54:45 +00:00
Dag-Erling Smørgrav	012b5531f4	Mechanical whitespace cleanup + minor style nits.	2004-01-11 19:43:14 +00:00
Mike Silbersack	69fba1650a	Fix the maxpipekva warning message so that it points to the correct sysctl, and shorten the message. Noticed by: bde	2003-12-28 01:19:58 +00:00
Seigo Tanimura	512824f8f7	- Implement selwakeuppri() which allows raising the priority of a thread being waken up. The thread waken up can run at a priority as high as after tsleep(). - Replace selwakeup()s with selwakeuppri()s and pass appropriate priorities. - Add cv_broadcastpri() which raises the priority of the broadcast threads. Used by selwakeuppri() if collision occurs. Not objected in: -arch, -current	2003-11-09 09:17:26 +00:00
Alan Cox	3b2c54e7bc	- Delay the allocation of memory for the pipe mutex until we need it. This avoids the need to free said memory in various error cases along the way.	2003-11-06 05:58:26 +00:00
Alan Cox	fc17df5264	- Simplify pipespace() by eliminating the explicit creation of vm objects. Instead, let the vm objects be lazily instantiated at fault time. This results in the allocation of fewer vm objects and vm map entries due to aggregation in the vm system.	2003-11-06 05:08:12 +00:00
Robert Watson	730ecf8254	Unlock pipe mutex when failing MAC pipe ioctl access control check. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-11-03 17:58:23 +00:00
Mike Silbersack	184dcdc7c8	Change all SYSCTLS which are readonly and have a related TUNABLE from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide more useful error messages.	2003-10-21 18:28:36 +00:00
David Malone	e1419c08e2	falloc allocates a file structure and adds it to the file descriptor table, acquiring the necessary locks as it works. It usually returns two references to the new descriptor: one in the descriptor table and one via a pointer argument. As falloc releases the FILEDESC lock before returning, there is a potential for a process to close the reference in the file descriptor table before falloc's caller gets to use the file. I don't think this can happen in practice at the moment, because Giant indirectly protects closes. To stop the file being completly closed in this situation, this change makes falloc set the refcount to two when both references are returned. This makes life easier for several of falloc's callers, because the first thing they previously did was grab an extra reference on the file. Reviewed by: iedowse Idea run past: jhb	2003-10-19 20:41:07 +00:00
John-Mark Gurney	9e5de980c6	fix a problem referencing free'd memory. This is only a problem for kqueue write events on a socket and you regularly create tons of pipes which overwrites the structure causing a panic when removing the knote from the list. If the peer has gone away (and it's a write knote), then don't bother trying to remove the knote from the list. Submitted by: Brian Buchanan and myself Obtained from: nCircle	2003-10-12 07:06:02 +00:00
Alan Cox	27d203eab3	pipe_build_write_buffer() only requires read access of the page that it obtains from pmap_extract_and_hold().	2003-09-12 07:13:15 +00:00
Alan Cox	03be99d20c	Use pmap_extract_and_hold() in pipe_build_write_buffer(). Consequently, pipe_build_write_buffer() no longer requires Giant on entry. Reviewed by: tegge	2003-09-08 04:58:32 +00:00
Alan Cox	603d3d4a44	Giant is no longer required by pipe_destroy_write_buffer(). Reduce unnecessary white space from pipe_destroy_write_buffer().	2003-09-06 21:02:10 +00:00
John-Mark Gurney	fc8684cd46	if we got this far, we definately don't have an EBADF. Return a more sane result of EPIPE. Reported by: nCircle dev team MFC after: 3 day	2003-08-15 04:31:01 +00:00
Alan Cox	77685ea594	- The vm_object pointer in pipe_buffer is unused. Remove it. - Check for successful initialization of pipe_zone in pipeinit() rather than every call to pipe(2).	2003-08-13 20:01:38 +00:00
Alan Cox	ad8204e3f5	Pipespace() no longer requires Giant.	2003-08-11 22:23:25 +00:00
Mike Silbersack	cebde06978	More pipe changes: From alc: Move pageable pipe memory to a seperate kernel submap to avoid awkward vm map interlocking issues. (Bad explanation provided by me.) From me: Rework pipespace accounting code to handle this new layout, and adjust our default values to account for the fact that we now have a solid limit on allocations. Also, remove the "maxpipes" limit, as it no longer has a purpose. (The limit on kva usage solves the problem of having two many pipes.)	2003-08-11 05:51:51 +00:00
Alan Cox	f9999c67be	Use vm_page_hold() instead of vm_page_wire(). Otherwise, a multithreaded application could cause a wired page to be freed. In general, vm_page_hold() should be preferred for ephemeral kernel mappings of pages borrowed from a user-level address space. (vm_page_wire() should really be reserved for indefinite duration pinning by the "owner" of the page.) Discussed with: silby Submitted by: tegge	2003-08-11 00:17:44 +00:00
Alan Cox	9c62fce085	- Remove GIANT_REQUIRED from pipespace(). - Remove a duplicate initialization from pipe_create().	2003-08-08 22:38:15 +00:00
Alan Cox	f9b1de367e	- Remove GIANT_REQUIRED from pipe_free_kmem(). - Remove the acquisition and release of Giant around pipe_kmem_free() and uma_zfree() in pipeclose().	2003-08-07 04:32:40 +00:00
Pierre Beyssac	ae9fcf4c66	Remove test in pipe_write() which causes write(2) to return EAGAIN on a non-blocking pipe in cases where select(2) returns the file descriptor as ready for write. This in turns causes libc_r, for one, to busy wait in such cases. Note: it is a quick performance fix, a more complex fix might be required in case this turns out to have unexpected side effects. Reviewed by: silby MFC after: 3 days	2003-07-30 22:50:37 +00:00
Alan Cox	93b4c5b707	The introduction of vm object locking has caused witness to reveal a long-standing mistake in the way a portion of a pipe's KVA is allocated. Specifically, kmem_alloc_pageable() is inappropriate for use in the "direct" case because it allows a preceding vm map entry and vm object to be extended to support the new KVA allocation. However, the direct case KVA allocation should not have a backing vm object. This is corrected by using kmem_alloc_nofault(). Submitted by: tegge (with the above explanation by me)	2003-07-30 18:55:04 +00:00
Mike Silbersack	ff56f15e26	A few minor changes: - Use atomic ops to update the bigpipe count - Make the bigpipe count sysctl readable - Remove a duplicate comparison in an if statement - Comment two SYSCTLs.	2003-07-09 21:59:48 +00:00
Mike Silbersack	289016f2d1	Put some concrete limits on pipe memory consumption: - Limit the total number of pipes so that we do not exhaust all vm objects in the kernel map. When this limit is reached, a ratelimited message will be printed to the console. - Put a soft limit on the amount of memory consumable by pipes. Once the limit has been reached, all new pipes will be limited to 4K in size, rather than the default of 16K. - Put a limit on the number of pages that may be used for high speed page flipping in order to reduce the amount of wired memory. Pipe writes that occur while this limit is exceeded will fall back to non-page flipping mode. The above values are auto-tuned in subr_param.c and are scaled to take into account both the size of physical memory and the size of the kernel map. These limits help to reduce the "kernel resources exhausted" panics that could be caused by opening a large number of pipes. (Pipes alone are no longer able to exhaust all resources, but other kernel memory hogs in league with pipes may still be able to do so.) PR: 53627 Ideas / comments from: hsu, tjr, dillon@apollo.backplane.com MFC after: 1 week	2003-07-08 04:02:31 +00:00
Poul-Henning Kamp	7c2d2efd58	Initialize struct fileops with C99 sparse initialization.	2003-06-18 18:16:40 +00:00
David E. O'Brien	677b542ea2	Use __FBSDID().	2003-06-11 00:56:59 +00:00
Maxime Henrion	0ca5dc1c3e	style(9).	2003-06-09 21:57:48 +00:00
Jeffrey Hsu	c31548c820	Need to hold the same SMP lock for (knote) list traversal as for list manipulation. This lock also protects read-modify-write operations on the pipe_state field.	2003-04-02 15:24:50 +00:00
Jake Burkholder	227f9a1c58	- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)	2003-03-25 00:07:06 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Alfred Perlstein	e7d6662f1b	Do not allow kqueues to be passed via unix domain sockets.	2003-02-15 06:04:55 +00:00
Alan Cox	2bd63062b5	Use atomic ops to update amountpipekva. Amountpipekva represents the total kernel virtual address space used by all pipes. It is, thus, outside the scope of any individual pipe lock.	2003-02-13 19:39:54 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Matthew Dillon	48e3128b34	Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.	2003-01-13 00:33:17 +00:00
Matthew Dillon	cd72f2180b	Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.	2003-01-12 01:37:13 +00:00
Poul-Henning Kamp	a7010ee2f4	White-space changes.	2002-12-24 09:44:51 +00:00
Poul-Henning Kamp	f3a682116c	Detediousficate declaration of fileops array members by introducing typedefs for them.	2002-12-23 21:53:20 +00:00
Alfred Perlstein	8ced1eb281	Remove a KASSERT I added in 1.73 to catch uninitialized pipes. It must be removed because it is done without the pipe being locked via pipelock() and therefore is vulnerable to races with pipespace() erroneously triggering it by temporarily zero'ing out the structure backing the pipe. It looks as if this assertion is not needed because all manipulation of the data changed by pipespace() _is_ protected by pipelock(). Reported by: kris, mckusick	2002-10-14 21:15:04 +00:00
Alfred Perlstein	1e31f88689	whitespace fixes.	2002-10-12 22:26:41 +00:00
Mike Barcroft	2b7f24d210	Change iov_base's type from `char ' to the standard` void '. All uses of iov_base which assume its type is `char ' (in order to do pointer arithmetic) have been updated to cast iov_base to `char '.	2002-10-11 14:58:34 +00:00
Don Lewis	91e97a8266	In an SMP environment post-Giant it is no longer safe to blindly dereference the struct sigio pointer without any locking. Change fgetown() to take a reference to the pointer instead of a copy of the pointer and call SIGIO_LOCK() before copying the pointer and dereferencing it. Reviewed by: rwatson	2002-10-03 02:13:00 +00:00
Robert Watson	1aa37f5392	Improve locking of pipe mutexes in the context of MAC: (1) Where previously the pipe mutex was selectively grabbed during pipe_ioctl(), now always grab it and then release if if not needed. This protects the call to mac_check_pipe_ioctl() to make sure the label remains consistent. (Note: it looks like sigio locking may be incorrect for fgetown() since we call it not-by-reference and sigio locking assumes call by reference). (2) In pipe_stat(), lock the pipe if MAC is compiled in so that the call to mac_check_pipe_stat() gets a locked pipe to protect label consistency. We still release the lock before returning actual stat() data, risking inconsistency, but apparently our pipe locking model accepts that risk. (3) In various pipe MAC authorization checks, assert that the pipe lock is held. (4) Grab the lock when performing a pipe relabel operation, and assert it a little deeper in the stack. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-01 04:30:19 +00:00
Poul-Henning Kamp	37c841831f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
Archie Cobbs	55f7c614fd	Don't use "NULL" when "0" is really meant.	2002-08-21 23:39:52 +00:00
Robert Watson	c024c3eeb1	Break out mac_check_pipe_op() into component check entry points: mac_check_pipe_poll(), mac_check_pipe_read(), mac_check_pipe_stat(), and mac_check_pipe_write(). This is improves consistency with other access control entry points and permits security modules to only control the object methods that they are interested in, avoiding switch statements. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 16:59:37 +00:00
Robert Watson	d49fa1ca6e	In continuation of early fileop credential changes, modify fo_ioctl() to accept an 'active_cred' argument reflecting the credential of the thread initiating the ioctl operation. - Change fo_ioctl() to accept active_cred; change consumers of the fo_ioctl() interface to generally pass active_cred from td->td_ucred. - In fifofs, initialize filetmp.f_cred to ap->a_cred so that the invocations of soo_ioctl() are provided access to the calling f_cred. Pass ap->a_td->td_ucred as the active_cred, but note that this is required because we don't yet distinguish file_cred and active_cred in invoking VOP's. - Update kqueue_ioctl() for its new argument. - Update pipe_ioctl() for its new argument, pass active_cred rather than td_ucred to MAC for authorization. - Update soo_ioctl() for its new argument. - Update vn_ioctl() for its new argument, use active_cred rather than td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-17 02:36:16 +00:00
Robert Watson	49cde51dfd	Correct white space nits that crept in during my recent merges of trustedbsd_mac material.	2002-08-16 14:12:40 +00:00
Robert Watson	ea6027a8e1	Make similar changes to fo_stat() and fo_poll() as made earlier to fo_read() and fo_write(): explicitly use the cred argument to fo_poll() as "active_cred" using the passed file descriptor's f_cred reference to provide access to the file credential. Add an active_cred argument to fo_stat() so that implementers have access to the active credential as well as the file credential. Generally modify callers of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which was redundantly provided via the fp argument. This set of modifications also permits threads to perform these operations on behalf of another thread without modifying their credential. Trickle this change down into fo_stat/poll() implementations: - badfo_poll(), badfo_stat(): modify/add arguments. - kqueue_poll(), kqueue_stat(): modify arguments. - pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to MAC checks rather than td->td_ucred. - soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather than cred to pru_sopoll() to maintain current semantics. - sopoll(): moidfy arguments. - vn_poll(), vn_statfile(): modify/add arguments, pass new arguments to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL() to maintian current semantics. - vn_close(): rename cred to file_cred to reflect reality while I'm here. - vn_stat(): Add active_cred and file_cred arguments to vn_stat() and consumers so that this distinction is maintained at the VFS as well as 'struct file' layer. Pass active_cred instead of td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics. - fifofs: modify the creation of a "filetemp" so that the file credential is properly initialized and can be used in the socket code if desired. Pass ap->a_td->td_ucred as the active credential to soo_poll(). If we teach the vnop interface about the distinction between file and active credentials, we would use the active credential here. Note that current inconsistent passing of active_cred vs. file_cred to VOP's is maintained. It's not clear why GETATTR would be authorized using active_cred while POLL would be authorized using file_cred at the file system level. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-16 12:52:03 +00:00
Robert Watson	9ca435893b	In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-15 20:55:08 +00:00
Robert Watson	925860774d	Introduce support for labeling and access control of pipe objects as part of the TrustedBSD MAC framework. Instrument the creation and destruction of pipes, as well as relevant operations, with necessary calls to the MAC framework. Note that the locking here is probably not quite right yet, but fixes will be forthcoming. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-13 02:47:13 +00:00
Dag-Erling Smørgrav	ea4c8f8ca1	Check the far end before registering an EVFILT_WRITE filter on a pipe.	2002-08-05 15:03:03 +00:00
Alfred Perlstein	1a5a641600	Remove unneeded caddr_t casts.	2002-07-22 19:05:44 +00:00
Alan Cox	b416fa1041	o Lock accesses to the page queues. o Add a comment explaining why hoisting the page queue lock outside of a particular loop is not possible.	2002-07-13 04:09:45 +00:00
Alfred Perlstein	7f05b0353a	More caddr_t removal, make fo_ioctl take a void * instead of a caddr_t.	2002-06-29 01:50:25 +00:00
Alfred Perlstein	52545a237b	document that the pipe fo_stat routine doesn't need locks because it's a read operation. Requested by: rwatson	2002-06-28 22:35:12 +00:00
Alfred Perlstein	e649887b1e	Make funsetown() take a 'struct sigio **' so that the locking can be done internally. Ensure that no one can fsetown() to a dying process/pgrp. We need to check the process for P_WEXIT to see if it's exiting. Process groups are already safe because there is no such thing as a pgrp zombie, therefore the proctree lock completely protects the pgrp from having sigio structures associated with it after it runs funsetownlst. Add sigio lock to witness list under proctree and allproc, but over proc and pgrp. Seigo Tanimura helped with this.	2002-05-06 19:31:28 +00:00
Alfred Perlstein	f132072368	Redo the sigio locking. Turn the sigio sx into a mutex. Sigio lock is really only needed to protect interrupts from dereferencing the sigio pointer in an object when the sigio itself is being destroyed. In order to do this in the most unintrusive manner change pgsigio's sigio * argument into a **, that way we can lock internally to the function.	2002-05-01 20:44:46 +00:00
Thomas Moestl	8db523989f	Use pmap_extract() instead of pmap_kextract() to retrieve the physical address associated with a user virtual address in pipe_build_write_buffer(). Reviewed by: alc	2002-04-13 20:09:06 +00:00
Thomas Moestl	de67a4bd91	Back out the last revision - it does not work correctly when one of the pages in question is not in the top-level vm object, but in one of the shadow ones. Pointed out by: alc Pointy hat to: tmm	2002-04-13 00:03:07 +00:00
Thomas Moestl	60f2606a7d	Do not use pmap_kextract() to find out the physical address of a user belong to a user virtual address; while this happens to work on some architectures, it can't on sparc64, since user and kernel virtual address spaces overlap there (the distinction between them is done via separate address space identifiers). Instead, look up the page in the vm_map of the process in question. Reviewed by: jake	2002-04-12 19:38:41 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Alan Cox	cd430164f1	Allow resursion on the pipe mutex because filt_piperead() and filt_pipewrite() can be called both with and without the pipe mutex held. (For example, if called by pipeselwakeup(), it is held. Whereas, if called by kqueue_scan(), it is not.) Reviewed by: alfred	2002-03-27 21:47:50 +00:00
Alfred Perlstein	db51256707	When "cloning" a pipe's buffer bcopy the data after dropping the pipe's lock as the data may be paged out and cause a fault.	2002-03-22 16:09:22 +00:00
Jeff Roberson	c897b81311	Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.	2002-03-20 04:09:59 +00:00
Jeff Roberson	8355f576a9	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00
Alfred Perlstein	3b018f572d	Bug fixes: Missed a place where the pipe sleep lock was needed in order to safely grab Giant, fix it and add an assertion to make sure this doesn't happen again. Fix typos in the PIPE_GET_GIANT/PIPE_DROP_GIANT that could cause the wrong mutex to get passed to PIPE_LOCK/PIPE_UNLOCK. Fix a location where the wrong pipe was being passed to PIPE_GET_GIANT/PIPE_DROP_GIANT.	2002-03-15 07:18:09 +00:00
Alfred Perlstein	be4af4b723	Don't deref NULL mutex pointer when pipeclose()'ing a pipe that is not fully instaniated. Revert the logic in pipeclose so that we don't have the entire function pretty much under a single if() statement, instead invert the test and just return if it fails. Submitted (in different form) by: bde Don't use pool mutexes for pipes. We can not use pool mutexes because we will need to grab the select lock while holding a pipe lock which is not allowed because you may not aquire additional mutexes when holding a pool mutex. Instead malloc(9) space for the mutex that is shared between the pipes.	2002-03-09 22:06:31 +00:00
Seigo Tanimura	996abba928	Track the number of wired pages to avoid unwiring unwired pages. Reviewed by: alfred	2002-03-05 00:51:03 +00:00
Alfred Perlstein	9f01374de5	kill __P.	2002-02-27 18:51:53 +00:00
Alfred Perlstein	566c1313a3	add assertions in the places where giant is required to catch when the pipe is locked and shouldn't be. initialize pipe->pipe_mtxp to NULL when creating pipes in order not to trip the above assertions. swap pipe lock with giant around calls to pipe_destroy_write_buffer() pipe_destroy_write_buffer issue noticed by: jhb	2002-02-27 18:49:58 +00:00
Alfred Perlstein	21dbcfd500	Fix a NULL deref panic in pipe_write, we can't blindly lock pipe->pipe_peer->pipe_mtxp because it may be NULL, so lock the passed in pipe's mutex instead.	2002-02-27 17:23:16 +00:00
Alfred Perlstein	ffddaaeeeb	MPsafe fixes: use SYSINIT to initialize pipe_zone. use PIPE_LOCK to protect kevent ops.	2002-02-27 11:27:48 +00:00
Alfred Perlstein	f81b04d96c	First rev at making pipe(2) pipe's MPsafe. Both ends of the pipe share a pool_mutex, this makes allocation and deadlock avoidance easy. Remove some un-needed FILE_LOCK ops while I'm here. There are some issues wrt to select and the f{s,g}etown code that we'll have to deal with, I think we may also need to move the calls to vfs_timestamp outside of the sections covered by PIPE_LOCK.	2002-02-27 07:35:59 +00:00
Alfred Perlstein	426da3bcfb	SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file fp); / increments reference count on a file / struct file fhold_locked(struct file fp); / like fhold but expects file to locked / struct file ffind_hold(struct thread , int fd); / finds the struct file in thread, adds one reference and returns it unlocked / struct file ffind_lock(struct thread , int fd); / ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.	2002-01-13 11:58:06 +00:00
Maxim Sobolev	783c41d432	Make kevents on pipes work as described in the manpage - when the last reader/writer disconnects, ensure that anybody who is waiting for the kevent on the other end of the pipe gets EV_EOF. MFC after: 2 weeks	2001-11-19 09:25:30 +00:00
John Baldwin	ed01445d8f	Use the passed in thread to selrecord() instead of curthread.	2001-09-21 22:46:54 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Matthew Dillon	7b9673fa28	cleanup: GIANT macros, rename DEPRECIATE to DEPRECATE Move p_giant_optional to proc zero'd section Remove (old) XXX zfree comment in pipe code	2001-07-04 17:11:03 +00:00
Matthew Dillon	0cddd8f023	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
Jonathan Lemon	7b748f0a21	Correctly hook up the write kqfilter to pipes. Submitted by: Niels Provos <provos@citi.umich.edu>	2001-06-15 20:45:01 +00:00
Matthew Dillon	1b3e974a71	The pipe_write() code was locking the pipe without busying it first in certain cases, and a close() by another process could potentially rip the pipe out from under the (blocked) locking operation. Reported-by: Alexander Viro <viro@math.psu.edu>	2001-06-04 04:04:45 +00:00

1 2 3 4 5

230 Commits