freebsd-nq

Author	SHA1	Message	Date
Julian Elischer	e602ba25fd	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
Alfred Perlstein	c33c825169	Implement SO_NOSIGPIPE option for sockets. This allows one to request that an EPIPE error return not generate SIGPIPE on sockets. Submitted by: lioux Inspired by: Darwin	2002-06-20 18:52:54 +00:00
Poul-Henning Kamp	c4bacc1871	Remove the compat bits for the mis-aligned struct disklabel on alpha, people got three times longer than I promised. Sponsored by: DARPA & NAI Labs.	2002-06-19 08:37:02 +00:00
Kelly Yancey	9ae6d334da	Make nselcol, the number of select collisions since boot, unsigned as negative collisions simply doesn't make sense. PR: (one small part of) 19720 Approved by: alfred	2002-06-12 02:08:18 +00:00
John Baldwin	60a9bb197d	Catch up to changes in ktrace API.	2002-06-07 05:37:18 +00:00
Alan Cox	82641acd17	o Correct an error made in revision 1.65: In readv(), if uap->iovcnt is out-of-range, drop the file reference before returning. (This error also exists in the RELENG_4 branch.) o Eliminate the acquisition and release of Giant in readv() now that malloc() and free() are callable without Giant.	2002-05-09 02:30:41 +00:00
Poul-Henning Kamp	0b5d880d39	As promised make the hack for sizeof(struct disklabel) on alpha annoying. Run make world (or recompile whatever program whines) to get rid of warning. Compat bits will be removed entirely in about two weeks.	2002-05-02 21:53:39 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
Poul-Henning Kamp	f67ad03a25	Delete the bogus d_boot[01] fields from struct disklabel. This shrinks the size 4 bytes on alpha, down to the same 276 bytes as all other platforms. Construct a hack to make old ioctls work on new kernels. Once world is recompiled only the new and correct sysctls will be used. This hack will become annoying around 1st of may to make people rebuild their worlds and it will be gone before 5.0.	2002-04-04 20:34:48 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
Alfred Perlstein	628abf6c69	Giant pushdown for read/write/pread/pwrite syscalls. kern/kern_descrip.c: Aquire Giant in fdrop_locked when file refcount hits zero, this removes the requirement for the caller to own Giant for the most part. kern/kern_ktrace.c: Aquire Giant in ktrgenio, simplifies locking in upper read/write syscalls. kern/vfs_bio.c: Aquire Giant in bwillwrite if needed. kern/sys_generic.c Giant pushdown, remove Giant for: read, pread, write and pwrite. readv and writev aren't done yet because of the possible malloc calls for iov to uio processing. kern/sys_socket.c Grab giant in the socket fo_read/write functions. kern/vfs_vnops.c Grab giant in the vnode fo_read/write functions.	2002-03-15 08:03:46 +00:00
Alfred Perlstein	85f190e4d1	Fixes to make select/poll mpsafe. Problem: selwakeup required calling pfind which would cause lock order reversals with the allproc_lock and the per-process filedesc lock. Solution: Instead of recording the pid of the select()'ing process into the selinfo structure, actually record a pointer to the thread. To avoid dereferencing a bad address all the selinfo structures that are in use by a thread are kept in a list hung off the thread (protected by sellock). When a selwakeup occurs the selinfo is removed from that threads list, it is also removed on the way out of select or poll where the thread will traverse its list removing all the selinfos from its own list. Problem: Previously the PROC_LOCK was used to provide the mutual exclusion needed to ensure proper locking, this couldn't work because there was a single condvar used for select and poll and condvars can only be used with a single mutex. Solution: Introduce a global mutex 'sellock' which is used to provide mutual exclusion when recording events to wait on as well as performing notification when an event occurs. Interesting note: schedlock is required to manipulate the per-thread TDF_SELECT flag, however if given its own field it would not need schedlock, also because TDF_SELECT is only manipulated under sellock one doesn't actually use schedlock for syncronization, only to protect against corruption. Proc locks are no longer used in select/poll. Portions contributed by: davidc	2002-03-14 01:32:30 +00:00
Alfred Perlstein	bbbb04ce62	Remove __P	2002-03-09 22:44:37 +00:00
Alfred Perlstein	4658f926c0	Remove unused variables in select(2) from previous delta. Pointed out by: bde	2002-01-30 19:48:25 +00:00
Alfred Perlstein	eb20931127	Attempt to fixup select(2) and poll(2), this should fix some races with other threads as well as speed up the interfaces. To fix the race and accomplish the speedup, remove selholddrop and pollholddrop. The entire concept is somewhat bogus because holding the individual struct file pointers offers us no guarantees that another thread context won't close it on us thereby removing our access to our own reference. Selholddrop and pollholddrop also would do multiple locks and unlocks of mutexes _per-file_ in the fd arrays to be scanned, this needed to be sped up. Instead of using selholddrop and pollholddrop, simply hold the filedesc lock over the selscan and pollscan functions. This should protect us against close(2)'s on the files as reduce the multiple lock/unlock pairs per fd into a single lock over the filedesc.	2002-01-29 22:54:19 +00:00
Alfred Perlstein	97fa4397d3	make pread use fget_read instead of holdfp.	2002-01-23 08:22:59 +00:00
Alfred Perlstein	aa11a498ff	undo a bit of the Giant pushdown. fdrop isn't SMP safe as it may call into the file's close routine which definetly is not SMP safe right now, so we hold Giant over calls to fdrop now.	2002-01-19 01:03:54 +00:00
Alfred Perlstein	b5c93a560d	Fix giant handling in pwrite(2), I forgot to release it when finishing the syscall.	2002-01-16 21:33:41 +00:00
Alfred Perlstein	a4db49537b	Replace ffind_* with fget calls. Make fget MPsafe. Make fgetvp and fgetsock use the fget subsystem to reduce code bloat. Push giant down in fpathconf().	2002-01-14 00:13:45 +00:00
Alfred Perlstein	426da3bcfb	SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file fp); / increments reference count on a file / struct file fhold_locked(struct file fp); / like fhold but expects file to locked / struct file ffind_hold(struct thread , int fd); / finds the struct file in thread, adds one reference and returns it unlocked / struct file ffind_lock(struct thread , int fd); / ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.	2002-01-13 11:58:06 +00:00
Matthew Dillon	b064d43d8f	remove holdfp() Replace uses of holdfp() with fget() or fgetvp() calls as appropriate introduce fget(), fget_read(), fget_write() - these functions will take a thread and file descriptor and return a file pointer with its ref count bumped. introduce fgetvp(), fgetvp_read(), fgetvp_write() - these functions will take a thread and file descriptor and return a vref()'d vnode. _read() requires that the file pointer be FREAD, _write that it be FWRITE. This continues the cleanup of struct filedesc and struct file access routines which, when are all through with it, will allow us to then make the API calls MP safe and be able to move Giant down into the fo_* functions.	2001-11-14 06:30:36 +00:00
John Baldwin	fea2ab833e	The P_SELECT flag was moved from p->p_flag to td->td_flags, but p_flag was locked by the proc lock and td_flags is locked by the sched_lock. The places that read, set, and cleared TDF_SELECT weren't updated, so they read and modified td_flags w/o holding the sched_lock, meaning that they could corrupt the per-thread flags field. As an immediate band-aid, grab sched_lock while reading and manipulating td_flags in relation to TDF_SELECT. This will probably be cleaned up some later on.	2001-09-21 22:06:22 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Matthew Dillon	ad2edad94e	Giant Pushdown: read() pread() readv() write () pwrite() writev() ioctl() select () poll() openbsd_poll()	2001-09-01 19:34:23 +00:00
Seigo Tanimura	1b36970495	Back out scanning file descriptors with holding a process lock. selrecord() requires allproc sx in pfind(), resulting in lock order reversal between allproc and a process lock.	2001-05-15 10:19:57 +00:00
Seigo Tanimura	265fc98f36	- Convert msleep(9) in select(2) and poll(2) to cv_wait(9). - Since polling should not involve sleeping, keep holding a process lock upon scanning file descriptors. - Hold a reference to every file descriptor prior to entering polling loop in order to avoid lock order reversal between lockmgr and p_mtx upon calling fdrop() in fo_poll(). (NOTE: this work has not been done for netncp and netsmb yet because a socket itself has no reference counts.) Reviewed by: jhb	2001-05-14 05:26:48 +00:00
John Baldwin	33a9ed9d0e	Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake	2001-04-24 00:51:53 +00:00
John Baldwin	19eb87d22a	Grab the process lock while calling psignal and before calling psignal.	2001-03-07 03:37:06 +00:00
Jonathan Lemon	ea0237ed11	Correctly declare variables as u_int rather than doing typecasts. Kill some register declarations while I'm here. Submitted by: bde (1)	2001-02-27 15:11:31 +00:00
Jonathan Lemon	0b7088c4d0	Cast nfds to u_int before range checking it in order to catch negative values. PR: 25393	2001-02-27 00:50:20 +00:00
Peter Wemm	2bd5ac330f	poll(2) array limits (take 2) - after some input from bde.	2001-02-09 08:10:22 +00:00
Bosko Milekic	9ed346bab0	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
Peter Wemm	89b716473e	The code I picked up from NetBSD in '97 had a nasty bug. It limited the index of the pollfd array to the number of fd's currently open, not the maximum number of fd's. ie: if you had 0,1,2 open, you could not use pollfd slots higher than 20. The specs say we only have to support OPEN_MAX [64] entries but we allow way more than that.	2001-02-07 23:28:01 +00:00
John Baldwin	e04ac2fe6b	- Catch up to proc flag changes. - Add proc locking for selwakeup() and selrecord().	2001-01-24 11:12:37 +00:00
Garrett Wollman	0a2c3d48c6	select() DKI is now in <sys/selinfo.h>.	2001-01-09 04:33:49 +00:00
Matthew Dillon	a41ce5d30b	Only call bwillwrite() for vnodes. Do not penalize devices or pipes.	2000-12-07 23:45:57 +00:00
Matthew Dillon	9440653d07	Add necessary bwillwrite() in writev() entry point. Deal with excessive dirty buffers when msync() syncs non-contiguous dirty buffers by checking for the case in UFS before checking for clusterability.	2000-12-06 20:55:09 +00:00
Alfred Perlstein	c6ab5768aa	only call bwillwrite() to stall on IO when dealing with VNODEs otherwise we will stall on non-disk IO for things like fifos and sockets	2000-11-30 20:23:14 +00:00
Jonathan Lemon	4a476efa51	Protect p_wchan with sched_lock in selwakeup().	2000-11-21 20:22:34 +00:00
Matthew Dillon	279d722604	This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>	2000-11-18 21:01:04 +00:00
Peter Wemm	b31ae1adc5	Fix a warning that has been annoying me for some time: "kern/sys_generic.c:358: warning: cast discards qualifiers from pointer target type" The idea for using the uintptr_t intermediate cast for de-constifying a pointer was hinted at by bde some time ago.	2000-07-28 22:17:42 +00:00
Brian Feldman	3c89e357f0	Distinguish between whether ktraceing was enabled before an IO operation or after it. If the ktrace operation was enabled while the process was blocked doing IO, the race would allow it to pass down invalid (uninitialized) data and panic later down the call stack.	2000-07-27 03:45:18 +00:00
John Baldwin	9c386f6b7d	For infinite timeouts, set both the tv_sec and tv_usec fields to zero in poll() and select(). Noticed by: Wesley Morgan <morganw@chemicals.tacorp.com>	2000-07-13 02:12:25 +00:00
John Baldwin	4da144c091	Fix a very obscure bug in select() and poll() where the timeout would never expire if poll() or select() was called before the system had been in multiuser for 1 second. This was caused by only checking to see if tv_sec was zero rather than checking both tv_sec and tv_usec.	2000-07-12 22:46:40 +00:00
Brian Feldman	7ceba2d755	Remove two micro-pessimizations I made. Bruce is teaching me well :) KTRPOINT(p, KTR_GENIO) is more uncommon than error == 0, so it should be first in the && statement.	2000-07-07 22:11:37 +00:00
Brian Feldman	42ebfbf227	Modify ktrace's general I/O tracing, ktrgenio(), to use a struct uio * instead of a struct iovec * array and int len. Get rid of stupidly trying to allocate all of the memory and copyin()ing the entire iovec[], and instead just do the proper VOP_WRITE() in ktrwrite() using a copy of the struct uio that the syscall originally used. This solves the DoS which could easily be performed; to work around the DoS, one could also remove "options KTRACE" from the kernel. This is a very strong MFC candidate for 4.1. Found by: art@OpenBSD.org	2000-07-02 08:08:09 +00:00
Alfred Perlstein	8757e5bbc5	unstatic getfp() so that other subsystems can use it. make sendfile() use it. Approved by: dg	2000-06-12 18:06:12 +00:00
Matthew Dillon	d2ba455c2c	Some ioctl routines assume that the ioctl buffer is aligned, but a char[] declaration makes no such guarentee. A union is used to force alignment of the char buffer.	2000-05-09 17:43:21 +00:00
Peter Wemm	f082218c18	Fix select(2) for the Alpha. (!!) It was never returning true for fd's in the range of 32-63, 96-127 etc. The first problem was the FD_*() macros were shifting a 32 bit integer "1" left by more than 32 bits. The same problem happened in selscan(). ffs() also takes an int argument and causes failure. For cases where int == long (ie: the usual case for x86, but not always as gcc can have long being a 64 bit quantity) ffs() could be used. Reported by: Marian Stagarescu <marian@bile.skycache.com> Reviewed by: dfr, gallatin (sys/types.h only) Approved by: jkh	2000-02-20 13:36:26 +00:00
Jason Evans	bfbbc4aa44	Add aio_waitcomplete(). Make aio work correctly for socket descriptors. Make gratuitous style(9) fixes (me, not the submitter) to make the aio code more readable. PR: kern/12053 Submitted by: Chris Sedore <cmsedore@maxwell.syr.edu>	2000-01-14 02:53:29 +00:00

1 2 3

103 Commits