freebsd-skq

Author	SHA1	Message	Date
dwmalone	fb925d8493	Make the wait for sendfile buffers interruptable. Stops one process consuming them all and then getting stuck. Reviewed by: dg Reviewed by: bmilekic Observed by: Andreas Persson <pap@garen.net>	2001-03-08 16:28:10 +00:00
jhb	9cd254601b	Grab the process lock while calling psignal and before calling psignal.	2001-03-07 03:37:06 +00:00
jlemon	9377320bfd	Return ECONNABORTED from accept if connection is closed while on the listen queue, as well as the current behavior of a zero-length sockaddr. Obtained from: KAME Reviewed by: -net	2001-02-14 02:09:11 +00:00
bmilekic	f364d4ac36	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
phk	23f02b4722	Fix the <sys/queue.h> abuse. Submitted by: Dima Dorfman <dima@unixfreak.org> Reviewed by: /sbin/md5	2001-01-02 11:51:55 +00:00
phk	75f809e410	Add an XXX about a <sys/queue.h> transgression which needs cleaned up.	2001-01-02 10:34:09 +00:00
bmilekic	4b6a7bddad	* Rename M_WAIT mbuf subsystem flag to M_TRYWAIT. This is because calls with M_WAIT (now M_TRYWAIT) may not wait forever when nothing is available for allocation, and may end up returning NULL. Hopefully we now communicate more of the right thing to developers and make it very clear that it's necessary to check whether calls with M_(TRY)WAIT also resulted in a failed allocation. M_TRYWAIT basically means "try harder, block if necessary, but don't necessarily wait forever." The time spent blocking is tunable with the kern.ipc.mbuf_wait sysctl. M_WAIT is now deprecated but still defined for the next little while. * Fix a typo in a comment in mbuf.h * Fix some code that was actually passing the mbuf subsystem's M_WAIT to malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the value of the M_WAIT flag, this could have became a big problem.	2000-12-21 21:44:31 +00:00
dwmalone	dd75d1d73b	Convert more malloc+bzero to malloc+M_ZERO. Submitted by: josh@zipperup.org Submitted by: Robert Drehmel <robd@gmx.net>	2000-12-08 21:51:06 +00:00
dg	bf7a8acc8f	Changed second argument in a call to sf_buf_free() to be NULL instead of PAGE_SIZE to match the prototype better. The argument is ignored, so this is just to silence the compile-time warning. Pointed out by: jhb	2000-12-03 01:35:46 +00:00
bmilekic	e91e9907ef	Make sure to free the sf_buf if we've allocated it but fail to allocate an mbuf (ENOBUFS) before returning so that we don't leak sf_bufs in the case where we're out of mbufs. Submitted by: David Greenman (dg)	2000-12-02 00:40:57 +00:00
dillon	15a44d16ca	This patchset fixes a large number of file descriptor race conditions. Pre-rfork code assumed inherent locking of a process's file descriptor array. However, with the advent of rfork() the file descriptor table could be shared between processes. This patch closes over a dozen serious race conditions related to one thread manipulating the table (e.g. closing or dup()ing a descriptor) while another is blocked in an open(), close(), fcntl(), read(), write(), etc... PR: kern/11629 Discussed with: Alexander Viro <viro@math.psu.edu>	2000-11-18 21:01:04 +00:00
dg	b529eb36e4	Fixed a certain panic on IO error in sendfile(): Page must be set PG_BUSY before calling vm_page_free() on it.	2000-11-12 14:51:15 +00:00
bmilekic	4ddcfae4ca	* Have m_pulldown() use the new M_WRITABLE() macro in order to determine whether the given ext_buf is shared. * Have the sf_bufs be setup with the mbuf subsystem using MEXTADD() with the two new arguments. Note: m_pulldown() is somewhat crotchy; the added comment explains the situation. Reviewed by: jlemon	2000-11-11 23:04:15 +00:00
bmilekic	51dc2fe61b	Change the sf_bufs wakeups to be wakeup_one(), because we don't want to wakeup all of the sleeping threads when we free only one buffer. This avoids us having to needlessly try again (and fail, and go back to sleep) for all the threads sleeping. We will now only wakeup the thread we know will succeed. Reviewed by: green	2000-11-04 21:55:25 +00:00
bmilekic	8b319d105d	Setup and put to use the mutex lock for sf_freelist, the sendfile(2) bufs freelist. Should now be thread-friendly, in part. Note: More work is needed in uipc_syscalls.c, but it will have to wait until the socket locking issues are at least 80% implemented and committed.	2000-11-04 07:16:08 +00:00
bp	a7bc78c86d	Add three new VOPs: VOP_CREATEVOBJECT, VOP_DESTROYVOBJECT and VOP_GETVOBJECT. They will be used by nullfs and other stacked filesystems to support full cache coherency. Reviewed in general by: mckusick, dillon	2000-09-12 09:49:08 +00:00
dwmalone	df0e25bf6c	Replace the mbuf external reference counting code with something that should be better. The old code counted references to mbuf clusters by using the offset of the cluster from the start of memory allocated for mbufs and clusters as an index into an array of chars, which did the reference counting. If the external storage was not a cluster then reference counting had to be done by the code using that external storage. NetBSD's system of linked lists of mbufs was cosidered, but Alfred felt it would have locking issues when the kernel was made more SMP friendly. The system implimented uses a pool of unions to track external storage. The union contains an int for counting the references and a pointer for forming a free list. The reference counts are incremented and decremented atomically and so should be SMP friendly. This system can track reference counts for any sort of external storage. Access to the reference counting stuff is now through macros defined in mbuf.h, so it should be easier to make changes to the system in the future. The possibility of storing the reference count in one of the referencing mbufs was considered, but was rejected 'cos it would often leave extra mbufs allocated. Storing the reference count in the cluster was also considered, but because the external storage may not be a cluster this isn't an option. The size of the pool of reference counters is available in the stats provided by "netstat -m". PR: 19866 Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: alfred (glanced at by others on -net)	2000-08-19 08:32:59 +00:00
green	9707bc34b0	Modify ktrace's general I/O tracing, ktrgenio(), to use a struct uio * instead of a struct iovec * array and int len. Get rid of stupidly trying to allocate all of the memory and copyin()ing the entire iovec[], and instead just do the proper VOP_WRITE() in ktrwrite() using a copy of the struct uio that the syscall originally used. This solves the DoS which could easily be performed; to work around the DoS, one could also remove "options KTRACE" from the kernel. This is a very strong MFC candidate for 4.1. Found by: art@OpenBSD.org	2000-07-02 08:08:09 +00:00
alfred	e7947cbed1	unstatic getfp() so that other subsystems can use it. make sendfile() use it. Approved by: dg	2000-06-12 18:06:12 +00:00
jake	961b97d434	Back out the previous change to the queue(3) interface. It was not discussed and should probably not happen. Requested by: msmith and others	2000-05-26 02:09:24 +00:00
jake	d93fbc9916	Change the way that the queue(3) structures are declared; don't assume that the type argument to _HEAD and _ENTRY is a struct. Suggested by: phk Reviewed by: phk Approved by: mdodd	2000-05-23 20:41:01 +00:00
jlemon	c41c876463	Introduce kqueue() and kevent(), a kernel event notification facility.	2000-04-16 18:53:38 +00:00
green	56a46611e1	This is Bosko Milekic's mbuf allocation waiting code. Basically, this means that running out of mbuf space isn't a panic anymore, and code which runs out of network memory will sleep to wait for it. Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: green, wollman	1999-12-12 05:52:51 +00:00
phk	2431275ac4	General clean-up of socket.h and associated sources to synchronise up with NetBSD and the Single Unix Specification v2. This updates some structures with other, almost equivalent types and effort is under way to get the whole more consistent. Also removes a double definition of INET6 and some other clean-ups. Reviewed by: green, bde, phk Some part obtained from: NetBSD, SUSv2 specification	1999-11-24 20:49:04 +00:00
phk	8fca18de89	This is a partial commit of the patch from PR 14914: Alot of the code in sys/kern directly accesses the Q_HEAD and Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914	1999-11-16 10:56:05 +00:00
phk	8e3c3eafed	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
green	e8104eadd5	Add a missing spl lowering. Submitted by: Ville-Pertti Keinonen <will@iki.fi>	1999-10-14 05:16:16 +00:00
peter	787140aa42	Trim unused options (or #ifdef for undoc options). Submitted by: phk	1999-10-11 15:19:12 +00:00
guido	4a4f1cc758	Plug a potential filedescriptor leak. This will probably almost never be triggered. Reviewed by: David Greenman	1999-09-30 19:13:17 +00:00
green	140cb4ff83	This is what was "fdfix2.patch," a fix for fd sharing. It's pretty far-reaching in fd-land, so you'll want to consult the code for changes. The biggest change is that now, you don't use fp->f_ops->fo_foo(fp, bar) but instead fo_foo(fp, bar), which increments and decrements the fp refcount upon entry and exit. Two new calls, fhold() and fdrop(), are provided. Each does what it seems like it should, and if fdrop() brings the refcount to zero, the fd is freed as well. Thanks to peter ("to hell with it, it looks ok to me.") for his review. Thanks to msmith for keeping me from putting locks everywhere :) Reviewed by: peter	1999-09-19 17:00:25 +00:00
peter	3b842d34e8	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
green	c03366a55d	Fix fd race conditions (during shared fd table usage.) Badfileops is now used in f_ops in place of NULL, and modifications to the files are more carefully ordered. f_ops should also be set to &badfileops upon "close" of a file. This does not fix other problems mentioned in this PR than the first one. PR: 11629 Reviewed by: peter	1999-08-04 18:53:50 +00:00
dillon	a40e0249d4	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 21:50:00 +00:00
fenner	479ab8882b	Don't free the socket address if soaccept() / pru_accept() doesn't return one.	1999-01-25 16:53:53 +00:00
dillon	f8af5b2105	Addendum: The original code that the last commit 'fixed' actually did not have a bug in it, but the last commit did make it more readable so we are keeping it.	1999-01-24 03:49:58 +00:00
dillon	8b297fc8f8	There was a situation where sendfile() might attempt to initiate I/O on a PG_BUSY page, due to a bug in its sequencing of a conditional.	1999-01-24 01:15:58 +00:00
dillon	1cb49fc563	Fixed a potential bug ( but maybe not ), where sendfile() clears PG_BUSY on a page without testing for waiters. Also collapsed busy wait into new vm_page_sleep_busy() inline ( see vm/vm_page.h )	1999-01-21 09:00:26 +00:00
dillon	df24433bbe	This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>	1999-01-21 08:29:12 +00:00
archie	60d13c7a9d	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
dg	7fd65a112e	Fixed broken code in sendfile(2) when using file offsets.	1998-12-03 12:35:47 +00:00
truckman	0b3bd2def8	We can't call fsetown() from sonewconn() because sonewconn() is be called from an interrupt context and fsetown() wants to peek at curproc, call malloc(..., M_WAITOK), and fiddle with various unprotected data structures. The fix is to move the code that duplicates the F_SETOWN/FIOSETOWN state of the original socket to the new socket from sonewconn() to accept1(), since accept1() runs in the correct context. Deferring this until the process calls accept() is harmless since the process can't do anything useful with SIGIO on the new socket until it has the descriptor for that socket. One could make the case for not bothering to duplicate the F_SETOWN/FIOSETOWN state and requiring the process to explicitly make the fcntl() or ioctl() call on the new socket, but this would be incompatible with the previous implementation and might break programs which rely on the old semantics. This bug was discovered by Andrew Gallatin <gallatin@cs.duke.edu>.	1998-11-23 00:45:39 +00:00
dg	939b432d02	Closed a very narrow and rare race condition that involved net interrupts, bio interrupts, and a truncated file that along with the precise alignment of the planets could result in a page being freed multiple times or a just-freed page being put onto the inactive queue.	1998-11-18 09:00:47 +00:00
dg	386e180e55	In sendfile(2), check against sb_lowat when filling the socket buffer, rather than 0.	1998-11-15 16:55:09 +00:00
dg	c160184873	Fixed a couple of nits in sendfile(2): clear PG_ZERO before unbusying the page, and use passed-in "p" rather than curproc in uio struct.	1998-11-14 23:36:17 +00:00
dg	bbf3dfad88	Added support for non-blocking sockets to sendfile(2).	1998-11-06 19:16:30 +00:00
dg	b178f74f12	Implemented zero-copy TCP/IP extensions via sendfile(2) - send a file to a stream socket. sendfile(2) is similar to implementations in HP-UX, Linux, and other systems, but the API is more extensive and addresses many of the complaints that the Apache Group and others have had with those other implementations. Thanks to Marc Slemko of the Apache Group for helping me work out the best API for this. Anyway, this has the "net" result of speeding up sends of files over TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU cycles) when compared to a traditional read/write loop.	1998-11-05 14:28:26 +00:00
wollman	a76fb5eefa	Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.	1998-08-23 03:07:17 +00:00
dfr	a29ddc94fc	64bit fixes: don't cast p->p_retval to an int*.	1998-06-10 10:30:23 +00:00
phk	34d62002ae	Fix a minor mbuf leak created by the previous change. Reviewed by: phk Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)	1998-04-14 06:24:43 +00:00
phk	ead640e967	setsockopt() transports user option data in an mbuf. if the user data is greater than MLEN, setsockopt is unable to pass it onto the protocol handler. Allocate a cluster in such case. PR: 2575 Reviewed by: phk Submitted by: Julian Assange proff@iq.org	1998-04-11 20:31:46 +00:00

1 2

87 Commits