freebsd-skq

Author	SHA1	Message	Date
peter	8d081cadd7	Plug a mbuf leak in tcp_usr_send(). pru_send() routines are expected to either enqueue or free their mbuf chains, but tcp_usr_send() was dropping them on the floor if the tcpcb/inpcb has been torn down in the middle of a send/write attempt. This has been responsible for a wide variety of mbuf leak patterns, ranging from slow gradual leakage to rather rapid exhaustion. This has been a problem since before 2.2 was branched and appears to have been fixed in rev 1.16 and lost in 1.23/1.28. Thanks to Jayanth Vijayaraghavan <jayanth@yahoo-inc.com> for checking (extensively) into this on a live production 2.2.x system and that it was the actual cause of the leak and looks like it fixes it. The machine in question was loosing (from memory) about 150 mbufs per hour under load and a change similar to this stopped it. (Don't blame Jayanth for this patch though) An alternative approach to this would be to recheck SS_CANTSENDMORE etc inside the splnet() right before calling pru_send() after all the potential sleeps, interrupts and delays have happened. However, this would mean exposing knowledge of the tcp stack's reset handling and removal of the pcb to the generic code. There are other things that call pru_send() directly though. Problem originally noted by: John Plevyak <jplevyak@inktomi.com>	1999-06-04 02:27:06 +00:00
ache	c872d87caf	Realy fix overflow on SO_*TIMEO Submitted by: bde	1999-05-21 15:54:40 +00:00
billf	dd35516544	Add sysctl descriptions to many SYSCTL_XXXs PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)	1999-05-03 23:57:32 +00:00
ache	cfd84c8eac	Lite2 bugfixes merge: so_linger is in seconds, not in 1/HZ range checking in SO_*TIMEO was wrong PR: 11252	1999-04-24 18:22:34 +00:00
dfr	22ceb237f0	* Change sysctl from using linker_set to construct its tree using SLISTs. This makes it possible to change the sysctl tree at runtime. * Change KLD to find and register any sysctl nodes contained in the loaded file and to unregister them when the file is unloaded. Reviewed by: Archie Cobbs <archie@whistle.com>, Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)	1999-02-16 10:49:55 +00:00
fenner	20e9c3a9ba	Fix the port of the NetBSD 19990120-accept fix. I misread a piece of code when examining their fix, which caused my code (in rev 1.52) to: - panic("soaccept: !NOFDREF") - fatal trap 12, with tracebacks going thru soclose and soaccept	1999-02-02 07:23:28 +00:00
dillon	a40e0249d4	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 21:50:00 +00:00
fenner	56bddd51c2	Port NetBSD's 19990120-accept bug fix. This works around the race condition where select(2) can return that a listening socket has a connected socket queued, the connection is broken, and the user calls accept(2), which then blocks because there are no connections queued. Reviewed by: wollman Obtained from: NetBSD (ftp://ftp.NetBSD.ORG/pub/NetBSD/misc/security/patches/19990120-accept)	1999-01-25 16:58:56 +00:00
fenner	46541593cd	Also consider the space left in the socket buffer when deciding whether to set PRUS_MORETOCOME.	1999-01-20 17:45:22 +00:00
fenner	505f7489c7	Add a flag, passed to pru_send routines, PRUS_MORETOCOME. This flag means that there is more data to be put into the socket buffer. Use it in TCP to reduce the interaction between mbuf sizes and the Nagle algorithm. Based on: "Justin C. Walker" <justin@apple.com>'s description of Apple's fix for this problem.	1999-01-20 17:32:01 +00:00
eivind	89e1199534	KNFize, by bde.	1999-01-10 01:58:29 +00:00
eivind	a8dc66f457	Split DIAGNOSTIC -> DIAGNOSTIC, INVARIANTS, and INVARIANT_SUPPORT as discussed on -hackers. Introduce 'KASSERT(assertion, ("panic message", args))' for simple check + panic. Reviewed by: msmith	1999-01-08 17:31:30 +00:00
archie	60d13c7a9d	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
truckman	de184682fa	Installed the second patch attached to kern/7899 with some changes suggested by bde, a few other tweaks to get the patch to apply cleanly again and some improvements to the comments. This change closes some fairly minor security holes associated with F_SETOWN, fixes a few bugs, and removes some limitations that F_SETOWN had on tty devices. For more details, see the description on the PR. Because this patch increases the size of the proc and pgrp structures, it is necessary to re-install the includes and recompile libkvm, the vinum lkm, fstat, gcore, gdb, ipfilter, ps, top, and w. PR: kern/7899 Reviewed by: bde, elvind	1998-11-11 10:04:13 +00:00
wollman	bae4326618	Bow to tradition and correctly implement the bogus-but-hallowed semantics of getsockopt never telling how much it might have copied if only the buffer were big enough.	1998-08-31 18:07:23 +00:00
wollman	639269dc95	Correctly set the return length regardless of the relative size of the user's buffer. Simplify the logic a bit. (Can we have a version of min() for size_t?)	1998-08-31 15:34:55 +00:00
wollman	a76fb5eefa	Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.	1998-08-23 03:07:17 +00:00
fenner	c45b137a7b	Undo rev 1.41 until we get more details about why it makes some systems fail.	1998-07-18 18:48:45 +00:00
fenner	bcc119fe0b	Introduce (fairly hacky) workaround for odd TCP behavior with application writes of size (100,208]+N*MCLBYTES. The bug: sosend() hands each mbuf off to the protocol output routine as soon as it has copied it, in the hopes of increasing parallelism (see http://www.kohala.com/~rstevens/vanj.88jul20.txt ). This works well for TCP as long as the first mbuf handed off is at least the MSS. However, when doing small writes (between MHLEN and MINCLSIZE), the transaction is split into 2 small MBUF's and each is individually handed off to TCP. TCP assumes that the first small mbuf is the whole transaction, so sends a small packet. When the second small mbuf arrives, Nagle prevents TCP from sending it so it must wait for a (potentially delayed) ACK. This sends throughput down the toilet. The workaround: Set the "atomic" flag when we're doing small writes. The "atomic" flag has two meanings: 1. Copy all of the data into a chain of mbufs before handing off to the protocol. 2. Leave room for a datagram header in said mbuf chain. TCP wants the first but doesn't want the second. However, the second simply results in some memory wastage (but is why the workaround is a hack and not a fix). The real fix: The real fix for this problem is to introduce something like a "requested transfer size" variable in the socket->protocol interface. sosend() would then accumulate an mbuf chain until it exceeded the "requested transfer size". TCP could set it to the TCP MSS (note that the current interface causes strange TCP behaviors when the MSS > MCLBYTES; nobody notices because MCLBYTES > ethernet's MTU).	1998-07-06 19:27:14 +00:00
wollman	bbc4497ada	Convert socket structures to be type-stable and add a version number. Define a parameter which indicates the maximum number of sockets in a system, and use this to size the zone allocators used for sockets and for certain PCBs. Convert PF_LOCAL PCB structures to be type-stable and add a version number. Define an external format for infomation about socket structures and use it in several places. Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through sysctl(3) without blocking network interrupts for an unreasonable length of time. This probably still has some bugs and/or race conditions, but it seems to work well enough on my machines. It is now possible for `netstat' to get almost all of its information via the sysctl(3) interface rather than reading kmem (changes to follow).	1998-05-15 20:11:40 +00:00
bde	cd450d6714	Moved some #includes from <sys/param.h> nearer to where they are actually used.	1998-03-28 10:33:27 +00:00
guido	406aea3e09	Make sure that you can only bind a more specific address when it is done by the same uid. Obtained from: OpenBSD	1998-03-01 19:39:29 +00:00
fenner	b318e8e53a	Revert sosend() to its behavior from 4.3-Tahoe and before: if so_error is set, clear it before returning it. The behavior introduced in 4.3-Reno (to not clear so_error) causes potentially transient errors (e.g. ECONNREFUSED if the other end hasn't opened its socket yet) to be permanent on connected datagram sockets that are only used for writing. (soreceive() clears so_error before returning it, as does getsockopt(...,SO_ERROR,...).) Submitted by: Van Jacobson <van@ee.lbl.gov>, via a comment in the vat sources.	1998-02-19 19:38:20 +00:00
eivind	4547a09753	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
eivind	c552a9a1c3	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
jkh	b820ab9845	MF22: MSG_EOR bug fix. Submitted by: wollman	1997-11-09 05:07:40 +00:00
phk	36e7a51ea1	Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde	1997-10-12 20:26:33 +00:00
phk	4022dffbc1	While booting diskless we have no proc pointer.	1997-10-04 18:21:15 +00:00
peter	0fc35eb0c2	Extend select backend for sockets to work with a poll interface (more detail is passed back and forwards). This mostly came from NetBSD, except that our interfaces have changed a lot and this funciton is in a different part of the kernel. Obtained from: NetBSD	1997-09-14 02:34:14 +00:00
bde	6ffb8bf9af	Removed unused #includes.	1997-09-02 20:06:59 +00:00
bde	6be005551f	#include <machine/limits.h> explicitly in the few places that it is required.	1997-08-21 20:33:42 +00:00
wollman	4542c1cf5d	Fix all areas of the system (or at least all those in LINT) to avoid storing socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.	1997-08-16 19:16:27 +00:00
peter	4e31107244	Don't accept insane values for SO_(SND\|RCV)BUF, and the low water marks. Specifically, don't allow a value < 1 for any of them (it doesn't make sense), and don't let the low water mark be greater than the corresponding high water mark. Pre-Approved by: wollman Obtained from: NetBSD	1997-06-27 15:28:54 +00:00
wollman	6afbf203bd	The long-awaited mega-massive-network-code- cleanup. Part I. This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in. 2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials. 3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed. 4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family. 5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem. As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.	1997-04-27 20:01:29 +00:00
bde	0d3591bdbd	Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined. Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.	1997-03-23 03:37:54 +00:00
wollman	a6e3c2a3ce	Create a new branch of the kernel MIB, kern.ipc, to store all of the configurables and instrumentation related to inter-process communication mechanisms. Some variables, like mbuf statistics, are instrumented here for the first time. For mbuf statistics: also keep track of m_copym() and m_pullup() failures, and provide for the user's inspection the compiled-in values of MSIZE, MHLEN, MCLBYTES, and MINCLSIZE.	1997-02-24 20:30:58 +00:00
peter	94b6d72794	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
jkh	808a36ef65	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
dg	4ac029cc38	Check for error return from uiomove to prevent looping endlessly in soreceive(). Closes PR#2114. Submitted by: wpaul	1996-11-29 19:03:42 +00:00
pst	b51353f335	Increase robustness of FreeBSD against high-rate connection attempt denial of service attacks. Reviewed by: bde,wollman,olah Inspired by: vjs@sgi.com	1996-10-07 04:32:42 +00:00
wollman	36ac1a0263	Modify the kernel to use the new pr_usrreqs interface rather than the old pr_usrreq mechanism which was poorly designed and error-prone. This commit renames pr_usrreq to pr_ousrreq so that old code which depended on it would break in an obvious manner. This commit also implements the new interface for TCP, although the old function is left as an example (#ifdef'ed out). This commit ALSO fixes a longstanding bug in the TCP timer processing (introduced by davidg on 1995/04/12) which caused timer processing on a TCB to always stop after a single timer had expired (because it misinterpreted the return value from tcp_usrreq() to indicate that the TCB had been deleted). Finally, some code related to polling has been deleted from if.c because it is not relevant t -current and doesn't look at all like my current code.	1996-07-11 16:32:50 +00:00
wollman	9ea36adbec	Make it possible to return more than one piece of control information (PR #1178). Define a new SO_TIMESTAMP socket option for datagram sockets to return packet-arrival timestamps as control information (PR #1179). Submitted by: Louis Mamakos <loiue@TransSys.com>	1996-05-09 20:15:26 +00:00
dg	f248f3aa32	Fix for PR #1146 : the "next" pointer must be cached before calling soabort since the struct containing it may be freed.	1996-04-16 03:50:08 +00:00
dg	a30b0a83b7	Changed socket code to use 4.4BSD queue macros. This includes removing the obsolete soqinsque and soqremque functions as well as collapsing so_q0len and so_qlen into a single queue length of unaccepted connections. Now the queue of unaccepted & complete connections is checked directly for queued sockets. The new code should be functionally equivilent to the old while being substantially faster - especially in cases where large numbers of connections are often queued for accept (e.g. http).	1996-03-11 15:37:44 +00:00
wollman	5c25078715	Kill XNS. While we're at it, fix socreate() to take a process argument. (This was supposed to get committed days ago...)	1996-02-13 18:16:31 +00:00
wollman	88a3e24de1	Define a new socket option, SO_PRIVSTATE. Getting it returns the state of the SS_PRIV flag in so_state; setting it always clears same.	1996-02-07 16:19:19 +00:00
bde	d7acdbc572	Nuked ambiguous sleep message strings: old: new: netcls[] = "netcls" "soclos" netcon[] = "netcon" "accept", "connec" netio[] = "netio" "sblock", "sbwait"	1995-12-14 22:51:13 +00:00
wollman	cfbb3f260a	Make somaxconn (maximum backlog in a listen(2) request) and sb_max (maximum size of a socket buffer) tunable. Permit callers of listen(2) to specify a negative backlog, which is translated into somaxconn. Previously, a negative backlog was silently translated into 0.	1995-11-03 18:33:46 +00:00
bde	1919b096ca	Remove extra arg from one of the calls to (*pr_usrreq)().	1995-08-25 20:27:46 +00:00
rgrimes	c86f0c7a71	Remove trailing whitespace.	1995-05-30 08:16:23 +00:00
wollman	60c4643ea3	getsockopt(s, SOL_SOCKET, SO_SNDTIMEO, ...) would construct the returned timeval incorrectly, truncating the usec part. Obtained from: Stevens vol. 2 p. 548	1995-02-16 01:07:43 +00:00
wollman	04d65ef905	Merge in the socket-level support for Transaction TCP.	1995-02-07 02:01:16 +00:00
dg	90bd3af0d9	Use M_NOWAIT instead of M_KERNEL for socket allocations; it is apparantly possible for certain socket operations to occur during interrupt context. Submitted by: John Dyson	1995-02-06 02:22:12 +00:00
dg	f58a2f6357	Calling semantics for kmem_malloc() have been changed...and the third argument is now more than just a single flag. (kern_malloc.c) Used new M_KERNEL value for socket allocations that previous were "M_NOWAIT". Note that this will change when we clean up the M_ namespace mess. Submitted by: John Dyson	1995-02-02 08:49:08 +00:00
phk	c3e4945541	All of this is cosmetic. prototypes, #includes, printfs and so on. Makes GCC a lot more silent.	1994-10-02 17:35:40 +00:00
dg	8d205697aa	Added $Id$	1994-08-02 07:55:43 +00:00
dg	2f931bd070	Changed mbuf allocation policy to get a cluster if size > MINCLSIZE. Makes a BIG difference in socket performance.	1994-05-29 07:48:17 +00:00
rgrimes	2469c867a1	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman	1994-05-25 09:21:21 +00:00
rgrimes	8fb65ce818	BSD 4.4 Lite Kernel Sources	1994-05-24 10:09:53 +00:00

1 2 3 4

159 Commits