freebsd-skq

Author	SHA1	Message	Date
jlemon	3d67b56283	One possible code path for syncache_respond() is: syncache_respond(A), ip_output(), ip_input(), tcp_input(), syncache_badack(B) Which winds up deleting a different entry from the syncache. Handle this by not utilizing the next entry in the timer chain until after syncache_respond() completes. The case of A == B should not be possible. Problem found by: Don Bowman <don@sandvine.com>	2002-06-28 19:12:38 +00:00
dfr	e59e678096	Fix warning. Reviewed by: luigi	2002-06-28 08:36:26 +00:00
luigi	a9ab854862	The new ipfw code. This code makes use of variable-size kernel representation of rules (exactly the same concept of BPF instructions, as used in the BSDI's firewall), which makes firewall operation a lot faster, and the code more readable and easier to extend and debug. The interface with the rest of the system is unchanged, as witnessed by this commit. The only extra kernel files that I am touching are if_fw.h and ip_dummynet.c, which is quite tied to ipfw. In userland I only had to touch those programs which manipulate the internal representation of firewall rules). The code is almost entirely new (and I believe I have written the vast majority of those sections which were taken from the former ip_fw.c), so rather than modifying the old ip_fw.c I decided to create a new file, sys/netinet/ip_fw2.c . Same for the user interface, which is in sbin/ipfw/ipfw2.c (it still compiles to /sbin/ipfw). The old files are still there, and will be removed in due time. I have not renamed the header file because it would have required touching a one-line change to a number of kernel files. In terms of user interface, the new "ipfw" is supposed to accepts the old syntax for ipfw rules (and produce the same output with "ipfw show". Only a couple of the old options (out of some 30 of them) has not been implemented, but they will be soon. On the other hand, the new code has some very powerful extensions. First, you can put "or" connectives between match fields (and soon also between options), and write things like ipfw add allow ip from { 1.2.3.4/27 or 5.6.7.8/30 } 10-23,25,1024-3000 to any This should make rulesets slightly more compact (and lines longer!), by condensing 2 or more of the old rules into single ones. Also, as an example of how easy the rules can be extended, I have implemented an 'address set' match pattern, where you can specify an IP address in a format like this: 10.20.30.0/26{18,44,33,22,9} which will match the set of hosts listed in braces belonging to the subnet 10.20.30.0/26 . The match is done using a bitmap, so it is essentially a constant time operation requiring a handful of CPU instructions (and a very small amount of memmory -- for a full /24 subnet, the instruction only consumes 40 bytes). Again, in this commit I have focused on functionality and tried to minimize changes to the other parts of the system. Some performance improvement can be achieved with minor changes to the interface of ip_fw_chk_t. This will be done later when this code is settled. The code is meant to compile unmodified on RELENG_4 (once the PACKET_TAG_* changes have been merged), for this reason you will see #ifdef __FreeBSD_version in a couple of places. This should minimize errors when (hopefully soon) it will be time to do the MFC.	2002-06-27 23:02:18 +00:00
mux	5347bbeaf2	Warning fixes for 64 bits platforms. With this last fix, I can build a GENERIC sparc64 kernel with -Werror. Reviewed by: luigi	2002-06-27 11:02:06 +00:00
luigi	c576432afc	Just a comment on some additional consistency checks that could be added here.	2002-06-26 21:00:53 +00:00
ken	0d3a835f3f	At long last, commit the zero copy sockets code. MAKEDEV: Add MAKEDEV glue for the ti(4) device nodes. ti.4: Update the ti(4) man page to include information on the TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options, and also include information about the new character device interface and the associated ioctls. man9/Makefile: Add jumbo.9 and zero_copy.9 man pages and associated links. jumbo.9: New man page describing the jumbo buffer allocator interface and operation. zero_copy.9: New man page describing the general characteristics of the zero copy send and receive code, and what an application author should do to take advantage of the zero copy functionality. NOTES: Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS, TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT. conf/files: Add uipc_jumbo.c and uipc_cow.c. conf/options: Add the 5 options mentioned above. kern_subr.c: Receive side zero copy implementation. This takes "disposable" pages attached to an mbuf, gives them to a user process, and then recycles the user's page. This is only active when ZERO_COPY_SOCKETS is turned on and the kern.ipc.zero_copy.receive sysctl variable is set to 1. uipc_cow.c: Send side zero copy functions. Takes a page written by the user and maps it copy on write and assigns it kernel virtual address space. Removes copy on write mapping once the buffer has been freed by the network stack. uipc_jumbo.c: Jumbo disposable page allocator code. This allocates (optionally) disposable pages for network drivers that want to give the user the option of doing zero copy receive. uipc_socket.c: Add kern.ipc.zero_copy.{send,receive} sysctls that are enabled if ZERO_COPY_SOCKETS is turned on. Add zero copy send support to sosend() -- pages get mapped into the kernel instead of getting copied if they meet size and alignment restrictions. uipc_syscalls.c:Un-staticize some of the sf* functions so that they can be used elsewhere. (uipc_cow.c) if_media.c: In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid calling malloc() with M_WAITOK. Return an error if the M_NOWAIT malloc fails. The ti(4) driver and the wi(4) driver, at least, call this with a mutex held. This causes witness warnings for 'ifconfig -a' with a wi(4) or ti(4) board in the system. (I've only verified for ti(4)). ip_output.c: Fragment large datagrams so that each segment contains a multiple of PAGE_SIZE amount of data plus headers. This allows the receiver to potentially do page flipping on receives. if_ti.c: Add zero copy receive support to the ti(4) driver. If TI_PRIVATE_JUMBOS is not defined, it now uses the jumbo(9) buffer allocator for jumbo receive buffers. Add a new character device interface for the ti(4) driver for the new debugging interface. This allows (a patched version of) gdb to talk to the Tigon board and debug the firmware. There are also a few additional debugging ioctls available through this interface. Add header splitting support to the ti(4) driver. Tweak some of the default interrupt coalescing parameters to more useful defaults. Add hooks for supporting transmit flow control, but leave it turned off with a comment describing why it is turned off. if_tireg.h: Change the firmware rev to 12.4.11, since we're really at 12.4.11 plus fixes from 12.4.13. Add defines needed for debugging. Remove the ti_stats structure, it is now defined in sys/tiio.h. ti_fw.h: 12.4.11 firmware. ti_fw2.h: 12.4.11 firmware, plus selected fixes from 12.4.13, and my header splitting patches. Revision 12.4.13 doesn't handle 10/100 negotiation properly. (This firmware is the same as what was in the tree previously, with the addition of header splitting support.) sys/jumbo.h: Jumbo buffer allocator interface. sys/mbuf.h: Add a new external mbuf type, EXT_DISPOSABLE, to indicate that the payload buffer can be thrown away / flipped to a userland process. socketvar.h: Add prototype for socow_setup. tiio.h: ioctl interface to the character portion of the ti(4) driver, plus associated structure/type definitions. uio.h: Change prototype for uiomoveco() so that we'll know whether the source page is disposable. ufs_readwrite.c:Update for new prototype of uiomoveco(). vm_fault.c: In vm_fault(), check to see whether we need to do a page based copy on write fault. vm_object.c: Add a new function, vm_object_allocate_wait(). This does the same thing that vm_object allocate does, except that it gives the caller the opportunity to specify whether it should wait on the uma_zalloc() of the object structre. This allows vm objects to be allocated while holding a mutex. (Without generating WITNESS warnings.) vm_object_allocate() is implemented as a call to vm_object_allocate_wait() with the malloc flag set to M_WAITOK. vm_object.h: Add prototype for vm_object_allocate_wait(). vm_page.c: Add page-based copy on write setup, clear and fault routines. vm_page.h: Add page based COW function prototypes and variable in the vm_page structure. Many thanks to Drew Gallatin, who wrote the zero copy send and receive code, and to all the other folks who have tested and reviewed this code over the years.	2002-06-26 03:37:47 +00:00
hsu	deb4c976c7	Avoid unlocking the inp twice if badport_bandlim() returns -1. Reported by: jlemon	2002-06-24 22:25:00 +00:00
hsu	f13c72f301	Style bug: fix 4 space indentations that should have been tabs. Submitted by: jlemon	2002-06-24 16:47:02 +00:00
luigi	fbbaa9503a	Slightly restructure the #ifdef INET6 sections to make the code more readable. Remove the six "register" attributes from variables tcp_output(), the compiler surely knows well how to allocate them.	2002-06-23 21:25:36 +00:00
luigi	ebcf841898	Move two global variables to automatic variables within the only function where they are used (they are used with TCPDEBUG only).	2002-06-23 21:22:56 +00:00
luigi	e49d2528a1	Move some global variables in more appropriate places. Add XXX comments to mark places which need to be taken care of if we want to remove this part of the kernel from Giant. Add a comment on a potential performance problem with ip_forward()	2002-06-23 20:48:26 +00:00
luigi	21d4ca5fd2	fix bad indentation and whitespace resulting from cut&paste	2002-06-23 09:15:43 +00:00
luigi	d781bb8585	fix indentation of a comment	2002-06-23 09:14:24 +00:00
luigi	b2109fbe30	fix a typo in a comment	2002-06-23 09:13:46 +00:00
luigi	085f8ffcb0	Remove ip_fw_fwd_addr (forgotten in previous commit) remove some extra whitespace.	2002-06-23 09:03:42 +00:00
luigi	5259888148	Remove (almost all) global variables that were used to hold packet forwarding state ("annotations") during ip processing. The code is considerably cleaner now. The variables removed by this change are: ip_divert_cookie used by divert sockets ip_fw_fwd_addr used for transparent ip redirection last_pkt used by dynamic pipes in dummynet Removal of the first two has been done by carrying the annotations into volatile structs prepended to the mbuf chains, and adding appropriate code to add/remove annotations in the routines which make use of them, i.e. ip_input(), ip_output(), tcp_input(), bdg_forward(), ether_demux(), ether_output_frame(), div_output(). On passing, remove a bug in divert handling of fragmented packet. Now it is the fragment at offset 0 which sets the divert status of the whole packet, whereas formerly it was the last incoming fragment to decide. Removal of last_pkt required a change in the interface of ip_fw_chk() and dummynet_io(). On passing, use the same mechanism for dummynet annotations and for divert/forward annotations. option IPFIREWALL_FORWARD is effectively useless, the code to implement it is very small and is now in by default to avoid the obfuscation of conditionally compiled code. NOTES: * there is at least one global variable left, sro_fwd, in ip_output(). I am not sure if/how this can be removed. * I have deliberately avoided gratuitous style changes in this commit to avoid cluttering the diffs. Minor stule cleanup will likely be necessary * this commit only focused on the IP layer. I am sure there is a number of global variables used in the TCP and maybe UDP stack. * despite the number of files touched, there are absolutely no API's or data structures changed by this commit (except the interfaces of ip_fw_chk() and dummynet_io(), which are internal anyways), so an MFC is quite safe and unintrusive (and desirable, given the improved readability of the code). MFC after: 10 days	2002-06-22 11:51:02 +00:00
hsu	3710c0eed0	Fix logic which resulted in missing a call to INP_UNLOCK(). Submitted by: jlemon, mux	2002-06-21 22:54:16 +00:00
hsu	80cef86a8d	TCP notify functions can change the pcb list.	2002-06-21 22:52:48 +00:00
peter	c7fdf6d30b	Solve the 'unregistered netisr 18' information notice with a sledgehammer. Register the ISR early, but do not actually kick off the timer until we see some activity. This still saves us from running the arp timers on a system with no network cards.	2002-06-20 01:27:40 +00:00
tanimura	cb3347e926	Remove so*_locked(), which were backed out by mistake.	2002-06-18 07:42:02 +00:00
hsu	abda76de0b	Notify functions can destroy the pcb, so they have to return an indication of whether this happenned so the calling function knows whether or not to unlock the pcb. Submitted by: Jennifer Yang (yangjihui@yahoo.com) Bug reported by: Sid Carter (sidcarter@symonds.net)	2002-06-14 08:35:21 +00:00
silby	86950bb1f4	Re-commit w/fix: Ensure that the syn cache's syn-ack packets contain the same ip_tos, ip_ttl, and DF bits as all other tcp packets. PR: 39141 MFC after: 2 weeks This time, make sure that ipv4 specific code (aka all of the above) is only run in the ipv4 case.	2002-06-14 03:08:05 +00:00
silby	0bbc7c9dc5	Back out ip_tos/ip_ttl/DF "fix", it just panic'd my box. :) Pointy-hat to: silby	2002-06-14 02:43:20 +00:00
silby	acb0745bfa	Ensure that the syn cache's syn-ack packets contain the same ip_tos, ip_ttl, and DF bits as all other tcp packets. PR: 39141 MFC after: 2 weeks	2002-06-14 02:36:34 +00:00
hsu	c78cdaf83b	Because we're holding an exclusive write lock on the head, references to the new inp cannot leak out even though it has been placed on the head list.	2002-06-13 23:14:58 +00:00
hsu	c580ba61b3	The UDP head was unlocked too early in one unicast case. Submitted by: bug reported by arr	2002-06-12 15:21:41 +00:00
hsu	b67cb93fe3	Fix logic which resulted in missing a call to INP_UNLOCK().	2002-06-12 03:11:06 +00:00
hsu	ab949ac863	Fix typo where INP_INFO_RLOCK should be INP_INFO_RUNLOCK. Submitted by: tegge, jlemon Prefer LIST_FOREACH macro. Submitted by: jlemon	2002-06-12 03:08:08 +00:00
hsu	d1834ccc3b	Remember to initialize the control block head mutex.	2002-06-11 10:58:57 +00:00
hsu	f140f41dad	Fix typo. Submitted by: Kyunghwan Kim <redjade@atropos.snu.ac.kr>	2002-06-11 10:56:49 +00:00
hsu	439384bfd7	Every array elt is initialized in the following loop, so remove unnecessary M_ZERO.	2002-06-10 23:48:37 +00:00
hsu	cd25d4648f	Lock up inpcb. Submitted by: Jennifer Yang <yangjihui@yahoo.com>	2002-06-10 20:05:46 +00:00
tanimura	e6fa9b9e92	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
wollman	5f28f6025e	Avoid unintentional trigraph.	2002-05-30 20:53:45 +00:00
arr	37981f345c	- Change the newly turned INVARIANTS #ifdef blocks (they were changed from DIAGNOSTIC yesterday) into KASSERT()'s as these help to increase code readability.	2002-05-21 18:52:24 +00:00
arr	f20545d47c	- Turn a few DIAGNOSTIC into INVARIANTS since they are really sanity checks.	2002-05-20 22:05:13 +00:00
arr	56aea61cc9	- Turn a DIAGNOSTIC into an INVARIANTS since it's a sanity check. Use proper ``if'' statement style.	2002-05-20 22:04:19 +00:00
arr	6fe64080f2	- Turn a #ifdef DIAGNOSTIC to #ifdef INVARIANTS as the code from this line through the #endif is really a sanity check. Reviewed by: jake	2002-05-20 21:50:39 +00:00
tanimura	92d8381dd5	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
kbyanc	134bb77c23	Reset token-ring source routing control field on receipt of ethernet frame without source routing information. This restores the behaviour in this scenario to that of prior to my last commit.	2002-05-15 01:03:32 +00:00
rwatson	be8339f00b	Modify the arguments to syncache_socket() to include the mbuf (m) that results in the syncache entry being turned into a socket. While it's not used in the main tree, this is required in the MAC tree so that labels can be propagated from the mbuf to the socket. This is also useful if you're doing things like transparent IP connection hijacking and you want to use the syncache/cookie mechanism, but we won't go there. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-05-14 18:57:55 +00:00
luigi	2afce45ffc	Add ipfw hooks to ether_demux() and ether_output_frame(). Ipfw processing of frames at layer 2 can be enabled by the sysctl variable net.link.ether.ipfw=1 Consider this feature experimental, because right now, the firewall is invoked in the places indicated below, and controlled by the sysctl variables listed on the right. As a consequence, a packet can be filtered from 1 to 4 times depending on the path it follows, which might make a ruleset a bit hard to follow. I will add an ipfw option to tell if we want a given rule to apply to ether_demux() and ether_output_frame(), but we have run out of flags in the struct ip_fw so i need to think a bit on how to implement this. to upper layers \| \| +----------->-----------+ ^ V [ip_input] [ip_output] net.inet.ip.fw.enable=1 \| \| ^ V [ether_demux] [ether_output_frame] net.link.ether.ipfw=1 \| \| +->- [bdg_forward]-->---+ net.link.ether.bridge_ipfw=1 ^ V \| \| to devices	2002-05-13 10:37:19 +00:00
luigi	f172dc1bd4	Remove custom definitions (IP_FW_TCPF_SYN etc.) of TCP header flags which are the same as the original ones (TH_SYN etc.)	2002-05-13 10:21:13 +00:00
luigi	320493f9eb	Add code to match MAC header fields (at the moment supported on bridged packets only, soon to come also for packets on ordinary ether_input() and ether_output() paths. The syntax is ipfw add <action> MAC dst src type where dst and src can be "any" or a MAC address optionallyfollowed by a mask, e.g. 10:20:30:40:50 10:20:30:40:50/32 10:20:30:40:50&ff:ff:ff:f0:ff:0f and type can be a single ethernet type, a range, or a type followed by a mask (values are always in hexadecimal) e.g. 0800 0800-0806 0800/8 0800&03ff Note, I am still uncertain on what is the best format for inputting these values, having the values in hexadecimal is convenient in most cases but can be confusing sometimes. Suggestions welcome. Implement suggestion from PR 37778 to allow "not me" on destination and source IP. The code in the PR was slightly wrong and interfered with the normal handling of IP addresses. This version hopefully is correct. Minor cleanup of the code, in some places moving the indentation to 4 spaces because the code was becoming too deep. Eventually, in a separate commit, I will move the whole file to 4 space indent.	2002-05-12 20:43:50 +00:00
dd	9cc64ca23c	s/demon/daemon/	2002-05-12 00:22:38 +00:00
mike	3ef853a60c	Remove some duplicate types that should have been removed as part of the rearranging in the previous revision. Pointy hat to: cvs update (merging), mike (for not noticing)	2002-05-11 23:28:51 +00:00
luigi	23cf222c81	Cleanup the interface to ip_fw_chk, two of the input arguments were totally useless and have been removed. ip_input.c, ip_output.c: Properly initialize the "ip" pointer in case the firewall does an m_pullup() on the packet. Remove some debugging code forgotten long ago. ip_fw.[ch], bridge.c: Prepare the grounds for matching MAC header fields in bridged packets, so we can have 'etherfw' functionality without a lot of kernel and userland bloat.	2002-05-09 10:34:57 +00:00
kbyanc	cc607e6c2d	Move ISO88025 source routing information into sockaddr_dl's sdl_data field. This returns the sdl_data field to a variable-length field. More importantly, this prevents a easily-reproduceable data-corruption bug when the interface name plus the hardware address exceed the sdl_data field's original 12 byte limit. However, token-ring interfaces may still overflow the new sdl_data field's 46 byte limit if the interface name exceeds 6 characters (since 6 characters for interface name plus 6 for hardware address plus 34 for source routing = the size of sdl_data). Further refinements could overcome this limitation but would break binary compatibility; this commit only addresses fixing the bug for commonly-occuring cases without breaking binary compatibility with the intention that the functionality can be MFC'ed to -stable. See message ID's (both send to -arch): 20020421013332.F87395-100000@gateway.posi.net 20020430181359.G11009-300000@gateway.posi.net for a more thorough description of the bug addressed and how to reproduce it. Approved by: silence on -arch and -net Sponsored by: NTT Multimedia Communications Labs MFC after: 1 week	2002-05-07 22:14:06 +00:00
ume	0dc033806b	Revised MLD-related definitions - Used mld_xxx and MLD_xxx instead of mld6_xxx and MLD6_xxx according to the official defintions in rfc2292bis (macro definitions for backward compatibility were provided) - Changed the first member of mld_hdr{} from mld_hdr to mld_icmp6_hdr to avoid name space conflict in C++ This change makes ports/net/pchar compilable again under -CURRENT. Obtained from: KAME	2002-05-06 16:28:25 +00:00
luigi	a03098c406	Indentation and comments cleanup, no functional change. MFC after: 3 days	2002-05-05 21:27:47 +00:00
alfred	798c53d495	Redo the sigio locking. Turn the sigio sx into a mutex. Sigio lock is really only needed to protect interrupts from dereferencing the sigio pointer in an object when the sigio itself is being destroyed. In order to do this in the most unintrusive manner change pgsigio's sigio * argument into a **, that way we can lock internally to the function.	2002-05-01 20:44:46 +00:00
alfred	21257e117d	Fix some edge cases where bad string handling could occur. Submitted by: ps	2002-05-01 08:29:41 +00:00
alfred	f34e021666	cleanup: fix line wraps, add some comments, fix macro definitions, fix for(;;) loops.	2002-05-01 08:08:24 +00:00
cjc	6b0c9026c6	Enlighten those who read the FINE POINTS of the documentation a bit more on how ipfw(8) deals with tiny fragments. While we're at it, add a quick log message to even let people know we dropped a packet. (Note that the second FINE POINT is somewhat redundant given the first, but since the code is there, leave the docs for it.) MFC after: 1 day	2002-05-01 06:29:16 +00:00
tanimura	89ec521d91	Revert the change of #includes in sys/filedesc.h and sys/socketvar.h. Requested by: bde Since locking sigio_lock is usually followed by calling pgsigio(), move the declaration of sigio_lock and the definitions of SIGIO_*() to sys/signalvar.h. While I am here, sort include files alphabetically, where possible.	2002-04-30 01:54:54 +00:00
tanimura	dbb4756491	Add a global sx sigio_lock to protect the pointer to the sigio object of a socket. This avoids lock order reversal caused by locking a process in pgsigio(). sowakeup() and the callers of it (sowwakeup, soisconnected, etc.) now require sigio_lock to be locked. Provide sowwakeup_locked(), soisconnected_locked(), and so on in case where we have to modify a socket and wake up a process atomically.	2002-04-27 08:24:29 +00:00
mike	491520a810	Rearrange <netinet/in.h> so that it is easier to conditionalize sections for various standards. Conditionalize sections for various standards. Use standards conforming spelling for types in the sockaddr_in structure.	2002-04-24 01:26:11 +00:00
mike	39f7a31d80	Add sa_family_t type to <sys/_types.h> and typedefs to <netinet/in.h> and <sys/socket.h>. Previously, sa_family_t was only typedef'd in <sys/socket.h>.	2002-04-20 02:24:35 +00:00
suz	553226e8e1	just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD. (based on freebsd4-snap-20020128) Reviewed by: ume MFC after: 1 week	2002-04-19 04:46:24 +00:00
suz	7a2a62c14d	initialize local variable explicitly Reviewed by: ume Obtained from: Fujitsu guys MFC after: 1 week	2002-04-11 02:14:21 +00:00
silby	c7389be7ba	Remove some ISN generation code which has been unused since the syncache went in. MFC after: 3 days	2002-04-10 22:12:01 +00:00
silby	5c10a8af24	Totally nuke IPPORT_USERRESERVED, it is no longer used anywhere, update remaining comments to reflect new ephemeral port range. Reminded by: Maxim Konovalov <maxim@macomnet.ru> MFC after: 3 days	2002-04-10 19:30:58 +00:00
mike	4100d7ad0f	Unconditionalize the definition of INET_ADDRSTRLEN and INET6_ADDRSTRLEN. Doing this helps expose bogus redefinitions in 3rd party software.	2002-04-10 11:59:02 +00:00
brian	c321804c50	Remove the code that masks an EEXIST returned from rtinit() when calling ioctl(SIOC[AS]IFADDR). This allows the following: ifconfig xx0 inet 1.2.3.1 netmask 0xffffff00 ifconfig xx0 inet 1.2.3.17 netmask 0xfffffff0 alias ifconfig xx0 inet 1.2.3.25 netmask 0xfffffff8 alias ifconfig xx0 inet 1.2.3.26 netmask 0xffffffff alias but would (given the above) reject this: ifconfig xx0 inet 1.2.3.27 netmask 0xfffffff8 alias due to the conflicting netmasks. I would assert that it's wrong to mask the EEXIST returned from rtinit() as in the above scenario, the deletion of the 1.2.3.25 address will leave the 1.2.3.27 address as unroutable as it was in the first place. Offered for review on: -arch, -net Discussed with: stephen macmanus <stephenm@bayarea.net> MFC after: 3 weeks	2002-04-10 01:42:44 +00:00
brian	2eb3cb5cca	Don't add host routes for interface addresses of 0.0.0.0/8 -> 0.255.255.255. This change allows bootp to work with more than one interface, at the expense of some rather ``wrong'' looking code. I plan to MFC this in place of luigi's recent #ifdef BOOTP stuff that was committed to this file in -stable, as that's slightly more wrong that this is. Offered for review on: -arch, -net MFC after: 2 weeks	2002-04-10 01:42:32 +00:00
jhb	6615797e53	Change the first argument of prison_xinpcb() to be a thread pointer instead of a proc pointer so that prison_xinpcb() can use td_ucred.	2002-04-09 20:04:10 +00:00
silby	5339bdcf65	Update comments to reflect the recent ephemeral port range change. Noticed by: ru MFC After: 1 day	2002-04-09 18:01:26 +00:00
mdodd	23ff620fb4	Retire this copy; it now lives in sys/net/fddi.h.	2002-04-05 19:24:38 +00:00
jhb	db9aa81e23	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
jhb	dc2e474f79	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
mike	beecc37c73	o Implement <sys/_types.h>, a new header for storing types that are MI, not required to be a fixed size, and used in multiple headers. This will grow in time, as more things move here from <sys/types.h> and <machine/ansi.h>. o Add missing type definitions (uint16_t and uint32_t) to <arpa/inet.h> and <netinet/in.h>. o Reduce pollution in <sys/types.h> by using `#if _FOO_T_DECLARED' widgets to avoid including <sys/stdint.h>. o Add some missing type definitions to <unistd.h> and note the ones that still need to be added. o Make use of <sys/_types.h> primitives in <grp.h> and <sys/types.h>. Reviewed by: bde	2002-04-01 08:12:25 +00:00
bde	867fc1ed1c	Fixed some style bugs in the removal of __P(()). Continuation lines were not outdented to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting.	2002-03-24 10:19:10 +00:00
rwatson	afe2b1f929	Merge from TrustedBSD MAC branch: Move the network code from using cr_cansee() to check whether a socket is visible to a requesting credential to using a new function, cr_canseesocket(), which accepts a subject credential and object socket. Implement cr_canseesocket() so that it does a prison check, a uid check, and add a comment where shortly a MAC hook will go. This will allow MAC policies to seperately instrument the visibility of sockets from the visibility of processes. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-03-22 19:57:41 +00:00
ru	cb4688c90e	Prevent icmp_reflect() from calling ip_output() with a NULL route pointer which will then result in the allocated route's reference count never being decremented. Just flood ping the localhost and watch refcnt of the 127.0.0.1 route with netstat(1). Submitted by: jayanth Back out ip_output.c,v 1.143 and ip_mroute.c,v 1.69 that allowed ip_output() to be called with a NULL route pointer. The previous paragraph shows why this was a bad idea in the first place. MFC after: 0 days	2002-03-22 16:45:54 +00:00
silby	c260993d3b	Change the ephemeral port range from 1024-5000 to 49152-65535. This increases the number of concurrent outgoing connections from ~4000 to ~16000. Other OSes (Solaris, OS X, NetBSD) and many other NAT products have already made this change without ill effects, so we should not run into any problems. MFC after: 1 week	2002-03-22 03:28:11 +00:00
orion	10ea87ba4b	Send periodic ARP requests when ARP entries for hosts we are sending to are about to expire. This prevents high packet rate flows from experiencing packet drops at the sender following ARP cache entry timeout. PR: kern/25517 Reviewed by: luigi MFC after: 7 days	2002-03-20 15:56:36 +00:00
jeff	0a59f1223c	Switch vm_zone.h with uma.h. Change over to uma interfaces.	2002-03-20 05:48:55 +00:00
alfred	357e37e023	Remove __P.	2002-03-19 21:25:46 +00:00
jeff	2923687da3	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00
rwatson	60d6d81252	NAI DBA update	2002-03-14 16:53:39 +00:00
mike	b9910027dd	o Add INET_ADDRSTRLEN and INET6_ADDRSTRLEN defines to <arpa/inet.h> for POSIX.1-2001 conformance. o Add magic to <netinet/in.h> and <netinet6/in6.h> to prevent redefining INET_ADDRSTRLEN and INET6_ADDRSTRLEN. o Add a note about missing typedefs in <arpa/inet.h>.	2002-03-10 06:42:27 +00:00
mike	b8cc0d1207	o Don't require long long support in bswap64() functions. o In i386's <machine/endian.h>, macros have some advantages over inlines, so change some inlines to macros. o In i386's <machine/endian.h>, ungarbage collect word_swap_int() (previously __uint16_swap_uint32), it has some uses on i386's with PDP endianness. Submitted by: bde o Move a comment up in <machine/endian.h> that was accidentially moved down a few revisions ago. o Reenable userland's use of optimized inline-asm versions of byteorder(3) functions. o Fix ordering of prototypes vs. redefinition of byteorder(3) functions, so that the non-GCC (libc asm) case has proper prototypes. o Add proper prototypes for byteorder(3) functions in <sys/param.h>. o Prevent redundant duplicate prototypes by making use of the _BYTEORDER_PROTOTYPED define. o Move the bswap16(), bswap32(), bswap64() C functions into MD space for platforms in which asm versions don't exist. This significantly reduces the complexity of some things at the cost of duplicate code. Reviewed by: bde	2002-03-09 21:02:16 +00:00
ume	3d5b174433	- Set inc_isipv6 in tcp6_usr_connect(). - When making a pcb from a sync cache, do not forget to copy inc_isipv6. Obtained from: KAME MFC After: 1 week	2002-02-28 17:11:10 +00:00
jhb	3706cd3509	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
cjc	822f4e8381	Change the wording of the inline comments from the previous commit. Objection from: ru	2002-02-27 13:52:06 +00:00
alfred	943268c4b5	More IPV6 const fixes.	2002-02-27 05:11:50 +00:00
dd	c8a6bd9922	Introduce a version field to `struct xucred' in place of one of the spares (the size of the field was changed from u_short to u_int to reflect what it really ends up being). Accordingly, change users of xucred to set and check this field as appropriate. In the kernel, this is being done inside the new cru2x() routine which takes a `struct ucred' and fills out a `struct xucred' according to the former. This also has the pleasant sideaffect of removing some duplicate code. Reviewed by: rwatson	2002-02-27 04:45:37 +00:00
brooks	b1c3d8a603	Staticize an extern that no one else used.	2002-02-26 18:24:00 +00:00
jedgar	ecdaec0ea7	Enforce inbound IPsec SPD Reviewed by: fenner	2002-02-26 02:11:13 +00:00
alfred	96af38570e	Document what inpcb->inp_vflag is for. Submitted by: Marco Molteni <molter@tin.it>	2002-02-25 09:41:43 +00:00
cjc	8b28692f71	The TCP code did not do sufficient checks on whether incoming packets were destined for a broadcast IP address. All TCP packets with a broadcast destination must be ignored. The system only ignored packets that were _link-layer_ broadcasts or multicast. We need to check the IP address too since it is quite possible for a broadcast IP address to come in with a unicast link-layer address. Note that the check existed prior to CSRG revision 7.35, but was removed. This commit effectively backs out that nine-year-old change. PR: misc/35022	2002-02-25 08:29:21 +00:00
luigi	565d5dddb5	BUGFIX: make use of the pointer to the target of skipto rules, so that after the first time we can follow the pointer instead of having to scan the list. This was the intended behaviour from day one. PR: 34639 MFC-after: 3 days	2002-02-20 17:15:57 +00:00
jlemon	cc3e7eecb1	When expanding a syncache entry into a socket, inherit the socket options from the current listen socket instead of the cached (and possibly stale) TCB pointer.	2002-02-20 16:47:11 +00:00
mike	bcee06d42c	o Move NTOHL() and associated macros into <sys/param.h>. These are deprecated in favor of the POSIX-defined lowercase variants. o Change all occurrences of NTOHL() and associated marcros in the source tree to use the lowercase function variants. o Add missing license bits to sparc64's <machine/endian.h>. Approved by: jake o Clean up <machine/endian.h> files. o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>. o Remove prototypes for non-existent bswapXX() functions. o Include <machine/endian.h> in <arpa/inet.h> to define the POSIX-required ntohl() family of functions. o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>, and <sys/param.h>. o Prepend underscores to the ntohl() family to help deal with complexities associated with having MD (asm and inline) versions, and having to prevent exposure of these functions in other headers that happen to make use of endian-specific defines. o Create weak aliases to the canonical function name to help deal with third-party software forgetting to include an appropriate header. o Remove some now unneeded pollution from <sys/types.h>. o Add missing <arpa/inet.h> includes in userland. Tested on: alpha, i386 Reviewed by: bde, jake, tmm	2002-02-18 20:35:27 +00:00
ru	d2e47c3d20	Moved the 127/8 check below so that IPF redirects have a chance of working. MFC after: 1 day	2002-02-15 12:19:03 +00:00
jlemon	04bdc3812f	When a duplicate SYN arrives which matches an entry in the syncache, update our lazy reference to the inpcb structure, as it may have changed. Found by: dima	2002-02-12 02:03:50 +00:00
dd	336de67dc7	Silence unused variable warning in the !KLD_MODULE case. Submitted by: archie	2002-02-10 22:22:05 +00:00
julian	b5eb64d6f0	Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,	2002-02-07 20:58:47 +00:00
ume	0c5d24bcd2	In tcp_respond(), correctly reset returned IPv6 header. This is essential when the original packet contains an IPv6 extension header. Obtained from: KAME MFC after: 1 week	2002-02-04 17:37:06 +00:00
markm	8110fb90cc	WARNS=n and lint(1) silencer. Declare an array of (const) strings as const char.	2002-02-03 11:57:32 +00:00
cjc	ffde9fe98f	The ipfw(8) 'tee' action simply hasn't worked on incoming packets for some time. _All_ packets, regardless of destination, were accepted by the machine as if addressed to it. Jump back to 'pass' processing for a teed packet instead of falling through as if it was ours. PR: kern/31130 Reviewed by: -net, luigi MFC after: 2 weeks	2002-01-26 10:14:08 +00:00
jlemon	a7b593e6c9	The ENDPTS_EQ macro was comparing the one of the fports to itself. Fix. Submitted by: emy@boostworks.com	2002-01-22 17:54:28 +00:00
ume	590d306747	- Check the address family of the destination cached in a PCB. - Clear the cached destination before getting another cached route. Otherwise, garbage in the padding space (which might be filled in if it was used for IPv4) could annoy rtalloc. Obtained from: KAME	2002-01-21 20:04:22 +00:00
ru	8d3eaf171b	RFC1122 requires that addresses of the form { 127, <any> } MUST NOT appear outside a host. PR: 30792, 33996 Obtained from: ip_input.c MFC after: 1 week	2002-01-21 13:59:42 +00:00
ru	6f5f8c6c2c	Fix a panic condition in icmp_reflect() introduced in rev. 1.61. (We should be able to handle locally originated IP packets, and these do not have m_pkthdr.rcvif set.) PR: kern/32806, kern/33766 Reviewed by: luigi Fix tested by: Maxim Konovalov <maxim@macomnet.ru>, Erwin Lansing <erwin@lansing.dk>	2002-01-11 12:13:57 +00:00
msmith	ea9c5a8d4c	Initialise the intrq_present fields at runtime, not link time. This allows us to load protocols at runtime, and avoids the use of common variables. Also fix the ip6_intrq assignment so that it works at all.	2002-01-08 10:34:03 +00:00
cjc	14705316d2	Fix a missing "ipfw:" in a syslog message. MFC after: 1 day	2002-01-07 07:12:09 +00:00
fenner	1a8ac98fc3	Pre-calculate the checksum for multicast packets sourced on a multicast router. This is overkill; it should be possible to delay to hardware interfaces and only pre-calculate when forwarding to a tunnel.	2002-01-05 18:23:53 +00:00
rwatson	46f317e07b	o Spelling fix in comment: tcp_ouput -> tcp_output	2002-01-04 17:21:27 +00:00
yar	11da1a2ed8	Don't reveal a router in the IPSTEALTH mode through IP options. The following steps are involved: a) the IP options related to routing (LSRR and SSRR) are processed as though the router were a host, b) the other IP options are processed as usual only if the packet is destined for the router; otherwise they are ignored. PR: kern/23123 Discussed in: freebsd-hackers	2001-12-29 09:24:18 +00:00
julian	f6dd852457	Fix ipfw fwd so that it acts as the docs say when forwarding an incoming packet to another machine. Obtained from: Vicor Production tree MFC after: 3 weeks	2001-12-28 21:21:57 +00:00
yar	ca1cc6602b	Implement matching IP precedence in ipfw(4). Submitted by: Igor Timkin <ivt@gamma.ru>	2001-12-21 18:43:02 +00:00
jlemon	dcae5ce4e7	Remove a change that snuck in from my private tree.	2001-12-21 05:07:39 +00:00
jlemon	87be243fa6	If syncookies are disabled (net.inet.tcp.syncookies) then use the faster arc4random() routine to generate ISNs instead of creating them with MD5(). Suggested by: silby	2001-12-21 04:41:08 +00:00
jlemon	ba290916ff	When storing an int value in a void *, use intptr_t as the cast type (instead of int) to keep the 64 bit platforms happy.	2001-12-19 15:57:43 +00:00
yar	25850c205d	Don't try to free a NULL route when doing IPFIREWALL_FORWARD. An old route will be NULL at that point if a packet were initially routed to an interface (using the IP_ROUTETOIF flag.) Submitted by: Igor Timkin <ivt@gamma.ru>	2001-12-19 14:54:13 +00:00
jlemon	d0b486460f	Extend the SYN DoS defense by adding syncookies to the syncache. All TCP ISNs that are sent out are valid cookies, which allows entries in the syncache to be dropped and still have the ACK accepted later. As all entries pass through the syncache, there is no sudden switchover from cache -> cookies when the cache is full; instead, syncache entries simply have a reduced lifetime. More details may be found in the "Resisting DoS attacks with a SYN cache" paper in the Usenix BSDCon 2002 conference proceedings. Sponsored by: DARPA, NAI Labs	2001-12-19 06:12:14 +00:00
ru	642a135b45	Fixed the bug in transparent TCP proxying with the "encode_ip_hdr" option -- TcpAliasOut() did not catch the IP header length change. Submitted by: Stepachev Andrey <aka50@mail.ru>	2001-12-18 16:13:45 +00:00
rwatson	5014778ff3	o Add IPOPT_ESO for the 'Extended Security' IP option (RFC1108) Obtained from: TrustedBSD Project	2001-12-14 19:37:32 +00:00
rwatson	56387dfef2	o Add definition for IPOPT_CIPSO, the commercial security IP option number. Submitted by: Ilmar S. Habibulin <ilmar@watson.org> Obtained from: TrustedBSD Project	2001-12-14 19:34:42 +00:00
jlemon	12f48c6901	whitespace and style fixes recovered from -stable.	2001-12-14 19:34:11 +00:00
jlemon	441bffc79d	minor style and whitespace fixes.	2001-12-14 19:33:29 +00:00
jlemon	0a6314db1d	whitespace fixes.	2001-12-14 19:32:47 +00:00
jlemon	2fde22e293	minor whitespace fixes.	2001-12-14 19:32:00 +00:00
silby	1b6efabb90	Reduce the local network slowstart flightsize from infinity to 4 packets. Now that we've increased the size of our send / receive buffers, bursting an entire window onto the network may cause congestion. As a result, we will slow start beginning with a flightsize of 4 packets. Problem reported by: Thomas Zenker <thz@Lennartz-electronic.de> MFC after: 3 days	2001-12-14 18:26:52 +00:00
jlemon	3c2732d720	Undo one of my last minute changes; move sc_iss up earlier so it is initialized in case we take the T/TCP path.	2001-12-13 04:05:26 +00:00
jlemon	776e8594bd	Fix up tabs from cut&n&paste.	2001-12-13 04:02:31 +00:00
jlemon	ec4b51f883	Fix up tabs in comments.	2001-12-13 04:02:09 +00:00
jlemon	37e5dc6ec1	Minor style fixes.	2001-12-13 04:01:23 +00:00
jlemon	f3ff850b00	Minor style fix.	2001-12-13 04:01:01 +00:00
obrien	7fd9a6a23a	Update to C99, s/__FUNCTION__/__func__/, also don't use ANSI string concatenation.	2001-12-10 08:09:49 +00:00
rwatson	02fc34fde9	o Our currenty userland boot code (due to rc.conf and rc.network) always enables TCP keepalives using the net.inet.tcp.always_keepalive by default. Synchronize the kernel default with the userland default.	2001-12-07 17:01:28 +00:00
ru	3dd3844f57	Fixed remotely exploitable DoS in arpresolve(). Easily exploitable by flood pinging the target host over an interface with the IFF_NOARP flag set (all you need to know is the target host's MAC address). MFC after: 0 days	2001-12-05 18:13:34 +00:00
rwatson	b5de442911	o Introduce pr_mtx into struct prison, providing protection for the mutable contents of struct prison (hostname, securelevel, refcount, pr_linux, ...) o Generally introduce mtx_lock()/mtx_unlock() calls throughout kern/ so as to enforce these protections, in particular, in kern_mib.c protection sysctl access to the hostname and securelevel, as well as kern_prot.c access to the securelevel for access control purposes. o Rewrite linux emulator abstractions for accessing per-jail linux mib entries (osname, osrelease, osversion) so that they don't return a pointer to the text in the struct linux_prison, rather, a copy to an array passed into the calls. Likewise, update linprocfs to use these primitives. o Update in_pcb.c to always use prison_getip() rather than directly accessing struct prison. Reviewed by: jhb	2001-12-03 16:12:27 +00:00
dillon	f97547e246	Fix a bug with transmitter restart after receiving a 0 window. The receiver was not sending an immediate ack with delayed acks turned on when the input buffer is drained, preventing the transmitter from restarting immediately. Propogate the TCP_NODELAY option to accept()ed sockets. (Helps tbench and is a good idea anyway). Some cleanup. Identify additonal issues in comments. MFC after: 1 day	2001-12-02 08:49:29 +00:00
ru	5fcff41f8a	Allow for ip_output() to be called with a NULL route pointer. This fixes a panic I introduced yesterday in ip_icmp.c,v 1.64.	2001-12-01 13:48:16 +00:00
mike	20cacce16c	o Stop abusing MD headers with non-MD types. o Hide nonstandard functions and types in <netinet/in.h> when _POSIX_SOURCE is defined. o Add some missing types (required by POSIX.1-200x) to <netinet/in.h>. o Restore vendor ID from Rev 1.1 in <netinet/in.h> and make use of new __FBSDID() macro. o Fix some miscellaneous issues in <arpa/inet.h>. o Correct final argument for the inet_ntop() function (POSIX.1-200x). o Get rid of the namespace pollution from <sys/types.h> in <arpa/inet.h>. Reviewed by: fenner Partially submitted by: bde	2001-12-01 03:43:01 +00:00
dillon	cbc4eaa756	The transmit burst limit for newreno completely breaks TCP's performance if the receive side is using delayed acks. Temporarily remove it. MFC after: 0 days	2001-11-30 21:33:39 +00:00
brian	0c6aed3bcb	During SIOCAIFADDR, if in_ifinit() fails and we've already added an interface address, blow the address away again before returning the error. In in_ifinit(), if we get an error from rtinit() and we've also got a destination address, return the error rather than masking EEXISTS. Failing to create a host route when configuring an interface should be treated as an error.	2001-11-30 14:00:55 +00:00
ru	cfe5212a8b	- Make ip_rtaddr() global, and use it to look up the correct source address in icmp_reflect(). - Two new "struct icmpstat" members: icps_badaddr and icps_noroute. PR: kern/31575 Obtained from: BSD/OS MFC after: 1 week	2001-11-30 10:40:28 +00:00
dd	2c2a10067f	ipfw_modevent(): Don't use an unnatural block to define a variable (fcp) that's already defined in the outer block and isn't used anywhere else. This silences -Wunused. Reviewed by: md5(1)	2001-11-27 20:32:47 +00:00
dd	371e36e76a	Remove debugging printfs that weren't conditional on any debugging options in handling MOD_{UN,}LOAD (they weren't very useful, anyway).	2001-11-27 20:28:48 +00:00
dd	aeec5e4265	In icmp_reflect(): If the packet was not addressed to us and was received on an interface without an IP address, try to find a non-loopback AF_INET address to use. If that fails, drop it. Previously, we used the address at the top of the in_ifaddrhead list, which didn't make much sense, and would cause a panic if there were no AF_INET addresses configured on the system. PR: 29337, 30524 Reviewed by: ru, jlemon Obtained from: NetBSD	2001-11-27 19:58:09 +00:00
rwatson	0696a32b7b	Add include of net/route.h, as structures moved around due to the syncache rely on 'struct route' being defined. This fixes the LINT build some.	2001-11-27 17:36:39 +00:00
tanimura	f951178b75	Clear a new syncache entry first, followed by filling in values. This fixes route breakage due to uncleared gabage on my box.	2001-11-27 11:55:28 +00:00
ru	1274247e0e	When servicing an internal FTP server, punch ipfirewall(4) holes for passive mode data connections (PASV/EPSV -> 227/229). Well, the actual punching happens a bit later, when the aliasing link becomes fully specified. Prodded by: Danny Carroll <dannycarroll@hotmail.com> MFC after: 1 week	2001-11-27 10:50:23 +00:00
ru	34f496f589	Restore the ability to use IP_FW_ADD with setsockopt(2) that got broken in revision 1.86. This broke natd(8)'s -punch_fw option. Reported by: Daniel Rock <D.Rock@t-online.de>, setantae <setantae@submonkey.net>	2001-11-26 10:05:58 +00:00
bde	d2d81413e2	Fixed a buffer overrun. In my kernel configuration, tcp_syncache happens to be followed by nfsnodehashtbl, so bzeroing callouts beyond the end of tcp_syncache soon caused a null pointer panic when nfsnodehashtbl was accessed.	2001-11-23 12:31:27 +00:00
jlemon	a3c1c9fdb4	Introduce a syncache, which enables FreeBSD to withstand a SYN flood DoS in an improved fashion over the existing code. Reviewed by: silby (in a previous iteration) Sponsored by: DARPA, NAI Labs	2001-11-22 04:50:44 +00:00
jlemon	c41580e9ad	Move initialization of snd_recover into tcp_sendseqinit().	2001-11-21 18:45:51 +00:00
dillon	86ed17d675	Give struct socket structures a ref counting interface similar to vnodes. This will hopefully serve as a base from which we can expand the MP code. We currently do not attempt to obtain any mutex or SX locks, but the door is open to add them when we nail down exactly how that part of it is going to work.	2001-11-17 03:07:11 +00:00
rwatson	8cf42b482a	o Replace reference to 'struct proc' with 'struct thread' in 'struct sysctl_req', which describes in-progress sysctl requests. This permits sysctl handlers to have access to the current thread, permitting work on implementing td->td_ucred, migration of suser() to using struct thread to derive the appropriate ucred, and allowing struct thread to be passed down to other code, such as network code where td is not currently available (and curproc is used). o Note: netncp and netsmb are not updated to reflect this change, as they are not currently KSE-adapted. Reviewed by: julian Obtained from: TrustedBSD Project	2001-11-08 02:13:18 +00:00
arr	9ed45cbd11	- Fixes non-zero'd out sin_zero field problem so that the padding is used as it is supposed to be. Inspired by: PR #31704 Approved by: jdp Reviewed by: jhb, -net@	2001-11-06 00:48:01 +00:00
phk	b66cb8c56d	3.5 years ago Wollman wrote: "[...] and removes the hostcache code from standard kernels---the code that depends on it is not going to happen any time soon, I'm afraid." Time to clean up.	2001-11-05 21:25:02 +00:00
luigi	f565e0a1df	MFS: sync the ipfw/dummynet/bridge code with the one recently merged into stable (mostly , but not only, formatting and comments changes).	2001-11-04 22:56:25 +00:00
luigi	0c9b62266a	s/FREE/free/	2001-11-04 17:35:31 +00:00
brian	876314d445	cmott@scientech.com -> cm@linktel.net Requested by: Charles Mott <cmott@scientech.com>	2001-11-03 11:34:09 +00:00
wpaul	08ca13c8db	Fix a (long standing?) bug in ip_output(): if ip_insertoptions() is called and ip_output() encounters an error and bails (i.e. host unreachable), we will leak an mbuf. This is because the code calls m_freem(m0) after jumping to the bad: label at the end of the function, when it should be calling m_freem(m). (m0 is the original mbuf list _without_ the options mbuf prepended.) Obtained from: NetBSD	2001-10-30 18:15:48 +00:00
des	3554d69eb7	Make sure the netmask always has an address family. This fixes Linux ifconfig, which expects the address returned by the SIOCGIFNETMASK ioctl to have a valid sa_family. Similar changes may be necessary for IPv6. While we're here, get rid of an unnecessary temp variable. MFC after: 2 weeks	2001-10-30 15:57:20 +00:00
jlemon	20820bb50e	When dropping a packet because there is no room in the queue (which itself is somewhat bogus), update the statistics to indicate something was dropped. PR: 13740	2001-10-30 14:58:27 +00:00
joe	0dc5f6f003	A few more style changes picked up whilst working on an MFC to -stable.	2001-10-29 15:09:07 +00:00
joe	f4296b73c0	Fix some whitespace, and a comment that I missed in the last commit.	2001-10-29 14:08:51 +00:00
joe	c78b92c237	Clean up the style of this header file.	2001-10-29 04:41:28 +00:00
dillon	981dfd6cd9	fix int argument used in printf w/ %ld (cast to long)	2001-10-29 02:19:19 +00:00
jlemon	db827296e4	Don't use the ip_timestamp structure to access timestamp options, as the compiler may cause an unaligned access to be generated in some cases. PR: 30982	2001-10-25 06:27:51 +00:00
jlemon	669cd5c6d7	If we are bridging, fall back to using any inet address in the system, irrespective of receive interface, as a last resort. Submitted by: ru	2001-10-25 06:14:21 +00:00
jlemon	0ecfb417cf	Relocate the KASSERT for a null recvif to a location where it will actually do some good. Pointed out by: ru	2001-10-25 05:56:30 +00:00
ume	44216e0fa0	restore the data of the ip header when extended udp header and data checksum is calculated. this caused some trouble in the code which the ip header is not modified. for example, inbound policy lookup failed. Obtained from: KAME MFC after: 1 week	2001-10-22 12:43:30 +00:00
jlemon	a3a164e488	Only examine inet addresses of the interface. This was broken in r1.83, with the result that the system would reply to an ARP request of 0.0.0.0	2001-10-20 05:14:06 +00:00
ru	ecb4d3d05f	Pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2. Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo '' as the argument. Pass rt_addrinfo all the way down to rtrequest1 and ifa->ifa_rtrequest. 3rd argument of ifa->ifa_rtrequest is now ``rt_addrinfo '' instead of ``sockaddr '' (almost noone is using it anyways). Benefit: the following command now works. Previously we needed two route(8) invocations, "add" then "change". # route add -inet6 default ::1 -ifp gif0 Remove unsafe typecast in rtrequest(), from ``rtentry '' to ``sockaddr *''. It was introduced by 4.3BSD-Reno and never corrected. Obtained from: BSD/OS, NetBSD MFC after: 1 month PR: kern/28360	2001-10-17 18:07:05 +00:00
fjoe	8ef8a1b13f	bring in ARP support for variable length link level addresses Reviewed by: jdp Approved by: jdp Obtained from: NetBSD MFC after: 6 weeks	2001-10-14 20:17:53 +00:00
rwatson	f51eaee62f	- Combine kern.ps_showallprocs and kern.ipc.showallsockets into a single kern.security.seeotheruids_permitted, describes as: "Unprivileged processes may see subjects/objects with different real uid" NOTE: kern.ps_showallprocs exists in -STABLE, and therefore there is an API change. kern.ipc.showallsockets does not. - Check kern.security.seeotheruids_permitted in cr_cansee(). - Replace visibility calls to socheckuid() with cr_cansee() (retain the change to socheckuid() in ipfw, where it is used for rule-matching). - Remove prison_unpcb() and make use of cr_cansee() against the UNIX domain socket credential instead of comparing root vnodes for the UDS and the process. This allows multiple jails to share the same chroot() and not see each others UNIX domain sockets. - Remove unused socheckproc(). Now that cr_cansee() is used universally for socket visibility, a variety of policies are more consistently enforced, including uid-based restrictions and jail-based restrictions. This also better-supports the introduction of additional MAC models. Reviewed by: ps, billf Obtained from: TrustedBSD Project	2001-10-09 21:40:30 +00:00
jayanth	3c25260058	Add a flag TF_LASTIDLE, that forces a previously idle connection to send all its data, especially when the data is less than one MSS. This fixes an issue where the stack was delaying the sending of data, eventhough there was enough window to send all the data and the sending of data was emptying the socket buffer. Problem found by Yoshihiro Tsuchiya (tsuchiya@flab.fujitsu.co.jp) Submitted by: Jayanth Vijayaraghavan	2001-10-05 21:33:38 +00:00
ps	38383190d5	Only allow users to see their own socket connections if kern.ipc.showallsockets is set to 0. Submitted by: billf (with modifications by me) Inspired by: Dave McKay (aka pm aka Packet Magnet) Reviewed by: peter MFC after: 2 weeks	2001-10-05 07:06:32 +00:00
ps	d0afbb304a	Make it so dummynet and bridge can be loaded as modules. Submitted by: billf	2001-10-05 05:45:27 +00:00
jlemon	6bc13e1485	in_ifinit apparently can be used to rewrite an ip address; recalculate the correct hash bucket for the entry. Submitted by: iedowse (with some munging by me)	2001-10-01 18:07:08 +00:00
luigi	b607d229d2	Fix a problem with unnumbered rules introduced in latest commit. Reported by: des	2001-10-01 17:35:54 +00:00
ru	623da62a5a	mdoc(7) police: Use the new .In macro for #include statements.	2001-10-01 16:09:29 +00:00
dillon	384d1b2861	Add __FBSDID's to libalias	2001-09-30 21:03:33 +00:00
jlemon	d8102a69ad	Nuke unused (and incorrect) #define of INADDR_HMASK. Spotted by: ru	2001-09-29 14:59:20 +00:00
jlemon	fc9b0a1530	Make the INADDR_TO_IFP macro use the IP address hash lookup instead of walking the entire list of IP addresses. Pointed out by: bfumerola	2001-09-29 06:16:02 +00:00
jlemon	3164f24b55	Add a hash table that contains the list of internet addresses, and use this in place of the in_ifaddr list when appropriate. This improves performance on hosts which have a large number of IP aliases.	2001-09-29 04:34:11 +00:00
jlemon	17d77e9346	Centralize satosin(), sintosa() and ifatoia() macros in <netinet/in.h> Remove local definitions.	2001-09-29 03:23:44 +00:00
luigi	0fb106cc3f	Two main changes here: + implement "limit" rules, which permit to limit the number of sessions between certain host pairs (according to masks). These are a special type of stateful rules, which might be of interest in some cases. See the ipfw manpage for details. + merge the list pointers and ipfw rule descriptors in the kernel, so the code is smaller, faster and more readable. This patch basically consists in replacing "foo->rule->bar" with "rule->bar" all over the place. I have been willing to do this for ages! MFC after: 1 week	2001-09-27 23:44:27 +00:00
luigi	af2cc9a068	Remove unused (and duplicate) struct ip_opts which is never used, not referenced in Stevens, and does not compile with g++. There is an equivalent structure, struct ipoption in ip_var.h which is actually used in various parts of the kernel, and also referenced in Stevens. Bill Fenner also says: ... if you want the trivia, struct ip_opts was introduced in in.h SCCS revision 7.9, on 6/28/1990, by Mike Karels. struct ipoption was introduced in ip_var.h SCCS revision 6.5, on 9/16/1985, by... Mike Karels. MFC-after: 3 days	2001-09-27 11:53:22 +00:00
brooks	b9f9861d89	Include sys/proc.h for the definition of securelevel_ge(). Submitted by: LINT	2001-09-26 21:53:20 +00:00
rwatson	823d828036	o Modify IPFW and DUMMYNET administrative setsockopt() calls to use securelevel_gt() to check the securelevel, rather than direct access to the securelevel variable. Obtained from: TrustedBSD Project	2001-09-26 19:58:29 +00:00
brooks	74063dd723	Make faith loadable, unloadable, and clonable.	2001-09-25 18:40:52 +00:00
luigi	fc8e0b7bdd	Fix a null pointer dereference introduced in the last commit, plus remove a useless assignment and move a comment. Submitted by: Thomas Moestl	2001-09-24 05:24:19 +00:00
ru	7de7d2144f	Fixed the bug that prevented communication with FTP servers behind NAT in extended passive mode if the server's public IP address was different from the main NAT address. This caused a wrong aliasing link to be created that did not route the incoming packets back to the original IP address of the server. natd -v -n pub0 -redirect_address localFTP publicFTP Note that even if localFTP == publicFTP, one still needs to supply the -redirect_address directive. It is needed as a helper because extended passive mode's 229 reply does not contain the IP address. MFC after: 1 week	2001-09-21 14:38:36 +00:00
rwatson	7a4775391d	o Rename u_cansee() to cr_cansee(), making the name more comprehensible in the face of a rename of ucred to cred, and possibly generally. Obtained from: TrustedBSD Project	2001-09-20 21:45:31 +00:00
luigi	571d41f160	A bunch of minor changes to the code (see below) for readability, code size and speed. No new functionality added (yet) apart from a bugfix. MFC will occur in due time and probably in stages. BUGFIX: fix a problem in old code which prevented reallocation of the hash table for dynamic rules (there is a PR on this). OTHER CHANGES: minor changes to the internal struct for static and dynamic rules. Requires rebuild of ipfw binary. Add comments to show how data structures are linked together. (It probably makes no sense to keep the chain pointers separate from actual rule descriptors. They will be hopefully merged soon. keep a (sysctl-readable) counter for the number of static rules, to speed up IP_FW_GET operations initial support for a "grace time" for expired connections, so we can set timeouts for closing connections to much shorter times. merge zero_entry() and resetlog_entry(), they use basically the same code. clean up and reduce replication of code for removing rules, both for readability and code size. introduce a separate lifetime for dynamic UDP rules. fix a problem in old code which prevented reallocation of the hash table for dynamic rules (PR ...) restructure dynamic rule descriptors introduce some local variables to avoid multiple dereferencing of pointer chains (reduces code size and hopefully increases speed).	2001-09-20 13:52:49 +00:00
sumikawa	31af69645f	Fixed comment: ipip_input -> mroute_encapcheck. Reported by: bde	2001-09-20 07:59:45 +00:00
sumikawa	aa9b71c68d	Removed ipip_input(). No codes calls it anymore due to ip_encap.c's encapsulation support.	2001-09-18 14:52:20 +00:00
julian	5596676e6c	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
julian	9ade8e4044	Remove some un-needed code that was accidentally included in the 2nd previous KAME patch. Submitted by: SUMIKAWA Munechika <sumikawa@ebina.hitachi.co.jp>	2001-09-07 07:24:28 +00:00
julian	3cc9960fd1	Patches from KAME to remove usage of Varargs in existing IPV4 code. For now they will still have some in the developing stuff (IPv6) Submitted by: Keiichi SHIMA / <keiichi@iij.ad.jp> Obtained from: KAME	2001-09-07 07:19:12 +00:00
jlemon	f729fe0a4a	Wrap array accesses in macros, which also happen to be lvalues: ifnet_addrs[i - 1] -> ifaddr_byindex(i) ifindex2ifnet[i] -> ifnet_byindex(i) This is intended to ease the conversion to SMPng.	2001-09-06 02:40:43 +00:00
alfred	7ffd260ead	Fix sysctl comment field, s/the the/then the Pointed out by: ru	2001-09-04 15:25:23 +00:00
alfred	3cb94e5158	Allow disabling of "arp moved" messages. Submitted by: Stephen Hurd <deuce@lordlegacy.org>	2001-09-03 21:53:15 +00:00
julian	4a8dc7084c	I really hope this is the right answer. call ip_input directly but take the offset off the packet first if it's an IPV4 packet encapsulated.	2001-09-03 21:07:31 +00:00
julian	70318b8e97	Call ip_input() instead of ipip_input() when decoding encapsulated ipv4 packets. (allows line to compile again)	2001-09-03 20:55:35 +00:00
julian	34824b62c0	One caller of rip_input failed to be converted in the last commit.	2001-09-03 20:40:35 +00:00
julian	071f86f9f1	Patches from Keiichi SHIMA <keiichi@iij.ad.jp> to make ip use the standard protosw structure again. Obtained from: Well, KAME I guess.	2001-09-03 20:03:55 +00:00
jayanth	77d67fb568	when newreno is turned on, if dupacks = 1 or dupacks = 2 and new data is acknowledged, reset the dupacks to 0. The problem was spotted when a connection had its send buffer full because the congestion window was only 1 MSS and was not being incremented because dupacks was not reset to 0. Obtained from: Yahoo!	2001-08-29 23:54:13 +00:00
jesper	0d6191f027	When net.inet.tcp.icmp_may_rst is enabled, report ECONNREFUSED not ENETRESET to the application as a RST would, this way we're compatible with the most applications. MFC candidate. Submitted by: Scott Renfro <scott@renfro.org> Reviewed by: Mike Silbersack <silby@silby.com>	2001-08-27 22:10:07 +00:00
billf	01b240a5a7	the IP_FW_GET code in ip_fw_ctl() sizes a buffer to hold information about rules and dynamic rules. it later fills this buffer with these rules. it also takes the opporunity to compare the expiration of the dynamic rules with the current time and either marks them for deletion or simply charges the countdown. unfortunatly it does this all (the sizing, the buffer copying, and the expiration GC) with no spl protection whatsoever. it was possible for the dynamic rule(s) to be ripped out from under the request before it had completed, resulting in corrupt memory dereferencing. Reviewed by: ps MFC before: 4.4-RELEASE, hopefully.	2001-08-26 10:09:47 +00:00
dd	6ea3a08d37	Correct a typo in a comment: FIN_WAIT2 -> FIN_WAIT_2 PR: 29970 Submitted by: Joseph Mallett <jmallett@xMach.org>	2001-08-23 22:34:29 +00:00
silby	58e247fcc4	Much delayed but now present: RFC 1948 style sequence numbers In order to ensure security and functionality, RFC 1948 style initial sequence number generation has been implemented. Barring any major crypographic breakthroughs, this algorithm should be unbreakable. In addition, the problems with TIME_WAIT recycling which affect our currently used algorithm are not present. Reviewed by: jesper	2001-08-22 00:58:16 +00:00
ru	cf9d9a36e7	Added TFTP support. Submitted by: Joe Clarke <marcus@marcuscom.com> MFC after: 2 weeks	2001-08-21 16:25:38 +00:00
ru	4d0fae19b5	Close the "IRC DCC" security breach reported recently on Bugtraq. Submitted by: Makoto MATSUSHITA <matusita@jp.FreeBSD.org>	2001-08-21 11:21:08 +00:00
brian	bf0ff75162	Make the copyright consistent. Previously approved by: Charles Mott <cmott@scientech.com>	2001-08-20 22:57:33 +00:00
brian	600042995a	Handle snprintf() returning -1 MFC after: 2 weeks	2001-08-20 12:06:42 +00:00
julian	de6d7f13db	Make the protoswitch definitiosn checkable in the same way that cdevsw entries have been for a long time. Discover that we now have two version sof the same structure. I will shoot one of them shortly when I figure out why someone thinks they need it. (And I can prove they don't) (netinet/ipprotosw.h should GO AWAY)	2001-08-10 23:17:22 +00:00
ru	4345758876	mdoc(7) police: Avoid using parenthesis enclosure macros (.Pq and .Po/.Pc) with plain text. Not only this slows down the mdoc(7) processing significantly, but it also has an undesired (in this case) effect of disabling hyphenation within the entire enclosed block.	2001-08-07 15:48:51 +00:00
ume	215c0c107e	When running aplication joined multicast address, removing network card, and kill aplication. imo_membership[].inm_ifp refer interface pointer after removing interface. When kill aplication, release socket,and imo_membership. imo_membership use already not exist interface pointer. Then, kernel panic. PR: 29345 Submitted by: Inoue Yuichi <inoue@nd.net.fujitsu.co.jp> Obtained from: KAME MFC after: 3 days	2001-08-04 17:10:14 +00:00
dcs	908ed6c780	MFS: Avoid dropping fragments in the absence of an interface address. Noticed by: fenner Submitted by: iedowse Not committed to current by: iedowse ;-)	2001-08-03 17:36:06 +00:00
peter	3ed3578ff8	Fix a warning.	2001-07-27 00:04:39 +00:00
peter	b8da0cdbc4	Patch up some style(9) stuff in tcp_new_isn()	2001-07-27 00:03:49 +00:00
peter	3feb3ed786	s/OpemBSD/OpenBSD/	2001-07-27 00:01:48 +00:00
ume	e8ae8d1bf4	move ipsec security policy allocation into in_pcballoc, before making pcbs available to the outside world. otherwise, we will see inpcb without ipsec security policy attached (-> panic() in ipsec.c). Obtained from: KAME MFC after: 3 days	2001-07-26 19:19:49 +00:00
fenner	8396f6f2b1	Somewhat modernize ip_mroute.c: - Use sysctl to export stats - Use ip_encap.c's encapsulation support - Update lkm to kld (is 6 years a record for a broken module?) - Remove some unused cruft	2001-07-25 20:15:49 +00:00
ru	513055859b	Avoid a NULL pointer derefence introduced in rev. 1.129. Problem noticed by: bde, gcc(1) Panic caught by: mjacob Patch tested by: mjacob	2001-07-23 16:50:01 +00:00
ru	82aace0e06	Backout non-functional changes from revision 1.128. Not objected to by: dcs	2001-07-19 07:10:30 +00:00
dcs	4e8adbcead	Skip the route checking in the case of multicast packets with known interfaces. Reviewed by: people at that channel Approved by: silence on -net	2001-07-17 18:47:48 +00:00
ru	1a2a5935ee	Backout damage to the INADDR_TO_IFP() macro in revision 1.7. This macro was supposed to only match local IP addresses of interfaces, and all consumers of this macro assume this as well. (See IP_MULTICAST_IF and IP_ADD_MEMBERSHIP socket options in the ip(4) manpage.) This fixes a major security breach in IPFW-based firewalls where the `me' keyword would match the other end of a P2P link. PR: kern/28567	2001-07-17 10:30:21 +00:00
obrien	c5393097b3	Bump net.inet.tcp.sendspace to 32k and net.inet.tcp.recvspace to 65k. This should help us in nieve benchmark "tests". It seems a wide number of people think 32k buffers would not cause major issues, and is in fact in use by many other OS's at this time. The receive buffers can be bumped higher as buffers are hardly used and several research papers indicate that receive buffers rarely use much space at all. Submitted by: Leo Bicknell <bicknell@ufp.org> <20010713101107.B9559@ussenterprise.ufp.org> Agreed to in principle by: dillon (at the 32k level)	2001-07-13 18:38:04 +00:00
ru	317b7d8e37	mdoc(7) police: removed HISTORY info from the .Os call.	2001-07-10 13:41:46 +00:00
silby	2be73222cb	Temporary feature: Runtime tuneable tcp initial sequence number generation scheme. Users may now select between the currently used OpenBSD algorithm and the older random positive increment method. While the OpenBSD algorithm is more secure, it also breaks TIME_WAIT handling; this is causing trouble for an increasing number of folks. To switch between generation schemes, one sets the sysctl net.inet.tcp.tcp_seq_genscheme. 0 = random positive increments, 1 = the OpenBSD algorithm. 1 is still the default. Once a secure _and_ compatible algorithm is implemented, this sysctl will be removed. Reviewed by: jlemon Tested by: numerous subscribers of -net	2001-07-08 02:20:47 +00:00
brooks	e7b9bc714f	gif(4) and stf(4) modernization: - Remove gif dependencies from stf. - Make gif and stf into modules - Make gif cloneable. PR: kern/27983 Reviewed by: ru, ume Obtained from: NetBSD MFC after: 1 week	2001-07-02 21:02:09 +00:00
cjc	a00bbf94c2	While in there fixing a fragment logging bug, fix it so we log fragments "right." Log fragment information tcpdump(8)-style, Jul 1 19:38:45 bubbles /boot/kernel/kernel: ipfw: 1000 Accept ICMP:8.0 192.168.64.60 192.168.64.20 in via ep0 (frag 53113:1480@0+) That is, instead of the old, ... Fragment = <offset/8> Do, ... (frag <IP ID>:<data len>@<offset>[+]) PR: kern/23446 Approved by: ru MFC after: 1 week	2001-07-02 15:50:31 +00:00
ru	9a1f6416f4	Backout CSRG revision 7.22 to this file (if in_losing notices an RTF_DYNAMIC route, it got freed twice). I am not sure what was the actual problem in 1992, but the current behavior is memory leak if PCB holds a reference to a dynamically created/modified routing table entry. (rt_refcnt>0 and we don't call rtfree().) My test bed was: 1. Set net.inet.tcp.msl to a low value (for test purposes), e.g., 5 seconds, to speed up the transition of TCP connection to a "closed" state. 2. Add a network route which causes ICMP redirect from the gateway. 3. ping(8) host H that matches this route; this creates RTF_DYNAMIC RTF_HOST route to H. (I was forced to use ICMP to cause gateway to generate ICMP host redirect, because gateway in question is a 4.2-STABLE system vulnerable to a problem that was fixed later in ip_icmp.c,v 1.39.2.6, and TCP packets with DF bit set were triggering this bug.) 4. telnet(1) to H 5. Block access to H with ipfw(8) 6. Send something in telnet(1) session; this causes EPERM, followed by an in_losing() call in a few seconds. 7. Delete ipfw(8) rule blocking access to H, and wait for TCP connection moving to a CLOSED state; PCB is freed. 8. Delete host route to H. 9. Watch with netstat(1) that `rttrash' increased. 10. Repeat steps 3-9, and watch `rttrash' increases. PR: kern/25421 MFC after: 2 weeks	2001-06-29 12:07:29 +00:00
ru	61d088ba8d	Fixed the brain-o in rev. 1.10: the logic check was reversed. Reported by: Bernd Fuerwitt <bf@fuerwitt.de>	2001-06-27 14:11:25 +00:00
ru	e2738b93f2	Bring in fix from NetBSD's revision 1.16: Pass the correct destination address for the route-to-gateway case. PR: kern/10607 MFC after: 2 weeks	2001-06-26 09:00:50 +00:00
dwmalone	db54f212f8	Allow getcred sysctl to work in jailed root processes. Processes can only do getcred calls for sockets which were created in the same jail. This should allow the ident to work in a reasonable way within jails. PR: 28107 Approved by: des, rwatson	2001-06-24 12:18:27 +00:00
jlemon	e071c16669	Replace bzero() of struct ip with explicit zeroing of structure members, which is faster.	2001-06-23 17:44:27 +00:00
ru	f8e11dde26	Add netstat(1) knob to reset net.inet.{ip\|icmp\|tcp\|udp\|igmp}.stats. For example, ``netstat -s -p ip -z'' will show and reset IP stats. PR: bin/17338	2001-06-23 17:17:59 +00:00
silby	f41767543e	Eliminate the allocation of a tcp template structure for each connection. The information contained in a tcptemp can be reconstructed from a tcpcb when needed. Previously, tcp templates required the allocation of one mbuf per connection. On large systems, this change should free up a large number of mbufs. Reviewed by: bmilekic, jlemon, ru MFC after: 2 weeks	2001-06-23 03:21:46 +00:00
sumikawa	845436d272	- Renumber KAME local ICMP types and NDP options numberes beacaues they are duplicated by newly defined types/options in RFC3121 - We have no backward compatibility issue. There is no apps in our distribution which use the above types/options. Obtained from: KAME MFC after: 2 weeks	2001-06-21 07:08:43 +00:00
ume	7ffe6c47e5	made sure to use the correct sa_len for rtalloc(). sizeof(ro_dst) is not necessarily the correct one. this change would also fix the recent path MTU discovery problem for the destination of an incoming TCP connection. Submitted by: JINMEI Tatuya <jinmei@kame.net> Obtained from: KAME MFC after: 2 weeks	2001-06-20 12:32:48 +00:00
jlemon	3d3ee69a37	Do not perform arp send/resolve on an interface marked NOARP. PR: 25006 MFC after: 2 weeks	2001-06-15 21:00:32 +00:00
peter	89d8e7c754	Fix a stack of KAME netinet6/in6.h warnings: 592: warning: `struct mbuf' declared inside parameter list 595: warning: `struct ifnet' declared inside parameter list	2001-06-15 00:37:27 +00:00
ume	832f8d2249	Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge. TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT. Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks	2001-06-11 12:39:29 +00:00
jesper	ce21e1d449	Make the default value of net.inet.ip.maxfragpackets and net.inet6.ip6.maxfragpackets dependent on nmbclusters, defaulting to nmbclusters / 4 Reviewed by: bde MFC after: 1 week	2001-06-10 11:04:10 +00:00
peter	4b91e2ecf0	"Fix" the previous initial attempt at fixing TUNABLE_INT(). This time around, use a common function for looking up and extracting the tunables from the kernel environment. This saves duplicating the same function over and over again. This way typically has an overhead of 8 bytes + the path string, versus about 26 bytes + the path string.	2001-06-08 05:24:21 +00:00
jlemon	bd2af8830f	Move IPFilter into contrib.	2001-06-07 05:13:35 +00:00
peter	c1df44ae51	Back out part of my previous commit. This was a last minute change and I botched testing. This is a perfect example of how NOT to do this sort of thing. :-(	2001-06-07 03:17:26 +00:00
peter	0732738ec4	Make the TUNABLE_() macros look and behave more consistantly like the SYSCTL_() macros. TUNABLE_INT_DECL() was an odd name because it didn't actually declare the int, which is what the name suggests it would do.	2001-06-06 22:17:08 +00:00
jesper	9d59cfc3ee	Silby's take one on increasing FreeBSD's resistance to SYN floods: One way we can reduce the amount of traffic we send in response to a SYN flood is to eliminate the RST we send when removing a connection from the listen queue. Since we are being flooded, we can assume that the majority of connections in the queue are bogus. Our RST is unwanted by these hosts, just as our SYN-ACK was. Genuine connection attempts will result in hosts responding to our SYN-ACK with an ACK packet. We will automatically return a RST response to their ACK when it gets to us if the connection has been dropped, so the early RST doesn't serve the genuine class of connections much. In summary, we can reduce the number of packets we send by a factor of two without any loss in functionality by ensuring that RST packets are not sent when dropping a connection from the listen queue. Submitted by: Mike Silbersack <silby@silby.com> Reviewed by: jesper MFC after: 2 weeks	2001-06-06 19:41:51 +00:00
brian	91bbcb8b58	Add BSD-style copyright headers Approved by: Charles Mott <cmott@scientech.com>	2001-06-04 15:09:51 +00:00
brian	5a407d2957	Change to a standard BSD-style copyright Approved by: Atsushi Murai <amurai@spec.co.jp>	2001-06-04 14:52:17 +00:00
jesper	4ff715c022	Prevent denial of service using bogus fragmented IPv4 packets. A attacker sending a lot of bogus fragmented packets to the target (with different IPv4 identification field - ip_id), may be able to put the target machine into mbuf starvation state. By setting a upper limit on the number of reassembly queues we prevent this situation. This upper limit is controlled by the new sysctl net.inet.ip.maxfragpackets which defaults to 200, as the IPv6 case, this should be sufficient for most systmes, but you might want to increase it if you have lots of TCP sessions. I'm working on making the default value dependent on nmbclusters. If you want old behaviour (no upper limit) set this sysctl to a negative value. If you don't want to accept any fragments (not recommended) set the sysctl to 0 (zero). Obtained from: NetBSD MFC after: 1 week	2001-06-03 23:33:23 +00:00
kris	e1524eb20c	Add ``options RANDOM_IP_ID'' which randomizes the ID field of IP packets. This closes a minor information leak which allows a remote observer to determine the rate at which the machine is generating packets, since the default behaviour is to increment a counter for each packet sent. Reviewed by: -net Obtained from: OpenBSD	2001-06-01 10:02:28 +00:00
obrien	538a64fd6b	Back out jesper's 2001/05/31 14:58:11 PDT commit. It does not compile.	2001-06-01 09:51:14 +00:00
jesper	70faf8712a	Prevent denial of service using bogus fragmented IPv4 packets. A attacker sending a lot of bogus fragmented packets to the target (with different IPv4 identification field - ip_id), may be able to put the target machine into mbuf starvation state. By setting a upper limit on the number of reassembly queues we prevent this situation. This upper limit is controlled by the new sysctl net.inet.ip.maxfragpackets which defaults to NMBCLUSTERS/4 If you want old behaviour (no upper limit) set this sysctl to a negative value. If you don't want to accept any fragments (not recommended) set the sysctl to 0 (zero) Obtained from: NetBSD (partially) MFC after: 1 week	2001-05-31 21:57:29 +00:00
jesper	7e194a2420	Disable rfc1323 and rfc1644 TCP extensions if we havn't got any response to our third SYN to work-around some broken terminal servers (most of which have hopefully been retired) that have bad VJ header compression code which trashes TCP segments containing unknown-to-them TCP options. PR: kern/1689 Submitted by: jesper Reviewed by: wollman MFC after: 2 weeks	2001-05-31 19:24:49 +00:00
ru	f478ecd8d3	Add an integer field to keep protocol-specific flags with links. For FTP control connection, keep the CRLF end-of-line termination status in there. Fixed the bug when the first FTP command in a session was ignored. PR: 24048 MFC after: 1 week	2001-05-30 14:24:35 +00:00
jesper	aa7ec52010	Inline TCP_REASS() in the single location where it's used, just as OpenBSD and NetBSD has done. No functional difference. MFC after: 2 weeks	2001-05-29 19:54:45 +00:00
jesper	02dca88184	properly delay acks in half-closed TCP connections PR: 24962 Submitted by: Tony Finch <dot@dotat.at> MFC after: 2 weeks	2001-05-29 19:51:45 +00:00
ru	82e492f616	In in_ifadown(), differentiate between whether the interface goes down or interface address is deleted. Only delete static routes in the latter case. Reported by: Alexander Leidinger <Alexander@leidinger.net>	2001-05-11 14:37:34 +00:00
markm	bcca5847d5	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
jesper	a1fab55459	Say goodbye to TCP_COMPAT_42 Reviewed by: wollman Requested by: wollman	2001-04-20 11:58:56 +00:00
kris	0c55f2e6da	Randomize the TCP initial sequence numbers more thoroughly. Obtained from: OpenBSD Reviewed by: jesper, peter, -developers	2001-04-17 18:08:01 +00:00
darrenr	df2a765614	fix security hole created by fragment cache	2001-04-06 15:52:28 +00:00
billf	4062f7d719	pipe/queue are the only consumers of flow_id, so only set it in those cases	2001-04-06 06:52:25 +00:00
jesper	3c2e206a41	MFC candidate. Change code from PRC_UNREACH_ADMIN_PROHIB to PRC_UNREACH_PORT for ICMP_UNREACH_PROTOCOL and ICMP_UNREACH_PORT And let TCP treat PRC_UNREACH_PORT like PRC_UNREACH_ADMIN_PROHIB This should fix the case where port unreachables for udp returned ENETRESET instead of ECONNREFUSED Problem found by: Bill Fenner <fenner@research.att.com> Reviewed by: jlemon	2001-03-28 14:13:19 +00:00
ru	25ef23ac1c	MAN[1-9] -> MAN.	2001-03-27 17:27:19 +00:00
yar	b3a36066df	Add a missing m_pullup() before a mtod() in in_arpinput(). PR: kern/22177 Reviewed by: wollman	2001-03-27 12:34:58 +00:00
simokawa	37504f69c9	Replace dyn_fin_lifetime with dyn_ack_lifetime for half-closed state. Half-closed state could last long for some connections and fin_lifetime (default 20sec) is too short for that. OK'ed by: luigi	2001-03-27 05:28:30 +00:00
phk	c47745e977	Send the remains (such as I have located) of "block major numbers" to the bit-bucket.	2001-03-26 12:41:29 +00:00
brian	8636c82fbe	Make header files conform to style(9). Reviewed by (): bde () alias_local.h only got a cursory glance.	2001-03-25 12:05:10 +00:00
brian	afd190c224	Remove an extraneous declaration.	2001-03-25 03:34:29 +00:00
ume	aabe84d0cb	IPv4 address is not unsigned int. This change introduces in_addr_t. PR: 9982 Adviced by: des Reviewed by: -alpha and -net (no objection) Obtained from: OpenBSD	2001-03-23 18:59:31 +00:00
brian	cdbf8e313d	Remove (non-protected) variable names from function prototypes.	2001-03-22 11:55:26 +00:00
paul	217aacd059	Only flush rules that have a rule number above that set by a new sysctl, net.inet.ip.fw.permanent_rules. This allows you to install rules that are persistent across flushes, which is very useful if you want a default set of rules that maintains your access to remote machines while you're reconfiguring the other rules. Reviewed by: Mark Murray <markm@FreeBSD.org>	2001-03-21 08:19:31 +00:00
des	9dc769bc1b	Axe TCP_RESTRICT_RST. It was never a particularly good idea except for a few very specific scenarios, and now that we have had net.inet.tcp.blackhole for quite some time there is really no reason to use it any more. (last of three commits)	2001-03-19 22:09:00 +00:00
ru	38387221cd	Invalidate cached forwarding route (ipforward_rt) whenever a new route is added to the routing table, otherwise we may end up using the wrong route when forwarding. PR: kern/10778 Reviewed by: silence on -net	2001-03-19 09:16:16 +00:00
ru	1387428744	Make sure the cached forwarding route (ipforward_rt) is still up before using it. Not checking this may have caused the wrong IP address to be used when processing certain IP options (see example below). This also caused the wrong route to be passed to ip_output() when forwarding, but fortunately ip_output() is smart enough to detect this. This example demonstrates the wrong behavior of the Record Route option observed with this bug. Host ``freebsd'' is acting as the gateway for the ``sysv''. 1. On the gateway, we add the route to the destination. The new route will use the primary address of the loopback interface, 127.0.0.1: : freebsd# route add 10.0.0.66 -iface lo0 -reject : add host 10.0.0.66: gateway lo0 2. From the client, we ping the destination. We see the correct replies. Please note that this also causes the relevant route on the ``freebsd'' gateway to be cached in ipforward_rt variable: : sysv# ping -snv 10.0.0.66 : PING 10.0.0.66: 56 data bytes : ICMP Host Unreachable from gateway 192.168.0.115 : ICMP Host Unreachable from gateway 192.168.0.115 : ICMP Host Unreachable from gateway 192.168.0.115 : : ----10.0.0.66 PING Statistics---- : 3 packets transmitted, 0 packets received, 100% packet loss 3. On the gateway, we delete the route to the destination, thus making the destination reachable through the `default' route: : freebsd# route delete 10.0.0.66 : delete host 10.0.0.66 4. From the client, we ping destination again, now with the RR option turned on. The surprise here is the 127.0.0.1 in the first reply. This is caused by the bug in ip_rtaddr() not checking the cached route is still up befor use. The debug code also shows that the wrong (down) route is further passed to ip_output(). The latter detects that the route is down, and replaces the bogus route with the valid one, so we see the correct replies (192.168.0.115) on further probes: : sysv# ping -snRv 10.0.0.66 : PING 10.0.0.66: 56 data bytes : 64 bytes from 10.0.0.66: icmp_seq=0. time=10. ms : IP options: <record route> 127.0.0.1, 10.0.0.65, 10.0.0.66, : 192.168.0.65, 192.168.0.115, 192.168.0.120, : 0.0.0.0(Current), 0.0.0.0, 0.0.0.0 : 64 bytes from 10.0.0.66: icmp_seq=1. time=0. ms : IP options: <record route> 192.168.0.115, 10.0.0.65, 10.0.0.66, : 192.168.0.65, 192.168.0.115, 192.168.0.120, : 0.0.0.0(Current), 0.0.0.0, 0.0.0.0 : 64 bytes from 10.0.0.66: icmp_seq=2. time=0. ms : IP options: <record route> 192.168.0.115, 10.0.0.65, 10.0.0.66, : 192.168.0.65, 192.168.0.115, 192.168.0.120, : 0.0.0.0(Current), 0.0.0.0, 0.0.0.0 : : ----10.0.0.66 PING Statistics---- : 3 packets transmitted, 3 packets received, 0% packet loss : round-trip (ms) min/avg/max = 0/3/10	2001-03-18 13:04:07 +00:00
phk	fa534e660d	<sys/queue.h> makeover.	2001-03-16 20:00:53 +00:00
phk	a4a639f968	Fix a style(9) nit.	2001-03-16 19:36:23 +00:00
ru	e4b7d932a1	net/route.c: A route generated from an RTF_CLONING route had the RTF_WASCLONED flag set but did not have a reference to the parent route, as documented in the rtentry(9) manpage. This prevented such routes from being deleted when their parent route is deleted. Now, for example, if you delete an IP address from a network interface, all ARP entries that were cloned from this interface route are flushed. This also has an impact on netstat(1) output. Previously, dynamically created ARP cache entries (RTF_STATIC flag is unset) were displayed as part of the routing table display (-r). Now, they are only printed if the -a option is given. netinet/in.c, netinet/in_rmx.c: When address is removed from an interface, also delete all routes that point to this interface and address. Previously, for example, if you changed the address on an interface, outgoing IP datagrams might still use the old address. The only solution was to delete and re-add some routes. (The problem is easily observed with the route(8) command.) Note, that if the socket was already bound to the local address before this address is removed, new datagrams generated from this socket will still be sent from the old address. PR: kern/20785, kern/21914 Reviewed by: wollman (the idea)	2001-03-15 14:52:12 +00:00
ru	75b400ba6b	RFC768 (UDP) requires that "if the computed checksum is zero, it is transmitted as all ones". This got broken after introduction of delayed checksums as follows. Some guys (including Jonathan) think that it is allowed to transmit all ones in place of a zero checksum for TCP the same way as for UDP. (The discussion still takes place on -net.) Thus, the 0 -> 0xffff checksum fixup was first moved from udp_output() (see udp_usrreq.c, 1.64 -> 1.65) to in_cksum_skip() (see sys/i386/i386/in_cksum.c, 1.17 -> 1.18, INVERT expression). Besides that I disagree that it is valid for TCP, there was no real problem until in_cksum.c,v 1.20, where the in_cksum() was made just a special version of in_cksum_skip(). The side effect was that now every incoming IP datagram failed to pass the checksum test (in_cksum() returned 0xffff when it should actually return zero). It was fixed next day in revision 1.21, by removing the INVERT expression. The latter also broke the 0 -> 0xffff fixup for UDP checksums. Before this change: : tcpdump: listening on lo0 : 127.0.0.1.33005 > 127.0.0.1.33006: udp 0 (ttl 64, id 1) : 4500 001c 0001 0000 4011 7cce 7f00 0001 : 7f00 0001 80ed 80ee 0008 0000 After this change: : tcpdump: listening on lo0 : 127.0.0.1.33005 > 127.0.0.1.33006: udp 0 (ttl 64, id 1) : 4500 001c 0001 0000 4011 7cce 7f00 0001 : 7f00 0001 80ed 80ee 0008 ffff	2001-03-13 17:07:06 +00:00
ru	e7537660da	Count and show incoming UDP datagrams with no checksum.	2001-03-13 13:26:06 +00:00
phk	07e97d2a86	Correctly cleanup in case of failure to bind a pcb. PR: 25751 Submitted by: <unicorn@Forest.Od.UA>	2001-03-12 21:53:23 +00:00
jlemon	9b532c7054	Unbreak LINT. Pointed out by: phk	2001-03-12 02:57:42 +00:00
iedowse	1fa96ee9e3	In ip_output(), initialise `ia' in the case where the packet has come from a dummynet pipe. Without this, the code which increments the per-ifaddr stats can dereference an uninitialised pointer. This should make dummynet usable again. Reported by: "Dmitry A. Yanko" <fm@astral.ntu-kpi.kiev.ua> Reviewed by: luigi, joe	2001-03-11 17:50:19 +00:00
ru	5639e86bdd	Make it possible to use IP_TTL and IP_TOS setsockopt(2) options on certain types of SOCK_RAW sockets. Also, use the ip.ttl MIB variable instead of MAXTTL constant as the default time-to-live value for outgoing IP packets all over the place, as we already do this for TCP and UDP. Reviewed by: wollman	2001-03-09 12:22:51 +00:00
jlemon	50bffc6c06	Push the test for a disconnected socket when accept()ing down to the protocol layer. Not all protocols behave identically. This fixes the brokenness observed with unix-domain sockets (and postfix)	2001-03-09 08:16:40 +00:00
jlemon	e8c0cc0af2	The TCP sequence number used for sending a RST with the ipfw reset rule is already in host byte order, so do not swap it again. Reviewed by: bfumerola	2001-03-09 08:13:08 +00:00
iedowse	9852c67f7c	It was possible for ip_forward() to supply to icmp_error() an IP header with ip_len in network byte order. For certain values of ip_len, this could cause icmp_error() to write beyond the end of an mbuf, causing mbuf free-list corruption. This problem was observed during generation of ICMP redirects. We now make quite sure that the copy of the IP header kept for icmp_error() is stored in a non-shared mbuf header so that it will not be modified by ip_output(). Also: - Calculate the correct number of bytes that need to be retained for icmp_error(), instead of assuming that 64 is enough (it's not). - In icmp_error(), use m_copydata instead of bcopy() to copy from the supplied mbuf chain, in case the first 8 bytes of IP payload are not stored directly after the IP header. - Sanity-check ip_len in icmp_error(), and panic if it is less than sizeof(struct ip). Incoming packets with bad ip_len values are discarded in ip_input(), so this should only be triggered by bugs in the code, not by bad packets. This patch results from code and suggestions from Ruslan, Bosko, Jonathan Lemon and Matt Dillon, with important testing by Mike Tancsa, who could reproduce this problem at will. Reported by: Mike Tancsa <mike@sentex.net> Reviewed by: ru, bmilekic, jlemon, dillon	2001-03-08 19:03:26 +00:00
truckman	7b8b7b318e	Modify the comments to more closely resemble the English language.	2001-03-05 22:40:27 +00:00
truckman	6b923e6dc3	Move the loopback net check closer to the beginning of ip_input() so that it doesn't block packets whose destination address has been translated to the loopback net by ipnat. Add warning comments about the ip_checkinterface feature.	2001-03-05 08:45:05 +00:00
bmilekic	88ef993e5e	During a flood, we don't call rtfree(), but we remove the entry ourselves. However, if the RTF_DELCLONE and RTF_WASCLONED condition passes, but the ref count is > 1, we won't decrement the count at all. This could lead to route entries never being deleted. Here, we call rtfree() not only if the initial two conditions fail, but also if the ref count is > 1 (and we therefore don't immediately delete the route, but let rtfree() handle it). This is an urgent MFC candidate. Thanks go to Mike Silbersack for the fix, once again. :-) Submitted by: Mike Silbersack <silby@silby.com>	2001-03-04 21:28:40 +00:00
truckman	e6aaaa86e7	Disable interface checking for packets subject to "ipfw fwd". Chris Johnson <cjohnson@palomine.net> tested this fix in -stable.	2001-03-04 03:22:36 +00:00
truckman	3a29c2f4df	Disable interface checking when IP forwarding is engaged so that packets addressed to the interface on the other side of the box follow their historical path. Explicitly block packets sent to the loopback network sent from the outside, which is consistent with the behavior of the forwarding path between interfaces as implemented in in_canforward(). Always check the arrival interface when matching the packet destination against the interface broadcast addresses. This bug allowed TCP connections to be made to the broadcast address of an interface on the far side of the system because the M_BCAST flag was not set because the packet was unicast to the interface on the near side. This was broken when the directed broadcast code was removed from revision 1.32. If the directed broadcast code was stil present, the destination would not have been recognized as local until the packet was forwarded to the output interface and ether_output() looped a copy back to ip_input() with M_BCAST set and the receive interface set to the output interface. Optimize the order of the tests. Reviewed by: jlemon	2001-03-04 01:39:19 +00:00
jlemon	021d152d84	Add a new sysctl net.inet.ip.check_interface, which will verify that an incoming packet arrivees on an interface that has an address matching the packet's address. This is turned on by default.	2001-03-02 20:54:03 +00:00
phk	78a2aff290	Fix jails.	2001-02-28 09:38:48 +00:00
jlemon	dd84ad82bf	When iterating over our list of interface addresses in order to determine if an arriving packet belongs to us, also check that the packet arrived through the correct interface. Skip this check if the packet was locally generated.	2001-02-27 19:43:14 +00:00
billf	7a0c52088d	The TCP header-specific section suffered a little bit of bitrot recently: When we recieve a fragmented TCP packet (other than the first) we can't extract header information (we don't have state to reference). In a rather unelegant fashion we just move on and assume a non-match. Recent additions to the TCP header-specific section of the code neglected to add the logic to the fragment code so in those cases the match was assumed to be positive and those parts of the rule (which should have resulted in a non-match/continue) were instead skipped (which means the processing of the rule continued even though it had already not matched). Fault can be spread out over Rich Steenbergen (tcpoptions) and myself (tcp{seq,ack,win}). rwatson sent me a patch that got me thinking about this whole situation (but what I'm committing / this description is mine so don't blame him).	2001-02-27 10:20:44 +00:00
jlemon	825b685ed9	Use more aggressive retransmit timeouts for the initial SYN packet. As we currently drop the connection after 4 retransmits + 2 ICMP errors, this allows initial connection attempts to be dropped much faster.	2001-02-26 21:33:55 +00:00

... 4 5 6 7 8 ...

1633 Commits