freebsd-skq

Author	SHA1	Message	Date
bz	1021d43b56	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
julian	1dfc5c98a4	Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x) Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux. From my notes: ----- One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address. Constraints: ------------ I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need. One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing". One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch. This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs. To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family. The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before. The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row. In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later. One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically). You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it. This brings us as to how the correct FIB is selected for an outgoing IPV4 packet. Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways. Packets fall into one of a number of classes. 1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice.. setfib -3 ping target.example.com # will use fib 3 for ping. It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands. 2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.) 3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2). 4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib. 5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to. 6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1. Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented) In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB. In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process. Early testing experience: ------------------------- Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks. For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done. Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly. ipfw has grown 2 new keywords: setfib N ip from anay to any count ip from any to any fib N In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required. SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something. Where to next: -------------------- After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code. Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code. My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it. When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry. Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already. This work was sponsored by Ironport Systems/Cisco Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco	2008-05-09 23:03:00 +00:00
silby	f965c7bdc4	Add FBSDID to all files in netinet so that people can more easily include file version information in bug reports. Approved by: re (kensmith)	2007-10-07 20:44:24 +00:00
rwatson	23574c8673	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
rwatson	a62dbe240a	Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and remove definition of NET_CALLOUT_MPSAFE, which is no longer required now that debug.mpsafenet has been removed. The once over: bz Approved by: re (kensmith)	2007-07-28 07:31:30 +00:00
rwatson	a25f94b5ae	Move universally to ANSI C function declarations, with relatively consistent style(9)-ish layout.	2007-05-10 15:58:48 +00:00
bms	74244a8ee1	Diff reduction with NetBSD; use IN_LOCAL_GROUP() to check if an address is within the locally scoped multicast range 224.0.0.0/24.	2007-03-15 08:44:22 +00:00
bms	5ab23e568d	Purge an out-of-date comment.	2007-03-04 16:32:19 +00:00
bms	84976a55f2	Style: Move declaration of subsystem mutex to where other mutexes are in this file, and use macros for dealing with it.	2007-02-28 20:02:24 +00:00
bms	2f11af3e83	Unlock a mutex which should be unlocked before returning. MFC after: 1 week	2007-02-25 14:22:03 +00:00
bms	3e83ac6653	Make IPv6 multicast forwarding dynamically loadable from a GENERIC kernel. It is built in the same module as IPv4 multicast forwarding, i.e. ip_mroute.ko, if and only if IPv6 support is enabled for loadable modules. Export IPv6 forwarding structs to userland netstat(1) via sysctl(9).	2007-02-24 11:38:47 +00:00
bms	7fe6b5282a	Use MAXTTL. Obtained from: NetBSD	2007-02-10 23:15:28 +00:00
bms	17b1800be2	If the rendezvous point for a group is not specified, do not send IGMPMSG_WHOLEPKT notifications to the userland PIM routing daemon, as an optimization to mitigate the effects of high multicast forwarding load. This is an experimental change, therefore it must be explicitly enabled by setting the sysctl/tunable net.inet.pim.squelch_wholepkt to a non-zero value. The tunable may be set from the loader or from within the kernel environment when loading ip_mroute.ko as a module. Submitted by: edrt <edrt at citiz.net> See also: http://mailman.icsi.berkeley.edu/pipermail/xorp-users/2005-June/000639.html	2007-02-10 14:48:42 +00:00
bms	42ce6b97cf	Build PIM by default as part of the IPv4 multicast forwarding path. Make PIM dynamically loadable by using encap_attach_func(). PIM may now be loaded into a GENERIC kernel. Tested with: ports/net/pimdd && tcpreplay && wireshark Reviewed by: Pavlin Radoslavov	2007-02-10 13:59:13 +00:00
bms	ecb2edb6fa	Store the cached route in vifp in the normal send_packet() case. The VIFF_TUNNEL case no longer exists, therefore this field is free to use, and its use eliminates a static data member.	2007-02-08 23:05:08 +00:00
bms	51ca9a4740	Nuke the token bucket filter code. Attempting to request rate limiting by the token bucket filter will result in EINVAL being returned. If you want to rate-limit traffic in future, use ALTQ or dummynet; this isn't a general purpose QoS engine. Preserve the now unused fields in struct vif so as to avoid having to recompile netstat(1) and other tools. Reviewed by: Pavlin Radslavov, Bill Fenner	2007-02-08 22:58:01 +00:00
bms	6fc869225c	eliminate redundant macro MC_SEND()	2007-02-07 20:36:33 +00:00
bms	61cc2fad7d	Remove support for IPIP tunnels in IPv4 multicast forwarding. XORP has never used them; with mrouted, their functionality may be replaced by explicitly configuring gif(4) instances and specifying them with the 'phyint' keyword. Bump __FreeBSD_version to 700030, and update UPDATING. A doc update is forthcoming. Discussed on: net Reviewed by: fenner MFC after: 3 months	2007-02-07 16:04:13 +00:00
rwatson	10d0d9cf47	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
rwatson	7beaaf5cd2	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
bms	686e54733a	Push removal of mrouted down to the rest of the tree.	2006-09-29 15:45:11 +00:00
bms	03e16bd83a	Fix the IPv4 multicast routing detach path. On interface detach whilst the MROUTER is running, the system would panic as described in the PR. The fix in the PR is a good start, however, the other state associated with the multicast forwarding cache has to be freed in order to avoid leaking memory and other possible panics. More care and attention is needed in this area. PR: kern/82882 MFC after: 1 week	2006-09-28 12:21:08 +00:00
bms	c18ccfca54	Initialize the new members of struct ip_moptions as a defensive programming measure. Note that whilst these members are not used by the ip_output() path, we are passing an instance of struct ip_moptions here which is declared on the stack (which could be considered a bad thing). ip_output() does not consume struct ip_moptions, but in case it does in future, declare an in_multi vector on the stack too to behave more like ip_findmoptions() does.	2006-05-18 19:51:08 +00:00
andre	2fd19a7545	In ip_mdq() compute the TV_DELTA the correct way around. PR: kern/91851 Submitted by: SAKAI Hiroaki <sakai.hiroaki-at-jp.fujitsu.com> MFC after: 3 days	2006-01-24 17:09:12 +00:00
jhb	3acb3374d9	Use %t (ptrdiff_t modifier) to print a couple of pointer differences rather than casting them to int.	2005-12-15 21:57:32 +00:00
andre	a6a209f2cc	Consolidate all IP Options handling functions into ip_options.[ch] and include ip_options.h into all files making use of IP Options functions. From ip_input.c rev 1.306: ip_dooptions(struct mbuf m, int pass) save_rte(m, option, dst) ip_srcroute(m0) ip_stripoptions(m, mopt) From ip_output.c rev 1.249: ip_insertoptions(m, opt, phlen) ip_optcopy(ip, jp) ip_pcbopts(struct inpcb inp, int optname, struct mbuf *m) No functional changes in this commit. Discussed with: rwatson Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-18 20:12:40 +00:00
ru	dcace5669d	Use sparse initializers for "struct domain" and "struct protosw", so they are easier to follow for the human being.	2005-11-09 13:29:16 +00:00
andre	0df84f5a83	Retire MT_HEADER mbuf type and change its users to use MT_DATA. Having an additional MT_HEADER mbuf type is superfluous and redundant as nothing depends on it. It only adds a layer of confusion. The distinction between header mbuf's and data mbuf's is solely done through the m->m_flags M_PKTHDR flag. Non-native code is not changed in this commit. For compatibility MT_HEADER is mapped to MT_DATA. Sponsored by: TCP/IP Optimization Fundraise 2005	2005-11-02 13:46:32 +00:00
andre	c4178ac83e	Use monotonic 'time_uptime' instead of 'time_second' as timebase for timeouts.	2005-09-19 22:31:45 +00:00
imp	d1b7fc96b0	Add back missing copyright and license statement. This is identical to the statement in ip_mroute.h, as well as being the same as what OpenBSD has done with this file. It matches the copyright in NetBSD's 1.1 through 1.14 versions of the file as well, which they subsequently added back. It appears to have been lost in the 4.4-lite1 import for FreeBSD 2.0, but where and why I've not investigated further. OpenBSD had the same problem. NetBSD had a copyright notice until Multicast 3.5 was integrated verbatim back in 1995. This appears to be the version that made it into 4.4-lite1. Approved by: re (scottl) MFC after: 3 days	2005-06-23 18:42:58 +00:00
glebius	2df73116df	Use NET_CALLOUT_MPSAFE macro.	2005-03-01 12:01:17 +00:00
rwatson	ccb2845f23	When running with debug.mpsafenet=0, initialize IP multicast routing callouts as non-CALLOUT_MPSAFE. Otherwise, they may trigger an assertion regarding Giant if they enter other parts of the stack from the callout. MFC after: 3 days Reported by: Dikshie < dikshie at ppk dot itb dot ac dot id >	2004-10-07 14:13:35 +00:00
andre	2126402238	Apply error and success logic consistently to the function netisr_queue() and its users. netisr_queue() now returns (0) on success and ERRNO on failure. At the moment ENXIO (netisr queue not functional) and ENOBUFS (netisr queue full) are supported. Previously it would return (1) on success but the return value of IF_HANDOFF() was interpreted wrongly and (0) was actually returned on success. Due to this schednetisr() was never called to kick the scheduling of the isr. However this was masked by other normal packets coming through netisr_dispatch() causing the dequeueing of waiting packets. PR: kern/70988 Found by: MOROHOSHI Akihiko <moro@remus.dti.ne.jp> MFC after: 3 days	2004-08-27 18:33:08 +00:00
csjp	657b6f650c	When a prison is given the ability to create raw sockets (when the security.jail.allow_raw_sockets sysctl MIB is set to 1) where privileged access to jails is given out, it is possible for prison root to manipulate various network parameters which effect the host environment. This commit plugs a number of security holes associated with the use of raw sockets and prisons. This commit makes the following changes: - Add a comment to rtioctl warning developers that if they add any ioctl commands, they should use super-user checks where necessary, as it is possible for PRISON root to make it this far in execution. - Add super-user checks for the execution of the SIOCGETVIFCNT and SIOCGETSGCNT IP multicast ioctl commands. - Add a super-user check to rip_ctloutput(). If the calling cred is PRISON root, make sure the socket option name is IP_HDRINCL, otherwise deny the request. Although this patch corrects a number of security problems associated with raw sockets and prisons, the warning in jail(8) should still apply, and by default we should keep the default value of security.jail.allow_raw_sockets MIB to 0 (or disabled) until we are certain that we have tracked down all the problems. Looking forward, we will probably want to eliminate the references to curthread. This may be a MFC candidate for RELENG_5. Reviewed by: rwatson Approved by: bmilekic (mentor)	2004-08-21 17:38:57 +00:00
rwatson	87aa99bbbb	White space cleanup for netinet before branch: - Trailing tab/space cleanup - Remove spurious spaces between or before tabs This change avoids touching files that Andre likely has in his working set for PFIL hooks changes for IPFW/DUMMYNET. Approved by: re (scottl) Submitted by: Xin LI <delphij@frontfree.net>	2004-08-16 18:32:07 +00:00
dwmalone	5df13d37b2	Get rid of the RANDOM_IP_ID option and make it a sysctl. NetBSD have already done this, so I have styled the patch on their work: 1) introduce a ip_newid() static inline function that checks the sysctl and then decides if it should return a sequential or random IP ID. 2) named the sysctl net.inet.ip.random_id 3) IPv6 flow IDs and fragment IDs are now always random. Flow IDs and frag IDs are significantly less common in the IPv6 world (ie. rarely generated per-packet), so there should be smaller performance concerns. The sysctl defaults to 0 (sequential IP IDs). Reviewed by: andre, silby, mlaier, ume Based on: NetBSD MFC after: 2 months	2004-08-14 15:32:40 +00:00
hsu	4b4df6655a	Fix bug with tracking the previous element in a list. Found by: edrt@citiz.net Submitted by: pavlin@icir.org	2004-08-03 02:01:44 +00:00
phk	5c95d686a1	Do a pass over all modules in the kernel and make them return EOPNOTSUPP for unknown events. A number of modules return EINVAL in this instance, and I have left those alone for now and instead taught MOD_QUIESCE to accept this as "didn't do anything".	2004-07-15 08:26:07 +00:00
rwatson	758f90deb8	Reduce the number of unnecessary unlock-relocks on socket buffer mutexes associated with performing a wakeup on the socket buffer: - When performing an sbappend*() followed by a so[rw]wakeup(), explicitly acquire the socket buffer lock and use the _locked() variants of both calls. Note that the _locked() sowakeup() versions unlock the mutex on return. This is done in uipc_send(), divert_packet(), mroute socket_send(), raw_append(), tcp_reass(), tcp_input(), and udp_append(). - When the socket buffer lock is dropped before a sowakeup(), remove the explicit unlock and use the _locked() sowakeup() variant. This is done in soisdisconnecting(), soisdisconnected() when setting the can't send/ receive flags and dropping data, and in uipc_rcvd() which adjusting back-pressure on the sockets. For UNIX domain sockets running mpsafe with a contention-intensive SMP mysql benchmark, this results in a 1.6% query rate improvement due to reduce mutex costs.	2004-06-26 19:10:39 +00:00
rwatson	93baf0b01a	When asserting non-Giant locks in the network stack, also assert Giant if debug.mpsafenet=0, as any points that require synchronization in the SMPng world also required it in the Giant-world: - inpcb locks (including IPv6) - inpcbinfo locks (including IPv6) - dummynet subsystem lock - ipfw2 subsystem lock	2004-06-24 02:01:48 +00:00
rwatson	c2888023eb	IP multicast code no longer needs to acquire Giant before appending an mbuf onto a socket buffer. This is left over from debug.mpsafenet affecting the forwarding/bridging plane only.	2004-06-20 20:10:05 +00:00
phk	f43aa0c4bc	add missing #include <sys/module.h>	2004-05-30 20:27:19 +00:00
hsu	01e90a548b	To comply with the spec, do not copy the TOS from the outer IP header to the inner IP header of the PIM Register if this is a PIM Null-Register message. Submitted by: Pavlin Radoslavov <pavlin@icir.org>	2004-03-08 07:47:27 +00:00
sam	2a76f36902	o move mutex init/destroy logic to the module load/unload hooks; otherwise they are initialized twice when the code is statically configured in the kernel because the module load method gets invoked before the user application calls ip_mrouter_init o add a mutex to synchronize the module init/done operations; this sort of was done using the value of ip_mroute but X_ip_mrouter_done sets it to NULL very early on which can lead to a race against ip_mrouter_init--using the additional mutex means this is safe now o don't call ip_mrouter_reset from ip_mrouter_init; this now happens once at module load and X_ip_mrouter_done does the appropriate cleanup work to insure the data structures are in a consistent state so that a subsequent init operation inherits good state Reviewed by: juli	2003-12-20 18:32:48 +00:00
sam	354edc9c36	the sbappendaddr call in socket_send must be protected by Giant because it can happen from an MPSAFE callout Supported by: FreeBSD Foundation	2003-11-08 22:51:18 +00:00
brooks	f1e94c6f29	Replace the if_name and if_unit members of struct ifnet with new members if_xname, if_dname, and if_dunit. if_xname is the name of the interface and if_dname/unit are the driver name and instance. This change paves the way for interface renaming and enhanced pseudo device creation and configuration symantics. Approved By: re (in principle) Reviewed By: njl, imp Tested On: i386, amd64, sparc64 Obtained From: NetBSD (if_xname)	2003-10-31 18:32:15 +00:00
sam	16414b86ee	Potential fix for races shutting down callouts when unloading the module. Previously we grabbed the mutex used by the callouts, then stopped the callout with callout_stop, but if the callout was already active and blocked by the mutex then it would continue later and reference the mutex after it was destroyed. Instead stop the callout first then lock. Supported by: FreeBSD Foundation	2003-10-29 19:15:00 +00:00
sam	841ffbf14a	o restructure initialization code so data structures are setup when loaded as a module o cleanup data structures on module unload when no application has been started (i.e. kldload, kldunload w/o mrtd) o remove extraneous unlocks immediately prior to destroying them Supported by: FreeBSD Foundation	2003-10-24 00:09:18 +00:00
sam	0960ba26a2	Add locking. Special thanks to Pavlin Radoslavov <pavlin@icir.org> for testing and fixing numerous problems. Sponsored by: FreeBSD Foundation Reviewed by: Pavlin Radoslavov <pavlin@icir.org>	2003-09-06 04:53:43 +00:00
hsu	2b41ad87c9	Remove redundant bzero. Submitted by: Pavlin Radoslavov <pavlin@icir.org>	2003-08-24 08:27:57 +00:00
hsu	54fc4fe430	* Bug fix in bw_meter_process(): the periodically processed bins of bw_meter entries were processed up to one second ahead. After an unappropriate rescheduling of some of the bw_meter entries, the upcalls weren't delivered. * pim_register_prepare() uses the appropriate sw_csum flag to call ip_fragment() so the IP checksum is computed properly. * Modify pim_register_prepare() to take care of IP packets that don't need fragmentation. * Add-back in_delayed_cksum() to encap_send(), because it seems it should be there. Submitted by: Pavlin Radoslavov <pavlin@icir.org>	2003-08-19 17:22:51 +00:00
hsu	22b74d7669	1. Basic PIM kernel support Disabled by default. To enable it, the new "options PIM" must be added to the kernel configuration file (in addition to MROUTING): options MROUTING # Multicast routing options PIM # Protocol Independent Multicast 2. Add support for advanced multicast API setup/configuration and extensibility. 3. Add support for kernel-level PIM Register encapsulation. Disabled by default. Can be enabled by the advanced multicast API. 4. Implement a mechanism for "multicast bandwidth monitoring and upcalls". Submitted by: Pavlin Radoslavov <pavlin@icir.org>	2003-08-07 18:16:59 +00:00
hsu	e3e094a1f9	* makes mfc[MFCTBLSIZ] and vif[MAXVIFS] tables accessible via sysctl: - sysctlbyname("net.inet.ip.mfctable", ...) - sysctlbyname("net.inet.ip.viftable", ...) This change is needed so netstat can use sysctlbyname() to read the data from those tables. Otherwise, in some cases "netstat -g" may fail to report the multicast forwarding information (e.g., if we run a multicast router on PicoBSD). * Bug fix: when sending IGMPMSG_WRONGVIF upcall to the multicast routing daemon, set properly "im->im_vif" to the receiving incoming interface of the packet that triggered that upcall rather than to the expected incoming interface of that packet. * Bug fix: add missing increment of counter "mrtstat.mrts_upcalls" * Few formatting nits (e.g., replace extra spaces with TABs) Submitted by: Pavlin Radoslavov <pavlin@icir.org>	2003-08-05 17:01:33 +00:00
des	567ac2b268	Introduce an M_ASSERTPKTHDR() macro which performs the very common task of asserting that an mbuf has a packet header. Use it instead of hand- rolled versions wherever applicable. Submitted by: Hiten Pandya <hiten@unixdaemons.com>	2003-04-08 14:25:47 +00:00
jlemon	04e28d5a81	Update netisr handling; Each SWI now registers its queue, and all queue drain routines are done by swi_net, which allows for better queue control at some future point. Packets may also be directly dispatched to a netisr instead of queued, this may be of interest at some installations, but currently defaults to off. Reviewed by: hsu, silby, jayanth, sam Sponsored by: DARPA, NAI Labs	2003-03-04 23:19:55 +00:00
imp	cf874b345d	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
alfred	bf8e8a6e8f	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
luigi	60e892bf31	Massive cleanup of the ip_mroute code. No functional changes, but: + the mrouting module now should behave the same as the compiled-in version (it did not before, some of the rsvp code was not loaded properly); + netinet/ip_mroute.c is now truly optional; + removed some redundant/unused code; + changed many instances of '0' to NULL and INADDR_ANY as appropriate; + removed several static variables to make the code more SMP-friendly; + fixed some minor bugs in the mrouting code (mostly, incorrect return values from functions). This commit is also a prerequisite to the addition of support for PIM, which i would like to put in before DP2 (it does not change any of the existing APIs, anyways). Note, in the process we found out that some device drivers fail to properly handle changes in IFF_ALLMULTI, leading to interesting behaviour when a multicast router is started. This bug is not corrected by this commit, and will be fixed with a separate commit. Detailed changes: -------------------- netinet/ip_mroute.c all the above. conf/files make ip_mroute.c optional net/route.c fix mrt_ioctl hook netinet/ip_input.c fix ip_mforward hook, move rsvp_input() here together with other rsvp code, and a couple of indentation fixes. netinet/ip_output.c fix ip_mforward and ip_mcast_src hooks netinet/ip_var.h rsvp function hooks netinet/raw_ip.c hooks for mrouting and rsvp functions, plus interface cleanup. netinet/ip_mroute.h remove an unused and optional field from a struct Most of the code is from Pavlin Radoslavov and the XORP project Reviewed by: sam MFC after: 1 week	2002-11-15 22:53:53 +00:00
jhb	980dea1922	Cast a ptrdiff_t to an int to printf.	2002-11-08 14:52:26 +00:00
rwatson	709dbb6b2b	When a packet is multicast encapsulated, give labeled policies the opportunity to preserve the label. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-20 21:59:00 +00:00
sam	2a86be217a	Replace aux mbufs with packet tags: o instead of a list of mbufs use a list of m_tag structures a la openbsd o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit ABI/module number cookie o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and use this in defining openbsd-compatible m_tag_find and m_tag_get routines o rewrite KAME use of aux mbufs in terms of packet tags o eliminate the most heavily used aux mbufs by adding an additional struct inpcb parameter to ip_output and ip6_output to allow the IPsec code to locate the security policy to apply to outbound packets o bump __FreeBSD_version so code can be conditionalized o fixup ipfilter's call to ip_output based on __FreeBSD_version Reviewed by: julian, luigi (silent), -arch, -net, darren Approved by: julian, silence from everyone else Obtained from: openbsd (mostly) MFC after: 1 month	2002-10-16 01:54:46 +00:00
sobomax	b749867dfc	Since from now on encap_input() also catches IPPROTO_MOBILE and IPPROTO_GRE packets in addition to IPPROTO_IPV4 and IPPROTO_IPV6, explicitly specify IPPROTO_IPV4 or IPPROTO_IPV6 instead of -1 when calling encap_attach(). MFC after: 28 days (along with other if_gre changes)	2002-09-09 09:36:47 +00:00
charnier	7dd9d47059	Replace various spelling with FALLTHROUGH which is lint()able	2002-08-25 13:23:09 +00:00
luigi	c576432afc	Just a comment on some additional consistency checks that could be added here.	2002-06-26 21:00:53 +00:00
tanimura	e6fa9b9e92	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
tanimura	92d8381dd5	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
tanimura	89ec521d91	Revert the change of #includes in sys/filedesc.h and sys/socketvar.h. Requested by: bde Since locking sigio_lock is usually followed by calling pgsigio(), move the declaration of sigio_lock and the definitions of SIGIO_*() to sys/signalvar.h. While I am here, sort include files alphabetically, where possible.	2002-04-30 01:54:54 +00:00
bde	867fc1ed1c	Fixed some style bugs in the removal of __P(()). Continuation lines were not outdented to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting.	2002-03-24 10:19:10 +00:00
ru	cb4688c90e	Prevent icmp_reflect() from calling ip_output() with a NULL route pointer which will then result in the allocated route's reference count never being decremented. Just flood ping the localhost and watch refcnt of the 127.0.0.1 route with netstat(1). Submitted by: jayanth Back out ip_output.c,v 1.143 and ip_mroute.c,v 1.69 that allowed ip_output() to be called with a NULL route pointer. The previous paragraph shows why this was a bad idea in the first place. MFC after: 0 days	2002-03-22 16:45:54 +00:00
alfred	357e37e023	Remove __P.	2002-03-19 21:25:46 +00:00
mike	bcee06d42c	o Move NTOHL() and associated macros into <sys/param.h>. These are deprecated in favor of the POSIX-defined lowercase variants. o Change all occurrences of NTOHL() and associated marcros in the source tree to use the lowercase function variants. o Add missing license bits to sparc64's <machine/endian.h>. Approved by: jake o Clean up <machine/endian.h> files. o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>. o Remove prototypes for non-existent bswapXX() functions. o Include <machine/endian.h> in <arpa/inet.h> to define the POSIX-required ntohl() family of functions. o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>, and <sys/param.h>. o Prepend underscores to the ntohl() family to help deal with complexities associated with having MD (asm and inline) versions, and having to prevent exposure of these functions in other headers that happen to make use of endian-specific defines. o Create weak aliases to the canonical function name to help deal with third-party software forgetting to include an appropriate header. o Remove some now unneeded pollution from <sys/types.h>. o Add missing <arpa/inet.h> includes in userland. Tested on: alpha, i386 Reviewed by: bde, jake, tmm	2002-02-18 20:35:27 +00:00
ru	5fcff41f8a	Allow for ip_output() to be called with a NULL route pointer. This fixes a panic I introduced yesterday in ip_icmp.c,v 1.64.	2001-12-01 13:48:16 +00:00
dillon	981dfd6cd9	fix int argument used in printf w/ %ld (cast to long)	2001-10-29 02:19:19 +00:00
sumikawa	31af69645f	Fixed comment: ipip_input -> mroute_encapcheck. Reported by: bde	2001-09-20 07:59:45 +00:00
sumikawa	aa9b71c68d	Removed ipip_input(). No codes calls it anymore due to ip_encap.c's encapsulation support.	2001-09-18 14:52:20 +00:00
julian	071f86f9f1	Patches from Keiichi SHIMA <keiichi@iij.ad.jp> to make ip use the standard protosw structure again. Obtained from: Well, KAME I guess.	2001-09-03 20:03:55 +00:00
fenner	8396f6f2b1	Somewhat modernize ip_mroute.c: - Use sysctl to export stats - Use ip_encap.c's encapsulation support - Update lkm to kld (is 6 years a record for a broken module?) - Remove some unused cruft	2001-07-25 20:15:49 +00:00
kris	e1524eb20c	Add ``options RANDOM_IP_ID'' which randomizes the ID field of IP packets. This closes a minor information leak which allows a remote observer to determine the rate at which the machine is generating packets, since the default behaviour is to increment a counter for each packet sent. Reviewed by: -net Obtained from: OpenBSD	2001-06-01 10:02:28 +00:00
asmodai	2f1d3e2cdf	Fix typo: seperate -> separate. Seperate does not exist in the english language.	2001-02-06 11:21:58 +00:00
jlemon	954e1d2ccd	Lock down the network interface queues. The queue mutex must be obtained before adding/removing packets from the queue. Also, the if_obytes and if_omcasts fields should only be manipulated under protection of the mutex. IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on the queue. An IF_LOCK macro is provided, as well as the old (mutex-less) versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which needs them, but their use is discouraged. Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF, which takes care of locking/enqueue, and also statistics updating/start if necessary.	2000-11-25 07:35:38 +00:00
kjc	0a7adf3296	change the evaluation order of the rsvp socket in rsvp_input() in favor of the new-style per-vif socket. this does not affect the behavior of the ISI rsvpd but allows another rsvp implementation (e.g., KOM rsvp) to take advantage of the new style for particular sockets while using the old style for others. in the future, rsvp supporn should be replaced by more generic router-alert support. PR: kern/20984 Submitted by: Martin Karsten <Martin.Karsten@KOM.tu-darmstadt.de> Reviewed by: kjc	2000-09-17 13:50:12 +00:00
ru	92269e49c4	Follow BSD/OS and NetBSD, keep the ip_id field in network order all the time. Requested by: wollman	2000-09-14 14:42:04 +00:00
ru	326b00612b	Fixed broken ICMP error generation, unified conversion of IP header fields between host and network byte order. The details: o icmp_error() now does not add IP header length. This fixes the problem when icmp_error() is called from ip_forward(). In this case the ip_len of the original IP datagram returned with ICMP error was wrong. o icmp_error() expects all three fields, ip_len, ip_id and ip_off in host byte order, so DTRT and convert these fields back to network byte order before sending a message. This fixes the problem described in PR 16240 and PR 20877 (ip_id field was returned in host byte order). o ip_ttl decrement operation in ip_forward() was moved down to make sure that it does not corrupt the copy of original IP datagram passed later to icmp_error(). o A copy of original IP datagram in ip_forward() was made a read-write, independent copy. This fixes the problem I first reported to Garrett Wollman and Bill Fenner and later put in audit trail of PR 16240: ip_output() (not always) converts fields of original datagram to network byte order, but because copy (mcopy) and its original (m) most likely share the same mbuf cluster, ip_output()'s manipulations on original also corrupted the copy. o ip_output() now expects all three fields, ip_len, ip_off and (what is significant) ip_id in host byte order. It was a headache for years that ip_id was handled differently. The only compatibility issue here is the raw IP socket interface with IP_HDRINCL socket option set and a non-zero ip_id field, but ip.4 manual page was unclear on whether in this case ip_id field should be in host or network byte order.	2000-09-01 12:33:03 +00:00
ken	1b9ed80d5c	Include machine/in_cksum.h to unbreak options MROUTING.	2000-05-08 23:56:30 +00:00
shin	50ba589c66	IPSEC support in the kernel. pr_input() routines prototype is also changed to support IPSEC and IPV6 chained protocol headers. Reviewed by: freebsd-arch, cvs-committers Obtained from: KAME project	1999-12-22 19:13:38 +00:00
peter	3b842d34e8	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
peter	73556bfee1	Add sufficient braces to keep egcs happy about potentially ambiguous if/else nesting.	1999-05-06 18:13:11 +00:00
fenner	c9e9dccbb7	Use dynamic memory allocation instead of mbuf's for multicast routing state. Note: this requires a recompilation of netstat (but netstat has been broken since rev 1.52 of ip_mroute.c anyway) Obtained from: Significantly based on Steve McCanne's <mccanne@cs.berkeley.edu> work for BSD/OS	1999-01-18 02:06:59 +00:00
eivind	e06f86cff9	Remove unused statics.	1999-01-12 12:16:50 +00:00
fenner	8532cc33d7	Add missing "break"s to allow multicast routing to work. Submitted by: Amancio Hasty <hasty@rah.star-gate.com>	1998-12-16 18:07:11 +00:00
archie	60d13c7a9d	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
wollman	a76fb5eefa	Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.	1998-08-23 03:07:17 +00:00
bde	08a3400100	Fixed printf format errors.	1998-08-17 01:05:25 +00:00
phk	cdd3d49d95	Byte count statistics of multicast vifs are invalid. The problem is caused by a wrong endianess in the sum. PR: 7115 Submitted by: Joao Carlos Mendes Luis <jonny@jonny.eng.br>	1998-06-30 10:56:31 +00:00
des	396b114475	Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.	1998-04-17 22:37:19 +00:00
eivind	d7a6ab2803	Staticize.	1998-02-09 06:11:36 +00:00
eivind	4547a09753	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
eivind	c552a9a1c3	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
bde	fb826377ff	Removed unused #includes.	1997-10-28 15:59:26 +00:00
gibbs	a415512fd4	Update for new callout interface.	1997-09-21 22:02:25 +00:00

1 2 3 4

190 Commits