freebsd-skq

Author	SHA1	Message	Date
Bjoern A. Zeeb	55fd3bafdb	Style changes: compare pointer to NULL and move a }. MFC after: 6 weeks	2008-10-04 17:07:58 +00:00
Bjoern A. Zeeb	86d02c5c63	Cache so_cred as inp_cred in the inpcb. This means that inp_cred is always there, even after the socket has gone away. It also means that it is constant for the lifetime of the inp. Both facts lead to simpler code and possibly less locking. Suggested by: rwatson Reviewed by: rwatson MFC after: 6 weeks X-MFC Note: use a inp_pspare for inp_cred	2008-10-04 15:06:34 +00:00
Marko Zec	8b615593fc	Step 1.5 of importing the network stack virtualization infrastructure from the vimage project, as per plan established at devsummit 08/08: http://wiki.freebsd.org/Image/Notes200808DevSummit Introduce INIT_VNET_() initializer macros, VNET_FOREACH() iterator macros, and CURVNET_SET() context setting macros, all currently resolving to NOPs. Prepare for virtualization of selected SYSCTL objects by introducing a family of SYSCTL_V_() macros, currently resolving to their global counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT(). Move selected #defines from sys/sys/vimage.h to newly introduced header files specific to virtualized subsystems (sys/net/vnet.h, sys/netinet/vinet.h etc.). All the changes are verified to have zero functional impact at this point in time by doing MD5 comparision between pre- and post-change object files(). () netipsec/keysock.c did not validate depending on compile time options. Implemented by: julian, bz, brooks, zec Reviewed by: julian, bz, brooks, kris, rwatson, ... Approved by: julian (mentor) Obtained from: //depot/projects/vimage-commit2/... X-MFC after: never Sponsored by: NLnet Foundation, The FreeBSD Foundation	2008-10-02 15:37:58 +00:00
Colin Percival	29a6d781af	Default to ignoring potentially evil IPv6 Neighbor Solicitation messages. Approved by: so (cperciva) Approved by: re (kensmith) Security: FreeBSD-SA-08:10.nd6 Thanks to: jinmei, bz	2008-10-02 00:32:59 +00:00
Robert Watson	4a0a13971e	When invoking the udp_send() from udp6_send() due to use of a v6-mapped IPv4 address, first drop the udbinfo and inpcb locks, which will otherwise be recursed. This leads to a potential minor race, but is preferable to a deadlock when acquiring a read lock after a write lock on the inpcb. MFC after: 3 days Reported by: Norbert Papke <fbsd-ml@scrapper.ca>, lioux	2008-09-22 06:44:03 +00:00
Bjoern A. Zeeb	9de45b2780	mld_timerresid() returns ms so instead of doing the maths in usec and then dividing down to ms, do the maths in ms. Obtained from: NetBSD mld6.c rev. 1.47 MFC after: 2 months	2008-09-10 19:42:13 +00:00
Simon L. B. Nielsen	59ca51adba	- Fix amd64 local privilege escalation. [08:07] - Fix nmount(2) local privilege escalation. [08:08] - Fix IPv6 remote kernel panics. [08:09] Fix for [08:07] is merge of r181823. Submitted by: kib [08:07], csjp [08:08], bz [08:09] Reviewed by: peter [08:07], jhb [08:07] Reviewed by: jinmei [08:09], rwatson [08:09] Approved by: re (SA blanket) Approved by: so (simon) Security: FreeBSD-SA-08:07.amd64 Security: FreeBSD-SA-08:08.nmount Security: FreeBSD-SA-08:09.icmp6	2008-09-03 19:09:47 +00:00
Bjoern A. Zeeb	bf0d5f8e16	Fix a bug, when a specially crafted ICMPV6 MLD packet could lead to an integer divide by zero panic in the kernel, if the kernel was run with hz<1000. Neither i386, pc98, amd64 or sparc64 are affected in the currently supported branches and default configuration. Submitted by: Miikka Saukko, Ossi Herrala and Jukka Taimisto from the CROSS project at Codenomicon Ltd. via CERT-FI. Reviewed by: bz, rwatson Security: CVE-2008-2464 MFC after: 8 hours	2008-09-03 08:13:58 +00:00
Robert Watson	7a0a0eecf0	In UDPv6, reduce scope of global udbinfo lock during append to last matching socket by dropping it before udp6_append(), and remove duplicate unlocks of udbinfo and inpcb in sysctl return path. MFC after: 3 days	2008-08-31 13:16:45 +00:00
Julian Elischer	5e5d5c6f17	another missed V_	2008-08-25 06:09:32 +00:00
Julian Elischer	5ed3800e41	Fix some of the formatting fixes.. It's amazing how some thing stand out in a commit message.	2008-08-20 01:24:55 +00:00
Julian Elischer	ac957cd271	A bunch of formatting fixes brough to light by, or created by the Vimage commit a few days ago.	2008-08-20 01:05:56 +00:00
Bjoern A. Zeeb	f125044552	As part of step 1.5 of the vimage framework resolve conflicts with file local static globals which would be folded onto the same name with the V_ macros. Reviewed by: kris, brooks, simon	2008-08-18 13:16:19 +00:00
Bjoern A. Zeeb	603724d3ab	Commit step 1 of the vimage project, (network stack) virtualization work done by Marko Zec (zec@). This is the first in a series of commits over the course of the next few weeks. Mark all uses of global variables to be virtualized with a V_ prefix. Use macros to map them back to their global names for now, so this is a NOP change only. We hope to have caught at least 85-90% of what is needed so we do not invalidate a lot of outstanding patches again. Obtained from: //depot/projects/vimage-commit2/... Reviewed by: brooks, des, ed, mav, julian, jamie, kris, rwatson, zec, ... (various people I forgot, different versions) md5 (with a bit of help) Sponsored by: NLnet Foundation, The FreeBSD Foundation X-MFC after: never V_Commit_Message_Reviewed_By: more people than the patch	2008-08-17 23:27:27 +00:00
Bjoern A. Zeeb	48d48eb980	Fix a regression introduced in r179289 splitting up ip6_savecontrol() into v4-only vs. v6-only inp_flags processing. When ip6_savecontrol_v4() is called from ip6_savecontrol() we were not passing back the mp thus the information will be missing in userland. Istead of going with a * as suggested in the PR we are returning **mp now and passing in the v4only flag as a pointer argument. PR: kern/126349 Reviewed by: rwatson, dwmalone	2008-08-16 06:39:18 +00:00
Robert Watson	2209e8f159	Adopt the slightly weaker consistency locking approach used in IPv4 raw sockets for IPv6 raw sockets: separately lock the inpcb for determining the destination address for a connect()'d raw socket at the rip6_send() layer, and then re-acquire the inpcb lock in the rip6_output() layer to query other options on the socket. Previously, the global raw IP socket lock was used, which while correct and marginally more consistent, could add significantly to global raw IP socket lock contention. MFC after: 1 week	2008-07-30 09:26:27 +00:00
Robert Watson	ae89d5a389	When copying in and out current ICMPv6 filters on a raw IPv6 socket, lock the inpcb and use a local stack variable to copy to/from userspace so that sooptcopyin()/sooptcopyout() aren't called while holding an rwlock. While here, fix a bug in which a failed sooptcopyin() might lead to partially consistent ICMPv6 filters on the socket by not ignoring the error returned by sooptcopyin(). MFC after: 2 weeks	2008-07-29 19:37:16 +00:00
Robert Watson	2f1ff0cd80	Since we fail IPv6 raw socket allocation if inp->in6p_icmp6filt can't be allocated, there's no need to conditionize use and freeing of it later. MFC after: 1 week	2008-07-29 18:09:46 +00:00
Robert Watson	cc29ac7d22	Marginally decomplicate set/getsockopt code in ip6_output.c by simply using the passed arguments explicitly and unconditionally rather than testing them and calling panic(). The result is the same but easier to read. MFC after: 3 days	2008-07-29 09:31:03 +00:00
Alexander Motin	6c5bbf5ce1	Move inpcb lock higher to protect some nonbinding fields reading. It fixes nothing at this time, but decided to be more correct.	2008-07-28 19:32:18 +00:00
Alexander Motin	b11e21ae80	According to in_pcb.h protocol binding information has double locking. It allows access it while list travercing holding only global pcbinfo lock.	2008-07-27 20:30:34 +00:00
Bjoern A. Zeeb	078b704233	Pass the ucred along into in{,6}_pcblookup_local for upcoming prison checks. Reviewed by: rwatson	2008-07-10 13:31:11 +00:00
Bjoern A. Zeeb	cdcb11b92c	For consistency take lport as u_short in in{,6}_pcblookup_local. All callers either pass in an u_short or u_int16_t. Reviewed by: rwatson	2008-07-10 13:23:22 +00:00
Randall Stewart	fc14de76f4	1) Adds the rest of the VIMAGE change macros 2) Adds some __UserSpace__ on some of the common defines that the user space code needs 3) Fixes a bug when we send up data to a user that failed. We need to a) trim off the data chunk headers, if present, and b) make sure the frag bit is communicated properly for the msgs coming off the stream queues... i.e. we see if some of the msg has been taken. Obtained from: jeli contributed the VIMAGE changes on this pass Thanks Julain!	2008-07-09 16:45:30 +00:00
Bjoern A. Zeeb	a55b8b2068	Document required locking in in6_sleectsrc() in case an inp is passed in by adding an assert. Requested by: rwatson Reviewed by: rwatson	2008-07-09 16:33:21 +00:00
Bjoern A. Zeeb	f2f877d38c	Change the parameters to in6_selectsrc(): - pass in the inp instead of both in6p_moptions and laddr. - pass in cred for upcoming prison checks. Reviewed by: rwatson	2008-07-08 18:41:36 +00:00
Robert Watson	963e491243	Use soreceive_dgram() and sosend_dgram() with UDPv6, as we do with UDPv4. Tested by: ps MFC after: 3 months	2008-07-08 10:15:23 +00:00
Robert Watson	65c577c01d	Drop read lock on udbinfo earlier during delivery to the last matching UDP socket for a datagram; the inpcb read lock is sufficient to provide inpcb stability during udp6_append(). MFC after: 1 month	2008-07-07 10:11:17 +00:00
Robert Watson	0ae76120da	Improve approximation of style(9) in raw socket code.	2008-07-05 18:03:39 +00:00
Robert Watson	4f7d1876d5	Introduce a new lock, hostname_mtx, and use it to synchronize access to global hostname and domainname variables. Where necessary, copy to or from a stack-local buffer before performing copyin() or copyout(). A few uses, such as in cd9660 and daemon_saver, remain under-synchronized and will require further updates. Correct a bug in which a failed copyin() of domainname would leave domainname potentially corrupted. MFC after: 3 weeks	2008-07-05 13:10:10 +00:00
Robert Watson	59dd72d040	Remove NETISR_MPSAFE, which allows specific netisr handlers to be directly dispatched without Giant, and add NETISR_FORCEQUEUE, which allows specific netisr handlers to always be dispatched via a queue (deferred). Mark the usb and if_ppp netisr handlers as NETISR_FORCEQUEUE, and explicitly acquire Giant in those handlers. Previously, any netisr handler not marked NETISR_MPSAFE would necessarily run deferred and with Giant acquired. This change removes Giant scaffolding from the netisr infrastructure, but NETISR_FORCEQUEUE allows non-MPSAFE handlers to continue to force deferred dispatch so as to avoid lock order reversals between their acqusition of Giant and any calling context. It is likely we will be able to remove NETISR_FORCEQUEUE once IFF_NEEDSGIANT is removed, as non-MPSAFE usb and if_ppp drivers will no longer be supported. Reviewed by: bz MFC after: 1 month X-MFC note: We can't remove NETISR_MPSAFE from stable/7 for KPI reasons, but the rest can go back.	2008-07-04 00:21:38 +00:00
Robert Watson	aaa37a7e4e	Remove GIANT_REQUIRED from IPv6 input, forward, and frag6 code. The frag6 code is believed to be MPSAFE, and leaving aside the IPv6 route cache in forwarding, Giant appears not to adequately synchronize the data structures in the input or forwarding paths.	2008-07-03 10:55:13 +00:00
Robert Watson	0a2fe17365	Set the IPv6 netisr handler as NETISR_MPSAFE on the basis that, despite there still being some well-known races in mld6 and nd6, running with Giant over the netisr handler provides little or not additional synchronization that might cause mld6 and nd6 to behave better.	2008-07-02 23:12:40 +00:00
Bjoern A. Zeeb	2d8bba43bd	Try to fix errors introduced in svn180085/cvs rev. 1.10: * Include ip6_var.h for ip6stat. * Use the correct name under ip6stat: `ip6s_cantforward' instead of its IPv4 counterpart. MFC after: 10 days	2008-06-29 07:34:21 +00:00
Alexander Kabaev	2ce7b410dc	Repair botched variable rename. Pointy hat to: julian	2008-06-29 04:33:45 +00:00
Julian Elischer	b3fb530c76	Oops, we've been incrementing the wrong cantforward variable. Obtained from: vimage tree	2008-06-29 00:25:16 +00:00
Julian Elischer	5f9a5768d2	Rename two vars so that they are different from the same vars in ipv4. They are static so it was not a problem 'per se' but it was confusing to the reader. Obtained from: vimage tree	2008-06-29 00:17:45 +00:00
Randall Stewart	b3f1ea41fd	- Macro-izes the packed declaration in all headers. - Vimage prep - these are major restructures to move all global variables to be accessed via a macro or two. The variables all go into a single structure. - Asconf address addition tweaks (add_or_del Interfaces) - Fix rwnd calcualtion to be more conservative. - Support SACK_IMMEDIATE flag to skip delayed sack by demand of peer. - Comment updates in the sack mapping calculations - Invarients panic added. - Pre-support for UDP tunneling (we can do this on MAC but will need added support from UDP to get a "pipe" of UDP packets in. - clear trace buffer sysctl added when local tracing on. Note the majority of this huge patch is all the vimage prep stuff :-)	2008-06-14 07:58:05 +00:00
Robert Watson	9622e84fcf	Employ read locks on UDP inpcbs, rather than write locks, when monitoring UDP connections using sysctls. In some cases, add previously missing locking of inpcbs, as inp_socket is followed, which also allows us to drop global locks more quickly. MFC after: 1 week	2008-05-29 08:27:14 +00:00
Bjoern A. Zeeb	9a38ba8101	Factor out the v4-only vs. the v6-only inp_flags processing in ip6_savecontrol in preparation for udp_append() to no longer need an WLOCK as we will no longer be modifying socket options. Requested by: rwatson Reviewed by: gnn MFC after: 10 days	2008-05-24 15:20:48 +00:00
Randall Stewart	c54a18d26b	- Adds support for the multi-asconf (From Kozuka-san) - Adds some prepwork (Not all yet) for vimage in particular support the delete the sctppcbinfo.xx structs. There is still a leak in here if it were to be called plus we stil need the regrouping (From Me and Michael Tuexen) - Adds support for UDP tunneling. For BSD there is no socket yet setup so its disabled, but major argument changes are in here to emcompass the passing of the port number (zero when you don't have a udp tunnel, the default for BSD). Will add some hooks in UDP here shortly (discussed with Robert) that will allow easy tunneling. (Mainly from Peter Lei and Michael Tuexen with some BSD work from me :-D) - Some ease for windows, evidently leave is reserved by their compile move label leave: -> out: MFC after: 1 week	2008-05-20 13:47:46 +00:00
Julian Elischer	8b07e49a00	Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x) Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux. From my notes: ----- One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address. Constraints: ------------ I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need. One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing". One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch. This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs. To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family. The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before. The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row. In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later. One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically). You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it. This brings us as to how the correct FIB is selected for an outgoing IPV4 packet. Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways. Packets fall into one of a number of classes. 1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice.. setfib -3 ping target.example.com # will use fib 3 for ping. It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands. 2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.) 3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2). 4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib. 5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to. 6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1. Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented) In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB. In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process. Early testing experience: ------------------------- Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks. For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done. Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly. ipfw has grown 2 new keywords: setfib N ip from anay to any count ip from any to any fib N In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required. SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something. Where to next: -------------------- After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code. Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code. My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it. When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry. Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already. This work was sponsored by Ironport Systems/Cisco Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco	2008-05-09 23:03:00 +00:00
Robert Watson	c7bc5dc1f5	Acquire a read lock, rather than a write lock, on a UDPv6 inpcb when delivering to the socket or extracting socket details for monitoring purposes. MFC after: 3 months	2008-04-22 12:20:33 +00:00
Robert Watson	bb145f600c	In ICMPv6, read lock rather than write lock the inpcb on receive. MFC after: 3 months	2008-04-21 12:08:40 +00:00
Robert Watson	9ad11dd8a4	With IPv4 raw sockets, read lock rather than write lock the inpcb when receiving or transmitting. With IPv6 raw sockets, read lock rather than write lock the inpcb when receiving. Unfortunately, IPv6 source address selection appears to require a write lock on the inpcb for the time being. MFC after: 3 months	2008-04-21 12:06:41 +00:00
Robert Watson	8328afb791	When querying a local or remote address on an IPv6 socket, use only a read lock on the inpcb. MFC after: 3 months	2008-04-19 14:36:19 +00:00
Robert Watson	8501a69cc9	Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to explicitly select write locking for all use of the inpcb mutex. Update some pcbinfo lock assertions to assert locked rather than write-locked, although in practice almost all uses of the pcbinfo rwlock main exclusive, and all instances of inpcb lock acquisition are exclusive. This change should introduce (ideally) little functional change. However, it lays the groundwork for significantly increased parallelism in the TCP/IP code. MFC after: 3 months Tested by: kris (superset of committered patch)	2008-04-17 21:38:18 +00:00
Randall Stewart	276ca5012c	- Have SCTP use the new pru_flush functionality PR: 122710 MFC after: 1 week	2008-04-14 18:12:37 +00:00
Qing Li	e440aed958	This patch provides the back end support for equal-cost multi-path (ECMP) for both IPv4 and IPv6. Previously, multipath route insertion is disallowed. For example, route add -net 192.103.54.0/24 10.9.44.1 route add -net 192.103.54.0/24 10.9.44.2 The second route insertion will trigger an error message of "add net 192.103.54.0/24: gateway 10.2.5.2: route already in table" Multiple default routes can also be inserted. Here is the netstat output: default 10.2.5.1 UGS 0 3074 bge0 => default 10.2.5.2 UGS 0 0 bge0 When multipath routes exist, the "route delete" command requires a specific gateway to be specified or else an error message would be displayed. For example, route delete default would fail and trigger the following error message: "route: writing to routing socket: No such process" "delete net default: not in table" On the other hand, route delete default 10.2.5.2 would be successful: "delete net default: gateway 10.2.5.2" One does not have to specify a gateway if there is only a single route for a particular destination. I need to perform more testings on address aliases and multiple interfaces that have the same IP prefixes. This patch as it stands today is not yet ready for prime time. Therefore, the ECMP code fragments are fully guarded by the RADIX_MPATH macro. Include the "options RADIX_MPATH" in the kernel configuration to enable this feature. Reviewed by: robert, sam, gnn, julian, kmacy	2008-04-13 05:45:14 +00:00
Robert Watson	f457d58098	In in_pcbnotifyall() and in6_pcbnotify(), use LIST_FOREACH_SAFE() and eliminate unnecessary local variable caching of the list head pointer, making the code a bit easier to read. MFC after: 3 weeks	2008-04-06 21:20:56 +00:00
Ruslan Ermilov	ea26d58729	Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT. Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.	2008-03-25 09:39:02 +00:00
Bjoern A. Zeeb	9e3bdede0f	Correct IPsec behaviour with a 'use' level in SP but no SA available. In that case return an continue processing the packet without IPsec. PR: 121384 MFC after: 5 days Reported by: Cyrus Rahman (crahman gmail.com) Tested by: Cyrus Rahman (crahman gmail.com) [slightly older version]	2008-03-14 16:38:11 +00:00
Bjoern A. Zeeb	8cfbd2995b	Correct reference counting on the SP for outgoing IPv6 IPsec connections. PR: 121374 Reported by: Cyrus Rahman (crahman gmail.com) Tested by: Cyrus Rahman (crahman gmail.com) MFC after: 5 days	2008-03-14 11:55:04 +00:00
Bjoern A. Zeeb	39d8cf90cb	#if 0 out a currently unsued (and incomplete) function: ip6_ipsec_mtu(). No need to compile 'dead' code. I am leaving it in because we will have to review the concept and should use the common function in various places. MFC after: 5 days	2008-03-14 11:44:30 +00:00
Bjoern A. Zeeb	41aa71dd3e	Replace the function name in two identical printfs by __func__, __LINE__ so we can distinguish them when people report a problem. PR: 121373 MFC after: 5 days	2008-03-14 11:09:11 +00:00
Bjoern A. Zeeb	c26fe973a3	Rather than passing around a cached 'priv', pass in an ucred to ipsec_set_policy and do the privilege check only if needed. Try to assimilate both ip_ctloutput code blocks calling ipsec*_set_policy. Reviewed by: rwatson	2008-02-02 14:11:31 +00:00
Bjoern A. Zeeb	79ba395267	Replace the last susers calls in netinet6/ with privilege checks. Introduce a new privilege allowing to set certain IP header options (hop-by-hop, routing headers). Leave a few comments to be addressed later. Reviewed by: rwatson (older version, before addressing his comments)	2008-01-24 08:25:59 +00:00
Bjoern A. Zeeb	ab569b9c05	Correct the commented out debugging printf()s in REPLACE and NEXT macros. ip6_sprintf() needs a buffer as first argument these days. MFC after: 2 weeks	2008-01-20 10:08:15 +00:00
David E. O'Brien	9233d8f3ad	un-__P()	2008-01-08 19:08:58 +00:00
Robert Watson	8b953b3f9d	Fix leaking MAC labels for IPv6 inpcbs by adding missing MAC label destroy call; this transpired because the inpcb alloc path for IPv4/IPv6 is the same code, but IPv6 has a separate free path. The results was that as new IPv6 TCP connections were created, kernel memory would gradually leak. MFC after: 3 days Reported by: tanyong <tanyong at ercist dot iscas dot ac dot cn>, zhouzhouyi	2007-12-17 17:20:57 +00:00
David E. O'Brien	b48287a32a	Clean up VCS Ids.	2007-12-10 16:03:40 +00:00
Julian Elischer	dbec798a76	Remove more dup'd code MFC After: 1 week	2007-12-06 22:48:24 +00:00
Julian Elischer	90b3552e6e	remove duped code Reviewed By: gnn MRC after: 1 week	2007-12-06 22:44:24 +00:00
Mike Makonnen	016fb9d9c7	Instead of manually freeing the packet options structure (and not even doing a good job of it) in the copypktopts() function, just call ip6_clearpktopts() directly. Otherwise, the callers of this function would end up freeing the memory twice. Reviewed by: jinmei PR: kern/116360	2007-11-21 16:01:42 +00:00
Robert Watson	b9b0dac33b	Move towards more explicit support for various network protocol stacks in the TrustedBSD MAC Framework: - Add mac_atalk.c and add explicit entry point mac_netatalk_aarp_send() for AARP packet labeling, rather than using a generic link layer entry point. - Add mac_inet6.c and add explicit entry point mac_netinet6_nd6_send() for ND6 packet labeling, rather than using a generic link layer entry point. - Add expliict entry point mac_netinet_arp_send() for ARP packet labeling, and mac_netinet_igmp_send() for IGMP packet labeling, rather than using a generic link layer entry point. - Remove previous genering link layer entry point, mac_mbuf_create_linklayer() as it is no longer used. - Add implementations of new entry points to various policies, largely by replicating the existing link layer entry point for them; remove old link layer entry point implementation. - Make MAC_IFNET_LOCK(), MAC_IFNET_UNLOCK(), and mac_ifnet_mtx global to the MAC Framework rather than static to mac_net.c as it is now needed outside of mac_net.c. Obtained from: TrustedBSD Project	2007-10-28 15:55:23 +00:00
Robert Watson	8640764682	Rename 'mac_mbuf_create_from_firewall' to 'mac_netinet_firewall_send' as we move towards netinet as a pseudo-object for the MAC Framework. Rename 'mac_create_mbuf_linklayer' to 'mac_mbuf_create_linklayer' to reflect general object-first ordering preference. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-26 13:18:38 +00:00
Robert Watson	30d239bc4c	Merge first in a series of TrustedBSD MAC Framework KPI changes from Mac OS X Leopard--rationalize naming for entry points to the following general forms: mac_<object>_<method/action> mac_<object>_check_<method/action> The previous naming scheme was inconsistent and mostly reversed from the new scheme. Also, make object types more consistent and remove spaces from object types that contain multiple parts ("posix_sem" -> "posixsem") to make mechanical parsing easier. Introduce a new "netinet" object type for certain IPv4/IPv6-related methods. Also simplify, slightly, some entry point names. All MAC policy modules will need to be recompiled, and modules not updates as part of this commit will need to be modified to conform to the new KPI. Sponsored by: SPARTA (original patches against Mac OS X) Obtained from: TrustedBSD Project, Apple Computer	2007-10-24 19:04:04 +00:00
John Baldwin	21b415b212	Close a race when trying to lookup a gateway route in rt_check(). Specifically, if two threads were doing concurrent lookups and the existing gateway was marked down, the the first thread would drop a reference on the gateway route and then unlock the "root" route while it tried to allocate a new route. The second thread could then also drop a reference on the same gateway route resulting in a reference underflow. Fix this by clearing the gateway route pointer after dropping the reference count but before dropping the lock. Secondly, in this same case, the second thread would overwrite the gateway route pointer w/o free'ing a reference to the route installed by the first thread. In practice this would probably just fix a lost reference that would result in a route never being freed. This fixes panics observed in rt_check() and rtexpunge(). MFC after: 1 week PR: kern/112490 Insight from: mehuljv at yahoo.com Reviewed by: ru (found the "not-setting it to NULL" part) Tested by: several	2007-10-22 19:01:26 +00:00
Randall Stewart	04ee05e815	- Incorrect error EAGAIN returned for invalid send on a locked stream (using EEOR mode). Changed to EINVAL (in sctp_output.c) - Static analysis comments added - fix in mobility code to return a value (static analysis found). - sctp6_notify function made visible instead of static (this is needed for Panda). Approved by: re@freebsd.org (B Mah)	2007-09-13 10:36:43 +00:00
Randall Stewart	851b7298b3	- send call has a reference to uio->uio_resid in the recent send code, but uio may be NULL on sendfile calls. Change to use sndlen variable. - EMSGSIZE is not being returned in non-blocking mode and needs a small tweak to look if the msg would ever fit when returning EWOULDBLOCK. - FWD-TSN has a bug in stream processing which could cause a panic. This is a follow on to the codenomicon fix. - PDAPI level 1 and 2 do not work unless the reader gets his returned buffer full. Fix so we can break out when at level 1 or 2. - Fix fast-handoff features to copy across properly on accepted sockets - Fix sctp_peeloff() system call when no true system call exists to screen arguments for errors. In cases where a real system call exists the system call itself does this. - Fix raddr leak in recent add-ip code change for bundled asconfs (even when non-bundled asconfs are received) - Make sure ipi_addr lock is held when walking global addr list. Need to change this lock type to a rwlock(). - Add don't wake flag on both input and output when the socket is closing. - When deleting an address verify the interface is correct before allowing the delete to process. This protects panda and unnumbered. - Clean up old sysctl stuff and get rid of the old Open/Net BSD structures. - Add a function to watch the ranges in the sysctl sets. - When appending in the reassembly queue, validate that the assoc has not gone to about to be freed. If so (in the middle) abort out. Note this especially effects MAC I think due to the lock/unlock they do (or with LOCK testing in place). - Netstat patch to get rid of warnings. - Make sure that no data gets queued to inactive/unconfirmed destinations. This especially effect CMT but also makes a impact on regular SCTP as well. - During init collision when we detect seq number out of sync we need to treat it like Case C and discard the cookie (no invarient needed here). - Atomic access to the random store. - When we declare a vtag good, we need to shove it into the time wait hash to prevent further use. When the tag is put into the assoc hash, we need to remove it from the twait hash (where it will surely be). This prevents duplicate tag assignments. - Move decr-ref count to better protect sysctl out of data. - ltrace error corrections in sctp6_usrreq.c - Add hook for interface up/down to be sent to us. - Make sysctl() exported structures independent of processor architecture. - Fix route and src addr cache clearing for delete address case. - Make sure address marked SCTP_DEL_IP_ADDRESS is never selected as src addr. - in icmp handling fixed so we actually look at the icmp codes to figure out what to do. - Modified mobility code. Reception of DELETE IP ADDRESS for a primary destination and SET PRIMARY for a new primary destination is used for retransmission trigger to the new primary destination. Also, in this case, destination of chunks in send_queue are changed to the new primary destination. - Fix so that we disallow sending by mbuf to ever have EEOR mode set upon it. Approved by: re@freebsd.org (B Mah)	2007-09-08 17:48:46 +00:00
Randall Stewart	ceaad40ae7	- Locking compatiability changes. This involves adding additional flags to many function calls. The flags only get used in BSD when we compile with lock testing. These flags allow apple to escape the "giant" lock it holds on the socket and have more fine-grained locking in the NKE. It also allows us to test (with witness) the locking used by apple via a compile switch (manually applied). Approved by: re@freebsd.org(B Mah)	2007-09-08 11:35:11 +00:00
Robert Watson	ce4d8529e3	Continue UDP/UDPv6 synchronization project: - Fix copyrights, comments in UDPv6. - Remove macro defines for in6pcb and udp6stat. - Consistently refer to inpcbs as 'inp' and not also 'in6p'. Reviewed by: gnn, jinmei, bz Approved by: re (bmah)	2007-09-08 08:18:24 +00:00
Randall Stewart	2afb3e849f	- During shutdown pending, when the last sack came in and the last message on the send stream was "null" but still there, a state we allow, we could get hung and not clean it up and wait for the shutdown guard timer to clear the association without a graceful close. Fix this so that that we properly clean up. - Added support for Multiple ASCONF per new RFC. We only (so far) accept input of these and cannot yet generate a multi-asconf. - Sysctl'd support for experimental Fast Handover feature. Always disabled unless sysctl or socket option changes to enable. - Error case in add-ip where the peer supports AUTH and ADD-IP but does NOT require AUTH of ASCONF/ASCONF-ACK. We need to ABORT in this case. - According to the Kyoto summit of socket api developers (Solaris, Linux, BSD). We need to have: o non-eeor mode messages be atomic - Fixed o Allow implicit setup of an assoc in 1-2-1 model if using the sctp_**() send calls - Fixed o Get rid of HAVE_XXX declarations - Done o add a sctp_pr_policy in hole in sndrcvinfo structure - Done o add a PR_SCTP_POLICY_VALID type flag - yet to-do in a future patch! - Optimize sctp6 calls to reuse code in sctp_usrreq. Also optimize when we close sending out the data and disabling Nagle. - Change key concatenation order to match the auth RFC - When sending OOTB shutdown_complete always do csum. - Don't send PKT-DROP to a PKT-DROP - For abort chunks just always checksums same for shutdown-complete. - inpcb_free front state had a bug where in queue data could wedge an assoc. We need to just abandon ones in front states (free_assoc). - If a peer sends us a 64k abort, we would try to assemble a response packet which may be larger than 64k. This then would be dropped by IP. Instead make a "minimum" size for us 64k-2k (we want at least 2k for our initack). If we receive such an init discard it early without all the processing. - When we peel off we must increment the tcb ref count to keep it from being freed from underneath us. - handling fwd-tsn had bugs that caused memory overwrites when given faulty data, fixed so can't happen and we also stop at the first bad stream no. - Fixed so comm-up generates the adaption indication. - peeloff did not get the hmac params copied. - fix it so we lock the addr list when doing src-addr selection (in future we need to use a multi-reader/one writer lock here) - During lowlevel output, we could end up with a _l_addr set to null if the iterator is calling the output routine. This means we would possibly crash when we gather the MTU info. Fix so we only do the gather where we have a src address cached. - we need to be sure to set abort flag on conn state when we receive an abort. - peeloff could leak a socket. Moved code so the close will find the socket if the peeloff fails (uipc_syscalls.c) Approved by: re@freebsd.org(Ken Smith)	2007-08-27 05:19:48 +00:00
Randall Stewart	c4739e2f47	- Fix address add handling to clear cached routes and source addresses when peer acks the add in case the routing table changes. - Fix sctp_lower_sosend to send shutdown chunk for mbuf send case when sndlen = 0 and sinfoflag = SCTP_EOF - Fix sctp_lower_sosend for SCTP_ABORT mbuf send case with null data, So that it does not send the "null" data mbuf out and cause it to get freed twice. - Fix so auto-asconf sysctl actually effect the socket's asconf state. - Do not allow SCTP_AUTO_ASCONF option to be used on subset bound sockets. - Memset bug in sctp_output.c (arguments were reversed) submitted found and reported by Dave Jones (davej@codemonkey.org.uk). - PD-API point needs to be invoked >= not just > to conform to socket api draft this fixes sctp_indata.c in the two places need to be >=. - move M_NOTIFICATION to use M_PROTO5. - PEER_ADDR_PARAMS did not fail properly if you specify an address that is not in the association with a valid assoc_id. This meant you got or set the stcb level values instead of the destination you thought you were going to get/set. Now validate if the stcb is non-null and the net is NULL that the sa_family is set and the address is unspecified otherwise return an error. - The thread based iterator could crash if associations were freed at the exact time it was running. rework the worker thread to use the increment/decrement to prevent this and no longer use the markers that the timer based iterator uses. - Fix the memleak in sctp_add_addr_to_vrf() for the case when it is detected that ifa is already pointing to a ifn. - Fix it so that if someone is so insane that they drop the send window below the minimal add mark, they still can send. - Changed all state for associations to use mask safe macro. - During front states in association freeing in sctp_inpcbfree, we had a locking problem where locks were not in place where they should have been. - Free association calls were not testing the return value in sctp_inpcb_free() properly... others should be cast void returns where we don't care about the return value. - If a reference count is held on an assoc, even from the "force free" we should not do the actual free.. but instead let the timer free it. - When we enter sctp_input(), if the SCTP_ASOC_ABOUT_TO_BE_FREED flag is set, we must NOT process the packet but handle it like ootb. This is because while freeing an assoc we release the locks to get all the higher order locks so we can purge all the hash tables. This leaves a hole if a packet comes in just at that point. Now sctp_common_input_processing() will call the ootb code in such a case. - Change MBUF M_NOTIFICATION to use M_PROTO5 (per Sam L). This makes it so we don't have a conflict (I think this is a covertity change). We made this change AFTER some conversation and looking to make sure that M_PROTO5 does not have a problem between SCTP and the 802.11 stuff (which is the only other place its used). - Fixed lock order reversal and missing atomic protection around locked_tcb during association lookup and the 1-2-1 model. - Added debug to source address selection. - V6 output must always do checksum even for loopback. - Remove more locks around inp that are not needed for an atomically added/subtracted ref count. - slight optimization in the way we zero the array in sctp_sack_check() - It was possible to respond to a ABORT() with bad checksum with a PKT-DROP. This lead to a PKT-DROP/ABORT war. Add code to NOT send a PKT-DROP to any ABORT(). - Add an option for local logging (useful for macintosh or when you need better performing during debugging). Note no commands are here to get the log info, you must just use kgdb. - The timer code needs to be aware of if it needs to call sctp_sack_check() to slide the maps and adjust the cum-ack. This is because it may be out of sync cum-ack wise. - Added threshold managment logging. - If the user picked just the right size, that just filled the send window minus one mtu, we would enter a forever loop not copying and at the same time not blocking. Change from < to <= solves this. - Sysctl added to control the fragment interleave level which defaults to 1. - My rwnd control was not being used to control the rwnd properly (we did not add and subtract to it :-() this is now fixed so we handle small messages (1 byte etc) better to bring our rwnd down more slowly. Approved by: re@freebsd.org (Bruce Mah)	2007-08-24 00:53:53 +00:00
Bjoern A. Zeeb	cc977adc71	Rename option IPSEC_FILTERGIF to IPSEC_FILTERTUNNEL. Also rename the related functions in a similar way. There are no functional changes. For a packet coming in with IPsec tunnel mode, the default is to only call into the firewall with the "outer" IP header and payload. With this option turned on, in addition to the "outer" parts, the "inner" IP header and payload are passed to the firewall too when going through ip_input() the second time. The option was never only related to a gif(4) tunnel within an IPsec tunnel and thus the name was very misleading. Discussed at: BSDCan 2007 Best new name suggested by: rwatson Reviewed by: rwatson Approved by: re (bmah)	2007-08-05 16:16:15 +00:00
Robert Watson	9e7a99e592	Continue effort to improve parity between UDPv4 and UDPv6: add a missing scope security check for the UDPv6 socket credential lookup service, allowing security policies to bound access to credential information. While not an immediate issue for Jail, which doesn't allow use of UDPv6, this may be relevant to other security policies that may wish to control ident lookups. While here, eliminate a very unlikely panic case, in which a socket in the process of being freed is inspected by the sysctl. Approved by: re (kensmith) Reviewed by: bz	2007-07-27 08:25:02 +00:00
Randall Stewart	1b649582bb	- take out a needless panic under invariants for sctp_output.c - Fix addrs's error checking of sctp_sendx(3) when addrcnt is less than SCTP_SMALL_IOVEC_SIZE - re-add back inpcb_bind local address check bypass capability - Fix it so sctp_opt_info is independant of assoc_id postion. - Fix cookie life set to use MSEC_TO_TICKS() macro. - asconf changes o More comment changes/clarifications related to the old local address "not" list which is now an explicit restricted list. o Rename some functions for clarity: - sctp_add/del_local_addr_assoc to xxx_local_addr_restricted() - asconf related iterator functions to sctp_asconf_iterator_xxx() o Fix bug when the same address is deleted and added (and removed from the asconf queue) where the ifa is "freed" twice refcount wise, possibly freeing it completely. o Fix bug in output where the first ASCONF would not go out after the last address is changed (e.g. only goes out when retransmitted). o Fix bug where multiple ASCONFs can be bundled in the same packet with the and with the same serial numbers. o Fix asconf stcb iterator to not send ASCONF until after all work queue entries have been processed. o Change behavior so that when the last address is deleted (auto asconf on a bound all endpoint) no action is taken until an address is added; at that time, an ASCONF add+delete is sent (if the assoc is still up). o Fix local address counting so that address scoping is taken into account. o #ifdef SCTP_TIMER_BASED_ASCONF the old timer triggered sending of ASCONF (after an RTO). The default now is to send ASCONF immediately (except for the case of changing/deleting the last usable address). Approved by: re(ken smith)@freebsd.org	2007-07-24 20:06:02 +00:00
Robert Watson	8136d21ec0	Continue effort to align UDPv4 and UDPv6 implementations by merging udp6_output() from udp6_output.c to udp6_usrreq.c, matching the UDPv4 structure, and allowing us to remove udp6_output.c. Reviewed by: bz, gnn Approved by: re (bmah)	2007-07-23 07:58:58 +00:00
Randall Stewart	52be287ebb	- remove duplicate code from sctp_asconf.c - remove duplicate #include <sys/priv.h> that is not under #ifdef FreeBSD version to allow compile on 6.1 - static analysis changes per the cisco SA tool including: o some SA_IGNORE comments o some checks for NULL before unlock. o type corrections int -> size_t - Fix it so sctp_alloc_asoc takes a thread/proc argument. Without this we pass a NULL in to bind on implicit assoc setup and crash :-( Approved by: re@freebsd.org(Ken Smith)	2007-07-21 21:41:32 +00:00
Robert Watson	08af97b790	Attempt to improve feature parity between UDPv4 and UDPv6 by merging UDPv4 features to UDPv6: - Add MAC checks on delivery and MAC labeling on transmit. - Check for (and reject) datagrams with destination port 0. - For multicast delivery, check the source port only if the socket being considered as a destination has been connected. - Implement UDP blackholing based on net.inet.udp.blackhole. - Add a new ICMPv6 unreachable reply rate limiting category for failed delivery attempts and implement rate limiting for UDPv6 (submitted by bz). Approved by: re (kensmith) Reviewed by: bz	2007-07-19 22:34:25 +00:00
Bjoern A. Zeeb	8accf26fea	Restore behavior changed with rev. 1.46 and make IPV6_IPSEC_POLICY always visible again. This unbreaks some third party user space applications. PR: 114491 Reported by: sumikawa Reviewed by: sumikawa Approved by: re (hrs)	2007-07-19 09:16:40 +00:00
Randall Stewart	18e198d3a3	- added pre-checks to the bindx call. - use proper tick gathering macro instead of ticks directly. - Placed reasonable boundaries on sets that a user can do that are converted to ticks from ms. - Fix CMT_PF to always check to be sure CMT is on. - Fix ticks use of CMT_PF. - put back code to allow asconfs to be queued while INITs are in flight and before the assoc is established. - During window probes, an ack'd packet might be left with the window probe mark on it causing it to be retransmitted. Change so that the flight decrease macro clears the window_probe mark. - Additional logging flight size/reading and ASOC LOG. This is only enabled if you manually insert things into opt_sctp.h since its a set of debug code only. - Found an interesting SMP race in the way data was appended which could cause a reader to lose a part of a message, had to reorder when we marked the message was complete to after the data was appended. - bug in ADD-IP for the subset bound socket case when the peer has only one address - fix ASCONF implicit success/error handling case - proper support of jails in Freebsd 6> - copy out the timeval for the 64 bit sparc world on cookie-echo alignment error crashes without this). Approved by: re(Ken Smith)	2007-07-17 20:58:26 +00:00
Randall Stewart	b54d3a6c48	- Modular congestion control, with RFC2581 being the default. - CMT_PF states added (w/sysctl to turn the PF version on) - sctp_input.c had a missing incr of cookie case when the auth was bad. This meant a free was called without an increment to refcnt, added increment like rest of code. - There was a case, unlikely, when the scope of the destination changed (this is a TSNH case). In that case, it would not free the alloc'ed asoc (in sctp_input.c). - When listed addresses found a colliding cookie/Init, then the collided upon tcb was not unlocked in sctp_pcb.c - Add error checking on arguments of sctp_sendx(3) to prevent it from referencing a NULL pointer. - Fix an error return of sctp_sendx(3), it was returing ENOMEM not -1. - Get assoc id was changed to use the sanctified socket api method for getting a assoc id (PEER_ADDR_INFO instead of PEER_ADDR_PARAMS). - Fix it so a peeled off socket will get a proper error return if it trys to send to a different address then it is connected to. - Fix so that select_a_stream can avoid an endless loop that could hang a caller. - time_entered (state set time) was not being set in all cases to the time we went established. Approved by: re(ken smith)	2007-07-14 09:36:28 +00:00
Robert Watson	542a638396	General style, white space, and comment cleanup; move to ANSI C prototypes, don't use register, etc. Synchronize structure and layout to the IPv4 versions of these functions to a greater extent, making visual comparison easier. Remove now stale or incorrect comments. Enable full lock assertions, and correct one exception handling case where the wrong label was jumped to. Tested by: bz Approved by: re (bmah)	2007-07-09 17:47:04 +00:00
Xin LI	2a463222be	Space cleanup Approved by: re (rwatson)	2007-07-05 16:29:40 +00:00
Xin LI	1272577e22	ANSIfy[1] plus some style cleanup nearby. Discussed with: gnn, rwatson Submitted by: Karl Sj?dahl - dunceor <dunceor gmail com> [1] Approved by: re (rwatson)	2007-07-05 16:23:49 +00:00
Peter Wemm	0273079097	Fix a stray splx() that caused a new warning. Approved by: re (rwatson)	2007-07-05 06:54:03 +00:00
Peter Wemm	edbb8b4600	Fix 'assignment used as truth value' warning Approved by: re (rwatson)	2007-07-05 06:27:15 +00:00
George V. Neville-Neil	d8c2182456	Remove a last, dangling, file from the Kame IPsec code. Approved by: re Spotted by: rwatson, bz	2007-07-04 01:03:48 +00:00
Max Laier	60ee384760	Link pf 4.1 to the build: - move ftp-proxy from libexec to usr.sbin - add tftp-proxy - new altq mtag link Approved by: re (kensmith)	2007-07-03 12:46:08 +00:00
George V. Neville-Neil	b2630c2934	Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC option is now deprecated, as well as the KAME IPsec code. What was FAST_IPSEC is now IPSEC. Approved by: re Sponsored by: Secure Computing	2007-07-03 12:13:45 +00:00
George V. Neville-Neil	e66ff7fc8e	Removing old, dead, KAME IPsec files as part of the move to the new FAST_IPSEC based IPsec stack. Approved by: re Reviewed by: bz	2007-07-02 04:02:21 +00:00
George V. Neville-Neil	adb0e1681f	Follow on cleanup and removal of two unnecessary include files. Reviewed by: bz Approved by: re Supported by: Secure Computing	2007-07-01 12:31:01 +00:00
George V. Neville-Neil	2cb64cb272	Commit IPv6 support for FAST_IPSEC to the tree. This commit includes only the kernel files, the rest of the files will follow in a second commit. Reviewed by: bz Approved by: re Supported by: Secure Computing	2007-07-01 11:41:27 +00:00
Matt Jacob	0add0b912e	gcc4.2 somehow doesn't believe that finaldst can stay stable between where it's initialized and where it's checked twice such that the origingal destination address is saved. Make it happier and trim things down a bit.	2007-06-17 04:12:21 +00:00
Randall Stewart	e42a0f5e72	- For sctp_input/sctp6_input add announcment when a packet arrives (debug) - re-factor the packet drop in sctp_output a bit more, we don't need the trim after all, but the size calc is now corrected. - When a assoc is in the COOKIE-ECHO/COOKIE-WAIT state and the user closes, it should not matter if data is queued, the assoc should be purged. - In error leg a missing free_chunk when iph comes in NULL (should not happen but just in case).	2007-06-17 01:36:02 +00:00
Matt Jacob	37f878f56c	Garbage collect unused variables.	2007-06-15 22:56:12 +00:00
Randall Stewart	80fefe0a08	- Fix so ifn's are properly deleted when the ref count goes to 0. - Fix so VRF's will clean themselves up when no references are around. - Allow sctp_ifa to be passed into inpcb_bind, addr_mgmt_ep_sa to bypass normal validation checks. - turn auto-asconf off for subset bound sockets - Moves all logging to use KTR. This gets rid of most of the logging #ifdef's with a few exceptions reducing the number of config options for SCTP.	2007-06-14 22:59:04 +00:00
Robert Watson	c2259ba44f	Include priv.h to pick up suser(9) definitions, missed in an earlier commit. Warnings spotted by: kris	2007-06-13 22:42:43 +00:00
Bruce M Simpson	71498f308b	Import rewrite of IPv4 socket multicast layer to support source-specific and protocol-independent host mode multicast. The code is written to accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work. This change only pertains to FreeBSD's use as a multicast end-station and does not concern multicast routing; for an IGMPv3/MLDv2 router implementation, consider the XORP project. The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6, which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html Summary * IPv4 multicast socket processing is now moved out of ip_output.c into a new module, in_mcast.c. * The in_mcast.c module implements the IPv4 legacy any-source API in terms of the protocol-independent source-specific API. * Source filters are lazy allocated as the common case does not use them. They are part of per inpcb state and are covered by the inpcb lock. * struct ip_mreqn is now supported to allow applications to specify multicast joins by interface index in the legacy IPv4 any-source API. * In UDP, an incoming multicast datagram only requires that the source port matches the 4-tuple if the socket was already bound by source port. An unbound socket SHOULD be able to receive multicasts sent from an ephemeral source port. * The UDP socket multicast filter mode defaults to exclusive, that is, sources present in the per-socket list will be blocked from delivery. * The RFC 3678 userland functions have been added to libc: setsourcefilter, getsourcefilter, setipv4sourcefilter, getipv4sourcefilter. * Definitions for IGMPv3 are merged but not yet used. * struct sockaddr_storage is now referenced from <netinet/in.h>. It is therefore defined there if not already declared in the same way as for the C99 types. * The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF which are then interpreted as interface indexes) is now deprecated. * A patch for the Rhyolite.com routed in the FreeBSD base system is available in the -net archives. This only affects individuals running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces. * Make IPv6 detach path similar to IPv4's in code flow; functionally same. * Bump __FreeBSD_version to 700048; see UPDATING. This work was financially supported by another FreeBSD committer. Obtained from: p4://bms_netdev Submitted by: Wilbert de Graaf (original work) Reviewed by: rwatson (locking), silence from fenner, net@ (but with encouragement)	2007-06-12 16:24:56 +00:00
Randall Stewart	35918f8571	- Restructure so bindx functions are not done inline to socket option but are a seperate call that can be re-used if needed. - 64 bit issues o re-arrange cookie so it is better 64 bit aligned o For wire level things we need the packed attribute.	2007-06-12 11:21:00 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
JINMEI Tatuya	5e9510e3b6	cleanup about the reassembly structures and routine: - removed unused structure members - fixed a minor bug that the ECN code point may not be restored correctly Approved by: ume (mentor) MFC after: 1 week	2007-06-04 06:06:35 +00:00
Randall Stewart	f4c93d2405	- fix initial pcb vrf setting when the initial vrf is not the default_vrf_id - Missing lock/unlock of inp added as well in the v6 side. - IFN hash table moves to sctppcbinfo since indexes are unique across systems (including different VRFs) this makes it easier to do ifn lookups.	2007-06-02 11:05:08 +00:00
JINMEI Tatuya	09a52a5532	fixed memory leak for IPv6 multicast membership information associated with interface addresses. Approved by: gnn (mentor) MFC after: 1 week	2007-06-02 08:02:36 +00:00
JINMEI Tatuya	99124467fc	simplified the fix in rev. 1.69 by replacing RT_REMREF+RT_UNLOCK with RTFREE_LOCKED. Approved by: gnn (mentor)	2007-06-02 07:27:02 +00:00
Randall Stewart	ad21a36485	- Take out the broken table-id concept. Panda Routers have a M-VRF concept that is NOT well thought out for a multi-homed transport protocol. So the useless table-id entries passed around need to be removed. - Add a event timer for the zero copy api. - Fix a bug in sctp_timer.c when searching for an alternate with the largest ssthresh (the compare was wrong).	2007-06-01 11:19:54 +00:00
Randall Stewart	207304d4b7	- Fixes so we won't try to start a timer when we hold a wq lock for the iterator. Panda uses a silly recursive lock they hold through the timer. - Add poor mans wireshark compile option.. - Allocate and start using SCTP_M_XXX for all SCTP_MALLOC() calls. - sysctl now will get back the refcnt for viewing by onlookers. Reviewed by: gnn	2007-05-29 09:29:03 +00:00
Randall Stewart	d61a0ae066	- fixed autclose to not allow setting on 1-2-1 model. - bounded cookie-life to 1 second minimum in socket option set. - Delayed_ack_time becomes delayed_ack per new socket api document. - Improve port number selection, we now use low/high bounds and no chance of a endless loop. Only one call to random per bind as well. - fixes so set_peer_primary pre-screens addresses to be valid to this host. - maxseg did not allow setting on an assoc basis. We needed to thus track and use an association value instead of a inp value. - Fixed ep get of HB status to report back properly. - use settings flag to tell if assoc level hb is on off not the timer.. since the timer may still run if unconf address are present. - check for crazy ENABLE/DISABLE conditions. - set and get of pmtud (fixed path mtu) not always taking into account ovh. - Getting PMTU info on stcb only needs to return PMTUD_ENABLED if any net is doing PMTU discovery. - Panic or warning fixed to not do so when a valid ip frag is taking place. - sndrcvinfo appearing in both inp and stcb was full size, instead of the non-pad version. This saves about 92 bytes from each struct by carefully converting to use the smaller version. - one-2-one model get(maxseg) would always get ep value, never the tcb's value. - The delayed ack time could be under a tick, this fixes so it bounds it to at least 1 tick for platforms whos tick is more than a ms. - Fragment interleave level set to wrong default value. - Fragment interleave could not set level 0. - Defered stream reset was broken due to a guard check and ntohl issue. - Found two lock order reversals and fixed. - Tighten up address checking, if the user gives an address the sa_len had better be set properly. - Get asoc by assoc-id would return a locked tcb when it was asked not to if the tcb was in the restart hash. - sysctl to dig down and get more association details Reviewed by: gnn	2007-05-28 11:17:24 +00:00
JINMEI Tatuya	6abdc89958	do not directly call rtfree() to meet an assumption in the callee. (this fix suppresses a warning message appearing in the boot time on IPv6-enabled systems) Approved by: gnn (mentor)	2007-05-25 06:44:00 +00:00
Olivier Houchard	d10f3ce07f	Force the alignment of the chars arrays, as they are casted later to structs. gcc 4.2 doesn't do it by default, and that results in unaligned access on arm. Reviewed by: gnn, imp	2007-05-21 14:38:20 +00:00
JINMEI Tatuya	187069853c	- Disabled responding to NI queries from a global address by default as specified in RFC4620. A new flag for icmp6_nodeinfo was added to enable the feature. - Also cleaned up the code so that the semantics of the icmp6_nodeinfo flags is clearer (i.e., defined specific macro names instead of using hard-coded values). Approved by: gnn (mentor) MFC after: 1 week	2007-05-17 21:20:24 +00:00
Randall Stewart	3c503c28da	- Fixed 1-2-1 model to not worry about associd in sockopts - Fixed RTOinfo for bounding. - Fixed connect() to return ECONNREFUSED when an ABORT is received. - Added comments to direct Static Analysis not to look at some things it does not understand (comments are /* sa_ignore XXXXX */) - Bind when colliding was broken, missing not_found = 1 before checking to see if the port was in use caused endless bind loop. - Cookie life needs to be in milliseconds to conform to socket api. - Cookie life is not supposed to change if its 0, On the assoc level set we changed it to 0 opps. - Two more static analysis issues identified by the cisco tool. Null checks needed. - An issue for sendfile(). Need to validate the correct input argument. - When sending failed due to a no route to host, we leaked the mbuf chain failing to call m_freem(). - Fix #ifdef issue for getting hash block len when HAVE_SHA2 is NOT defined Reviewed by: gnn	2007-05-17 12:16:24 +00:00
JINMEI Tatuya	7eefde2c0c	handle IPv6 router alert option contained in an incoming packet per option value so that unrecognized options are ignored as specified in RFC2711. (packets containing an MLD router alert option are passed to the upper layer as before). Approved by: gnn (mentor), ume (mentor)	2007-05-14 17:56:13 +00:00
Robert Watson	54d642bbe5	Reduce network stack oddness: implement .pru_sockaddr and .pru_peeraddr protocol entry points using functions named proto_getsockaddr and proto_getpeeraddr rather than proto_setsockaddr and proto_setpeeraddr. While it's true that sockaddrs are allocated and set, the net effect is to retrieve (get) the socket address or peer address from a socket, not set it, so align names to that intent.	2007-05-11 10:20:51 +00:00
Matt Jacob	b065259568	Need sys/cdevs.h for the macro FBSDID to work.	2007-05-09 23:19:55 +00:00
George V. Neville-Neil	559d3390d0	Integrate the Camellia Block Cipher. For more information see RFC 4132 and its bibliography. Submitted by: Tomoyuki Okazaki <okazaki at kick dot gr dot jp> MFC after: 1 month	2007-05-09 19:37:02 +00:00
Randall Stewart	ad81507eed	Two major items here: - All printf that was surrounded by #ifdef SCTP_DEBUG moves to a macro that does all of this. This removes all printfs from the code and makes the code more portable and easier to read. - Static Analysis (cisco) - found a few bugs, but mostly we add checks for NULL pointers and such to make the tool happy. We now pass the Cisco SA tools checks except for where it does not understand tailq/lists. We still need to look at the coverity tools output too (this is like the cisco SA tool) and see if it wants us to fix any other items. Hopefully this will be the last major churn in the code other than bug fixes.	2007-05-09 13:30:06 +00:00
George V. Neville-Neil	62c4e3f043	Reduce the default number of header options that the IPv6 protocol stack will process from 50 to 15. As this is a sysctl variable it can be tuned up or down at the user/administrator's whim. Submitted by: itojun MFC after: 1 day	2007-05-08 20:11:36 +00:00
Randall Stewart	b100636770	- Copyright change, cisco's silly tool wants it to say: "Copyright (c) 2001-2007, by Cisco Systems," instead of *Copyright (c) 2001-2007, Cisco Systems," - Also fix a few straglers that were still in 2006.	2007-05-08 17:01:12 +00:00
Randall Stewart	b0552ae214	- Get rid of the sctp_inpcb_free() "magic numbers", now they are sensible defines that tell what you are directing the function to do.	2007-05-08 15:53:03 +00:00
Randall Stewart	6e55db5445	- Static analyisis fixes for cisco's commit (this is equivilant to the coverity tool.. may even be the same one.. not sure). - A bug in the way sctp_abort() and friends were setting the IP_CLOSE flag.. and NOT passing the last argument as a (,1)... so that things would get freed..	2007-05-08 14:32:53 +00:00
Randall Stewart	17205ecc85	- More macros for OS compatabilty - PR-SCTP would ignore FWD-TSN's above a rwnd's worth of TSN's (1 byte msgs).. this left the peer hopelessly out of sync.. or an attacker. So now we abort the assoc. - New IFN hash, also rename hashes to match addr/ifn now that the vrf has multiple. - Do not enable SCTP_PCB_FLAGS_RECVDATAIOEVNT per default as defined in the Socket API ID. - Export MTU information via sysctl. - Vrf's need table id's. This is default for BSD, but may be other things later when BSD fully supports VRFs. - Additional stream reset bug (caught by cisco dev-test). - Additional validations for the address in sending a message (socket api). -------- and ----- - Fix association notifications not to give the active open side false notifications. - Fix so sendfile and SENDALL will work properly (missing flag to say socket sender is done). - Fix Bug that prevented COOKIES from being retransmitted. - Break out connectx into helper sub-models so that iox routines can reuse the helpers. - When an address is added during system init (non-dynamic mode) make sure that the "defer use" flag is not set. its compiling on XR now :-D Reviewed by: gnn	2007-05-08 00:21:05 +00:00
SUZUKI Shinsuke	8f34a8b84a	some minor modification to the previous commit to sys/netinet6/nd6.c and nd6_nbr.c. - added some clarification comments - removed an unnecesary code Obtained from: KAME MFC after: 1 week	2007-05-05 04:24:01 +00:00
SUZUKI Shinsuke	8d290a593f	fixed a memory leak in unresolved ND queue processing Obtained from: KAME MFC after: 1 week	2007-05-04 02:34:17 +00:00
Randall Stewart	d06c82f169	- Somehow the disable fragment option got lost. We could set/clear it but would not do it. Now we will. - Moved to latest socket api for extended sndrcv info struct. - Moved to support all new levels of fragment interleave (0-2). - Codenomicon security test updates - length checks and such. - Bug in stream reset (2 actually). - setpeerprimary could unlock a null pointer, fixed. - Added a flag in the pcb so netstat can see if we are listening easier. Obtained from: (some of the Listen changes from Weongyo Jeong)	2007-05-02 12:50:13 +00:00
Robert Watson	84ca8aa609	Remove unused pcbinfo arguments to in_setsockaddr() and in_setpeeraddr().	2007-05-01 16:31:02 +00:00
Robert Watson	712fc218a0	Rename some fields of struct inpcbinfo to have the ipi_ prefix, consistent with the naming of other structure field members, and reducing improper grep matches. Clean up and comment structure fields in structure definition.	2007-04-30 23:12:05 +00:00
George V. Neville-Neil	6486cbd7bb	Turn off route header processing for now due to issues pointed out by Philippe Biondi and Arnaud Ebalard. This is a temporary fix until more discussion can be had on the exact risks involved in allowing source routing in IPv6 Submitted by: itojun Reviewed by: jinmei MFC after: 1 day	2007-04-23 09:32:04 +00:00
Robert Watson	fea9ea0005	Teach netinet6 to use PRIV_NETINET_REUSEPORT.	2007-04-21 18:14:04 +00:00
Randall Stewart	c105859eee	- fix source address selection when picking an acceptable address - name change of prefered -> preferred - CMT fast recover code added. - Comment fixes in CMT. - We were not giving a reason of cant_start_asoc per socket api if we failed to get init/or/cookie to bring up an assoc. Change so we don't just give a generic "comm lost" but look at actual states of dying assoc. - change "crc32" arguments to "crc32c" to silence strict/noisy compiler warnings when crc32() is also declared - A few minor tweaks to get the portable stuff truely portable for sctp6_usrreq.c :-D - one-2-one style vrf match problem. - window recovery would leave chks marked for retran during window probes on the sent queue. This would then cause an out-of-order problem and assure that the flight size "problem" would occur. - Solves a flight size logging issue that caused rwnd overruns, flight size off as well as false retransmissions.g - Macroize the up and down of flight size. - Fix a ECNE bug in its counting. - The strict_sacks options was causing aborts when window probing was active, fix to make strict sacks a bit smarter about what the next unsent TSN is. - Fixes a one-2-one wakeup bug found by Martin Kulas. - If-defed out form, Andre's copy routines pending his commit of at least m_last().. need to adjust for 6.2 as well.. since m_last won't exist. Reviewed by: gnn	2007-04-14 09:44:09 +00:00
Robert Watson	949da0d8f8	Remove obsolete comment about privileges: SUSER_ALLOWJAIL is no longer set in this code.	2007-04-11 16:31:02 +00:00
Randall Stewart	bff64a4db3	- fixed several places where we did not release INP locks. - fixed a refcount bug in the new ifa structures. - use vrf's from default stcb or inp whenever possible. - Address limits raised to account for a full IP fragmented packet (1000 addresses). - flight size correcting updated to include one message only and to handle case where the peer does not cumack the next segment aka lists 1/1 in sack blocks.. - Various bad init/init-ack handling could cause a panic since we tried to unlock the destroyed mutex. Fixes so we properly exit when we need to destroy an assoc. (Found by Cisco DevTest team :D) - name rename in src-addr-selection from pass to sifa. - route structure typedef'd to allow different platforms and updated into sctp_os_bsd file. - Max retransmissions a chunk can be made added. Reviewed by: gnn	2007-04-03 11:15:32 +00:00
John Baldwin	4e7f640dfb	Optimize sx locks to use simple atomic operations for the common cases of obtaining and releasing shared and exclusive locks. The algorithms for manipulating the lock cookie are very similar to that rwlocks. This patch also adds support for exclusive locks using the same algorithm as mutexes. A new sx_init_flags() function has been added so that optional flags can be specified to alter a given locks behavior. The flags include SX_DUPOK, SX_NOWITNESS, SX_NOPROFILE, and SX_QUITE which are all identical in nature to the similar flags for mutexes. Adaptive spinning on select locks may be enabled by enabling the ADAPTIVE_SX kernel option. Only locks initialized with the SX_ADAPTIVESPIN flag via sx_init_flags() will adaptively spin. The common cases for sx_slock(), sx_sunlock(), sx_xlock(), and sx_xunlock() are now performed inline in non-debug kernels. As a result, <sys/sx.h> now requires <sys/lock.h> to be included prior to <sys/sx.h>. The new kernel option SX_NOINLINE can be used to disable the aforementioned inlining in non-debug kernels. The size of struct sx has changed, so the kernel ABI is probably greatly disturbed. MFC after: 1 month Submitted by: attilio Tested by: kris, pjd	2007-03-31 23:23:42 +00:00
Randall Stewart	5e54f665f0	- Found bug in min split point bundling which caused incorrect, non-bundlable fragmentation. - Added min residual to better control split points for both how big a msg must be as well as how much needs to be left over. - With our new algo in place, we need to implicitly set "end of msg" on the sp-> structure otherwise we end up with "hung" associations. - Room reserved up front in IP header by pushing IP header to back of mbuf. - Fix so FR's peg count of retransmissions needed. - Fix so an unlucky chunk that never gets across will kill the assoc via the kill timer and send an abort too. - Fix bug in sctp_input which can result in a crash. - Do not strip off IP options anymore. - Clean up sctp_calculate_rto(). - Get rid of unused sysctl. - Fixed so we discard all M-Cast - Fixed so port check done AFTER checksum - Fixed bug in fragmentation code that prevented us from fragmenting a small complete message when we needed to. - Window probes were not marked back to unsent and flight adjusted when a sack came in with no window change or accepting of the probe data. We now fix this with having a mark on the net and the chunk so we can clear it out when the sack arrives forcing it to retran just like it was "new" this improves the handling of window probes, which were dropped by the receiver. - Tighten AUTH protocol error checks during INIT/INIT-ACK exchange	2007-03-31 11:47:30 +00:00
Bruce M Simpson	ec002fee99	Implement reference counting for ifmultiaddr, in_multi, and in6_multi structures. Detect when ifnet instances are detached from the network stack and perform appropriate cleanup to prevent memory leaks. This has been implemented in such a way as to be backwards ABI compatible. Kernel consumers are changed to use if_delmulti_ifma(); in_delmulti() is unable to detect interface removal by design, as it performs searches on structures which are removed with the interface. With this architectural change, the panics FreeBSD users have experienced with carp and pfsync should be resolved. Obtained from: p4 branch bms_netdev Reviewed by: andre Sponsored by: Garance A Drosehn Idea from: NetBSD MFC after: 1 month	2007-03-20 00:36:10 +00:00
Randall Stewart	132dea7d5a	- errno -> becomes error in sctp_output.c and sctputil.c - SB_CLEAR macro defined and used for sb clearing. - Fix for CMT express_sack_handling did not do proper pseudo-cumack updates. - Get rid of extraneous function that was never used ip_2_ip6_hdr() - Fixed source address selection bug (initialization problem). - Source address selection debug added.	2007-03-19 06:53:02 +00:00
Randall Stewart	42551e993f	- Sysctl's move to seperate file - moved away from ifn/ifa access to sctp_ifa/sctp_ifn built and managed by the add-ip code. - cleaned up add-ip code to use the iterator - made iterator be a thread, which enables auto-asconf now. - rewrote and cleaned up source address selection (also made it use new structures). - Fixed a couple of memory leaks. - DACK now settable as to how many packets to delay as well as time. - connectx() to latest socket API, new associd arg. - Fixed issue with revoking and loosing potential to send when we inflate the flight size. We now inflate the cwnd too and deflate it later when the revoked chunk is sent or acked. - Got rid of some temp debug code - src addr selection moved to a common file (sctp_output.c) - Support for simple VRF's (we have support for multi-vfr via compile switch that is scrubbed from BSD but we won't need multi-vrf until we first get VRF :-D) - Rest of mib work for address information now done - Limit number of addresses in INIT/INIT-ACK to a #def (30). Reviewed by: gnn	2007-03-15 11:27:14 +00:00
Bruce M Simpson	00cf3f55fb	Add comments about common idioms for cleanup pass at a later date.	2007-02-28 21:58:37 +00:00
Bruce M Simpson	cd88c37218	Remove code which would never be used, viz a viz Quality-of-Service; the token bucket filter got killed in netinet, so it gets killed here too. Correct comments.	2007-02-28 20:32:25 +00:00
Bruce M Simpson	430fc8f211	Add a comment about a struct which needs to be global. Remove an unused global variable. Staticize variables which do not need to be global.	2007-02-28 20:29:20 +00:00
Bruce M Simpson	1291e2a0eb	Fix tinderbox. ip6_mrouter should be defined in raw_ip6.c as it is tested to determine if the userland socket is open; this, in turn, is used to determine if the module has been loaded. Tested with: LINT	2007-02-24 21:09:35 +00:00
Bruce M Simpson	6be2e366d6	Make IPv6 multicast forwarding dynamically loadable from a GENERIC kernel. It is built in the same module as IPv4 multicast forwarding, i.e. ip_mroute.ko, if and only if IPv6 support is enabled for loadable modules. Export IPv6 forwarding structs to userland netstat(1) via sysctl(9).	2007-02-24 11:38:47 +00:00
Robert Watson	afdb42748d	Rename two identically named log_in_vain variables: tcp_input.c's static log_in_vain to tcp_log_in_vain, and udp_usrreq's global log_in_vain to udp_log_in_vain. MFC after: 1 week	2007-02-20 10:20:03 +00:00
Randall Stewart	f42a358a6f	- Copyright updates (aka 2007) - ZONE get now also take a type cast so it does the cast like mtod does. - New macro SCTP_LIST_EMPTY, which in bsd is just LIST_EMPTY - Removal of const in some of the static hmac functions (not needed) - Store length changes to allow for new fields in auth - Auth code updated to current draft (this should be the RFC version we think). - use uint8_t instead of u_char in LOOPBACK address comparison - Some u_int32_t converted to uint32_t (in crc code) - A bug was found in the mib counts for ordered/unordered count, this was fixed (was referencing a freed mbuf). - SCTP_ASOCLOG_OF_TSNS added (code will probably disappear after my testing completes. It allows us to keep a small log on each assoc of the last 40 TSN's in/out and stream assignment. It is NOT in options and so is only good for private builds. - Some CMT changes in prep for Jana fixing his problem with reneging when CMT is enabled (Concurrent Multipath Transfer = CMT). - Some missing mib stats added. - Correction to number of open assoc's count in mib - Correction to os_bsd.h to get right sha2 macros - Add of special AUTH_04 flags so you can compile the code with the old format (in case the peer does not yet support the latest auth code). - Nonce sum was incorrectly being set in when ecn_nonce was NOT on. - LOR in listen with implicit bind found and fixed. - Moved away from using mbuf's for socket options to using just data pointers. The mbufs were used to harmonize NetBSD code since both Net and Open used this method. We have decided to move away from that and more conform to FreeBSD style (which makes more sense). - Very very nasty bug found in some of my "debug" code. The cookie_how collision case tracking had an endless loop in it if you got a second retransmission of a cookie collision case. This would lock up a CPU .. ugly.. - auth function goes to using size_t instead of int which conforms to socketapi better - Found the nasty bug that happens after 9 days of testing.. you get the data chunk, deliver it and due to the reference to a ch-> that every now and then has been deleted (depending on the postion in the mbuf) you have an invalid ch->ch.flags.. and thus you don't advance the stream sequence number.. so you block the stream permanently. The fix is to make local variables of these guys and set them up before you have any chance of trimming the mbuf. - style fix in sctp_util.h, not sure how this got bad maybe in the last patch? (aka it may not be in the real source). - Found interesting bug when using the extended snd/rcv info where we would get an error on receiving with this. Thats because it was NOT padded to the same size as the snd_rcv info. We increase (add the pad) so the two structs are the same size in sctp_uio.h - In sctp_usrreq.c one of the most common things we did for socket options was to cast the pointer and validate the size. This as been macro-ized to help make the code more readable. - in sctputil.c two things, the socketapi class found a missing flag type (the next msg is a notification) and a missing scope recovery was also fixed. Reviewed by: gnn	2007-02-12 23:24:31 +00:00
Bruce M Simpson	31a9460383	In the ICMP6 path to handle FQDN 'who-are-you' queries, check that the packet header mbuf is non-NULL before trying to create a duplicate of it. PR: 95957 Reviewed by: ume MFC after: 3 days	2007-02-10 12:25:19 +00:00
Bruce M Simpson	6ede684320	MFC after: 3 days	2007-02-05 11:05:41 +00:00
Hajimu UMEMOTO	c57086ced7	ng_iface requiers neighbor cache as well. MFC after: 3 days	2007-02-03 09:34:36 +00:00
Bruce A. Mah	f234bea7d7	Revert nd6.c revs. 1.67, 1.68, 1.69, 1.70 in an attempt to unbreak IPv6 over point-to-point gif(4) tunnels. These revisions caused a host route to the destination of a point-to-point gif(4) interface to not get installed when the interface and destination addresses were assigned. This caused "no route to host" errors when trying to send traffic over the interface. The first packet arriving inbound over the tunnel, however, would cause the correct route to get installed, allowing subsequent outbound traffic to be routed correctly. gif(4) interfaces with prefix lengths of less than 128 bits (i.e. no explicit destination address assigned) were not affected by this bug. This bug fix is a possible candidate for a 6.2-RELEASE errata note. Approved by: jhay (original committer) Discussed with: jhay, JINMEI Tatuya MFC after: 3 days	2007-01-26 23:22:58 +00:00
Randall Stewart	93164cf98c	- most all includes (#include <>) migrate to the sctp_os_bsd.h file - Finally all splxx() are removed - Count error fixed in mapping array which might cause a wrong cumack generation. - Invariants around panic for case D + printf when no invariants. - one-to-one model race condition fixed by using a pre-formed connection and then completing the work so accept won't happen on a non-formed association. - Some additional paranoia checks in sctp_output. - Locks that were missing in the accept code. Approved by: gnn	2007-01-18 09:58:43 +00:00
Hajimu UMEMOTO	6a550ab34b	Avoid infinite loop if nicmp6 and nip6 are not on the same mbuf. NetBSD PR 34994+35333 MFC after: 3 days	2007-01-16 15:55:29 +00:00
Randall Stewart	44b7479ba2	- Macroizes the V6ONLY flag check. - Added a short time wait (not used yet) constant - Corrected the type of the crc32c table (it was unsigned long and really is a uint32_t - Got rid of the user of MHeaders until they are truely needed by lower layers. - Fixed an initialization problem in the readq structure (ordering was off). - Found yet another collision bug when the random number generator returns two numbers on one side (during a collision) that are the same. Also added some tracking of cookies that will go away when we know that we have the last collision bug gone. - Fixed an init bug for book_size_scale, that was causing Early FR code to run when it should not. - Fixed a flight size tracking bug that was associated with Early FR but due to above bug also effected all FR's - Fixed it so Max Burst also will apply to Fast Retransmit. - Fixed a bug in the temporary logging code that allowed a static log array overflow - hashinit_flags is now used. - Two last mcopym's were converted to the macro sctp_m_copym that has always been used by all other places - macro sctp_m_copym was converted to upper case. - We now validate sinfo_flags on input (we did not before). - Fixed a bug that prevented a user from sending data and immediately shuting down with one send operation. - Moved to use hashdestroy instead of free() in our macros. - Fixed an init problem in our timed_wait vtag where we did not fully initialize our time-wait blocks. - Timer stops were re-positioned. - A pcb cleanup method was added, however this probably will not be used in BSD.. unless we make module loadable protocols - I think this fixes the mysterious timer bug.. it was a ordering of locks problem in the way we did timers. It now conforms to the timeout(9) manual (except for the _drain part, we had to do this a different way due to locks). - Fixed error return code so we get either CONNREUSED or CONNRESET depending on where one is in progression - Purged an unused clone macro. - Fixed a read erro code issue where we were NOT getting the proper error when the connection was reset. - Purged an unused clone macro. - Fixed a read erro code issue where we were NOT getting the proper error when the connection was reset. Approved by: gnn	2007-01-15 15:12:10 +00:00
Warner Losh	1c0ee39e74	Marked these as packed correctly	2007-01-12 07:20:25 +00:00
Randall Stewart	139bc87fda	a) macro-ization of all mbuf and random number access plus timers. This makes the code more portable and able to change out the mbuf or timer system used more easily ;-) b) removal of all use of pkt-hdr's until only the places we need them (before ip_output routines). c) remove a bunch of code not needed due to <b> aka worrying about pkthdr's :-) d) There was one last reorder problem it looks where if a restart occur's and we release and relock (at the point where we setup our alias vtag) we would end up possibly getting the wrong TSN in place. The code that fixed the TSN's just needed to be shifted around BEFORE the release of the lock.. also code that set the state (since this also could contribute). Approved by: gnn	2006-12-29 20:21:42 +00:00
Bjoern A. Zeeb	e521ae0c64	In ip6_sprintf print the addresses in a more common/readable format eliminating leading zeros like in :0001 -> :1. Reviewed by: mlaier	2006-12-16 14:15:31 +00:00
Randall Stewart	a5d547add3	1) Fixes on a number of different collision case LOR's. 2) Fix all "magic numbers" to be constants. 3) A collision case that would generate two associations to the same peer due to a missing lock is fixed. 4) Added tracking of where timers are stopped. Approved by: gnn	2006-12-14 17:02:55 +00:00
Bjoern A. Zeeb	1d54aa3ba9	MFp4: 92972, 98913 + one more change In ip6_sprintf no longer use and return one of eight static buffers for printing/logging ipv6 addresses. The caller now has to hand in a sufficiently large buffer as first argument.	2006-12-12 12:17:58 +00:00
Ruslan Ermilov	f9a047a1b7	- In nd6_rtrequest(), when caching an rtentry, don't forget to add a reference to it; otherwise, we could later access a freed memory. This is believed to fix panics some users were observing when running route6d(8), and is similar to the fix in sys/netinet/if_ether.c,v 1.139 by glebius@. PR: kern/93910, kern/105437 Testing by: Wojciech Puchar (still ongoing) - Add rtentry locking to nd6_output() similar to rt_check(). MFC after: 4 days	2006-11-25 20:38:56 +00:00
Randall Stewart	03b0b02163	-Fixes first of all the getcred on IPv6 and V4. The copy's were incorrect and so was the locking. -A bug was also found that would create a race and panic when an abort arrived on a socket being read from. -Also fix the reader to get MSG_TRUNC when a partial delivery is aborted. -Also addresses a couple of coverity caught error path memory leaks and a couple of other valid complaints Approved by: gnn	2006-11-08 00:21:13 +00:00
Robert Watson	b96fbb37da	Convert three new suser(9) calls introduced between when the priv(9) patch was prepared and committed to priv(9) calls. Add XXX comments as, in each case, the semantics appear to differ from the TCP/UDP versions of the calls with respect to jail, and because cr_canseecred() is not used to validate the query. Obtained from: TrustedBSD Project	2006-11-06 14:54:06 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Randall Stewart	50cec91936	Tons of fixes to get all the 64bit issues removed. This also moves two 16 bit int's to become 32 bit values so we do not have to use atomic_add_16. Most of the changes are %p, casts and other various nasty's that were in the orignal code base. With this commit my machine will now do a build universe.. however I as yet have not tested on a 64bit machine .. it may not work :-(	2006-11-05 13:25:18 +00:00
Randall Stewart	73932c69b6	Opps... in my fix up of all the $FreeBSD:$-> $FreeBSD$ I inserted a few to the new files.. but I falied to add the #include <sys/cdef.h> Which causes a compile error.. sorry about that... got it now :-) Approved by:gnn	2006-11-03 17:21:53 +00:00
Randall Stewart	f8829a4a40	Ok, here it is, we finally add SCTP to current. Note that this work is not just mine, but it is also the works of Peter Lei and Michael Tuexen. They both are my two key other developers working on the project.. and they need ata-boy's too: ** peterlei@cisco.com tuexen@fh-muenster.de ** I did do a make sysent which updated the syscall's and sysproto.. I hope that is correct... without it you don't build since we have new syscalls for SCTP :-0 So go out and look at the NOTES, add option SCTP (make sure inet and inet6 are present too) and play with SCTP. I will see about comitting some test tools I have after I figure out where I should place them. I also have a lib (libsctp.a) that adds some of the missing socketapi functions that I need to put into lib's.. I will talk to George about this :-) There may still be some 64 bit issues in here, none of us have a 64 bit processor to test with yet.. Michael may have a MAC but thats another beast too.. If you have a mac and want to use SCTP contact Michael he maintains a web site with a loadable module with this code :-) Reviewed by: gnn Approved by: gnn	2006-11-03 15:23:16 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Hajimu UMEMOTO	e7e51bc3e1	Make net.inet6.ip6.auto_linklocal tunable. Someone may want to enable/disable auto_linklocal even in single user mode. Discussed with: re@, gnn@ MFC after: 3 days	2006-10-13 12:45:53 +00:00
Hajimu UMEMOTO	f5c04409eb	Revert the default value of net.inet6.ip6.auto_linklocal to 1. If ipv6_enable is not set to "YES", net.inet6.ip6.auto_linklocal is turned to 0 at boot. Discussed with: re@, gnn@ MFC after: 3 days	2006-10-13 12:41:36 +00:00
John Hay	ae0ddac700	Hopefully the last tweak in trying to make it possible to add ipv6 direct host routes without side effects. Submitted by: JINMEI Tatuya MFC after: 4 days	2006-10-02 19:15:10 +00:00
George V. Neville-Neil	90ce6fa1c8	Turn off automatic link local address if ipv6_enable is not set to YES in rc.conf Reviewed by: KAME core team, cperciva MFC after: 3 days	2006-10-02 10:13:30 +00:00
John Hay	584b68e792	A better fix is to check if it is a host route. Submitted by: ume MFC after: 5 days	2006-09-30 20:25:33 +00:00
John Hay	c482f11edb	My previous commit broke "route add -inet6 <network_addr> -interface gif0". Fix that by excluding point-to-point interfaces. MFC after: 5 days	2006-09-30 14:08:57 +00:00
Bruce M Simpson	910e1364b6	Nits. Submitted by: ru	2006-09-29 16:16:41 +00:00
Bruce M Simpson	2d20d32344	Push removal of mrouted down to the rest of the tree.	2006-09-29 15:45:11 +00:00
SUZUKI Shinsuke	831c32014e	fixed a bug that IPv6 packets arriving to stf are not accepted. (a degrade introduced in in6.c Rev 1.61) PR: kern/103415 Submitted by: JINMEI Tatuya MFC after: 1 week	2006-09-22 01:42:22 +00:00
John Hay	f129892448	Make it possible to add an IPv6 host route to a host directly connected. Use something like this: route add -inet6 <dest_addr> <my_addr_on_that_interface> -interface -llinfo This is usefull for wireless adhoc mesh networks. MFC after: 5 days	2006-09-16 06:24:28 +00:00
John Hay	1fcae350ae	All multicast listeners on a port should get one copy of the packet. This was broken during the locking changes.	2006-09-07 18:44:54 +00:00
Andre Oppermann	233dcce118	First step of TSO (TCP segmentation offload) support in our network stack. o add IFCAP_TSO[46] for drivers to announce this capability for IPv4 and IPv6 o add CSUM_TSO flag to mbuf pkthdr csum_flags field o add tso_segsz field to mbuf pkthdr o enhance ip_output() packet length check to allow for large TSO packets o extend tcp_maxmtu[46]() with a flag pointer to pass interface capabilities o adjust all callers of tcp_maxmtu[46]() accordingly Discussed on: -current, -net Sponsored by: TCP/IP Optimization Fundraise 2005	2006-09-06 21:51:59 +00:00
John Hay	80a684e083	Use net.inet6.ip6.redirect / ip6_sendredirects as part of the decision to generate icmp6 redirects. Now it is possible to switch redirects off. MFC after: 1 week	2006-09-05 19:20:42 +00:00
Brooks Davis	43bc7a9c62	With exception of the if_name() macro, all definitions in net_osdep.h were unused or already in if_var.h so add if_name() to if_var.h and remove net_osdep.h along with all references to it. Longer term we may want to kill off if_name() entierly since all modern BSDs have if_xname variables rendering it unnecessicary.	2006-08-04 21:27:40 +00:00
Robert Watson	c9db0fad09	Align IPv6 socket locking with IPv4 locking: lock socket buffer explicitly and use _locked variants to avoid extra lock and unlock operations. Reviewed by: gnn MFC after: 1 week	2006-07-23 12:24:22 +00:00
George V. Neville-Neil	c6af35ee0e	The KAME project ceased work on IPv6 and IPSec in March of 2006. Remove the README file which warns against cosmetic or local only changes. FreeBSD committers should now feel free to work on the IPv6 and IPSec code without fetters. The KAME mailing lists still exist and it is always a good idea to ask questions about this code on the snap-users@kame.net mailing list. Reviewed by: rwatson, brooks	2006-07-22 02:32:32 +00:00
Robert Watson	a152f8a361	Change semantics of socket close and detach. Add a new protocol switch function, pru_close, to notify protocols that the file descriptor or other consumer of a socket is closing the socket. pru_abort is now a notification of close also, and no longer detaches. pru_detach is no longer used to notify of close, and will be called during socket tear-down by sofree() when all references to a socket evaporate after an earlier call to abort or close the socket. This means detach is now an unconditional teardown of a socket, whereas previously sockets could persist after detach of the protocol retained a reference. This faciliates sharing mutexes between layers of the network stack as the mutex is required during the checking and removal of references at the head of sofree(). With this change, pru_detach can now assume that the mutex will no longer be required by the socket layer after completion, whereas before this was not necessarily true. Reviewed by: gnn	2006-07-21 17:11:15 +00:00
Stephan Uphoff	d915b28015	Fix race conditions on enumerating pcb lists by moving the initialization ( and where appropriate the destruction) of the pcb mutex to the init/finit functions of the pcb zones. This allows locking of the pcb entries and race condition free comparison of the generation count. Rearrange locking a bit to avoid extra locking operation to update the generation count in in_pcballoc(). (in_pcballoc now returns the pcb locked) I am planning to convert pcb list handling from a type safe to a reference count model soon. ( As this allows really freeing the PCBs) Reviewed by: rwatson@, mohans@ MFC after: 1 week	2006-07-18 22:34:27 +00:00
Oleg Bulyzhin	6372145725	Complete timebase (time_second -> time_uptime) conversion. PR: kern/94249 Reviewed by: andre (few months ago) Approved by: glebius (mentor)	2006-07-05 23:37:21 +00:00
Yaroslav Tykhiy	4e6098c6a4	We needn't check "m" for NULL here because "off" should be within the mbuf chain. If we ever get a buggy caller, a bogus "off" should be caught by the sanity check at the function entry. Null "m" here means a very unusual condition of a totally broken mbuf chain (wrong m_pkthdr.len or whatever), so we can just page fault later. Found by: Coverity Prevent(tm) CID: 825	2006-06-30 18:25:07 +00:00
Yaroslav Tykhiy	4b97d7affd	There is a consensus that ifaddr.ifa_addr should never be NULL, except in places dealing with ifaddr creation or destruction; and in such special places incomplete ifaddrs should never be linked to system-wide data structures. Therefore we can eliminate all the superfluous checks for "ifa->ifa_addr != NULL" and get ready to the system crashing honestly instead of masking possible bugs. Suggested by: glebius, jhb, ru	2006-06-29 19:22:05 +00:00
Yaroslav Tykhiy	40e4360c10	Use queue(3) macros instead of accessing list/queue internals directly.	2006-06-29 16:56:07 +00:00
Bjoern A. Zeeb	421d8aa603	Use INPLOOKUP_WILDCARD instead of just 1 more consistently. OKed by: rwatson (some weeks ago)	2006-06-29 10:49:49 +00:00
Pawel Jakub Dawidek	5279398812	- Use suser_cred(9) instead of directly comparing cr_uid. - Compare pointer with NULL, instead of 0. Reviewed by: rwatson	2006-06-27 11:40:05 +00:00
Pawel Jakub Dawidek	835d4b8924	- Use suser_cred(9) instead of directly checking cr_uid. - Change the order of conditions to first verify that we actually need to check for privileges and then eventually check them. Reviewed by: rwatson	2006-06-27 11:35:53 +00:00
Robert Watson	1e0acb6801	Use suser_cred() instead of a direct comparison of cr_uid with 0 in rip6_output(). MFC after: 1 week	2006-06-25 13:54:59 +00:00
George V. Neville-Neil	a59af512d4	Fix spurious warnings from neighbor discovery when working with IPv6 over point to point tunnels (gif). PR: 93220 Submitted by: Jinmei Tatuya MFC after: 1 week	2006-06-08 00:31:17 +00:00
Seigo Tanimura	f8366b0334	Avoid spurious release of an rtentry.	2006-05-23 00:32:22 +00:00
Bjoern A. Zeeb	93e4f81d9f	In IN6_IS_ADDR_V4MAPPED case instead of returning directly set error and goto out so that locks will be dropped. Reviewed by: rwatson, gnn	2006-05-20 13:26:08 +00:00
Max Laier	656faadcb8	Remove ip6fw. Since ipfw has full functional IPv6 support now and - in contrast to ip6fw - is properly lockes, it is time to retire ip6fw.	2006-05-12 20:39:23 +00:00
Bjoern A. Zeeb	1b34a059cb	Assert ip6_forward_rt protected by Giant adding GIANT_REQUIRED to functions not yet asserting it but working on global ip6_forward_rt route cache which is not locked and perhaps should go away in the future though cache hit/miss ration wasn't bad. It's #if 0ed in frag6 because the code working on ip6_forward_rt is.	2006-05-04 18:41:08 +00:00
Robert Watson	20e3d71cdd	Break out socket access control and delivery logic from udp6_input() into its own function, udp6_append(). This mirrors a similar structure in udp_input() and udp_append(), and makes the whole thing a lot more readable. While here, add missing inpcb locking in UDP6 input path. Reviewed by: bz MFC after: 3 months	2006-05-01 21:39:48 +00:00
Robert Watson	8deea4a8f3	Move lock assertions to top of in6_pcbladdr(): we still want them to run even if we're going to return an argument-based error. Assert pcbinfo lock in in6_pcblookup_local(), in6_pcblookup_hash(), since they walk pcbinfo inpcb lists. Assert inpcb and pcbinfo locks in in6_pcbsetport(), since port reservations are changing. MFC after: 3 months	2006-04-25 12:09:58 +00:00
Robert Watson	04f2073775	Modify in6_pcbpurgeif0() to accept a pcbinfo structure rather than a pcb list head structure; this improves congruence to IPv4, and also allows in6_pcbpurgeif0() to lock the pcbinfo. Modify in6_pcbpurgeif0() to lock the pcbinfo before iterating the pcb list, use queue(9)'s LIST_FOREACH() for the iteration, and to lock individual inpcb's while manipulating them. MFC after: 3 months	2006-04-23 15:06:16 +00:00
Paul Saab	4f590175b7	Allow for nmbclusters and maxsockets to be increased via sysctl. An eventhandler is used to update all the various zones that depend on these values.	2006-04-21 09:25:40 +00:00
Robert Watson	086dafc15b	Mirror IPv4 pcb locking into in6_setsockaddr() and in6_setpeeraddr(): acquire inpcb lock when reading inpcb port+address in order to prevent races with other threads that may be changing them. MFC after: 3 months	2006-04-15 05:24:23 +00:00
Robert Watson	8511b981f6	Assert the inpcb lock in udp6_output(), as we dereference various fields. MFC after: 3 months	2006-04-12 03:34:22 +00:00
Robert Watson	dec8026073	Add comment to udp6_input() that locking is missing from multicast UDPv6 delivery. Lock the inpcb of the UDP connection being delivered to before processing IPSEC policy and other delivery activities. MFC after: 3 months	2006-04-12 03:32:54 +00:00
Robert Watson	5383103aa0	Add udbinfo locking in udp6_input() to protect lookups of the inpcb lists during UDPv6 receipt. MFC after: 3 months	2006-04-12 03:23:56 +00:00
Robert Watson	ff7425ced0	Don't use spl around call to in_pcballoc() in IPv6 raw socket support; all necessary synchronization appears present. MFC after: 3 months	2006-04-12 03:07:22 +00:00
Robert Watson	41ba156433	Remove one remaining use of spl in the IPv6 fragmentation code, as this code appears properly locked. MFC after: 3 months	2006-04-12 03:06:20 +00:00
Robert Watson	e3beea90c7	Add missing locking to udp6_getcred(), remove spl use. MFC after: 3 months	2006-04-12 03:03:47 +00:00
Robert Watson	4847772314	Remove spl use from IPv6 inpcb code. In various inpcb methods for IPv6 sockets, don't check of so_pcb is NULL, assert it isn't. MFC after: 3 months	2006-04-12 02:52:14 +00:00
SUZUKI Shinsuke	8447156ce0	ip6_mrouter_done(): use if_allmulti(0) for disabling the multicast promiscuous mode Obtained from: KAME MFC after: 2 days	2006-04-10 14:33:22 +00:00
Robert Watson	c60afb3f55	Fix assertion description: !=, not ==. Submitted by: pjd MFC after: 3 months	2006-04-09 16:33:41 +00:00
Robert Watson	14ba8add01	Update in_pcb-derived basic socket types following changes to pru_abort(), pru_detach(), and in_pcbdetach(): - Universally support and enforce the invariant that so_pcb is never NULL, converting dozens of unnecessary NULL checks into assertions, and eliminating dozens of unnecessary error handling cases in protocol code. - In some cases, eliminate unnecessary pcbinfo locking, as it is no longer required to ensure so_pcb != NULL. For example, in protocol shutdown methods, and in raw IP send. - Abort and detach protocol switch methods no longer return failures, nor attempt to free sockets, as the socket layer does this. - Invoke in_pcbfree() after in_pcbdetach() in order to free the detached in_pcb structure for a socket. MFC after: 3 months	2006-04-01 16:20:54 +00:00
Robert Watson	4c7c478d0f	Break out in_pcbdetach() into two functions: - in_pcbdetach(), which removes the link between an inpcb and its socket. - in_pcbfree(), which frees a detached pcb. Unlike the previous in_pcbdetach(), neither of these functions will attempt to conditionally free the socket, as they are responsible only for managing in_pcb memory. Mirror these changes into in6_pcbdetach() by breaking it into in6_pcbdetach() and in6_pcbfree(). While here, eliminate undesired checks for NULL inpcb pointers in sockets, as we will now have as an invariant that sockets will always have valid so_pcb pointers. MFC after: 3 months	2006-04-01 16:04:42 +00:00
Robert Watson	bc725eafc7	Chance protocol switch method pru_detach() so that it returns void rather than an error. Detaches do not "fail", they other occur or the protocol flags SS_PROTOREF to take ownership of the socket. soclose() no longer looks at so_pcb to see if it's NULL, relying entirely on the protocol to decide whether it's time to free the socket or not using SS_PROTOREF. so_pcb is now entirely owned and managed by the protocol code. Likewise, no longer test so_pcb in other socket functions, such as soreceive(), which have no business digging into protocol internals. Protocol detach routines no longer try to free the socket on detach, this is performed in the socket code if the protocol permits it. In rts_detach(), no longer test for rp != NULL in detach, and likewise in other protocols that don't permit a NULL so_pcb, reduce the incidence of testing for it during detach. netinet and netinet6 are not fully updated to this change, which will be in an upcoming commit. In their current state they may leak memory or panic. MFC after: 3 months	2006-04-01 15:42:02 +00:00
Robert Watson	ac45e92ff2	Change protocol switch pru_abort() API so that it returns void rather than an int, as an error here is not meaningful. Modify soabort() to unconditionally free the socket on the return of pru_abort(), and modify most protocols to no longer conditionally free the socket, since the caller will do this. This commit likely leaves parts of netinet and netinet6 in a situation where they may panic or leak memory, as they have not are not fully updated by this commit. This will be corrected shortly in followup commits to these components. MFC after: 3 months	2006-04-01 15:15:05 +00:00
David Malone	fe12457335	This comment on various IPPORT_ defines was copied from in.h and probably never fully applied to IPv6. Over time it has become more stale, so replace it with something more up to date. Reviewed by: ume MFC after: 1 month	2006-03-28 12:51:22 +00:00
Robert Watson	85f1f481ab	Remove manual assignment of m_pkthdr from one mbuf to another in ipsec_copypkt(), as this is already handled by the call to M_MOVE_PKTHDR(), which also knows how to correctly handle MAC m_tags. This corrects a panic when running with MAC and KAME IPSEC. PR: kern/94599 Submitted by: zhouyi zhou <zhouyi04 at ios dot cn> Reviewed by: bz MFC after: 3 days	2006-03-28 10:16:38 +00:00
SUZUKI Shinsuke	31d4137bf3	fixed a memory leak when net.inet6.icmp6.nd6_maxqueuelen is greater than 1 Obtained from: KAME MFC after: 3 days	2006-03-24 16:20:12 +00:00
David Malone	fcd1001c63	Make net.inet.ip.portrange.reservedhigh and net.inet.ip.portrange.reservedlow apply to IPv6 aswell as IPv4. We could have made new sysctls for IPv6, but that potentially makes things complicated for mapped addresses. This seems like the least confusing option and least likely to cause obscure problems in the future. This change makes the mac_portacl module useful with IPv6 apps. Reviewed by: ume MFC after: 1 month	2006-03-19 11:48:48 +00:00
SUZUKI Shinsuke	d3693a631e	implements section 2.2 of RFC4191, regarding the reserved preference value (10) Obtained from: KAME MFC after: 1 day	2006-03-19 06:38:39 +00:00
SUZUKI Shinsuke	e381ac4daa	updates net.inet6.ip6.kame_version as the proof of the latest KAME merge Reviewed by: KAME MFC after: 2 days	2006-03-19 02:11:42 +00:00
SUZUKI Shinsuke	2c112cdc6d	fixed a bug that an MLD report is not advertised when group-specific MLD query is received. PR: kern/93526 Obtained from: KAME MFC after: 1 day	2006-03-04 09:17:11 +00:00
Hajimu UMEMOTO	430683286b	avoided the use of purged address structure when an address became invalid in nd6_timer(). PR: kern/93170 Reported by: kris Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp> Confirmed by: kris Obtained from: KAME MFC after: 2 days	2006-02-12 15:37:08 +00:00
George V. Neville-Neil	f2b1bd14dc	Fix for an inappropriate bzero of the ICMPv6 stats. The code was zero'ing the wrong structure member but setting the correct one. Submitted by: James dot Juran at baesystems dot com Reviewed by: gnn MFC after: 1 week	2006-02-08 07:16:46 +00:00
Hajimu UMEMOTO	8c76311215	shut up strict-aliasing rules warning.	2006-02-05 09:52:40 +00:00
Hajimu UMEMOTO	92cb1c3210	make IPV6_V6ONLY socket option work for UDP as well. PR: ports/92620 Reported by: Kurt Miller <kurt__at__intricatesoftware.com> MFC after: 1 week	2006-02-02 11:46:05 +00:00
Christian S.J. Peron	604afec496	Somewhat re-factor the read/write locking mechanism associated with the packet filtering mechanisms to use the new rwlock(9) locking API: - Drop the variables stored in the phil_head structure which were specific to conditions and the home rolled read/write locking mechanism. - Drop some includes which were used for condition variables - Drop the inline functions, and convert them to macros. Also, move these macros into pfil.h - Move pfil list locking macros intp phil.h as well - Rename ph_busy_count to ph_nhooks. This variable will represent the number of IN/OUT hooks registered with the pfil head structure - Define PFIL_HOOKED macro which evaluates to true if there are any hooks to be ran by pfil_run_hooks - In the IP/IP6 stacks, change the ph_busy_count comparison to use the new PFIL_HOOKED macro. - Drop optimization in pfil_run_hooks which checks to see if there are any hooks to be ran, and returns if not. This check is already performed by the IP stacks when they call: if (!PFIL_HOOKED(ph)) goto skip_hooks; - Drop in assertion which makes sure that the number of hooks never drops below 0 for good measure. This in theory should never happen, and if it does than there are problems somewhere - Drop special logic around PFIL_WAITOK because rw_wlock(9) does not sleep - Drop variables which support home rolled read/write locking mechanism from the IPFW firewall chain structure. - Swap out the read/write firewall chain lock internal to use the rwlock(9) API instead of our home rolled version - Convert the inlined functions to macros Reviewed by: mlaier, andre, glebius Thanks to: jhb for the new locking API	2006-02-02 03:13:16 +00:00
Gleb Smirnoff	25af0bb50e	Add some initial locking to gif(4). It doesn't covers the whole driver, however IPv4-in-IPv4 tunnels are now stable on SMP. Details: - Add per-softc mutex. - Hold the mutex on output. The main problem was the rtentry, placed in softc. It could be freed by ip_output(). Meanwhile, another thread being in in_gif_output() can read and write this rtentry. Reported by: many Tested by: Alexander Shiryaev <aixp mail.ru>	2006-01-30 08:39:09 +00:00
Hajimu UMEMOTO	411babc618	don't embed scope id before running packet filters. Reported by: YAMAMOTO Takashi <yamt__at__mwd.biglobe.ne.jp> Obtained from: NetBSD MFC after: 1 week	2006-01-25 08:17:02 +00:00
Robert Watson	9f8a02f168	Convert in6_cksum() to ANSI C function declaration. MFC after: 1 week	2006-01-22 01:17:57 +00:00
Robert Watson	fc4c825847	When storing the results of malloc() in a pointer to a pointer, check the pointer to a pointer for NULL, not the pointer for NULL. Noticed by: Coverity Prevent analysis tool MFC after: 3 days	2006-01-14 00:09:41 +00:00
Robert Watson	2ab392c630	In ipcomp6_input(), check 'md' not 'm' after a call to m_pulldown(): 'm' may be a stale pointer at this point, and we're interested in whether or not m_pulldown() failed. Noticed by: Coverity Prevent analysis tool MFC after: 3 days	2006-01-13 23:53:23 +00:00
SUZUKI Shinsuke	02ff33e2d0	added a note about the assumption for m->m_pkthdr.rcvif Obtained from: KAME MFC After: 1 day	2006-01-09 09:08:43 +00:00
Andrew Thompson	73ff045c57	Add RFC 3378 EtherIP support. This change makes it possible to add gif interfaces to bridges, which will then send and receive IP protocol 97 packets. Packets are Ethernet frames with an EtherIP header prepended. Obtained from: NetBSD MFC after: 2 weeks	2005-12-21 21:29:45 +00:00
SUZUKI Shinsuke	7014e0eb11	fixed a kernel crash at the initialization time of PIM-SM register interface MFC after: 2 days	2005-12-09 04:42:19 +00:00
Hajimu UMEMOTO	4a3df7fe7b	the response NS to a DAD NS was not sent correctly due to the invalid destination address. Submitted by: JINMEI Tatuya <jinmei__at__isl.rdc.toshiba.co.jp> MFC after: 1 day	2005-12-08 06:43:39 +00:00
SUZUKI Shinsuke	a829cf5765	fixed a kernel crash due to an improper removal of callout-timer (ToDo: similar fix is necessary for other NDP-related callout-timers in netinet6/nd6*.c) PR: kern/88725 MFC after: 1 month	2005-11-16 12:36:08 +00:00
Ruslan Ermilov	303989a2f3	Use sparse initializers for "struct domain" and "struct protosw", so they are easier to follow for the human being.	2005-11-09 13:29:16 +00:00
SUZUKI Shinsuke	797df30d75	statically configured IPv6 address is properly added/deleted now Obtained from: KAME Reported in: freebsd-net@freebsd MFC after: 1 day	2005-10-31 23:06:04 +00:00
SUZUKI Shinsuke	36dc24e61e	fixed a compilation failure on amd64/sparc64/ia64 Submitted by: max MFC after: 2 month	2005-10-22 05:07:16 +00:00
SUZUKI Shinsuke	200caaf0c0	nuked non-existing commands	2005-10-21 16:31:39 +00:00
SUZUKI Shinsuke	743eee666f	sync with KAME regarding NDP - introduced fine-grain-timer to manage ND-caches and IPv6 Multicast-Listeners - supports Router-Preference <draft-ietf-ipv6-router-selection-07.txt> - better prefix lifetime management - more spec-comformant DAD advertisement - updated RFC/internet-draft revisions Obtained from: KAME Reviewed by: ume, gnn MFC after: 2 month	2005-10-21 16:23:01 +00:00
SUZUKI Shinsuke	9c8aab3e0b	perform NUD on an IPv6-aware point-to-point interface Obtained from: KAME MFC after: 1 week	2005-10-21 15:59:00 +00:00
SUZUKI Shinsuke	4ecbe3316a	sync with KAME (renamed a macro IPV6_DADOUTPUT to IPV6_UNSPECSRC) Obtained from: KAME	2005-10-21 15:45:13 +00:00
SUZUKI Shinsuke	7aa5949375	sync with KAME (nuked unused code, use NULL to denote a NULL pointer) Obtained from: KAME Reviewed by: ume, gnn	2005-10-19 17:18:49 +00:00
SUZUKI Shinsuke	c1a049ac20	sync with KAME (removed a unnecesary non-standard macro) Obtained from: KAME Reviewd by: ume, gnn	2005-10-19 16:53:24 +00:00
SUZUKI Shinsuke	d28bde669a	sync with KAME regarding the following clarification in RFC3542: - disable IPv6 operation if DAD fails for some EUI-64 link-local addresses. - export get_hw_ifid() (and rename it) as a subroutine for this process. Obtained from: KAME Reviewd by: ume, gnn MFC after: 2 week	2005-10-19 16:43:57 +00:00
SUZUKI Shinsuke	a22adbc68c	sync with KAME (don't respond to NI_QTYPE_IPV4ADDR) Obtained from: KAME Reviewed by: ume, gnn	2005-10-19 16:27:33 +00:00
SUZUKI Shinsuke	5b27b04579	supported an ndp command suboption to disable IPv6 in the given interface Obtained from: KAME Reviewd by: ume, gnn MFC after: 2 week	2005-10-19 16:20:18 +00:00
SUZUKI Shinsuke	b9204379a1	added an ioctl option in kernel so that ndp/rtadvd can change some NDP-related kernel variables based on their configurations (RFC2461 p.43 6.2.1 mandates this for IPv6 routers) Obtained from: KAME Reviewd by: ume, gnn MFC after: 2 weeks	2005-10-19 15:05:42 +00:00
SUZUKI Shinsuke	2ce62dce17	sync with KAME in the following points: - fixed typos - improved some comment descriptions - use NULL, instead of 0, to denote a NULL pointer - avoid embedding a magic number in the code - use nd6log() instead of log() to record NDP-specific logs - nuked an unnecessay white space Obtained from: KAME MFC after: 1 day	2005-10-19 10:09:19 +00:00

... 3 4 5 6 7 ...

972 Commits