freebsd-dev

Author	SHA1	Message	Date
Randall Stewart	2dad8a55be	- Remove extra comment for 7.0 (no GIANT here). - Remove unneeded WLOCK/UNLOCK of inp for getting TCB lock. - Fix panic that may occur when freeing an assoc that has partial delivery in progress (may dereference null socket pointer when queuing partial delivery aborted notification) - Some spacing and comment fixes. - Fix address add handling to clear cached routes and source addresses when peer acks the add in case the routing table changes. Approved by: re@freebsd.org (Bruce Mah)	2007-08-16 01:51:22 +00:00
Qing Li	8cb5ba02d8	Use the sequence number comparison macro to compare projected_offset against isn_offset to account for wrap around. Reviewed by: gnn, kmacy, silby Submitted by: yusheng.huang@bluecoat.com Approved by: re MFC: 3 days	2007-08-16 01:35:55 +00:00
Christian S.J. Peron	b244c8ad14	Over the past couple of years, there have been a number of reports relating the use of divert sockets to dead locks. A number of LORs have been reported between divert and a number of other network subsystems including: IPSEC, Pfil, multicast, ipfw and others. Other dead locks could occur because of recursive entry into the IP stack. This change should take care of most if not all of these issues. A summary of the changes follow: - We disallow multicast operations on divert sockets. It really doesn't make semantic sense to allow this, since typically you would set multicast parameters on multicast end points. NOTE: As a part of this change, we actually dis-allow multicast options on any socket that IS a divert socket OR IS NOT a SOCK_RAW or SOCK_DGRAM family - We check to see if there are any socket options that have been specified on the socket, and if there was (which is very un-common and also probably doesnt make sense to support) we duplicate the mbuf carrying the options. - We then drop the INP/INFO locks over the call to ip_output(). It should be noted that since we no longer support multicast operations on divert sockets and we have duplicated any socket options, we no longer need the reference to the pcb to be coherent. - Finally, we replaced the call to ip_input() to use netisr queuing. This should remove the recursive entry into the IP stack from divert. By dropping the locks over the call to ip_output() we eliminate all the lock ordering issues above. By switching over to netisr on the inbound path, we can no longer recursively enter the ip_input() code via divert. I have tested this change by using the following command: ipfwpcap -r 8000 - \| tcpdump -r - -nn -v This should exercise the input and re-injection (outbound) path, which is very similar to the work load performed by natd(8). Additionally, I have run some ospf daemons which have a heavy reliance on raw sockets and multicast. Approved by: re@ (kensmith) MFC after: 1 month LOR: 163 LOR: 181 LOR: 202 LOR: 203 Discussed with: julian, andre et al (on freebsd-net) In collaboration with: bms [1], rwatson [2] [1] bms helped out with the multicast decisions [2] rwatson submitted the original netisr patches and came up with some of the original ideas on how to combat this issue.	2007-08-06 22:06:36 +00:00
Randall Stewart	63981c2b40	- change number assignments for SHA225-512 (match artisync for bakeoff.. using the next sequential ones) - In cookie processing 1-2-1, we did not increment the stcb refcnt before releasing the tcb lock. We need to do this to keep the tcb from being freed by a abort or ?? unlikely but worth doing. Also get rid of unneed INP_WLOCK. - extra receive info included the rcvinfo which killed the padding/alignment. We now redefine all the fields properly so they both align properly both to 128 bytes. - A peeled off socket would not close without an error due to its misguided idea that sctp_disconnect() was not supported on it. This fixes it so it goes through the proper path. - When an assoc was being deleted after abort (via a timer) a small race condition exists where we might take a packet for the old assoc (since we are waiting for a cleanup timer). This state especially happens in mac. We now add a state in the asoc so these can properly handle the packet as OOTB. Approved by: re@freebsd.org(Ken Smith)	2007-08-06 15:46:46 +00:00
Robert Watson	0bf686c125	Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which previously conditionally acquired Giant based on debug.mpsafenet. As that has now been removed, they are no longer required. Removing them significantly simplifies error-handling in the socket layer, eliminated quite a bit of unwinding of locking in error cases. While here clean up the now unneeded opt_net.h, which previously was used for the NET_WITH_GIANT kernel option. Clean up some related gotos for consistency. Reviewed by: bz, csjp Tested by: kris Approved by: re (kensmith)	2007-08-06 14:26:03 +00:00
Bjoern A. Zeeb	cc977adc71	Rename option IPSEC_FILTERGIF to IPSEC_FILTERTUNNEL. Also rename the related functions in a similar way. There are no functional changes. For a packet coming in with IPsec tunnel mode, the default is to only call into the firewall with the "outer" IP header and payload. With this option turned on, in addition to the "outer" parts, the "inner" IP header and payload are passed to the firewall too when going through ip_input() the second time. The option was never only related to a gif(4) tunnel within an IPsec tunnel and thus the name was very misleading. Discussed at: BSDCan 2007 Best new name suggested by: rwatson Reviewed by: rwatson Approved by: re (bmah)	2007-08-05 16:16:15 +00:00
Peter Wemm	c4a184bdc4	Change TCPTV_MIN to be independent of HZ. While it was documented to be in ticks "for algorithm stability" when originally committed, it turns out that it has a significant impact in timing out connections. When we changed HZ from 100 to 1000, this had a big effect on reducing the time before dropping connections. To demonstrate, boot with kern.hz=100. ssh to a box on local ethernet and establish a reliable round-trip-time (ie: type a few commands). Then unplug the ethernet and press a key. Time how long it takes to drop the connection. The old behavior (with hz=100) caused the connection to typically drop between 90 and 110 seconds of getting no response. Now boot with kern.hz=1000 (default). The same test causes the ssh session to drop after just 9-10 seconds. This is a big deal on a wifi connection. With kern.hz=1000, change sysctl net.inet.tcp.rexmit_min from 3 to 30. Note how it behaves the same as when HZ was 100. Also, note that when booting with hz=100, net.inet.tcp.rexmit_min used to be 30. This commit changes TCPTV_MIN to be scaled with hz. rexmit_min should always be about 30. If you set hz to Really Slow(TM), there is a safety feature to prevent a value of 0 being used. This may be revised in the future, but for the time being, it restores the old, pre-hz=1000 behavior, which is significantly less annoying. As a workaround, to avoid rebooting or rebuilding a kernel, you can run "sysctl net.inet.tcp.rexmit_min=30" and add "net.inet.tcp.rexmit_min=30" to /etc/sysctl.conf. This is safe to run from 6.0 onwards. Approved by: re (rwatson) Reviewed by: andre, silby	2007-07-31 22:11:55 +00:00
Dag-Erling Smørgrav	218cbbea9a	Make tcpstates[] static, and make sure TCPSTATES is defined before <netinet/tcp_fsm.h> is included into any compilation unit that needs tcpstates[]. Also remove incorrect extern declarations and TCPDEBUG conditionals. This allows kernels both with and without TCPDEBUG to build, and unbreaks the tinderbox. Approved by: re (rwatson)	2007-07-30 11:06:42 +00:00
Bruce A. Mah	e251d2f4f6	Fix a typo in a log message: s/Reveived/Received/. Approved by: re (rwatson)	2007-07-29 20:13:22 +00:00
Matt Jacob	24face5416	Fix compilation problems- tcpstates is only available if TCPDEBUG is set. Approved by: re (in spirit)	2007-07-29 01:31:33 +00:00
Mike Silbersack	e3020cfd3c	Fix a panic introduced in rev 1.126. Approved by: re (rwatson)	2007-07-28 20:13:40 +00:00
Andre Oppermann	773673c133	Provide a sysctl to toggle reporting of TCP debug logging: sys.net.inet.tcp.log_debug = 1 It defaults to enabled for the moment and is to be turned off for the next release like other diagnostics from development branches. It is important to note that sysctl sys.net.inet.tcp.log_in_vain uses the same logging function as log_debug. Enabling of the former also causes the latter to engage, but not vice versa. Use consistent terminology in tcp log messages: "ignored" means a segment contains invalid flags/information and is dropped without changing state or issuing a reply. "rejected" means a segments contains invalid flags/information but is causing a reply (usually RST) and may cause a state change. Approved by: re (rwatson)	2007-07-28 12:20:39 +00:00
Andre Oppermann	cdaf208d09	o Move setting/resetting logic of syncache timer from macro SYNCACHE_TIMEOUT to new function syncache_timeout(). o Fix inverted timeout callout engagement logic to actually enable the timer for the bucket row. Before SYN\|ACK was not retransmitted. o Simplify SYN\|ACK retransmit timeout backoff calculation. o Improve logging of retransmit and timeout events. o Reset timeout when duplicate SYN arrives. o Add comments. o Rearrange SYN cookie statistics counting. Bug found by: silby Submitted by: silby (different version) Approved by: re (rwatson)	2007-07-28 12:02:05 +00:00
Andre Oppermann	19bc77c549	o Move all detailed checks for RST in LISTEN state from tcp_input() to syncache_rst(). o Fix tests for flag combinations of RST and SYN, ACK, FIN. Before a RST for a connection in syncache did not properly free the entry. o Add more detailed logging. Approved by: re (rwatson)	2007-07-28 11:51:44 +00:00
Robert Watson	c6b2899785	Replace references to NET_CALLOUT_MPSAFE with CALLOUT_MPSAFE, and remove definition of NET_CALLOUT_MPSAFE, which is no longer required now that debug.mpsafenet has been removed. The once over: bz Approved by: re (kensmith)	2007-07-28 07:31:30 +00:00
Mike Silbersack	c325962b47	Export the contents of the syncache to netstat. Approved by: re (kensmith) MFC after: 2 weeks	2007-07-27 00:57:06 +00:00
Andre Oppermann	564aab1fe6	Fix comments in tcp_do_segment(). Approved by: re (kensmith)	2007-07-25 18:48:24 +00:00
Randall Stewart	1b649582bb	- take out a needless panic under invariants for sctp_output.c - Fix addrs's error checking of sctp_sendx(3) when addrcnt is less than SCTP_SMALL_IOVEC_SIZE - re-add back inpcb_bind local address check bypass capability - Fix it so sctp_opt_info is independant of assoc_id postion. - Fix cookie life set to use MSEC_TO_TICKS() macro. - asconf changes o More comment changes/clarifications related to the old local address "not" list which is now an explicit restricted list. o Rename some functions for clarity: - sctp_add/del_local_addr_assoc to xxx_local_addr_restricted() - asconf related iterator functions to sctp_asconf_iterator_xxx() o Fix bug when the same address is deleted and added (and removed from the asconf queue) where the ifa is "freed" twice refcount wise, possibly freeing it completely. o Fix bug in output where the first ASCONF would not go out after the last address is changed (e.g. only goes out when retransmitted). o Fix bug where multiple ASCONFs can be bundled in the same packet with the and with the same serial numbers. o Fix asconf stcb iterator to not send ASCONF until after all work queue entries have been processed. o Change behavior so that when the last address is deleted (auto asconf on a bound all endpoint) no action is taken until an address is added; at that time, an ASCONF add+delete is sent (if the assoc is still up). o Fix local address counting so that address scoping is taken into account. o #ifdef SCTP_TIMER_BASED_ASCONF the old timer triggered sending of ASCONF (after an RTO). The default now is to send ASCONF immediately (except for the case of changing/deleting the last usable address). Approved by: re(ken smith)@freebsd.org	2007-07-24 20:06:02 +00:00
Randall Stewart	52be287ebb	- remove duplicate code from sctp_asconf.c - remove duplicate #include <sys/priv.h> that is not under #ifdef FreeBSD version to allow compile on 6.1 - static analysis changes per the cisco SA tool including: o some SA_IGNORE comments o some checks for NULL before unlock. o type corrections int -> size_t - Fix it so sctp_alloc_asoc takes a thread/proc argument. Without this we pass a NULL in to bind on implicit assoc setup and crash :-( Approved by: re@freebsd.org(Ken Smith)	2007-07-21 21:41:32 +00:00
Robert Watson	08af97b790	Attempt to improve feature parity between UDPv4 and UDPv6 by merging UDPv4 features to UDPv6: - Add MAC checks on delivery and MAC labeling on transmit. - Check for (and reject) datagrams with destination port 0. - For multicast delivery, check the source port only if the socket being considered as a destination has been connected. - Implement UDP blackholing based on net.inet.udp.blackhole. - Add a new ICMPv6 unreachable reply rate limiting category for failed delivery attempts and implement rate limiting for UDPv6 (submitted by bz). Approved by: re (kensmith) Reviewed by: bz	2007-07-19 22:34:25 +00:00
Randall Stewart	18e198d3a3	- added pre-checks to the bindx call. - use proper tick gathering macro instead of ticks directly. - Placed reasonable boundaries on sets that a user can do that are converted to ticks from ms. - Fix CMT_PF to always check to be sure CMT is on. - Fix ticks use of CMT_PF. - put back code to allow asconfs to be queued while INITs are in flight and before the assoc is established. - During window probes, an ack'd packet might be left with the window probe mark on it causing it to be retransmitted. Change so that the flight decrease macro clears the window_probe mark. - Additional logging flight size/reading and ASOC LOG. This is only enabled if you manually insert things into opt_sctp.h since its a set of debug code only. - Found an interesting SMP race in the way data was appended which could cause a reader to lose a part of a message, had to reorder when we marked the message was complete to after the data was appended. - bug in ADD-IP for the subset bound socket case when the peer has only one address - fix ASCONF implicit success/error handling case - proper support of jails in Freebsd 6> - copy out the timeval for the 64 bit sparc world on cookie-echo alignment error crashes without this). Approved by: re(Ken Smith)	2007-07-17 20:58:26 +00:00
Randall Stewart	b54d3a6c48	- Modular congestion control, with RFC2581 being the default. - CMT_PF states added (w/sysctl to turn the PF version on) - sctp_input.c had a missing incr of cookie case when the auth was bad. This meant a free was called without an increment to refcnt, added increment like rest of code. - There was a case, unlikely, when the scope of the destination changed (this is a TSNH case). In that case, it would not free the alloc'ed asoc (in sctp_input.c). - When listed addresses found a colliding cookie/Init, then the collided upon tcb was not unlocked in sctp_pcb.c - Add error checking on arguments of sctp_sendx(3) to prevent it from referencing a NULL pointer. - Fix an error return of sctp_sendx(3), it was returing ENOMEM not -1. - Get assoc id was changed to use the sanctified socket api method for getting a assoc id (PEER_ADDR_INFO instead of PEER_ADDR_PARAMS). - Fix it so a peeled off socket will get a proper error return if it trys to send to a different address then it is connected to. - Fix so that select_a_stream can avoid an endless loop that could hang a caller. - time_entered (state set time) was not being set in all cases to the time we went established. Approved by: re(ken smith)	2007-07-14 09:36:28 +00:00
Robert Watson	43bbb6aa10	Further cleanup of UDPv4: - Move udp_sendspace and udp_recvspace global variables and associated sysctls to the top of the file where most other such things are present. - Rename static variable 'blackhole' to 'udp_blackhole' and unstaticize so that we can add blackhole support for UDPv6 using the same MIB variable. - Move udp_append() above udp_input() to match the function order in udp6_usrreq.c. Approved by: re (kensmith)	2007-07-10 09:30:46 +00:00
Bruce M Simpson	d90b8675c2	Fix a regression in IPv4 multicast join path (IP_ADD_MEMBERSHIP). With the in_mcast.c code, if an interface for an IPv4 multicast join was not specified, and a route did not exist for the specified group in the unicast forwarding tables, the join would be rejected with the error EADDRNOTAVAIL. This change restores the old behaviour whereby if no interface is specified, and no route exists for the group destination, the IPv4 address list is walked to find a non-loopback, multicast-capable interface to satisfy the join request. This should resolve problems with starting multicast services during system boot or when a default forwarding entry does not exist. Approved by: re (rwatson)	2007-07-09 10:36:47 +00:00
Robert Watson	bd84d20457	Minor UDPv4 cleanup: capitalize comment, move statistics update after mbuf free to be consistent with other error handling, and release socket buffer lock before freeing mbufs and statistics updates rather than after. Approved by: re (kensmith)	2007-07-07 09:46:34 +00:00
Peter Wemm	477d44c467	Fix a second warning, introduced by my last "fix". I committed the wrong diff from the wrong machine. Pointy hat to: peter Approved by: re (rwatson - blanket, several days ago)	2007-07-05 06:04:46 +00:00
Peter Wemm	9fb5d4c064	Fix cast-qualifiers warning when INET6 is not present Approved by: re (rwatson)	2007-07-05 05:55:57 +00:00
Max Laier	60ee384760	Link pf 4.1 to the build: - move ftp-proxy from libexec to usr.sbin - add tftp-proxy - new altq mtag link Approved by: re (kensmith)	2007-07-03 12:46:08 +00:00
George V. Neville-Neil	b2630c2934	Commit the change from FAST_IPSEC to IPSEC. The FAST_IPSEC option is now deprecated, as well as the KAME IPsec code. What was FAST_IPSEC is now IPSEC. Approved by: re Sponsored by: Secure Computing	2007-07-03 12:13:45 +00:00
Randall Stewart	5bead43650	- Consolidate the code that free's chunks to actually also call the sctp_free_remote_address() function. - Assure that when we allocate a chunk the whoTo is NULL, also when we free it and place it into the cache we NULL it (that way the consolidation code will always work). - Fix a small race, when a empty data holder is left on the stream out queue, and both sides do a shutdown, the empty data holder would prevent us from sending a SHUTDOWN-ACK and at the same time we never would cleanup the empty holder (since nothing was ever in queue). We now add a utility function that a) cleans up empty holders and b) properly determines if there are still pending data chunks on the stream out wheel. Approved by: re@freebsd.org (Ken Smith)	2007-07-02 19:22:22 +00:00
Robert Watson	02dd4b5cbd	Continue pre-7.0 privilege cleanup: update suser(9) comments to be priv(9) comments. Approved by: re (bmah)	2007-07-02 15:44:30 +00:00
George V. Neville-Neil	0d29af67f2	Fix a dangling netinet6 to netipsec transition for SCTP include files. Approved by: re	2007-07-01 14:18:20 +00:00
George V. Neville-Neil	2cb64cb272	Commit IPv6 support for FAST_IPSEC to the tree. This commit includes only the kernel files, the rest of the files will follow in a second commit. Reviewed by: bz Approved by: re Supported by: Secure Computing	2007-07-01 11:41:27 +00:00
Randall Stewart	9ceab0faf0	- When a SCTP socket is closed, but the last data SACK is lost, we would incorrectly abort the association instead of retransmitting the SACK. Approved by: re@freebsd.org (Ken Smith)	2007-06-29 15:14:23 +00:00
Randall Stewart	97c76f10a0	- Update bindx address checking to properly screen out address per the socket api, adding port validation. We allow port 0 or the already bound port number and no others. Approved by: re@freebsd.org (Ken Smith)	2007-06-25 19:05:26 +00:00
Randall Stewart	a964e8de4c	- Fix type casts in calling sctp_m_getptr, it expects a int not an unsigned (returned by sizeof) also add cast to comparison check for size bounds. Approved by: re(bmah@freebsd.org)	2007-06-22 14:40:09 +00:00
Randall Stewart	671d309c7c	- Fix stream reset so it limits the number of streams that can be listed - Fix fwd-tsn to use proper accessor so it does not overrun mbufs - Fix stream reset error reporting to actually work (it has always been broken if the peer rejects a stream reset) - Some 64 bit friendly changes Approved by: re(bmah@freebsd.org)	2007-06-22 13:50:56 +00:00
Randall Stewart	ea1fbec59a	- Two more static analisys bugs found by cisco's tool on a subsequent run.	2007-06-18 22:36:52 +00:00
Randall Stewart	eacc51c5b6	- Fixes cstatic issues found by cisco sa tool (missing frees and such on error legs) - align sctp_sockstore to 64 bit boundary ..	2007-06-18 21:59:15 +00:00
Maxim Konovalov	d069a5d478	o Make ipfw set more robust -- now it is possible: - to show a specific set: ipfw set 3 show - to delete rules from the set: ipfw set 9 delete 100 200 300 - to flush the set: ipfw set 4 flush - to reset rules counters in the set: ipfw set 1 zero PR: kern/113388 Submitted by: Andrey V. Elsukov Approved by: re (kensmith) MFC after: 6 weeks	2007-06-18 17:52:37 +00:00
Randall Stewart	d95ddf0251	Add additional logging level mask for packet_logging too.	2007-06-18 13:57:37 +00:00
Randall Stewart	19d8ca2eaf	- The packet log needs to copy all of the buffer not to the end.	2007-06-17 23:43:37 +00:00
Randall Stewart	75298de2a0	Back out last change to inpcb_free. Turns out we need to hold off freeing if there is data pending ... someone might do send/close. Which means we want the data to go and then close it after startup. Added comments to the code as well to note that this is done for a reason.	2007-06-17 19:27:46 +00:00
Matt Jacob	cce418d3bf	Make gcc4.2 happy and zero save_ip for the unlikely (blackhole != 0) codepath.	2007-06-17 04:07:11 +00:00
Randall Stewart	e42a0f5e72	- For sctp_input/sctp6_input add announcment when a packet arrives (debug) - re-factor the packet drop in sctp_output a bit more, we don't need the trim after all, but the size calc is now corrected. - When a assoc is in the COOKIE-ECHO/COOKIE-WAIT state and the user closes, it should not matter if data is queued, the assoc should be purged. - In error leg a missing free_chunk when iph comes in NULL (should not happen but just in case).	2007-06-17 01:36:02 +00:00
Matt Jacob	27d65ef267	Replace incorrect local OFFSET_OF macro with the correct and generic offsetof macro.	2007-06-17 00:33:34 +00:00
Matt Jacob	fbdd20a1ae	Simplification to quiet a gcc4.2 warning. Just by setting match.s_addr to nonzero you fulfill the same function as the variable 'cmp'. so you might as well zero match and test against it later. Reviewed by: timeout on review request	2007-06-17 00:31:24 +00:00
Randall Stewart	ca2cc3feac	- Better handle sending large pkt-drops. We were not triming the data with m_adj if a large pkt arrived with a bad csum some systems can't handle you not triming the tail (think panda :-D)	2007-06-16 14:03:15 +00:00
Randall Stewart	48dabb921d	- Raise max range of sctp_logging sysctl so panda does not disallow us to turn on logging levels.	2007-06-16 03:28:18 +00:00
Randall Stewart	72fb6fdb41	- Matthew's changes to get inlines out, plus a few of my own to deal with the VRF inline function -> becomes a macro now. Submitted by: Matthew Jacobs	2007-06-16 00:33:47 +00:00

1 2 3 4 5 ...

2933 Commits