freebsd-nq

Author	SHA1	Message	Date
Will Andrews	15249f73e9	Fix CARP in backup mode by properly registering its hooks for INET and INET6 using ipproto_{un,}register() and the newly created ip6proto_{un,}register() so that it can again receive IPPROTO_CARP packets allowing its state machine to work. Reviewed by: bz Approved by: ken (mentor)	2010-09-06 21:06:06 +00:00
Will Andrews	e24fa11d3e	Fix static kernel builds with carp(4) by changing its SYSINIT order so that it is initialized after basic protocol initialization, which allows it to register via pf_proto_register(). Reviewed by: bz Approved by: ken (mentor)	2010-09-06 21:03:30 +00:00
Gleb Smirnoff	14a268a073	in_delayed_cksum() requires host byte order. Reported by: Alexander Levin <amindomao googlemail.com> MFC after: 1 week	2010-09-06 13:17:01 +00:00
Michael Tuexen	049640c1f0	Implement correct handling of address parameter and sendinfo for SCTP send calls. MFC after: 4 weeks.	2010-09-05 20:13:07 +00:00
Randall Stewart	52129fcd78	Fix some CLANG warnings. One clang warning is left due to the fact that its bogus.. nam->sa_family will not change from AF_INET6 to AF_INET (but clang thinks it does ;-D)	2010-09-05 13:41:45 +00:00
Bjoern A. Zeeb	42db1b87d6	In case of RADIX_MPATH do not leak the IN_IFADDR read lock on early return. MFC after: 3 days	2010-09-04 16:06:01 +00:00
Bjoern A. Zeeb	1b48d24533	MFp4 CH=183052 183053 183258: In protosw we define pr_protocol as short, while on the wire it is an uint8_t. That way we can have "internal" protocols like DIVERT, SEND or gaps for modules (PROTO_SPACER). Switch ipproto_{un,}register to accept a short protocol number() and do an upfront check for valid boundries. With this we also consistently report EPROTONOSUPPORT for out of bounds protocols, as we did for proto == 0. This allows a caller to not error for this case, which is especially important if we want to automatically call these from domain handling. () the functions have been without any in-tree consumer since the initial introducation, so this is considered save. Implement ip6proto_{un,}register() similarly to their legacy IP counter parts to allow modules to hook up dynamically. Reviewed by: philip, will MFC after: 1 week	2010-09-02 17:43:44 +00:00
Michael Tuexen	fc0487080a	Fix a bug which results in peer IPv4 addresses a.b.c.d with 224<=d<=239 incorrectly being detected as multicast addresses on little endian systems. MFC after: 2 weeks	2010-09-01 16:11:26 +00:00
Maxim Konovalov	5a47f206a1	o Some programs could send broadcast/multicast traffic to ipfw pseudo-interface. This leads to a panic due to uninitialized if_broadcastaddr address. Initialize it and implement ip_output() method to prevent mbuf leak later. ipfw pseudo-interface should never send anything therefore call panic(9) in if_start() method. PR: kern/149807 Submitted by: Dmitrij Tejblum MFC after: 2 weeks	2010-08-30 09:29:51 +00:00
Michael Tuexen	9c7635e18b	Fix the the SCTP_WITH_NO_CSUM option when used in combination with interface supporting CRC offload. While at it, make use of the feature that the loopback interface provides CRC offloading. MFC after: 4 weeks	2010-08-29 18:50:30 +00:00
Michael Tuexen	e24ea413e0	Bugfix: Do not send a packet drop report in response to a received INIT-ACK with incorrect CRC.	2010-08-28 21:15:00 +00:00
Michael Tuexen	20083c2eb1	Fix the switching on/off of CMT using sysctl and socket option. Fix the switching on/off of PF and NR-SACKs using sysctl. Add minor improvement in handling malloc failures. Improve the address checks when sending. MFC after: 4 weeks	2010-08-28 17:59:51 +00:00
John Baldwin	98b9eb0db2	Simplify the tcp pcblist estimate logic slightly. MFC after: 3 days	2010-08-27 18:17:46 +00:00
Andre Oppermann	8502ec25dc	Use timestamp modulo comparison macro for automatic receive buffer scaling to correctly handle wrapping of ticks value. MFC after: 1 week	2010-08-27 12:34:53 +00:00
Ana Kukec	1db8d1f843	MFp4: anchie_soc2009 branch: Add kernel side support for Secure Neighbor Discovery (SeND), RFC 3971. The implementation consists of a kernel module that gets packets from the nd6 code, sends them to user space on a dedicated socket and reinjects them back for further processing. Hooks are used from nd6 code paths to divert relevant packets to the send implementation for processing in user space. The hooks are only triggered if the send module is loaded. In case no user space application is connected to the send socket, processing continues normaly as if the module would not be loaded. Unloading the module is not possible at this time due to missing nd6 locking. The native SeND socket is similar to a raw IPv6 socket but with its own, internal pseudo-protocol. Approved by: bz (mentor)	2010-08-19 11:31:03 +00:00
Andre Oppermann	c3f0bdc66b	If a TCP connection has been idle for one retransmit timeout or more it must reset its congestion window back to the initial window. RFC3390 has increased the initial window from 1 segment to up to 4 segments. The initial window increase of RFC3390 wasn't reflected into the restart window which remained at its original defaults of 4 segments for local and 1 segment for all other connections. Both values are controllable through sysctl net.inet.tcp.local_slowstart_flightsize and net.inet.tcp.slowstart_flightsize. The increase helps TCP's slow start algorithm to open up the congestion window much faster. Reviewed by: lstewart MFC after: 1 week	2010-08-18 18:05:54 +00:00
Andre Oppermann	b7d747ecec	Untangle the net.inet.tcp.log_in_vain and net.inet.tcp.log_debug sysctl's and remove any side effects. Both sysctl's share the same backend infrastructure and due to the way it was implemented enabling net.inet.tcp.log_in_vain would also cause log_debug output to be generated. This was surprising and eventually annoying to the user. The log output backend is kept the same but a little shim is inserted to properly separate log_in_vain and log_debug and to remove any side effects. PR: kern/137317 MFC after: 1 week	2010-08-18 17:39:47 +00:00
Bjoern A. Zeeb	2278f9927d	When calculating the expected memory size for userspace, also take the number of syncache entries into account for the surplus we add to account for a possible increase of records in the re-entry window. Discussed with: jhb, silby MFC after: 1 week	2010-08-18 09:28:12 +00:00
John Baldwin	c007b96a78	Ensure a minimum "slop" of 10 extra pcb structures when providing a memory size estimate to userland for pcb list sysctls. The previous behavior of a "slop" of n/8 does not work well for small values of n (e.g. no slop at all if you have less than 8 open UDP connections). Reviewed by: bz MFC after: 1 week	2010-08-17 16:41:16 +00:00
Andre Oppermann	e4e9266071	Fix the interaction between 'ICMP fragmentation needed' MTU updates, path MTU discovery and the tcp_minmss limiter for very small MTU's. When the MTU suggested by the gateway via ICMP, or if there isn't any the next smaller step from ip_next_mtu(), is lower than the floor enforced by net.inet.tcp.minmss (default 216) the value is ignored and the default MSS (512) is used instead. However the DF flag in the IP header is still set in tcp_output() preventing fragmentation by the gateway. Fix this by using tcp_minmss as the MSS and clear the DF flag if the suggested MTU is too low. This turns off path MTU dissovery for the remainder of the session and allows fragmentation to be done by the gateway. Only MTU's smaller than 256 are affected. The smallest official MTU specified is for AX.25 packet radio at 256 octets. PR: kern/146628 Tested by: Matthew Luckie <mjl-at-luckie org nz> MFC after: 1 week	2010-08-15 13:25:18 +00:00
Andre Oppermann	0e678ed825	Initializing the new error variable to zero in syncache_socket() is not necessary. Noticed by: bz	2010-08-15 13:07:08 +00:00
Andre Oppermann	943044b01f	Add more logging points for failures in syncache_socket() to report when a new socket couldn't be created because one of in_pcbinshash(), in6_pcbconnect() or in_pcbconnect() failed. Logging is conditional on net.inet.tcp.log_debug being enabled. MFC after: 1 week	2010-08-15 09:30:13 +00:00
Andre Oppermann	153e5b57af	When using TSO and sending more than TCP_MAXWIN sendalot is set and we loop back to 'again'. If the remainder is less or equal to one full segment, the TSO flag was not cleared even though it isn't necessary anymore. Enabling the TSO flag on a segment that doesn't require any offloaded segmentation by the NIC may cause confusion in the driver or hardware. Reset the internal tso flag in tcp_output() on every iteration of sendalot. PR: kern/132832 Submitted by: Renaud Lienhart <renaud-at-vmware com> MFC after: 1 week	2010-08-14 21:41:33 +00:00
Andre Oppermann	40fe9eff47	Change the messages of the ICMP bad port bandwidth limiter from a kernel printf to a log output with the priority of LOG_NOTICE. This way the messages still show up in /var/log/messages but no longer spam the console every other second on busy servers that are port scanned: "Limiting open port RST response from 114 to 100 packets/sec" PR: kern/147352 Submitted by: Eugene Grosbein <eugen-at-eg sd rdtc ru> MFC after: 1 week	2010-08-14 21:04:27 +00:00
Andre Oppermann	bee4e5afa9	Disable TCP inflight limiter by default. It was experimental and interferes with the normal congestion control algorithms by instating a separate, possibly lower, ceiling for the amount of data that is in flight to the remote host. With high speed internet connections the inflight limit frequently has been estimated too low due to the noisy nature of the RTT measurements. This code gives way for the upcoming pluggable congestion control framework. It is the task of the congestion control algorithm to set the congestion window and amount of inflight data without external interference. Reviewed by: lstewart MFC after: 1 week Removal after: 1 month	2010-08-14 20:40:55 +00:00
Will Andrews	9963e8a52c	Unbreak LINT by moving all carp hooks to net/if.c / netinet/ip_carp.h, with the appropriate ifdefs. Reviewed by: bz Approved by: ken (mentor)	2010-08-11 20:18:19 +00:00
Will Andrews	54bfbd5153	Allow carp(4) to be loaded as a kernel module. Follow precedent set by bridge(4), lagg(4) etc. and make use of function pointers and pf_proto_register() to hook carp into the network stack. Currently, because of the uncertainty about whether the unload path is free of race condition panics, unloads are disallowed by default. Compiling with CARPMOD_CAN_UNLOAD in CFLAGS removes this anti foot shooting measure. This commit requires IP6PROTOSPACER, introduced in r211115. Reviewed by: bz, simon Approved by: ken (mentor) MFC after: 2 weeks	2010-08-11 00:51:50 +00:00
Xin LI	9fe5092de1	Address an edge condition that we found at work, where the carp(4) interface goes to issue LINK_UP, then LINK_DOWN, then LINK_UP at cold boot. This behavior is not observed when carp(4) interface is created slightly later, when the underlying interface is fully up. Before this change what happen at boot is roughly: - ifconfig creates em0 interface; - ifconfig clones a carp device using em0; (em0's link state is DOWN at this point) - carp state: INIT -> BACKUP [] - carp state: BACKUP -> MASTER - [Some negotiate between em0 and switch] - em0 kicks up link state change event (em0's link state is now up DOWN at this point) - do_link_state_change() -> carp_carpdev_state() - carp state: MASTER -> INIT (via carp_set_state(sc, INIT)) [+] - carp state: INIT -> BACKUP - carp state: BACKUP -> MASTER At the [] stage, em0 did not received any broadcast message from other node, and assume our node is the master, thus carp(4) sets the link state to "UP" after becoming a master. At [+], the master status is forcely set to "INIT", then an election is casted, after which our node would actually become a master. We believe that at the [*] stage, the master status should remain as "INIT" since the underlying parent interface's link state is not up. Obtained from: iXsystems, Inc. Reported by: jpaetzel MFC after: 2 months	2010-08-08 07:04:27 +00:00
Ed Schouten	367698346b	Don't use struct timezone. The timezone structure acquired by gettimeofday() is not used at all. Just remove it.	2010-08-08 02:51:32 +00:00
Michael Tuexen	87a37484eb	Fix a bug where endpoints bound to wildcard addresses where using addresses not announced to the peer due to address scoping. MFC after: 3 weeks	2010-08-05 16:52:13 +00:00
Michael Tuexen	d2604d08d0	Cleanup code. MFC after: 2 weeks	2010-08-01 08:06:59 +00:00
Bjoern A. Zeeb	19291ab3de	Document the mandatory argument to the arptimer() and nd6_llinfo_timer() functions with a KASSERT(). Note: there is no need to return after panic. In the legacy IP case, only assign the arg after the check, in the IPv6 case, remove the extra checks for the table and interface as they have to be there unless we freed and forgot to cancel the timer. It doesn't matter anyway as we would panic on the NULL pointer deref immediately and the bug is elsewhere. This unifies the code of both address families to some extend. Reviewed by: rwatson MFC after: 6 days	2010-07-31 21:33:18 +00:00
Bjoern A. Zeeb	4579930d2e	MFp4 @181628: Free the rtentry after we diconnected it from the FIB and are counting it as rttrash. There might still be a chance we leak it from a different code path but there is nothing we can do about this here. Sponsored by: ISPsystem (in February) Reviewed by: julian (in February) MFC after: 2 weeks	2010-07-31 15:31:23 +00:00
Andre Oppermann	28a53f037a	Fix a bug in syncache where the initial CWND for new incoming connections was limited to one segment under the faulty assumption of a retransmit. Due to this the opportunity to initialize the increased congestion window according to RFC3390 was missed. Support for RFC3465 introduced in r187289 uncovered the bug as the ACK to SYN/ACK no longer caused snd_cwnd increase by MSS (actually, this increase shouldn't happen as it's explicitly forbidden by RFC3390, but it's another issue). Snd_cwnd remains really small (1*MSS + 1) and this causes really bad interaction with delayed acks on other side. The variable name sc_rxmits is a bit misleading as it counts all transmits, not just retransmits. Submitted by: Maxim Dounin <mdounin-at-mdounin-dot-ru> MFC after: 10 days	2010-07-30 21:45:53 +00:00
Randall Stewart	753358d725	Fix the comment block that has the nice table to really have the nice table :-) MFC after: 1 month	2010-07-29 12:01:59 +00:00
Randall Stewart	44fbe46280	PR SCTP Bugs. Basically a full sized frame of PR SCTP FWD-TSN's would not be sent and thus cause a stalled connection. Also the rwnd Calculation was also off on the receiver side for PR-SCTP. MFC after: 1 month	2010-07-29 11:37:04 +00:00
Gleb Smirnoff	b9bff254af	Fix operation of "netgraph" action in conjunction with the net.inet.ip.fw.one_pass sysctl. The "ngtee" action is still broken. PR: kern/148885 Submitted by: Nickolay Dudorov <nnd mail.nsk.ru>	2010-07-27 14:26:34 +00:00
Michael Tuexen	74e906fa94	Fix a bug where the length of a FORWARD-TSN chunk was set incorrectly in the chunk. This resulted in malformed frames. Remove a duplicate assignment. MFC after: 2 weeks	2010-07-26 09:26:55 +00:00
Randall Stewart	8db924defb	Make sure that we report chunks if a socket still exists that were not sent. In either case carefully remove the data if it does not get taken by the reporting routines. MFC after: 2 weeks	2010-07-26 09:22:52 +00:00
Randall Stewart	6c065bbe06	When counting the number of chunks in the retransmission queue to validate the retran count, we need to include the chunks in the control send queue too. Otherwise the count will not match and you will get the invarient warning if invarients are on. MFC after: 2 weeks	2010-07-26 09:20:55 +00:00
Lawrence Stewart	79848522b5	- Move common code from the hook functions that fills in a packet node struct to a separate inline function. This further reduces duplicate code that didn't have a good reason to stay as it was. - Reorder the malloc of a pkt_node struct in the hook functions such that it only occurs if we managed to find a usable tcpcb associated with the packet. - Make the inp_locally_locked variable's type consistent with the prototype of siftr_siftdata(). Sponsored by: FreeBSD Foundation	2010-07-18 05:09:10 +00:00
Warner Losh	43e05a6523	machine/cpu.h isn't appropriate for this file,so remove it	2010-07-16 06:32:38 +00:00
Luigi Rizzo	71ad35a185	remove some conditional #ifdefs (no-op on FreeBSD); run the timer routine on cpu 0.	2010-07-15 14:43:12 +00:00
Luigi Rizzo	297151a0f3	whitespace fixes	2010-07-15 14:37:59 +00:00
Luigi Rizzo	e6fef96ef4	fix a comment and final empty line	2010-07-15 14:37:02 +00:00
Lawrence Stewart	adc5f0109d	The SIFTR DPCPU statistics struct was not being zeroed between enable/disable cycles so the values would accumulate rather than reset for each cycle. Sponsored by: FreeBSD Foundation	2010-07-13 08:23:46 +00:00
Lawrence Stewart	985147dec6	Catch up with the rename of DPCPU_SUM to DPCPU_VARSUM in r209978. Sponsored by: FreeBSD Foundation	2010-07-13 07:00:57 +00:00
Gleb Smirnoff	281b584e8e	Improve last commit: use bpf_mtap2() to avoiding stack usage. Prodded by: julian	2010-07-09 11:27:33 +00:00
Gleb Smirnoff	a5f9fc17c2	Since r209216 bpf(4) searches for mbuf_tags(9) and thus will not work with a stub m_hdr instead of a full mbuf. PR: kern/148050	2010-07-08 13:07:40 +00:00
Randall Stewart	478fbccb67	This fixes a crash in SCTP. It was possible to have a large number of packets queued to a crashing process. In a specific case you may get 2 ABORT's back (from say two packets in flight). If the aborts happened to be processed at the same time its possible to have one free the association while the other is trying to report all the outbound packets. When this occured it could lead to a crash. MFC after: 3 days	2010-07-03 14:03:31 +00:00

1 2 3 4 5 ...

3858 Commits