Commit Graph

4563 Commits

Author SHA1 Message Date
Luigi Rizzo
9df9e62789 nobody uses this file except the userspace ipfw code, but the cast
of a pointer to an integer needs a cast to prevent a warning for
size mismatch.

MFC after:	1 week
2012-07-31 08:04:49 +00:00
Michael Tuexen
77db9ed99e Fix the sctp_sockstore union such that userland programs don't depend
on INET and/or INET6 to be defined and in-tune with how the kernel
was compiled.

MFC after:	3 days
Discussed with:	rrs
2012-07-26 08:10:29 +00:00
Bjoern A. Zeeb
8253dcabe7 Fix a problem when CARP is enabled on the interface for IPv4
but not for IPv6.  The current checks in nd6_nbr.c along with the
old version will result in ifa being NULL and subsequently the
packet will be dropped.  This prevented NS/NA, from working and
with that IPv6.

Now return the ifa from the carp lookup function in two cases:
1) if the address matches, is a carp address, and we are MASTER
   (as before),
2) if the address matches but it is not a carp address at all (new).

Reported by:	Peter Wemm (new Y! FreeBSD cluster, eating our own dogfood)
Tested on:	New Y! FreeBSD cluster machines
Reviewed by:	glebius
2012-07-25 12:14:39 +00:00
Robert Watson
0989f56cff Update some stale comments regarding tcbinfo locking in the TCP input
path: read locks on tcbinfo are no longer used, so won't happen.  No
functional change.

MFC after:	3 days
2012-07-22 17:31:36 +00:00
Gleb Smirnoff
3c73180f55 Plug a reference leak: before doing 'goto again' we need to unref
ia->ia_ifa if there is any.

Submitted by:	Andrey Zonov <andrey zonov.org>
2012-07-18 08:58:30 +00:00
Gleb Smirnoff
b9abeb9d99 When traversing global in_ifaddr list in the IFP_TO_IA() macro, we need
to obtain IN_IFADDR_RLOCK().
2012-07-18 08:41:00 +00:00
Michael Tuexen
d07b2ac6c6 Fix a refcount bug when freeing an association.
While there: Change code to be consistent.
Discussed with rrs@.
MFC after: 3 days
2012-07-17 13:03:47 +00:00
Gleb Smirnoff
df0633a145 If ip_output() returns EMSGSIZE to tcp_output(), then the latter calls
tcp_mtudisc(), which in its turn may call tcp_output(). Under certain
conditions (must admit they are very special) an infinite recursion can
happen.

To avoid recursion we can pass struct route to ip_output() and obtain
correct mtu. This allows us not to use tcp_mtudisc() but call tcp_mss_update()
directly.

PR:		kern/155585
Submitted by:	Andrey Zonov <andrey zonov.org> (original version of patch)
2012-07-16 07:08:34 +00:00
Michael Tuexen
5e20b91dbe Changes which improve compilation if neither INET nor INET6 is defined.
MFC after: 3 days
2012-07-15 20:16:17 +00:00
Michael Tuexen
e0e00a4d0f #ifdef INET and INET6 consistently. This also fixes a bug, where
it was done wrong.

MFC after: 3 days
2012-07-15 11:04:49 +00:00
Michael Tuexen
ad83c8a5a5 Provide the correct notification type (SCTP_SEND_FAILED_EVENT)
for unsent messages.

MFC after: 3 days
2012-07-14 21:25:14 +00:00
Michael Tuexen
b5e0cd79c2 Use case for selecting the address family (as in other places).
MFC after: 3 days
2012-07-14 20:08:03 +00:00
Michael Tuexen
c9afdc3aa9 Use case for selecting the address family (as in other places).
MFC after: 3 days
2012-07-14 19:44:39 +00:00
Michael Tuexen
5db75fc724 Fix a bug introduced in r237715.
MFC after:i 3 days.
2012-07-09 10:59:39 +00:00
Hiroki Sato
8efbd296e0 Make ipfw0 logging pseudo-interface clonable. It can be created automatically
by $firewall_logif rc.conf(5) variable at boot time or manually by ifconfig(8)
after a boot.

Discussed on:	freebsd-ipfw@
2012-07-09 07:16:19 +00:00
Alexander V. Chernikov
30835e6d50 Finally fix lookup (account remaining '\0') and deletion
(provide valid key length for radix lookup).

Submitted by:    Ihor Kaharlichenko<madkinder at gmail.com>  (prev version)
Approved by:     kib(mentor)
MFC after:       3 days

Sponsored by:    Shtorm ISP
2012-07-08 21:13:04 +00:00
Michael Tuexen
1ebccfd0a2 Use consistent method to determine IPV4_OUTPUT/IPV6_OUTPUT.
MFC after: 3 days
2012-07-04 20:59:30 +00:00
Michael Tuexen
c76b949968 Use CSUM_SCTP_IPV6 for IPv6.
MFC after: 3 days
2012-07-04 20:29:16 +00:00
Gleb Smirnoff
bf9840512a When ip_output()/ip6_output() is supplied a struct route *ro argument,
it skips FLOWTABLE lookup. However, the non-NULL ro has dual meaning
here: it may be supplied to provide route, and it may be supplied to
store and return to caller the route that ip_output()/ip6_output()
finds. In the latter case skipping FLOWTABLE lookup is pessimisation.

The difference between struct route filled by FLOWTABLE and filled
by rtalloc() family is that the former doesn't hold a reference on
its rtentry. Reference is hold by flow entry, and it is about to
be released in future. Thus, route filled by FLOWTABLE shouldn't
be passed to RTFREE() macro.

- Introduce new flag for struct route/route_in6, that marks route
  not holding a reference on rtentry.
- Introduce new macro RO_RTFREE() that cleans up a struct route
  depending on its kind.
- All callers to ip_output()/ip6_output() that do supply non-NULL
  but empty route should use RO_RTFREE() to free results of
  lookup.
- ip_output()/ip6_output() now do FLOWTABLE lookup always when
  ro->ro_rt == NULL.

Tested by:	tuexen (SCTP part)
2012-07-04 07:37:53 +00:00
Michael Tuexen
e3d6ef0b03 Iniitialize a variable.
MFC after: 3 days
2012-07-03 21:41:19 +00:00
Mikolaj Golub
b7d61ba317 Don't check for ifp != NULL before KASSERT, as ifp may not be NULL here
(it is dereferenced below).

Discussed with:	jhb
MFC after:	1 week
2012-07-03 19:04:18 +00:00
Mikolaj Golub
3a288e901f Fix RTTVAR scale in net.inet.tcp.hostcache.list sysctl.
Reviewed by:	andre
MFC after:	3 days
2012-07-03 18:59:13 +00:00
Isabell Long
82cecbea69 - Make ipfw's sched rules case insensitive, for user-friendliness.
- Add a note to the ipfw(8) man page about the rules no longer being
case sensitive.
- Fix some typos in the man page.

PR:		docs/164772
Reviewed by:	bz
Approved by:	gabor (doc mentor, src committer)
MFC after:	2 weeks
2012-07-03 08:42:48 +00:00
Gleb Smirnoff
3df6468a2d Remove route caching from IP multicast routing code. There is no
reason to do that, and also, cached route never got unreferenced,
which meant a reference leak.

Reviewed by:	bms
2012-07-02 19:44:18 +00:00
Michael Tuexen
a8775ad93d Move common code parts to sctp_common_input_processing().
MFC after: 3 days
2012-07-02 16:44:09 +00:00
Michael Tuexen
d8d544892c Remove dead code (on FreeBSD) as suggested by glebius@.
MFC after: 3 days
2012-07-02 16:40:11 +00:00
Michael Tuexen
b1754ad17b Pass the src and dst address of a received packet explicitly around.
MFC after: 3 days
2012-06-28 16:01:08 +00:00
Michael Tuexen
6dc5aabcb7 Unify sctp_input() and sctp6_input().
MFC after: 3 days
2012-06-25 19:13:43 +00:00
Michael Tuexen
39803b8c58 Whitespace cleanup.
MFC after: 3 days
2012-06-25 17:15:09 +00:00
Michael Tuexen
20cc2188f3 Pass the packet length explicitly around.
MFC after: 3 days
2012-06-24 23:12:24 +00:00
Michael Tuexen
329204ff00 Remove redundant check.
MFC after: 3 days
2012-06-24 22:22:44 +00:00
Michael Tuexen
f938425253 Do packet logging in a consistent way.
MFC after: 3 days
2012-06-24 21:25:54 +00:00
Alexander V. Chernikov
0bfbe5f660 Fix interface matching by ipfw table
Submitted by:     Ihor Kaharlichenko <madkinder@gmail.com>
Tested by:        Ihor Kaharlichenko <madkinder@gmail.com>
Approved by:      kib(mentor)
MFC after:        3 days
2012-06-23 12:40:24 +00:00
Michael Tuexen
9f4ab62e92 Remove redundant #ifdef. Reported by gnn@.
MFC after: 3 days
2012-06-21 12:51:24 +00:00
Navdeep Parhar
09fe63205c - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
Michael Tuexen
5ad9e57b3f Add rate limitation for SCTP OOTB responses.
MFC after: 3 days
2012-06-18 17:11:24 +00:00
Michael Tuexen
285052f0aa Cleanup the UDP decapsulation code.
MFC after: 3 days
2012-06-18 17:09:39 +00:00
Michael Tuexen
f30ac43257 Pass flowid explicitly through the stack instead of taking it from
the mbuf chain at different places.
While there: Fix several bugs related to VRFs.

MFC after: 3 days
2012-06-14 06:54:48 +00:00
Joel Dahl
6ae742a549 mdoc: avoid nested displays. Fixes mandoc warnings. 2012-06-13 18:57:27 +00:00
Michael Tuexen
f584d74b4d Add a cmsg of type IP_TOS for UDP/IPv4 sockets to specify the TOS byte.
MFC after: 3 days
2012-06-12 14:56:08 +00:00
Michael Tuexen
3cca425b29 Add a IP_RECVTOS socket option to receive for received UDP/IPv4
packets a cmsg of type IP_RECVTOS which contains the TOS byte.
Much like IP_RECVTTL does for TTL. This allows to implement a
protocol on top of UDP and implementing ECN.

MFC after: 3 days
2012-06-12 14:02:38 +00:00
Michael Tuexen
c9e089587c Unify the sending of ABORT, SHUTDOWN-COMPLETE and ERROR chunks.
While there: Fix also some minor bugs and prepare for SCTP/DTLS.

MFC after: 3 days
2012-06-12 13:15:27 +00:00
Michael Tuexen
17267b32d8 Small cleanup.
MFC after: 3 days
2012-06-12 09:18:28 +00:00
Alexander V. Chernikov
5154b2dffc Validate IPv4 network mask being passed to ipfw kernel interface.
Incorrect mask can possibly be one of the reasons for kern/127209 existance.

Approved by:        kib(mentor)
MFC after:          3 days
2012-06-09 17:39:05 +00:00
Eitan Adler
3e0efd2ec4 Fix style nit: don't use leading zero for dates in .Dd
Prompted by:	brueffer
Approved by:	brueffer
MFC after:	3 days
2012-06-05 03:14:39 +00:00
Maksim Yevmenkin
77d396fd18 Plug more refcount leaks and possible NULL deref for interface
address list.

Submitted by:	scottl@
MFC after:	3 days
2012-06-04 18:43:51 +00:00
Michael Tuexen
2faa5be555 Remove code which is not needed.
MFC after: 3 days
2012-06-03 18:14:57 +00:00
Michael Tuexen
b82bd838f6 Use an existing function to get the source address.
MFC after: 3 days
2012-06-03 14:54:50 +00:00
Michael Tuexen
2566e071ec Honor sysctl for TTL.
MFC after: 3 days
2012-06-02 21:22:26 +00:00
Michael Tuexen
962cef4089 Don't request data from the IPv6 layer, which is not used.
MFC after: 3 days
2012-06-02 20:53:23 +00:00
Michael Tuexen
021416fb85 Remove an unused parameter.
MFC after: 3 days
2012-06-02 13:13:38 +00:00
Bjoern A. Zeeb
5fa2656e55 Make TCP LRO work properly with VIMAGE kernels rather than just panicing.
There's no VIMAGE context set there yet as this is before if_ethersubr.c.

MFC after:	3 days
X-MFC with:	r235981
2012-06-01 11:42:50 +00:00
Michael Tuexen
11ef11db47 Small cleanups. No functional change.
MFC after: 3 days
2012-06-01 08:26:50 +00:00
Michael Tuexen
a6cff10f2a Seperate SCTP checksum offloading for IPv4 and IPv6.
While there: remove some trainling whitespaces.

MFC after: 3 days
X-MFC with: 236170
2012-05-30 20:56:07 +00:00
Gleb Smirnoff
eaf151c49d Improve style(9) of bcopy() to and from mbuf tag.
Submitted by:	bde
2012-05-30 13:51:00 +00:00
Gleb Smirnoff
a856ddc665 After r228571 carp_output() expects carp_softc * pointer in the mtag.
Noticed by:	thompsa
2012-05-30 07:11:27 +00:00
Bjoern A. Zeeb
356ab07e2d It turns out that too many drivers are not only parsing the L2/3/4
headers for TSO but also for generic checksum offloading.  Ideally we
would only have one common function shared amongst all drivers, and
perhaps when updating them for IPv6 we should introduce that.
Eventually we should provide the meta information along with mbufs to
avoid (re-)parsing entirely.

To not break IPv6 (checksums and offload) and to be able to MFC the
changes without risking to hurt 3rd party drivers, duplicate the v4
framework, as other OSes have done as well.

Introduce interface capability flags for TX/RX checksum offload with
IPv6, to allow independent toggling (where possible).  Add CSUM_*_IPV6
flags for UDP/TCP over IPv6, and reserve further for SCTP, and IPv6
fragmentation.  Define CSUM_DELAY_DATA_IPV6 as we do for legacy IP and
add an alias for CSUM_DATA_VALID_IPV6.

This pretty much brings IPv6 handling in line with IPv4.
TSO is still handled in a different way and not via if_hwassist.

Update ifconfig to allow (un)setting of the new capability flags.
Update loopback to announce the new capabilities and if_hwassist flags.

Individual driver updates will have to follow, as will SCTP.

Reported by:	gallatin, dim, ..
Reviewed by:	gallatin (glanced at?)
MFC after:	3 days
X-MFC with:	r235961,235959,235958
2012-05-28 09:30:13 +00:00
Ed Maste
0615959a81 Add IPPROTO_MPLS (rfc4023) IP protocol definition
There are currently no in-tree consumers; I'm adding it now for use by
vendor code.  This matches the change OpenBSD made while implementing
MPLS in gif(4).
2012-05-27 16:16:28 +00:00
Bjoern A. Zeeb
cace7064fc Trim the extra $FreeBSD$ from the comment below the license. We use
the __FBSDID() macro on the file now instead.

MFC after:	3 days
2012-05-26 10:28:11 +00:00
Michael Tuexen
8d9638ab33 Get rid of SCTP specific code to avoid CRC32C computations on loopback.
Just just offloading.
MFC after: 3 days
2012-05-26 09:16:33 +00:00
Michael Tuexen
430061cf18 Undefine SCTP_PACKED before including sctp_uio.h, which doesn't
use it. Spotted by Irene Ruengeler.

MFC after: 3 days
2012-05-25 11:14:08 +00:00
Bjoern A. Zeeb
0cfdff24d2 MFp4 bz_ipv6_fast:
Properly protect the inp read access when handling the control code.
  In the past this was expensive but given the rlock it's not so much
  anymore.

  Spotted while:	optimizing udp6
  Discussed with:	rwatson (a few months ago)

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-25 09:24:45 +00:00
Bjoern A. Zeeb
31bfc56ecd In case forwarding is turned on for a given address family, refuse to
queue the packet for LRO and tell the driver to directly pass it on.
This avoids re-assembly and later re-fragmentation problems when
forwarding.

It's not the best solution but the simplest and most effective for
the moment.

Should have been done:	ages ago
Discussed with and by:	many
MFC after:		3 days
2012-05-25 08:17:59 +00:00
Bjoern A. Zeeb
45747ba53c MFp4 bz_ipv6_fast:
Add code to handle pre-checked TCP checksums as indicated by mbuf
  flags to save the entire computation for validation if not needed.

  In the IPv6 TCP output path only compute the pseudo-header checksum,
  set the checksum offset in the mbuf field along the appropriate flag
  as done in IPv4.

  In tcp_respond() just initialize the IPv6 payload length to 0 as
  ip6_output() will properly set it.

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-25 02:23:26 +00:00
Bjoern A. Zeeb
3a9391defb MFp4 bz_ipv6_fast:
Factor out the tcp_hc_getmtu() call.  As the comments say it
  applies to both v4 and v6, so only write it once making it easier
  to read the protocol family specifc code.

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-25 01:13:39 +00:00
Bjoern A. Zeeb
62b5b6ecd0 MFp4 bz_ipv6_fast:
Significantly update tcp_lro for mostly two things:
  1) introduce basic support for IPv6 without extension headers.
  2) try hard to also get the incremental checksum updates right,
     especially also in the IPv4 case for the IP and TCP header.

  Move variables around for better locality, factor things out into
  functions, allow checksum updates to be compiled out, ...

  Leave a few comments on further things to look at in the future,
  though that is not the full list.

  Update drivers with appropriate #includes as needed for IPv6 data
  type in LRO.

  Sponsored by:	The FreeBSD Foundation
  Sponsored by:	iXsystems

Reviewed by:	gnn (as part of the whole)
MFC After:	3 days
2012-05-24 23:03:23 +00:00
Michael Tuexen
14ce0c0c80 Add sn_send_failed_event to sctp_notification.
MFC after: 3 days
2012-05-24 11:52:57 +00:00
Michael Tuexen
807aad636f Use consistent text at the begining of the files.
MFC after: 3 days
2012-05-23 11:26:28 +00:00
Marcel Moolenaar
157bb65afd Remove unused inclusion of curses.h 2012-05-19 05:14:24 +00:00
Michael Tuexen
635b6ff793 Use a default for max_burst of 4 and l2var of 2.
This was discussed with rrs@.

MFC after: 3 days
2012-05-17 18:12:24 +00:00
Michael Tuexen
8b42cfa708 Support SCTP_EOF also for 1-to-1 style sockets.
MFC after: 3 days
2012-05-17 17:02:47 +00:00
Bjoern A. Zeeb
27f190a3ca Switch to a standard 2 clause BSD license (from bsd-style-copyright).
Approved by:	Myricom Inc. (gallatin)
Approved by:	Intel Corporation (jfv)
2012-05-15 13:23:44 +00:00
Michael Tuexen
389b1b118c Support SCTP_REMOTE_ERROR notification.
MFC after: 3 days
2012-05-13 22:27:54 +00:00
Michael Tuexen
1edc9dbae5 Provide in the SCTP_SEND_FAILED and SCTP_SEND_FAILED_EVENT notifications
the correct ssf_error or ssfe_error as required by RFC 6458.

MFC after: 3 days
2012-05-13 19:32:49 +00:00
Michael Tuexen
4b1f78e1af Provide the error code in SCTP_PEER_ADDR_CHANGE notifications as
specified in RFC 6458.

MFC after: 3 days
2012-05-13 17:36:04 +00:00
Michael Tuexen
b342faf4bf Remove unused constants.
MFC after: 3 days
2012-05-13 17:17:13 +00:00
Michael Tuexen
410a3b1ef0 Use ECONNABORTED in cases where the ABORT was sent to the peer.
MFC after: 3 days
2012-05-13 16:56:16 +00:00
Michael Tuexen
58411b0821 Ensure the user can read COMM_LOST notifications on 1-to-1 style sockets.
MFC after: 3 days
2012-05-13 16:07:53 +00:00
Michael Tuexen
a2b42326b5 Provide in the association change notification the received ABORT chunk
if case of SCTP_COMM_LOST or SCTP_CANT_STR_ASSOC as required by RFC 6458.

MFC after: 3 days
2012-05-12 20:11:35 +00:00
Glen Barber
a3fb6da9ba General mdoc(7) and typo fixes.
PR:		167734
Submitted by:	Nobuyuki Koganemaru (kogane!jp.freebsd.org)
MFC after:	3 days
2012-05-11 20:06:46 +00:00
Michael Tuexen
ce228dab70 Fix a bug in the handling of association reset request.
MFC after: 3 days
2012-05-11 19:15:33 +00:00
Michael Tuexen
c5b5675d10 Only provide the supported features in the SCTP_ASSOC_CHANGE notif
if the state is SCTP_COMM_UP or SCTP_RESTART.
While there, do some cleanups.

MFC after: 3 days
2012-05-11 18:07:36 +00:00
Michael Tuexen
07d249d661 Remove a constant which is only used on non-FreeBSD platform.
(The actual code for the socket option handling has been #ifdefed
out forever...)

MFC after: 3 days.
2012-05-11 17:50:51 +00:00
Michael Tuexen
8c501e516b Address clang warnings.
MFC after: 3 days
2012-05-06 14:50:54 +00:00
Michael Tuexen
e06b67c756 Add support for the sac_info field in struct sctp_assoc_change
as required by RFC 6458.

MFC after: 3 days
2012-05-06 13:09:13 +00:00
Michael Tuexen
3f826ed2f3 Remove debug code.
MFC after: 3 days
2012-05-06 11:37:41 +00:00
Michael Tuexen
9935403ae1 Add support for SCTP_SEND_FAILED_EVENT as required by RFC 6458.
MFC after: 3 days
2012-05-06 11:02:53 +00:00
Michael Tuexen
d426064666 Provide the flags in the SCTP stream reconfig related notification
as specified in RFC 6525.

MFC after: 3 days
2012-05-05 21:41:16 +00:00
Michael Tuexen
f3ebe64c9e Honor SCTP_ENABLE_STREAM_RESET socket option when processing incoming
requests. Fix also the provided result in the response and use names
as specified in RFC 6525.

MFC after: 3 days
2012-05-05 20:07:33 +00:00
Michael Tuexen
f6faece611 Do error checking for the SCTP_RESET_STREAMS, SCTP_RESET_ASSOC,
and SCTP_ADD_STREAMS socket options as specified by RFC 6525.

MFC after: 3 days
2012-05-05 14:06:15 +00:00
Xin LI
74882260fa Add ToS definitions for DiffServ Codepoints as per RFC2474.
Obtained from:	OpenBSD
MFC after:	2 weeks
2012-05-04 21:00:32 +00:00
Michael Tuexen
cdedb5425c Add support for the SCTP_ENABLE_STREAM_RESET socket option to
getsockopt(). This improves the support of RFC 6525.

MFC after: 3 days
2012-05-04 17:18:02 +00:00
Michael Tuexen
88d935dad7 Add support for SCTP_STREAM_CHANGE_EVENT, SCTP_ASSOC_RESET_EVENT as
required by RFC 6525. This also fixes SCTP_STREAM_RESET_EVENT.

MFC after: 3 days
2012-05-04 15:49:08 +00:00
Michael Tuexen
7d44bb8b14 Call panic() only under INVARIANTS.
MFC after: 3 days
2012-05-04 10:06:08 +00:00
Michael Tuexen
cd3fd53188 Use SCTP_PRINTF() instead of printf() in all SCTP sources.
MFC after: 3 days
2012-05-04 09:27:00 +00:00
Michael Tuexen
254945d75a Fix another RFC 6458 issue. Spotted by Irene Ruengeler.
MFC after: 3 days
2012-05-03 10:26:33 +00:00
Alexander V. Chernikov
bdf942c3f0 Revert r234834 per luigi@ request.
Cleaner solution (e.g. adding another header) should be done here.

Original log:
  Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
  Remove ipfw/ip_fw_private.h header from non-ipfw code.

Requested by:      luigi
Approved by:       kib(mentor)
2012-05-03 08:56:43 +00:00
Alexander V. Chernikov
7bd5e9b143 Move several enums and structures required for L2 filtering from ip_fw_private.h to ip_fw.h.
Remove ipfw/ip_fw_private.h header from non-ipfw code.

Approved by:        ae(mentor)
MFC after:          2 weeks
2012-04-30 10:22:23 +00:00
Michael Tuexen
5fc8d746ee Add support for missing gauth_number_of_chunks field. This Bug was
found by Irene Ruengeler.

MFC after: 1 week
2012-04-30 08:44:21 +00:00
Michael Tuexen
ed72abc07d Whitespace changes.
MFC after: 3 days
2012-04-28 16:32:49 +00:00
Michael Tuexen
9a2e24aa4b Remove unused structure.
Reported by Irene Ruengeler.

MFC after: 3 days
2012-04-27 13:58:09 +00:00
Michael Tuexen
78f2804527 Fix a type in an SCTP AUTH related notification. Keep the old name
for backwards compatibility.
Spotted by Irene Ruengeler.

MFC after: 3 days
2012-04-26 11:07:15 +00:00
Michael Tuexen
d77145777a Use the flags defined in RFC 6525 in the stream reset event. 2012-04-23 20:45:31 +00:00
Michael Tuexen
07ee8fa69f Fix check used by stream reset related events.
MFC after: 3 days
2012-04-21 11:53:24 +00:00
Michael Tuexen
90eba9b693 Whitespace changes.
MFC after: 3 days
2012-04-19 15:30:15 +00:00
Michael Tuexen
74b2fab47a Use the same pattern for mbuf logging everywhere.
MFC after: 3 days
2012-04-19 13:11:17 +00:00
Michael Tuexen
953b6058cc Fix reported errno.
MFC after: 3 days
2012-04-19 12:47:18 +00:00
Michael Tuexen
921569e288 Fix a bug where we copy out more data from a mbuf chain that are
actually in it. This happens when SCTP receives an unknown chunk, which
requires the sending of an ERROR chunk, and there is no final padding but
the chunk is not 4-byte aligned.
Reported by yueting via rwatson@

MFC after: 3 days
2012-04-19 12:43:19 +00:00
Gleb Smirnoff
ef341ee1e3 When we receive an ICMP unreach need fragmentation datagram, we take
proposed MTU value from it and update the TCP host cache. Then
tcp_mss_update() is called on the corresponding tcpcb. It finds the
just allocated entry in the TCP host cache and updates MSS on the
tcpcb. And then we do a fast retransmit of what we have in the tcp
send buffer.

This sequence gets broken if the TCP host cache is exausted. In this
case allocation fails, and later called tcp_mss_update() finds nothing
in cache. The fast retransmit is done with not reduced MSS and is
immidiately replied by remote host with new ICMP datagrams and the
cycle repeats. This ping-pong can go up to wirespeed.

To fix this:
- tcp_mss_update() gets new parameter - mtuoffer, that is like
  offer, but needs to have min_protoh subtracted.
- tcp_mtudisc() as notification method renamed to tcp_mtudisc_notify().
- tcp_mtudisc() now accepts not a useless error argument, but proposed
  MTU value, that is passed to tcp_mss_update() as mtuoffer.

Reported by:	az
Reported by:	Andrey Zonov <andrey zonov.org>
Reviewed by:	andre (previous version of patch)
2012-04-16 13:49:03 +00:00
Michael Tuexen
4dca0ef478 Send always HBs when in PF state.
MFC after: 1 week
X-MFC with: r234296
2012-04-14 21:01:44 +00:00
Michael Tuexen
ca7567c923 Bugfix: Don't send HBs on path which are not idle.
MFC after: 1 week
2012-04-14 20:22:01 +00:00
Gleb Smirnoff
a9a2c40ced It is a logical error that in carp_multicast_cleanup()
we look at count of addresses on a particular vhid, we
should account number of addresses on cif.

To achieve this we need to run carp_attach() and
carp_detach() under appropriate cif lock.
2012-04-11 12:26:30 +00:00
Gleb Smirnoff
90b357f6ec M_DONTWAIT is a flag from historical mbuf(9)
allocator, not malloc(9) or uma(9) flag.
2012-04-10 06:52:39 +00:00
Gleb Smirnoff
9ca1fe0db8 CARP should be capable to run on if_bridge(4). Unfortunately,
this commit is not enough to enable CARP operation on
if_bridge(4), because the latter doesn't handle or even
initialize its ifp->if_link_state.

Reported by:	Alexander Lunev <sol289 gmail.com>
2012-04-10 05:42:48 +00:00
Michael Tuexen
17b611fb21 Remove duplicate condition in if statement.
Obtained from: brucec@
MFC after: 3 days
2012-04-06 09:03:02 +00:00
Gleb Smirnoff
2b9536028f Don't check malloc(M_WAITOK) results. 2012-03-31 11:20:48 +00:00
Randall Stewart
c4e848b770 Make stream our stream reset implementation
compliant to RFC6525.

MFC after:	1 month
2012-03-29 13:36:53 +00:00
Marko Zec
2454a7ca98 Permit tcpdrop in VNET jails.
Submitted by:	Miljenko Mikuc
MFC after:	3 days
2012-03-28 12:30:16 +00:00
Michael Tuexen
86e4703fa6 Honor the net.inet.udp.checksum sysctl when using SCTP/UDP/IPv4
encapsulation.
MFCing requires MFCing http://svn.freebsd.org/changeset/base/233554
MFC after: 2 weeks
2012-03-28 08:11:46 +00:00
Bjoern A. Zeeb
40b676bea6 Export the udp_cksum sysctl for upcoming SCTP work. Rather than always,
SCTP will only do IPv4 UDP checksum calculation as defined by the host
policy.  When tunneling SCTP always calculates the inner checksum already
so not doing the outer UDP can save cycles.

While here virtualize the variable.

Requested by:	tuexen
MFC after:	2 weeks
2012-03-27 15:14:29 +00:00
Alexander V. Chernikov
732d27b32d - Permit number of ipfw tables to be changed in runtime.
net.inet.ip.fw.tables_max is now read-write.

- Bump IPFW_TABLES_MAX to 65535
Default number of tables is still 128

- Remove IPFW_TABLES_MAX from ipfw(8) code.

Sponsored by Yandex LLC

Approved by:    kib(mentor)

MFC after:      2 weeks
2012-03-25 20:37:59 +00:00
Michael Tuexen
b5b6e5c237 Small cleanup of the code. No functional change (in FreeBSD kernel).
MFC after: 1 week.
2012-03-22 16:22:16 +00:00
Robert Millan
75410f02d2 Hide a few declarations from userland (including `struct inpcbgroup'). This
removes the dependency on <machine/param.h> which was introduced with SVN
rev 222748 (due to CACHE_LINE_SIZE).

Reviewed by:	bde
MFC after:	10 days
2012-03-17 21:51:39 +00:00
Michael Tuexen
dea47f3999 Clean up, no functional change.
MFC after: 3 days.
2012-03-15 14:22:05 +00:00
Michael Tuexen
99f293a20e Fix bugs which can result in a panic when an non-SCTP socket it
used with an sctp_ system-call which expects an SCTP socket.

MFC after: 3 days.
2012-03-15 14:13:38 +00:00
Alexander V. Chernikov
25a451a451 Fix VNET build broken by r232865.
Temporary remove the ability to assign different number of tables per VNET instance.
2012-03-12 15:41:36 +00:00
Randall Stewart
7081943d6f This fixes PR 165210. Basically we just
add in the netgraph interface to the list of
acceptable interfaces. A todo at the next
IETF code blitz, though is we need to review
why we screen interfaces, there was a reason ;-).

PR:		165210
MFC after:	1 week
2012-03-12 15:05:17 +00:00
Alexander V. Chernikov
f8bee51a69 - Add ipfw eXtended tables permitting radix to be used for any kind of keys.
- Add support for IPv6 and interface extended tables
- Make number of tables to be loader tunable in range 0..65534.
- Use IP_FW3 opcode for all new extended table cmds

No ABI changes are introduced. Old userland will see valid tables for
IPv4 tables and no entries otherwise. Flush works for any table.

IP_FW3 socket option is used to encapsulate all new opcodes:
 /* IP_FW3 header/opcodes */
 typedef struct _ip_fw3_opheader {
        uint16_t opcode;        /* Operation opcode */
        uint16_t reserved[3];   /* Align to 64-bit boundary */
 } ip_fw3_opheader;

New opcodes added:
 IP_FW_TABLE_XADD, IP_FW_TABLE_XDEL, IP_FW_TABLE_XGETSIZE, IP_FW_TABLE_XLIST

ipfw(8) table argument parsing behavior is changed:
 'ipfw table 999 add host' now assumes 'host' to be interface name instead of
 hostname.

New tunable:
 net.inet.ip.fw.tables_max controls number of table supported by ipfw in given
 VNET instance. 128 is still the default value.

New syntax:
ipfw add skipto tablearg ip from any to any via table(42) in
ipfw add skipto tablearg ip from any to any via table(4242) out

This is a bit hackish, special interface name '\1' is used to signal interface
table number is passed in p.glob field.

Sponsored by Yandex LLC

Reviewed by:    ae
Approved by:    ae (mentor)

MFC after:      4 weeks
2012-03-12 14:07:57 +00:00
Michael Tuexen
bd79f2ded1 Fix a warning reported by bz@
MFC after: 3 days.
2012-03-09 15:42:47 +00:00
Michael Tuexen
70a03e8834 Add support for stf interfaces.
MFC after: 3days.
2012-03-09 13:15:40 +00:00
Michael Tuexen
425d06a13d Fix a bug reported by Peter Holm which results in a crash:
Verify in sctp_peeloff() that the socket is a one-to-many
style SCTP socket.

MFC after: 3 days.
2012-03-09 13:12:33 +00:00
Marko Zec
1c7792950c Change SYSINIT priorities so that ip_mroute_modevent() is executed
before vnet_mroute_init(), since vnet_mroute_init() depends on mfchashsize
tunable to be set, and that is done in in ip_mroute_modevent().
Apparently I broke that ordering with r208744 almost 2 years ago...

PR:		kern/162201
Submitted by:	Stevan Markovic (mcafee.com)
MFC after:	3 days
2012-03-04 18:59:38 +00:00
Bjoern A. Zeeb
4ab8fc10d1 Correct typo in the RFC number for the constants based on IANA assignments
for IPv6 Neighbor Discovery Option types for "IPv6 Router Advertisement
Options for DNS Configuration".  It is RFC 6106.

MFC after:	3 days
2012-03-04 18:47:20 +00:00
Oleg Bulyzhin
f35f35c976 - Refresh dynamic tcp rule only if both sides answered keepalive packets.
- Remove some useless assignments.

MFC after:	1 month
2012-02-28 22:00:41 +00:00
Oleg Bulyzhin
8ac1bb0495 lookup_dyn_rule_locked(): style(9) cleanup
MFC after:	1 month
2012-02-28 21:53:39 +00:00
Kip Macy
a93cda789a When using flowtable llentrys can outlive the interface with which they're associated
at which the lle_tbl pointer points to freed memory and the llt_free pointer is no longer
valid.

Move the free pointer in to the llentry itself and update the initalization sites.

MFC after:	2 weeks
2012-02-23 18:21:37 +00:00
Andrey V. Elsukov
313157dead Don't use `m' after m_megapullup.
PR:		kern/165373
MFC after:	3 days
2012-02-22 04:19:33 +00:00
Michael Tuexen
310a027788 Remove two clang warnings.
MFC after: 1 month.
2012-02-18 16:06:15 +00:00
Bjoern A. Zeeb
9dba179d5e IFC @231845
Sponsored by:	Cisco Systems, Inc.
2012-02-17 00:27:48 +00:00
Bjoern A. Zeeb
d8951c8a2f Fix PAWS (Protect Against Wrapped Sequence numbers) in cases when
hz >> 1000 and thus getting outside the timestamp clock frequenceny of
1ms < x < 1s per tick as mandated by RFC1323, leading to connection
resets on idle connections.

Always use a granularity of 1ms using getmicrouptime() making all but
relevant callouts independent of hz.

Use getmicrouptime(), not getmicrotime() as the latter may make a jump
possibly breaking TCP nfsroot mounts having our timestamps move forward
for more than 24.8 days in a second without having been idle for that
long.

PR:		kern/61404
Reviewed by:	jhb, mav, rrs
Discussed with:	silby, lstewart
Sponsored by:	Sandvine Incorporated (originally in 2011)
MFC after:	6 weeks
2012-02-15 16:09:56 +00:00
Michael Tuexen
9b1246ca38 Fix a bug where the wrong protocol overhead was used. This can lead
to a deadlock of an association when an IPv6 socket was used to
communcate with IPv4 and an ICMPv4 fragmentation needed message
was received.
While there, simplify the code a bit.

MFC after: 3 days.
2012-02-14 12:00:34 +00:00
Gleb Smirnoff
afdbac98df Set vnet context in callouts and taskqueues.
PR:		164696
2012-02-08 13:39:38 +00:00
Gleb Smirnoff
23ccd3d976 Make the 'tcpwin' option of ipfw(8) accept ranges and lists.
Submitted by:	sem
2012-02-06 11:35:29 +00:00
Michael Tuexen
f1cf024599 Fix a typo which was already fixed by eadler in r227489. We missed
to integrate this fix in our code base, so it was removed in r227755.

MFC after: 3 days.
2012-02-06 10:47:12 +00:00
Gleb Smirnoff
9077f38738 Add new socket options: TCP_KEEPINIT, TCP_KEEPIDLE, TCP_KEEPINTVL and
TCP_KEEPCNT, that allow to control initial timeout, idle time, idle
re-send interval and idle send count on a per-socket basis.

Reviewed by:	andre, bz, lstewart
2012-02-05 16:53:02 +00:00
Bjoern A. Zeeb
9836132cd1 Document the fact that multi-FIB support for SCTP had been backed out
in r179783 as (ab)using the concept of VRFs for this had not worked.
At this point SCTP in FreeBSD does not support multi-FIB, neither for
IPv4 nor for IPv6.

Discussed with:	rrs
Sponsored by:	Cisco Systems, Inc.
2012-02-03 15:39:13 +00:00
Bjoern A. Zeeb
cebf3b1b6a Make ipfw verify_path6() multi-FIB aware.
The "fib" and "setfib" keywords implementations need no adjustments.

Sponsored by:	Cisco Systems, Inc.
2012-02-03 13:12:42 +00:00
Bjoern A. Zeeb
81d5d46b3c Add multi-FIB IPv6 support to the core network stack supplementing
the original IPv4 implementation from r178888:

- Use RT_DEFAULT_FIB in the IPv4 implementation where noticed.
- Use rt*fib() KPI with explicit RT_DEFAULT_FIB where applicable in
  the NFS code.
- Use the new in6_rt* KPI in TCP, gif(4), and the IPv6 network stack
  where applicable.
- Split in6_rtqtimo() and in6_mtutimo() as done in IPv4 and equally
  prevent multiple initializations of callouts in in6_inithead().
- Use wrapper functions where needed to preserve the current KPI to
  ease MFCs.  Use BURN_BRIDGES to indicate expected future cleanup.
- Fix (related) comments (both technical or style).
- Convert to rtinit() where applicable and only use custom loops where
  currently not possible otherwise.
- Multicast group, most neighbor discovery address actions and faith(4)
  are locked to the default FIB.  Individual IPv6 addresses will only
  appear in the default FIB, however redirect information and prefixes
  of connected subnets are automatically propagated to all FIBs by
  default (mimicking IPv4 behavior as closely as possible).

Sponsored by:	Cisco Systems, Inc.
2012-02-03 13:08:44 +00:00
Gleb Smirnoff
2512b096a2 o Provide functions carp_ifa_addroute()/carp_ifa_delroute()
to cleanup routes from a single ifa.
o Implement carp_addroute()/carp_delroute() via above functions.
o Call carp_ifa_delroute() in the carp_detach() to avoid
  junk routes left in routing table, in case if user
  removes an address in a MASTER state. [1]

Reported by:	az [1]
2012-02-01 11:07:41 +00:00
Luigi Rizzo
7dec311e83 a variable was erroneously declared as 32 bit instead of 64.
MFC after:	3 days
2012-01-27 13:26:25 +00:00
Gleb Smirnoff
3660e7b0da Remove unused variable. 2012-01-24 14:27:14 +00:00
Bjoern A. Zeeb
08adfbbf62 Make #error messages string-literals and remove punctuation.
Reported by:	bde (for ip_divert)
Reviewed by:	bde
MFC after:	3 days
2012-01-22 10:41:58 +00:00
Bjoern A. Zeeb
aa57e971bc Fix ip_divert handling of inet and inet6 and module building some more.
Properly sort the "carp" case in modules/Makefile after it was renamed.

Reported by:	bde (most)
Reviewed by:	bde
MFC after:	3 days
2012-01-22 02:16:31 +00:00
Bjoern A. Zeeb
83e521ec73 Clean up some #endif comments removing from short sections. Add #endif
comments to longer, also refining strange ones.

Properly use #ifdef rather than #if defined() where possible.  Four
#if defined(PCBGROUP) occurances (netinet and netinet6) were ignored to
avoid conflicts with eventually upcoming changes for RSS.

Reported by:	bde (most)
Reviewed by:	bde
MFC after:	3 days
2012-01-22 02:13:19 +00:00
Bjoern A. Zeeb
44e489690c Remove a superfluous INET6 check (no opt_inet6.h included anyway).
MFC after:	3 days
2012-01-20 17:18:54 +00:00
Michael Tuexen
c75fdd302b Fix a problem when using the CBAPI.
While there, remove an old comment which does not apply anymore.
2012-01-20 13:26:11 +00:00
Gleb Smirnoff
56cf9dc1f6 Drop support for SIOCSIFADDR, SIOCSIFNETMASK, SIOCSIFBRDADDR, SIOCSIFDSTADDR
ioctl commands.

PR:		163524
Reviewed by:	net
2012-01-16 09:53:24 +00:00
Michael Tuexen
86eef6072b Two cleanups. No functional change. 2012-01-15 13:35:55 +00:00
Michael Tuexen
8de4bcc1f7 Fix two bugs, which result in a panic when calling getsockopt()
using SCTP_RECVINFO or SCTP_NXTINFO.
Reported by Clement Lecigne and forwarded to us by zi@.

MFC after: 3 days.
2012-01-14 09:10:20 +00:00
Gleb Smirnoff
5c39f7bdeb Bunch of fixes to pfsync(4) module load/unload:
o Make the pfsync.ko actually usable. Before this change loading it
  didn't register protosw, so was a nop. However, a module /boot/kernel
  did confused users.
o Rewrite the way we are joining multicast group:
  - Move multicast initialization/destruction to separate functions.
  - Don't allocate memory if we aren't going to join a multicast group.
  - Use modern API for joining/leaving multicast group.
  - Now the utterly wrong pfsync_ifdetach() isn't needed.
o Move module initialization from SYSINIT(9) to moduledata_t method.
o Refuse to unload module, unless asked forcibly.
o Improve a bit some FreeBSD porting code:
  - Use separate malloc type.
  - Simplify swi sheduling.

This change is probably wrong from VIMAGE viewpoint, however pfsync
wasn't VIMAGE-correct before this change, too.

Glanced at by:	bz
2012-01-09 08:50:22 +00:00
Gleb Smirnoff
aa24ae3c5e Make it possible to use alternative source hardware address
in the ARP datagram generated by arprequest(). If caller doesn't
supply the address, then it is either picked from CARP or hardware
address of the interface is taken.

While here, make several minor fixes:

- Hold IF_ADDR_RLOCK(ifp) while traversing address list.
- Remove not true comment.
- Access internet address and mask via in_ifaddr fields,
  rather than ifaddr.
2012-01-08 17:25:15 +00:00
Gleb Smirnoff
d294536082 Provide IA_MASKSIN() macro similar to IA_SIN() and IA_DSTSIN(). 2012-01-08 17:20:29 +00:00
Gleb Smirnoff
94901d5e60 Move arprequest() declaration to if_ether.h. 2012-01-08 13:34:00 +00:00
Michael Tuexen
c58e60be43 Add an SCTP sysctl "blackhole", similar to the one for TCP.
If set to 1, no ABORT is sent back in response to an incoming
INIT. If set to 2, no ABORT is sent back in response to
an out of the blue packet. If set to 0 (the default), ABORTs
are sent.
Discussed with rrs@.

MFC after: 1 month.
2012-01-08 09:56:24 +00:00
Michael Tuexen
ab6174d587 Retire the SCTP sysctl "strict_init". We always perform the validation
and there is no reason to make is configuarable.
Discussed with rrs@.
2012-01-07 14:04:00 +00:00
Michael Tuexen
d68fdc4df6 Improve the handling of received INITs. Send an ABORT when
not accepting the connection. Also fix a crash, which
could happen when the user closed the socket.

MFC after: 1 month.
2012-01-07 13:03:33 +00:00
Eitan Adler
316efdb35f - Fix sysctl description
PR:		163623
Submitted by:	Eugene Grosbein <eugen@eg.sd.rdtc.ru>
Approved by:	bz
2012-01-07 00:11:36 +00:00
Michael Tuexen
aa1808b71f Use NULL instead of 0.
MFC after: 1 month.
2012-01-06 20:20:59 +00:00
Navdeep Parhar
db3cee5170 Always release the inp lock before returning from tcp_detach.
MFC after:	5 days
2012-01-06 18:29:40 +00:00
John Baldwin
0ff93ca524 Tweak the last fix to match what was actually tested.
Pointy hat to:	jhb
2012-01-06 12:49:01 +00:00
Sergey Kandaurov
4f7fca1ca1 Fix a typo.
X-MFC-with:	229665
2012-01-06 00:23:17 +00:00
John Baldwin
1e96ae8193 Remove the assertion from tcp_input() that rcv_nxt is always greater
than or equal to rcv_adv and fix tcp_twstart() to handle this case by
assuming the last window was zero rather than a negative value.

The code in tcp_input() already safely handled this case.  It can happen
due to delayed ACKs along with a remote sender that sends data beyond
the window we previously advertised.  If we have room in our socket buffer
for the extra data beyond the advertised window, we will accept it.
However, if the ACK for that segment is delayed, then we will not
effectively fixup rcv_adv to account for that extra data until the
next segment arrives and forces out an ACK.  When that next segment
arrives, rcv_nxt will be beyond rcv_adv.

Tested by:	pjd
MFC after:	1 week
2012-01-05 22:29:11 +00:00
John Baldwin
137f91e80f Convert all users of IF_ADDR_LOCK to use new locking macros that specify
either a read lock or write lock.

Reviewed by:	bz
MFC after:	2 weeks
2012-01-05 19:00:36 +00:00
John Baldwin
0f188ebb1d Use a helper variable to wrap a long line. 2012-01-04 13:29:26 +00:00
John Baldwin
56d6e1292d In the handling of the SIOC[DG]LIFADDR icotls in in_lifaddr_ioctl(), add
missing interface address list locking and grab a reference on the
matching interface address after dropping the lock while it is used to
avoid a potential use after free.

Reviewed by:	bz
MFC after:	1 week
2012-01-04 13:26:56 +00:00
John Baldwin
0823c29b81 Fix the SIOC[DG]LIFADDR ioctls in in_lifaddr_ioctl() to work with IPv4
interface address rather than IPv6.

Submitted by:	hrs
Reviewed by:	bz
MFC after:	1 week
2012-01-04 13:23:51 +00:00
John Baldwin
c6f4ea8062 When cancelling multicast timers on an interface, don't release the
reference on a group in the leaving state while iterating over the loop.
Instead, use the same approach used in igmp_ifdetach() and mld_ifdetach()
of placing the groups to free on pending release list and then releasing
the references after dropping the IF_ADDR_LOCK.  This closes an ugly race
where the code was dropping the lock in the middle of iterating over the
list.  It also fixes some additional potential use-after-free bugs since
the cancellation routine also applied other changes to the group after
dropping the reference.  Now those changes are performed before the
reference is dropped and the group is potentially freed.

Prodded to fix by:	glebius
Reviewed by:	bz
MFC after:	1 week
2012-01-03 20:34:52 +00:00
John Baldwin
f5b50e25ec Use TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE() for some loops that
do not modify the queues they iterate over.

Submitted by:	glebius
2012-01-03 16:22:29 +00:00
Bjoern A. Zeeb
762ad1d614 As I came by and noticed add a comment that inp locking is a bit optistic
(read: non-existent) here and should be fixed.
2012-01-02 09:18:58 +00:00
John Baldwin
c23de1f45b Defer the work of freeing IPv4 multicast options from a socket to an
asychronous task.  This avoids tearing down multicast state including
sending IGMP leave messages and reprogramming MAC filters while holding
the per-protocol global pcbinfo lock that is used in the receive path of
packet processing.

Reviewed by:	rwatson
MFC after:	1 month
2011-12-29 20:41:16 +00:00
John Baldwin
3b0b2840be Use queue(3) macros instead of home-rolled versions in several places in
the INET6 code.  This includes retiring the 'ndpr_next' and 'pfr_next'
macros.

Submitted by:	pluknet (earlier version)
Reviewed by:	pluknet
2011-12-29 18:25:18 +00:00
Gleb Smirnoff
9de96e891c Don't fallback to a CARP address in BACKUP state. 2011-12-29 15:59:14 +00:00
Michael Tuexen
60990c0c06 Address issues found by clang. While there, fix also some style
issues.

MFC after: 3 months.
2011-12-27 10:16:24 +00:00
Gleb Smirnoff
1c435c73a1 Use a better log message for master down event. 2011-12-22 18:48:21 +00:00
Gleb Smirnoff
7121247312 Provide ABI compatibility shim to enable configuring of addresses
with ifconfig(8) prior to r228571.

Requested by:	brooks
2011-12-21 12:39:08 +00:00
Gleb Smirnoff
f08535f872 Restore a feature that was present in 5.x and 6.x, and was cleared in
7.x, 8.x and 9.x with pf(4) imports: pfsync(4) should suppress CARP
preemption, while it is running its bulk update.

However, reimplement the feature in more elegant manner, that is
partially inspired by newer OpenBSD:

- Rename term "suppression" to "demotion", to match with OpenBSD.
- Keep a global demotion factor, that can be raised by several
  conditions, for now these are:
  - interface goes down
  - carp(4) has problems with ip_output() or ip6_output()
  - pfsync performs bulk update
- Unlike in OpenBSD the demotion factor isn't a counter, but
  is actual value added to advskew. The adjustment values for
  particular error conditions are also configurable, and their
  defaults are maximum advskew value, so a single failure bumps
  demotion to maximum. This is for POLA compatibility, and should
  satisfy most users.
- Demotion factor is a writable sysctl, so user can do
  foot shooting, if he desires to.
2011-12-20 13:53:31 +00:00
Michael Tuexen
7215cc1b74 Fix unused parameter warnings.
While there, fix some whitespace issues.

MFC after: 3 months.
2011-12-17 19:21:40 +00:00
Gleb Smirnoff
92ed4e1a24 Since size of struct in_aliasreq has just been changed in r228571,
and thus ifconfig(8) needs recompile, it is a good chance to make
parameter checks on SIOCAIFADDR arguments more strict.
2011-12-16 13:30:17 +00:00
Gleb Smirnoff
08b68b0e4c A major overhaul of the CARP implementation. The ip_carp.c was started
from scratch, copying needed functionality from the old implemenation
on demand, with a thorough review of all code. The main change is that
interface layer has been removed from the CARP. Now redundant addresses
are configured exactly on the interfaces, they run on.

The CARP configuration itself is, as before, configured and read via
SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or
SIOCAIFADDR_IN6 may now be configured to a particular virtual host id,
which makes the prefix redundant.

ifconfig(8) semantics has been changed too: now one doesn't need
to clone carpXX interface, he/she should directly configure a vhid
on a Ethernet interface.

To supply vhid data from the kernel to an application the getifaddrs(8)
function had been changed to pass ifam_data with each address. [1]

The new implementation definitely closes all PRs related to carp(4)
being an interface, and may close several others. It also allows
to run a single redundant IP per interface.

Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for
idea on using ifam_data and for several rounds of reviewing!

PR:		kern/117000, kern/126945, kern/126714, kern/120130, kern/117448
Reviewed by:	bz
Submitted by:	bz [1]
2011-12-16 12:16:56 +00:00
Gleb Smirnoff
55174c34ef Belatedly catch up with r151555. in_scrubprefix() also needs this fix. We
should compare not only addresses, but their masks, too, when searching
for matching prefix.
2011-12-13 06:56:43 +00:00
Michael Tuexen
972478a4c0 Fix a bug reported by Irene Ruengeler which resulted in not sending
out HEARTBEATs when requested by the user. The HEARTBEATs were only
queued, but not actually sent out.

MFC after: 2 months.
2011-12-10 10:52:54 +00:00
Gleb Smirnoff
f769e5b0fa Fix a very special case when SIOCAIFADDR supplies mask of 0.0.0.0,
don't overwrite the mask with autoguessing based on classes.
2011-12-06 20:55:20 +00:00
Michael Tuexen
a56569ba55 Remove debug code.
MFC after: 1 month.
2011-11-28 20:48:35 +00:00
Gleb Smirnoff
89b9325530 Fix one more fallout from r227791: do not overwrite trimmed sa_len
on the ia_sockmask when doing SIOCSIFNETMASK.

Reported by:	Stefan Bethke <stb lassitu.de>, gonzo
Pointy hat to:	glebius
2011-11-28 13:30:14 +00:00
Michael Tuexen
70acddf158 Fix a warning reported by arundel@.
Fix a bug where the parameter length of a supported address types
parameter is set to a wrong value if the kernel is built with
with either INET or INET6, but not both.

MFC after: 3 days.
2011-11-27 17:51:13 +00:00
Lawrence Stewart
a26fef3a21 Plug a TCP reassembly UMA zone leak introduced in r226113 by only using the
backup stack queue entry when the zone is exhausted, otherwise we leak a zone
allocation each time we plug a hole in the reassembly queue.

Reported by:	many on freebsd-stable@ (thread: "TCP Reassembly Issues")
Tested by:	many on freebsd-stable@ (thread: "TCP Reassembly Issues")
Reviewed by:	bz (very brief sanity check)
MFC after:	3 days
2011-11-27 02:32:08 +00:00
Gleb Smirnoff
c6e5c71116 Remove superfluous check: SIOCAIFADDR must have ifra_addr supplied. 2011-11-24 22:46:11 +00:00
Gleb Smirnoff
bd47ae58a6 Fix stupid typo in r227830.
PR:		162806
Pointy hat to:	glebius
2011-11-24 22:43:48 +00:00
Michael Tuexen
052230f978 Move up the address to the top of the sctp_udencaps structure
like in all other structures. This avoids alignment problems.

MFC after: 3 months.
2011-11-24 10:58:48 +00:00
Michael Tuexen
ec9925ed78 Move up the address to the top of the sctp_paddrthlds structure
like in all other structures. This avoids alignment problems.

MFC after: 3 days.
2011-11-24 10:54:30 +00:00