Commit Graph

137 Commits

Author SHA1 Message Date
Alexander V. Chernikov
670e8b3b8c Kill custom in_matroute() radix mathing function removing one rte mutex lock.
Initially in_matrote() in_clsroute() in their current state was introduced by
r4105 20 years ago. Instead of deleting inactive routes immediately, we kept them
in route table, setting RTPRF_OURS flag and some expire time. After that, either
GC came or RTPRF_OURS got removed on first-packet. It was a good solution
in that days (and probably another decade after that) to keep TCP metrics.
However, after moving metrics to TCP hostcache in r122922, most of in_rmx
functionality became unused. It might had been used for flushing icmp-originated
routes before rte mutexes/refcounting, but I'm not sure about that.

So it looks like this is nearly impossible to make GC do its work nowadays:

in_rtkill() ignores non-RTPRF_OURS routes.
route can only become RTPRF_OURS after dropping last reference via rtfree()
which calls in_clsroute(), which, it turn, ignores UP and non-RTF_DYNAMIC routes.

Dynamic routes can still be installed via received redirect, but they
have default lifetime (no specific rt_expire) and no one has another trie walker
to call RTFREE() on them.

So, the changelist:
* remove custom rnh_match / rnh_close matching function.
* remove all GC functions
* partially revert r256695 (proto3 is no more used inside kernel,
  it is not possible to use rt_expire from user point of view, proto3 support
  is not complete)
* Finish r241884 (similar to this commit) and remove remaining IPv6 parts

MFC after:	1 month
2014-11-11 02:52:40 +00:00
Alexander V. Chernikov
603eaf792b Renove faith(4) and faithd(8) from base. It looks like industry
have chosen different (and more traditional) stateless/statuful
NAT64 as translation mechanism. Last non-trivial commits to both
faith(4) and faithd(8) happened more than 12 years ago, so I assume
it is time to drop RFC3142 in FreeBSD.

No objections from:	net@
2014-11-09 21:33:01 +00:00
Adrian Chadd
9d3ddf4384 Add support for receiving and setting flowtype, flowid and RSS bucket
information as part of recvmsg().

This is primarily used for debugging/verification of the various
processing paths in the IP, PCB and driver layers.

Unfortunately the current implementation of the control message path
results in a ~10% or so drop in UDP frame throughput when it's used.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 01:45:39 +00:00
Adrian Chadd
8496de3825 Oops - somehow I missed the IP option numbers clashing with the multicast
numbers below.

Move them to a new set of non-clashing numbers.
2014-07-17 05:45:54 +00:00
Adrian Chadd
0a100a6f1e Implement the first stage of multi-bind listen sockets and RSS socket
awareness.

* Introduce IP_BINDMULTI - indicating that it's okay to bind multiple
  sockets on the same bind details.

  Although the PCB code has been taught about this (see below) this patch
  doesn't introduce the rest of the PCB changes necessary to distribute
  lookups among multiple PCB entries in the global wildcard table.

* Introduce IP_RSS_LISTEN_BUCKET - placing an listen socket into the
  given RSS bucket (and thus a single PCBGROUP hash.)

* Modify the PCB add path to be aware of IP_BINDMULTI:
  + Only allow further PCB entries to be added if the owner credentials
    and IP_BINDMULTI has been specified.  Ie, only allow further
    IP_BINDMULTI sockets to appear if the first bind() was IP_BINDMULTI.

* Teach the PCBGROUP code about IP_RSS_LISTE_BUCKET marked PCB entries.
  Instead of using the wildcard logic and hashing, these sockets are
  simply placed into the PCBGROUP and _not_ in the wildcard hash.

* When doing a PCBGROUP lookup, also do a wildcard match as well.
  This allows for an RSS bucket PCB entry to appear in a PCBGROUP
  rather than having to exist in the wildcard list.

Tested:

* TCP IPv4 server testing with igb(4)
* TCP IPv4 server testing with ix(4)

TODO:

* The pcbgroup lookup code duplicated the wildcard and wildcard-PCB
  logic.  This could be refactored into a single function.

* This doesn't yet work for IPv6 (The PCBGROUP code in netinet6/ doesn't
  yet know about this); nor does it yet fully work for UDP.
2014-07-10 03:10:56 +00:00
Adrian Chadd
7847796a93 Retire IP_RSSCPUID ; the right thing to do is query the RSS bucket;
map the bucket to an RSS queue, then map the queue to a CPU ID.
This way the bucket->queue and queue->CPU mapping can change
over time.

Introduce IP_RSSBUCKETID - which instead looks up the RSS bucket.
User applications can then map the RSS bucket to a CPU.
2014-06-26 04:12:41 +00:00
Jilles Tjoelker
ced01b33f4 netinet/in.h: Expose htonl(), htons(), ntohl() and ntohs() in strict POSIX
mode.

Put the htonl(), htons(), ntohl() and ntohs() declarations under
__POSIX_VISIBLE >= 200112. POSIX.1-2001 and newer require these to be
exposed from <netinet/in.h> (as well as <arpa/inet.h>).

Note that it may be unnecessary to check __POSIX_VISIBLE >= 200112 because
older versions of POSIX and the C standard do not define this header.
However, other places in the same file already perform the check.

PR:		188316
Submitted by:	Christian Neukirchen
2014-05-29 15:23:37 +00:00
Adrian Chadd
d804a08f3e Reserve IP_FLOWID, IP_FLOWTYPE, IP_RSSCPUID socket option IDs for
near-term future use.

These are intended to fetch the current flow id, flow hash type
(M_HASHTYPE_* from the sys/mbuf.h) and if RSS is enabled, the
RSS destined CPU ID for the receive path.
2014-05-17 00:09:12 +00:00
Kevin Lo
e06e816f67 Add support for UDP-Lite protocol (RFC 3828) to IPv4 and IPv6 stacks.
Tested with vlc and a test suite [1].

[1] http://www.erg.abdn.ac.uk/~gerrit/udp-lite/files/udplite_linux.tar.gz

Reviewed by:	jhb, glebius, adrian
2014-04-07 01:53:03 +00:00
John Baldwin
5b26ea5df3 Remove more constants related to static sysctl nodes. The MAXID constants
were primarily used to size the sysctl name list macros that were removed
in r254295.  A few other constants either did not have an associated
sysctl node, or the associated node used OID_AUTO instead.

PR:		ports/184525 (exp-run)
2014-02-25 18:44:33 +00:00
Bjoern A. Zeeb
415167d52b Add more (IPv6) related Internet Protocols:
- Host Identity Protocol (RFC5201)
- Shim6 Protocol (RFC5533)
- 2x experimentation and testing (RFC3692, RFC4727)

This does not indicate interest to implement/support these protocols,
but they are part of the "IPv6 Extension Header Types" [1] based on RFC7045
and might thus be needed by filtering and next header parsing
implementations.

References:	[1] http://www.iana.org/assignments/ipv6-parameters
Obtained from:	http://www.iana.org/assignments/protocol-numbers
MFC after:	1 week
2013-12-25 20:26:49 +00:00
John Baldwin
3380883230 Finish r254925 and remove the last remaining sysctl name list macro. The
one port that used it has been fixed to use the more portable
getprotoent(3) instead.
2013-10-23 13:22:50 +00:00
Kevin Lo
9768475eb0 - Add parentheses to all internet addresses
- All the casts to uint32_t should be to in_addr_t

Suggested by:	bde
Reviewed by:	bde
2013-10-19 18:13:32 +00:00
Kevin Lo
b298a86678 Treat INADDR_NONE as uint32_t.
Reviewed by:	glebius
2013-10-15 07:35:39 +00:00
John Baldwin
fd77bbb967 Remove most of the remaining sysctl name list macros. They were only
ever intended for use in sysctl(8) and it has not used them for many
years.

Reviewed by:	bde
Tested by:	exp-run by bdrewery
2013-08-26 18:16:05 +00:00
Colin Percival
76089c9511 Move IPPROTO_IPV6 from #ifdef __BSD_VISIBLE to #if __POSIX_VISIBLE >= 201112
since POSIX 2001 states that it shall be defined.

Reported by:	sbruno
Reviewed by:	jilles
MFC after:	1 week
2013-04-27 23:36:01 +00:00
Gleb Smirnoff
8f134647ca Switch the entire IPv4 stack to keep the IP packet header
in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.

  After this change a packet processed by the stack isn't
modified at all[2] except for TTL.

  After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.

[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.

[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.

Reviewed by:	luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by:	ray, Olivier Cochard-Labbe <olivier cochard.me>
2012-10-22 21:09:03 +00:00
Michael Tuexen
3cca425b29 Add a IP_RECVTOS socket option to receive for received UDP/IPv4
packets a cmsg of type IP_RECVTOS which contains the TOS byte.
Much like IP_RECVTTL does for TTL. This allows to implement a
protocol on top of UDP and implementing ECN.

MFC after: 3 days
2012-06-12 14:02:38 +00:00
Ed Maste
0615959a81 Add IPPROTO_MPLS (rfc4023) IP protocol definition
There are currently no in-tree consumers; I'm adding it now for use by
vendor code.  This matches the change OpenBSD made while implementing
MPLS in gif(4).
2012-05-27 16:16:28 +00:00
Gleb Smirnoff
53883e0c24 Add support for IPv4 /31 prefixes, as described in RFC3021.
To run a /31 network, participating hosts MUST drop support
for directed broadcasts, and treat the first and last addresses
on subnet as unicast. The broadcast address for the prefix
should be the link local broadcast address, INADDR_BROADCAST.
2011-10-15 18:41:25 +00:00
Attilio Rao
109c1de8ba Make the RPC specific __rpc_inet_ntop() and __rpc_inet_pton() general
in the kernel (just as inet_ntoa() and inet_aton()) are and sync their
prototype accordingly with already mentioned functions.

Sponsored by:	Sandvine Incorporated
Reviewed by:	emaste, rstone
Approved by:	dfr
MFC after:	2 weeks
2010-09-24 15:01:45 +00:00
Ana Kukec
1db8d1f843 MFp4: anchie_soc2009 branch:
Add kernel side support for Secure Neighbor Discovery (SeND), RFC 3971.

The implementation consists of a kernel module that gets packets from
the nd6 code, sends them to user space on a dedicated socket and reinjects
them back for further processing.

Hooks are used from nd6 code paths to divert relevant packets to the
send implementation for processing in user space.  The hooks are only
triggered if the send module is loaded. In case no user space
application is connected to the send socket, processing continues
normaly as if the module would not be loaded. Unloading the module
is not possible at this time due to missing nd6 locking.

The native SeND socket is similar to a raw IPv6 socket but with its own,
internal pseudo-protocol.

Approved by:	bz (mentor)
2010-08-19 11:31:03 +00:00
Xin LI
b80d1bf60e Add definition of IPv6 mobility header's protocol number, as assigned by
IANA and defined in RFC 3775.

Obtained from:	KAME
2010-03-31 23:02:25 +00:00
Luigi Rizzo
830c6e2b97 bring in several cleanups tested in ipfw3-head branch, namely:
r201011
- move most of ng_ipfw.h into ip_fw_private.h, as this code is
  ipfw-specific. This removes a dependency on ng_ipfw.h from some files.

- move many equivalent definitions of direction (IN, OUT) for
  reinjected packets into ip_fw_private.h

- document the structure of the packet tags used for dummynet
  and netgraph;

r201049
- merge some common code to attach/detach hooks into
  a single function.

r201055
- remove some duplicated code in ip_fw_pfil. The input
  and output processing uses almost exactly the same code so
  there is no need to use two separate hooks.
  ip_fw_pfil.o goes from 2096 to 1382 bytes of .text

r201057 (see the svn log for full details)
- macros to make the conversion of ip_len and ip_off
  between host and network format more explicit

r201113 (the remaining parts)
- readability fixes -- put braces around some large for() blocks,
  localize variables so the compiler does not think they are uninitialized,
  do not insist on precise allocation size if we have more than we need.

r201119
- when doing a lookup, keys must be in big endian format because
  this is what the radix code expects (this fixes a bug in the
  recently-introduced 'lookup' option)

No ABI changes in this commit.

MFC after:	1 week
2009-12-28 10:47:04 +00:00
Luigi Rizzo
de9fc6bcd4 Add new sockopt names for ipfw and dummynet.
This commit is just grabbing entries for the new names
that will be used in the future, so you don't need to
rebuild anything now.

MFC after:	3 days
2009-12-02 10:36:41 +00:00
Attilio Rao
758801232c Move inet_aton() (specular to inet_ntoa(), already present in libkern)
into libkern in order to made it usable by other modules than alias_proxy.

Obtained from:	Sandvine Incorporated
Sponsored by:	Sandvine Incorporated
MFC:		1 week
2009-11-12 00:46:28 +00:00
Poul-Henning Kamp
2ac047d1fe Move the duplicate definition of struct sockaddr_storage to its own
include file, and include this where the previous duplicate definitions were.

Static program checkers like FlexeLint rightfully take a dim view of
duplicate definitions, even if they currently are identical.
2009-09-08 10:39:38 +00:00
Pawel Jakub Dawidek
f44270e764 - Rename IP_NONLOCALOK IP socket option to IP_BINDANY, to be more consistent
with OpenBSD (and BSD/OS originally). We can't easly do it SOL_SOCKET option
  as there is no more space for more SOL_SOCKET options, but this option also
  fits better as an IP socket option, it seems.
- Implement this functionality also for IPv6 and RAW IP sockets.
- Always compile it in (don't use additional kernel options).
- Remove sysctl to turn this functionality on and off.
- Introduce new privilege - PRIV_NETINET_BINDANY, which allows to use this
  functionality (currently only unjail root can use it).

Discussed with:	julian, adrian, jhb, rwatson, kmacy
2009-06-01 10:30:00 +00:00
David Schultz
b3c11b5b91 Namespace: Defining htonl() and friends here instead of arpa/inet.h is
a BSD extension.
2009-03-14 20:16:54 +00:00
Bruce M Simpson
d10910e6ce Merge IGMPv3 and Source-Specific Multicast (SSM) to the FreeBSD
IPv4 stack.

Diffs are minimized against p4.
PCS has been used for some protocol verification, more widespread
testing of recorded sources in Group-and-Source queries is needed.
sizeof(struct igmpstat) has changed.

__FreeBSD_version is bumped to 800070.
2009-03-09 17:53:05 +00:00
Bruce M Simpson
b554b6ca91 Add various defines/macros required by IGMPv3:
* MCAST_UNDEFINED state.
 * in_allhosts() macro (group is 224.0.0.1).
   This uses a const endian comparison.
 * IP_MAX_GROUP_SRC_FILTER, IP_MAX_SOCK_SRC_FILTER
   default resource limits.
2009-03-04 03:01:05 +00:00
Adrian Chadd
4f2e6bfdd8 Better comment what the socket option does. Thanks to Sam Leffler
for suggesting this.
2009-01-09 17:18:17 +00:00
Adrian Chadd
be9347e3fe Implement a new IP option (not compiled/enabled by default) to allow
applications to specify a non-local IP address when bind()'ing a socket
to a local endpoint.

This allows applications to spoof the client IP address of connections
if (obviously!) they somehow are able to receive the traffic normally
destined to said clients.

This patch doesn't include any changes to ipfw or the bridging code to
redirect the client traffic through the PCB checks so TCP gets a shot
at it. The normal behaviour is that packets with a non-local destination
IP address are not handled locally. This can be dealth with some IPFW hackery;
modifications to IPFW to make this less hacky will occur in subsequent
commmits.

Thanks to Julian Elischer and others at Ironport. This work was approved
and donated before Cisco acquired them.

Obtained from:	Julian Elischer and others
MFC after:	2 weeks
2009-01-09 16:02:19 +00:00
Bjoern A. Zeeb
4b79449e2f Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by:	brooks, gnn, des, zec, imp
Sponsored by:	The FreeBSD Foundation
2008-12-02 21:37:28 +00:00
Marko Zec
8b615593fc Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by:	julian, bz, brooks, zec
Reviewed by:	julian, bz, brooks, kris, rwatson, ...
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-10-02 15:37:58 +00:00
George V. Neville-Neil
6b9ff6b7a7 Clean up the code that checks the types of address so that it is
done by understandable macros.

Fix the bug that prevented the system from responding on interfaces with
link local addresses assigned.

PR: 120958
Submitted by: James Snow <snow at teardrop.org>
MFC after: 2 weeks
2008-04-17 12:50:42 +00:00
Rui Paulo
1cf6e4f5ff Change the default port range for outgoing connections by introducing
IPPORT_EPHEMERALFIRST and IPPORT_EPHEMERALLAST with values
10000 and 65535 respectively.
The rationale behind is that it makes the attacker's life more
difficult if he/she wants to guess the ephemeral port range and
also lowers the probability of a port colision (described in
draft-ietf-tsvwg-port-randomization-01.txt).

While there, remove code duplication in in_pcbbind_setup().

Submitted by:	Fernando Gont <fernando at gont.com.ar>
Approved by:	njl (mentor)
Reviewed by:	silby, bms
Discussed on:	freebsd-net
2008-03-04 19:16:21 +00:00
Bruce M Simpson
71498f308b Import rewrite of IPv4 socket multicast layer to support source-specific
and protocol-independent host mode multicast. The code is written to
accomodate IPv6, IGMPv3 and MLDv2 with only a little additional work.

This change only pertains to FreeBSD's use as a multicast end-station and
does not concern multicast routing; for an IGMPv3/MLDv2 router
implementation, consider the XORP project.

The work is based on Wilbert de Graaf's IGMPv3 code drop for FreeBSD 4.6,
which is available at: http://www.kloosterhof.com/wilbert/igmpv3.html

Summary
 * IPv4 multicast socket processing is now moved out of ip_output.c
   into a new module, in_mcast.c.
 * The in_mcast.c module implements the IPv4 legacy any-source API in
   terms of the protocol-independent source-specific API.
 * Source filters are lazy allocated as the common case does not use them.
   They are part of per inpcb state and are covered by the inpcb lock.
 * struct ip_mreqn is now supported to allow applications to specify
   multicast joins by interface index in the legacy IPv4 any-source API.
 * In UDP, an incoming multicast datagram only requires that the source
   port matches the 4-tuple if the socket was already bound by source port.
   An unbound socket SHOULD be able to receive multicasts sent from an
   ephemeral source port.
 * The UDP socket multicast filter mode defaults to exclusive, that is,
   sources present in the per-socket list will be blocked from delivery.
 * The RFC 3678 userland functions have been added to libc: setsourcefilter,
   getsourcefilter, setipv4sourcefilter, getipv4sourcefilter.
 * Definitions for IGMPv3 are merged but not yet used.
 * struct sockaddr_storage is now referenced from <netinet/in.h>. It
   is therefore defined there if not already declared in the same way
   as for the C99 types.
 * The RFC 1724 hack (specify 0.0.0.0/8 addresses to IP_MULTICAST_IF
   which are then interpreted as interface indexes) is now deprecated.
 * A patch for the Rhyolite.com routed in the FreeBSD base system
   is available in the -net archives. This only affects individuals
   running RIPv1 or RIPv2 via point-to-point and/or unnumbered interfaces.
 * Make IPv6 detach path similar to IPv4's in code flow; functionally same.
 * Bump __FreeBSD_version to 700048; see UPDATING.

This work was financially supported by another FreeBSD committer.

Obtained from:  p4://bms_netdev
Submitted by:   Wilbert de Graaf (original work)
Reviewed by:    rwatson (locking), silence from fenner,
		net@ (but with encouragement)
2007-06-12 16:24:56 +00:00
Bruce M Simpson
ad3b9f70ed Add INADDR_ALLRPTS_GROUP define for 224.0.0.22 for future IGMPv3 support.
Obtained from:	OpenSolaris
2007-02-27 14:45:37 +00:00
Bruce M Simpson
1976bc4af7 Import macros IN_LINKLOCAL(), IN_PRIVATE(), IN_LOCAL_GROUP(), IN_ANY_LOCAL().
This is not a functional change.

IN_LINKLOCAL() tests if an address falls within the IPv4 link-local prefix.
IN_PRIVATE() tests if an address falls within an RFC 1918 private prefix.
IN_LOCAL_GROUP() tests if an address falls within the statically assigned
link-local multicast scope specified in RFC 2365.
IN_ANY_LOCAL() tests for either of IN_LINKLOCAL() or IN_LOCAL_GROUP().

As with the existing macros in the FreeBSD netinet stack, comparisons
are performed in host-byte order.

See also:	RFC 1918, RFC 2365, RFC 3927
Obtained from:	NetBSD (dyoung@)
MFC after:	2 weeks
2007-01-31 14:34:47 +00:00
Paolo Pisati
ff2f6fe80f Summer of Code 2005: improve libalias - part 2 of 2
With the second (and last) part of my previous Summer of Code work, we get:

-ipfw's in kernel nat

-redirect_* and LSNAT support

General information about nat syntax and some examples are available
in the ipfw (8) man page. The redirect and LSNAT syntax are identical
to natd, so please refer to natd (8) man page.

To enable in kernel nat in rc.conf, two options were added:

o firewall_nat_enable: equivalent to natd_enable

o firewall_nat_interface: equivalent to natd_interface

Remember to set net.inet.ip.fw.one_pass to 0, if you want the packet
to continue being checked by the firewall ruleset after being
(de)aliased.

NOTA BENE: due to some problems with libalias architecture, in kernel
nat won't work with TSO enabled nic, thus you have to disable TSO via
ifconfig (ifconfig foo0 -tso).

Approved by: glebius (mentor)
2006-12-29 21:59:17 +00:00
Bruce M Simpson
3548bfc964 Fix a long-standing limitation in IPv4 multicast group membership.
By making the imo_membership array a dynamically allocated vector,
this minimizes disruption to existing IPv4 multicast code. This
change breaks the ABI for the kernel module ip_mroute.ko, and may
cause a small amount of churn for folks working on the IGMPv3 merge.

Previously, sockets were subject to a compile-time limitation on
the number of IPv4 group memberships, which was hard-coded to 20.
The imo_membership relationship, however, is 1:1 with regards to
a tuple of multicast group address and interface address. Users who
ran routing protocols such as OSPF ran into this limitation on machines
with a large system interface tree.
2006-05-14 14:22:49 +00:00
Xin LI
92e0a4a2a4 Use consistent indent character as other IPPROTO_* lines did. 2005-12-20 09:38:03 +00:00
George V. Neville-Neil
496f9fc522 Add protocol number for SCTP.
Submitted by:	Randall Stewart rrs at cisco.com
MFC after:	1 week
2005-12-20 09:24:04 +00:00
Andre Oppermann
b2828ad291 Implement IP_DONTFRAG IP socket option enabling the Don't Fragment
flag on IP packets.  Currently this option is only repected on udp
and raw ip sockets.  On tcp sockets the DF flag is controlled by the
path MTU discovery option.

Sending a packet larger than the MTU size of the egress interface
returns an EMSGSIZE error.

Discussed with:	rwatson
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-09-26 20:25:16 +00:00
Robert Watson
b1c53bc9c0 Take a first cut at cleaning up ifnet removal and multicast socket
panics, which occur when stale ifnet pointers are left in struct
moptions hung off of inpcbs:

- Add in_ifdetach(), which matches in6_ifdetach(), and allows the
  protocol to perform early tear-down on the interface early in
  if_detach().

- Annotate that if_detach() needs careful consideration.

- Remove calls to in_pcbpurgeif0() in the handling of SIOCDIFADDR --
  this is not the place to detect interface removal!  This also
  removes what is basically a nasty (and now unnecessary) hack.

- Invoke in_pcbpurgeif0() from in_ifdetach(), in both raw and UDP
  IPv4 sockets.

It is now possible to run the msocket_ifnet_remove regression test
using HEAD without panicking.

MFC after:	3 days
2005-09-18 17:36:28 +00:00
Andre Oppermann
936cd18dad Add socketoption IP_MINTTL. May be used to set the minimum acceptable
TTL a packet must have when received on a socket.  All packets with a
lower TTL are silently dropped.  Works on already connected/connecting
and listening sockets for RAW/UDP/TCP.

This option is only really useful when set to 255 preventing packets
from outside the directly connected networks reaching local listeners
on sockets.

Allows userland implementation of 'The Generalized TTL Security Mechanism
(GTSM)' according to RFC3682.  Examples of such use include the Cisco IOS
BGP implementation command "neighbor ttl-security".

MFC after:	2 weeks
Sponsored by:	TCP/IP Optimization Fundraise 2005
2005-08-22 16:13:08 +00:00
Gleb Smirnoff
a97719482d Add CARP (Common Address Redundancy Protocol), which allows multiple
hosts to share an IP address, providing high availability and load
balancing.

Original work on CARP done by Michael Shalayeff, with many
additions by Marco Pfatschbacher and Ryan McBride.

FreeBSD port done solely by Max Laier.

Patch by:	mlaier
Obtained from:	OpenBSD (mickey, mcbride)
2005-02-22 13:04:05 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Andre Oppermann
539be79a9d Pre-emptively define IPPROTO_SPACER to 32767, the same value as PROTO_SPACER
to document that this value is globally assigned for a special purpose and
may not be reused within the IPPROTO number space.
2004-10-19 20:59:01 +00:00