Commit Graph

4716 Commits

Author SHA1 Message Date
Sergey Kandaurov
b8b4cfcdf6 Draft-ietf-tcpm-initcwnd-05 became RFC6928.
MFC after:	1 week
2013-12-26 04:24:08 +00:00
Bjoern A. Zeeb
415167d52b Add more (IPv6) related Internet Protocols:
- Host Identity Protocol (RFC5201)
- Shim6 Protocol (RFC5533)
- 2x experimentation and testing (RFC3692, RFC4727)

This does not indicate interest to implement/support these protocols,
but they are part of the "IPv6 Extension Header Types" [1] based on RFC7045
and might thus be needed by filtering and next header parsing
implementations.

References:	[1] http://www.iana.org/assignments/ipv6-parameters
Obtained from:	http://www.iana.org/assignments/protocol-numbers
MFC after:	1 week
2013-12-25 20:26:49 +00:00
Gleb Smirnoff
ec5df3a7b1 It'll be okay to use LibAliasDetachHandlers() here, relying
on the fact that all handlers come from modules' bss and are
followed by NODIR handler.
2013-12-25 09:43:51 +00:00
Gleb Smirnoff
535e0a0981 Cleanup alias module handler register/unregister.
- Remove locking, since all module(9) events are running under &Giant.
- Use TAILQ for protocol handlers and fix a bug which led to
  infinite cycle. Bug found in VirtualBox [1]
- Simplify code everywhere.
- Fix documentation.

[1]  https://www.virtualbox.org/pipermail/vbox-dev/2013-November/011936.html

PR:		183792 [1]
Submitted by:	Valery Ushakov <uwe NetBSD.org> [1]
Sponsored by:	Nginx, Inc.
2013-12-25 03:24:20 +00:00
Gleb Smirnoff
2fb87f0892 Kill space at eols. 2013-12-25 02:06:57 +00:00
Gleb Smirnoff
1019f603d5 Remove from kernel the "dll" code. 2013-12-25 01:58:19 +00:00
Gleb Smirnoff
22d3fb1917 Whitespace cleanup. 2013-12-25 01:52:55 +00:00
Dimitry Andric
36f54f0aaa In sys/netinet/in_mcast.c, inm_is_ifp_detached() is only used whenever
KTR is defined, so put it between #ifdef KTR guards.  This avoids a
warning about a unused function if KTR is not enabled.

MFC after:	 3 days
2013-12-24 20:25:18 +00:00
Adrian Chadd
ac7e121247 Disable the now unpredicably bogus check for whether we have
eneough queue space before queuing a bunch of IP fragments.

As the comment in the committed change says, in the post-if_transmit(),
post-SMP, post-preemption world, there's just too much overlapping
concurrent code paths and different approaches to driver transmit
queue management to have this code even remotely be effective.

The only specific place it could be useful is if ALTQ is enabled
but again it doesn't at all promise that all the fragments will be
transmitted anyway.

The main reason for committing this change is to disable a parallel
place where the drops counter is incremented.  This is a side effect
of an upcoming change to ixgbe/cxgbe to handle the queue drops
counter slightly better.

Sponsored by:	Netflix, Inc.
2013-12-20 07:41:03 +00:00
Eitan Adler
5f30ec9b63 In a situation where:
- The remote host sends a FIN
	- in an ACK for a sequence number for which an ACK has already
	  been received
	- There is still unacked data on route to the remote host
	- The packet does not contain a window update

The packet may be dropped without processing the FIN flag.

PR:		kern/99188
Submitted by:	Staffan Ulfberg <staffan@ulfberg.se>
Discussed with:	andre
MFC after:	never
2013-12-02 03:11:25 +00:00
Michael Tuexen
c302aeb123 In
http://svnweb.freebsd.org/changeset/base/258221
I introduced a bug which initialized global locks
whenever the SCTP stack initialized. This was fixed in
http://svnweb.freebsd.org/changeset/base/258574
by rodrigc@. He just initialized the locks for
the default vnet. This fix reverts to the old
behaviour before r258221, which explicitly makes
sure it is only called once, because this works also on
other platforms.
MFC after: 3 days
X-MFC with: r258574.
2013-11-30 12:51:19 +00:00
Andriy Gapon
d9fae5ab88 dtrace sdt: remove the ugly sname parameter of SDT_PROBE_DEFINE
In its stead use the Solaris / illumos approach of emulating '-' (dash)
in probe names with '__' (two consecutive underscores).

Reviewed by:	markj
MFC after:	3 weeks
2013-11-26 08:46:27 +00:00
Adrian Chadd
fa22ce1570 Convert over the TCP probes to use mtod() rather than directly
dereferencing m->m_data.

Sponsored by:	Netflix, Inc.
2013-11-25 22:55:06 +00:00
Craig Rodrigues
c0c61281b4 Only initialize some mutexes for the default VNET.
In r208160, sctp_it_ctl was made a global variable, across all VNETs.
However, sctp_init() is called for every VNET that is created.  This results
in the same global mutexes which are part of sctp_it_ctl being initialized.  This can result
in crashes if many jails are created.

To reproduce the problem:
  (1)  Take a GENERIC kernel config, and add options for: VIMAGE, WITNESS,
       INVARIANTS.
  (2)  Run this command in a loop:
       jail -l -u root -c path=/ name=foo persist vnet && jexec foo ifconfig lo0 127.0.0.1/8 && jail -r foo

       (see http://lists.freebsd.org/pipermail/freebsd-current/2010-November/021280.html )

Witness will warn about the same mutex being initialized.

Fix the problem by only initializing these mutexes in the default VNET.
2013-11-25 18:49:37 +00:00
Attilio Rao
54366c0bd7 - For kernel compiled only with KDTRACE_HOOKS and not any lock debugging
option, unbreak the lock tracing release semantic by embedding
  calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
  version of the releasing functions for mutex, rwlock and sxlock.
  Failing to do so skips the lockstat_probe_func invokation for
  unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
  kernel compiled without lock debugging options, potentially every
  consumer must be compiled including opt_kdtrace.h.
  Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
  dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
  is linked there and it is only used as a compile-time stub [0].

[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested.  As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while.  Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].

Sponsored by:	EMC / Isilon storage division
Discussed with:	rstone
[0] Reported by:	rstone
[1] Discussed with:	philip
2013-11-25 07:38:45 +00:00
Gleb Smirnoff
c1f7c3f500 In r257692 I intentionally deleted code that handled P2P interfaces
with equal addresses on both sides. It appeared that OpenVPN uses
such configutations.

Submitted by:	trociny
2013-11-17 15:14:07 +00:00
Mikolaj Golub
a3985bdd12 Deregister helper hooks on vnet destroy. 2013-11-17 15:09:39 +00:00
Michael Tuexen
2a44dbf682 Use SCTP_PR_SCTP_TTL when the user provides a positive
timetolive in sctp_sendmsg().

MFC after: 3 days
2013-11-16 19:57:56 +00:00
Michael Tuexen
04194e4f7d Remove a stray write operation.
MFC after: 3 days
2013-11-16 16:09:09 +00:00
Michael Tuexen
dcb3fc4cd6 When determining if an address belongs to an stcb, take the address family
into account for wildcard bound endpoints.

MFC after: 3 days
2013-11-16 15:34:14 +00:00
Michael Tuexen
f4f34bde23 Cleanups which result in fixes which have been made upstream
and where partially suggested by Andrew Galante.
There is no functional change in FreeBSD.

MFC after: 3 days
2013-11-16 15:04:49 +00:00
Gleb Smirnoff
555036b5f6 Remove never used ioctls that originate from KAME. The proof
of their zero usage was exp-run from misc/183538.
2013-11-11 05:39:42 +00:00
Gleb Smirnoff
2f3eb7f4d8 Make TCP_KEEP* socket options readable. At least PostgreSQL wants
to read the values.

Reported by:	sobomax
2013-11-08 13:04:14 +00:00
Michael Tuexen
de72f4e54b Get rid of the artification limitation enforced by
SCTP_AUTH_RANDOM_SIZE_MAX.
This was suggested by Andrew Galante.

MFC after: 3 days
2013-11-07 18:50:11 +00:00
Michael Tuexen
a9d94d290b Make sure that we don't try to build an ASCONF-ACK chunk
larger than what fits in the the mbuf cluster.
This issue was reported by Andrew Galante.

MFC after: 3 days
2013-11-07 17:08:09 +00:00
Michael Tuexen
c9eb4473b4 Use htons()/ntohs() appropriately.
These issues were reported by Andrew Galante.

MFC after: 3 days
2013-11-07 16:37:12 +00:00
Gleb Smirnoff
77b89ad837 Provide compat layer for OSIOCAIFADDR. 2013-11-06 19:46:20 +00:00
Gleb Smirnoff
821b5caf7a Fix my braino in r257692. For SIOCG*ADDR we don't need exact match on
specified address, actually in most cases the address isn't specified.

Reported by:	peter
2013-11-06 08:36:08 +00:00
Nathan Whitehorn
6224cd89c0 Fix build on GCC. 2013-11-06 01:14:00 +00:00
Gleb Smirnoff
fe9bfbcf5a netinet code no longer uses IFA_RTSELF. 2013-11-05 07:45:20 +00:00
Gleb Smirnoff
f7a39160c2 Rewrite in_control(), so that it is comprehendable without getting mad.
o Provide separate functions for SIOCAIFADDR and for SIOCDIFADDR, with
  clear code flow from beginning to the end. After that the rest of
  in_control() gets very small and clear.
o Provide sx(9) lock to protect against parallel ioctl() invocations.
o Reimplement logic from r201282, that tried to keep localhost route in
  table when multiple P2P interfaces with same local address are created
  and deleted.

Discussed with:		pluknet, melifaro
Sponsored by:		Netflix
Sponsored by:		Nginx, Inc.
2013-11-05 07:44:15 +00:00
Gleb Smirnoff
b1b9dcae46 Remove net.link.ether.inet.useloopback sysctl tunable. It was always on by
default from the very beginning. It was placed in wrong namespace
net.link.ether, originally it had been at another wrong namespace. It was
incorrectly documented at incorrect manual page arp(8). Since new-ARP commit,
the tunable have been consulted only on route addition, and ignored on route
deletion. Behaviour of a system with tunable turned off is not fully correct,
and has no advantages comparing to normal behavior.
2013-11-05 07:32:09 +00:00
Michael Tuexen
3b3d05d769 Unlock the lock before destroying it.
This issue was reported by Andrew Galante.

MFC after: 3 days
2013-11-03 14:00:17 +00:00
Michael Tuexen
b54ddf225f Changes from upstream to improve compilation when INET or INET6
or none of them is defined.

MFC after: 3 days
2013-11-02 20:12:19 +00:00
Gleb Smirnoff
586904c22e in_ifadown() can be void. 2013-11-01 10:29:10 +00:00
Gleb Smirnoff
237bf7f773 Cleanup in_ifscrub(), which is just an entry to in_scrubprefix(). 2013-11-01 10:18:41 +00:00
Michael Tuexen
6ed728108a Terminate a debug output with a \n. 2013-10-29 20:04:50 +00:00
Gleb Smirnoff
8d7cf9b5d4 Uninline inm_lookup_locked(). Now in_var.h doesn't dereference
fields of struct ifnet.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-29 11:21:31 +00:00
Michael Tuexen
92dfa76cbc Fis the value of *optlen when calling getsockopt() for
SCTP_REMOTE_UDP_ENCAPS_PORT.
This issue was reported by Andrew Galante.
MFC after: 3 days
2013-10-28 20:45:19 +00:00
Michael Tuexen
daac3e7db6 Fix compilation if SCTP_DONT_DO_PRIVADDR_SCOPE is defined.
The issue was reported by Andrew Galante.

MFC after: 3 days
2013-10-28 20:32:37 +00:00
Gleb Smirnoff
c3322cb91c Include necessary headers that now are available due to pollution
via if_var.h.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-28 07:29:16 +00:00
Gleb Smirnoff
eedc7fd9e8 Provide includes that are needed in these files, and before were read
in implicitly via if.h -> if_var.h pollution.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 18:18:50 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
John Baldwin
3380883230 Finish r254925 and remove the last remaining sysctl name list macro. The
one port that used it has been fixed to use the more portable
getprotoent(3) instead.
2013-10-23 13:22:50 +00:00
Andre Oppermann
c1e5a6e5e8 The TCP delayed ACK logic isn't aware of LRO passing up large aggregated
segments thinking it received only one segment. This causes it to enable
the delay the ACK for 100ms to wait for another segment which may never
come because all the data was received already.

Doing delayed ACK for LRO segments is bogus for two reasons: a) it pushes
us further away from acking every other packet; b) it introduces additional
delay in responding to the sender.  The latter is especially bad because it
is in the nature of LRO to aggregated all segments of a burst with no more
coming until an ACK is sent back.

Change the delayed ACK logic to detect LRO segments by being larger than
the MSS for this connection and issuing an immediate ACK for them to keep
the ACK clock ticking without interruption.

Reported by:	julian, cperciva
Tested by:	cperciva
Reviewed by:	lstewart
MFC after:	3 days
2013-10-22 18:24:34 +00:00
Kevin Lo
9768475eb0 - Add parentheses to all internet addresses
- All the casts to uint32_t should be to in_addr_t

Suggested by:	bde
Reviewed by:	bde
2013-10-19 18:13:32 +00:00
Michael Tuexen
77dabf96d9 Remove a buggy comparision when setting manually the path MTU.
After fixing, the comparision would have become redundant.
Thanks to Andrew Galante for reporting the issue.

MFC after:	3 days
2013-10-15 20:21:27 +00:00
Gleb Smirnoff
7caf4ab7ac - Utilize counter(9) to accumulate statistics on interface addresses. Add
four counters to struct ifaddr. This kills '+=' on a variables shared
  between processors for every packet.
- Nuke struct if_data from struct ifaddr.
- In ip_input() do not put a reference on ifaddr, instead update statistics
  right now in place and do IN_IFADDR_RUNLOCK(). These removes atomic(9)
  for every packet. [1]
- To properly support NET_RT_IFLISTL sysctl used by getifaddrs(3), in
  rtsock.c fill if_data fields using counter_u64_fetch().
- Accidentially fix bug in COMPAT_32 version of NET_RT_IFLISTL, which
  took if_data not from the ifaddr, but from ifaddr's ifnet. [2]

Submitted by:	melifaro [1], pluknet[2]
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-15 11:37:57 +00:00
Gleb Smirnoff
4675896098 Remove ifa_init() and provide ifa_alloc() that will allocate and setup
struct ifaddr internally.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-15 10:31:42 +00:00
Gleb Smirnoff
6ed910fabe Hide 'struct ifaddr' definition from userland. Two tools left that use it,
namely ipftest(1) and ifmcstat(1). These sniff structure definition using
_WANT_IFADDR define.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-15 10:19:24 +00:00