4276 Commits

Author SHA1 Message Date
bz
e7eef5c811 Fix PAWS (Protect Against Wrapped Sequence numbers) in cases when
hz >> 1000 and thus getting outside the timestamp clock frequenceny of
1ms < x < 1s per tick as mandated by RFC1323, leading to connection
resets on idle connections.

Always use a granularity of 1ms using getmicrouptime() making all but
relevant callouts independent of hz.

Use getmicrouptime(), not getmicrotime() as the latter may make a jump
possibly breaking TCP nfsroot mounts having our timestamps move forward
for more than 24.8 days in a second without having been idle for that
long.

PR:		kern/61404
Reviewed by:	jhb, mav, rrs
Discussed with:	silby, lstewart
Sponsored by:	Sandvine Incorporated (originally in 2011)
MFC after:	6 weeks
2012-02-15 16:09:56 +00:00
tuexen
bcfaf51260 Fix a bug where the wrong protocol overhead was used. This can lead
to a deadlock of an association when an IPv6 socket was used to
communcate with IPv4 and an ICMPv4 fragmentation needed message
was received.
While there, simplify the code a bit.

MFC after: 3 days.
2012-02-14 12:00:34 +00:00
glebius
1b77c1e93b Set vnet context in callouts and taskqueues.
PR:		164696
2012-02-08 13:39:38 +00:00
glebius
d050a38ab4 Make the 'tcpwin' option of ipfw(8) accept ranges and lists.
Submitted by:	sem
2012-02-06 11:35:29 +00:00
tuexen
8b885b79f5 Fix a typo which was already fixed by eadler in r227489. We missed
to integrate this fix in our code base, so it was removed in r227755.

MFC after: 3 days.
2012-02-06 10:47:12 +00:00
glebius
4326beb059 Add new socket options: TCP_KEEPINIT, TCP_KEEPIDLE, TCP_KEEPINTVL and
TCP_KEEPCNT, that allow to control initial timeout, idle time, idle
re-send interval and idle send count on a per-socket basis.

Reviewed by:	andre, bz, lstewart
2012-02-05 16:53:02 +00:00
glebius
c5de302211 o Provide functions carp_ifa_addroute()/carp_ifa_delroute()
to cleanup routes from a single ifa.
o Implement carp_addroute()/carp_delroute() via above functions.
o Call carp_ifa_delroute() in the carp_detach() to avoid
  junk routes left in routing table, in case if user
  removes an address in a MASTER state. [1]

Reported by:	az [1]
2012-02-01 11:07:41 +00:00
luigi
54939631d7 a variable was erroneously declared as 32 bit instead of 64.
MFC after:	3 days
2012-01-27 13:26:25 +00:00
glebius
b7d04855ff Remove unused variable. 2012-01-24 14:27:14 +00:00
bz
2e7b42f00f Make #error messages string-literals and remove punctuation.
Reported by:	bde (for ip_divert)
Reviewed by:	bde
MFC after:	3 days
2012-01-22 10:41:58 +00:00
bz
6aadb2bfe6 Fix ip_divert handling of inet and inet6 and module building some more.
Properly sort the "carp" case in modules/Makefile after it was renamed.

Reported by:	bde (most)
Reviewed by:	bde
MFC after:	3 days
2012-01-22 02:16:31 +00:00
bz
a8d3ef905d Clean up some #endif comments removing from short sections. Add #endif
comments to longer, also refining strange ones.

Properly use #ifdef rather than #if defined() where possible.  Four
#if defined(PCBGROUP) occurances (netinet and netinet6) were ignored to
avoid conflicts with eventually upcoming changes for RSS.

Reported by:	bde (most)
Reviewed by:	bde
MFC after:	3 days
2012-01-22 02:13:19 +00:00
bz
0aae67830d Remove a superfluous INET6 check (no opt_inet6.h included anyway).
MFC after:	3 days
2012-01-20 17:18:54 +00:00
tuexen
5b1bac3ecb Fix a problem when using the CBAPI.
While there, remove an old comment which does not apply anymore.
2012-01-20 13:26:11 +00:00
glebius
1725d903e6 Drop support for SIOCSIFADDR, SIOCSIFNETMASK, SIOCSIFBRDADDR, SIOCSIFDSTADDR
ioctl commands.

PR:		163524
Reviewed by:	net
2012-01-16 09:53:24 +00:00
tuexen
a34eb79030 Two cleanups. No functional change. 2012-01-15 13:35:55 +00:00
tuexen
f8b7dd3132 Fix two bugs, which result in a panic when calling getsockopt()
using SCTP_RECVINFO or SCTP_NXTINFO.
Reported by Clement Lecigne and forwarded to us by zi@.

MFC after: 3 days.
2012-01-14 09:10:20 +00:00
glebius
7b863fa87e Bunch of fixes to pfsync(4) module load/unload:
o Make the pfsync.ko actually usable. Before this change loading it
  didn't register protosw, so was a nop. However, a module /boot/kernel
  did confused users.
o Rewrite the way we are joining multicast group:
  - Move multicast initialization/destruction to separate functions.
  - Don't allocate memory if we aren't going to join a multicast group.
  - Use modern API for joining/leaving multicast group.
  - Now the utterly wrong pfsync_ifdetach() isn't needed.
o Move module initialization from SYSINIT(9) to moduledata_t method.
o Refuse to unload module, unless asked forcibly.
o Improve a bit some FreeBSD porting code:
  - Use separate malloc type.
  - Simplify swi sheduling.

This change is probably wrong from VIMAGE viewpoint, however pfsync
wasn't VIMAGE-correct before this change, too.

Glanced at by:	bz
2012-01-09 08:50:22 +00:00
glebius
aab03c16b7 Make it possible to use alternative source hardware address
in the ARP datagram generated by arprequest(). If caller doesn't
supply the address, then it is either picked from CARP or hardware
address of the interface is taken.

While here, make several minor fixes:

- Hold IF_ADDR_RLOCK(ifp) while traversing address list.
- Remove not true comment.
- Access internet address and mask via in_ifaddr fields,
  rather than ifaddr.
2012-01-08 17:25:15 +00:00
glebius
c48f3c2002 Provide IA_MASKSIN() macro similar to IA_SIN() and IA_DSTSIN(). 2012-01-08 17:20:29 +00:00
glebius
f99edf0f86 Move arprequest() declaration to if_ether.h. 2012-01-08 13:34:00 +00:00
tuexen
ebc0602463 Add an SCTP sysctl "blackhole", similar to the one for TCP.
If set to 1, no ABORT is sent back in response to an incoming
INIT. If set to 2, no ABORT is sent back in response to
an out of the blue packet. If set to 0 (the default), ABORTs
are sent.
Discussed with rrs@.

MFC after: 1 month.
2012-01-08 09:56:24 +00:00
tuexen
792f7ea9b5 Retire the SCTP sysctl "strict_init". We always perform the validation
and there is no reason to make is configuarable.
Discussed with rrs@.
2012-01-07 14:04:00 +00:00
tuexen
582eaf225d Improve the handling of received INITs. Send an ABORT when
not accepting the connection. Also fix a crash, which
could happen when the user closed the socket.

MFC after: 1 month.
2012-01-07 13:03:33 +00:00
eadler
54d3f8299b - Fix sysctl description
PR:		163623
Submitted by:	Eugene Grosbein <eugen@eg.sd.rdtc.ru>
Approved by:	bz
2012-01-07 00:11:36 +00:00
tuexen
3a9031f3c3 Use NULL instead of 0.
MFC after: 1 month.
2012-01-06 20:20:59 +00:00
np
8b68a40419 Always release the inp lock before returning from tcp_detach.
MFC after:	5 days
2012-01-06 18:29:40 +00:00
jhb
4559cec957 Tweak the last fix to match what was actually tested.
Pointy hat to:	jhb
2012-01-06 12:49:01 +00:00
pluknet
98532dbe82 Fix a typo.
X-MFC-with:	229665
2012-01-06 00:23:17 +00:00
jhb
dc84417a34 Remove the assertion from tcp_input() that rcv_nxt is always greater
than or equal to rcv_adv and fix tcp_twstart() to handle this case by
assuming the last window was zero rather than a negative value.

The code in tcp_input() already safely handled this case.  It can happen
due to delayed ACKs along with a remote sender that sends data beyond
the window we previously advertised.  If we have room in our socket buffer
for the extra data beyond the advertised window, we will accept it.
However, if the ACK for that segment is delayed, then we will not
effectively fixup rcv_adv to account for that extra data until the
next segment arrives and forces out an ACK.  When that next segment
arrives, rcv_nxt will be beyond rcv_adv.

Tested by:	pjd
MFC after:	1 week
2012-01-05 22:29:11 +00:00
jhb
4ef366671a Convert all users of IF_ADDR_LOCK to use new locking macros that specify
either a read lock or write lock.

Reviewed by:	bz
MFC after:	2 weeks
2012-01-05 19:00:36 +00:00
jhb
fad5a4a690 Use a helper variable to wrap a long line. 2012-01-04 13:29:26 +00:00
jhb
bae1448c5e In the handling of the SIOC[DG]LIFADDR icotls in in_lifaddr_ioctl(), add
missing interface address list locking and grab a reference on the
matching interface address after dropping the lock while it is used to
avoid a potential use after free.

Reviewed by:	bz
MFC after:	1 week
2012-01-04 13:26:56 +00:00
jhb
90b391fc12 Fix the SIOC[DG]LIFADDR ioctls in in_lifaddr_ioctl() to work with IPv4
interface address rather than IPv6.

Submitted by:	hrs
Reviewed by:	bz
MFC after:	1 week
2012-01-04 13:23:51 +00:00
jhb
66d3d3405c When cancelling multicast timers on an interface, don't release the
reference on a group in the leaving state while iterating over the loop.
Instead, use the same approach used in igmp_ifdetach() and mld_ifdetach()
of placing the groups to free on pending release list and then releasing
the references after dropping the IF_ADDR_LOCK.  This closes an ugly race
where the code was dropping the lock in the middle of iterating over the
list.  It also fixes some additional potential use-after-free bugs since
the cancellation routine also applied other changes to the group after
dropping the reference.  Now those changes are performed before the
reference is dropped and the group is potentially freed.

Prodded to fix by:	glebius
Reviewed by:	bz
MFC after:	1 week
2012-01-03 20:34:52 +00:00
jhb
dd61fe0873 Use TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE() for some loops that
do not modify the queues they iterate over.

Submitted by:	glebius
2012-01-03 16:22:29 +00:00
bz
d9de01105e As I came by and noticed add a comment that inp locking is a bit optistic
(read: non-existent) here and should be fixed.
2012-01-02 09:18:58 +00:00
jhb
419867b9b4 Defer the work of freeing IPv4 multicast options from a socket to an
asychronous task.  This avoids tearing down multicast state including
sending IGMP leave messages and reprogramming MAC filters while holding
the per-protocol global pcbinfo lock that is used in the receive path of
packet processing.

Reviewed by:	rwatson
MFC after:	1 month
2011-12-29 20:41:16 +00:00
jhb
7a0151720c Use queue(3) macros instead of home-rolled versions in several places in
the INET6 code.  This includes retiring the 'ndpr_next' and 'pfr_next'
macros.

Submitted by:	pluknet (earlier version)
Reviewed by:	pluknet
2011-12-29 18:25:18 +00:00
glebius
6d9bb65799 Don't fallback to a CARP address in BACKUP state. 2011-12-29 15:59:14 +00:00
tuexen
b9ef107414 Address issues found by clang. While there, fix also some style
issues.

MFC after: 3 months.
2011-12-27 10:16:24 +00:00
glebius
744cf7a5e4 Use a better log message for master down event. 2011-12-22 18:48:21 +00:00
glebius
653f8c5e71 Provide ABI compatibility shim to enable configuring of addresses
with ifconfig(8) prior to r228571.

Requested by:	brooks
2011-12-21 12:39:08 +00:00
glebius
8c74bad9f3 Restore a feature that was present in 5.x and 6.x, and was cleared in
7.x, 8.x and 9.x with pf(4) imports: pfsync(4) should suppress CARP
preemption, while it is running its bulk update.

However, reimplement the feature in more elegant manner, that is
partially inspired by newer OpenBSD:

- Rename term "suppression" to "demotion", to match with OpenBSD.
- Keep a global demotion factor, that can be raised by several
  conditions, for now these are:
  - interface goes down
  - carp(4) has problems with ip_output() or ip6_output()
  - pfsync performs bulk update
- Unlike in OpenBSD the demotion factor isn't a counter, but
  is actual value added to advskew. The adjustment values for
  particular error conditions are also configurable, and their
  defaults are maximum advskew value, so a single failure bumps
  demotion to maximum. This is for POLA compatibility, and should
  satisfy most users.
- Demotion factor is a writable sysctl, so user can do
  foot shooting, if he desires to.
2011-12-20 13:53:31 +00:00
tuexen
3a4d069b21 Fix unused parameter warnings.
While there, fix some whitespace issues.

MFC after: 3 months.
2011-12-17 19:21:40 +00:00
glebius
25d9ab0f4e Since size of struct in_aliasreq has just been changed in r228571,
and thus ifconfig(8) needs recompile, it is a good chance to make
parameter checks on SIOCAIFADDR arguments more strict.
2011-12-16 13:30:17 +00:00
glebius
27a36f6ac8 A major overhaul of the CARP implementation. The ip_carp.c was started
from scratch, copying needed functionality from the old implemenation
on demand, with a thorough review of all code. The main change is that
interface layer has been removed from the CARP. Now redundant addresses
are configured exactly on the interfaces, they run on.

The CARP configuration itself is, as before, configured and read via
SIOCSVH/SIOCGVH ioctls. A new prefix created with SIOCAIFADDR or
SIOCAIFADDR_IN6 may now be configured to a particular virtual host id,
which makes the prefix redundant.

ifconfig(8) semantics has been changed too: now one doesn't need
to clone carpXX interface, he/she should directly configure a vhid
on a Ethernet interface.

To supply vhid data from the kernel to an application the getifaddrs(8)
function had been changed to pass ifam_data with each address. [1]

The new implementation definitely closes all PRs related to carp(4)
being an interface, and may close several others. It also allows
to run a single redundant IP per interface.

Big thanks to Bjoern Zeeb for his help with inet6 part of patch, for
idea on using ifam_data and for several rounds of reviewing!

PR:		kern/117000, kern/126945, kern/126714, kern/120130, kern/117448
Reviewed by:	bz
Submitted by:	bz [1]
2011-12-16 12:16:56 +00:00
glebius
a062915218 Belatedly catch up with r151555. in_scrubprefix() also needs this fix. We
should compare not only addresses, but their masks, too, when searching
for matching prefix.
2011-12-13 06:56:43 +00:00
tuexen
e98fa27d6f Fix a bug reported by Irene Ruengeler which resulted in not sending
out HEARTBEATs when requested by the user. The HEARTBEATs were only
queued, but not actually sent out.

MFC after: 2 months.
2011-12-10 10:52:54 +00:00
glebius
35ed2c25b8 Fix a very special case when SIOCAIFADDR supplies mask of 0.0.0.0,
don't overwrite the mask with autoguessing based on classes.
2011-12-06 20:55:20 +00:00