Commit Graph

3728 Commits

Author SHA1 Message Date
Luigi Rizzo
642dddf0f8 fix handling of commands issued by RELENG_7 version of /sbin/ipfw,
Submitted by:	Riccardo Panicucci
2010-03-10 14:21:05 +00:00
Qing Li
c7ea0aa648 One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to
allow for connection load balancing across interfaces. Currently
the address alias handling method is colliding with the ECMP code.
For example, when two interfaces are configured on the same prefix,
only one prefix route is installed. So connection load balancing
among the available interfaces is not possible.

The other advantage of ECMP is for failover. The issue with the
current code, is that the interface link-state is not reflected
in the route entry. For example, if there are two interfaces on
the same prefix, the cable on one interface is unplugged, new and
existing connections should switch over to the other interface.
This is not done today and packets go into a black hole.

Also, there is a small bug in the kernel where deleting ECMP routes
in the userland will always return an error even though the command
is successfully executed.

MFC after:	5 days
2010-03-09 01:11:45 +00:00
Luigi Rizzo
feadd2b1ca cosmetic changes and C++ compatibility 2010-03-08 11:27:39 +00:00
Luigi Rizzo
d12cc63303 don't use C++ keywords as variable names 2010-03-08 11:27:08 +00:00
Luigi Rizzo
b854138d5f do not report an error unnecessarily 2010-03-08 11:22:47 +00:00
Bjoern A. Zeeb
376aadf896 Destroy TCP UMA zones (empty or not) upon network stack teardown
to not leak them, otherwise making UMA/vmstat unhappy with every stoped vnet.
We will still leak pages (especially for zones marked NOFREE).

Reshuffle cleanup order in tcp_destroy() to get rid of what we can
easily free first.

Sponsored by:	ISPsystem
Reviewed by:	rwatson
MFC after:	5 days
2010-03-07 15:58:44 +00:00
Bjoern A. Zeeb
e253cdd07c Not only flush the ipfw tables when unloading ipfw or tearing
down a virtual netowrk stack, but also free the Radix Node Head.

Sponsored by:	ISPsystem
Reviewed by:	julian
MFC after:	5 days
2010-03-07 15:37:58 +00:00
Robert Watson
1f821c53f0 Locking the tcbinfo structure should not be necessary in tcp_timer_delack(),
so don't.

MFC after:      1 week
Reviewed by:    bz
Sponsored by:   Juniper Networks
2010-03-07 14:23:44 +00:00
Robert Watson
2bf3ce088d Add comment in tcp_discardcb() talking about how we don't, but should,
address TCP races relating to not calling tcp_drain() on stopped callouts.

Discussed with:	bz
2010-03-07 14:13:59 +00:00
Robert Watson
68b5629bf5 Make udp_set_kernel_tunneling() less forgiving when its invariants are
violated: so_pcb can never be NULL for a valid UDP socket, and it is
always SOCK_DGRAM.  Use sotoinpcb() as the rest of the UDP code does.

MFC after:	1 week
Reviewed by:	bz
Sponsored by:	Juniper Networks
2010-03-07 10:47:47 +00:00
Robert Watson
1d7429e0a9 Remove unnecessary locking of divcbinfo lock from div_output(): this has not
been required since FreeBSD 7.0 when the so_pcb pointer leading to inp was
guaranteed to be stable when a valid socket reference is held (as it is in
the output path).

MFC after:	1 week
Reviewed by:	bz
Sponsored by:	Juniper Networks
2010-03-06 22:04:45 +00:00
Robert Watson
8296cddfdd Add a comment to tcp_usr_accept() to indicate why it is we acquire the
tcbinfo lock there: r175612, which re-added it, masked a race between
sonewconn(2) and accept(2) that could allow an incompletely initialized
address on a newly-created socket on a listen queue to be exposed.  Full
details can be found in that commit message.

MFC after:	1 week
Sponsored by:	Juniper Networks
2010-03-06 21:38:31 +00:00
Bjoern A. Zeeb
391dab1c2d Destroy UDP UMA zones (empty or not) upon network stack teardown
to not leak them making the VM subsystem unhappy with every stoped vnet(*).
We will still leak pages (especially as zones are marked NOFREE).

(*) This will also keep vmstat -z more usable.

Sponsored by:	ISPsystem
MFC after:	5 days
2010-03-06 21:24:32 +00:00
Robert Watson
66f80e90ef Wrap use of rw_try_upgrade() on pcbinfo with macro INP_INFO_TRY_UPGRADE()
to match other pcbinfo locking macros.

MFC after:	1 week
2010-03-06 21:24:11 +00:00
Luigi Rizzo
67d079f342 plug a memory leak on pipe's reconfiguration 2010-03-05 17:53:28 +00:00
Luigi Rizzo
6a82d14731 fix a memory leak when deleting RED queues 2010-03-05 12:58:19 +00:00
Luigi Rizzo
b05934e2cb portability fixes 2010-03-04 21:52:40 +00:00
Luigi Rizzo
ae8b199313 don't use keywords as variable names. 2010-03-04 21:01:59 +00:00
Luigi Rizzo
44e510399b use callout_drain() (outside the lock) when unloading the module.
This prevents a potential deadlock.

Submitted by:	Francesco Magno
2010-03-04 16:53:38 +00:00
Luigi Rizzo
6aada3117b improve compatibility with RELENG_7.2 2010-03-04 16:52:26 +00:00
Luigi Rizzo
cc4d3c30ea Bring in the most recent version of ipfw and dummynet, developed
and tested over the past two months in the ipfw3-head branch.  This
also happens to be the same code available in the Linux and Windows
ports of ipfw and dummynet.

The major enhancement is a completely restructured version of
dummynet, with support for different packet scheduling algorithms
(loadable at runtime), faster queue/pipe lookup, and a much cleaner
internal architecture and kernel/userland ABI which simplifies
future extensions.

In addition to the existing schedulers (FIFO and WF2Q+), we include
a Deficit Round Robin (DRR or RR for brevity) scheduler, and a new,
very fast version of WF2Q+ called QFQ.

Some test code is also present (in sys/netinet/ipfw/test) that
lets you build and test schedulers in userland.

Also, we have added a compatibility layer that understands requests
from the RELENG_7 and RELENG_8 versions of the /sbin/ipfw binaries,
and replies correctly (at least, it does its best; sometimes you
just cannot tell who sent the request and how to answer).
The compatibility layer should make it possible to MFC this code in a
relatively short time.

Some minor glitches (e.g. handling of ipfw set enable/disable,
and a workaround for a bug in RELENG_7's /sbin/ipfw) will be
fixed with separate commits.

CREDITS:
This work has been partly supported by the ONELAB2 project, and
mostly developed by Riccardo Panicucci and myself.
The code for the qfq scheduler is mostly from Fabio Checconi,
and Marta Carbone and Francesco Magno have helped with testing,
debugging and some bug fixes.
2010-03-02 17:40:48 +00:00
Joel Dahl
7df6f59359 The NetBSD Foundation has granted permission to remove clause 3 and 4 from
their software.

Obtained from:	NetBSD
2010-03-01 17:05:46 +00:00
Bjoern A. Zeeb
aa3f803697 Upon virtual network stack teardown properly release the TCP syncache
resources.

Sponsored by:	ISPsystem
Reviewed by:	rwatson
MFC After:	5 days
2010-02-20 21:45:04 +00:00
Michael Tuexen
7b470fc31c Fix handling of SHUTDOWN-ACK chunk in COOKIE_WAIT and COOKIE_ECHOED.
MFC after: 1 week
2010-02-20 20:30:40 +00:00
Bjoern A. Zeeb
9802380e41 Split up ip_drain() into an outer lock and iterator part and
a "locked" version that will only handle a single network stack
instance. The latter is called directly from ip_destroy().

Hook up an ip_destroy() function to release resources from the
legacy IP network layer upon virtual network stack teardown.

Sponsored by:	ISPsystem
Reviewed by:	rwatson
MFC After:	5 days
2010-02-20 19:59:52 +00:00
Michael Tuexen
7291848a0b * Fix another u_long -> uint32_t issue.
* Remove an unused global variable.
* Fix an issue reported by Bruce Cran related to reusing SCTP socket which
  where connected.

MFC after: 1 week
2010-02-19 18:00:38 +00:00
Pawel Jakub Dawidek
957d68dd91 No need to include security/mac/mac_framework.h here. 2010-02-18 22:26:01 +00:00
Michael Tuexen
63eda93d1a Use uint32_t instead of u_long.
MFC after: 1 week
2010-02-18 13:46:54 +00:00
Luigi Rizzo
27c9c97a3e remove recursive lock/unlock calls, we do them already before entering
the switch.

Reported by: Marta Carbone
2010-02-17 13:06:06 +00:00
Michael Tuexen
8d9d061323 Add missing SCTP_PACKED. Spotted by Irene Ruengeler.
MFC after: 1 week
2010-02-13 21:38:15 +00:00
Bjoern A. Zeeb
fffb9f1d9c Properly free resources when destroying the TCP hostcache while
tearing down a network stack (in the VIMAGE jail+vnet case).

For that break out the logic from tcp_hc_purge() into an internal
function we can call from both, the sysctl handler and the
tcp_hc_destroy().

Sponsored by:	ISPsystem
Reviewed by:	silby, lstewart
MFC After:	8 days
2010-02-09 21:31:53 +00:00
Michael Tuexen
f1150dc0a5 Restore the checksum received before processing the packet.
MFC after: 1 week
2010-02-04 21:02:29 +00:00
Qing Li
d577d18a00 Some of the existing ppp and vpn related scripts create and set
the IP addresses of the tunnel end points to the same value. In
these cases the loopback route is not installed for the local
end.

Verified by:	avg
MFC after:	5 days
2010-02-02 20:38:30 +00:00
Luigi Rizzo
dc5fd2595c use u_char instead of u_int for short bitfields.
For our compiler the two constructs are completely equivalent, but
some compilers (including MSC and tcc) use the base type for alignment,
which in the cases touched here result in aligning the bitfields
to 32 bit instead of the 8 bit that is meant here.

Note that almost all other headers where small bitfields
are used have u_int8_t instead of u_int.

MFC after:	3 days
2010-02-01 14:13:44 +00:00
Michael Tuexen
663fdad84b Use [] instead of [0] for flexible arrays.
Obtained from: Bruce Cran
MFC after: 1 week
2010-01-22 07:53:41 +00:00
Michael Tuexen
cd55430963 Get rid of a lot of duplicated code for NR-SACK handle.
Generalize the SACK to code handle also NR-SACKs.
2010-01-17 21:00:28 +00:00
Randall Stewart
e34b217f91 Bug fix: If the allocation of a socket failed and we
freed the inpcb, it was possible to not set the
proper flags on the pcb (i.e. the socket is not there).
This is HIGHLY unlikely since no one else should be
able to find the socket.. but for consistency we
do the proper loop thing to make sure that we
mark the socket as gone on the PCB.
2010-01-17 19:47:59 +00:00
Randall Stewart
0812a4d5e6 Pulls out another leaked windows ifdef that somehow
made its way through the scrubber.
2010-01-17 19:40:21 +00:00
Randall Stewart
a10c3242c7 This change syncs up the socketAPI stream-reset
values to match those in linux and the I-D
just released to the IETF.
2010-01-17 19:35:38 +00:00
Randall Stewart
92cf719944 More leaked ifdefs for APPLE and its mobility stuff. 2010-01-17 19:24:30 +00:00
Randall Stewart
33141385fc Remove another set of "leaked" ifdefs that somehow found
their way into FreeBSD.
2010-01-17 19:21:50 +00:00
Randall Stewart
58ac2d97b7 Remove strange APPLE define that leaked
through the scrubber scripts. Scripts are
now fixed so this won't happen again.
2010-01-17 19:17:16 +00:00
Bjoern A. Zeeb
4dcc55a363 Garbage collect references to the no longer implemented tcp_fasttimo().
Discussed with:	rwatson
MFC after:	5 days
2010-01-17 13:07:52 +00:00
Bjoern A. Zeeb
592bcae802 Add ip4.saddrsel/ip4.nosaddrsel (and equivalent for ip6) to control
whether to use source address selection (default) or the primary
jail address for unbound outgoing connections.

This is intended to be used by people upgrading from single-IP
jails to multi-IP jails but not having to change firewall rules,
application ACLs, ... but to force their connections (unless
otherwise changed) to the primry jail IP they had been used for
years, as well as for people prefering to implement similar policies.

Note that for IPv6, if configured incorrectly, this might lead to
scope violations, which single-IPv6 jails could as well, as by the
design of jails. [1]

Reviewed by:	jamie, hrs (ipv6 part)
Pointed out by:	hrs [1]
MFC After:	2 weeks
Asked for by:	Jase Thew (bazerka beardz.net)
2010-01-17 12:57:11 +00:00
Hajimu UMEMOTO
416458131a Change 'me' to match any IPv6 address configured on an interface in
the system as well as any IPv4 address.

Reviewed by:	David Horn <dhorn2000__at__gmail.com>, luigi, qingli
MFC after:	2 weeks
2010-01-17 08:39:48 +00:00
Michael Tuexen
5661a9ed70 Get rid of support of an old version of the SCTP-AUTH draft.
Get rid of unused MD5 code.

MFC after: 1 week
2010-01-16 20:04:17 +00:00
Qing Li
646c800540 Ensure an address is removed from the interface address
list when the installation of that address fails.

PR:		139559
2010-01-08 17:49:24 +00:00
Ruslan Ermilov
acc0fee071 Complete the swap of carp(4) log levels and document the change.
MFC after:	3 days
2010-01-08 16:14:41 +00:00
Martin Blapp
c2ede4b379 Remove extraneous semicolons, no functional changes.
Submitted by:	Marc Balmer <marc@msys.ch>
MFC after:	1 week
2010-01-07 21:01:37 +00:00
Luigi Rizzo
5afa29b41a we don't use dummynet_drain! 2010-01-07 13:53:47 +00:00