Commit Graph

3626 Commits

Author SHA1 Message Date
Randall Stewart
54bb41671a MFC of 2 items to fix the csum for v6 issue:
Revision 205075 and 205104:

---------205075----------
With the recent change of the sctp checksum to support offload,
no delayed checksum was added to the ip6 output code. This
causes cards that do not support SCTP checksum offload to
have SCTP packets that are IPv6 NOT have the sctp checksum
performed. Thus you could not communicate with a peer. This
adds the missing bits to make the checksum happen for these cards.
-------------------------
---------205104----------
The proper fix for the delayed SCTP checksum is to
have the delayed function take an argument as to the offset
to the SCTP header. This allows it to work for V4 and V6.
This of course means changing all callers of the function
to either pass the header len, if they have it, or create
it (ip_hl << 2 or sizeof(ip6_hdr)).
-------------------------
PR:		144529
2010-04-05 13:48:23 +00:00
Qing Li
c951da56b4 MFC 204902
One of the advantages of enabling ECMP (a.k.a RADIX_MPATH) is to
allow for connection load balancing across interfaces. Currently
the address alias handling method is colliding with the ECMP code.
For example, when two interfaces are configured on the same prefix,
only one prefix route is installed. So connection load balancing
among the available interfaces is not possible.

The other advantage of ECMP is for failover. The issue with the
current code, is that the interface link-state is not reflected
in the route entry. For example, if there are two interfaces on
the same prefix, the cable on one interface is unplugged, new and
existing connections should switch over to the other interface.
This is not done today and packets go into a black hole.

Also, there is a small bug in the kernel where deleting ECMP routes
in the userland will always return an error even though the command
is successfully executed.
2010-04-02 05:02:50 +00:00
Qing Li
ca2d42b2a1 MFC 201131
introduce a local variable rte acting as a cache of ro->ro_rt
within ip_output, achieving (in random order of importance):
- a reduction of the number of 'r's in the source code;
- improved legibility;
- a reduction of 64 bytes in the .text
2010-04-02 04:58:17 +00:00
Kip Macy
e952596a10 MFC 205066, 205069, 205093, 205097, 205488:
r205066:

Log:
 - restructure flowtable to support ipv6
 - add a name argument to flowtable_alloc for printing with ddb commands
 - extend ddb commands to print destination address or 4-tuples
 - don't parse ports in ulp header if FL_HASH_ALL is not passed
 - add kern_flowtable_insert to enable more generic use of flowtable
   (e.g. system calls for adding entries)
 - don't hash loopback addresses
 - cleanup whitespace
 - keep statistics per-cpu for per-cpu flowtables to avoid cache line contention
 - add sysctls to accumulate stats and report aggregate

r205069:
Log:
 fix stats reporting sysctl

r205093:
Log:
 re-update copyright to 2010
 pointed out by danfe@

r205097:

Log:
 flowtable_get_hashkey is only used by a DDB function - move under #ifdef DDB

 pointed out by jkim@

r205488:

Log:
 - boot-time size the ipv4 flowtable and the maximum number of flows
 - increase flow cleaning frequency and decrease flow caching time
   when near the flow limit
 - stop allocating new flows when within 3% of maxflows don't start
   allocating again until below 12.5%
2010-04-01 00:36:40 +00:00
Luigi Rizzo
353be77138 A last-minute change in the previous commit broke rule deletion,
so i am fixing it, this time with a more detailed description
of what the code is supposed to do.
2010-03-31 01:51:08 +00:00
Luigi Rizzo
d15984d46e mfc 205830:
fixes to rule set handling (including potential kernel panics)
2010-03-29 12:32:16 +00:00
Luigi Rizzo
0d3003c0c8 remove a leftover debugging message 2010-03-29 12:29:34 +00:00
Bjoern A. Zeeb
397069f2c5 MFC r205251:
Add pcb reference counting to the pcblist sysctl handler functions
  to ensure type stability while caching the pcb pointers for the
  copyout.

  Reviewed by:  rwatson
2010-03-27 17:51:27 +00:00
Bjoern A. Zeeb
62f500d0c2 MFC r204838:
Destroy TCP UMA zones (empty or not) upon network stack teardown
  to not leak them, otherwise making UMA/vmstat unhappy with every
  stoped vnet.
  We will still leak pages (especially for zones marked NOFREE).

  Reshuffle cleanup order in tcp_destroy() to get rid of what we can
  easily free first.

  Reviewed by:  rwatson
2010-03-27 17:50:02 +00:00
Bjoern A. Zeeb
3662f299d2 MFC r204807:
Destroy UDP UMA zones (empty or not) upon network stack teardown
  to not leak them making UMA/vmstat -z unhappy with every stoped vnet.
  We will still leak pages (especially as zones are marked NOFREE).
2010-03-27 17:46:06 +00:00
Bjoern A. Zeeb
1198bd71ba MFC r204143:
Upon virtual network stack teardown properly release the TCP syncache
  resources.

  Reviewed by:  rwatson
2010-03-27 17:36:52 +00:00
Bjoern A. Zeeb
e47658ce90 MFC r204140:
Split up ip_drain() into an outer lock and iterator part and
  a "locked" version that will only handle a single network stack
  instance. The latter is called directly from ip_destroy().

  Hook up an ip_destroy() function to release resources from the
  legacy IP network layer upon virtual network stack teardown.

  Reviewed by:  rwatson
2010-03-27 17:34:57 +00:00
Bjoern A. Zeeb
ef18ad7e2d MFC r203724:
Properly free resources when destroying the TCP hostcache while
  tearing down a network stack (in the VIMAGE jail+vnet case).

  For that break out the logic from tcp_hc_purge() into an internal
  function we can call from both, the sysctl handler and the
  tcp_hc_destroy().

  Reviewed by:  silby, lstewart
2010-03-27 17:26:31 +00:00
Luigi Rizzo
7da98b8ab6 MFC 205602:
Honor ip.fw.one_pass when a packet comes out of a pipe without being delayed.
I forgot to handle this case when i did the mtag cleanup three months ago.

I am merging immediately because this bugfix is important for
people using RELENG_8.

PR:           145004
2010-03-24 15:19:47 +00:00
Luigi Rizzo
8018e843a3 MFC of a large number of ipfw and dummynet fixes and enhancements
done in CURRENT over the last 4 months.
HEAD and RELENG_8 are almost in sync now for ipfw, dummynet
the pfil hooks and related components.

Among the most noticeable changes:
- r200855 more efficient lookup of skipto rules, and remove O(N)
  blocks from critical sections in the kernel;
- r204591 large restructuring of the dummynet module, with support
  for multiple scheduling algorithms (4 available so far)
See the original commit logs for details.

Changes in the kernel/userland ABI should be harmless because the
kernel is able to understand previous requests from RELENG_8 and
RELENG_7. For this reason, this changeset would be applicable
to RELENG_7 as well, but i am not sure if it is worthwhile.
2010-03-23 09:58:59 +00:00
Matt Jacob
7733cf8fff MFC a number of changes from head for ISP (203478,203463,203444,202418,201758,
201408,201325,200089,198822,197373,197372,197214,196162). Since one of those
changes was a semicolon cleanup from somebody else, this touches a lot more.
2010-02-11 18:34:06 +00:00
Qing Li
613e96b8c6 MFC r203401
Some of the existing ppp and vpn related scripts create and set
the IP addresses of the tunnel end points to the same value. In
these cases the loopback route is not installed for the local
end.
2010-02-09 19:27:54 +00:00
Julian Elischer
2ae7ec29fd MFC of 197952 and 198075
Virtualize the pfil hooks so that different jails may chose different
    packet filters. ALso allows ipfw to be enabled on on ejail and disabled
    on another. In 8.0 it's a global setting.
and
    Unbreak the VIMAGE build with IPSEC, broken with r197952 by
    virtualizing the pfil hooks.
    For consistency add the V_ to virtualize the pfil hooks in here as well.
2010-02-07 09:00:22 +00:00
Antoine Brodin
e2b36efde5 MFC r201145 to stable/8:
(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.
  Fix some wrong usages.
  Note: this does not affect generated binaries as this argument is not used.

  PR:		137213
  Submitted by:	Eygene Ryabinkin (initial version)
2010-01-30 12:11:21 +00:00
George V. Neville-Neil
fbbbfe0ba5 MFC r196797:
Add ARP statistics to the kernel and netstat.
2010-01-28 16:48:44 +00:00
Michael Tuexen
b93b253dc6 MFC 202449:
Get rid of support of an old version of the SCTP-AUTH draft.
Get rid of unused MD5 code.
2010-01-24 22:17:08 +00:00
Bjoern A. Zeeb
be6797dde8 MFC r202469:
Garbage collect references to the no longer implemented tcp_fasttimo().
2010-01-24 12:22:38 +00:00
Bjoern A. Zeeb
3bcceea40e MFC r202468:
Add ip4.saddrsel/ip4.nosaddrsel (and equivalent for ip6) to control
  whether to use source address selection (default) or the primary
  jail address for unbound outgoing connections.

  This is intended to be used by people upgrading from single-IP
  jails to multi-IP jails but not having to change firewall rules,
  application ACLs, ... but to force their connections (unless
  otherwise changed) to the primry jail IP they had been used for
  years, as well as for people prefering to implement similar policies.

  Note that for IPv6, if configured incorrectly, this might lead to
  scope violations, which single-IPv6 jails could as well, as by the
  design of jails. [1]

  Reviewed by:		jamie, hrs (ipv6 part)
  Pointed out by:	hrs [1]
2010-01-23 16:40:35 +00:00
Navdeep Parhar
cbbe4a754c MFC r201416:
Avoid NULL dereference in arpresolve.

Requested by: kib@
2010-01-21 10:12:21 +00:00
Michael Tuexen
06ee5047d5 MFC 201523
Correct usage of parenthesis.
2010-01-17 18:18:01 +00:00
Michael Tuexen
45bde0da39 MFC 199459
Get rid of unused fields addr_over which is never really used,
only copied around.
2010-01-17 17:49:28 +00:00
Michael Tuexen
64224569da MFC 199374
Fix a bug where queued ASCONF messags are not sent out.
From Irene Ruengeler.
2010-01-17 17:46:48 +00:00
Michael Tuexen
533e1ca310 MFC 198621
Improve round robin stream scheduler and cleanup some code.
2010-01-17 17:45:09 +00:00
Michael Tuexen
53b14b7294 MFC 197341
Fix errnos.
2010-01-17 17:41:43 +00:00
Michael Tuexen
fb7bf5f374 MFC 198499
Improve the round robin stream scheduler.
2010-01-17 17:10:17 +00:00
Michael Tuexen
33dabcc064 MFC 199437
Use always LIST_EMPTY instead of sometime SCTP_LIST_EMPTY,
which is defined as LIST_EMPTY.
2010-01-17 17:05:59 +00:00
Michael Tuexen
24a263d9da MFC 199372
Do not start the iterator when there are no associations.
This fixes a bug found by Irene Ruengeler.
2010-01-17 17:03:40 +00:00
Michael Tuexen
801fc2d035 MFC 199369
Do not hold the lock longer than necessary.
2010-01-17 17:01:01 +00:00
Michael Tuexen
a8725a275a MFC 198522:
Bugfix: Use formula from section 7.2.3 of RFC 4960. Reported by Martin Becke.
2010-01-17 16:58:37 +00:00
Qing Li
4cc5ccf399 MFC r201544
An existing incomplete ARP entry would expire a subsequent
statically configured entry of the same host. This bug was
due to the expiration timer was not cancelled when installing
the static entry. Since there exist a potential race condition
with respect to timer cancellation, simply check for the
LLE_STATIC bit inside the expiration function instead of
cancelling the active timer.
2010-01-12 00:04:13 +00:00
Ruslan Ermilov
60ee8f1ae1 MFC: r200026,201801: Swap carp(4) log levels. 2010-01-11 12:32:06 +00:00
Qing Li
a17a2dcab6 MFC r201285
Consolidate the route message generation code for when address
aliases were added or deleted. The announced route entry for
an address alias is no longer empty because this empty route
entry was causing some route daemon to fail and exit abnormally.
2010-01-05 22:33:10 +00:00
Qing Li
32c5340155 MFC r201282, r201543
r201282
-------
The proxy arp entries could not be added into the system over the
IFF_POINTOPOINT link types. The reason was due to the routing
entry returned from the kernel covering the remote end is of an
interface type that does not support ARP. This patch fixes this
problem by providing a hint to the kernel routing code, which
indicates the prefix route instead of the PPP host route should
be returned to the caller. Since a host route to the local end
point is also added into the routing table, and there could be
multiple such instantiations due to multiple PPP links can be
created with the same local end IP address, this patch also fixes
the loopback route installation failure problem observed prior to
this patch. The reference count of loopback route to local end would
be either incremented or decremented. The first instantiation would
create the entry and the last removal would delete the route entry.

r201543
-------
The IFA_RTSELF address flag marks a loopback route has been installed
for the interface address. This marker is necessary to properly support
PPP types of links where multiple links can have the same local end
IP address. The IFA_RTSELF flag bit maps to the RTF_HOST value, which
was combined into the route flag bits during prefix installation in
IPv6. This inclusion causing the prefix route to be unusable. This
patch fixes this bug by excluding the IFA_RTSELF flag during route
installation.

PR:		ports/141342, kern/141134
2010-01-05 22:14:55 +00:00
John Baldwin
e10b0dfd66 MFC 200847:
- Rename the __tcpi_(snd|rcv)_mss fields of the tcp_info structure to remove
  the leading underscores since they are now implemented.
- Implement the tcpi_rto and tcpi_last_data_recv fields in the tcp_info
  structure.
2010-01-05 17:04:14 +00:00
Shteryana Shopova
aed7a0f878 MFC r201254:
Make sure the multicast forwarding cache entry's stall queue is properly
initialized before trying to insert an entry into it.

PR:		kern/142052
Reviewed by:	bms
2010-01-04 15:58:36 +00:00
Hajimu UMEMOTO
479812d91c MFC r200055, r200102:
- Teach an IPv6 to the debug prints.
- Use INET_ADDRSTRLEN and INET6_ADDRSTRLEN rather than hard
  coded number.
2010-01-04 15:22:38 +00:00
Hajimu UMEMOTO
30feab0076 MFC r200027: Teach an IPv6 to send_pkt() and ipfw_tick().
It fixes the issue which keep-alive doesn't work for an IPv6.
2010-01-04 15:05:11 +00:00
Bjoern A. Zeeb
950cde5085 MFC r200473:
Throughout the network stack we have a few places of
        if (jailed(cred))
  left.  If you are running with a vnet (virtual network stack) those will
  return true and defer you to classic IP-jails handling and thus things
  will be "denied" or returned with an error.

  Work around this problem by introducing another "jailed()" function,
  jailed_without_vnet(), that also takes vnets into account, and permits
  the calls, should the jail from the given cred have its own virtual
  network stack.

  We cannot change the classic jailed() call to do that,  as it is used
  outside the network stack as well.

  Discussed with:       julian, zec, jamie, rwatson (back in Sept)
2009-12-28 14:40:58 +00:00
Robert Watson
1120ce6b69 Merge r198438 from head to stable/8:
Correct spelling typo in ip_input comment.

  Pointed out by:       N.J. Mann <njm at njm.me.uk>,
                John Nielsen <john at jnielsen.net>, julian (!), lstewart
2009-12-14 11:53:02 +00:00
Robert Watson
ec610c212b Merge r198393 from head to stable/8:
Improve grammar in ip_input comment while attempting to maintain what
  might be its meaning.

(Note, merge of the revision correcting a spelling error in this commit
will follow as well!)
2009-12-14 11:15:47 +00:00
Michael Tuexen
cf19fced17 MFC 197288,197326,197327,197328,197342,197914,197929,
197955,199365,199370,199371,199373,199866
This MFCs all SCTP/VNET relevant fixes from head.

Approved by: rrs (mentor)
2009-12-07 07:33:51 +00:00
Bjoern A. Zeeb
b4e227f473 MFC r198050:
Compare pointer to NULL rather than 0.
2009-12-05 19:44:16 +00:00
Luigi Rizzo
3cdcbc4885 some simple MFC:
r200020:
  change the type of the opcode from enum *:8  to u_int8_t
  so the size and alignment of the ipfw_insn is not compiler dependent.
  No changes in the code generated by gcc.

r200023:
  Add new sockopt names for ipfw and dummynet.

  This commit is just grabbing entries for the new names
  that will be used in the future, so you don't need to
  rebuild anything now.

r200034
  Dispatch sockopt calls to ipfw and dummynet
  using the new option numbers, IP_FW3 and IP_DUMMYNET3.
  Right now the modules return an error if called with those arguments
  so there is no danger of unwanted behaviour.

r200040
  - initialize src_ip in the main loop to prevent a compiler warning
    (gcc 4.x under linux, not sure how real is the complaint).
  - rename a macro argument to prevent name clashes.
  -  add the macro name on a couple of #endif
  - add a blank line for readability.
2009-12-05 12:51:51 +00:00
Attilio Rao
a5e831ded9 MFC r199208, r199223:
Move inet_aton() (specular to inet_ntoa(), already present in libkern)
into libkern in order to made it usable by other modules than alias_proxy.

Sponsored by:	Sandvine Incorporated
2009-11-22 16:04:49 +00:00
Bruce M Simpson
025bbb4984 MFC r199522..199528:
Pullup IPv6 mcast SSM KPI fixes from HEAD, including fix for
  filter deallocation from Stef Walter.
2009-11-20 12:30:40 +00:00