3127 Commits

Author SHA1 Message Date
Robert Watson
ca528788b8 Fix error in comment.
MFC after:	3 weeks
2008-07-16 10:55:50 +00:00
Robert Watson
43cc0bc1df Merge last of a series of rwlock conversion changes to UDP, which
completes the move to a fully parallel UDP transmit path by using
global read, rather than write, locking of inpcbinfo in further
semi-connected cases:

- Add macros to allow try-locking of inpcb and inpcbinfo.
- Always acquire an incpcb read lock in udp_output(), which stablizes the
  local inpcb address and port bindings in order to determine what further
  locking is required:
  - If the inpcb is currently not bound (at all) and are implicitly
    connecting, we require inpcbinfo and inpcb write locks, so drop the
    read lock and re-acquire.
  - If the inpcb is bound for at least one of the port or address, but an
    explicit source or destination is requested, trylock the inpcbinfo
    lock, and if that fails, drop the inpcb lock, lock the global lock,
    and relock the inpcb lock.
  - Otherwise, no further locking is required (common case).
- Update comments.

In practice, this means that the vast majority of consumers of UDP sockets
will not acquire any exclusive locks at the socket or UDP levels of the
network stack.  This leads to a marked performance improvement in several
important workloads, including BIND, nsd, and memcached over UDP, as well
as significant improvements in pps microbenchmarks.

The plan is to MFC all of the rwlock changes to RELENG_7 once they have
settled for a weeks in the tree.

Tested by:	ps, kris (older revision), bde
MFC after:	3 weeks
2008-07-15 15:38:47 +00:00
Rui Paulo
b27227029b Fix commment in typo.
M    tcp_output.c
2008-07-15 10:32:35 +00:00
Ermal Luçi
7972c979c5 Fix carp(4) panics that can occur during carp interface configuration.
Approved by:	mlaier (mentor)
Reported by:	Scott Ullrich
MFC after:	1 week
2008-07-14 20:11:51 +00:00
Robert Watson
3144b7d3d3 Slightly rearrange validation of UDP arguments and jail processing in
udp_output() so that argument validation occurs before jail processing.

Add additional comments explaining what's going on when we process
addresses and binding during udp_output().

MFC after:	3 weeks
2008-07-10 16:20:18 +00:00
Bjoern A. Zeeb
078b704233 Pass the ucred along into in{,6}_pcblookup_local for upcoming
prison checks.

Reviewed by:	rwatson
2008-07-10 13:31:11 +00:00
Bjoern A. Zeeb
cdcb11b92c For consistency take lport as u_short in in{,6}_pcblookup_local.
All callers either pass in an u_short or u_int16_t.

Reviewed by:	rwatson
2008-07-10 13:23:22 +00:00
Robert Watson
1175d9d56d Apply the MAC label to an outgoing UDP packet when other inpcb properties are
processed, meaning that we avoid the cost of MAC label assignment if we're
going to drop the packet due to mbuf exhaustion, etc.

MFC after:	3 weeks
2008-07-10 09:45:28 +00:00
Bjoern A. Zeeb
e5cf427baf For consistency with the rest of the function use the locally cached
pointer pcbinfo rather than inp->inp_pcbinfo.

MFC after:	3 weeks
2008-07-09 19:03:06 +00:00
Randall Stewart
fc14de76f4 1) Adds the rest of the VIMAGE change macros
2) Adds some __UserSpace__ on some of the common defines that
   the user space code needs
3) Fixes a bug when we send up data to a user that failed. We
   need to a) trim off the data chunk headers, if present, and
   b) make sure the frag bit is communicated properly for the
   msgs coming off the stream queues... i.e. we see if some
   of the msg has been taken.

Obtained from:	jeli contributed the VIMAGE changes on this pass Thanks Julain!
2008-07-09 16:45:30 +00:00
Robert Watson
7b709f8ad4 Provide some initial chicken-scratching annotations of locking for
struct inpcb.

Prodded by:	bz
MFC after:	3 days
2008-07-08 17:22:59 +00:00
Robert Watson
ac9ae27991 Allow udp_notify() to accept read, as well as write, locks on the passed
inpcb.  When directly invoking udp_notify() from udp_ctlinput(), acquire
only a read lock; we may still see write locks in udp_notify() as the
in_pcbnotifyall() routine is shared with TCP and always uses a write lock
on the inpcb being notified.

MFC after:	1 month
2008-07-07 12:27:55 +00:00
Robert Watson
c4d585aefe Add additional udbinfo and inpcb locking assertions to udp_output(); for
some code paths, global or inpcb write locks are required, but for other
code paths, read locks or no locking at all are sufficient for the data
structures.

MFC after:	1 month
2008-07-07 12:14:10 +00:00
Robert Watson
948d0fc926 First step towards parallel transmit in UDP: if neither a specific
source or a specific destination address is requested as part of a send
on a UDP socket, read lock the inpcb rather than write lock it.  This
will allow fully parallel transmit down to the IP layer when sending
simultaneously from multiple threads on a connected UDP socket.

Parallel transmit for more complex cases, such as when sendto(2) is
invoked with an address and there's already a local binding, will
follow.

MFC after:	1 month
2008-07-07 10:56:55 +00:00
Robert Watson
10cc62b7a6 Drop read lock on udbinfo earlier during delivery to the last matching
UDP socket for a datagram; the inpcb read lock is sufficient to provide
inpcb stability during udp_append().

MFC after:	1 month
2008-07-07 09:26:52 +00:00
Robert Watson
cec9ffee22 Rename raw_append() to rip_append(): the raw_ prefix is generally used
for functions in the generic raw socket library (raw_cb.c, raw_usrreq.c),
and they are not used for IPv4 raw sockets.

MFC after:	3 days
2008-07-05 18:55:03 +00:00
Robert Watson
0ae76120da Improve approximation of style(9) in raw socket code. 2008-07-05 18:03:39 +00:00
Oleksandr Tymoshenko
06a37c4203 Enqueue de-capsulated packet instead of performing direct dispatch. It's
possible to exhaust and garble stack with a packet that contains a couple
of hundreds nested encapsulation levels.

Submitted by:   Ming Fu <fming@borderware.com>
Reviewed by:    rwatson
PR:             kern/85320
2008-07-04 21:01:30 +00:00
Robert Watson
59dd72d040 Remove NETISR_MPSAFE, which allows specific netisr handlers to be directly
dispatched without Giant, and add NETISR_FORCEQUEUE, which allows specific
netisr handlers to always be dispatched via a queue (deferred).  Mark the
usb and if_ppp netisr handlers as NETISR_FORCEQUEUE, and explicitly
acquire Giant in those handlers.

Previously, any netisr handler not marked NETISR_MPSAFE would necessarily
run deferred and with Giant acquired.  This change removes Giant
scaffolding from the netisr infrastructure, but NETISR_FORCEQUEUE allows
non-MPSAFE handlers to continue to force deferred dispatch so as to avoid
lock order reversals between their acqusition of Giant and any calling
context.

It is likely we will be able to remove NETISR_FORCEQUEUE once
IFF_NEEDSGIANT is removed, as non-MPSAFE usb and if_ppp drivers will no
longer be supported.

Reviewed by:	bz
MFC after:	1 month
X-MFC note:	We can't remove NETISR_MPSAFE from stable/7 for KPI reasons,
		but the rest can go back.
2008-07-04 00:21:38 +00:00
Bjoern A. Zeeb
62ee136457 Remove a bogusly introduced rtalloc_ign() in rev. 1.335/SVN 178029,
generating an RTM_MISS for every IP packet forwarded making user space
routing daemons unhappy.

PR:		kern/123621, kern/124540, kern/122338
Reported by:	Paul <paul gtcomm.net>, Mike Tancsa <mike sentex.net> on net@
Tested by:	Paul and Mike
Reviewed by:	andre
MFC after:	3 days
2008-07-03 12:44:36 +00:00
Robert Watson
5df3e83946 Add soreceive_dgram(9), an optimized socket receive function for use by
datagram-only protocols, such as UDP.  This version removes use of
sblock(), which is not required due to an inability to interlace data
improperly with datagrams, as well as avoiding some of the larger loops
and state management that don't apply on datagram sockets.

This is experimental code, so hook it up only for UDPv4 for testing; if
there are problems we may need to revise it or turn it off by default,
but it offers *significant* performance improvements for threaded UDP
applications such as BIND9, nsd, and memcached using UDP.

Tested by:	kris, ps
2008-07-02 23:23:27 +00:00
Robert Watson
119d85f6e0 In udp_append() and udp_input(), make use of read locking on incpbs
rather than write locking: while we need to maintain a valid reference
to the inpcb and fix its state, no protocol layer state is modified
during an IPv4 UDP receive -- there are only changes at the socket
layer, which is separately protected by socket locking.

While parallel concurrent receive on a single UDP socket is currently
relatively unusual, introducing read locking in the transmit path,
allowing concurrent receive and transmit, will significantly improve
performance for loads such as BIND, memcached, etc.

MFC after:	2 months
Tested by:	gnn, kris, ps
2008-06-30 18:26:43 +00:00
Oleksandr Tymoshenko
cf77b84879 In case of interface initialization failure remove struct in_ifaddr* from
in_ifaddrhashtbl in in_ifinit because error handler in in_control removes
entries only for AF_INET addresses. If in_ifinit is called for the cloned
inteface that has just been created its address family is not AF_INET and
therefor LIST_REMOVE is not called for respective LIST_INSERT_HEAD and
freed entries remain in in_ifaddrhashtbl and lead to memory corruption.

PR:	kern/124384
2008-06-24 13:58:28 +00:00
Alexander Motin
48ca67bea6 Partially revert previous commit. DeleteLink() does not deletes permanent
links so we should be aware of it and try to delete every link only once
or we will loop forever.
2008-06-22 11:39:42 +00:00
Alexander Motin
ea29dd9241 Implement UDP transparent proxy support.
PR:		bin/54274
Submitted by:	Nicolai Petri <nicolai@petri.cc>
2008-06-21 20:18:57 +00:00
Alexander Motin
b46d3e21bb Add support for PORT/EPRT FTP commands in lowercase.
Use strncasecmp() instead of huge local implementation to reduce code size.
Check space presence after command/code.

PR:		kern/73034
2008-06-21 16:22:56 +00:00
Stephan Uphoff
606a2669cf Change incorrect stale cookie detection in syncookie_lookup() that prematurely
declared a cookie as expired.

Reviewed by:	andre@, silby@
Reported by:    Yahoo!
2008-06-16 20:08:22 +00:00
Stephan Uphoff
104ac85378 Fix a check in SYN cache expansion (syncache_expand()) to accept packets that arrive in the receive window instead of just on the left edge of the receive window.
This is needed for correct behavior when packets are lost or reordered.

PR:	kern/123950
Reviewed by:	andre@, silby@
Reported by:	Yahoo!, Wang Jin
MFC after:	1 week
2008-06-16 19:56:59 +00:00
Randall Stewart
97a7b90ff3 More prep for Vimage:
- only one functino to destroy an SCTP stack sctp_finish()
 - Make it so this function also arranges for any threads
   created by the image to do a kthread_exit()
2008-06-15 12:31:23 +00:00
Randall Stewart
9b02321796 - Fixes foobar on my part. Some missing virtualization macros from
specific logging cases.
2008-06-14 13:24:49 +00:00
Randall Stewart
b3f1ea41fd - Macro-izes the packed declaration in all headers.
- Vimage prep - these are major restructures to move
  all global variables to be accessed via a macro or two.
  The variables all go into a single structure.
- Asconf address addition tweaks (add_or_del Interfaces)
- Fix rwnd calcualtion to be more conservative.
- Support SACK_IMMEDIATE flag to skip delayed sack
  by demand of peer.
- Comment updates in the sack mapping calculations
- Invarients panic added.
- Pre-support for UDP tunneling (we can do this on
  MAC but will need added support from UDP to
  get a "pipe" of UDP packets in.
- clear trace buffer sysctl added when local tracing on.

Note the majority of this huge patch is all the vimage prep stuff :-)
2008-06-14 07:58:05 +00:00
Jack F Vogel
6c5087a818 Add generic TCP LOR into netinet 2008-06-11 22:12:50 +00:00
Max Laier
1ead26d4e1 Sort IP addresses before hashing them for the signature. Otherwise carp is
sensitive to address configuration order.

PR:		kern/121574
Reported by:	Douglas K. Rand, Wouter de Jong
Obtained from:	OpenBSD (rev 1.114 + fixes)
MFC after:	2 weeks
2008-06-02 18:58:07 +00:00
Robert Watson
53640b0e3a When allocating temporary storage to hold a TCP/IP packet header
template, use an M_TEMP malloc(9) allocation rather than an mbuf
with mtod(9) and dtom(9).  This eliminates the last use of
dtom(9) in TCP.

MFC after:	3 weeks
2008-06-02 14:20:26 +00:00
Alexander Motin
ef30318ee9 Increase LINK_TABLE_OUT_SIZE from 101 to 4001 like LINK_TABLE_IN_SIZE
to reduce performance degradation under heavy outgoing scan/flood.
Scalability is now much more important then several kilobytes of RAM.

Remove unneded TCP-specific expiration handeling. Before this connected
TCP sessions could never expire. Now connected TCP sessions will expire
after 24hours of inactivity.

Simplify HouseKeeping() to avoid several mul/div-s per packet. Taking into
account increased LINK_TABLE_OUT_SIZE, precision is still much more then
required.
2008-06-01 18:34:58 +00:00
Alexander Motin
efc66711f9 Make m_megapullup() more intelligent:
- to increase performance do not reallocate mbuf when possible,
 - to support up to 16K packets (was 2K max) use mbuf cluster of proper size.
This change depends on recent ng_nat and ip_fw_nat changes.
2008-06-01 17:52:40 +00:00
Alexander Motin
1913488d10 PKT_ALIAS_FOUND_HEADER_FRAGMENT result is not an error, so pass that packet.
This fixes packet fragmentation handeling.

Pass really available buffer size to libalias instead of MCLBYTES constant.
MCLBYTES constant were used with believe that m_megapullup() always moves
date into a fresh cluster that sometimes may become not so.
2008-06-01 12:29:23 +00:00
Alexander Motin
aac54f0a70 Fix packet fragmentation support broken by copy/paste error in rev.1.60.
ip_id should be u_short, but not u_char.
2008-06-01 11:47:04 +00:00
Robert Watson
c28cb4d82f Read lock rather than write lock TCP inpcbs in monitoring sysctls. In
some cases, add explicit inpcb locking rather than relying on the global
lock, as we dereference inp_socket, but also allowing us to drop the
global lock more quickly.

MFC after:	1 week
2008-05-29 14:28:26 +00:00
Robert Watson
9622e84fcf Employ read locks on UDP inpcbs, rather than write locks, when
monitoring UDP connections using sysctls.  In some cases, add
previously missing locking of inpcbs, as inp_socket is followed,
which also allows us to drop global locks more quickly.

MFC after:	1 week
2008-05-29 08:27:14 +00:00
Bjoern A. Zeeb
9a38ba8101 Factor out the v4-only vs. the v6-only inp_flags processing in
ip6_savecontrol in preparation for udp_append() to no longer
need an WLOCK as we will no longer be modifying socket options.

Requested by:		rwatson
Reviewed by:		gnn
MFC after:		10 days
2008-05-24 15:20:48 +00:00
Robert Watson
22c82719cf Consistently check IPFW and DUMMYNET privileges in the configuration
routines for those modules, rather than in the raw socket code.  This
each privilege check to occur in exactly once place and avoids
duplicate checks across layers.

MFC after:	3 weeks
Sponsored by:	nCircle Network Security, Inc.
2008-05-22 08:10:31 +00:00
Randall Stewart
d61374e183 - sctputil.c - If debug is on, the INPKILL timer can deref a freed value.
Change so that we save off a type field for display and
               NULL inp just for good measure.

- sctp_output.c - Fix it so in sending to the loopback we use the
                  src address of the inbound INIT. We don't want
                  to do this for non local addresses since otherwise
                  we might be ingressed filtered so we need to use
                  the best src address and list the address sent to.

Obtained from:	time bug - Neil Wilson
MFC after:	1 week
2008-05-21 16:51:21 +00:00
Randall Stewart
c54a18d26b - Adds support for the multi-asconf (From Kozuka-san)
- Adds some prepwork (Not all yet) for vimage in particular
  support the delete the sctppcbinfo.xx structs. There is
  still a leak in here if it were to be called plus we stil
  need the regrouping (From Me and Michael Tuexen)
- Adds support for UDP tunneling. For BSD there is no
  socket yet setup so its disabled, but major argument
  changes are in here to emcompass the passing of the port
  number (zero when you don't have a udp tunnel, the default
  for BSD). Will add some hooks in UDP here shortly (discussed
  with Robert) that will allow easy tunneling. (Mainly from
  Peter Lei and Michael Tuexen with some BSD work from me :-D)
- Some ease for windows, evidently leave is reserved by their
  compile move label leave: -> out:

MFC after:	1 week
2008-05-20 13:47:46 +00:00
Randall Stewart
bfefd19036 - Define changes in sctp.h
- Bug in CA that does not get us incrementing the PBA properly which
  made us more conservative.
- comment updated in sctp_input.c
- memsets added before we log
- added arg to hmac id's
MFC after:	2 weeks
2008-05-20 09:51:36 +00:00
George V. Neville-Neil
fff0ededf8 Fix the loopback interface. Cleaning up some code with new macros
was a tad too aggressive.

PR:		kern/123568
Submitted by:	Vladimir Ermakov <samflanker at gmail dot com>
Obtained from:	antoine
2008-05-12 02:44:53 +00:00
Julian Elischer
8b07e49a00 Add code to allow the system to handle multiple routing tables.
This particular implementation is designed to be fully backwards compatible
and to be MFC-able to 7.x (and 6.x)

Currently the only protocol that can make use of the multiple tables is IPv4
Similar functionality exists in OpenBSD and Linux.

From my notes:

-----

  One thing where FreeBSD has been falling behind, and which by chance I
  have some time to work on is "policy based routing", which allows
  different
  packet streams to be routed by more than just the destination address.

  Constraints:
  ------------

  I want to make some form of this available in the 6.x tree
  (and by extension 7.x) , but FreeBSD in general needs it so I might as
  well do it in -current and back port the portions I need.

  One of the ways that this can be done is to have the ability to
  instantiate multiple kernel routing tables (which I will now
  refer to as "Forwarding Information Bases" or "FIBs" for political
  correctness reasons). Which FIB a particular packet uses to make
  the next hop decision can be decided by a number of mechanisms.
  The policies these mechanisms implement are the "Policies" referred
  to in "Policy based routing".

  One of the constraints I have if I try to back port this work to
  6.x is that it must be implemented as a EXTENSION to the existing
  ABIs in 6.x so that third party applications do not need to be
  recompiled in timespan of the branch.

  This first version will not have some of the bells and whistles that
  will come with later versions. It will, for example, be limited to 16
  tables in the first commit.
  Implementation method, Compatible version. (part 1)
  -------------------------------
  For this reason I have implemented a "sufficient subset" of a
  multiple routing table solution in Perforce, and back-ported it
  to 6.x. (also in Perforce though not  always caught up with what I
  have done in -current/P4). The subset allows a number of FIBs
  to be defined at compile time (8 is sufficient for my purposes in 6.x)
  and implements the changes needed to allow IPV4 to use them. I have not
  done the changes for ipv6 simply because I do not need it, and I do not
  have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it.

  Other protocol families are left untouched and should there be
  users with proprietary protocol families, they should continue to work
  and be oblivious to the existence of the extra FIBs.

  To understand how this is done, one must know that the current FIB
  code starts everything off with a single dimensional array of
  pointers to FIB head structures (One per protocol family), each of
  which in turn points to the trie of routes available to that family.

  The basic change in the ABI compatible version of the change is to
  extent that array to be a 2 dimensional array, so that
  instead of protocol family X looking at rt_tables[X] for the
  table it needs, it looks at rt_tables[Y][X] when for all
  protocol families except ipv4 Y is always 0.
  Code that is unaware of the change always just sees the first row
  of the table, which of course looks just like the one dimensional
  array that existed before.

  The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign()
  are all maintained, but refer only to the first row of the array,
  so that existing callers in proprietary protocols can continue to
  do the "right thing".
  Some new entry points are added, for the exclusive use of ipv4 code
  called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(),
  which have an extra argument which refers the code to the correct row.

  In addition, there are some new entry points (currently called
  rtalloc_fib() and friends) that check the Address family being
  looked up and call either rtalloc() (and friends) if the protocol
  is not IPv4 forcing the action to row 0 or to the appropriate row
  if it IS IPv4 (and that info is available). These are for calling
  from code that is not specific to any particular protocol. The way
  these are implemented would change in the non ABI preserving code
  to be added later.

  One feature of the first version of the code is that for ipv4,
  the interface routes show up automatically on all the FIBs, so
  that no matter what FIB you select you always have the basic
  direct attached hosts available to you. (rtinit() does this
  automatically).

  You CAN delete an interface route from one FIB should you want
  to but by default it's there. ARP information is also available
  in each FIB. It's assumed that the same machine would have the
  same MAC address, regardless of which FIB you are using to get
  to it.

  This brings us as to how the correct FIB is selected for an outgoing
  IPV4 packet.

  Firstly, all packets have a FIB associated with them. if nothing
  has been done to change it, it will be FIB 0. The FIB is changed
  in the following ways.

  Packets fall into one of a number of classes.

  1/ locally generated packets, coming from a socket/PCB.
     Such packets select a FIB from a number associated with the
     socket/PCB. This in turn is inherited from the process,
     but can be changed by a socket option. The process in turn
     inherits it on fork. I have written a utility call setfib
     that acts a bit like nice..

         setfib -3 ping target.example.com # will use fib 3 for ping.

     It is an obvious extension to make it a property of a jail
     but I have not done so. It can be achieved by combining the setfib and
     jail commands.

  2/ packets received on an interface for forwarding.
     By default these packets would use table 0,
     (or possibly a number settable in a sysctl(not yet)).
     but prior to routing the firewall can inspect them (see below).
     (possibly in the future you may be able to associate a FIB
     with packets received on an interface..  An ifconfig arg, but not yet.)

  3/ packets inspected by a packet classifier, which can arbitrarily
     associate a fib with it on a packet by packet basis.
     A fib assigned to a packet by a packet classifier
     (such as ipfw) would over-ride a fib associated by
     a more default source. (such as cases 1 or 2).

  4/ a tcp listen socket associated with a fib will generate
     accept sockets that are associated with that same fib.

  5/ Packets generated in response to some other packet (e.g. reset
     or icmp packets). These should use the FIB associated with the
     packet being reponded to.

  6/ Packets generated during encapsulation.
     gif, tun and other tunnel interfaces will encapsulate using the FIB
     that was in effect withthe proces that set up the tunnel.
     thus setfib 1 ifconfig gif0 [tunnel instructions]
     will set the fib for the tunnel to use to be fib 1.

  Routing messages would be associated with their
  process, and thus select one FIB or another.
  messages from the kernel would be associated with the fib they
  refer to and would only be received by a routing socket associated
  with that fib. (not yet implemented)

  In addition Netstat has been edited to be able to cope with the
  fact that the array is now 2 dimensional. (It looks in system
  memory using libkvm (!)). Old versions of netstat see only the first FIB.

  In addition two sysctls are added to give:
  a) the number of FIBs compiled in (active)
  b) the default FIB of the calling process.

  Early testing experience:
  -------------------------

  Basically our (IronPort's) appliance does this functionality already
  using ipfw fwd but that method has some drawbacks.

  For example,
  It can't fully simulate a routing table because it can't influence the
  socket's choice of local address when a connect() is done.

  Testing during the generating of these changes has been
  remarkably smooth so far. Multiple tables have co-existed
  with no notable side effects, and packets have been routes
  accordingly.

  ipfw has grown 2 new keywords:

  setfib N ip from anay to any
  count ip from any to any fib N

  In pf there seems to be a requirement to be able to give symbolic names to the
  fibs but I do not have that capacity. I am not sure if it is required.

  SCTP has interestingly enough built in support for this, called VRFs
  in Cisco parlance. it will be interesting to see how that handles it
  when it suddenly actually does something.

  Where to next:
  --------------------

  After committing the ABI compatible version and MFCing it, I'd
  like to proceed in a forward direction in -current. this will
  result in some roto-tilling in the routing code.

  Firstly: the current code's idea of having a separate tree per
  protocol family, all of the same format, and pointed to by the
  1 dimensional array is a bit silly. Especially when one considers that
  there is code that makes assumptions about every protocol having the
  same internal structures there. Some protocols don't WANT that
  sort of structure. (for example the whole idea of a netmask is foreign
  to appletalk). This needs to be made opaque to the external code.

  My suggested first change is to add routing method pointers to the
  'domain' structure, along with information pointing the data.
  instead of having an array of pointers to uniform structures,
  there would be an array pointing to the 'domain' structures
  for each protocol address domain (protocol family),
  and the methods this reached would be called. The methods would have
  an argument that gives FIB number, but the protocol would be free
  to ignore it.

  When the ABI can be changed it raises the possibilty of the
  addition of a fib entry into the "struct route". Currently,
  the structure contains the sockaddr of the desination, and the resulting
  fib entry. To make this work fully, one could add a fib number
  so that given an address and a fib, one can find the third element, the
  fib entry.

  Interaction with the ARP layer/ LL layer would need to be
  revisited as well. Qing Li has been working on this already.

  This work was sponsored by Ironport Systems/Cisco

Reviewed by:    several including rwatson, bz and mlair (parts each)
Obtained from:  Ironport systems/Cisco
2008-05-09 23:03:00 +00:00
John Baldwin
790fce68dd Always bump tcpstat.tcps_badrst if we get a RST for a connection in the
syncache that has an invalid SEQ instead of only doing it when we suceed
in mallocing space for the log message.

MFC after:	1 week
Reviewed by:	sam, bz
2008-05-08 22:21:09 +00:00
Kip Macy
8ab7ce7c61 replace spaces added in last change with tabs 2008-05-05 23:13:27 +00:00
Kip Macy
535fbad68f add rcv_nxt, snd_nxt, and toe offload id to FreeBSD-specific
extension fields for tcp_info
2008-05-05 20:13:31 +00:00