Commit Graph

93 Commits

Author SHA1 Message Date
sam
c165a87f8d o eliminate widespread on-stack mbuf use for bpf by introducing
a new bpf_mtap2 routine that does the right thing for an mbuf
  and a variable-length chunk of data that should be prepended.
o while we're sweeping the drivers, use u_int32_t uniformly when
  when prepending the address family (several places were assuming
  sizeof(int) was 4)
o return M_ASSERTVALID to BPF_MTAP* now that all stack-allocated
  mbufs have been eliminated; this may better be moved to the bpf
  routines

Reviewed by:	arch@ and several others
2003-12-28 03:56:00 +00:00
andre
6164d7c280 Introduce tcp_hostcache and remove the tcp specific metrics from
the routing table.  Move all usage and references in the tcp stack
from the routing table metrics to the tcp hostcache.

It caches measured parameters of past tcp sessions to provide better
initial start values for following connections from or to the same
source or destination.  Depending on the network parameters to/from
the remote host this can lead to significant speedups for new tcp
connections after the first one because they inherit and shortcut
the learning curve.

tcp_hostcache is designed for multiple concurrent access in SMP
environments with high contention and is hash indexed by remote
ip address.

It removes significant locking requirements from the tcp stack with
regard to the routing table.

Reviewed by:	sam (mentor), bms
Reviewed by:	-net, -current, core@kame.net (IPv6 parts)
Approved by:	re (scottl)
2003-11-20 20:07:39 +00:00
silby
be846dcb4f Remove the m_defrag call from if_loop; testing with m_fragment
has shown that the IPv6 stack can clearly handle fragmented
mbuf chains without a problem.

MFC after:	1 week
2003-11-11 17:58:36 +00:00
brooks
f1e94c6f29 Replace the if_name and if_unit members of struct ifnet with new members
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.

This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.

Approved By:	re (in principle)
Reviewed By:	njl, imp
Tested On:	i386, amd64, sparc64
Obtained From:	NetBSD (if_xname)
2003-10-31 18:32:15 +00:00
sam
357db6acd2 Always queue looped back packets (rather than potentially using
direct dispatch) to avoid extensive kernel stack usage and to
avoid directly re-entering the network stack.  The latter causes
locking problems when, for example, a complete TCP handshake`
happens w/o a context switch.
2003-10-29 18:37:47 +00:00
sam
409cf5f514 Introduce the notion of "persistent mbuf tags"; these are tags that stay
with an mbuf until it is reclaimed.  This is in contrast to tags that
vanish when an mbuf chain passes through an interface.  Persistent tags
are used, for example, by MAC labels.

Add an m_tag_delete_nonpersistent function to strip non-persistent tags
from mbufs and use it to strip such tags from packets as they pass through
the loopback interface and when turned around by icmp.  This fixes problems
with "tag leakage".

Pointed out by:	Jonathan Stone
Reviewed by:	Robert Watson
2003-10-29 05:40:07 +00:00
kan
d4fed514b5 Delay if_lo module intialization until domain list has been
completely populated. This prevents a system crash on boot.
2003-10-27 04:16:32 +00:00
sam
9d93fce265 Locking for updates to routing table entries. Each rtentry gets a mutex
that covers updates to the contents.  Note this is separate from holding
a reference and/or locking the routing table itself.

Other/related changes:

o rtredirect loses the final parameter by which an rtentry reference
  may be returned; this was never used and added unwarranted complexity
  for locking.
o minor style cleanups to routing code (e.g. ansi-fy function decls)
o remove the logic to bump the refcnt on the parent of cloned routes,
  we assume the parent will remain as long as the clone; doing this avoids
  a circularity in locking during delete
o convert some timeouts to MPSAFE callouts

Notes:

1. rt_mtx in struct rtentry is guarded by #ifdef _KERNEL as user-level
   applications cannot/do-no know about mutex's.  Doing this requires
   that the mutex be the last element in the structure.  A better solution
   is to introduce an externalized version of struct rtentry but this is
   a major task because of the intertwining of rtentry and other data
   structures that are visible to user applications.
2. There are known LOR's that are expected to go away with forthcoming
   work to eliminate many held references.  If not these will be resolved
   prior to release.
3. ATM changes are untested.

Sponsored by:	FreeBSD Foundation
Obtained from:	BSD/OS (partly)
2003-10-04 03:44:50 +00:00
silby
ad2238f4b4 Also ifdef the variable which becomes unused w/o INET6, fixing the build
after the previous commit.

Noticed by:	alc
2003-08-09 05:44:00 +00:00
silby
5d2c6e8c37 #ifdef INET6 the if_loop packet defrag; since only the ipv6 stack (might)
require this to be done, there's no reason to waste time doing it if
ipv6 isn't compiled in.

MFC after:	1 week
2003-08-09 02:55:11 +00:00
silby
b09db08f24 Replace a handrolled defrag function with m_defrag. The handrolled
function couldn't handle chains of > MCLBYTES, and it had a bug which
caused corruption and panics in certain low mbuf situations.

Additionally, change the failure case so that looutput returns ENOBUFS
rather than attempting to pass on non-defragmented mbuf chains.

Finally, remove the printf which would happen every time the low memory
situation occured.  It served no useful purpose other than to clue me
in as to what was causing the panic in question. :)

MFC after:	4 days
2003-05-28 02:04:33 +00:00
rwatson
53a050aac3 Move MAC label storage for mbufs into m_tags from the m_pkthdr structure,
returning some additional room in the first mbuf in a chain, and
avoiding feature-specific contents in the mbuf header.  To do this:

- Modify mbuf_to_label() to extract the tag, returning NULL if not
  found.

- Introduce mac_init_mbuf_tag() which does most of the work
  mac_init_mbuf() used to do, except on an m_tag rather than an
  mbuf.

- Scale back mac_init_mbuf() to perform m_tag allocation and invoke
  mac_init_mbuf_tag().

- Replace mac_destroy_mbuf() with mac_destroy_mbuf_tag(), since
  m_tag's are now GC'd deep in the m_tag/mbuf code rather than
  at a higher level when mbufs are directly free()'d.

- Add mac_copy_mbuf_tag() to support m_copy_pkthdr() and related
  notions.

- Generally change all references to mbuf labels so that they use
  mbuf_to_label() rather than &mbuf->m_pkthdr.label.  This
  required no changes in the MAC policies (yay!).

- Tweak mbuf release routines to not call mac_destroy_mbuf(),
  tag destruction takes care of it for us now.

- Remove MAC magic from m_copy_pkthdr() and m_move_pkthdr() --
  the existing m_tag support does all this for us.  Note that
  we can no longer just zero the m_tag list on the target mbuf,
  rather, we have to delete the chain because m_tag's will
  already be hung off freshly allocated mbuf's.

- Tweak m_tag copying routines so that if we're copying a MAC
  m_tag, we don't do a binary copy, rather, we initialize the
  new storage and do a deep copy of the label.

- Remove use of MAC_FLAG_INITIALIZED in a few bizarre places
  having to do with mbuf header copies previously.

- When an mbuf is copied in ip_input(), we no longer need to
  explicitly copy the label because it will get handled by the
  m_tag code now.

- No longer any weird handling of MAC labels in if_loop.c during
  header copies.

- Add MPC_LOADTIME_FLAG_LABELMBUFS flag to Biba, MLS, mac_test.
  In mac_test, handle the label==NULL case, since it can be
  dynamically loaded.

In order to improve performance with this change, introduce the notion
of "lazy MAC label allocation" -- only allocate m_tag storage for MAC
labels if we're running with a policy that uses MAC labels on mbufs.
Policies declare this intent by setting the MPC_LOADTIME_FLAG_LABELMBUFS
flag in their load-time flags field during declaration.  Note: this
opens up the possibility of post-boot policy modules getting back NULL
slot entries even though they have policy invariants of non-NULL slot
entries, as the policy might have been loaded after the mbuf was
allocated, leaving the mbuf without label storage.  Policies that cannot
handle this case must be declared as NOTLATE, or must be modified.

- mac_labelmbufs holds the current cumulative status as to whether
  any policies require mbuf labeling or not.  This is updated whenever
  the active policy set changes by the function mac_policy_updateflags().
  The function iterates the list and checks whether any have the
  flag set.  Write access to this variable is protected by the policy
  list; read access is currently not protected for performance reasons.
  This might change if it causes problems.

- Add MAC_POLICY_LIST_ASSERT_EXCLUSIVE() to permit the flags update
  function to assert appropriate locks.

- This makes allocation in mac_init_mbuf() conditional on the flag.

Reviewed by:	sam
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-04-14 20:39:06 +00:00
des
567ac2b268 Introduce an M_ASSERTPKTHDR() macro which performs the very common task
of asserting that an mbuf has a packet header.  Use it instead of hand-
rolled versions wherever applicable.

Submitted by:	Hiten Pandya <hiten@unixdaemons.com>
2003-04-08 14:25:47 +00:00
peter
fbc7526e8f Finish driving a stake through the heart of netns and the associated
ifdefs scattered around the place - its dead Jim!

The SMB stuff had stolen AF_NS, make it official.
2003-03-05 19:24:24 +00:00
jlemon
04e28d5a81 Update netisr handling; Each SWI now registers its queue, and all queue
drain routines are done by swi_net, which allows for better queue control
at some future point.  Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.

Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
2003-03-04 23:19:55 +00:00
imp
cf874b345d Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
alfred
bf8e8a6e8f Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
sam
01e5d3cdfd don't reference a pkthdr after M_MOVE_PKTHDR has "remove it"; instead
reference the pkthdr now in the destination of the move

Sponsored by:	Vernier Networks
2003-01-06 21:33:54 +00:00
sam
b16cb0a948 Correct mbuf packet header propagation. Previously, packet headers
were sometimes propagated using M_COPY_PKTHDR which actually did
something between a "move" and a  "copy" operation.  This is replaced
by M_MOVE_PKTHDR (which copies the pkthdr contents and "removes" it
from the source mbuf) and m_dup_pkthdr which copies the packet
header contents including any m_tag chain.  This corrects numerous
problems whereby mbuf tags could be lost during packet manipulations.

These changes also introduce arguments to m_tag_copy and m_tag_copy_chain
to specify if the tag copy work should potentially block.  This
introduces an incompatibility with openbsd which we may want to revisit.

Note that move/dup of packet headers does not handle target mbufs
that have a cluster bound to them.  We may want to support this;
for now we watch for it with an assert.

Finally, M_COPYFLAGS was updated to include M_FIRSTFRAG|M_LASTFRAG.

Supported by:	Vernier Networks
Reviewed by:	Robert Watson <rwatson@FreeBSD.org>
2002-12-30 20:22:40 +00:00
rwatson
bd8533b1a2 Under some circumstances, the loopback interface will allocate a new
mbuf for a packet looping back to provide alignment guarantees for
KAME.  Unfortunately, this code performs a direct copy of the header
rather than using a header copying primitive (largely because we have
sucky header copying primitives).  This results in a multiple free
of the MAC label in the header when the same label data is freed
twice when the two mbufs with that header are freed.  As a temporary
work-around, clear the initialized flag on the label to prevent the
duplicate free, which prevents panics on large unaligned loopback
IP and IPv6 data.  The real fix is to improve and make use of proper
packet header copying routines here.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-12-18 15:34:17 +00:00
sam
6a05792540 network interface and link layer changes:
o on input don't strip the Ethernet header from packets
o input packet handling is now done with if_input
o track changes to ether_ifattach/ether_ifdetach API
o track changes to bpf tapping
o call ether_ioctl for default handling of ioctl's
o use constants from net/ethernet.h where possible

Reviewed by:	many
Approved by:	re
2002-11-15 00:00:15 +00:00
sam
2a86be217a Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
  ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
  use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
  inpcb parameter to ip_output and ip6_output to allow the IPsec code to
  locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by:	julian, luigi (silent), -arch, -net, darren
Approved by:	julian, silence from everyone else
Obtained from:	openbsd (mostly)
MFC after:	1 month
2002-10-16 01:54:46 +00:00
brooks
6cfd5a5a1d Move all unit number management cloned interfaces into the cloning
code.  The reverts the API change which made the <if>_clone_destory()
functions return an int instead of void bringing us into closer
alignment with NetBSD.

Reviewed by:	net (a long time ago)
2002-05-25 20:17:04 +00:00
luigi
f7c71ab2f2 Replace (deprecated ?) FREE() macro with direct calls to free() 2002-04-04 06:03:17 +00:00
mux
9a5e4c88a3 Simplify the interface cloning framework by handling unit
unit allocation with a bitmap in the generic layer.  This
allows us to get rid of the duplicated rman code in every
clonable interface.

Reviewed by:	brooks
Approved by:	phk
2002-03-11 09:26:07 +00:00
brooks
be085a899e Add cloning support to the loopback interface.
Submitted by:	mux
2002-03-04 21:46:00 +00:00
jake
fc59b98b83 sparc64 needs the same alingment fixes that alpha and ia64 do.
Submitted by:	tmm
2001-12-29 08:45:17 +00:00
jlemon
0a6314db1d whitespace fixes. 2001-12-14 19:32:47 +00:00
ru
ecb4d3d05f Pull post-4.4BSD change to sys/net/route.c from BSD/OS 4.2.
Have sys/net/route.c:rtrequest1(), which takes ``rt_addrinfo *''
as the argument.  Pass rt_addrinfo all the way down to rtrequest1
and ifa->ifa_rtrequest.  3rd argument of ifa->ifa_rtrequest is now
``rt_addrinfo *'' instead of ``sockaddr *'' (almost noone is
using it anyways).

Benefit: the following command now works.  Previously we needed
two route(8) invocations, "add" then "change".
# route add -inet6 default ::1 -ifp gif0

Remove unsafe typecast in rtrequest(), from ``rtentry *'' to
``sockaddr *''.  It was introduced by 4.3BSD-Reno and never
corrected.

Obtained from:	BSD/OS, NetBSD
MFC after:	1 month
PR:		kern/28360
2001-10-17 18:07:05 +00:00
ru
9653180034 Revision 1.32 corresponded to CSRG revision 8.2. 2001-10-17 10:44:39 +00:00
dfr
0eec0bfe7b Add ia64 to the list of machines which don't do unaligned reads. 2001-10-05 19:04:23 +00:00
peter
9648bb82ea Remove/comment tokens after #endif (#endif NETATALK) 2001-09-10 01:33:03 +00:00
ume
832f8d2249 Sync with recent KAME.
This work was based on kame-20010528-freebsd43-snap.tgz and some
critical problem after the snap was out were fixed.
There are many many changes since last KAME merge.

TODO:
  - The definitions of SADB_* in sys/net/pfkeyv2.h are still different
    from RFC2407/IANA assignment because of binary compatibility
    issue.  It should be fixed under 5-CURRENT.
  - ip6po_m member of struct ip6_pktopts is no longer used.  But, it
    is still there because of binary compatibility issue.  It should
    be removed under 5-CURRENT.

Reviewed by:	itojun
Obtained from:	KAME
MFC after:	3 weeks
2001-06-11 12:39:29 +00:00
peter
4b91e2ecf0 "Fix" the previous initial attempt at fixing TUNABLE_INT(). This time
around, use a common function for looking up and extracting the tunables
from the kernel environment.  This saves duplicating the same function
over and over again.  This way typically has an overhead of 8 bytes + the
path string, versus about 26 bytes + the path string.
2001-06-08 05:24:21 +00:00
peter
c1df44ae51 Back out part of my previous commit. This was a last minute change
and I botched testing.  This is a perfect example of how NOT to do
this sort of thing. :-(
2001-06-07 03:17:26 +00:00
peter
0732738ec4 Make the TUNABLE_*() macros look and behave more consistantly like the
SYSCTL_*() macros.  TUNABLE_INT_DECL() was an odd name because it didn't
actually declare the int, which is what the name suggests it would do.
2001-06-06 22:17:08 +00:00
peter
6be84866ea Exterminate the use of PSEUDO_SET() with extreme prejudice. 2001-01-31 07:58:58 +00:00
bp
076c2e70fd Let M_PANIC go back to the private tree as its intention isn't understood well
for now.
2001-01-31 04:50:20 +00:00
peter
695e2716fd Use M_PANIC instead of if (sc == NULL) panic(); 2001-01-29 13:21:34 +00:00
peter
20862096d0 Make the number of loopback interfaces dynamically tunable. Why one
would *want* to is a different story, but it used to be able to be done
statically.  Get rid of #include "loop.h" and struct ifnet loif[NLOOP];
This could be used as an example of how to do this in other drivers,
for example: ccd.
2001-01-29 11:06:26 +00:00
jlemon
954e1d2ccd Lock down the network interface queues. The queue mutex must be obtained
before adding/removing packets from the queue.  Also, the if_obytes and
if_omcasts fields should only be manipulated under protection of the mutex.

IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on
the queue.  An IF_LOCK macro is provided, as well as the old (mutex-less)
versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which
needs them, but their use is discouraged.

Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF,
which takes care of locking/enqueue, and also statistics updating/start
if necessary.
2000-11-25 07:35:38 +00:00
rwatson
283b0b9a59 o Add missing "\n" to warning output in netinet/if_loop.c, when an
unsupported address family is used on localhost interface.

    looutput: af=0 unexpected

  Speculation as to the reasons for my seeing this error are welcome, of
  course.  :-)
2000-09-04 21:46:27 +00:00
itojun
5f4e854de1 sync with kame tree as of july00. tons of bug fixes/improvements.
API changes:
- additional IPv6 ioctls
- IPsec PF_KEY API was changed, it is mandatory to upgrade setkey(8).
  (also syntax change)
2000-07-04 16:35:15 +00:00
mjacob
bb3c05baed fix KASSERT usage 2000-06-02 19:06:33 +00:00
gallatin
0eb0bb6777 Rather than checking for hlen causing misalignment, we should do the
m_adj() and then check the resulting mbuf for misalignment, copying
backwards to align the mbuf if required.

This fixes a longstanding problem where an mbuf which would have been
properly aligned after an m_adj() was being misaligned and causing an
unaligned access trap in ip_input().  This bug only triggered when booting
diskless.

Reviewed by:	dfr
2000-05-26 13:47:02 +00:00
archie
4ba625d0ce Just need to pass the address family to if_simloop(), not the whole sockaddr. 2000-05-24 21:16:56 +00:00
peter
a5441090de Clean up some loose ends in the network code, including the X.25 and ISO
#ifdefs.  Clean out unused netisr's and leftover netisr linker set gunk.
Tested on x86 and alpha, including world.

Approved by:	jkh
2000-02-13 03:32:07 +00:00
peter
d15f4a9e26 Remove some #if NFOO > 0 that are always true because of config rules. 2000-01-29 16:56:24 +00:00
shin
70f0bdf681 udp IPv6 support, IPv6/IPv4 tunneling support in kernel,
packet divert at kernel for IPv6/IPv4 translater daemon

This includes queue related patch submitted by jburkhol@home.com.

Submitted by: queue related patch from jburkhol@home.com
Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
1999-12-07 17:39:16 +00:00
shin
cad2014b27 KAME netinet6 basic part(no IPsec,no V6 Multicast Forwarding, no UDP/TCP
for IPv6 yet)

With this patch, you can assigne IPv6 addr automatically, and can reply to
IPv6 ping.

Reviewed by: freebsd-arch, cvs-committers
Obtained from: KAME project
1999-11-22 02:45:11 +00:00