them mostly with packet tags (one case is handled by using an mbuf flag
since the linkage between "caller" and "callee" is direct and there's no
need to incur the overhead of a packet tag).
This is (mostly) work from: sam
Silence from: -arch
Approved by: bms(mentor), sam, rwatson
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
then the mbuf has been consumed by a hook; otherwise beware of a null
mbuf return (gack). In particular the bridge was doing the wrong thing.
While in the ipv6 code make it's handling of pfil_run_hooks identical
to netbsd.
Pointed out by: Pyun YongHyeon <yongari@kt-is.co.kr>
o revamp IPv4+IPv6+bridge usage to match API changes
o remove pfil_head instances from protosw entries (no longer used)
o add locking
o bump FreeBSD version for 3rd party modules
Heavy lifting by: "Max Laier" <max@love2party.net>
Supported by: FreeBSD Foundation
Obtained from: NetBSD (bits of pfil.h and pfil.c)
o replace magic constants with #defines (e.g. ETHER_ADDR_LEN)
o move mib variables to net.link.ether.bridge with backwards compatible
entries for well-known items maintained under BURN_BRIDGES
o revamp debugging support so it is conditioanlly compiled with BRIDGE_DEBUG
(on currently) and runtime controlled by net.link.ether.bridge.debug
o change timeout to MPSAFE callout
o optimize lookup for common case of two interfaces
o optimize forwarding path to take IFNET lock only when needed
o make boot-time printf dependent on bootverbose
o sundry style changes (ANSI decls, extraneous spaces, etc.)
Sponsored by: FreeBSD Foundation
warning which breaks builds.
cc1: warnings being treated as errors
src/sys/net/bridge.c: In function `bdg_forward':
sys/net/bridge.c:931: warning: suggest parentheses around assignment used as truth value
*** Error code 1
revision 1.62. It was checking for M_PREPEND() failing, not for the
case of a NULL mbuf pointer being supplied to the macro. Back out
that revision, and fix the NULL dereference by not calling EH_RESTORE()
in the case where the mbuf pointer is NULL because the firewall
rejected the packet.
the mbuf allocator flags {M_TRYWAIT, M_DONTWAIT}.
o Fix a bpf_compat issue where malloc() was defined to just call
bpf_alloc() and pass the 'canwait' flag(s) along. It's been changed
to call bpf_alloc() but pass the corresponding M_TRYWAIT or M_DONTWAIT
flag (and only one of those two).
Submitted by: Hiten Pandya <hiten@unixdaemons.com> (hiten->commit_count++)
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
inpcb parameter to ip_output and ip6_output to allow the IPsec code to
locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version
Reviewed by: julian, luigi (silent), -arch, -net, darren
Approved by: julian, silence from everyone else
Obtained from: openbsd (mostly)
MFC after: 1 month
I should have committed this ages ago.
The MFC for if_ethersubr.c could be done in the usual few days (only
ipfw2 uses it), the one for bridge.c should probably wait until
after 4.7 because it changes an existing though mostly undocumented
behaviour (on which i hope nobody relies). All in all, i'll wait for
both things unless there is demand.
MFC after: 35 days
packet forwarding state ("annotations") during ip processing.
The code is considerably cleaner now.
The variables removed by this change are:
ip_divert_cookie used by divert sockets
ip_fw_fwd_addr used for transparent ip redirection
last_pkt used by dynamic pipes in dummynet
Removal of the first two has been done by carrying the annotations
into volatile structs prepended to the mbuf chains, and adding
appropriate code to add/remove annotations in the routines which
make use of them, i.e. ip_input(), ip_output(), tcp_input(),
bdg_forward(), ether_demux(), ether_output_frame(), div_output().
On passing, remove a bug in divert handling of fragmented packet.
Now it is the fragment at offset 0 which sets the divert status of
the whole packet, whereas formerly it was the last incoming fragment
to decide.
Removal of last_pkt required a change in the interface of ip_fw_chk()
and dummynet_io(). On passing, use the same mechanism for dummynet
annotations and for divert/forward annotations.
option IPFIREWALL_FORWARD is effectively useless, the code to
implement it is very small and is now in by default to avoid the
obfuscation of conditionally compiled code.
NOTES:
* there is at least one global variable left, sro_fwd, in ip_output().
I am not sure if/how this can be removed.
* I have deliberately avoided gratuitous style changes in this commit
to avoid cluttering the diffs. Minor stule cleanup will likely be
necessary
* this commit only focused on the IP layer. I am sure there is a
number of global variables used in the TCP and maybe UDP stack.
* despite the number of files touched, there are absolutely no API's
or data structures changed by this commit (except the interfaces of
ip_fw_chk() and dummynet_io(), which are internal anyways), so
an MFC is quite safe and unintrusive (and desirable, given the
improved readability of the code).
MFC after: 10 days
were totally useless and have been removed.
ip_input.c, ip_output.c:
Properly initialize the "ip" pointer in case the firewall does an
m_pullup() on the packet.
Remove some debugging code forgotten long ago.
ip_fw.[ch], bridge.c:
Prepare the grounds for matching MAC header fields in bridged packets,
so we can have 'etherfw' functionality without a lot of kernel and
userland bloat.
# sysctl net.link.ether.bdg_ipf=1
To enable. Just like ipfw(8) bridging, only input packets are filtered
in the bridge. Filtering works just like in the IP layer, ipf(8)
first, then ipfw(8). And just like in the IP layer, both are
independent, one need not be run to use the other. (Note: This will
not work in, but doesn't break, the bridge.ko module. The ipl.ko
module would need to be fixed before that is worth worrying about.)
Reviewed by: luigi
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
complexities associated with having MD (asm and inline) versions, and
having to prevent exposure of these functions in other headers that
happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.
Tested on: alpha, i386
Reviewed by: bde, jake, tmm
In order of importance:
+ each cluster now uses private data structures (filtering and
local address tables) so you can treat them as fully independent
switches. This part of the work was supported by:
Cisco Systems, Inc. - NSITE lab, RTP, NC.
+ cleaned up the handling of configuration, so the system will behave
much better when real or pseudo devices are dynamically attached
or detached. It should also not panic anymore on systems with large
number of devices, closing a few existings PRs on the topic.
+ while at it, add support for VLAN. This means that a FreeBSD box
can now work as a real VLAN switch, with trunk interfaces etc.
As an example:
ifconfig vlan0 vlan 3 vlandev dc0
ifconfig vlan1 vlan 4 vlandev dc0
net.link.ether.bridge_cfg="vlan0:3,dc1:3,vlan1:4,dc1:4"
uses dc0 as a trunk interface, and dc1 and dc3 as ports on vlans 3 and 4
You get the idea...
NOTA BENE: by default bridge_cfg is initialised to "" so even if
you enable bridging, no packets will be bridged until you set the
list of interfaces on which you want this to happen.
+ large restructuring of the code, moving private vars and types from
bridge.h to bridge.c.
+ added a lot of comments to the code to explain how to use it.
+ implement "limit" rules, which permit to limit the number of sessions
between certain host pairs (according to masks). These are a special
type of stateful rules, which might be of interest in some cases.
See the ipfw manpage for details.
+ merge the list pointers and ipfw rule descriptors in the kernel, so
the code is smaller, faster and more readable. This patch basically
consists in replacing "foo->rule->bar" with "rule->bar" all over
the place.
I have been willing to do this for ages!
MFC after: 1 week
+ configuration: make sure that the NUL at the end of the config
string is properly detected and handled, and the stats passed
up via sysctl properly reflect which interfaces do bridging.
(The whole config support might make good use of some cleanup
in the future).
+ fixed some bugs related to the corruption of multicast and
broadcast packets: make sure that for those packets the entire
IP + ethernet header is in the mbuf, not in a cluster, so
that writes performed in that area by the upper layers do
not affect us.
+ performance: when calling m_pullup, make room for the ethernet header
as well, we are going to add it in right after. Also, change an m_dup
back to m_copypacket. The former is not necessary anymore now, and
it did not help, anyways.
I will do a fast MFC because 95% of this patch is fixing bad bugs
and i doubt anyone would test the fix in CURRENT. Plus the last
two items mostly bring back some code which was already there in 4.0
times.
before adding/removing packets from the queue. Also, the if_obytes and
if_omcasts fields should only be manipulated under protection of the mutex.
IF_ENQUEUE, IF_PREPEND, and IF_DEQUEUE perform all necessary locking on
the queue. An IF_LOCK macro is provided, as well as the old (mutex-less)
versions of the macros in the form _IF_ENQUEUE, _IF_QFULL, for code which
needs them, but their use is discouraged.
Two new macros are introduced: IF_DRAIN() to drain a queue, and IF_HANDOFF,
which takes care of locking/enqueue, and also statistics updating/start
if necessary.
* Some dummynet code incorrectly handled a malloc()-allocated pseudo-mbuf
header structure, called "pkt," and could consequently pollute the mbuf
free list if it was ever passed to m_freem(). The fix involved passing not
pkt, but essentially pkt->m_next (which is a real mbuf) to the mbuf
utility routines.
* Also, for dummynet, in bdg_forward(), made the code copy the ethernet header
back into the mbuf (prepended) because the dummynet code that follows expects
it to be there but it is, unfortunately for dummynet, passed to bdg_forward
as a seperate argument.
PRs: kern/19551 ; misc/21534 ; kern/23010
Submitted by: Thomas Moestl <tmoestl@gmx.net>
Reviewed by: bmilekic
Approved by: luigi