During tests an assert was triggered and pointed to missing flags in
the newlink function of ng_bridge(4).
Reported by: markj
Reviewed by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D30759
If sending out a packet fails during the loop over all links, the
allocated memory is leaked and not all links receive a copy. This
patch fixes those problems, clarifies a premature abort of the loop,
and fixes a minory style(9) bug.
PR: 255430
Submitted by: Dancho Penev
Tested by: Dancho Penev
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30008
Hint the compiler, that this update is needed at most once per second.
Only in this case the memory line needs to be written. This will
reduce the amount of cache trashing during forward of most frames.
Suggested by: zec
Approved by: zec
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28601
The node ng_bridge underwent a lot of changes in the last few months.
All those steps were necessary to distinguish between structure
modifying and read-only data transport paths. Now it's done, the node
can perform frame forwarding on multiple cores in parallel.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28123
Use the new control message to move ethernet addresses from a link to
a new link in ng_bridge(4). Send this message instead of doing the
work directly requires to move the loop detection into the control
message processing. This will delay the loop detection by a few
frames.
This decouples the read-only activity from the modification under a
more strict writer lock.
Reviewed by: manpages (gbe)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28559
Add a new control message to move ethernet addresses to a given link
in ng_bridge(4). Send this message instead of doing the work directly.
This decouples the read-only activity from the modification under a
more strict writer lock.
Decoupling the work is a prerequisite for multithreaded operation.
Approved by: manpages (bcr), kp (earlier version)
MFC: 3 weeks
Differential Revision: https://reviews.freebsd.org/D28516
For broadcast, multicast and unknown unicast, the replication loop
sends a copy of the packet to each link, beside the first one. This
special path is handled later, but the counters are not updated.
Factor out the common send and count actions as a function.
Reviewed by: kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28537
In the data path of ng_bridge(4), the only value of the host struct,
which needs to be modified, is the staleness, which is reset every
time a frame is received. It's save to leave the code as it is.
This patch is part of a series to make ng_bridge(4) multithreaded.
Reviewed by: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28546
In a earlier version of ng_bridge(4) the exernal visible host entry
structure was a strict subset of the internal one. So internal view
was a direct annotation of the external structure. This strict
inheritance was lost many versions ago. There is no need to
encapsulate a part of the internal represntation as a separate
structure.
This patch is a preparation to make the internal structure read only
in the data path in order to make ng_bridge(4) multithreaded.
Reviewed by: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28545
The data path in netgraph is designed to work on an read only state of
the whole netgraph network. Currently this is achived by convention,
there is no technical enforcment. In the case of NETGRAPH_DEBUG all
nodes can be annotated for debugging purposes, so the strict
enforcment needs to be lifted for this purpose.
This patch is part of a series to make ng_bridge multithreaded, which
is done by rewrite the data path to operate on const.
Reviewed By: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28141
The data path in netgraph is designed to work on an read only state of
the whole netgraph network. Currently this is achived by convetion,
there is no technical enforcment. This patch is part of a series to
make ng_brigde multithreaded, which is done by rewrite the data path
to const handling.
Reviewed By: kp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28141
This was announced to happen after the 12 relases.
Remove a depeciated ABI.
The complete removal is for HEAD only. I'll remove the #define in
stable/13 as MFC, so the code will still exist in 13.x, but will not
included by default. Earlier versions will not be affected.
Reviewed by: kp
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D28518
This is the first patch of a series of necessary steps
to make ng_bridge(4) multithreaded.
Reviewed by: melifaro (network), afedorov
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D28125
Handling of unknown MACs on an bridge with incomplete learning
capabilites (aka uplink ports) can be defined in different ways.
The classical approach is to broadcast unicast frames send to an
unknown MAC, because the unknown devices can be everywhere. This mode
is default for ng_bridge(4).
In the case of dedicated uplink ports, which prohibit learning of MAC
addresses in order to save memory and CPU cycles, the broadcast
approach is dangerous. All traffic to the uplink port is broadcasted
to every downlink port, too. In this case, it's better to restrict the
distribution of frames to unknown MAC to the uplink ports only.
In order to keep the chance small and the handling as natural as
possible, the first attached link is used to determine the behaviour
of the bridge: If it is an "uplink" port, then the bridge switch from
classical mode to restricted mode.
Reviewed By: kp
Approved by: kp (mentor)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28487
The ng_bridge(4) node is designed to work in moderately small
environments. Connecting such a node to a larger network rapidly fills
the MAC table for no reason. It even become complicated to obtain data
from the gettable message, because the result is too large to
transmit.
This patch introduces, two new functionality bits on the hooks:
- Allow or disallow MAC address learning for incoming patckets.
- Allow or disallow sending unknown MACs through this hook.
Uplinks are characterized by denied learing while sending out
unknowns. Normal links are charaterized by allowed learning and
sending out unknowns.
Reviewed by: kp
Approved by: kp (mentor)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D23963
In order to be able to merge r353026 bring back support for the old
cookie API for a transition period in 12.x releases (and possibly 13)
before the old API can be removed again entirely.
Suggested by: julian
Submitted by: Lutz Donnerhacke (lutz donnerhacke.de)
PR: 240787
Reviewed by: julian
MFC after: 2 weeks
X-MFC with: r353026
Differential Revision: https://reviews.freebsd.org/D21961
can handle. Instead using an array on node private data, use per-hook
private data.
- Use NG_NODE_FOREACH_HOOK() to traverse through hooks instead of array.
PR: 240787
Submitted by: Lutz Donnerhacke <lutz donnerhacke.de>
Differential Revision: https://reviews.freebsd.org/D21803
Uses of mallocarray(9).
The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.
Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.
Reported by: wosch
PR: 225197
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.
This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
X-Differential revision: https://reviews.freebsd.org/D13837
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
packet filters. ALso allows ipfw to be enabled on on ejail and disabled
on another. In 8.0 it's a global setting.
Sitting aroung in tree waiting to commit for: 2 months
MFC after: 2 months
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks. Minor cleanups are done in the process,
and comments updated to reflect these changes.
Reviewed by: bz
Approved by: re (vimage blanket)
container structures, depending on VIMAGE_GLOBALS compile time option.
Make VIMAGE_GLOBALS a new compile-time option, which by default will not
be defined, resulting in instatiations of global variables selected for
V_irtualization (enclosed in #ifdef VIMAGE_GLOBALS blocks) to be
effectively compiled out. Instantiate new global container structures
to hold V_irtualized variables: vnet_net_0, vnet_inet_0, vnet_inet6_0,
vnet_ipsec_0, vnet_netgraph_0, and vnet_gif_0.
Update the VSYM() macro so that depending on VIMAGE_GLOBALS the V_
macros resolve either to the original globals, or to fields inside
container structures, i.e. effectively
#ifdef VIMAGE_GLOBALS
#define V_rt_tables rt_tables
#else
#define V_rt_tables vnet_net_0._rt_tables
#endif
Update SYSCTL_V_*() macros to operate either on globals or on fields
inside container structs.
Extend the internal kldsym() lookups with the ability to resolve
selected fields inside the virtualization container structs. This
applies only to the fields which are explicitly registered for kldsym()
visibility via VNET_MOD_DECLARE() and vnet_mod_register(), currently
this is done only in sys/net/if.c.
Fix a few broken instances of MODULE_GLOBAL() macro use in SCTP code,
and modify the MODULE_GLOBAL() macro to resolve to V_ macros, which in
turn result in proper code being generated depending on VIMAGE_GLOBALS.
De-virtualize local static variables in sys/contrib/pf/net/pf_subr.c
which were prematurely V_irtualized by automated V_ prepending scripts
during earlier merging steps. PF virtualization will be done
separately, most probably after next PF import.
Convert a few variable initializations at instantiation to
initialization in init functions, most notably in ipfw. Also convert
TUNABLE_INT() initializers for V_ variables to TUNABLE_FETCH_INT() in
initializer functions.
Discussed at: devsummit Strassburg
Reviewed by: bz, julian
Approved by: julian (mentor)
Obtained from: //depot/projects/vimage-commit2/...
X-MFC after: never
Sponsored by: NLnet Foundation, The FreeBSD Foundation
virtualization work done by Marko Zec (zec@).
This is the first in a series of commits over the course
of the next few weeks.
Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.
We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.
Obtained from: //depot/projects/vimage-commit2/...
Reviewed by: brooks, des, ed, mav, julian,
jamie, kris, rwatson, zec, ...
(various people I forgot, different versions)
md5 (with a bit of help)
Sponsored by: NLnet Foundation, The FreeBSD Foundation
X-MFC after: never
V_Commit_Message_Reviewed_By: more people than the patch
The difference is that the callout function installed via the
ng_callout() method is guaranteed to NOT fire after the shutdown
method was run (when a node is marked NGF_INVALID). Also, the
shutdown method and the callout function are guaranteed to NOT
run at the same time, as both require the writer lock. Thus
we can safely ignore a zero return value from ng_uncallout()
(callout_stop()) in shutdown methods, and go on with freeing
the node.
The said revision broke the node shutdown -- ng_bridge_timeout()
is no longer fired after ng_bridge_shutdown() was run, resulting
in a memory leak, dead nodes, and inability to unload the module.
Fix this by cancelling the callout on shutdown, and moving part
responsible for freeing a node resources from ng_bridge_timer()
to ng_bridge_shutdown().
Noticed by: ru
Submitted by: glebius, ru
and preserves the ipfw ABI. The ipfw core packet inspection and filtering
functions have not been changed, only how ipfw is invoked is different.
However there are many changes how ipfw is and its add-on's are handled:
In general ipfw is now called through the PFIL_HOOKS and most associated
magic, that was in ip_input() or ip_output() previously, is now done in
ipfw_check_[in|out]() in the ipfw PFIL handler.
IPDIVERT is entirely handled within the ipfw PFIL handlers. A packet to
be diverted is checked if it is fragmented, if yes, ip_reass() gets in for
reassembly. If not, or all fragments arrived and the packet is complete,
divert_packet is called directly. For 'tee' no reassembly attempt is made
and a copy of the packet is sent to the divert socket unmodified. The
original packet continues its way through ip_input/output().
ipfw 'forward' is done via m_tag's. The ipfw PFIL handlers tag the packet
with the new destination sockaddr_in. A check if the new destination is a
local IP address is made and the m_flags are set appropriately. ip_input()
and ip_output() have some more work to do here. For ip_input() the m_flags
are checked and a packet for us is directly sent to the 'ours' section for
further processing. Destination changes on the input path are only tagged
and the 'srcrt' flag to ip_forward() is set to disable destination checks
and ICMP replies at this stage. The tag is going to be handled on output.
ip_output() again checks for m_flags and the 'ours' tag. If found, the
packet will be dropped back to the IP netisr where it is going to be picked
up by ip_input() again and the directly sent to the 'ours' section. When
only the destination changes, the route's 'dst' is overwritten with the
new destination from the forward m_tag. Then it jumps back at the route
lookup again and skips the firewall check because it has been marked with
M_SKIP_FIREWALL. ipfw 'forward' has to be compiled into the kernel with
'option IPFIREWALL_FORWARD' to enable it.
DUMMYNET is entirely handled within the ipfw PFIL handlers. A packet for
a dummynet pipe or queue is directly sent to dummynet_io(). Dummynet will
then inject it back into ip_input/ip_output() after it has served its time.
Dummynet packets are tagged and will continue from the next rule when they
hit the ipfw PFIL handlers again after re-injection.
BRIDGING and IPFW_ETHER are not changed yet and use ipfw_chk() directly as
they did before. Later this will be changed to dedicated ETHER PFIL_HOOKS.
More detailed changes to the code:
conf/files
Add netinet/ip_fw_pfil.c.
conf/options
Add IPFIREWALL_FORWARD option.
modules/ipfw/Makefile
Add ip_fw_pfil.c.
net/bridge.c
Disable PFIL_HOOKS if ipfw for bridging is active. Bridging ipfw
is still directly invoked to handle layer2 headers and packets would
get a double ipfw when run through PFIL_HOOKS as well.
netinet/ip_divert.c
Removed divert_clone() function. It is no longer used.
netinet/ip_dummynet.[ch]
Neither the route 'ro' nor the destination 'dst' need to be stored
while in dummynet transit. Structure members and associated macros
are removed.
netinet/ip_fastfwd.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code.
netinet/ip_fw.h
Removed 'ro' and 'dst' from struct ip_fw_args.
netinet/ip_fw2.c
(Re)moved some global variables and the module handling.
netinet/ip_fw_pfil.c
New file containing the ipfw PFIL handlers and module initialization.
netinet/ip_input.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code. ip_forward() does not longer require
the 'next_hop' struct sockaddr_in argument. Disable early checks
if 'srcrt' is set.
netinet/ip_output.c
Removed all direct ipfw handling code and replace it with the new
'ipfw forward' handling code.
netinet/ip_var.h
Add ip_reass() as general function. (Used from ipfw PFIL handlers
for IPDIVERT.)
netinet/raw_ip.c
Directly check if ipfw and dummynet control pointers are active.
netinet/tcp_input.c
Rework the 'ipfw forward' to local code to work with the new way of
forward tags.
netinet/tcp_sack.c
Remove include 'opt_ipfw.h' which is not needed here.
sys/mbuf.h
Remove m_claim_next() macro which was exclusively for ipfw 'forward'
and is no longer needed.
Approved by: re (scottl)
Also introduce a macro to be called by persistent nodes to signal their
persistence during shutdown to hide this mechanism from the node author.
Make node flags have a consistent style in naming.
Document the change.
Only the first link0..link$NLINKS hooks would be utilized, whereas
the link hooks may be connected sparsely.
Add a counter variable so that the link hook array is only traversed
while there is still work to do, but that it continues up to the end
if it has to.
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
in the case where the bridge node was closed down but a timeout
still applied to it, the final reference to the node was freeing the private
data structure using the wrong malloc type.
Approved by: re@