eddc45e797
precedent.
1115 lines
38 KiB
Groff
1115 lines
38 KiB
Groff
.\" Copyright (c) 1996-1999 Whistle Communications, Inc.
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" Subject to the following obligations and disclaimer of warranty, use and
|
|
.\" redistribution of this software, in source or object code forms, with or
|
|
.\" without modifications are expressly permitted by Whistle Communications;
|
|
.\" provided, however, that:
|
|
.\" 1. Any and all reproductions of the source or object code must include the
|
|
.\" copyright notice above and the following disclaimer of warranties; and
|
|
.\" 2. No rights are granted, in any manner or form, to use Whistle
|
|
.\" Communications, Inc. trademarks, including the mark "WHISTLE
|
|
.\" COMMUNICATIONS" on advertising, endorsements, or otherwise except as
|
|
.\" such appears in the above copyright notice or in the software.
|
|
.\"
|
|
.\" THIS SOFTWARE IS BEING PROVIDED BY WHISTLE COMMUNICATIONS "AS IS", AND
|
|
.\" TO THE MAXIMUM EXTENT PERMITTED BY LAW, WHISTLE COMMUNICATIONS MAKES NO
|
|
.\" REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, REGARDING THIS SOFTWARE,
|
|
.\" INCLUDING WITHOUT LIMITATION, ANY AND ALL IMPLIED WARRANTIES OF
|
|
.\" MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
|
|
.\" WHISTLE COMMUNICATIONS DOES NOT WARRANT, GUARANTEE, OR MAKE ANY
|
|
.\" REPRESENTATIONS REGARDING THE USE OF, OR THE RESULTS OF THE USE OF THIS
|
|
.\" SOFTWARE IN TERMS OF ITS CORRECTNESS, ACCURACY, RELIABILITY OR OTHERWISE.
|
|
.\" IN NO EVENT SHALL WHISTLE COMMUNICATIONS BE LIABLE FOR ANY DAMAGES
|
|
.\" RESULTING FROM OR ARISING OUT OF ANY USE OF THIS SOFTWARE, INCLUDING
|
|
.\" WITHOUT LIMITATION, ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY,
|
|
.\" PUNITIVE, OR CONSEQUENTIAL DAMAGES, PROCUREMENT OF SUBSTITUTE GOODS OR
|
|
.\" SERVICES, LOSS OF USE, DATA OR PROFITS, HOWEVER CAUSED AND UNDER ANY
|
|
.\" THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
.\" (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
|
|
.\" THIS SOFTWARE, EVEN IF WHISTLE COMMUNICATIONS IS ADVISED OF THE POSSIBILITY
|
|
.\" OF SUCH DAMAGE.
|
|
.\"
|
|
.\" Authors: Julian Elischer <julian@FreeBSD.org>
|
|
.\" Archie Cobbs <archie@FreeBSD.org>
|
|
.\"
|
|
.\" $FreeBSD$
|
|
.\" $Whistle: netgraph.4,v 1.7 1999/01/28 23:54:52 julian Exp $
|
|
.\"
|
|
.Dd January 19, 1999
|
|
.Dt NETGRAPH 4
|
|
.Os FreeBSD
|
|
.Sh NAME
|
|
.Nm netgraph
|
|
.Nd graph based kernel networking subsystem
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Nm
|
|
system provides a uniform and modular system for the implementation
|
|
of kernel objects which perform various networking functions. The objects,
|
|
known as
|
|
.Em nodes ,
|
|
can be arranged into arbitrarily complicated graphs. Nodes have
|
|
.Em hooks
|
|
which are used to connect two nodes together, forming the edges in the graph.
|
|
Nodes communicate along the edges to process data, implement protocols, etc.
|
|
.Pp
|
|
The aim of
|
|
.Nm
|
|
is to supplement rather than replace the existing kernel networking
|
|
infrastructure. It provides:
|
|
.Pp
|
|
.Bl -bullet -compact -offset 2n
|
|
.It
|
|
A flexible way of combining protocol and link level drivers
|
|
.It
|
|
A modular way to implement new protocols
|
|
.It
|
|
A common framework for kernel entities to inter-communicate
|
|
.It
|
|
A reasonably fast, kernel-based implementation
|
|
.El
|
|
.Sh Nodes and Types
|
|
The most fundamental concept in
|
|
.Nm
|
|
is that of a
|
|
.Em node .
|
|
All nodes implement a number of predefined methods which allow them
|
|
to interact with other nodes in a well defined manner.
|
|
.Pp
|
|
Each node has a
|
|
.Em type ,
|
|
which is a static property of the node determined at node creation time.
|
|
A node's type is described by a unique
|
|
.Tn ASCII
|
|
type name.
|
|
The type implies what the node does and how it may be connected
|
|
to other nodes.
|
|
.Pp
|
|
In object-oriented language, types are classes and nodes are instances
|
|
of their respective class. All node types are subclasses of the generic node
|
|
type, and hence inherit certain common functionality and capabilities
|
|
(e.g., the ability to have an
|
|
.Tn ASCII
|
|
name).
|
|
.Pp
|
|
Nodes may be assigned a globally unique
|
|
.Tn ASCII
|
|
name which can be
|
|
used to refer to the node.
|
|
The name must not contain the characters
|
|
.Dq \&.
|
|
or
|
|
.Dq \&:
|
|
and is limited to
|
|
.Dv "NG_NODELEN + 1"
|
|
characters (including NUL byte).
|
|
.Pp
|
|
Each node instance has a unique
|
|
.Em ID number
|
|
which is expressed as a 32-bit hex value. This value may be used to
|
|
refer to a node when there is no
|
|
.Tn ASCII
|
|
name assigned to it.
|
|
.Sh Hooks
|
|
Nodes are connected to other nodes by connecting a pair of
|
|
.Em hooks ,
|
|
one from each node. Data flows bidirectionally between nodes along
|
|
connected pairs of hooks. A node may have as many hooks as it
|
|
needs, and may assign whatever meaning it wants to a hook.
|
|
.Pp
|
|
Hooks have these properties:
|
|
.Pp
|
|
.Bl -bullet -compact -offset 2n
|
|
.It
|
|
A hook has an
|
|
.Tn ASCII
|
|
name which is unique among all hooks
|
|
on that node (other hooks on other nodes may have the same name).
|
|
The name must not contain a
|
|
.Dq \&.
|
|
or a
|
|
.Dq \&:
|
|
and is
|
|
limited to
|
|
.Dv "NG_HOOKLEN + 1"
|
|
characters (including NUL byte).
|
|
.It
|
|
A hook is always connected to another hook. That is, hooks are
|
|
created at the time they are connected, and breaking an edge by
|
|
removing either hook destroys both hooks.
|
|
.El
|
|
.Pp
|
|
A node may decide to assign special meaning to some hooks.
|
|
For example, connecting to the hook named
|
|
.Dq debug
|
|
might trigger
|
|
the node to start sending debugging information to that hook.
|
|
.Sh Data Flow
|
|
Two types of information flow between nodes: data messages and
|
|
control messages. Data messages are passed in mbuf chains along the edges
|
|
in the graph, one edge at a time. The first mbuf in a chain must have the
|
|
.Dv M_PKTHDR
|
|
flag set. Each node decides how to handle data coming in on its hooks.
|
|
.Pp
|
|
Control messages are type-specific C structures sent from one node
|
|
directly to some arbitrary other node. Control messages have a common
|
|
header format, followed by type-specific data, and are binary structures
|
|
for efficiency. However, node types also may support conversion of the
|
|
type specific data between binary and
|
|
.Tn ASCII
|
|
for debugging and human interface purposes (see the
|
|
.Dv NGM_ASCII2BINARY
|
|
and
|
|
.Dv NGM_BINARY2ASCII
|
|
generic control messages below). Nodes are not required to support
|
|
these conversions.
|
|
.Pp
|
|
There are two ways to address a control message. If
|
|
there is a sequence of edges connecting the two nodes, the message
|
|
may be
|
|
.Dq source routed
|
|
by specifying the corresponding sequence
|
|
of hooks as the destination address for the message (relative
|
|
addressing). Otherwise, the recipient node global
|
|
.Tn ASCII
|
|
name
|
|
(or equivalent ID based name) is used as the destination address
|
|
for the message (absolute addressing). The two types of addressing
|
|
may be combined, by specifying an absolute start node and a sequence
|
|
of hooks.
|
|
.Pp
|
|
Messages often represent commands that are followed by a reply message
|
|
in the reverse direction. To facilitate this, the recipient of a
|
|
control message is supplied with a
|
|
.Dq return address
|
|
that is suitable
|
|
for addressing a reply.
|
|
.Pp
|
|
Each control message contains a 32 bit value called a
|
|
.Em typecookie
|
|
indicating the type of the message, i.e., how to interpret it.
|
|
Typically each type defines a unique typecookie for the messages
|
|
that it understands. However, a node may choose to recognize and
|
|
implement more than one type of message.
|
|
.Pp
|
|
If message is delivered to an address that implies that it arrived
|
|
at that node through a particular hook, that hook is identified to the
|
|
receiving node. This allows a message to be rerouted or passed on, should
|
|
a node decide that this is required.
|
|
.Sh Netgraph is Functional
|
|
In order to minimize latency, most
|
|
.Nm
|
|
operations are functional.
|
|
That is, data and control messages are delivered by making function
|
|
calls rather than by using queues and mailboxes. For example, if node
|
|
A wishes to send a data mbuf to neighboring node B, it calls the
|
|
generic
|
|
.Nm
|
|
data delivery function. This function in turn locates
|
|
node B and calls B's
|
|
.Dq receive data
|
|
method.
|
|
.Pp
|
|
It is allowable for nodes to reject a data packet, or to pass it back to the
|
|
caller in a modified or completely replaced form. The caller can notify the
|
|
node being called that it does not wish to receive any such packets
|
|
by using the
|
|
.Fn NG_SEND_DATA
|
|
macro, in which case, the second node should just discard rejected packets.
|
|
If the sender knows how to handle returned packets, it must use the
|
|
.Fn NG_SEND_DATA_RET
|
|
macro, which will adjust the parameters to point to the returned data
|
|
or NULL if no data was returned to the caller. No packet return is possible
|
|
across a queuing link (though an explicitly sent return is of course possible,
|
|
it doesn't mean quite the same thing).
|
|
.Pp
|
|
While this mode of operation
|
|
results in good performance, it has a few implications for node
|
|
developers:
|
|
.Pp
|
|
.Bl -bullet -compact -offset 2n
|
|
.It
|
|
Whenever a node delivers a data or control message, the node
|
|
may need to allow for the possibility of receiving a returning
|
|
message before the original delivery function call returns.
|
|
.It
|
|
Netgraph nodes and support routines generally run at
|
|
.Fn splnet .
|
|
However, some nodes may want to send data and control messages
|
|
from a different priority level. Netgraph supplies queueing routines which
|
|
utilize the NETISR system to move message delivery to
|
|
.Fn splnet .
|
|
Nodes that run at other priorities (e.g. interfaces) can be directly
|
|
linked to other nodes so that the combination runs at the other priority,
|
|
however any interaction with nodes running at splnet MUST be achievd via the
|
|
queueing functions, (which use the
|
|
.Fn netisr
|
|
feature of the kernel).
|
|
Note that messages are always received at
|
|
.Fn splnet .
|
|
.It
|
|
It's possible for an infinite loop to occur if the graph contains cycles.
|
|
.El
|
|
.Pp
|
|
So far, these issues have not proven problematical in practice.
|
|
.Sh Interaction With Other Parts of the Kernel
|
|
A node may have a hidden interaction with other components of the
|
|
kernel outside of the
|
|
.Nm
|
|
subsystem, such as device hardware,
|
|
kernel protocol stacks, etc. In fact, one of the benefits of
|
|
.Nm
|
|
is the ability to join disparate kernel networking entities together in a
|
|
consistent communication framework.
|
|
.Pp
|
|
An example is the node type
|
|
.Em socket
|
|
which is both a netgraph node and a
|
|
.Xr socket 2
|
|
BSD socket in the protocol family
|
|
.Dv PF_NETGRAPH .
|
|
Socket nodes allow user processes to participate in
|
|
.Nm Ns .
|
|
Other nodes communicate with socket nodes using the usual methods, and the
|
|
node hides the fact that it is also passing information to and from a
|
|
cooperating user process.
|
|
.Pp
|
|
Another example is a device driver that presents
|
|
a node interface to the hardware.
|
|
.Sh Node Methods
|
|
Nodes are notified of the following actions via function calls
|
|
to the following node methods (all at
|
|
.Fn splnet )
|
|
and may accept or reject that action (by returning the appropriate
|
|
error code):
|
|
.Bl -tag -width xxx
|
|
.It Creation of a new node
|
|
The constructor for the type is called. If creation of a new node is
|
|
allowed, the constructor must call the generic node creation
|
|
function (in object-oriented terms, the superclass constructor)
|
|
and then allocate any special resources it needs. For nodes that
|
|
correspond to hardware, this is typically done during the device
|
|
attach routine. Often a global
|
|
.Tn ASCII
|
|
name corresponding to the
|
|
device name is assigned here as well.
|
|
.It Creation of a new hook
|
|
The hook is created and tentatively
|
|
linked to the node, and the node is told about the name that will be
|
|
used to describe this hook. The node sets up any special data structures
|
|
it needs, or may reject the connection, based on the name of the hook.
|
|
.It Successful connection of two hooks
|
|
After both ends have accepted their
|
|
hooks, and the links have been made, the nodes get a chance to
|
|
find out who their peer is across the link and can then decide to reject
|
|
the connection. Tear-down is automatic.
|
|
.It Destruction of a hook
|
|
The node is notified of a broken connection. The node may consider some hooks
|
|
to be critical to operation and others to be expendable: the disconnection
|
|
of one hook may be an acceptable event while for another it
|
|
may effect a total shutdown for the node.
|
|
.It Shutdown of a node
|
|
This method allows a node to clean up
|
|
and to ensure that any actions that need to be performed
|
|
at this time are taken. The method must call the generic (i.e., superclass)
|
|
node destructor to get rid of the generic components of the node.
|
|
Some nodes (usually associated with a piece of hardware) may be
|
|
.Em persistent
|
|
in that a shutdown breaks all edges and resets the node,
|
|
but doesn't remove it, in which case the generic destructor is not called.
|
|
.El
|
|
.Sh Sending and Receiving Data
|
|
Three other methods are also supported by all nodes:
|
|
.Bl -tag -width xxx
|
|
.It Receive data message
|
|
An mbuf chain is passed to the node.
|
|
The node is notified on which hook the data arrived,
|
|
and can use this information in its processing decision.
|
|
The receiving node must always
|
|
.Fn m_freem
|
|
the mbuf chain on completion or error, pass it back (reject it), or pass
|
|
it on to another node
|
|
(or kernel module) which will then be responsible for freeing it.
|
|
If a node passes a packet back to the caller, it does not have to be the
|
|
same mbuf, in which case the original must be freed. Passing a packet
|
|
back allows a module to modify the original data (e.g. encrypt it),
|
|
or in some other way filter it (e.g. packet filtering).
|
|
.Pp
|
|
In addition to the mbuf chain itself there is also a pointer to a
|
|
structure describing meta-data about the message
|
|
(e.g. priority information). This pointer may be
|
|
.Dv NULL
|
|
if there is no additional information. The format for this information is
|
|
described in
|
|
.Pa netgraph.h .
|
|
The memory for meta-data must allocated via
|
|
.Fn malloc
|
|
with type
|
|
.Dv M_NETGRAPH .
|
|
As with the data itself, it is the receiver's responsibility to
|
|
.Fn free
|
|
the meta-data. If the mbuf chain is freed the meta-data must
|
|
be freed at the same time. If the meta-data is freed but the
|
|
real data on is passed on, then a
|
|
.Dv NULL
|
|
pointer must be substituted.
|
|
Meta-data may be passed back in the same way that mbuf data may be passed back.
|
|
As with mbuf data, the rejected or returned meta-data pointer may point to
|
|
the same or different meta-data as that passed in,
|
|
and if it is different, the original must be freed.
|
|
.Pp
|
|
The receiving node may decide to defer the data by queueing it in the
|
|
.Nm
|
|
NETISR system (see below).
|
|
.Pp
|
|
The structure and use of meta-data is still experimental, but is presently used in
|
|
frame-relay to indicate that management packets should be queued for transmission
|
|
at a higher priority than data packets. This is required for
|
|
conformance with Frame Relay standards.
|
|
.Pp
|
|
.It Receive queued data message
|
|
Usually this will be the same function as
|
|
.Em Receive data message.
|
|
This is the entry point called when a data message is being handed to
|
|
the node after having been queued in the NETISR system.
|
|
This allows a node to decide in the
|
|
.Em Receive data message
|
|
method that a message should be deferred and queued,
|
|
and be sure that when it is processed from the queue,
|
|
it will not be queued again.
|
|
.It Receive control message
|
|
This method is called when a control message is addressed to the node.
|
|
A return address is always supplied, giving the address of the node
|
|
that originated the message so a reply message can be sent anytime later.
|
|
.Pp
|
|
It is possible for a synchronous reply to be made, and in fact this
|
|
is more common in practice.
|
|
This is done by setting a pointer (supplied as an extra function parameter)
|
|
to point to the reply.
|
|
Then when the control message delivery function returns,
|
|
the caller can check if this pointer has been made non-NULL,
|
|
and if so then it points to the reply message allocated via
|
|
.Fn malloc
|
|
and containing the synchronous response. In both directions,
|
|
(request and response) it is up to the
|
|
receiver of that message to
|
|
.Fn free
|
|
the control message buffer. All control messages and replies are
|
|
allocated with
|
|
.Fn malloc
|
|
type
|
|
.Dv M_NETGRAPH .
|
|
.Pp
|
|
If the message was delivered via a specific hook, that hook will
|
|
also be made known, which allows the use of such things as flow-control
|
|
messages, and status change messages, where the node may want to forward
|
|
the message out another hook to that on which it arrived.
|
|
.El
|
|
.Pp
|
|
Much use has been made of reference counts, so that nodes being
|
|
free'd of all references are automatically freed, and this behaviour
|
|
has been tested and debugged to present a consistent and trustworthy
|
|
framework for the
|
|
.Dq type module
|
|
writer to use.
|
|
.Sh Addressing
|
|
The
|
|
.Nm
|
|
framework provides an unambiguous and simple to use method of specifically
|
|
addressing any single node in the graph. The naming of a node is
|
|
independent of its type, in that another node, or external component
|
|
need not know anything about the node's type in order to address it so as
|
|
to send it a generic message type. Node and hook names should be
|
|
chosen so as to make addresses meaningful.
|
|
.Pp
|
|
Addresses are either absolute or relative. An absolute address begins
|
|
with a node name, (or ID), followed by a colon, followed by a sequence of hook
|
|
names separated by periods. This addresses the node reached by starting
|
|
at the named node and following the specified sequence of hooks.
|
|
A relative address includes only the sequence of hook names, implicitly
|
|
starting hook traversal at the local node.
|
|
.Pp
|
|
There are a couple of special possibilities for the node name.
|
|
The name
|
|
.Dq \&.
|
|
(referred to as
|
|
.Dq \&.: )
|
|
always refers to the local node.
|
|
Also, nodes that have no global name may be addressed by their ID numbers,
|
|
by enclosing the hex representation of the ID number within square brackets.
|
|
Here are some examples of valid netgraph addresses:
|
|
.Bd -literal -offset 4n -compact
|
|
|
|
.:
|
|
foo:
|
|
.:hook1
|
|
foo:hook1.hook2
|
|
[f057cd80]:hook1
|
|
.Ed
|
|
.Pp
|
|
Consider the following set of nodes might be created for a site with
|
|
a single physical frame relay line having two active logical DLCI channels,
|
|
with RFC-1490 frames on DLCI 16 and PPP frames over DLCI 20:
|
|
.Pp
|
|
.Bd -literal
|
|
[type SYNC ] [type FRAME] [type RFC1490]
|
|
[ "Frame1" ](uplink)<-->(data)[<un-named>](dlci16)<-->(mux)[<un-named> ]
|
|
[ A ] [ B ](dlci20)<---+ [ C ]
|
|
|
|
|
| [ type PPP ]
|
|
+>(mux)[<un-named>]
|
|
[ D ]
|
|
.Ed
|
|
.Pp
|
|
One could always send a control message to node C from anywhere
|
|
by using the name
|
|
.Em "Frame1:uplink.dlci16" .
|
|
In this case, node C would also be notified that the message
|
|
reached it via its hook
|
|
.Dq mux .
|
|
Similarly,
|
|
.Em "Frame1:uplink.dlci20"
|
|
could reliably be used to reach node D, and node A could refer
|
|
to node B as
|
|
.Em ".:uplink" ,
|
|
or simply
|
|
.Em "uplink" .
|
|
Conversely, B can refer to A as
|
|
.Em "data" .
|
|
The address
|
|
.Em "mux.data"
|
|
could be used by both nodes C and D to address a message to node A.
|
|
.Pp
|
|
Note that this is only for
|
|
.Em control messages .
|
|
In each of these cases, where a relative addressing mode is
|
|
used, the recipient is notified of the hook on which the
|
|
message arrived, as well as
|
|
the originating node.
|
|
This allows the option of hop-by-hop distibution of messages and
|
|
state information.
|
|
Data messages are
|
|
.Em only
|
|
routed one hop at a time, by specifying the departing
|
|
hook, with each node making
|
|
the next routing decision. So when B receives a frame on hook
|
|
.Dq data
|
|
it decodes the frame relay header to determine the DLCI,
|
|
and then forwards the unwrapped frame to either C or D.
|
|
.Pp
|
|
A similar graph might be used to represent multi-link PPP running
|
|
over an ISDN line:
|
|
.Pp
|
|
.Bd -literal
|
|
[ type BRI ](B1)<--->(link1)[ type MPP ]
|
|
[ "ISDN1" ](B2)<--->(link2)[ (no name) ]
|
|
[ ](D) <-+
|
|
|
|
|
+----------------+
|
|
|
|
|
+->(switch)[ type Q.921 ](term1)<---->(datalink)[ type Q.931 ]
|
|
[ (no name) ] [ (no name) ]
|
|
.Ed
|
|
.Sh Netgraph Structures
|
|
Interesting members of the node and hook structures are shown below:
|
|
.Bd -literal
|
|
struct ng_node {
|
|
char *name; /* Optional globally unique name */
|
|
void *private; /* Node implementation private info */
|
|
struct ng_type *type; /* The type of this node */
|
|
int refs; /* Number of references to this struct */
|
|
int numhooks; /* Number of connected hooks */
|
|
hook_p hooks; /* Linked list of (connected) hooks */
|
|
};
|
|
typedef struct ng_node *node_p;
|
|
|
|
struct ng_hook {
|
|
char *name; /* This node's name for this hook */
|
|
void *private; /* Node implementation private info */
|
|
int refs; /* Number of references to this struct */
|
|
struct ng_node *node; /* The node this hook is attached to */
|
|
struct ng_hook *peer; /* The other hook in this connected pair */
|
|
struct ng_hook *next; /* Next in list of hooks for this node */
|
|
};
|
|
typedef struct ng_hook *hook_p;
|
|
.Ed
|
|
.Pp
|
|
The maintenance of the name pointers, reference counts, and linked list
|
|
of hooks for each node is handled automatically by the
|
|
.Nm
|
|
subsystem.
|
|
Typically a node's private info contains a back-pointer to the node or hook
|
|
structure, which counts as a new reference that must be registered by
|
|
incrementing
|
|
.Dv "node->refs" .
|
|
.Pp
|
|
From a hook you can obtain the corresponding node, and from
|
|
a node the list of all active hooks.
|
|
.Pp
|
|
Node types are described by these structures:
|
|
.Bd -literal
|
|
/** How to convert a control message from binary <-> ASCII */
|
|
struct ng_cmdlist {
|
|
u_int32_t cookie; /* typecookie */
|
|
int cmd; /* command number */
|
|
const char *name; /* command name */
|
|
const struct ng_parse_type *mesgType; /* args if !NGF_RESP */
|
|
const struct ng_parse_type *respType; /* args if NGF_RESP */
|
|
};
|
|
|
|
struct ng_type {
|
|
u_int32_t version; /* Must equal NG_VERSION */
|
|
const char *name; /* Unique type name */
|
|
|
|
/* Module event handler */
|
|
modeventhand_t mod_event; /* Handle load/unload (optional) */
|
|
|
|
/* Constructor */
|
|
int (*constructor)(node_p *node); /* Create a new node */
|
|
|
|
/** Methods using the node **/
|
|
int (*rcvmsg)(node_p node, /* Receive control message */
|
|
struct ng_mesg *msg, /* The message */
|
|
const char *retaddr, /* Return address */
|
|
struct ng_mesg **resp /* Synchronous response */
|
|
hook_p lasthook); /* last hook traversed */
|
|
int (*shutdown)(node_p node); /* Shutdown this node */
|
|
int (*newhook)(node_p node, /* create a new hook */
|
|
hook_p hook, /* Pre-allocated struct */
|
|
const char *name); /* Name for new hook */
|
|
|
|
/** Methods using the hook **/
|
|
int (*connect)(hook_p hook); /* Confirm new hook attachment */
|
|
int (*rcvdata)(hook_p hook, /* Receive data on a hook */
|
|
struct mbuf *m, /* The data in an mbuf */
|
|
meta_p meta, /* Meta-data, if any */
|
|
struct mbuf **ret_m, /* return data here */
|
|
meta_p *ret_meta); /* return Meta-data here */
|
|
int (*disconnect)(hook_p hook); /* Notify disconnection of hook */
|
|
|
|
/** How to convert control messages binary <-> ASCII */
|
|
const struct ng_cmdlist *cmdlist; /* Optional; may be NULL */
|
|
};
|
|
.Ed
|
|
.Pp
|
|
Control messages have the following structure:
|
|
.Bd -literal
|
|
#define NG_CMDSTRLEN 15 /* Max command string (16 with null) */
|
|
|
|
struct ng_mesg {
|
|
struct ng_msghdr {
|
|
u_char version; /* Must equal NG_VERSION */
|
|
u_char spare; /* Pad to 2 bytes */
|
|
u_short arglen; /* Length of cmd/resp data */
|
|
u_long flags; /* Message status flags */
|
|
u_long token; /* Reply should have the same token */
|
|
u_long typecookie; /* Node type understanding this message */
|
|
u_long cmd; /* Command identifier */
|
|
u_char cmdstr[NG_CMDSTRLEN+1]; /* Cmd string (for debug) */
|
|
} header;
|
|
char data[0]; /* Start of cmd/resp data */
|
|
};
|
|
|
|
#define NG_VERSION 1 /* Netgraph version */
|
|
#define NGF_ORIG 0x0000 /* Command */
|
|
#define NGF_RESP 0x0001 /* Response */
|
|
.Ed
|
|
.Pp
|
|
Control messages have the fixed header shown above, followed by a
|
|
variable length data section which depends on the type cookie
|
|
and the command. Each field is explained below:
|
|
.Bl -tag -width xxx
|
|
.It Dv version
|
|
Indicates the version of netgraph itself. The current version is
|
|
.Dv NG_VERSION .
|
|
.It Dv arglen
|
|
This is the length of any extra arguments, which begin at
|
|
.Dv data .
|
|
.It Dv flags
|
|
Indicates whether this is a command or a response control message.
|
|
.It Dv token
|
|
The
|
|
.Dv token
|
|
is a means by which a sender can match a reply message to the
|
|
corresponding command message; the reply always has the same token.
|
|
.Pp
|
|
.It Dv typecookie
|
|
The corresponding node type's unique 32-bit value.
|
|
If a node doesn't recognize the type cookie it must reject the message
|
|
by returning
|
|
.Er EINVAL .
|
|
.Pp
|
|
Each type should have an include file that defines the commands,
|
|
argument format, and cookie for its own messages.
|
|
The typecookie
|
|
insures that the same header file was included by both sender and
|
|
receiver; when an incompatible change in the header file is made,
|
|
the typecookie
|
|
.Em must
|
|
be changed.
|
|
The de facto method for generating unique type cookies is to take the
|
|
seconds from the epoch at the time the header file is written
|
|
(i.e., the output of
|
|
.Dv "date -u +'%s'" ) .
|
|
.Pp
|
|
There is a predefined typecookie
|
|
.Dv NGM_GENERIC_COOKIE
|
|
for the
|
|
.Dq generic
|
|
node type, and
|
|
a corresponding set of generic messages which all nodes understand.
|
|
The handling of these messages is automatic.
|
|
.It Dv command
|
|
The identifier for the message command. This is type specific,
|
|
and is defined in the same header file as the typecookie.
|
|
.It Dv cmdstr
|
|
Room for a short human readable version of
|
|
.Dq command
|
|
(for debugging purposes only).
|
|
.El
|
|
.Pp
|
|
Some modules may choose to implement messages from more than one
|
|
of the header files and thus recognize more than one type cookie.
|
|
.Sh Control Message ASCII Form
|
|
Control messages are in binary format for efficiency. However, for
|
|
debugging and human interface purposes, and if the node type supports
|
|
it, control messages may be converted to and from an equivalent
|
|
.Tn ASCII
|
|
form. The
|
|
.Tn ASCII
|
|
form is similar to the binary form, with two exceptions:
|
|
.Pp
|
|
.Bl -tag -compact -width xxx
|
|
.It o
|
|
The
|
|
.Dv cmdstr
|
|
header field must contain the
|
|
.Tn ASCII
|
|
name of the command, corresponding to the
|
|
.Dv cmd
|
|
header field.
|
|
.It o
|
|
The
|
|
.Dv args
|
|
field contains a NUL-terminated
|
|
.Tn ASCII
|
|
string version of the message arguments.
|
|
.El
|
|
.Pp
|
|
In general, the arguments field of a control messgage can be any
|
|
arbitrary C data type. Netgraph includes parsing routines to support
|
|
some pre-defined datatypes in
|
|
.Tn ASCII
|
|
with this simple syntax:
|
|
.Pp
|
|
.Bl -tag -compact -width xxx
|
|
.It o
|
|
Integer types are represented by base 8, 10, or 16 numbers.
|
|
.It o
|
|
Strings are enclosed in double quotes and respect the normal
|
|
C language backslash escapes.
|
|
.It o
|
|
IP addresses have the obvious form.
|
|
.It o
|
|
Arrays are enclosed in square brackets, with the elements listed
|
|
consecutively starting at index zero. An element may have an optional
|
|
index and equals sign preceeding it. Whenever an element
|
|
does not have an explicit index, the index is implicitly the previous
|
|
element's index plus one.
|
|
.It o
|
|
Structures are enclosed in curly braces, and each field is specified
|
|
in the form
|
|
.Dq fieldname=value .
|
|
.It o
|
|
Any array element or structure field whose value is equal to its
|
|
.Dq default value
|
|
may be omitted. For integer types, the default value
|
|
is usually zero; for string types, the empty string.
|
|
.It o
|
|
Array elements and structure fields may be specified in any order.
|
|
.El
|
|
.Pp
|
|
Each node type may define its own arbitrary types by providing
|
|
the necessary routines to parse and unparse.
|
|
.Tn ASCII
|
|
forms defined
|
|
for a specific node type are documented in the documentation for
|
|
that node type.
|
|
.Sh Generic Control Messages
|
|
There are a number of standard predefined messages that will work
|
|
for any node, as they are supported directly by the framework itself.
|
|
These are defined in
|
|
.Pa ng_message.h
|
|
along with the basic layout of messages and other similar information.
|
|
.Bl -tag -width xxx
|
|
.It Dv NGM_CONNECT
|
|
Connect to another node, using the supplied hook names on either end.
|
|
.It Dv NGM_MKPEER
|
|
Construct a node of the given type and then connect to it using the
|
|
supplied hook names.
|
|
.It Dv NGM_SHUTDOWN
|
|
The target node should disconnect from all its neighbours and shut down.
|
|
Persistent nodes such as those representing physical hardware
|
|
might not disappear from the node namespace, but only reset themselves.
|
|
The node must disconnect all of its hooks.
|
|
This may result in neighbors shutting themselves down, and possibly a
|
|
cascading shutdown of the entire connected graph.
|
|
.It Dv NGM_NAME
|
|
Assign a name to a node. Nodes can exist without having a name, and this
|
|
is the default for nodes created using the
|
|
.Dv NGM_MKPEER
|
|
method. Such nodes can only be addressed relatively or by their ID number.
|
|
.It Dv NGM_RMHOOK
|
|
Ask the node to break a hook connection to one of its neighbours.
|
|
Both nodes will have their
|
|
.Dq disconnect
|
|
method invoked.
|
|
Either node may elect to totally shut down as a result.
|
|
.It Dv NGM_NODEINFO
|
|
Asks the target node to describe itself. The four returned fields
|
|
are the node name (if named), the node type, the node ID and the
|
|
number of hooks attached. The ID is an internal number unique to that node.
|
|
.It Dv NGM_LISTHOOKS
|
|
This returns the information given by
|
|
.Dv NGM_NODEINFO ,
|
|
but in addition
|
|
includes an array of fields describing each link, and the description for
|
|
the node at the far end of that link.
|
|
.It Dv NGM_LISTNAMES
|
|
This returns an array of node descriptions (as for
|
|
.Dv NGM_NODEINFO ")"
|
|
where each entry of the array describes a named node.
|
|
All named nodes will be described.
|
|
.It Dv NGM_LISTNODES
|
|
This is the same as
|
|
.Dv NGM_LISTNAMES
|
|
except that all nodes are listed regardless of whether they have a name or not.
|
|
.It Dv NGM_LISTTYPES
|
|
This returns a list of all currently installed netgraph types.
|
|
.It Dv NGM_TEXT_STATUS
|
|
The node may return a text formatted status message.
|
|
The status information is determined entirely by the node type.
|
|
It is the only "generic" message
|
|
that requires any support within the node itself and as such the node may
|
|
elect to not support this message. The text response must be less than
|
|
.Dv NG_TEXTRESPONSE
|
|
bytes in length (presently 1024). This can be used to return general
|
|
status information in human readable form.
|
|
.It Dv NGM_BINARY2ASCII
|
|
This message converts a binary control message to its
|
|
.Tn ASCII
|
|
form.
|
|
The entire control message to be converted is contained within the
|
|
arguments field of the
|
|
.Dv Dv NGM_BINARY2ASCII
|
|
message itself. If successful, the reply will contain the same control
|
|
message in
|
|
.Tn ASCII
|
|
form.
|
|
A node will typically only know how to translate messages that it
|
|
itself understands, so the target node of the
|
|
.Dv NGM_BINARY2ASCII
|
|
is often the same node that would actually receive that message.
|
|
.It Dv NGM_ASCII2BINARY
|
|
The opposite of
|
|
.Dv NGM_BINARY2ASCII .
|
|
The entire control message to be converted, in
|
|
.Tn ASCII
|
|
form, is contained
|
|
in the arguments section of the
|
|
.Dv NGM_ASCII2BINARY
|
|
and need only have the
|
|
.Dv flags ,
|
|
.Dv cmdstr ,
|
|
and
|
|
.Dv arglen
|
|
header fields filled in, plus the NUL-terminated string version of
|
|
the arguments in the arguments field. If successful, the reply
|
|
contains the binary version of the control message.
|
|
.El
|
|
.Sh Metadata
|
|
Data moving through the
|
|
.Nm
|
|
system can be accompanied by meta-data that describes some
|
|
aspect of that data. The form of the meta-data is a fixed header,
|
|
which contains enough information for most uses, and can optionally
|
|
be supplemented by trailing
|
|
.Em option
|
|
structures, which contain a
|
|
.Em cookie
|
|
(see the section on control messages), an identifier, a length and optional
|
|
data. If a node does not recognize the cookie associated with an option,
|
|
it should ignore that option.
|
|
.Pp
|
|
Meta data might include such things as priority, discard eligibility,
|
|
or special processing requirements. It might also mark a packet for
|
|
debug status, etc. The use of meta-data is still experimental.
|
|
.Sh INITIALIZATION
|
|
The base
|
|
.Nm
|
|
code may either be statically compiled
|
|
into the kernel or else loaded dynamically as a KLD via
|
|
.Xr kldload 8 .
|
|
In the former case, include
|
|
.Bd -literal -offset 4n -compact
|
|
|
|
options NETGRAPH
|
|
|
|
.Ed
|
|
in your kernel configuration file. You may also include selected
|
|
node types in the kernel compilation, for example:
|
|
.Bd -literal -offset 4n -compact
|
|
|
|
options NETGRAPH
|
|
options NETGRAPH_SOCKET
|
|
options NETGRAPH_ECHO
|
|
|
|
.Ed
|
|
.Pp
|
|
Once the
|
|
.Nm
|
|
subsystem is loaded, individual node types may be loaded at any time
|
|
as KLD modules via
|
|
.Xr kldload 8 .
|
|
Moreover,
|
|
.Nm
|
|
knows how to automatically do this; when a request to create a new
|
|
node of unknown type
|
|
.Em type
|
|
is made,
|
|
.Nm
|
|
will attempt to load the KLD module
|
|
.Pa ng_type.ko .
|
|
.Pp
|
|
Types can also be installed at boot time, as certain device drivers
|
|
may want to export each instance of the device as a netgraph node.
|
|
.Pp
|
|
In general, new types can be installed at any time from within the
|
|
kernel by calling
|
|
.Fn ng_newtype ,
|
|
supplying a pointer to the type's
|
|
.Dv struct ng_type
|
|
structure.
|
|
.Pp
|
|
The
|
|
.Fn NETGRAPH_INIT
|
|
macro automates this process by using a linker set.
|
|
.Sh EXISTING NODE TYPES
|
|
Several node types currently exist. Each is fully documented
|
|
in its own man page:
|
|
.Bl -tag -width xxx
|
|
.It SOCKET
|
|
The socket type implements two new sockets in the new protocol domain
|
|
.Dv PF_NETGRAPH .
|
|
The new sockets protocols are
|
|
.Dv NG_DATA
|
|
and
|
|
.Dv NG_CONTROL ,
|
|
both of type
|
|
.Dv SOCK_DGRAM .
|
|
Typically one of each is associated with a socket node.
|
|
When both sockets have closed, the node will shut down. The
|
|
.Dv NG_DATA
|
|
socket is used for sending and receiving data, while the
|
|
.Dv NG_CONTROL
|
|
socket is used for sending and receiving control messages.
|
|
Data and control messages are passed using the
|
|
.Xr sendto 2
|
|
and
|
|
.Xr recvfrom 2
|
|
calls, using a
|
|
.Dv struct sockaddr_ng
|
|
socket address.
|
|
.Pp
|
|
.It HOLE
|
|
Responds only to generic messages and is a
|
|
.Dq black hole
|
|
for data, Useful for testing. Always accepts new hooks.
|
|
.Pp
|
|
.It ECHO
|
|
Responds only to generic messages and always echoes data back through the
|
|
hook from which it arrived. Returns any non generic messages as their
|
|
own response. Useful for testing. Always accepts new hooks.
|
|
.Pp
|
|
.It TEE
|
|
This node is useful for
|
|
.Dq snooping .
|
|
It has 4 hooks:
|
|
.Dv left ,
|
|
.Dv right ,
|
|
.Dv left2right ,
|
|
and
|
|
.Dv right2left .
|
|
Data entering from the right is passed to the left and duplicated on
|
|
.Dv right2left,
|
|
and data entering from the left is passed to the right and
|
|
duplicated on
|
|
.Dv left2right .
|
|
Data entering from
|
|
.Dv left2right
|
|
is sent to the right and data from
|
|
.Dv right2left
|
|
to left.
|
|
.Pp
|
|
.It RFC1490 MUX
|
|
Encapsulates/de-encapsulates frames encoded according to RFC 1490.
|
|
Has a hook for the encapsulated packets
|
|
.Pq Dq downstream
|
|
and one hook
|
|
for each protocol (i.e., IP, PPP, etc.).
|
|
.Pp
|
|
.It FRAME RELAY MUX
|
|
Encapsulates/de-encapsulates Frame Relay frames.
|
|
Has a hook for the encapsulated packets
|
|
.Pq Dq downstream
|
|
and one hook
|
|
for each DLCI.
|
|
.Pp
|
|
.It FRAME RELAY LMI
|
|
Automatically handles frame relay
|
|
.Dq LMI
|
|
(link management interface) operations and packets.
|
|
Automatically probes and detects which of several LMI standards
|
|
is in use at the exchange.
|
|
.Pp
|
|
.It TTY
|
|
This node is also a line discipline. It simply converts between mbuf
|
|
frames and sequential serial data, allowing a tty to appear as a netgraph
|
|
node. It has a programmable
|
|
.Dq hotkey
|
|
character.
|
|
.Pp
|
|
.It ASYNC
|
|
This node encapsulates and de-encapsulates asynchronous frames
|
|
according to RFC 1662. This is used in conjunction with the TTY node
|
|
type for supporting PPP links over asynchronous serial lines.
|
|
.Pp
|
|
.It INTERFACE
|
|
This node is also a system networking interface. It has hooks representing
|
|
each protocol family (IP, AppleTalk, IPX, etc.) and appears in the output of
|
|
.Xr ifconfig 8 .
|
|
The interfaces are named
|
|
.Em ng0 ,
|
|
.Em ng1 ,
|
|
etc.
|
|
.El
|
|
.Sh NOTES
|
|
Whether a named node exists can be checked by trying to send a control message
|
|
to it (e.g.,
|
|
.Dv NGM_NODEINFO
|
|
).
|
|
If it does not exist,
|
|
.Er ENOENT
|
|
will be returned.
|
|
.Pp
|
|
All data messages are mbuf chains with the M_PKTHDR flag set.
|
|
.Pp
|
|
Nodes are responsible for freeing what they allocate.
|
|
There are three exceptions:
|
|
.Bl -tag -width xxxx
|
|
.It 1
|
|
Mbufs sent across a data link are never to be freed by the sender,
|
|
unless it is returned from the recipient.
|
|
.It 2
|
|
Any meta-data information traveling with the data has the same restriction.
|
|
It might be freed by any node the data passes through, and a
|
|
.Dv NULL
|
|
passed onwards, but the caller will never free it.
|
|
Two macros
|
|
.Fn NG_FREE_META "meta"
|
|
and
|
|
.Fn NG_FREE_DATA "m" "meta"
|
|
should be used if possible to free data and meta data (see
|
|
.Pa netgraph.h ) .
|
|
.It 3
|
|
Messages sent using
|
|
.Fn ng_send_message
|
|
are freed by the recipient. As in the case above, the addresses
|
|
associated with the message are freed by whatever allocated them so the
|
|
recipient should copy them if it wants to keep that information.
|
|
.El
|
|
.Sh FILES
|
|
.Bl -tag -width xxxxx -compact
|
|
.It Pa /sys/netgraph/netgraph.h
|
|
Definitions for use solely within the kernel by
|
|
.Nm
|
|
nodes.
|
|
.It Pa /sys/netgraph/ng_message.h
|
|
Definitions needed by any file that needs to deal with
|
|
.Nm
|
|
messages.
|
|
.It Pa /sys/netgraph/ng_socket.h
|
|
Definitions needed to use
|
|
.Nm
|
|
socket type nodes.
|
|
.It Pa /sys/netgraph/ng_{type}.h
|
|
Definitions needed to use
|
|
.Nm
|
|
{type}
|
|
nodes, including the type cookie definition.
|
|
.It Pa /modules/netgraph.ko
|
|
Netgraph subsystem loadable KLD module.
|
|
.It Pa /modules/ng_{type}.ko
|
|
Loadable KLD module for node type {type}.
|
|
.El
|
|
.Sh USER MODE SUPPORT
|
|
There is a library for supporting user-mode programs that wish
|
|
to interact with the netgraph system. See
|
|
.Xr netgraph 3
|
|
for details.
|
|
.Pp
|
|
Two user-mode support programs,
|
|
.Xr ngctl 8
|
|
and
|
|
.Xr nghook 8 ,
|
|
are available to assist manual configuration and debugging.
|
|
.Pp
|
|
There are a few useful techniques for debugging new node types.
|
|
First, implementing new node types in user-mode first
|
|
makes debugging easier.
|
|
The
|
|
.Em tee
|
|
node type is also useful for debugging, especially in conjunction with
|
|
.Xr ngctl 8
|
|
and
|
|
.Xr nghook 8 .
|
|
.Sh SEE ALSO
|
|
.Xr socket 2 ,
|
|
.Xr netgraph 3 ,
|
|
.Xr ng_async 4 ,
|
|
.Xr ng_bpf 4 ,
|
|
.Xr ng_cisco 4 ,
|
|
.Xr ng_ether 4 ,
|
|
.Xr ng_echo 4 ,
|
|
.Xr ng_frame_relay 4 ,
|
|
.Xr ng_hole 4 ,
|
|
.Xr ng_iface 4 ,
|
|
.Xr ng_ksocket 4 ,
|
|
.Xr ng_lmi 4 ,
|
|
.Xr ng_mppc 4 ,
|
|
.Xr ng_ppp 4 ,
|
|
.Xr ng_pppoe 4 ,
|
|
.Xr ng_rfc1490 4 ,
|
|
.Xr ng_socket 4 ,
|
|
.Xr ng_tee 4 ,
|
|
.Xr ng_tty 4 ,
|
|
.Xr ng_UI 4 ,
|
|
.Xr ng_vjc 4 ,
|
|
.Xr ng_{type} 4 ,
|
|
.Xr ngctl 8 ,
|
|
.Xr nghook 8
|
|
.Sh HISTORY
|
|
The
|
|
.Nm
|
|
system was designed and first implemented at Whistle Communications, Inc.
|
|
in a version of
|
|
.Fx 2.2
|
|
customized for the Whistle InterJet.
|
|
It first made its debut in the main tree in
|
|
.Fx 3.4 .
|
|
.Sh AUTHORS
|
|
.An Julian Elischer Aq julian@FreeBSD.org ,
|
|
with contributions by
|
|
.An Archie Cobbs Aq archie@FreeBSD.org .
|