times consequently, without checking whether callout has been serviced
or not. (ng_pptpgre and ng_ppp were catched in this behavior).
- In ng_callout() save old item before calling callout_reset(). If the
latter has returned 1, then free this item.
- In ng_uncallout() clear c->c_arg.
Problem reported by: Alexandre Kardanev
First, mutexed callouts are incompatible with netgraph nodes, because
netgraph(4) can guarantee that the function will be called with mutex
held.
Second, nodes should not send data to their neighbor holding their
mutex. A node does not know what stack can it enter sending data in
some direction. May be executing will encounter a place to sleep.
New locking:
- ng_pptpgre_recv() and ng_pptpgre_xmit() must be entered with mutex held.
- ng_pptpgre_recv() and ng_pptpgre_xmit() unlock mutex before
sending data and then return unlocked.
- callout routines acquire mutex themselves.
does not clear m_nextpkt for us. The mbufs are sent into netgraph and
then, if they contain a TCP packet delivered locally, they will enter
socket code again. They can pass the first assert in sbappendstream()
because m_nextpkt may be set not in the first mbuf, but deeper in the
chain. So the problem will trigger much later, when local program
reads the data from socket, and an mbuf with m_nextpkt becomes a
first one.
This bug was demasked by revision 1.54, when I made upcall queueable.
Before revision 1.54 there was a very small probability to have 2
mbufs in GRE socket buffer, because ng_ksocket_incoming2() dequeued
the first one immediately.
- in ng_ksocket_incoming2() clear m_nextpkt on all mbufs
read from socket.
- restore rev. 1.54 change in ng_ksocket_incoming().
PR: kern/84952
PR: kern/82413
In collaboration with: rwatson
panic. The panic happens when outgoing L2CAP connection descriptor is
deleted with the L2CAP command(s) pending in the queue. In this case when
the last L2CAP command is deleted (due to cleanup) and reference counter
for the L2CAP connection goes down to zero the auto disconnect timeout
is incorrectly set. pjd gets credit for tracking this down and committing
bandaid.
Reported by: Jonatan B <onatan at gmail dot com>
MFC after: 3 days
parallel ng_pptp_rcvdata():
- Add a per-node mutex.
- Acquire mutex during all ng_pptp_rcvdata() method.
- Make callouts protected by mutex. Now callouts count as
netgraph writers, but there are plans to allow reader callouts
for nodes, that have internal locking.
- Acquire mutex in ng_pptp_reset(), which can be triggered
by a message or node shutdown.
PR: kern/80035
Tested by: Deomid Ryabkov <myself rojer.pp.ru>
Reviewed by: Deomid Ryabkov <myself rojer.pp.ru>
either reader or writer flag on item in the function, that
allocates the item. Do not modify these flags when item is
applied or queued.
The only exceptions are node and hook overrides - they can
change item flags to writer.
the code, i.e. ng_fec_init() is called with the ifp->if_softc pointer and
NOT with the ifp pointer.
PR: kern/85239
Reviewed by: brooks
MFC after: 1 day
it fixes. I believe the problem lives somewhere outside ng_ksocket,
but until it is found, let the node be working.
PR: kern/84952
PR: kern/82413
MFC after: 3 days
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.
Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.
Reviewed by: pjd, bz
MFC after: 7 days
data link type of the hook. It will be used to ease autoconfiguration
of netgraph and also to print warning messages, when incompatoble nodes
are connected together.
At the end of ng_snd_item(), node queue is processed. In certain
netgraph setups deep recursive calls can occur.
For example this happens, when two nodes are connected and can send
items to each other in both directions. If, for some reason, both nodes
have a lot of items in their queues, then the processing thread will
recurse between these two nodes, delivering items left and right, going
deeper in the stack. Other setups can suffer from deep recursion, too.
The following factors can influence risk of deep netgraph call:
- periodical write-access events on node
- combination of slow link and fast one in one graph
- net.inet.ip.fastforwarding
Changes made:
- In ng_acquire_{read,write}() do not dequeue another item. Instead,
call ng_setisr() for this node.
- At the end of ng_snd_item(), do not process queue. Call ng_setisr(),
if there are any dequeueable items on node queue.
- In ng_setisr() narrow worklist mutex holding.
- In ng_setisr() assert queue mutex.
Theoretically, the first two changes should negatively affect performance.
To check this, some profiling was made:
1) In general real tasks, no noticable performance difference was found.
2) The following test was made: two multithreaded nodes and one
single-threaded were connected into a ring. A large queues of packets
were sent around this ring. Time to pass the ring N times was measured.
This is a very vacuous test: no items/mbufs are allocated, no upcalls or
downcalls outside of netgraph. It doesn't represent a real load, it is
a stress test for ng_acquire_{read,write}() and item queueing functions.
Surprisingly, the performance impact was positive! New code is 13% faster
on UP and 17% faster on SMP, in this particular test.
The problem was originally found, described, analyzed and original patch
was written by Roselyn Lee from Vernier Networks. Thanks!
Submitted by: Roselyn Lee <rosel verniernetworks com>
It does not work with ng_ubt(4) and require special driver and firmware.
Obtained from: Marcel Holtmann < marcel at holtmann dot org >
Submitted by: Rainer Goellner < rainer at jabbe dot de >
MFC after: 3 days
there are at least two versions of the adapter. Version 1 (product ID 0x2200)
of the adapter does not work with ng_ubt(4) and require special driver and
firmware. Version 2 (product ID 0x3800) seems to work just fine, except it
does not have bDeviceClass, bDeviceSubClass and bDeviceProtocol set to required
(by specification) values. This change forces ng_ubt(4) to attach to the
version 2 adapter.
Obtained from: Marcel Holtmann <marcel at holtmann dot org>
Submitted by: Rainer Goellner <rainer at jabbe dot de>
PPPoE modes. The interface was declared obsoleted before 5.3-RELEASE.
When running as access concentrator ng_pppoe(4) supports both modes
simultanously. When running as client mode can be swicthed in ppp(8)
configuration.
Approved by: re (scottl)
an item may be queued and processed later. While this is OK for mbufs,
this is a problem for control messages.
In the framework:
- Add optional callback function pointer to an item. When item gets
applied the callback is executed from ng_apply_item().
- Add new flag NG_PROGRESS. If this flag is supplied, then return
EINPROGRESS instead of 0 in case if item failed to deliver
synchronously and was queued.
- Honor NG_PROGRESS in ng_snd_item().
In ng_socket:
- When userland sends control message add callback to the item.
- If ng_snd_item() returns EINPROGRESS, then sleep.
This change fixes possible races in ngctl(8) scripts.
Reviewed by: julian
Approved by: re (scottl)
a DLT_NULL interface. In particular:
1) Consistently use type u_int32_t for the header of a
DLT_NULL device - it continues to represent the address
family as always.
2) In the DLT_NULL case get bpf_movein to store the u_int32_t
in a sockaddr rather than in the mbuf, to be consistent
with all the DLT types.
3) Consequently fix a bug in bpf_movein/bpfwrite which
only permitted packets up to 4 bytes less than the MTU
to be written.
4) Fix all DLT_NULL devices to have the code required to
allow writing to their bpf devices.
5) Move the code to allow writing to if_lo from if_simloop
to looutput, because it only applies to DLT_NULL devices
but was being applied to other devices that use if_simloop
possibly incorrectly.
PR: 82157
Submitted by: Matthew Luckie <mjl@luckie.org.nz>
Approved by: re (scottl)
Provide a backwards compatible way to have the extra macro by defining
PCCARD_API_LEVEL 5 before including pccarddevs for driver writers that
want/need to have the same driver on 5 and 6 with pccard attachments.
Approved by: re (dwhite)
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.
This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.
Other changes of note:
- Struct arpcom is no longer referenced in normal interface code.
Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
To enforce this ac_enaddr has been renamed to _ac_enaddr.
- The second argument to ether_ifattach is now always the mac address
from driver private storage rather than sometimes being ac_enaddr.
Reviewed by: sobomax, sam
- Do not edit pullup_len outside M_CHECK macro.
- Do not reimplement NG_FWD_NEW_DATA().
- Remove redundant check for item being not NULL.
Submitted by: ru
hack MSS of packets outgoing via interface with small MTU, to workaround
path MTU discovery problems.
Written by Alexey Popov, with some cleanups from me. There are also plans
to improve mpd port, so that it uses this node, instead of doing MSS
hacking in userland, when 'enable tcpmssfix' option is on.
Submitted by: Alexey Popov <lollypop@flexuser.ru>
Note: len gets intialized to 0 for sap == NULL case only to
make compiler on amd64 happy. This has nothing todo with the
former uninitialized use of len in sap != NULL case.
Reviewed by: glebius
Approved by: pjd (mentor)
specified by caller.
- Change ng_send_item() interface - use 'flags' argument instead of
boolean 'queue'.
- Extend ng_send_fn(), ng_package_data() and ng_package_msg()
interface - add possibility to pass flags. Rename ng_send_fn() to
ng_send_fn1(). Create macro for ng_send_fn().
- Update all macros, that use ng_package_data() and ng_package_msg().
Reviewed by: julian
The most significant changes are:
- Use UMA zone instead of own chunk of memory.
- Lock each hash entry separately.
- Expire items "actively" - interrupt method can expire flows
from hash slot, when it searches through it.
- Remove global tailqueue. Make callout thread search through
every hash slot.
- Export datagram is detached from private data and filled. If
it is incomplete, it is attached back. Another thread will
continue working with it.
Lesser, but also important speedups:
- Flows in hash slot are stored in tailqueue. Whenever a flow is
hit, it is moved to the begging, so it can be located quicker.
- When callout thread works with hash slot it bails out if
slot mutex is contested.
to the mbuf. Offset cannot exceed MHLEN bytes. This is currently used to
fix Ethernet header alignment problem on alpha and sparc64. Also change all
users of m_uiotombuf to pass proper offset.
Reviewed by: jmg, sam
Tested by: Sten Spans "sten AT blinkenlights DOT nl"
MFC after: 1 week
protocol. RFCOMM is a SOCK_STREAM protocol not SOCK_SEQPACKET. This was a
serious bug caused by cut-and-paste. I'm surprised it did not bite me before.
Dunce hat goes to me.
MFC after: 3 days
EA bit is set in hdr->length (16-bit length). This currently has no effect
on the rest of the code. It just fixes the debug message.
MFC After: 3 weeks
Functional changes:
- Cut struct source_hookinfo. Just use hook_p pointer.
- Remove "start_now" command. "start" command now requires number of
packets to send as argument. "start" command actually starts sending.
Move the code that actually starts sending from ng_source_rcvmsg()
to ng_source_start().
- Remove check for NG_SOURCE_ACTIVE in ng_source_stop(). We can be called
with flag cleared (see begin of ng_source_intr()).
- If NG_SEND_DATA_ONLY() use log(LOG_DEBUG) instead of printf(). Otherwise
we will *flood* console.
- Add ng_connect_t method, which sends NGM_ETHER_GET_IFNAME command
to "output" hook. Cut ng_source_request_output_ifp(). Refactor
ng_source_store_output_ifp() to use ifunit() and don't muck through
interface list.
- Add "setiface" command, which gives ability to configure interface
in case when ng_source_connect() failed. This happens, when we are not
connected directly to ng_ether(4) node.
- Remove KASSERTs, which can never fire.
- Don't check for M_PKTHDR in rcvdata method. netgraph(4) does this
for us.
Style:
- Assign sc_p = NG_NODE_PRIVATE(node) in declaration, to be
consistent with style of other nodes.
- Sort variables.
- u_intXX -> uintXX.
- Dots at ends of comments.
Sponsored by: Rambler
be pass-thru mode, when traffic is not copied by ng_tee, but passed thru
ng_netflow.
Changes made:
- In ng_netflow_rcvdata() do all necessary pulluping: Ethernet header,
IP header, and TCP/UDP header.
- Pass only pointer to struct ip to ng_netflow_flow_add(). Any TCP/UDP
headers are guaranteed to by after it.
- Merge make_flow_rec() function into ng_netflow_flow_add().
be pass-thru mode, when traffic is not copied by ng_tee, but passed thru
ng_netflow.
Changes made:
- In ng_netflow_rcvdata() do all necessary pulluping: Ethernet header,
IP header, and TCP/UDP header.
- Pass only pointer to struct ip to ng_netflow_flow_add(). Any TCP/UDP
headers are guaranteed to by after it.
- Merge make_flow_rec() function into ng_netflow_flow_add().
precision when IP packet may travel through internet for several seconds.
Also uptime measured in milliseconds overflows every 48+ days.
But we have to do same to keep compatibility with Cisco and flow-tools.
Make a macro MILLIUPTIME, which does overflowable multiplication to 1000.
Requested by: Sergey Ryabin, Oleg Bulyzhin
MFC after: 1 week
its return value and free resources if function returns error. Plug
several memory leaks with this change.
Submitted by: archie
Found by: Coverity Prevent analysis tool
a socket from a regular socket to a listening socket able to accept new
connections. As part of this state transition, solisten() calls into the
protocol to update protocol-layer state. There were several bugs in this
implementation that could result in a race wherein a TCP SYN received
in the interval between the protocol state transition and the shortly
following socket layer transition would result in a panic in the TCP code,
as the socket would be in the TCPS_LISTEN state, but the socket would not
have the SO_ACCEPTCONN flag set.
This change does the following:
- Pushes the socket state transition from the socket layer solisten() to
to socket "library" routines called from the protocol. This permits
the socket routines to be called while holding the protocol mutexes,
preventing a race exposing the incomplete socket state transition to TCP
after the TCP state transition has completed. The check for a socket
layer state transition is performed by solisten_proto_check(), and the
actual transition is performed by solisten_proto().
- Holds the socket lock for the duration of the socket state test and set,
and over the protocol layer state transition, which is now possible as
the socket lock is acquired by the protocol layer, rather than vice
versa. This prevents additional state related races in the socket
layer.
This permits the dual transition of socket layer and protocol layer state
to occur while holding locks for both layers, making the two changes
atomic with respect to one another. Similar changes are likely require
elsewhere in the socket/protocol code.
Reported by: Peter Holm <peter@holm.cc>
Review and fixes from: emax, Antoine Brodin <antoine.brodin@laposte.net>
Philosophical head nod: gnn
- refactor ngd_constructor, so that make_dev() is called without
any locks held, since it mallocs memory with M_WAITOK flag.
- rename global mtx, to have name different to per-node mtx
MFC after: 2 weeks
removes netgraph node and unwraps Ethernet interface.
This gives us ability to unload ng_ether.ko, when all interfaces
are detached, making ng_ether(4) developers happy.
Reviewed by: ru
a definite setup was broken: two ng_ksockets are connected to each other,
connect()ed to different remote hosts, and bind()ed to different local
interfaces. In this case one ng_ksocket is fooled with tag from the other
one.
Put node id into tag. In rcvdata method utilize tag only if it has our
own id inside or id equals zero. The latter case is added to support
packets send by some third, not ng_ksocket node.
MFC after: 1 week
with net byte order. Change byte order to net in ng_ipfw_input(), change
byte order to host before ip_output(), do not change before ip_input().
In collaboration with: ru
The difference is that the callout function installed via the
ng_callout() method is guaranteed to NOT fire after the shutdown
method was run (when a node is marked NGF_INVALID). Also, the
shutdown method and the callout function are guaranteed to NOT
run at the same time, as both require the writer lock. Thus
we can safely ignore a zero return value from ng_uncallout()
(callout_stop()) in shutdown methods, and go on with freeing
the node.
The said revision broke the node shutdown -- ng_bridge_timeout()
is no longer fired after ng_bridge_shutdown() was run, resulting
in a memory leak, dead nodes, and inability to unload the module.
Fix this by cancelling the callout on shutdown, and moving part
responsible for freeing a node resources from ng_bridge_timer()
to ng_bridge_shutdown().
Noticed by: ru
Submitted by: glebius, ru