either reader or writer flag on item in the function, that
allocates the item. Do not modify these flags when item is
applied or queued.
The only exceptions are node and hook overrides - they can
change item flags to writer.
At the end of ng_snd_item(), node queue is processed. In certain
netgraph setups deep recursive calls can occur.
For example this happens, when two nodes are connected and can send
items to each other in both directions. If, for some reason, both nodes
have a lot of items in their queues, then the processing thread will
recurse between these two nodes, delivering items left and right, going
deeper in the stack. Other setups can suffer from deep recursion, too.
The following factors can influence risk of deep netgraph call:
- periodical write-access events on node
- combination of slow link and fast one in one graph
- net.inet.ip.fastforwarding
Changes made:
- In ng_acquire_{read,write}() do not dequeue another item. Instead,
call ng_setisr() for this node.
- At the end of ng_snd_item(), do not process queue. Call ng_setisr(),
if there are any dequeueable items on node queue.
- In ng_setisr() narrow worklist mutex holding.
- In ng_setisr() assert queue mutex.
Theoretically, the first two changes should negatively affect performance.
To check this, some profiling was made:
1) In general real tasks, no noticable performance difference was found.
2) The following test was made: two multithreaded nodes and one
single-threaded were connected into a ring. A large queues of packets
were sent around this ring. Time to pass the ring N times was measured.
This is a very vacuous test: no items/mbufs are allocated, no upcalls or
downcalls outside of netgraph. It doesn't represent a real load, it is
a stress test for ng_acquire_{read,write}() and item queueing functions.
Surprisingly, the performance impact was positive! New code is 13% faster
on UP and 17% faster on SMP, in this particular test.
The problem was originally found, described, analyzed and original patch
was written by Roselyn Lee from Vernier Networks. Thanks!
Submitted by: Roselyn Lee <rosel verniernetworks com>
an item may be queued and processed later. While this is OK for mbufs,
this is a problem for control messages.
In the framework:
- Add optional callback function pointer to an item. When item gets
applied the callback is executed from ng_apply_item().
- Add new flag NG_PROGRESS. If this flag is supplied, then return
EINPROGRESS instead of 0 in case if item failed to deliver
synchronously and was queued.
- Honor NG_PROGRESS in ng_snd_item().
In ng_socket:
- When userland sends control message add callback to the item.
- If ng_snd_item() returns EINPROGRESS, then sleep.
This change fixes possible races in ngctl(8) scripts.
Reviewed by: julian
Approved by: re (scottl)
specified by caller.
- Change ng_send_item() interface - use 'flags' argument instead of
boolean 'queue'.
- Extend ng_send_fn(), ng_package_data() and ng_package_msg()
interface - add possibility to pass flags. Rename ng_send_fn() to
ng_send_fn1(). Create macro for ng_send_fn().
- Update all macros, that use ng_package_data() and ng_package_msg().
Reviewed by: julian
SI_SUB_INIT_IF but before SI_SUB_DRIVERS. Make Netgraph(4)
framework initialize at SI_SUB_NETGRAPH level.
This does not address the bigger problem: MODULE_DEPEND
does not seem to work when modules are compiled in the
kernel, but it fixes the problem with Netgraph Bluetooth
device drivers reported by a few folks.
PR: i386/69876
Reviewed by: julian, rik, scottl
MFC after: 3 days
out c->c_func, we can't take it after callout_stop(). To take it before
we need to acquire callout_lock, to avoid race. This commit narrows
down area where lock is held, but hack is still present.
This should be redesigned.
Approved by: julian (mentor)
using linker_load_module(). This works OK if NGM_MKPEER message came
from userland and we have process associated with thread. But when
NGM_MKPEER was queued because target node was busy, linker_load_module()
is called from netisr thread leading to panic.
To workaround that we do not load modules by framework, instead ng_socket
loads module (if this is required) before sending NGM_MKPEER.
However, the race condition between return from NgSendMsg() and actual
creation of node still exist and needs to be solved.
PR: kern/62789
Approved by: julian
Also introduce a macro to be called by persistent nodes to signal their
persistence during shutdown to hide this mechanism from the node author.
Make node flags have a consistent style in naming.
Document the change.
is obviously not run a lot. (but is in some test cases).
This code is not usually run because it covers a case that doesn't
happen a lot (removing a node that has data traversing it).
for unknown events.
A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
- Assert the mutex in NG_IDHASH_FIND() since the mutex is required to
safely walk the node lists in the ng_ID_hash table.
- Acquire the ng_nodelist_mtx when walking ng_allnodes or ng_allhooks
to generate state dump output from the netgraph sysctls.
behaviour lost in the change from 4.x style netgraph tee nodes.
Alter the tee node to use the new method. Document the behaviour.
Step the ABI version number... old netgraph klds will refuse to load.
Better than just crashing.
Submitted by: Gleb Smirnoff <glebius@cell.sick.ru>
whether or not the isr needs to hold Giant when running; Giant-less
operation is also controlled by the setting of debug_mpsafenet
o mark all netisr's except NETISR_IP as needing Giant
o add a GIANT_REQUIRED assertion to the top of netisr's that need Giant
o pickup Giant (when debug_mpsafenet is 1) inside ip_input before
calling up with a packet
o change netisr handling so swi_net runs w/o Giant; instead we grab
Giant before invoking handlers based on whether the handler needs Giant
o change netisr handling so that netisr's that are marked MPSAFE may
have multiple instances active at a time
o add netisr statistics for packets dropped because the isr is inactive
Supported by: FreeBSD Foundation
conservative lock. The problem with the lock-less algorithm is that
it suffers from the ABA problem. Running an application with funnels
a couple of 100kpkts/s through the netgraph system on a dual CPU system
with MPSAFE drivers will panic almost immediatly with the old algorithm.
It may be possible to eliminate the contention between threads that insert
free items into the list and those that get free items by using the
Michael/Scott queue algorithm that has two locks.
of asserting that an mbuf has a packet header. Use it instead of hand-
rolled versions wherever applicable.
Submitted by: Hiten Pandya <hiten@unixdaemons.com>
drain routines are done by swi_net, which allows for better queue control
at some future point. Packets may also be directly dispatched to a netisr
instead of queued, this may be of interest at some installations, but
currently defaults to off.
Reviewed by: hsu, silby, jayanth, sam
Sponsored by: DARPA, NAI Labs
queue items that can be allocated by netgraph and the number of free queue
items that are cached on a private list.
Netgraph places an upper limit on the number of queue items it may allocate.
When there is a large number of netgraph messages travelling through the
system (100k/sec and more) there is a high probability, that messages get
queued at the nodes and netgraph runs out of queue items. In this case the data
flow through netgraph gets blocked. The tuneable for the number of free
items lets one trade memory for performance.
The tunables are also available as read-only sysctls.
PR: kern/47393
Reviewed by: julian
Approved by: jake (mentor)
linker_load_module() instead.
This fixes a bug where the kernel was unable to properly locate and
load a kernel module in vfs_mount() (and probably in the netgraph
code as well since it was using the same function). This is because
the linker_load_file() does not properly search the module path.
Problem found by: peter
Reviewed by: peter
Thanks to: peter
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64