in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.
After this change a packet processed by the stack isn't
modified at all[2] except for TTL.
After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.
[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.
[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.
Reviewed by: luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by: ray, Olivier Cochard-Labbe <olivier cochard.me>
host byte order, was sometimes called with net byte order. Since we are
moving towards net byte order throughout the stack, the function was
converted to expect net byte order, and its consumers fixed appropriately:
- ip_output(), ipfilter(4) not changed, since already call
in_delayed_cksum() with header in net byte order.
- divert(4), ng_nat(4), ipfw_nat(4) now don't need to swap byte order
there and back.
- mrouting code and IPv6 ipsec now need to switch byte order there and
back, but I hope, this is temporary solution.
- In ipsec(4) shifted switch to net byte order prior to in_delayed_cksum().
- pf_route() catches up on r241245 changes to ip_output().
ngthread properly set the item's depth to 1. In particular, prior to this
change if ng_snd_item failed to acquire a lock on a node, the item's depth
would not be set at all. This fix ensures that the error code from rcvmsg/
rcvdata is properly passed back to the apply callback. For example, this
fixes a bug where an error from rcvmsg/rcvdata would not previously
propagate back to a libnetgraph consumer when the message was queued.
Reviewed by: mav
MFC after: 1 month
Sponsored by: Sandvine Incorporated
reside, and move there ipfw(4) and pf(4).
o Move most modified parts of pf out of contrib.
Actual movements:
sys/contrib/pf/net/*.c -> sys/netpfil/pf/
sys/contrib/pf/net/*.h -> sys/net/
contrib/pf/pfctl/*.c -> sbin/pfctl
contrib/pf/pfctl/*.h -> sbin/pfctl
contrib/pf/pfctl/pfctl.8 -> sbin/pfctl
contrib/pf/pfctl/*.4 -> share/man/man4
contrib/pf/pfctl/*.5 -> share/man/man5
sys/netinet/ipfw -> sys/netpfil/ipfw
The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.
Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.
The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.
Discussed with: bz, luigi
and configurable on per-interface basis.
Remove __inline__ for several functions being called once per
flow (e.g once per 10-20 packets on common traffic flows).
Update manual page to simplify search for BPF data link types.
Sponsored by Yandex LLC
Reviewed by: glebius
Approved by: ae(mentor)
MFC after: 2 weeks
PCP and CFI fields.
* Ethernet_type for VLAN encapsulation is tunable, default is 0x8100;
* PCP (Priority code point) and CFI (canonical format indicator) is
tunable per VID;
* Tunable encapsulation to support 802.1q
* Encapsulation/Decapsulation code improvements
New messages have been added for this netgraph node to support the
new features.
However, the legacy "vlan" id is still supported and compiled in by
default. It can be disabled in a future release.
TODO:
* Documentation
* Examples
PR: kern/161908
Submitted by: Ivan <rozhuk.im@gmail.com>
- Make hash sizes growable, to satisfy users running large mpd
installations, having thousands of nodes.
- NG_NAMEHASH() proved to give a very bad distribution in real life
name sets, while generic hash32_str(name, HASHINIT) proved to give
an even one, so you the latter for name hash.
- Do not store unnamed nodes in slot 0 of name hash, no reason for that.
- Use the ID hash in cases when we need to run through all nodes: the
NGM_LISTNODES command and in the vnet_netgraph_uninit().
- Implement NGM_LISTNODES and NGM_LISTNAMES as separate code, the former
iterates through the ID hash, and the latter through the name hash.
- Keep count of all nodes and of named nodes, so that we don't need
to count nodes in NGM_LISTNODES and NGM_LISTNAMES. The counters are
also used to estimate whether we need to grow hashes.
- Close a race between two threads running ng_name_node() assigning same
name to different nodes.
Code should just use the devtoname() function to obtain the name of a
character device. Also add const keywords to pieces of code that need it
to build properly.
MFC after: 2 weeks
hash with names of its hooks. It starts with size of 16, and
grows when number of hooks reaches twice the current size. A
failure to grow (memory is allocated with M_NOWAIT) isn't
fatal, however.
I used standard hash(9) function for the hash. With 25000
hooks named in the mpd (ports/net/mpd5) manner of "b%u", the
distributions is the following: 72.1% entries consist of one
element, 22.1% consist of two, 5.2% consist of three and
0.6% of four.
Speedup in a synthetic test that creates 25000 hooks and then
runs through a long cyclce dereferencing them in a random order
is over 25 times.
mutex(9) to rwlock(9) based locks.
While here remove dropping lock when processing NGM_LISTNODES,
and NGM_LISTTYPES generic commands. We don't need to drop it
since memory allocation is done with M_NOWAIT.
It seems strchr() and strrchr() are used more often than index() and
rindex(). Therefore, simply migrate all kernel code to use it.
For the XFS code, remove an empty line to make the code identical to
the code in the Linux kernel.
if_alloctype was used to store the origional interface type. Take
advantage of this change by removing all existing uses of if_free_type()
in favor of if_free().
MFC after: 1 Month
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.