Commit Graph

1190 Commits

Author SHA1 Message Date
Gleb Smirnoff
8f134647ca Switch the entire IPv4 stack to keep the IP packet header
in network byte order. Any host byte order processing is
done in local variables and host byte order values are
never[1] written to a packet.

  After this change a packet processed by the stack isn't
modified at all[2] except for TTL.

  After this change a network stack hacker doesn't need to
scratch his head trying to figure out what is the byte order
at the given place in the stack.

[1] One exception still remains. The raw sockets convert host
byte order before pass a packet to an application. Probably
this would remain for ages for compatibility.

[2] The ip_input() still subtructs header len from ip->ip_len,
but this is planned to be fixed soon.

Reviewed by:	luigi, Maxim Dounin <mdounin mdounin.ru>
Tested by:	ray, Olivier Cochard-Labbe <olivier cochard.me>
2012-10-22 21:09:03 +00:00
Andre Oppermann
c9b652e3e8 Mechanically remove the last stray remains of spl* calls from net*/*.
They have been Noop's for a long time now.
2012-10-18 13:57:24 +00:00
Alexander V. Chernikov
10fcb07c91 Add NG_NETFLOW_V9INFO_TYPE command to be able to request netflowv9-specific
data.

Submitted by:	Dmitry Luhtionov <dmitryluhtionov at gmail.com>
MFC after:	2 weeks
2012-10-11 16:15:18 +00:00
Kevin Lo
9823d52705 Revert previous commit...
Pointyhat to:	kevlo (myself)
2012-10-10 08:36:38 +00:00
Kevin Lo
a10cee30c9 Prefer NULL over 0 for pointers 2012-10-09 08:27:40 +00:00
Kevin Lo
10d66948a8 Fix typo: s/unknow/unknown 2012-10-09 06:15:16 +00:00
Gleb Smirnoff
23e9c6dc1e After r241245 it appeared that in_delayed_cksum(), which still expects
host byte order, was sometimes called with net byte order. Since we are
moving towards net byte order throughout the stack, the function was
converted to expect net byte order, and its consumers fixed appropriately:
  - ip_output(), ipfilter(4) not changed, since already call
    in_delayed_cksum() with header in net byte order.
  - divert(4), ng_nat(4), ipfw_nat(4) now don't need to swap byte order
    there and back.
  - mrouting code and IPv6 ipsec now need to switch byte order there and
    back, but I hope, this is temporary solution.
  - In ipsec(4) shifted switch to net byte order prior to in_delayed_cksum().
  - pf_route() catches up on r241245 changes to ip_output().
2012-10-08 08:03:58 +00:00
Hans Petter Selasky
12b16d85ae The USB Bluetooth driver should only grab its own interfaces. This allows the
USB bluetooth driver to co-exist with other USB device classes and drivers.

Reported by:	Geoffrey Levand
MFC after:	1 week
2012-09-30 19:31:20 +00:00
Ryan Stone
3fabe28bdc Ensure that all cases that enqueue a netgraph item for delivery by a
ngthread properly set the item's depth to 1.  In particular, prior to this
change if ng_snd_item failed to acquire a lock on a node, the item's depth
would not be set at all.  This fix ensures that the error code from rcvmsg/
rcvdata is properly passed back to the apply callback.  For example, this
fixes a bug where an error from rcvmsg/rcvdata would not previously
propagate back to a libnetgraph consumer when the message was queued.

Reviewed by:	mav
MFC after:	1 month
Sponsored by:	Sandvine Incorporated
2012-09-27 20:12:51 +00:00
Gleb Smirnoff
3b3a8eb937 o Create directory sys/netpfil, where all packet filters should
reside, and move there ipfw(4) and pf(4).

o Move most modified parts of pf out of contrib.

Actual movements:

sys/contrib/pf/net/*.c		-> sys/netpfil/pf/
sys/contrib/pf/net/*.h		-> sys/net/
contrib/pf/pfctl/*.c		-> sbin/pfctl
contrib/pf/pfctl/*.h		-> sbin/pfctl
contrib/pf/pfctl/pfctl.8	-> sbin/pfctl
contrib/pf/pfctl/*.4		-> share/man/man4
contrib/pf/pfctl/*.5		-> share/man/man5

sys/netinet/ipfw		-> sys/netpfil/ipfw

The arguable movement is pf/net/*.h -> sys/net. There are
future plans to refactor pf includes, so I decided not to
break things twice.

Not modified bits of pf left in contrib: authpf, ftp-proxy,
tftp-proxy, pflogd.

The ipfw(4) movement is planned to be merged to stable/9,
to make head and stable match.

Discussed with:		bz, luigi
2012-09-14 11:51:49 +00:00
Alexander Motin
2c2e2be746 Remove duplicate check.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
2012-08-03 12:55:31 +00:00
Ed Maste
39b553cea0 Add version so others can depend on this module 2012-07-27 13:57:28 +00:00
Alexander V. Chernikov
36374fcf4b Make radix lookup on src and dst flow addresses optional
and configurable on per-interface basis.
Remove __inline__ for several functions being called once per
flow (e.g once per 10-20 packets on common traffic flows).
Update manual page to simplify search for BPF data link types.

Sponsored by Yandex LLC

Reviewed by:      glebius
Approved by:      ae(mentor)
MFC after:        2 weeks
2012-06-18 13:56:36 +00:00
Alexander V. Chernikov
0bd6bb6bb0 Simplify IP pointer recovery in case of mbuf reallocation.
Reviewed by:     glebius (previous version)
Approved by:     ae(mentor)
MFC after:       2 weeks
2012-06-18 13:50:41 +00:00
Alexander V. Chernikov
e8cce25549 Use time_uptime instead of getnanotime for accouting integer number of seconds.
Reviewed by:     glebius
Approved by:     ae(mentor)
MFC after:       1 week
2012-06-16 13:55:31 +00:00
Alexander V. Chernikov
6a0d28ec21 Set netflow v9 observation domain value to fib number instead of node id.
This fixes multi-fib netflow v9 export.

Reviewed by:     glebius
Approved by:     kib(mentor)
MFC after:       1 week
2012-06-16 13:53:14 +00:00
Alexander V. Chernikov
f75083f064 Fix improper L4 header handling for IPv6 packets passed via DLT_RAW.
Reported by:     Emil Muratov <gpm@hotplug.ru>
Reviewed by:     glebius
Approved by:     ae(mentor)
MFC after:       1 week
2012-06-16 13:51:01 +00:00
Gleb Smirnoff
5372c30b8f Revert my local not yet properly tested changes, that leaked in
with r235923.
2012-05-25 07:46:24 +00:00
Gleb Smirnoff
38f1b2d1bc Revert r220768 for ng_ksocket. This node is special and
when it is cloning, its constructor method may be called
in a context that isn't allowed to sleep.

Noticed by:	Vadim Goncharov
2012-05-24 18:22:57 +00:00
Alexander V. Chernikov
3a0cd8db78 Fix panic in ng_patch(4) caused by checksum flags being added to mbuf flags.
Tested by:        Maxim Ignatenko <gelraen.ua@gmail.com>
Approved by:      kib(mentor)

MFC after:        3 days
2012-04-22 17:00:52 +00:00
Marko Zec
5bc2249ff8 #include <net/vnet.h> is no longer needed here.
Spotted by:	Ed Maste
MFC after:	3 days.
2012-04-16 13:41:46 +00:00
Hans Petter Selasky
6d917491f5 Fix compiler warnings, mostly signed issues,
when USB modules are compiled with WARNS=9.

MFC after:	1 weeks
2012-04-02 10:50:42 +00:00
Alexander V. Chernikov
147972555f Use rt_numfibs variable instead of compile-time RT_NUMFIBS.
Reviewed by:    glebius (previous version)
Approved by:    kib(mentor), ae(mentor)
2012-03-13 11:08:40 +00:00
Adrian Chadd
bbf53c35ea Upgrade the netgraph vlan node to support 802.1q, encapsulation type,
PCP and CFI fields.

* Ethernet_type for VLAN encapsulation is tunable, default is 0x8100;
* PCP (Priority code point) and CFI (canonical format indicator) is
  tunable per VID;
* Tunable encapsulation to support 802.1q
* Encapsulation/Decapsulation code improvements

New messages have been added for this netgraph node to support the
new features.

However, the legacy "vlan" id is still supported and compiled in by
default.  It can be disabled in a future release.

TODO:

* Documentation
* Examples

PR:		kern/161908
Submitted by:	Ivan <rozhuk.im@gmail.com>
2012-03-11 19:08:56 +00:00
Gleb Smirnoff
77a117ca28 Revert r231829, that was my braino. 2012-02-22 09:08:51 +00:00
Gleb Smirnoff
687adb703d Refactor the name hash and the ID hash, that are used to address nodes:
- Make hash sizes growable, to satisfy users running large mpd
  installations, having thousands of nodes.
- NG_NAMEHASH() proved to give a very bad distribution in real life
  name sets, while generic hash32_str(name, HASHINIT) proved to give
  an even one, so you the latter for name hash.
- Do not store unnamed nodes in slot 0 of name hash, no reason for that.
- Use the ID hash in cases when we need to run through all nodes: the
  NGM_LISTNODES command and in the vnet_netgraph_uninit().
- Implement NGM_LISTNODES and NGM_LISTNAMES as separate code, the former
  iterates through the ID hash, and the latter through the name hash.
- Keep count of all nodes and of named nodes, so that we don't need
  to count nodes in NGM_LISTNODES and NGM_LISTNAMES. The counters are
  also used to estimate whether we need to grow hashes.
- Close a race between two threads running ng_name_node() assigning same
  name to different nodes.
2012-02-16 19:10:01 +00:00
Gleb Smirnoff
320d00eee8 Specify correct loading order for core of netgraph(4). 2012-02-16 18:54:44 +00:00
Gleb Smirnoff
04fdc6c689 Supply correct "how" argument to the uma_zcreate(). 2012-02-16 18:51:12 +00:00
Gleb Smirnoff
8338a34a82 In ng_getsockaddr() allocate memory prior to obtaining lock.
Reported & tested by:	Mykola Dzham <i levsha.me>
2012-02-16 14:44:52 +00:00
Gleb Smirnoff
b99a737923 Fix includes list.
Submitted by:	bde
2012-02-15 15:54:57 +00:00
Gleb Smirnoff
923d1d7814 Trim double empty lines. 2012-02-15 15:06:03 +00:00
Gleb Smirnoff
3eb05c287a Remove testing stuff, reducing kernel memory footprint by 1 Kb.
Anyway, when we are building a LINT kernel, all these macros
are tested via nodes.
2012-02-15 14:56:18 +00:00
Gleb Smirnoff
c3189b3fb4 In ng_bypass() add more protection against potential race
with ng_rmnode() and its followers.
2012-02-15 14:29:23 +00:00
Gleb Smirnoff
19afcd9829 style(9): sort includes. 2012-02-15 14:26:50 +00:00
Gleb Smirnoff
dea55037d0 No need to optimise for a node with no hooks, my braino. 2012-02-13 13:07:56 +00:00
Max Khon
36fb423e4f - Use fixed-width integer types.
- Prefer to use C99 stdint types.

This fixes ng_cisco on 64-bit architectures.

MFC after:	1 week
2012-02-12 05:14:12 +00:00
Ed Schouten
7870adb640 Remove direct access to si_name.
Code should just use the devtoname() function to obtain the name of a
character device. Also add const keywords to pieces of code that need it
to build properly.

MFC after:	2 weeks
2012-02-10 12:35:57 +00:00
Gleb Smirnoff
48a47609bc Provide a findhook method for ng_socket(4). The node stores a
hash with names of its hooks. It starts with size of 16, and
grows when number of hooks reaches twice the current size. A
failure to grow (memory is allocated with M_NOWAIT) isn't
fatal, however.

I used standard hash(9) function for the hash. With 25000
hooks named in the mpd (ports/net/mpd5) manner of "b%u", the
distributions is the following: 72.1% entries consist of one
element, 22.1% consist of two, 5.2% consist of three and
0.6% of four.

Speedup in a synthetic test that creates 25000 hooks and then
runs through a long cyclce dereferencing them in a random order
is over 25 times.
2012-01-23 16:43:13 +00:00
Gleb Smirnoff
4b2b8a370c In ng_socket(4) expose less kernel internals to userland. This commit
breaks ABI, but makes probability of ABI breakage in future less.
2012-01-23 15:39:45 +00:00
Gleb Smirnoff
c4282b741b Convert locks that protect name hash, ID hash and typelist from
mutex(9) to rwlock(9) based locks.

While here remove dropping lock when processing NGM_LISTNODES,
and NGM_LISTTYPES generic commands. We don't need to drop it
since memory allocation is done with M_NOWAIT.
2012-01-23 15:17:14 +00:00
Gleb Smirnoff
bdc99a491d The newhook method can be called in ISR context at
certain circumstances, so better use M_NOWAIT in it.
2012-01-17 18:10:25 +00:00
Gleb Smirnoff
fe4ead276d Add missing static. 2012-01-16 12:33:55 +00:00
Gleb Smirnoff
abe5a2ce52 Remove some disabled NOTYET code. Probability of enabling it is low,
if anyone wants, he/she can take it from svn.
2012-01-16 12:31:33 +00:00
Ed Schouten
dc15eac046 Use strchr() and strrchr().
It seems strchr() and strrchr() are used more often than index() and
rindex(). Therefore, simply migrate all kernel code to use it.

For the XFS code, remove an empty line to make the code identical to
the code in the Linux kernel.
2012-01-02 12:12:10 +00:00
Gleb Smirnoff
4bd1b55756 style(9), whitespace and spelling nits. 2011-12-30 15:41:28 +00:00
Brooks Davis
4b22573a89 In r191367 the need for if_free_type() was removed and a new member
if_alloctype was used to store the origional interface type.  Take
advantage of this change by removing all existing uses of if_free_type()
in favor of if_free().

MFC after:	1 Month
2011-11-11 22:57:52 +00:00
Ed Schouten
6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
Ed Schouten
d745c852be Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
2011-11-07 06:44:47 +00:00
Max Khon
4cf39b5da4 - Fix potential double mbuf free: M_PREPEND may free mbuf chain and return
NULL but item will still have the reference ot the mbuf chain and will free
it upon destruction.
- Fix memory leak (unfree'd item on error path).
2011-11-06 05:24:54 +00:00
Max Khon
6812e78328 Fix potential double mbuf free: M_PREPEND may free mbuf chain and return
NULL but item will still have the reference ot the mbuf chain and will free
it upon destruction.
2011-11-06 05:23:42 +00:00