Commit Graph

3347 Commits

Author SHA1 Message Date
Hiren Panchasara
a9467c3c45 Currently there is no easy way to specify net.isr.maxthreads = all cpus. We need
to specify exact number of cpus in loader.conf which get annoying when you have
mix of machines which don't have equal number of total cpus. I propose "-1" as
that value. When loader.conf has net.isr.maxthreads = -1, netisr will use all
available cpus.

In collaboration with:	davide
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D2318
MFC after:	2 weeks
Sponsored by:	Limelight Networks
2015-04-25 16:12:06 +00:00
Gleb Smirnoff
9f7d0f4830 Don't propagate SIOCSIFCAPS from a vlan(4) to its parent. This leads to
quite unexpected result of toggling capabilities on the neighbour vlan(4)
interfaces.

Reviewed by:		melifaro, np
Differential Revision:	https://reviews.freebsd.org/D2310
Sponsored by:		Nginx, Inc.
2015-04-23 13:19:00 +00:00
Craig Rodrigues
d9db52256e Move zlib.c from net to libkern.
It is not network-specific code and would
be better as part of libkern instead.
Move zlib.h and zutil.h from net/ to sys/
Update includes to use sys/zlib.h and sys/zutil.h instead of net/

Submitted by:		Steve Kiernan stevek@juniper.net
Obtained from:		Juniper Networks, Inc.
GitHub Pull Request:	https://github.com/freebsd/freebsd/pull/28
Relnotes:		yes
2015-04-22 14:38:58 +00:00
Gleb Smirnoff
41c1a23326 Make IFMEDIA_DEBUG a kernel option.
Sponsored by:	Nginx, Inc.
2015-04-21 10:35:23 +00:00
Mark Johnston
b23cbbe6db Move the definition of struct bpf_if to bpf.c.
A couple of fields are still exposed via struct bpf_if_ext so that
bpf_peers_present() can be inlined into its callers. However, this change
eliminates some type duplication in the resulting CTF container, since
otherwise ctfmerge(1) propagates the duplication through all types that
contain a struct bpf_if.

Differential Revision:	https://reviews.freebsd.org/D2319
Reviewed by:	melifaro, rpaulo
2015-04-20 22:08:11 +00:00
Alexander Motin
7144875388 Activate write-only optimization if bpf device opened with O_WRONLY.
dhclient opens bpf as write-only to send packets. It never reads received
packets from that descriptor, but processing them in kernel takes time.
Especially much time takes packet timestamping on systems with expensive
timecounter, such as bhyve guest, where network speed dropped in half.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2015-04-20 10:44:46 +00:00
Gleb Smirnoff
6456c04aae Bring in if_types.h from projects/ifnet, where types are
defined in enum.
2015-04-17 06:39:15 +00:00
Gleb Smirnoff
da8ae05d14 - Format copyright notices, VCS ids.
- Run through unifdef(1).
2015-04-17 06:38:31 +00:00
Gleb Smirnoff
772e66a6fc Move ALTQ from contrib to net/altq. The ALTQ code is for many years
discontinued by its initial authors. In FreeBSD the code was already
slightly edited during the pf(4) SMP project. It is about to be edited
more in the projects/ifnet. Moving out of contrib also allows to remove
several hacks to the make glue.

Reviewed by:	net@
2015-04-16 20:22:40 +00:00
Marcelo Araujo
546afaf83d Remove duplicate header entry. 2015-04-16 02:44:37 +00:00
George V. Neville-Neil
6332e4cc38 Minor change to the macros to make sure that if an AF is passed that is neither AF_INET6 nor AF_INET that we don't touch random bits of memory.
Differential Revision:	https://reviews.freebsd.org/D2291
2015-04-15 14:46:45 +00:00
George V. Neville-Neil
3085e1216e Document internal interface types which are specific to FreeBSD. 2015-04-14 15:21:20 +00:00
Gleb Smirnoff
4651df570c Redo r274966. Instead of global all-interface all-vnet undocumented sysctl,
use per-interface flag, and document it.

Sponsored by:	Nginx, Inc.
2015-04-10 09:50:13 +00:00
George V. Neville-Neil
51d4054eeb Revert 281276 as unnecessary. Proper change to be committed
to the base polling code in a subsequent commit.

Pointed out by: glebius

Sponsored by:	Rubicon Communications (NetGate)
2015-04-09 14:44:30 +00:00
George V. Neville-Neil
8a7ad10169 Add support for a netisr polling tunable, which allows run time switching of
device polling rather than having it only be controlled by the compile
time option.

Summary: Rubicon Communications (Netgate)

Reviewers: #network, hiren

Reviewed By: #network, hiren

Subscribers: hiren

Differential Revision: https://reviews.freebsd.org/D2258
2015-04-08 20:25:51 +00:00
Eric Joyner
eb7e25b22f ifmedia changes:
- Extend the number of available subtypes for Ethernet media by using some
of the ifmedia word's option bits to help denote subtypes. As a result, the
number of possible Ethernet subtype values increases from 31 to 511.

- Use some of those new values to define new media types.

- lacp_compose_key() recgonizes the new Ethernet media types added.
  (Change made as required by a comment in if_media.h)

- New ioctl, SIOGIFXMEDIA, to handle getting the new extended media types.
  SIOCGIFMEDIA is retained for backwards compatibility.

- Changes to ifconfig to allow it to handle the new extended media types.

Submitted by:	mike@karels.net (original), hselasky
Reviewed by:	jfvogel, gnn, hselasky
Approved by:	jfvogel (mentor), gnn (mentor)
Differential Revision: http://reviews.freebsd.org/D1965
2015-04-07 21:31:17 +00:00
Andrey V. Elsukov
a4b65afcab Fix a possible mbuf leak on interface departure.
Reported by:	Alexandre Martins
2015-03-26 23:40:22 +00:00
Gleb Smirnoff
c03044244f Fix couple of fallouts from r280280. The first one is a simple typo,
where counter was incremented on parent, instead of vlan(4) interface.

The second is more complicated. Historically, in our stack the incoming
packets are accounted in drivers, while incoming bytes for Ethernet
drivers are accounted in ether_input_internal(). Thus, it should be
removed from vlan(4) driver.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2015-03-25 16:01:46 +00:00
Gleb Smirnoff
b1828acf05 Make vlan_config() the signle point of validity checks.
Sponsored by:	Nginx, Inc.
2015-03-20 21:09:03 +00:00
Gleb Smirnoff
f941c31aae In vlan_clone_match_ethervid():
- Use ifunit() instead of going through the interface list ourselves.
- Remove unused parameter.
- Move the most important comment above the function.

Sponsored by:	Nginx, Inc.
2015-03-20 20:42:58 +00:00
Gleb Smirnoff
67975c79be Tiny comment fix. 2015-03-20 14:16:26 +00:00
Gleb Smirnoff
a58ea6b1cf Now, when r272244 introduced counter(9) based counters for all interfaces,
revert the r271538, which did that for vlan(4) only.

No objections:	melifaro
Sponsored by:	Nginx, Inc.
2015-03-20 14:05:17 +00:00
Gleb Smirnoff
3e8c6d74bb Always lock the hash row of a source node when updating its 'states' counter.
PR:		182401
Sponsored by:	Nginx, Inc.
2015-03-17 12:19:28 +00:00
Andrey V. Elsukov
b57d97215e Add if_input_default() method, that will be used for if_input
initialization, when no input method specified before if_attach().

This prevents panics when if_input() method called directly e.g.
from bpf(4) code.

PR:		192426
Reviewed by:	glebius
MFC after:	1 week
2015-03-12 14:55:33 +00:00
Hans Petter Selasky
b7ba031ff7 Factor out mbuf hashing code from LAGG driver so that other network
drivers can use it. This avoids some code duplication. Add missing
default case to all switch statements while at it. Also move the
hashing of the IPv6 flow field to layer 4 because the IPv6 flow field
is constant on a per L4 connection basis and not on a per L3 network.

Differential Revision:	https://reviews.freebsd.org/D1987
Sponsored by:		Mellanox Technologies
MFC after:		1 month
2015-03-11 16:02:24 +00:00
Mark Johnston
aa14e9b7c9 Reimplement support for userland core dump compression using a new interface
in kern_gzio.c. The old gzio interface was somewhat inflexible and has not
worked properly since r272535: currently, the gzio functions are called with
a range lock held on the output vnode, but kern_gzio.c does not pass the
IO_RANGELOCKED flag to vn_rdwr() calls, resulting in deadlock when vn_rdwr()
attempts to reacquire the range lock. Moreover, the new gzio interface can
be used to implement kernel core compression.

This change also modifies the kernel configuration options needed to enable
userland core dump compression support: gzio is now an option rather than a
device, and the COMPRESS_USER_CORES option is removed. Core dump compression
is enabled using the kern.compress_user_cores sysctl/tunable.

Differential Revision:	https://reviews.freebsd.org/D1832
Reviewed by:	rpaulo
Discussed with:	kib
2015-03-09 03:50:53 +00:00
Gleb Smirnoff
607e337454 Optimize SIOCGIFMEDIA handling removing malloc(9) and double
traversal of the list.

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2015-03-04 15:00:20 +00:00
Hiroki Sato
c92a456b55 Fix group membership of cloned interfaces when one is moved by
if_vmove().

In if_vmove(), if_detach_internal() and if_attach_internal() were
called in series to detach and reattach the interface.  When
detaching, if_delgroup() was called and the interface leaves all of
the group membership.  And then upon attachment, if_addgroup(ifp,
IFG_ALL) was called and it joined only "all" group again.

This had a problem. Normally, a cloned interface automatically joins
a group whose name is ifc_name of the cloner in addition to "all"
upon creation.  However, if_vmove() removed the membership and did
not restore upon attachment.

Differential Revision:	https://reviews.freebsd.org/D1859
2015-03-02 20:00:03 +00:00
Gleb Smirnoff
fec642add7 Hide struct ifmultiaddr under _KERNEL, too. 2015-02-27 01:15:23 +00:00
Xin LI
08e5736618 Handle SIOCSIFCAP by propogating the request to the parent interface. This
allows adding an vlan interface into a bridge.

Thanks for William Katsak <wkatsak cs rutgers edu> for testing and fixing
an issue in my previous patch draft.

MFC after:	2 weeks
2015-02-20 18:39:12 +00:00
Gleb Smirnoff
e072c794ad Now that all users of _WANT_IFADDR are fixed, remove this crutch and
hide ifaddr, in_ifaddr and in6_ifaddr under _KERNEL.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2015-02-19 23:16:10 +00:00
Gleb Smirnoff
0324938a0f - Improve INET/INET6 scope.
- style(9) declarations.
- Make couple of local functions static.
2015-02-16 23:50:53 +00:00
Gleb Smirnoff
8dc98c2a36 Toss declarations to fix regular build and NO_INET6 build. 2015-02-16 21:52:28 +00:00
Gleb Smirnoff
f0b0fe5b45 Commit a miss from r278843.
Pointy hat to:	glebius
2015-02-16 18:33:33 +00:00
Brad Davis
936bdf364d Fix build.
Approved by:	gibbs
2015-02-16 18:06:24 +00:00
Gleb Smirnoff
6004805208 Missed from r278831. 2015-02-16 06:02:46 +00:00
Hiroki Sato
25792b116f Fix a panic when tearing down a vnet on a VIMAGE-enabled kernel.
There was a race that bridge_ifdetach() could be called via
ifnet_departure event handler after vnet_bridge_uninit().

PR:		195859
Reported by:	Danilo Egea Gondolfo
2015-02-14 18:15:14 +00:00
Will Andrews
0e5f55bb95 Improve the distribution of LAGG port traffic.
I edited the original change to retain the use of arc4random() as a seed for
the hashing as a very basic defense against intentional lagg port selection.

The author's original commit message (edited slightly):

sys/net/ieee8023ad_lacp.c
sys/net/if_lagg.c
	In lagg_hashmbuf, use the FNV hash instead of the old
	hash32_buf.  The hash32 family of functions operate one octet
	at a time, and when run on a string s of length n, their output
	is equivalent to :

		   ----- i=n-1
		   \
	       n    \           (n-i-1)              32
	( seed^  +  /        33^        * s[i] ) % 2^
		   /
		   ----- i=0

	The problem is that the last five bytes of input don't get
	multiplied by sufficiently many powers of 33 to rollover 2^32.
	That means that changing the last few bytes (but obviously not
	the very last) of input will always change the value of the
	hash by a multiple of 33.  In the case of lagg_hashmbuf() with
	ipv4 input, the last four bytes are the TCP or UDP port
	numbers.  Since the output of lagg_hashmbuf is always taken
	modulo the port count, and 3 is a common port count for a lagg,
	that's bad.  It means that the UDP or TCP source port will
	never affect which lagg member is selected on a 3-port lagg.

	At 10Gbps, I was not able to measure any difference in CPU
	consumption between the old and new hash.

Submitted by:	asomers (original commit)
Reviewed by:	emaste, glebius
MFC after:	1 week
Sponsored by:	Spectra Logic
MFSpectraBSD:	1001723 on 2013/08/28 (original)
		1114258 on 2015/01/22 (edit)
2015-01-23 00:06:35 +00:00
Gleb Smirnoff
efc6c51ffa Back out r276841, r276756, r276747, r276746. The change in r276747 is very
very questionable, since it makes vimages more dependent on each other. But
the reason for the backout is that it screwed up shutting down the pf purge
threads, and now kernel immedially panics on pf module unload. Although module
unloading isn't an advertised feature of pf, it is very important for
development process.

I'd like to not backout r276746, since in general it is good. But since it
has introduced numerous build breakages, that later were addressed in
r276841, r276756, r276747, I need to back it out as well. Better replay it
in clean fashion from scratch.
2015-01-22 01:23:16 +00:00
Adrian Chadd
b2bdc62a95 Refactor / restructure the RSS code into generic, IPv4 and IPv6 specific
bits.

The motivation here is to eventually teach netisr and potentially
other networking subsystems a bit more about how RSS work queues / buckets
are configured so things have a hope of auto-configuring in the future.

* net/rss_config.[ch] takes care of the generic bits for doing
  configuration, hash function selection, etc;
* topelitz.[ch] is now in net/ rather than netinet/;
* (and would be in libkern if it didn't directly include RSS_KEYSIZE;
  that's a later thing to fix up.)
* netinet/in_rss.[ch] now just contains the IPv4 specific methods;
* and netinet/in6_rss.[ch] now just contains the IPv6 specific methods.

This should have no functional impact on anyone currently using
the RSS support.

Differential Revision:	D1383
Reviewed by:	gnn, jfv (intel driver bits)
2015-01-18 18:06:40 +00:00
Andrey V. Elsukov
504289ea5a Fix condition and really sort ports. Also add comment describing
the intent of this code.

Reported by:	sbruno
MFC after:	1 week
Sponsored by:	Yandex LLC
2015-01-17 11:32:09 +00:00
Alexander V. Chernikov
29e0d65d7a Eliminate SIOCGIFADDR handling in bpf.
Quoting 19 years bpf.4 manual from bpf-1.2a1:
"
(SIOCGIFADDR is obsolete under BSD systems.  SIOCGIFCONF should be
 used to query link-level addresses.)
"
* SIOCGIFADDR was not imported in NetBSD (bpf.c 1.36) and OpenBSD.
* Last bits (e.g. manpage claiming SIOCGIFADDR exists) was cleaned
  from NetBSD via kern/21513 5 years ago,
  from OpenBSD via documentation/6352 5 years ago.
2015-01-16 10:09:28 +00:00
Andrey V. Elsukov
9c0265c6fd Restore Ethernet-within-IP Encapsulation support that was broken after
r273087. Move all checks from gif_output() into gif_transmit(). Previously
they were checked always, because if_start always called gif_output.
Now gif_transmit() can be called directly from if_bridge() code and we need
do checks here.

PR:		196646
MFC after:	1 week
2015-01-10 08:28:50 +00:00
Andrey V. Elsukov
3a9f9af803 Use if_name() macro instead of ifp->if_xname.
MFH:		1 week
2015-01-10 03:29:17 +00:00
Andrey V. Elsukov
84d03ddad3 Fix an error introduced in r274246.
Pass mtag argument into m_tag_locate() to continue the search from
the last found mtag.

X-MFC after:	r274246
2015-01-10 03:26:46 +00:00
Andrey V. Elsukov
c26230adee Move the recursion detection code into separate function gif_check_nesting().
Also make MTAG_GIF definition private to if_gif.c.

MFC after:	1 week
2015-01-10 03:13:16 +00:00
Alexander V. Chernikov
ecf09f8321 Fix typo.
Submitted by:	Olivér Pintér
2015-01-09 20:29:13 +00:00
Alexander V. Chernikov
d63e657c04 * Deal with ARCNET L2 multicast mapping for IPv6 the same way as in IPv4:
handle it in arc_output() instead of nd6_storelladdr().
* Remove IFT_ARCNET check from arpresolve() since arc_output() does not
  use arpresolve() to handle broadcast/multicast. This check was there
  since r84931. It looks like it was not used since r89099 (initial
  import of Arcnet support where multicast is handled separately).
* Remove IFT_IEEE1394 case from nd6_storelladdr() since firewire_output()
  calles nd6_storelladdr() for unicast addresses only.
* Remove IFT_ARCNET case from nd6_storelladdr() since arc_output() now
  handles multicast by itself.

As a result, we have the following pattern: all non-ethernet-style
media have their own multicast map handling inside their appropriate
routines. On the other hand, arpresolve() (and nd6_storelladdr()) which
meant to be 'generic' ones de-facto handles ethernet-only multicast maps.

MFC after:	3 weeks
2015-01-09 12:56:51 +00:00
Xin LI
681ed54caa MFV r276759: libpcap 1.6.2.
MFC after:	1 month
2015-01-06 22:29:12 +00:00
Craig Rodrigues
8d665c6ba8 Reapply previous patch to fix build.
PR: 194515
2015-01-06 16:47:02 +00:00