Commit Graph

45 Commits

Author SHA1 Message Date
Hans Petter Selasky
4f8c17ab35 Fix compile warning by removing unused variable.
MFC:		3 days
Sponsored by:	Mellanox Technologies
2014-10-30 16:57:56 +00:00
Hans Petter Selasky
7cabdf161d Update the network interface baudrate integer according to the actual
line rate.

Submitted by:	jhb @
MFC after:	1 week
2014-10-24 16:39:01 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Hans Petter Selasky
2c6eb461a7 Update the OFED Linux compatibility layer and
Mellanox hardware driver(s):

- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
  cancelling and starting a timeout.
- Implement more Linux scatterlist
  functions.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-15 13:40:29 +00:00
Hans Petter Selasky
15983afb04 Fix compile warning when compiling with GCC.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-07 10:04:25 +00:00
Hans Petter Selasky
7b8455ae34 Update code to use new network counter API.
Fix some minor compile warnings while at it.

Sponsored by:	Mellanox Technologies
Suggested by:	glebius@
MFC after:	1 week
2014-09-24 08:28:34 +00:00
Hans Petter Selasky
f02f742280 Hardware driver update from Mellanox Technologies, including:
- improved performance
 - better stability
 - new features
 - bugfixes

Supported HCAs:
 - ConnectX-2
 - ConnectX-3
 - ConnectX-3 Pro

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-23 12:37:01 +00:00
Hans Petter Selasky
9fd573c39d Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by:	adrian, rmacklem
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-22 08:27:27 +00:00
Hans Petter Selasky
72f3100047 Revert r271504. A new patch to solve this issue will be made.
Suggested by:	adrian @
2014-09-13 20:52:01 +00:00
Hans Petter Selasky
eb93b77ae4 Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-09-13 08:26:09 +00:00
Hans Petter Selasky
c7818b48b6 - Update the OFED Linux Emulation layer as a preparation for a
hardware driver update from Mellanox Technologies.
- Remove empty files from the OFED Linux Emulation layer.
- Fix compile warnings related to printf() and the "%lld" and "%llx"
format specifiers.
- Add some missing 2-clause BSD copyrights.
- Add "Mellanox Technologies, Ltd." to list of copyright holders.
- Add some new compatibility files.
- Fix order of uninit in the mlx4ib module to avoid crash at unload
using the new module_exit_order() function.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-08-27 13:21:53 +00:00
Hans Petter Selasky
5cb6b3afa4 Fix OFED startup order: All SYSINIT()'s and modules should be loaded
prior to starting "/sbin/init" which will run all the "/etc/rc.d/xxx"
scripts. Else there can be a race configuring the interfaces via
"/etc/rc.conf".

MFC after:	4 weeks
Sponsored by:	Mellanox Technologies
2014-07-06 14:22:13 +00:00
Hans Petter Selasky
4813ad54f8 Compile fixes:
Remove duplicate "debug_ktr.mask" sysctl definition.
Remove now unused variable from "kern_ktr.c".
This fixes build of "ktr" which was broken by r267961.

Let the default value for "vm_kmem_size_scale" be zero. It is setup
after that the sysctl has been initialized from "getenv()" in the
"kmeminit()" function to equal the "VM_KMEM_SIZE_MAX" value, if
zero. On Sparc64 the "VM_KMEM_SIZE_MAX" macro is not a constant. This
fixes build of Sparc64 which was broken by r267961.

Add a special macro to dynamically create SYSCTL root nodes, because
root nodes have a special parent. This fixes build of existing OFED
module and CANBUS module for pc98 which was broken by r267961.

Add missing "sysctl.h" includes to get the needed sysctl header file
declarations. This is needed after r267961.

MFC after:	2 weeks
2014-06-28 17:36:18 +00:00
Warner Losh
c6063d0da8 Use src.opts.mk in preference to bsd.own.mk except where we need stuff
from the latter.
2014-05-06 04:22:01 +00:00
Gleb Smirnoff
b245f96c44 Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
  notion of data (packet counters, etc) are by no means MD. And it is a
  bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
  which at modern speeds overflow within a second.

  This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
  make future changes to if_data less ABI breaking. Unfortunately the
  8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with:	emax
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-03-13 03:42:24 +00:00
Dimitry Andric
86390f9444 Similar to r260020, only use -fms-extensions with gcc, for all other
modules which require this flag to compile.  Use a GCC_MS_EXTENSIONS
variable, defined in kern.pre.mk, which can be used to easily supply the
flag (or not), depending on the compiler type.

MFC after:	3 days
2013-12-30 20:34:53 +00:00
Alfred Perlstein
c3e51c9ce1 Defer start/stop port to workqueues.
We need to do this because the Linux compat layer uses sx(9) for
mutex, however the lagg code uses rmlocks and calls into the mellanox
driver.  This causes deadlock due to sleeping while holding a rmlock.

Submitted by: Shahar Klein (shahark mellanox.com)
MFC After: 3 days.
2013-12-15 07:07:13 +00:00
Eitan Adler
7a22215c53 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
Alfred Perlstein
a8eb9360d3 Fix creating a vlan over lagg over mlxen crash.
PR:		181931
Submitted by:	Shahar Klein (shahark mellanox.com)
2013-11-17 20:58:31 +00:00
Alfred Perlstein
a91c93ed22 Do not use a sleep lock when protecting the driver flags.
This was causing a locking issue with lagg

Submitted by:	odeds
2013-11-08 18:28:48 +00:00
Alfred Perlstein
d3e98a133b Fix for bad performance when mtu is increased.
Update the auto moderation behavior in the mlxen driver to match
the new LINUX OFED code.

Submitted by:	odeds
2013-11-08 18:26:28 +00:00
Alfred Perlstein
df5f4d8202 Fix API mismatch exposed by lagg.
When destroying a lagg the driver tries to restore the old mac and
fails due to API mismatch
2013-11-02 10:49:47 +00:00
Alfred Perlstein
c02a5c3915 Fix resource free.
The order of releasing resources in mlxen was wrong, which caused
panic on reload of the module.

conf_ctx list should be released before stat_ctx list, otherwise
the leafs in conf_ctx list won't be released because of the dependancy.

The fix is to change the order of the releases.

Submitted by:	Shahar Klein (shahark at mellanox.com)
2013-10-17 12:19:36 +00:00
Alfred Perlstein
549e8f8b3f Fixed kernel crash when removing IPOIB_CM option from configuration file
Changed module init from module_init() to module_init_order() with
SI_ORDER_MIDDLE flag
Submitted by:	Orit Moskovich (oritm mellanox.com)
Approved by:	re
2013-10-01 15:36:51 +00:00
Alfred Perlstein
c9f432b7ba Update OFED to Linux 3.7 and update Mellanox drivers.
Update the OFED Infiniband core to the version supplied in Linux
version 3.7.

The update to OFED is nearly all additional defines and functions
with the exception of the addition of additional parameters to
ib_register_device() and the reg_user_mr callback.

In addition the ibcore (Infiniband core) and ipoib (IP over Infiniband)
have both been made into completely loadable modules to facilitate
testing of the OFED stack in FreeBSD.

Finally the Mellanox Infiniband drivers are now updated to the
latest version shipping with Linux 3.7.

Submitted by: Mellanox FreeBSD driver team:
                Oded Shanoon (odeds mellanox.com),
                Meny Yossefi (menyy mellanox.com),
                Orit Moskovich (oritm mellanox.com)

Approved by: re
2013-09-29 00:35:03 +00:00
Andre Oppermann
24c6ede6f0 Change m->pkthdr.header to m->pkthdr.PH_loc.ptr after r254804
to transiently store pointers to packet headers.

Sponsored by:	The FreeBSD Foundation
2013-08-25 09:45:26 +00:00
John Baldwin
c183dc9519 Add a missing prototype.
Pointy hat:	me
2013-07-29 20:48:10 +00:00
John Baldwin
575a6840de Various fixes to the mlxen(4) driver:
- Remove an incorrect assertion that can trigger when downing an interface.
- Stop the interface during detach to avoid panics when unloading the
  driver.
- A few locking fixes to be more consistent with other FreeBSD drivers:
  - Protect if_drv_flags with the driver lock, not atomic ops
  - Hold the driver lock when adjusting multicast state.
  - Hold the driver lock while adjusting if_capenable.

PR:		kern/180791 [1,2]
Submitted by:	Shakar Klein @ Mellanox [1,2]
MFC after:	3 days
2013-07-29 18:44:52 +00:00
John Baldwin
ba90c51af3 Avoid trashing IP fragments:
- Only enable UDP/TCP hardware checksums if CSUM_UDP or CSUM_TCP is set.
- Only enable IP hardware checksums if CSUM_IP is set.

PR:		kern/180430
Submitted by:	Meny Yossefi <menyy@mellanox.com>
MFC after:	1 week
2013-07-25 16:34:34 +00:00
John Baldwin
c232b2aebb Rework the previous fix for the IB vs Ethernet sysctl handler to be more
generic and apply to all sysfs attributes:
- Use sysctl_handle_string() instead of reimplementing it.
- Remove trailing newline from the current value before passing it to
  userland and append a newline to the new string value before passing it
  to the attribute's store function.
- Don't leak the temporary buffer if the first error check triggers.
- Revert earlier change to mlx4 port mode handler.

PR:		kern/174213
Submitted by:	Garrett Cooper
Reviewed by:	Shakar Klein @ Mellanox
MFC after:	1 week
2013-07-18 14:06:01 +00:00
John Baldwin
d356583dbc Remove check forbidding requests that would result in one port being set
to Ethernet and the subsequent port being set to IB.

Submitted by:	Shakar Klein @ Mellanox
Tested by:	Morgan Robertson <morganrobertson@gmail.com>
MFC after:	1 week
2013-07-17 13:41:54 +00:00
John Baldwin
335349690f Allow mlx4 devices to switch from Ethernet to Infiniband (and vice versa):
- Fix sysctl wrapper for sysfs attributes to properly handle new string
  values similar to sysctl_handle_string() (only copyin the user's
  supplied length and nul-terminate the string).
- Don't check for a trailing newline when evaluating the desired operating
  mode of a mlx4 device.

PR:		kern/179999
Submitted by:	Shahar Klein <shahark@mellanox.com>
MFC after:	1 week
2013-07-08 21:25:12 +00:00
Eitan Adler
7a2b450ff8 Fxi a bunch of typos.
PR:	misc/174625
Submitted by:	Jeremy Chadwick <jdc@koitsu.org>
2013-05-10 16:41:26 +00:00
Xin LI
3b9ba32d88 Fix LINT build on amd64. 2013-02-09 04:13:45 +00:00
Randall Stewart
ded5ea6a25 This fixes a out-of-order problem with several
of the newer drivers. The basic problem was
that the driver was pulling the mbuf off the
drbr ring and then when sending with xmit(), encounting
a full transmit ring. Thus the lower layer
xmit() function would return an error, and the
drivers would then append the data back on to the ring.
For TCP this is a horrible scenario sure to bring
on a fast-retransmit.

The fix is to use drbr_peek() to pull the data pointer
but not remove it from the ring. If it fails then
we either call the new drbr_putback or drbr_advance
method. Advance moves it forward (we do this sometimes
when the xmit() function frees the mbuf). When
we succeed we always call advance. The
putback will always copy the mbuf back to the top
of the ring. Note that the putback *cannot* be used
with a drbr_dequeue() only with drbr_peek(). We most
of the time, in putback, would not need to copy it
back since most likey the mbuf is still the same, but
sometimes xmit() functions will change the mbuf via
a pullup or other call. So the optimial case for
the single consumer is to always copy it back. If
we ever do a multiple_consumer (for lagg?) we
will  need a test and atomic in the put back possibly
a seperate putback_mc() in the ring buf.

Reviewed by:	jhb@freebsd.org, jlv@freebsd.org
2013-02-07 15:20:54 +00:00
Gleb Smirnoff
eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Eitan Adler
db702c59cf remove duplicate semicolons where possible.
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:00:37 +00:00
Gleb Smirnoff
063efed28c The drbr(9) API appeared to be so unclear, that most drivers in
tree used it incorrectly, which lead to inaccurate overrated
if_obytes accounting. The drbr(9) used to update ifnet stats on
drbr_enqueue(), which is not accurate since enqueuing doesn't
imply successful processing by driver. Dequeuing neither mean
that. Most drivers also called drbr_stats_update() which did
accounting again, leading to doubled if_obytes statistics. And
in case of severe transmitting, when a packet could be several
times enqueued and dequeued it could have been accounted several
times.

o Thus, make drbr(9) API thinner. Now drbr(9) merely chooses between
  ALTQ queueing or buf_ring(9) queueing.
  - It doesn't touch the buf_ring stats any more.
  - It doesn't touch ifnet stats anymore.
  - drbr_stats_update() no longer exists.

o buf_ring(9) handles its stats itself:
  - It handles br_drops itself.
  - br_prod_bytes stats are dropped. Rationale: no one ever
    reads them but update of a common counter on every packet
    negatively affects performance due to excessive cache
    invalidation.
  - buf_ring_enqueue_bytes() reduced to buf_ring_enqueue(), since
    we no longer account bytes.

o Drivers handle their stats theirselves: if_obytes, if_omcasts.

o mlx4(4), igb(4), em(4), vxge(4), oce(4) and  ixv(4) no longer
  use drbr_stats_update(), and update ifnet stats theirselves.

o bxe(4) was the most correct driver, it didn't call
  drbr_stats_update(), thus it was the only driver accurate under
  moderate load. Now it also maintains stats itself.

o ixgbe(4) had already taken stats from hardware, so just
  - drop software stats updating.
  - take multicast packet count from hardware as well.

o mxge(4) just no longer needs NO_SLOW_STATS define.

o cxgb(4), cxgbe(4) need no change, since they obtain stats
  from hardware.

Reviewed by:	jfv, gnn
2012-09-28 18:28:27 +00:00
Bjoern A. Zeeb
7a1421ee03 Do not announce IPv6 TSO support yet. The driver seems to make assumptions
based on IPv4 header parsing only.

MFC after:	1 week
2012-04-23 21:50:10 +00:00
John Baldwin
ed5a2b61fd Add OFED and the associated options and drivers to x86 LINT builds:
- Mark 'sdp' as requiring 'inet'.
- Always include "opt_inet.h" and "opt_inet6.h" and modify the IB
  driver Makefiles to honor WITH/WITHOUT_INET/INET6/_SUPPORT options
  to determine what should be enabled during a module build.
- Fix the mlxen(4) driver and the core IB code to compile without
  if INET is disabled (including when both INET and INET6 are disabled).

Reviewed by:	bz
MFC after:	2 weeks
2012-04-12 14:01:06 +00:00
John Baldwin
efbebba22a Properly parse 40G media types from newer Mellanox adapters that are
40G capable.  For now, map all 40G links to 40GBase-CR4.

MFC after:	2 weeks
2012-04-10 14:01:09 +00:00
Jeff Roberson
cafd78fc6e - Implement wake-on-lan support in mlxen. 2011-03-26 00:54:01 +00:00
Jeff Roberson
a340f09abe - Correct the vlan filter programming. The device filter is built in
reverse order.
 - Name the cq taskqueues according to whether they handle rx or tx.
 - Default LRO to on.
2011-03-23 02:47:04 +00:00
Jeff Roberson
6146335abb - Don't use a separate set of rx queues for UDP, hash them into the same
set as TCP.
 - Eliminate the fully linear non-scatter/gather rx path, there is no
   harm in using arrays of clusters for both TCP and UDP.
 - Implement support for enabling/disabling per-vlan priority pause and
   queues via sysctl.
2011-03-22 04:50:47 +00:00
Jeff Roberson
aa0a1e58f0 - Merge in OFED 1.5.3 from projects/ofed/head 2011-03-21 09:58:24 +00:00