Commit Graph

308 Commits

Author SHA1 Message Date
Hans Petter Selasky
3884ff1831 Add some defines needed by the coming mlx5 infiniband support.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2015-11-24 12:11:56 +00:00
Enji Cooper
ae9356f143 Don't leak work if __mlx4_register_vlan(..) fails in
mlx4_master_immediate_activate_vlan_qos(..)

MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D4203
Submitted by: Miles Olrich <miles.olrich@isilon.com>
Sponsored by: EMC / Isilon Storage Division
2015-11-19 01:08:16 +00:00
Hans Petter Selasky
0f5150a757 Fix integer to pointer of different size conversion warnings when
using GCC for 32-bit platforms. The integer size in this case is
hardcoded 64-bit while the pointer size is 32-bit.

Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-12 10:12:20 +00:00
Hans Petter Selasky
3143f07779 Fix print formatting compile warnings for Sparc64 and PowerPC platforms.
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-12 09:56:25 +00:00
Hans Petter Selasky
8d59ecb214 Finish process of moving the LinuxKPI module into the default kernel build.
- Move all files related to the LinuxKPI into sys/compat/linuxkpi and
  its subfolders.
- Update sys/conf/files and some Makefiles to use new file locations.
- Added description of COMPAT_LINUXKPI to sys/conf/NOTES which in turn
  adds the LinuxKPI to all LINT builds.
- The LinuxKPI can be added to the kernel by setting the
  COMPAT_LINUXKPI option. The OFED kernel option no longer builds the
  LinuxKPI into the kernel. This was done to keep the build rules for
  the LinuxKPI in sys/conf/files simple.
- Extend the LinuxKPI module to include support for USB by moving the
  Linux USB compat from usb.ko to linuxkpi.ko.
- Bump the FreeBSD_version.
- A universe kernel build has been done.

Reviewed by:	np @ (cxgb and cxgbe related changes only)
Sponsored by:	Mellanox Technologies
2015-10-29 08:28:39 +00:00
Hans Petter Selasky
2c8d721186 Add missing FreeBSD RCS keyword and SVN properties.
Sponsored by:	Mellanox Technologies
2015-10-27 12:21:15 +00:00
Hans Petter Selasky
aac7caaf47 Add support for binding IRQs to CPUs in the LinuxKPI. The new function
added is for BSD only and does not exist in Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-10-26 13:28:34 +00:00
Hans Petter Selasky
dfcc270f25 Build fix for MIPS.
Sponsored by:	Mellanox Technologies
2015-10-26 09:34:43 +00:00
Hans Petter Selasky
63ec90e212 Build fix for non-i386 and non-amd64 platforms.
Sponsored by:	Mellanox Technologies
2015-10-23 14:52:05 +00:00
Hans Petter Selasky
2da3897d01 Rename linuxapi[.ko] into linuxkpi[.ko], to reflect that it is a
kernel programming interface module, KPI, to avoid confusion with the
existing Linux userspace binary compatibility shims. Bump the
FreeBSD_version number.

Reviewed by:	np @
Suggested by:	dumbbell @
Sponsored by:	Mellanox Technologies
2015-10-22 09:50:45 +00:00
Hans Petter Selasky
03ae38081c Remove all comments deriving from Linux.
Minor rework of ilog2() function.

Suggested by:	emaste @
Sponsored by:	Mellanox Technologies
2015-10-21 09:37:34 +00:00
Hans Petter Selasky
6f2fc610dd Remove all comments deriving from Linux. Style file for FreeBSD.
Suggested by:	emaste @
Sponsored by:	Mellanox Technologies
2015-10-21 08:51:49 +00:00
Hans Petter Selasky
29a2e474c0 Reimplement header file, remove all comments deriving from Linux and
update copyright to 2-clause BSD.

Suggested by:	emaste @
Sponsored by:	Mellanox Technologies
2015-10-21 07:59:46 +00:00
Hans Petter Selasky
382d6bebd3 Move location of RCS keyword according to style.
Suggested by:	jhb @
Sponsored by:	Mellanox Technologies
2015-10-20 19:08:26 +00:00
Hans Petter Selasky
f89453cfb9 Add missing FreeBSD RCS keyword and SVN properties.
Sponsored by:	Mellanox Technologies
2015-10-20 16:02:11 +00:00
Hans Petter Selasky
65324421da Add missing FreeBSD RCS keyword and SVN properties.
Sponsored by:	Mellanox Technologies
2015-10-20 15:28:02 +00:00
Hans Petter Selasky
c0a8182919 Add missing dash to copyright clause.
Sponsored by:	Mellanox Technologies
2015-10-20 11:42:00 +00:00
Hans Petter Selasky
77320fe897 Add missing FreeBSD RCS keyword and SVN properties.
Sponsored by:	Mellanox Technologies
2015-10-20 11:40:04 +00:00
Hans Petter Selasky
3f862b56a1 Merge LinuxKPI changes from DragonflyBSD:
- Remove redundant NBLONG macro and use BIT_WORD()
  and BIT_MASK() instead.
- Correctly define BIT_MASK() according to Linux and
  update all users of this macro.
- Add missing GENMASK() macro.
- Remove all comments deriving from Linux.

Sponsored by:	Mellanox Technologies
2015-10-20 09:13:35 +00:00
Hans Petter Selasky
a4b5fa85df The returned value from vm_fault_disable_pagefaults() must be stored
and passed to vm_fault_enable_pagefaults(). Else possible recursion on
the state can be lost.

Sponsored by:	Mellanox Technologies
Suggested by:	kib @
2015-10-19 16:03:08 +00:00
Hans Petter Selasky
2ca3cc5132 Merge LinuxKPI changes from DragonflyBSD:
- Redefine DIV_ROUND_UP as a function macro taking two arguments
  instead of none.
- Implement more Linux kernel functions related to various forms
  of DELAY() and basic mathematical operations.

Sponsored by:	Mellanox Technologies
2015-10-19 12:44:41 +00:00
Hans Petter Selasky
e490164bee Merge LinuxKPI changes from DragonflyBSD:
- Implement more Linux kernel functions.

Sponsored by:	Mellanox Technologies
2015-10-19 12:33:09 +00:00
Hans Petter Selasky
f556cede8a Merge LinuxKPI changes from DragonflyBSD:
- Define the kref structure identical to the one found in Linux.
- Update clients referring inside the kref structure.
- Implement kref_sub() for FreeBSD.

Reviewed by:	np @
Sponsored by:	Mellanox Technologies
2015-10-19 12:26:38 +00:00
Hans Petter Selasky
35d974cd0c Merge LinuxKPI changes from DragonflyBSD:
- Map more Linux compiler related defines to FreeBSD ones.

Sponsored by:	Mellanox Technologies
2015-10-19 12:08:06 +00:00
Hans Petter Selasky
f940cc8ffc Map two more Linux error return codes to FreeBSD ones.
Sponsored by:	Mellanox Technologies
2015-10-19 12:04:20 +00:00
Hans Petter Selasky
af5648c465 Implement IS_ERR_OR_NULL() function.
Sponsored by:	Mellanox Technologies
2015-10-19 12:00:52 +00:00
Hans Petter Selasky
2404bdddf1 Merge LinuxKPI changes from DragonflyBSD:
- Add more list related functions and macros.
- Update the hlist_for_each_entry() macro to take one less argument.

Sponsored by:	Mellanox Technologies
2015-10-19 11:57:33 +00:00
Hans Petter Selasky
64bda586e1 Merge LinuxKPI changes from DragonflyBSD:
- Reimplement ktime header file to distinguish more from Linux.
- Add new time header file to handle time related Linux functions.

Sponsored by:	Mellanox Technologies
2015-10-19 11:46:48 +00:00
Hans Petter Selasky
ecfc226c7d Fix compile warning.
Sponsored by:	Mellanox Technologies
2015-10-19 11:29:50 +00:00
Hans Petter Selasky
1610bf8edf Merge LinuxKPI changes from DragonflyBSD:
- Reimplement math64 header file to distinguish more from Linux.

Sponsored by:	Mellanox Technologies
2015-10-19 11:16:38 +00:00
Hans Petter Selasky
96e8192d3c Merge LinuxKPI changes from DragonflyBSD:
- Whitespace fixes.

Sponsored by:	Mellanox Technologies
2015-10-19 11:11:15 +00:00
Hans Petter Selasky
b526833859 Merge LinuxKPI changes from DragonflyBSD:
- Avoid using PAGE_MASK, because Linux defines it differently.
  Use (PAGE_SIZE - 1) instead.
- Add support for for_each_sg_page() and sg_page_iter_dma_address().

Sponsored by:	Mellanox Technologies
2015-10-19 11:09:51 +00:00
Hans Petter Selasky
e6d1c6e382 Merge LinuxKPI changes from DragonflyBSD:
- Implement schedule_timeout().

Sponsored by:	Mellanox Technologies
2015-10-19 10:57:56 +00:00
Hans Petter Selasky
da7c18e051 Merge LinuxKPI changes from DragonflyBSD:
- Implement pagefault_disable() and pagefault_enable().

Sponsored by:	Mellanox Technologies
2015-10-19 10:56:32 +00:00
Hans Petter Selasky
dad154ab93 Merge LinuxKPI changes from DragonflyBSD:
- Added support for multiple new Linux functions.
- Properly implement DEFINE_WAIT() and init_waitqueue_head() macros.
- Removed FreeBSD specific __wait_queue_head structure definition.

Sponsored by:	Mellanox Technologies
2015-10-19 10:54:24 +00:00
Hans Petter Selasky
7c50bc1cf6 Merge LinuxKPI changes from DragonflyBSD:
- Some minor whitespace fixes.
- Added support for two new Linux functions.

Sponsored by:	Mellanox Technologies
2015-10-19 10:49:15 +00:00
Alexander V. Chernikov
fb373bc2b1 Fix build broken by r287861.
Spotted by:	zb
2015-09-16 15:40:08 +00:00
Alexander V. Chernikov
1fe201c322 Simplify the way of attaching IPv6 link-layer header.
Problem description:
How do we currently perform layer 2 resolution and header imposition:

For IPv4 we have the following chain:
  ip_output() -> (ether|atm|whatever)_output() -> arpresolve()

Lookup is done in proper place (link-layer output routine) and it is possible
  to provide cached lle data.

For IPv6 situation is more complex:
  ip6_output() -> nd6_output() -> nd6_output_ifp() -> (whatever)_output() ->
    nd6_storelladdr()

We have ip6_ouput() which calls nd6_output() instead of link output routine.
nd6_output() does the following:
  * checks if lle exists, creates it if needed (similar to arpresolve())
  * performes lle state transitions (similar to arpresolve())
  * calls nd6_output_ifp() which pushes packets to link output routine along
    with running SeND/MAC hooks regardless of lle state
    (e.g. works as run-hooks placeholder).

After that, iface output routine like ether_output() calls nd6_storelladdr()
  which performs lle lookup once again.

As a result, we perform lookup twice for each outgoing packet for most types
  of interfaces. We also need to maintain runtime-checked table of 'nd6-free'
  interfaces (see nd6_need_cache()).

Fix this behavior by eliminating first ND lookup. To be more specific:
  * make all nd6_output() consumers use nd6_output_ifp() instead
  * rename nd6_output[_slow]() to nd6_resolve_[slow]()
  * convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantics,
    e.g. copy L2 address to buffer instead of pushing packet towards lower
    layers
  * Make all nd6_storelladdr() users use nd6_resolve()
  * eliminate nd6_storelladdr()

The resulting callchain is the following:
  ip6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_resolve()

Error handling:
Currently sending packet to non-existing la results in ip6_<output|forward>
  -> nd6_output() -> nd6_output _lle() which returns 0.
In new scenario packet is propagated to <ether|whatever>_output() ->
  nd6_resolve() which will return EWOULDBLOCK, and that result
  will be converted to 0.

(And EWOULDBLOCK is actually used by IB/TOE code).

Sponsored by:		Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D1469
2015-09-16 14:26:28 +00:00
Mark Johnston
4af587d062 Ensure that the MAD agent's delayed taskqueue is completely stopped
before proceeding. Otherwise, nothing prevents it from running after the
MAD agent struct has been been freed, and this results in a use-after-free
when the task's ta_pending count is incremented in the callout handler.

MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2015-09-15 23:56:31 +00:00
John Baldwin
188458ea7c Currently the Linux character device mmap handling only supports mmap
operations that map a single page that has an associated vm_page_t.
This does not permit mapping larger regions (such as a PCI memory
BAR) and it does not permit mapping addresses beyond the top of RAM
(such as a 64-bit BAR located above the top of RAM).

Instead of using a single OBJT_DEVICE object and passing the physaddr via
the offset as a hack, create a new sglist and OBJT_SG object for each
mmap request. The requested memory attribute is applied to the object
thus affecting all pages mapped by the request.

Reviewed by:	hselasky, np
MFC after:	1 week
Sponsored by:	Chelsio
Differential Revision:	https://reviews.freebsd.org/D3386
2015-09-03 18:27:39 +00:00
Navdeep Parhar
6b5c8394f1 Reinstate unify_tcp_port_space and associated code that was lost during
the last OFED update (r278886).

iWARP on FreeBSD is properly integrated with the network stack and the
iWARP drivers _never_ operate out of any private TCP port-space that is
invisible to the kernel.  Instead, an iWARP connection shows up as a TCP
socket (which is what it is) fully visible to the kernel and standard
tools like netstat, sockstat, etc.
2015-08-12 22:09:58 +00:00
Mark Johnston
8811063172 ipv4_is_zeronet() and ipv4_is_loopback() expect an address in network
order, but IN_ZERONET and IN_LOOPBACK expect it in host order.

Submitted by:	Tao Liu <Tao.Liu@isilon.com>
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2015-08-07 18:30:11 +00:00
Hans Petter Selasky
5884383f19 Avoid calling into the random subsystem before it is initialized.
Sponsored by:	Mellanox Technologies
2015-08-04 09:45:10 +00:00
Mark Johnston
e2e45da0e8 ib mad: fix an incorrect use of list_for_each_entry
In tf_dequeue(), if we reach the end of the list without finding a
non-cancelled element, "tmp" will be a pointer into the list head, so the
tmp->canceled check is bogus. Use a flag instead.

Submitted by:	Tao Liu <Tao.Liu@isilon.com>
Reviewed by:	hselasky
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3244
2015-07-30 18:28:37 +00:00
Hans Petter Selasky
49557d2481 Fix broken implementation of "kvasprintf()" function by adding missing
kmalloc() call. Make function global instead of static inline to fix
compiler warnings about passing variable argument lists to inline
functions.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-07-03 11:16:20 +00:00
Mateusz Guzik
9ef8328d52 fd: make rights a mandatory argument to fget_unlocked 2015-06-16 09:52:36 +00:00
Mateusz Guzik
f6f6d24062 Implement lockless resource limits.
Use the same scheme implemented to manage credentials.

Code needing to look at process's credentials (as opposed to thred's) is
provided with *_proc variants of relevant functions.

Places which possibly had to take the proc lock anyway still use the proc
pointer to access limits.
2015-06-10 10:48:12 +00:00
Gleb Smirnoff
dfd828e931 Add SIOCGI2C ioctl support to the driver. Would work only on ConnectX-3
with fresh firmware. The low level code is based on code provided by
Mellanox.

Thanks to Mellanox and their distributor Must (http://mustcompany.ru)
for providing hardware.

In collaboration with:	Andre Melkoumian <andre mellanox.com>
Reviewed by:		hselasky
Sponsored by:		Netflix
Sponsored by:		Nginx, Inc.
2015-05-27 13:42:28 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Hans Petter Selasky
d27c74649b Apply proper locking when iterating the multicast addresses and add a
missing check for NULL from a non-blocking "kzalloc()" function call.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Found by:	glebius @
2015-05-12 11:52:34 +00:00
Mark Johnston
760a181bb2 msecs_to_jiffies() is implemented using tvtohz(9), which always returns a
positive value since it adds the current tick to its result. This differs
from the behaviour in Linux, whose implementation does not add the extra
tick, so subtract the extra tick in the OFED compat layer implementation.
This addresses some incorrect handling of IB MAD timeouts, since some IB
code depends on msecs_to_jiffies(0) returning 0.

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2015-05-10 22:21:00 +00:00
Mark Johnston
979b8eaf4b find_next_bit() and find_next_zero_bit(): if the caller-specified offset
lies within the last block of the bit set and no bits are set beyond the
offset, terminate the search immediately instead of continuing as though
there are further blocks in the set and subsequently returning an incorrect
result.

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2015-05-10 22:04:42 +00:00
Mark Johnston
86d4dbf1cb Don't drop the idr lock before verifying that the newly-inserted element
is present in the tree. Otherwise there exists a window during which the
element could be removed by another thread, triggering an incorrect
assertion failure.

Reviewed by:	jeff
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2015-05-02 00:26:38 +00:00
Mateusz Guzik
90f54cbfeb fd: remove filedesc argument from fdclose
Just accept a thread instead. This makes it consistent with fdalloc.

No functional changes.
2015-04-11 15:40:28 +00:00
Hans Petter Selasky
8de7453501 Fix variable casting:
- Jiffies or ticks in FreeBSD have integer type and are not long.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-03-27 19:08:11 +00:00
Hans Petter Selasky
932493ff29 Fixes for the LinuxAPI completion wrappers:
- make sure the timeout computations are always above zero by using
the existing "linux_timer_jiffies_until()" function. Negative timeouts
can result in undefined behaviour.
- declare all completion functions like external symbols and move the
code to the LinuxAPI kernel module.
- add a proper prefix to all LinuxAPI kernel functions to avoid
namespace collision with other parts of the FreeBSD kernel.
- clean up header file inclusions in the linux/completion.h, linux/in.h
and linux/fs.h header files.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-03-27 16:16:23 +00:00
Hans Petter Selasky
2162d4f09e Add missing void pointer argument to SYSINIT() functions.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2015-03-18 10:50:10 +00:00
Hans Petter Selasky
54fe6f6bdf Fix problems about 32-bit ticks wraparound and unsigned long
conversion:
- The linux compat API layer casts the ticks to unsigned long which
might cause problems when the ticks value is negative.
- Guard against already expired ticks values, by checking if the
passed expiry tick is already elapsed.
- While at it avoid referring the address of an inlined function.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2015-03-18 10:49:17 +00:00
Hans Petter Selasky
805b1f609d Declare missing symbol and inline macro which is only used once.
MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
Submitted by:	glebius@
2015-03-18 08:46:08 +00:00
Hans Petter Selasky
b7ba031ff7 Factor out mbuf hashing code from LAGG driver so that other network
drivers can use it. This avoids some code duplication. Add missing
default case to all switch statements while at it. Also move the
hashing of the IPv6 flow field to layer 4 because the IPv6 flow field
is constant on a per L4 connection basis and not on a per L3 network.

Differential Revision:	https://reviews.freebsd.org/D1987
Sponsored by:		Mellanox Technologies
MFC after:		1 month
2015-03-11 16:02:24 +00:00
Hans Petter Selasky
5a61df011e Ensure setting promiscious mode when a network interface is up, is
always non-blocking by not locking a SX type of mutex.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-03-10 21:17:10 +00:00
Hans Petter Selasky
e53c954d45 Define PTR_ALIGN() macro which will be needed coming Mellanox driver
releases.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-03-04 09:58:39 +00:00
Hans Petter Selasky
2b8859521d Updates for the Mellanox ethernet driver
> List of fixes:
  * use correct format for GID printouts
  * double array indexing
  * spelling in printouts
  * void pointer arithmetic
  * allow more receive rings
  * correct maximum number of transmit rings
  * use "const" instead of "static" for constants
  * check for invalid VLAN tags
  * check for lack of IRQ resources
> Added more hardware specific defines
> Added more verbose printouts of firmware status codes

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-03-04 09:30:03 +00:00
Hans Petter Selasky
74db12ca99 Macro fixes:
- Add missing "order_base_2()" macro.
- Fix BUILD_BUG_ON() macro.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2015-02-23 12:54:46 +00:00
Bjoern A. Zeeb
daa99f441e Try to unbreak NOIP and NOINET6 LINT builds after r278886
by placing appropriate #ifdefs around otherwise unused variables
or sections with functions called which are not available without
IPv6 support in the kernel.
2015-02-19 11:48:00 +00:00
Mateusz Guzik
b7a39e9e07 filedesc: simplify fget_unlocked & friends
Introduce fget_fcntl which performs appropriate checks when needed.
This removes a branch from fget_unlocked.

Introduce fget_mmap dealing with cap_rights_to_vmprot conversion.
This removes a branch from _fget.

Modify fget_unlocked to pass sequence counter to interested callers so
that they can perform their own checks and make sure the result was
otained from stable & current state.

Reviewed by:	silence on -hackers
2015-02-17 23:54:06 +00:00
Hans Petter Selasky
76a2b0a487 Fix compilation of LINT-NOINET kernel target after r278886.
Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 21:59:15 +00:00
Gleb Smirnoff
cc4a90c445 Globally enable -fms-extensions when building kernel with gcc, and remove
this option from all modules that enable it theirselves.
  In C mode -fms-extensions option enables anonymous structs and unions,
allowing us to use this C11 feature in kernel. Of course, clang supports
it without any extra options.

Reviewed by:	dim
2015-02-17 19:27:14 +00:00
Hans Petter Selasky
8a8f7d5bad Fix compilation of the SDP driver and a compile warning after r278886.
Also fix the kernel build rule for mlx4_exp.c.
This fixes the LINT kernel target for amd64.

Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 10:00:15 +00:00
Hans Petter Selasky
8a3fed4e54 Fix compilation when DEBUG is defined.
Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 08:57:36 +00:00
Hans Petter Selasky
b5c1e0cb8d Update the infiniband stack to Mellanox's OFED version 2.1.
Highlights:
 - Multiple verbs API updates
 - Support for RoCE, RDMA over ethernet

All hardware drivers depending on the common infiniband stack has been
updated aswell.

Discussed with:	np @
Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 08:40:27 +00:00
Hans Petter Selasky
3122da88ca Define standard formatting strings to print GIDs
in a separate header file.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-02-16 21:26:16 +00:00
Hans Petter Selasky
7383a0591c The kasprintf() function cannot be inlined due to using a variable
number of arguments. Move it to a C-file in the linuxapi module to
make the function usable.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-02-16 21:22:56 +00:00
Hans Petter Selasky
271aa1089b The "frag_info" pointer is already pointing to an array index.
Don't index twice.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-02-16 17:05:59 +00:00
Hans Petter Selasky
9cdc138838 Add more functions to the Linux kernel compatibility layer. Add some
missing includes which are needed when the header files are not
included in a particular order.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2015-02-13 16:35:12 +00:00
Navdeep Parhar
9d3c01e391 Fix bug in idr_pre_get where it doesn't handle 'need' correctly.
Obtained from:	Chelsio Communications' internal repository.
2015-02-02 23:41:43 +00:00
Hans Petter Selasky
a115fb62ed Revert for r277213:
FreeBSD developers need more time to review patches in the surrounding
areas like the TCP stack which are using MPSAFE callouts to restore
distribution of callouts on multiple CPUs.

Bump the __FreeBSD_version instead of reverting it.

Suggested by:		kmacy, adrian, glebius and kib
Differential Revision:	https://reviews.freebsd.org/D1438
2015-01-22 11:12:42 +00:00
Hans Petter Selasky
d39d7c8636 Add missing linuxapi module dependencies and always use the FreeBSD
"MODULE_VERSION" macro definition. Remove the redefinition of the
"MODULE_VERSION" macro from the Linux kernel compatibility API.

MFC after:	1 month
Reported by:	np@
Sponsored by:	Mellanox Technologies
2015-01-19 21:53:00 +00:00
Hans Petter Selasky
7c3892fc82 Add more functions to the Linux kernel compatibility layer. Add some
missing includes which are needed when the header files are not
included in a particular order.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2015-01-19 20:39:48 +00:00
Hans Petter Selasky
e982e5c561 Start importing the basic OFED linux compatibility layer changes made
by dumbbell@ to be able to compile this layer as a dependency module.
Clean up some Makefiles and remove the no longer used OFED define.
Currently only i386 and amd64 targets are supported.

MFC after:		1 month
Sponsored by:		Mellanox Technologies
2015-01-17 16:36:39 +00:00
Navdeep Parhar
24e2fa2b4d Use parentheses instead of close proximity to ensure layer + 1 is evaluated
before the rest of the expression.
2015-01-16 02:20:24 +00:00
Hans Petter Selasky
1a26c3c047 Major callout subsystem cleanup and rewrite:
- Close a migration race where callout_reset() failed to set the
  CALLOUT_ACTIVE flag.
- Callout callback functions are now allowed to be protected by
  spinlocks.
- Switching the callout CPU number cannot always be done on a
  per-callout basis. See the updated timeout(9) manual page for more
  information.
- The timeout(9) manual page has been updated to reflect how all the
  functions inside the callout API are working. The manual page has
  been made function oriented to make it easier to deduce how each of
  the functions making up the callout API are working without having
  to first read the whole manual page. Group all functions into a
  handful of sections which should give a quick top-level overview
  when the different functions should be used.
- The CALLOUT_SHAREDLOCK flag and its functionality has been removed
  to reduce the complexity in the callout code and to avoid problems
  about atomically stopping callouts via callout_stop(). If someone
  needs it, it can be re-added. From my quick grep there are no
  CALLOUT_SHAREDLOCK clients in the kernel.
- A new callout API function named "callout_drain_async()" has been
  added. See the updated timeout(9) manual page for a complete
  description.
- Update the callout clients in the "kern/" folder to use the callout
  API properly, like cv_timedwait(). Previously there was some custom
  sleepqueue code in the callout subsystem, which has been removed,
  because we now allow callouts to be protected by spinlocks. This
  allows us to tear down the callout like done with regular mutexes,
  and a "td_slpmutex" has been added to "struct thread" to atomically
  teardown the "td_slpcallout". Further the "TDF_TIMOFAIL" and
  "SWT_SLEEPQTIMO" states can now be completely removed. Currently
  they are marked as available and will be cleaned up in a follow up
  commit.
- Bump the __FreeBSD_version to indicate kernel modules need
  recompilation.
- There has been several reports that this patch "seems to squash a
  serious bug leading to a callout timeout and panic".

Kernel build testing:	all architectures were built
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D1438
Sponsored by:		Mellanox Technologies
Reviewed by:		jhb, adrian, sbruno and emaste
2015-01-15 15:32:30 +00:00
Hans Petter Selasky
ec680ff8a8 Don't mask the IP-address when doing multicast IP over infiniband.
PR:		196631
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2015-01-09 06:39:07 +00:00
Hans Petter Selasky
5e3cd9e19c Use the M_SIZE() macro when possible.
MFC after:	3 days
Suggested by:	rwatson@
2015-01-08 14:58:54 +00:00
Hans Petter Selasky
fe657d68f5 Fixes and updates for the Linux compatibility layer:
- Remove unsupported "bus" field from "struct pci_dev".
- Fix logic inside "pci_enable_msix()" when the number of allocated
  interrupts are less than the number of available interrupts.
- Update header files included from "list.h".
- Ensure that "idr_destroy()" removes all entries before destroying
  the IDR root node(s).
- Set the "device->release" function so that we don't leak memory at
  device destruction.
- Use FreeBSD's "log()" function for certain debug printouts.
- Put parenthesis around arguments inside the min, max, min_t and max_t macros.
- Make sure we don't leak file descriptors by dropping the extra file
  reference counts done by the FreeBSD kernel when calling falloc()
  and fget_unlocked().

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-01-06 10:02:14 +00:00
Hans Petter Selasky
a94c3b7d98 Make sure callbacks being freed are not pending when the
"mlx4_en_deactivate_cq()" function returns.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-12-11 10:47:50 +00:00
Hans Petter Selasky
4184fb4d2a Move OFED init a bit earlier so that PXE boot works.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-12-09 08:56:01 +00:00
Craig Rodrigues
7b345d3999 Use CURVNET macros inside inet_get_local_port_range() function.
Without this fix, a kernel with VIMAGE + Infiniband will panic on bootup.

Certain necessary #include statements require LIST_HEAD.
Add these includes to ofed/include/linux/list.h, because
LIST_HEAD is specifically overridden in this file.

PR: 191468
Differential Revision: D1279
Reviewed by: hselasky
2014-12-08 07:25:59 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Gleb Smirnoff
651e4e6a30 Merge from projects/sendfile: extend protocols API to support
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2014-11-30 13:24:21 +00:00
Alexander V. Chernikov
74860d4f7c Do not return unlocked/unreferenced lle in arpresolve/nd6_storelladdr -
return lle flags IFF needed.
Do not pass rte to arpresolve - pass is_gateway flag instead.
2014-11-27 23:06:25 +00:00
Gleb Smirnoff
cfa6009e36 In preparation of merging projects/sendfile, transform bare access to
sb_cc member of struct sockbuf to a couple of inline functions:

sbavail() and sbused()

Right now they are equal, but once notion of "not ready socket buffer data",
will be checked in, they are going to be different.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-12 09:57:15 +00:00
Hans Petter Selasky
4f8c17ab35 Fix compile warning by removing unused variable.
MFC:		3 days
Sponsored by:	Mellanox Technologies
2014-10-30 16:57:56 +00:00
Hans Petter Selasky
7cabdf161d Update the network interface baudrate integer according to the actual
line rate.

Submitted by:	jhb @
MFC after:	1 week
2014-10-24 16:39:01 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Hans Petter Selasky
2c6eb461a7 Update the OFED Linux compatibility layer and
Mellanox hardware driver(s):

- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
  cancelling and starting a timeout.
- Implement more Linux scatterlist
  functions.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-15 13:40:29 +00:00
Hans Petter Selasky
15983afb04 Fix compile warning when compiling with GCC.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-07 10:04:25 +00:00
Gleb Smirnoff
a917ed7723 Mechanically convert to if_inc_counter(). 2014-09-27 20:39:24 +00:00
Hans Petter Selasky
7b8455ae34 Update code to use new network counter API.
Fix some minor compile warnings while at it.

Sponsored by:	Mellanox Technologies
Suggested by:	glebius@
MFC after:	1 week
2014-09-24 08:28:34 +00:00
Hans Petter Selasky
f02f742280 Hardware driver update from Mellanox Technologies, including:
- improved performance
 - better stability
 - new features
 - bugfixes

Supported HCAs:
 - ConnectX-2
 - ConnectX-3
 - ConnectX-3 Pro

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-23 12:37:01 +00:00
John Baldwin
9696feebe2 Add a new fo_fill_kinfo fileops method to add type-specific information to
struct kinfo_file.
- Move the various fill_*_info() methods out of kern_descrip.c and into the
  various file type implementations.
- Rework the support for kinfo_ofile to generate a suitable kinfo_file object
  for each file and then convert that to a kinfo_ofile structure rather than
  keeping a second, different set of code that directly manipulates
  type-specific file information.
- Remove the shm_path() and ksem_info() layering violations.

Differential Revision:	https://reviews.freebsd.org/D775
Reviewed by:	kib, glebius (earlier version)
2014-09-22 16:20:47 +00:00
Hans Petter Selasky
9fd573c39d Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by:	adrian, rmacklem
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-22 08:27:27 +00:00
Hans Petter Selasky
72f3100047 Revert r271504. A new patch to solve this issue will be made.
Suggested by:	adrian @
2014-09-13 20:52:01 +00:00
Hans Petter Selasky
eb93b77ae4 Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-09-13 08:26:09 +00:00
John Baldwin
2d69d0dcc2 Fix various issues with invalid file operations:
- Add invfo_rdwr() (for read and write), invfo_ioctl(), invfo_poll(),
  and invfo_kqfilter() for use by file types that do not support the
  respective operations.  Home-grown versions of invfo_poll() were
  universally broken (they returned an errno value, invfo_poll()
  uses poll_no_poll() to return an appropriate event mask).  Home-grown
  ioctl routines also tended to return an incorrect errno (invfo_ioctl
  returns ENOTTY).
- Use the invfo_*() functions instead of local versions for
  unsupported file operations.
- Reorder fileops members to match the order in the structure definition
  to make it easier to spot missing members.
- Add several missing methods to linuxfileops used by the OFED shim
  layer: fo_write(), fo_truncate(), fo_kqfilter(), and fo_stat().  Most
  of these used invfo_*(), but a dummy fo_stat() implementation was
  added.
2014-09-12 21:29:10 +00:00
Bjoern A. Zeeb
a10816090b Forward declare struct kiocb, which is only used for an unsued function
argument but not actually defined anywhere.

This fixes the compile complaining about
"declaration of 'struct kiocb' will not be visible outside of this function".

MFC after:	2 weeks
X-MFC with:	whatever changed caused the breakage ;-)
2014-08-29 14:47:05 +00:00
Hans Petter Selasky
c7818b48b6 - Update the OFED Linux Emulation layer as a preparation for a
hardware driver update from Mellanox Technologies.
- Remove empty files from the OFED Linux Emulation layer.
- Fix compile warnings related to printf() and the "%lld" and "%llx"
format specifiers.
- Add some missing 2-clause BSD copyrights.
- Add "Mellanox Technologies, Ltd." to list of copyright holders.
- Add some new compatibility files.
- Fix order of uninit in the mlx4ib module to avoid crash at unload
using the new module_exit_order() function.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-08-27 13:21:53 +00:00
Roger Pau Monné
073bf9dd70 pci: make MSI(-X) enable and disable methods of the PCI bus
Make the functions pci_disable_msi, pci_enable_msi and pci_enable_msix
methods of the newbus PCI bus. This code should not include any
functional change.

Sponsored by: Citrix Systems R&D
Reviewed by: imp, jhb
Differential Revision: https://reviews.freebsd.org/D354

dev/pci/pci.c:
 - Convert the mentioned functions to newbus methods.
 - Fix the callers of the converted functions.

sys/dev/pci/pci_private.h:
dev/pci/pci_if.m:
 - Declare the new methods.

dev/pci/pcivar.h:
 - Add helpers to call the newbus methods.

ofed/include/linux/pci.h:
 - Add define to prevent the ofed version of pci_enable_msix from
   clashing with the FreeBSD native version.
2014-08-20 14:57:20 +00:00
Hans Petter Selasky
918ba0175b - Fix radix tree memory leakage when unloading modules using radix
trees. This happens because the logic inserting items into the radix
tree is allocating empty radix levels, when index zero does not
contain any items.
- Add proper error case handling, so that the radix tree does not end
up in a bad state, if memory cannot be allocated during insertion of
an item.
- Add check for inserting NULL items into the radix tree.
- Add check for radix tree getting too big.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-08-12 11:45:57 +00:00
Hans Petter Selasky
5cb6b3afa4 Fix OFED startup order: All SYSINIT()'s and modules should be loaded
prior to starting "/sbin/init" which will run all the "/etc/rc.d/xxx"
scripts. Else there can be a race configuring the interfaces via
"/etc/rc.conf".

MFC after:	4 weeks
Sponsored by:	Mellanox Technologies
2014-07-06 14:22:13 +00:00
Hans Petter Selasky
22239af86c Fix compile warning.
MFC after:	4 weeks
Sponsored by:	Mellanox Technologies
2014-07-06 14:20:47 +00:00
Hans Petter Selasky
d291b07865 Fix some compile warnings.
MFC after:	4 weeks
Sponsored by:	Mellanox Technologies
2014-07-06 14:14:07 +00:00
Hans Petter Selasky
4813ad54f8 Compile fixes:
Remove duplicate "debug_ktr.mask" sysctl definition.
Remove now unused variable from "kern_ktr.c".
This fixes build of "ktr" which was broken by r267961.

Let the default value for "vm_kmem_size_scale" be zero. It is setup
after that the sysctl has been initialized from "getenv()" in the
"kmeminit()" function to equal the "VM_KMEM_SIZE_MAX" value, if
zero. On Sparc64 the "VM_KMEM_SIZE_MAX" macro is not a constant. This
fixes build of Sparc64 which was broken by r267961.

Add a special macro to dynamically create SYSCTL root nodes, because
root nodes have a special parent. This fixes build of existing OFED
module and CANBUS module for pc98 which was broken by r267961.

Add missing "sysctl.h" includes to get the needed sysctl header file
declarations. This is needed after r267961.

MFC after:	2 weeks
2014-06-28 17:36:18 +00:00
Hans Petter Selasky
648ad2e720 - Fix out of range shifting bug in bitops.h.
- Make code a bit easier to read by adding parenthesis.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-06-12 13:33:01 +00:00
Warner Losh
c6063d0da8 Use src.opts.mk in preference to bsd.own.mk except where we need stuff
from the latter.
2014-05-06 04:22:01 +00:00
Bryan Drewery
44f1c91610 Rename global cnt to vm_cnt to avoid shadowing.
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.

Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.

Exp-run revealed no ports using it directly.

No objection from:	arch@
Sponsored by:	EMC / Isilon Storage Division
2014-03-22 10:26:09 +00:00
Gleb Smirnoff
b245f96c44 Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
  notion of data (packet counters, etc) are by no means MD. And it is a
  bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
  which at modern speeds overflow within a second.

  This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
  make future changes to if_data less ABI breaking. Unfortunately the
  8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with:	emax
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-03-13 03:42:24 +00:00
Alexander V. Chernikov
95fbe4d0cc Simplify filling sockaddr_dl structure for if_resolvemulti()
callback providers. link_init_sdl() function can be used to
fill most of the parameters. Use caller stack instead of
allocation / freing memory for each request. Do not drop support
for extra-long (probably non-existing) link-layer protocols by
introducing link_alloc_sdl() (used by if_resolvemulti() callback)
and link_free_sdl() (used by caller).
Since this change breaks KBI, MFC requires slightly different approach
(link_init_sdl() auto-allocating buffer if necessary to handle cases
 with unmodified if_resolvemulti() callers).

MFC after:	2 weeks
2014-01-18 23:24:51 +00:00
Dimitry Andric
86390f9444 Similar to r260020, only use -fms-extensions with gcc, for all other
modules which require this flag to compile.  Use a GCC_MS_EXTENSIONS
variable, defined in kern.pre.mk, which can be used to easily supply the
flag (or not), depending on the compiler type.

MFC after:	3 days
2013-12-30 20:34:53 +00:00
Alfred Perlstein
c3e51c9ce1 Defer start/stop port to workqueues.
We need to do this because the Linux compat layer uses sx(9) for
mutex, however the lagg code uses rmlocks and calls into the mellanox
driver.  This causes deadlock due to sleeping while holding a rmlock.

Submitted by: Shahar Klein (shahark mellanox.com)
MFC After: 3 days.
2013-12-15 07:07:13 +00:00
Eitan Adler
7a22215c53 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
Alfred Perlstein
a8eb9360d3 Fix creating a vlan over lagg over mlxen crash.
PR:		181931
Submitted by:	Shahar Klein (shahark mellanox.com)
2013-11-17 20:58:31 +00:00
Alfred Perlstein
a91c93ed22 Do not use a sleep lock when protecting the driver flags.
This was causing a locking issue with lagg

Submitted by:	odeds
2013-11-08 18:28:48 +00:00
Alfred Perlstein
d3e98a133b Fix for bad performance when mtu is increased.
Update the auto moderation behavior in the mlxen driver to match
the new LINUX OFED code.

Submitted by:	odeds
2013-11-08 18:26:28 +00:00
Alfred Perlstein
58f91ead4b Use explicit long cast to avoid overflow in bitopts.
This was causing problems with the buddy allocator inside of
ofed.

Submitted by: odeds
2013-11-08 18:20:19 +00:00
Alfred Perlstein
df5f4d8202 Fix API mismatch exposed by lagg.
When destroying a lagg the driver tries to restore the old mac and
fails due to API mismatch
2013-11-02 10:49:47 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Alfred Perlstein
c02a5c3915 Fix resource free.
The order of releasing resources in mlxen was wrong, which caused
panic on reload of the module.

conf_ctx list should be released before stat_ctx list, otherwise
the leafs in conf_ctx list won't be released because of the dependancy.

The fix is to change the order of the releases.

Submitted by:	Shahar Klein (shahark at mellanox.com)
2013-10-17 12:19:36 +00:00
Alfred Perlstein
7c1be871e4 Fix __free_pages() in the linux shim.
__free_pages() is actaully supposed to take a "struct page *" not
an address.
2013-10-15 15:50:43 +00:00
Alfred Perlstein
59175f9f30 Fix for When more than one NIC is present.
The device name was incorrect due to a specific function we ported
from the Linux driver that is not FBSD compatible.  This resulted
with a false sysctl registration and some more problematic issues.

The patch basically revokes it all together.

Submitted by: Meny Yossefi (menyy mellanox.com)

Approved by:	re
2013-10-10 14:03:03 +00:00
Dimitry Andric
cb53fc2dad Remove redundant declaration of cmclass in
sys/ofed/drivers/infiniband/core/ucm.c, to silence a gcc warning.

Approved by:	re (kib)
X-MFC-With:	r255932
2013-10-09 07:02:03 +00:00
Dimitry Andric
7ea82d6741 Give an unnamed union in sys/ofed/include/rdma/ib_verbs.h a name, to
silence a gcc warning.

Approved by:	re (gjb)
MFC after:      3 days
2013-10-07 16:54:29 +00:00
Alfred Perlstein
ef4f67f0b8 Fixed kernel crash when running devinfo
When calling to ib_uverbs_cleanup_ucontext, there is a call to
mutex_lock of xrcd_table_mutex, which was not initialized.
Added missing initialization for xrcd_table_mutex.

Submitted by: Orit Moskovich (oritm mellanox.com)

Approved by:	re
2013-10-01 15:43:23 +00:00
Alfred Perlstein
e18c176d9d Enable ib_dev.mmap function
Removed the ifdef linux from this function.
Added stub function for contiguous pages to avoid compilation
errors.

Submitted by:	Orit Moskovich (oritm mellanox.com)
Approved by:	re
2013-10-01 15:42:38 +00:00
Alfred Perlstein
ef7a7bb8ab Fixed 'Couldn't Create QP' issue when running rc_pingpong, uc_pingpong,
srq_pingpong IBverbs

Removed refrences using 'ifdef __linux__' to qpg functions and
related fields in struct
ib_qp_init_attr.

Submitted by: Orit Moskovich (oritm mellanox.com)

Approved by:	re
2013-10-01 15:38:29 +00:00
Alfred Perlstein
549e8f8b3f Fixed kernel crash when removing IPOIB_CM option from configuration file
Changed module init from module_init() to module_init_order() with
SI_ORDER_MIDDLE flag
Submitted by:	Orit Moskovich (oritm mellanox.com)
Approved by:	re
2013-10-01 15:36:51 +00:00
Alfred Perlstein
babfea95f9 Fix mis-merge of upstream fix.
We would accidentally make the string one byte too short.

Submitted by: Orit Moskovich (oritm mellanox.com)

Approved by:	re
2013-10-01 15:33:00 +00:00
Alfred Perlstein
c9f432b7ba Update OFED to Linux 3.7 and update Mellanox drivers.
Update the OFED Infiniband core to the version supplied in Linux
version 3.7.

The update to OFED is nearly all additional defines and functions
with the exception of the addition of additional parameters to
ib_register_device() and the reg_user_mr callback.

In addition the ibcore (Infiniband core) and ipoib (IP over Infiniband)
have both been made into completely loadable modules to facilitate
testing of the OFED stack in FreeBSD.

Finally the Mellanox Infiniband drivers are now updated to the
latest version shipping with Linux 3.7.

Submitted by: Mellanox FreeBSD driver team:
                Oded Shanoon (odeds mellanox.com),
                Meny Yossefi (menyy mellanox.com),
                Orit Moskovich (oritm mellanox.com)

Approved by: re
2013-09-29 00:35:03 +00:00
Pawel Jakub Dawidek
ab568de789 Handle cases where capability rights are not provided.
Reported by:	kib
2013-09-05 11:58:12 +00:00
Andre Oppermann
24c6ede6f0 Change m->pkthdr.header to m->pkthdr.PH_loc.ptr after r254804
to transiently store pointers to packet headers.

Sponsored by:	The FreeBSD Foundation
2013-08-25 09:45:26 +00:00
Navdeep Parhar
f336c6303e Fix implementation of sock_getname.
MFC after:	1 week
2013-08-23 18:54:27 +00:00
John Baldwin
c8e1113b37 Stop an ipoib interface before detaching it.
PR:		kern/181225
Submitted by:	Shahar Klein
Obtained from:	Mellanox
MFC after:	1 week
2013-08-20 18:08:06 +00:00
Andre Oppermann
86bd049144 Add m_clrprotoflags() to clear protocol specific mbuf flags at up and
downwards layer crossings.

Consistently use it within IP, IPv6 and ethernet protocols.

Discussed with:	trociny, glebius
2013-08-19 13:27:32 +00:00
Gleb Smirnoff
ca04d21d5f Make sendfile() a method in the struct fileops. Currently only
vnode backed file descriptors have this method implemented.

Reviewed by:	kib
Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2013-08-15 07:54:31 +00:00
Jeff Roberson
863c7e4562 - Reserve a special AF for SDP. The one we were incorrectly using before
was taken by another AF.

Sponsored by:	EMC / Isilon Storage Division
2013-08-09 03:26:17 +00:00
Jeff Roberson
3d71c6cf45 - Correctly handle various edge cases in sysfs emulation.
Sponsored by:	EMC / Isilon Storage Division
2013-08-09 03:24:48 +00:00
Jeff Roberson
ba397d0f16 - Use the correct type in the linux bitops emulation.
Submitted by:	Maxim Ignatenko <gelraen.ua@gmail.com>
2013-08-09 03:24:12 +00:00
Konstantin Belousov
449c2e92c9 Split the pagequeues per NUMA domains, and split pageademon process
into threads each processing queue in a single domain.  The structure
of the pagedaemons and queues is kept intact, most of the changes come
from the need for code to find an owning page queue for given page,
calculated from the segment containing the page.

The tie between NUMA domain and pagedaemon thread/pagequeue split is
rather arbitrary, the multithreaded daemon could be allowed for the
single-domain machines, or one domain might be split into several page
domains, to further increase concurrency.

Right now, each pagedaemon thread tries to reach the global target,
precalculated at the start of the pass.  This is not optimal, since it
could cause excessive page deactivation and freeing.  The code should
be changed to re-check the global page deficit state in the loop after
some number of iterations.

The pagedaemons reach the quorum before starting the OOM, since one
thread inability to meet the target is normal for split queues.  Only
when all pagedaemons fail to produce enough reusable pages, OOM is
started by single selected thread.

Launder is modified to take into account the segments layout with
regard to the region for which cleaning is performed.

Based on the preliminary patch by jeff, sponsored by EMC / Isilon
Storage Division.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-08-07 16:36:38 +00:00
Jeff Roberson
5df87b21d3 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
John Baldwin
c183dc9519 Add a missing prototype.
Pointy hat:	me
2013-07-29 20:48:10 +00:00