165 lines
6.1 KiB
Plaintext
165 lines
6.1 KiB
Plaintext
Introduction
|
|
============
|
|
|
|
libibverbs is a library that allows programs to use RDMA "verbs" for
|
|
direct access to RDMA (currently InfiniBand and iWARP) hardware from
|
|
userspace. For more information on RDMA verbs, see the InfiniBand
|
|
Architecture Specification vol. 1, especially chapter 11, and the RDMA
|
|
Consortium's RDMA Protocol Verbs Specification.
|
|
|
|
Using libibverbs
|
|
================
|
|
|
|
Device nodes
|
|
------------
|
|
|
|
The verbs library expects special character device files named
|
|
/dev/infiniband/uverbsN to be created. When you load the kernel
|
|
modules, including both the low-level driver for your IB hardware as
|
|
well as the ib_uverbs module, you should see one or more uverbsN
|
|
entries in /sys/class/infiniband_verbs in addition to the
|
|
/dev/infiniband/uverbsN character device files.
|
|
|
|
To create the appropriate character device files automatically with
|
|
udev, a rule like
|
|
|
|
KERNEL="uverbs*", NAME="infiniband/%k"
|
|
|
|
can be used. This will create device nodes named
|
|
|
|
/dev/infiniband/uverbs0
|
|
|
|
and so on. Since the RDMA userspace verbs should be safe for use by
|
|
non-privileged users, you may want to add an appropriate MODE or GROUP
|
|
to your udev rule.
|
|
|
|
Permissions
|
|
-----------
|
|
|
|
To use IB verbs from userspace, a process must be able to access the
|
|
appropriate /dev/infiniband/uverbsN special device file. You can
|
|
check the permissions on this file with the command
|
|
|
|
ls -l /dev/infiniband/uverbs*
|
|
|
|
Make sure that the permissions on these files are such that the
|
|
user/group that your verbs program runs as can access the device file.
|
|
|
|
To use IB verbs from userspace, a process must also have permission to
|
|
tell the kernel to lock sufficient memory for all of your registered
|
|
memory regions as well as the memory used internally by IB resources
|
|
such as queue pairs (QPs) and completion queues (CQs). To check your
|
|
resource limits, use the command
|
|
|
|
ulimit -l
|
|
|
|
(or "limit memorylocked" for csh-like shells).
|
|
|
|
If you see a small number such as 32 (the units are KB) then you will
|
|
need to increase this limit. This is usually done for ordinary users
|
|
via the file /etc/security/limits.conf. More configuration may be
|
|
necessary if you are logging in via OpenSSH and your sshd is
|
|
configured to use privilege separation.
|
|
|
|
Valgrind support
|
|
----------------
|
|
|
|
When running applications that use libibverbs under the Valgrind
|
|
memory-checking debugger, Valgrind will falsely report "read from
|
|
uninitialized" for memory that was initialized by the kernel drivers.
|
|
Specifically, Valgrind cannot see when kernel drivers write to
|
|
userspace memory, so when the process reads from that memory, Valgrind
|
|
incorrectly assumes that the memory contents are uninitialized, and
|
|
therefore raises a warning.
|
|
|
|
libibverbs can be built with specific support for the Valgrind
|
|
memory-checking debugger by specifying the --with-valgrind command
|
|
line argument to configure. This flag enables code in libibverbs to
|
|
tell Valgrind "this memory may look uninitialized, but it's really
|
|
OK," which therefore suppresses the incorrect "read from
|
|
uninitialized" warnings. This code adds trivial overhead to the
|
|
critical performance path, so it is disabled by default. The intent
|
|
is that production users can use a "normal" build of libibverbs and
|
|
developers can use the "valgrind debug" build by simply switching
|
|
their LD_LIBRARY_PATH environment variables.
|
|
|
|
Libibverbs needs some header files from Valgrind in order to compile
|
|
this support; it is important to use the header files from the same
|
|
version of Valgrind that will be used at run time. You may need to
|
|
specify the directory where Valgrind's header files are installed as
|
|
an argument to --with-valgrind. For example
|
|
|
|
./configure --with-valgrind=/opt/valgrind
|
|
|
|
will make the libibverbs build look for valgrind headers in
|
|
/opt/valgrind/include
|
|
|
|
Reporting bugs
|
|
==============
|
|
|
|
Bugs should be reported to the OpenFabrics mailing list
|
|
<general@lists.openfabrics.org>. In your bug report, please include:
|
|
|
|
* Information about your system:
|
|
- Linux distribution and version
|
|
- Linux kernel and version
|
|
- InfiniBand/iWARP hardware and firmware version
|
|
- ... any other relevant information
|
|
|
|
* How to reproduce the bug. Command line arguments for a libibverbs
|
|
example program or source code that other developers can
|
|
compile and run is most convenient.
|
|
|
|
* If the bug is a crash, the exact output printed out when the crash
|
|
occurred, including any kernel messages produced.
|
|
|
|
* If a verbs call is mysteriously returning an error or failing, the
|
|
output of "strace -ewrite -ewrite=all <command>".
|
|
|
|
Submitting patches
|
|
==================
|
|
|
|
Patches should also be submitted to the OpenFabrics mailing list
|
|
<general@lists.openfabrics.org>. Please use unified diff form (the -u
|
|
option to GNU diff), and include a good description of what your patch
|
|
does and why it should be applied. If your patch fixes a bug, please
|
|
make sure to describe the bug and how your fix works.
|
|
|
|
Please include a change to the ChangeLog file (in standard GNU
|
|
changelog format) as part of your patch.
|
|
|
|
Make sure that your contribution can be licensed under the same
|
|
license as the original code you are patching, and that you have all
|
|
necessary permissions to release your work.
|
|
|
|
TODO
|
|
====
|
|
|
|
1.1 series
|
|
----------
|
|
|
|
The libibverbs API and ABI are frozen for all releases in the 1.1
|
|
series. Methods were added to struct ibv_context to implement the
|
|
following features, so it should be possible to add them in a future
|
|
release in the 1.1 series:
|
|
|
|
* Memory window (MW) support.
|
|
|
|
* Implement the reregister memory region (MR) verb. We will add an
|
|
extension to the IB spec to allow the application to indicate that
|
|
the region is only being extended, and that operations in progress
|
|
should _not_ fail (contrary to the IB spec, which states that
|
|
reregister must be implemented so that it behaves equivalently to a
|
|
deregister followed by a register).
|
|
|
|
Other possibilities
|
|
-------------------
|
|
|
|
There are no plans to implement the following features, which would be
|
|
needed for completeness but don't seem particularly useful. However,
|
|
if there is demand from application developers or an implementation is
|
|
contributed, then the feature may be added.
|
|
|
|
* Implement the query address handle (AH) verb.
|
|
* Implement the query memory region (MR) verb.
|