The module parameter is read-only since changing mode after loading
isn't going to work.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Access to PCI config space should be inside pci_cfg_access_lock
to avoid read/modify/write races.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
It is good practice to propogate the return values of failing
functions so that more information can be reported. The failed result
of probe will make it out to errno and get printed by modprobe
and will aid in diagnosis of failures.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
It is better style to just use the pci_num_vf directly, rather
than wrapping it with a local (but globally named) function with
the same effect.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Fix style issues reported by checkpatch.
There was a real bug in that the setup code was returning
positive value for errors which goes against convention and
might have caused a problem.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Don't put capitialization and space in name since it will show
up in /proc/interrupts. Instead use driver name to follow the
conventions used in the kernel by other drivers.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Use Linux kernel standard coding conventions for console messages.
Bare use of printk() is not desirable and is reported as a style
problem by checkpatch. Instead use pr_info() and dev_info()
to print out log messages where appropriate.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Add relevant callback function to change a KNI device's MAC address.
Signed-off-by: Padam Jeet Singh <padam.singh@inventum.net>
Reviewed-by: Helin Zhang <helin.zhang@intel.com>
Per netif_receive_skb function description, it may only be called from
interrupt contex, but KNI is run on kthread that like as user-space
context. It may occur deadlock, if netif_receive_skb called from kthread,
so it should be repleaced by netif_rx or adding local_bh_disable/enable
around netif_receive_skb.
Signed-off-by: Yao-Po Wang <blue119@gmail.com>
Acked-by: Alex Markuze <alex@weka.io>
The FreeBSD nic_uio driver was missing the #defines to include the device ids
for devices using the i40e driver. This change adds in the missing defines.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
When adding link bonding to EAL initialization (a155d430119),
an include was missing for BSD.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Zhaochen Zhan <zhaochen.zhan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
SET_ETHTOOL_OPS is gone in 3.16, so modify drivers accordingly.
Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This follows the mainline Linux kernel commit
ed616689a3d95eb6c9bdbb1ef74b0f50cbdf276a (Add support to configure SR-IOV
VF minimum and maximum Tx rate) by Sucheta Chakraborty, and enables to
build the driver against 3.16.
Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Compilation in RHEL7 is failed. This fixes the build issue.
RHEL7 has skb_set_hash, the kernel version is 3.10 though.
Don't define skb_set_hash for RHEL7.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Reviewed-by: Hayato Momma <h-momma@ce.jp.nec.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Updating functionality in EAL to support adding link bonding
devices via –vdev option. Link bonding devices will be
initialized after all physical devices have been probed and
initialized.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Valgrind reports this issue:
==29880== Invalid read of size 1
==29880== at 0x56FF9A5: cpu_socket_id (eal_lcore.c:101)
==29880== by 0x56FFAE9: rte_eal_cpu_init (eal_lcore.c:168)
==29880== by 0x56F944A: rte_eal_init (eal.c:975)
The problem is that endptr points to memory allocated underneath the DIR
handle, which has already been freed. So move the closedir() call lower.
Signed-off-by: Aaron Campbell <aaron@arbor.net>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.
Leave the tests for the deprecated function in place.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Mark the rte_log, cmdline_printf and rte_snprintf functions as
being printf-style functions. This causes compilation errors
due to mis-matched parameter types, so the parameter types are
fixed where appropriate.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since commit "add FILE argument to debug functions" (591a9d7985c1230),
application which includes rte_memory.h without stdio.h will be hit
compilation failure:
/path/to/include/rte_memory.h:146:30: error: unknown type name ‘FILE’
void rte_dump_physmem_layout(FILE *f);
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Reviewed-by: Hayato Momma <h-momma@ce.jp.nec.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Only some devices support the link state interrupt configuration option.
Link state control does not work in virtual drivers
(virtio, vmxnet3, igbvf, and ixgbevf). Instead of having the application
try and guess whether it will work or not provide a driver flag that
can be checked instead.
Note: if device driver doesn't support link state control, what
would happen previously is that the code would never detect link
transitions. This prevents that.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: rename flag]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Problems with lib rte_malloc:
1. Rte_malloc searches a heap's entire free list looking for the best
fit, resulting in linear complexity.
2. Heaps store free blocks in a singly-linked list, resulting in
linear complexity when rte_free needs to remove an adjacent block.
3. The library inserts and removes free blocks with ad hoc, in-line
code, rather than using linked-list functions or macros.
4. The library wastes potential small blocks of size 64 and 128 bytes
(plus overhead of 64 bytes) as padding when reusing free blocks or
resizing allocated blocks.
This patch addresses those problems as follows:
1. Replace single free list with a handful of free lists. Each free
list contains blocks of a specified size range, for example:
list[0]: (0 , 2^8]
list[1]: (2^8 , 2^10]
list[2]: (2^10, 2^12]
list[3]: (2^12, 2^14]
list[4]: (2^14, MAX_SIZE]
When allocating a block, start at the first list that can contain
a big enough block. Search subsequent lists, if necessary.
Terminate the search as soon as we find a block that is big enough.
2. Use doubly-linked lists, so that we can remove free blocks in
constant time.
3. Use BSD LIST macros, as defined in sys/queue.h and the QUEUE(3)
man page.
4. Change code to utilize small blocks of data size 64 and 128, when
splitting larger blocks.
Signed-off-by: Robert Sanford <rsanford2@gmail.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The compile errors are copied as follows. The fixes came from
Linux drivers of ixgbe-3.21.2 and igb-5.1.2 with modifications.
The idea is to use self-defined functions no matter they have
already been defined somewhere or not.
* Oracle Linux6.4
lib/librte_eal/linuxapp/kni/ethtool/ixgbe/kcompat.h:3111:
error: redefinition of 'ether_addr_equal'
include/linux/etherdevice.h:180: note: previous definition
of 'ether_addr_equal' was here
* RHEL6.5
lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3597:
error: redefinition of 'mmd_eee_cap_to_ethtool_sup_t'
include/linux/mdio.h:387: note: previous definition of
'mmd_eee_cap_to_ethtool_sup_t' was here
lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3625:
error: redefinition of 'mmd_eee_adv_to_ethtool_adv_t'
include/linux/mdio.h:415: note: previous definition of
'mmd_eee_adv_to_ethtool_adv_t' was here
lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3653:
error: redefinition of 'ethtool_adv_to_mmd_eee_adv_t'
include/linux/mdio.h:443: note: previous definition of
'ethtool_adv_to_mmd_eee_adv_t' was here
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
When parsing EAL option --base-virtaddr
errno was not being set to 0 before calling strtoull,
therefore function might fail unnecesarily.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Aaron Campbell <aaron@arbor.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
also, making VFIO code distinguish between actual unexpected values
and ioctl() failures, providing appropriate error messages.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Currently, VFIO only checks for being able to access the /dev/vfio
directory when initializing VFIO, deferring actual VFIO container
initialization to VFIO binding code. This doesn't bode well for when
VFIO container cannot be initialized for whatever reason, because
it results in unrecoverable error even if the user didn't set up
VFIO and didn't even want to use it in the first place.
This patch fixes this by moving container initialization into the
code that checks if VFIO is available at runtime. Therefore, any
issues with the container will be known at initialization stage and
VFIO will simply be turned off if container could not be set up.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Enabling 'Extended Tag' and resetting 'Max Read Request Size' in PCI
config space have big impacts to i40e performance. They cannot be
changed on some BIOS implementations, though can on others. Two sys
files of 'extended_tag' and 'max_read_request_size' are added to
support changing them by 'echo' in user space.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Unit tests for Packet Framework libraries.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
The Packet Framework pipeline library provides a standard methodology
(logically similar to OpenFlow) for rapid development of complex packet
processing pipelines out of ports, tables and actions.
A pipeline is constructed by connecting its input ports to its output ports
through a chain of lookup tables. As result of lookup operation into the
current table, one of the table entries (or the default table entry, in case
of lookup miss) is identified to provide the actions to be executed on the
current packet and the associated action meta-data.
The behavior of user actions is defined through the configurable table action
handler, while the reserved actions define the next hop for the current packet
(either another table, an output port or packet drop) and are handled
transparently by the framework.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
This file defines the operations to be implemented by
any Packet Framework table.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
This file defines the port operations that have to be implemented
by Packet Framework ports.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
Removing PCI ID list to make igb_uio more similar to a generic driver
like vfio-pci or pci_uio_generic. This is done to make it easier for
the binding script to support multiple drivers.
Note that since igb_uio no longer has a PCI ID list, it can now be
bound to any device, not just those explicitly supported by DPDK. In
other words, it now behaves similar to PCI stub, VFIO and other generic
PCI drivers.
Therefore to bind a new device to igb_uio, the user will now have to
first write its PCI ID to "new_id" file inside the igb_uio driver
directory, and only then write the PCI ID to "bind". This is reflected
in changes to PCI binding script as well.
There's a weird behaviour of sysfs when a new device ID is added to
new_id. Subsequent writing to "bind" will result in IOError on
closing the file. This error is harmless but it triggers the
exception anyway, so in order to work around that, we check if the
device was actually bound to the driver before raising an error.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Unlike igb_uio, VFIO interrupt type is not set by kernel module
parameters but is set up via ioctl() calls at runtime. This warrants
a new EAL command-line parameter. It will have no effect if VFIO is
not compiled, but will set VFIO interrupt type to either "legacy", "msi"
or "msix" if VFIO support is compiled. Note that VFIO initialization
will fail if the interrupt type selected is not supported by the system.
If the interrupt type parameter wasn't specified, VFIO will try all
interrupt types (starting with MSI-X).
In unit tests, we don't know if VFIO is compiled (eal_vfio.h header is
internal to Linuxapp EAL), so we check this flag regardless.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Add support for binding VFIO devices if RTE_PCI_DRV_NEED_MAPPING is set
for this driver. Try VFIO first, if not mapped then try IGB_UIO too.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since VFIO cannot be used to map the same device twice, secondary
processes receive the device/group fd's by means of communicating over a
local socket. Only group and container fd's should be sent, as device
fd's can be obtained via ioctl() calls' on the group fd.
For multiprocess, VFIO distinguishes between existing but unused groups
(e.g. grups that aren't bound to VFIO driver) and non-existing groups in
order to know if the secondary process requests a valid group, or if
secondary process requests something that doesn't exist.
VFIO multiprocess sync communicates over a simple protocol. It defines
two requests - request for group fd, and request for container fd.
Possible replies are: SOCKET_OK (an OK signal), SOCKET_ERR (error
signal) and SOCKET_NO_FD (a signal that indicates that the requested
VFIO group is valid, but no fd is present for that group - indicating
that the respective group is simply not bound to VFIO driver).
Here is the logic in a nutshell:
1. secondary process sends SOCKET_REQ_CONTAINER or SOCKET_REQ_GROUP
1a. in case of SOCKET_REQ_GROUP, client also then sends group number
2. primary process receives message
2a. in case of invalid group, SOCKET_ERR is sent back to secondary
2b. in case of unbound group, SOCKET_NO_FD is sent back to secondary
2c. in case of valid group, SOCKET_OK is sent and followed by fd
3. socket is closed
in case of any error, socket is closed and SOCKET_ERR is sent.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Adding code to support VFIO mapping (primary processes only). Most of
the things are done via ioctl() calls on either /dev/vfio/vfio (the
container) or a /dev/vfio/$GROUP_NR (IOMMU group).
In a nutshell, the code does the following:
1. creates a VFIO container (an entity that allows sharing IOMMU DMA
mappings between devices)
2. checks if a given PCI device is a member of an IOMMU group (if it's
not, this indicates that the device isn't bound to VFIO)
3. calls open() the group file to obtain a group fd
4. checks if the group is viable (that is, if all the devices in the
same IOMMU group are either bound to VFIO or not bound to anything)
5. adds the group to a container
6. sets up DMA mappings (only done once, mapping whole DPDK hugepage
memory for DMA, with a 1:1 correspondence of IOVA to PA)
7. gets the actual PCI device fd from the group fd (can fail, which
simply means that this particular device is not bound to VFIO)
8. maps BARs (MSI-X BAR cannot be mmaped, so skipping it)
9. sets up interrupt structures (but not enables them!)
10. enables PCI bus mastering
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Creating code to handle VFIO interrupts in EAL interrupts (supports all
types of interrupts).
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Add VFIO compilation option to linuxapp config.
Adding a header that will determine if VFIO support should be compiled
in. If VFIO is enabled in config (and it's enabled by default), then the
header will also check for kernel version. If VFIO is enabled in config
and if the kernel version is 3.6+, then VFIO_PRESENT will be defined.
This is the macro that should be used to determine if VFIO support is
being compiled in.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Moving interrupt type enum out of igb_uio and renaming it to be more
generic. Such a strange header naming and separation is done mostly to
make coming virtio patches easier to port to dpdk.org tree.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Currently, igb_uio is always compiled. Some Linux distributions may not
want to include igb_uio with DPDK, so we need to make sure that igb_uio
compilation for Linuxapp targets can be optional.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Rename the RTE_PCI_DRV_NEED_IGB_UIO to be more generic.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Currently, EAL does not distinguish between actual failures and expected
initialization errors. E.g. sometimes the driver fails to initialize
because it was not supposed to be initialized in the first place, such
as device not being managed by said driver.
This patch makes EAL fail on actual initialization errors while still
skipping over expected initialization errors.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>