Commit Graph

79 Commits

Author SHA1 Message Date
Gleb Smirnoff
256d9417f8 Fix build. 2013-12-18 04:36:35 +00:00
Luigi Rizzo
2e159ef0b5 fix the build using __builtin_prefetch() instead of redefining prefetch() 2013-12-16 23:57:43 +00:00
Luigi Rizzo
f9790aeb88 split netmap code according to functions:
- netmap.c		base code
- netmap_freebsd.c	FreeBSD-specific code
- netmap_generic.c	emulate netmap over standard drivers
- netmap_mbq.c		simple mbuf tailq
- netmap_mem2.c		memory management
- netmap_vale.c		VALE switch

simplify devce-specific code
2013-12-15 08:37:24 +00:00
Luigi Rizzo
5864b3a586 remove a debugging message 2013-11-06 19:18:39 +00:00
Luigi Rizzo
bc4c07d8e6 remove some test code. 2013-11-05 01:06:22 +00:00
Luigi Rizzo
954dca4c99 fix a bug when a device has 1 tx (or rx) queue and more than
one queue of a different type.

Submitted by:	Vincenzo Maffione
MFC after:	3 days
2013-11-05 00:56:07 +00:00
Luigi Rizzo
5ab0d24d48 check errors on return from netmap_attach()
Submitted by:	Giuseppe Lettieri
MFC after:	3 days
2013-11-05 00:50:59 +00:00
Luigi Rizzo
3d819cb610 circumvent a couple of warnings:
- on line 2550 intentionally overriding a const qualifier
- on line 3219 intentionally converting uint64_t to a pointer
2013-11-02 18:03:21 +00:00
Luigi Rizzo
ab1e286051 add missing file from previous netmap update... 2013-11-02 00:54:47 +00:00
Luigi Rizzo
ce3ee1e7c4 update to the latest netmap snapshot.
This includes the following:
- use separate memory regions for VALE ports
- locking fixes
- some simplifications in the NIC-specific routines
- performance improvements for the VALE switch
- some new features in the pkt-gen test program
- documentation updates

There are small API changes that require programs to be recompiled
(NETMAP_API has been bumped so you will detect old binaries at runtime).

In particular:
- struct netmap_slot now is 16 bytes to support an extra pointer,
  which may save one data copy when using VALE ports or VMs;
- the struct netmap_if has two extra fields;

MFC after:	3 days
2013-11-01 21:21:14 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Jack F Vogel
7609433eb6 Update the Intel igb driver to version 2.4.0
- This version has support for the new Intel Avoton systems,
including 2.5Gb support, further it now has IPv6/TSO6 support as
well. Shared code has been updated where necessary as well. Thanks
to my new assistant Eric Joyner for doing the transmit path changes
to bring in the IPv6/TSO6 support. Thanks to Gleb for catching the
one bug and change needed in NETMAP.

Approved by: re
2013-10-09 17:32:52 +00:00
Luigi Rizzo
85233a7d39 - fix a bug in the previous commit that was dropping the last packet
from each batch flowing on the VALE switch

- feature: add glue for 'indirect' buffers on the sender side:
  if a slot has NS_INDIRECT set, the netmap buffer contains pointer(s)
  to the actual userspace buffers, which are accessed with copyin().
     The feature is not finalised yet, as it will likely need to deal
  with some iovec variant for proper scatter/gather support.
  This will save one copy for clients (e.g. qemu) that cannot
  use the netmap buffer directly.

A curiosity: on amd64 copyin() appears to be 10-15% faster than pkt_copy()
or bcopy() at least for sizes of 256 and greater.
2013-06-05 17:27:59 +00:00
Luigi Rizzo
f18be5766f Bring in a number of new features, mostly implemented by Michio Honda:
- the VALE switch now support up to 254 destinations per switch,
  unicast or broadcast (multicast goes to all ports).

- we can attach hw interfaces and the host stack to a VALE switch,
  which means we will be able to use it more or less as a native bridge
  (minor tweaks still necessary).
  A 'vale-ctl' program is supplied in tools/tools/netmap
  to attach/detach ports the switch, and list current configuration.

- the lookup function in the VALE switch can be reassigned to
  something else, similar to the pf hooks. This will enable
  attaching the firewall, or other processing functions (e.g. in-kernel
  openvswitch) directly on the netmap port.

The internal API used by device drivers does not change.

Userspace applications should be recompiled because we
bump NETMAP_API as we now use some fields in the struct nmreq
that were previously ignored -- otherwise, data structures
are the same.

Manpages will be committed separately.
2013-05-30 14:07:14 +00:00
Luigi Rizzo
ede69cff5b another minor bugfix in the memory allocator, this time in the free routine. 2013-05-10 08:46:10 +00:00
Luigi Rizzo
654ae8d68c remove trailing whitespace 2013-05-02 16:01:04 +00:00
Luigi Rizzo
849bec0e76 Partial cleanup in preparation for upcoming changes:
- netmap_rx_irq()/netmap_tx_irq() can now be called by FreeBSD drivers
  hiding the logic for handling NIC interrupts in netmap mode.
  This also simplifies the case of NICs attached to VALE switches.
     Individual drivers will be updated with separate commits.

- use the same refcount() API for FreeBSD and linux

- plus some comments, typos and formatting fixes

Portions contributed by Michio Honda
2013-04-30 16:08:34 +00:00
Luigi Rizzo
28228e0816 whitespace - document alternative locking under linux 2013-04-29 19:30:35 +00:00
Luigi Rizzo
d4b42e0869 whitespace changes:
remove $Id$ lines, and add blank lines around some #if / #elif /#endif
2013-04-29 18:00:53 +00:00
Luigi Rizzo
b865453e3e explicitly mark some variables as const 2013-04-29 16:58:21 +00:00
Luigi Rizzo
2579e2d715 mostly whitespace changes:
- remove vestiges of the old memory allocator
- clean up some comments
2013-04-19 21:08:21 +00:00
Luigi Rizzo
aa76317cfc fix a bug in the computation of the userspace offset for a give netmap buffer.
Submitted by: Hugh Nhan
2013-04-15 11:49:16 +00:00
Attilio Rao
89f6b8632c Switch the vm_object mutex to be a rwlock. This will enable in the
future further optimizations where the vm_object lock will be held
in read mode most of the time the page cache resident pool of pages
are accessed for reading purposes.

The change is mostly mechanical but few notes are reported:
* The KPI changes as follow:
  - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK()
  - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK()
  - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK()
  - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED()
    (in order to avoid visibility of implementation details)
  - The read-mode operations are added:
    VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(),
    VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED()
* The vm/vm_pager.h namespace pollution avoidance (forcing requiring
  sys/mutex.h in consumers directly to cater its inlining functions
  using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h
  consumers now must include also sys/rwlock.h.
* zfs requires a quite convoluted fix to include FreeBSD rwlocks into
  the compat layer because the name clash between FreeBSD and solaris
  versions must be avoided.
  At this purpose zfs redefines the vm_object locking functions
  directly, isolating the FreeBSD components in specific compat stubs.

The KPI results heavilly broken by this commit.  Thirdy part ports must
be updated accordingly (I can think off-hand of VirtualBox, for example).

Sponsored by:	EMC / Isilon storage division
Reviewed by:	jeff
Reviewed by:	pjd (ZFS specific review)
Discussed with:	alc
Tested by:	pho
2013-03-09 02:32:23 +00:00
Luigi Rizzo
091fd0ab54 Add support for transparent mode while in netmap.
By setting dev.netmap.fwd=1 (or enabling the feature with a per-ring flag),
packets are forwarded between the NIC and the host stack unless the
netmap client clears the NS_FORWARD flag on the individual descriptors.

This feature greatly simplifies applications where some traffic
(think of ARP, control traffic, ssh sessions...) must be processed
by the host stack, whereas the bulk is handled by the netmap process
which simply (un)marks packets that should not be forwarded.
The default is chosen so that now a netmap receiver operates
in a mode very similar to bpf.

Of course there is no free lunch: traffic to/from the host stack
still operates at OS speed (or less, as there is one extra copy in
one direction).
HOWEVER, since traffic goes to the user process before being
reinjected, and reinjection occurs in a user context, you get some
form of livelock protection for free.
2013-01-23 05:37:45 +00:00
Luigi Rizzo
ae10d1afee control some debugging messages with dev.netmap.verbose
add infrastracture to adapt to changes in number of queues
and buffers at runtime
2013-01-23 03:51:47 +00:00
Luigi Rizzo
70ca194a4c remove the old memory allocator, not useful anymore 2013-01-17 23:14:17 +00:00
Luigi Rizzo
1dce924d25 add some definition and driver changes in preparation for
two upcoming features:

semi-transparent mode:
    when a device is opened in this mode, the
    user program will be able to mark slots that must be forwarded
    to the "other" side (i.e. from NIC to host stack, or viceversa),
    and the forwarding will occur automatically at the next netmap syscall.
    This saves the need to open another file descriptor and do
    the forwarding manually.

direct-forwarding mode:
    when operating with a VALE port, the user can specify in the slot
    the actual destination port, overriding the forwarding decision
    made by a lookup of the destination MAC. This can be useful to
    implement packet dispatchers.

No API changes will be introduced.
No new functionality in this patch yet.
2013-01-17 22:14:58 +00:00
Luigi Rizzo
e814dcebf3 remove an incorrect comment and debugging code 2013-01-17 19:27:12 +00:00
Luigi Rizzo
60372f6f58 rename the 'tag' and 'map' fields used the rx ring to their
previous names, 'ptag' and 'pmap' -- p stands for packet.

This change reduces the difference between the code in stable/9
and head, and also helps using the same ixgbe_netmap.h on both branches.

Approved by:	Jack Vogel
2012-12-20 22:26:03 +00:00
Jack F Vogel
7d1157eec8 First of a series of 11 patches leading to new ixgbe version 2.5.0
This removes the header split and supporting code from the driver.
2012-11-30 22:19:18 +00:00
Ed Maste
d2b9185176 Use M_NOWAIT when calling malloc with a lock held.
The check for a NULL return was already in place so I assume this was just
an oversight.
2012-10-19 19:28:35 +00:00
Gleb Smirnoff
88f7905789 Fix build. 2012-10-19 09:41:45 +00:00
Luigi Rizzo
8241616dc5 This is an import of code, mostly from Giuseppe Lettieri,
that revises the netmap memory allocator so that the
various parameters (number and size of buffers, rings, descriptors)
can be modified at runtime through sysctl variables.
The changes become effective when no netmap clients are active.

The API is mostly unchanged, although the NIOCUNREGIF ioctl now
does not bring the interface back to normal mode: and you
need to close the file descriptor for that.
This change was necessary to track who is using the mapped region,
and since it is a simplification of the API there was no
incentive in trying to preserve NIOCUNREGIF.
We will remove the ioctl from the kernel next time we need
a real API change (and version bump).

Among other things, buffer allocation when opening devices is
now much faster: it used to take O(N^2) time, now it is linear.

Submitted by:	Giuseppe Lettieri
2012-10-19 04:13:12 +00:00
Ed Maste
4cf8455f59 Avoid panic when a netmap instance cannot obtain memory.
A uint32_t is always >= 0.

Sponsored by: ADARA Networks
2012-10-17 18:21:14 +00:00
Ed Maste
033ed050a0 Reword comment to try to improve clarity, and fix a typo. 2012-08-13 19:14:45 +00:00
Ed Maste
2f70fca5ec Improve lock and unlock symmetry
- Move destruction of per-ring locks to netmap_dtor_locked to mirror the
initialization that happens in NIOCREGIF.  Otherwise unloading a netmap-
capable interface that was never put into netmap mode would try to
mtx_destroy an uninitialized mutex, and panic.

- Destroy core_lock in netmap_detach, mirroring init in netmap_attach.

- Also comment out the knlist_destroy for now as there is currently no
knlist_init.

Sponsored by:   ADARA Networks
Reviewed by:    luigi@
2012-08-09 14:46:52 +00:00
Ed Maste
0bf8895411 Fix whitespace (missing newline) 2012-08-08 15:28:29 +00:00
Ed Maste
24e57ec96d Clarify comments about number of tx / rx rings 2012-08-08 15:27:01 +00:00
Luigi Rizzo
b3d5301688 fix some signed/unsigned warnings in the netmap code.
Unfortunately the original drivers still have a lot of
sign conversion/comparison warnings.
2012-08-02 11:59:43 +00:00
Luigi Rizzo
42a3a5bd91 Add a newline on an error message;
rename linux functions to avoid confusion;
fix error reporting on linux
2012-08-02 07:35:40 +00:00
Luigi Rizzo
d198a63d44 remove a redundant MALLOC_DECLARE 2012-07-31 05:51:48 +00:00
Luigi Rizzo
0b8ed8e069 - move the inclusion of netmap headers to the common part of the code;
- more portable annotations for unused arguments;
2012-07-30 18:21:48 +00:00
Luigi Rizzo
01c7d25ff4 use __builtin_prefetch() for prefetch.
merge in the remaining part of the linux-specific glue so i do not need
to maintain two different distributions.
2012-07-27 10:52:21 +00:00
Luigi Rizzo
826e7ddbfc remove unused definition, whitespace cleanup 2012-07-27 10:31:26 +00:00
Luigi Rizzo
29ecb031b6 define prefetch as a noop on !x86 2012-07-26 21:37:58 +00:00
Luigi Rizzo
f196ce3869 Add support for VALE bridges to the netmap core, see
http://info.iet.unipi.it/~luigi/vale/

VALE lets you dynamically instantiate multiple software bridges
that talk the netmap API (and are *extremely* fast), so you can test
netmap applications without the need for high end hardware.

This is particularly useful as I am completing a netmap-aware
version of ipfw, and VALE provides an excellent testing platform.

Also, I also have netmap backends for qemu mostly ready for commit
to the port, and this too will let you interconnect virtual machines
at high speed without fiddling with bridges, tap or other slow solutions.

The API for applications is unchanged, so you can use the code
in tools/tools/netmap (which i will update soon) on the VALE ports.

This commit also syncs the code with the one in my internal repository,
so you will see some conditional code for other platforms.
The code should run mostly unmodified on stable/9 so people interested
in trying it can just copy sys/dev/netmap/ and sys/net/netmap*.h
from HEAD

VALE is joint work with my colleague Giuseppe Lettieri, and
is partly supported by the EU Projects CHANGE and OPENLAB
2012-07-26 16:45:28 +00:00
Luigi Rizzo
0ee29d4125 this file is too old and not interesting anymore now that netmap
has been MFC'ed.
2012-05-17 20:05:13 +00:00
Luigi Rizzo
5b24837478 print 'netmap stack ring full' only in verbose mode. 2012-05-03 21:16:53 +00:00
Luigi Rizzo
b1123b0137 i prefer this fix for the -Wformat warning (just one cast,
all the other variables are already correct for %x).
My previous attempt put the cast in the wrong place.
2012-04-14 16:44:18 +00:00
Bjoern A. Zeeb
92083c91d2 Make compile on 64bit somehow for now after a first try at r234242 on
maybe 32bit?
2012-04-14 13:39:39 +00:00