Commit Graph

25 Commits

Author SHA1 Message Date
Kenneth D. Merry
98cb733c67 At long last, commit the zero copy sockets code.
MAKEDEV:	Add MAKEDEV glue for the ti(4) device nodes.

ti.4:		Update the ti(4) man page to include information on the
		TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options,
		and also include information about the new character
		device interface and the associated ioctls.

man9/Makefile:	Add jumbo.9 and zero_copy.9 man pages and associated
		links.

jumbo.9:	New man page describing the jumbo buffer allocator
		interface and operation.

zero_copy.9:	New man page describing the general characteristics of
		the zero copy send and receive code, and what an
		application author should do to take advantage of the
		zero copy functionality.

NOTES:		Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS,
		TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.

conf/files:	Add uipc_jumbo.c and uipc_cow.c.

conf/options:	Add the 5 options mentioned above.

kern_subr.c:	Receive side zero copy implementation.  This takes
		"disposable" pages attached to an mbuf, gives them to
		a user process, and then recycles the user's page.
		This is only active when ZERO_COPY_SOCKETS is turned on
		and the kern.ipc.zero_copy.receive sysctl variable is
		set to 1.

uipc_cow.c:	Send side zero copy functions.  Takes a page written
		by the user and maps it copy on write and assigns it
		kernel virtual address space.  Removes copy on write
		mapping once the buffer has been freed by the network
		stack.

uipc_jumbo.c:	Jumbo disposable page allocator code.  This allocates
		(optionally) disposable pages for network drivers that
		want to give the user the option of doing zero copy
		receive.

uipc_socket.c:	Add kern.ipc.zero_copy.{send,receive} sysctls that are
		enabled if ZERO_COPY_SOCKETS is turned on.

		Add zero copy send support to sosend() -- pages get
		mapped into the kernel instead of getting copied if
		they meet size and alignment restrictions.

uipc_syscalls.c:Un-staticize some of the sf* functions so that they
		can be used elsewhere.  (uipc_cow.c)

if_media.c:	In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid
		calling malloc() with M_WAITOK.  Return an error if
		the M_NOWAIT malloc fails.

		The ti(4) driver and the wi(4) driver, at least, call
		this with a mutex held.  This causes witness warnings
		for 'ifconfig -a' with a wi(4) or ti(4) board in the
		system.  (I've only verified for ti(4)).

ip_output.c:	Fragment large datagrams so that each segment contains
		a multiple of PAGE_SIZE amount of data plus headers.
		This allows the receiver to potentially do page
		flipping on receives.

if_ti.c:	Add zero copy receive support to the ti(4) driver.  If
		TI_PRIVATE_JUMBOS is not defined, it now uses the
		jumbo(9) buffer allocator for jumbo receive buffers.

		Add a new character device interface for the ti(4)
		driver for the new debugging interface.  This allows
		(a patched version of) gdb to talk to the Tigon board
		and debug the firmware.  There are also a few additional
		debugging ioctls available through this interface.

		Add header splitting support to the ti(4) driver.

		Tweak some of the default interrupt coalescing
		parameters to more useful defaults.

		Add hooks for supporting transmit flow control, but
		leave it turned off with a comment describing why it
		is turned off.

if_tireg.h:	Change the firmware rev to 12.4.11, since we're really
		at 12.4.11 plus fixes from 12.4.13.

		Add defines needed for debugging.

		Remove the ti_stats structure, it is now defined in
		sys/tiio.h.

ti_fw.h:	12.4.11 firmware.

ti_fw2.h:	12.4.11 firmware, plus selected fixes from 12.4.13,
		and my header splitting patches.  Revision 12.4.13
		doesn't handle 10/100 negotiation properly.  (This
		firmware is the same as what was in the tree previously,
		with the addition of header splitting support.)

sys/jumbo.h:	Jumbo buffer allocator interface.

sys/mbuf.h:	Add a new external mbuf type, EXT_DISPOSABLE, to
		indicate that the payload buffer can be thrown away /
		flipped to a userland process.

socketvar.h:	Add prototype for socow_setup.

tiio.h:		ioctl interface to the character portion of the ti(4)
		driver, plus associated structure/type definitions.

uio.h:		Change prototype for uiomoveco() so that we'll know
		whether the source page is disposable.

ufs_readwrite.c:Update for new prototype of uiomoveco().

vm_fault.c:	In vm_fault(), check to see whether we need to do a page
		based copy on write fault.

vm_object.c:	Add a new function, vm_object_allocate_wait().  This
		does the same thing that vm_object allocate does, except
		that it gives the caller the opportunity to specify whether
		it should wait on the uma_zalloc() of the object structre.

		This allows vm objects to be allocated while holding a
		mutex.  (Without generating WITNESS warnings.)

		vm_object_allocate() is implemented as a call to
		vm_object_allocate_wait() with the malloc flag set to
		M_WAITOK.

vm_object.h:	Add prototype for vm_object_allocate_wait().

vm_page.c:	Add page-based copy on write setup, clear and fault
		routines.

vm_page.h:	Add page based COW function prototypes and variable in
		the vm_page structure.

Many thanks to Drew Gallatin, who wrote the zero copy send and receive
code, and to all the other folks who have tested and reviewed this code
over the years.
2002-06-26 03:37:47 +00:00
Bill Paul
6263665f87 Fix the definitions for memory bank sizes, which I somehow got wrong.
The constant I was using was correct, but I mislabeled it as 256K when
it should have been 512K. This doesn't actually change the code, but
it clarifies things somewhat.

Submitted by:	Chuck Cranor <chuck@research.att.com>
2001-04-26 16:40:45 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Bosko Milekic
ca43854a66 (Introduce something sitting in my repo for 3 weeks now...)
Have if_ti stop "hiding" the softc pointer in the buffer region. Rather,
use the available void * passed to the free routine and pass the softc
pointer through there.

To note: in MEXTADD(), TI_JUMBO_FRAMELEN should probably be TI_JLEN. I left it
unchanged, because this way I'm sure to not damage anything in this respect...
2000-10-21 00:13:35 +00:00
Bill Paul
d1ce910572 First round of converting network drivers from spls to mutexes. This
takes care of all the 10/100 and gigE PCI drivers that I've done.
Next will be the wireless drivers, then the USB ones. I may pick up
some stragglers along the way. I'm sort of playing this by ear: if
anyone spots any places where I've screwed up horribly, please let me
know.
2000-10-13 17:54:19 +00:00
David Malone
a5c4836d39 Replace the mbuf external reference counting code with something
that should be better.

The old code counted references to mbuf clusters by using the offset
of the cluster from the start of memory allocated for mbufs and
clusters as an index into an array of chars, which did the reference
counting. If the external storage was not a cluster then reference
counting had to be done by the code using that external storage.

NetBSD's system of linked lists of mbufs was cosidered, but Alfred
felt it would have locking issues when the kernel was made more
SMP friendly.

The system implimented uses a pool of unions to track external
storage. The union contains an int for counting the references and
a pointer for forming a free list. The reference counts are
incremented and decremented atomically and so should be SMP friendly.
This system can track reference counts for any sort of external
storage.

Access to the reference counting stuff is now through macros defined
in mbuf.h, so it should be easier to make changes to the system in
the future.

The possibility of storing the reference count in one of the
referencing mbufs was considered, but was rejected 'cos it would
often leave extra mbufs allocated. Storing the reference count in
the cluster was also considered, but because the external storage
may not be a cluster this isn't an option.

The size of the pool of reference counters is available in the
stats provided by "netstat -m".

PR:		19866
Submitted by:	Bosko Milekic <bmilekic@dsuper.net>
Reviewed by:	alfred (glanced at by others on -net)
2000-08-19 08:32:59 +00:00
Bill Paul
6f069b494a Add support for the Netgear GA620T copper gigabit card. 2000-08-02 18:49:17 +00:00
Bill Paul
e87631b976 Update the Tigon driver to support 1000baseTX gigE over copper AceNIC
cards. This basically involves switching to the 12.4.13 firmware, plus
a couple of minor tweaks to the driver.

Also changed the jumbo buffer allocation scheme just a little to avoid
'failed to allocate jumbo buffer' conditions in certain cases.
2000-07-20 22:24:43 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Bill Paul
27440dd237 Update the Tigon firmware to 12.3.21. This fixes a few bugs and adds support
for cards with 2MB of on-board SRAM.
2000-04-24 17:44:45 +00:00
Bill Paul
827a61b63d Update Tigon firmware yet again, this time to version 12.3.20. 2000-03-18 01:30:36 +00:00
Bill Paul
74ea2d6f60 Update the Tigon driver to use the 12.3.18 firmware release from Alteon.
(No changes to the driver code itself.)

Approved by: jkh
2000-02-10 00:37:48 +00:00
Bill Paul
b822a5eae9 Add the vendor/device ID for the Farallon PN9000SX gigabit ethernet
card, which is apparently also a Tigon 2 device.
2000-01-18 00:26:29 +00:00
Bill Paul
981069a71b Update the Tigon driver firmware images to the latest release from
Alteon (12.6.15).
1999-09-22 06:43:16 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Bill Paul
af1c062105 On FreeBSD/i386, when you use the SYS_RES_MEMORY resource to allocate
a PCI memory mapped region, rman_get_bushandle() returns what happens
to be a kernel virtual address pointing to the base of the PCI shared
memory window. However this is not the behavior on all platforms:
the only thing you should do with the bushandle is pass it to the
bus_spare_read()/bus_space_write() routines. If you actually do want
the kernel virtual address of the base of the PCI memory window, you
need to use rman_get_virtual().

The problem is that at the moment, rman_get_virtual() returns a physical
address, which is bad. In order to get the kernel virtual address we
need, we have to play with it a little.

Presumeably this behavior will be changed, but in the meantime the
Tigon driver won't work. So for the moment, I'm adding a kludge to
make things happy on the alpha: the correct kernel virtual address
is calculated from the value returned by rman_get_virtual(). This
should be removed once rman_get_virtual() starts doing the right
thing.

This should make the Tigon actuall work on the alpha now.
1999-07-27 03:54:48 +00:00
Bill Paul
571a80b261 Clean up the buffer allocation code a bit. Make sure to initialize certain
critical mbuf fields to sane values. Simplify the use of ETHER_ALIGN to
enforce payload alignment, and turn it on on the x86 as well as alpha
since it helps with NFS which wants the payload to be longword aligned
even though the hardware doesn't require it.

This fixes a problem with the ti driver causing an unaligned access trap
on the Alpha due to m_adj() sometimes not setting the alignment correctly
because of incomplete mbuf initialization.
1999-07-23 18:46:24 +00:00
Bill Paul
39d837d4b5 Dangit. Somehow the pmap_kextract hack for alpha snuck back into these
files. Change them back to alpha_XXX_dmamap().

Pointed out by: Andrew Gallatin
1999-07-23 02:18:01 +00:00
Bill Paul
89ca84e6db Convert the Alteon Tigon gigabit ethernet driver to newbus. Also upgrade
to the latest firmware release from Alteon (12.3.12).
1999-07-23 02:10:11 +00:00
Bill Paul
7f971fc2ca Remove ti_refill_rx_rings() and associated stuff; replace dirty RX buffers
in ti_rxeof() instead. This doesn't really seem to provide much in the
way of a performance boost, and I'm pretty sure it can cause mbuf leakage
in some extreme cases.
1999-07-05 20:19:41 +00:00
Bill Paul
274342303b Add a transmit descriptor usage counter and use it to absolutely,
positively not let ti_encap() fill up the TX ring all the way and wrap
around. This fixes a potential transmit lockup where a really fast
machine (or particular TX traffic pattern) can overrun the end of the
ring.

Reported by: John Plevyak <jplevyak@inktomi.com>
1999-06-19 00:36:56 +00:00
Andrew Gallatin
50b8d1cce4 Allow chipset drivers to specify the direct-mapped DMA window's mask in
preparation for tsunami support.  Previous chipsets' direct-mapped DMA
mask was always 1024*1024*1024.  The Tsunami chipset needs it to be
2*1024*1024*1024

These changes should not affect the i386 port

Reviewed by:	Doug Rabson <dfr@nlsystems.com>
1999-05-26 23:01:57 +00:00
Bill Paul
6263933e7a Upgrade firmware images Alteon's latest release (12.3.10). This fixes a
bug in the stats accounting (nicSendBDs counter was bogus when TX ring was
configured to be in host memory).

Update if_tireg.h to look for new firmware fix level.
1999-05-03 17:44:53 +00:00
Bill Paul
d02c233129 Add driver support for gigabit ethernet adapters based on the Alteon
Networks Tigon 1 and Tigon 2 chipsets. There are a _lot_ of OEM'ed
gigabit ethernet adapters out there which use the Alteon chipset so
this driver covers a fair amount of hardware. I know that it works with
the Alteon AceNIC, 3Com 3c985 and Netgear GA620, however it should also
work with the DEC/Compaq EtherWORKS 1000, Silicon Graphics Gigabit
ethernet board, NEC Gigabit Ethernet board and maybe even the IBM and
and Sun boards. The Netgear board is the cheapest (~$350US) but still
yields fairly good performance.

Support is provided for jumbo frames with all adapters (just set the
MTU to something larger than 1500 bytes), as well as hardware multicast
filtering and vlan tagging (in conjunction with the vlan support in
-current, which I should merge into -stable soon). There are some hooks
for checksum offload support, but they're turned off for now since
FreeBSD doesn't have an officially sanctioned way to support checksum
offloading (yet).

I have not added the 'device ti0' entry to GENERIC since the driver
with all the firmware compiled in is quite large, and it doesn't really
fit into the category of generic hardware.
1999-04-06 17:08:31 +00:00