37 Commits

Author SHA1 Message Date
Pyun YongHyeon
ff3ced1270 - Tx side bus_dmamap_load_mbuf_sg(9) support. This reduces bookkeeping
requiried to keep consistent softc state before/after callback function
  invocation and supposed to be sligntly faster than previous one as it
  wouldn't incur callback overhead. With this change callback function
  was gone.
- Decrease TI_MAXTXSEGS to 32 from 128. It seems that most mbuf chain
  length is less than 32 and it would be re-packed with m_defrag(9) if
  its chain length is larger than TI_MAXTXSEGS. This would protect ti(4)
  against possible kernel stack overflow when txsegs[] is put on stack.
  Alternatively, we can embed the txsegs[] into softc. However, that
  would waste memory and make Tx/Rx speration hard when we want to
  sperate Tx/Rx handlers to optimize locking.
- Fix dma map tracking used in Tx path. Previously it used the dma map
  of the last mbuf chain in ti_txeof() which was incorrect as ti(4)
  used dma map of the first mbuf chain when it loads a mbuf chain with
  bus_dmamap_load_mbuf(9). Correct the bug by introducing queues that
  keep track of active/inactive dma maps/mbuf chain.
- Use ti_txcnt to check whether driver need to set watchdog timer instead
  of blidnly clearing the timer in ti_txeof().
- Remove the 3rd arg. of ti_encap(). Since ti(4) now caches the last
  descriptor index(ti_tx_saved_prodidx) used in Tx there is no need to
  pass it as a fuction arg.
- Change data type of producer/consumer index to int from u_int16_t in
  order to remove implicit type conversions in Tx/Rx handlers.
- Check interface queue before getting a mbuf chain to reduce locking
  overhead.
- Check number of available Tx descriptores to be 16 or higher in
  ti_start(). This wouldn't protect Tx descriptor shortage but it would
  reduce number of bus_dmamap_unload(9) calls in ti_encap() when we are
  about to running out of Tx descriptors.
- Command NIC to send packets ony when the driver really has packets
  enqueued. Previously it always set TI_MB_SENDPROD_IDX which would
  command NIC to DMA Tx descriptors into NIC local memory regardless
  of Tx descriptor changes.

Reviewed by:	scottl
2006-01-03 06:14:07 +00:00
Scott Long
557e53c6f7 Cache the tx producer index instead of reading it every time ti_start is
called.
2005-12-28 08:36:32 +00:00
Pyun YongHyeon
d54c905707 Bring big-endian architecture support for ti(4).
. remove unnecessay header files after Scott's bus_dma(9) commit.
 . remove global variable tis which was introduced at the time of
   zero_copy(9) changes. The variable tis was not used at all. The
   same applyes to ti_links in softc so axe it.
 . deregister variables.
 . axe ti_vhandle and switch to use explicit register access for
   accessing NIC local memory. Creates three variants of ti_mem to
   read/write NIC local memory(ti_mem_read, ti_mem_write) and clearing
   NIC local memory(ti_mem_zero). This greatly enhances code
   readability and have ti(4) drop using shared memory scheme for
   Tigon 1. As Tigon 1 switched to use explicit register access for Tx,
   axe ti_tx_ring_nic/ti_cmd_ring in softc.(Tigon 2 used to host ring
   scheme which means there is no need to access NIC local memory via
   register access for Tx and NIC would DMA the modified Tx rings into
   its local memory.) [1]
 . introduce new macro TI_EVENT_*/TI_CMD_* to handle NIC envent/command.
   Instead of using bit fields assginment for accessing the event, use
   shift operations to set/get it. [1]
 . add additional check for valid DMA tags in ti_free_dmamaps().
 . add missing bus_dmamap_sync/bus_dmamap_unload in ti_free_*_ring_*.
 . fix locking nits(MTX_RECURSE mutex) and make ti(4) MPSAFE.
 . change data type of ti_rdata_phys to bus_addr_t and don't blindly
   cast to uint32_t.
 . rearrange detach path and make ti(4) survive during device detach.
 . for Tigon 1, use explicit register access for checking Tx descriptors
   in ti_encap()/ti_txeof(). [1]
 . properly call bus_dmamap_sync(9) for updating statistics.
 . remove extra semicolon in ti_encap()
 . rewrite loading MAC address to work on strict-alignment architectures.
 . move TI_RD_OFF macro to if_tireg.h
 . axe ETHER_ALIGN as it's already defined in <net/ethernet.h>.
 . make macros immuine from expansion by adding parenthesis and do-while.
 . remove alpha specific hack as vtophys(9) is no longer used in ti(4)
   after Scott's bus_dma(9) fix.

Reviewed by:	scottl
Obtained from:	OpenBSD [1]
2005-12-28 02:57:19 +00:00
Scott Long
6239708b1c Fix the Tigon I/II driver to support 64-bit DMA. In the process, convert it
to use busdma.  Unlike most of the other drivers, but similar to the
if_em driver, pre-allocate the dmamaps at init time instead of allocating
them on the fly when descriptors need to be filled.  This isn't ideal right
now because a map is allocated for every descriptor slot in the tx, rx, mini,
and jumbo rings (which is a lot!) in order to simplify the bookkeeping, even
though the driver might support filling only a subset of those slots.
Luckily, maps are typically NULL on i386 and amd64, so the cost isn't
very high.  It could be an issue with sparc64, but the driver isn't endian
clean either, and that is a much bigger problem to solve first.

Note that jumbo frame support is under-tested, and I'm not even sure if
it till really works correctly given the evil VM magic that is does.
The changes here attempt to preserve the existing semanitcs.

Thanks to Martin Nillson for contributing the Netgear card for this work.

MFC-After: 3 weeks
2005-12-14 00:03:41 +00:00
Scott Long
d29dab303f Allocate the jumbo rx frame buffer with busdma. 2005-12-10 08:58:48 +00:00
John Baldwin
ccb7a62ef7 Typo. 2005-09-29 15:04:58 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Scott Long
586cfbb265 Start the process of modernizing the Tigon driver by using busdma for the
descriptor and configuration data.  Thanks to Martin Nilsson for providing
hardware.
2005-03-21 07:17:27 +00:00
Warner Losh
60727d8b86 /* -> /*- for license, minor formatting changes 2005-01-07 02:29:27 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Sam Leffler
5120abbfb4 Drop the driver lock around calls to if_input to avoid a LOR when
the packets are immediately returned for sending (e.g.  when bridging
or packet forwarding).  There are more efficient ways to do this
but for now use the least intrusive approach.

Reviewed by:	imp, rwatson
2003-11-14 19:00:32 +00:00
Alfred Perlstein
29f194457c Fix instances of macros with improperly parenthasized arguments.
Verified by: md5
2002-11-09 12:55:07 +00:00
Kenneth D. Merry
98cb733c67 At long last, commit the zero copy sockets code.
MAKEDEV:	Add MAKEDEV glue for the ti(4) device nodes.

ti.4:		Update the ti(4) man page to include information on the
		TI_JUMBO_HDRSPLIT and TI_PRIVATE_JUMBOS kernel options,
		and also include information about the new character
		device interface and the associated ioctls.

man9/Makefile:	Add jumbo.9 and zero_copy.9 man pages and associated
		links.

jumbo.9:	New man page describing the jumbo buffer allocator
		interface and operation.

zero_copy.9:	New man page describing the general characteristics of
		the zero copy send and receive code, and what an
		application author should do to take advantage of the
		zero copy functionality.

NOTES:		Add entries for ZERO_COPY_SOCKETS, TI_PRIVATE_JUMBOS,
		TI_JUMBO_HDRSPLIT, MSIZE, and MCLSHIFT.

conf/files:	Add uipc_jumbo.c and uipc_cow.c.

conf/options:	Add the 5 options mentioned above.

kern_subr.c:	Receive side zero copy implementation.  This takes
		"disposable" pages attached to an mbuf, gives them to
		a user process, and then recycles the user's page.
		This is only active when ZERO_COPY_SOCKETS is turned on
		and the kern.ipc.zero_copy.receive sysctl variable is
		set to 1.

uipc_cow.c:	Send side zero copy functions.  Takes a page written
		by the user and maps it copy on write and assigns it
		kernel virtual address space.  Removes copy on write
		mapping once the buffer has been freed by the network
		stack.

uipc_jumbo.c:	Jumbo disposable page allocator code.  This allocates
		(optionally) disposable pages for network drivers that
		want to give the user the option of doing zero copy
		receive.

uipc_socket.c:	Add kern.ipc.zero_copy.{send,receive} sysctls that are
		enabled if ZERO_COPY_SOCKETS is turned on.

		Add zero copy send support to sosend() -- pages get
		mapped into the kernel instead of getting copied if
		they meet size and alignment restrictions.

uipc_syscalls.c:Un-staticize some of the sf* functions so that they
		can be used elsewhere.  (uipc_cow.c)

if_media.c:	In the SIOCGIFMEDIA ioctl in ifmedia_ioctl(), avoid
		calling malloc() with M_WAITOK.  Return an error if
		the M_NOWAIT malloc fails.

		The ti(4) driver and the wi(4) driver, at least, call
		this with a mutex held.  This causes witness warnings
		for 'ifconfig -a' with a wi(4) or ti(4) board in the
		system.  (I've only verified for ti(4)).

ip_output.c:	Fragment large datagrams so that each segment contains
		a multiple of PAGE_SIZE amount of data plus headers.
		This allows the receiver to potentially do page
		flipping on receives.

if_ti.c:	Add zero copy receive support to the ti(4) driver.  If
		TI_PRIVATE_JUMBOS is not defined, it now uses the
		jumbo(9) buffer allocator for jumbo receive buffers.

		Add a new character device interface for the ti(4)
		driver for the new debugging interface.  This allows
		(a patched version of) gdb to talk to the Tigon board
		and debug the firmware.  There are also a few additional
		debugging ioctls available through this interface.

		Add header splitting support to the ti(4) driver.

		Tweak some of the default interrupt coalescing
		parameters to more useful defaults.

		Add hooks for supporting transmit flow control, but
		leave it turned off with a comment describing why it
		is turned off.

if_tireg.h:	Change the firmware rev to 12.4.11, since we're really
		at 12.4.11 plus fixes from 12.4.13.

		Add defines needed for debugging.

		Remove the ti_stats structure, it is now defined in
		sys/tiio.h.

ti_fw.h:	12.4.11 firmware.

ti_fw2.h:	12.4.11 firmware, plus selected fixes from 12.4.13,
		and my header splitting patches.  Revision 12.4.13
		doesn't handle 10/100 negotiation properly.  (This
		firmware is the same as what was in the tree previously,
		with the addition of header splitting support.)

sys/jumbo.h:	Jumbo buffer allocator interface.

sys/mbuf.h:	Add a new external mbuf type, EXT_DISPOSABLE, to
		indicate that the payload buffer can be thrown away /
		flipped to a userland process.

socketvar.h:	Add prototype for socow_setup.

tiio.h:		ioctl interface to the character portion of the ti(4)
		driver, plus associated structure/type definitions.

uio.h:		Change prototype for uiomoveco() so that we'll know
		whether the source page is disposable.

ufs_readwrite.c:Update for new prototype of uiomoveco().

vm_fault.c:	In vm_fault(), check to see whether we need to do a page
		based copy on write fault.

vm_object.c:	Add a new function, vm_object_allocate_wait().  This
		does the same thing that vm_object allocate does, except
		that it gives the caller the opportunity to specify whether
		it should wait on the uma_zalloc() of the object structre.

		This allows vm objects to be allocated while holding a
		mutex.  (Without generating WITNESS warnings.)

		vm_object_allocate() is implemented as a call to
		vm_object_allocate_wait() with the malloc flag set to
		M_WAITOK.

vm_object.h:	Add prototype for vm_object_allocate_wait().

vm_page.c:	Add page-based copy on write setup, clear and fault
		routines.

vm_page.h:	Add page based COW function prototypes and variable in
		the vm_page structure.

Many thanks to Drew Gallatin, who wrote the zero copy send and receive
code, and to all the other folks who have tested and reviewed this code
over the years.
2002-06-26 03:37:47 +00:00
Bill Paul
6263665f87 Fix the definitions for memory bank sizes, which I somehow got wrong.
The constant I was using was correct, but I mislabeled it as 256K when
it should have been 512K. This doesn't actually change the code, but
it clarifies things somewhat.

Submitted by:	Chuck Cranor <chuck@research.att.com>
2001-04-26 16:40:45 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Bosko Milekic
ca43854a66 (Introduce something sitting in my repo for 3 weeks now...)
Have if_ti stop "hiding" the softc pointer in the buffer region. Rather,
use the available void * passed to the free routine and pass the softc
pointer through there.

To note: in MEXTADD(), TI_JUMBO_FRAMELEN should probably be TI_JLEN. I left it
unchanged, because this way I'm sure to not damage anything in this respect...
2000-10-21 00:13:35 +00:00
Bill Paul
d1ce910572 First round of converting network drivers from spls to mutexes. This
takes care of all the 10/100 and gigE PCI drivers that I've done.
Next will be the wireless drivers, then the USB ones. I may pick up
some stragglers along the way. I'm sort of playing this by ear: if
anyone spots any places where I've screwed up horribly, please let me
know.
2000-10-13 17:54:19 +00:00
David Malone
a5c4836d39 Replace the mbuf external reference counting code with something
that should be better.

The old code counted references to mbuf clusters by using the offset
of the cluster from the start of memory allocated for mbufs and
clusters as an index into an array of chars, which did the reference
counting. If the external storage was not a cluster then reference
counting had to be done by the code using that external storage.

NetBSD's system of linked lists of mbufs was cosidered, but Alfred
felt it would have locking issues when the kernel was made more
SMP friendly.

The system implimented uses a pool of unions to track external
storage. The union contains an int for counting the references and
a pointer for forming a free list. The reference counts are
incremented and decremented atomically and so should be SMP friendly.
This system can track reference counts for any sort of external
storage.

Access to the reference counting stuff is now through macros defined
in mbuf.h, so it should be easier to make changes to the system in
the future.

The possibility of storing the reference count in one of the
referencing mbufs was considered, but was rejected 'cos it would
often leave extra mbufs allocated. Storing the reference count in
the cluster was also considered, but because the external storage
may not be a cluster this isn't an option.

The size of the pool of reference counters is available in the
stats provided by "netstat -m".

PR:		19866
Submitted by:	Bosko Milekic <bmilekic@dsuper.net>
Reviewed by:	alfred (glanced at by others on -net)
2000-08-19 08:32:59 +00:00
Bill Paul
6f069b494a Add support for the Netgear GA620T copper gigabit card. 2000-08-02 18:49:17 +00:00
Bill Paul
e87631b976 Update the Tigon driver to support 1000baseTX gigE over copper AceNIC
cards. This basically involves switching to the 12.4.13 firmware, plus
a couple of minor tweaks to the driver.

Also changed the jumbo buffer allocation scheme just a little to avoid
'failed to allocate jumbo buffer' conditions in certain cases.
2000-07-20 22:24:43 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Bill Paul
27440dd237 Update the Tigon firmware to 12.3.21. This fixes a few bugs and adds support
for cards with 2MB of on-board SRAM.
2000-04-24 17:44:45 +00:00
Bill Paul
827a61b63d Update Tigon firmware yet again, this time to version 12.3.20. 2000-03-18 01:30:36 +00:00
Bill Paul
74ea2d6f60 Update the Tigon driver to use the 12.3.18 firmware release from Alteon.
(No changes to the driver code itself.)

Approved by: jkh
2000-02-10 00:37:48 +00:00
Bill Paul
b822a5eae9 Add the vendor/device ID for the Farallon PN9000SX gigabit ethernet
card, which is apparently also a Tigon 2 device.
2000-01-18 00:26:29 +00:00
Bill Paul
981069a71b Update the Tigon driver firmware images to the latest release from
Alteon (12.6.15).
1999-09-22 06:43:16 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Bill Paul
af1c062105 On FreeBSD/i386, when you use the SYS_RES_MEMORY resource to allocate
a PCI memory mapped region, rman_get_bushandle() returns what happens
to be a kernel virtual address pointing to the base of the PCI shared
memory window. However this is not the behavior on all platforms:
the only thing you should do with the bushandle is pass it to the
bus_spare_read()/bus_space_write() routines. If you actually do want
the kernel virtual address of the base of the PCI memory window, you
need to use rman_get_virtual().

The problem is that at the moment, rman_get_virtual() returns a physical
address, which is bad. In order to get the kernel virtual address we
need, we have to play with it a little.

Presumeably this behavior will be changed, but in the meantime the
Tigon driver won't work. So for the moment, I'm adding a kludge to
make things happy on the alpha: the correct kernel virtual address
is calculated from the value returned by rman_get_virtual(). This
should be removed once rman_get_virtual() starts doing the right
thing.

This should make the Tigon actuall work on the alpha now.
1999-07-27 03:54:48 +00:00
Bill Paul
571a80b261 Clean up the buffer allocation code a bit. Make sure to initialize certain
critical mbuf fields to sane values. Simplify the use of ETHER_ALIGN to
enforce payload alignment, and turn it on on the x86 as well as alpha
since it helps with NFS which wants the payload to be longword aligned
even though the hardware doesn't require it.

This fixes a problem with the ti driver causing an unaligned access trap
on the Alpha due to m_adj() sometimes not setting the alignment correctly
because of incomplete mbuf initialization.
1999-07-23 18:46:24 +00:00
Bill Paul
39d837d4b5 Dangit. Somehow the pmap_kextract hack for alpha snuck back into these
files. Change them back to alpha_XXX_dmamap().

Pointed out by: Andrew Gallatin
1999-07-23 02:18:01 +00:00
Bill Paul
89ca84e6db Convert the Alteon Tigon gigabit ethernet driver to newbus. Also upgrade
to the latest firmware release from Alteon (12.3.12).
1999-07-23 02:10:11 +00:00
Bill Paul
7f971fc2ca Remove ti_refill_rx_rings() and associated stuff; replace dirty RX buffers
in ti_rxeof() instead. This doesn't really seem to provide much in the
way of a performance boost, and I'm pretty sure it can cause mbuf leakage
in some extreme cases.
1999-07-05 20:19:41 +00:00
Bill Paul
274342303b Add a transmit descriptor usage counter and use it to absolutely,
positively not let ti_encap() fill up the TX ring all the way and wrap
around. This fixes a potential transmit lockup where a really fast
machine (or particular TX traffic pattern) can overrun the end of the
ring.

Reported by: John Plevyak <jplevyak@inktomi.com>
1999-06-19 00:36:56 +00:00
Andrew Gallatin
50b8d1cce4 Allow chipset drivers to specify the direct-mapped DMA window's mask in
preparation for tsunami support.  Previous chipsets' direct-mapped DMA
mask was always 1024*1024*1024.  The Tsunami chipset needs it to be
2*1024*1024*1024

These changes should not affect the i386 port

Reviewed by:	Doug Rabson <dfr@nlsystems.com>
1999-05-26 23:01:57 +00:00
Bill Paul
6263933e7a Upgrade firmware images Alteon's latest release (12.3.10). This fixes a
bug in the stats accounting (nicSendBDs counter was bogus when TX ring was
configured to be in host memory).

Update if_tireg.h to look for new firmware fix level.
1999-05-03 17:44:53 +00:00
Bill Paul
d02c233129 Add driver support for gigabit ethernet adapters based on the Alteon
Networks Tigon 1 and Tigon 2 chipsets. There are a _lot_ of OEM'ed
gigabit ethernet adapters out there which use the Alteon chipset so
this driver covers a fair amount of hardware. I know that it works with
the Alteon AceNIC, 3Com 3c985 and Netgear GA620, however it should also
work with the DEC/Compaq EtherWORKS 1000, Silicon Graphics Gigabit
ethernet board, NEC Gigabit Ethernet board and maybe even the IBM and
and Sun boards. The Netgear board is the cheapest (~$350US) but still
yields fairly good performance.

Support is provided for jumbo frames with all adapters (just set the
MTU to something larger than 1500 bytes), as well as hardware multicast
filtering and vlan tagging (in conjunction with the vlan support in
-current, which I should merge into -stable soon). There are some hooks
for checksum offload support, but they're turned off for now since
FreeBSD doesn't have an officially sanctioned way to support checksum
offloading (yet).

I have not added the 'device ti0' entry to GENERIC since the driver
with all the firmware compiled in is quite large, and it doesn't really
fit into the category of generic hardware.
1999-04-06 17:08:31 +00:00