Commit Graph

76 Commits

Author SHA1 Message Date
Gleb Smirnoff
ba583cfed2 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:07:11 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Gleb Smirnoff
e8fd18f306 Shorten list of arguments to mbuf external storage freeing function.
All of these arguments are stored in m_ext, so there is no reason
to pass them in the argument list.  Not all functions need the second
argument, some don't even need the first one.  The second argument
lives in next cache line, so not dereferencing it is a performance
gain.  This was discovered in sendfile(2), which will be covered by
next commits.

The second goal of this commit is to bring even more flexibility
to m_ext mbufs, allowing to create more fields in m_ext, opaque to
the generic mbuf code, and potentially set and dereferenced by
subsystems.

Reviewed by:	gallatin, kbowling
Differential Revision:	https://reviews.freebsd.org/D12615
2017-10-09 20:35:31 +00:00
Ed Maste
3e85b721d6 Remove register keyword from sys/ and ANSIfy prototypes
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.

ANSIfy related prototypes while here.

Reviewed by:	cem, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10193
2017-05-17 00:34:34 +00:00
Pedro F. Giffuni
453130d9bf sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
Gleb Smirnoff
c8dfaf382f Mechanically convert to if_inc_counter(). 2014-09-19 03:51:26 +00:00
Gleb Smirnoff
15c28f87b8 All mbuf external free functions never fail, so let them be void.
Sponsored by:	Nginx, Inc.
2014-07-11 13:58:48 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Andre Oppermann
bb25e5ab00 Give (*ext_free) an int return value allowing for very sophisticated
external mbuf buffer management capabilities in the future.

For now only EXT_FREE_OK is defined with current legacy behavior.

Sponsored by:	The FreeBSD Foundation
2013-08-25 10:57:09 +00:00
Andre Oppermann
9a73687609 Add an mbuf pointer parameter to (*ext_free) to give the external
free function access to the mbuf the external memory was attached
to.

Mechanically adjust all users to include the mbuf parameter.

This fixes a long standing annoyance for external free functions.
Before one had to sacrifice one of the argument pointers for this.

Sponsored by:	The FreeBSD Foundation
2013-08-24 16:57:44 +00:00
Gleb Smirnoff
c6499eccad Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.
2012-12-04 09:32:43 +00:00
Dimitry Andric
29658c96ce Remove duplicate const specifiers in many drivers (I hope I got all of
them, please let me know if not).  Most of these are of the form:

static const struct bzzt_type {
	[...list of members...]
} const bzzt_devs[] = {
	[...list of initializers...]
};

The second const is unnecessary, as arrays cannot be modified anyway,
and if the elements are const, the whole thing is const automatically
(e.g. it is placed in .rodata).

I have verified this does not change the binary output of a full kernel
build (except for build timestamps embedded in the object files).

Reviewed by:	yongari, marius
MFC after:	1 week
2012-11-05 19:16:27 +00:00
Kevin Lo
5bbe0c5357 ether_ifattach() sets if_mtu to ETHERMTU, don't bother set it again
Reviewed by:	yongari
2012-01-07 09:41:57 +00:00
Marius Strobl
4b7ec27007 - There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
  (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
  since r52045) but even recently added device drivers do this unnecessarily.
  Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
  Discussed with: jhb
- Also while at it, use __FBSDID.
2011-11-22 21:28:20 +00:00
Pyun YongHyeon
57c81d92ae Close a race where SIOCGIFMEDIA ioctl get inconsistent link status.
Because driver is accessing a common MII structure in
mii_pollstat(), updating user supplied structure should be done
before dropping a driver lock.

Reported by:	Karim (fodillemlinkarimi <> gmail dot com)
2011-10-17 19:49:00 +00:00
Marius Strobl
6aa1145c84 - Remove unused remnants of MII bitbang'ing.
- Sprinkle const.
2011-10-11 21:52:24 +00:00
Marius Strobl
3fcb7a5365 - Remove attempts to implement setting of BMCR_LOOP/MIIF_NOLOOP
(reporting IFM_LOOP based on BMCR_LOOP is left in place though as
  it might provide useful for debugging). For most mii(4) drivers it
  was unclear whether the PHYs driven by them actually support
  loopback or not. Moreover, typically loopback mode also needs to
  be activated on the MAC, which none of the Ethernet drivers using
  mii(4) implements. Given that loopback media has no real use (and
  obviously hardly had a chance to actually work) besides for driver
  development (which just loopback mode should be sufficient for
  though, i.e one doesn't necessary need support for loopback media)
  support for it is just dropped as both NetBSD and OpenBSD already
  did quite some time ago.
- Let mii_phy_add_media() also announce the support of IFM_NONE.
- Restructure the PHY entry points to use a structure of entry points
  instead of discrete function pointers, and extend this to include
  a "reset" entry point. Make sure any PHY-specific reset routine is
  always used, and provide one for lxtphy(4) which disables MII
  interrupts (as is done for a few other PHYs we have drivers for).
  This includes changing NIC drivers which previously just called the
  generic mii_phy_reset() to now actually call the PHY-specific reset
  routine, which might be crucial in some cases. While at it, the
  redundant checks in these NIC drivers for mii->mii_instance not being
  zero before calling the reset routines were removed because as soon
  as one PHY driver attaches mii->mii_instance is incremented and we
  hardly can end up in their media change callbacks etc if no PHY driver
  has attached as mii_attach() would have failed in that case and not
  attach a miibus(4) instance.
  Consequently, NIC drivers now no longer should call mii_phy_reset()
  directly, so it was removed from EXPORT_SYMS.
- Add a mii_phy_dev_attach() as a companion helper to mii_phy_dev_probe().
  The purpose of that function is to perform the common steps to attach
  a PHY driver instance and to hook it up to the miibus(4) instance and to
  optionally also handle the probing, addition and initialization of the
  supported media. So all a PHY driver without any special requirements
  has to do in its bus attach method is to call mii_phy_dev_attach()
  along with PHY-specific MIIF_* flags, a pointer to its PHY functions
  and the add_media set to one. All PHY drivers were updated to take
  advantage of mii_phy_dev_attach() as appropriate. Along with these
  changes the capability mask was added to the mii_softc structure so
  PHY drivers taking advantage of mii_phy_dev_attach() but still
  handling media on their own do not need to fiddle with the MII attach
  arguments anyway.
- Keep track of the PHY offset in the mii_softc structure. This is done
  for compatibility with NetBSD/OpenBSD.
- Keep track of the PHY's OUI, model and revision in the mii_softc
  structure. Several PHY drivers require this information also after
  attaching and previously had to wrap their own softc around mii_softc.
  NetBSD/OpenBSD also keep track of the model and revision on their
  mii_softc structure. All PHY drivers were updated to take advantage
  as appropriate.
- Convert the mebers of the MII data structure to unsigned where
  appropriate. This is partly inspired by NetBSD/OpenBSD.
- According to IEEE 802.3-2002 the bits actually have to be reversed
  when mapping an OUI to the MII ID registers. All PHY drivers and
  miidevs where changed as necessary. Actually this now again allows to
  largely share miidevs with NetBSD, which fixed this problem already
  9 years ago. Consequently miidevs was synced as far as possible.
- Add MIIF_NOMANPAUSE and mii_phy_flowstatus() calls to drivers that
  weren't explicitly converted to support flow control before. It's
  unclear whether flow control actually works with these but typically
  it should and their net behavior should be more correct with these
  changes in place than without if the MAC driver sets MIIF_DOPAUSE.

Obtained from:	NetBSD (partially)
Reviewed by:	yongari (earlier version), silence on arch@ and net@
2011-05-03 19:51:29 +00:00
Marius Strobl
d6c65d276e Converted the remainder of the NIC drivers to use the mii_attach()
introduced in r213878 instead of mii_phy_probe(). Unlike r213893 these
are only straight forward conversions though.

Reviewed by:	yongari
2010-10-15 15:00:30 +00:00
John Baldwin
e55bc0154d - Hook into the existing stat timer to drive the transmit watchdog instead
of using if_watchdog and if_timer.
- Reorder detach to call ether_ifdetach() before anything else in tl(4)
  and wb(4).
2009-11-19 22:14:23 +00:00
Robert Watson
eb956cd041 Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs.  This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.

For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 11:45:06 +00:00
Christian Brueffer
6282e61346 Remove unused variable.
Found with:	Coverity Prevent(tm)
CID:		549
2009-05-12 19:33:36 +00:00
Pyun YongHyeon
eeeebe75aa Plug memory leak in jumbo buffer allocation failure path.
Patch in the PR was modified to check active jumbo buffers in use
and other possible jumbo buffer leak.

Jumbo buffer usage in lge(4) still wouldn't be reliable due to lack
of driver lock in local jumbo buffer allocator. Either introduce
a new lock to protect jumbo buffer or switch to UMA backed page
allocator for jumbo frame is required.

PR:	kern/78072
2008-03-05 05:36:09 +00:00
Poul-Henning Kamp
cf827063a9 Give MEXTADD() another argument to make both void pointers to the
free function controlable, instead of passing the KVA of the buffer
storage as the first argument.

Fix all conventional users of the API to pass the KVA of the buffer
as the first argument, to make this a no-op commit.

Likely break the only non-convetional user of the API, after informing
the relevant committer.

Update the mbuf(9) manual page, which was already out of sync on
this point.

Bump __FreeBSD_version to 800016 as there is no way to tell how
many arguments a CPP macro needs any other way.

This paves the way for giving sendfile(9) a way to wait for the
passed storage to have been accessed before returning.

This does not affect the memory layout or size of mbufs.

Parental oversight by:	sam and rwatson.

No MFC is anticipated.
2008-02-01 19:36:27 +00:00
Pyun YongHyeon
6a087a8722 Fix function prototype for device_shutdown method. 2007-11-22 02:45:00 +00:00
Paolo Pisati
ef544f6312 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
Gleb Smirnoff
6b9f5c941c - Consistently use if_printf() only in interface methods: if_start(),
if_watchdog, etc., or in functions used only in these methods.
  In all other functions in the driver use device_printf().
- Use __func__ instead of typing function name.

Submitted by:	Alex Lyashkov <umka sevcity.net>
2006-09-15 15:16:12 +00:00
Poul-Henning Kamp
c40da00ca3 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
John Baldwin
73dbd3da73 Remove various bits of conditional Alpha code and fixup a few comments. 2006-05-12 05:04:46 +00:00
Gleb Smirnoff
23033eebf4 Do not touch ifp->if_baudrate in miibus aware drivers. 2006-02-14 12:44:56 +00:00
John Baldwin
b05223a327 Add locking and mark MPSAFE:
- Add locked variants of start, init, and ifmedia_upd.
- Add a mutex to the softc and remove spl calls.
- Use callout(9) rather than timeout(9).
- Setup interrupt handler last in attach.
- Use M_ZERO rather than bzero.

MFC after:	1 week
Tested by:	wpaul
2005-11-23 18:51:34 +00:00
Ruslan Ermilov
4a0d6638b3 - Store pointer to the link-level address right in "struct ifnet"
rather than in ifindex_table[]; all (except one) accesses are
  through ifp anyway.  IF_LLADDR() works faster, and all (except
  one) ifaddr_byindex() users were converted to use ifp->if_addr.

- Stop storing a (pointer to) Ethernet address in "struct arpcom",
  and drop the IFP2ENADDR() macro; all users have been converted
  to use IF_LLADDR() instead.
2005-11-11 16:04:59 +00:00
Warner Losh
7b279558cc Replace FreeBSD 3.x syntax (controller miibus0) with 4.x syntax
(device miibus) in time for 7.0 :-)
2005-10-22 05:06:55 +00:00
John Baldwin
78de678e75 - Use if_printf() and device_printf() and axe lge_unit from the softc.
- Don't bzero the softc first thing in attach.
- Cleanup error handling in attach() to avoid lots of duplication.
- Don't initialize the callout handle twice.

MFC after:	3 days
2005-10-03 15:52:34 +00:00
Warner Losh
ad4f426ef6 Make sure that we call if_free(ifp) after bus_teardown_intr. Since we
could get an interrupt after we free the ifp, and the interrupt
handler depended on the ifp being still alive, this could, in theory,
cause a crash.  Eliminate this possibility by moving the if_free to
after the bus_teardown_intr() call.
2005-09-19 03:10:21 +00:00
Robert Watson
13f4c340ae Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags.  Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags.  This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by:	pjd, bz
MFC after:	7 days
2005-08-09 10:20:02 +00:00
Robert Watson
13b203d0d7 Modify device drivers supporting multicast addresses to lock if_addr_mtx
over iteration of their multicast address lists when synchronizing the
hardware address filter with the network stack-maintained list.

Problem reported by:	Ed Maste (emaste at phaedrus dot sandvine dot ca>
MFC after:		1 week
2005-08-03 00:18:35 +00:00
Brooks Davis
fc74a9f93a Stop embedding struct ifnet at the top of driver softcs. Instead the
struct ifnet or the layer 2 common structure it was embedded in have
been replaced with a struct ifnet pointer to be filled by a call to the
new function, if_alloc(). The layer 2 common structure is also allocated
via if_alloc() based on the interface type. It is hung off the new
struct ifnet member, if_l2com.

This change removes the size of these structures from the kernel ABI and
will allow us to better manage them as interfaces come and go.

Other changes of note:
 - Struct arpcom is no longer referenced in normal interface code.
   Instead the Ethernet address is accessed via the IFP2ENADDR() macro.
   To enforce this ac_enaddr has been renamed to _ac_enaddr.
 - The second argument to ether_ifattach is now always the mac address
   from driver private storage rather than sometimes being ac_enaddr.

Reviewed by:	sobomax, sam
2005-06-10 16:49:24 +00:00
Yoshihiro Takahashi
d4fcf3cba5 Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64.  The optimization is a trivial on recent machines.

Reviewed by:	-arch (imp, marcel, dfr)
2005-05-29 04:42:30 +00:00
Warner Losh
b77e575e1d Use BUS_PROBE_DEFAULT for pci probe return value 2005-03-05 18:17:35 +00:00
Warner Losh
098ca2bda9 Start each of the license/copyright comments with /*-, minor shuffle of lines 2005-01-06 01:43:34 +00:00
Poul-Henning Kamp
6d6b7a18aa Hide link up/down/media printfs behind bootverbose 2004-11-08 19:21:57 +00:00
Robert Watson
b095765df6 Since if_lge doesn't contain locking or run with INTR_MPSAFE, mark
the interface as IFF_NEEDSGIANT so if_start is run holding Giant.
2004-08-13 23:18:01 +00:00
Warner Losh
794950069f Remove the setting of the pci config variables on power state changes.
The bus does this now.
2004-06-28 20:26:21 +00:00
Christian Weisgerber
0e939c0cea Replace handrolled CRC calculation with ether_crc32_[lb]e(). 2004-06-09 14:34:04 +00:00
Poul-Henning Kamp
fe12f24bb0 Add missing <sys/module.h> includes 2004-05-30 20:08:47 +00:00
Maxime Henrion
866a788cc2 We don't need to initialize if_output, ether_ifattach() does it
for us.
2004-05-23 16:11:53 +00:00
Nate Lawson
5f96beb9e0 Convert callers to the new bus_alloc_resource_any(9) API.
Submitted by:	Mark Santcroos <marks@ripe.net>
Reviewed by:	imp, dfr, bde
2004-03-17 17:50:55 +00:00
Matthew N. Dodd
e3bbbec2ca Announce ethernet MAC addresss in ether_ifattach(). 2004-03-14 07:12:25 +00:00
David E. O'Brien
a55a017f42 Don't use caddr_t in mchash(). Also use C99 spellings over BSD ones.
Requested by:	bde,imp
2003-12-08 07:54:15 +00:00