Commit Graph

51 Commits

Author SHA1 Message Date
mmacy
7aeac9ef18 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
imp
d7cd0a5408 Add PNP info to xl as an example. 2018-03-23 15:35:15 +00:00
pfg
1537078d8f sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
pfg
9da7bdde06 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
emaste
1901c3e1f2 Remove register keyword from sys/ and ANSIfy prototypes
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.

ANSIfy related prototypes while here.

Reviewed by:	cem, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10193
2017-05-17 00:34:34 +00:00
pfg
eed4bd22ad sys/dev: minor spelling fixes.
Most affect comments, very few have user-visible effects.
2016-05-03 03:41:25 +00:00
glebius
306a6faf84 These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
glebius
54ea96669f Mechanically convert to if_inc_counter(). 2014-09-18 20:35:22 +00:00
glebius
6c652d2b37 The MII layer shouldn't care about administrative status of an
interface. Make MII drivers forget about 'struct ifnet'.

Later plan is to provide an administrative downcall from ifnet
layer into drivers, to inform them about administrative status
change. If someone thinks that processing MII events for an
administratively down interface is a big problem, then drivers
would turn MII processing off.

The following MII drivers do evil things, like strcmp() on
driver name, so they still need knowledge of ifnet and thus
include if_var.h. They all need to be fixed:

sys/dev/mii/brgphy.c
sys/dev/mii/e1000phy.c
sys/dev/mii/ip1000phy.c
sys/dev/mii/jmphy.c
sys/dev/mii/nsphy.c
sys/dev/mii/rgephy.c
sys/dev/mii/truephy.c

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 18:40:17 +00:00
glebius
ff6e113f1b The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
glebius
a69aaa7721 Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags in sys/dev.
2012-12-04 09:32:43 +00:00
dim
373133f0ad Remove duplicate const specifiers in many drivers (I hope I got all of
them, please let me know if not).  Most of these are of the form:

static const struct bzzt_type {
	[...list of members...]
} const bzzt_devs[] = {
	[...list of initializers...]
};

The second const is unnecessary, as arrays cannot be modified anyway,
and if the elements are const, the whole thing is const automatically
(e.g. it is placed in .rodata).

I have verified this does not change the binary output of a full kernel
build (except for build timestamps embedded in the object files).

Reviewed by:	yongari, marius
MFC after:	1 week
2012-11-05 19:16:27 +00:00
marius
8abe2314ae - Change the module order of these MAC drivers to be last so they are
deterministically handled after the corresponding PHY drivers when
  loaded as modules. Otherwise, when these MAC/PHY driver pairs are
  compiled into a single module probing the PHY driver may fail. This
  makes r151438 and r226154 actually work. [1]
  Reported and tested by: yongari (fxp(4))
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.

Submitted by:	jhb [1]
MFC after:	3 days
2012-05-11 02:40:40 +00:00
marius
0f30a4cf89 Use DEVMETHOD_END. 2011-11-23 20:27:26 +00:00
marius
17e14c6132 - There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
  (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
  since r52045) but even recently added device drivers do this unnecessarily.
  Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
  Discussed with: jhb
- Also while at it, use __FBSDID.
2011-11-22 21:28:20 +00:00
marius
b4610d98b0 - Import the common MII bitbang'ing code from NetBSD and convert drivers to
take advantage of it instead of duplicating it. This reduces the size of
  the i386 GENERIC kernel by about 4k. The only potential in-tree user left
  unconverted is xe(4), which generally should be changed to use miibus(4)
  instead of implementing PHY handling on its own, as otherwise it makes not
  much sense to add a dependency on miibus(4)/mii_bitbang(4) to xe(4) just
  for the MII bitbang'ing code. The common MII bitbang'ing code also is
  useful in the embedded space for using GPIO pins to implement MII access.
- Based on lessons learnt with dc(4) (see r185750), add bus barriers to the
  MII bitbang read and write functions of the other drivers converted in
  order to ensure the intended ordering. Given that register access via an
  index register as well as register bank/window switching is subject to the
  same problem, also add bus barriers to the respective functions of smc(4),
  tl(4) and xl(4).
- Sprinkle some const.

Thanks to the following testers:
Andrew Bliznak (nge(4)), nwhitehorn@ (bm(4)), yongari@ (sis(4) and ste(4))
Thanks to Hans-Joerg Sirtl for supplying hardware to test stge(4).

Reviewed by:	yongari (subset of drivers)
Obtained from:	NetBSD (partially)
2011-11-01 16:13:59 +00:00
marius
07ac74d257 - Follow the lead of dcphy(4) and pnphy(4) and move the reminder of the PHY
drivers that only ever attach to a particular MAC driver, i.e. inphy(4),
  ruephy(4) and xlphy(4), to the directory where the respective MAC driver
  lives and only compile it into the kernel when the latter is also there,
  also removing it from miibus.ko and moving it into the module of the
  respective MAC driver.
- While at it, rename exphy.c, which comes from NetBSD where the MAC driver
  it corresponds to also is named ex(4) instead of xl(4) but that in FreeBSD
  actually identifies itself as xlphy(4), and its function names accordingly
  for consistency.
- Additionally while at it, fix some minor style issues like whitespace
  in the register headers and add multi-inclusion protection to inphyreg.h.
2011-10-08 12:33:10 +00:00
imp
87cfd8e26b Really spell suppress the right way 2011-06-21 22:17:28 +00:00
imp
19f3f95131 My broken 'u' key scks! 2011-06-21 22:16:04 +00:00
imp
3e1f211217 Supress warning that command didn't complete when the parent bus
thinks the card is gone.
2011-06-21 20:51:09 +00:00
yongari
cdcd15ee71 Fix build. 2011-05-07 04:40:44 +00:00
yongari
84665e966a Remove unneeded use of variable status. This should have been done
in r221557.
2011-05-07 02:19:46 +00:00
yongari
e97ce6dc83 XL_DMACTL is 32bit register, use 32bit write macro.
While I'm here add more bits for the register.
2011-05-07 00:25:12 +00:00
yongari
3a014bc50f Rearm watchdog timer if driver kick controller to recover from TX
underrun error.
While here, prepend 0x to status code to show TX status is hex
number.
2011-05-07 00:18:58 +00:00
yongari
091c72cf13 Rename xl_stats_update() callout handler to xl_tick() and move MII
tick driving logic to xl_tick(). Now xl_tick() handles MII tick as
well as periodic updating of statistics.
This change removes a hack used in interrupt handler where it
wanted to update statistics without driving MII tick.
2011-05-07 00:06:02 +00:00
yongari
79e423bbd2 Reuse the TX descriptor(DPD) if xl_encap() failed instead of just
picking the next available one. This may explain why xl(4) sees TX
underrun error with no queued frame. I hope this addresses a long
standing xl(4) watchdog timeout issue as well.

Obtained from:	OpenBSD
2011-05-06 23:49:10 +00:00
yongari
a3f2c28a33 Change xl_rxeof() a bit to return the number of processed frames in
RX descriptor ring. Previously it returned the number of frames
that were successfully passed to upper stack which in turn means it
ignored frames that were discarded due to errors. The number of
processed frames in RX descriptor ring is used to detect whether
driver is out of sync with controller's current descriptor pointer.
Returning number of processed frames reduces unnecessary (probably
wrong) re-synchronization.

While here, remove unnecessary local variable initialization.
2011-05-06 23:01:29 +00:00
yongari
9225850fe6 Terminate interrupt handler if driver detect it's not running.
Also add check for driver running state before trying to send
frames. While I'm here, use for loop.
2011-05-06 22:55:53 +00:00
yongari
6356409dba Updating status word should be the last operation of UPD structure
renewal.  Disable instruction reordering by adding volatile to
xl_list_onefrag structure.
2011-05-06 22:45:13 +00:00
yongari
40c60ef59e Call bus_dmamap_sync() only after TX DPD update. 2011-05-06 22:36:43 +00:00
yongari
d5f31a325b Set status word once instead of twice. For 3C90xB/3C90xC, frame
length of status word is ignored. While here move bus_dmamap_sync()
up where DMA map is loaded.
2011-05-06 22:26:57 +00:00
yongari
ee61ee4139 Remove unnecessary htole32/le32toh dance. 2011-05-06 22:16:43 +00:00
yongari
809c03efc5 Rewrite RX filter logic and provide controller specific filter
handler for 3C90x and 3C90xB/C respectively.  This simplifies ioctl
handler as well as enhancing readability.
While I'm here don't reprogram multicast filter when driver is not
running.
2011-05-06 22:01:46 +00:00
jhb
00c3c01f4f Do a sweep of the tree replacing calls to pci_find_extcap() with calls to
pci_find_cap() instead.
2011-03-23 13:10:15 +00:00
marius
fc754ef053 Allocate the DMA memory shared between the host and the controller as
coherent.

MFC after:	2 weeks
2011-03-11 22:25:34 +00:00
yongari
1bdd9a8607 Add flow control for 3C905B and newer controllers. Note, these
controllers support RX pause only.

Reviewed by:	marius
2010-11-14 23:53:13 +00:00
marius
d68ca83621 Correct an inverted check in r213893. 2010-11-05 19:38:28 +00:00
marius
385153aa98 Convert the PHY drivers to honor the mii_flags passed down and convert
the NIC drivers as well as the PHY drivers to take advantage of the
mii_attach() introduced in r213878 to get rid of certain hacks. For
the most part these were:
- Artificially limiting miibus_{read,write}reg methods to certain PHY
  addresses; we now let mii_attach() only probe the PHY at the desired
  address(es) instead.
- PHY drivers setting MIIF_* flags based on the NIC driver they hang
  off from, partly even based on grabbing and using the softc of the
  parent; we now pass these flags down from the NIC to the PHY drivers
  via mii_attach(). This got us rid of all such hacks except those of
  brgphy() in combination with bce(4) and bge(4), which is way beyond
  what can be expressed with simple flags.

While at it, I took the opportunity to change the NIC drivers to pass
up the error returned by mii_attach() (previously by mii_phy_probe())
and unify the error message used in this case where and as appropriate
as mii_attach() actually can fail for a number of reasons, not just
because of no PHY(s) being present at the expected address(es).

Reviewed by:	jhb, yongari
2010-10-15 14:52:11 +00:00
yongari
05b7de36b6 Implement basic WOL support. Note, not all xl(4) controllers
support WOL. Some controllers require additional 3-wire auxiliary
remote wakeup connector to draw power. More recent xl(4)
controllers may not need the wakeup connector though.
2010-08-23 19:18:50 +00:00
yongari
f76184d2bf Move xl_reset() to xl_init_locked(). This will make driver
initialize controller from a known good state. Previously driver
used to issue controller reset while TX/RX DMA are in progress.
I guess resetting controller in active TX/RX DMA cycle is to ensure
stopping I/Os in xl_shutdown(). I remember some buggy controllers
didn't respond with stop command if controller is under high
network load at the time of shutdown so resetting controller was
the only safe way to stop the I/Os. However, from my experiments,
controller always responded with stop command under high network
load so I think it's okay to remove the xl_reset() in
device_shutdown handler.
Resetting controller also will clear configured RX filter which
in turn will make WOL support hard because driver have to reprogram
RX filter in WOL handler as well as setting station address.
2010-08-23 18:51:31 +00:00
yongari
9e34256876 Remove unnecessary controller reinitialization by checking
IFF_DRV_RUNNING flag.
2010-08-23 00:31:55 +00:00
yongari
73721416ee Clean up SIOCSIFCAP handler and allow RX checksum offloading could
be controlled by user.
2010-08-23 00:24:12 +00:00
imp
2706cc4a07 cardbus -> CardBus 2010-01-03 23:29:49 +00:00
yongari
2706e21f2c Make xl(4) build with Tx checksum offload.
PR:		kern/136409
Approved by:	re (kib)
2009-07-09 01:58:59 +00:00
rwatson
be5740a255 Use if_maddr_rlock()/if_maddr_runlock() rather than IF_ADDR_LOCK()/
IF_ADDR_UNLOCK() across network device drivers when accessing the
per-interface multicast address list, if_multiaddrs.  This will
allow us to change the locking strategy without affecting our driver
programming interface or binary interface.

For two wireless drivers, remove unnecessary locking, since they
don't actually access the multicast address list.

Approved by:	re (kib)
MFC after:	6 weeks
2009-06-26 11:45:06 +00:00
attilio
b523608331 When user_frac in the polling subsystem is low it is going to busy the
CPU for too long period than necessary.  Additively, interfaces are kept
polled (in the tick) even if no more packets are available.
In order to avoid such situations a new generic mechanism can be
implemented in proactive way, keeping track of the time spent on any
packet and fragmenting the time for any tick, stopping the processing
as soon as possible.

In order to implement such mechanism, the polling handler needs to
change, returning the number of packets processed.
While the intended logic is not part of this patch, the polling KPI is
broken by this commit, adding an int return value and the new flag
IFCAP_POLLING_NOCOUNT (which will signal that the return value is
meaningless for the installed handler and checking should be skipped).

Bump __FreeBSD_version in order to signal such situation.

Reviewed by:	emaste
Sponsored by:	Sandvine Incorporated
2009-05-30 15:14:44 +00:00
kmacy
73893c1685 remove dead code with reference to IFQ_HANDOFF 2009-04-27 22:53:35 +00:00
yongari
5632a9bf6f To make it easy whether xl(4) missed Tx completion interrupt check
number of queued packets in watchdog timeout handler. If there are
no queued packets just print a informational message and return
without resetting controller. Also fix to invoke correct Tx
completion handler as 3C905B needs different handler.
2009-04-21 00:42:11 +00:00
yongari
7364df80c2 Clear IFF_DRV_OACTIVE flag if one of queued packets was transmitted.
Previously it used to clear the flag only when the transmit queue
is empty which may slow down Tx performance.
While I'm here check whether driver is running and whether we can
queue more packets in if_start handler. This fixes occasional
watchdog timeouts.

Reported by:	xer < xernet <> hotmail dot it >
Tested by:	xer < xernet <> hotmail dot it >
2009-04-21 00:34:31 +00:00
imp
ed87bed6c0 remove now-redunant cardbus attachment. 2009-03-09 13:23:54 +00:00