Commit Graph

164 Commits

Author SHA1 Message Date
Justin Hibbits
93037a67bf Mechanically convert mxge(4) to IfAPI
Reviewed by:	gallatin
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37848
2023-02-06 12:52:46 -05:00
Elliott Mitchell
6147584bc0 mxge: cleanup from purging EOL release compatibility
The FreeBSD version was being used to trigger preprocessor conditionals
IFNET_BUF_RING and MXGE_VIRT_JUMBOS.  Since those are no longer
conditional, purge their effects.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/603
Differential Revision: https://reviews.freebsd.org/D35560
2023-02-04 09:11:01 -07:00
Elliott Mitchell
c98b837ad0 mxge: purge EOL release compatibility
Remove FreeBSD 7, 8, 9 and 10 compatibility code.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/603
Differential Revision: https://reviews.freebsd.org/D35560
2023-02-04 09:10:57 -07:00
John Baldwin
7ae99f80b6 pmap_unmapdev/bios: Accept a pointer instead of a vm_offset_t.
This matches the return type of pmap_mapdev/bios.

Reviewed by:	kib, markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D36548
2022-09-22 15:08:52 -07:00
John Baldwin
99b29e3408 mxge: Remove unused devclass argument to DRIVER_MODULE. 2022-05-09 12:22:03 -07:00
John Baldwin
b08a56ad46 mxgbe: Remove a dummy variable used to force a register read.
Keep the register read by casting the result of the pointer indirect
to void.
2022-04-07 17:01:26 -07:00
Warner Losh
37466424b2 mxge: Remove write only variables, mark ifp __unused
Sponsored by:		Netflix
2022-04-05 21:42:06 -06:00
Warner Losh
273676a44c mxge_rss_ethp_z8e_fw_modevent: eliminate write only variable parent
Sponsored by:		Netflix
2022-04-04 22:29:49 -06:00
Warner Losh
5f136a4c01 mxge_rss_eth_z8e_fw_modevent: eliminate write only variable parent
Sponsored by:		Netflix
2022-04-04 22:29:47 -06:00
Warner Losh
886bc93da8 mxge_ethp_z8e_fw_modevent: eliminate write only variable parent
Sponsored by:		Netflix
2022-04-04 22:29:45 -06:00
Warner Losh
498276b4b4 mxge_eth_z8e_fw_modevent: eliminate write only variable parent
Sponsored by:		Netflix
2022-04-04 22:29:43 -06:00
John Baldwin
aa54c24283 Use uintptr_t instead of unsigned long for integers holding pointers.
Reviewed by:	imp, gallatin
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27580
2020-12-16 00:17:54 +00:00
Mateusz Guzik
ecb86c27ba mxge: clean up empty lines in .c and .h files 2020-09-01 22:05:00 +00:00
Ryan Moeller
cbb9ccf735 Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in
various NIC drivers.

Reviewed by:	hselasky, np, gallatin, jpaetzel
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25120
2020-06-15 16:35:27 +00:00
Brooks Davis
1b8b041ce9 Use ifr_data_get_ptr() consistently. 2020-03-03 18:58:43 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Warner Losh
58aa35d429 Remove sparc64 kernel support
Remove all sparc64 specific files
Remove all sparc64 ifdefs
Removee indireeect sparc64 ifdefs
2020-02-03 17:35:11 +00:00
Gleb Smirnoff
2a0f8518d6 Convert to if_foreach_llmaddr() KPI.
Reviewed by:	gallatin
2019-10-14 20:18:36 +00:00
Xin LI
1dbf944a91 if_mxge: update zlib version 1.0.4 to 1.2.11.
PR:		229763
Submitted by:	Yoshihiro Ota <ota j email ne jp>
Differential Revision:	https://reviews.freebsd.org/D20272
2019-08-03 03:36:18 +00:00
Andrew Gallatin
3d07f89450 mxge: replace 65536 with IP_MAXPACKET in tso settings. 2018-07-05 02:43:10 +00:00
Andrew Gallatin
7eafc4d50b mxge: choose appropriate values for hw tso 2018-07-04 19:29:06 +00:00
Andrew Gallatin
df131e8459 mxge: Add SIOCGI2C support for devices with SFP/XFP cages 2018-07-04 18:54:44 +00:00
Andrew Gallatin
c0a1f0af0c mxge: fix panic at module unload
r333175 (multicast changes) exposed a bug where
mxge was not checking to see if the driver was being
unloaded while handing ioctls that touch hardware.
As a result, now that in6m_disconnect() is run from
an async gtaskq, it was busy-waiting in mxge_send_cmd()
while the mcast list was destroyed.
2018-07-04 14:25:38 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Ravi Pokala
c756fb6ebb mxge(4) should pass unhandled ioctls to ether_ioctl()
Panasas discovered that ioctl(SIOCGLAGGPORT) returns ENOTTY for mxge(4) when
the NIC is not a member of a lagg. This came as a surprise, because the
SIOCGLAGGPORT handler in if_lagg.c only returns ENOENT (if run against the
laggX interface, rather than a physical port) or EINVAL (if run against a
non-member physical port). This behavior was not seen with other drivers,
such as bge(4), igb(4), and cxl(4). When I compared their respective ioctl
handlers, I found that they all called ether_ioctl() for the default (i.e.
unhandled) case; by contrast, mxge(4) only calls ether_ioctl() for two
specific cases, and returns ENOTTY for the default case.

Remove the two cases which explicitly call ether_ioctl(), and let the
default case call it instead. This matches what the vast majority of the NIC
drivers do.

Reviewed by:	kmacy
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14381
2018-02-15 03:22:04 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
b0ae8f91ac Fix build after r327949.
Reported by:	Cy Schubert
2018-01-14 00:31:34 +00:00
Pedro F. Giffuni
26c1d774b5 dev: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these is likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
2018-01-13 22:30:30 +00:00
Pedro F. Giffuni
718cf2ccb9 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 14:52:40 +00:00
Sepherosa Ziehau
eacb70ba70 mxge: Setup mbuf flowid before calling tcp_lro_rx().
Reviewed by:	gallatin
MFC after:	1 week
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6320
2016-05-12 03:36:49 +00:00
Pedro F. Giffuni
73a1170a8c sys/dev: use our nitems() macro when it is avaliable through param.h.
No functional change, only trivial cases are done in this sweep,
Drivers that can get further enhancements will be done independently.

Discussed in:	freebsd-current
2016-04-19 23:37:24 +00:00
Sepherosa Ziehau
6dd38b8716 tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
Justin Hibbits
da1b038af9 Use uintmax_t (typedef'd to rman_res_t type) for rman ranges.
On some architectures, u_long isn't large enough for resource definitions.
Particularly, powerpc and arm allow 36-bit (or larger) physical addresses, but
type `long' is only 32-bit.  This extends rman's resources to uintmax_t.  With
this change, any resource can feasibly be placed anywhere in physical memory
(within the constraints of the driver).

Why uintmax_t and not something machine dependent, or uint64_t?  Though it's
possible for uintmax_t to grow, it's highly unlikely it will become 128-bit on
32-bit architectures.  64-bit architectures should have plenty of RAM to absorb
the increase on resource sizes if and when this occurs, and the number of
resources on memory-constrained systems should be sufficiently small as to not
pose a drastic overhead.  That being said, uintmax_t was chosen for source
clarity.  If it's specified as uint64_t, all printf()-like calls would either
need casts to uintmax_t, or be littered with PRI*64 macros.  Casts to uintmax_t
aren't horrible, but it would also bake into the API for
resource_list_print_type() either a hidden assumption that entries get cast to
uintmax_t for printing, or these calls would need the PRI*64 macros.  Since
source code is meant to be read more often than written, I chose the clearest
path of simply using uintmax_t.

Tested on a PowerPC p5020-based board, which places all device resources in
0xfxxxxxxxx, and has 8GB RAM.
Regression tested on qemu-system-i386
Regression tested on qemu-system-mips (malta profile)

Tested PAE and devinfo on virtualbox (live CD)

Special thanks to bz for his testing on ARM.

Reviewed By: bz, jhb (previous)
Relnotes:	Yes
Sponsored by:	Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D4544
2016-03-18 01:28:41 +00:00
Justin Hibbits
43cd61606b Replace several bus_alloc_resource() calls using default arguments with bus_alloc_resource_any()
Since these calls only use default arguments, bus_alloc_resource_any() is the
right call.

Differential Revision: https://reviews.freebsd.org/D5306
2016-02-19 03:37:56 +00:00
Hans Petter Selasky
e936121d31 Add optimizing LRO wrapper:
- Add optimizing LRO wrapper which pre-sorts all incoming packets
  according to the hash type and flowid. This prevents exhaustion of
  the LRO entries due to too many connections at the same time.
  Testing using a larger number of higher bandwidth TCP connections
  showed that the incoming ACK packet aggregation rate increased from
  ~1.3:1 to almost 3:1. Another test showed that for a number of TCP
  connections greater than 16 per hardware receive ring, where 8 TCP
  connections was the LRO active entry limit, there was a significant
  improvement in throughput due to being able to fully aggregate more
  than 8 TCP stream. For very few very high bandwidth TCP streams, the
  optimizing LRO wrapper will add CPU usage instead of reducing CPU
  usage. This is expected. Network drivers which want to use the
  optimizing LRO wrapper needs to call "tcp_lro_queue_mbuf()" instead
  of "tcp_lro_rx()" and "tcp_lro_flush_all()" instead of
  "tcp_lro_flush()". Further the LRO control structure must be
  initialized using "tcp_lro_init_args()" passing a non-zero number
  into the "lro_mbufs" argument.

- Make LRO statistics 64-bit. Previously 32-bit integers were used for
  statistics which can be prone to wrap-around. Fix this while at it
  and update all SYSCTL's which expose LRO statistics.

- Ensure all data is freed when destroying a LRO control structures,
  especially leftover LRO entries.

- Reduce number of memory allocations needed when setting up a LRO
  control structure by precomputing the total amount of memory needed.

- Add own memory allocation counter for LRO.

- Bump the FreeBSD version to force recompilation of all KLDs due to
  change of the LRO control structure size.

Sponsored by:	Mellanox Technologies
Reviewed by:	gallatin, sbruno, rrs, gnn, transport
Tested by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D4914
2016-01-19 15:33:28 +00:00
Craig Rodrigues
d9db52256e Move zlib.c from net to libkern.
It is not network-specific code and would
be better as part of libkern instead.
Move zlib.h and zutil.h from net/ to sys/
Update includes to use sys/zlib.h and sys/zutil.h instead of net/

Submitted by:		Steve Kiernan stevek@juniper.net
Obtained from:		Juniper Networks, Inc.
GitHub Pull Request:	https://github.com/freebsd/freebsd/pull/28
Relnotes:		yes
2015-04-22 14:38:58 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Gleb Smirnoff
56ec302f3c Whitespace cleanup. 2014-09-25 05:47:33 +00:00
Gleb Smirnoff
f3f040d929 - Provide mxge_get_counter() to return counters that are not collected,
but taken from hardware.
- Mechanically convert to if_inc_counter() the rest of counters.
2014-09-25 05:45:52 +00:00
Gleb Smirnoff
56b61ca27a Remove ifq_drops from struct ifqueue. Now queue drops are accounted in
struct ifnet if_oqdrops.

Some netgraph modules used ifqueue w/o ifnet. Accounting of queue drops
is simply removed from them. There were no API to read this statistic.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-09-19 09:01:19 +00:00
Gleb Smirnoff
b245f96c44 Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
  notion of data (packet counters, etc) are by no means MD. And it is a
  bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
  which at modern speeds overflow within a second.

  This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
  make future changes to if_data less ABI breaking. Unfortunately the
  8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with:	emax
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-03-13 03:42:24 +00:00
Eitan Adler
7a22215c53 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
Gleb Smirnoff
76039bc84f The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
Scott Long
c68534f1d5 Update PCI drivers to no longer look at the MEMIO-enabled bit in the PCI
command register.  The lazy BAR allocation code in FreeBSD sometimes
disables this bit when it detects a range conflict, and will re-enable
it on demand when a driver allocates the BAR.  Thus, the bit is no longer
a reliable indication of capability, and should not be checked.  This
results in the elimination of a lot of code from drivers, and also gives
the opportunity to simplify a lot of drivers to use a helper API to set
the busmaster enable bit.

This changes fixes some recent reports of disk controllers and their
associated drives/enclosures disappearing during boot.

Submitted by:	jhb
Reviewed by:	jfv, marius, achadd, achim
MFC after:	1 day
2013-08-12 23:30:01 +00:00
Gabor Kovesdan
b78540b1c7 - Correct mispellings of word resource
Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
2013-04-17 11:47:32 +00:00
Andrew Gallatin
dedbe8362f Several cleanups and fixes to mxge:
- Remove vestigial null pointer tests after malloc(..., M_WAITOK).

- Remove vestigal qualhack union

- Use strlcpy() instead of the error-prone strncpy() when parsing
  EEPROM and copying strings

- Check the MAC address in the EEPROM strings more strictly.

- Expand the macro MXGE_NEXT_STRING() at its only user. Due to a typo,
  the macro was very confusing.

- Remove unnecessary buffer limit check.  The buffer is double-NUL
  terminated per construction.

PR:		kern/176369
Submitted by:	Christoph Mallon <christoph.mallon gmx.de>
2013-02-25 16:22:40 +00:00
Andrew Gallatin
cabc512fe4 Bump mxge copyright.
Sponsored by: Myricom

MFC After: 7 days
2013-02-22 19:23:33 +00:00
Andrew Gallatin
a4b233dd06 Improvements for newer mxge nics:
- Some mxge nics may store the serial number in the SN2 field of the
  EEPROM.  These will also have an SN=0 field, so parse the SN2 field,
  and give it precedence.

- Skip MXGEFW_CMD_UNALIGNED_TEST on mxge nics which do not require it.
  This saves roughly 10ms per port at device attach time.

Sponsored by: Myricom

MFC After: 7 days
2013-02-22 19:21:29 +00:00
Andrew Gallatin
abc5b96b99 Try harder to make mxge safe for all combinations of INET and INET6
- Re-fix build by restoring local removed in r247151, but protected
  by #if defined(INET) || defined(INET6) so that the compile
  succeeds in the !(INET||INET6) case.

- Protect call to in_pseudo() with an #ifdef INET, to allow
  a kernel to link with mxge when INET is not compiled in.

- Also remove an errant (improperly commented) obsolete debugging printf

Thanks to Glebius for pointing out the !(INET||INET6) build issue.

Sponsored by: Myricom

MFC After: 7 days
2013-02-22 16:46:28 +00:00