Commit Graph

70 Commits

Author SHA1 Message Date
Gleb Smirnoff
a6b55ee6be net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.

This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch.  A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.

Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH.  But the tcp_lro_flush_all() asserts the presence
of network epoch.  Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.

Reviewed by:		zlei, gallatin, erj
Differential Revision:	https://reviews.freebsd.org/D39510
2023-04-17 09:08:35 -07:00
Justin Hibbits
67fd4c9d07 Mechanically convert oce(4) to IfAPI
Reviewed By:	zlei
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37829
2023-02-07 14:16:17 -05:00
John Baldwin
4e75e49626 oce: Remove unused devclass argument to DRIVER_MODULE. 2022-05-09 12:22:03 -07:00
Gordon Bergling
88cdccff3f oce(4): Fix a typo in a sysctl description
- s/interupt/interrupt/

MFC after:	3 days
2022-04-20 12:51:52 +02:00
John Baldwin
19c099a866 oce: Remove unused variables. 2022-04-07 17:01:27 -07:00
Michael Tuexen
e363f832cf if_oce: fix epoch handling
Thanks to gallatin@ for suggesting the patch.

PR:			260330
Reported by:		Vincent Milum Jr.
Reviewed by:		gallatin, glebius
Tested by:		Vincent Milum Jr.
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D33395
2021-12-18 23:43:00 +01:00
Gordon Bergling
5bdf58e196 Fix some common typos in source code comments
- s/priviledged/privileged/
- s/funtion/function/
- s/doens't/doesn't/
- s/sychronization/synchronization/

MFC after:	3 days
2021-08-28 18:57:23 +02:00
Mark Johnston
71776d6719 oce: Fix handling of m_pullup() errors in oce_tso_setup()
m_pullup() frees the input mbuf chain upon a failure.  Set *mpp to NULL
in this case to ensure that the caller does not free the chain again.

PR:		224928
Submitted by:	Lv Yunlong <lylgood@foxmail.com> (original version)
MFC after:	1 week
2021-05-26 10:42:36 -04:00
Mateusz Guzik
cec3c5649c oce: clean up empty lines in .c and .h files 2020-09-01 22:02:32 +00:00
Conrad Meyer
b75a772875 oce(4): Account and trace mbufs before handing to hw
Once tx mbufs have been handed to hardware, nothing serializes the tx
path against completion and potential use-after-free of the outbound
mbuf.  Perform accounting and BPF tap before queueing to hardware to
avoid this race.

Submitted by:	Steve Wirtz <steve_wirtz AT dell.com>
Reviewed by:	markj, rstone
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25364
2020-06-20 17:22:46 +00:00
Ryan Moeller
cbb9ccf735 Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in
various NIC drivers.

Reviewed by:	hselasky, np, gallatin, jpaetzel
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25120
2020-06-15 16:35:27 +00:00
Warner Losh
a9208e9862 Remove FreeBSD 7-9 support from oce
Use newer pci_find_cap API now that the need to remap the old API is gone.
2020-02-27 15:25:26 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Conrad Meyer
f3bae413e9 random(9): Deprecate random(9), remove meaningless srandom(9)
srandom(9) is meaningless on SMP systems or any system with, say,
interrupts.  One could never rely on random(9) to produce a reproducible
sequence of outputs on the basis of a specific srandom() seed because the
global state was shared by all kernel contexts.  As such, removing it is
literally indistinguishable to random(9) consumers (as compared with
retaining it).

Mark random(9) as deprecated and slated for quick removal.  This is not to
say we intend to remove all fast, non-cryptographic PRNG(s) in the kernel.
It/they just won't be random(9), as it exists today, in either name or
implementation.

Before random(9) is removed, a replacement will be provided and in-tree
consumers will be converted.

Note that despite the name, the random(9) interface does not bear any
resemblance to random(3).  Instead, it is the same crummy 1988 Park-Miller
LCG used in libc rand(3).
2019-12-26 19:41:09 +00:00
Mark Johnston
c76ddeeb1c oce: Disallow the passthrough ioctl for unprivileged users.
A missing check meant that unprivileged users could send passthrough
commands to the device firmware.

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-12-23 23:43:50 +00:00
Mark Johnston
6e7ecc9a89 oce: Tighten input validation in the SIOCGI2C handler.
Missing validation meant that it was possible to read 8 bytes beyond
the end of sfp_vpd_dump_buffer.

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by:	delphij, ram
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22859
2019-12-18 18:44:16 +00:00
Eric Joyner
e81998f407 net: prefer ETHER_ADDR_LEN over ETH_ADDR_LEN
A couple of drivers and one place in if.c use ETH_ADDR_LEN, even though
net/ethernet.h provides an equivalent ETHER_ADDR_LEN definition.

Cleanup all of the locations which refer to ETH_ADDR_LEN to use the
standard ETHER_ADDR_LEN instead.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	erj@, jpaetzel@
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21239
2019-11-04 22:57:36 +00:00
Gleb Smirnoff
7384b10ff1 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:12:02 +00:00
Kyle Evans
ab7de25c25 oce(4): potential out of bounds access before vector validation
Submitted by:	Augustin Cavalier <waddlesplash@gmail.com>
Obtained from:	Haiku (ec2b89264cfc63e05e611cce82cc449197403aa4)
MFC after:	3 days
2019-08-06 13:09:20 +00:00
Alexander Motin
3582828053 Fix array out of bound panic introduced in r306219.
As I see, different NICs in different configurations may have different
numbers of TX and RX queues.  The code was assuming 1:1 mapping between
event queues (interrupts) and TX/RX queues.  Since number of interrupts
is set to maximum of TX and RX queues, when those two are different, the
system is doomed.

I have no documentation or deep knowledge about this hardware, so this
change is based on general observations and code reading.  If some of my
guesses are wrong, please do better.  I just confirmed HP NC550SFP NICs
are working now.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-05-28 18:32:04 +00:00
Xin LI
e0dd0fe31e Added support for the SIOCGI2C ioctl.
Submitted by:	Ram Kishore Vegesna <ram.vegesna@broadcom.com>
Obtained from:	Broadcom
MFC after:	2 weeks
2019-01-08 05:41:04 +00:00
Warner Losh
0dc34160f3 Add PNP info to PCI attachments of cbb, cxgb, ida, iwn, ixl, ixlv,
mfi, mps, mpr, mvs, my, oce, pcn, ral, rl. This only labels existing
pci device tables, and has no probe / attach code changes.

Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
Approved by: re (glen)
2018-09-26 17:12:30 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Josh Paetzel
dced0d1821 Add ability to perform a firmware reset during driver initialization.
Required by Lancer Gen 5 hardware.

Submitted by:	Ram Kishore Vegesna <ram.vegesna@broadcom.com>
Obtained from:	Broadcom
2018-05-01 17:39:20 +00:00
Brooks Davis
541d96aaaf Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size).  This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14900
2018-03-30 18:50:13 +00:00
Pedro F. Giffuni
7282444b10 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:36:21 +00:00
Josh Paetzel
c2625e6e38 Update oce to version 11.0.50.0
Submitted by:	Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
2016-09-22 22:51:11 +00:00
Conrad Meyer
14410265b6 Revert r306148 to fix build
Requested by:	jpaetzel
Reported by:	Larry Rosenman <ler at lerctr.org>, Jenkins
2016-09-22 00:25:23 +00:00
Josh Paetzel
764c812d84 Update oce driver to 11.0.50.0
Submitted by:	Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
2016-09-21 22:53:16 +00:00
Pedro F. Giffuni
74b8d63dcc Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
Sepherosa Ziehau
6dd38b8716 tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
John Baldwin
cbc4d2db75 Remove taskqueue_enqueue_fast().
taskqueue_enqueue() was changed to support both fast and non-fast
taskqueues 10 years ago in r154167.  It has been a compat shim ever
since.  It's time for the compat shim to go.

Submitted by:	Howard Su <howard0su@gmail.com>
Reviewed by:	sephe
Differential Revision:	https://reviews.freebsd.org/D5131
2016-03-01 17:47:32 +00:00
Justin Hibbits
c47476d7e6 Migrate many bus_alloc_resource() calls to bus_alloc_resource_anywhere().
Most calls to bus_alloc_resource() use "anywhere" as the range, with a given
count.  Migrate these to use the new bus_alloc_resource_anywhere() API.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D5370
2016-02-27 03:38:01 +00:00
Gleb Smirnoff
8ec07310fa These files were getting sys/malloc.h and vm/uma.h with header pollution
via sys/mbuf.h
2016-02-01 17:41:21 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
John-Mark Gurney
0d6c1105bc srandom has no influence on read_random, at least not this late... 2015-02-13 19:44:04 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Hans Petter Selasky
9fd573c39d Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by:	adrian, rmacklem
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-22 08:27:27 +00:00
Gleb Smirnoff
c8dfaf382f Mechanically convert to if_inc_counter(). 2014-09-19 03:51:26 +00:00
Hans Petter Selasky
72f3100047 Revert r271504. A new patch to solve this issue will be made.
Suggested by:	adrian @
2014-09-13 20:52:01 +00:00
Hans Petter Selasky
eb93b77ae4 Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-09-13 08:26:09 +00:00
Warner Losh
c236d64745 Cast queue length because q_len isn't really an enum in the same sense
that clang wants it to be (a value that can only have values inside
the enum range), but rather an unsigned count of bytes.
2014-08-07 21:56:46 +00:00
Luigi Rizzo
d398c8634a Various bugfixes from Stefano Garzarella:
1. oce_multiq_start(): make sure the buffer is consumed even on ENXIO
2. oce_multiq_transmit(): there is an extra call to drbr_enqueue()
  causing the mbuf to be enqueued twice when the NIC's queue is full,
  and potential panics
3. oce_multiq_transmit(): same problem fixed recently in ixgbe (r267187)
   and other drivers: if the mbuf is enqueued, the proper return value is 0

Submitted by:	Stefano Garzarella
MFC after:	3 days
2014-07-02 12:13:11 +00:00
Xin LI
a4f734b4fc Apply vendor fixes for big endian support and 20GBps/25GBps link speeds.
Many thanks to Emulex for their continued support of FreeBSD!

Submitted by:	Venkata Duvvuru <VenkatKumar.Duvvuru Emulex.Com>
MFC after:	3 days
2014-06-24 20:11:22 +00:00
John Baldwin
c34f1a08c6 Fix teardown of static DMA allocations in various NIC drivers:
- Add missing calls to bus_dmamap_unload() in et(4).
- Check the bus address against 0 to decide when to call
  bus_dmamap_unload() instead of comparing the bus_dma map against NULL.
- Check the virtual address against NULL to decide when to call
  bus_dmamem_free() instead of comparing the bus_dma map against NULL.
- Don't clear bus_dma map pointers to NULL for static allocations.
  Instead, treat the value as completely opaque.
- Pass the correct virtual address to bus_dmamem_free() in wpi(4) instead
  of trying to free a pointer to the virtual address.

Reviewed by:	yongari
2014-06-17 14:47:49 +00:00
Gleb Smirnoff
b245f96c44 Since 32-bit if_baudrate isn't enough to describe a baud rate of a 10 Gbit
interface, in the r241616 a crutch was provided. It didn't work well, and
finally we decided that it is time to break ABI and simply make if_baudrate
a 64-bit value. Meanwhile, the entire struct if_data was reviewed.

o Remove the if_baudrate_pf crutch.

o Make all fields of struct if_data fixed machine independent size. The
  notion of data (packet counters, etc) are by no means MD. And it is a
  bug that on amd64 we've got a 64-bit counters, while on i386 32-bit,
  which at modern speeds overflow within a second.

  This also removes quite a lot of COMPAT_FREEBSD32 code.

o Give 16 bit for the ifi_datalen field. This field was provided to
  make future changes to if_data less ABI breaking. Unfortunately the
  8 bit size of it had effectively limited sizeof if_data to 256 bytes.

o Give 32 bits to ifi_mtu and ifi_metric.
o Give 64 bits to the rest of fields, since they are counters.

__FreeBSD_version bumped.

Discussed with:	emax
Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-03-13 03:42:24 +00:00
Xin LI
198f573526 Eliminate unused drbr_stats_update implementation in oce(4) driver.
Noticed by:	dim
MFC after:	2 weeks
2013-12-30 21:30:49 +00:00
Xin LI
b41206d84a Apply vendor improvements to oce(4) driver:
- Add support to 40Gbps devices;
 - Add support to control adaptive interrupt coalescing (AIC)
   via sysctl;
 - Improve support of BE3 devices;

Many thanks to Emulex for their continued support of FreeBSD.

Submitted by:	Venkata Duvvuru <VenkatKumar.Duvvuru Emulex Com>
MFC after:	3 days
2013-12-04 20:24:18 +00:00
Xin LI
d8f7bfb8bd The previous code makes a memory allocation in size of
struct mbx_common_read_write_flashrom plus 32KB and caps the actual
transfer size at 32KB.  This is harmless as it is but may confuse
static code analyzer, so allocate a full 32KB instead.

Reported by:	Coverity via mjacob
Submitted by:	Venkata Duvvuru <VenkatKumar.Duvvuru Emulex Com>
Coverity CID:	1125820
2013-11-14 18:53:17 +00:00