Commit Graph

3837 Commits

Author SHA1 Message Date
Stephen Hurd
de13095409 Fix indentation.
Sponsored by:	Limelight Networks
2017-12-27 19:12:32 +00:00
Eitan Adler
caa7e52f3f kernel: Fix several typos and minor errors
- duplicate words
- typos
- references to old versions of FreeBSD

Reviewed by:	imp, benno
2017-12-27 03:23:21 +00:00
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
Alexander Kabaev
92f19df431 Do not pass NULL pointer to copyout in if_clone_list.
Sometimes caller is only interested in how many clones
are there and NULL pointer is passed for the destination
buffer. Do not pass it to copyout then.
2017-12-23 16:45:24 +00:00
Alexander Kabaev
ce4ab99d82 Remove some trailing whitespace.
Reviewed by: glebius, ae
Differential Revision: https://reviews.freebsd.org/D10386
2017-12-23 16:24:00 +00:00
Alexander Kabaev
3395dd6eb8 Do not double free the memory in if_clone.
if_clone_attach function will drop the reference on failure  which will
free the if_clone structure. No need to do it second time.

Reviewed by: glebius, ae
Differential Revision: https://reviews.freebsd.org/D10386
2017-12-23 16:23:58 +00:00
Warner Losh
79554b4049 The device tables end with a sentinel in iflib. Don't include the
sentinel in the output.
2017-12-23 04:50:52 +00:00
Warner Losh
d2064cf030 Use '#' rather than some made up name for fields we want to ignore. 2017-12-22 17:53:27 +00:00
Konstantin Belousov
97755e83f5 Fix build for kernels with SCHED_4BSD.
Sponsored by:	The FreeBSD Foundation
2017-12-21 23:05:13 +00:00
Stephen Hurd
25ac1dd5c7 Don't call tcp_lro_rx() unless hardware verified TCP/UDP csum
It seems that tcp_lro_rx() doesn't verify TCP checksums, so
if there are bad checksums in the packets caused by invalid data, the
invalid data will pass through without errors.

This was noticed with the igb driver and a specific internet host:
fetch http://www.mpfr.org/mpfr-current/mpfr-3.1.6.tar.xz -o test.bin && sha256 test.bin
Would result in a different value sometimes.

This ends up making LRO require RXCSUM to be enabled, and RXCSUM to
support TCP and UDP checksums.

PR:		224346
Reported by:	gjb
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13561
2017-12-21 01:22:36 +00:00
Li-Wen Hsu
40cf51c438 Add missing ;
Approved by:	kevlo
2017-12-20 06:08:16 +00:00
Stephen Hurd
b103855e18 Support attaching tx queues to cpus
This will attempt to use a different thread/core on the same L2
cache when possible, or use the same cpu as the rx thread when not.
If SMP isn't enabled, don't go looking for cores to use. This is mostly
useful when using shared TX/RX queues.

Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12446
2017-12-20 01:03:34 +00:00
Stephen Hurd
96fc97c81f Update Matthew Macy contact info
Email address has changed, uses consistent name (Matthew, not Matt)

Reported by:	Matthew Macy <mmacy@mattmacy.io>
Differential Revision:	https://reviews.freebsd.org/D13537
2017-12-19 17:59:00 +00:00
Andrey V. Elsukov
0e253fd12c Fix possible memory leak.
vxlan_ftable entries are sorted in ascending order, due to wrong arguments
order it is possible to stop search before existing element will be found.
Then new element will be allocated in vxlan_ftable_update_locked() and can
be inserted in the list second time or trigger MPASS() assertion with
enabled INVARIANTS.

PR:		224371
MFC after:	1 week
2017-12-16 14:36:21 +00:00
Ryan Stone
19f41c2a52 Plug an ifaddr leak when changing a route's src
If a route is modified in a way that changes the route's source
address (i.e. the address used to access the gateway), then a
reference on the ifaddr representing the old source address will
be leaked if the address type does not have an ifa_rtrequest
method defined.  Plug the leak by releasing the reference in
all cases.

Differential Revision:	https://reviews.freebsd.org/D13417
Reviewed by:	ae
MFC after:	3 weeks
Sponsored by:	Dell
2017-12-14 20:48:50 +00:00
Stephen Hurd
06c47d48dc Increment encap_pad_mbuf_fail when m_dup() fails in padding
Previously, the counter was only incremented when m_append() failed.  Since
the function can also fail on m_dup() now, increment the counter there as
well.

Sponsored by:	Limelight Networks
2017-12-11 20:01:28 +00:00
Stephen Hurd
04993890dc Free mbuf chain when m_dup fails
Fix memory leak where mbuf chain wasn't free()d if iflib_ether_pad()
has a failure in m_dup().

Reported by:	"Ryan Stone" <rysto32@gmail.com>
Sponsored by:	Limelight Networks
2017-12-08 19:50:06 +00:00
Stephen Hurd
a15fbbb8fe Handle read-only mbufs in iflib ether pad function
If ethernet padding is enabled, and a read-only mbuf is passed,
it would modify the mbuf using m_append(). Instead, call m_dup() and
append to the new packet.

Reported by:	Pyun YongHyeon
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13414
2017-12-08 18:43:31 +00:00
Gleb Smirnoff
17eea3202a Garbage collect IFCAP_POLLING_NOCOUNT. It wasn't used since very
beginning of polling(4).  The module always ignored return value
from driver polling handler.
2017-12-06 23:03:34 +00:00
Stephen Hurd
d14c853ba3 iflib: Support to padding Ethernet frames to a min size
Some bnxt devices do not correctly send frames smaller than
52 bytes (without CRC), so add a quirk that will pad frames to an
arbitrary size before passing off to the encap routine.

Reported by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13269
2017-12-05 21:00:31 +00:00
Stephen Hurd
fe1bcada9b Avoid calling CURVNET_[SET|RESTORE] for each packet
The LRO possible test was calling CURVNET_SET once for IPv4 or IPv6 for
each packet in a chain. Only call it once per chain instead.

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	cem, ae
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13368
2017-12-05 20:43:24 +00:00
Eric Joyner
b0f3e715fa ifconfig(8): Display extended compliance code string for SFP transceivers
- Updates tables in affected files with new entries from newer spec
revisions of SFF-8472, SFF-8024, and SFF-8636

- Change ifconfig to read and display the extended compliance code for
SFP media if the extended compliance code is not 0. This was being displayed
for QSFP transceivers only, but SFP28 media uses this to report 25G
capability.

Reviewed by:	melifaro, sbruno
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D13286
2017-12-05 18:42:07 +00:00
Bryan Venteicher
93a5a3b019 Add if media and link status events to vxlan
PR:		214359
MFC after:	2 weeks
2017-12-02 22:04:00 +00:00
Stephen Hurd
a027c8e958 Add support for SIOCGIFXMEDIA to iflib
SIOCGIFXMEDIA is required for extended ethernet media types,
but iflib did not support it.

Reported by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13312
2017-12-01 17:58:20 +00:00
Hans Petter Selasky
b58e7aacf6 Properly define the VLAN_XXX() function macros to avoid miscompilation when
used inside "if" statements comparing with another value.

Detailed explanation:
"if (a ? b : c != 0)" is not the same like "if ((a ? b : c) != 0)"
which is the expected behaviour of a function macro.

Affects:
toecore, linuxkpi and ibcore.

Reviewed by:	kib
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-11-30 11:35:22 +00:00
Stephen Hurd
772593dbd9 Fix comment introduced in r326369
The code uses the set of all CPUs, it doesn't zero out the set.

Sponsored by:	Limelight Networks
2017-11-29 18:21:17 +00:00
Stephen Hurd
e516b5353b Ensure that ctx->ifc_cpus is always initialized
If a device didn't support MSI-X, ctx->ifc_cpus would not be initialized,
but the IRQ allocation routines still uses the value.  Move the
initialization to common code.

Sponsored by:	Limelight Networks
2017-11-29 18:14:57 +00:00
Hans Petter Selasky
fa3f256682 Disallow TUN and TAP character device IOCTLs to modify the network device
type to any value. This can cause page faults and panics due to accessing
uninitialized fields in the "struct ifnet" which are specific to the network
device type.

MFC after:	1 week
Found by:	jau@iki.fi
PR:		223767
Sponsored by:	Mellanox Technologies
2017-11-29 09:40:11 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
Stephen Hurd
7274b2f6be Fix off-by-one error in bit_nclear() usage
bit_nclear() takes the bit numbers for the start and end bits, not the start
and a count.  This was resulting in memory corruption past the end of the
bitstr_t.

Sponsored by:	Limelight Networks
2017-11-20 21:57:04 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Konstantin Belousov
319a53f5ad Fix build.
Sponsored by:	The FreeBSD Foundation
2017-11-19 11:21:16 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Stephen Hurd
d27352644a Fix default numbers of iflib queue sets
The intent appears to be having one RX/TX queue set per core,
but since scctx->isc_n[tr]xqsets is set to max before calling
iflib_msix_init(), both end up being set to total number of cores.

Use ctx->ifc_sysctl_n[rt]xqs as the selected value and
scctx->isc_n[rt]xqsets as the max. This should result in what appears
to be the intended behaviour

Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13096
2017-11-16 18:52:58 +00:00
Antoine Brodin
f5056d933a Do not leak control in raw_usend 2017-11-08 23:20:05 +00:00
Konstantin Belousov
3cf8254f1e Add a place for a driver to report rx timestamps in nanoseconds from
boot for the received packets.

The rcv_tstmp field overlaps the place of Ln header length indicators,
not used by received packets.  The basic pkthdr rearrangement change
in sys/mbuf.h was provided by gallatin.

There are two accompanying M_ flags: M_TSTMP means that there is the
timestamp (and it was generated by hardware).

Another flag M_TSTMP_HPREC indicates that the timestamp is
high-precision.  Practically M_TSTMP_HPREC means that hardware
provided additional precision comparing with the stamps when the flag
is not set.  E.g., for ConnectX all packets are stamped by hardware
when PCIe transaction to write out the completion descriptor is
performed, but PTP packet are stamped on port.  For Intel cards, when
PTP assist is enabled, only PTP packets are stamped in the limited
number of registers, so if Intel cards ever start support this
mechanism, they would always set M_TSTMP | M_TSTMP_HPREC if hardware
timestamp is present for the given packet.

Add IFCAP_HWRXTSTMP interface capability to indicate the support for
hardware rx timestamping, and ifconfig(8) command to toggle it.

Based on the patch by:	gallatin
Reviewed by:	gallatin (previous version), hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks (? mbuf KBI issue)
X-Differential revision:	https://reviews.freebsd.org/D12638
2017-11-07 09:29:14 +00:00
Sean Bruno
abec47242a Fix NOINET/NOINET6 build during compilation of iflib.
Reported by:	kib
2017-11-06 19:54:25 +00:00
Stephen Hurd
35e4e998d8 Only chain non-LRO mbufs when LRO is not possible
Preserve packet order between tcp_lro_rx() and if_input() to avoid
creating extra corner cases. If no packets can be LROed, combine them
into one chain for submission via if_input(). If any packet can
potentially be LROed however, retain old behaviour and call if_input()
for each packet.

This should keep the 12% improvement for small packet forwarding intact,
but mostly avoids impacting the LRO case.

Reviewed by:	cem, sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12876
2017-11-06 16:23:21 +00:00
Eugene Grosbein
9f23a54e52 Allow a process to assign an IP address to local ppp interface
even if kernel routing table already has a route to the address in question
installed by some routing daemon (PR 223129).

Also, allow loopback route deletion when stopping a VIMAGE jail (PR 222647).

PR:			222647, 223129
Reviewed by:		gnn
Approved by:		avg (mentor), mav (mentor)
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D12747
2017-11-05 14:41:48 +00:00
Kristof Provost
85f330e5fa epair: Fix panic on unload
The VNET_SYSUNINIT() callback is executed after the MOD_UNLOAD. That means
that netisr_unregister() has already been called when
netisr_unregister_vnet() gets calls, leading to an assertion failure.

Restore the expected order of operations by performing everything that
was done in MOD_UNLOAD to a SYSUNINIT() (that will be called after the
VNET_SYSUNINIT()).

Differential Revision:	https://reviews.freebsd.org/D12771
2017-11-01 14:27:26 +00:00
Stephen Hurd
0b6c52b69f Preserve TSO checksum flags
r323941 incorrectly disabled TSO flags based on MTU.

Reported by:	Yuri Pankov <yuripv@gmx.com>
Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12880
2017-10-31 19:03:35 +00:00
Stephen Hurd
a1b799ca5b Fix PR221990 - Assertion at iflib.c:1947
ifl_pidx and ifl_credits are going out of sync in _iflib_fl_refill() as they
use different update log.  Use the same update logic for both, and add a
final call to isc_rxd_refill() to handle early exits from the loop.

PR:		221990
Reported by:	pho
Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12798
2017-10-31 17:50:42 +00:00
Stephen Hurd
10e0d93811 Fix build with nodevice netmap
iru_init() was declared and used outside the DEV_NETMAP
conditional blocks, but was implemented inside one. Move the
implementation out of the DEV_NETMAP block to allow building with
netmap disabled.

Reported by:	Andrew Turner <andrew@fubar.geek.nz>
Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12842
2017-10-31 02:49:28 +00:00
Stephen Hurd
09b57b7f40 bnxt: HW_LRO Rx Pkt with > 32 fragments caused Crash (iflib)
Broadcom NIC with HW_LRO setting max_agg_segs >= 6 can generate Rx pkt with
64 (2^6) fragments, modify IFLIB_MAX_RX_SEGS to 64 to avoid memory
corruption / Crash.

Submitted by:	Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed by:	shurd, sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Broadcom Limited
Differential Revision:	https://reviews.freebsd.org/D12774
2017-10-30 21:20:33 +00:00
Stephen Hurd
2d873474b2 Fix PR222744 - netmap errors with iflib em driver
Fix error when refilling netmap buffers that resulted in the first
buffer of the successive passes through ifl_bus_addrs[] leaving the
first value unset (tmp_pidx started at 1, not zero after the first time
through the loop).

Leave the one unused buffer required by some NICs visible in the netmap
ring rather than hidden. There will always be a buffer in use by the
kernel now when an iflib driver is used via netmap.

Always get the netmap slot index via netmap_idx_n2k() to account for
nkr_hwofs in a consistent way.

Split shared functionality into new functions.
iru_init(): shared by _iflib_fl_refill() and netmap_fl_refill()
netmap_fl_refill(): shared by iflib_netmap_rxsync() and
iflib_netmap_rxq_init()

PR:		222744
Reported by:	Shirkdog <mshirk@daemon-security.com>
Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12769
2017-10-30 21:14:31 +00:00
Stephen Hurd
0fdea53954 Avoid enabling MSI-X if MSI-X is disabled globally
It was reported on the community call that with
hw.pci.enable_msix=0, iflib would enable MSI-X on the device and attempt
to use it, which caused issues. Test the sysctl explicitly and do not
enable MSI-X if it's disabled globally.

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12805
2017-10-30 21:08:12 +00:00
Stephen Hurd
3429c02f82 Some cache related optimizations
1. prefetch 128 bytes of mbufs.
2. Re-order filling the pkt_info so cache stalls happen at the end
3. Define empty prefetch2cachelines() macro when the function isn't present.

Provides small performance improvments on some hardware

Reviewed by:	sbruno
Approved by:	sbruno (mentor)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12447
2017-10-23 20:50:08 +00:00
Bjoern A. Zeeb
8e94025b41 With r181803 on 2008-08-17 23:27:27Z the first VIMAGE commit went into
HEAD.  Enable VIMAGE in GENERIC kernels and some others (where GENERIC does
not exist) on HEAD.

Disable building LINT-VIMAGE with VIMAGE being default.

This should give it a lot more exposure in the run-up to 12 to help
us evaluate whether to keep it on by default or not.
We are also hoping to get better performance testing.
The feature can be disabled using nooptions.

Requested by:		many
Reviewed by:		kristof, emaste, hiren
X-MFC after:		never
Relnotes:		yes
Differential Revision:	https://reviews.freebsd.org/D12639
2017-10-20 21:40:59 +00:00
Andriy Voskoboinyk
c64c1f95a7 ifnet(9): split ifc_alloc_unit() (should simplify code flow)
Allocate smallest unit number from pool via ifc_alloc_unit_next()
and exact unit number (if available) via ifc_alloc_unit_specific().

While here, address possible deadlock (mentioned in PR).

PR:		217401
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D12551
2017-10-16 21:21:31 +00:00
Sepherosa Ziehau
2832cbe7d2 rss: Remove never defined UDP_IPV4_EX
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D12455
2017-10-11 06:08:01 +00:00