Commit Graph

3844 Commits

Author SHA1 Message Date
Sean Bruno
3600bd1e38 Revert r319921 which seems to cause NFS booting assertion panics in
various configurations.

Reported by:	pho@
2017-06-15 17:46:20 +00:00
Sean Bruno
f7587db036 Add new sysctl to allow changing of timing of the txq timers.
Add new sysctl to override use of busdma in the driver.

Submitted by:	Drew Gallitin <gallatin@netflix.com>
2017-06-13 23:16:38 +00:00
Sean Bruno
aa8fa07cd8 Plug mbuf leak in the busdma path of iflib.
Submitted by:	Michael Tuexen <tuexen@freebsd.org>
Reported by:	Drew Gallitin <gallatin@netflix.com>
2017-06-13 19:32:23 +00:00
Andrey V. Elsukov
b83aa367a5 Resurrect RTF_RNH_LOCKED flag and restore ability to call rtalloc1_fib()
with acquired RIB lock.

This fixes a possible panic due to trying to acquire RIB rlock when it is
already exclusive locked.

PR:		215963, 215122
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-06-13 10:52:31 +00:00
Alexander Motin
41cf0d54a2 Call VLAN_CAPABILITIES() when LAGG capabilities change.
This makes VLAN on top of LAGG to expose proper capabilities if they are
changed after creation.

MFC after:	1 week
2017-05-26 22:22:48 +00:00
Alexander Motin
8403ab7919 Improve applying unified capabilities to the lagg ports.
Some NICs have some capabilities dependent, so that disabling one require
disabling some other (TXCSUM/RXCSUM on em).  This code tries to reach the
consensus more insistently.

PR:		219453
MFC after:	1 week
2017-05-26 20:15:33 +00:00
Alexander Motin
e3d90506c4 Remove some code, dead from the day one. 2017-05-25 23:19:09 +00:00
Alexander Motin
9bcf3ae4c7 Add parent interface reference counting to if_vlan.
Using plain ifunit() looks like a request for troubles.

MFC after:	1 week
2017-05-23 00:13:27 +00:00
Sepherosa Ziehau
ae0669e390 net/vlan: Revert 305177
Miss read the parentheses.

Reported by:	oleg@
Reviewed by:	hps@
MFC after:	3 days
Sponsored by:	Microsoft
2017-05-19 01:42:31 +00:00
Ed Maste
3e85b721d6 Remove register keyword from sys/ and ANSIfy prototypes
A long long time ago the register keyword told the compiler to store
the corresponding variable in a CPU register, but it is not relevant
for any compiler used in the FreeBSD world today.

ANSIfy related prototypes while here.

Reviewed by:	cem, jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10193
2017-05-17 00:34:34 +00:00
Ravi Pokala
5b1a5e45b2 Persistently store NIC's hardware MAC address, and add a way to retrive it
An earlier version of r318160 allocated if_hw_addr unconditionally; when it
became conditional, I forgot to check for NULL in ether_ifattach().

Reviewed by:	kp
MFC after:	1 week
MFC with:	r318160
Sponsored by:	Panasas
Differential Revision:	https://reviews.freebsd.org/D10678
Pointy-hat to:	rpokala
2017-05-11 06:46:39 +00:00
Ravi Pokala
ddae57504b Persistently store NIC's hardware MAC address, and add a way to retrive it
The MAC address reported by `ifconfig ${nic} ether' does not always match
the address in the hardware, as reported by the driver during attach. In
particular, NICs which are components of a lagg(4) interface all report the
same MAC.

When attaching, the NIC driver passes the MAC address it read from the
hardware as an argument to ether_ifattach(). Keep a second copy of it, and
create ioctl(SIOCGHWADDR) to return it. Teach `ifconfig' to report it along
with the active MAC address.

PR:		194386
Reviewed by:	glebius
MFC after:	1 week
Sponsored by:	Panasas
Differential Revision:	https://reviews.freebsd.org/D10609
2017-05-10 22:13:47 +00:00
Eric Joyner
6e105d4e35 Add several new media types to if_media.h
These include several 25G types (for active direct attach cables and LR modules),
and a missing type for 10G active direct attach.

Differential Revision:	https://reviews.freebsd.org/D10425
Reviewed by:	smh, imp
MFC after:	3 days
Sponsored by:	Intel Corporation
2017-05-10 18:33:40 +00:00
Bjoern A. Zeeb
1d898b914e Adjust a comment.
MFC after:	3 days
2017-05-09 08:29:55 +00:00
Alexander Motin
bbfc32a6b5 Relax r317696 locking to not drain taskqueue under the lock.
MFC after:	11 days
2017-05-05 16:51:53 +00:00
Alexander Motin
e83177fba8 Fix r317696 build without debug.
MFC after:	2 weeks
2017-05-03 02:54:11 +00:00
Alexander Motin
2f86d4b001 Introduce sleepable locks into if_lagg.
Before this change if_lagg was using nonsleepable rmlocks to protect its
internal state.  This patch introduces another sx lock to protect code
paths that require sleeping, while still uses old rmlock to protect hot
nonsleepable data paths.

This change allows to remove taskqueue decoupling used before to change
interface addresses without holding the lock.  Instead it uses sx lock to
protect direct if_ioctl() calls.

As another bonus, the new code synchronizes enabled capabilities of member
interfaces, and allows to control them with ifconfig laggX, that was
impossible before.  This part should fix interoperation with if_bridge,
that may need to disable some capabilities, such as TXCSUM or LRO, to allow
bridging with noncapable interfaces.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D10514
2017-05-02 19:09:11 +00:00
Alexander Motin
ebe4288151 Make if_bridge complain if it can't disable some capabilities.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-04-29 08:52:07 +00:00
Alexander Motin
59150e9141 Propagate IFCAP_LRO from trunk to vlan interface.
False positive here cost nothing, while false negative may lead to some
confusions.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-04-29 08:28:59 +00:00
Alexander Motin
d89baa5aac Allow some control over enabled capabilities for if_vlan.
It improves interoperability with if_bridge, which may need to disable
some capabilities not supported by other members.  IMHO there is still
open question about LRO capability, which may need to be disabled on
physical interface.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-04-28 11:00:58 +00:00
Brooks Davis
a7dc31283a Remove the NATM framework including the en(4), fatm(4), hatm(4), and
patm(4) devices.

Maintaining an address family and framework has real costs when we make
infrastructure improvements.  In the case of NATM we support no devices
manufactured in the last 20 years and some will not even work in modern
motherboards (some newer devices that patm(4) could be updated to
support apparently exist, but we do not currently have support).

With this change, support remains for some netgraph modules that don't
require NATM support code. It is unclear if all these should remain,
though ng_atmllc certainly stands alone.

Note well: FreeBSD 11 supports NATM and will continue to do so until at
least September 30, 2021.  Improvements to the code in FreeBSD 11 are
certainly welcome.

Reviewed by:	philip
Approved by:	harti
2017-04-24 21:21:49 +00:00
Alexander Motin
1e04441a9d Remove unneeded conditions.
MFC after:	2 weeks
2017-04-22 08:38:49 +00:00
Alexander Motin
b98b5ae8ec Add interface reference counting to if_lagg.
Using plain ifunit() looks like request for troubles.

MFC after:	2 weeks
2017-04-21 13:45:01 +00:00
Jung-uk Kim
af1973281e Use kmem_malloc() instead of malloc(9) for the native amd64 filter.
r316767 broke the BPF JIT compiler for amd64 because malloc()'d space is no
longer executable.

Discussed with:	kib, alc
2017-04-17 22:02:09 +00:00
Jung-uk Kim
c7ff2b13d1 Remove an unnecessary declaration missed in the previous commit. 2017-04-17 21:57:23 +00:00
Jung-uk Kim
e329e330d4 Move declarations for a machine-dependent function to the header file. 2017-04-17 21:51:26 +00:00
Patrick Kelsey
2f8c6c0a58 Fix userland tools that don't check the format of routing socket
messages before accessing message fields that may not be present,
removing dead/duplicate/misleading code along the way.

Document the message format for each routing socket message in
route.h.

Fix a bug in usr.bin/netstat introduced in r287351 that resulted in
pointer computation with essentially random 16-bit offsets and
dereferencing of the results.

Reviewed by:	ae
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D10330
2017-04-16 19:17:10 +00:00
Andrey V. Elsukov
070f87e878 Inherit IPv6 checksum offloading flags to vlan interfaces.
if_vlan(4) interfaces inherit IPv4 checksum offloading flags from the
parent when VLAN_HWCSUM and VLAN_HWTAGGING flags are present on the
parent interface. Do the same for IPv6 checksum offloading flags.

Reported by:	Harry Schmalzbauer
Reviewed by:	np, gnn
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10356
2017-04-11 19:23:25 +00:00
Andrey V. Elsukov
c00bf73076 Do not adjust interface MTU automatically. Leave this task to the system
administrator.

This restores the behavior that was prior to r274246.

No objection from:	#network
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D10215
2017-04-11 08:56:18 +00:00
Patrick Kelsey
75580d5881 Fixed typo in comment found while reading commit email for fix of
other typo in same comment.

ned -> need

MFC after:	3 days
2017-04-08 04:50:50 +00:00
Patrick Kelsey
59f35a8290 Fixed typo in comment.
patckets -> packets

MFC after:	3 days
2017-04-08 04:45:52 +00:00
Patrick Kelsey
68ce5a03a2 Fix typo in comment.
logest -> longest

MFC after:	3 days
2017-04-08 04:37:01 +00:00
Sean Bruno
60596476cf Move pause frame counter out of struct if_ctx and into struct if_softc_ctx_t
so that we can use it in iflib to detect pause frames.

The igb(4) driver definitely used to use this in its old timer function and
I see no reason to restrict it to that driver only.

Sponsored by:	Limelight Networks
2017-04-07 00:33:03 +00:00
Sean Bruno
ea351d3f14 Allow MSIX to be turned off by tuneable per interface, per driver.
Sponsored by:	Limelight Networks
2017-04-04 21:03:34 +00:00
Andrey V. Elsukov
88d950a650 Remove "IPFW static rules" rmlock.
Make PFIL's lock global and use it for this purpose.
This reduces the number of locks needed to acquire for each packet.

Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
No objection from: #network
Differential Revision:	https://reviews.freebsd.org/D10154
2017-04-03 13:35:04 +00:00
Sean Bruno
2b2fc97356 Don't call init functions directly from the timer/watchdog function.
Enqueue this in the admin task now that it can process it.

Submitted by:	Matt Macy <mmacy@nextbsd.org>
Sponsored by:	Limelight Networks
2017-03-30 16:54:01 +00:00
Sean Bruno
5c1ff25517 Assert IFF_DRV_OACTIVE in iflib_timer() when the "hung" case is detected
so that iflib's admin task can still process the reset directive and restore
functionality.

Sponsored by:	Limelight Networks
2017-03-30 16:03:51 +00:00
Andrey V. Elsukov
af48c203d6 ake pfil's locking macros private.
Obtained from:	Yandex LLC
MFC after:	1 week
2017-03-27 08:18:13 +00:00
Andrey V. Elsukov
52b8eb0b31 Declare module version.
MFC after:	1 week
2017-03-27 07:56:41 +00:00
Ermal Luçi
4e950412ff Correct handling of ALTQ with epair(4) interfaces but presenting that ALTQ(9) is supported.
Approved by:	ae
MFC after:	2 weeks
2017-03-24 00:55:16 +00:00
Kristof Provost
2f8fb3a868 pf: Fix possible shutdown race
Prevent possible races in the pf_unload() / pf_purge_thread() shutdown
code. Lock the pf_purge_thread() with the new pf_end_lock to prevent
these races.

Use a shared/exclusive lock, as we need to also acquire another sx lock
(VNET_LIST_RLOCK). It's fine for both pf_purge_thread() and pf_unload()
to sleep,

Pointed out by: eri, glebius, jhb
Differential Revision:	https://reviews.freebsd.org/D10026
2017-03-22 21:18:18 +00:00
Sean Bruno
5e88838850 Change casting to a uintptr_t to be compatible with non-x86 architectures.
Submitted by:	Matt Macy <mmacy@nextbsd.org>
Reported by:	rpokala
Sponsored by:	Limelight Networks
2017-03-14 22:25:07 +00:00
Sean Bruno
0a1b74a3d1 Fixup LINT by using uint64_t type as we do on all other calls to PNMB()
Found with Jenkins.

Reported by:	lwshu
Sponsored by:	Limelight Networks
2017-03-14 15:08:56 +00:00
Sean Bruno
95246abb21 IFLIB updates
- unconditionally enable BUS_DMA on non-x86 architectures
- speed up rxd zeroing via customized function
- support out of order updates to rxd's
- add prefetching to hardware descriptor rings
- only prefetch on 10G or faster hardware
- add seperate tx queue intr function
- preliminary rework of NETMAP interfaces, WIP

Submitted by:	Matt Macy <mmacy@nextbsd.org>
Sponsored by:	Limelight Networks
2017-03-13 22:53:06 +00:00
Andrey V. Elsukov
250a8e2720 Ignore ifnet renaming in the bpf ifnet departure handler.
PR:		213015
MFC after:	1 week
2017-03-13 09:04:10 +00:00
Andrey V. Elsukov
350e622703 Remove now unneded cast. 2017-03-08 08:09:41 +00:00
Andrey V. Elsukov
22986c6740 Introduce the concept of IPsec security policies scope.
Currently are defined three scopes: global, ifnet, and pcb.
Generic security policies that IKE daemon can add via PF_KEY interface
or an administrator creates with setkey(8) utility have GLOBAL scope.
Such policies can be applied by the kernel to outgoing packets and checked
agains inbound packets after IPsec processing.
Security policies created by if_ipsec(4) interfaces have IFNET scope.
Such policies are applied to packets that are passed through if_ipsec(4)
interface.
And security policies created by application using setsockopt()
IP_IPSEC_POLICY option have PCB scope. Such policies are applied to
packets related to specific socket. Currently there is no way to list
PCB policies via setkey(8) utility.

Modify setkey(8) and libipsec(3) to be able distinguish the scope of
security policies in the `setkey -DP` listing. Add two optional flags:
'-t' to list only policies related to virtual *tunneling* interfaces,
i.e. policies with IFNET scope, and '-g' to list only policies with GLOBAL
scope. By default policies from all scopes are listed.

To implement this PF_KEY's sadb_x_policy structure was modified.
sadb_x_policy_reserved field is used to pass the policy scope from the
kernel to userland. SADB_SPDDUMP message extended to support filtering
by scope: sadb_msg_satype field is used to specify bit mask of requested
scopes.

For IFNET policies the sadb_x_policy_priority field of struct sadb_x_policy
is used to pass if_ipsec's interface if_index to the userland. For GLOBAL
policies sadb_x_policy_priority is used only to manage order of security
policies in the SPDB. For IFNET policies it is not used, so it can be used
to keep if_index.

After this change the output of `setkey -DP` now looks like:
# setkey -DPt
0.0.0.0/0[any] 0.0.0.0/0[any] any
	in ipsec
	esp/tunnel/87.250.242.144-87.250.242.145/unique:145
	spid=7 seq=3 pid=58025 scope=ifnet ifname=ipsec0
	refcnt=1
# setkey -DPg
::/0 ::/0 icmp6 135,0
	out none
	spid=5 seq=1 pid=872 scope=global
	refcnt=1

No objection from:	#network
Obtained from:	Yandex LLC
MFC after:	2 weeks
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9805
2017-03-07 00:13:53 +00:00
Sean Bruno
d945ed6472 Make gtaskqueue compatible with drm-next such that they can be used with the
linuxkpi tasklets.

Submitted by:	mmacy@nextbsd.org
Reported by:	hps
2017-03-01 18:37:35 +00:00
Warner Losh
607a4c520e Back out r314471. In https://reviews.freebsd.org/D1858 it was clear
that this shouldn't go in. I was unaware when I merged the pull
request. I don't wish to upset the status quo, so backout per
project practice.

Pull Request:	https://github.com/freebsd/freebsd/pull/92
Noted by:	hrs@
2017-03-01 05:38:04 +00:00
Warner Losh
7d85b06ecf Fix VNET - DAD detected duplicate IPv6 address
Assign a hopefully unique, locally administered etheraddr. - for
epairNa & epairNb

Submitted by:	Catalin <sslevil@users.noreply.github.com>
Pull Request:	https://github.com/freebsd/freebsd/pull/92
2017-03-01 04:47:22 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Gleb Smirnoff
efe3b0de14 Remove SVR4 (System V Release 4) binary compatibility support.
UNIX System V Release 4 is operating system released in 1988. It ceased
to exist in early 2000-s.
2017-02-28 05:14:42 +00:00
Jonathan T. Looney
e80039007a Do some minimal work to better conform to the 802.3ad (LACP) standard.
In particular, don't set the synchronized bit for the peer unless it truly
appears to be synchronized to us. Also, don't set our own synchronized bit
unless we have actually seen a remote system.

Prior to this change, we were seeing some strange behavior, such as:

1. We send an advertisement with the Activity, Aggregation, and Default
flags, followed by an advertisement with the Activity, Aggregation,
Synchronization, and Default flags. However, we hadn't seen an
advertisement from another peer and were still advertising the default
(NULL) peer. A closer examination of the in-kernel data structures (using
kgdb) showed that the system had added the default (NULL) peer as a valid
aggregator for the segment.
2. We were receiving an advertisement from a peer that included the
default (NULL) peer instead of including our system information. However,
we responded with an advertisement that included the Synchronization flag
for both our system and the peer. (Since the peer's advertisement did not
include our system information, we shouldn't add the synchronization bit
for the peer.)

This patch corrects those two items.

Reviewed by:	smh
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D9485
2017-02-26 00:19:02 +00:00
Pedro F. Giffuni
e099b90b80 sys: Replace zero with NULL for pointers.
Found with:	devel/coccinelle
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D9694
2017-02-22 02:35:59 +00:00
Jason A. Harmening
e2a8d17887 Bring back r313037, with fixes for mips:
Implement get_pcpu() for amd64/sparc64/mips/powerpc, and use it to
replace pcpu_find(curcpu) in MI code.

Reviewed by:	andreast, kan, lidl
Tested by:	lidl(mips, sparc64), andreast(powerpc)
Differential Revision:	https://reviews.freebsd.org/D9587
2017-02-19 02:03:09 +00:00
Xin LI
b86fcc147f MFV r313759: license change for a few headers (4 clause BSD to 3 clause BSD).
MFC after:	28 days
X-MFC-with:	r313695
2017-02-15 07:22:47 +00:00
Xin LI
ada6f083b9 MFV r313676: libpcap 1.8.1
MFC after:	1 month
2017-02-13 08:23:39 +00:00
Gleb Smirnoff
10addc1eb5 Last consumer of _WANT_RTENTRY gone. 2017-02-10 17:37:04 +00:00
Andrey V. Elsukov
fcf596178b Merge projects/ipsec into head/.
Small summary
 -------------

o Almost all IPsec releated code was moved into sys/netipsec.
o New kernel modules added: ipsec.ko and tcpmd5.ko. New kernel
  option IPSEC_SUPPORT added. It enables support for loading
  and unloading of ipsec.ko and tcpmd5.ko kernel modules.
o IPSEC_NAT_T option was removed. Now NAT-T support is enabled by
  default. The UDP_ENCAP_ESPINUDP_NON_IKE encapsulation type
  support was removed. Added TCP/UDP checksum handling for
  inbound packets that were decapsulated by transport mode SAs.
  setkey(8) modified to show run-time NAT-T configuration of SA.
o New network pseudo interface if_ipsec(4) added. For now it is
  build as part of ipsec.ko module (or with IPSEC kernel).
  It implements IPsec virtual tunnels to create route-based VPNs.
o The network stack now invokes IPsec functions using special
  methods. The only one header file <netipsec/ipsec_support.h>
  should be included to declare all the needed things to work
  with IPsec.
o All IPsec protocols handlers (ESP/AH/IPCOMP protosw) were removed.
  Now these protocols are handled directly via IPsec methods.
o TCP_SIGNATURE support was reworked to be more close to RFC.
o PF_KEY SADB was reworked:
  - now all security associations stored in the single SPI namespace,
    and all SAs MUST have unique SPI.
  - several hash tables added to speed up lookups in SADB.
  - SADB now uses rmlock to protect access, and concurrent threads
    can do SA lookups in the same time.
  - many PF_KEY message handlers were reworked to reflect changes
    in SADB.
  - SADB_UPDATE message was extended to support new PF_KEY headers:
    SADB_X_EXT_NEW_ADDRESS_SRC and SADB_X_EXT_NEW_ADDRESS_DST. They
    can be used by IKE daemon to change SA addresses.
o ipsecrequest and secpolicy structures were cardinally changed to
  avoid locking protection for ipsecrequest. Now we support
  only limited number (4) of bundled SAs, but they are supported
  for both INET and INET6.
o INPCB security policy cache was introduced. Each PCB now caches
  used security policies to avoid SP lookup for each packet.
o For inbound security policies added the mode, when the kernel does
  check for full history of applied IPsec transforms.
o References counting rules for security policies and security
  associations were changed. The proper SA locking added into xform
  code.
o xform code was also changed. Now it is possible to unregister xforms.
  tdb_xxx structures were changed and renamed to reflect changes in
  SADB/SPDB, and changed rules for locking and refcounting.

Reviewed by:	gnn, wblock
Obtained from:	Yandex LLC
Relnotes:	yes
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D9352
2017-02-06 08:49:57 +00:00
Sean Bruno
67af525c55 Delete duplicate break. 2017-02-04 18:25:09 +00:00
Jason A. Harmening
ad62ba6e96 Revert r313037
The switch to get_pcpu() in MI code seems to cause hangs on MIPS.
Back out until we can get a better idea of what's happening there.

Reported by:	kan, lidl
2017-02-04 06:24:49 +00:00
Jason A. Harmening
65ed483615 Implement get_pcpu() for the remaining architectures and use it to
replace pcpu_find(curcpu) in MI code.
2017-02-01 03:32:49 +00:00
Stephen J. Kiernan
d0b2cad1ca Add the folowing set accessor functions for recently-added members of ifnet
structure:

if_gethwtsomax(), if_sethwtsomax()                 - if_hw_tsomax
if_gethwtsomaxsegcount(), if_sethwtsomaxsegcount() - if_hw_tsomaxsegcount
if_gethwtsomaxsegsize(), if_sethwtsomaxsegsize()   - if_hw_tsomaxsegsize

Update em and vnic drivers which had already been coverted to use accessor
functions for the other ifnet structure members.

Reviewed by:	erj
Approved by:	sjg (mentor)
Obtained from:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D8544
2017-01-31 16:12:31 +00:00
Luiz Otavio O Souza
13157b2baf Do not update the lagg link layer address when destroying a lagg clone.
This would enqueue an event to send the gratuitous arp on a dying lagg
interface without any physical ports attached to it.

Apart from that, the taskqueue_drain() on lagg_clone_destroy() runs too
late, when the ifp data structure is already freed.  Fix that too.

Obtained from:	pfSense
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-01-30 03:04:33 +00:00
Luiz Otavio O Souza
d177868c16 The stf(4) interface name does not conform with the default naming
convention for interfaces, because only one stf(4) interface can exist
in the system.

This disallow the use of unit numbers different than 0, however, it is
possible to create the clone without specify the unit number (wildcard).

In the wildcard case we must update the interface name before return.

This fix an infinite recursion in pf code that keeps track of network
interfaces and groups:

1 - a group for the cloned type of the interface is added (stf in this
    case);
2 - the system will now try to add an interface named stf (instead of
    stf0) to stf group;
3 - when pfi_kif_attach() tries to search for an already existing 'stf'
    interface, the 'stf' group is returned and thus the group is added
    as an interface of itself;

This will now cause a crash at the first attempt to traverse the groups
which the stf interface belongs (which loops over itself).

Obtained from:	pfSense
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-01-29 18:41:09 +00:00
Andriy Voskoboinyk
2bbd06fc33 Garbage collect IFT_IEEE80211 (but leave the define for possible reuse)
This interface type ("a parent interface of wlanX") is not used since
r287197

Reviewed by:	adrian, glebius
Differential Revision:	https://reviews.freebsd.org/D9308
2017-01-28 17:08:40 +00:00
Sean Bruno
835809f99f Fix i386 compile failure by moving needed closing parenthesis out of
conditional block.

Submitted by:   hiren
Reported by:    cy
2017-01-28 15:44:14 +00:00
Dexuan Cui
6597559ea7 ifnet: move the new ifnet_event EVENTHANDLER_DECLARE to net/if_var.h
Thank glebius for pointing this out:
"The network stuff shall not be added to sys/eventhandler.h"

Reviewed by:	David_A_Bright_DELL.com, sephe, glebius
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D9345
2017-01-28 07:26:42 +00:00
Sean Bruno
e035717e57 IFLIB updates:
We found routing performance dropped significantly when configuring
FreeBSD as a router, we are applying the following changes in order to
resolve those issues and hopefully perform better.
 - don't prefetch the flags array, we usually don't need it
 - prefetch the next cache line of each of the software descriptor arrays as
   well as the first cache line of each of the next four packets' mbufs and
   clusters
 - reduce max copy size to 63 bytes
 - convert rx soft descriptors from array of structures to a structure of arrays
 - update copyrights

Submitted by:	Matt Macy <mmacy@nextbsd.org>
2017-01-27 23:08:06 +00:00
Sean Bruno
96eeabefbe Replace customized busmaster code with standardized setup call.
Reported by:	jhb
2017-01-27 22:30:27 +00:00
Sean Bruno
69b7fc3e67 Minor style annoyance.
Submitted by:	bde
2017-01-26 13:50:09 +00:00
Kristof Provost
ab5cda71df bridge: Release the bridge lock when calling bridge_set_ifcap()
This calls ioctl() handlers for the different interfaces in the bridge.
These handlers expect to get called in an ioctl context where it's safe
for them to sleep. We may not sleep with the bridge lock held.

However, we still need to protect the interface list, to ensure it
doesn't get changed while we iterate over it.
Use BRIDGE_XLOCK(), which prevents bridge members from being removed.
Adding bridge members is safe, because it uses LIST_INSERT_HEAD().

This caused panics when adding xen interfaces to a bridge.

PR:		216304
Reviewed by:	ae
MFC after:	1 week
Sponsored by:	RootBSD
Differential Revision:	https://reviews.freebsd.org/D9290
2017-01-25 21:25:26 +00:00
Luiz Otavio O Souza
338e227ac0 After the in_control() changes in r257692, an existing address is
(intentionally) deleted first and then completely added again (so all the
events, announces and hooks are given a chance to run).

This cause an issue with CARP where the existing CARP data structure is
removed together with the last address for a given VHID, which will cause
a subsequent fail when the address is later re-added.

This change fixes this issue by adding a new flag to keep the CARP data
structure when an address is not being removed.

There was an additional issue with IPv6 CARP addresses, where the CARP data
structure would never be removed after a change and lead to VHIDs which
cannot be destroyed.

Reviewed by:	glebius
Obtained from:	pfSense
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (Netgate)
2017-01-25 19:04:08 +00:00
Sean Bruno
f7ae9a84e3 Add error checking to the pci_find_cap(, PCIY_MSIX,) call that is returns
success and a good value.  Only then try to use it and set the MSIX_ENABLE
bit.

With the current em(4) driver we have observed failures in this case in a
specific environment when pci_find_cap() would not return the assumed
value, which meant we ended up writing to PCI register 2 (PCI_DEVICE_ID)
which is read-only.

PR:		216456
Submitted by:	bz
2017-01-25 14:37:05 +00:00
Sean Bruno
bd84f70044 iflib:
Add internal tracking of smp startup status to reliably figure out
     what methods are to be used to get gtaskqueue up and running.

e1000:
     Calculating this pointer gives undefined behaviour when (last == -1)
     (it is before the buffer).  The pointer is always followed.  Panics
     occurred when it points to an unmapped page.  Otherwise, the pointed-to
     garbage tends to not have the E1000_TXD_STAT_DD bit set in it, so in the
     broken case the loop was usually null and the function just returned, and
     this was acidentally correct.

Submitted by:	bde
Reported by:	Matt Macy <mmacy@nextbsd.org>
2017-01-24 16:05:42 +00:00
Sean Bruno
36fa5d5b64 Revert 312696 due to build tests. 2017-01-24 15:55:52 +00:00
Sean Bruno
562a3182f6 iflib:
Add internal tracking of smp startup status to reliably figure out
   what methods are to be used to get gtaskqueue up and running.

e1000:
   Calculating this pointer gives undefined behaviour when (last == -1)
   (it is before the buffer).  The pointer is always followed.  Panics
   occurred when it points to an unmapped page.  Otherwise, the pointed-to
   garbage tends to not have the E1000_TXD_STAT_DD bit set in it, so in the
   broken case the loop was usually null and the function just returned, and
   this was acidentally correct.

Submitted by:	bde
Reviewed by:	Matt Macy <mmacy@nextbsd.org>
2017-01-24 14:48:32 +00:00
Dexuan Cui
92a6859b91 ifnet: introduce event handlers for ifup/ifdown events
Hyper-V's NIC SR-IOV implementation needs a Hyper-V synthetic NIC and
a VF NIC to work together, mainly to support seamless live migration.

When the VF device becomes UP (or DOWN), the synthetic NIC driver needs
to switch the data path from the synthetic NIC to the VF (or the opposite).

So the synthetic NIC driver needs to know when a VF device is becoming
UP or DOWN and hence the patch is made.

Reviewed by:	sephe
Approved by:	sephe (mentor)
MFC after:	2 weeks
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8963
2017-01-24 09:19:46 +00:00
Ravi Pokala
d592868ebf Eliminate misleading comments and dead code in lacp_port_create()
Variables "fast" and "active" are both constant in lacp_port_create(), but
comments mispleadingly suggest that "fast" can be changed via ioctl. The
constant values control the value of "lp->lp_state", so it too is constant,
and the code for assigning different value to it is essentially dead.

Remove both "fast" and "active", and set "lp->lp_state" unconditionally;
that gets rid of the dead code and misleading comments.

CID: 1305692
CID: 1305734

Reported by:	asomers
Reviewed by:	asomers
MFC after:	1 week
Sponsored by:	Panasas
Differential Revision:	https://reviews.freebsd.org/D9302
2017-01-24 01:39:40 +00:00
Ryan Stone
7d309e8e40 Fix reference to free memory in ixgbe/if_media.c
When ixgbe receives an interrupt indicating that a new optical module
may have been inserted, it discards all of its current media types
by calling ifmedia_removeall() and then creates a new set of media
types for the supported media on the new module.  However,
ifmedia_removeall() was maintaining a pointer to whatever the
current media type was before the call to ifmedia_removealL().
The result of this was that any attempt to read the current media
type of the interface (e.g. via ifconfig) would return potentially
garbage data from free memory (or if one were particularly unlucky
on an architecture that does not malloc() from a direct map, page
fault the kernel).

Fix this by NULL'ing out the current media field in if_media.c,
and have ixgbe update the current media type after recreating
them.

Submitted by:	Matt Joras <matt.joras AT gmail DOT com>
Reviewed by:	sbruno, erj
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D9164
2017-01-20 17:16:48 +00:00
Hans Petter Selasky
f3e7afe2d7 Implement kernel support for hardware rate limited sockets.
- Add RATELIMIT kernel configuration keyword which must be set to
enable the new functionality.

- Add support for hardware driven, Receive Side Scaling, RSS aware, rate
limited sendqueues and expose the functionality through the already
established SO_MAX_PACING_RATE setsockopt(). The API support rates in
the range from 1 to 4Gbytes/s which are suitable for regular TCP and
UDP streams. The setsockopt(2) manual page has been updated.

- Add rate limit function callback API to "struct ifnet" which supports
the following operations: if_snd_tag_alloc(), if_snd_tag_modify(),
if_snd_tag_query() and if_snd_tag_free().

- Add support to ifconfig to view, set and clear the IFCAP_TXRTLMT
flag, which tells if a network driver supports rate limiting or not.

- This patch also adds support for rate limiting through VLAN and LAGG
intermediate network devices.

- How rate limiting works:

1) The userspace application calls setsockopt() after accepting or
making a new connection to set the rate which is then stored in the
socket structure in the kernel. Later on when packets are transmitted
a check is made in the transmit path for rate changes. A rate change
implies a non-blocking ifp->if_snd_tag_alloc() call will be made to the
destination network interface, which then sets up a custom sendqueue
with the given rate limitation parameter. A "struct m_snd_tag" pointer is
returned which serves as a "snd_tag" hint in the m_pkthdr for the
subsequently transmitted mbufs.

2) When the network driver sees the "m->m_pkthdr.snd_tag" different
from NULL, it will move the packets into a designated rate limited sendqueue
given by the snd_tag pointer. It is up to the individual drivers how the rate
limited traffic will be rate limited.

3) Route changes are detected by the NIC drivers in the ifp->if_transmit()
routine when the ifnet pointer in the incoming snd_tag mismatches the
one of the network interface. The network adapter frees the mbuf and
returns EAGAIN which causes the ip_output() to release and clear the send
tag. Upon next ip_output() a new "snd_tag" will be tried allocated.

4) When the PCB is detached the custom sendqueue will be released by a
non-blocking ifp->if_snd_tag_free() call to the currently bound network
interface.

Reviewed by:		wblock (manpages), adrian, gallatin, scottl (network)
Differential Revision:	https://reviews.freebsd.org/D3687
Sponsored by:		Mellanox Technologies
MFC after:		3 months
2017-01-18 13:31:17 +00:00
Sean Bruno
4ecb427a49 Fix hangs in a uniprocessor configuration (qemu, virtualbox, real hw).
sys/net/iflib.c:
  Add ctx to filter_info and don't skpi interrupt early on unless we're on an
  SMP system

sys/kern/subr_gtaskqueue.c:
  Skip smp check if we're running UP

Submitted by:	Matt Macy <mmacy@nextbsd.org>
Reported by:	emaste bde
2017-01-15 00:50:10 +00:00
Sean Bruno
5b51fcfc7a Remove unused mtx_held() macro. 2017-01-09 23:41:10 +00:00
Sepherosa Ziehau
cc5bb78be1 if: Defer the if_up until the ifnet.if_ioctl is called.
This ensures the interface is initialized by the interface driver
before it can be used by the rest of the system.

Reviewed by:	jhb, karels, gnn
MFC after:	3 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8905
2017-01-06 05:10:49 +00:00
Adrian Chadd
f69a29b5f9 [net80211] add VHT media types in the media layer. 2017-01-05 04:49:23 +00:00
Sean Bruno
1248952a50 2017 IFLIB updates in preparation for commits to e1000 and ixgbe.
- iflib - add checksum in place support (mmacy)
- iflib - initialize IP for TSO (going to be needed for e1000) (mmacy)
- iflib - move isc_txrx from shared context to softc context (mmacy)
- iflib - Normalize checks in TXQ drainage. (shurd)
- iflib - Fix queue capping checks (mmacy)
- iflib - Fix invalid assert, em can need 2 sentinels (mmacy)
- iflib - let the driver determine what capabilities are set and what
          tx csum flags are used (mmacy)
- add INVARIANTS debugging hooks to gtaskqueue enqueue (mmacy)
- update bnxt(4) to support the changes to iflib (shurd)

Some other various, sundry updates.  Slightly more verbose changelog:

Submitted by:	mmacy@nextbsd.org
Reviewed by:	shurd
mFC after:
Sponsored by:	LimeLight Networks and Dell EMC Isilon
2017-01-02 00:56:33 +00:00
Alan Somers
8a73c85db3 Remove stray debugging code from r310180
Reported by:	rstone
Pointy hat to:	asomers
MFC after:	3 weeks
X-MFC-with:	310180
Sponsored by:	Spectra Logic Corp
2016-12-20 15:45:53 +00:00
Alan Somers
d9fa2d67eb Fix panic during lagg destruction with simultaneous status check
If you run "ifconfig lagg0 destroy" and "ifconfig lagg0" at the same time a
page fault may result. The first process will destroy ifp->if_lagg in
lagg_clone_destroy (called by if_clone_destroy). Then the second process
will observe that ifp->if_lagg is NULL at the top of lagg_port_ioctl and
goto fallback: where it will promptly dereference ifp->if_lagg anyway.

The solution is to repeat the NULL check for ifp->if_lagg

MFC after:	4 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D8512
2016-12-16 22:39:30 +00:00
Adrian Chadd
fdbc9e6e82 [net80211] start laying down the foundation for 11ac support.
This is a work in progress and some of this stuff may change;
but hopefully I'm laying down enough stuff and space in fields
to allow it to grow without another major recompile.

We'll see!

* Add a net80211 PHY type for VHT 2G and VHT 5G.

  Note - yes, VHT is supposed to be for 5GHZ, however some vendors
  (*cough* most of them) support some subset of VHT rate support
  in 2GHz.  No - not 80MHz wide channels, but at least some MCS8-9
  support, maybe some beamforming, and maybe some longer A-MPDU
  aggregates.  I don't want to even think about MU-MIMO on 2GHz.

* Add an ifmedia placeholder type for VHT rates.

* Add channel flags for VHT, VHT20/40U/40D/80/80+80/160
* Add channel macros for the above
* Add ieee80211_channel fields for the VHT information and flags,
  along with some padding (so this struct definitely grows.)
* Add a phy type flag for VHT - 'v'

* Bump the number of channels to a much higher amount - until we get
  something like the linux mac80211 chanctx abstraction (where the
  stack provides a current channel configuration via callbacks,
  versus the driver ever checking ic->ic_curchan or similar) we'll
  have to populate VHT+HT combinations.

Eg, there'll likely be a full set of duplicate VHT20/40 channels to match
HT channels.  There will also be a full set of duplicate VHT80 channels -
note that for VHT80, its assumed you're doing VHT40 as a base, so we
don't need a duplicate of VHT80 + 20MHz only primary channels, only
a duplicate of all the VHT40 combinations.

I don't want to think about VHT80+80 or VHT160 for now - and I won't,
as the current device I'm doing 11ac bringup on (QCA9880) only does
VHT80.

I'll likely revisit the channel configuration and scanning related
stuff after I get VHT20/40 up.

* Add vht flags and the basic MCS rate setup to ieee80211com, ieee80211vap
  and ieee80211_node in preparation for 11ac configuration.
  There is zero code that uses this right now.
* Whilst here, add some more placeholders in case I need to extend
  out things by some uint32_t flag sized fields.  Hopefully I won't!

What I haven't yet done:

* any of the code that uses this
* any of the beamforming related fields
* any of the MU-MIMO fields required for STA/AP operation
* any of the IE fields in beacon frame / probe request/response handling
  and the calculations required for shifting beacon contents around
  when the TIM grows/shrinks

This will require a full rebuild of net80211 related programs -
ifconfig, hostapd, wpa_supplicant.
2016-12-16 04:43:31 +00:00
Luiz Otavio O Souza
8f1c8ade60 Fix the typos and style(9) in comment.
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC (Netgate)
2016-12-08 18:18:48 +00:00
Fabien Thomas
bf4356266d IPsec RFC6479 support for replay window sizes up to 2^32 - 32 packets.
Since the previous algorithm, based on bit shifting, does not scale
with large replay windows, the algorithm used here is based on
RFC 6479: IPsec Anti-Replay Algorithm without Bit Shifting.
The replay window will be fast to be updated, but will cost as many bits
in RAM as its size.

The previous implementation did not provide a lock on the replay window,
which may lead to replay issues.

Reviewed by:	ae
Obtained from:	emeric.poupon@stormshield.eu
Sponsored by:	Stormshield
Differential Revision:	https://reviews.freebsd.org/D8468
2016-11-25 14:44:49 +00:00
Sean Bruno
da69b8f9d1 iflib updates and fixes:
-    reset gen on down
-    initialize admin task statically
-    drain mp_ring on down
-    don't drop context lock on stop
-    reset error stats on down
-    fix typo in min_latency sysctl
-    return ENOBUFS from if_transmit if the driver isn't running or the link is down

Submitted by:	mmacy@nextbsd.org
Reviewed by:	shurd
MFC after:	2 days
Sponsored by:	Isilon and Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D8558
2016-11-18 04:19:21 +00:00
Mark Johnston
55dfce589c Plug a lock leak in sysctl_ifmalist().
Fix style in the local variable declarations.

PR:		214542
MFC after:	1 week
2016-11-15 19:23:48 +00:00
Ryan Stone
ab607f28e3 Don't read if_counters with if_addr_lock held
Calling into an ifnet implementation with the if_addr_lock already
held can cause a LOR and potentially a deadlock, as ifnet
implementations typically can take the if_addr_lock after their
own locks during configuration.  Refactor a sysctl handler that
was violating this to read if_counter data in a temporary buffer
before the if_addr_lock is taken, and then copying the data
in its final location later, when the if_addr_lock is held.

PR: 194109
Reported by: Jean-Sebastien Pedron
MFC after: 2 weeks
Differential Revision:	https://reviews.freebsd.org/D8498
Reviewed by: sbruno
2016-11-12 19:03:23 +00:00
Luigi Rizzo
844a6f0c53 Various fixes for ptnet/ptnetmap (passthrough of netmap ports). In detail:
- use PCI_VENDOR and PCI_DEVICE ids from a publicly allocated range
  (thanks to RedHat)
- export memory pool information through PCI registers
- improve mechanism for configuring passthrough on different hypervisors
Code is from Vincenzo Maffione as a follow up to his GSOC work.
2016-10-27 09:46:22 +00:00
Sepherosa Ziehau
14a31e99d7 hyperv/hn: Define empty packet filter.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8342
2016-10-27 04:55:19 +00:00
Bryan Drewery
921e5f5675 Remove excess CTLFLAG_VNET
Sponsored by:	Dell EMC Isilon
2016-10-26 23:40:07 +00:00
Sepherosa Ziehau
121e98e697 hyperv/hn: Fix RX filter settings.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8313
2016-10-24 05:10:35 +00:00
Sepherosa Ziehau
970ead008d hyperv/hn: Add network change support.
Currently the network change is simulated by link status changes.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8295
2016-10-21 08:02:05 +00:00
Jung-uk Kim
69d410eeb1 Implement BPF_MOD and BPF_XOR instructions.
These two ALU instructions first appeared on Linux.  Then, libpcap adopted
and made them available since 1.6.2.  Now more platforms including NetBSD
have them in kernel.  So do we.
 --이 줄 이하는 자동으로 제거됩니다--
2016-10-21 06:55:07 +00:00
Andrew Gallatin
eedb49598b Clear mbuf hashtype on loopback when RSS is enabled.
The hashtype on an outgoing mbuf reflects the correct hash on the
transmit side of the connection.  If this hash persists on loopback,
the receiving RSS/PCBGROUP code will use it to look up the pcbgroup
for the transmit side, which will often not match the pcbgroup for the
receive side of the connection.  This leads to TCP connections
hanging, and dropping the SYN/ACK packet.   This is essentially
the same as having a hardware network card generate mbufs with an
incorrect RSS hash.

There are a number of places which can set the hash on transmit,
so the simplest fix is to simply clear the hash at loopback time.
Clearing the hash allows a new, correct hash to be calculated in
software on the receive side.

Reviewed by:	jtl
Discussed with:	adrian
Sponsored by:	Netflix
2016-10-20 13:48:29 +00:00
Kevin Lo
b95d46da24 Fix typo in comment. 2016-10-19 02:24:57 +00:00
Luigi Rizzo
a2a7409151 remove stale and unused code from various files
fix build on 32 bit platforms
simplify logic in netmap_virt.h

The commands (in net/netmap.h) to configure communication with the
hypervisor may be revised soon.
At the moment they are unused so this will not be a change of API.
2016-10-18 16:18:25 +00:00
Luigi Rizzo
6ad42d71b2 remove trailing whitespace. No code changes. 2016-10-18 15:41:57 +00:00
Sean Bruno
2fe66646bd Set default capabilities at attach.
ref: 6425f45e5f

Submitted by:	mmacy@nextbsd.org
2016-10-18 14:02:45 +00:00
Sean Bruno
add6f7d069 When deciding whether or not to call tqg_attach_cpu(), reference rid
directly.

ref: c9b47b468b

Submitted by:	mmacy@nextbsd.org
2016-10-18 13:29:30 +00:00
Sean Bruno
8b2a1db901 Toggle v4/v6 rxcsum together
Only re-init if driver is running

ref: 106518e874

Submitted by:	mmacy@nextbsd.org
2016-10-18 13:22:44 +00:00
Sean Bruno
aa3c5dd8a8 Fix misusage of CPU_FFS when binding queues to cpus
ref: 922d0bdf22

Submitted by:	mmacy@nextbsd.org
2016-10-18 13:12:19 +00:00
Luigi Rizzo
a82ab41168 add a missing header. 2016-10-16 18:27:41 +00:00
Luigi Rizzo
37e3a6d349 Import the current version of netmap, aligned with the one on github.
This commit, long overdue, contains contributions in the last 2 years
from Stefano Garzarella, Giuseppe Lettieri, Vincenzo Maffione, including:
+ fixes on monitor ports
+ the 'ptnet' virtual device driver, and ptnetmap backend, for
  high speed virtual passthrough on VMs (bhyve fixes in an upcoming commit)
+ improved emulated netmap mode
+ more robust error handling
+ removal of stale code
+ various fixes to code and documentation (some mixup between RX and TX
  parameters, and private and public variables)

We also include an additional tool, nmreplay, which is functionally
equivalent to tcpreplay but operating on netmap ports.
2016-10-16 14:13:32 +00:00
Sepherosa Ziehau
368bf0c2c6 ifnet: Use if_link_state snapshot to invoke ifnet_link_event
So that everyone in this task have consistent view of link state.

Reviewed by:	ae
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8214
2016-10-12 01:52:29 +00:00
Andrey V. Elsukov
199511bcdd Make LLTABLE list lock private for if_llatbl.c
Rename lock and macros to reflect that it protects V_lltables list.
2016-10-11 17:41:13 +00:00
Sepherosa Ziehau
65ca331080 hyperv/hn: Fix checksum offload settings
The _correct_ way to identify the supported checksum offloading and
TSO parameters is to query OID_TCP_OFFLOAD_HARDWARE_CAPABILITIES.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8088
2016-10-10 05:41:39 +00:00
Andrey V. Elsukov
abe95d87ee Replace rw_init/rw_destroy with corresponding macros.
Obtained from:	Yandex LLC
2016-10-06 14:42:06 +00:00
Kevin Lo
c2b5ba7661 Remove an alias if_list, use if_link consistently.
Reviewed by:	tuexen
Differential Revision:	https://reviews.freebsd.org/D8075
2016-10-06 00:51:27 +00:00
Sepherosa Ziehau
1a3c881209 hyperv/hn: Add stubs for OFFLOAD_CURRENT_CONFIG and NETWORK_CHANGE status
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8068
2016-09-30 06:58:45 +00:00
Kevin Lo
decb239dff Remove the compatibility macro if_addrlist.
Since if_addrlist is used only for ipfilter(4), add a macro if_addrlist
in ip_compat.h.

Reviewed by:	cy
Differential Revision:	https://reviews.freebsd.org/D8059
2016-09-29 05:37:45 +00:00
Kevin Lo
c7641cd18d Remove ifa_list, use ifa_link (structure field) instead.
While here, prefer if_addrhead (FreeBSD) to if_addrlist (BSD compat) naming
for the interface address list in sctp_bsd_addr.c

Reviewed by:	tuexen
Differential Revision:	https://reviews.freebsd.org/D8051
2016-09-28 13:29:11 +00:00
Kevin Lo
3ff511d316 Remove a comment about the size of the ifnet structure.
Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D8036
2016-09-27 08:11:09 +00:00
Kristof Provost
f18598a43e bridge: Fix fragment handling and memory leak
Fragmented UDP and ICMP packets were corrupted if a firewall with reassembling
feature (like pf'scrub) is enabled on the bridge.  This patch fixes corrupted
packet problem and the panic (triggered easly with low RAM) as explain in PR
185633.

bridge_pfil and bridge_fragment relationship:

bridge_pfil() receive (IN direction) packets and sent it to the firewall The
firewall can be configured for reassembling fragmented packet (like pf'scrubing)
in one mbuf chain when bridge_pfil() need to send this reassembled packet to the
outgoing interface, it needs to re-fragment it by using bridge_fragment()
bridge_fragment() had to split this mbuf (using ip_fragment) first then
had to M_PREPEND each packet in the mbuf chain for adding Ethernet
header.

But M_PREPEND can sometime create a new mbuf on the begining of the mbuf chain,
then the "main" pointer of this mbuf chain should be updated and this case is
tottaly forgotten. The original bridge_fragment code (Revision 158140,
2006 April 29) came from OpenBSD, and the call to bridge_enqueue was
embedded.  But on FreeBSD, bridge_enqueue() is done after bridge_fragment(),
then the original OpenBSD code can't work as-it of FreeBSD.

PR:		185633
Submitted by:	Olivier Cochard-Labbé
Differential Revision:	https://reviews.freebsd.org/D7780
2016-09-24 07:09:43 +00:00
Kevin Lo
c3bef61e58 Remove the 4.3BSD compatible macro m_copy(), use m_copym() instead.
Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D7878
2016-09-15 07:41:48 +00:00
Sepherosa Ziehau
b349357819 hyperv/hn: Stringent RNDIS packet message length/offset check.
While I'm here, use definition in net/rndis.h

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7782
2016-09-06 03:20:06 +00:00
Sepherosa Ziehau
a8197ee35e net/rndis: Define RNDIS status message, which could be sent by device.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7757
2016-09-05 04:56:56 +00:00
Sepherosa Ziehau
772b86ba12 net/rndis: Define common message header for RNDIS messages.
And avoid RNDIS_HEADER_OFFSET hardcoding.

Reviewed by:	hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7739
2016-09-02 05:57:13 +00:00
Sepherosa Ziehau
178228a10e net/rndis: Add comment for rndis_comp_hdr
Reviewed by:	hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7738
2016-09-02 05:49:38 +00:00
Sepherosa Ziehau
46ebd74ce1 net/rndis: Define types for RNDIS pktinfo rm_type field.
They are defined by NDIS spec, so the NDIS prefix.

Reviewed by:	hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7717
2016-09-01 07:17:06 +00:00
Sepherosa Ziehau
f320cbed5a net/vlan: Shift for pri is 13 (pri mask 0xe000) not 1.
Reviewed by:	araujo, hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7710
2016-09-01 06:32:35 +00:00
Sepherosa Ziehau
6f67f21938 net/rndis: Define per-packet-info for RNDIS packet message
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7708
2016-09-01 05:40:13 +00:00
Sepherosa Ziehau
947175ca10 net/rndis: Add comment for rndis_set_parameter
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7705
2016-09-01 05:15:04 +00:00
Sepherosa Ziehau
1010113dad net/rndis: Packet types are defined by NDIS; not RNDIS specific.
Reviewed by:	hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7681
2016-08-30 03:11:07 +00:00
Sepherosa Ziehau
8bb1a21b56 hyperv/hn: Move OIDs to net/rndis.h; they are standard NDIS OIDs.
Actually all OIDs defined in net/rndis.h are standard NDIS OIDs.
While I'm here, use the verbose macro name as in NDIS spec.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7679
2016-08-30 02:55:07 +00:00
Sepherosa Ziehau
77c4f5aa9d hyperv/hn: Use vmbus xact for RNDIS set.
And use new RNDIS set to configure NDIS offloading parameters.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7641
2016-08-26 05:18:27 +00:00
Sepherosa Ziehau
cc3d96db55 hyperv/hn: Use vmbus xact for RNDIS query.
And switch MAC address query to use new RNDIS query function.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7639
2016-08-26 05:12:09 +00:00
Sepherosa Ziehau
550bbdbd27 hyperv/hn: Use vmbus xact for RNDIS initialize.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7624
2016-08-25 05:00:41 +00:00
Sepherosa Ziehau
6d79d63a7b net/rndis: Fix RNDIS_STATUS_PENDING definition.
While I'm here, sort the RNDIS status in ascending order.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7594
2016-08-24 03:16:25 +00:00
Sepherosa Ziehau
48ef7b17f0 net/rndis: Add canonical RNDIS major/minor version as of today.
Reviewed by:	hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7593
2016-08-24 03:08:13 +00:00
Sepherosa Ziehau
1ba241d223 net: Split RNDIS protocol structs/macros out of dev/usb/net/if_urndisreg.h
So that Hyper-V can leverage them instead of rolling its own definition.

Discussed with:	hps
Reviewed by:	hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7592
2016-08-23 02:54:06 +00:00
Andrey V. Elsukov
fdf95c0b81 Teach netisr_get_cpuid() to limit a given value to supported by netisr.
Use netisr_get_cpuid() in netisr_select_cpuid() to limit cpuid value
returned by protocol to be sure that it is not greather than nws_count.

PR:		211836
Reviewed by:	adrian
MFC after:	3 days
2016-08-17 20:21:33 +00:00
Stephen Hurd
23ac9029f9 Update iflib to support more NIC designs
- Move group task queue into kern/subr_gtaskqueue.c
- Change intr_enable to return an int so it can be detected if it's not
  implemented
- Allow different TX/RX queues per set to be different sizes
- Don't split up TX mbufs before transmit
- Allow a completion queue for TX as well as RX
- Pass the RX budget to isc_rxd_available() to allow an earlier return
  and avoid multiple calls

Submitted by:	shurd
Reviewed by:	gallatin
Approved by:	scottl
Differential Revision:	https://reviews.freebsd.org/D7393
2016-08-12 21:29:44 +00:00
Adrian Chadd
eb81dc79e9 Extract out the various local definitions of ETHER_IS_BROADCAST() and
turn them into a shared definition.

Set M_MCAST/M_BCAST appropriately upon packet reception in net80211, just
before they are delivered up to the ethernet stack.

Submitted by:	rstone
2016-08-07 03:48:33 +00:00
John Baldwin
f454e7ebf5 Add __printflike() to bus_describe_intr() to enable -Wformat checks.
Fix a few places that were passing a raw string as the format to use
a "%s" format string instead.

MFC after:	2 months
2016-08-04 18:29:16 +00:00
Conrad Meyer
9809e9dc3a rtentry: Initialize rt_mtx with MTX_NEW
The "rtentry" zone does not use UMA_ZONE_ZINIT, so it is invalid to assume the
mutex's memory will be zero.  Without MTX_NEW, garbage backing memory may
trigger the "re-initializing a mutex" assertion.

PR:		200991
Submitted by:	Chang-Hsien Tsai <luke.tw AT gmail.com>
2016-08-01 23:07:31 +00:00
Konstantin Belousov
584b675ed6 Hide the boottime and bootimebin globals, provide the getboottime(9)
and getboottimebin(9) KPI. Change consumers of boottime to use the
KPI.  The variables were renamed to avoid shadowing issues with local
variables of the same name.

Issue is that boottime* should be adjusted from tc_windup(), which
requires them to be members of the timehands structure.  As a
preparation, this commit only introduces the interface.

Some uses of boottime were found doubtful, e.g. NLM uses boottime to
identify the system boot instance.  Arguably the identity should not
change on the leap second adjustment, but the commit is about the
timekeeping code and the consumers were kept bug-to-bug compatible.

Tested by:	pho (as part of the bigger patch)
Reviewed by:	jhb (same)
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:08:59 +00:00
Gleb Smirnoff
32e0ade6c4 Partially revert r257696/r257713, which have an issue with writing to user
controlled address. Restore the old code that emulated OSIOCGIFCONF in if.c.

Noticed by:	C Turt
2016-07-24 10:10:09 +00:00
Alexander Motin
84e633724f Negotiate/disable TXCSUM_IPV6 same as TXCSUM. 2016-07-18 16:58:47 +00:00
Nathan Whitehorn
8c636a11dc Remove assumptions in MI code that the BSP is CPU 0.
MFC after:	2 weeks
2016-07-11 21:25:28 +00:00
Pedro F. Giffuni
a4bf2e2d49 ng_mppc(4):: basic readability cleanups.
In particular use __unreachable() to appease static analyzers.
No functional change.

CID:		1356591
MFC after:	3 days
2016-07-09 02:33:45 +00:00
Conrad Meyer
91d546a05d iflib: Fix typo in 'iflib_rx_miss_bufs' sysctl name
It looks like these sysctls were copy-pasted from netmap.  Most were changed
from 'ixl_' prefix to 'iflib_', but this one was missed.

Fix the "can't re-use a leaf (ixl_rx_miss_bufs)!" warning.

Reported by:	dim@ and others
Sponsored by:	EMC / Isilon Storage Division
2016-07-08 17:04:21 +00:00
Nathan Whitehorn
6415e9aafb Add variable declaration missing in r302372.
Submitted by:	andrew
Approved by:	re (gjb, kib)
2016-07-06 17:46:49 +00:00
Nathan Whitehorn
96c85efb4b Replace a number of conflations of mp_ncpus and mp_maxid with either
mp_maxid or CPU_FOREACH() as appropriate. This fixes a number of places in
the kernel that assumed CPU IDs are dense in [0, mp_ncpus) and would try,
for example, to run tasks on CPUs that did not exist or to allocate too
few buffers on systems with sparse CPU IDs in which there are holes in the
range and mp_maxid > mp_ncpus. Such circumstances generally occur on
systems with SMT, but on which SMT is disabled. This patch restores system
operation at least on POWER8 systems configured in this way.

There are a number of other places in the kernel with potential problems
in these situations, but where sparse CPU IDs are not currently known
to occur, mostly in the ARM machine-dependent code. These will be fixed
in a follow-up commit after the stable/11 branch.

PR:		kern/210106
Reviewed by:	jhb
Approved by:	re (glebius)
2016-07-06 14:09:49 +00:00