Commit Graph

99759 Commits

Author SHA1 Message Date
Adrian Chadd
a4d98bf442 Add basic RSS awareness for the UDPv6 send path.
This doesn't include the same kind of userland overriding that the IPv4
path has; nor does it yet know about 2-tuple versus 4-tuple hashing.
That'll come later.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 04:20:53 +00:00
Adrian Chadd
8ad1a83b48 Calculate the RSS hash for outbound UDPv4 frames.
Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 04:19:36 +00:00
Adrian Chadd
b8bc95cd49 Update the IPv4 input path to handle reassembled frames and incoming frames
with no RSS hash.

When doing RSS:

* Create a new IPv4 netisr which expects the frames to have been verified;
  it just directly dispatches to the IPv4 input path.
* Once IPv4 reassembly is done, re-calculate the RSS hash with the new
  IP and L3 header; then reinject it as appropriate.
* Update the IPv4 netisr to be a CPU affinity netisr with the RSS hash
  function (rss_soft_m2cpuid) - this will do a software hash if the
  hardware doesn't provide one.

NICs that don't implement hardware RSS hashing will now benefit from RSS
distribution - it'll inject into the correct destination netisr.

Note: the netisr distribution doesn't work out of the box - netisr doesn't
query RSS for how many CPUs and the affinity setup.  Yes, netisr likely
shouldn't really be doing CPU stuff anymore and should be "some kind of
'thing' that is a workqueue that may or may not have any CPU affinity";
that's for a later commit.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 04:18:20 +00:00
Adrian Chadd
72d33245f5 Implement IPv4 RSS software hash functions to use during packet ingress
and egress.

* rss_mbuf_software_hash_v4 - look at the IPv4 mbuf to fetch the IPv4 details
  + direction to calculate a hash.
* rss_proto_software_hash_v4 - hash the given source/destination IPv4 address,
  port and direction.
* rss_soft_m2cpuid - map the given mbuf to an RSS CPU ("bucket" for now)

These functions are intended to be used by the stack to support
the following:

* Not all NICs do RSS hashing, so we should support some way of doing
  a hash in software;
* The NIC / driver may not hash frames the way we want (eg UDP 4-tuple
  hashing when the stack is only doing 2-tuple hashing for UDP); so we
  may need to re-hash frames;
* .. same with IPv4 fragments - they will need to be re-hashed after
  reassembly;
* .. and same with things like IP tunneling and such;
* The transmit path for things like UDP, RAW and ICMP don't currently
  have any RSS information attached to them - so they'll need an
  RSS calculation performed before transmit.

TODO:

* Counters! Everywhere!
* Add a debug mode that software hashes received frames and compares them
  to the hardware hash provided by the hardware to ensure they match.

The IPv6 part of this is missing - I'm going to do some re-juggling of
where various parts of the RSS framework live before I add the IPv6
code (read: the IPv6 code is going to go into netinet6/in6_rss.[ch],
rather than living here.)

Note: This API is still fluid.  Please keep that in mind.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 03:10:21 +00:00
Adrian Chadd
9d3ddf4384 Add support for receiving and setting flowtype, flowid and RSS bucket
information as part of recvmsg().

This is primarily used for debugging/verification of the various
processing paths in the IP, PCB and driver layers.

Unfortunately the current implementation of the control message path
results in a ~10% or so drop in UDP frame throughput when it's used.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 01:45:39 +00:00
Adrian Chadd
b174de323a Add IP_NODEFAULTFLOWID awareness to ip6_output().
Differential Revision:	https://reviews.freebsd.org/D527
2014-09-09 00:21:21 +00:00
Adrian Chadd
061a4b4c36 Add a flag to ip_output() - IP_NODEFAULTFLOWID - which prevents it from
overriding an existing flowid/flowtype field in the outbound mbuf with
the inp_flowid/inp_flowtype details.

The upcoming RSS UDP support calculates a valid RSS value for outbound
mbufs and since it may change per send, it doesn't cache it in the inpcb.
So overriding it here would be wrong.

Differential Revision:	https://reviews.freebsd.org/D527
Reviewed by:	grehan
2014-09-09 00:19:02 +00:00
Christian Brueffer
380064c8dc Use the right constants in comparisons. This is currently a nop, as
MIN_RXD == MIN_TXD and MAX_RXD == MAX_TXD.

Reviewed by:	Eric Joyner @ Intel
MFC after:	1 week
2014-09-08 19:24:25 +00:00
Ian Lepore
6966304042 Add a 'ubenv import' command to import environment variables from the
u-boot env into the loader(8) env (which also gets them into the kernel
env).  You can import selected variables or the whole environment.  Each
u-boot var=value becomes uboot.var=value in the loader env.  You can also
use 'ubenv show' to display uboot vars without importing them.
2014-09-08 19:19:10 +00:00
Alexander Motin
fcd7f38fb0 Bunch of microoptimizations to reduce dereferences and cache collisions. 2014-09-08 12:11:49 +00:00
Hiroki Sato
714266373c - Make hhook_run_socket() vnet-aware instead of adding CURVNET_SET() around
the function calls.
- Fix a memory leak and stats in the case that hhook_run_socket() fails
  in soalloc().

PR:	193265
2014-09-08 09:04:22 +00:00
Jean-Sébastien Pédron
ffea80b445 pause_sbt(): Take the cold path (ie. use DELAY()) if KDB is active
This fixes a panic in the i915 driver when one uses debug.kdb.enter=1
under vt(4).

PR:		193269
Reported by:	emaste@
Submitted by:	avg@
MFC after:	3 days
2014-09-08 08:44:50 +00:00
Bjoern A. Zeeb
afa0a6efe0 Use __DECONST to avoid compiler warnings (and thus build failures)
with gcc on sparc64, mips, and powerpc after r271173.
2014-09-08 08:12:09 +00:00
Jean-Sébastien Pédron
313ef9368f vt(4): Change the terminal and buffer sizes, even without a font
This fixes a bug where scroll lock would not work for tty #0 when using
vt_vga's textmode. The reason was that this window is created with a
static 256x100 buffer, larger than the real size of 80x25.

Now, in vt_change_font() and vt_compute_drawable_area(), we still
perform operations even of the window has no font loaded (this is the
case in textmode here vw->vw_font == NULL). One of these operation
resizes the buffer accordingly.

In vt_compute_drawable_area(), we take the terminal size as is (ie.
80x25) for the drawable area.

The font argument to vt_set_border() is removed (it was never used) and
the code now uses the computed drawable area instead of re-doing its own
calculation.

Reported by:	Harald Schmalzbauer <h.schmalzbauer_omnilan.de>
Tested by:	Harald Schmalzbauer <h.schmalzbauer_omnilan.de>
MFC after:	3 days
2014-09-08 07:37:03 +00:00
Adrian Chadd
9dd099b274 Implement htprotmode handling.
This is separate to 11g protection - the default is to RTS protect
11n frames, including A-MPDU frames.

Tested:

* Intel 5100, STA mode
2014-09-08 07:16:00 +00:00
Adrian Chadd
339503ab41 (more) correctly account TX completion status for A-MPDU session frames.
The rules turn out to be:

* for non-aggregation session TX queues - it's either sent or not sent.
* for aggregation session TX queues - if nframes=1, then the status reflects
  the completed transmission.
* however, for nframes > 1, then this is just a status reflecting what
  the initial transmission did.  The compressed BA (immediate or delayed)
  may not have yet been received, so the actual frame status is in the
  compressed BA updates.

Whilst here, I fiddled with debugging and formatting a bit.

There's also RTS attempts (what the atheros chips call "short retries")
which weren't being logged and they aren't yet being used in the rate
control statistics updates.  For now, at least log them.

TODO:

* This still isn't 100% correct! So I have to tinker with this some more.
  (The failures aren't always failures..)
* Extend the rate control API in net80211 so it can take both short and
  long retry counts.

Tested:

* Intel 5100, STA mode
2014-09-08 03:16:28 +00:00
Adrian Chadd
f7efe7e999 Bring over some more status codes from the Linux iwlwifi driver.
The (eventual) intention is to create MIB counters for transmitted
frame completion to count how many packets with each status are
transmitted.

Note the difference between A-MPDU and non A-MPDU status.

Obtained from:	Linux iwlwifi/dvm driver
2014-09-08 03:12:42 +00:00
Glen Barber
1bba7a88fc Silence a bmake(1) warning in the gif(4) module build
when built with WITHOUT_INET6.

LGTM: 		sbruno
Sponsored by:	The FreeBSD Foundation
2014-09-08 02:37:45 +00:00
Alan Cox
0afcd3af8b Oops. vm_map_simplify_entry() is used by mac_proc_vm_revoke_recurse(), so
it can't be static.
2014-09-08 02:25:01 +00:00
Alan Cox
077ec27cd6 Make two functions static and eliminate an unused #define. 2014-09-08 00:19:03 +00:00
Andrew Turner
89fd02846e When entering the kernel with the MMU off assume we are running from a
va == pa map.

I'm not sure the code would work if we are not running from the identity
map as the ARM core may attempt to read the next instruction from an
invalid memory location.
2014-09-07 21:46:54 +00:00
Sean Bruno
082afa63f6 Remove redundant kern conf entries that are inherited via include
Move cfg to spi1 as that's where it is located from the factory
2014-09-07 20:27:48 +00:00
Andrew Turner
d5294df7d0 Remove Lvirtaddr and Lphysaddr, these don't appear to be used. 2014-09-07 19:33:38 +00:00
Andrew Turner
d26c7c862b Generalise the va to pa code and use it when starting secondary cores
Reviewed by:	ian@, rpaulo@
Differential Revision: https://reviews.freebsd.org/D736
2014-09-07 18:32:42 +00:00
Michael Tuexen
ad234e3c3d Address warnings generated by the clang analyzer.
MFC after: 1 week
2014-09-07 18:05:37 +00:00
Michael Tuexen
23602b60fb Address another warnings reported by Patrick Laimbock when compiling
in userspace. While there, improve consistency.

MFC after: 1 week
2014-09-07 17:07:19 +00:00
Xin LI
817d804595 MFV r271223:
In dnode_sync(), do dnode_increase_indirection() before processing
the dn_next_nblkptr.

Illumos issue:
    5117 space map reallocation can cause corruption

MFC after:	3 days
2014-09-07 13:13:42 +00:00
Michael Tuexen
24aaac8d59 Use union sctp_sockstore instead of struct sockaddr_storage. This
eliminiates some warnings when building in userland.
Thanks to Patrick Laimbock for reporting this issue.
Remove also some unnecessary casts.
There should be no functional change.

MFC after: 1 week
2014-09-07 09:06:26 +00:00
Andrew Turner
e72deda6c8 Create a common i.MX53 config and use it with the two existing i.MX53
boards.

This is just intended to split the common config entries out, further
cleanup is expected.

Reviewed by:	ian@, rpaulo@ (earlier version)
Differential Revision: https://reviews.freebsd.org/D731
2014-09-07 08:16:27 +00:00
Michael Tuexen
95e550801c Use SYSCTL_PROC instead of SYSCTL_VNET_PROC.
Suggested by: glebius@
MFC after: 1 week
2014-09-07 07:49:49 +00:00
Hans Petter Selasky
113336eec2 Update mixer description for FastTrackPro.
MFC after:	3 days
2014-09-07 07:23:33 +00:00
Gleb Smirnoff
0abfc519b4 style(9) 2014-09-07 05:47:48 +00:00
Gleb Smirnoff
9e739a5a05 Fix for r271182.
Submitted by:	mjg
Pointy hat to:	me, submitter and everyone who urged me to commit
2014-09-07 05:44:14 +00:00
Adrian Chadd
89a4f450d7 Implement local sfbuf_map and sfbuf_unmap for MIPS32.
The pre-rework behaviour was not to keep the cached mappings around after
the sfbuf was used but instead to recycle said mappings.

PR:		kern/193400
2014-09-06 22:38:32 +00:00
Michael Tuexen
24110da033 Fix a leak of an address, if the address is scheduled for removal
and the stack is torn down.
Thanks to Peter Bostroem and Jiayang Liu from Google for reporting the
issue.

MFC after: 1 week
2014-09-06 20:03:24 +00:00
Konstantin Belousov
27d21b9e9a Add a define for index of IA32_XSS MSR, which is, per SDM rev. 50, an
analog of XCR0 for ring 0 FPU state, used by XSAVES and XRSTORS.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-09-06 19:47:37 +00:00
Alexander Motin
c45ff92177 Save one register read (AHCI_IS) for AHCI controllers with only one port.
For controllers with only one port (like PCIe or M.2 SSDs) interrupt can
come from only one source, and skipping read saves few percents of CPU time.

MFC after:	1 month
H/W donated by:	I/O Switch
2014-09-06 19:43:48 +00:00
Konstantin Belousov
f3f509767a SDM rev. 50 defines the use of the next 8 bytes in the xstate header.
It is the compaction bitmask, with the highest bit defining if compact
format of the xsave area is used at all.

Adjust the definition of struct xstate_hdr, provide define for bit 63.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-09-06 19:39:12 +00:00
Michael Tuexen
f47f328dc5 Fix the handling of sysctl variables when used with VIMAGE.
While there do some cleanup of the code.

MFC after: 1 week
2014-09-06 19:12:14 +00:00
Ian Lepore
26511eb02e When registering an association between a device and an xref phandle, create
an entry in the xref list if one doesn't already exist for the given handle.

On a system that uses phandle properties, the init-time scan of the tree
which builds the xref list will pre-create entries for every xref handle
that exists in the data.  On systems where the xref and node handles are
synonymous there is no phandle property in referenced nodes, and the xref
list will initialize to an empty state.  In the latter case, we still need
to be able to associate a device_t with an xref handle, so we create list
entries on the fly as needed.  Since the node and xref handles are
synonymous, we have all the info needed to create a list entry at device
registration time.

The downside to this change is that it basically allows on the fly creation
of xref handles as synonyms of node handles, and the association of a
device_t with them.  Whether this is a bug or a feature is in the eye of
the beholder, I guess.
2014-09-06 18:43:17 +00:00
Warner Losh
4acab041dd Restore order of interrupt setup. Minor problems can result by
setting up the interrupts too early:

Reviewed by: mav@
Sponsored by: Netflix
2014-09-06 18:20:50 +00:00
Ruslan Bukin
8aabb9719d o Remove __unused attribute on variables which actually used
o Unmagic 'configuration done' bit
o Move probe() to place before attach() for better navigation
o Use bus_read_n instead of bus_space_read_n functions

Pointed out by:	andrew
Sponsored by:	DARPA, AFRL
2014-09-06 18:08:21 +00:00
Ian Lepore
f021180bfb Revert rr271190, it was based on a misunderstanding. The problem of
non-existant device<->xref info needs to be handled by creating the info,
which will come in a subsequent commit.
2014-09-06 17:50:59 +00:00
Andrew Turner
f622737b2e Fixthe spelling of ehci 2014-09-06 17:33:41 +00:00
Konstantin Belousov
dc7c2b07da Add more bits for the XSAVE features from CPUID 0xd, sub-function 1
%eax report.

Print the XSAVE features 0xd/1 in the boot banner.  The printcpuinfo()
is executed late enough so that XSAVE is already enabled.

There is no known to me off the shelf hardware that implements any
feature bits except XSAVEOPT, the list is taken from SDM rev. 50.  The
banner printing will allow us to note the hardware arrival.

Sponsored by:	    The FreeBSD Foundation
MFC after:	    1 week
2014-09-06 15:45:45 +00:00
Alexander Motin
985da6db1c Fix typo in comments.
Submitted by:	Benedict Reuschling <bcr@FreeBSD.org>
MFC after:	6 days
2014-09-06 15:37:55 +00:00
John Baldwin
b1d735ba4c Create a separate structure for per-CPU state saved across suspend and
resume that is a superset of a pcb.  Move the FPU state out of the pcb and
into this new structure.  As part of this, move the FPU resume code on
amd64 into a C function.  This allows resumectx() to still operate only on
a pcb and more closely mirrors the i386 code.

Reviewed by:	kib (earlier version)
2014-09-06 15:23:28 +00:00
Ian Lepore
00eea22f11 Add OF_xref_from_node_strict() which returns -1 if there is no xref handle
for the node.  The default routine returns the untranslated handle, which
is sometimes useful, but sometimes you really need to know there's no
entry in the xref<->node<->device translation table.
2014-09-06 15:11:35 +00:00
Andrew Turner
ae46df0eab Allow us to use the virtual timer. It is currently disabled, but should
be usable as the default timer in place of the physical timer.

We are guaranteed to have access to the virtual timer, but when running
under a hypervisor may not have access to the physical.

Differential Revision: https://reviews.freebsd.org/D588
2014-09-06 13:21:07 +00:00
Ruslan Bukin
34a0189341 Add FPGA Manager driver. This driver allows to program FPGA core
from FreeBSD userspace running on ARM core.

Sponsored by:	DARPA, AFRL
2014-09-06 08:48:57 +00:00