Commit Graph

190824 Commits

Author SHA1 Message Date
Adrian Chadd
7063e348ab Add initial RSS awareness to the ixgbe(4) driver.
The ixgbe(4) hardware is capable of RSS hashing RX packets and doing RSS
queue selection for up to 8 queues.

However, even if multi-queue is enabled for ixgbe(4), the RX path doesn't use
the RSS flowid from the received descriptor.  It just uses the MSIX queue id.

This patch does a handful of things if RSS is enabled:

* Instead of using a random key at boot, fetch the RSS key from the RSS code
  and program that in to the RSS redirection table.

  That whole chunk of code should be double checked for endian correctness.

* Use the RSS queue mapping to CPU ID to figure out where to thread pin
  the RX swi thread and the taskqueue threads for each queue.

* The software queue is now really an "RSS bucket".

* When programming the RSS indirection table, use the RSS code to
  figure out which RSS bucket each slot in the indirection table maps
  to.

* When transmitting, use the flowid RSS mapping if the mbuf has
  an RSS aware hash.  The existing method wasn't guaranteed to align
  correctly with the destination RSS bucket (and thus CPU ID.)

This code warns if the number of RSS buckets isn't the same as the
automatically configured number of hardware queues.  The administrator
will have to tweak one of them for better performance.

There's currently no way to re-balance the RSS indirection table after
startup.  I'll worry about that later.

Additionally, it may be worthwhile to always use the full 32 bit flowid if
multi-queue is enabled.  It'll make things like lagg(4) behave better with
respect to traffic distribution.
2014-06-30 04:38:29 +00:00
Adrian Chadd
1d72a9bea9 Add initial RSS awareness to the igb(4) driver.
The igb(4) hardware is capable of RSS hashing RX packets and doing RSS
queue selection for up to 8 queues.  (I believe some hardware is limited
to 4 queues, but I haven't tested on that.)

However, even if multi-queue is enabled for igb(4), the RX path doesn't use
the RSS flowid from the received descriptor.  It just uses the MSIX queue id.

This patch does a handful of things if RSS is enabled:

* Instead of using a random key at boot, fetch the RSS key from the RSS code
  and program that in to the RSS redirection table.

  That whole chunk of code should be double checked for endian correctness.

* Use the RSS queue mapping to CPU ID to figure out where to thread pin
  the RX swi thread and the taskqueue threads for each queue.

* The software queue is now really an "RSS bucket".

* When programming the RSS indirection table, use the RSS code to
  figure out which RSS bucket each slot in the indirection table maps
  to.

* When transmitting, use the flowid RSS mapping if the mbuf has
  an RSS aware hash.  The existing method wasn't guaranteed to align
  correctly with the destination RSS bucket (and thus CPU ID.)

This code warns if the number of RSS buckets isn't the same as the
automatically configured number of hardware queues.  The administrator
will have to tweak one of them for better performance.

There's currently no way to re-balance the RSS indirection table after
startup.  I'll worry about that later.

Additionally, it may be worthwhile to always use the full 32 bit flowid if
multi-queue is enabled.  It'll make things like lagg(4) behave better with
respect to traffic distribution.
2014-06-30 04:34:59 +00:00
Adrian Chadd
8f7e75cbbd If we're doing RSS then ensure the TCP timer selection uses the multi-CPU
callwheel setup, rather than just dumping all the timers on swi0.
2014-06-30 04:26:29 +00:00
Adrian Chadd
c445c3c7f6 If we're doing RSS then ensure that the callwheel swi's are CPU pinned. 2014-06-30 04:25:51 +00:00
Scott Long
3da2a91a57 In rare cases, a SATA drive can stop responding to commands and trigger a
reset device task request from the driver.  If the drive fails to respond
with a signature FIS, the driver would previously get into an endless retry
loop, stalling all I/O to the drive and keeping user processes stranded.
Instead, fail the i/o and invalidate the device if the task management
command times out.  This is controllable with the sysctl and tunable
hw.isci.fail_on_task_timeout
dev.isci.0.fail_on_task_timeout

The default for these is 1.

Reviewed by:	jimharris
Obtained from:	Netflix, Inc.
MFC after:	2 days
2014-06-30 01:01:54 +00:00
Scott Long
58cf99d20d Fix a case in ndling ATA_PASSTHROUGH commands that have an unaligned buffer.
This impacts some home-rolled SMART tools.

Reviewed by:	jimharris
Obtained from:	Netflix
MFC after:	2 days
2014-06-30 00:41:46 +00:00
Ed Maste
d02efe3ffe Regenerate after r268022
MFC after:	1 week
2014-06-30 00:22:41 +00:00
Ed Maste
824a909300 Rename the WITHOUT_VT_SUPPORT knob to WITHOUT_VT
The _SUPPORT knobs have a consistent meaning which differs from the
behaviour controlled by this knob.  As the knob is opt-out and has not
appeared in a release the impact should be low.

Suggested by:	imp, wblock
MFC after:	1 week
2014-06-30 00:20:12 +00:00
Sean Bruno
0fb31ed073 Add detection for ciss(4) controllers that are set to non-raid JBOD mode.
If a controller is set to JBOD, it has no RAID functions turned on.

Populate even more of the firmware specification headers, copied from
cciss_vol_status.

Reviewed by:	Benesh, Scott <scott.benesh@hp.com>
MFC after:	2 weeks
2014-06-29 18:53:15 +00:00
Sean Bruno
97ed09b5ff Check return of cam_periph_find() before using it in a printf.
If cam_periph_find() doesn't locate the path we requested, bail to error
condition.

Acquire ciss->mtx for this operation.

Reviewed by:	"Benesh, Scott" <scott.benesh@hp.com>
MFC after:	2 weeks
2014-06-29 18:38:44 +00:00
Pedro F. Giffuni
0135aadfc3 Reduce some warnings in the Solaris unicode support.
Clean some warnings from parenthesis and minor style issues.

MFC after:	3 days
2014-06-29 02:28:05 +00:00
Bryan Venteicher
efd48b30f1 Give each interrupt a descriptive name when using MSIX
MFC after:	3 days
2014-06-29 01:04:11 +00:00
Rick Macklem
2d5f835917 There might be a potential race condition for the NFSv4 client
when a newly created file has another open done on it that
update the open mode. This patch moves the code that updates
the open mode up into the block where the mutex is held to
ensure this cannot happen. No bug caused by this potential
race has been observed, but this fix is a safety belt to ensure
it cannot happen.

MFC after:	2 weeks
2014-06-28 21:47:15 +00:00
Pedro F. Giffuni
f34dd28f7d Revert r267869:
MFV	r260708
4427 pid provider rejects probes with valid UTF-8 names

Use of u8_textprep.c broke the build on powerpc.

Reported by:	bz, rpaulo and tinderbox.
Pointyhat:	me
2014-06-28 19:59:12 +00:00
Rui Paulo
57c2423017 Move the -I of common/util to the proper place to fix the powerpc build.
MFC after:	2 weeks
2014-06-28 18:53:02 +00:00
Hans Petter Selasky
4813ad54f8 Compile fixes:
Remove duplicate "debug_ktr.mask" sysctl definition.
Remove now unused variable from "kern_ktr.c".
This fixes build of "ktr" which was broken by r267961.

Let the default value for "vm_kmem_size_scale" be zero. It is setup
after that the sysctl has been initialized from "getenv()" in the
"kmeminit()" function to equal the "VM_KMEM_SIZE_MAX" value, if
zero. On Sparc64 the "VM_KMEM_SIZE_MAX" macro is not a constant. This
fixes build of Sparc64 which was broken by r267961.

Add a special macro to dynamically create SYSCTL root nodes, because
root nodes have a special parent. This fixes build of existing OFED
module and CANBUS module for pc98 which was broken by r267961.

Add missing "sysctl.h" includes to get the needed sysctl header file
declarations. This is needed after r267961.

MFC after:	2 weeks
2014-06-28 17:36:18 +00:00
David Malone
a88289fcb9 Don't accidently skip every second line when calculating the
idle time.

MFC after:	2 weeks
2014-06-28 15:53:28 +00:00
Dimitry Andric
08e09c6e13 Fix breakage after r267981.
Pointy hat to:	dim
MFC after:	3 days
X-MFC-With:	r267981
2014-06-28 09:53:44 +00:00
Mateusz Guzik
b0bc0cadbe Call fdcloseexec right after fdunshare.
No functional changes.

MFC after:	1 week
2014-06-28 05:51:45 +00:00
Mateusz Guzik
b9d32c36fa Make fdunshare accept only td parameter.
Proc had to match the thread anyway and 2 parameters were inconsistent
with the rest.

MFC after:	1 week
2014-06-28 05:41:53 +00:00
Mateusz Guzik
35778d7aa9 Make sure to always clear p_fd for process getting rid of its filetable.
Filetable can be shared with other processes. Previous code failed to
clear the pointer for all but the last process getting rid of the table.
This is mostly cosmetics.

Get rid of 'This should happen earlier' comment. Clearing the pointer in
this place is fine as consumers can reliably check for files availability
by inspecting fd_refcnt and vnodes availabity by NULL-checking them.

MFC after:	1 week
2014-06-28 05:18:03 +00:00
Hans Petter Selasky
6a3287f889 Fix regression issue after r267961. Handle special string case for
SYSCTLs like previously.

MFC after:	2 weeks
Reported by:	several people
2014-06-28 03:59:04 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Gavin Atkinson
b152235544 Minimal update for cvsup -> svn change. 2014-06-28 00:01:18 +00:00
Rui Paulo
63083de0a6 Redefine SUNW based on SYSDIR in an attempt to fix a build problem.
MFC after:	2 weeks
2014-06-27 22:38:42 +00:00
Alexander Motin
acee7463b6 Remove odd practice of inverting error codes.
-EPERM is equal to ERESTART, returning which from ioctl() handler causes
infinite syscall restart.

MFC after:	2 weeks
2014-06-27 22:28:14 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Xin LI
d2f1b8f4d2 Use Intel's official name (Secure Key) per Intel® Digital Random Number
Generator (DRNG) Software Implementation Guide.

Reviewed by:	kib
Approved by:	so
MFC after:	2 weeks
2014-06-27 21:33:15 +00:00
Dimitry Andric
0506f5060d Add the llvm patch for r267981. 2014-06-27 20:45:17 +00:00
Dimitry Andric
96b9c77676 Pull in r211627 from upstream llvm trunk (by Bill Schmidt):
[PPC64] Fix PR20071 (fctiduz generated for targets lacking that
  instruction)

  PR20071 identifies a problem in PowerPC's fast-isel implementation
  for floating-point conversion to integer.  The fctiduz instruction
  was added in Power ISA 2.06 (i.e., Power7 and later).  However, this
  instruction is being generated regardless of which 64-bit PowerPC
  target is selected.

  The intent is for fast-isel to punt to DAG selection when this
  instruction is not available.  This patch implements that change.
  For testing purposes, the existing fast-isel-conversion.ll test adds
  a RUN line for -mcpu=970 and tests for the expected code generation.
  Additionally, the existing test fast-isel-conversion-p5.ll was found
  to be incorrectly expecting the unavailable instruction to be
  generated.  I've removed these test variants since we have adequate
  coverage in fast-isel-conversion.ll.

This is needed to compile clang with debug+asserts on older powerpc64
and ppc970 targets.

Requested by:	jhibbits
MFC after:	3 days
2014-06-27 20:41:12 +00:00
Marius Strobl
7344ee184b In order to get vt(4) a bit closer to the feature set provided by sc(4),
implement options TERMINAL_{KERN,NORM}_ATTR. These are aliased to
SC_{KERNEL_CONS,NORM}_ATTR and like these latter, allow to change the
default colors of normal and kernel text respectively.
Note on the naming: Although affecting the output of vt(4), technically
kern/subr_terminal.c is primarily concerned with changing default colors
so it would be inconsistent to term these options VT_{KERN,NORM}_ATTR.
Actually, if the architecture and abstraction of terminal+teken+vt would
be perfect, dev/vt/* wouldn't be touched by this commit at all.

Reviewed by:	emaste
MFC after:	3 days
Sponsored by:	Bally Wulff Games & Entertainment GmbH
2014-06-27 19:57:57 +00:00
Xin LI
e6683f19dd Always set UF_ARCHIVE on target (because they are by definition new files
and should be archived) and ignore error when we can't set it (e.g. NFS).

Reviewed by:	ken
MFC after:	2 weeks
2014-06-27 19:57:54 +00:00
Ed Maste
6ac6c9d5f4 Add CTLFLAG_NOFETCH flag; console vty code runs before tunable fetch
Also remove redundant "" assignment for string in BSS.

Submitted by:	hselasky@
2014-06-27 19:07:35 +00:00
Adrian Chadd
dc847eb656 Add missing variable declarations when using RSS.
Reported by: bryanv@
2014-06-27 19:07:00 +00:00
Luiz Otavio O Souza
646707db80 Simplify the code a little bit using the update_sensor_sysctl() routine to
retrieve the sensor temperature.

This also avoid the overflow that could happen on sysctlnametomib(3)
because the code was not checking the length of the mib array.

CID:		1222504
2014-06-27 18:58:22 +00:00
Mateusz Guzik
75ad9daa46 pw: fix up deletion of users from groups
Previuosly given 'foo,bar' members, removing 'foo' would result in an
infinite loop.

PR:		191427
Submitted by:	Voradesh Yenbut <yenbut cs.washington.edu>
MFC after:	1 week
2014-06-27 18:51:19 +00:00
Luiz Otavio O Souza
c04cdb5738 Correct the buffer length check to avoid overflows.
Found with:	Coverity Scan
CID:		1222502, 1222503
2014-06-27 18:40:14 +00:00
Joel Dahl
3a8794be33 Minor mdoc fix. 2014-06-27 18:32:20 +00:00
Marius Strobl
a102d82271 - SC_NO_SYSMOUSE isn't currently supported by vt(4), so nuke it from vt.4.
- vt_vga(4) is a driver rather than a function so reference it accordingly.
- Uncomment HISTORY section given that vt(4) will first appear in 9.3.

Reviewed by:	emaste (modulo last part)
MFC after:	3 days
Sponsored by:	Bally Wulff Games & Entertainment GmbH
2014-06-27 18:24:20 +00:00
Neel Natu
64fe72354c Add post-mortem debugging for "EPT Misconfiguration" VM-exit. This error
is hard to reproduce so try to collect all the breadcrumbs when it happens.

Reviewed by:	grehan
2014-06-27 18:00:38 +00:00
Ed Maste
59644098f8 Use a common tunable to choose between vt(4)/sc(4)
With this change and previous work from ray@ it will be possible to put
both in GENERIC, and have one enabled by default, but allow the other to
be selected via the loader.

(The previous implementation had separate kern.vt.disable and
hw.syscons.disable tunables, and would panic if both drivers were
compiled in and neither was explicitly disabled.)

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2014-06-27 17:50:33 +00:00
Neel Natu
874a6e051b After r267897 brought in a new version of file/libmagic, a filesystem image
is identified as "DOS/MBR boot sector" as opposed to "x86 boot sector".

This trips up vmrun.sh when using the new file(1) and makes it want to boot
into the installer instead.

Fix this by just looking for "boot sector" instead.
2014-06-27 17:18:54 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Hans Petter Selasky
04006eabea Don't hide zero-length strings when doing sysctl listings.
MFC after:	1 week
2014-06-27 15:23:12 +00:00
John Baldwin
cde1f5b8a0 Sort command flags in usage output and the manpages. 2014-06-27 15:20:34 +00:00
Hans Petter Selasky
561227da30 Add proper rangechecks in "axge_rx_frame()" function and
fix receive loop header parsing.

MFC after:	3 days
PR:		191432
2014-06-27 10:24:36 +00:00
Alexander Motin
1d6f7db544 Fix typo in r267481.
MFC after:	3 days
2014-06-27 06:52:37 +00:00
Peter Grehan
62f17e92fe Set the version and date to fixed fields rather than using
preprocessor macros that don't allow reproducible builds.
As a side-effect, the date string is now spec-compliant.

root@bhyve:~ # dmidecode
# dmidecode 2.12
SMBIOS 2.4 present.
12 structures occupying 514 bytes.
Table at 0x000F101F.

Handle 0x0001, DMI type 0, 24 bytes
BIOS Information
        Vendor: BHYVE
        Version: 1.0
        Release Date: 03/14/2014

Submitted by:	des (original version)
Reviewed by:	tychon
MFC after:	1 week
2014-06-27 05:27:37 +00:00
Mateusz Guzik
de966666a2 Check lower bound of cmsg_len.
If passed cm->cmsg_len was below cmsghdr size the experssion:
datalen = (caddr_t)cm + cm->cmsg_len - (caddr_t)data;

would give negative result. However, in practice it would not
result in a crash because the kernel would try to obtain garbage fds
for given process and would error out with EBADF.

PR:		124908
Submitted by:	campbell mumble.net (modified a little)
MFC after:	1 week
2014-06-27 05:04:36 +00:00
Rui Paulo
a43f0be9fe MFV illumos
4471 DTrace count() with histogram
4472 DTrace full width distribution histograms
4473 DTrace frequency trails

MFC after:	2 weeks
2014-06-26 23:24:59 +00:00