Commit Graph

6951 Commits

Author SHA1 Message Date
Mark Murray
e866d8f05b Make the UMA harvesting go away completely if not wanted. Default to "not wanted".
Provide and document the RANDOM_ENABLE_UMA option.

Change RANDOM_FAST to RANDOM_UMA to clarify the harvesting.

Remove RANDOM_DEBUG option, replace with SDT probes. These will be of
use to folks measuring the harvesting effect when deciding whether to
use RANDOM_ENABLE_UMA.

Requested by:	scottl and others.
Approved by:	so (/dev/random blanket)
Differential Revision:    https://reviews.freebsd.org/D3197
2015-08-22 12:59:05 +00:00
Justin Hibbits
6aabc119b6 Create a RouterBoard platform and use it to create a flash map
Summary:
The RouterBoard uses a predefined partition map which doesn't exist in the fdt.
This change allows overriding the fdt slicer with a custom slicer, and uses this
custom slicer to define the flash map on the RouterBoard RB800.
D3305 converts the mpc85xx platform into a base class, so that systems based on
the mpc85xx platform can add their own overrides.  This change builds on D3305,
and creates a RouterBoard (RB800) platform to initialize the slicer override.

Reviewed By: nwhitehorn, imp
Differential Revision: https://reviews.freebsd.org/D3345
2015-08-22 05:50:18 +00:00
Luiz Otavio O Souza
0a70aaf8f5 Add ALTQ(9) support for the CoDel algorithm.
CoDel is a parameterless queue discipline that handles variable bandwidth
and RTT.

It can be used as the single queue discipline on an interface or as a sub
discipline of existing queue disciplines such as PRIQ, CBQ, HFSC, FAIRQ.

Differential Revision:	https://reviews.freebsd.org/D3272
Reviewd by:	rpaulo, gnn (previous version)
Obtained from:	pfSense
Sponsored by:	Rubicon Communications (Netgate)
2015-08-21 22:02:22 +00:00
Michael Gmelin
12e413be27 Allow building a kernel with baked in ig4, isl and cyapa drivers.
Also addresses jhb's remarks on D2811 and D3068.

PR:		202059
Differential Revision:	https://reviews.freebsd.org/D3351
Reviewed by:	jhb
Approved by:	jhb
2015-08-19 09:49:29 +00:00
Luiz Otavio O Souza
3df058ffaf Add the GPIO driver for the ADI Engineering RCC-VE and RCC-DFF/DFFv2.
This driver allows read the software reset switch state and control the
status LEDs.

The GPIO pins have their direction (input/output) locked down to prevent
possible short circuits.

Note that most people get a reset button that is a hardware reset.  The
software reset button is available on boards from Netgate.

Sponsored by:	Rubicon Communications (Netgate)
2015-08-18 21:05:56 +00:00
Mark Murray
646041a89a Add DEV_RANDOM pseudo-option and use it to "include out" random(4)
if desired.

Retire randomdev_none.c and introduce random_infra.c for resident
infrastructure. Completely stub out random(4) calls in the "without
DEV_RANDOM" case.

Add RANDOM_LOADABLE option to allow loadable Yarrow/Fortuna/LocallyWritten
algorithm.  Add a skeleton "other" algorithm framework for folks
to add their own processing code. NIST, anyone?

Retire the RANDOM_DUMMY option.

Build modules for Yarrow, Fortuna and "other".

Use atomics for the live entropy rate-tracking.

Convert ints to bools for the 'seeded' logic.

Move _write() function from the algorithm-specific areas to randomdev.c

Get rid of reseed() function - it is unused.

Tidy up the opt_*.h includes.

Update documentation for random(4) modules.

Fix test program (reviewers, please leave this).

Differential Revision:    https://reviews.freebsd.org/D3354
Reviewed by:              wblock,delphij,jmg,bjk
Approved by:              so (/dev/random blanket)
2015-08-17 07:36:12 +00:00
Alexander Motin
67ceb24bca Move "ioctl" CAM frontend into separate file.
It has nothing to share with too huge ctl.c other then device descriptor,
but even that may be counted as design error that may be fixed later.
At some point we may even want to have several ioctl ports.
2015-08-15 15:42:21 +00:00
Alexander Motin
2f444d157b Drop "internal" CTL frontend.
Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..
2015-08-15 13:34:38 +00:00
Rui Paulo
1558258bc4 sys/conf: pass NMFLAGS to nm(1) via genassym.sh. 2015-08-14 22:58:32 +00:00
Alexander Motin
e0360e14d2 MFV r277431: 5497 lock contention on arcs_mtx
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>

illumos/illumos-gate@244781f10d

This patch attempts to reduce lock contention on the current arc_state_t
mutexes. These mutexes are used liberally to protect the number of LRU
lists within the ARC (e.g. ARC_mru, ARC_mfu, etc). The granularity at
which these locks are acquired has been shown to greatly affect the
performance of highly concurrent, cached workloads.
2015-08-14 09:31:07 +00:00
Marcel Moolenaar
cc787e3d0e Change md(4) to use weak symbols as start, end and size for the embedded
root disk. The embedded image is linked into the kernel in the .mfs
section.

Add rules and variables to kern.pre.mk and kern.post.mk that handle the
linking of the image. First objcopy is used to generate an object file.
Then, the object file is linked into the kernel.

Submitted by:	Steve Kiernan <stevek@juniper.net>
Reviewed by:	brooks@
Obtained from: Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D2903
2015-08-13 15:16:34 +00:00
Peter Wemm
6d57346a39 Add missing cddl/contrib/opensolaris/uts/common/fs/zfs/bqueue.c that
was left out of r286705.

Forgotten by:  mav
2015-08-13 05:42:56 +00:00
Marcel Moolenaar
7ef5e8bc80 Better support memory mapped console devices, such as VGA and EFI
frame buffers and memory mapped UARTs.

1.  Delay calling cninit() until after pmap_bootstrap(). This makes
    sure we have PMAP initialized enough to add translations. Keep
    kdb_init() after cninit() so that we have console when we need
    to break into the debugger on boot.
2.  Unfortunately, the ATPIC code had be moved as well so as to
    avoid a spurious trap #30. The reason for which is not known
    at this time.
3.  In pmap_mapdev_attr(), when we need to map a device prior to the
    VM system being initialized, use virtual_avail as the KVA to map
    the device at. In particular, avoid using the direct map on amd64
    because we can't demote by virtue of not being able to allocate
    yet. Keep track of the translation.
    Re-use the translation after the VM has been initialized to not
    waste KVA and to satisfy the assumption in uart(4) that the handle
    returned for the low-level console is the same as later returned
    when the device is probed and attached.
4.  In pmap_unmapdev() remove the mapping from the table when called
    pre-init. Otherwise keep the mapping. During bus probe and attach
    device resources are mapped and unmapped multiple times, which
    would have us destroy the mapping used by the low-level console.
5.  In pmap_init(), set pmap_initialized to signal that we're not
    pre-init anymore. On amd64, bring the direct map in sync with the
    translations created at that time.
6.  Implement bus_space_map() and bus_space_unmap() for real: when
    the tag corresponds to memory space, call the corresponding
    pmap_mapdev() and pmap_unmapdev() functions to construct and
    actual handle.
7.  In efifb.c and vt_vga.c, remove the crutches and hacks and simply
    call pmap_mapdev_attr() or bus_space_map() as desired.

Notes:
1.  uart(4) already used bus_space_map() during low-level console
    setup but since serial ports have traditionally been I/O port
    based, the lack of a proper implementation for said function
    was not a problem. It has always supported memory mapped UARTs
    for low-level consoles by setting hw.uart.console accordingly.
2.  The use of the direct map on amd64 without setting caching
    attributes has been a bigger problem than previously thought.
    This change has the fortunate (and unexpected) side-effect of
    fixing various EFI frame buffer problems (though not all).

PR: 191564, 194952

Special thanks to:
1.  XipLink, Inc -- generously donated an Intel Bay Trail E3800
    based eval board (ADLE3800PC).
2.  The FreeBSD Foundation, in particular emaste@ -- for UEFI
    support in general and testing.
3.  Everyone who tested the proposed for PR 191564.
4.  jhb@ and kib@ for being a soundboard and applying a clue bat
    if so needed.
2015-08-12 15:26:32 +00:00
Rui Paulo
97e2b41abf Fix a comment for iwm.
Submitted by:	brueffer
2015-08-10 16:32:47 +00:00
Conrad Meyer
051b71e564 files.amd64: Build ntb_hw.o if if_ntb OR ntb_hw
Approved by:	markj (mentor)
Sponsored by:	EMC / Isilon Storage Division
Differential Revision:	https://reviews.freebsd.org/D3349
2015-08-10 14:03:39 +00:00
Zbigniew Bodek
bfc3978594 Add support for external PCIe (PEM) on Cavium's ThunderX
Reviewed by:   jhb
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3257
2015-08-08 21:32:03 +00:00
Rui Paulo
963b53ea7f sys/conf/files: add iwm and iwmfw. 2015-08-08 21:08:10 +00:00
Rui Paulo
cbe49cc415 Add nodevice iwmfw to WITHOUT_SOURCELESS_UCODE. 2015-08-08 20:45:47 +00:00
Rui Paulo
3d4e84fe5a sys/conf/options: add IWM_DEBUG. 2015-08-08 20:45:12 +00:00
Zbigniew Bodek
d943d79a36 Introduce support for internal PCIe for Cavium's ThunderX
This driver supports internal PCIe Root Complex on
Cavium ThunderX Pass 1.1 hardware.

Reviewed by:   andrew, jhb
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3031
2015-08-08 20:34:55 +00:00
Navdeep Parhar
d94b89b915 cxgbe(4): Update T5 and T4 firmwares bundled with the driver to 1.14.4.0. The
changes in the firmwares since 1.11.27.0 are listed here (straight copy-paste
from the "Release Notes.txt" accompanying the Chelsio Unified Wire 2.11.1.0
release on the website).

22.1. T5 Firmware
+++++++++++++++++++++++++++++++++

Version : 1.14.4.0
Date    : 08/05/2015
================================================================================

FIXES
-----

BASE:
- Fixes a potential data path hang by properly programming PMTX congestion
  threshold settings.
- Fixes a potential initialization error when accessing a configuration file
  stored on the flash.
- Fixes a regression where SGE resources can be miss-sized if iWARP is disabled.

ETH:
- Fixes a timing issue that would prevent CR4 links from coming up with some
  switches.

FOFCoE:
- Defers fcoe linkdown mailbox command handling till LOGO is sent.
- Updates vlan prio for all outstanding IOs during dcbx update.

ENHANCEMENTS
------------

BASE:
- Adds support for PAUSE OFF watchdog.
- Reports devlog access information in PCIE_FW_PF register 7.

ETH:
- Enhances segmentation offload to include VxLAN and Geneve.
- Adds PTP support.
- Adds new interface to allow the driver to query the VI rss table base
  addresses.
- Allows the driver to program the SGE ingrext contxt CongDrop field.

OFLD:
- Adds new interface for the driver to specify offloaded connections TCP snd
  and rcv scale factors.

iSCSI:
- Adds support for iscsi segmentatation offload (ISO).
- Adds support for iscsi t10-dif offload.

FOiSCSI:
- Sets FORCE_BIT for cut through processing for FOiSCSI.

FOFCoE:
- Adds support for FCoE BB6.
- Improves WRITE performance.

================================================================================
================================================================================

Version : 1.13.32.0
Date    : 03/25/2015
================================================================================

FIXES
-----

BASE:
- Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of
   negative)
- Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain
   adapter configurations)
- Fixes config file based PL_TIMEOUT register programming

ETH:
- Fixes a potential EO UDP SEG header corruption
- Fixes an issue where 1000Base-X was not enabled correctly when using QSA
   modules

OFLD:
- Fixes timeout issue with half-open connections
- Fixes FW_FLOWC_WR processing when state is set to finwait1

FOFCoE:
- Fixes fcoe xchg leaks in linkdown/peer down path
- Fixes cleanup in FCoE linkdown and fixed buf timer flowid abuse
- Fixes fw crash by clearing fcf flowc during bye

FOiSCSI:
- Don't create a new tcp socket if ERL0 attempt has timed out.

ENHANCEMENTS
------------

BASE:
- Adds support for VFs on PFs 4 to 7
- Adds support for QPs/CQs on any physical and virtual function

ETH:
- Stops sending LACP frames on loopback interface
- Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE
- Adds support for CR4 links (BEAN/AEC on 40G TwinAx cables)

OFLD:
- Improves default settings of LAN and CLUSTER TCP timer settings
- Sends Negative Advice CPLs to software

FOISCSI:
- Adds IPv6 support for foiscsi. Keeps backward compatibility with
   old foiscsi drivers which doesn't support ipv6.

FOFCoE:
- Added fcoe debug support in flowc dump

================================================================================
================================================================================

Version : 1.12.25.0
Date    : 10/22/2014
================================================================================

FIXES
-----

BASE:
- Improves precision of the Weight Round Robing Traffic Management Algorithm
- Fixes an issue where the link would intermittently fail to come up
- Fixes an issue where adapters with an external PHY couldn't run at 100Mbps
- Fixes an issue where active optical cables were not recognized
- Fixes link advertising issues on T520-BT (speed and pause frames) that would
  cause the link to negotiate unexpected settings
- Forces link restart when auto-negotiation is disabled
- Fix an issue where pause frames wouldn't be fully disabled even if requested

ETH:
- Fixes NVGRE Segmentation Offload network header generation.

DCBX:
- Fixes an issue where some settings were not being sent to the switch
  correctly
- Fixes an issue where back-to-back DCBX port updates could get overwritten by
  FW
- Fixes a firmware crash on DCBX APP information request before link up

FOiSCSI:
- Fixes abort task leak in tmf response handling
- Fixes TCP RST handling while in iSCSI ERL0
- Fixes a firmware crash on BYE without INIT

ENHANCEMENTS
-------------

BASE:
- Adds link partner settings reporting when available
- Adds QSA support (in conjunction with QSA VPD)
- Adds T520-BT LED support
- Reports NOTSUPPORTED for modules with an unhandled identifier

DCBX:
- Adds version reporting (indicating which version FW is trying to negotiate)
- Adds IEEE support
- Reports LLDP time outs

FOiSCSI:
- Add support for multiple iSCSI DDP client
- Sends DHCP renew request when lease expires

================================================================================

22.2. T4 Firmware
+++++++++++++++++

Version : 1.14.4.0
Date    : 08/05/2015
================================================================================

FIXES
-----

BASE:
- Fixes a potential initialization error when accessing a configuration file
  stored on the flash.
- Initialize PCIE_DBG_INDIR_REQ.Enable to 0, as hardware failed to do so and
  register dumps could result in errors.

ETH:
- Fixes an issue that sometimes prevented the link from coming up in CR adapters.

ENHANCEMENTS
------------

BASE:
- Adds support for PAUSE OFF watchdog.
- Reports devlog access information in PCIE_FW_PF register 7.

ETH:
- Adds new interface to allow the driver to query the VI rss table base
  addresses.

OFLD:
- Adds new interface for the driver to specify offloaded connections TCP snd
  and rcv scale factors.

================================================================================
================================================================================

Version : 1.13.32.0
Date    : 03/25/2015
================================================================================

FIXES
-----

BASE:
- Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of
    negative)
- Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain
    adapter configurations)
- Fixes config file based PL_TIMEOUT register programming

ETH:
- Fixes a potential EO UDP SEG header corruption

OFLD:
- Fixes timeout issue with half-open connections
- Fixes FW_FLOWC_WR processing when state is set to finwait1

FOiSCSI:
- Don't create a new tcp socket if ERL0 attempt has timed out.

ENHANCEMENTS
------------

ETH:
- Stops sending LACP frames on loopback interface
- Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE

OFLD:
- Improves default settings of LAN and CLUSTER TCP timer settings
- Sends Negative Advice CPLs to software

================================================================================
================================================================================

Version : 1.12.25.0
Date    : 10/22/2014
================================================================================

FIXES
-----

BASE:
- Improves precision of the Weight Round Robing Traffic Management Algorithm
- Forces link restart when auto-negotiation is disabled
- Fix an issue where pause frames wouldn't be fully disabled even if requested

DCBX:
- Fixes an issue where some settings were not being sent to the switch
  correctly
- Fixes an issue where back-to-back DCBX port updates could get overwritten by
  FW
- Fixes a firmware crash on DCBX APP information request before link up

FOiSCSI:
- Fixes abort task leak in tmf response handling
- Fixes TCP RST handling while in iSCSI ERL0
- Fixes a firmware crash on BYE without INIT

ENHANCEMENTS
------------

BASE:
- Adds link partner settings reporting when available
- Firmware now reports NOTSUPPORTED for modules with an unhandled identifier

DCBX:
- Adds version reporting (indicating which version FW is trying to negotiate)
- Adds IEEE support
- Reports LLDP time outs

FOiSCSI:
- Adds support for multiple iSCSI DDP clients
- Sends DHCP renew request when lease expires

================================================================================

Obtained from:	Chelsio Communications
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2015-08-05 19:45:11 +00:00
Andrew Turner
176739d3f5 Load the stack in stack_save and stack_save_td. This uses the generalised
unwind_frame function to read each stack frame until either the pc or stack
are no longer withing the kernel's address space.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2015-07-31 15:32:32 +00:00
Andrew Turner
36baf858c9 Add support for uma_small_alloc and uma_small_free, and make use of these.
This is copied from the amd64 version with minor changes. These should be
merged into a single file as from a quick look there are other copies of
the same file in other parts of the tree.

Obtained from:	ABT Systems Ltd
Sponsored by:	The FreeBSD Foundation
2015-07-31 14:17:26 +00:00
Zbigniew Bodek
a2b3dfad08 Apply erratum for mrs ICC_IAR1_EL1 speculative execution on ThunderX
ERRATUM:     22978, 23154
PASS (rev.): 1.0/1.1

Reviewed by:   imp
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3184
2015-07-31 10:00:45 +00:00
Oleksandr Tymoshenko
40ebf1626f Add GPIO backlight driver compatible with Linux FDT bindings.
Brightness is controlled through sysctl dev.gpiobacklight.X.brightness:
  - any value greater than 0: backlight is on
  - any value less than or equal to  0: backlight is off

FDT bindings docs in Linux tree:
    Documentation/devicetree/bindings/video/backlight/gpio-backlight.txt
2015-07-30 19:04:14 +00:00
Jung-uk Kim
fe0f0bbb19 Merge ACPICA 20150717. 2015-07-22 16:25:07 +00:00
Conrad Meyer
75ac3a7359 vt: Draw logos per CPU core
This feature is inspired by another Unix-alike OS commonly found on
airplane headrests.

A number of beasties[0] are drawn at top of framebuffer during boot,
based on the number of active SMP CPUs[1]. Console buffer output
continues to scroll in the screen area below beastie(s)[2].

After some time[3] has passed, the beasties are erased leaving the
entire terminal for use.

Includes two 80x80 vga16 beastie graphics and an 80x80 vga16 orb
graphic. (The graphics are RLE compressed to save some space -- 3x 3200
bytes uncompressed, or 4208 compressed.)

[0]: The user may select the style of beastie with

    kern.vt.splash_cpu_style=(0|1|2)

[1]: Or the number may be overridden with tunable kern.vt.splash_ncpu.
[2]: https://www.youtube.com/watch?v=UP2jizfr3_o
[3]: Configurable with kern.vt.splash_cpu_duration (seconds, def. 10).

Differential Revision:	https://reviews.freebsd.org/D2181
Reviewed by:	dumbbell, emaste
Approved by:	markj (mentor)
MFC after:	2 weeks
2015-07-21 20:33:36 +00:00
Mark Johnston
32cd0147fa Implement the lockstat provider using SDT(9) instead of the custom provider
in lockstat.ko. This means that lockstat probes now have typed arguments and
will utilize SDT probe hot-patching support when it arrives.

Reviewed by:	gnn
Differential Revision:	https://reviews.freebsd.org/D2993
2015-07-19 22:14:09 +00:00
Mark Murray
d657959305 Clarify the intent of the RANDOM_* options.
Approved by:	so (/dev/random blanket)
2015-07-19 16:05:26 +00:00
Benno Rice
eacbeb2b95 Merge driver for PMC Sierra's range of SAS/SATA HBAs.
Submitted by:	Achim Leubner <Achim.Leubner@pmcs.com>
Reviewed by:	scottl
2015-07-17 23:30:43 +00:00
Ed Schouten
6e5fcd99df Add a sysentvec for CloudABI on x86-64.
Summary:
For CloudABI we need to put two things on the stack of new processes:
the argument data (a binary blob; not strings) and a startup data
structure. The startup data structure contains interesting things such
as a pointer to the ELF program header, the thread ID of the initial
thread, a stack smashing protection canary, and a pointer to the
argument data.

Fetching system call arguments and setting the return value is similar
to FreeBSD. The only differences are that system call 0 does not exist
and that we call into cloudabi_convert_errno() to convert the error
code. We also need this function in a couple of other places, so we'd
better reuse it here.

Reviewers: dchagin, kib

Reviewed By: kib

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D3098
2015-07-16 18:24:06 +00:00
Navdeep Parhar
c7dbd80213 cxgbe(4): Update T4 and T5 firmwares to 1.14.2.0.
Obtained from:	Chelsio Communications
MFC after:	3 days
2015-07-14 08:02:05 +00:00
John-Mark Gurney
e0b231cbc8 fix typos..
Submitted by:	brueffer
2015-07-14 06:34:57 +00:00
John-Mark Gurney
b65946c631 cryptodev is not needed for TCP_SIGNATURE...
Comment that cryptodev shouldn't be used unless you know what you're
doing...

The various arm/mips and one powerpc configs that have cryptodev in
them need to be addressed, audited if they provide benefit and removed
if they don't...
2015-07-14 05:09:58 +00:00
Mark Murray
3aa77530ca * Address review (and add a bit myself).
- Tweek man page.
 - Remove all mention of RANDOM_FORTUNA. If the system owner wants YARROW or DUMMY, they ask for it, otherwise they get FORTUNA.
 - Tidy up headers a bit.
 - Tidy up declarations a bit.
 - Make static in a couple of places where needed.
 - Move Yarrow/Fortuna SYSINIT/SYSUNINIT to randomdev.c, moving us towards a single file where the algorithm context is used.
 - Get rid of random_*_process_buffer() functions. They were only used in one place each, and are better subsumed into those places.
 - Remove *_post_read() functions as they are stubs everywhere.
 - Assert against buffer size illegalities.
 - Clean up some silly code in the randomdev_read() routine.
 - Make the harvesting more consistent.
 - Make some requested argument name changes.
 - Tidy up and clarify a few comments.
 - Make some requested comment changes.
 - Make some requested macro changes.

* NOTE: the thing calling itself a 'unit test' is not yet a proper
  unit test, but it helps me ensure things work. It may be a proper
  unit test at some time in the future, but for now please don't make
  any assumptions or hold any expectations.

Differential Revision:	https://reviews.freebsd.org/D2025
Approved by:	so (/dev/random blanket)
2015-07-12 18:14:38 +00:00
Zbigniew Bodek
e7c14c38ba Implement stubs for ACPI PCI routines
ACPI driver requires special functions to be provided by machdep code.
Add temporary stubs to satisfy the compiler when both "pci" and "acpi"
are enabled in the kernel configuration file.

Reviewed by:   andrew
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3028
2015-07-12 17:28:31 +00:00
Adrian Chadd
6520495abc Add an initial NUMA affinity/policy configuration for threads and processes.
This is based on work done by jeff@ and jhb@, as well as the numa.diff
patch that has been circulating when someone asks for first-touch NUMA
on -10 or -11.

* Introduce a simple set of VM policy and iterator types.
* tie the policy types into the vm_phys path for now, mirroring how
  the initial first-touch allocation work was enabled.
* add syscalls to control changing thread and process defaults.
* add a global NUMA VM domain policy.
* implement a simple cascade policy order - if a thread policy exists, use it;
  if a process policy exists, use it; use the default policy.
* processes inherit policies from their parent processes, threads inherit
  policies from their parent threads.
* add a simple tool (numactl) to query and modify default thread/process
  policities.
* add documentation for the new syscalls, for numa and for numactl.
* re-enable first touch NUMA again by default, as now policies can be
  set in a variety of methods.

This is only relevant for very specific workloads.

This doesn't pretend to be a final NUMA solution.

The previous defaults in -HEAD (with MAXMEMDOM set) can be achieved by
'sysctl vm.default_policy=rr'.

This is only relevant if MAXMEMDOM is set to something other than 1.
Ie, if you're using GENERIC or a modified kernel with non-NUMA, then
this is a glorified no-op for you.

Thank you to Norse Corp for giving me access to rather large
(for FreeBSD!) NUMA machines in order to develop and verify this.

Thank you to Dell for providing me with dual socket sandybridge
and westmere v3 hardware to do NUMA development with.

Thank you to Scott Long at Netflix for providing me with access
to the two-socket, four-domain haswell v3 hardware.

Thank you to Peter Holm for running the stress testing suite
against the NUMA branch during various stages of development!

Tested:

* MIPS (regression testing; non-NUMA)
* i386 (regression testing; non-NUMA GENERIC)
* amd64 (regression testing; non-NUMA GENERIC)
* westmere, 2 socket (thankyou norse!)
* sandy bridge, 2 socket (thankyou dell!)
* ivy bridge, 2 socket (thankyou norse!)
* westmere-EX, 4 socket / 1TB RAM (thankyou norse!)
* haswell, 2 socket (thankyou norse!)
* haswell v3, 2 socket (thankyou dell)
* haswell v3, 2x18 core (thankyou scott long / netflix!)

* Peter Holm ran a stress test suite on this work and found one
  issue, but has not been able to verify it (it doesn't look NUMA
  related, and he only saw it once over many testing runs.)

* I've tested bhyve instances running in fixed NUMA domains and cpusets;
  all seems to work correctly.

Verified:

* intel-pcm - pcm-numa.x and pcm-memory.x, whilst selecting different
  NUMA policies for processes under test.

Review:

This was reviewed through phabricator (https://reviews.freebsd.org/D2559)
as well as privately and via emails to freebsd-arch@.  The git history
with specific attributes is available at https://github.com/erikarn/freebsd/
in the NUMA branch (https://github.com/erikarn/freebsd/compare/local/adrian_numa_policy).

This has been reviewed by a number of people (stas, rpaulo, kib, ngie,
wblock) but not achieved a clear consensus.  My hope is that with further
exposure and testing more functionality can be implemented and evaluated.

Notes:

* The VM doesn't handle unbalanced domains very well, and if you have an overly
  unbalanced memory setup whilst under high memory pressure, VM page allocation
  may fail leading to a kernel panic.  This was a problem in the past, but it's
  much more easily triggered now with these tools.

* This work only controls the path through vm_phys; it doesn't yet strongly/predictably
  affect contigmalloc, KVA placement, UMA, etc.  So, driver placement of memory
  isn't really guaranteed in any way.  That's next on my plate.

Sponsored by:	Norse Corp, Inc.; Dell
2015-07-11 15:21:37 +00:00
Mariusz Zaborski
306a82f8f4 Rename zfs nvpair files to not colidate with our nvlist.
PR:		201356
Approved by:	pjd (mentor)
2015-07-09 21:53:40 +00:00
Andrew Turner
6c50960be6 Add support for __aeabi_memclr4, clang 3.7 calls it.
Sponsored by:	ABT Systems Ltd
2015-07-09 20:54:38 +00:00
Andrew Turner
b2b5507779 Add support for SMP. This uses the FDT data to find the CPUs to start on,
and psci to start them. I expect ACPI support to be added later.

This has been tested on qemu with 2 cpus as that is the current value of
MAXCPUS. This is expected to be increased in the future as FreeBSD has
been tested on 48 cores on the Cavium ThunderX hardware.

Partially based on a patch from Robin Randhawa from ARM.

Approved by:	ABT Systems Ltd
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3024
2015-07-09 13:23:29 +00:00
Ed Schouten
6d338f9a81 Import the CloudABI datatypes and create a system call table.
CloudABI is a pure capability-based runtime environment for UNIX. It
works similar to Capsicum, except that processes already run in
capabilities mode on startup. All functionality that conflicts with this
model has been omitted, making it a compact binary interface that can be
supported by other operating systems without too much effort.

CloudABI is 'secure by default'; the idea is that it should be safe to
run arbitrary third-party binaries without requiring any explicit
hardware virtualization (Bhyve) or namespace virtualization (Jails). The
rights of an application are purely determined by the set of file
descriptors that you grant it on startup.

The datatypes and constants used by CloudABI's C library (cloudlibc) are
defined in separate files called syscalldefs_mi.h (pointer size
independent) and syscalldefs_md.h (pointer size dependent). We import
these files in sys/contrib/cloudabi and wrap around them in
cloudabi*_syscalldefs.h.

We then add stubs for all of the system calls in sys/compat/cloudabi or
sys/compat/cloudabi64, depending on whether the system call depends on
the pointer size. We only have nine system calls that depend on the
pointer size. If we ever want to support 32-bit binaries, we can simply
add sys/compat/cloudabi32 and implement these nine system calls again.

The next step is to send in code reviews for the individual system call
implementations, but also add a sysentvec, to allow CloudABI executabled
to be started through execve().

More information about CloudABI:
- GitHub: https://github.com/NuxiNL/cloudlibc
- Talk at BSDCan: https://www.youtube.com/watch?v=SVdF84x1EdA

Differential Revision:	https://reviews.freebsd.org/D2848
Reviewed by:	emaste, brooks
Obtained from:	https://github.com/NuxiNL/freebsd
2015-07-09 07:20:15 +00:00
Achim Leubner
4e1bc9a039 Driver 'pmspcv' added. Supports PMC-Sierra PM8001/8081/8088/8089/8074/8076/8077 SAS/SATA HBA Controllers. 2015-07-07 13:17:02 +00:00
Zbigniew Bodek
1ae9c994c8 Introduce ITS support for ARM64
Add ARM ITS (Interrupt Translation Services) support required
to bring-up message signalled interrupts on some ARM64 platforms.

Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
2015-07-06 18:27:41 +00:00
Justin Hibbits
3f3cffedce Merge booke and aim interrupt.c files.
Summary:
Both booke and AIM interrupt.c files contain nearly identical code.  This merges
the two files, to reduce duplication.

Reviewers: #powerpc, marcel

Reviewed By: marcel

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D2991
2015-07-06 05:08:57 +00:00
Ian Lepore
b108902461 Ensure all the required files get built when you include the IPSEC option. 2015-07-05 14:15:58 +00:00
George V. Neville-Neil
aaa7cbfe1a Summary: Add missing files necessary to build with IPSEC and crypto 2015-07-04 21:32:44 +00:00
Mariusz Zaborski
54f98da930 Move the nvlist source and private includes from sys/kern to seperate
directory sys/contrib/libnv.

The goal of this operation is to NOT install header files which shouldn't
be used outside the nvlist library.

Approved by:	pjd (mentor)
2015-07-04 16:33:37 +00:00
Warner Losh
8cad626fbb Cache _MPATH and pass it down into the modules build. Some NFS setups
make the find it does extremely expensive, so compute it only
once. Also make sure the 'traditional' module building method works at
the expense of a bit of duplicated code.
2015-07-04 05:43:45 +00:00
Marcel Moolenaar
194709a520 Remove commented-out and non-existent cbus(4) attachment for uart(4). 2015-07-03 16:02:06 +00:00
Marcel Moolenaar
472aa69925 Allow proto(4) to be compiled into the kernel. 2015-07-03 15:56:00 +00:00