Submitted by: Howard Su <howard0su@gmail.com>
Reviewed by: delphij, royger, adrian
Approved by: adrian (mentor)
Sponsored by: Microsoft OSTC
Differential Revision: https://reviews.freebsd.org/D4676
The upcoming GELI support in the loader reuses parts of this code
Some ifdefs are added, and some code is moved outside of existing ifdefs
The HMAC parts of GELI are broken out into their own file, to separate
them from the kernel crypto/openssl dependant parts that are replaced
in the boot code.
Passed the GELI regression suite (tools/regression/geom/eli)
Files=20 Tests=14996
Result: PASS
Reviewed by: pjd, delphij
MFC after: 1 week
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D4699
It seems that `options GZIP` and `options ZFS` collide because they both
define inconsistent definitions for inflate, etc
Fixing this will require upgrading zlib in the kernel, as suggested in
r245102.
Pointyhat to: ngie
Reported by: bz
Sponsored by: EMC / Isilon Storage Division
KERNCONF when "make tinderbox" is run
This will help ensure that "options ZFS" will not be accidentally
regressed, as the current LINT configuration tests the zfs module
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
cperciva's libmd implementation is 5-30% faster
The same was done for SHA256 previously in r263218
cperciva's implementation was lacking SHA-384 which I implemented, validated against OpenSSL and the NIST documentation
Extend sbin/md5 to create sha384(1)
Chase dependancies on sys/crypto/sha2/sha2.{c,h} and replace them with sha512{c.c,.h}
Reviewed by: cperciva, des, delphij
Approved by: secteam, bapt (mentor)
MFC after: 2 weeks
Sponsored by: ScaleEngine Inc.
Differential Revision: https://reviews.freebsd.org/D3929
The mdio driver interface is generally useful for devices that require
MDIO without the full MII bus interface. This lifts the driver/interface
out of etherswitch(4), and adds a mdio(4) man page.
Submitted by: Landon Fuller <landon@landonf.org>
Differential Revision: https://reviews.freebsd.org/D4606
TFO is disabled by default in the kernel build. See the top comment
in sys/netinet/tcp_fastopen.c for implementation particulars.
Reviewed by: gnn, jch, stas
MFC after: 3 days
Sponsored by: Verisign, Inc.
Differential Revision: https://reviews.freebsd.org/D4350
ports.
The sys/mips/rt305x/ code currently has these hard-coded with a comment
to make them configurable; this is the first step towards that.
Submitted by: Stanislav Galabov <sgalabov@gmail.com>
Add support for two new devices: X552 SFP+ 10 GbE, and the single port
version of X550T.
Submitted by: erj
Reviewed by: gnn
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D4186
into a new function that other platforms can share.
This creates a new ofw_reg_to_paddr() function (in a new ofw_subr.c file)
that contains most of the existing ppc implementation, mostly unchanged.
The ppc code now calls the new MI code from the MD code, then creates a
ppc-specific bus_space mapping from the results. The new arm implementation
does the same in an arm-specific way.
This also moves the declaration of OF_decode_addr() from ofw_machdep.h to
openfirm.h, except on sparc64 which uses a different function signature.
This will help all FDT platforms to set up early console access using
OF_decode_addr().
The ci20 port (by kan@) is going to reuse almost all of the intrng code
since the SoC in question looks suspiciously like someone took an ARM
SoC design and replaced the ARM core with a MIPS core.
* migrate out the code;
* rename ARM_ -> INTR_;
* rename arm_ -> intr_;
* move the interrupt flush routine from intr.c / intrng.c into
arm/machdep_intr.c - removing the code duplication and removing
the ARM specific bits from here.
Thanks to the Star Wars: The Force Awakens premiere line for allowing
me a couple hours of quiet time to finish the universe builds.
Tested:
* make universe
TODO:
* The structure definitions in subr_intr.c still includes machine/intr.h
which requires one duplicates all of the intrng definitions in
the platform code (which kan has done, and I think we don't have to.)
Instead I should break out the generic things (function declarations,
common intr structures, etc) into a separate header.
* Kan has requested I make the PIC based IPI stuff optional.
Vast majority of rtalloc(9) users require only basic info from
route table (e.g. "does the rtentry interface match with the interface
I have?". "what is the MTU?", "Give me the IPv4 source address to use",
etc..).
Instead of hand-rolling lookups, checking if rtentry is up, valid,
dealing with IPv6 mtu, finding "address" ifp (almost never done right),
provide easy-to-use API hiding all the complexity and returning the
needed info into small on-stack structure.
This change also helps hiding route subsystem internals (locking, direct
rtentry accesses).
Additionaly, using this API improves lookup performance since rtentry is not
locked.
(This is safe, since all the rtentry changes happens under both radix WLOCK
and rtentry WLOCK).
Sponsored by: Yandex LLC
clock_gettime(2) on ARMv7 and ARMv8 systems which have architectural
generic timer hardware. It is similar how the RDTSC timer is used in
userspace on x86.
Fix a permission problem where generic timer access from EL0 (or
userspace on v7) was not properly initialized on APs.
For ARMv7, mark the stack non-executable. The shared page is added for
all arms (including ARMv8 64bit), and the signal trampoline code is
moved to the page.
Reviewed by: andrew
Discussed with: emaste, mmel
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D4209
One reason the kernel does not build reproducibly is that it includes
a timestamp in the version string. SOURCE_DATE_EPOCH provides a standard
method to address this: it should be set to the last modification time
of the source, and build processes use the specified timestamp instead
of the "current" date and time.
This change uses SOURCE_DATE_EPOCH if it is set; how it gets set needs
to be addressed elsewhere.
Reviewed by: bapt
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
typically memory mapped bus, for example on the AMD Opteron A1100 the AHCI
device is mapped in the CPUs address space, and not through a PCI
controller.
Further work is needed for this to work with ACPI as this is expected to be
common on ARMv8 servers.
Reviewed by: mav, mmel
Obtained from: mmel, ABT Systems Ltd
Relnotes: yes
Sponsored by: SoftIron Inc
Differential Revision: https://reviews.freebsd.org/D4269
changes. Use the list of MFILES found by find to identify the set of
possible auto-generated files and add the intersection of this set and
SRCS to CLEANFILES.
Submitted by: imp (previous version), sbruno
Differential Revision: https://reviews.freebsd.org/D4336
IPv4/IPv6 checksum offloading and VLAN tag insertion/stripping.
Since uether doesn't provide a way to announce driver specific offload
capabilities to upper stack, checksum offloading support needs more work
and will be done in the future.
Special thanks to Hayes Wang from RealTek who gave input.
Because of how osreldate.h was being built with newvers.sh, which always
spat out a vers.c dependent on SVN or git, the meta mode build was
considering osreldate.h to depend on the current git or SVN index. This
would lead to entire tree rebuilds when modifying git's index. There's
no reason to be generating vers.c here so just skip it.
While here, in mk-osreldate.sh rename PARAM_H to proper PARAMFILE (which
newvers.sh already has a default for) and remove unneeded export.
Sponsored by: EMC / Isilon Storage Division
Use hhook(9) framework to achieve ability of loading and unloading
if_enc(4) kernel module. INET and INET6 code on initialization registers
two helper hooks points in the kernel. if_enc(4) module uses these helper
hook points and registers its hooks. IPSEC code uses these hhook points
to call helper hooks implemented in if_enc(4).
This should be a big no-op pass; and reduces the size of if_ath.c.
I'm hopefully soon going to take a whack at the USB support for ath(4)
and this'll require some reuse of the busdma memory code.
default and add a manual page for mlx5en. The mlx5 module contains
shared code for both infiniband and ethernet. The mlx5en module
contains specific code for ethernet functionality only. A mlx5ib
module is in the works for infiniband support.
Supported hardware:
- ConnectX-4: 10/20/25/40/50/56/100Gb/s speeds.
- ConnectX-4 LX: 10/25/40/50Gb/s speeds (low power consumption)
Refer to the mlx5en(4) manual page for a comprehensive list.
The team porting the mlx5 driver(s) to FreeBSD:
- Hans Petter Selasky <hselasky@freebsd.org>
- Oded Shanoon <odeds@mellanox.com>
- Meny Yossefi <menyy@mellanox.com>
- Shany Michaely <shanim@mellanox.com>
- Shahar Klein <shahark@mellanox.com>
- Daria Genzel <dariaz@mellanox.com>
- Mark Bloch <markb@mellanox.com>
Differential Revision: https://reviews.freebsd.org/D4163
Submitted by: Mark Block <markb@mellanox.com>
Sponsored by: Mellanox Technologies
Reviewed by: gnn @
MFC after: 3 days
It turns out on a 16550 w/ a 25MHz SoC reference clock you get a little
over 3% error at 115200 baud, which causes this to fail.
Just .. cope. Things cope these days.
Default to 30 (3.0%) as before, but allow UART_DEV_TOLERANCE_PCT to be
set at build time to change that.
QorIQ SoCs (e5500 core, P5 family) have 2 BARs for local access windows, while
MPC85XX, and P1/P2 families use only a single BAR register.
This also adds the QORIQ_DPAA option, mutually exclusive to MPC85XX, to handle
this difference.
Obtained from: Semihalf
Sponsored by: Alex Perez/Inertial Computing
as of r288992 use it to manage the CCNT.
Use the CNNT for get_cyclecount() instead of binuptime() when device pmu
is compiled in; if it fails to attach, fall back to the former method.
Enable by default for the BeagleBoneBlack configuration.
Optained from: Cambridge/L41
Sponsored by: DARPA/AFRL
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D3837
ccache is mostly beneficial for frequent builds where -DNO_CLEAN is not
used to achieve a safe pseudo-incremental build. This is explained in
more detail upstream [1] [2]. It incurs about a 20%-28% hit to populate the
cache, but with a full cache saves 30-50% in build times. When combined with
the WITH_FAST_DEPEND feature it saves up to 65% since ccache does cache the
resulting dependency file, which it does not do when using mkdep(1)/'CC
-E'. Stats are provided at the end of this message.
This removes the need to modify /etc/make.conf with the CC:= and CXX:=
lines which conflicted with external compiler support [3] (causing the
bootstrap compiler to not be built which lead to obscure failures [4]),
incorrectly invoked ccache in various stages, required CCACHE_CPP2 to avoid
Clang errors with parenthesis, and did not work with META_MODE.
The option name was picked to match the existing option in ports. This
feature is available for both in-src and out-of-src builds that use
/usr/share/mk.
Linking, assembly compiles, and pre-processing avoid using ccache since it is
only overhead. ccache does nothing special in these modes, although there is
no harm in calling it for them.
CCACHE_COMPILERCHECK is set to 'content' when using the in-tree bootstrap
compiler to hash the content of the compiler binary to determine if it
should be a cache miss. For external compilers the 'mtime' option is used
as it is more efficient and likely to be correct. Future work may optimize the
'content' check using the same checks as whether a bootstrap compiler is needed
to be built.
The CCACHE_CPP2 pessimization is currently default in our devel/ccache
port due to Clang requiring it. Clang's -Wparentheses-equality,
-Wtautological-compare, and -Wself-assign warnings do not mix well with
compiling already-pre-processed code that may have expanded macros that
trigger the warnings. GCC has so far not had this issue so it is allowed to
disable the CCACHE_CPP2 default in our port.
Sharing a cache between multiple checkouts, or systems, is explained in
the ccache manual. Sharing a cache over NFS would likely not be worth
it, but syncing cache directories between systems may be useful for an
organization. There is also a memcached backend available [5]. Due to using
an object directory outside of the source directory though you will need to
ensure that both are in the same prefix and all users use the same layout. A
possible working layout is as follows:
Source: /some/prefix/src1
Source: /some/prefix/src2
Source: /some/prefix/src3
Objdir: /some/prefix/obj
Environment: CCACHE_BASEDIR='${SRCTOP:H}' MAKEOBJDIRPREFIX='${SRCTOP:H}/obj'
This will use src*/../obj as the MAKEOBJDIRPREFIX and tells ccache to replace
all absolute paths to be relative. Using something like this is required due
to -I and -o flags containing both SRC and OBJDIR absolute paths that ccache
adds into its hash for the object without CCACHE_BASEDIR.
distcc can be hooked into by setting CCACHE_PREFIX=/usr/local/bin/distcc.
I have not personally tested this and assume it will not mix well with
using the bootstrap compiler.
The cache from buildworld can be reused in a subdir by first running
'make buildenv' (from r290424).
Note that the cache is currently different depending on whether -j is
used or not due to ccache enabling -fdiagnostics-color automatically if
stderr is a TTY, which bmake only does if not using -j.
The system I used for testing was:
WITNESS
Build options: -j20 WITH_LLDB=yes WITH_DEBUG_FILES=yes WITH_CCACHE_BUILD=yes
DISK: ZFS 3-way mirror with very slow disks using SSD l2arc/log.
The arc was fully populated with src tree files and ccache objects.
RAM: 76GiB
CPU: Intel(R) Xeon(R) CPU L5520 @2.27GHz
2 package(s) x 4 core(s) x 2 SMT threads = hw.ncpu=16
The WITH_FAST_DEPEND feature was used for comparison here as well to show
the dramatic time savings with a full cache.
buildworld:
x buildworld-before
+ buildworld-ccache-empty
* buildworld-ccache-full
% buildworld-ccache-full-fastdep
# buildworld-fastdep
+-------------------------------------------------------------------------------+
|% * # +|
|% * # +|
|% * # xxx +|
| |A |
| A|
| A |
|A |
| A |
+-------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 3 3744.13 3794.31 3752.25 3763.5633 26.935139
+ 3 4519 4525.04 4520.73 4521.59 3.1104823
Difference at 95.0% confidence
758.027 +/- 43.4565
20.1412% +/- 1.15466%
(Student's t, pooled s = 19.1726)
* 3 1823.08 1827.2 1825.62 1825.3 2.0785572
Difference at 95.0% confidence
-1938.26 +/- 43.298
-51.5007% +/- 1.15045%
(Student's t, pooled s = 19.1026)
% 3 1266.96 1279.37 1270.47 1272.2667 6.3971113
Difference at 95.0% confidence
-2491.3 +/- 44.3704
-66.1952% +/- 1.17895%
(Student's t, pooled s = 19.5758)
# 3 3153.34 3155.16 3154.2 3154.2333 0.91045776
Difference at 95.0% confidence
-609.33 +/- 43.1943
-16.1902% +/- 1.1477%
(Student's t, pooled s = 19.0569)
buildkernel:
x buildkernel-before
+ buildkernel-ccache-empty
* buildkernel-ccache-empty-fastdep
% buildkernel-ccache-full
# buildkernel-ccache-full-fastdep
@ buildkernel-fastdep
+-------------------------------------------------------------------------------+
|# @ % * |
|# @ % * x + |
|# @ % * xx ++|
| MA |
| MA|
| A |
| A |
|A |
| A |
+-------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 3 571.57 573.94 571.79 572.43333 1.3094401
+ 3 727.97 731.91 728.06 729.31333 2.2492295
Difference at 95.0% confidence
156.88 +/- 4.17129
27.4058% +/- 0.728695%
(Student's t, pooled s = 1.84034)
* 3 527.1 528.29 528.08 527.82333 0.63516402
Difference at 95.0% confidence
-44.61 +/- 2.33254
-7.79305% +/- 0.407478%
(Student's t, pooled s = 1.02909)
% 3 400.4 401.05 400.62 400.69 0.3306055
Difference at 95.0% confidence
-171.743 +/- 2.16453
-30.0023% +/- 0.378128%
(Student's t, pooled s = 0.954969)
# 3 201.94 203.34 202.28 202.52 0.73020545
Difference at 95.0% confidence
-369.913 +/- 2.40293
-64.6212% +/- 0.419774%
(Student's t, pooled s = 1.06015)
@ 3 369.12 370.57 369.3 369.66333 0.79033748
Difference at 95.0% confidence
-202.77 +/- 2.45131
-35.4225% +/- 0.428227%
(Student's t, pooled s = 1.0815)
[1] https://ccache.samba.org/performance.html
[2] http://www.mail-archive.com/ccache@lists.samba.org/msg00576.html
[3] https://reviews.freebsd.org/D3484
[5] https://github.com/jrosdahl/ccache/pull/30
PR: 182944 [4]
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
Relnotes: yes
This is because the .meta files generated from filemon already contain a
list of all files read to generate the object.
X-MFC-With: r290433
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
This is especially noticeable in the kernel obj directory since it
includes so many files.
X-MFC-With: r290433
MFC after: 3 weeks
Sponsored by: EMC / Isilon Storage Division
driver. This is taken from the MAC at boot, but can be overridden with
'options AT91_MACB_USE_RMII'.
Switch to macb for HL201 and SAM9G20EK boards. It now works both
places. Also start to sneak up on FDT for the SAM9G20EK board, but
leave disabled due to issues with MMC that haven't been resolved.
Add early debug support for the SAM9G20EK since that is required
for FDT to work presently on these SoC.
This speeds up buildworld by 16% on my system and buildkernel by 35%.
Rather than calling mkdep(1), which is just a wrapper around 'cc -E',
use the modern -MD -MT -MF flags to gather and generate dependencies during
compilation. This flag was introduced in GCC "a long time ago", in GCC 3.0,
and is also supported by Clang. (It appears that ICC also supports this but I
do not have access to test it). This avoids running the preprocessor *twice*
for every build, in both 'make depend' and 'make all'. This is especially
noticeable when using ccache since it does not cache preprocessor results from
mkdep(1) / 'cc -E', but still speeds up compilation with the -MD flags.
For 'make depend' a tree-walk is still done to ensure that all DPSRCS
are generated when expected, and that beforedepend/afterdepend and
_EXTRADEPEND are all still respected. In time this may change but for now
I've been conservative. The time for a tree-walk with -j combined with
SUBDIR_PARALLEL is not significant. For example, it takes about 9 seconds
with -j15 to walk all of src/ for 'make depend' now on my system.
A .depend file is still generated with the various rules that apply to
the final target, or custom rules. Otherwise there are now
per-built-object-file .depend files, such as .depend.filename.o. These
are included directly by make rather than populating .depend with a loop
and .depend lines, which only added overhead to the now almost-NOP 'make
depend' phase.
Before this I experimented with having mkdep(1) called in parallel per-file.
While this improved the kernel and lib/libc 'make depend' phase, it resulted
in slower build times overall.
The -M flags are removed from CFLAGS when linking since they have no effect.
Enabling this by default, for src or out-of-src, can be done once more testing
has been done, such as a ports exp-run, and with more compilers.
The system I used for testing was:
WITNESS
Build options: -j20 WITH_LLDB=yes WITH_DEBUG_FILES=yes WITH_FAST_DEPEND=yes
DISK: ZFS 3-way mirror with very slow disks using SSD l2arc/log.
The arc was fully populated with src tree files.
RAM: 76GiB
CPU: Intel(R) Xeon(R) CPU L5520 @2.27GHz
2 package(s) x 4 core(s) x 2 SMT threads = hw.ncpu=16
buildworld:
x buildworld-before
+ buildworld-fastdep
+-------------------------------------------------------------------------------+
|+ |
|+ |
|+ xx x|
| |_MA___||
|A |
+-------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 3 3744.13 3794.31 3752.25 3763.5633 26.935139
+ 3 3153.34 3155.16 3154.2 3154.2333 0.91045776
Difference at 95.0% confidence
-609.33 +/- 43.1943
-16.1902% +/- 1.1477%
(Student's t, pooled s = 19.0569)
buildkernel:
x buildkernel-before
+ buildkernel-fastdep
+-------------------------------------------------------------------------------+
|+ x |
|++ xx|
| A||
|A| |
+-------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 3 571.57 573.94 571.79 572.43333 1.3094401
+ 3 369.12 370.57 369.3 369.66333 0.79033748
Difference at 95.0% confidence
-202.77 +/- 2.45131
-35.4225% +/- 0.428227%
(Student's t, pooled s = 1.0815)
Sponsored by: EMC / Isilon Storage Division
MFC after: 3 weeks
Relnotes: yes
wrong value in the comparison, leading to incorrectly setting the new
value.
This has been observed in the ZFS code. Without this we can lose track of
the reference count in a zrlock object.
We should move to use the generic atomic functions, however as this has
been observed I would prefer to have this working, then move to the generic
functions.
PR: 204037
Sponsored by: ABT Systems Ltd
- Move all files related to the LinuxKPI into sys/compat/linuxkpi and
its subfolders.
- Update sys/conf/files and some Makefiles to use new file locations.
- Added description of COMPAT_LINUXKPI to sys/conf/NOTES which in turn
adds the LinuxKPI to all LINT builds.
- The LinuxKPI can be added to the kernel by setting the
COMPAT_LINUXKPI option. The OFED kernel option no longer builds the
LinuxKPI into the kernel. This was done to keep the build rules for
the LinuxKPI in sys/conf/files simple.
- Extend the LinuxKPI module to include support for USB by moving the
Linux USB compat from usb.ko to linuxkpi.ko.
- Bump the FreeBSD_version.
- A universe kernel build has been done.
Reviewed by: np @ (cxgb and cxgbe related changes only)
Sponsored by: Mellanox Technologies
This commit introduces support for etherswitch devices that utilize SMI as
a way of accessing its registers. SMI register is located in address space
of mge -- access to it was exported through MDIO interface.
Attachment functions were enhanced so as to ensure proper initialisation
in both cases: 1) PHYs attached directly to mge, 2) PHYs attached to
switch device and switch attached to mge. Attachment of etherswitch device
depends on dts entry with compatible="mrvl,sw" property. If none is found,
typical PHY attachment procedure follows.
In case of switch attached, PHYs' status and configuration is accessible
via etherswitchcfg, and ifconfig shows always-up, non-configurable mge
interfaces.
Due to the fact that there may be simultaneous accessess to SMI
registers (e.g. from PHY attached to one of mge instances and switch
to the other), SMI access interlock was added. It is SX lock,
because sleep ability is necessary -- busy-waiting would result
in poor performance due to long delays required by hardware.
Underlying switch driver is obliged to use sleepable locks as well.
Reviewed by: adrian
Obtained from: Semihalf
Submitted by: Bartosz Szczepanek <bsz@semihalf.com>
Differential revision: https://reviews.freebsd.org/D3900
It turns out that it is pretty easy to make CloudABI work on ARM64. We
essentially only need to copy over the sysvec from AMD64 and ensure that
we use ARM64 specific registers.
As there is an overlap between function argument and return registers,
we do need to extend cloudabi64_schedtail() to only set its values if
we're actually forking. Not when we're creating a new thread.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D3917
In order to make it easier to support CloudABI on ARM64, move out all of
the bits from the AMD64 cloudabi_sysvec.c into a new file
cloudabi_module.c that would otherwise remain identical. This reduces
the AMD64 specific code to just ~160 lines.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D3974
to copying in some code from the armv4 busdma, and adapting a few variable
and flag names to match the surrounding mips code.
Instead of keeping a local cache of prealloced busdma_map structs on a
mutex-protected list, set up an uma zone to cache them.
Instead of all memory allocations using M_DEVBUF, use new categories
M_BUSDMA for allocations of metadata (tags, maps, segment tracking lists),
and M_BOUNCE for bounce pages.
When buffers are allocated out of the busdma_bufalloc zones the alignment
and size of the buffers is known, and the code can skip doing any "partial
cacheline flush" logic to preserve data that may be adjacent to the DMA
buffer but contain non-DMA data.
Reviewed by: adrian, imp
This commit adds support for MDIO present in the ThunderX SoC.
From the FDT point of view it is compatible with "octeon-3860-mdio"
however only C22 mode is used.
The code also implements lmac_if interface functions.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
- The driver consists of three main componens: PF, VF, BGX
- Requires appropriate entries in DTS and MDIO driver
- Supports only FDT configuration
- Multiple Tx queues and single Rx queue supported
- No RSS, HW checksum and TSO support
- No more than 8 queues per-IF (only one Queue Set per IF)
- HW statistics enabled
- Works in all available MAC modes (1,10,20,40G)
- Style converted to BSD according to style(9)
- The code brings lmac_if interface used by the BGX driver to
update its logical MACs state.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
and armv6 architecures. The primary enhancement over the old design is
support for hierarchical interrupt controllers (such as a gpio driver
which can receive interrupts from a root PIC and act as a PIC itself for
clients interested in handling a change of gpio pin state as an
interrupt). The new code also provides an infrastructure for mapping
interrupts described in metadata in the form of a "controller reference
plus interrupt number" tuple into the simple "0-n" flat numeric space
understood by rman and the bus resource mechanisms.
Use of the new code is enabled by setting the ARM_INTRNG option, and by
making a few simple changes to the platform's support code. In addition
each existing PIC driver needs changes to be ready for INTRNG; this commit
contains the changes for the arm/gic driver, which most armv6 SoCs use, but
it does not enable the new code yet on any platform.
This project has been many years in the making, starting as a GSoC project
by Jakub Klama (jceel@) in 2012. That didn't get committed right away and
the source base evolved out from under it to some degree. In 2014 I rebased
the diffs to then -current and did some enhancements in the area of mapping
interrupt numbers and storing associated fdt data, then the project went
cold again for a while. Eventually Svata Kraus took that work in progress
and did another big round of work on it, removing most of the remaining
rough edges. Finally I took that and made one more pass through it, mostly
disabling the "INTR_SOLO" feature for now, pending further design
discussions on how to most efficiently dispatch a pending interrupt through
more than one layer of PIC. The current code with the INTR_SOLO feature
disabled uses approximate 100 extra cpu cycles for each cascaded PIC the
interrupt has to be passed to, so what's left to do is about efficiency, not
correct operation.
Differential Revision: https://reviews.freebsd.org/D2047
an error.
Most of these do a 'mkdir -p' or 'install -d' before installing, but add
the trailing / here for consistency with the userland install.
MFC after: 2 weeks
X-MFC-With: r289391
Sponsored by: EMC / Isilon Storage Division
HWPMC depends on pmu.c even if device pmu is not specified.
Would be great if we could just automatically enabled "device pmu"
if we try to compile in HWPMC.
Also several old kernel cnfigurations seem to have HWPMC enabled but are
pre-FDT and thus fail. So make pmu.c depend on fdt in case of hwpmc as
well.
MFC after: 2 weeks
Sponsored by: DARPA/AFRL
Differential Revision: https://reviews.freebsd.org/D3877
To me that seems broken as certain interrupts will never be handled
properly. I'll re-open D3877 and we can seek a better solution and
try again. For now go back to that state and avoid compile time errors.
Would be great if we could just automatically enabled "device pmu"
if we try to compile in HWPMC.
MFC after: 2 weeks
Sponsored by: DARPA/AFRL
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D3877
packets and/or state transitions from each TCP socket. That would help with
narrowing down certain problems we see in the field that are hard to reproduce
without understanding the history of how we got into a certain state. This
change provides just that.
It saves copies of the last N packets in a list in the tcpcb. When the tcpcb is
destroyed, the list is freed. I thought this was likely to be more
performance-friendly than saving copies of the tcpcb. Plus, with the packets,
you should be able to reverse-engineer what happened to the tcpcb.
To enable the feature, you will need to compile a kernel with the TCPPCAP
option. Even then, the feature defaults to being deactivated. You can activate
it by setting a positive value for the number of captured packets. You can do
that on either a global basis or on a per-socket basis (via a setsockopt call).
There is no way to get the packets out of the kernel other than using kmem or
getting a coredump. I thought that would help some of the legal/privacy concerns
regarding such a feature. However, it should be possible to add a future effort
to export them in PCAP format.
I tested this at low scale, and found that there were no mbuf leaks and the peak
mbuf usage appeared to be unchanged with and without the feature.
The main performance concern I can envision is the number of mbufs that would be
used on systems with a large number of sockets. If you save five packets per
direction per socket and have 3,000 sockets, that will consume at least 30,000
mbufs just to keep these packets. I tried to reduce the concerns associated with
this by limiting the number of clusters (not mbufs) that could be used for this
feature. Again, in my testing, that appears to work correctly.
Differential Revision: D3100
Submitted by: Jonathan Looney <jlooney at juniper dot net>
Reviewed by: gnn, hiren
- Move the required kernel compiler flags from Makefile.arm64 to kern.mk.
- Build arm64 modules as PIC; non-PIC relocations in .o for shared object
output cannot be handled.
- Do not try to install aarch64 symlink.
- A hack for arm64 to avoid ld -r stage. See the comment for the explanation.
Some functionality is lost, like ctf handling, but hopefully will be
restored after newer linker is available.
Reviewed by: andrew, emaste
Tested by: andrew (on real hardware)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3796
The current Xen console driver is crashing very quickly when using it on
an ARM guest. This is because the console lock is recursive and it may
lead to recursion on the tty lock and/or corrupt the ring pointer.
Furthermore, the console lock is not always taken where it should be and has
to be released too early because of the way the console has been designed.
Over the years, code has been modified to support various new features but
the driver has not been reworked.
This new driver has been rewritten with the idea of only having a small set
of specific function to write either via the shared ring or the hypercall
interface.
Note that HVM support has been left aside for now because it requires
additional features which are not yet supported. A follow-up patch will be
sent with HVM guest support.
List of items that may be good to have but not mandatory:
- Avoid to flush for each character written when using the tty
- Support multiple consoles
Submitted by: Julien Grall <julien.grall@citrix.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D3698
Sponsored by: Citrix Systems R&D
specific as we may use the pmu registers for other uses. No configs seem
to currently build this.
This will allow for more use of this device.
Discussed with: bz
Sponsored by: ABT Systems Ltd
This avoids needing a large boot partition / file system in order to
accommodate multiple kernels, and provides consistency with userland
debug. This also simplifies the process of moving kernel debug files
to a separate package and installing them on demand.
In addition, change kernel debug file extension to .debug, to match
userland debug files.
When using the supported kernel installation method the
/usr/lib/debug/boot/kernel directory will be renamed (to kernel.old)
as is done with /boot/kernel.
Developers wishing to maintain the historical behavior of installing
debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5).
Reviewed by: bdrewery, brooks, imp, markj
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D1006
This also adds a newbus interface that allows a SoC to override the
following settings:
- if_dwc specific SoC initialization;
- if_dwc descriptor type;
- if_dwc MII clock.
This seems to be an old version of the hardware descriptors but it is
still in use in a few SoCs (namely Allwinner A20 and Amlogic at least).
Tested on Cubieboard2 and Banana pi.
Tested for regressions on Altera Cyclone by br@ (old version).
Obtained from: NetBSD
drivers into the revived sys/sparc64/pci/ofw_pci.c, previously already
serving a similar purpose. This has been done with sun4v in mind, which
explains a) the otherwise not that obvious scheme employed and b) why
reusing sys/powerpc/ofw/ofw_pci.c was even lesser an option.
- Add a workaround for QEMU once again not emulating real machines, in
this case by not providing the OFW_PCI_CS_MEM64 range. [1]
Submitted by: jhb [1]
MFC after: 1 week
CTL HA functionality was originally implemented by Copan many years ago,
but large part of the sources was never published. This change includes
clean room implementation of the missing code and fixes for many bugs.
This code supports dual-node HA with ALUA in four modes:
- Active/Unavailable without interlink between nodes;
- Active/Standby with second node handling only basic LUN discovery and
reservation, synchronizing with the first node through the interlink;
- Active/Active with both nodes processing commands and accessing the
backing storage, synchronizing with the first node through the interlink;
- Active/Active with second node working as proxy, transfering all
commands to the first node for execution through the interlink.
Unlike original Copan's implementation, depending on specific hardware,
this code uses simple custom TCP-based protocol for interlink. It has
no authentication, so it should never be enabled on public interfaces.
The code may still need some polishing, but generally it is functional.
Relnotes: yes
Sponsored by: iXsystems, Inc.
RANDOM_LOADABLE and RANDOM_YARROW's definitions from opt_random.h to
opt_global.h
This unbreaks `make depend` in sys/modules with multiple drivers (tmpfs, etc)
after r286839
X-MFC with: r286839
Reviewed by: imp
Submitted by: lwhsu
Differential Revision: D3486
SoC is used in the HiKey board from 96boards.
Currently on the SD card is working on the HiKey, as such devices 0 and 2
will need to be disabled, for example by adding the following to
loader.conf:
hint.hisi_dwmmc.0.disabled=1
hint.hisi_dwmmc.2.disabled=1
Relnotes: yes (Hikey board booting)
Sponsored by: ABT Systems Ltd
only gpiobus configured via FDT is supported. Bus enumeration is
supported. Devices are created for each device found. 1-Wire
temperature controllers are supported, but other drivers could be
written. Temperatures are polled and reported via a sysctl. Errors
are reported via sysctl counters. Mis-wired bus detection is included
for more trouble shooting. See ow(4), owc(4) and ow_temp(4) for
details of what's supported and known issues.
This has been tested on Raspberry Pi-B, Pi2 and Beagle Bone Black
with up to 7 devices.
Differential Revision: https://reviews.freebsd.org/D2956
Relnotes: yes
MFC after: 2 weeks
Reviewed by: loos@ (with many insightful comments)
be used with any SoC specific drivers, for example a ThunderX nic driver
would use something like the following in files.arm64:
arm64/cavium/thunder_nic.c optional soc_cavm_thunderx thndr_nic
Reviewed by: imp
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D3479
I/OAT is also referred to as Crystal Beach DMA and is a Platform Storage
Extension (PSE) on some Intel server platforms.
This driver currently supports DMA descriptors only and is part of a
larger effort to upstream an interconnect between multiple systems using
the Non-Transparent Bridge (NTB) PSE.
For now, this driver is only built on AMD64 platforms. It may be ported
to work on i386 later, if that is desired. The hardware is exclusive to
x86.
Further documentation on ioat(4), including API documentation and usage,
can be found in the new manual page.
Bring in a test tool, ioatcontrol(8), in tools/tools/ioat. The test
tool is not hooked up to the build and is not intended for end users.
Submitted by: jimharris, Carl Delsey <carl.r.delsey@intel.com>
Reviewed by: jimharris (reviewed my changes)
Approved by: markj (mentor)
Relnotes: yes
Sponsored by: Intel
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3456
Provide and document the RANDOM_ENABLE_UMA option.
Change RANDOM_FAST to RANDOM_UMA to clarify the harvesting.
Remove RANDOM_DEBUG option, replace with SDT probes. These will be of
use to folks measuring the harvesting effect when deciding whether to
use RANDOM_ENABLE_UMA.
Requested by: scottl and others.
Approved by: so (/dev/random blanket)
Differential Revision: https://reviews.freebsd.org/D3197
Summary:
The RouterBoard uses a predefined partition map which doesn't exist in the fdt.
This change allows overriding the fdt slicer with a custom slicer, and uses this
custom slicer to define the flash map on the RouterBoard RB800.
D3305 converts the mpc85xx platform into a base class, so that systems based on
the mpc85xx platform can add their own overrides. This change builds on D3305,
and creates a RouterBoard (RB800) platform to initialize the slicer override.
Reviewed By: nwhitehorn, imp
Differential Revision: https://reviews.freebsd.org/D3345
CoDel is a parameterless queue discipline that handles variable bandwidth
and RTT.
It can be used as the single queue discipline on an interface or as a sub
discipline of existing queue disciplines such as PRIQ, CBQ, HFSC, FAIRQ.
Differential Revision: https://reviews.freebsd.org/D3272
Reviewd by: rpaulo, gnn (previous version)
Obtained from: pfSense
Sponsored by: Rubicon Communications (Netgate)
This driver allows read the software reset switch state and control the
status LEDs.
The GPIO pins have their direction (input/output) locked down to prevent
possible short circuits.
Note that most people get a reset button that is a hardware reset. The
software reset button is available on boards from Netgate.
Sponsored by: Rubicon Communications (Netgate)
if desired.
Retire randomdev_none.c and introduce random_infra.c for resident
infrastructure. Completely stub out random(4) calls in the "without
DEV_RANDOM" case.
Add RANDOM_LOADABLE option to allow loadable Yarrow/Fortuna/LocallyWritten
algorithm. Add a skeleton "other" algorithm framework for folks
to add their own processing code. NIST, anyone?
Retire the RANDOM_DUMMY option.
Build modules for Yarrow, Fortuna and "other".
Use atomics for the live entropy rate-tracking.
Convert ints to bools for the 'seeded' logic.
Move _write() function from the algorithm-specific areas to randomdev.c
Get rid of reseed() function - it is unused.
Tidy up the opt_*.h includes.
Update documentation for random(4) modules.
Fix test program (reviewers, please leave this).
Differential Revision: https://reviews.freebsd.org/D3354
Reviewed by: wblock,delphij,jmg,bjk
Approved by: so (/dev/random blanket)
It has nothing to share with too huge ctl.c other then device descriptor,
but even that may be counted as design error that may be fixed later.
At some point we may even want to have several ioctl ports.
Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Elling <richard.elling@richardelling.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
illumos/illumos-gate@244781f10d
This patch attempts to reduce lock contention on the current arc_state_t
mutexes. These mutexes are used liberally to protect the number of LRU
lists within the ARC (e.g. ARC_mru, ARC_mfu, etc). The granularity at
which these locks are acquired has been shown to greatly affect the
performance of highly concurrent, cached workloads.
root disk. The embedded image is linked into the kernel in the .mfs
section.
Add rules and variables to kern.pre.mk and kern.post.mk that handle the
linking of the image. First objcopy is used to generate an object file.
Then, the object file is linked into the kernel.
Submitted by: Steve Kiernan <stevek@juniper.net>
Reviewed by: brooks@
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D2903
frame buffers and memory mapped UARTs.
1. Delay calling cninit() until after pmap_bootstrap(). This makes
sure we have PMAP initialized enough to add translations. Keep
kdb_init() after cninit() so that we have console when we need
to break into the debugger on boot.
2. Unfortunately, the ATPIC code had be moved as well so as to
avoid a spurious trap #30. The reason for which is not known
at this time.
3. In pmap_mapdev_attr(), when we need to map a device prior to the
VM system being initialized, use virtual_avail as the KVA to map
the device at. In particular, avoid using the direct map on amd64
because we can't demote by virtue of not being able to allocate
yet. Keep track of the translation.
Re-use the translation after the VM has been initialized to not
waste KVA and to satisfy the assumption in uart(4) that the handle
returned for the low-level console is the same as later returned
when the device is probed and attached.
4. In pmap_unmapdev() remove the mapping from the table when called
pre-init. Otherwise keep the mapping. During bus probe and attach
device resources are mapped and unmapped multiple times, which
would have us destroy the mapping used by the low-level console.
5. In pmap_init(), set pmap_initialized to signal that we're not
pre-init anymore. On amd64, bring the direct map in sync with the
translations created at that time.
6. Implement bus_space_map() and bus_space_unmap() for real: when
the tag corresponds to memory space, call the corresponding
pmap_mapdev() and pmap_unmapdev() functions to construct and
actual handle.
7. In efifb.c and vt_vga.c, remove the crutches and hacks and simply
call pmap_mapdev_attr() or bus_space_map() as desired.
Notes:
1. uart(4) already used bus_space_map() during low-level console
setup but since serial ports have traditionally been I/O port
based, the lack of a proper implementation for said function
was not a problem. It has always supported memory mapped UARTs
for low-level consoles by setting hw.uart.console accordingly.
2. The use of the direct map on amd64 without setting caching
attributes has been a bigger problem than previously thought.
This change has the fortunate (and unexpected) side-effect of
fixing various EFI frame buffer problems (though not all).
PR: 191564, 194952
Special thanks to:
1. XipLink, Inc -- generously donated an Intel Bay Trail E3800
based eval board (ADLE3800PC).
2. The FreeBSD Foundation, in particular emaste@ -- for UEFI
support in general and testing.
3. Everyone who tested the proposed for PR 191564.
4. jhb@ and kib@ for being a soundboard and applying a clue bat
if so needed.
changes in the firmwares since 1.11.27.0 are listed here (straight copy-paste
from the "Release Notes.txt" accompanying the Chelsio Unified Wire 2.11.1.0
release on the website).
22.1. T5 Firmware
+++++++++++++++++++++++++++++++++
Version : 1.14.4.0
Date : 08/05/2015
================================================================================
FIXES
-----
BASE:
- Fixes a potential data path hang by properly programming PMTX congestion
threshold settings.
- Fixes a potential initialization error when accessing a configuration file
stored on the flash.
- Fixes a regression where SGE resources can be miss-sized if iWARP is disabled.
ETH:
- Fixes a timing issue that would prevent CR4 links from coming up with some
switches.
FOFCoE:
- Defers fcoe linkdown mailbox command handling till LOGO is sent.
- Updates vlan prio for all outstanding IOs during dcbx update.
ENHANCEMENTS
------------
BASE:
- Adds support for PAUSE OFF watchdog.
- Reports devlog access information in PCIE_FW_PF register 7.
ETH:
- Enhances segmentation offload to include VxLAN and Geneve.
- Adds PTP support.
- Adds new interface to allow the driver to query the VI rss table base
addresses.
- Allows the driver to program the SGE ingrext contxt CongDrop field.
OFLD:
- Adds new interface for the driver to specify offloaded connections TCP snd
and rcv scale factors.
iSCSI:
- Adds support for iscsi segmentatation offload (ISO).
- Adds support for iscsi t10-dif offload.
FOiSCSI:
- Sets FORCE_BIT for cut through processing for FOiSCSI.
FOFCoE:
- Adds support for FCoE BB6.
- Improves WRITE performance.
================================================================================
================================================================================
Version : 1.13.32.0
Date : 03/25/2015
================================================================================
FIXES
-----
BASE:
- Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of
negative)
- Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain
adapter configurations)
- Fixes config file based PL_TIMEOUT register programming
ETH:
- Fixes a potential EO UDP SEG header corruption
- Fixes an issue where 1000Base-X was not enabled correctly when using QSA
modules
OFLD:
- Fixes timeout issue with half-open connections
- Fixes FW_FLOWC_WR processing when state is set to finwait1
FOFCoE:
- Fixes fcoe xchg leaks in linkdown/peer down path
- Fixes cleanup in FCoE linkdown and fixed buf timer flowid abuse
- Fixes fw crash by clearing fcf flowc during bye
FOiSCSI:
- Don't create a new tcp socket if ERL0 attempt has timed out.
ENHANCEMENTS
------------
BASE:
- Adds support for VFs on PFs 4 to 7
- Adds support for QPs/CQs on any physical and virtual function
ETH:
- Stops sending LACP frames on loopback interface
- Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE
- Adds support for CR4 links (BEAN/AEC on 40G TwinAx cables)
OFLD:
- Improves default settings of LAN and CLUSTER TCP timer settings
- Sends Negative Advice CPLs to software
FOISCSI:
- Adds IPv6 support for foiscsi. Keeps backward compatibility with
old foiscsi drivers which doesn't support ipv6.
FOFCoE:
- Added fcoe debug support in flowc dump
================================================================================
================================================================================
Version : 1.12.25.0
Date : 10/22/2014
================================================================================
FIXES
-----
BASE:
- Improves precision of the Weight Round Robing Traffic Management Algorithm
- Fixes an issue where the link would intermittently fail to come up
- Fixes an issue where adapters with an external PHY couldn't run at 100Mbps
- Fixes an issue where active optical cables were not recognized
- Fixes link advertising issues on T520-BT (speed and pause frames) that would
cause the link to negotiate unexpected settings
- Forces link restart when auto-negotiation is disabled
- Fix an issue where pause frames wouldn't be fully disabled even if requested
ETH:
- Fixes NVGRE Segmentation Offload network header generation.
DCBX:
- Fixes an issue where some settings were not being sent to the switch
correctly
- Fixes an issue where back-to-back DCBX port updates could get overwritten by
FW
- Fixes a firmware crash on DCBX APP information request before link up
FOiSCSI:
- Fixes abort task leak in tmf response handling
- Fixes TCP RST handling while in iSCSI ERL0
- Fixes a firmware crash on BYE without INIT
ENHANCEMENTS
-------------
BASE:
- Adds link partner settings reporting when available
- Adds QSA support (in conjunction with QSA VPD)
- Adds T520-BT LED support
- Reports NOTSUPPORTED for modules with an unhandled identifier
DCBX:
- Adds version reporting (indicating which version FW is trying to negotiate)
- Adds IEEE support
- Reports LLDP time outs
FOiSCSI:
- Add support for multiple iSCSI DDP client
- Sends DHCP renew request when lease expires
================================================================================
22.2. T4 Firmware
+++++++++++++++++
Version : 1.14.4.0
Date : 08/05/2015
================================================================================
FIXES
-----
BASE:
- Fixes a potential initialization error when accessing a configuration file
stored on the flash.
- Initialize PCIE_DBG_INDIR_REQ.Enable to 0, as hardware failed to do so and
register dumps could result in errors.
ETH:
- Fixes an issue that sometimes prevented the link from coming up in CR adapters.
ENHANCEMENTS
------------
BASE:
- Adds support for PAUSE OFF watchdog.
- Reports devlog access information in PCIE_FW_PF register 7.
ETH:
- Adds new interface to allow the driver to query the VI rss table base
addresses.
OFLD:
- Adds new interface for the driver to specify offloaded connections TCP snd
and rcv scale factors.
================================================================================
================================================================================
Version : 1.13.32.0
Date : 03/25/2015
================================================================================
FIXES
-----
BASE:
- Fixes FW_CAPS_CONFIG_CMD return value on error (was positive instead of
negative)
- Fixes FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ indication (was wrong on certain
adapter configurations)
- Fixes config file based PL_TIMEOUT register programming
ETH:
- Fixes a potential EO UDP SEG header corruption
OFLD:
- Fixes timeout issue with half-open connections
- Fixes FW_FLOWC_WR processing when state is set to finwait1
FOiSCSI:
- Don't create a new tcp socket if ERL0 attempt has timed out.
ENHANCEMENTS
------------
ETH:
- Stops sending LACP frames on loopback interface
- Adds an AUTOEQU indication to CPL_SGE_EGR_UPDATE
OFLD:
- Improves default settings of LAN and CLUSTER TCP timer settings
- Sends Negative Advice CPLs to software
================================================================================
================================================================================
Version : 1.12.25.0
Date : 10/22/2014
================================================================================
FIXES
-----
BASE:
- Improves precision of the Weight Round Robing Traffic Management Algorithm
- Forces link restart when auto-negotiation is disabled
- Fix an issue where pause frames wouldn't be fully disabled even if requested
DCBX:
- Fixes an issue where some settings were not being sent to the switch
correctly
- Fixes an issue where back-to-back DCBX port updates could get overwritten by
FW
- Fixes a firmware crash on DCBX APP information request before link up
FOiSCSI:
- Fixes abort task leak in tmf response handling
- Fixes TCP RST handling while in iSCSI ERL0
- Fixes a firmware crash on BYE without INIT
ENHANCEMENTS
------------
BASE:
- Adds link partner settings reporting when available
- Firmware now reports NOTSUPPORTED for modules with an unhandled identifier
DCBX:
- Adds version reporting (indicating which version FW is trying to negotiate)
- Adds IEEE support
- Reports LLDP time outs
FOiSCSI:
- Adds support for multiple iSCSI DDP clients
- Sends DHCP renew request when lease expires
================================================================================
Obtained from: Chelsio Communications
MFC after: 2 weeks
Sponsored by: Chelsio Communications
unwind_frame function to read each stack frame until either the pc or stack
are no longer withing the kernel's address space.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
This is copied from the amd64 version with minor changes. These should be
merged into a single file as from a quick look there are other copies of
the same file in other parts of the tree.
Obtained from: ABT Systems Ltd
Sponsored by: The FreeBSD Foundation
Brightness is controlled through sysctl dev.gpiobacklight.X.brightness:
- any value greater than 0: backlight is on
- any value less than or equal to 0: backlight is off
FDT bindings docs in Linux tree:
Documentation/devicetree/bindings/video/backlight/gpio-backlight.txt
This feature is inspired by another Unix-alike OS commonly found on
airplane headrests.
A number of beasties[0] are drawn at top of framebuffer during boot,
based on the number of active SMP CPUs[1]. Console buffer output
continues to scroll in the screen area below beastie(s)[2].
After some time[3] has passed, the beasties are erased leaving the
entire terminal for use.
Includes two 80x80 vga16 beastie graphics and an 80x80 vga16 orb
graphic. (The graphics are RLE compressed to save some space -- 3x 3200
bytes uncompressed, or 4208 compressed.)
[0]: The user may select the style of beastie with
kern.vt.splash_cpu_style=(0|1|2)
[1]: Or the number may be overridden with tunable kern.vt.splash_ncpu.
[2]: https://www.youtube.com/watch?v=UP2jizfr3_o
[3]: Configurable with kern.vt.splash_cpu_duration (seconds, def. 10).
Differential Revision: https://reviews.freebsd.org/D2181
Reviewed by: dumbbell, emaste
Approved by: markj (mentor)
MFC after: 2 weeks
in lockstat.ko. This means that lockstat probes now have typed arguments and
will utilize SDT probe hot-patching support when it arrives.
Reviewed by: gnn
Differential Revision: https://reviews.freebsd.org/D2993
Summary:
For CloudABI we need to put two things on the stack of new processes:
the argument data (a binary blob; not strings) and a startup data
structure. The startup data structure contains interesting things such
as a pointer to the ELF program header, the thread ID of the initial
thread, a stack smashing protection canary, and a pointer to the
argument data.
Fetching system call arguments and setting the return value is similar
to FreeBSD. The only differences are that system call 0 does not exist
and that we call into cloudabi_convert_errno() to convert the error
code. We also need this function in a couple of other places, so we'd
better reuse it here.
Reviewers: dchagin, kib
Reviewed By: kib
Subscribers: imp
Differential Revision: https://reviews.freebsd.org/D3098
Comment that cryptodev shouldn't be used unless you know what you're
doing...
The various arm/mips and one powerpc configs that have cryptodev in
them need to be addressed, audited if they provide benefit and removed
if they don't...
- Tweek man page.
- Remove all mention of RANDOM_FORTUNA. If the system owner wants YARROW or DUMMY, they ask for it, otherwise they get FORTUNA.
- Tidy up headers a bit.
- Tidy up declarations a bit.
- Make static in a couple of places where needed.
- Move Yarrow/Fortuna SYSINIT/SYSUNINIT to randomdev.c, moving us towards a single file where the algorithm context is used.
- Get rid of random_*_process_buffer() functions. They were only used in one place each, and are better subsumed into those places.
- Remove *_post_read() functions as they are stubs everywhere.
- Assert against buffer size illegalities.
- Clean up some silly code in the randomdev_read() routine.
- Make the harvesting more consistent.
- Make some requested argument name changes.
- Tidy up and clarify a few comments.
- Make some requested comment changes.
- Make some requested macro changes.
* NOTE: the thing calling itself a 'unit test' is not yet a proper
unit test, but it helps me ensure things work. It may be a proper
unit test at some time in the future, but for now please don't make
any assumptions or hold any expectations.
Differential Revision: https://reviews.freebsd.org/D2025
Approved by: so (/dev/random blanket)
ACPI driver requires special functions to be provided by machdep code.
Add temporary stubs to satisfy the compiler when both "pci" and "acpi"
are enabled in the kernel configuration file.
Reviewed by: andrew
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3028
This is based on work done by jeff@ and jhb@, as well as the numa.diff
patch that has been circulating when someone asks for first-touch NUMA
on -10 or -11.
* Introduce a simple set of VM policy and iterator types.
* tie the policy types into the vm_phys path for now, mirroring how
the initial first-touch allocation work was enabled.
* add syscalls to control changing thread and process defaults.
* add a global NUMA VM domain policy.
* implement a simple cascade policy order - if a thread policy exists, use it;
if a process policy exists, use it; use the default policy.
* processes inherit policies from their parent processes, threads inherit
policies from their parent threads.
* add a simple tool (numactl) to query and modify default thread/process
policities.
* add documentation for the new syscalls, for numa and for numactl.
* re-enable first touch NUMA again by default, as now policies can be
set in a variety of methods.
This is only relevant for very specific workloads.
This doesn't pretend to be a final NUMA solution.
The previous defaults in -HEAD (with MAXMEMDOM set) can be achieved by
'sysctl vm.default_policy=rr'.
This is only relevant if MAXMEMDOM is set to something other than 1.
Ie, if you're using GENERIC or a modified kernel with non-NUMA, then
this is a glorified no-op for you.
Thank you to Norse Corp for giving me access to rather large
(for FreeBSD!) NUMA machines in order to develop and verify this.
Thank you to Dell for providing me with dual socket sandybridge
and westmere v3 hardware to do NUMA development with.
Thank you to Scott Long at Netflix for providing me with access
to the two-socket, four-domain haswell v3 hardware.
Thank you to Peter Holm for running the stress testing suite
against the NUMA branch during various stages of development!
Tested:
* MIPS (regression testing; non-NUMA)
* i386 (regression testing; non-NUMA GENERIC)
* amd64 (regression testing; non-NUMA GENERIC)
* westmere, 2 socket (thankyou norse!)
* sandy bridge, 2 socket (thankyou dell!)
* ivy bridge, 2 socket (thankyou norse!)
* westmere-EX, 4 socket / 1TB RAM (thankyou norse!)
* haswell, 2 socket (thankyou norse!)
* haswell v3, 2 socket (thankyou dell)
* haswell v3, 2x18 core (thankyou scott long / netflix!)
* Peter Holm ran a stress test suite on this work and found one
issue, but has not been able to verify it (it doesn't look NUMA
related, and he only saw it once over many testing runs.)
* I've tested bhyve instances running in fixed NUMA domains and cpusets;
all seems to work correctly.
Verified:
* intel-pcm - pcm-numa.x and pcm-memory.x, whilst selecting different
NUMA policies for processes under test.
Review:
This was reviewed through phabricator (https://reviews.freebsd.org/D2559)
as well as privately and via emails to freebsd-arch@. The git history
with specific attributes is available at https://github.com/erikarn/freebsd/
in the NUMA branch (https://github.com/erikarn/freebsd/compare/local/adrian_numa_policy).
This has been reviewed by a number of people (stas, rpaulo, kib, ngie,
wblock) but not achieved a clear consensus. My hope is that with further
exposure and testing more functionality can be implemented and evaluated.
Notes:
* The VM doesn't handle unbalanced domains very well, and if you have an overly
unbalanced memory setup whilst under high memory pressure, VM page allocation
may fail leading to a kernel panic. This was a problem in the past, but it's
much more easily triggered now with these tools.
* This work only controls the path through vm_phys; it doesn't yet strongly/predictably
affect contigmalloc, KVA placement, UMA, etc. So, driver placement of memory
isn't really guaranteed in any way. That's next on my plate.
Sponsored by: Norse Corp, Inc.; Dell
and psci to start them. I expect ACPI support to be added later.
This has been tested on qemu with 2 cpus as that is the current value of
MAXCPUS. This is expected to be increased in the future as FreeBSD has
been tested on 48 cores on the Cavium ThunderX hardware.
Partially based on a patch from Robin Randhawa from ARM.
Approved by: ABT Systems Ltd
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3024
CloudABI is a pure capability-based runtime environment for UNIX. It
works similar to Capsicum, except that processes already run in
capabilities mode on startup. All functionality that conflicts with this
model has been omitted, making it a compact binary interface that can be
supported by other operating systems without too much effort.
CloudABI is 'secure by default'; the idea is that it should be safe to
run arbitrary third-party binaries without requiring any explicit
hardware virtualization (Bhyve) or namespace virtualization (Jails). The
rights of an application are purely determined by the set of file
descriptors that you grant it on startup.
The datatypes and constants used by CloudABI's C library (cloudlibc) are
defined in separate files called syscalldefs_mi.h (pointer size
independent) and syscalldefs_md.h (pointer size dependent). We import
these files in sys/contrib/cloudabi and wrap around them in
cloudabi*_syscalldefs.h.
We then add stubs for all of the system calls in sys/compat/cloudabi or
sys/compat/cloudabi64, depending on whether the system call depends on
the pointer size. We only have nine system calls that depend on the
pointer size. If we ever want to support 32-bit binaries, we can simply
add sys/compat/cloudabi32 and implement these nine system calls again.
The next step is to send in code reviews for the individual system call
implementations, but also add a sysentvec, to allow CloudABI executabled
to be started through execve().
More information about CloudABI:
- GitHub: https://github.com/NuxiNL/cloudlibc
- Talk at BSDCan: https://www.youtube.com/watch?v=SVdF84x1EdA
Differential Revision: https://reviews.freebsd.org/D2848
Reviewed by: emaste, brooks
Obtained from: https://github.com/NuxiNL/freebsd
Add ARM ITS (Interrupt Translation Services) support required
to bring-up message signalled interrupts on some ARM64 platforms.
Obtained from: Semihalf
Sponsored by: The FreeBSD Foundation
Summary:
Both booke and AIM interrupt.c files contain nearly identical code. This merges
the two files, to reduce duplication.
Reviewers: #powerpc, marcel
Reviewed By: marcel
Subscribers: imp
Differential Revision: https://reviews.freebsd.org/D2991
directory sys/contrib/libnv.
The goal of this operation is to NOT install header files which shouldn't
be used outside the nvlist library.
Approved by: pjd (mentor)
make the find it does extremely expensive, so compute it only
once. Also make sure the 'traditional' module building method works at
the expense of a bit of duplicated code.