249059 Commits

Author SHA1 Message Date
asomers
62ee35c06a Delete copypasta
Reported by:	rpokala
MFC after:	20 days
X-MFC-With:	329845
Sponsored by:	Spectra Logic Corp
2018-02-23 17:20:53 +00:00
asomers
3b1068d587 Add the ZFS test suite
It was originally written by Sun as part of the STF (Solaris test framework).
They open sourced it in OpenSolaris, then HighCloud partially ported it to
FreeBSD, and Spectra Logic finished the port.  We also added many testcases,
fixed many broken ones, and converted them all to the ATF framework.  We've had
help along the way from avg, araujo, smh, and brd.

By default most of the tests are disabled.  Set the disks Kyua variable to
enable them.

Submitted by:	asomers, will, justing, ken, brd, avg, araujo, smh
Sponsored by:	Spectra Logic Corp, HighCloud
2018-02-23 16:31:00 +00:00
imp
623826ceb4 Use bool instead of int for predicate functions relating to work
available.
2018-02-23 16:06:54 +00:00
kib
de90d5191c Do not return out of bound pointers from intr_lookup_source().
This hardens the code against driver and upper level bugs causing
invalid indexes used, e.g. on msi release.

Reported by:	gallatin
Reviewed by:	gallatin, hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14470
2018-02-23 11:20:59 +00:00
wma
e19e0680ad powerpc64: add NVMe to GENERIC64
NVMe support is ready and should be compiled-in
to the ppc64 kernel.

Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-02-23 07:43:52 +00:00
kevans
89dd1b63b1 lualoader: Track effective line number, use it for drawing
Takes into account hidden entries, so that we don't draw blank lines in
place of a hidden item.
2018-02-23 04:12:19 +00:00
imp
9a19781ae7 Floaty McFloatface is funnier...
Submitted by: emaste@
2018-02-23 04:06:15 +00:00
imp
f735e1eb15 Do not include float interfaces when using libsa.
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
2018-02-23 04:04:25 +00:00
imp
c1023867ee When the LUA_FLOAT_TYPE != LUA_FLOAT_INT64, we can't reference float
or double so ifdef that code out when the numbers aren't float at all.

There's still references in the lmathlib.c, but we don't compile that
for the loader yet.

Differential Revision: https://reviews.freebsd.org/D14472
2018-02-23 04:04:18 +00:00
imp
0326683caf Centralize lua defines
We need to ensure that we defined Numbers as int64_t everywhere we
build for lua. Previously, we were compiling part of the code with
Numbers as int64_t and part as double. Move lua number definition to a
more-central location
2018-02-23 04:04:03 +00:00
kevans
ef8f5a3bc4 lualoader: Use "local function x()" instead of "local x = function()"
The latter is good, but the former is more elegant and clear about what 'x'
is. Adopt it, preferably only using the latter kind of notation where needed
as values for tables.
2018-02-23 04:03:07 +00:00
davidcs
c2d6bf66d7 1. Added support to offline a port if is error recovery on successful.
2. Sysctls to enable/disable driver_state_dump and error_recovery.
3. Sysctl to control the delay between hw/fw reinitialization and
   restarting the fastpath.
4. Stop periodic stats retrieval if interface has IFF_DRV_RUNNING flag off.
5. Print contents of PEG_HALT_STATUS1 and PEG_HALT_STATUS2 on heartbeat
   failure.
6. Speed up slowpath shutdown during error recovery.
7. link_state update using atomic_store.
8. Added timestamp information on driver state and minidump captures.
9. Added support for Slowpath event logging
10.Added additional failure injection types to simulate failures.
2018-02-23 03:36:24 +00:00
kevans
826e79ce74 lualoader: shallowCopyTable => deepCopyTable
I called it a shallow copy, but it wasn't really a shallow copy at all.
2018-02-23 03:18:24 +00:00
asomers
0d688befe3 libifconfig: multiple feature additions
Added the ability to:

* Create virtual interfaces
* Create vlan interfaces
* Get interface fib
* Get interface groups
* Get interface status
* Get nd6 info
* Get media status
* Get additional ifaddr info in a convenient struct
* Get vhids
* Get carp info
* Get lagg and laggport status
* Iterate over all interfaces and ifaddrs

And add more examples, too.

Note that this is a backwards-incompatible change. But that's ok, because it's
a private library.

MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D14463
2018-02-23 03:11:43 +00:00
kevans
5ea1f55a46 Add copyright notice to core.lua
I've also made some not-insignificant changes/additions to this file, to
include the added constants, ACPI changes, boot environment listing, and
some utility functions.
2018-02-23 02:53:50 +00:00
kevans
50b03cd46b Add SPDX tags to lua files 2018-02-23 02:51:35 +00:00
kevans
f358c43aae lualoader: Drop unused return values; we'll only use the first 2018-02-23 02:47:51 +00:00
pfg
832474ebd4 __printf_render_int(): small type change to match use.
Variable l is consistently used as an int rather than a char.
Sort names while here.

Obtained from:	Apple's Libc-1244.30.3
MFC after:	5 days
2018-02-23 01:11:57 +00:00
pfg
fc32530227 getpeereid(3): Fix behavior on failure to match documentation.
According to the getpeereid(3) documentation, on failure the value -1 is
returned and the global variable errno is set to indicate the error. We
were returning the error instead.

Obtained from:	Apple's Libc-1244.30.3
MFC after:	5 days
2018-02-23 00:28:00 +00:00
asomers
3102e5665c Fix numerous Coverity issues in mptutil
Most are memory or file descriptor leaks. Three were unannotated
fallthroughs in a switch/case statement. One was an integer overflow before
widen.

Reported by:	Coverity
CID:		1007463 1007462 1007461 1007460 1007459 1007458 1007457
CID:		1006855 1006854 1006853 1006852 1006851 1006850 1006849
CID:		1006848 1006845 1006844 1006843 1006842 1006841 1006840
CID:		1006839 1006838 1006837 1006836 1006835 1006834 1006833
CID:		1006832 1006831 1006831 1006830 1006829 1008334 1008170
CID:		1008169 1008168
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D11013
2018-02-23 00:17:50 +00:00
truckman
1e9bbe93d1 Decrease latency by not wrapping the idle loop's potentially lengthy
search for a thread to steal inside a critical section.  Since this
allows the search to be preempted, restart the search if preemption
happens since the search results found earlier may no longer be
valid.

Decrease the latency of starting a thread that may be assigned to
this CPU during the search by polling for incoming threads during
the search and switching to that thread instead of continuing the
search.

Test for stale search results and restart the search before going
through the expense of calling tdq_lock_pair().  Retry some tests
after grabbing the locks since things may have changed while waiting
to get both locks.

Eliminate special case handling for stealing from an SMT peer that
uses 1 as the steal threshold.  This can only succeed if a thread
has been assigned but our SMT peer has not yet started executing
it.  This is quite rare and when it happens the other SMT thread
is generally waiting for the same tdq lock that we hold.  Basically
both SMT threads are racing to grab the same spin lock.

Add the kern.sched.always_steal knob from a ULE patch by jeff@.

Incorporate another idea from Jeff's ULE patch.  If the sched_switch()
detects that the CPU is about to go idle, try to steal a thread
before switching to the idle thread.  Since the search for a thread
to steal has to be done inside a critical section in this context,
limit the impact on latency by adding the knob kern.sched.trysteal_limit
to limit the topological distance of the search and don't restart
the search if we detect stale results.  If this search can't find
an stealable thread, the idle loop can do a more complete search.
Also poll for threads being assigned to this CPU during the search
and switch to them instead of continuing the search.  This change
is responsibile for the majority of the improvement in parallel
buildworld times.

In sched_balance_group() change the minimum threshold from stealing
a thread from 1 to 2.  Poaching a newly assigned thread from a CPU
that is waking up hasn't yet switched to that thread from idle is
likely very rare and is likely to have the same lock race as is
seen when stealing threads in the idle loop.  Also use tdq_notify()
to kick the destintation CPU instead of always sending an IPI.
Update a stale comment, the number of transferable threads is not
calculated.

Reviewed by:	kib (earlier version)
Comments by:	avg, jeff, mav
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D12130
2018-02-23 00:12:51 +00:00
rpokala
85249cf415 jedec_dimm(4): report asset info and temperatures for DDR3 and DDR4 DIMMs
A super-set of the functionality of jedec_ts(4). jedec_dimm(4) reports asset
information (Part Number, Serial Number) encoded in the "Serial Presence
Detect" (SPD) data on JEDEC DDR3 and DDR4 DIMMs. It also calculates and
reports the memory capacity of the DIMM, in megabytes. If the DIMM includes
a "Thermal Sensor On DIMM" (TSOD), the temperature is also reported.

Reviewed by:	cem
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Panasas
Differential Revision:	https://reviews.freebsd.org/D14392
Discussed with:	avg, cem
Tested by:	avg, cem (previous version, no semantic changes)
2018-02-22 23:18:46 +00:00
ian
424d880d23 Add a missing line continuation.
How many commits does it take to get a simple module makefile working?
Apparently at least three.

Pointy hat to:  ian
2018-02-22 22:25:26 +00:00
mjg
f85647797f Fix up sysctl vfs.buffercache broken in r329612
Sample problem:
top: sysctl(vfs.bufspace...) expected 8, got 4

Reported by:	O. Hartmann <ohartmann walstatt.org>
2018-02-22 20:39:25 +00:00
kevans
0d36a565ea lualoader: Attend to some 80-col issues, pointed out by luacheck
Graphics have a tendency to cause 80-col issues, so make an exception to our
standard indentation guidelines for these graphics. This does not hamper
readability too badly.

Two 40-column strings of spaces is trivially replaced with
string.rep(" ", 80)
2018-02-22 20:10:23 +00:00
gonzo
1fc1663e02 [chvgpio] add GPIO driver for Intel Z8xxx SoC family
Add chvgpio(4) driver for Intel Z8xxx SoC family. This product
was formerly known as Cherry Trail but Linux and OpenBSD drivers
refer to it as Cherry View. This driver is derived from OpenBSD
one so the name is kept for alignment with another BSD system.

Submitted by:	Tom Jones <tj@enoti.me>
Reviewed by:	gonzo, wblock(man page)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D13086
2018-02-22 19:12:32 +00:00
kevans
4a91d690c2 Fix userboot w/ ZFS after r329725
r329725 cleaned up ZFS commands duplicated in multiple places, but userboot
was not setting HAVE_ZFS when MK_ZFS != "no". This resulted in a failure to
boot (as seen in PR 226118) in bhyve, with the following message:

/boot/userboot.so: Undefined symbol "ldi_get_size"

PR:		226118
Glanced at by:	imp
2018-02-22 18:49:53 +00:00
asomers
59acad2e43 nvmecontrol: fix build on amd64/clang
Broken by:	329824
Sponsored by:	Spectra Logic Corp
2018-02-22 17:47:16 +00:00
vangyzen
f84320cfc8 sched_ule: update a comment to reflect reality
MFC after:	3 days
Sponsored by:	Dell EMC
2018-02-22 17:09:26 +00:00
kevans
ba4f462a09 nvme: Unbreak LE builds after r329824
The parameter 'p' is unused if _BYTE_ORDER == _LITTLE_ENDIAN. Add in a
(void)p to fix the build.
2018-02-22 16:16:49 +00:00
kevans
5c3354db03 lua-lint: Add note about luacheck in ports, silence warning
luacheck was added in ports r462609.

Silence warning about cli_execute -- it's non-standard, but for our setup it
will be a standard global.
2018-02-22 15:29:57 +00:00
hselasky
fdcd80b449 Return correct error code to user-space when a system call receives a
signal in the LinuxKPI.

The read(), write() and mmap() system calls can return either EINTR or
ERESTART upon receiving a signal. Add code to figure out the correct
return value by temporarily storing the return code from the relevant
FreeBSD kernel APIs in the Linux task structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-02-22 15:29:19 +00:00
wma
2858f9ff6e NVMe: Add big-endian support
Remove bitfields from defined structures as they are not portable.
Instead use shift and mask macros in the driver and nvmecontrol application.

NVMe is now working on powerpc64 host.

Submitted by:          Michal Stanek <mst@semihalf.com>
Obtained from:         Semihalf
Reviewed by:           imp, wma
Sponsored by:          IBM, QCM Technologies
Differential revision: https://reviews.freebsd.org/D13916
2018-02-22 13:32:31 +00:00
avg
1c2b9038f9 another rework of getzfsvfs / getzfsvfs_impl code
This change is designed to account for yet another difference between
illumos and FreeBSD VFS.  In FreeBSD a filesystem driver is supposed to
clean up mnt_data in its VFS_UNMOUNT method because it's the last call
into the driver before a struct mount object is destroyed.  The VFS
drains all references to the object before destroying it, but for the
driver it's already as good as gone.
In contrast, illumos VFS provides another method, VFS_FREEVFS, that is
called when all references are drained.  So, the driver can keep its
data after VFS_UNMOUNT and clean it up in VFS_FREEVFS after all
references are gone. This is what ZFS does on illumos.
So there a reference to a filesystem is sufficient to guarantee that the
ZFS specific data, aka zfsvfs_t, stays around (even if the filesystem
gets unmounted).  In FreeBSD we need to vfs_busy the filesystem to get
the same guarantee.  vfs_ref guarantees only that the struct mount is
kept.

The following rules should be observed in getzfsvfs / getzfsvfs_impl on
FreeBSD:
- if we need access to zfsvfs_t then we must use vfs_busy
- if only we need to access struct mount (aka vfs_t), then vfs_ref is
  enough
- when illumos code actually needs only the vfs_t, they still can pass
  the zfsvfs_t and get the vfs_t from it;  that can work in FreeBSD if
  the filesystem is busied, but when it's just referenced then we have
  to pass the vfs_t explicitly
- we cannot call vfs_busy while holding a dataset because that creates a
  LOR with dp_config_rwlock

As a result:
- getzfsvfs_impl now only references the filesystem, same as in illumos,
  but unlike illumos it has to return the vfs_t
- the consumers are updated to account for the change
- getzfsvfs busies the filesystem (and drops the reference from
  getzfsvfs_impl)

Also, zfs_unmount_snap() now gets a busied a filesystem, references it
and then unbusies it essentially reverting actions done in getzfsvfs.
This is needed because the code may perform some checks that require the
zfsvfs_t.  So, those are done before the unbusying.

MFC after:	2 weeks
2018-02-22 13:06:27 +00:00
wma
6efdd1d6ac Add bsdlabel and fdisk to powerpc64
Submitted by:          Wojciech Macek <wma@semihalf.org>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-02-22 12:31:28 +00:00
imp
172bf78e52 Revert r329814 as well. It should have been in r329819. 2018-02-22 11:51:50 +00:00
avg
688400242b followup to r329556, completely remove the covered vnode assert
vrele() acquires the vnode lock only if the hold count drops to zero.
In other scenarios it needs only the interlock.  So,
zfsctl_snapdir_lookup() can race with vfs_mount_destroy() -> vrele()
such that the lookup adds a new reference and then vrele() drops the
mountpoint's reference and only then we check the reference count.
It would be just one in this case.

In fact, the assert should have been removed in r323483 when the code
learned how to deal with the uncovered vnode.

PR:		225795
MFC after:	4 days
X-MFC with:	r329556
2018-02-22 11:41:00 +00:00
imp
56151d954c Backout r329818, r329816 and r329815.
These aren't the commits I thought I was testing prior to
commit. Revert until I can sort out what happened and fix it.
2018-02-22 11:18:33 +00:00
imp
5e23abf5f2 Fix typo in last commit after last rebase before commit... 2018-02-22 10:55:23 +00:00
araujo
f846b0afd1 The firewall_type is ignored if not set in rc.conf or rc.conf.local,
after r190575 there is an option to call rc.firewall with the firewall_type
passed in as an argument.

Submitted by:	David P. Discher <dpd@dpdtech.com>
MFC after:	3 weeks.
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D14286
2018-02-22 08:25:39 +00:00
imp
25e0879e2e Combine BIO_DELETE requests for nda devices
Now that we're queueing BIO_DELETE requests in the CAM I/O scheduler,
it make sense to try to combine as many as possible into a single
request to send down to hardware. Hopefully, lots of larger requests
like this are better than lots of individual transactions.

Note for future: need to limit based on total size of the trim
request. Should also collapse adjacent ranges where possible to
increase the size of the max payload.

Sponsored by: Netflix
2018-02-22 05:44:00 +00:00
imp
f92b0f08d7 Introduce capacity flags for periphs
Introduce flags word to describe the capacities of the peripheral.
First bit will describe if the periph driver allows multiple
outstanding TRIMS to be active in a device.

Modify the I/O scheduler so that the nda driver can queue trims
for a while after the first one arrives. We'll queue until we see
a I/O scheduler tick, then we'll schedule as many TRIMs as allowed
by other factors (currently this is slocts in the NVMe controller).
This mariginally helps the read latency issues we see with reads,
but sets the stage for the nda driver to do TRIM collapsing like the
da and ada drivers do today.

Sponsored by: Netflix
2018-02-22 05:43:55 +00:00
imp
b15cf48a39 Note when we tick.
To help implement a policy of 'queue all trims until next I/O sched
tick' policy to help coalesce them, note when we tick so we can do
something special on the first call after the tick to get more work.

Sponsored by: Netflix
2018-02-22 05:43:50 +00:00
imp
1b29936de7 Wrap an extra long line
This debugging line is too big for even my largest xterm. wrap it at
about 80 columns.

Sponsored by: Netflix
2018-02-22 05:43:45 +00:00
imp
31fe0d8c8c Don't sort TRIMs.
While the code for ada and da both assume that the trim list is
ordered when doing the coaleascing the TRIMs, it turns out that
creating the sorted list uses more resources than are saved by having
slightly fewer trims sent to the device.

Sponsored by: Netflix
2018-02-22 05:43:20 +00:00
kevans
a22c49eb0c lualoader: Clear up an empty conditional branch
We will likely uncomment this whole monster in the near future.
2018-02-22 04:30:52 +00:00
kevans
6c747fa217 Add script for linting stand/lua to tools/boot.
We require some --globals due to custom loader extensions in our
environment. Add everything required for this to tools/boot so that other
interested parties can get up and go with linting our scripts and not get a
bunch of false-positives.
2018-02-22 04:28:52 +00:00
kevans
13421cd581 lualoader: Address some 'luacheck' concerns
luacheck pointed out an assortment of issues, ranging from non-standard
globals being created as well as unused parameters, variables, and redundant
assignments.

Using '_' as a placeholder for values unused (whether it be parameters
unused or return values unused, assuming multiple return values) feels clean
and gets the point across, so I've adopted it. It also helps flag candidates
for cleanup later in some of the lambdas I've created, giving me an easy way
to re-evaluate later if we're still not using some of these features.
2018-02-22 04:15:02 +00:00
mav
a5f379e05c MFV r329807:
8940 Sending an intra-pool resumable send stream may result in EXDEV

illumos/illumos-gate@544132fce3

"zfs send -t <token>" for an incremental send should be able to resume
successfully when sending to the same pool: a subtle issue in
zfs_iter_children() doesn't currently allow this.

Because resuming from a token requires "guid" -> "dataset" mapping
(guid_to_name()), we have to walk the whole hierarchy to find the right
snapshots to send.
When resuming an incremental send both source and destination live in the
same pool and have the same guid: this is where zfs_iter_children() gets
confused and picks up the wrong snapshot, so we end up trying to send an
incremental "destination@snap1 -> source@snap2" stream instead of
"source@snap1 -> source@snap2": this fails with an "Invalid cross-device
link" (EXDEV) error.

Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Hans Rosenfeld <rosenfeld@grumpf.hope-2000.org>
Author: loli10K <ezomori.nozomu@gmail.com>
2018-02-22 04:01:55 +00:00
kevans
8031156523 lualoader: Consistently use double quotes 2018-02-22 03:55:02 +00:00