Commit Graph

238711 Commits

Author SHA1 Message Date
Andriy Gapon
f050611e7f MFV r342469: 9630 add lzc_rename and lzc_destroy to libzfs_core
illumos/illumos-gate@049ba636fa
049ba636fa

https://www.illumos.org/issues/9630
  Rename and destroy are very useful operations that deserve to be in
  libzfs_core.  And they are not hard to implement too.

MFC after:	2 weeks
Relnotes:	maybe
2018-12-26 10:37:41 +00:00
Andriy Gapon
48114cb5be 9630 add lzc_rename and lzc_destroy to libzfs_core
illumos/illumos-gate@049ba636fa
049ba636fa

https://www.illumos.org/issues/9630
  Rename and destroy are very useful operations that deserve to be in
  libzfs_core.  And they are not hard to implement too.

Reviewed by: Andy Stormont <astormont@racktopsystems.com>
Reviewed by: Matt Ahrens <matt@delphix.com>
Reviewed by: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>
2018-12-26 07:57:21 +00:00
Kyle Evans
7ce09314b2 bectl: use jail id as the default jail name for a boot environment
By default, bectl is setting the jail 'name' parameter to the boot
environment name, which causes an error when the boot environment name is
not a valid jail name. With the attached fix, when no name is supplied, the
default jail name will be the jail id - this is is the same behavior as the
jail command.

Additionally, this commit addresses two other bugs that prevented unjailing
in scenarios where the jail name does not match the boot environment name:

1. In 'bectl_locate_jail', 'mountpoint' is used to resolve the boot
  environment path, but really 'mounted' should be used. 'mountpoint' is the
  path where the zfs dataset will be mounted. 'mounted' is the path where
  the dataset is actually mounted.

2. in 'bectl_search_jail_paths', 'jail_getv' would fail after the first
  call. Which is fine, if the boot environment you're unjailing is the next
  one up. According to 'man jail_getv', it's expecting name and value
  strings. 'jail_getv' is being passed an integer for the lastjid, so amend
  that to use a string instead.

Test cases have been amended to reflect the bugs found.

PR:		233637
Submitted by:	Rob <rob.fx907_gmail.com>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D18607
2018-12-25 15:18:41 +00:00
Hans Petter Selasky
a89c806508 Fix reading of USB sample rate descriptor for SPL Crimson Rev 1.
Read first one entry, then try to read the full rate descriptor table.

PR:			234380
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-12-25 10:15:48 +00:00
Alexander Motin
abeb9f61f9 Increase MTX_POOL_SLEEP_SIZE from 128 to 1024.
This value remained unchanged for 15 years, and now this bump reduces
lock spinning in GEOM and BIO layers while doing ~1.6M IOPS to 4 NVMe
on 72-core system from ~25% to ~5% by the cost of additional 28KB RAM.

While there, align struct mtx_pool fields to cache lines.

MFC after:	1 month
2018-12-24 23:52:35 +00:00
Alexander Motin
511662d083 Remove CAM SIM lock from NVMe SIM.
CAM does not require SIM lock since FreeBSD 10.4, and NVMe code never
required it at all, using per-queue locks instead.  This formally allows
parallel request submission in CAM mode as much as single per-device and
per-queue locks of CAM allow.

MFC after:	1 month
2018-12-24 23:28:11 +00:00
Conrad Meyer
1fa054c135 Enable sys/random.h #include from C++
And bump __FreeBSD_version, just in case.

PR:		234180
Submitted by:	Ralf van der Enden <tremere AT cainites.net>
MFC after:	5 days
2018-12-24 19:37:10 +00:00
Maxim Konovalov
d71831dbbb DragonFly 5.4.0, 5.4.1 and FreeBSD 12.0 releases added. 2018-12-24 16:36:39 +00:00
Rebecca Cran
a2581e8021 Activate support for efibootmgr(8) -b --bootnum parameter
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D18647
2018-12-24 15:38:36 +00:00
Chris Rees
261e62db4c Clarify kld_list format
PR:		docs/234248
Submitted by:	David Fiander
Submitted by:	Miroslav Lachman
2018-12-24 10:47:48 +00:00
Scott Long
46b9415ff9 Further refactoring for task management commands. Also fix a related
typo from the previous commit.
2018-12-24 06:14:32 +00:00
Scott Long
3921a9f75c Commands for user-initated device resets should come from the high-priority
allocator.  Prior to this change, they would leak from the normal allocator.
2018-12-24 05:54:36 +00:00
Scott Long
b7f1ee7970 First step in refactoring and fixing the error recovery and task management
code in the mpr and mps drivers.  Eliminate duplicated code and fix some
comments.
2018-12-24 05:05:38 +00:00
Cy Schubert
4db33e987e Remove an empty #if block.
The interesting thing is that looking through Darren's commit logs,
the line containing an extern ppsratecheck() definition was removed
from the v5-1-RELEASE branch but not from HEAD (I have taken his
CVS tree and converted it to GIT). There is a commit adding an
additional #if defined to the empty block. I can only assume that
this was intentional for something later. Looking through HEAD the
extern ppsratecheck() is there. However if we put it back it would
conflict with a static ppsratecheck() definition in fil.c when
building ipftest.

Therefore we remove this empty block.

ppsratecheck() is a function in the FreeBSD kernel. However ipftest
cannot call the ppsratecheck() in the kernel. Therefore one exists in
fil.c for use when building the userland ipftest utility which
approximates the packet filter in userland for testing of ipfilter
rules against packets captured with tcpdump.

MFC after:	1 week
2018-12-24 01:12:43 +00:00
Cy Schubert
3c65d0a054 Register a pre-commit review for ipfilter. 2018-12-24 01:12:22 +00:00
Pedro F. Giffuni
a5dabd6c3c Fix mismatch from r342379. 2018-12-23 20:51:13 +00:00
Konstantin Belousov
cbbdd28318 nvdimm SPA geom: Update bio fields needed for devstat_end_transaction_bio().
Reported by:	bde
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-12-23 19:14:31 +00:00
Konstantin Belousov
8690d4dea3 Allocate v_object for the new snapshot vnode.
The vnode is not opened, so it ends up with the malloced buffers otherwise.

Reported and tested by:	pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-12-23 18:54:09 +00:00
Konstantin Belousov
6c59824b31 Properly test for vmio buffer in bnoreuselist().
The presence of allocated v_object does not imply that the buffer is
necessary VMIO kind.  Buffer might has been allocated before the
object created, then the buffer is malloced.  Although we try to avoid
such situation, it seems to be still legitimate.

Reported and tested by:	pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-12-23 18:52:02 +00:00
Pedro F. Giffuni
09ed804717 gai_strerror() - Update string error messages according to RFC 3493.
Error messages in gai_strerror(3) vary largely among OSs.

For new software we largely replaced the obsoleted EAI_NONAME and
with EAI_NODATA but we never updated the corresponding message to better
match the intended use. We also have references to ai_flags and ai_family
which are not very descriptive for non-developer end users.

Bring new new error messages based on informational RFC 3493, which has
obsoleted RFC 2553, and make them consistent among the header adn
manpage.

MFC after:	1 month
Differentical Revision:	D18630
2018-12-23 18:15:48 +00:00
Cy Schubert
2686f69ed4 Remove NETBSD_PF. NETBSD_PF is a flag that defines whether the pfil(9)
framework is available. pfil(9) has been in FreeBSD since FreeBSD 5
and according to svn log was first committed to HEAD in 2000, therefore
it is safe to say the check is no longer needed in FreeBSD.

pfil(9) first appeared in NetBSD 1.3 (hence the name NETBSD_PF).
Therefore it is safe to say that it is supported by every NetBSD system
today. The framework also exists in illumos.

As ipfilter code is shared and exchanged between FreeBSD and NetBSD, and
at some point in the future illumos too, and as all three platforms have
pfil(9), the redundant NETBSD_PF #defines and #ifdefs are removed.

MFC after:	1 week
2018-12-23 05:10:36 +00:00
Simon J. Gerraty
dfd669ab38 Merge bmake-20181221 2018-12-23 01:05:52 +00:00
Bruce Evans
c907940bf9 Fix devstat on md devices, second attempt. r341765 depends on
g_io_deliver() finishing initialization of the bio, but g_io_deliver()
actually destroys the bio.  INVARIANTS makes the bug obvious by
overwriting the bio with garbage.

Restore the old order for calling devstat (except don't restore not calling
it for the error case), and translate to the devstat KPI so that this order
works.

Reviewed by:	kib
2018-12-22 22:59:11 +00:00
Cy Schubert
1e3ecb57e0 Remove the last vestiges of HP/UX from a FreeBSD-only ipfilter
source file.

MFC after:	1 week
2018-12-22 21:49:25 +00:00
Simon J. Gerraty
4e6c593faa Import bmake-20181221
o parse.c: ParseVErrorInternal use .PARSEDIR
  and apply if relative, and then use .PARSEFILE
  for consistent result.
o var.c: avoid SEGFAULT in .unexport-env
  when MAKELEVEL is not set
2018-12-22 21:32:17 +00:00
Vincenzo Maffione
58e185425a netmap: fix txsync check in netmap poll
To check if txsync can be skipped, it is necessary to look for
unseen TX space. However, this means comparing ring->cur
against ring->tail, rather than ring->head against ring->tail
(like nm_ring_empty() does).
This change also adds some more comments to explain the optimization
performed at the beginning of netmap_poll().

MFC after:	3 days
Sponsored by:	Sunny Valley Networks
2018-12-22 16:23:42 +00:00
Vincenzo Maffione
e1ed1fbdea netmap: fix bug in netmap_poll() optimization
The bug was introduced by r339639, although it is present in the upstream
netmap code since 2015. It is due to resetting the want_rx variable to
POLLIN, rather than resetting it to POLLIN|POLLRDNORM.
It only affects select(), which uses POLLRDNORM. poll() is not affected,
because it uses POLLIN.
Also, it only affects FreeBSD, because Linux skips the optimization
implemented by the piece of code where the bug occurs.

MFC after:	3 days
Sponsored by:	Sunny Valley Networks
2018-12-22 15:15:45 +00:00
Eugene Grosbein
8ebaf58450 ifconfig.4, lagg.4: fix documentation bug: -use_flowid needs to be used
to force local hash computation and disable usage of RSS hash
provided by driver.

PR:		234242
MFC after:	1 week
2018-12-22 11:38:54 +00:00
Bruce Evans
5ef4f86d7a Oops, rounddown() for the start was misspelled roundup() in r342295,
so only aligned starts worked.  This broke releasing caches in most
cases where the i/o size is smaller than the fs block size.
2018-12-22 09:31:55 +00:00
Kyle Evans
ac0a7e2a3c config(8): Remove all instances of an option when opting out
Quick follow-up to r342362: options can appear multiple times now, so
clean up all of them as needed. For non-OPTIONS options, this has no effect
since they're already de-duplicated.

MFC after:	1 week
X-MFC-With:	r342362
2018-12-22 06:08:06 +00:00
Kyle Evans
993e5c4fd2 config(8): Allow duplicate options to be specified
config(8)'s option handling has been written to allow duplicate options; if
the value changes, then the latest value is used and an informative message
is printed to stderr like so:

/usr/src/sys/amd64/conf/TEST: option "VERBOSE_SYSINIT" redefined from 0 to 1

Currently, this is only a possibility for cpu types, MAXUSERS, and
MACHINE_ARCH. Anything else duplicated in a config file will use the first
value set and error about duplicated options on subsequent appearances,
which is arguably unfriendly since one could specify:

include GENERIC
nooptions VERBOSE_SYSINIT
options VERBOSE_SYSINIT

to redefine the value later anyways.

Reported by:	mmacy
MFC after:	1 week
2018-12-22 06:02:34 +00:00
Warner Losh
9d0e9f8ef5 Try the first 256 units with nvmecontrol devlist.
The nvmecontrol code that did the devlist assumed that we had a
tightly-packed allocation of units. Since pci writing exists, this
isn't the case. Loop over the first 256 units, which is a reasonable
number of possible units.

Sponsored by: Netflix
2018-12-21 23:22:37 +00:00
Bruce Evans
416e232cc6 Fix clobbering of the fatchain cache for clustered i/o's when full
clustering is not done.  The bug caused extreme slowness for large
files in some cases.

There is no way to tell VOP_BMAP() how many blocks are wanted, so for
all file systems it has to waste time in some cases by searching for
more contiguous blocks than will be accessed.  For msdosfs, it also
clobbered the fatchain cache in these cases by advancing the cache to
point to the chain entry for block that won't be read.  This makes
the cache useless for the next sequential i/o (or VOP_BMAP()), so the
fat chain is searched from the beginning.  The cache only has 1 relevant
entry, so it is similarly useless for random i/o.

Fix this by only advancing the cache to point to the chain entry for
the first block that will be read.  Clustering uses results from
VOP_BMAP(), so when more than 1 block is read by clustering, the cache
is not advanced as optimally as before, but it is at most 1 cluster
size behind and searching the chain through the blocks for this cluster
doesn't take too long.
2018-12-21 21:17:45 +00:00
Navdeep Parhar
6c5c0137a9 Remove unused macros from t4_tom.h. 2018-12-21 20:46:45 +00:00
Conrad Meyer
86312e466c mps(4), mpr(4): remove SATA ID command cancellation hack
Add a generic mechanism to override mp?_wait_command's timeout behavior,
which continues to invoke reinit by default.  Invokers who set
cm_timeout_handler may avoid automatic reinit and do their own handling.

Adapt mp?sas_get_sata_identify to this mechanism and remove its callout
hack.

Reviewed by:	scottl
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18614
2018-12-21 20:30:52 +00:00
Conrad Meyer
8277ce2b78 mps(4), mpr(4): Fix lifetime of command buffer for mp?sas_get_sata_identify
In the event that the ID command timed out, mps(4)/mpr(4) did not free the
command until it could be cancelled.  However, it freed the associated
buffer (cm_data).  Fix the lifetime issue by freeing the associated buffer
only after Abort Task or controller reset.

Reviewed by:	scottl
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18612
2018-12-21 20:29:16 +00:00
Bruce Evans
8ec22c4d65 Quick fix for initialization of mnt_iosize_max. (This limit controls
mainly clustering and read-ahead.)  Copy the initialization from ffs,
and also copy a couple of lines of ffs's nearby style for initialization
order and whitespace.

A correct fix would de-duplicate the initialization and fix bitrot in it
instead of adding another instance of the duplication.  Complications to
use the size preferred by the device have been reduced to hard-coding
slightly pessimal and/or inconsistent defaults, using large code that was
almost needed to support the complications.

For msdosfs, the result was that mnt_iosize_max was DFTLPHYS (64K) but is
now MAXPHYS (128K).
2018-12-21 20:12:43 +00:00
Alexander Motin
1f03d0bae1 Fix passing wrong variables to nvlist_destroy() after r333446.
Reported by:	Alexander Fedorov (IT-Grad.ru)
MFC after:	5 days
2018-12-21 17:22:15 +00:00
Vincenzo Maffione
c2231fb0f8 netmap: update nmreplay(8)
Small modifications to the nmreplay man page.
Used igor and mandoc tools to fix warnings and errors.

Reviewed by:	bcr
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D18629
2018-12-21 14:45:10 +00:00
Vincenzo Maffione
4b45250941 netmap: nmreplay: import various fixes from upstream (2704a51839906)
Changelist:
    - General reformatting
    - Fix packet duplication in cons(). Whenever cons() reached the
      burst limit it would send all pending packets without advancing
      head. This caused the last injected packet to be sent again in
      the next round.
    - Fix full-speed transmissions after first loop.

MFC after:	3 days
2018-12-21 13:56:57 +00:00
Vincenzo Maffione
77a2baf551 netmap: move buf_size validation code to its own function
This code validates the netmap buf_size against the interface MTU
and maximum descriptor size, to make sure the values are consistent.
Moving this functionality to its own function is needed because this
function is also called by Linux-specific code.

MFC after:	3 days
2018-12-21 11:50:14 +00:00
Vincenzo Maffione
c52382bd40 netmap: pipes: make sure both ends use the same number of slots 2018-12-21 11:32:55 +00:00
Andrey V. Elsukov
a5178bca19 Allow use underscores and dots in service names without escaping.
PR:		234237
MFC after:	1 week
2018-12-21 10:41:45 +00:00
Bruce Evans
9e5ed8593f Use VOP_ADVISE() with POSIX_FADV_DONTNEED instead of IO_DIRECT to
implement not double-caching for reads from vnode-backed md devices.
Use VOP_ADVISE() similarly instead of !IO_DIRECT unsimilarly for writes.
Add a "cache" option to mdconfig to allow changing the default of not
caching.

This depends on a recent commit to fix VOP_ADVISE().  A previous version
had optimizations for sequential i/o's (merge the i/o's and only uncache
for discontiguous i/o's and for full blocks), but optimizations and
knowledge of block boundaries belong in VOP_ADVISE().  Read-ahead should
also be handled better, by supporting it in md and discarding it in
VOP_ADVISE().

POSIX_FADV_DONTNEED is ignored by zfs, but so is IO_DIRECT.

POSIX_FADV_DONTNEED works better than IO_DIRECT if it is not ignored,
since it only discards from the buffer cache immediately, while
IO_DIRECT also discards from the page cache immediately.

IO_DIRECT was not used for writes since it was claimed to be too slow,
but most of the slowness for writes is from doing them synchronously by
default.  Non-synchronous writes still deadlock in many cases.

IO_DIRECT only has a special implementation for ffs reads with DIRECTIO
configured.  Otherwise, if it is not ignored than it uses the buffer and
page caches normally except for discarding everything after each i/o,
and then it has much the same overheads as POSIX_FADV_DONTNEED.  The
overheads for reading with ffs and DIRECTIO were similar in tests of md.

Reviewed by:	kib
2018-12-21 08:15:31 +00:00
Bruce Evans
e6f6d8853c Fix missing (sub)options in usage message to prepare for adding a new one.
Reviewed by:	kib
2018-12-21 06:38:13 +00:00
Bruce Evans
2c0434acb0 Fix rounding in vop_stdadvise() for POSIX_FADV_NOREUSE (really
POSIX_FADV_DONTNEED).  The most broken case was for applications that
advise for the whole file and then do block-aligned i/o's 1 block at
a time.  Then advice is sent to VOP_ADVISE() 1 block at a time, but
in vop_stdadvise() the 1-block advice was turned into 0-block advice
for the buffer cache part.

The bugs were caused partly by callers representing the region as
(a_start, a_end), where a_end is actually the maximum, and everything
else representing the region as (start, end) where 'end' is actually
the end (1 after the maximum).  The maximum a_end must be rounded up,
but was rounded down.  Also, rounding to page boundaries was inconsistent.

The bugs and fixes have no effect for zfs and other file systems that
don't use the buffer cache or the page cache.  Most or all file systems
currently use the default VOP_FADVISE(), but it finds a null buffer cache
and a null page cache for file systems that don't use normal methods.

Reviewed by:	kib
2018-12-21 04:57:59 +00:00
Kirk McKusick
13c31c29ca Some filesystems (like cd9660 and ext3) require that VFS_STATFS()
be called before VFS_ROOT() is called. Move the call for VFS_STATFS()
so that it is done after VFS_MOUNT(), but before VFS_ROOT().
This change actually improves the robustness of the mount system
call because it returns an error rather than failing silently
when VFS_STATFS() returns failure.

Reported by:  Rebecca Cran <rebecca@bluestop.org>
Sponsored by: Netflix
2018-12-21 01:09:25 +00:00
Navdeep Parhar
ad025209ba cxgbe/iw_cxgbe: Remove redundant CTRs from c4iw_alloc/c4iw_rdev_open.
This information is readily available elsewhere.

Sponsored by:	Chelsio Communications
2018-12-20 22:39:58 +00:00
Navdeep Parhar
6bb034658d cxgbe/iw_cxgbe: Do not terminate CTRx messages with \n. 2018-12-20 22:31:07 +00:00
Rick Macklem
d493fe42f9 Add an UPDATING message for r342286. 2018-12-20 22:26:54 +00:00