220559 Commits

Author SHA1 Message Date
Jonathan T. Looney
e80039007a Do some minimal work to better conform to the 802.3ad (LACP) standard.
In particular, don't set the synchronized bit for the peer unless it truly
appears to be synchronized to us. Also, don't set our own synchronized bit
unless we have actually seen a remote system.

Prior to this change, we were seeing some strange behavior, such as:

1. We send an advertisement with the Activity, Aggregation, and Default
flags, followed by an advertisement with the Activity, Aggregation,
Synchronization, and Default flags. However, we hadn't seen an
advertisement from another peer and were still advertising the default
(NULL) peer. A closer examination of the in-kernel data structures (using
kgdb) showed that the system had added the default (NULL) peer as a valid
aggregator for the segment.
2. We were receiving an advertisement from a peer that included the
default (NULL) peer instead of including our system information. However,
we responded with an advertisement that included the Synchronization flag
for both our system and the peer. (Since the peer's advertisement did not
include our system information, we shouldn't add the synchronization bit
for the peer.)

This patch corrects those two items.

Reviewed by:	smh
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D9485
2017-02-26 00:19:02 +00:00
Warner Losh
2379d1d6ed Move inclusion of opt_printf.h around so that we can compile all the
SCSI modules outside of a sub-build from the kernel.

Differential Revision: https://reviews.freebsd.org/D9653
Sponsored by: Netflix
2017-02-25 22:11:10 +00:00
Edward Tomasz Napierala
e801ac7852 Fix linux_fstatfs() to return proper value for f_frsize. Without it,
linux df(1) binary from Xenial shows garbage.

Reviewed by:	dchagin
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9692
2017-02-25 20:32:37 +00:00
Josh Paetzel
b98d22744f MFV 314276
7570 tunable to allow zvol SCSI unmap to return on commit of txn to ZIL

illumos/illumos-gate@1c9272b861
1c9272b861

https://www.illumos.org/issues/7570

  Based on the discovery that every unmap waits for the commit of the txn to the ZIL,
  introducing a very high latency to unmap commands, this behavior was made into a
  tunable zvol_unmap_sync_enabled and set to false. The net impact of this change is
  that by default SCSI unmap commands will result in space being freed within the zvol
  (today they are ignored and returned with good status). However, unlike the code
  today, instead of 18+ms per unmap, they take about 30us.

  With the testing done on NTFS against a Win2k12 target, the new behavior should work
  seamlessly. Files on the zvol that have already been set with the zfree application
  will continue to write 0's when deleted, and any new files created since zvol
  creation will send unmap commands when deleted. This behavior exists today, but with
  this change the unmap commands will be processed and result in reclaim of space.

Author: Stephen Blinick <stephen.blinick@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Robert Mustacchi <rm@joyent.com>
2017-02-25 20:01:17 +00:00
Josh Paetzel
dbac7ab9c0 7570 tunable to allow zvol SCSI unmap to return on commit of txn to ZIL
illumos/illumos-gate@1c9272b861
1c9272b861

https://www.illumos.org/issues/7570

  Based on the discovery that every unmap waits for the commit of the txn to the ZIL,
  introducing a very high latency to unmap commands, this behavior was made into a
  tunable zvol_unmap_sync_enabled and set to false. The net impact of this change is
  that by default SCSI unmap commands will result in space being freed within the zvol
  (today they are ignored and returned with good status). However, unlike the code
  today, instead of 18+ms per unmap, they take about 30us.

  With the testing done on NTFS against a Win2k12 target, the new behavior should work
  seamlessly. Files on the zvol that have already been set with the zfree application
  will continue to write 0's when deleted, and any new files created since zvol
  creation will send unmap commands when deleted. This behavior exists today, but with
  this change the unmap commands will be processed and result in reclaim of space.

Author: Stephen Blinick <stephen.blinick@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com>
Approved by: Robert Mustacchi <rm@joyent.com>
2017-02-25 19:16:20 +00:00
Mariusz Zaborski
85b92f82b6 Remove unused macro from common/drv.c.
When we was compering it to code from boot2 it also looks like
this code is buggy and boot2 was never updated to use this code.
USE_XREAD flag is unused in boot2, and common/drv.c was never
build with that flag.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D9780
2017-02-25 18:14:32 +00:00
Andriy Gapon
9211bb327f l2arc: try to fix write size calculation broken by Compressed ARC commit
While there, make a change to not evict a first buffer outside the
requested eviciton range.

To do:
- give more consistent names to the size variables
- upstream to OpenZFS

PR:		216178
Reported by:	lev
Tested by:	lev
MFC after:	2 weeks
2017-02-25 17:03:48 +00:00
Andriy Gapon
1e1065b60f zfs: call spa_deadman on a taskqueue thread
callout(9) prohibits callout functions from sleeping.
illumos mutexes are emulated using sx(9).
spa_deadman() calls vdev_deadman() and the latter acquires vq_lock.

As a result we can get a more confusing panic instead of a specific
panic or no panic:
sleepq_add: td 0xfffff80019669960 to sleep on wchan 0xfffff8001cff4d88 with sleeping prohibited

This change adds another level of indirection where the deadman
callout schedules spa_deadman() to be executed on taskqueue_thread.

While there, use callout_schedule(0 instead of callout_reset()
in spa_sync().

Discussed with:	mav
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D9762
2017-02-25 16:45:53 +00:00
Andriy Gapon
9b43bc27c4 call vm_lowmem hook in uma_reclaim_worker
A comment near kmem_reclaim() implies that we already did that.
Calling the hook is useful, because some handlers, e.g. ARC,
might be able to release significant amounts of KVA.

Now that we have more than one place where vm_lowmem hook is called,
use this change as an opportunity to introduce flags that describe
a reason for calling the hook.  No handler makes use of the flags yet.

Reviewed by:	markj, kib
MFC after:	1 week
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D9764
2017-02-25 16:39:21 +00:00
Andriy Gapon
253495d764 chromebook_platform: catch up with ig4iic -> ig4iic_pci in r310621
Reported by:	Wolfgang Zenker <wolfgang@lyxys.ka.sub.org>
Tested by:	Wolfgang Zenker <wolfgang@lyxys.ka.sub.org>
MFC after:	5 days
2017-02-25 15:55:46 +00:00
Andriy Gapon
845ab1f5a0 add chromebook_platform.4 to the list of manual pages
Reported by:	Wolfgang Zenker <wolfgang@lyxys.ka.sub.org>
Pointyhat to:	avg
MFC after:	5 days
2017-02-25 14:50:53 +00:00
Josh Paetzel
029c0bfdbd MFV 314243
6676 Race between unique_insert() and unique_remove() causes ZFS fsid change

illumos/illumos-gate@40510e8eba
40510e8eba

https://www.illumos.org/issues/6676

  The fsid of zfs filesystems might change after reboot or remount. The problem seems to
  be caused by a race between unique_insert() and unique_remove(). The unique_remove()
  is called from dsl_dataset_evict() which is now an asynchronous thread. In a case the
  dsl_dataset_evict() thread is very slow and calls unique_remove() too late we will end
  up with changed fsid on zfs mount.

  This problem is very likely caused by #5056.

  Steps to Reproduce
  Note: I'm able to reproduce this always on a single core (virtual) machine. On multicore
  machines it is not so easy to reproduce.

# uname -a
SunOS openindiana 5.11 illumos-633aa80 i86pc i386 i86pc Solaris
# zfs create rpool/TEST
# FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}')
# echo $FS::print vfs_t vfs_fsid | mdb -k
vfs_fsid = {
    vfs_fsid.val = [ 0x54d7028a, 0x70311508 ]
}
# zfs umount rpool/TEST
# zfs mount rpool/TEST
# FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}')
# echo $FS::print vfs_t vfs_fsid | mdb -k
vfs_fsid = {
    vfs_fsid.val = [ 0xd9454e49, 0x6b36d08 ]
}
#

  Impact
  The persistent fsid (filesystem id) is essential for proper NFS functionality.
  If the fsid of a filesystem changes on remount (or after reboot) the NFS
  clients might not be able to automatically recover from such event and the
  manual remount of the NFS filesystems on every NFS client might be needed.

Author: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Dan Vatca <dan.vatca@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
2017-02-25 14:45:54 +00:00
Alexander Motin
a9d2a1930b Add reporting SAS protocol, in case we ever have one.
MFC after:	2 weeks
2017-02-25 14:36:24 +00:00
Alexander Motin
ebaf2c29d7 Use ctl_queue_sense() to implement sense data reporting.
USB MS BBB transport does not support autosense, so we have to queue any
sense data back to CTL for later fetching via REQUEST SENSE.
2017-02-25 14:24:29 +00:00
Alexander Motin
6563855634 Reenable CTL_WITH_CA, optimizing it for lower memory usage.
This code was disabled due to its high memory usage.  But now we need this
functionality for cfumass(4) frontend, since USB MS BBB transport does not
support autosense.

MFC after:	2 weeks
2017-02-25 14:20:30 +00:00
Alexander Motin
5e70e673d9 Update kern_data_resid according to r312291.
This now mandatory for correct operation.
2017-02-25 12:11:07 +00:00
Konstantin Belousov
aca4bb9112 Do not leak mount references for dying threads.
Thread might create a condition for delayed SU cleanup, which creates
a reference to the mount point in td_su, but exit without returning
through userret(), e.g. when terminating due to single-threading or
process exit.  In this case, td_su reference is not dropped and mount
point cannot be freed.

Handle the situation by clearing td_su also in the thread destructor
and in exit1().  softdep_ast_cleanup() has to receive the thread as
argument, since e.g. thread destructor is executed in different
context.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-02-25 10:38:18 +00:00
Konstantin Belousov
d360b49b1d Do not use ULL suffix. Cast to uint64_t where the suffix is needed,
and just remove it in another place.

Requested by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-02-25 10:32:49 +00:00
Warner Losh
28586889c2 Convert PCIe Hot Plug to using pci_request_feature
Convert PCIe hot plug support over to asking the firmware, if any, for
permission to use the HotPlug hardware. Implement pci_request_feature
for ACPI. All other host pci connections to allowing all valid feature
requests.

Sponsored by: Netflix
2017-02-25 06:11:59 +00:00
Warner Losh
8a1926c5c1 Rename pci_pcie_intr to pci_pcie_intr_hotplug.
Sponsored by: Netflix
2017-02-25 06:11:50 +00:00
Warner Losh
4cb6772936 Create pcib_request_feature.
pcib_request_feature allows drivers to request the firmware (ACPI)
release certain features it may be using. ACPI normally manages things
like hot plug, advanced error reporting and other features until the
OS requests ACPI to relenquish control since it is taking over.

Sponsored by: Netflix
2017-02-25 06:11:36 +00:00
Alexander Motin
03ea6ef2db Axe out some forever disabled questionable functionality.
This code is complicated enough even in its base shape.

MFC after:	2 weeks
2017-02-25 04:24:51 +00:00
Alexander Motin
3afd480696 Improve CAM target frontend reference counting.
Before this change it was possible to trigger some use-after-free panics
by disabling LUNs/ports under heavy load.

MFC after:	2 weeks
2017-02-25 04:04:11 +00:00
Enji Cooper
edb4a773cb Fill MK_LIBTHR as far as lib/libthr is concerned
There are other areas of the tree that will need to be evaluated for sanity
if they're supposed to be conditionally compiled out of the build/install,
like libzpool

MFC after:	1 month
Relnotes:	yes (this might break someone's system if have the knob set)
Sponsored by:	Dell EMC Isilon
2017-02-25 03:44:51 +00:00
Enji Cooper
dc53eebd2c Remove MK_OBJC block
It is no longer represented via src.conf(5)

MFC after:	3 days
Sponsored by:	Dell EMC Isilon
2017-02-25 03:35:26 +00:00
Josh Paetzel
acf1cb602b 6676 Race between unique_insert() and unique_remove() causes ZFS fsid change
illumos/illumos-gate@40510e8eba
40510e8eba

https://www.illumos.org/issues/6676

  The fsid of zfs filesystems might change after reboot or remount. The problem seems to
  be caused by a race between unique_insert() and unique_remove(). The unique_remove()
  is called from dsl_dataset_evict() which is now an asynchronous thread. In a case the
  dsl_dataset_evict() thread is very slow and calls unique_remove() too late we will end
  up with changed fsid on zfs mount.

  This problem is very likely caused by #5056.

  Steps to Reproduce
  Note: I'm able to reproduce this always on a single core (virtual) machine. On multicore
  machines it is not so easy to reproduce.

# uname -a
SunOS openindiana 5.11 illumos-633aa80 i86pc i386 i86pc Solaris
# zfs create rpool/TEST
# FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}')
# echo $FS::print vfs_t vfs_fsid | mdb -k
vfs_fsid = {
    vfs_fsid.val = [ 0x54d7028a, 0x70311508 ]
}
# zfs umount rpool/TEST
# zfs mount rpool/TEST
# FS=$(echo ::fsinfo | mdb -k | grep TEST | awk '{print $1}')
# echo $FS::print vfs_t vfs_fsid | mdb -k
vfs_fsid = {
    vfs_fsid.val = [ 0xd9454e49, 0x6b36d08 ]
}
#

  Impact
  The persistent fsid (filesystem id) is essential for proper NFS functionality.
  If the fsid of a filesystem changes on remount (or after reboot) the NFS
  clients might not be able to automatically recover from such event and the
  manual remount of the NFS filesystems on every NFS client might be needed.

Author: Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: Dan Vatca <dan.vatca@gmail.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
2017-02-25 03:34:22 +00:00
Enji Cooper
31c3fb774a Remove MK_CRYPT stub
It doesn't directly control what gets installed today; it indirectly
pulls other knobs (like MK_KERBEROS, etc).

MFC after:	1 weeks
Sponsored by:	Dell EMC Isilon
2017-02-25 03:33:09 +00:00
Enji Cooper
f0e1405514 Fill in MK_RESCUE by finding paths in ${DESTDIR}/rescue and adding
them to OLD_FILES/OLD_DIRS, as necessary.

MFC after:	1 month
Sponsored by:	Dell EMC Isilon
2017-02-25 03:28:49 +00:00
Enji Cooper
5673173e11 Conditionally compile certain programs into rescue(8) if requested
MK_CCD - ccdconfig
MK_ROUTED - routed, rtquery

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-02-25 03:23:11 +00:00
Enji Cooper
6a6050accf Add shutdown/poweroff support to rescue(8)
shutdown is a safer way to power off than reboot (in general), because of
the added shutdown process that it executes via /etc/rc.shutdown . It was
odd that it was missing from rescue(8) since reboot and friends were
added in past commits.

While here, alias poweroff to shutdown for parity with sbin/shutdown/Makefile

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-02-25 03:11:08 +00:00
Andriy Voskoboinyk
d7de0a2c2a iwn: some initialization / RF switch state change fixes.
- Check return code from initialization path; otherwise, vap state
may be wrong after an error.
- Do not try to run iwn_stop() / iwn_init() multiple times.
- Merge iwn_radio_on/off() and move RFKILL bit check into the task.
- Try to handle possible RF switch state change in S3 state (PR 181694).

PR:		181694
Reviewed by:	adrian
Differential Revision:	https://reviews.freebsd.org/D9797
2017-02-25 00:40:50 +00:00
Enji Cooper
15f0eea574 Parameterize out the length of struct filed->f_lasttime as MAXDATELEN
This removes the hardcoded value for the field (16) and the equivalent
hardcoded lengths in logmsg(..).

This change is being done to help stage future work to add RFC5424/RFC5434
support to syslogd(8).

Obtained from:	Isilon OneFS (dcd33d13da) (as part of a larger change)
Submitted by:	John Bauman <john.bauman@isilon.com>
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-02-25 00:12:29 +00:00
Warner Losh
35a419a2cb Exit when we can't print a variable.
Exit after printing a message on stderr when we can't get a
message. This is slightly different than linux, but keeps shell
scripts from thinking the value of the variable is the error message
and so is a net win.

Sponsored by: Netflix
2017-02-25 00:09:26 +00:00
Warner Losh
589c673b32 Don't convert ENOENT to nothing for individual lookup, just for the
iterative get_next interface. This prevents efivar(3) from printing 4k
of 0's when a variable isn't set.

Sponsored by: Netflix
2017-02-25 00:09:21 +00:00
Warner Losh
d8fab838af Make nvmecontrol logpage -p help list known pages.
Make -p help and -v help list all the pages we know about.
Add -v to usage.
Update the man page.

Sponsored by: Netflix
2017-02-25 00:09:16 +00:00
Warner Losh
e71bab696f Exit with usage if argv[1] is NULL in dispatch. This fixes core dumps
when a command has subcommands, but the user doesn't give the
parameters on the command line.

Sponsored by: Netflix
2017-02-25 00:09:12 +00:00
Warner Losh
56e101b337 Fix typos in output.
Sponsored by: Netflix
2017-02-25 00:09:02 +00:00
Enji Cooper
8fe70bb8f2 Use SRCTOP instead of .CURDIR relative paths with ".."
This simplifies pathing in make/displayed output

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-02-24 21:35:59 +00:00
Mahdi Mokhtari
bd911530b7 Add linux_preadv() and linux_pwritev() syscalls to Linuxulator.
Reviewed by:	dchagin
Approved by:	dchagin, trasz (src committers)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9722
2017-02-24 20:04:02 +00:00
Dmitry Chagin
8665c4d9cd Revert r314217. Commit is not match that I have approved. 2017-02-24 19:47:27 +00:00
Mahdi Mokhtari
21d23e3249 Add linux_preadv() and linux_pwritev() syscalls to Linuxulator.
Reviewed by:	dchagin
Approved by:	dchagin, trasz (src committers)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9722
2017-02-24 19:22:17 +00:00
Jonathan T. Looney
a81ce6e9d5 We have seen several cases recently where we appear to get a double-fault:
We have an original panic. Then, instead of writing the core to the dump
device, the kernel has a second panic: "smp_targeted_tlb_shootdown:
interrupts disabled". This change is an attempt to fix that second panic.

When the other CPUs are stopped, we can't notify them of the TLB shootdown,
so we skip that operation. However, when the CPUs come back up, we
invalidate the TLB to ensure they correctly observe any changes to the
page mappings.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D9786
2017-02-24 18:56:00 +00:00
Hans Petter Selasky
0f32531a56 Implement more string functions in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-24 17:36:55 +00:00
Hans Petter Selasky
5a5a8c8a17 Prototype device structure to ensure LinuxKPI header file can be
included standalone.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-24 17:03:14 +00:00
Allan Jude
f91d1c83cd Remove control+r handling from geliboot's pwgets()
pwgets() is based on ngets() from libstand, which includes a feature
that is not wanted in a very of the function designed for password
handling.

Pressing control+r echos out the entered string

This commit removes that feature from pwgets()

PR:		217298
Reported by:	ehaupt
Reviewed by:	kristof, tsoome, ehaupt
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D9782
2017-02-24 16:52:57 +00:00
Ruslan Bukin
e42e2a1edc Use correct macro for Synopsys UART driver declaration. 2017-02-24 16:37:35 +00:00
Konstantin Belousov
8cd5962571 Remove cpu_deepest_sleep variable.
On Core2 and older Intel CPUs, where TSC stops in C2, system does not
allow C2 entrance if timecounter hardware is TSC.  This is done by
tc_windup() which tests for TC_FLAGS_C2STOP flag of the new
timecounter and increases cpu_disable_c2_sleep if flag is set.  Right
now init_TSC_tc() only sets the flag if cpu_deepest_sleep >= 2, but
TSC is initialized too early for this variable to be set by
acpi_cpu.c.

There is no reason to require that ACPI reported C2 and deeper states
to set TC_FLAGS_C2STOP, so remove cpu_deepest_sleep test from
init_TSC_tc() condition.  And since this is the only use of the
variable, remove it at all.

Reported and submitted by:	Jia-Shiun Li <jiashiun@gmail.com>
Suggested by:	jhb
MFC after:	2 weeks
2017-02-24 16:11:55 +00:00
Adrian Chadd
9c8efa1dfb [iwm] add if_iwm_fw.c. 2017-02-24 15:17:43 +00:00
Alexander Motin
87de303c4a Respecting r314204 tighten ATIO cleanup requirements.
Every ATIO must complete with either successfully sent status or XPT_ABORT.

MFC after:	2 weeks
2017-02-24 14:48:17 +00:00
Hans Petter Selasky
959d6165a2 Implement srcu_dereference() macro in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-24 14:40:15 +00:00