Commit Graph

117872 Commits

Author SHA1 Message Date
Andriy Gapon
c6fb364293 MFV r322221: 7910 l2arc_write_buffers() may write beyond target_sz
FreeBD note: the essence of this change was committed to FreeBSD in
r314274.  This commit catches up with differences between what was
committed to FreeBSD and what was committed to OpenZFS, mainly more
logical variable names.

illumos/illumos-gate@16a7e5ac11
16a7e5ac11

https://www.illumos.org/issues/7910
  It seems that the change in issue #6950 resurrected the problem that was
  earlier fixed by the change in issue #5219.
  Please also see the following FreeBSD bug report:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216178

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Author: Andriy Gapon <avg@FreeBSD.org>

MFC after:	2 weeks
2017-08-08 10:43:41 +00:00
Mark Johnston
c0589825fd Add round_jiffies_up(), local_clock() and __setup_timer() to the LinuxKPI.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11871
2017-08-08 04:34:02 +00:00
Mark Johnston
48dac28d63 Add macros for defining attribute groups and for WO and RW attributes.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11872
2017-08-08 04:30:22 +00:00
Marius Strobl
79f39c6aa1 - If available, use TRIM instead of ERASE for implementing BIO_DELETE.
This also involves adding a quirk table as TRIM is broken for some
  Kingston eMMC devices, though. Compared to ERASE (declared "legacy"
  in the eMMC specification v5.1), TRIM has the advantage of operating
  on write sectors rather than on erase sectors, which typically are
  of a much larger size. Thus, employing TRIM, we don't need to fiddle
  with coalescing BIO_DELETE requests that are also of (write) sector
  units into erase sectors, which might not even add up in all cases.
- For some SanDisk iNAND devices, the CMD38 argument, e. g. ERASE,
  TRIM etc., has to be specified via EXT_CSD[113], which now is also
  handled via a quirk.
- My initial understanding was that for eMMC partitions, the granularity
  should be used as erase sector size, e. g. 128 KB for boot partitions.
  However, rereading the relevant parts of the eMMC specification v5.1,
  this isn't actually correct. So drop the code which used partition
  granularities for delmaxsize and stripesize. For the most part, this
  change is a NOP, though, because a) for ERASE, mmcsd_delete() used
  the erase sector size unconditionally for all partitions anyway and
  b) g_disk_limit() doesn't actually take the stripesize into account.
- Take some more advantage of mmcsd_errmsg() in mmcsd(4) for making
  error codes human readable.
2017-08-07 23:33:05 +00:00
Warner Losh
36d6e01474 Eliminate useless adjustments of aliased device.
No need to set any fields in the cloned device. devfs uses symlinks,
so the adev entries returned won't be presented to the drivers. Since
we don't save copies, nothing else will see them. This code came from
the old compat code, and it appears to be obsolete or never needed.

Submitted by: kib@
Differential Review: https://reviews.freebsd.org/D11919
2017-08-07 22:42:46 +00:00
Warner Losh
d45e16744f Add nvd alias to nda ndoes.
All ndaX and ndaXpY nodes will appear as nvdX and nvdXpY as well
(through symlinks in devfs via the normal disk aliasing mechanism in
GEOM).

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:43 +00:00
Warner Losh
d3517d306c Expose API to allow disks to ask for alias names in devfs.
Implement disk_add_alias to allow aliases to be added to disks. All
disk have a primary name (say "foo") can also have secondary names
(say "bar") such that all instances of "foo" also have a "bar"
alias. So if you have foo0, foo0p1, foo1, foo1s1 and foo1s1a nodes
created by the foo driver and gpart, device nodes bar0, bar0p1, bar1,
bar1s1 and bar1s1a will appear as symlinks back to the original nodes.
This generalizes to multiple aliases. However, since the unit number
follows the primary name, multiple device drivers can't create the
same aliases unless those drives coorinate the unit number space (eg
you couldn't add an alias 'disk' to both 'da' and 'ada' because it's
possible to have da0 and ada0, because 'disk0' is ambiguous).

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:38 +00:00
Warner Losh
5d7d13290a Add alias support to gpart.
When we're creating new providers for each of the partitions, add
aliases to the geom before we create the provider so when geom_dev
tastes the provider, the aliases are in place so the proper /dev
entries are created. So foo5p6 gets created as an alias for bar5p6
when foo is an alias for bar in the geom we're partitioning with
g_part. This also copies aliases from the container geom (eg disk) to
the label geom (the disk with GPT partitioning) so that aliases nest
properly.

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:33 +00:00
Warner Losh
c624eb2598 Add aliasing concept to geom.
Add an alias name list to geoms. Use them in geom_dev to create
aliases. Previously, geom_dev would create an device node for the name
of the geom. Now, additional nodes are created pointing back to the
primary node with make_dev_alias_p. Aliases must be in place on the
geom before any tasting occurs.

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:28 +00:00
Kirk McKusick
6c6118b390 gjournal is broken in handling its flush_queue. If we have 10 bio's
in the flush_queue:
         1 2 3 4 5 6 7 8 9 10
and another 10 bio's go into the flush queue after only the first five
bio's are removed from the flush queue, the queue should look like:
         6 7 8 9 10 11 12 13 14 15 16 17 18 19 20,
but because of the bug we end up with
         6 11 12 13 14  15 16 17 18 19 20 7 8 9 10.
So the sequence of the bio's is damaged in the flush queue (and
therefore in the journal on disk !). This error can be triggered by
ffs_snapshot() when a block is read with readblock() and gjournal finds
this block in the broken flush queue before it goes to the correct
active queue.

The fix is to place all new blocks at the end of the queue.

Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-07 19:40:03 +00:00
Kirk McKusick
683590b642 sysctl kern.geom.journal.cache.limit shows negative value for FreeBSD/amd64
system having over 4GB RAM. That's due to:

1) the limit being u_int instead of u_long like vm.kmem_size (the limit is
   half of vm.kmem_size by default for amd64);
2) sysctl handler g_journal_cache_limit_sysctl() using u_int instead of u_long.

The fix is to replace u_int with u_long for the kern.geom.journal.cache.limit
sysctl variable.

PR: 198500
Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Reported by: Eugene Grosbein
Discussed with: kib
MFC after: 1 week
2017-08-07 19:18:27 +00:00
Konstantin Belousov
fe04f5e9d0 Avoid DI recursion when reclaim_pv_chunk() is called from
pmap_advise() or pmap_remove().

Reported and tested by:	pho (previous version)
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-07 17:29:54 +00:00
Konstantin Belousov
1a47eac0f5 Explain why delayed invalidation is not required in pmap_protect() and
pmap_remove_pages().

Submitted by:	alc
MFC after:	1 week
2017-08-07 17:23:10 +00:00
Alexander Motin
e1cf70fbab Fix hrtimer_active() in case of cancellation.
While there, switch to FreeBSD internal callout active status.

Reviewed by:	markj, hselasky
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D11900
2017-08-07 14:34:05 +00:00
Ruslan Bukin
ca20f8ec29 o Replace __riscv__ with __riscv
o Replace __riscv64 with (__riscv && __riscv_xlen == 64)

This is required to support new GCC 7.1 compiler.
This is compatible with current GCC 6.1 compiler.

RISC-V is extensible ISA and the idea here is to have built-in define
per each extension, so together with __riscv we will have some subset
of these as well (depending on -march string passed to compiler):

__riscv_compressed
__riscv_atomic
__riscv_mul
__riscv_div
__riscv_muldiv
__riscv_fdiv
__riscv_fsqrt
__riscv_float_abi_soft
__riscv_float_abi_single
__riscv_float_abi_double
__riscv_cmodel_medlow
__riscv_cmodel_medany
__riscv_cmodel_pic
__riscv_xlen

Reviewed by:	ngie
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11901
2017-08-07 14:09:57 +00:00
Navdeep Parhar
b96793ae43 cxgbe(4): Add the T6 and T5 Unified Wire configuration files to the
kernel, just like for T4, when the driver is compiled into the kernel.

Reported by:	mav@
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-07 14:04:19 +00:00
Navdeep Parhar
cc2050c5eb cxgbe(4): Avoid a NULL dereference that would occur during module unload
if there were problems earlier during attach.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-06 19:45:59 +00:00
Konstantin Belousov
0b9b3897a8 Remove trivial comments. Remove and-ing with UINT_MAX for minor(),
cast to int already does the required truncation of significant bits.

Requested and reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
2017-08-06 12:27:20 +00:00
Andrew Turner
1f15260790 Mark each cpu in the appropriate cpuset_domain set. This allows devices to
handle cases where they can only run on a single domain.

To allow all devices access to this set we need to move reading the domain
earlier in the boot as it was previously handled in the CPU driver, however
this is too late for the GICv3 ITS driver.

Sponsored by:	DARPA, AFRL
2017-08-05 20:57:34 +00:00
Jung-uk Kim
0105034487 Detect hypervisors early. We used to set lower hz on hypervisors by default
but it was broken since r273800 (and r278522, its MFC to stable/10) because
identify_cpu() is called too late, i.e., after init_param1().

MFC after:	3 days
2017-08-05 06:56:46 +00:00
Toomas Soome
07672e9c19 libefi/time.c cstyle cleanup
libefi/time.c is mix of different styles, this update does cleanup.
Also fix 0 versus NULL, and zero the tv structure for case we get error
from UEFI firmware.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D11861
2017-08-05 05:20:03 +00:00
Cy Schubert
a3329e0129 Fix matchcing of NATed ICMP queries (resolving NATed MTU discovery).
MFC after:	1 month
2017-08-05 00:28:42 +00:00
Warner Losh
9990efd2e2 Move EFI fmtdev functionality to libefi
This patch moves code necessary for the fmtdev functionality from
loader to libefi, allowing other applications to make use of it

Submitted by: Eric McCorkle
Differential Revision: https://reviews.freebsd.org/D11862
2017-08-04 16:33:36 +00:00
Navdeep Parhar
019c1a0111 cxgbe(4): Allow the TOE timer tunables to be set with microsecond
precision.  These timers are already displayed in microseconds in the
sysctl MIB.  Add variables to track these tunables while here.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-04 15:57:10 +00:00
Andrew Turner
49f347f450 Start to teach the GICv3 driver about NUMA. On ThunderX we may have
multiple ITS devices, however we only want a single ITS device to be
configured on each CPU. To fix this only enable ITS when the node matches
the CPUs node.

Sponsored by:	DARPA, AFRL
2017-08-04 13:08:45 +00:00
Andrew Turner
dbba8930ce Read the numa-node-id property from each CPU node. This will initially be
used to support the dual package ThunderX where we need to send MSI/MSI-X
interrupts to the same package as the device the interrupt came from.

Sponsored by:	DARPA, AFRL
2017-08-04 10:33:22 +00:00
Konstantin Belousov
4e93dbdf47 Relax visibility for some termios symbols.
They are defined by XSI or newer SUS.
This is a follow-up to r318780.

Reported by:	jbeich
Obtained from:	DragonflyBSD commit e08b3836c962
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-04 09:45:40 +00:00
Alan Cox
ba98e6a2d7 In case readers are misled by expressions that combine multiplication and
division, add parentheses to make the precedence explicit.

Submitted by:	Doug Moore <dougm@rice.edu>
Requested by:	imp
Reviewed by:	imp
MFC after:	1 week
X-MFC after:	r321840
Differential Revision:	https://reviews.freebsd.org/D11815
2017-08-04 04:23:23 +00:00
Warner Losh
f48bfebce1 Add EFI utility functions to libefi
This patch adds additional EFI utility functions to convert errno
values to EFI_STATUS errors, as well as EFI times to UNIX times.

Submitted by: Eric McCorkle
Differential Revision: https://reviews.freebsd.org/D11858
2017-08-04 04:20:11 +00:00
Warner Losh
457ea3bce3 Move EFI ZFS functions to libefi
This patch moves some EFI ZFS functions from loader to libefi,
allowing them to be used by anything that links against libefi.

Submitted by: Eric McCorkle
Differential Revision: https://reviews.freebsd.org/D11855
2017-08-04 04:20:06 +00:00
Warner Losh
acf82d2659 Add definitions and utilities for EFI drivers
This patch adds definitions and utility code for creating EFI drivers
using the EFI_DRIVER_BINDING_PROTOCOL.

Submitted by: Eric McCorkle
Differential Revision: https://reviews.freebsd.org/D11852
2017-08-04 04:16:41 +00:00
Warner Losh
8a5d94f94d Make nvd vs nda choice boot-time rather than build-time
Introduce hw.nvme.use_nvd tunable. This tunable allows both nvd and
nda to be installed in the kernel, while allowing only one of them to
create devices. This is an all-or-nothing setting, and you can't
change it after boot-time. However, it will allow easier A/B testing.

Differential Revision: https://reviews.freebsd.org/D11825
2017-08-04 03:40:01 +00:00
Navdeep Parhar
6320b0f850 cxgbe(4): Always use the first and not the last virtual interface
associated with a port in begin_synchronized_op.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-04 01:28:06 +00:00
Conrad Meyer
6f240e18b5 x86: Tag some intrinsics with __pure2
Some C wrappers for x86 instructions do not touch global memory and only act
on their arguments; they can be marked __pure2, aka __const__.  Without this
annotation, Clang 3.9.1 is not intelligent enough on its own to grok that
these functions are __const__.

Submitted by:	Anton Rang <anton.rang AT isilon.com>
Sponsored by:	Dell EMC Isilon
2017-08-03 22:28:30 +00:00
Mark Johnston
a95435cfed Bump the maximum file name length in pseudofs filesystems to 48.
The previous limit of 24 was somewhat restrictive, and with this change
ceil(log2(sizeof(struct pfs_node))) is the same as before in both the ILP32
and LP64 models, so the malloc zone used for allocations of struct pfs_node
is the same as before.

Approved by:	des
2017-08-03 21:35:53 +00:00
Mark Johnston
f2ec04a394 Add subsystem vendor and device ID fields to struct pci_dev.
MFC after:	1 week
2017-08-03 21:14:46 +00:00
Emmanuel Vadot
c31654c5b6 arm: Add a GENERIC-NODEBUG kernel config
Like amd64 or arm64 provide a GENERIC-NODEBUG configuration file that
remove WITNESS and INVARIANTS etc ...
2017-08-03 19:01:46 +00:00
Ian Lepore
ba60088b16 Add missing header file to SRCS.
Reported by:	manu@
2017-08-03 18:49:15 +00:00
Ian Lepore
094e5e7e12 Switch to iicdev_readfrom/writeto() to do xfers with proper bus ownership.
Tested by:	manu@
2017-08-03 18:43:54 +00:00
Ian Lepore
854519fdd9 Add an ahci driver for imx6.
This was submitted by Rogiel Sulzbach (thank you!) but has a few last-minute
changes by me, mostly where the code interfaces to my still-utterly-deficient
imx6_ccm clocks implementation.  So blame me for any mistakes.

Submitted by:	Rogiel Sulzbach <rogiel@rogiel.com>
Differential Revision:	https://reviews.freebsd.org/D11177
2017-08-03 14:43:41 +00:00
Navdeep Parhar
f856f099cb cxgbe(4): Initial import of the "collect" component of Chelsio unified
debug (cudbg) code, hooked up to the main driver via an ioctl.

The ioctl can be used to collect the chip's internal state in a
compressed dump file.  These dumps can be decoded with the "view"
component of cudbg.

Obtained from:	Chelsio Communications
MFC after:	2 months
Sponsored by:	Chelsio Communications
2017-08-03 14:43:30 +00:00
Enji Cooper
1ef2a611de Revert r321969
My change had good intentions, but the implementation was incorrect:
- printf was returning the number of characters in the format string
  plus the NUL, but failed in two regards implementation wise:
-- the pathological case, printf(""), wasn't being handled properly since
   the pointer is always incremented, so the value returned would be
   off-by-one.
-- printf(3) reports the number of characters printed post-conversion via
   vfprintf, etc.
- putchar(3) should return the character printed or EOF, not the number
  of characters output to the screen.

My goal in making the change (again) was to increase parity, but as bde
pointed out these are freestanding functions, so they don't have to
conform to libc/POSIX. I argued that the functions should be named
differently since the implementation is different enough to warrant it
and to allow boot2 code to be usable when linked against sys/boot and
libstand and other libraries in base. I have no interest in pushing
this change forward more though, as the original concern I had behind
the change with zfsboottest was resolved in r321849 and r321852. The
next person that updates the toolchain gets to deal with the
inconsistency if it's flagged by a newer compiler.

MFC after:	1 month
Reported by:	ed, markj
2017-08-03 13:50:46 +00:00
Hans Petter Selasky
b40951b8cd Change reject message type when destroying cm_id in ibore.
This patch fixes an interopability issue between FreeBSD and non-FreeBSD
systems when the connection establishment is aborted. Refer to the
initial commit in Linux, drivers/infiniband/core/cm.c,
for a more detailed description.

Obtained from:	Linux
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:31:10 +00:00
Hans Petter Selasky
44d8a0fc60 Ticks are 32-bit in FreeBSD.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:18:25 +00:00
Hans Petter Selasky
713dd5cb9e Resolve locking issue for non-sleepable context in the mlx5core.
Code inspection reveals the busdma unload and free functions
do not write to the belonging dma tag and does not need to be
serialized. This allows mlx5_fwp_free() to be called from
software interrupt context.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:14:43 +00:00
Hans Petter Selasky
1d4905b5b0 Using GFP_ATOMIC with firmware commands is not supported after busdma was
introduced in the mlx5core, because busdma might sleep when loading memory
into DMA.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:11:51 +00:00
Mark Johnston
069555390b Remove D_TRACKCLOSE now that ksyms no longer has a close method.
Reported by:	jhb
X-MFC with:	r321963
2017-08-03 05:55:01 +00:00
Enji Cooper
b9fe1d4f15 Fix the return types for printf and putchar to match their libc and
POSIX equivalents

Both printf and putchar return int, not void.

This will allow code that leverages the libcalls and checks/rely on the
return type to interchangeably between loader code and non-loader
code.

MFC after:	1 month
2017-08-03 05:27:05 +00:00
Sepherosa Ziehau
fe167cce54 hyperv/kvp: Use proper size macro for adapter id.
Submitted by:	Christopher Ertl <Christopher.Ertl microsoft com>
MFC after:	3 days
Sponsored by:	Microsoft
2017-08-03 01:44:40 +00:00
Mark Johnston
22e406c80b Rework and simplify the ksyms(4) implementation.
- Store the symbol table contents in an anonymous swap-backed object. Have
  mmap(/dev/ksyms) map that object, and stop mapping the symbol table into
  the calling process in ksyms_open(). Previously we would cache a pointer
  to the pmap of the opening process, and mmap(/dev/ksyms) would create a
  mapping using the physical address found by a pmap lookup at the initial
  mapping address. However, this assumes that the cached pmap is valid,
  which may not be the case. [1]
- Remove the ksyms ioctl interface. It appears to have been added to work
  around a limitation in libelf that no longer exists; see r321842.
  Moreover, the interface is difficult to support and isn't present in
  illumos. Since ksyms was added specifically to support lockstat(1), it
  is expected that this removal won't have any real impact.
- Simplify ksyms_read() to avoid unnecessary copying.
- Don't call the device handle destructor if we fail to capture a snapshot
  of the kernel's symbol table. devfs will do that for us.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com> [1]
Reviewed by:	kib (previous revision)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11789
2017-08-03 00:38:13 +00:00