117829 Commits

Author SHA1 Message Date
Hans Petter Selasky
44d8a0fc60 Ticks are 32-bit in FreeBSD.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:18:25 +00:00
Hans Petter Selasky
713dd5cb9e Resolve locking issue for non-sleepable context in the mlx5core.
Code inspection reveals the busdma unload and free functions
do not write to the belonging dma tag and does not need to be
serialized. This allows mlx5_fwp_free() to be called from
software interrupt context.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:14:43 +00:00
Hans Petter Selasky
1d4905b5b0 Using GFP_ATOMIC with firmware commands is not supported after busdma was
introduced in the mlx5core, because busdma might sleep when loading memory
into DMA.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:11:51 +00:00
Mark Johnston
069555390b Remove D_TRACKCLOSE now that ksyms no longer has a close method.
Reported by:	jhb
X-MFC with:	r321963
2017-08-03 05:55:01 +00:00
Enji Cooper
b9fe1d4f15 Fix the return types for printf and putchar to match their libc and
POSIX equivalents

Both printf and putchar return int, not void.

This will allow code that leverages the libcalls and checks/rely on the
return type to interchangeably between loader code and non-loader
code.

MFC after:	1 month
2017-08-03 05:27:05 +00:00
Sepherosa Ziehau
fe167cce54 hyperv/kvp: Use proper size macro for adapter id.
Submitted by:	Christopher Ertl <Christopher.Ertl microsoft com>
MFC after:	3 days
Sponsored by:	Microsoft
2017-08-03 01:44:40 +00:00
Mark Johnston
22e406c80b Rework and simplify the ksyms(4) implementation.
- Store the symbol table contents in an anonymous swap-backed object. Have
  mmap(/dev/ksyms) map that object, and stop mapping the symbol table into
  the calling process in ksyms_open(). Previously we would cache a pointer
  to the pmap of the opening process, and mmap(/dev/ksyms) would create a
  mapping using the physical address found by a pmap lookup at the initial
  mapping address. However, this assumes that the cached pmap is valid,
  which may not be the case. [1]
- Remove the ksyms ioctl interface. It appears to have been added to work
  around a limitation in libelf that no longer exists; see r321842.
  Moreover, the interface is difficult to support and isn't present in
  illumos. Since ksyms was added specifically to support lockstat(1), it
  is expected that this removal won't have any real impact.
- Simplify ksyms_read() to avoid unnecessary copying.
- Don't call the device handle destructor if we fail to capture a snapshot
  of the kernel's symbol table. devfs will do that for us.

Reported by:	Ilja van Sprundel <ivansprundel@ioactive.com> [1]
Reviewed by:	kib (previous revision)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11789
2017-08-03 00:38:13 +00:00
Marius Strobl
cd85acba1a - Correct the remainder of confusing and error prone mix-ups between
"br" or "bridge" where - according to the terminology outlined in
  comments of bridge.h and mmcbr_if.m  around since their addition in
  r163516 - the bus is meant and used instead. Some of these instances
  are also rather old, while those in e. g. mmc_subr.c are as new as
  r315430 and were caused by choosing mmc_wait_for_request(), i. e. the
  one pre-r315430 outliner existing in mmc.c, as template for function
  parameters in mmc_subr.c inadvertently. This correction translates to
  renaming "brdev" to "busdev" and "mmcbr" to "mmcbus" respectively as
  appropriate.
  While at it, also rename "reqdev" to just "dev" in mmc_subr.[c,h]
  for consistency with was already used in mmm.c pre-r315430, again
  modulo mmc_wait_for_request() that is.
- Remove comment lines from bridge.h incorrectly suggesting that there
  would be a MMC bridge base class driver.
- Update comments in bridge.h regarding the star topology of SD and SDIO;
  since version 3.00 of the SDHCI specification, for eSD and eSDIO bus
  topologies are actually possible in form of so called "shared buses"
  (in some subcontext later on renamed to "embedded" buses).
2017-08-02 21:11:51 +00:00
Emmanuel Vadot
48ee531892 arm64: Add Allwinner H5 SoC
Allwinner H5 is an H3 (arm32) with Cortex A53 cores.
Add support for it and enable it in GENERIC kernel config

Tested on: OrangePi PC2
2017-08-02 20:19:19 +00:00
Emmanuel Vadot
904581f050 allwiner: modclk: Do not try to enable parent clock if it doesn't exist 2017-08-02 20:17:04 +00:00
Ian Lepore
28926d45c9 Fix the interface to imx_iomux_gpr_get/set(). The functions were defined
as taking a register number, and that would get multiplied by 4 to make
a register address.  But the header file that consumers have to reference
this stuff publishes register addresses, not numbers.  So now everything
works in terms of register addresses.

Note that the HDMI init code was writing into the wrong register before
this change.  Apparently whatever it wrote to was harmless, and apparently
HDMI was working because uboot had set up the right bits.
2017-08-02 18:28:06 +00:00
Ian Lepore
c396301df0 Add missing ofw_bus_if.h src file. 2017-08-02 15:16:40 +00:00
Ian Lepore
7ad88b3ce0 The imx6_snvs driver is not strictly required for the system to run, so
change it from standard to optional and add a device statement for it so
that it's included unless someone uses nodevice to eliminate it.
2017-08-02 15:15:18 +00:00
Konstantin Belousov
dfcc612cb0 For makedev(), cast the minor argument to unsigned type explicitely,
avoiding possible sign propagation.

Submitted by:	hselasky
2017-08-02 14:54:54 +00:00
Hans Petter Selasky
2b79a966ab Fix LinuxKPI regression after r321920. The mda_unit and si_drv0 fields are not
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.

While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-02 14:27:27 +00:00
Andrew Turner
28356527bf Fix the return type for get_cntxc(). The register is 64-bit on both arm
and arm64 so move any truncation to the caller.

Submitted by:	Mihai Carabas <mihai.carabas@gmail.com>
X-Differential Revision:	https://reviews.freebsd.org/D10213
2017-08-02 14:12:47 +00:00
Ed Schouten
c852847584 Keep top page on CloudABI to work around AMD Ryzen stability issues.
Similar to r321899, reduce sv_maxuser by one page inside of CloudABI.
This ensures that the stack, the vDSO and any allocations cannot touch
the top page of user virtual memory.

Considering that CloudABI userspace is completely oblivious to virtual
memory layout, don't bother making this conditional based on the CPU of
the running system.

Reviewed by:	kib, truckman
Differential Revision:	https://reviews.freebsd.org/D11808
2017-08-02 13:08:10 +00:00
Mateusz Guzik
fd1d4c8159 amd64: annotate the syscall return address check with __predict_false
before:
   0xffffffff80b03ebb <+2059>:	mov    0x460(%r14),%rax
   0xffffffff80b03ec2 <+2066>:	mov    0x98(%rax),%rax
   0xffffffff80b03ec9 <+2073>:	shr    $0x2f,%rax
   0xffffffff80b03ecd <+2077>:	je     0xffffffff80b03edd <amd64_syscall+2093>
   0xffffffff80b03ecf <+2079>:	mov    0x3f8(%r14),%rax
   0xffffffff80b03ed6 <+2086>:	orl    $0x1,0xc8(%rax)
   0xffffffff80b03edd <+2093>:	add    $0xf8,%rsp

after:
   0xffffffff80b03ebb <+2059>:	mov    0x460(%r14),%rax
   0xffffffff80b03ec2 <+2066>:	mov    0x98(%rax),%rax
   0xffffffff80b03ec9 <+2073>:	shr    $0x2f,%rax
   0xffffffff80b03ecd <+2077>:	jne    0xffffffff80b03eef <amd64_syscall+2111>
   0xffffffff80b03ecf <+2079>:	add    $0xf8,%rsp

Reviewed by:	kib
MFC after:	1 week
2017-08-02 11:25:38 +00:00
Konstantin Belousov
5dd371944e Change major()/minor() to work with 64bit dev_t.
Since traditional types for the macros values are int, remove the
cookie trick and just split the dev_t at the word boundary.

Reported by:	Victor Stinner <victor.stinner@gmail.com>
PR:	221048
Sponsored by:	The FreeBSD Foundation
2017-08-02 10:14:17 +00:00
Konstantin Belousov
6632a4330f Do not call trapsignal() after handling usermode fault or interrupt,
when a signal is not intended to be sent.

The variable holding the signal number to send is left uninitialized,
which sometimes triggers invalid signal checks.

For NMI, a return to usermode without ast processing is done.  On the
other hand, for spurious dtrace probe interrupt it is usermode which
triggered the interrupt, so handle it through userret() as any other
fault.

Reported by:	Nils Beyer <nbe@renzel.net>
PR:	221151
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-02 10:12:10 +00:00
Don Lewis
cd155b5603 Lower the amd64 shared page, which contains the signal trampoline,
from the top of user memory to one page lower on machines with the
Ryzen (AMD Family 17h) CPU.  This pushes ps_strings and the stack
down by one page as well.  On Ryzen there is some sort of interaction
between code running at the top of user memory address space and
interrupts that can cause FreeBSD to either hang or silently reset.
This sounds similar to the problem found with DragonFly BSD that
was fixed with this commit:
  https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/b48dd28447fc8ef62fbc963accd301557fd9ac20
but our signal trampoline location was already lower than the address
that DragonFly moved their signal trampoline to.  It also does not
appear to be related to SMT as described here:
  https://www.phoronix.com/forums/forum/hardware/processors-memory/955368-some-ryzen-linux-users-are-facing-issues-with-heavy-compilation-loads?p=955498#post955498

  "Hi, Matt Dillon here. Yes, I did find what I believe to be a
   hardware issue with Ryzen related to concurrent operations. In a
   nutshell, for any given hyperthread pair, if one hyperthread is
   in a cpu-bound loop of any kind (can be in user mode), and the
   other hyperthread is returning from an interrupt via IRETQ, the
   hyperthread issuing the IRETQ can stall indefinitely until the
   other hyperthread with the cpu-bound loop pauses (aka HLT until
   next interrupt). After this situation occurs, the system appears
   to destabilize. The situation does not occur if the cpu-bound
   loop is on a different core than the core doing the IRETQ. The
   %rip the IRETQ returns to (e.g. userland %rip address) matters a
   *LOT*. The problem occurs more often with high %rip addresses
   such as near the top of the user stack, which is where DragonFly's
   signal trampoline traditionally resides. So a user program taking
   a signal on one thread while another thread is cpu-bound can cause
   this behavior. Changing the location of the signal trampoline
   makes it more difficult to reproduce the problem. I have not
   been because the able to completely mitigate it. When a cpu-thread
   stalls in this manner it appears to stall INSIDE the microcode
   for IRETQ. It doesn't make it to the return pc, and the cpu thread
   cannot take any IPIs or other hardware interrupts while in this
   state."
since the system instability has been observed on FreeBSD with SMT
disabled.  Interrupts to appear to play a factor since running a
signal-intensive process on the first CPU core, which handles most
of the interrupts on my machine, is far more likely to trigger the
problem than running such a process on any other core.

Also lower sv_maxuser to prevent a malicious user from using mmap()
to load and execute code in the top page of user memory that was made
available when the shared page was moved down.

Make the same changes to the 64-bit Linux emulator.

PR:		219399
Reported by:	nbe@renzel.net
Reviewed by:	kib
Reviewed by:	dchagin (previous version)
Tested by:	nbe@renzel.net (earlier version)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11780
2017-08-02 01:43:35 +00:00
Mark Johnston
526b5fe16c Amend r321884 to check the refcount and update the class with w_mtx held.
Reviewed by:	jhb
X-MFC with:	r321884
2017-08-01 23:14:38 +00:00
Emmanuel Vadot
fad5dbf8d5 Allwinner dtb: Add NanoPi M1 to the build
It was tested on NanoPi M1 Plus.
2017-08-01 20:28:11 +00:00
Emmanuel Vadot
5393952249 Alwinner: nanopi-neo: Remove r_i2c node from DTS as it isn't used on the board 2017-08-01 19:22:00 +00:00
Emmanuel Vadot
3da70cfac3 Allwinner dtb: add link for NanoPi Neo
Reported by:	Richard Puga <richard@puga.net>
Tested by:	Richard Puga <richard@puga.net>, myself
2017-08-01 18:33:27 +00:00
Mark Johnston
57688b6e4e Fix a witness assertion that fires when a lock type's class changes.
When all instances of a lock type are destroyed (for example, after a
module unload), the corresponding witness entry remains associated with
that lock type. In this case, we shouldn't panic if a new instance of the
lock type is created and its lock class does not match that recorded in the
witness entry.

Reviewed by:	jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11788
2017-08-01 17:50:28 +00:00
Marcin Wojtas
7579dce2c6 Merge ena-com 1.1.4.2
Update ENA HAL after fixing gcc build in r321861.

Submitted by: rlibby
Reviewed by: cognet (mentor)
Approved by: cognet (mentor)
Differential Revision: https://reviews.freebsd.org/D11480
2017-08-01 11:00:04 +00:00
Roger Pau Monné
d4ed36cde2 pci: fix write order when sizing BARs
According to the PCI Local Specification rev. 3.0 in case of a 64-bit
BAR both the low and the high parts of the register should be set to
~0 before attempting to read back the size.

So far I have found no single device that has problems with the
previous approach, but I think it's better to stay on the safe size.

This commit should not introduce any functional change.

MFC after:		3 weeks
Sponsored by:		Citrix Systems R&D
Reviewed by:		jhb
Differential revision:	https://reviews.freebsd.org/D11750
2017-08-01 10:47:44 +00:00
Alexander Motin
40715e9ed7 Add explicit check for PCI bus to r321720.
Reported by:	jhb
MFC after:	6 days
2017-08-01 09:22:10 +00:00
Enji Cooper
20cce726e6 Standardize paths on SRCTOP instead of .CURDIR-relative paths
MFC after:	1 week
2017-08-01 05:39:40 +00:00
Mark Johnston
2375aaa8e9 Batch updates to v_wire_count when freeing page table pages on x86.
The removed release stores are not needed since stores are totally
ordered on i386 and amd64.

Reviewed by:	alc, kib (previous revision)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11790
2017-08-01 05:26:30 +00:00
Enji Cooper
88abac8f06 Clean up style in print_state(..) and pager_printf(..)
No functional change intended.

MFC after:	3 days
2017-08-01 05:16:14 +00:00
Ian Lepore
94759a2448 Add a driver for the Intersil ISL12xx family of i2c RTC chips.
Supports ISL1209, ISL1218, ISL1219, ISL1220, ISL1221 (just basic RTC
functionality, not all the other fancy stuff the chips can do).
2017-08-01 04:16:52 +00:00
Alan Cox
2ac0c7c37c The blist_meta_* routines that process a subtree take arguments 'radix' and
'skip', which denote, respectively, the largest number of blocks that can be
managed by a subtree of that height, and one less than the number of nodes
in a subtree of that height.  This change removes the 'skip' argument from
those functions because 'skip' can be trivially computed from 'radius'.
This change also redefines 'skip' so that it denotes the number of nodes in
the subtree, and so changes loop upper bound tests from '<= skip' to '<
skip' to account for the change.

The 'skip' field is also removed from the blist struct.

The self-test program is changed so that the print command includes the
cursor value in the output.

Submitted by:	Doug Moore <dougm@rice.edu>
MFC after:	1 week
2017-08-01 03:51:26 +00:00
Dmitry Chagin
77d3337c9f Implement proper Linux /dev/fd and /proc/self/fd behavior by adding
Linux specific things to the native fdescfs file system.

Unlike FreeBSD, the Linux fdescfs is a directory containing a symbolic
links to the actual files, which the process has open.
A readlink(2) call on this file returns a full path in case of regular file
or a string in a special format (type:[inode], anon_inode:<file-type>, etc..).
As well as in a FreeBSD, opening the file in the Linux fdescfs directory is
equivalent to duplicating the corresponding file descriptor.

Here we have mutually exclusive requirements:
- in case of readlink(2) call fdescfs lookup() method should return VLNK
vnode otherwise our kern_readlink() fail with EINVAL error;
- in the other calls fdescfs lookup() method should return non VLNK vnode.

For what new vnode v_flag VV_READLINK was added, which is set if fdescfs has beed
mounted with linrdlnk option an modified kern_readlinkat() to properly handle it.

For now For Linux ABI compatibility mount fdescfs volume with linrdlnk option:

    mount -t fdescfs -o linrdlnk null /compat/linux/dev/fd

Reviewed by:	kib@
MFC after:	1 week
Relnotes:	yes
2017-08-01 03:40:19 +00:00
Pedro F. Giffuni
d2ffc7af30 sys/net8021: Add missing braces in setcurchan().
Obtained from:	DragonFlyBSD (git c69e37d6)
MFC after:	3 days
2017-08-01 03:13:43 +00:00
Sepherosa Ziehau
f41e0df406 hyperv/hn: Add comment about ether_ifattach event subscription.
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11710
2017-08-01 02:55:43 +00:00
Sepherosa Ziehau
962f035786 hyperv/hn: Renaming and minor cleanup
This prepares for the upcoming transparent VF support.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11708
2017-08-01 02:45:54 +00:00
Ian Lepore
55b0d8a05a Build iicbus/{ds1307,ds3231,nxprtc} as modules. 2017-07-31 22:32:11 +00:00
Ian Lepore
c28ccaf0f2 Restructure the SUBDIR list as 1-per-line and alphabetize, so it will be
easier to add new things (and see what changed) in the future.
2017-07-31 22:26:30 +00:00
Ian Lepore
012e41a7a8 Bugfixes and enhancements...
Don't enable the oscillator when it is found to be stopped at init time,
just let the first setting of valid time start it.  But still report a dead
battery if it's stopped at init time.

Don't force the chip into 24hr mode, just cope with whatever mode it is
already in.

Schedule the clock_settime() callbacks to align the RTC clock to top of
second when setting it.
2017-07-31 22:00:00 +00:00
Ian Lepore
13fd9ff767 No need to call getnanotime() now that the waiting is done by the central
subr_rtc code, switch from CLOCKF_SETTIME_NO_TS to CLOCKF_SETTIME_NO_ADJ
so that we get fed a timestamp, but it's not adjusted to compensate for
inaccuracy in setting time.
2017-07-31 21:53:00 +00:00
Kirk McKusick
33bbdde01f Avoid reading a snapshot block when it is already in the cache.
Update the use of the B_CACHE flag (since the May 1999 commit
that made it the correct test here).

Reported by: Andreas Longwitz <longwitz@incore.de>
Reviewed by: kib
Tested by: Peter Holm
MFC after: 1 week
2017-07-31 20:41:45 +00:00
Mark Johnston
6c7ebc242b Batch v_wire_count decrements in vm_hold_free_pages().
Atomic updates to v_wire_count are a significant source of contention, so
combine multiple updates into one in this easy case. Also remove an old
printf that gets executed if the page is shared-busied, which is a case
that will lead to a panic anyway.

Reviewed by:	alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11791
2017-07-31 18:48:58 +00:00
Mark Johnston
17b5949a31 Don't trace running threads that have interrupts disabled.
In this case we shouldn't assume that the thread has a valid frame pointer.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11787
2017-07-31 17:57:54 +00:00
Scott Long
2068b2aa88 Fix a logic bug in the split PCI interrupt code that slipped through
Reported by:	Harry Schmalzbauer
2017-07-31 16:55:56 +00:00
Ian Lepore
def6c0c3ee Restore a few rather important lines of code that got fumbled in r321746. 2017-07-31 16:46:16 +00:00
Ian Lepore
bf9c1267f3 Check the clock-halted flag every time the clock is read, not just once
at startup.  The flag stays set until the clock is loaded with good time,
so we need to keep saying the time is invalid until that happens.
2017-07-31 15:24:40 +00:00
Alexander Motin
e9c9826673 Improve FHA locality control for NFS read/write requests.
This change adds two new tunables, allowing to control serialization for
read and write NFS requests separately.  It does not change the default
behavior since there are too many factors to consider, but gives additional
space for further experiments and tuning.

The main motivation for this change is very low write speed in case of ZFS
with sync=always or when NFS clients requests sychronous operation, when
every separate request has to be written/flushed to ZIL, and requests are
processed one at a time.  Setting vfs.nfsd.fha.write=0 in that case allows
to increase ZIL throughput by several times by coalescing writes and cache
flushes.  There is a worry that doing it may increase data fragmentation
on disks, but I suppose it should not happen for pool with SLOG.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2017-07-31 15:23:19 +00:00
Ian Lepore
ccf32b9887 Add a detach() method. 2017-07-31 14:58:01 +00:00