Commit Graph

122827 Commits

Author SHA1 Message Date
Mark Johnston
a8be239d69 Re-count available PV entries after reclaiming a PV chunk.
The call to reclaim_pv_chunk() in reserve_pv_entries() may free a
PV chunk with free entries belonging to the current pmap.  In this
case we must account for the free entries that were reclaimed, or
reserve_pv_entries() may return without having reserved the requested
number of entries.

Reviewed by:	alc, kib
Tested by:	pho (previous version)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D15911
2018-06-23 10:41:52 +00:00
Jeff Roberson
63b5557b2f Sort uma_zone fields according to 64 byte cache line with adjacent line
prefetch on 64bit architectures.  Prior to this, two lines were needed
for the fast path and each line may fetch an unused adjacent neighbor.
 - Move fields used by the fast path into a single line.
 - Move constants into the adjacent line which is mostly used for
   the spare bucket alloc 'medium path'.
 - Unpad the mtx which is only used by the fast path and place it in
   a line with rarely used data.  This aligns the cachelines better and
   eliminates 128 bytes of wasted space.

This gives a 45% improvement on a will-it-scale test on a 24 core machine.

Reviewed by:	mmacy
2018-06-23 08:10:09 +00:00
Matt Macy
0bcfb47363 epoch(9): Don't trigger taskq enqueue before the grouptaskqs are setup
If EARLY_AP_STARTUP is not defined it is possible for an epoch to be
allocated prior to it being possible to call epoch_call without
issue.

Based on patch by andrew@

PR:		229014
Reported by:	andrew
2018-06-23 07:14:08 +00:00
Gleb Smirnoff
a00f4ac22f Revert r334843, and partially revert r335180.
tcp_outflags[] were defined since 4BSD and are defined nowadays in
all its descendants. Removing them breaks third party application.
2018-06-23 06:53:53 +00:00
Justin Hibbits
ab42fbe2e9 powerpc64: Fix stack setup in dbtrap
r330610 relocated the DMAP from the base of memory to the base of the fourth
quadrant of memory.  This broke synthetic traps, such as KDB forced
breakpoints.  Use GET_TOCBASE() so the DMAP offset is handled.

Submitted by:	git_bdragon.rkt0.net
Differential Revision:	https://reviews.freebsd.org/D15973
2018-06-23 01:42:34 +00:00
Rick Macklem
9f4c522e6b Set the slotid and ND_HASSLOTID flag for NFSv4.1 sequenced operations.
Most NFSv4.1 compound RPCs start with a Sequence operation. For these
cases, save the slotid and note that it is saved by setting ND_HASSLOTID.
This is used by r335568 to free up the session slot and disable it.

MFC after:	2 weeks
2018-06-23 00:48:45 +00:00
Rick Macklem
b18130d330 Define ND_HASSLOTID needed by r335568.
r335568 uses a flag called ND_HASSLOTID to indicate that the slotid is set,
so it can free and invalidate it.
This flag needs to be set, which will be done in a subsequent commit.

MFC after:	2 weeks
2018-06-23 00:37:15 +00:00
Kristof Provost
150182e309 pf: Support "return" statements in passing rules when they fail.
Normally pf rules are expected to do one of two things: pass the traffic or
block it. Blocking can be silent - "drop", or loud - "return", "return-rst",
"return-icmp". Yet there is a 3rd category of traffic passing through pf:
Packets matching a "pass" rule but when applying the rule fails. This happens
when redirection table is empty or when src node or state creation fails. Such
rules always fail silently without notifying the sender.

Allow users to configure this behaviour too, so that pf returns an error packet
in these cases.

PR:		226850
Submitted by:	Kajetan Staszkiewicz <vegeta tuxpowered.net>
MFC after:	1 week
Sponsored by:	InnoGames GmbH
2018-06-22 21:59:30 +00:00
Rick Macklem
ba6cce3aea Fix the handling of NFSv4.1 sessions for "soft" mounts.
When a "soft" mount is used for NFSv4.1, an RPC that fails without completing
will leave a slot in the NFSv4.1 session in an indeterminate state.
As such, all that can be done is free up the slot while making is no longer
usable.
A "soft" NFSv4.1 mount is not recommended in general, since it will leave
Open/Lock state in an indeterminate state. An exception is a pNFS mount of
a DS, since there are no Opens/Locks done for them except file creates
where loss of the Open state does not matter.
The patch also makes connections to DSs soft, so that they will fail when
a DS is non-functional or network partitioned, allowing the pNFS MDS to disable
the DS for a mirrored configuration.
This patch should not affect normal "hard" NFSv4.1 mounts.

MFC after:	2 weeks
2018-06-22 21:37:20 +00:00
Rick Macklem
2e35b8fe24 Change the NFSv4.1 pNFS client so that it returns the DS error in layoutreturn.
When the NFSv4.1 pNFS client gets an error for a DS I/O operation using a
Flexible File layout, it returns the layout with an error.
This patch changes the code slightly, so that it returns the layout for all
errors except EACCES and lets the MDS decide what to do based on the error.
It also makes a couple of changes to nfscl_layoutrecall() to ensure that
the first layoutreturn(s) will have the error in the reply.
Plus, the patch adds a wakeup() so that the "nfscl" thread won't wait 1sec
before doing the LayoutReturn.
Tested against the pNFS service.
This patch should not affect non-pNFS use of the client.
The unused "dsp" argument will be used by a future patch that disables the
connection to the DS when possible.

MFC after:	2 weeks
2018-06-22 21:25:27 +00:00
Ian Lepore
ce0b0df7ab Add spigen(4) fdt data overlays for RPI-B, RPI-2.
By adding spigen-rpi{2,-b}.dtso to fdt_overlays= in loader.conf, the fdt data
will set up the correct pinmux and device nodes to create a spigen(4) device
for each available chipselect pin.

Submitted by:	Bob Frazier
Differential Revision:	https://reviews.freebsd.org/D15067
2018-06-22 20:45:40 +00:00
Ian Lepore
c5b7751fa2 Eliminate a spurious panic on non-SMP systems (occurred on shutdown/reboot). 2018-06-22 20:22:26 +00:00
Colin Percival
7e8db78116 Improve the accuracy of the POSIX "process CPU-time" clocks by adding the
used portion of the current thread's time slice if the current thread
belongs to the process being queried (i.e., if clock_gettime is invoked
with a clock ID of CLOCK_PROCESS_CPUTIME_ID or the value provided by
passing getpid(2) to clock_getcpuclockid(3)).

The CLOCK_VIRTUAL and CLOCK_PROF timers already make this adjustment via
long-standing code in calcru(), but since those timers are not specified
by POSIX it seems useful to add it here so that the higher accuracy is
available to code which aims to be portable.

PR:		228669
Reported by:	Graham Percival
Reviewed by:	kib
MFC after:	1 week
2018-06-22 10:23:32 +00:00
Rick Macklem
c16f407e31 Add a counter to limit the number of disabled DSs for a mirrored pNFS MDS.
This patch adds a counter that limits the number of disabled mirrored DSs
to mirror level - 1.  It also makes a small change that keeps a Write that
has failed with EACCES when attempted by a client to a DS from disabling
the DS.
This patch only affects the pNFS server.
2018-06-22 00:55:39 +00:00
Matt Macy
ae25f40b72 epoch(9): make non-preemptible variant work early boot 2018-06-22 00:47:18 +00:00
Chuck Tuffli
4c4317193b Fix output of linprocfs stat entry
The Linux /proc/stat entry has grown over time

 v2.5.41 <
   user, nice, system, idle
 v2.5.41
   user, nice, system, idle, iowait, irq
 v2.6.11
   user, nice, system, idle, iowait, irq, softirq, steal
 v2.6.24
   user, nice, system, idle, iowait, irq, softirq, steal, guest
 v2.6.32 >
   user, nice, system, idle, iowait, irq, softirq, steal, guest, guest_nice

Some applications (e.g. nodejs) depend on the correct number of entries
and will abort otherwise.

Fix is to print the correct number of entries based on the value of
osrelease set either in sysctl or the jail settings. Change is similar
to approach used by illumos.

Reviewed by: emaste, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15858
2018-06-22 00:02:05 +00:00
Chuck Tuffli
3575504976 Fix the Linux kernel version number calculation
The Linux compatibility code was converting the version number (e.g.
2.6.32) in two different ways and then comparing the results.

The linux_map_osrel() function converted MAJOR.MINOR.PATCH similar to
what FreeBSD does natively. I.e. where major=v0, minor=v1, and patch=v2
    v = v0 * 1000000 + v1 * 1000 + v2;

The LINUX_KERNVER() macro, on the other hand, converted the value with
bit shifts. I.e. where major=a, minor=b, and patch=c
    v = (((a) << 16) + ((b) << 8) + (c))

The Linux kernel uses the later format via the KERNEL_VERSION() macro in
include/generated/uapi/linux/version.h

Fix is to use the LINUX_KERNVER() macro in linux_map_osrel() as well as
in the .trans_osrel functions.

PR: 229209
Reviewed by: emaste, cem, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15952
2018-06-22 00:02:03 +00:00
Kyle Evans
03d7aee8a7 subr_hints: Fix acpi unit hinting (at the very least)
The refactoring in r335479 overlooked the fact that the dynamic kenv can
also be switched to if hintmode == 0. This is problematic because the
checkmethod bits are only ever ran once, but it worked previously because
the use_kenv was a global state and the first lookup would enable it if
occurring after the dynamic environment has been setup.

Extending our local definition of use_kenv to include all non-STATIC
hintmodes as long as the dynamic_kenv is setup fixes this. We still have
potential issues if the dynamic kenv comes up while we're doing an anchored
search through the environment, but this is not much of a concern right now
because:

1.) The dynamic environment comes up super early in boot, just after kmem

2.) This is going to get rewritten to provide a safer mechanism for the
anchored searches, ensuring that we continue using the same environment
chain (dynamic env or static fallback) for all anchored search invocations

Reported by:	mmamcy
X-MFC-With: r335479
2018-06-21 21:50:00 +00:00
Ian Lepore
1fcf4de055 Incorporate bus and chip select numbers into spigen(4) cdev names. Rather
than assigning spigen device names in order of creation, this uses a device
name that corresponds to the owning spibus and chip-select index.

Example: /dev/spigen0.1 would be a child of spibus0, and use cs = 1

The intent is for systems like Raspberry Pi to have a consistent way of
using an SPI interface with a specific cs value from a user application.
Otherwise, there is no consistent way of knowing which cs pin will be
assigned to a particular spigen device. The alternative is to specify
everything in "the right order" in an overlay file, which is less than
ideal. Additionally, this duplicates (to some extent) the way Linux handles
a similar situation with their 'spidev' device, so it would be somewhat
familiar to those who also use Linux.

A new kernel config option, SPIGEN_LEGACY_CDEVNAME, causes the driver to
also create /dev/spigenN device name aliases, with N incrementing in the
order of device instantiation.  This is provided to ease the transition
for existing systems using the original naming convention (particularly
when these changes are MFC'd to stable branches).

Differential Revision:	https://reviews.freebsd.org/D15301
2018-06-21 21:16:26 +00:00
Konstantin Belousov
31665c1abb linux_clone_thread: mark new thread as TDB_BORN.
So that the ptrace code will catch it and report it to attached
debugger.  Enables debugging of threaded Linux binaries with FreeBSD
debugger.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15880
2018-06-21 21:15:04 +00:00
Konstantin Belousov
6e22bbf66e fork: avoid endless wait with PTRACE_FORK and RFSTOPPED.
An RFSTOPPED thread can't clean TDB_STOPATFORK, which is done in the
fork_return() in its context, so parent is stuck forever.  Triggered
when trying to ptrace linux process.  Instead of waiting for the new
thread to clear TDB_STOPATFORK, tag it as traced and reparent to the
debugger in do_fork(), and let it only notify the debugger when run.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
Reviewed by:	jhb
MFC after:	1 week
X-MFC-Note:	keep p_dbgwait placeholder intact
Differential revision:	https://reviews.freebsd.org/D15857
2018-06-21 21:12:49 +00:00
Konstantin Belousov
ac4bc0c171 Update proc->p_ptevents annotation to reflect the actual locking.
Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
Reviewed by:	jhb
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15954
2018-06-21 21:07:25 +00:00
Randall Stewart
581a046a8b This adds in an optimization so that we only walk one
time through the mbuf chain during copy and TSO limiting.
It is used by both Rack and now the FreeBSD stack.
Sponsored by:	Netflix Inc
Differential Revision: https://reviews.freebsd.org/D15937
2018-06-21 21:03:58 +00:00
Matt Macy
e93fdbe212 raw_ip: validate inp in both loops
Continuation of r335497. Also move the lock acquisition up to
validate before referencing inp_cred.

Reported by:	pho
2018-06-21 20:18:23 +00:00
Matt Macy
3d348772e7 in_pcblookup_hash: validate inp before return
Post r335356 it is possible to have an inpcb on the hash lists that is
partially torn down. Validate before using. Also as a side effect of this
change the lock ordering issue between hash lock and inpcb no longer exists
allowing some simplification.

Reported by:	pho@
2018-06-21 18:40:15 +00:00
Conrad Meyer
e38b830780 Sync strlcpy with userland version, again
No functional change.

Please remember to update libkern copies of libc functions when you update
libc.

Sponsored by:	Dell EMC Isilon
2018-06-21 17:35:13 +00:00
Matt Macy
e5c331cf78 raw_ip: validate inp
Post r335356 it is possible to have an inpcb on the hash lists that is
partially torn down. Validate before using.

Reported by:	pho
2018-06-21 17:24:10 +00:00
Bryan Drewery
63c4317055 Minor comment fix d_namelen -> d_namlen 2018-06-21 16:40:07 +00:00
Justin Hibbits
88cafb5b91 Fix the build post-PMCR addition.
Submitted by:	lwhsu
2018-06-21 15:59:05 +00:00
Roger Pau Monné
de06f02ea4 xen: check if there are clients waiting in gnttab_end_foreign_access_references
Without a call to check_free_callbacks() clients waiting for grant
references would not be woken up even when there are sufficient grant
references available.

The check was likely left out as a mistake when the function was first
added.

Note that other functions used to free grant references already call
check_free_callbacks.

Submitted by:		pratyush
Reviewed by:		royger
Differential review:	https://reviews.freebsd.org/D15899
2018-06-21 15:47:47 +00:00
Ian Lepore
199b9ab84f Add a note about using option VERBOSE_SYSINIT=0 to get the verbose code
compiled in but disabled by default.
2018-06-21 14:59:23 +00:00
Justin Hibbits
b99540b655 Add the rest of the files for r335481
Missed hooking PMCR cpufreq(4) to the build, and adding the SPR to the header.
2018-06-21 14:30:14 +00:00
Justin Hibbits
22c1b4c0f1 Introduce PMCR-based cpufreq(4) driver, for IBM POWER8 and POWER9 systems
Summary: POWER8 and POWER9 use a single CPU register, per core, to change clock
speed.  Everything else is handled by the on-chip controller.  This change
necessitates a change to the cpufreq global kernel driver to bump supported
levels, as the device tree for these systems can have theoretically 256
different options.  On my POWER9 Talos, the list consists of 100 items.  At
16.67MHz intervals, that allows for a change of roughly 1.67GHz between lowest
and highest.

This has only been tested on the POWER9.  However, since they're similar, this
should work on POWER8 as well.

Reviewed By: nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15932
2018-06-21 14:26:43 +00:00
Kyle Evans
770488d202 subr_hints: simplify a little bit
Some complexity exists in these bits that isn't needed. The sysctl handler,
upon change to '2', runs through the current set of hints and sets them in
the kenv.

However, this isn't at all necessary if we're pulling hints from the kenv,
static or dynamic, as the former will get added to the latter in
init_dynamic_kenv (see: kern_environment.c). We can reduce this
configuration to just adding static_hints to the kenv if we were previously
using them.

The changes in res_find are minimal and based on the observation that once
use_kenv gets set to '1' it will never be reset to '0', and it gets set to
'1' as soon as we hit fallback mode. Later work will refactor res_find a
little bit and eliminate this now-local, because it's become clear that
there's some funkiness revolving around use_kenv=1 and it being used to
imply that we're certainly looking at the dynamic_kenv.

Reviewed by:	ray
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D15940
2018-06-21 14:04:02 +00:00
Ruslan Bukin
c43e3c8659 PLIC driver was sponsored by ECATS contract, not CTSRD one. 2018-06-21 11:52:09 +00:00
Ilya Bakulin
5e03278fee Add MMCCAM support to AllWinner MMC driver
Using MMCCAM on AllWinner boards is now possible, reaching highest
possible data transfer speed.

For now, MMCCAM doesn't scan cards on boot. This means that scanning
has to be done manually and that it's not possible to mount root FS
from MMC/SD card since there is no block device at the boot time.

For manually scanning the cards, run:
# camcontrol rescan X:0:0
Where X is the bus number (look at camcontrol devlist to determine
bus number assigned to the MMC controller).

Reviewed by:	manu
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D15891
2018-06-21 11:49:21 +00:00
Ruslan Bukin
b47999470d Fix uma_zalloc_pcpu_arg() operation in case of !SMP build.
Reviewed by:	mjg
Sponsored by:	DARPA, AFRL
2018-06-21 11:43:54 +00:00
Matt Macy
46374cbf54 udp_ctlinput: don't refer to unpcb after we drop the lock
Reported by: pho@
2018-06-21 06:10:52 +00:00
Eric Joyner
94c86dd038 ixl(4): Fix gcc build errors
By removing redundant function declarations.

Reported by:	ci.freebsd.org via Mark Millard <marklmi@yahoo.com>
MFC after:	1 month
2018-06-20 22:16:46 +00:00
Hans Petter Selasky
ce70c57262 Permit the kernel environment to set an array of numeric values for a single
sysctl(9) node.

Reviewed by:		kib@, imp@, jhb@
Differential Revision:	https://reviews.freebsd.org/D15802
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-06-20 20:04:20 +00:00
Kyle Evans
c7962400c9 Add debug.verbose_sysinit tunable for VERBOSE_SYSINIT
VERBOSE_SYSINIT is currently an all-or-nothing option. debug.verbose_sysinit
adds an option to have the code compiled in but quiet by default so that
getting this information from a device in the field doesn't necessarily
require distributing a recompiled kernel.

Its default is VERBOSE_SYSINIT's value as defined in the kernconf. As such,
the default behavior for simply omitting or including this option is
unchanged.

MFC after:	1 week
2018-06-20 19:23:56 +00:00
Emmanuel Vadot
78442297f5 Add pmap_mapdev_attr for arm64
This is needed for efifb.
arm and ricv pmap (the two arch with arm64 that uses subr_devmap) have very
different implementation so for now only add this for arm64.

Tested with efifb on Pine64 with a few other patches.

Reviewed by:	cognet
Differential Revision:	https://reviews.freebsd.org/D15294
2018-06-20 16:07:35 +00:00
Emmanuel Vadot
a2d7053188 if_rk_dwc: Disable setting delays for now
The values for tx/rx delays differs accross the different DTS.
Mainline Linux set it to 0x24/0x18
Mostly-Vendor u-boot (the one maintained and developped) to 0x18/0x18
Mostly-Vendor linux (the one maintained and developped) to 0x26/0x11

By experience only 0x18/0x18 works so until the issue is resolved rely on
the bootloader settings.
2018-06-20 15:27:09 +00:00
Emmanuel Vadot
5f35d609c7 rk_gpio: Read the correct register for gpio read
Reported by:	jmcneill
2018-06-20 14:46:07 +00:00
Emmanuel Vadot
e167047518 if_rk_dwc: Fix delays handling
The property are named {t,r}x_delay and not {t,r}-delay.
The upper bits of the register are a mask of which bits is allowed
to be written, set it otherwise we write nothing.
OF_getencprop returns <0 = for an error.

Pointy Hat: myself
Reported by:	jmcneill (delay and mask bits)
2018-06-20 14:45:26 +00:00
Justin Hibbits
d23d5d73e8 Attach dev.cpu nodes on powerpc SMT cores, using only the first found thread
Summary: In order to use cpufreq(4), a dev.cpu attachment must be created.  If
the IBM property is found denoting SMT, attach only to the first thread setup,
so that a cpufreq device can bind.

Reviewed by:	nwhitehorn
Differential Revision: https://reviews.freebsd.org/D15921
2018-06-20 13:30:35 +00:00
Bjoern A. Zeeb
7938a4425a Instead of using hand-rolled loops where not needed switch them
to FOREACH_PROC_IN_SYSTEM() to have a single pattern to look for.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D15916
2018-06-20 11:42:06 +00:00
Andrew Turner
c30e5927b6 Move the SYSINIT to allow userspace access to the ARM generic timer later
in the boot. It doesn't need to be early, so move it to the SI_ORDER_ANY
stage of SI_SUB_SMP.

Sponsored by:	DARPA, AFRL
2018-06-20 11:13:10 +00:00
Andrew Turner
1d802c646c Move the SMCCC SYSINIT later in the boot so the psci driver has attached.
Sponsored by:	DARPA, AFRL
2018-06-20 10:57:29 +00:00
Andrew Turner
95c4b3c735 Fix the SMCCC signatures, they are all 32-bit calls. This fixes SMCCC
version detection.

Sponsored by:	DARPA, AFRL
2018-06-20 10:02:50 +00:00