121621 Commits

Author SHA1 Message Date
Conrad Meyer
3b8d52d371 blake2: Disable warnings (not just error) for code we will not modify
Leave libb2 pristine and silence the warnings for mjg.
2018-04-21 02:08:56 +00:00
Emmanuel Vadot
bfeb2bd7ca regulator: Check status before disabling
When disabling regulator when they are unused, check before is they are
enabled.
While here don't check the enable_cnt on the regulator entry as it is
checked by regnode_stop.
This solve the panic on any board using a fixed regulator that is driven
by a gpio when the regulator is unused.

Tested On: OrangePi One
Pointy Hat to:	    myself
Reported by:	kevans, Milan Obuch (freebsd-arm@dino.sk)
2018-04-20 20:30:33 +00:00
Emmanuel Vadot
c712135a32 gnu/dts: Update our copy of arm dts from Linux 4.16 2018-04-20 19:37:08 +00:00
Konstantin Belousov
1302eea7bb Rename PROC_PDEATHSIG_SET -> PROC_PDEATHSIG_CTL and PROC_PDEATHSIG_GET
-> PROC_PDEATHSIG_STATUS for consistency with other procctl(2)
operations names.

Requested by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2018-04-20 15:19:27 +00:00
Andriy Gapon
f87beb93e8 call racct_proc_ucred_changed() under the proc lock
The lock is required to ensure that the switch to the new credentials
and the transfer of the process's accounting data from the old
credentials to the new ones is done atomically.  Otherwise, some updates
may be applied to the new credentials and then additionally transferred
from the old credentials if the updates happen after proc_set_cred() and
before racct_proc_ucred_changed().

The problem is especially pronounced for RACCT_RSS because
- there is a strict accounting for this resource (it's reclaimable)
- it's updated asynchronously by the vm daemon
- it's updated by setting an absolute value instead of applying a delta

I had to remove a call to rctl_proc_ucred_changed() from
racct_proc_ucred_changed() and make all callers of latter call the
former as well.  The reason is that rctl_proc_ucred_changed, as it is
implemented now, cannot be called while holding the proc lock, so the
lock is dropped after calling racct_proc_ucred_changed.  Additionally,
I've added calls to crhold / crfree around the rctl call, because
without the proc lock there is no gurantee that the new credentials,
owned by the process, will stay stable.  That does not eliminate a
possibility that the credentials passed to the rctl will get stale.
Ideally, rctl_proc_ucred_changed should be able to work under the proc
lock.

Many thanks to kib for pointing out the above problems.

PR:		222027
Discussed with:	kib
No comment:	trasz
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D15048
2018-04-20 13:08:04 +00:00
Rick Macklem
cb0d9834d4 Fix use of pointer after being set NULL.
Using a pointer after setting it NULL is probably not a good plan.
Spotted by inspection during changes for Flexible File Layout Ioerr handling.
This code path obviously isn't normally executed.

MFC after:	1 week
2018-04-20 11:38:29 +00:00
Andrey V. Elsukov
2b9600b449 Add dead_bpf_if structure, that should be used as fake bpf_if
during ifnet detach.

Since destroying interface is not atomic operation and due to the
lack of synhronization during destroy, it is possible, that in the
time between bpfdetach() and if_free() some queued on destroying
interface mbuf will be used by ether_input_internal() and
bpf_peers_present() can dereference NULL bpf_if pointer. To protect
from this, assign pointer to empty bpf_if_ext structure instead of
NULL pointer after bpfdetach().

Reviewed by:	melifaro, eugen
Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D15083
2018-04-20 09:57:31 +00:00
Justin Hibbits
567dd766f6 powerpc64: Set n_slbs = 32 for POWER9
Summary:
POWER9 also contains 32 slbs entries as explained by the POWER9 User Manual:

 "For HPT translation, the POWER9 core contains a unified (combined for both
   instruction and data), 32-entry, fully-associative SLB per thread"

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15128
2018-04-20 03:23:19 +00:00
Justin Hibbits
2914706ab0 powerpc64: Add DSCR support
Summary:
Powerpc64 has support for a register called Data Stream Control Register
(DSCR), which basically controls how the hardware controls the caching and
prefetch for stream operations.

Since mfdscr and mtdscr are privileged instructions, we need to emulate them,
and
keep the custom DSCR configuration per thread.

The purpose of this feature is to change DSCR depending on the operation, set
to DSCR Default Prefetch Depth to deepest on string operations, as memcpy.

Submitted by:	Breno Leitao
Differential Revision: https://reviews.freebsd.org/D15081
2018-04-20 03:19:44 +00:00
Rick Macklem
6269d66373 Fix OpenDowngrade for NFSv4.1 if a client sets the OPEN_SHARE_ACCESS_WANT* bits.
The NFSv4.1 RFC specifies that the OPEN_SHARE_ACCESS_WANT bits can be set
in the OpenDowngrade share_access argument and are basically ignored.
I do not know of a extant NFSv4.1 client that does this, but this little
patch fixes it just in case.
It also changes the error from NFSERR_BADXDR to NFSERR_INVAL since the NFSv4.1
RFC specifies this as the error to be returned if bogus bits are set.
(The NFSv4.0 RFC didn't specify any error for this, so the error reply can
 be changed for NFSv4.0 as well.)
Found by inspection while looking at a problem with OpenDowngrade reported
for the ESXi 6.5 NFSv4.1 client.

Reported by:	andreas.nagy@frequentis.com
PR:		227214
MFC after:	1 week
2018-04-19 20:30:33 +00:00
Nathan Whitehorn
323e673945 Fix detection of memory overlap with the kernel in the case where a memory
region marked "available" by firmware is contained entirely in the kernel.

This had a tendency to happen with FDTs passed by loader, though could for
other reasons as well, and would result in the kernel slowly cannibalizing
itself for other purposes, eventually resulting in a crash.

A similar fix is needed for mmu_oea.c and should probably just be rolled
at that point into some generic code in platform.c for taking a mem_region
list and removing chunks.

PR:		226974
Submitted by:	leandro.lupori@gmail.com
Reviewed by:	jhibbits
Differential Revision:	D15121
2018-04-19 18:34:38 +00:00
Navdeep Parhar
8aa1c1d8b4 cxgbe(4): Fix bugs in the handling of COP rules that match on VLAN tag.
Retrieve the tag from the correct ifnet and use the provided tag
(instead of hardcoded 0xffff, implying no tag) in the routines that
process offload policy.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
2018-04-19 18:10:44 +00:00
Konstantin Belousov
38858594c1 Use symbolic constant, explaining the operation.
Sponsored by:	The FreeBSD Foundation
2018-04-19 18:08:46 +00:00
Warner Losh
e8bef32ce2 Reword comment to remove awkward constructs, including an "it's" that
shouldn't have been there at all (it wasn't a typo for its, rather a
left-over from an older revision of the comment).

Noticed by: many
2018-04-19 16:05:48 +00:00
John Baldwin
73c8686e91 Simplify the code to allocate stack for auxv, argv[], and environment vectors.
Remove auxarg_size as it was only used once right after a confusing
assignment in each of the variants of exec_copyout_strings().

Reviewed by:	emaste
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D15123
2018-04-19 16:00:34 +00:00
Warner Losh
b3e85e7a79 Intel drives have an optimal alignment for I/O. While they honor I/Os
that cross this boundary, they perform better when this isn't the
case. Intel uses the 3rd byte in the vendor specific area for
this. The DC P3500 was previously listed without any explanation. Add
the DC P3520 and DC P4500 to the list.

There won't be any others drives needing this quirk. Intel has
standardized a field in the namespace data in 1.3 (noiob).  A future
patch will use that if it exists, with fallback to this method.

Submitted by: Keith Busch
Reviewed by: jimharris@
2018-04-19 15:39:20 +00:00
Alexander Motin
596e6ade92 Release memory resource on cuda driver attach failure.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
2018-04-19 15:29:10 +00:00
Conrad Meyer
179b21e8b1 cryptosoft: Do not exceed crd_len around *crypt_multi
When a caller passes in a uio or mbuf chain that is longer than crd_len, in
tandem with a transform that supports the multi-block interface,
swcr_encdec() would process the entire mbuf or uio instead of just the
portion indicated by crd_len (+ crd_skip).

De/encryption are performed in-place, so this would trash subsequent uio or
mbuf contents.

This was introduced in r331639 (mea culpa).  It only affects the
{de,en}crypt_multi() family of interfaces.  That interface only has one
consumer transform in-tree (for now): Chacha20.

PR:		227605
Submitted by:	Valentin Vergez <valentin.vergez AT stormshield.eu>
2018-04-19 15:24:21 +00:00
Randall Stewart
fd389e7cd5 These two modules need the tcp_hpts.h file for
when the option is enabled (not sure how LINT/build-universe
missed this) opps.

Sponsored by:	Netflix Inc
2018-04-19 15:03:48 +00:00
Mark Johnston
64b3893010 Initialize marker pages in vm_page_domain_init().
They were previously initialized by the corresponding page daemon
threads, but for vmd_inacthead this may be too late if
vm_page_deactivate_noreuse() is called during boot.

Reported and tested by:	cperciva
Reviewed by:	alc, kib
MFC after:	1 week
2018-04-19 14:09:44 +00:00
Randall Stewart
3ee9c3c4eb This commit brings in the TCP high precision timer system (tcp_hpts).
It is the forerunner/foundational work of bringing in both Rack and BBR
which use hpts for pacing out packets. The feature is optional and requires
the TCPHPTS option to be enabled before the feature will be active. TCP
modules that use it must assure that the base component is compile in
the kernel in which they are loaded.

MFC after:	Never
Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D15020
2018-04-19 13:37:59 +00:00
Andriy Gapon
f3f6ecb450 set kdb_why to "trap" when calling kdb_trap from trap_fatal
This will allow to hook a ddb script to "kdb.enter.trap" event.
Previously there was no specific name for this event, so it could only
be handled by either "kdb.enter.unknown" or "kdb.enter.default" hooks.
Both are very unspecific.

Having a specific event is useful because the fatal trap condition is
very similar to panic but it has an additional property that the current
stack frame is the frame where the trap occurred.  So, both a register
dump and a stack bottom dump have additional information that can help
analyze the problem.

I have added the event only on architectures that have trap_fatal()
function defined.  I haven't looked at other architectures.  Their
maintainers can add support for the event later.

Sample script:
kdb.enter.trap=bt; show reg; x/aS $rsp,20; x/agx $rsp,20

Reviewed by:	kib, jhb, markj
MFC after:	11 days
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D15093
2018-04-19 05:06:56 +00:00
Konstantin Belousov
b940886338 Add PROC_PDEATHSIG_SET to procctl interface.
Allow processes to request the delivery of a signal upon death of
their parent process.  Supposed consumer of the feature is PostgreSQL.

Submitted by:	Thomas Munro
Reviewed by:	jilles, mjg
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D15106
2018-04-18 21:31:13 +00:00
Konstantin Belousov
919015a42e Fix pmap_trm_alloc(M_ZERO).
Sponsored by:	The FreeBSD Foundation
2018-04-18 20:09:26 +00:00
Konstantin Belousov
17cdb360cb For fatal traps other than pagefaults, print raw fault error codes.
For pagefaults, the error is already decoded and printed.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-04-18 20:07:47 +00:00
John Baldwin
f36411145e Fix two off-by-one errors when allocating MSI and MSI-X interrupts.
x86 enforces an (arbitray) limit on the number of available MSI and
MSI-X interrupts to simplify code (in particular, interrupt_source[]
is statically sized).  This means that an attempt to allocate an MSI
vector needs to fail if it would go beyond the limit, but the checks
for exceeding the limit had an off-by-one error.  In the case of MSI-X
which allocates interrupts one at a time this meant that IRQ 768 kept
getting handed out multiple times for msix_alloc() instead of failing
because all MSI IRQs were in use.

Tested by:	lidl
MFC after:	1 week
2018-04-18 18:45:34 +00:00
John Baldwin
eb33d8ce3e Workaround fixed I/O port resources encoded as I/O port ranges in _CRS.
ACPI I/O port descriptors use _MIN and _MAX fields to specify the set
of allowable base (start) addresses for an I/O port resource along with
a _LEN field specifying the length.  A fixed resource is supposed to be
encoded with _MIN == _MAX, but some buggy firmwares instead set _MAX to
the end of the fixed range.  Relocating I/O ranges only make sense in
_PRS (possible resource settings), not in _CRS (current resource settings),
so if an I/O port range with _MAX set set to the end of the range is
present in _CRS, treat it as a fixed I/O port resource starting at
_MIN.

PR:		224096
Submitted by:	Harald Böhm <harald@boehm.codes>
Pointy hat to:	jhb (taking so long to actually commit this)
MFC after:	1 week
2018-04-18 18:36:26 +00:00
Andriy Gapon
6d83b2e971 don't check for kdb reentry in trap_fatal(), it's impossible
trap() checks for it earlier and calls kdb_reentry().

Discussed with:	jhb
MFC after:	12 days
Sponsored by:	Panzura
2018-04-18 15:44:54 +00:00
Stephen Hurd
0b75ac7796 iflib: Fix queue distribution when there are no threads
Previously, if there are no threads, all queues which targeted
cores that share an L2 cache were bound to a single core. The intent is
to distribute them across these cores.

Reported by:	olivier
Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15120
2018-04-18 15:34:18 +00:00
Brooks Davis
edba22b0fb Remove references to fs_nofault_intr_begin/end.
These should have removed in r332656.

Reported by:	mjg, lidl
2018-04-17 22:30:00 +00:00
Mark Johnston
9de8fcfddf Ensure that m and skip_m belong to the same object.
Pages allocated from a given reservation may belong to different
objects. It is therefore possible for vm_page_ps_test() to be called
with the base page's object unlocked. Check for this case before
asserting that the object lock is held.

Reported by:	jhb
Reviewed by:	kib
MFC after:	1 week
2018-04-17 18:49:17 +00:00
John Baldwin
8ce99bb405 Properly do a deep copy of the ioctls capability array for fget_cap().
fget_cap() tries to do a cheaper snapshot of a file descriptor without
holding the file descriptor lock.  This snapshot does not do a deep
copy of the ioctls capability array, but instead uses a different
return value to inform the caller to retry the copy with the lock
held.  However, filecaps_copy() was returning 1 to indicate that a
retry was required, and fget_cap() was checking for 0 (actually
'!filecaps_copy()').  As a result, fget_cap() did not do a deep copy
of the ioctls array and just reused the original pointer.  This cause
multiple file descriptor entries to think they owned the same pointer
and eventually resulted in duplicate frees.

The only code path that I'm aware of that triggers this is to create a
listen socket that has a restricted list of ioctls and then call
accept() which calls fget_cap() with a valid filecaps structure from
getsock_cap().

To fix, change the return value of filecaps_copy() to return true if
it succeeds in copying the caps and false if it fails because the lock
is required.  I find this more intuitive than fixing the caller in
this case.  While here, change the return type from 'int' to 'bool'.

Finally, make filecaps_copy() more robust in the failure case by not
copying any of the source filecaps structure over.  This avoids the
possibility of leaking a pointer into a structure if a similar future
caller doesn't properly handle the return value from filecaps_copy()
at the expense of one more branch.

I also added a test case that panics before this change and now passes.

Reviewed by:	kib
Discussed with:	mjg (not a fan of the extra branch)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D15047
2018-04-17 18:07:40 +00:00
Brooks Davis
9c11d8d483 Remove the unused fuwintr() and suiwintr() functions.
Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by:	kib, andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15102
2018-04-17 18:04:28 +00:00
Brooks Davis
427be5bb3a Remove unused implementations of copyoutstr().
Also remove the commented out documentation.  The documentation arrived
with the import of the copy.9 manpage.  I suspect the implementations
came from NetBSD while bootstrapping the Arm and MIPS ports.

Reviewed by:	andrew, jmallett
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15108
2018-04-17 17:20:04 +00:00
Andrew Gallatin
bca3808097 Restore SIOCGI2C functionality to ixgbe
When ixgbe was converted to iflib, it lost the SIOCGI2C support
that allows ifconfig to print SFP state, optical light levels, etc.
Restore this by plugging in to the ifdi_i2c_req iflib method.  Note
that the sanity checking on dev_addr that used to be done in ixgbe is
now done in iflib.

Reviewed by:	erj, Matthew Macy <mmacy@mattmacy.io>
Sponsored by:	Netflix
2018-04-17 16:51:27 +00:00
Warner Losh
a397def9fb Add PNP info to the PCI attahement of the puc driver.
Adjust sys/conf/files and sys/modules/puc/Makefile to omit
pucdata.c now tht it's included by puc_pci.c.

Submitted by: Lakhan Shiva Kamireddy (with build fixes by me)
Pull Request: https://github.com/freebsd/freebsd/pull/136
2018-04-17 16:46:08 +00:00
Warner Losh
aeb291cc32 Add PNP info to the bce driver.
Submitted by: Lakhan Shiva Kamireddy
Pull Request: https://github.com/freebsd/freebsd/pull/136
2018-04-17 16:46:01 +00:00
Brooks Davis
cee61c8cac Stop using fuswintr() and suswintr() in the profiler.
Always take the AST path rather than calling MD functions which are
often implemented as always failing. The is the case on amd64, arm,
i386, and powerpc. This optimization (inherited from 4.4 Lite) is a
pessimization on those architectures and is the sole use of these
functions. They will be removed in a seperate commit.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15101
2018-04-17 16:36:53 +00:00
Warner Losh
3531bbb5ea Restore db_radix on parse error, otherwise we'll silently change it to
10 on a botched trace command.
2018-04-17 15:44:05 +00:00
Ram Kishore Vegesna
5eaf9435d0 Moved opts-stack.h include before all other includes.
PR: 227446
Approved by: ken
MFC after: 3 days
2018-04-17 15:29:32 +00:00
Alan Somers
52c0983128 lio_listio: return EAGAIN instead of EIO when out of resources
This behavior is already documented by the man page, and suggested by POSIX.

Reviewed by:	jhb
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D15099
2018-04-16 18:12:15 +00:00
Brooks Davis
b6cb3eab1e Remove unused badaddr() function.
Reviewed by:	jmallett
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15078
2018-04-16 17:43:26 +00:00
Warner Losh
8b999cc69b Use bool instead of boolean_t here. No reason to use boolean_t.
Also, stop passing FALSE to a bool parameter.
2018-04-16 13:52:40 +00:00
Warner Losh
9d89a9f326 No need to force md code to define a macro that's the same as
_BYTE_ORDER. Use that instead.
2018-04-16 13:52:23 +00:00
Justin Hibbits
3877c32ec9 Use a resource hint instead of environment variable for DIU mode
This makes it more consistent with FreeBSD norms, rather than using Linux's
norms.  Now, instead of needing an environment variable

  video-mode=fslfb:1280x1024@60

Now one would use a hint:

  hint.fb.0.mode=1280x1024@60
2018-04-16 04:02:53 +00:00
Alexander Motin
bbbac409fe 9433 Fix ARC hit rate
When the compressed ARC feature was added in commit d3c2ae1
the method of reference counting in the ARC was modified.  As
part of this accounting change the arc_buf_add_ref() function
was removed entirely.

This would have be fine but the arc_buf_add_ref() function
served a second undocumented purpose of updating the ARC access
information when taking a hold on a dbuf.  Without this logic
in place a cached dbuf would not migrate its associated
arc_buf_hdr_t to the MFU list.  This would negatively impact
the ARC hit rate, particularly on systems with a small ARC.

This change reinstates the missing call to arc_access() from
dbuf_hold() by implementing a new arc_buf_access() function.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2018-04-16 00:54:58 +00:00
Brooks Davis
5aafa305af Remove device cm which was removed in r332490. 2018-04-15 15:06:07 +00:00
Warner Losh
5bc896bcc2 Make first a 'bool' instead of a 'boolean_t'.
'bool' is preferred to 'boolean_t'. We only get the boolean_t
definition by header pollution (though the same is true for
bool). Since we use both, switch entirely to bool.

Note: We still have TRUE/FALSE instead of true/false in heavy use in
the rest of the file. These are with ints of various flavors, so
that's appropriate, even though we should eventually migrate to bool
and true/false (though the tables they are in are nicely packed with
short and wouldn't be so nicely packed with bool, another reason
to leave it alone for now).
2018-04-14 22:14:18 +00:00
Navdeep Parhar
1131c927c4 cxgbe(4): Add support for Connection Offload Policy (aka COP).
COP allows fine-grained control on whether to offload a TCP connection
using t4_tom, and what settings to apply to a connection selected for
offload.  t4_tom must still be loaded and IFCAP_TOE must still be
enabled for full TCP offload to take place on an interface.  The
difference is that IFCAP_TOE used to be the only knob and would enable
TOE for all new connections on the inteface, but now the driver will
also consult the COP, if any, before offloading to the hardware TOE.

A policy is a plain text file with any number of rules, one per line.
Each rule has a "match" part consisting of a socket-type (L = listen,
A = active open, P = passive open, D = don't care) and a pcap-filter(7)
expression, and a "settings" part that specifies whether to offload the
connection or not and the parameters to use if so.  The general format
of a rule is: [socket-type] expr => settings

Example.  See cxgbetool(8) for more information.
[L] ip && port http => offload
[L] port 443 => !offload
[L] port ssh => offload
[P] src net 192.168/16 && dst port ssh => offload !nagle !timestamp cong newreno
[P] dst port ssh => offload !nagle ecn cong tahoe
[P] dst port http => offload
[A] dst port 443 => offload tls
[A] dst net 192.168/16 => offload !timestamp cong highspeed

The driver processes the rules for each new listen, active open, or
passive open and stops at the first match.  There is an implicit rule at
the end of every policy that prohibits offload when no rule in the
policy matches:
[D] all => !offload

This is a reworked and expanded version of a patch submitted by
Krishnamraju Eraparaju @ Chelsio.

Sponsored by:	Chelsio Communications
2018-04-14 19:07:56 +00:00
Konstantin Belousov
23084818ff Set PG_G global mapping bit on the trampoline ptes.
Trampoline mappings are better treated as global since they are valid
in all address spaces, even for PTI.  pmap_invalidate_range() must work
on global mappings for pti since kernel_pmap invalidations are really
same as for non-PTI.

Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D15052
2018-04-14 17:33:16 +00:00