Commit Graph

119609 Commits

Author SHA1 Message Date
Michael Tuexen
ad15e1548f Unbreak compilation when using SCTP_DETAILED_STR_STATS option.
MFC after:	1 week
2017-11-24 12:18:48 +00:00
Hans Petter Selasky
8a53e1340f Merge ^/head r326132 through r326161. 2017-11-24 12:13:27 +00:00
Hans Petter Selasky
322f006ecc Implement atomic_fetchadd_64() for i386. This function is needed by the
atomic64 header file in the LinuxKPI for i386.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-24 12:10:42 +00:00
Hans Petter Selasky
c3125bc5bf Compile fixes for 32-bit architectures.
Sponsored by:	Mellanox Technologies
2017-11-24 12:08:50 +00:00
Hans Petter Selasky
e5806bf466 Compile fix for LINT-NOIP kernel target.
Sponsored by:	Mellanox Technologies
2017-11-24 12:05:49 +00:00
Andriy Gapon
a3bbbd5e40 amd-vi: a small whitespace cleanup
Reviewed by:	anish
2017-11-24 11:37:41 +00:00
Andriy Gapon
685c54fc6a amd-vi: use correct type for pci_rid, start_dev_rid, end_dev_rid sysctls
Previously, the values could look confusing because of unrelated bits from
adjacent memory.

Reviewed by:	anish
2017-11-24 11:36:35 +00:00
Andriy Gapon
eb6c9c128c amd-vi: small improvements to event printing
Ensure that an opening bracket always has a matching closing one.
Ensure that there is always a new-line at the end of a report line.
Also, add a space before the printed event flag.

Reviewed by:	anish
2017-11-24 11:35:43 +00:00
Andriy Gapon
dee38cdc2a amd-vi: print some additional details for INVALID_DEVICE_REQUEST event
Namely, the type of the hardware event and whether the transaction
was a translation request.

Reviewed by:	anish
2017-11-24 11:34:46 +00:00
Michael Tuexen
b7d2b5d5b1 Add SPDX line. 2017-11-24 11:25:53 +00:00
Andriy Gapon
53d580f984 amd-vi: fix up r326152, the new width requires a wider type
This is my brain-o from extending the width at the last moment.
2017-11-24 11:25:06 +00:00
Andriy Gapon
5a041f2183 amd-vi: fix and extend definition of Command and Event Status Register (0x2020)
The defined bits are the lower bits, not the higher ones.

Also, the specification has been extended to define bits 0:18 and they
all could potentially be interesting to us, so extend the width of the
field accordingly.

Reviewed by:	anish
2017-11-24 11:20:10 +00:00
Andriy Gapon
8523ad24ba vmm/amd: improve iteration over IVHD (type 10h) entries in IVRS table
Many 8-byte entries have zero at byte 4, so the second 4-byte part is
skipped as a 4-byte padding entry.  But not all 8-byte entries have that
property and they get misinterpreted.

A real example:
    48 00 00 00 ff 01 00 01
This an 8-byte ACPI_IVRS_TYPE_SPECIAL entry for IOAPIC with ID 255 (bogus).
It is reported as:
    ivhd0: Unknown dev entry:0xff
Fortunately, it was completely harmless.

Also, bail out early if we encounter an entry of a variable length type.
We do not have proper handling for those yet.

Reviewed by:	anish
2017-11-24 11:10:36 +00:00
Hans Petter Selasky
fd9e423c88 Make sure all tasks are cancelled synchronously in ipoib to avoid
use after free.

Sponsored by:	Mellanox Technologies
2017-11-24 09:55:20 +00:00
Hans Petter Selasky
68732efcd9 Build fix for ipoib when CONFIG_INFINIBAND_IPOIB_CM is defined.
Sponsored by:	Mellanox Technologies
2017-11-24 09:52:56 +00:00
Hans Petter Selasky
95ef56abc2 Build fix for kernel LINT target.
Sponsored by:	Mellanox Technologies
2017-11-24 09:12:13 +00:00
Ed Schouten
814629dd64 Don't let cpu_set_syscall_retval() clobber exec_setregs().
Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).

Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.

Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.

To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.

Reviewed by:	kib, jhibbits
Differential Revision:	https://reviews.freebsd.org/D13180
2017-11-24 07:35:08 +00:00
Kyle Evans
e60d3b7ff4 Add ccu compat string for Allwinner a83t
A ccu driver was added for the a83t in r326114. Add compat string to
aw_ccung and register the clocks for the a83t upon attach.

Reviewed by:	manu
Approved by:	emaste (mentor, implicit)
Differential Revision:	https://reviews.freebsd.org/D13205
2017-11-24 02:39:38 +00:00
Andrew Turner
521018d379 Ensure we check the program state set in the trap frame on arm and arm64.
This value may be set by userspace so we need to check it before using it.
If this is not done correctly on exception return the kernel may continue
in kernel mode with all registers set to a userspace controlled value. Fix
this by moving the check into set_mcontext, and also add the missing
sanitisation from the arm64 set_regs.

Discussed with:	security-officer@
MFC after:	3 days
Sponsored by:	DARPA, AFRL
2017-11-23 17:40:40 +00:00
Mark Johnston
e9f63df76d Duplicate helpers after disabling inherited tracepoints during a fork.
We may create probes in the nascent child process, so we first need to
ensure that any inherited tracepoints are first removed. Otherwise the
probe sites will not be in the state expected by fasttrap, and it won't
be able to enable the probes.

MFC after:	2 weeks
2017-11-23 14:29:07 +00:00
Hans Petter Selasky
82725ba9bf Merge ^/head r325999 through r326131. 2017-11-23 14:28:14 +00:00
Mark Johnston
0349817103 Allow kern.geom.mirror.debug to be negative.
A negative value can be used to suppress all prints from the gmirror
kernel code, which can be useful when attempting to trigger race
conditions using stress tests.

MFC after:	1 week
2017-11-23 14:07:52 +00:00
Hans Petter Selasky
9ac7c5a64c Make sure the iSCSI I/O limits are set properly so that the ISCSIDSEND IOCTL
can be used prior to the ISCSIDHANDOFF IOCTL which set the negotiated values.
Else the login PDU will fail when passing the "-r" option to "iscsictl" which
means iSCSI over RDMA instead of TCP/IP.

Discussed with:	np@ and trasz@
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-23 13:57:44 +00:00
Hans Petter Selasky
8d73c9ba61 The __internal_mr is freed as part of the protection domain, pd.
There is no need to free this mr. This fixes an issue accessing
freed memory in ISER.

Sponsored by:	Mellanox Technologies
2017-11-23 12:25:11 +00:00
Konstantin Belousov
383f241dce Remove lint support from system headers and MD x86 headers.
Reviewed by:	dim, jhb
Discussed with:	imp
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D13156
2017-11-23 11:40:16 +00:00
Konstantin Belousov
ee50062cfb Kill all descendants of the reaper, even if they are descendants of a
subordinate reaper.

Also, mark reapers when listing pids.

Reported by:	Michael Zuo <muh.muhten@gmail.com>
PR:	223745
Reviewed by:	bapt
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13183
2017-11-23 11:25:11 +00:00
Andrew Turner
a72d6c8975 Zero struct efi_tm before setting the needed values. We don't use the dst
or timezone fields so ensure these are set.

Reported by:	emaste
Sponsored by:	DARPA, AFRL
2017-11-23 10:34:38 +00:00
Andrey V. Elsukov
1719df1bb4 Modify ipfw's dynamic states KPI.
Hide the locking logic used in the dynamic states implementation from
generic code. Rename ipfw_install_state() and ipfw_lookup_dyn_rule()
function to have similar names: ipfw_dyn_install_state() and
ipfw_dyn_lookup_state(). Move dynamic rule counters updating to the
ipfw_dyn_lookup_state() function. Now this function return NULL when
there is no state and pointer to the parent rule when state is found.
Thus now there is no need to return pointer to dynamic rule, and no need
to hold bucket lock for this state. Remove ipfw_dyn_unlock() function.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D11657
2017-11-23 08:02:02 +00:00
Andrey V. Elsukov
9d15540022 Check that address family of state matches address family of packet.
If it is not matched avoid comparing other state fields.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-23 07:05:25 +00:00
Andrey V. Elsukov
30df59d581 Move ipfw_send_pkt() from ip_fw_dynamic.c into ip_fw2.c.
It is not specific for dynamic states function and called also from
generic code.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-23 06:04:57 +00:00
Andrey V. Elsukov
288bf455bb Rework rule ranges matching. Use comparison rule id with UINT32_MAX to
match all rules with the same rule number.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-23 05:55:53 +00:00
Kyle Evans
c80eef0dc6 Allwinner a83t: add ccung bits
Upstream DTS has switched to using CCU rather than /clocks nodes. Add a CCU
driver for the a83t to bring us closer to upstream, but don't yet attach it
to ccu node.

Reviewed by:	manu
Approved by:	emaste (mentor)
Differential Revision:	https://reviews.freebsd.org/D12843
2017-11-23 05:54:04 +00:00
Kyle Evans
0b7a88e60d aw_ccung: changes to accommodate upcoming a83t support
Add a means to specify mask/value for the prediv condition instead of
shift/width/value for clocks that have a more complex mux scenario.

Specifically, ahb1 on the a83t has the prediv applied if mux is either b10
or b11.

Reviewed by:	manu
Approved by:	emaste (mentor)
Differential Revision:	https://reviews.freebsd.org/D12851
2017-11-23 05:43:44 +00:00
Mateusz Guzik
2d96bd8812 sx: unbreak debug after r326107
An assertion was modified to use the found value, but it was not updated to
handle a race where blocked threads appear after the entrance to the func.

Move the assertion down to the area protected with sleepq lock where the
lock is read anyway. This does not affect coverage of the assertion and
is consistent with what rw locks are doing.

Reported by:	Shawn Webb
2017-11-23 03:40:51 +00:00
Mateusz Guzik
62b0676cde rwlock: unbreak WITNESS builds after r326110
Reported by:	Shawn Webb
2017-11-23 03:20:12 +00:00
Mateusz Guzik
70502e39d3 rwlock: don't check for curthread's read lock count in the fast path 2017-11-22 23:52:05 +00:00
Landon J. Fuller
2f909a9f74 bhnd(4): Add a basic ChipCommon GPIO driver sufficient to support bwn(4)
The driver is functional on both BHND Wi-Fi adapters and MIPS SoCs, but
does not currently include support for features not required by bwn(4),
including GPIO interrupt handling.

Approved by:	adrian (mentor, implicit)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12708
2017-11-22 23:10:20 +00:00
Mateusz Guzik
b584eb2e90 locks: pass the found lock value to unlock slow path
This avoids an explicit read later.

While here whack the cheaply obtainable 'tid' argument.
2017-11-22 22:04:04 +00:00
Mateusz Guzik
013c0b493f locks: remove the file + line argument from internal primitives when not used
The pair is of use only in debug or LOCKPROF kernels, but was passed (zeroed)
for many locks even in production kernels.

While here whack the tid argument from wlock hard and xlock hard.

There is no kbi change of any sort - "external" primitives still accept the
pair.
2017-11-22 21:51:17 +00:00
Landon J. Fuller
4e96bf3a37 bhnd(4): extend the PMU APIs to support bwn(4)
The bwn(4) driver requires a number of extensions to the bhnd(4) PMU
interface to support external configuration of PLLs, LDOs, and other
parameters that require chipset or PHY-specific workarounds.

These changes add support for:

- Writing raw voltage register values to PHY-specific LDO regulator
  registers (required by LP-PHY).
- Enabling/disabling PHY-specific LDOs (required by LP-PHY)
- Writing to arbitrary PMU chipctrl registers (required for common PHY PLL
  reset support).
- Requesting chipset/PLL-specific spurious signal avoidance modes.
- Querying clock frequency and latency.

Additionally, rather than updating legacy PWRCTL support to conform to the
new PMU interface:

- PWRCTL API is now provided by a bhnd_pwrctl_if.m interface.
- Since PWRCTL is only found in older SSB-based chipsets, translation from
  bhnd(4) bus APIs to corresponding PWRCTL operations is now handled
  entirely within the siba(4) driver.
- The PWRCTL-specific host bridge clock gating APIs in bhnd_bus_if.m have
  been lifted out into a standalone bhnd_pwrctl_hostb_if.m interface.

Approved by:	adrian (mentor, implicit)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12664
2017-11-22 20:27:46 +00:00
Alan Somers
b0f662fed3 Always null-terminate CAM periph_name and dev_name
Reported by:	Coverity
CID:		1010039, 1010040, 1010041, 1010043
Reviewed by:	ken, imp
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D13194
2017-11-22 19:57:34 +00:00
Konstantin Belousov
9410cd7d9e Return different error code for the guard page layout violation.
On KERN_NO_SPACE error, as it is returned now, vm_map_find() continues
the loop searching for the suitable range for the requested mapping
with specific alignment.  Since the vm_map_findspace() succesfully
finds the same place, the loop never ends.

The errors returned from vm_map_stack() completely repeat the behavior
of vm_map_insert() now, as suggested by Alan.

Reported by:	Arto Pekkanen <aksyom@gmail.com>
PR:	223732
Reviewed by:	alc, markj
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D13186
2017-11-22 16:45:27 +00:00
Alan Cox
4d572bb3ed When vm_map_find(find_space = VMFS_OPTIMAL_SPACE) fails to find space, a
second scan of the address space with find_space = VMFS_ANY_SPACE is
performed.  Previously, vm_map_find() released and reacquired the map lock
between the first and second scans.  However, there is no compelling
reason to do so.  This revision modifies vm_map_find() to retain the map
lock.

Reviewed by:	jhb, kib, markj
MFC after:	1 week
X-Differential Revision:	https://reviews.freebsd.org/D13155
2017-11-22 16:39:24 +00:00
Mark Johnston
7a5c730561 Use the right variable for the IP header parameter to tcp:::send.
This addresses a regression from r311225.

MFC after:	1 week
2017-11-22 14:13:40 +00:00
Ruslan Bukin
0d4435dfab o Invalidate the correct page in pmap_protect().
With this bug fix we don't need to invalidate all the entries.
o Remove a call to pmap_invalidate_all(). This was never called
  as the anyvalid variable is never set.

Obtained from:	arm64/pmap.c (r322797, r322800)
Sponsored by:	DARPA, AFRL
2017-11-22 14:10:58 +00:00
Andrey V. Elsukov
7143bb7626 Add ipfw_add_protected_rule() function that creates rule with 65535
number in the reserved set 31. Use this function to create default rule.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-22 05:49:21 +00:00
Justin Hibbits
efa8edd5bb PowerPC has 12 artificial frames for the profiler
It may need to be different between AIM and Book-E, this was tested only on
Book-E (64- and 32-bit)

MFC after:	3 weeks
2017-11-22 01:53:59 +00:00
Landon J. Fuller
9ed453245b bhnd(4): Add support for querying DMA address translation parameters
BHND Wi-Fi chipsets and SoCs share a common DMA engine, operating within
backplane address space. To support host DMA on Wi-Fi chipsets, the bridge
core maps host address space onto the backplane; any host addresses must
be translated to their corresponding backplane address.


- Defines a new bhnd_get_dma_translation(9) API to support querying DMA
  address translation parameters from the bhnd(4) bus.
- Extends bhndb(4) to provide DMA translation descriptors from a DMA
  address translation table defined in the host bridge-specific
  bhndb_hwcfg.
- Defines bhndb(4) DMA address translation tables for all supported host
  bridge cores.
- Extends mips/broadcom's bhnd_nexus driver to return an identity (no-op)
  DMA translation descriptor; no translation is required when addressing
  the SoC backplane.

Approved by:	adrian (mentor)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12582
2017-11-21 23:25:22 +00:00
Landon J. Fuller
caeff9a3c2 bhnd(4): implement MIPS and PCI(e) interrupt support
On BHND MIPS SoCs, this replaces the use of hard-coded MIPS IRQ#s in the
common bhnd(4) core drivers; we now register an INTRNG child PIC that
handles routing of backplane interrupt vectors via the MIPS core.

On BHND PCI devices, backplane interrupt vectors are now routed to the
PCI/PCIe host bridge core when bus_setup_intr() is called, where they are
dispatched by the PCI core via a host interrupt (e.g. INTx/MSI).

The bhndb(4) bridge driver tracks registered interrupt handlers for the
bridged bhnd(4) devices and manages backplane interrupt routing, while
delegating actual bus interrupt setup/teardown to the parent bus on behalf
of the bridged cores.

Approved by:	adrian (mentor, implicit)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12518
2017-11-21 23:15:20 +00:00
Ed Schouten
0a9cc964a1 Import the latest CloudABI definitions, v0.18.
In addition to some small style fixes to the ARMv6 vDSO, this release
includes a new vDSO that can be used for the execution of ARMv6/ARMv7
code on 64-bit platforms.

Just like for i686 on x86-64, this new vDSO is responsible for padding
arguments and return values to 64-bit values, so that the kernel can
easily forward system calls to the native system calls.

Obtained from:	https://github.com/NuxiNL/cloudabi
2017-11-21 20:46:21 +00:00
Andriy Gapon
7bcc2cfc86 zfs_write: fix problem with writes appearing to succeed when over quota
The problem happens when the writes have offsets and sizes aligned with
a filesystem's recordsize (maximum block size).  In this scenario
dmu_tx_assign() would fail because of being over the quota, but the uio
would already be modified in the code path where we copy data from the
uio into a borrowed ARC buffer.  That makes an appearance of a partial
write, so zfs_write() would return success and the uio would be modified
consistently with writing a single block.

That bug can result in a data loss because the writes over the quota
would appear to succeed while the actual data is being discarded.

This commit fixes the bug by ensuring that the uio is not changed until
after all error checks are done.  To achieve that the code now uses
uiocopy() + uioskip() as in the original illumos design.  We can do that
now that uiocopy() has been updated in r326067 to use
vn_io_fault_uiomove().

Reported by:	mav
Analyzed by:	mav
Reviewed by:	mav
Pointyhat to:	avg (myself)
MFC after:	1 week
X-MFC after:	r326067
X-Erratum:	wanted
2017-11-21 18:28:14 +00:00
Andriy Gapon
bf71fe61af make illumos uiocopy use vn_io_fault_uiomove
uiocopy() is currently unused, its purpose is copy data from a uio
without modifying the uio.  It was in use before the vn_io_fault support
was added to ZFS, at which point our code diverged from the illumos code
a little bit.  Because ZFS is the only (potential) user of the function
we are free to modify it to better suit ZFS needs.

The intention behind this change is to remove the differences introduced
earlier in zfs_write().

While here, re-implement uioskip() using uiomove() with
uio_segflg == UIO_NOCOPY.
The story of uioskip is the same as with uiocopy.

Reviewed by:	mav
MFC after:	1 week
2017-11-21 18:01:43 +00:00
Andrew Turner
a3dff126f9 Add a driver for the EFI RTC. This uses the EFI Runtime Services to query
the system time.

As we seem to only read this time on boot, and this is the only source of
time on many arm64 machines we need to enable this by default there. As
this is not always the case with U-Boot firmware, or when we have been
booted from a non-UEFI environment we only enable the device driver when
the Runtime Services are present and reading the time doesn't result in an
error.

PR:		212185
Reviewed by:	imp, kib
Tested by:	emaste
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D12650
2017-11-21 17:23:16 +00:00
Alan Somers
e7dda951a4 Fix uninitialized variable from 326034
Reported by:	Coverity
CID:		1382887
MFC after:	20 days
X-MFC-With:	326034
Sponsored by:	Spectra Logic Corp
2017-11-21 16:38:30 +00:00
Mark Johnston
755230eb9f Clean up the SYSINIT_FLAGS definitions for rwlock(9) and rmlock(9).
Avoid duplication in their macro definitions, and document them. No
functional change intended.

MFC after:	1 week
2017-11-21 14:59:23 +00:00
Hans Petter Selasky
2b378326f8 Make sure all initialized mutexes are destroyed in the iser module,
else WITNESS will panic. Prefix all mutex names with "iser_" to
prevent future WITNESS issues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-21 13:56:30 +00:00
Andrew Turner
2eb3e51e71 When fpcurthread is not the current thread it may be non-NULL. In this
case another thread has had the VFP unit enabled and will have its state
in the VFP registers along with it stored in memory. As such we don't need
to store the state, but do need to zero the fpcurthread pointer to stop
the VFP driver from using the enable fast path.

Reported by:	emaste
Sponsored by:	DARPA, AFRL
2017-11-21 13:19:38 +00:00
Mark Johnston
5070d56d41 Allow for fictitious physical pages in vm_page_scan_contig().
Some drm2 drivers will set PG_FICTITIOUS in physical pages in order to
satisfy the OBJT_MGTDEVICE object interface, so a scan may encounter
fictitous pages. For now, allow for this possibility; such pages will be
skipped later in the scan since they are wired.

Reported by:	avg
Reviewed by:	kib
MFC after:	1 week
2017-11-21 13:17:40 +00:00
Hans Petter Selasky
5cbcf51dce Compile fix for the mlx4 module.
Sponsored by:	Mellanox Technologies
2017-11-21 09:08:27 +00:00
Warner Losh
fdcf0c7477 While the EFI spec allows numbers to be in many forms, libefivar
produces hex numbers for the dsn. Since that come is from EDK2, change
this for symmetry, by generating the dsn as a hex number.

Noticed by: gpart list | grep efimedia | awk -F: '{print $2;}' | \
	sed -e 's/^ *//g;s/,,/,/' | grep MBR | efidp -p | efidp -f
Sponsored by: Netflix
2017-11-21 06:12:21 +00:00
Warner Losh
2ab9683565 Remove trailing whitespace (one I just introduced and a bunch of
others in the same directory).

Sponsored by: Netflix
2017-11-21 05:42:13 +00:00
Warner Losh
d65b6588d6 Implement efi media tagging for MBR partitioning types.
Sponsored by: Netflix
2017-11-21 05:35:21 +00:00
Justin Hibbits
1ccb14588b Check the page table before TLB1 in pmap_kextract()
The vast majority of pmap_kextract() calls are looking for a physical memory
address, not a device address.  By checking the page table first this saves
the formerly inevitable 64 (on e500mc and derivatives) iteration loop
through TLB1 in the most common cases.

Benchmarking this on the P5020 (e5500 core) yields a 300% throughput
improvement on dtsec(4) (115Mbit/s -> 460Mbit/s) measured with iperf.

Benchmarked on the P1022 (e500v2 core, 16 TLB1 entries) yields a 50%
throughput improvement on tsec(4) (~93Mbit/s -> 165Mbit/s) measured with
iperf.

MFC after:	1 week
Relnotes:	Maybe (significant performance improvement)
2017-11-21 03:12:16 +00:00
Landon J. Fuller
e60ff6a3b4 Preemptively map MIPS INTRNG interrupts on non-FDT MIPS targets
This replaces a partial workaround introduced in r305527 that was
incompatible with nested INTRNG interrupt controllers if not also using
FDT.

On non-FDT MIPS INTRNG targets, we now preemptively produce a set of fixed
mappings for the MIPS IRQ range during nexus attach. On FDT targets,
OFW_BUS_MAP_INTR() remains responsible for mapping the MIPS IRQs.

Approved by:	adrian (mentor)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12385
2017-11-21 01:54:48 +00:00
Navdeep Parhar
8ee789e9fa cxgbe(4): Fix unsafe mailbox access in cudbg.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-11-21 01:18:58 +00:00
Alan Somers
a4557b0509 Quirk Seagate ST8000AS0003-2HH
Like its predecessor ST8000AS0002, this is a drive-managed SMR drive, but
doesn't declare that in its ATA identify data.

MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-20 23:45:42 +00:00
Alan Somers
c95dc95b35 da(4): Short-circuit unnecessary BIO_FLUSH commands
sys/cam/scsi/scsi_da.c
	Complete BIO_FLUSH commands immediately if the da(4) device hasn't
	been written to since the last flush. If we haven't written to the
	device, there is no reason to send a flush.

Submitted by:	gibbs
Reviewed by:	imp
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D13106
2017-11-20 22:27:33 +00:00
Brooks Davis
114aedb30e Remove a couple variables that are unused after r325790.
Reported by:	rpokala
2017-11-20 22:18:24 +00:00
Alan Somers
8a0a413e12 Fix multiple bugs in cam_strmatch
* Wrongly matches strings that are shorter than the pattern
* Fails to match negative character sets
* Fails to match character sets that aren't at the end of the pattern
* Fails to match character ranges

Reviewed by:	imp
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D13173
2017-11-20 22:01:45 +00:00
Stephen Hurd
7274b2f6be Fix off-by-one error in bit_nclear() usage
bit_nclear() takes the bit numbers for the start and end bits, not the start
and a count.  This was resulting in memory corruption past the end of the
bitstr_t.

Sponsored by:	Limelight Networks
2017-11-20 21:57:04 +00:00
Conrad Meyer
60965b7606 msdosfs(5): Reflect READONLY attribute in file mode
Msdosfs allows setting READONLY by clearing the owner write bit of the file
mode.  (While here, correct the misspelling of S_IWUSR as VWRITE.  No
functional change.)

In msdosfs_getattr, intuitively reflect that READONLY attribute to userspace
in the file mode.

Reported by:	Karl Denninger <karl AT denninger.net>
Sponsored by:	Dell EMC Isilon
2017-11-20 21:38:24 +00:00
Scott Long
cab229b2a6 Update a comment in brelse() to match reality. 2017-11-20 20:53:03 +00:00
Pedro F. Giffuni
981e34b9ca Indent protection and some other oops from the prvious commits. 2017-11-20 19:56:11 +00:00
Navdeep Parhar
8e628d6d65 cxgbe(4): Add a custom board to the device id list.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-11-20 19:50:48 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Pedro F. Giffuni
7282444b10 sys/dev: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:36:21 +00:00
Vladimir Kondratyev
3e10195c86 evdev: change USB scancode 0x54 from KEY_SLASH to KEY_KPSLASH
Submitted by:		dumbbell
Reviewed by:		gonzo, wulf
Approved by:		gonzo (mentor)
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D12983
2017-11-20 19:25:22 +00:00
Vladimir Kondratyev
b12ac17ef1 Fix evdev codes for slash and asterisk numpad keys of AT-keyboards
Reviewed by:	gonzo
Approved by:	gonzo (mentor)
MFC after:	2 weeks
2017-11-20 19:20:05 +00:00
Vladimir Kondratyev
303dbb854f evdev: Export EVDEV_SUPPORT kernel option through feature facility
Suggested by:	netchild
Reviewed by:	gonzo
Approved by:	gonzo (mentor)
MFC after:	1 week
2017-11-20 19:17:43 +00:00
Justin Hibbits
727ca2fdfd Eliminate 1 XX_VirtToPhys() and 2 XX_PhysToVirt() calls from if_dtsec(4)
XX_VirtToPhys(), by way of pmap_kextract(), is an expensive operation.
Profiling via dtrace during a series of iperf tests I found 16111 / 30432
stack frames were located in mmu_booke_kextract(), so eliminating this
expensive call should improve performance slightly.  XX_PhysToVirt() is not
as expensive, but redundant calls in this context is wasteful.
2017-11-20 04:32:01 +00:00
Hans Petter Selasky
937d37fc6c Merge ^/head r325842 through r325998. 2017-11-19 12:36:03 +00:00
Konstantin Belousov
319a53f5ad Fix build.
Sponsored by:	The FreeBSD Foundation
2017-11-19 11:21:16 +00:00
Xin LI
11d38f2ca0 Remove unused header. 2017-11-19 03:52:03 +00:00
Xin LI
eafe4941b3 Remove unused header. 2017-11-19 03:51:47 +00:00
Kyle Evans
c4717ac049 aw_nmi: add support for a31/a83t's r_intc
We currently support the a83t's r_intc in a somewhat hack-ish way; our .dts
describes it as nmi_intc, and uses a subset of the actual register space to
make it line up with a20/a31 nmi offsets.

This breaks with the recent 4.14 update describing r_intc using the full
register space, so update aw_nmi to use the correct register offsets with
the right compat data in a way that doesn't break our current dts with
nmi_intc or upstream with r_intc described.

Reviewed by:	manu
Approved by:	emaste (mentor)
Differential Revision:	https://reviews.freebsd.org/D13122
2017-11-19 03:14:10 +00:00
Ed Maste
4a8dea8cf9 ANSIfy sys/libkern
PR:		223641
Submitted by:	ota@j.email.ne.jp
MFC after:	1 week
2017-11-19 00:31:13 +00:00
Emmanuel Vadot
dab900583d dts: arm64: allwinner: Remove unused dts for pine64
Latest u-boot port provide the dts for pine64, remove our custom
and outdated dts for this board.
2017-11-18 21:39:54 +00:00
Emmanuel Vadot
3f9ade0643 if_awg: drain tx buffers and clear rx buffers when stopping
Stale packets should not be transmitted when the interface comes up after being down.
Count the successfully transmitted ones for statistics and drop the rest.

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D12539
2017-11-18 21:12:06 +00:00
Emmanuel Vadot
bd9063297c if_awg: avoid hole in the rx ring buffer when mbuf allocation fails
Use a spare dma map when attempting to map a new mbuf on the rx path.
If the mbuf allocation fails or the dma map loading for the new mbuf fails just reuse the old mbuf
and increase the drop counter.

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D12538
2017-11-18 21:08:18 +00:00
Emmanuel Vadot
337c6940a9 if_awg: rename tx functions to match other drivers and free mbuf on m_collapse failure
- use awg_encap and awg_txeof names to match iflib and other network drivers.
- handle m_collapse failure similarly by freeing the mbuf rather than reenqueuing it where it will continue to fail.

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D13035
2017-11-18 21:04:39 +00:00
Emmanuel Vadot
0d2abe1e2b if_awg: don't process transmitted packets on TX_BUF_UA_INT, only on TX_INT
TX_BUF_UA_INT is set when there are no buffers to transmit and can
happen before hw.awg.tx_interval segments have been transmitted.

To reduce load, tx cleanup should be done in hw.awg.tx_interval intervals.

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D13034
2017-11-18 20:59:20 +00:00
Emmanuel Vadot
f179ed0561 if_awg: replace multiple calls to if_setdrvflagbits with one call in awg_txintr
Small optimization

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D13033
2017-11-18 20:55:37 +00:00
Emmanuel Vadot
09e2285c4c if_awg: only increment IFCOUNTER_OPACKETS when the last segment of a frame has been successfully transmitted
A packet may be built from multiple segments, don't increase the count for each segment

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D13032
2017-11-18 20:50:31 +00:00
Emmanuel Vadot
fce9d29f8d if_awg: store mbuf and dma mapping in the last segment of a tx frame instead of the first
According to the datasheet, TX_DESC_CTL is cleared when whole frame is transmitted or all
data in the current descriptor's buffer are transmitted.
When the mbuf and mapping are stored in the first segment and in a scenario where a tx
completion interrupt arrives for a frame and only the start of the next frame was transmitted,
at the time of interrupt processing the mbuf and mapping will be freed when processing the
first segment of the next frame but the other untrasmitted segments still need to use them.

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D13031
2017-11-18 20:46:31 +00:00
Emmanuel Vadot
c6110e7514 if_awg: mark the first tx descriptor as ready only after all the other tx descriptors are set up
In a multi segment frame, if the first tx descriptor is marked with TX_DESC_CTL
but not all tx descriptors for the other segments in the frame are set up,
the TX DMA may transmit an incomplete frame.
To prevent this, set TX_DESC_CTL for the first tx descriptor only when done
with all the other segments.

Also, don't bother cleaning transmitted tx descriptors since TX_DESC_CTL
is cleared for them by the hardware and they will be reprogrammed before
TX_DESC_CTL is reenabled for them.

Submitted by:	Guy Yur <guyyur@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D13030
2017-11-18 20:42:48 +00:00
Emmanuel Vadot
1ee5a3d3b2 if_awg: only request completion interrupt on the last descriptor of a tx frame
The hardware will not issue a completion interrupt for a descriptor
with TX_INT_CTL set if it doesn't also have TX_LAST_DESC set.

Submitted by:	 Guy Yur <guyyur_gmail.com>
Differential Revision:	https://reviews.freebsd.org/D13029
2017-11-18 20:38:05 +00:00
Hans Petter Selasky
b108f35740 Remove duplicate static function prototype to fix compilation of
mlx5_fs_tree.c after r325638 when using GCC.

Found by:	kib @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-18 20:32:09 +00:00
Bryan Drewery
b00ea69b51 Fix PORTS_MODULES+'make reinstallkernel' trying to run bogus 'make redeinstall'.
Also fix 'make installkernel' running 'make deinstall' twice.

PR:		201779
MFC after:	2 weeks
Sponsored by:	Dell
2017-11-18 20:01:05 +00:00
Emmanuel Vadot
7847756781 dts: Allwinner: Remove our last custom DTS
All Allwinner boards should use the upstream DTS so remove our
remaining custom ones.
2017-11-18 16:07:53 +00:00
Emmanuel Vadot
518cc4989e Update our copy of DTS from the ones from Linux 4.14 2017-11-18 15:46:48 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Mateusz Guzik
284194f183 locks: fix compilation issues without SMP or KDTRACE_HOOKS 2017-11-17 23:27:06 +00:00
Andrey V. Elsukov
66f84fabb3 Add comment for accidentally committed unrelated change in r325960.
Do not invoke IPv4 NAT handler for non IPv4 packets. Libalias expects
a packet is IPv4. And in case when it is IPv6, it just translates them
as IPv4. This leads to corruption and in some cases to panics.
In particular a panic can happen when value of ip6_plen modified to
something that leads to IP fragmentation, but actual packet length does
not match the IP length.

Packets that are not IPv4 will be dropped by NAT rule.

Reported by:	Viktor Dukhovni <freebsd at dukhovni dot org>
MFC after:	1 week
2017-11-17 23:25:06 +00:00
Navdeep Parhar
879462e916 cxgbe(4): Add core Vdd to the sysctl MIB.
Sponsored by:	Chelsio Communications
2017-11-17 23:22:39 +00:00
Andrey V. Elsukov
e11f0a0c4c Unconditionally enable support for O_IPSEC opcode.
IPsec support can be loaded as kernel module, thus do not depend from
kernel option IPSEC and always build O_IPSEC opcode implementation as
enabled.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2017-11-17 22:40:02 +00:00
Alan Somers
1a69c11aab Add assertion in probedone() that we're holding the device lock.
Submitted by:	ken
Reviewed by:	asomers
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-17 20:53:52 +00:00
Mateusz Guzik
18f23540d8 lockmgr: remove the ADAPTIVE_LOCKMGRS option
The code was never enabled and is very heavy weight.

A revamped adaptive spinning may show up at a later time.

Discussed with:	kib
2017-11-17 20:41:17 +00:00
Conrad Meyer
38d84d683e vfs_lookup: Allow PATH_MAX-1 symlinks
Previously, symlinks in FreeBSD were artificially limited to PATH_MAX-2.

Add a short test case to verify the change.

Submitted by:	Gaurav Gangalwar <ggangalwar AT isilon.com>
Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12589
2017-11-17 19:25:39 +00:00
Warner Losh
1cbb58886a Remove build system support for lint.
Differential Revision: https://reviews.freebsd.org/D13124
2017-11-17 18:16:46 +00:00
Ruslan Bukin
3b418d1b9a Add Intel Processor Trace registers for:
- CPUID
- Table of Physical Addresses (ToPA).

Sponsored by:	DARPA, AFRL
2017-11-17 17:54:10 +00:00
Alan Somers
602740b689 Fix potential NULL pointer dereference of device physical path
In scsi_dev_advinfo(), if the physical path is being stored and there is a
malloc failure (malloc(9) is called with M_NOWAIT), we could wind up in a
situation where the device's physpath_len is set to the length the user
provided, but the physpath itself is NULL.

If another context then comes in to fetch the physical path value, we would
wind up trying to memcpy a NULL pointer into the caller's buffer.

So, set the physpath_len to 0 when we free the physpath on entry into the
store case for the physical path.  Reset the length to a non-zero value only
after we've successfully malloced a buffer to hold it.

Submitted by:	ken
Reviewed by:	asomers
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-17 17:13:00 +00:00
Scott Long
7841fefb62 Rename P_OSREL_CK_CLYGRP to P_OSREL_CK_CYLGRP 2017-11-17 13:12:20 +00:00
Baptiste Daroussin
6247019ec2 Actually commit the right patch for r325929 2017-11-17 09:33:29 +00:00
Baptiste Daroussin
5cdf66d629 Do not remove the sources when zstd is called as zstdcat 2017-11-17 09:29:26 +00:00
Justin Hibbits
c3f3cd058f Add jumbo frame support to dtsec(4)
MFC after:	2 weeks
2017-11-17 04:29:32 +00:00
Justin Hibbits
bb7137e1a3 Stop special casing 32-bit AIM in memory parsing
There's no need to special case 32-bit AIM to short circuit processing.
Some AIM CPUs can handle 36 bit addresses, and 64-bit CPUs can run 32-bit
OSes, so this will allow us to expand for that in the future if we desire.
2017-11-17 04:10:52 +00:00
Mateusz Guzik
2ccee9cc52 mtx: add missing parts of the diff in r325920
Fixes build breakage.
2017-11-17 02:59:28 +00:00
Mateusz Guzik
32aef9ff05 sched: move panic handling code out of choosethread
This avoids jumps in the common case of the kernel not being panicked.
2017-11-17 02:45:38 +00:00
Mateusz Guzik
997131646f Check for PRS_NEW without locking the proc in sysctl_kern_proc 2017-11-17 02:29:06 +00:00
Mateusz Guzik
bc24577c25 sx: perform a minor cleanup of the unlock slowpath
No functional changes.
2017-11-17 02:27:04 +00:00
Mateusz Guzik
8fef6b2c67 rwlock: unlock before traversing threads to wake up
While here perform a minor cleanup of the unlock path.
2017-11-17 02:26:15 +00:00
Mateusz Guzik
8448e02081 mtx: unlock before traversing threads to wake up
This shortens the lock hold time while not affecting corretness.
All the woken up threads end up competing can lose the race against
a completely unrelated thread getting the lock anyway.
2017-11-17 02:25:04 +00:00
Mateusz Guzik
ae7d25a4d7 locks: pull up PMC_SOFT_CALLs out of slow path loops 2017-11-17 02:22:51 +00:00
Mateusz Guzik
3af300592c rwlock: avoid branches in the slow path if lockstat is disabled 2017-11-17 02:21:24 +00:00
Mateusz Guzik
e41d616684 sx: avoid branches if in the slow path if lockstat is disabled 2017-11-17 02:21:07 +00:00
Warner Losh
a3c15a4445 Only try to enable CK_CLYGRP if we're running on kernel newer than
1200046, the first version that supports this feature. If we set it,
then use an old kernel, we'll break the 'contract' of having
checksummed cylinder groups this flag signifies. To avoid creating
something with an inconsistent state, don't turn the flag on in these
cases. The first full fsck with a new kernel will turn this on.

Spnsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13114
2017-11-16 21:28:14 +00:00
Stephen Hurd
d27352644a Fix default numbers of iflib queue sets
The intent appears to be having one RX/TX queue set per core,
but since scctx->isc_n[tr]xqsets is set to max before calling
iflib_msix_init(), both end up being set to total number of cores.

Use ctx->ifc_sysctl_n[rt]xqs as the selected value and
scctx->isc_n[rt]xqsets as the max. This should result in what appears
to be the intended behaviour

Reviewed by:	sbruno
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D13096
2017-11-16 18:52:58 +00:00
Konstantin Belousov
4e421792ec Remove i386 XBOX support.
It is for console presented at 2001 and featuring Pentium III
processor.  Even if any of them are still alive and run FreeBSD, we do
not have any sign of life from their users.  While removing another
dozens of #ifdefs from the i386 sources reduces the aversion from
looking at the code and improves the platform vitality.

Reviewed by:	cem, pfg, rink (XBOX support author)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D13016
2017-11-16 14:27:02 +00:00
Hans Petter Selasky
41dbd9dd1d Update iser backend code to use new ibcore APIs.
Sponsored by:	Mellanox Technologies
2017-11-16 13:28:00 +00:00
Baptiste Daroussin
16f92ad234 Add some 4k quirks for Samsung pm863a SSDs
Submitted by:	Nikita Kozlov <nikita.kozlov at blade-group.com>
MFC after:	3 days
Sponsored by:	blade
Differential Revision:	https://reviews.freebsd.org/D13093
2017-11-16 10:15:17 +00:00
Mark Johnston
e9a2e17d1b Avoid holding the process in uread() and uwrite().
In general, higher-level code will atomically verify that the process
is not exiting and hold the process. In one case, we were using uwrite()
to copy a probed instruction to a per-thread scratch space block, but
copyout() can be used for this purpose instead; this change effectively
reverts r227291.

MFC after:	1 week
2017-11-16 07:25:12 +00:00
Navdeep Parhar
d3d5e96893 cxgbe(4): Remove rsrv_noflowq from intrs_and_queues structure as it does
not influence or get affected by the number of interrupts or queues.

Sponsored by:	Chelsio Communications
2017-11-16 02:42:37 +00:00
Navdeep Parhar
d406c47d71 cxgbe(4): Sanitize t4_num_vis during MOD_LOAD like all other t4_*
tunables.  Add num_vis to the intrs_and_queues structure as it affects
the number of interrupts requested and queues created.  In future
cfg_itype_and_nqueues might lower it incrementally instead of going
straight to 1 when enough interrupts aren't available.

Sponsored by:	Chelsio Communications
2017-11-16 01:33:53 +00:00
Navdeep Parhar
8c61c6bbda cxgbe(4): Combine all _10g and _1g tunables and drop the suffix from
their names.  The finer-grained knobs weren't practically useful.

Sponsored by:	Chelsio Communications
2017-11-15 23:48:02 +00:00
Conrad Meyer
f95f6841c8 ipsec: Use the same keysize values for HMAC as prior to r324017
The HMAC construction natively permits any key size between 0 and the input
block length. Before r324017, the auth_hash 'keysize' member was the hash
output length, which was used by ipsec for key sizes. (Non-ipsec consumers
need the ability to use other keysizes, hence, r324017.)

The ipsec SADB code blindly uses the auth_hash 'keysize' member for both
minimum and maximum key size, which is wrong (from an HMAC perspective).
For now, just switch it to 'hashsize', which matches the existing
expectations.

Instead it should probably use the range [0, keysize]. But there may be
other broken code in ipsec that rejects hashes with too small a minimum
key size.

Reported by:	olivier@
Reviewed by:	olivier, no objection from ae
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12770
2017-11-15 22:42:20 +00:00
Gordon Tetlow
edb01d11f8 Properly bzero kldstat structure to prevent kernel information leak.
Submitted by:	kib
Reported by:	TJ Corley
Security:	CVE-2017-1088
2017-11-15 22:30:21 +00:00
Michael Tuexen
3e87bccde3 Fix the handling of ERROR chunks which a lot of error causes.
While there, clean up the code.
Thanks to Felix Weinrank who found the bug by using fuzz-testing
the SCTP userland stack.

MFC after:	1 week
2017-11-15 22:13:10 +00:00
Alan Somers
30083240bc Remove a double free(9) in xpt_bus_register
In xpt_bus_register(), remove superfluous call to free().  This was mostly
benign since free(9) checks for NULL before doing anything, and
xpt_create_path() is nice enough to NULL out the pointer on failure.
However, it could've segfaulted if malloc(9) failed during
xpt_create_path().

Submitted by:	gibbs
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-11-15 15:52:06 +00:00
Konstantin Belousov
1c778d91b5 vmtotal: extend memory counters to accomodate for current and future
hardware sizes.

32bit counters already overflow on approachable virtual memory page
counts, and soon would overflow on the physical pages counts as well.
Bump sizes to 64bit types.  Bump __FreeBSD_version.

It is impossible to provide perfect backward ABI compat for this
change.  If a program requests an old structure, it can be detected by
size.  But if it queries the size first by passing NULL old req
pointer, there is almost nothing we can do to detect the desired ABI.
As a partial solution, check p_osrel of the quering process when
selecting the size to report.

Submitted by:	Pawel Biernacki <pawel.biernacki@gmail.com>
Differential revision:	https://reviews.freebsd.org/D13018
2017-11-15 13:41:03 +00:00
Baptiste Daroussin
c07d14deb5 remove the poor emulation of the IllumOS needfree global variable to prevent
the ARC reclaim thread running longer than needed.

Update the arc::needfree dtrace probe triggered in arc_lowmem() to also report
the value we may want to free.

Submitted by:	Nikita Kozlov <nikita.kozlov at blade-group.com>
Reviewed by:	avg
Approved by:	avg
MFC after:	3 weeks
Sponsored by:	blade
Differential Revision:	https://reviews.freebsd.org/D12163
2017-11-15 12:48:36 +00:00
Hans Petter Selasky
4f939024a8 Add full support for specifying IPv6 addresses to krping.
Sponsored by:	Mellanox Technologies
2017-11-15 11:35:02 +00:00
Hans Petter Selasky
55b1c6e7e4 Merge ^/head r325663 through r325841. 2017-11-15 11:28:11 +00:00
Hans Petter Selasky
c3191c2e2b Update the mlx4 core and mlx4en(4) modules towards Linux v4.9.
Background:
The coming ibcore update forces an update of mlx4ib(4) which in turn requires
an updated mlx4 core module. This also affects the mlx4en(4) module because
commonly used APIs are updated. This commit is a middle step updating the
mlx4 modules towards the new ibcore.

This change contains no major new features.

Changes in mlx4:
  a) Improved error handling when mlx4 PCI devices are
  detached inside VMs.
  b) Major update of codebase towards Linux 4.9.

Changes in mlx4ib(4):
  a) Minimal changes needed in order to compile using the
  updated mlx4 core APIs.

Changes in mlx4en(4):
  a) Update flow steering code in mlx4en to use new APIs for
  registering MAC addresses and IP addresses.
  b) Update all statistics counters to be 64-bit.
  c) Minimal changes needed in order to compile using the
  updated mlx4 core APIs.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-15 11:14:39 +00:00
Wojciech Macek
ec7f8d58b9 CXGBE: fix big-endian behaviour
The setbit/clearbit pair casts the bitfield pointer
to uint8_t* which effectively treats its contents as
little-endian variable. The ffs() function accepts int as
the parameter, which is big-endian. Use uint8_t here to
avoid mismatch, as we have only 4 doorbells.

Submitted by:          Wojciech Macek <wma@freebsd.org>
Reviewed by:           np
Obtained from:         Semihalf
Sponsored by:          QCM Technologies
Differential revision: https://reviews.freebsd.org/D13084
2017-11-15 06:45:33 +00:00
Warner Losh
9e45a31607 Fix SYSDIR path. After the move, we need to chop off a couple ../ from
the prior definition. But a safer definition is SRCTOP/sys, so use
that.

Sponsored by: Netflix
2017-11-15 03:46:59 +00:00
Warner Losh
eab9d0a85b Inline pcie_link_{status,caps} where needed. Remove them as they
aren't really needed and I don't want to document them.

Suggested by: jhb@
Sponsored by: Netflix
2017-11-15 02:24:47 +00:00
John Baldwin
b8db1c9fe0 Use #if instead of #ifdef for __BSD_VISIBLE tests.
__BSD_VISIBLE is always defined and it's value instead needs to be
tested via #if to determine if FreeBSD-specific APIs should be
exposed.

PR:		196226, 223481 (exp-run)
Submitted by:	pluknet
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12977
2017-11-14 23:50:30 +00:00
Warner Losh
ca987d4641 Move sys/boot to stand. Fix all references to new location
Sponsored by:	Netflix
2017-11-14 23:02:19 +00:00
Warner Losh
2e36db147e Move sys/boot/fdt/dts to sys/dts and adjust scripts.
Sponsored by: Netflix
2017-11-14 21:03:57 +00:00
Ed Maste
81d606f52e disallow clock_settime too far in the future to avoid panic
clock_ts_to_ct has a KASSERT that the converted year fits into four
digits.  By default (sysctl debug.allow_insane_settime is 0) the kernel
disallows a time too far in the future, using a value of 9999 366-day
years.  However, clock_settime is epoch-relative and the assertion will
fail with a tv_sec corresponding to some 8030 years.

Avoid trying to be too clever, and just use a limit of 8000 365-day
years past the epoch.

Submitted by:	Heqing Yan <scottieyan@gmail.com>
Reported by:	Syzkaller (https://github.com/google/syzkaller)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-11-14 18:18:18 +00:00
Bjoern A. Zeeb
c04716b1ad Unbreak IPv6.
No longer return ENXIO when trying to send an IPv6 packet in
nicvf_sq_add_hdr_subdesc().
Restructure the code so that the upper layer protocol parts are
agnostic of the L3 protocol (and no longer specific to IPv4).
With this basic IPv6 packets go through.  We are still seeing
weird behaviour which needs further diagnosis.

PR:			223669
In collaboration with:	emaste
MFC after:		3 days
2017-11-14 16:47:05 +00:00
Ed Maste
3bfef74a1f vnic: report that the driver supports multicast
The driver is currently hardcoded to force promiscuous mode, so all of
the MAC filtering code is presently unused and multicast should "just
work."  Report to the higher layers that multicast is supported.

PR:		223573
Reported by:	bz
Sponsored by:	The FreeBSD Foundation, Packet.net (hardware)
2017-11-14 16:31:11 +00:00
Hans Petter Selasky
dd00abf2d7 Make sure the ib_wr_opcode enum is signed by adding a negative dummy element.
Different compilers may optimise the enum type in different ways. This ensures
coherency when range checking the value of enums in ibcore.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-14 14:51:37 +00:00
Hans Petter Selasky
3b8c4b3619 Make sure a valid VNET is set before trying to access the V_ip6_v6only
variable. Access the variable directly instead of going through the sysctl()
interface in the kernel.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-14 14:43:35 +00:00
Hans Petter Selasky
4591fd4e2a Set the default VNET in krping before calling ifunit_ref(). Else using IPv6
link-local addresses when VIMAGE is enabled will cause a so-called NULL
pointer dereferencing issue.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-14 14:39:37 +00:00
Navdeep Parhar
f93039d9de Fix iw_cxgbe build in the projects branch.
Submitted by:	Krishnamraju Eraparaju @ Chelsio
2017-11-14 07:04:06 +00:00
Warner Losh
5e8a39f6ab Properly decode NVMe state of the drive and print out the information
in the attach to more closely match what SCSI and ATA attached
storage provides.

Sponsored by: Netflix
2017-11-14 05:05:26 +00:00
Warner Losh
d1c9349f76 Belatedly add opt_nvme.h to fix building nvme.ko outside of a kernel
build.

Sponsored by: Netflix
2017-11-14 05:05:21 +00:00
Warner Losh
4e3b274457 Provide link speed data in XPT_GET_TRAN_SETTINGS. Provide full version
information for that and XPT_PATH_INQ. Provide macros to encode/decode
major/minor versions.  Read the link speed and lane count to compute
the base_transfer_speed for XPT_PATH_INQ.

Sponsored by: Netflix
2017-11-14 05:05:16 +00:00
Warner Losh
d505913c91 Provide pcie_link_status and pcie_link_cap convenience functions.
Sponsored by: Netflix
2017-11-14 05:05:05 +00:00
Warner Losh
0c16b53773 Move zstd from contrib to sys/contrib so it can be used in the
kernel. Adjust the Makefiles that referenced it to the new path.

Sponsored by: Netflix
OK'd by: cem@ and AllanJude@
2017-11-14 05:03:38 +00:00
Justin Hibbits
27353776c4 Expand the Freescale PCIe root complex driver with the ofw_pcib_pci
The interrupt map wasn't being allocated properly, preventing IRQs from being
allocated to children of the PCIe bus.  Fix this by cloning the ofw_pcib_pci
code, which handles all cases -- device tree and probed.

In the future this may become a subclass of the ofw_pcib_pci driver, but as
that's not an exported class, it's cloned for now.

MFC after:	3 weeks
2017-11-14 03:53:15 +00:00
Justin Hibbits
ce7fd13aff Convert BERI to use ofw_parse_bootargs()
Summary:
ofw_parse_bootargs() was added in r306065 as an attempt to unify the
various copies of the same code.  This simply migrates BERI to use it.

Reviewed By: brooks
Differential Revision: https://reviews.freebsd.org/D12962
2017-11-14 03:23:46 +00:00
Justin Hibbits
f8135d8282 Use the correct board name for the Ubiquiti Unifi Security Gateway 2017-11-14 03:21:39 +00:00
Michael Tuexen
d0f6ab7920 Simply the code and use the full buffer for contigous chunk representation.
MFC after:	1 week
2017-11-14 02:30:21 +00:00
Kevin Lo
ccaf7ad472 Add TP-LINK UE300.
Submitted by:	Kris G <netsick@gmail.com>
2017-11-14 01:57:54 +00:00
Warner Losh
48f1a4921e Add two new tunables / sysctls to controll reboot after panic:
kern.poweroff_on_panic which, when enabled, instructs a system to
power off on a panic instead of a reboot.

kern.powercyle_on_panic which, when enabled, instructs a system to
power cycle, if possible, on a panic instead of a reboot.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13042
2017-11-14 00:29:14 +00:00
Gleb Smirnoff
3e21cbc802 Style r320614: don't initialize at declaration, new line after declarations,
shorten variable name to avoid extra long lines.
No functional changes.
2017-11-13 22:16:47 +00:00
Warner Losh
60717f4460 Don't add /boot/dt*s* but /boot/dt*b*. Stupid think-o. /boot/dtb was
what was tested...
2017-11-13 21:25:57 +00:00
Warner Losh
4a7ef88393 Add /boot/dts to the list of default modules. The minimal arm and mips
loader.conf for uboot have this in the list, but the default one
didn't. Since there's no harm and it's a failsafe, add it to the list.

Sponsored by: Netflix
2017-11-13 21:23:26 +00:00
John Baldwin
7e3e36068b Move loop to clear TDB_SUSPEND into PT_DETACH case.
The PT_DETACH case above the sendsig: label already looped over all
threads clearing flags in td_dbgflags.  Reuse this loop to clear
TDB_SUSPEND and move the logic out of the sendsig: block.
2017-11-13 21:22:33 +00:00
John Baldwin
2a2b23cae2 Pull the PT_ATTACH case out of the 'sendsig:' block.
Most of the conditionals in the 'sendsig:' block are now only different
for PT_ATTACH vs other continue requests.  Pull the PT_ATTACH-specific
logic up into the PT_ATTACH case and simplify the 'sendsig:' block.  This
also permits moving the unlock of proctree_lock above the sendsig: label
since PT_KILL doesn't hold the lock and and the other cases all fall
through to the label.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D13073
2017-11-13 21:09:08 +00:00
Bryan Drewery
d7a699d3d8 Rework r325568 so all 'make LINT' targets work.
Reported by:	ngie
Sponsored by:	Dell EMC Isilon
2017-11-13 20:49:08 +00:00
Warner Losh
d6be6d496c Add loader.conf to the list of files that are MD.
loader.conf is also different between machines. On arm it's a minimal
one that's not quite compatible with the default one. On arm it's
minimal for speed, which is good, but there's also extra things in it
relative to the default on which break loading FDT which is bad. This
doesn't address that issue, but instead ensures the minimal one for
arm is used.

A similar issue for mips exists, but since we can have a beri variant
of /boot/loader and a uboot variant, I'm leaving that mess alone for
the moment.

Sponsored by: Netflix
2017-11-13 20:39:43 +00:00
John Baldwin
feeaec18d4 Only clear a pending thread event if one is pending.
This fixes a panic when attaching to an already-stopped process after
r325028.  While here, clean up a few other things in the control flow
of the 'sendsig' section:
- Only check for P_STOPPED_TRACE rather than either of P_STOPPED_SIG
  or P_STOPPED_TRACE for most ptrace requests.  The signal handling
  code in kern_sig.c never sets just P_STOPPED_SIG for a traced
  process, so if P_STOPPED_SIG is stopped, P_STOPPED_TRACE should be
  set anyway.  Remove a related debug printf.  Assuming P_STOPPED_TRACE
  permits simplifications in the 'sendsig:' block.
- Move the block to clear the pending thread state up into a new
  block conditional on P_STOPPED_TRACE and handle delivering pending
  signals to the reporting thread and clearing the reporting thread's
  state in this block.
- Consolidate case to send a signal to the process in a single case
  for PT_ATTACH.  The only case that could have been in the else before
  was a PT_ATTACH where P_STOPPED_SIG was not set, so both instances
  of kern_psignal() collapse down to just PT_ATTACH.

Reported by:	pho, mmel
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12837
2017-11-13 19:58:58 +00:00
Emmanuel Vadot
f362a3985c arm: rpi2: Fix cpufreq(4)
Since r324184 the root node compatible for rpi2 is "brcm,bcm2836", add
it to the compatible list of bcm2835_cpufreq.

Tested On: RPI2 v1.1 RPI2 v1.2

Reported by:	many on freebsd-arm@
2017-11-13 18:53:41 +00:00
Hans Petter Selasky
0c19d06405 Properly handle the case where the linux_cdev_handle_insert() function
in the LinuxKPI returns NULL. This happens when the VM area's private
data handle already exists and could cause a so-called NULL pointer
dereferencing issue prior to this fix.

Found by:	greg@unrelenting.technology
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-13 18:16:26 +00:00
Eric van Gyzen
d63f2964ce Add __BEGIN_DECLS and __END_DECLS to <sys/umtx.h>
This allows C++ programs to call _umtx_op().

MFC after:	3 days
Sponsored by:	Dell EMC
2017-11-13 16:53:36 +00:00
Hans Petter Selasky
8dee9a7a44 Remove no longer supported mthca driver.
Sponsored by:	Mellanox Technologies
2017-11-13 10:59:38 +00:00
Hans Petter Selasky
8cc487045e Update mlx4ib(4) to Linux 4.9.
Sponsored by:	Mellanox Technologies
2017-11-13 10:49:18 +00:00
Konstantin Belousov
f4dd123e15 Do not leak PMC_PO_OWNS_LOGFILE on error.
Note that PMCLOG_RESERVE_WITH_ERROR() macro contains goto error;
statement and executed after the flag is set.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-11-13 10:45:31 +00:00
Konstantin Belousov
c9da263712 Style bug.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-11-13 10:43:31 +00:00
Xin LI
712dda7fb0 Be more careful when doing calculation with request from userland.
MFC after:	2 weeks
2017-11-13 07:47:43 +00:00
Warner Losh
2ce13699d7 Use proper include file. While <boot/userboot/userboot.h> works, it
only works because we have -Isys on the command line. We also have
-Isys/boot/userboot on the command line, so bring it in directly with
<userboot.h>. No functional change, but it removes one hard to see
dependency on the boot loader's location in sys/boot.

Sponsored by: Netflix
2017-11-13 00:30:38 +00:00
Ruslan Bukin
b510dab312 Add Intel Processor Trace (PT) MSRs.
Sponsored by:	DARPA, AFRL
2017-11-12 23:13:04 +00:00
Michael Tuexen
469a65d1f8 Cleanup the handling of control chunks. While there fix some minor
bug related to clearing the assoc retransmit counter and the dup TSN
handling of NR-SACK chunks.

MFC after:	3 days
2017-11-12 21:43:33 +00:00
Ed Maste
1ec7755286 boot1: also check for NULL device
r325681 fixed a NULL pointer dereference on RPi3 caused by a lack of
functionality in uboot's EFI implementation.  That rev checked the boot1
load path for NULL but not the load device.  In practice if the former
works the latter will as well, but improve correctness by checking each
separately.

Submitted by:	Keith White <kwhite@eecs.uottawa.ca>
Reported by:	jhb
MFC after:	5 days
MFC with:	r325681
Sponsored by:	The FreeBSD Foundation
2017-11-12 17:15:54 +00:00
Warner Losh
175748c982 Make sure the proper loader.rc gets installed.
There were two things wrong. First, the wrong path was listed in .PATH
statement. Second, BOOTSRC wasn't yet defined for the .PATH, so it
didn't properly add it. Third, even if these were right, . was in the
path before, so it wouldn't have worked. Replace this with a simple
loop so the proper loader.rc gets selected.

Noticed by: dhw@ (menus stopped working on boot)
Sponsored by: Netflix
2017-11-12 17:10:57 +00:00
Mateusz Guzik
ca0227933e amd64: stop nesting preemption counter in spinlock_enter
Discussed with:	jhb
2017-11-12 03:13:01 +00:00
Mateusz Guzik
fe7979a12c Use passed thread pointer instead of curthread in sys_sched_yield
No functional changes.
2017-11-12 02:34:33 +00:00
Mateusz Guzik
baaa6ec7ed Avoid locking and refing in sysctl_kern_proc_args if possible.
Turns out the sysctl is called a lot e.g. by pkg-static.
2017-11-11 22:39:33 +00:00
Mateusz Guzik
8b9817a443 sysctl: try to avoid malloc in name2oid
name2oid is called all the time and passed names are almost always very short
(< 16 characters).
2017-11-11 21:50:36 +00:00
Hans Petter Selasky
5c4155867a Implement missing KDGETMODE IOCTL in VT.
Obtained from:	Johannes Lundberg <yohanesu75@gmail.com>
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-11 20:12:48 +00:00
Mateusz Guzik
537d0fb138 Use pfind_any in linux_rt_sigqueueinfo and kern_sigqueue 2017-11-11 18:10:09 +00:00
Mateusz Guzik
6e1619dae3 Add pfind_any
It looks for both regular and zombie processes. This avoids allproc relocking
previously seen with pfind -> zpfind calls.
2017-11-11 18:04:39 +00:00
Mateusz Guzik
272640b7fc Avoid allproc lock in pfind if curproc->pid == pid 2017-11-11 18:03:26 +00:00
Mateusz Guzik
9b57bf75d0 Remove useless proc lookup from sysctl_out_proc 2017-11-11 18:02:23 +00:00
Hans Petter Selasky
ef9257491d Remove release and acquire semantics when accessing the "state" field of the
LinuxKPI task struct. Change type of "state" variable from "int" to
"atomic_t" to simplify code and avoid unneccessary casting.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-11 11:01:50 +00:00
Hans Petter Selasky
e0390735d3 Mask away return codes from del_timer() and del_timer_sync() because
they are not the same like in Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-11 10:46:12 +00:00
Mateusz Guzik
c7e4e92ecd rwlock: use fcmpset for setting RW_LOCK_WRITE_SPINNER 2017-11-11 09:34:11 +00:00