Commit Graph

132800 Commits

Author SHA1 Message Date
Mark Johnston
79ddb55c39 Add SCTP_SUPPORT handling to config.mk.
Reviewed by:	jhb, tuexen
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25402
2020-06-25 15:25:00 +00:00
Mark Johnston
84242cf68a Call swap_pager_freespace() from vm_object_page_remove().
All vm_object_page_remove() callers, except
linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when
removing a range of pages from an object.  The LinuxKPI case appears to
be an unintentional omission that could result in leaked swap blocks, so
unconditionally free swap space in vm_object_page_remove() to protect
against similar bugs in the future.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25329
2020-06-25 15:21:21 +00:00
Conrad Meyer
4daa95f85d bhyve(8): For prototyping, reattempt decode in userspace
If userspace has a newer bhyve than the kernel, it may be able to decode
and emulate some instructions vmm.ko is unaware of.  In this scenario,
reset decoder state and try again.

Reviewed by:	grehan
Differential Revision:	https://reviews.freebsd.org/D24464
2020-06-25 00:18:42 +00:00
Vladimir Kondratyev
54cca285fc atkbd/evdev: recognize the Chromebook menu key as F13 like Linux does.
This is the key on the right side of the function keys, with the
"hamburger menu" icon on it.

Submitted by:		GregV <greg@unrelenting.technology>
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D25390
2020-06-25 00:09:43 +00:00
Mark Johnston
ddf1843203 acpi_ibm(4): Rename disengaged mode to unthrottled mode.
This mode was added in r362496.  Rename it to make the meaning more
clear.

PR:		247306
Suggested by:	rpokala
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC with:	r362496
2020-06-24 19:51:03 +00:00
Enji Cooper
d6701b6c8c Add kern.features.witness
Adding `kern.features.witness` helps expose whether or not the kernel has
`options WITNESS` enabled, so the `feature_present(3)` API can be used
to query whether or not witness(9) is built into the kernel.

This support is helpful with userspace applications (generally speaking,
tests), as it can be queried to determine whether or not tests related
to WITNESS should be run.

MFC after:	1 week
Reviewed by: cem, darrick.freebsd_gmail.com
Differential Revision: https://reviews.freebsd.org/D25302
Sponsored by:	DellEMC Isilon
2020-06-24 18:51:01 +00:00
Mark Johnston
1388cfe1b5 ipfw(4): make O_IPVER/ipversion match IPv4 or 6, not just IPv4.
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Reviewed by:	Lutz Donnerhacke
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25227
2020-06-24 15:46:33 +00:00
Mitchell Horne
133b1f1461 Only invalidate the early DTB mapping if it exists
This temporary mapping will become optional. Booting via loader(8)
means that the DTB will have already been copied into the kernel's
staging area, and is therefore covered by the early KVA mappings.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D24911
2020-06-24 15:21:12 +00:00
Mitchell Horne
f7d2df2a8a Handle load from loader(8)
In locore, we must detect and handle different arguments passed by
loader(8) compared to what we recieve when booting directly via SBI
firmware. Currently we receive the hart ID in a0 and a pointer to the
device tree blob in a1. loader(8) provides only a pointer to its
metadata in a0.

The solution to this is to add an additional entry point, _alt_start.
This will be placed first in the .text section, so SBI firmware will
enter here, and jump to the common pagetable setup shortly after. Since
loader(8) understands our ELF kernel, it will enter at the ELF's entry
address, which points to _start. This approach leads to very little
guesswork as to which way we booted.

Fix-up initriscv() to parse the loader's metadata, continuing to use
fake_preload_metadata() in the SBI direct boot case.

Reviewed by:	markj, jrtc27 (asm portion)
Differential Revision:	https://reviews.freebsd.org/D24912
2020-06-24 15:20:00 +00:00
Michael Tuexen
132c073866 Fix the acconting for fragmented unordered messages when using
interleaving.
This was reported for the userland stack in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=19321

MFC after:		1 week
2020-06-24 14:47:51 +00:00
Richard Scheffenegger
6e26dd0dbe TCP: fix cubic RTO reaction.
Proper TCP Cubic operation requires the knowledge
of the maximum congestion window prior to the
last congestion event.

This restores and improves a bugfix previously added
by jtl@ but subsequently removed due to a revert.

Reported by:	chengc_netapp.com
Reviewed by:	chengc_netapp.com, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25133
2020-06-24 13:52:53 +00:00
Richard Scheffenegger
9dc7d8a246 TCP: make after-idle work for transactional sessions.
The use of t_rcvtime as proxy for the last transmission
fails for transactional IO, where the client requests
data before the server can respond with a bulk transfer.

Set aside a dedicated variable to actually track the last
locally sent segment going forward.

Reported by:	rrs
Reviewed by:	rrs, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25016
2020-06-24 13:42:42 +00:00
Marcin Wojtas
fab2a758cc Fix AccessWidth and BitWidth parsing in SPCR table
The ACPI Specification defines a Generic Address Structure (GAS),
which is used to describe UART controller register layout in the
SPCR table. The driver responsible for parsing it (uart_cpu_acpi)
wrongly associates the Access Size field to the uart_bas's regshft
and the register BitWidth to the regiowidth - according to
the definitions it should be opposite.

This problem remained hidden most likely because the majority of platforms
use 32-bit registers (BitWidth) which are accessed with the according
size (Dword). However on Marvell Armada 8k / Cn913x platforms,
the 32-bit registers should be accessed with Byte granulity, which
unveiled the issue.

This patch fixes above by proper values assignment and slightly improved
parsing.

Note that handling of the AccessWidth set to EFI_ACPI_6_0_UNDEFINED is
needed to work around a buggy SPCR table on EC2 x86 "bare metal" instances.

Reviewed by: manu, imp, cperciva, greg_unrelenting.technology
Obtained from: Semihalf
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25373
2020-06-24 12:15:27 +00:00
Michael Tuexen
87c0bf77d9 Fix alignment issue manifesting in the userland stack.
MFC after:		1 wwek
2020-06-23 23:05:05 +00:00
Doug Moore
158c55a584 In r362552, RB_SET_PARENT is defined, and use in parens in
RB_CLEAR_NODE.  But it is not an expression, and ought not to be
enclosed in parens.  Remove them.

Approved by:	markj
Differential Revision:	https://reviews.freebsd.org/D25421
2020-06-23 22:47:54 +00:00
Kirk McKusick
9407f25df2 Optimize g_journal's superblock update by noting that the summary
information is neither read nor written so it need not be written
out when updating the superblock.

PR:           247425
Sponsored by: Netflix
2020-06-23 21:44:00 +00:00
Vincenzo Maffione
0ff2126795 iflib: netmap: fix rsync index overrun
In the current iflib_netmap_rxsync, there is nothing that prevents
kring->nr_hwtail to overrun kring->nr_hwcur during the descriptor
import phase. This may cause errors in netmap applications, such as:

em1 RX0: fail 'head < kring->nr_hwcur || head > kring->nr_hwtail'
    h 795 c 795 t 282 rh 795 rc 795 rt 282 hc 282 ht 282

Reviewed by:	gallatin
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25252
2020-06-23 20:23:56 +00:00
Doug Moore
4d56980017 Define RB_SET_PARENT to do all assignments to rb parent
pointers. Define RB_SWAP_CHILD to replace the child of a parent with
its twin, and use it in 4 places. Use RB_SET in rb_link_node to remove
the only linuxkpi reference to color, and then drop color- and
parent-related definitions that are defined and used only in rbtree.h.

This is intended to be entirely cosmetic, with no impact on program
behavior, and leave RB_PARENT and RB_SET_PARENT as the only ways to
read and write rb parent pointers.

Reviewed by:	markj, kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D25264
2020-06-23 20:02:55 +00:00
Conrad Meyer
9b6edf364e kmod.mk: Don't split out debug symbols if requested
Ports bsd.kmod.mk explicitly sets MK_KERNEL_SYMBOLS=no to prevent auto-
splitting of debuginfo from kernel modules.  If that knob is set, don't
split out a .ko.debug and .ko from .ko.full; just generate a .ko with
debuginfo and leave it be.

Otherwise, with DEBUG_FLAGS set and MK_KERNEL_SYMBOLS=no, we would helpfully
strip out the debuginfo from the .ko.full and then not install it.  That is
not the desired result a WITH_DEBUG port kmod build.

Reviewed by:	emaste, jhb
Differential Revision:	https://reviews.freebsd.org/D24835
2020-06-23 18:25:31 +00:00
Ed Maste
e46cf959d6 arm64 armreg.h: fix TCR_TBI1 definition
Submitted by:	Greg V <greg@unrelenting.technology>
Differential Revision:	https://reviews.freebsd.org/D25411
2020-06-23 15:32:05 +00:00
Tycho Nightingale
c774294c57 To avoid a startup script race change net.bpf.optimize_writers from
CTLFLAG_RW to CTLFLAG_RWTUN to allow it to be modified by a loader
tunable.

Sponsored by:	Dell EMC Isilon
2020-06-23 13:57:53 +00:00
Navdeep Parhar
0cadedfc46 cxgbe(4): Add a tx_len16_to_desc helper.
No functional change.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-06-23 07:33:29 +00:00
Toomas Soome
a14844e0d6 MFOpenZFS: Add basic zfs ioc input nvpair validation
We want newer versions of libzfs_core to run against an existing
zfs kernel module (i.e. a deferred reboot or module reload after
an update).

Programmatically document, via a zfs_ioc_key_t, the valid arguments
for the ioc commands that rely on nvpair input arguments (i.e. non
legacy commands from libzfs_core). Automatically verify the expected
pairs before dispatching a command.

This initial phase focuses on the non-legacy ioctls. A follow-on
change can address the legacy ioctl input from the zfs_cmd_t.

The zfs_ioc_key_t for zfs_keys_channel_program looks like:

static const zfs_ioc_key_t zfs_keys_channel_program[] = {
       {"program",     DATA_TYPE_STRING,               0},
       {"arg",         DATA_TYPE_UNKNOWN,              0},
       {"sync",        DATA_TYPE_BOOLEAN_VALUE,        ZK_OPTIONAL},
       {"instrlimit",  DATA_TYPE_UINT64,               ZK_OPTIONAL},
       {"memlimit",    DATA_TYPE_UINT64,               ZK_OPTIONAL},
};

Introduce four input errors to identify specific input failures
(in addition to generic argument value errors like EINVAL, ERANGE,
EBADF, and E2BIG).

ZFS_ERR_IOC_CMD_UNAVAIL the ioctl number is not supported by kernel
ZFS_ERR_IOC_ARG_UNAVAIL an input argument is not supported by kernel
ZFS_ERR_IOC_ARG_REQUIRED a required input argument is missing
ZFS_ERR_IOC_ARG_BADTYPE an input argument has an invalid type

Reviewed by:	allanjude
Obtained from:	OpenZFS
Sponsored by:	Netflix, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D25393
2020-06-23 06:42:39 +00:00
Andriy Gapon
b40dd828bd teach ena driver about RSS kernel option
Networking is broken if the driver configures its (virtual) hardware to
use a hash algorithm (or a key) different from the one that the network
stack (software RSS) uses.  This can be seen with connections initiated
from the host.  The PCB will be placed into the hash table based on the
hash value calculated by the software.  The hardware-calculated hash
value in reponse packets will be different, so the PCB won't be found.

Tested with a kernel compiled with 'options RSS' on an instance with ena
driver.

Reviewed by:	mw, adrian
MFC after:	2 weeks
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D24733
2020-06-23 04:58:36 +00:00
John Baldwin
5b750b9a68 Store the AAD in a separate buffer for KTLS.
For TLS 1.2 this permits reusing one of the existing iovecs without
always having to duplicate both.

While here, only duplicate the output iovec for TLS 1.3 if it will be
used.

Reviewed by:	gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25291
2020-06-23 00:02:28 +00:00
John Baldwin
6deb4131b8 Add support for requests with separate AAD to ccr(4).
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25290
2020-06-22 23:41:33 +00:00
John Baldwin
604b021795 Add support for requests with separate AAD to aesni(4).
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25289
2020-06-22 23:22:13 +00:00
John Baldwin
9b774dc0c5 Add support to the crypto framework for separate AAD buffers.
This permits requests to provide the AAD in a separate side buffer
instead of as a region in the crypto request input buffer.  This is
useful when the main data buffer might not contain the full AAD
(e.g. for TLS or IPsec with ESN).

Unlike separate IVs which are constrained in size and stored in an
array in struct cryptop, separate AAD is provided by the caller
setting a new crp_aad pointer to the buffer.  The caller must ensure
the pointer remains valid and the buffer contents static until the
request is completed (e.g. when the callback routine is invoked).

As with separate output buffers, not all drivers support this feature.
Consumers must request use of this feature via a new session flag.

To aid in driver testing, kern.crypto.cryptodev_separate_aad can be
set to force /dev/crypto requests to use a separate AAD buffer.

Discussed with:	cem
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25288
2020-06-22 23:20:43 +00:00
Jung-uk Kim
450d86fc7f Assume all TSCs are synchronized for AMD Family 17h processors and later
when it has passed the synchronization test.

"Processor Programming Reference (PPR) for AMD Family 17h" states that
the TSC uses a common reference for all sockets, cores and threads.

MFC after:	1 month
2020-06-22 20:42:58 +00:00
Allan Jude
c5305bb50a MFOpenZFS: Add zio_ddt_free()+ddt_phys_decref() error handling
The assumption in zio_ddt_free() is that ddt_phys_select() must
always find a match.  However, if that fails due to a damaged
DDT or some other reason the code will NULL dereference in
ddt_phys_decref().

While this should never happen it has been observed on various
platforms.  The result is that unless your willing to patch the
ZFS code the pool is inaccessible.  Therefore, we're choosing
to more gracefully handle this case rather than leave it fatal.

http://mail.opensolaris.org/pipermail/zfs-discuss/2012-February/050972.html

5dc6af0eec

Reported by:	Pierre Beyssac
Obtained from:	OpenZFS
MFC after:	2 weeks
Sponsored by:	Klara Inc.
2020-06-22 19:03:02 +00:00
Michael Tuexen
b88082dd39 No need to include netinet/sctp_crc32.h twice. 2020-06-22 14:36:14 +00:00
Mark Johnston
e6db509d10 Move the definition of SCTP's system_base_info into sctp_crc32.c.
This file is the only SCTP source file compiled into the kernel when
SCTP_SUPPORT is configured.  sctp_delayed_checksum() references a couple
of counters defined in system_base_info, so the change allows these
counters to be referenced in a kernel compiled without "options SCTP".

Submitted by:	tuexen
MFC with:	r362338
2020-06-22 14:01:31 +00:00
Mark Johnston
9f763f0092 acpi_ibm(4): Add support for putting fans in disengaged mode.
PR:		247306
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
2020-06-22 12:36:05 +00:00
Andrew Turner
372c142b4f Translaate the PCI address when activating a resource
When the PCI address != physical address we need to translate from the
former to the latter before passing to the parent to map into the kernels
virtual address space.

Sponsored by:	Innovate UK
2020-06-22 10:49:50 +00:00
Andriy Gapon
f31030ba61 gpiobus_release_pin: remove incorrect prefix from error messages
It's interesting that similar messages from gpiobus_acquire_pin never
had any prefix while gpiobus_release_pin messages were prefixed with
"gpiobus_acquire_pin".
Anyway, the prefix is not that useful and can be deduced from context.

MFC after:	2 weeks
2020-06-22 10:32:41 +00:00
Doug Rabson
c07782e10e Add some missing parts for supporting va_birthtime.
Reviewed by:	rmacklem
2020-06-22 08:23:16 +00:00
Andrew Turner
fc0804f18b Fix reboot command on the Raspberry Pi series.
The Raspbery Pi computers do not properly implement PSCI. The canonical
way to reset them is to set a watchdog timer and allow it to expire.

Submitted by:	Robert Crowston <crowston_protonmail.com>
Differential Revision:	https://reviews.freebsd.org/D25268
2020-06-22 08:12:21 +00:00
Baptiste Daroussin
5b990a9463 Revert r362466
Such change should not have happen without prior discussion and review.

With hat:	transitioning core
2020-06-22 07:46:24 +00:00
Alexander V. Chernikov
b158cfb3fc Switch cxgbe interface lookup to use fibX_lookup() from older
fibX_lookup_nh_ext().

fibX_lookup_nh_ represents pre-epoch generation of fib kpi,
providing less guarantees over pointer validness and requiring
on-stack data copying.

Reviewed by:	np
Differential Revision:	https://reviews.freebsd.org/D24975
2020-06-22 07:35:23 +00:00
Michael Tuexen
c5d9e5c99e Cleanup the defintion of struct sctp_getaddresses. This stucture
is used by the IPPROTO_SCTP level socket options SCTP_GET_PEER_ADDRESSES
and SCTP_GET_LOCAL_ADDRESSES, which are used by libc to implement
sctp_getladdrs() and sctp_getpaddrs().
These changes allow an old libc to work on a newer kernel.
2020-06-21 23:12:56 +00:00
Bjoern A. Zeeb
e387af1fa8 Rather than zeroing MAXVIFS times size of pointer [r362289] (still better than
sizeof pointer before [r354857]), we need to zero MAXVIFS times the size of
the struct.  All good things come in threes; I hope this is it on this one.

PR:		246629, 206583
Reported by:	kib
MFC after:	ASAP
2020-06-21 22:09:30 +00:00
Matt Macy
9aeca21324 iflib: fix cloneattach fail and generalize pseudo device handling
- a cloneattach failure will not currently be handled correctly,
  jump to the right target

- pseudo devices are all treat as if they're ethernet devices -
  this often doesn't make sense

MFC after:	1 week
Sponsored by:	Netgate, Inc.
Differential Revision:	https://reviews.freebsd.org/D25083
2020-06-21 22:02:49 +00:00
Pawel Biernacki
9daf71541c net.link.generic.ifdata.<ifindex>.linkspecific: rework handler
This OID was added in r17352 but the write path of IFDATA_LINKSPECIFIC
seems unused as there are no in-base writers, and as far as I can tell
we had issues with this code before, see PR 219472.  Drop the write path
to make the handler read-only as described in comments and man-pages.
It can be marked as MPSAFE now.

Reviewed by:	bdragon, kib, melifaro, wollman
Approved by:	kib (mentor)
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25348
2020-06-21 18:40:17 +00:00
Hans Petter Selasky
7747001b12 Improve wording to be more precise and clear.
No functional change intended.

s/Master Boot/Main Boot/ (also called MBR)

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-21 13:34:08 +00:00
Edward Tomasz Napierala
5ac2674278 Adapt linuxulator syscalls.master files to the new layout.
No functional changes.

Reviewed by:	brooks
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25381
2020-06-21 10:09:34 +00:00
Michael Tuexen
171edd2110 Fix the build for an INET6 only configuration.
The fix from the last commit is actually needed twice...

MFC after:		1 week
2020-06-21 09:56:09 +00:00
Thomas Munro
f270658873 vfs: track sequential reads and writes separately
For software like PostgreSQL and SQLite that sometimes reads sequentially
while also writing sequentially some distance behind with interleaved
syscalls on the same fd, performance is better on UFS if we do
sequential access heuristics separately for reads and writes.

Patch originally by Andrew Gierth in 2008, updated and proposed by me with
his permission.

Reviewed by:	mjg, kib, tmunro
Approved by:	mjg (mentor)
Obtained from:	Andrew Gierth <andrew@tao11.riddles.org.uk>
Differential Revision:	https://reviews.freebsd.org/D25024
2020-06-21 08:51:24 +00:00
Jeff Roberson
03270b59ee Use zone nomenclature that is consistent with UMA. 2020-06-21 04:59:02 +00:00
Brandon Bergren
40b664f64b [PowerPC] More relocation fixes
It turns out relocating the symbol table itself can cause issues, like fbt
crashing because it applies the offsets to the kernel twice.

This had been previously brought up in rS333447 when the stoffs hack was
added, but I had been unaware of this and reimplemented symtab relocation.

Instead of relocating the symbol table, keep track of the relocation base
in ddb, so the ddb symbols behave like the kernel linker-provided symbols.

This is intended to be NFC on platforms other than PowerPC, which do not
use fully relocatable kernels. (The relbase will always be 0)

 * Remove the rest of the stoffs hack.
 * Remove my half-baked displace_symbol_table() function.
 * Extend ddb initialization to cope with having a relocation offset on the
   kernel symbol table.
 * Fix my kernel-as-initrd hack to work with booke64 by using a temporary
   mapping to access the data.
 * Fix another instance of __powerpc__ that is actually RELOCATABLE_KERNEL.
 * Change the behavior or X_db_symbol_values to apply the relocation base
   when updating valp, to match link_elf_symbol_values() behavior.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25223
2020-06-21 03:39:26 +00:00
Rick Macklem
b94b9a80b2 Fix up a comment added by r362455. 2020-06-21 02:49:56 +00:00