Commit Graph

132779 Commits

Author SHA1 Message Date
Navdeep Parhar
0cadedfc46 cxgbe(4): Add a tx_len16_to_desc helper.
No functional change.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-06-23 07:33:29 +00:00
Toomas Soome
a14844e0d6 MFOpenZFS: Add basic zfs ioc input nvpair validation
We want newer versions of libzfs_core to run against an existing
zfs kernel module (i.e. a deferred reboot or module reload after
an update).

Programmatically document, via a zfs_ioc_key_t, the valid arguments
for the ioc commands that rely on nvpair input arguments (i.e. non
legacy commands from libzfs_core). Automatically verify the expected
pairs before dispatching a command.

This initial phase focuses on the non-legacy ioctls. A follow-on
change can address the legacy ioctl input from the zfs_cmd_t.

The zfs_ioc_key_t for zfs_keys_channel_program looks like:

static const zfs_ioc_key_t zfs_keys_channel_program[] = {
       {"program",     DATA_TYPE_STRING,               0},
       {"arg",         DATA_TYPE_UNKNOWN,              0},
       {"sync",        DATA_TYPE_BOOLEAN_VALUE,        ZK_OPTIONAL},
       {"instrlimit",  DATA_TYPE_UINT64,               ZK_OPTIONAL},
       {"memlimit",    DATA_TYPE_UINT64,               ZK_OPTIONAL},
};

Introduce four input errors to identify specific input failures
(in addition to generic argument value errors like EINVAL, ERANGE,
EBADF, and E2BIG).

ZFS_ERR_IOC_CMD_UNAVAIL the ioctl number is not supported by kernel
ZFS_ERR_IOC_ARG_UNAVAIL an input argument is not supported by kernel
ZFS_ERR_IOC_ARG_REQUIRED a required input argument is missing
ZFS_ERR_IOC_ARG_BADTYPE an input argument has an invalid type

Reviewed by:	allanjude
Obtained from:	OpenZFS
Sponsored by:	Netflix, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D25393
2020-06-23 06:42:39 +00:00
Andriy Gapon
b40dd828bd teach ena driver about RSS kernel option
Networking is broken if the driver configures its (virtual) hardware to
use a hash algorithm (or a key) different from the one that the network
stack (software RSS) uses.  This can be seen with connections initiated
from the host.  The PCB will be placed into the hash table based on the
hash value calculated by the software.  The hardware-calculated hash
value in reponse packets will be different, so the PCB won't be found.

Tested with a kernel compiled with 'options RSS' on an instance with ena
driver.

Reviewed by:	mw, adrian
MFC after:	2 weeks
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D24733
2020-06-23 04:58:36 +00:00
John Baldwin
5b750b9a68 Store the AAD in a separate buffer for KTLS.
For TLS 1.2 this permits reusing one of the existing iovecs without
always having to duplicate both.

While here, only duplicate the output iovec for TLS 1.3 if it will be
used.

Reviewed by:	gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25291
2020-06-23 00:02:28 +00:00
John Baldwin
6deb4131b8 Add support for requests with separate AAD to ccr(4).
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25290
2020-06-22 23:41:33 +00:00
John Baldwin
604b021795 Add support for requests with separate AAD to aesni(4).
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25289
2020-06-22 23:22:13 +00:00
John Baldwin
9b774dc0c5 Add support to the crypto framework for separate AAD buffers.
This permits requests to provide the AAD in a separate side buffer
instead of as a region in the crypto request input buffer.  This is
useful when the main data buffer might not contain the full AAD
(e.g. for TLS or IPsec with ESN).

Unlike separate IVs which are constrained in size and stored in an
array in struct cryptop, separate AAD is provided by the caller
setting a new crp_aad pointer to the buffer.  The caller must ensure
the pointer remains valid and the buffer contents static until the
request is completed (e.g. when the callback routine is invoked).

As with separate output buffers, not all drivers support this feature.
Consumers must request use of this feature via a new session flag.

To aid in driver testing, kern.crypto.cryptodev_separate_aad can be
set to force /dev/crypto requests to use a separate AAD buffer.

Discussed with:	cem
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25288
2020-06-22 23:20:43 +00:00
Jung-uk Kim
450d86fc7f Assume all TSCs are synchronized for AMD Family 17h processors and later
when it has passed the synchronization test.

"Processor Programming Reference (PPR) for AMD Family 17h" states that
the TSC uses a common reference for all sockets, cores and threads.

MFC after:	1 month
2020-06-22 20:42:58 +00:00
Allan Jude
c5305bb50a MFOpenZFS: Add zio_ddt_free()+ddt_phys_decref() error handling
The assumption in zio_ddt_free() is that ddt_phys_select() must
always find a match.  However, if that fails due to a damaged
DDT or some other reason the code will NULL dereference in
ddt_phys_decref().

While this should never happen it has been observed on various
platforms.  The result is that unless your willing to patch the
ZFS code the pool is inaccessible.  Therefore, we're choosing
to more gracefully handle this case rather than leave it fatal.

http://mail.opensolaris.org/pipermail/zfs-discuss/2012-February/050972.html

5dc6af0eec

Reported by:	Pierre Beyssac
Obtained from:	OpenZFS
MFC after:	2 weeks
Sponsored by:	Klara Inc.
2020-06-22 19:03:02 +00:00
Michael Tuexen
b88082dd39 No need to include netinet/sctp_crc32.h twice. 2020-06-22 14:36:14 +00:00
Mark Johnston
e6db509d10 Move the definition of SCTP's system_base_info into sctp_crc32.c.
This file is the only SCTP source file compiled into the kernel when
SCTP_SUPPORT is configured.  sctp_delayed_checksum() references a couple
of counters defined in system_base_info, so the change allows these
counters to be referenced in a kernel compiled without "options SCTP".

Submitted by:	tuexen
MFC with:	r362338
2020-06-22 14:01:31 +00:00
Mark Johnston
9f763f0092 acpi_ibm(4): Add support for putting fans in disengaged mode.
PR:		247306
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
2020-06-22 12:36:05 +00:00
Andrew Turner
372c142b4f Translaate the PCI address when activating a resource
When the PCI address != physical address we need to translate from the
former to the latter before passing to the parent to map into the kernels
virtual address space.

Sponsored by:	Innovate UK
2020-06-22 10:49:50 +00:00
Andriy Gapon
f31030ba61 gpiobus_release_pin: remove incorrect prefix from error messages
It's interesting that similar messages from gpiobus_acquire_pin never
had any prefix while gpiobus_release_pin messages were prefixed with
"gpiobus_acquire_pin".
Anyway, the prefix is not that useful and can be deduced from context.

MFC after:	2 weeks
2020-06-22 10:32:41 +00:00
Doug Rabson
c07782e10e Add some missing parts for supporting va_birthtime.
Reviewed by:	rmacklem
2020-06-22 08:23:16 +00:00
Andrew Turner
fc0804f18b Fix reboot command on the Raspberry Pi series.
The Raspbery Pi computers do not properly implement PSCI. The canonical
way to reset them is to set a watchdog timer and allow it to expire.

Submitted by:	Robert Crowston <crowston_protonmail.com>
Differential Revision:	https://reviews.freebsd.org/D25268
2020-06-22 08:12:21 +00:00
Baptiste Daroussin
5b990a9463 Revert r362466
Such change should not have happen without prior discussion and review.

With hat:	transitioning core
2020-06-22 07:46:24 +00:00
Alexander V. Chernikov
b158cfb3fc Switch cxgbe interface lookup to use fibX_lookup() from older
fibX_lookup_nh_ext().

fibX_lookup_nh_ represents pre-epoch generation of fib kpi,
providing less guarantees over pointer validness and requiring
on-stack data copying.

Reviewed by:	np
Differential Revision:	https://reviews.freebsd.org/D24975
2020-06-22 07:35:23 +00:00
Michael Tuexen
c5d9e5c99e Cleanup the defintion of struct sctp_getaddresses. This stucture
is used by the IPPROTO_SCTP level socket options SCTP_GET_PEER_ADDRESSES
and SCTP_GET_LOCAL_ADDRESSES, which are used by libc to implement
sctp_getladdrs() and sctp_getpaddrs().
These changes allow an old libc to work on a newer kernel.
2020-06-21 23:12:56 +00:00
Bjoern A. Zeeb
e387af1fa8 Rather than zeroing MAXVIFS times size of pointer [r362289] (still better than
sizeof pointer before [r354857]), we need to zero MAXVIFS times the size of
the struct.  All good things come in threes; I hope this is it on this one.

PR:		246629, 206583
Reported by:	kib
MFC after:	ASAP
2020-06-21 22:09:30 +00:00
Matt Macy
9aeca21324 iflib: fix cloneattach fail and generalize pseudo device handling
- a cloneattach failure will not currently be handled correctly,
  jump to the right target

- pseudo devices are all treat as if they're ethernet devices -
  this often doesn't make sense

MFC after:	1 week
Sponsored by:	Netgate, Inc.
Differential Revision:	https://reviews.freebsd.org/D25083
2020-06-21 22:02:49 +00:00
Pawel Biernacki
9daf71541c net.link.generic.ifdata.<ifindex>.linkspecific: rework handler
This OID was added in r17352 but the write path of IFDATA_LINKSPECIFIC
seems unused as there are no in-base writers, and as far as I can tell
we had issues with this code before, see PR 219472.  Drop the write path
to make the handler read-only as described in comments and man-pages.
It can be marked as MPSAFE now.

Reviewed by:	bdragon, kib, melifaro, wollman
Approved by:	kib (mentor)
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25348
2020-06-21 18:40:17 +00:00
Hans Petter Selasky
7747001b12 Improve wording to be more precise and clear.
No functional change intended.

s/Master Boot/Main Boot/ (also called MBR)

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-21 13:34:08 +00:00
Edward Tomasz Napierala
5ac2674278 Adapt linuxulator syscalls.master files to the new layout.
No functional changes.

Reviewed by:	brooks
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25381
2020-06-21 10:09:34 +00:00
Michael Tuexen
171edd2110 Fix the build for an INET6 only configuration.
The fix from the last commit is actually needed twice...

MFC after:		1 week
2020-06-21 09:56:09 +00:00
Thomas Munro
f270658873 vfs: track sequential reads and writes separately
For software like PostgreSQL and SQLite that sometimes reads sequentially
while also writing sequentially some distance behind with interleaved
syscalls on the same fd, performance is better on UFS if we do
sequential access heuristics separately for reads and writes.

Patch originally by Andrew Gierth in 2008, updated and proposed by me with
his permission.

Reviewed by:	mjg, kib, tmunro
Approved by:	mjg (mentor)
Obtained from:	Andrew Gierth <andrew@tao11.riddles.org.uk>
Differential Revision:	https://reviews.freebsd.org/D25024
2020-06-21 08:51:24 +00:00
Jeff Roberson
03270b59ee Use zone nomenclature that is consistent with UMA. 2020-06-21 04:59:02 +00:00
Brandon Bergren
40b664f64b [PowerPC] More relocation fixes
It turns out relocating the symbol table itself can cause issues, like fbt
crashing because it applies the offsets to the kernel twice.

This had been previously brought up in rS333447 when the stoffs hack was
added, but I had been unaware of this and reimplemented symtab relocation.

Instead of relocating the symbol table, keep track of the relocation base
in ddb, so the ddb symbols behave like the kernel linker-provided symbols.

This is intended to be NFC on platforms other than PowerPC, which do not
use fully relocatable kernels. (The relbase will always be 0)

 * Remove the rest of the stoffs hack.
 * Remove my half-baked displace_symbol_table() function.
 * Extend ddb initialization to cope with having a relocation offset on the
   kernel symbol table.
 * Fix my kernel-as-initrd hack to work with booke64 by using a temporary
   mapping to access the data.
 * Fix another instance of __powerpc__ that is actually RELOCATABLE_KERNEL.
 * Change the behavior or X_db_symbol_values to apply the relocation base
   when updating valp, to match link_elf_symbol_values() behavior.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25223
2020-06-21 03:39:26 +00:00
Rick Macklem
b94b9a80b2 Fix up a comment added by r362455. 2020-06-21 02:49:56 +00:00
Rick Macklem
4302e8b671 Modify the way the client side krpc does soreceive() for TCP.
Without this patch, clnt_vc_soupcall() first does a soreceive() for
4 bytes (the Sun RPC over TCP record mark) and then soreceive(s) for
the RPC message.
This first soreceive() almost always results in an mbuf allocation,
since having the 4byte record mark in a separate mbuf in the socket
rcv queue is unlikely.
This is somewhat inefficient and rather odd. It also will not work
for the ktls rx, since the latter returns a TLS record for each
soreceive().

This patch replaces the above with code similar to what the server side
of the krpc does for TCP, where it does a soreceive() for as much data
as possible and then parses RPC messages out of the received data.
A new field of the TCP socket structure called ct_raw is the list of
received mbufs that the RPC message(s) are parsed from.
I think this results in cleaner code and is needed for support of
nfs-over-tls.
It also fixes the code for the case where a server sends an RPC message
in multiple RPC message fragments. Although this is allowed by RFC5531,
no extant NFS server does this. However, it is probably good to fix this
in case some future NFS server does do this.
2020-06-21 00:06:04 +00:00
Michael Tuexen
5087b6e732 Set a variable also in the case of an INET6 only kernel
MFC after:		1 week
2020-06-20 23:48:57 +00:00
Xin LI
00e8fb8001 Bump __FreeBSD_version after making liblzma to use libmd implementation
of SHA256.

PR:		200142
2020-06-20 21:32:14 +00:00
Michael Tuexen
ed82c2edd6 Use a struct sockaddr_in pr struct sockaddr_in6 as the option value
for the IPPROTO_SCTP level socket options SCTP_BINDX_ADD_ADDR and
SCTP_BINDX_REM_ADDR. These socket option are intended for internal
use only to implement sctp_bindx().
This is one user of struct sctp_getaddresses less.
struct sctp_getaddresses is strange and will be changed shortly.
2020-06-20 21:06:02 +00:00
Doug Moore
bc1bed77a8 In concluding RB_REMOVE_COLOR, in the case when the sibling of the
root of the too-short tree is black and at least one of the children
of that sibling is red, either one or two rotations finish the
rebalancing. In the case when both of the children are red, the
current implementation uses two rotations where only one is
necessary. This change removes that extra rotation, and in that case
also removes a needless black-to-red-to-black recoloring.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25335
2020-06-20 20:25:39 +00:00
Jeff Roberson
c8b0a88b8d Clarify some language. Favor primary where both master and primary were
used in conjunction with secondary.
2020-06-20 20:21:04 +00:00
Michael Tuexen
7621bd5ead Cleanup the adding and deleting of addresses via sctp_bindx().
There is no need to use the association identifier, so remove it.
While there, cleanup the code a bit.

MFC after:		1 week
2020-06-20 20:20:16 +00:00
Edward Tomasz Napierala
bafd96b8dd Regen after r362440.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-20 18:31:02 +00:00
Edward Tomasz Napierala
52c81be11a Add linux_madvise(2) instead of having Linux apps call the native
FreeBSD madvise(2) directly.  While some of the flag values match,
most don't.

PR:		kern/230160
Reported by:	markj
Reviewed by:	markj
Discussed with:	brooks, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25272
2020-06-20 18:29:22 +00:00
Conrad Meyer
b75a772875 oce(4): Account and trace mbufs before handing to hw
Once tx mbufs have been handed to hardware, nothing serializes the tx
path against completion and potential use-after-free of the outbound
mbuf.  Perform accounting and BPF tap before queueing to hardware to
avoid this race.

Submitted by:	Steve Wirtz <steve_wirtz AT dell.com>
Reviewed by:	markj, rstone
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25364
2020-06-20 17:22:46 +00:00
Hans Petter Selasky
75dc9c41ab Improve debug message to be more precise and clear.
For the sake of the record, this is the last use of the words master and slave
in the FreeBSD's USB stack, drivers and subsystems.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-20 14:16:24 +00:00
Toomas Soome
3830659e99 loader: create single zfs nextboot implementation
We should have nextboot feature implemented in libsa zfs code.
To get there, I have created zfs_nextboot() implementation based on
two sources, our current simple textual string based approach with added
structured boot label PAD structure from OpenZFS.

Secondly, all nvlist details are moved to separate source file and
restructured a bit. This is done to provide base support to add nvlist
add/update feature in followup updates.

And finally, the zfsboot/gptzfsboot disk access functions are swapped to use
libi386 and libsa.

Sponsored by:	Netflix, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D25324
2020-06-20 06:23:31 +00:00
Kyle Evans
e245e555fa raspberry pi 4: cpufreq support
The submitter notes that the bcm2835_cpufreq driver really just needs the
rpi4 compat string added to it; powerd subsequently works and the dev.cpu.0
sysctl values look sane and can be successfully manipulated.

Submitted by:	James Mintram <me@jamesrm.com>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D25349
2020-06-20 04:07:58 +00:00
Warner Losh
f66ca1b1eb Use the more descriptive src_ccb and dst_ccb for the two ccbs being merged.
MFC after: 1 week
2020-06-20 04:07:23 +00:00
Edward Tomasz Napierala
4afe4fae1b Add warnings for unsupported Linux clockids.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25322
2020-06-19 19:33:06 +00:00
Michal Meloun
b72e2878ab Improve if_dwc:
- refactorize packet receive path. Make sure that we don't leak mbufs
   and/or that we don't create holes in RX descriptor ring
 - slightly simplify handling with TX descriptors

MFC after:	4 weeks
2020-06-19 19:26:55 +00:00
Brandon Bergren
fb0543afa8 [PowerPC] Add virtio to GENERIC
Due to kldxref not being able to generate hints for nonnative platforms,
any cross built VM images do not have /boot/kernel/linker.hints.

This prevents the virtio modules from being loaded, as the fallback code
will always fail the version check when the hints are missing.

Since we want to be able to generate VM images for 32 bit powerpc, add the
virtio modules to GENERIC like we do on powerpc64.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25271
2020-06-19 18:43:13 +00:00
Brandon Bergren
8415f755f1 [PowerPC] Fix booke64 qemu infinite loop in L2 cache enable
Since qemu does not implement the L2 cache, we get stuck forever waiting
for a bit to be set when trying to invalidate it.

To prevent that, we should bail out if the L2 cache is missing.
One easy way to check this is L2CFG0 == 0 (since L2CSIZE always has at
least one bit set in a valid implementation)

(tested on qemu, rb800, and x5000)

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25225
2020-06-19 18:40:39 +00:00
Brandon Bergren
37f530582d [PowerPC] De-giant powermac_nvram, update documentation
* Remove the giant lock requirement from powermac_nvram.
* Update manual pages to reflect current state.

Reviewed by:	bcr (manpages), jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24812
2020-06-19 18:36:10 +00:00
Michal Meloun
188aee740f Finish renaming in if_dwc.
By using DWC TRM terminology, normal descriptor format should be named
extended and alternate descriptor format should be named normal.

Should not been functional change.

MFC after:	4 weeks
2020-06-19 18:34:27 +00:00
Michal Meloun
8d43a8685c Use naming nomenclature used in DesignWare TRM.
Use naming nomenclature used in DesignWare TRM.
This driver was written by using Altera (now Intel) documentation for Arria
FPGA manual. Unfortunately this manual used very different (and in some cases
opposite naming) for registers and descriptor fields. Unfortunately,
this makes future expansion extremely hard.

Should not been functional change.

MFC after:	4 weeks
2020-06-19 18:04:41 +00:00