Commit Graph

132804 Commits

Author SHA1 Message Date
Bjoern A. Zeeb
ce19cceb8d When converting the static arrays to mallocarray() in r356621 I missed
one place where we now need to multiply the size of the struct with the
number of entries.  This lead to problems when restarting user space
daemons, as the cleanup was never properly done, resulting in MRT_ADD_VIF
EADDRINUSE.
Properly zero all array elements to avoid this problem.

PR:		246629, 206583
Reported by:	(many)
MFC after:	4 days
Sponsored by:	Rubicon Communications, LLC (d/b/a "Netgate")
2020-06-17 21:04:38 +00:00
Bjoern A. Zeeb
b7b3d237e7 The call into ifa_ifwithaddr() needs to be epoch protected; ortherwise
we'll panic on an assertion.
While here, leave a comment that the ifp was never protected and stable
(as glebius pointed out) and this needs to be fixed properly.

Discovered while working on:	PR 246629
Reviewed by:	glebius
MFC after:	4 days
Sponsored by:	Rubicon Communications, LLC (d/b/a "Netgate")
2020-06-17 20:58:37 +00:00
Andrew Turner
9a7053ce96 Clean up the pci host generic driver
- Support Prefetchable Memory.
 - Use the correct rman when allocating memory and ioports.
 - Translate PCI addresses in bus_alloc_resource to allow physical
   addresses that are different than pci addresses.

Reviewed by:	Robert Crowston <crowston_protonmail.com>
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25121
2020-06-17 19:56:17 +00:00
Andrew Turner
3a6413d81e Support pmap_extract_and_hold on arm64 stage 2 mappings
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D24469
2020-06-17 19:45:05 +00:00
Alexander Motin
550d5d64fe Fix admin qpair leak if detached during initial reset.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-17 17:51:40 +00:00
Alan Somers
eea79fde5a Remove vfs_statfs and vnode_mount macros from NFS
These macro definitions are no longer needed as the NFS OSX port is long
dead.  The vfs_statfs macro conflicts with the vfsops field of the same
name.

Submitted by:	shivank@
Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	Google, Inc. (GSoC 2020)
Differential Revision:	https://reviews.freebsd.org/D25263
2020-06-17 16:20:19 +00:00
Ruslan Bukin
c9ea007c3b Complete the ACPI support for ARM Coresight:
o Parse the ACPI DSD (Device Specific Data) graph property and record
  device connections.
o Split-out FDT support to a separate file.
o Get the corresponding (FDT/ACPI) Coresight platform data in
  the device drivers.

Sponsored by:	DARPA, AFRL
2020-06-17 15:54:51 +00:00
Michael Tuexen
2d87bacde4 Allow the self reference to be NULL in case the timer was stopped.
Submitted by:		Timo Voelker
MFC after:		1 week
2020-06-17 15:27:45 +00:00
Tom Jones
d88fe3d964 Add header definition for RFC4340, Datagram Congestion Control Protocol
Add a header definition for DCCP as defined in RFC4340. This header definition
is required to perform validation when receiving and forwarding DCCP packets.
We do not currently support DCCP.

Reviewed by:	gallatin, bz
Approved by:	bz (co-mentor)
MFC after:	1 week
MFC with:	350749
Differential Revision:	https://reviews.freebsd.org/D21179
2020-06-17 13:27:13 +00:00
Andrew Turner
f3e9395d0c Add all the TCR_EL1 fields
These will be used when adding support for new Armv8 extensions.

Sponsored by:	Innovate UK
2020-06-17 11:56:10 +00:00
Hans Petter Selasky
11304ef50e Fix HW TLS offload regression issue after r359919, in mlx5en(4).
Changes in the mbuf layout regarding HW TLS, resulted in wrong detection
of starting mbuf. Use a boolean variable to handle this and pass m_adj()
the top mbuf, so that the packet header is adjusted correctly.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-17 11:14:54 +00:00
Hans Petter Selasky
a26df270c9 Allow multicast packets to be received in promiscious mode, in mlx4en(4).
Make sure we disable the multicast filter in promiscious mode aswell as when
the all multicast flag is set.

MFC after:	1 week
Found by:	Tycho Nightingale <tychon@freebsd.org>
Sponsored by:	Mellanox Technologies
2020-06-17 11:12:10 +00:00
Vladimir Kondratyev
94811094f8 evdev: Add AT translated set1 scancodes for 'Eisu' & 'Kana' keys.
PR:		247292
Submitted by:	Yuichiro NAITO <naito.yuichiro@gmail.com>
MFC after:	1 week
2020-06-17 08:35:35 +00:00
Conrad Meyer
a116b5d3e4 vm: Drop vm_map_clip_{start,end} macro wrappers
No functional change.

Reviewed by:	dougm, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25282
2020-06-16 22:53:56 +00:00
Ryan Moeller
33b39b6615 Apply default security flavor in vfs_export
There may be some version of mountd out there that does not supply a default
security flavor when none is given for an export.

Set the default security flavor in vfs_export if none is given, and remove the
workaround for oexport compat.

Reported by:	npn
Reviewed by:	rmacklem
Approved by:	mav (mentor)
MFC after:	3 days
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25300
2020-06-16 21:30:30 +00:00
Randall Stewart
95ef69c63c iSo in doing final checks on OCA firmware with all the latest tweaks the dup-ack checking
packet drill script was failing with a number of unexpected acks. So it turns
out if you have the default recvwin set up to 1Meg (like OCA's do) and you
have no window scaling (like the dupack checking code) then we have another
case where we are always trying to update the rwnd and sending an
ack when we should not.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D25298
2020-06-16 18:16:45 +00:00
Simon J. Gerraty
73845fdbd3 Make KENV_MVALLEN tunable
When doing secure boot, loader wants to export loader.ve.hashed
the value of which typically exceeds KENV_MVALLEN.

Replace use of KENV_MVALLEN with tunable kenv_mvallen.

Add getenv_string_buffer() for the case where a stack buffer cannot be
created and use uma_zone_t kenv_zone for suitably sized buffers.

Reviewed by:	stevek, kevans
Obtained from:	Abhishek Kulkarni <abkulkarni@juniper.net>
MFC after:	1 week
Sponsored by:	Juniper Networks
Differential Revision: https://reviews.freebsd.org//D25259
2020-06-16 17:02:56 +00:00
Randall Stewart
4d418f8da8 So it turns out rack has a shortcoming in dup-ack counting. It counts the dupacks but
then does not properly respond to them. This is because a few missing bits are not present.
BBR actually does properly respond (though it also sends a TLP which is interesting and
maybe something to fix)..

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D25294
2020-06-16 12:26:23 +00:00
Rick Macklem
2ed5e42378 Expose UID_xxx and GID_xxx definitions to userspace.
This patch moves the UID_xxx and GID_xxx definitions out of the
#ifdef _KERNEL section, so that userspace programs like mountd
can use them.
There are a couple of userspace programs that do define UID_ROOT,
but they do not include sys/conf.h.  Since they are defined as
the same value, maybe they should be changed to include sys/conf.h.

Reviewed by:	kib
Differential Revision:	https:/reviews.freebsd.org/D25281
2020-06-16 02:31:22 +00:00
Adrian Chadd
209be66e26 [rsu] Update wme ie API use.
Whoops, forgot to land this one too!
2020-06-16 01:11:40 +00:00
Adrian Chadd
bac852bbac [net80211] Add missing commit to previous-1 uapsd commit.
Whoops; somehow my big commit line didn't include this..  cue the tree breakage emails.
2020-06-16 00:28:45 +00:00
Adrian Chadd
8379e8db7a [net80211] Add initial U-APSD negotiation support.
U-APSD (unscheduled automatic power save delivery) is a power save method
that's a bit better than legacy PS-POLL - stations can mark frames with
an extra flag that tells the AP to leak out more frames after it sends
its own frames rather than needing to send a PS-POLL to get another frame
from the AP.

Now, this code just handles the negotiation bits; it doesn't actually
implement U-APSD.  That's up to drivers, and nothing in the tree yet
implements this.  I /may/ implement this for ath(4) if I eventually care
enough but right now I plan on just implementing it for firmware offload
based NICs that handle this in the NIC.

I'll commit the ifconfig bit after this and I may have some follow-up
commits as this gets used more by me in local testing.

This should be a glorious no-op for everyone else.  If things change
for anyone that isn't fixed by a complete recompile then please reach out
to me.
2020-06-16 00:27:32 +00:00
Edward Tomasz Napierala
3d8dd98381 Make Linux uname(2) return x86_64 to 32-bit apps. This helps Steam.
PR:		kern/240432
Analyzed by by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25248
2020-06-15 20:12:10 +00:00
Vincenzo Maffione
ef6fdb3312 if_vtnet: let vtnet_rx_vq_intr() and vtnet_rxq_tq_intr() share code
Since the two functions are similar, introduce a common function
(vtnet_rx_vq_process()) to share common code.
This also improves locking, by ensuring vrxs_rescheduled is accessed
under the RXQ lock, and taskqueue_enqueue() is not called under the
lock (therefore avoiding a spurious duplicate lock warning).

Reported by:	jrtc27
MFC after:	2 weeks
2020-06-15 19:46:34 +00:00
John Baldwin
ad54157b5e Simplify MACHINE_ARCH to be a single string.
Big endian and armv4 mean that we are now down to only two supported
variants.  A future change will use MACHINE_ARCH in assembly which
does not support C-style string concatentation and thus needs
MACHINE_ARCH defined as a single string.

Reviewed by:	imp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25211
2020-06-15 18:57:43 +00:00
Ryan Moeller
cbb9ccf735 Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in
various NIC drivers.

Reviewed by:	hselasky, np, gallatin, jpaetzel
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25120
2020-06-15 16:35:27 +00:00
Takanori Watanabe
ccb9fc3218 Update event masks constant to Bluetooth core spec V5.2
and add LE Events.

PR: 247257
Submitted by:	Marc Veldman
2020-06-15 14:58:40 +00:00
Jessica Clarke
576b099a5f vtnet: Fix regression introduced in r361944
For legacy devices that don't support MrgRxBuf (such as bhyve pre-r358180),
r361944 failed to update the receive handler to account for the additional
padding introduced by the unused num_buffers field that is now always present
in struct vtnet_rx_header. Thus, calculate the padding dynamically based on
vtnet_hdr_size.

PR:		247242
Reported by:	thj
Tested by:	thj
2020-06-14 22:39:34 +00:00
Vincenzo Maffione
0a182b4c63 iflib: netmap: enter/exit netmap mode after device stops
Avoid possible race conditions by calling nm_set_native_flags()
and nm_clear_native_flags() only after the device has been
stopped.

MFC after:	1 week
2020-06-14 21:07:12 +00:00
Vincenzo Maffione
16f224b5f8 netmap: vtnet: fix races in vtnet_netmap_reg()
The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.

MFC after:	1 week
2020-06-14 20:47:31 +00:00
Brandon Bergren
a4ec123c56 [PowerPC] Fix scc z8530 driver
Parts of the z8530 driver were still using the SUN channel spacing.

This was invalid on PowerMac and QEMU, where the attachment was to escc,
not escc-legacy. This means the driver has apparently NEVER worked properly
on Macintosh hardware.

Add documentation for the channel spacing details, and change to using
driver-specific initialization instead of hardcoded spacing so either
spacing can be used.

Fixes boot hang in QEMU when using the serial console, and fixes use on
Xserve serial (and presumably PowerMacs that have a Stealth Serial port
or similar)

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24661
2020-06-14 16:47:16 +00:00
Michael Tuexen
b231bff8b2 Allocate the mbuf for the signature in the COOKIE or the correct size.
While there, do also do some cleanups.

MFC after:		1 week
2020-06-14 16:05:08 +00:00
Edward Tomasz Napierala
889cd28520 Make linux(4) warn about unsupported CMSG level/type.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25255
2020-06-14 14:38:40 +00:00
Doug Rabson
3900c11481 Add support for the timecreate attribute
This maps to the va_birthtime VFS attribute.
2020-06-14 11:41:57 +00:00
Michael Tuexen
4471043177 Cleanups, no functional change.
MFC after:		1 week
2020-06-14 09:50:00 +00:00
Toomas Soome
e7fd9688ea Move font related data structured to sys/font.c and update vtfontcvt
Prepare support to be able to handle font data in loader, consolidate
data structures to sys/font.h and update vtfontcvt.

vtfontcvt update is about to output set of glyphs in form of C source,
the implementation does allow to output compressed or uncompressed font
bitmaps.

Reviewed by:	bcr
Differential Revision:	https://reviews.freebsd.org/D24189
2020-06-14 06:58:58 +00:00
Rick Macklem
9d6fc9963e Oops, r362158 committed a duplicate definition of MAXSECFLAVORS.
This patch gets rid of the duplicate.
2020-06-14 01:22:19 +00:00
Adrian Chadd
e9efad4f9e [net80211] Treat frames without an rx status as not a decap'ed A-MSDU.
Drivers for NICs which do A-MSDU decap in hardware / driver will need to
set the rx status, so if it's missing then treat it as not a decap'ed
A-MSDU.
2020-06-14 00:23:06 +00:00
Adrian Chadd
1209ded2e1 [net80211] Also convert the ddb path
Whoops - this belonged in my previous commit.
2020-06-14 00:21:48 +00:00
Rick Macklem
3fa08158f7 Version bump for r362158, since the arguments for vfs_checkexp() changed. 2020-06-14 00:12:29 +00:00
Rick Macklem
1f7104d720 Fix export_args ex_flags field so that is 64bits, the same as mnt_flags.
Since mnt_flags was upgraded to 64bits there has been a quirk in
"struct export_args", since it hold a copy of mnt_flags
in ex_flags, which is an "int" (32bits).
This happens to currently work, since all the flag bits used in ex_flags are
defined in the low order 32bits. However, new export flags cannot be defined.
Also, ex_anon is a "struct xucred", which limits it to 16 additional groups.
This patch revises "struct export_args" to make ex_flags 64bits and replaces
ex_anon with ex_uid, ex_ngroups and ex_groups (which points to a
groups list, so it can be malloc'd up to NGROUPS in size.
This requires that the VFS_CHECKEXP() arguments change, so I also modified the
last "secflavors" argument to be an array pointer, so that the
secflavors could be copied in VFS_CHECKEXP() while the export entry is locked.
(Without this patch VFS_CHECKEXP() returns a pointer to the secflavors
array and then it is used after being unlocked, which is potentially
a problem if the exports entry is changed.
In practice this does not occur when mountd is run with "-S",
but I think it is worth fixing.)

This patch also deleted the vfs_oexport_conv() function, since
do_mount_update() does the conversion, as required by the old vfs_cmount()
calls.

Reviewed by:	kib, freqlabs
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D25088
2020-06-14 00:10:18 +00:00
Adrian Chadd
e81d909274 [net80211] Handle offloaded AMSDU in AMPDU reordering.
In the 11n world, most NICs did A-MPDU receive/transmit offloading but
not A-MSDU offloading.  So, the net80211 A-MPDU receive path would just
receive MPDUs, do the reordering bit, pass it up to the rest of
net80211 for crypto decap and then do A-MSDU decap before throwing ethernet
frames up to the rest of the system.

However 11ac and 11ax NICs are increasingly doing A-MSDU offload (and
newer 11ax stuff does socket offload, but hey I don't want to scare people
JUST yet) - so although A-MPDU reordering may be done in the OS, A-MSDUs
look like a normal MPDU.  This means that all the MSDUs are actually
faked into a set of MPDUs with matching 802.11 header - the sequence number,
QoS header and any encryption verification bits (like IV) are just copied.

This shows up as MASSIVE packet loss in net80211, cause after the first MPDU
we just toss the rest.

(And don't get me started about ethernet decap with A-MPDU host reordering;
we'll have to cross that bridge for later 11ac and 11ax bits too.)

Anyway, this work changes each A-MPDU reorder slot into an mbufq.
The mbufq is treated as a whole set of frames to pass up to the stack
and reordered/de-duped as a group.  The last frame in the reorder list
is checked to see if it's an A-MSDU final frame so any duplicates are
correctly tossed rather than double-received.  Other than that, the
rest of the logic is unchanged.

The previous commit did a small subset of this - if there wasn't any reordering
going on then it'd accept the A-MSDUs.  This is the rest of the needed work.

This is a no-op for 11n NICs doing A-MPDU reordering but needing software
A-MSDU decap - they aren't tagged as A-MSDU and so any subsequent
frames added to the reorder slot are tossed.

Tested:

* QCA9880 (ath10k/athp) - STA/AP mode;
* RT3593 (if_rsu) - 11n STA+DWDS mode (I'm committing through it rn);
* QCA9380 (if_ath) - STA/AP mode.
2020-06-13 23:35:22 +00:00
Adrian Chadd
ea3d5fd9df [net80211] separate out node allocation and node initialisation.
This is a new, optional (for now!) method that drivers can use to separate
node allocation and node initialisation.  Right now they're the same, and
drivers that need to do node allocation via firmware commands need to sleep
and thus they need to defer node allocation into an internal taskqueue.

Right now they're just separate but not deferred.  Later on if I get the time
we'll start deferring the node and key related operations but that requires
making a bunch of other stuff (notably things that generate frames!) also
async/deferred.

Tested:

* RT3593, STA/DWDS mode
* AR9380, STA/AP modes
* QCA9880 (athp) - STA/AP modes
2020-06-13 22:20:02 +00:00
Michael Tuexen
d60bdf8569 Remove usage of empty macro.
MFC after:		1 week
2020-06-13 21:23:26 +00:00
Michael Tuexen
64c8fc5de8 Simpify a condition, no functional change.
MFC after:		1 week
2020-06-13 18:38:59 +00:00
Conrad Meyer
8bc0d2b855 Fix !DEBUGNET build after r362138
X-MFC-With:	r362138
2020-06-13 03:16:09 +00:00
Conrad Meyer
508a6e84e7 Flip kern.tty_info_kstacks on by default
It's a useful debug aid for anyone using Ctrl-T today, and doesn't seem to be
widely known.  So, enable it out of the box to help people find it.

It's a tunable and sysctl, so if you don't like it, it's easy to disable
locally.

If people really hate it, we can always flip it back.

Reported by:	Daniel O'Connor
2020-06-13 03:04:40 +00:00
Doug Moore
9f1041dc2e Linuxkpi uses the rb-tree structures without using their interfaces,
making them break when the representation changes. Revert changes that
eliminated the color field from rb-trees, leaving everything as it was
before.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25250
2020-06-13 01:54:09 +00:00
Conrad Meyer
479ab044c1 net80211: Add framework for debugnet(4) support
Allow net80211 drivers to register a small vtable of debugnet-related
methods.

This is not a functional change.  Driver support is needed, similar to
debugnet(4) for wired NICs.

Reviewed by:	adrian, markj (earlier version both)
Differential Revision:	https://reviews.freebsd.org/D17308
2020-06-13 00:59:36 +00:00
John Baldwin
d93010c598 Allow <sys/elf_common.h> to be used in assembly.
Hide C-only declarations under #ifndef LOCORE.  This will be used by
future changes to define ELF notes in assembly.

Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25211
2020-06-12 23:43:44 +00:00
John Baldwin
4f3c25bce0 Allow <sys/param.h> to be included from userland assembly files.
This will be used by future changes to define ELF notes in assembly.

Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25211
2020-06-12 23:42:36 +00:00
John Baldwin
26d292d3e2 Various optimizations to software AES-CCM and AES-GCM.
- Make use of cursors to avoid data copies for AES-CCM and AES-GCM.

  Pass pointers into the request's input and/or output buffers
  directly to the Update, encrypt, and decrypt hooks rather than
  always copying all data into a temporary block buffer on the stack.

- Move handling for partial final blocks out of the main loop.

  This removes branches from the main loop and permits using
  encrypt/decrypt_last which avoids a memset to clear the rest of the
  block on the stack.

- Shrink the on-stack buffers to assume AES block sizes and CCM/GCM
  tag lengths.

- For AAD data, pass larger chunks to axf->Update.  CCM can take each
  AAD segment in a single call.  GMAC can take multiple blocks at a
  time.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25058
2020-06-12 23:10:30 +00:00
John Baldwin
4e6a381306 Fix a regression in r361804 for TLS 1.3.
I was not including the record type stored in the first byte of the
trailer as part of the payload to be encrypted and hashed.

Sponsored by:	Netflix
2020-06-12 22:27:26 +00:00
Konstantin Belousov
17edf152e5 Control for Special Register Buffer Data Sampling mitigation.
New microcode update for Intel enables mitigation for SRBDS, which
slows down RDSEED and related instructions.  The update also provides
a control to limit the mitigation to SGX enclaves, which should
restore the speed of random generator by the cost of potential
cross-core bufer sampling.

See https://software.intel.com/security-software-guidance/insights/deep-dive-special-register-buffer-data-sampling

GIve the user control over it.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25221
2020-06-12 22:14:45 +00:00
Konstantin Belousov
958d257ed5 x86: add bits definitions for SRBDS mitigation control.
See https://software.intel.com/security-software-guidance/insights/deep-dive-special-register-buffer-data-sampling

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25221
2020-06-12 22:12:57 +00:00
Eric van Gyzen
8cc8c5864a Honor db_pager_quit in some vm_object ddb commands
These can be rather verbose.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2020-06-12 21:53:08 +00:00
Simon J. Gerraty
66d8bce379 mac_veriexec_fingerprint_check_vnode: v_writecount > 0 means active writers
v_writecount can actually be < 0 for text,
so check for v_writecount > 0

Reviewed by:	stevek
MFC after:	1 week
2020-06-12 21:51:20 +00:00
John Baldwin
b0b2161ce4 Fix AES-CCM requests with an AAD size smaller than a single block.
The amount to copy for the first block is the minimum of the size of
the AAD region or the remaining space in the first block.

Reported by:	cryptocheck -z
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25140
2020-06-12 21:33:02 +00:00
John Baldwin
822d2d6ac9 Various fixes to TLS for MIPS.
- Clear the current thread's TLS pointer on exec. Previously the TLS
  pointer (and register) remain unchanged.

- Explicitly clear the TLS pointer when new threads are created.

- Make md_tls_tcb_offset per-process instead of per-thread.

  The layout of the TLS and TCB are identical for all threads in a
  process, it is only the TLS pointer values themselves that vary by
  thread.  This also makes setting md_tls_tcb_offset in
  cpu_set_user_tls() redundant with the setting in exec_setregs(), so
  only set it in exec_setregs().

Submitted by:	Alfredo Mazzinghi (1)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24957
2020-06-12 21:21:18 +00:00
Eric van Gyzen
6fba90f201 FPU init: allocate initial state from UMA to ensure alignment
The Intel Instruction Set Reference says this about the XSAVE instruction:

    Use of a destination operand not aligned to 64-byte boundary
    (in either 64-bit or 32-bit modes) results in a general-protection
    (#GP) exception.

This alignment happens naturally when all malloc buckets are powers
of two.  However, this change is necessary on some systems when
certain non-power-of-two (and non-multiple of 64) malloc buckets
are defined.

Reviewed by:	cem; kib; earlier version by jhb
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:17:56 +00:00
Eric van Gyzen
701acc2fd8 FPU: make xsave_area_desc static
...because it can be.

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:12:26 +00:00
Eric van Gyzen
674cbe7908 FPU init: Do potentially blocking operations before disabling interrupts
In particular, uma_zcreate creates sysctl oids, which locks an sx lock,
which uses IPIs under contention.  IPIs tend not to work very well
when interrupts are disabled.  Who knew, right?

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:10:45 +00:00
Randall Stewart
f092a3c71c So it turns out with the right window scaling you can get the code in all stacks to
always want to do a window update, even when no data can be sent. Now in
cases where you are not pacing thats probably ok, you just send an extra
window update or two. However with bbr (and rack if its paced) every time
the pacer goes off its going to send a "window update".

Also in testing bbr I have found that if we are not responding to
data right away we end up staying in startup but incorrectly holding
a pacing gain of 192 (a loss). This is because the idle window code
does not restict itself to only work with PROBE_BW. In all other
states you dont want it doing a PROBE_BW state change.

Sponsored by:	Netflix Inc.
Differential Revision: 	https://reviews.freebsd.org/D25247
2020-06-12 19:56:19 +00:00
Andrew Gallatin
6da16e3eb0 x86: Bump default msi/msix vector limit to 2048
Given that 64c/128t CPUs are currently available, and that many
devices (nvme, many NICs) desire to map 1 MSI-X vector per core,
or even 1 per-thread, it is becoming far easier to see MSI-X interrupt
setup fail due to msi vector exhaustion, and devices fail to attach at
boot on large system.

This bump costs 12KB on amd64 (and 6KB on i386), which seems
worth the trade off for a better out of the box experience on
high end hardware.

Reviewed by:	jhb
MFC after:	21 days
Sponsored by:	Netflix
2020-06-12 18:41:12 +00:00
Doug Moore
13dca1937f Revert r362108, as it breaks compilation. 2020-06-12 17:48:12 +00:00
Ruslan Bukin
72842e4697 Coresight replicator:
o Add a header file;
o Split-out FDT attachment to a separate file;
o Add ACPI attachment.

Sponsored by:	DARPA, AFRL
2020-06-12 17:31:38 +00:00
Doug Moore
3159ceca97 The linuxkpi code accesses left/right rb tree pointers without using
RB_LEFT or RB_RIGHT, so they aren't stripping off the color bit
encoded there. Strip off that bit for linuxkpi.

Reported by:	dch
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25245
2020-06-12 16:51:55 +00:00
Michael Tuexen
3ee11586b2 Whitespace change due to upstream cleanup.
MFC after:		1 week
2020-06-12 16:40:10 +00:00
Michael Tuexen
2f9e6db0be More cleanups due to ifdef cleanup done upstream
MFC after:		1 week
2020-06-12 16:31:13 +00:00
Edward Tomasz Napierala
462171d9aa Add compat.linux.debug sysctl, to make it possible to silence down
the debug messages. While here, clean up some variable naming.

Reviewed by:	bcr (manpages), emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25230
2020-06-12 14:37:50 +00:00
Edward Tomasz Napierala
599dadca55 Fix naming clash.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-12 14:31:19 +00:00
Edward Tomasz Napierala
34ff0c0e6a Make linux(4) warn about unsupported fcntls.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25231
2020-06-12 14:25:32 +00:00
Edward Tomasz Napierala
4beacc3b1d Minor code cleanup; no functional changes.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25232
2020-06-12 14:23:10 +00:00
Alexander Motin
92390644e3 Fix config_intrhook leak on initial reset failure.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-12 14:14:01 +00:00
Ruslan Bukin
a132ec9f8a ARM Coresight Trace Memory Controller (TMC):
o Split-out FDT attachment to a separate file;
o Add ACPI attachment.

Sponsored by:	DARPA, AFRL
2020-06-12 13:59:58 +00:00
Andrew Turner
400c0119a7 Teach the arm64 vfp.h about struct thread.
Ensure struct thread is defined in vfp.h. In some cases it is not and stops
the kernel from building.

Sponsored by:	Innovate UK
2020-06-12 10:43:21 +00:00
Michael Tuexen
306c2ba375 Small cleanup due to upstream ifdef cleanups.
MFC after:		1 week
2020-06-12 10:13:23 +00:00
Adrian Chadd
a67acf111f [net80211] First part of A-MSDU offload handling - don't bump A-MPDU reordering seqno
When doing A-MSDU offload handling the driver is required to mark
A-MSDUs from the same MPDU with the same sequence number.
It then tags them as AMSDU (if it's a decap'ed A-MSDU) and AMSDU_MORE
(saying there's more AMSDUs decapped in the same MSDU.)
This allows encryption and sequence number offload to work right.

In the A-MSDU path the sequence number check looks at the A-MSDU flags
in the frame to see whether it's part of the same seqno and will pass them
(ie, not increment rx_seq until the last A-MSDU is seen from the driver,
or a new seqno shows up.0

However, I did this work in the A-MSDU path but not the A-MSDU in A-MPDU path.
For the non A-MDSU offload case the A-MPDU receive reordering will do its
thing and then pass up the MPDU up for decap - which then will see it's
an A-MSDU and decap each sub-frame.  But this isn't done for offloaded
A-MSDU frames.

This requires two parts:

* Don't bump the RX sequence number, same as above; and
* If frames go into the reordering buffer, they need to be added into the slot
  as a set of frames rather than a single frame, so once a new seqno shows up
  this slot can be marked as "full" and we can move on.

This patch does the first.  The latter requires that I find and commit
work to change rxa_m from an mbuf to an mbufq and the nhandle A-MSDU
there.  But, the first is enough to allow the normal case (ie, no or not
a lot of A-MPDU RX reordering) to work.

This allows the athp driver (QCA9880) throughput to go from VERY low
(like 5mbit TCP, 1/3-1/4 expected UDP throughput) to ~ 250mbit TCP
and > 300mbit UDP on a VHT/40 channel.  TCP sucks because, well, it
shows up as MASSIVE packet loss when all but one frame in a decap'ed
A-MSDU stream is dropped. Le whoops.

Now, where'd I put that laptop with the patch for rxa_m mbufq that
I wrote like in 2017...

Tested:

* AR9380, STA/AP mode (a big no-op, no A-MSDU hardware decap);
* if_run (RT3593), STA DWDS mode (A-MPDU / A-MSDU receive, but again
  no A-MSDU hardware decap);
* QCA9880, STA/AP mode (which is doing hardware A-MPDU/A-MSDU decap,
  but no A-MPDU reordering in the firmware.)
2020-06-12 04:19:03 +00:00
Ravi Pokala
2a73c8f5e1 Decode the "LACP Fast Timeout" LAGG option flag
r286700 added the "lacp_fast_timeout" option to `ifconfig', but we forgot to
include the new option in the string used to decode the option bits. Add
"LACP_FAST_TIMO" to LAGG_OPT_BITS.

Also, s/LAGG_OPT_LACP_TIMEOUT/LAGG_OPT_LACP_FAST_TIMO/g , to be clearer that
the flag indicates "Fast Timeout" mode.

Reported by:	Greg Foster <gfoster at panasas dot com>
Reviewed by:	jpaetzel
MFC after:	1 week
Sponsored by:	Panasas
Differential Revision:	https://reviews.freebsd.org/D25239
2020-06-11 22:46:08 +00:00
Ruslan Bukin
d06110e566 Shorten the filename of the coresight replicator driver.
Sponsored by:	DARPA, AFRL
2020-06-11 21:52:06 +00:00
Vincenzo Maffione
6682323732 netmap: introduce netmap_kring_on()
This function returns NULL if the ring identified by
queue id and direction is in netmap mode. Otherwise
return the corresponding kring.
Use this function to replace vtnet_netmap_queue_on().

MFC after:	1 week
2020-06-11 20:35:28 +00:00
Konstantin Belousov
e09fb42a9a Correct comment (this should have been committed with r362065).
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2020-06-11 20:26:39 +00:00
Konstantin Belousov
e7a291f418 Restore TLB invalidations done before smp started.
In particular, invalidation of the preloaded modules text to allow
execution from it was broken after D25188/r362031.

Reviewed by:	markj
Reported by:	delphij, dhw
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2020-06-11 17:25:20 +00:00
Eric Joyner
104d75a051 em(4): Always reinit interface when adding/removing VLAN
This partially reverts r361053 since there have been reports
by users that this breaks some functionality for em(4)
devices; it seems at first glance that some sort of interface
restart is required for those cards.

This isn't a proper fix; this unbreaks those users until a proper
fix is found for their issues.

PR:		240818
Reported by:	Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after:	3 days
2020-06-11 15:59:49 +00:00
Edward Tomasz Napierala
86e794eb65 Don't use newlines with linux_msg(). No functional changes.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-11 14:57:30 +00:00
Hans Petter Selasky
9c847ffd74 Add missing range checks when receiving USB ethernet packets.
Found by:	Ilja Van Sprundel, IOActive
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-06-11 14:31:51 +00:00
Edward Tomasz Napierala
bc8e281082 Replace LINUX_FASYNC with LINUX_O_ASYNC; no functional changes.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25218
2020-06-11 14:09:43 +00:00
Michael Tuexen
28397ac1ed Non-functional changes due to upstream cleanup.
MFC after:		1 week
2020-06-11 13:34:09 +00:00
Michal Meloun
3e13ea16a6 Fix grabbing of tegra uart.
An attempt to write to FCR register may corrupt transmit FIFO,
so we should wait for the FIFO to be empty before we can modify it.

MFC after:	1 week
2020-06-11 12:53:22 +00:00
Edward Tomasz Napierala
433d61a573 Improve the warnings.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-11 12:35:00 +00:00
Edward Tomasz Napierala
3bc69ad9b3 Make linux(4) handle SO_REUSEPORT.
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25216
2020-06-11 12:25:49 +00:00
Andriy Gapon
04dc03e0fe fix up r362047: a call to zvol_*_minors() was not hidden from userland
Reported by:	CI/FreeBSD-head-powerpc64-build
MFC after:	5 weeks
X-MFC with:	r362047
2020-06-11 11:35:30 +00:00
Andriy Gapon
f51f07e1ec rework how ZVOLs are updated in response to DSL operations
With this change all ZVOL updates are initiated from the SPA sync
context instead of a mix of the sync and open contexts.  The updates are
queued to be applied by a dedicated thread in the original order.  This
should ensure that ZVOLs always accurately reflect the corresponding
datasets.  ZFS ioctl operations wait on the mentioned thread to complete
its work.  Thus, the illusion of the synchronous ZVOL update is
preserved.  At the same time, the SPA sync thread never blocks on ZVOL
related operations avoiding problems like reported in bug 203864.

This change is based on earlier work in the same direction: D7179 and
D14669 by Anthoine Bourgeois.  D7179 tried to perform ZVOL operations
in the open context and that opened races between them.  D14669 uses a
design very similar to this change but with different implementation
details.

This change also heavily borrows from similar code in ZoL, but there are
many differences too.  See:
- a0bd735adb
- https://github.com/zfsonlinux/zfs/issues/3681
- https://github.com/zfsonlinux/zfs/issues/2217

PR:		203864
MFC after:	5 weeks
Sponsored by:	CyberSecure
Differential Revision: https://reviews.freebsd.org/D23478
2020-06-11 10:41:31 +00:00
Hans Petter Selasky
6fe9e470bb Make sure packets generated by raw IP code is let through by mlx5en(4).
Allow the TCP header to reside in the mbuf following the IP header.
Else such packets will get dropped.

Backtrace:
mlx5e_sq_xmit()
mlx5e_xmit()
ether_output_frame()
ether_output()
ip_output_send()
ip_output()
rip_output()
sosend_generic()
sosend()
kern_sendit()
sendit()
sys_sendto()
amd64_syscall()
fast_syscall_common()

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:41:54 +00:00
Hans Petter Selasky
b63b61cc75 Extend use of unlikely() in the fast path, in mlx5en(4).
Typically the TCP/IP headers fit within the first mbuf and should not
trigger any of the error cases. Use unlikely() for these cases.

No functional change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:38:51 +00:00
Hans Petter Selasky
9eb1e4aa21 Use const keyword when parsing the TCP/IP header in the fast path in mlx5en(4).
When parsing the TCP/IP header in the fast path, make it clear by using
the const keyword, no fields are to be modified inside the transmitted
packet.

No functional change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:36:37 +00:00
Andriy Gapon
4b869dd71d iicbb: rebuild the bit-banging algorithms using different primitives
I2C_SET was quite inflexible, it used too long delays as well as some
unnecessary delays.  The new building blocks are iicbb_clockin and
iicbb_clockout.  The former sets SDA and starts the high period of SCL,
the latter executes the low period of SCL.  What happens during the high
phase depends on the operation.  For writes we just hold both lines, for
reads we poll SDA.  S, Sr and P change SDA in the middle of the high
period.

Also, the calculation of udelay has been updated, so that the resulting
period more closely corresponds the requested bus frequency.  There is a
new knob, io_delay, that allows to further adjust udelay based on the
estimated latency of pin toggling operations.

Finally, I slightly changed debug tracing and added error indicators to
it.  The debug prints are compiled in but disabled by default.  This can
be of use if there is any fallout from this change.

Some ideas for further improvements:
- add a function for sub-microsecond delays (e.g., in units of 1/10th of
  a microsecond) and use it for more precise timing of short delays;
- account for the actual time spent in the pin I/O.

Some sample debug output with the new code follows.

Reading temperature and humidity from HTU21 in the bus hold mode:
  <<w80+ we3+ <w81+ .....r6d+ rac+ r94- >>
  <<w80+ we5+ <w81+ .............r47+ re2+ r84- >>
where '<<' is S, '<' is Sr, '>>' is P, '.' is one millisecond of clock
stretching by the slave.

Reading temperature and humidity in the no-hold mode:
  <<w80+ wf3+ >>
  <<w81- >>
  <<w81+ r6d+ r54+ raf- >>
  <<w80+ wf5+ >>
  <<w81- >>
  <<w81+ r48+ r4e+ r9c- >>
where '+' is Ack and '-' is NoAck.
We see that first read attempts are not acknowledged.

MFC after:	4 weeks
Differential Revision: https://reviews.freebsd.org/D22206
2020-06-11 05:34:31 +00:00
Mark Johnston
a03c42bbef Hard-code the ice_ddp firmware version.
Like every other firmware image in the tree, the makefile will need to
be updated to point to the newest import.

Reviewed by:	erj, imp (previous version)
Differential Revision:	https://reviews.freebsd.org/D25222
2020-06-11 00:36:35 +00:00
Mark Johnston
479f70ef24 Fix a couple of nits in Linux sysinfo(2) emulation.
- Use the same definition of free memory as Linux.
- Rename the totalbig and freebig fields to match the corresponding
  names on Linux.

Discussed with:	alc
MFC after:	1 week
2020-06-10 23:52:50 +00:00
Mark Johnston
27e4374dd4 Add a comment reflecting the commit log for r361945.
Suggested by:	alc
Reviewed by:	alc
MFC with:	r361945
2020-06-10 23:52:39 +00:00
Mark Johnston
4f8ad92f36 Remove the FIRMWARE_MAX limit.
The firmware module arbitrarily limits us to at most 50 images.  It is
possible to hit this limit on platforms that preload many firmware
images, or link all of the firmware images for a set of devices into the
kernel.

Convert the table into a linked list, removing the limit.

Reported by:	Steve Wheeler
Reviewed by:	rpokala
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D25161
2020-06-10 23:52:29 +00:00
Justin Hibbits
ae672aa5e3 powerpc/pmap: Fix pte_find_next() iterators for booke64 pmap
After r361988 fixed the reference count leak on booke64, it became possible
for an iteration somewhere in the middle of a page to become stale, with the
page vanishing (correctly) due to all PTEs on that page going away.
pte_find_next() would start at that iterator, and move along 'higher' order
directory pages until it finds a valid one, without zeroing out the lower
order pages.  For instance:

	/* Find next pte at or above 0x10002000. */
	pte = pte_find_next(pmap, &(0x10002000));
	pte_remove(pmap, pte);
	/* This pte was the last reference in the page table page, page is
	 * gone.
	 */
	pte = pte_find_next(pmap, 0x10002000);
	/* pte_find_next will see 0x10002000's page is gone, and jump to the
	 * next one, but starting iteration at the '0x2000' slot, skipping
	 * 0x0000 and 0x1000.
	 */

This caused some processes, like git, to trip the KASSERT() in
pmap_release().

Fix this by zeroing all lower order iterators at each level.
2020-06-10 23:03:35 +00:00
Konstantin Belousov
4149c6a3ec Remove double-calls to tc_get_timecount() to warm timecounters.
It seems that second call does not add any useful state change for all
implemented timecounters.

Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2020-06-10 22:30:32 +00:00
Konstantin Belousov
3b23ffe271 amd64 pmap: reorder IPI send and local TLB flush in TLB invalidations.
Right now code first flushes all local TLB entries that needs to be
flushed, then signals IPI to remote cores, and then waits for
acknowledgements while spinning idle.  In the VMWare article 'Don’t
shoot down TLB shootdowns!' it was noted that the time spent spinning
is lost, and can be more usefully used doing local TLB invalidation.

We could use the same invalidation handler for local TLB as for
remote, but typically for pmap == curpmap we can use INVLPG for locals
instead of INVPCID on remotes, since we cannot control context
switches on them.  Due to that, keep the local code and provide the
callbacks to be called from smp_targeted_tlb_shootdown() after IPIs
are fired but before spin wait starts.

Reviewed by:	alc, cem, markj, Anton Rang <rang at acm.org>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D25188
2020-06-10 22:07:57 +00:00
Oleksandr Tymoshenko
da21a623dd Add mode selection to iMX6 IPU driver
- Configure ipu1_di0 tob e sourced from the VIDEO_PLL(PLL5) and hardcode
  frequency to (455000000/3)Mhz. This value, further divided, can yield
  frequencies close enough to support 1080p, 720p, 1024x768, and 640x480
  modes. This is not ideal but it's an improvement comparing to the only
  hardcoded 1024x768 mode.

- Fix memory leaks if attach method failed
- Print EDID when -v passed to the kernel
2020-06-10 22:00:31 +00:00
Oleksandr Tymoshenko
cbc596d6bf Fix reading EDID on TVs/monitors without E-DCC support
Writing segment id to I2C device 0x30 only required if the segment is
non-zero. On the devices without E-DCC support writing to that address
fails and whole transaction then fails too. To avoid this do
not attempt write to the segment selection device unless required.

MFC after:	2 weeks
2020-06-10 21:38:35 +00:00
John Baldwin
9b6b2f8608 Adjust crypto_apply function callbacks for OCF.
- crypto_apply() is only used for reading a buffer to compute a
  digest, so change the data pointer to a const pointer.

- To better match m_apply(), change the data pointer type to void *
  and the length from uint16_t to u_int.  The length field in
  particular matters as none of the apply logic was splitting requests
  larger than UINT16_MAX.

- Adjust the auth_xform Update callback to match the function
  prototype passed to crypto_apply() and crypto_apply_buf().  This
  removes the needs for casts when using the Update callback.

- Change the Reinit and Setkey callbacks to also use a u_int length
  instead of uint16_t.

- Update auth transforms for the changes.  While here, use C99
  initializers for auth_hash structures and avoid casts on callbacks.

Reviewed by:	cem
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25171
2020-06-10 21:18:19 +00:00
Chuck Tuffli
f14f005113 pci: loosen PCIe hot-plug requirements
The original PCIe hot-plug code required a couple of things which cause
PCI probing errors on the QEMU Q35 system and possibly physical systems
(Dell R6515).

Allocate the hot-plug interrupt as shared to support INTx interrupts.
The hot-plug interrupt mechanism should normally be MSI as PCIe mandates
MSI support, but QEMU's Q35 bridge only provides INTx interrupts.

Second, the code required the Electromechanical Interlock (Slot Status
EIS) to be engaged if present (Slot Capability EIP). Some platforms
including QEMU Q35 set EIP but not EIS. Fix by deleting the check.

Reviewed by: imp, mav, jhb
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D24877
2020-06-10 20:12:45 +00:00
Adrian Chadd
ee424b7351 [net80211] ok ok if_xname won't ever be NULL.
Somewhere in net80211 if_xname is checked against NULL but it doesn't trigger
a compiler warning, but this does.  So DTRT for FreeBSD and the other if_xname
derefences can be converted to this function at a later time.
2020-06-10 18:59:46 +00:00
Edward Tomasz Napierala
8c5059e9ea Make linux(4) set the openfiles soft resource limit to 1024 for Linux
applications, which often depend on this being the case.  There's a new
sysctl, compat.linux.default_openfiles, to control this behaviour.

Reviewed by:	kevans, emaste, bcr (manpages)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25177
2020-06-10 18:50:46 +00:00
Edward Tomasz Napierala
c31a6a6612 Support SO_SNDBUFFORCE/SO_RCVBUFFORCE by aliasing them to the
standard SO_SNDBUF/SO_RCVBUF.  Mostly cosmetics, to get rid
of the warning during 'apt upgrade'.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25173
2020-06-10 18:43:43 +00:00
Ed Maste
cff33fa8c8 Fix arm64 kernel build with DEBUG on
Submitted by:	Greg V <greg@unrelenting.technology>, andrew
Differential Revision:	https://reviews.freebsd.org/D24986
2020-06-10 16:00:43 +00:00
Ruslan Bukin
c7dada4c03 All the ARM Coresight interconnect devices set ResourceProducer on memory
resources, ignore it.

The devices found in the ARM Neoverse N1 System Development Platform
(N1SDP).

Sponsored by:	DARPA, AFRL
2020-06-10 14:39:54 +00:00
Ruslan Bukin
5637d889e3 ARM Coresight Funnel device:
o Split-out FDT attachment to a separate file;
o Add ACPI attachment;
o Add support for the Static Funnel device.

Sponsored by:	DARPA, AFRL
2020-06-10 14:28:36 +00:00
Alexander V. Chernikov
a287a973e3 Switch rtsock code to using newly-create rib_action() KPI call.
This simplifies the code and allows to further split rtentry and nexthop,
 removing one of the blockers for multipath code introduction, described in
 D24141.

Reviewed by:	ae
Differential Revision:	https://reviews.freebsd.org/D25192
2020-06-10 07:46:22 +00:00
Richard Scheffenegger
2fda0a6f3a Prevent TCP Cubic to abruptly increase cwnd after app-limited
Cubic calculates the new cwnd based on absolute time
elapsed since the start of an epoch. A cubic epoch is
started on congestion events, or once the congestion
avoidance phase is started, after slow-start has
completed.

When a sender is application limited for an extended
amount of time and subsequently a larger volume of data
becomes ready for sending, Cubic recalculates cwnd
with a lingering cubic epoch. This recalculation
of the cwnd can induce a massive increase in cwnd,
causing a burst of data to be sent at line rate by
the sender.

This adds a flag to reset the cubic epoch once a
session transitions from app-limited to cwnd-limited
to prevent the above effect.

Reviewed by:	chengc_netapp.com, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25065
2020-06-10 07:32:02 +00:00
Takanori Watanabe
7a33c92b43 Add LE events:
READ_REMOTE_FEATURES_COMPL
LONG_TERM_KEY_REQUEST
REMOTE_CONN_PARAM_REQUEST
DATA_LENGTH_CHANGE
READ_LOCAL_P256_PK_COMPL
GEN_DHKEY_COMPL
ENH_CONN_COMPL

PR: 247050
Submitted by:	Marc Veldman marc at bumblingdork.com
2020-06-10 04:54:02 +00:00
Justin Hibbits
46e8ab5aa1 powerpc/powernv: Don't use the vmem quantum cache for OPAL PCI MSI allocations
vmem quantum cache is only needed when doing a lot of concurrent allocations,
which doesn't happen when allocating MSIs.  This wastes memory for the cache
zones.  Avoid this waste and don't use the quantum cache.

Reported by:	markj
2020-06-10 04:08:16 +00:00
Justin Hibbits
76d5f5e22c powerpc/mpc85xx: Don't use the quantum cache in vmem for MPIC MSIs
The qcache is unnecessary for this purpose, it's only needed when there are
lots of concurrent allocations.

Reported by:	markj
2020-06-10 04:04:59 +00:00
Doug Moore
66959b4f5d Fixup r361997 by balancing parens. Duh. 2020-06-10 03:36:17 +00:00
Rick Macklem
84d746de21 Add two functions that create M_EXTPG mbufs with anonymous pages.
These two functions are needed by nfs-over-tls, but could also be
useful for other purposes.
mb_alloc_ext_plus_pages() - Allocates a M_EXTPG mbuf and enough anonymous
      pages to store "len" data bytes.
mb_mapped_to_unmapped() - Copies the data from a list of mapped (non-M_EXTPG)
      mbufs into a list of M_EXTPG mbufs allocated with anonymous pages.
      This is roughly the inverse of mb_unmapped_to_ext().

Reviewed by:	gallatin
Differential Revision:	https://reviews.freebsd.org/D25182
2020-06-10 02:51:39 +00:00
Doug Moore
61a7df230e Restore an RB_COLOR macro, for the benefit of a bit of DIAGNOSTIC code
that depends on it.

Reported by:	rpokala, mjguzik
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25204
2020-06-10 02:50:25 +00:00
John Baldwin
1138b87ae6 Add some default cases for unreachable code to silence compiler warnings.
This was caused by r361481 when the buffer type was changed from an
int to an enum.

Reported by:	mjg, rpokala
Sponsored by:	Chelsio Communications
2020-06-10 00:09:31 +00:00
Mateusz Guzik
1724c563e6 cred: distribute reference count per thread
This avoids dirtying creds in the common case, see the comment in kern_prot.c
for details.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D24007
2020-06-09 23:03:48 +00:00
Eric Joyner
b4a7ce0690 ixl(4): Add FW recovery mode support and other things
Update the iflib version of ixl driver based on the OOT version ixl-1.11.29.

Major changes:

- Extract iflib specific functions from ixl_pf_main.c to ixl_pf_iflib.c
  to simplify code sharing between legacy and iflib version of driver

- Add support for most recent FW API version (1.10), which extends FW
  LLDP Agent control by user to X722 devices

- Improve handling of device global reset

- Add support for the FW recovery mode

- Use virtchnl function to validate virtual channel messages instead of
  using separate checks

- Fix MAC/VLAN filters accounting

Submitted by:	Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by:	erj@
Tested by:	Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D24564
2020-06-09 22:42:54 +00:00
John Baldwin
a3d565a118 Add a crypto capability flag for accelerated software drivers.
Use this in GELI to print out a different message when accelerated
software such as AESNI is used vs plain software crypto.

While here, simplify the logic in GELI a bit for determing which type
of crypto driver was chosen the first time by examining the
capabilities of the matched driver after a single call to
crypto_newsession rather than making separate calls with different
flags.

Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25126
2020-06-09 22:26:07 +00:00
John Baldwin
cea399ec0e Mark padlock(4) and cryptocteon(4) as software drivers.
Both already return the accelerated software priority from
cryptodev_probesession.

Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25125
2020-06-09 22:19:36 +00:00
Justin Hibbits
c8c5600701 powerpc/pmap: Fix wired memory leak in booke64 page directories
Properly handle reference counts in the 64-bit pmap page directories.
Otherwise all page table pages would leak due to over-referencing.  This
would cause a quick enter to swap on a desktop system (AmigaOne X5000) when
quitting and rerunning applications, or just building world.

Add an INVARIANTS check to validate no leakage at pmap release time.
2020-06-09 21:59:13 +00:00
Richard Scheffenegger
6907bbae18 Prevent TCP Cubic to abruptly increase cwnd after slow-start
Introducing flags to track the initial Wmax dragging and exit
from slow-start in TCP Cubic. This prevents sudden jumps in the
caluclated cwnd by cubic, especially when the flow is application
limited during slow start (cwnd can not grow as fast as expected).
The downside is that cubic may remain slightly longer in the
concave region before starting the convex region beyond Wmax again.

Reviewed by:	chengc_netapp.com, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor, blanket)
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D23655
2020-06-09 21:07:58 +00:00
Andreas Tobler
c76b8bda0b Fix boot of wandquad after DTS update
In the recent dts sync the name of the aips-bus@ changed to bus@. Reflect
this change and add an additional OF_finddevice in fix_fdt_interrupt_data()
and in fix_fdt_iomuxc_data() with bus@ only. Iow, keep the old naming for
compatibility.

Discussed with:	ian@
2020-06-09 20:27:35 +00:00
Doug Moore
36ba4b393f To reduce the size of an rb_node, drop the color field. Set the least
significant bit in the pointer to the node from its parent to indicate
that the node is red. Have the tree rotation macros leave the
old-parent/new-child node red and the new-parent/old-child node black.

This change makes RB_LEFT and RB_RIGHT no longer assignable, and
RB_COLOR no longer defined. Any code that modifies the tree or
examines a node color would have to be modified after this change.

Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D25105
2020-06-09 20:19:11 +00:00
Vincenzo Maffione
e136e9c88f iflib: netmap: honor netmap_irx_irq return values
In the receive interrupt routine, always call netmap_rx_irq().
The latter function will return != NM_IRQ_PASS if netmap is not
active on that specific receive queue, so that the driver can go
on with iflib_rxeof(). Note that netmap supports partial opening,
where only a subset of the RX or TX rings can be open in netmap mode.
Checking the IFCAP_NETMAP flag is not enough to make sure that the
queue is indeed in netmap mode.
Moreover, in case netmap_rx_irq() returns NM_IRQ_RESCHED, it means
that netmap expects the driver to call netmap_rx_irq() again as soon
as possible. Currently, this may happen when the device is attached
to a VALE switch.

Reviewed by:	gallatin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25167
2020-06-09 19:15:43 +00:00
Ruslan Bukin
b62d159cb3 Similar to UART on ThunderX2, the ARM Coresight (ETM component)
set ResourceProducer on memory resources: ignore it.

Tested on ARM N1SDP board.

Sponsored by:	DARPA, AFRL
2020-06-09 17:07:42 +00:00
John Baldwin
58b552dcec Refactor ptrace() ABI compatibility.
Add a freebsd32_ptrace() and move as many freebsd32 shims as possible
to freebsd32_ptrace().  Aside from register sets, freebsd32 passes
pointers to native structures to kern_ptrace() and converts to/from
native/32-bit structure formats in freebsd32_ptrace() outside of
kern_ptrace().

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25195
2020-06-09 16:43:23 +00:00
Ruslan Bukin
b6f7bae402 ARM Embedded Trace Macrocell v4.x driver:
o Split-out FDT attachment to a separate file;
o Add ACPI attachment.

Sponsored by:	DARPA, AFRL
2020-06-09 16:43:16 +00:00
Ruslan Bukin
b65c190c40 Fix style: wrap long lines.
Sponsored by:	DARPA, AFRL
2020-06-09 16:06:10 +00:00
Ruslan Bukin
b1670691e8 Rename coresight drivers: use underscores in filenames.
Sponsored by:	DARPA, AFRL
2020-06-09 15:56:41 +00:00
Mateusz Guzik
90a08d6cad Assert on pg_jobc state.
Stolen from NetBSD.
2020-06-09 15:17:23 +00:00
Mateusz Guzik
7ce3a31286 vm: rework swap_pager_status to execute in constant time
The lock-protected iteration is trivially avoidable.

This removes a serialisation point from Linux binaries (which end up calling
here from the sysinfo syscall).
2020-06-09 14:16:18 +00:00
Emmanuel Vadot
4707401c75 coufreq_dt: Rename DEBUG to DPRINTF
DEBUG is a kernel configuration flag and if used cpufreq_dt.c will fail the
build of kernel.

PR:		246867
Submitted by:	Oskar Holmund (oskar.holmlund@ohdata.se)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25080
2020-06-09 09:42:39 +00:00
Mark Johnston
3e5fae34fc Stop computing a "sharedram" value when emulating Linux sysinfo(2).
The previous code was computing an incorrect value in a very expensive
manner.  "sharedram" is supposed to be the amount of memory used by
named swap objects, which on FreeBSD basically corresponds to memory
usage by shared memory objects (including, for example, GEM objects) and
tmpfs.  We currently have no cheap way to count such pages.  The
previous code tried to determine the number of copy-on-write pages
shared between processes.

Just replace the computed value with 0.  illumos reportedly does the
same thing.  Linux itself did not populate this field until a 2014
commit, "mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces".

Reported by:	mjg
MFC after:	1 week
2020-06-08 22:29:52 +00:00
Jessica Clarke
8c3988dff9 virtio: Support non-legacy network device and queue
The non-legacy interface always defines num_buffers in the header,
regardless of whether VIRTIO_NET_F_MRG_RXBUF, just leaving it unused. We
also need to ensure our virtqueue doesn't filter out VIRTIO_F_VERSION_1
during negotiation, as it supports non-legacy transports just fine. This
fixes network packet transmission on TinyEMU.

Reviewed by:	br, brooks (mentor), jhb (mentor)
Approved by:	br, brooks (mentor), jhb (mentor)
Differential Revision:	https://reviews.freebsd.org/D25132
2020-06-08 21:51:36 +00:00
Jessica Clarke
16ca3d0f59 virtio_mmio: Negotiate the upper half of the feature bits too
The feature bits are exposed as a 32-bit register with 2 banks, so we
should negotiate both halves. Notably, VIRTIO_F_VERSION_1 is in the
upper half, and will be used in an upcoming commit.

The PCI bus driver also has this bug, but the legacy BAR layout did not
include selector registers and is rather different from the modern
layout, so it remains solely as legacy.

Reviewed by:	br, brooks (mentor), jhb (mentor)
Approved by:	br, brooks (mentor), jhb (mentor)
Differential Revision:	https://reviews.freebsd.org/D25131
2020-06-08 21:49:42 +00:00
Alexander Motin
9a4510ac32 Implement zero-copy iSCSI target transmission/read.
Add ICL_NOCOPY flag to icl_pdu_append_data(), specifying that the method
can just reference the data buffer instead of immediately copying it.

Extend the offload KPI with optional PDU queue method, allowing to specify
completion callback, called when all the data referenced by above has been
transferred and won't be accessed any more (the buffers can be freed).

Implement the above functionality in software iSCSI driver using mbufs
with external storage and reference counter.  Note that some NICs (ixl(4))
may keep the mbuf in TX queue for a long time, so CTL has to be ready.

Add optional method to struct ctl_scsiio for buffer reference counting.
Implement it for CTL block backend, allowing to delay free of the struct
ctl_be_block_io and memory it references as needed.  In first reincarnation
of the patch I tried to delay whole I/O as it is done for FibreChannel,
that was cleaner, but due to the above callback delays I had to rewrite
it this way to not leave LUN referenced potentially for hours or more.

All together on sequential read from ZFS ARC this saves about 30% of CPU
time and memory bandwidth by avoiding one of 3 memory copies (the other
two are from ZFS ARC to DMU cache and then from DMU cache to CTL buffers).
On tests with 2x Xeon Silver 4114 this allows to reach full line rate of
100GigE NIC.  Tests with Gold CPUs and two 100GigE NICs are stil TBD,
but expectations to saturate them are pretty high. ;)

Discussed with:	Chelsio
Sponsored by:	iXsystems, Inc.
2020-06-08 20:53:57 +00:00
Michael Tuexen
5fb132abbb Whitespace cleanups and removal of a stale comment.
MFC after:		1 week
2020-06-08 20:23:20 +00:00
Jessica Clarke
e28d8a5b26 riscv: Use SBI shutdown call to implement RB_POWEROFF
Currently we only call sbi_shutdown in cpu_reset, which means we reach
"Please press any key to reboot." even when RB_POWEROFF is set, and only
once the user presses a key do we then shutdown. Instead, register a
shutdown_final event handler and make an SBI shutdown call if
RB_POWEROFF is set.

Reviewed by:	br, jhb (mentor), kp
Approved by:	br, jhb (mentor), kp
Differential Revision:	https://reviews.freebsd.org/D25183
2020-06-08 17:57:21 +00:00
Gleb Smirnoff
953171ba9e Move MPASS() macros to systm.h. They are widely used all over
the kernel and aren't contained only to the locking code.

Reviewed by:	kib, mjg
Differential Revision:	https://reviews.freebsd.org/D23656
2020-06-08 17:40:39 +00:00
Randall Stewart
e854dd38ac An important statistic in determining if a server process (or client) is being delayed
is to know the time to first byte in and time to first byte out. Currently we
have no way to know these all we have is t_starttime. That (t_starttime) tells us
what time the 3 way handshake completed. We don't know when the first
request came in or how quickly we responded. Nor from a client perspective
do we know how long from when we sent out the first byte before the
server responded.

This small change adds the ability to track the TTFB's. This will show up in
BB logging which then can be pulled for later analysis. Note that currently
the tracking is via the ticks variable of all three variables. This provides
a very rough estimate (hz=1000 its 1ms). A follow-on set of work will be
to change all three of these values into something with a much finer resolution
(either microseconds or nanoseconds), though we may want to make the resolution
configurable so that on lower powered machines we could still use the much
cheaper ticks variable.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D24902
2020-06-08 11:48:07 +00:00
Alex Richardson
c98013c0b1 RISC-V: Check that the DTB doesn't overlap with kernel
This can happen with very large kernels (e.g. ones embedding a root
filesystem). The DTB written by OpenSBI/BBL is quite small so this is
unlikely to hit important data, but if it does this can result in very
confusing and hard-to-debug crashes. Add a KASSERT() and a verbose print
to catch this problem with debug kernels.

While this will not print any output by default if it fails (that would
depend on EARLY_PRINTF), at least the kernel now halts reliably instead
of randomly crashing.

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25153
2020-06-08 08:52:02 +00:00
Alex Richardson
f7910a3df9 sys/riscv: Remove debug printfs
They are only visible with EARLY_PRINTF so don't show up by default.

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25152
2020-06-08 08:51:57 +00:00