Commit Graph

133082 Commits

Author SHA1 Message Date
Mateusz Guzik
a2de789ebb cred: add a prediction to crfree for td->td_realucred == cr
This matches crhold and eliminates an assembly maze in the common case.
2020-07-02 12:58:07 +00:00
Mateusz Guzik
d23850207b cache: add missing call to cache_ncp_invalid for negative hits
Note the dtrace probe can fire even the entry is gone, but I don't think that's
worth fixing.
2020-07-02 12:56:20 +00:00
Mateusz Guzik
d129e0eba0 cache: fix misplaced fence in cache_ncp_invalidate
The intent was to mark the entry as invalid before cache_zap starts messing
with it.

While here add some comments.
2020-07-02 12:54:50 +00:00
Konstantin Belousov
92d8df2f37 mlx5_core: remove unneccessary LFENCE instruction.
Use fence instead of barrier, which is optimized to take advantage of
the x86 TSO memory model.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2020-07-02 10:44:45 +00:00
Konstantin Belousov
f334f212d9 linuxkpi: improvements for linux_pid_task() and linux_get_pid_task().
Unify functions bodies.
Do not call tdfind() if pid is passed, and do not call pfind() if tid
is supplied.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25534
2020-07-02 10:42:58 +00:00
Konstantin Belousov
4bc5ce2c74 Use tdfind() in pget().
Reviewed by:	jhb, hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25532
2020-07-02 10:40:47 +00:00
Kristof Provost
b865714d95 riscv pmap: zero reserved pte bits in ppn
The top 10 bits of a pte are reserved by specification[1] and are not part of
the PPN.

[1] 'Volume II: RISC-V Privileged Architectures V20190608-Priv-MSU-Ratified',
'4.4.1 Addressing and Memory Protection', page 72: "The PTE format for Sv39 is
shown in Figure 4.18. ... Bits 63–54 are reserved for future use and must be
zeroed by software for forward compatibility."

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	kp, mhorne
Differential Revision:	https://reviews.freebsd.org/D25523
2020-07-01 19:15:43 +00:00
Kristof Provost
6f11e59d72 riscv locore.S: load constant prior to loop
A very minor micro-optimization; t0 is not clobbered between the loop top and
bottom and there appear to be no other branches to this label.

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	mhorne
Differential Revision:	https://reviews.freebsd.org/D25524
2020-07-01 19:12:47 +00:00
Kristof Provost
d53a2816c7 riscv: Log missing registers in dump_regs()
If we panic we dump the registers for debugging. This is very useful, but it
missed several registers (ra, sp, gp and tp).

Log these as well. Especially the return address value is extremely useful.

Sponsored by:	Axiado
2020-07-01 19:11:02 +00:00
Michael Tuexen
e54b7cd007 Fix the cleanup handling in a error path for TCP BBR.
Reported by:		syzbot+df7899c55c4cc52f5447@syzkaller.appspotmail.com
Reviewed by:		rscheff
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D25486
2020-07-01 17:17:06 +00:00
Andrew Turner
e4fc3b653a Read the CPU 0 arm64 ID registers early in initarm
We also update the kernel view early in the boot. This will allow the
use of the common kernel view in ifunc resolvers.

Sponsored by:	Innovate UK
2020-07-01 16:57:57 +00:00
Andrew Turner
eeada9221b Move ID reading signatures to a better header
The functions to read the common user and kernel ID registers should be
in cpu.h rather than undefined.h as they are related to CPU details and
used by undefined instruction handlers.

Sponsored by:	Innovate UK
2020-07-01 16:17:51 +00:00
Mark Johnston
d16a2e4784 Fix a possible next-hop refcount leak when handling IPSec traffic.
It may be possible to fix this by deferring the lookup, but let's
keep the initial change simple to make MFCs easier.

PR:		246951
Reviewed by:	melifaro
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25519
2020-07-01 15:42:48 +00:00
Andrew Turner
9eb07d5627 Read the arm64 ID registers earlier in the boot process.
Also move parsing the registers to just after the secondary CPUs have
started. This means the kernel register view from all CPUs is available
after the CPU SYSINITs have finished, e.g. for use by ifunc resolvers.

Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25505
2020-07-01 15:17:45 +00:00
Andrew Turner
ecc8ccb441 Simplify the flow when getting/setting an isrc
Rather than unlocking and returning we can just perform the needed action
only when the interrupt source is valid and reuse the unlock in both the
valid irq and invalid irq cases.

Sponsored by:	Innovate UK
2020-07-01 12:07:28 +00:00
Edward Tomasz Napierala
6d76adbb6d Rework linux accept(2). This makes the code flow easier to follow,
and fixes a bug where calling accept(2) could result in closing fd 0.

Note that the code still contains a number of problems: it makes
assumptions about l_sockaddr_in being the same as sockaddr_in,
the EFAULT-related code looks like it doesn't work at all, and the
socket type check is racy.  Those will be addressed later on;
I'm trying to work in small steps to avoid breaking one thing while
fixing another.

It fixes Redis, among other things.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25461
2020-07-01 10:37:08 +00:00
Hans Petter Selasky
9a4e535b39 The "pid" field in the LinuxKPI task struct is typically set to the thread ID
and not the process ID. Make sure the linux_task_exiting() function uses tdfind()
to lookup the BSD procedure structure pointer by the "pid" field, and only
fallback to pfind() when no match is found! This makes linux_task_exiting()
in line with the rest of the code.

Differential Revision: https://reviews.freebsd.org/D25509
Submitted by:	Greg V <greg@unrelenting.technology>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-07-01 08:23:57 +00:00
Mateusz Guzik
5d1c042d32 cache: lockless forward lookup with smr
This eliminates the need to take bucket locks in the common case.

Concurrent lookup utilizng the same vnodes is still bottlenecked on referencing
and locking path components, this will be taken care of separately.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23913
2020-07-01 05:59:08 +00:00
Mateusz Guzik
f8022be3e6 vfs: protect vnodes with smr
vget_prep_smr and vhold_smr can be used to ref a vnode while within vfs_smr
section, allowing consumers to get away without locking.

See vhold_smr and vdropl for comments explaining caveats.

Reviewed by:	kib
Testec by:	pho
Differential Revision:	https://reviews.freebsd.org/D23913
2020-07-01 05:56:29 +00:00
Takanori Watanabe
263a104f43 Allow some Bluetooth LE related HCI request to non-root user.
PR:	247588
Reported by:	Greg V (greg@unrelenting.technology)
Reviewed by:	emax
Differential Revision:	https://reviews.freebsd.org/D25516
2020-07-01 04:00:54 +00:00
Ryan Moeller
e5539fb618 libifconfig: Add function to get bridge status
The new function operates similarly to ifconfig_lagg_get_lagg_status and
likewise is accompanied by a function to free the bridge status data structure.

I have included in this patch the relocation of some strings describing STP
parameters and the PV2ID macro from ifconfig into net/if_bridgevar.h as they
are useful for consumers of libifconfig.

Reviewed by:	kp, melifaro, mmacy
Approved by:	mmacy (mentor)
MFC after:	1 week
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D25460
2020-07-01 02:32:41 +00:00
Conrad Meyer
64612d4e44 geom(4): Kill GEOM_PART_EBR_COMPAT option
Take advantage of Warner's nice new real GEOM aliasing system and use it for
aliased partition names that actually work.

Our canonical EBR partition name is the weird, not-default-on-x86-prior-to-
this-revision "da1p4+00001234."  However, if compatibility mode (tunable
kern.geom.part.ebr.compat_aliases) is enabled (1, default), we continue to
provide the alias names like "da1p5" in addition to the weird canonical
names.

Naming partition providers was just one aspect of the COMPAT knob; in
addition it limited mutability, in part because it did not preserve existing
EBR header content aside from that of LBA 0.  This change saves the EBR
header for LBA 0, as well as for every EBR partition encountered.  That way,
when we write out the EBR partition table on modification, we can restore
any bootloader or other metadata in both LBA0 (the first data-containing EBR
may start after 0) as well as every logical EBR we read from the disk, and
only update the geometry metadata and linked list pointers that describe the
actual partitioning.

(This change does not add support for the 'bootcode' verb to EBR.)

PR:		232463
Reported by:	Manish Jain <bourne.identity AT hotmail.com>
Discussed with:	ae (no objection)
Relnotes:	maybe
Differential Revision:	https://reviews.freebsd.org/D24939
2020-07-01 02:16:36 +00:00
Oleksandr Tymoshenko
94bc2117b4 Add i.MX 8M Quad support
- Add CCM driver and clocks implementations for i.MX 8M
- Add GPC driver for iMX8
- Add clock tree for i.MX 8M Quad
- Add clocks support and new compat strings (where required) for existing i.MX 6 UART, I2C, and GPIO drivers
- Enable aarch64-compatible drivers form i.MX 6 in arm64 GENERIC kernel config
- Add dtb/imx8 kernel module with DTBs for Nitrogen8M and iMX8MQ EVK

With this patch both Nitrogen8M and iMX8MQ EVK boot with NFS root up to multiuser login prompt

Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D25274
2020-07-01 00:33:16 +00:00
Adrian Chadd
39ca7ca568 [net80211] Commit files missing in the previous commit
These belong to my previous commit, but apparently I typed ieee80211_vhf.[ch]
and forgot ht.h.  Le oops.
2020-07-01 00:24:55 +00:00
Adrian Chadd
f1481c8d3b [net80211] Migrate HT/legacy protection mode and preamble calculation to per-VAP flags
The later firmware devices (including iwn!) support multiple configuration
contexts for a lot of things, leaving it up to the firmware to decide
which channel and vap is active.  This allows for things like off-channel
p2p sta/ap operation and other weird things.

However, net80211 is still focused on a "net80211 drives all" when it comes to driving
the NIC, and as part of this history a lot of these options are global and not per-VAP.
This is fine when net80211 drives things and all VAPs share a single channel - these
parameters importantly really reflect the state of the channel! - but it will increasingly
be not fine when we start supporting more weird configurations and more recent NICs.
Yeah, recent like iwn/iwm.

Anyway - so, migrate all of the HT protection, legacy protection and preamble
stuff to be per-VAP.  The global flags are still there; they're now calculated
in a deferred taskqueue that mirrors the old behaviour.  Firmware based drivers
which have per-VAP configuration of these parameters can now just listen to the
per-VAP options.

What do I mean by per-channel? Well, the above configuration parameters really
are about interoperation with other devices on the same channel. Eg, HT protection
mode will flip to legacy/mixed if it hears ANY BSS that supports non-HT stations or
indicates it has non-HT stations associated.  So, these flags really should be
per-channel rather than per-VAP, and then for things like "do i need short preamble
or long preamble?" turn into a "do I need it for this current operating channel".
Then any VAP using it can query the channel that it's on, reflecting the real
required state.

This patch does none of the above paragraph just yet.

I'm also cheating a bit - I'm currently not using separate taskqueues for
the beacon updates and the per-VAP configuration updates.  I can always further
split it later if I need to but I didn't think it was SUPER important here.

So:

* Create vap taskqueue entries for ERP/protection, HT protection and short/long
  preamble;
* Migrate the HT station count, short/long slot station count, etc - into per-VAP
  variables rather than global;
* Fix a bug with my WME work from a while ago which made it per-VAP - do the WME
  beacon update /after/ the WME update taskqueue runs, not before;
* Any time the HT protmode configuration changes or the ERP protection mode
  config changes - schedule the task, which will call the driver without the
  net80211 lock held and all correctly serialised;
* Use the global flags for beacon IEs and VAP flags for probe responses and
  other IE situations.

The primary consumer of this is ath10k.  iwn could use it when sending RXON,
but we don't support IBSS or AP modes on it yet, and I'm not yet sure whether
it's required in STA mode (ie whether the firmware parses beacons to change
protection mode or whether we need to.)

Tested:

* AR9280, STA/AP
* AR9380, DWDS STA+STA/AP
* ath10k work, STA/AP
* Intel 6235, STA
* Various rtwn / run NICs, DWDS STA and STA configurations
2020-07-01 00:23:49 +00:00
Mark Johnston
7290cb47fc Convert cryptostats to a counter_u64 array.
The global counters were not SMP-friendly.  Use per-CPU counters
instead.

Reviewed by:	jhb
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D25466
2020-06-30 22:01:21 +00:00
Michael Tuexen
7a3f60e7f5 Fix a bug introduced in https://svnweb.freebsd.org/changeset/base/362173
Reported by:		syzbot+f3a6fccfa6ae9d3ded29@syzkaller.appspotmail.com
MFC after:		1 week
2020-06-30 21:50:05 +00:00
Edward Tomasz Napierala
c2da36fecd Make linprocfs(5) create the /proc/<PID>/task/ directores.
This is to silence down some Chromium assertions.

PR:		kern/240991
Analyzed by:	Alex S <iwtcex@gmail.com>
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25256
2020-06-30 16:24:28 +00:00
Edward Tomasz Napierala
9bc42c18cb Make linux(4) ignore SA_INTERRUPT. The zsh(1) binary from Bionic uses it.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25499
2020-06-30 16:18:09 +00:00
Andrew Turner
518da7ace8 Add dwc_otg_acpi
Create an acpi attachment for the DWC USB OTG device. This is present in
the Raspberry Pi 4 in the USB-C port normally used to power the board. Some
firmware presents the kernel with ACPI tables rather than FDT so we need
an ACPI attachment.

Submitted by:	Greg V <greg_unrelenting.technology>
Approved by:	hselasky (removal of All rights reserved)
Differential Revision:	https://reviews.freebsd.org/D25203
2020-06-30 15:58:29 +00:00
Mark Johnston
a5ae70f5a0 Remove unused 32-bit compatibility structures from cryptodev.
The counters are exported by a sysctl and have the same width on all
platforms anyway.

Reviewed by:	cem, delphij, jhb
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D25465
2020-06-30 15:57:11 +00:00
Mark Johnston
a5c053f5a7 Remove CRYPTO_TIMING.
It was added a very long time ago.  It is single-threaded, so only
really useful for basic measurements, and in the meantime we've gotten
some more sophisticated profiling tools.

Reviewed by:	cem, delphij, jhb
Sponsored by:	Rubicon Communications, LLC (Netgate)
Differential Revision:	https://reviews.freebsd.org/D25464
2020-06-30 15:56:54 +00:00
Hans Petter Selasky
d326a6c7c1 Document the is_signed(), type_max() and type_min() function macros in the
LinuxKPI. Try to make the function argument more readable.

Suggested by:	several
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-30 08:41:33 +00:00
Andrew Gallatin
46cac10b3b Fix a panic when unloading firmware
LIST_FOREACH_SAFE() is not safe in the presence
of other threads removing list entries when a
mutex is released.

This is not in the critical path, so just restart
the scan each time we drop the lock, rather than
using a marker.

Reviewed by:	jhb, markj
Sponsored by:	Netflix
2020-06-29 21:35:50 +00:00
Kyle Evans
97ce5033a8 linux: reposition the comment for bsd_to_linux_bits/linux_to_bsd_bits
rpokala notes that splitting the definitions like this is kind of silly,
since the comment applies to both.  Move the comment up (or the definition
down, depending on your perspective on life) accordingly.

Reported by:	rpokala
2020-06-29 17:47:00 +00:00
Conrad Meyer
8a64110e43 vm: Add missing WITNESS warnings for M_WAITOK allocation
vm_map_clip_{end,start} and lookup_clip_start allocate memory M_WAITOK
for !system_map vm_maps.  Add WITNESS warning annotation for !system_map
callers who may be holding non-sleepable locks.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25283
2020-06-29 16:54:00 +00:00
Hans Petter Selasky
d0eed838e3 Implement is_signed(), type_max() and type_min() function macros in the
LinuxKPI.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-29 13:08:40 +00:00
Ruslan Bukin
1ea7952510 Coresight: provide device_attach method for FDT bus.
Sponsored by:	DARPA, AFRL
2020-06-29 12:59:09 +00:00
Andrew Turner
b639b3b195 Fix the spelling of identify in the arm64 identcpu code
Sponsored by:	Innovate UK
2020-06-29 09:37:07 +00:00
Andrew Turner
45e999d918 Create a kernel arm64 ID register view
In preparation for using ifuncs in the kernel is is useful to have a common
view of the arm64 ID registers across all CPUs. Add this and extract the
logic for finding the lower value of two fields to a new helper function.

Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25463
2020-06-29 09:08:36 +00:00
Kyle Evans
5403f186a7 linuxolator: implement memfd_create syscall
This effectively mirrors our libc implementation, but with minor fudging --
name needs to be copied in from userspace, so we just copy it straight into
stack-allocated memfd_name into the correct position rather than allocating
memory that needs to be cleaned up.

The sealing-related fcntl(2) commands, F_GET_SEALS and F_ADD_SEALS, have
also been implemented now that we support them.

Note that this implementation is still not quite at feature parity w.r.t.
the actual Linux version; some caveats, from my foggy memory:

- Need to implement SHM_GROW_ON_WRITE, default for memfd (in progress)
- LTP wants the memfd name exposed to fdescfs
- Linux allows open() of an fdescfs fd with O_TRUNC to truncate after dup.
  (?)

Interested parties can install and run LTP from ports (devel/linux-ltp) to
confirm any fixes.

PR:		240874
Reviewed by:	kib, trasz
Differential Revision:	https://reviews.freebsd.org/D21845
2020-06-29 03:09:14 +00:00
Mark Johnston
8c277118d8 Fix UMA's first-touch policy on systems with empty domains.
Suppose a thread is running on a CPU in a NUMA domain with no physical
RAM.  When an item is freed to a first-touch zone, it ends up in the
cross-domain bucket.  When the bucket is full, it gets placed in another
domain's bucket queue.  However, when allocating an item, UMA will
always go to the keg upon a per-CPU cache miss because the empty
domain's bucket queue will always be empty.  This means that a non-empty
domain's bucket queues can grow very rapidly on such systems.  For
example, it can easily cause mbuf allocation failures when the zone
limit is reached.

Change cache_alloc() to follow a round-robin policy when running on an
empty domain.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25355
2020-06-28 21:35:04 +00:00
Mark Johnston
3507b8d467 Remove some redundant assignments and computations.
Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25400
2020-06-28 21:34:38 +00:00
Oleksandr Tymoshenko
4c95d46303 Configure rx_delay/tx_delay values for RK3399/RK3328 GMAC
For 1000Mb mode to work reliably TX/RX delays need to be configured
between the TX/RX clock and the respective signals on the PHY
to compensate for differing trace lengths on the PCB.

Reviewed by:	manu
MFC after:	1 week
2020-06-28 21:11:10 +00:00
Edward Tomasz Napierala
4fe5361cbe Make linux(4) support SO_PROTOCOL. Running Python test suite
with python3.8 from Focal triggers those.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25491
2020-06-28 18:56:32 +00:00
Andrew Turner
23e42a83c1 Use EFI memory map to determine attributes for Acpi mappings on arm64.
AcpiOsMapMemory is used for device memory when e.g. an _INI method wants
to access physical memory, however, aarch64 pmap_mapbios is hardcoded to
writeback. Search for the correct memory type to use in pmap_mapbios.

Submitted by:	Greg V <greg_unrelenting.technology>
Differential Revision:	https://reviews.freebsd.org/D25201
2020-06-28 15:03:07 +00:00
Michael Tuexen
e99ce3eac5 Don't send packets containing ERROR chunks in response to unknown
chunks when being in a state where the verification tag to be used
is not known yet.

MFC after:		1 week
2020-06-28 14:11:36 +00:00
Michael Tuexen
f2f66ef6d2 Don't check ch for not being NULL, since that is true.
MFC after:		1 week
2020-06-28 11:12:03 +00:00
Konstantin Belousov
557905569d amd64 pmap: explain ptepindex.
Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25187
2020-06-27 19:29:07 +00:00
Edward Tomasz Napierala
d5629eb216 Make linux(4) warn about unsupported SA_ flags.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25453
2020-06-27 15:50:35 +00:00
Edward Tomasz Napierala
a39cdcd7e7 Regen.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-27 14:43:29 +00:00
Edward Tomasz Napierala
308e194cbf Add proper types for linux message queue syscalls; mostly taken
from 32-bit Linuxulator.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25386
2020-06-27 14:42:08 +00:00
Edward Tomasz Napierala
36507f85dc Add syscall definitions for linux xattr syscalls.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25387
2020-06-27 14:39:44 +00:00
Edward Tomasz Napierala
8036e7876d Adjust types of linuxulator syscalls, to match include/linux/syscalls.h
in vanilla Linux git tree.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25385
2020-06-27 14:37:36 +00:00
Li-Wen Hsu
18db3c616f rtwn: Add a USB ID for Buffalo WI-U2-433DHP
PR:		247573
Submitted by:	HATANO Tomomi <hatanou@infolab.ne.jp>
MFC after:	1 week
2020-06-27 07:34:15 +00:00
Adrian Chadd
b5e7ee4718 [ath_hal] Add KeyMiss for AR5212/AR5416 series chips.
This is a flag from the MAC that says the received packet didn't match
a keycache slot.  This isn't technically a problem as WEP keys don't
match keycache slots (they're "global" keys), but it could be useful
for tracking down CCMP decryption failures.

Right now it's a no-op - it mirrors what the AR9300 HAL does and it
just increments a counter.  But, hey, maybe one day I'll use it for
diagnosing keycache/CCMP decrypt issues.
2020-06-27 02:59:51 +00:00
Konstantin Belousov
ee06cffcd2 vm_page_free_prep(): correct description of the required page and object state.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D25482
2020-06-27 02:31:39 +00:00
Matt Macy
4dc16f4391 Fix "current" variable name conflict with openzfs
The variable "current" is an alias for curthread
in openzfs. Rename all variable uses of current
in dtrace.c to curstate.
2020-06-27 00:57:48 +00:00
Matt Macy
56e5ad5ff7 Rename nvpair.c to bsd_nvpair.c to not conflict with openzfs' version. 2020-06-27 00:55:03 +00:00
Alexander Motin
4cee4598e7 Add mostly dummy hw.pci.enable_aspm tunable.
The only thing this tunable enables now is reporting to ACPI _OSC that
Active State Power Management and Clock Power Management Capability are
"supported" by the OS.

I've found that at least some Supermicro server boards do not allow OS
to support native PCIe hot-plug unless it reports those capabilities.
After spending significant time in PCIe specs I have found very little
motivation for that, and none of it applies to those motherboards, not
enabling ASPM themselves.  So unless OS explicitly wants to save power,
I see nothing for it to do there actually.

I guess it may get sense to support ASPM when we get Thunderbolt support.
Otherwise I have no system with PCIe hot-plug where power saving matters.

It would be nice to enable this by default, but I worry that it affect
power saving of some laptops, even though I haven't noticed that myself.
2020-06-26 19:55:11 +00:00
Andriy Gapon
4302208388 sound/hda: fix interrupt handler endless loop after r362294
Not all interrupt sources that affect CIS bit were acknowledged.
Specifically, bits in STATESTS (aka WAKESTS) were left set.

The fix is to disable WAKEEN and clear STATESTS bits before the HDA
interrupt is enabled.  This way we should never get any STATESTS bits.

I also added placeholders for all event bits that we currently do not
enable, do not handle and do not clear.  This might get useful when / if
we enable any of them.

Reported by:	kib (Apollo Lake hardware)
Tested by:	kib (earlier, different change)
MFC after:	2 weeks
X-MFC with:	r362294
2020-06-26 09:46:03 +00:00
Andriy Gapon
8bf2c3c9f6 ena: fix module build after r362530
Somehow I missed the makefile when moving the change from phabricator to
svn.

MFC after:	1 week
X-MFC with:	r362530
2020-06-26 09:32:57 +00:00
Rick Macklem
db4b8d7e0d Bump the version since r362639 changed the internal API between the NFS
kernel modules so they must all be rebuilt.
2020-06-26 03:14:30 +00:00
Rick Macklem
4476c1def0 Add a boolean argument to nfscl_reqstart() to indicate that ext_pgs mbufs
should be used.

For KERN_TLS (and possibly some other future network interface) the mbuf
list passed into sosend() must be ext_pgs mbufs. The krpc could simply
copy all the mbuf data into ext_pgs mbufs before calling sosend(), but
that would be inefficient for large RPC messages.
This patch adds an argument to nfscl_reqstart() to indicate that it should
fill the RPC message into ext_pgs mbufs.
It also adds fields to "struct nfsrv_descript" needed for building NFS RPC
messages in ext_pgs mbufs, along with new flags for this.

Since the argument is always "false", this commit should not result in any
semantic change. However, this commit prepares the code
for future commits that will add support for building of NFS RPC messages
in ext_pgs mbufs.
2020-06-26 03:11:54 +00:00
John Baldwin
94578db218 Reduce contention on per-adapter lock.
- Move temporary sglists into the session structure and protect them
  with a per-session lock instead of a per-adapter lock.

- Retire an unused session field, and move a debugging field under
  INVARIANTS to avoid using the session lock for completion handling
  when INVARIANTS isn't enabled.

- Use counter_u64 for per-adapter statistics.

Note that this helps for cases where multiple sessions are used
(e.g. multiple IPsec SAs or multiple KTLS connections).  It does not
help for workloads that use a single session (e.g. a single GELI
volume).

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25457
2020-06-26 00:01:31 +00:00
John Baldwin
dae61c9d09 Simplify IPsec transform-specific teardown.
- Rename from the teardown callback from 'zeroize' to 'cleanup' since
  this no longer zeroes keys.

- Change the callback return type to void.  Nothing checked the return
  value and it was always zero.

- Don't have esp call into ah since it no longer needs to depend on
  this to clear the auth key.  Instead, both are now private and
  self-contained.

Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25443
2020-06-25 23:59:16 +00:00
John Baldwin
f82eb2a6f0 Enter and exit the network epoch for async IPsec callbacks.
When an IPsec packet has been encrypted or decrypted, the next step in
the packet's traversal through the network stack is invoked from a
crypto worker thread, not from the original calling thread.  These
threads need to enter the network epoch before passing packets down to
IP output routines or up to transport protocols.

Reviewed by:	ae
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25444
2020-06-25 23:57:30 +00:00
Vincenzo Maffione
9503233f87 iflib: fix compilation issue introduced in r362621
The ifp local variable is useful even without netmap
and altq, as it is used to check for IFF_DRV_RUNNING.

MFC after:	2 weeks
2020-06-25 20:43:21 +00:00
John Baldwin
20869b25cc Use zfree() to explicitly zero IPsec keys.
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25442
2020-06-25 20:31:06 +00:00
Mark Johnston
f4134e3d87 Implement an approximation of Linux MADV_DONTNEED semantics.
Linux MADV_DONTNEED is not advisory: it has side effects for anonymous
memory, and some system software depends on that.  In particular,
MADV_DONTNEED causes anonymous pages to be discarded.  If the mapping is
a private mapping of a named object then subsequent faults are to
repopulate the range from that object, otherwise pages will be
zero-filled.  For mappings of non-anonymous objects, Linux MADV_DONTNEED
can be implemented in the same way as our MADV_DONTNEED.

This implementation differs from Linux semantics in its handling of
private mappings, inherited through fork(), of non-anonymous objects.
After applying MADV_DONTNEED, subsequent faults will repopulate the
mapping from the parent object rather than the root of the shadow chain.

PR:		230160
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25330
2020-06-25 20:30:30 +00:00
Alexander Motin
701267ad19 Fix few panics on NVMe's timing out initialization requests.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-25 20:29:29 +00:00
John Baldwin
6572e5ff66 Use explicit_bzero() instead of bzero() for sensitive data.
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25441
2020-06-25 20:25:35 +00:00
John Baldwin
9b6dc28176 Explicitly zero the temporary auth context used to generate HMAC state.
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25439
2020-06-25 20:22:44 +00:00
John Baldwin
347c369294 Explicitly zero hash results and context in glxsb_authcompute().
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25438
2020-06-25 20:21:34 +00:00
John Baldwin
b172f23dd7 Use zfree() instead of bzero() and free().
These bzero's should have been explicit_bzero's.

Reviewed by:	cem, delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25437
2020-06-25 20:20:22 +00:00
John Baldwin
17a831ea25 Zero the temporary HMAC key in hmac_init_pad().
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25436
2020-06-25 20:18:55 +00:00
John Baldwin
4a711b8d04 Use zfree() instead of explicit_bzero() and free().
In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().

Suggested by:	cem
Reviewed by:	cem, delphij
Approved by:	csprng (cem)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25435
2020-06-25 20:17:34 +00:00
Vincenzo Maffione
d8b2d26b15 iflib: netmap: add support for partial ring openings
Reviewed by:	gallatin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25254
2020-06-25 19:44:24 +00:00
Vincenzo Maffione
88a688663a iflib: netmap: add per-tx-queue netmap support
Reviewed by:	gallatin
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25253
2020-06-25 19:35:43 +00:00
Mark Johnston
90297b6471 Add SCTP_SUPPORT to the default kernel options.
Otherwise out-of-tree module builds will be broken for a lack of a
definition of MK_SCTP_SUPPORT.

Reported by:	Michael Butler <imb@protected-networks.net>
MFC with:	r362614
Sponsored by:	The FreeBSD Foundation
2020-06-25 19:12:27 +00:00
Doug Moore
3a509754de Eliminate the color field from the RB element struct. Identify the
color of a node (or, really, the color of the link from the parent to
the node) by using one of the last two bits of the parent pointer in
that parent node. Adjust rebalancing methods to account for where
colors are stored, and the fact that null children have a color too.

Adjust RB_PARENT and RB_SET_PARENT to account for this change.

Reviewed by:	markj
Tested by:	pho, hselasky
Differential Revision:	https://reviews.freebsd.org/D25418
2020-06-25 17:44:14 +00:00
Navdeep Parhar
7c228be30b cxgbe(4): Add a pointer to the adapter softc in vi_info.
There were quite a few places where port_info was being accessed only to
get to the adapter.

Reviewed by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25432
2020-06-25 17:04:22 +00:00
Mark Johnston
79ddb55c39 Add SCTP_SUPPORT handling to config.mk.
Reviewed by:	jhb, tuexen
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25402
2020-06-25 15:25:00 +00:00
Mark Johnston
84242cf68a Call swap_pager_freespace() from vm_object_page_remove().
All vm_object_page_remove() callers, except
linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when
removing a range of pages from an object.  The LinuxKPI case appears to
be an unintentional omission that could result in leaked swap blocks, so
unconditionally free swap space in vm_object_page_remove() to protect
against similar bugs in the future.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25329
2020-06-25 15:21:21 +00:00
Conrad Meyer
4daa95f85d bhyve(8): For prototyping, reattempt decode in userspace
If userspace has a newer bhyve than the kernel, it may be able to decode
and emulate some instructions vmm.ko is unaware of.  In this scenario,
reset decoder state and try again.

Reviewed by:	grehan
Differential Revision:	https://reviews.freebsd.org/D24464
2020-06-25 00:18:42 +00:00
Vladimir Kondratyev
54cca285fc atkbd/evdev: recognize the Chromebook menu key as F13 like Linux does.
This is the key on the right side of the function keys, with the
"hamburger menu" icon on it.

Submitted by:		GregV <greg@unrelenting.technology>
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D25390
2020-06-25 00:09:43 +00:00
Mark Johnston
ddf1843203 acpi_ibm(4): Rename disengaged mode to unthrottled mode.
This mode was added in r362496.  Rename it to make the meaning more
clear.

PR:		247306
Suggested by:	rpokala
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC with:	r362496
2020-06-24 19:51:03 +00:00
Enji Cooper
d6701b6c8c Add kern.features.witness
Adding `kern.features.witness` helps expose whether or not the kernel has
`options WITNESS` enabled, so the `feature_present(3)` API can be used
to query whether or not witness(9) is built into the kernel.

This support is helpful with userspace applications (generally speaking,
tests), as it can be queried to determine whether or not tests related
to WITNESS should be run.

MFC after:	1 week
Reviewed by: cem, darrick.freebsd_gmail.com
Differential Revision: https://reviews.freebsd.org/D25302
Sponsored by:	DellEMC Isilon
2020-06-24 18:51:01 +00:00
Mark Johnston
1388cfe1b5 ipfw(4): make O_IPVER/ipversion match IPv4 or 6, not just IPv4.
Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Reviewed by:	Lutz Donnerhacke
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25227
2020-06-24 15:46:33 +00:00
Mitchell Horne
133b1f1461 Only invalidate the early DTB mapping if it exists
This temporary mapping will become optional. Booting via loader(8)
means that the DTB will have already been copied into the kernel's
staging area, and is therefore covered by the early KVA mappings.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D24911
2020-06-24 15:21:12 +00:00
Mitchell Horne
f7d2df2a8a Handle load from loader(8)
In locore, we must detect and handle different arguments passed by
loader(8) compared to what we recieve when booting directly via SBI
firmware. Currently we receive the hart ID in a0 and a pointer to the
device tree blob in a1. loader(8) provides only a pointer to its
metadata in a0.

The solution to this is to add an additional entry point, _alt_start.
This will be placed first in the .text section, so SBI firmware will
enter here, and jump to the common pagetable setup shortly after. Since
loader(8) understands our ELF kernel, it will enter at the ELF's entry
address, which points to _start. This approach leads to very little
guesswork as to which way we booted.

Fix-up initriscv() to parse the loader's metadata, continuing to use
fake_preload_metadata() in the SBI direct boot case.

Reviewed by:	markj, jrtc27 (asm portion)
Differential Revision:	https://reviews.freebsd.org/D24912
2020-06-24 15:20:00 +00:00
Michael Tuexen
132c073866 Fix the acconting for fragmented unordered messages when using
interleaving.
This was reported for the userland stack in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=19321

MFC after:		1 week
2020-06-24 14:47:51 +00:00
Richard Scheffenegger
6e26dd0dbe TCP: fix cubic RTO reaction.
Proper TCP Cubic operation requires the knowledge
of the maximum congestion window prior to the
last congestion event.

This restores and improves a bugfix previously added
by jtl@ but subsequently removed due to a revert.

Reported by:	chengc_netapp.com
Reviewed by:	chengc_netapp.com, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25133
2020-06-24 13:52:53 +00:00
Richard Scheffenegger
9dc7d8a246 TCP: make after-idle work for transactional sessions.
The use of t_rcvtime as proxy for the last transmission
fails for transactional IO, where the client requests
data before the server can respond with a bulk transfer.

Set aside a dedicated variable to actually track the last
locally sent segment going forward.

Reported by:	rrs
Reviewed by:	rrs, tuexen (mentor)
Approved by:	tuexen (mentor), rgrimes (mentor)
MFC after:	2 weeks
Sponsored by:	NetApp, Inc.
Differential Revision:	https://reviews.freebsd.org/D25016
2020-06-24 13:42:42 +00:00
Marcin Wojtas
fab2a758cc Fix AccessWidth and BitWidth parsing in SPCR table
The ACPI Specification defines a Generic Address Structure (GAS),
which is used to describe UART controller register layout in the
SPCR table. The driver responsible for parsing it (uart_cpu_acpi)
wrongly associates the Access Size field to the uart_bas's regshft
and the register BitWidth to the regiowidth - according to
the definitions it should be opposite.

This problem remained hidden most likely because the majority of platforms
use 32-bit registers (BitWidth) which are accessed with the according
size (Dword). However on Marvell Armada 8k / Cn913x platforms,
the 32-bit registers should be accessed with Byte granulity, which
unveiled the issue.

This patch fixes above by proper values assignment and slightly improved
parsing.

Note that handling of the AccessWidth set to EFI_ACPI_6_0_UNDEFINED is
needed to work around a buggy SPCR table on EC2 x86 "bare metal" instances.

Reviewed by: manu, imp, cperciva, greg_unrelenting.technology
Obtained from: Semihalf
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25373
2020-06-24 12:15:27 +00:00
Michael Tuexen
87c0bf77d9 Fix alignment issue manifesting in the userland stack.
MFC after:		1 wwek
2020-06-23 23:05:05 +00:00
Doug Moore
158c55a584 In r362552, RB_SET_PARENT is defined, and use in parens in
RB_CLEAR_NODE.  But it is not an expression, and ought not to be
enclosed in parens.  Remove them.

Approved by:	markj
Differential Revision:	https://reviews.freebsd.org/D25421
2020-06-23 22:47:54 +00:00
Kirk McKusick
9407f25df2 Optimize g_journal's superblock update by noting that the summary
information is neither read nor written so it need not be written
out when updating the superblock.

PR:           247425
Sponsored by: Netflix
2020-06-23 21:44:00 +00:00
Vincenzo Maffione
0ff2126795 iflib: netmap: fix rsync index overrun
In the current iflib_netmap_rxsync, there is nothing that prevents
kring->nr_hwtail to overrun kring->nr_hwcur during the descriptor
import phase. This may cause errors in netmap applications, such as:

em1 RX0: fail 'head < kring->nr_hwcur || head > kring->nr_hwtail'
    h 795 c 795 t 282 rh 795 rc 795 rt 282 hc 282 ht 282

Reviewed by:	gallatin
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25252
2020-06-23 20:23:56 +00:00
Doug Moore
4d56980017 Define RB_SET_PARENT to do all assignments to rb parent
pointers. Define RB_SWAP_CHILD to replace the child of a parent with
its twin, and use it in 4 places. Use RB_SET in rb_link_node to remove
the only linuxkpi reference to color, and then drop color- and
parent-related definitions that are defined and used only in rbtree.h.

This is intended to be entirely cosmetic, with no impact on program
behavior, and leave RB_PARENT and RB_SET_PARENT as the only ways to
read and write rb parent pointers.

Reviewed by:	markj, kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D25264
2020-06-23 20:02:55 +00:00
Conrad Meyer
9b6edf364e kmod.mk: Don't split out debug symbols if requested
Ports bsd.kmod.mk explicitly sets MK_KERNEL_SYMBOLS=no to prevent auto-
splitting of debuginfo from kernel modules.  If that knob is set, don't
split out a .ko.debug and .ko from .ko.full; just generate a .ko with
debuginfo and leave it be.

Otherwise, with DEBUG_FLAGS set and MK_KERNEL_SYMBOLS=no, we would helpfully
strip out the debuginfo from the .ko.full and then not install it.  That is
not the desired result a WITH_DEBUG port kmod build.

Reviewed by:	emaste, jhb
Differential Revision:	https://reviews.freebsd.org/D24835
2020-06-23 18:25:31 +00:00
Ed Maste
e46cf959d6 arm64 armreg.h: fix TCR_TBI1 definition
Submitted by:	Greg V <greg@unrelenting.technology>
Differential Revision:	https://reviews.freebsd.org/D25411
2020-06-23 15:32:05 +00:00
Tycho Nightingale
c774294c57 To avoid a startup script race change net.bpf.optimize_writers from
CTLFLAG_RW to CTLFLAG_RWTUN to allow it to be modified by a loader
tunable.

Sponsored by:	Dell EMC Isilon
2020-06-23 13:57:53 +00:00
Navdeep Parhar
0cadedfc46 cxgbe(4): Add a tx_len16_to_desc helper.
No functional change.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-06-23 07:33:29 +00:00
Toomas Soome
a14844e0d6 MFOpenZFS: Add basic zfs ioc input nvpair validation
We want newer versions of libzfs_core to run against an existing
zfs kernel module (i.e. a deferred reboot or module reload after
an update).

Programmatically document, via a zfs_ioc_key_t, the valid arguments
for the ioc commands that rely on nvpair input arguments (i.e. non
legacy commands from libzfs_core). Automatically verify the expected
pairs before dispatching a command.

This initial phase focuses on the non-legacy ioctls. A follow-on
change can address the legacy ioctl input from the zfs_cmd_t.

The zfs_ioc_key_t for zfs_keys_channel_program looks like:

static const zfs_ioc_key_t zfs_keys_channel_program[] = {
       {"program",     DATA_TYPE_STRING,               0},
       {"arg",         DATA_TYPE_UNKNOWN,              0},
       {"sync",        DATA_TYPE_BOOLEAN_VALUE,        ZK_OPTIONAL},
       {"instrlimit",  DATA_TYPE_UINT64,               ZK_OPTIONAL},
       {"memlimit",    DATA_TYPE_UINT64,               ZK_OPTIONAL},
};

Introduce four input errors to identify specific input failures
(in addition to generic argument value errors like EINVAL, ERANGE,
EBADF, and E2BIG).

ZFS_ERR_IOC_CMD_UNAVAIL the ioctl number is not supported by kernel
ZFS_ERR_IOC_ARG_UNAVAIL an input argument is not supported by kernel
ZFS_ERR_IOC_ARG_REQUIRED a required input argument is missing
ZFS_ERR_IOC_ARG_BADTYPE an input argument has an invalid type

Reviewed by:	allanjude
Obtained from:	OpenZFS
Sponsored by:	Netflix, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D25393
2020-06-23 06:42:39 +00:00
Andriy Gapon
b40dd828bd teach ena driver about RSS kernel option
Networking is broken if the driver configures its (virtual) hardware to
use a hash algorithm (or a key) different from the one that the network
stack (software RSS) uses.  This can be seen with connections initiated
from the host.  The PCB will be placed into the hash table based on the
hash value calculated by the software.  The hardware-calculated hash
value in reponse packets will be different, so the PCB won't be found.

Tested with a kernel compiled with 'options RSS' on an instance with ena
driver.

Reviewed by:	mw, adrian
MFC after:	2 weeks
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D24733
2020-06-23 04:58:36 +00:00
John Baldwin
5b750b9a68 Store the AAD in a separate buffer for KTLS.
For TLS 1.2 this permits reusing one of the existing iovecs without
always having to duplicate both.

While here, only duplicate the output iovec for TLS 1.3 if it will be
used.

Reviewed by:	gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25291
2020-06-23 00:02:28 +00:00
John Baldwin
6deb4131b8 Add support for requests with separate AAD to ccr(4).
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25290
2020-06-22 23:41:33 +00:00
John Baldwin
604b021795 Add support for requests with separate AAD to aesni(4).
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25289
2020-06-22 23:22:13 +00:00
John Baldwin
9b774dc0c5 Add support to the crypto framework for separate AAD buffers.
This permits requests to provide the AAD in a separate side buffer
instead of as a region in the crypto request input buffer.  This is
useful when the main data buffer might not contain the full AAD
(e.g. for TLS or IPsec with ESN).

Unlike separate IVs which are constrained in size and stored in an
array in struct cryptop, separate AAD is provided by the caller
setting a new crp_aad pointer to the buffer.  The caller must ensure
the pointer remains valid and the buffer contents static until the
request is completed (e.g. when the callback routine is invoked).

As with separate output buffers, not all drivers support this feature.
Consumers must request use of this feature via a new session flag.

To aid in driver testing, kern.crypto.cryptodev_separate_aad can be
set to force /dev/crypto requests to use a separate AAD buffer.

Discussed with:	cem
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25288
2020-06-22 23:20:43 +00:00
Jung-uk Kim
450d86fc7f Assume all TSCs are synchronized for AMD Family 17h processors and later
when it has passed the synchronization test.

"Processor Programming Reference (PPR) for AMD Family 17h" states that
the TSC uses a common reference for all sockets, cores and threads.

MFC after:	1 month
2020-06-22 20:42:58 +00:00
Allan Jude
c5305bb50a MFOpenZFS: Add zio_ddt_free()+ddt_phys_decref() error handling
The assumption in zio_ddt_free() is that ddt_phys_select() must
always find a match.  However, if that fails due to a damaged
DDT or some other reason the code will NULL dereference in
ddt_phys_decref().

While this should never happen it has been observed on various
platforms.  The result is that unless your willing to patch the
ZFS code the pool is inaccessible.  Therefore, we're choosing
to more gracefully handle this case rather than leave it fatal.

http://mail.opensolaris.org/pipermail/zfs-discuss/2012-February/050972.html

5dc6af0eec

Reported by:	Pierre Beyssac
Obtained from:	OpenZFS
MFC after:	2 weeks
Sponsored by:	Klara Inc.
2020-06-22 19:03:02 +00:00
Michael Tuexen
b88082dd39 No need to include netinet/sctp_crc32.h twice. 2020-06-22 14:36:14 +00:00
Mark Johnston
e6db509d10 Move the definition of SCTP's system_base_info into sctp_crc32.c.
This file is the only SCTP source file compiled into the kernel when
SCTP_SUPPORT is configured.  sctp_delayed_checksum() references a couple
of counters defined in system_base_info, so the change allows these
counters to be referenced in a kernel compiled without "options SCTP".

Submitted by:	tuexen
MFC with:	r362338
2020-06-22 14:01:31 +00:00
Mark Johnston
9f763f0092 acpi_ibm(4): Add support for putting fans in disengaged mode.
PR:		247306
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
2020-06-22 12:36:05 +00:00
Andrew Turner
372c142b4f Translaate the PCI address when activating a resource
When the PCI address != physical address we need to translate from the
former to the latter before passing to the parent to map into the kernels
virtual address space.

Sponsored by:	Innovate UK
2020-06-22 10:49:50 +00:00
Andriy Gapon
f31030ba61 gpiobus_release_pin: remove incorrect prefix from error messages
It's interesting that similar messages from gpiobus_acquire_pin never
had any prefix while gpiobus_release_pin messages were prefixed with
"gpiobus_acquire_pin".
Anyway, the prefix is not that useful and can be deduced from context.

MFC after:	2 weeks
2020-06-22 10:32:41 +00:00
Doug Rabson
c07782e10e Add some missing parts for supporting va_birthtime.
Reviewed by:	rmacklem
2020-06-22 08:23:16 +00:00
Andrew Turner
fc0804f18b Fix reboot command on the Raspberry Pi series.
The Raspbery Pi computers do not properly implement PSCI. The canonical
way to reset them is to set a watchdog timer and allow it to expire.

Submitted by:	Robert Crowston <crowston_protonmail.com>
Differential Revision:	https://reviews.freebsd.org/D25268
2020-06-22 08:12:21 +00:00
Baptiste Daroussin
5b990a9463 Revert r362466
Such change should not have happen without prior discussion and review.

With hat:	transitioning core
2020-06-22 07:46:24 +00:00
Alexander V. Chernikov
b158cfb3fc Switch cxgbe interface lookup to use fibX_lookup() from older
fibX_lookup_nh_ext().

fibX_lookup_nh_ represents pre-epoch generation of fib kpi,
providing less guarantees over pointer validness and requiring
on-stack data copying.

Reviewed by:	np
Differential Revision:	https://reviews.freebsd.org/D24975
2020-06-22 07:35:23 +00:00
Michael Tuexen
c5d9e5c99e Cleanup the defintion of struct sctp_getaddresses. This stucture
is used by the IPPROTO_SCTP level socket options SCTP_GET_PEER_ADDRESSES
and SCTP_GET_LOCAL_ADDRESSES, which are used by libc to implement
sctp_getladdrs() and sctp_getpaddrs().
These changes allow an old libc to work on a newer kernel.
2020-06-21 23:12:56 +00:00
Bjoern A. Zeeb
e387af1fa8 Rather than zeroing MAXVIFS times size of pointer [r362289] (still better than
sizeof pointer before [r354857]), we need to zero MAXVIFS times the size of
the struct.  All good things come in threes; I hope this is it on this one.

PR:		246629, 206583
Reported by:	kib
MFC after:	ASAP
2020-06-21 22:09:30 +00:00
Matt Macy
9aeca21324 iflib: fix cloneattach fail and generalize pseudo device handling
- a cloneattach failure will not currently be handled correctly,
  jump to the right target

- pseudo devices are all treat as if they're ethernet devices -
  this often doesn't make sense

MFC after:	1 week
Sponsored by:	Netgate, Inc.
Differential Revision:	https://reviews.freebsd.org/D25083
2020-06-21 22:02:49 +00:00
Pawel Biernacki
9daf71541c net.link.generic.ifdata.<ifindex>.linkspecific: rework handler
This OID was added in r17352 but the write path of IFDATA_LINKSPECIFIC
seems unused as there are no in-base writers, and as far as I can tell
we had issues with this code before, see PR 219472.  Drop the write path
to make the handler read-only as described in comments and man-pages.
It can be marked as MPSAFE now.

Reviewed by:	bdragon, kib, melifaro, wollman
Approved by:	kib (mentor)
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25348
2020-06-21 18:40:17 +00:00
Hans Petter Selasky
7747001b12 Improve wording to be more precise and clear.
No functional change intended.

s/Master Boot/Main Boot/ (also called MBR)

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-21 13:34:08 +00:00
Edward Tomasz Napierala
5ac2674278 Adapt linuxulator syscalls.master files to the new layout.
No functional changes.

Reviewed by:	brooks
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25381
2020-06-21 10:09:34 +00:00
Michael Tuexen
171edd2110 Fix the build for an INET6 only configuration.
The fix from the last commit is actually needed twice...

MFC after:		1 week
2020-06-21 09:56:09 +00:00
Thomas Munro
f270658873 vfs: track sequential reads and writes separately
For software like PostgreSQL and SQLite that sometimes reads sequentially
while also writing sequentially some distance behind with interleaved
syscalls on the same fd, performance is better on UFS if we do
sequential access heuristics separately for reads and writes.

Patch originally by Andrew Gierth in 2008, updated and proposed by me with
his permission.

Reviewed by:	mjg, kib, tmunro
Approved by:	mjg (mentor)
Obtained from:	Andrew Gierth <andrew@tao11.riddles.org.uk>
Differential Revision:	https://reviews.freebsd.org/D25024
2020-06-21 08:51:24 +00:00
Jeff Roberson
03270b59ee Use zone nomenclature that is consistent with UMA. 2020-06-21 04:59:02 +00:00
Brandon Bergren
40b664f64b [PowerPC] More relocation fixes
It turns out relocating the symbol table itself can cause issues, like fbt
crashing because it applies the offsets to the kernel twice.

This had been previously brought up in rS333447 when the stoffs hack was
added, but I had been unaware of this and reimplemented symtab relocation.

Instead of relocating the symbol table, keep track of the relocation base
in ddb, so the ddb symbols behave like the kernel linker-provided symbols.

This is intended to be NFC on platforms other than PowerPC, which do not
use fully relocatable kernels. (The relbase will always be 0)

 * Remove the rest of the stoffs hack.
 * Remove my half-baked displace_symbol_table() function.
 * Extend ddb initialization to cope with having a relocation offset on the
   kernel symbol table.
 * Fix my kernel-as-initrd hack to work with booke64 by using a temporary
   mapping to access the data.
 * Fix another instance of __powerpc__ that is actually RELOCATABLE_KERNEL.
 * Change the behavior or X_db_symbol_values to apply the relocation base
   when updating valp, to match link_elf_symbol_values() behavior.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25223
2020-06-21 03:39:26 +00:00
Rick Macklem
b94b9a80b2 Fix up a comment added by r362455. 2020-06-21 02:49:56 +00:00
Rick Macklem
4302e8b671 Modify the way the client side krpc does soreceive() for TCP.
Without this patch, clnt_vc_soupcall() first does a soreceive() for
4 bytes (the Sun RPC over TCP record mark) and then soreceive(s) for
the RPC message.
This first soreceive() almost always results in an mbuf allocation,
since having the 4byte record mark in a separate mbuf in the socket
rcv queue is unlikely.
This is somewhat inefficient and rather odd. It also will not work
for the ktls rx, since the latter returns a TLS record for each
soreceive().

This patch replaces the above with code similar to what the server side
of the krpc does for TCP, where it does a soreceive() for as much data
as possible and then parses RPC messages out of the received data.
A new field of the TCP socket structure called ct_raw is the list of
received mbufs that the RPC message(s) are parsed from.
I think this results in cleaner code and is needed for support of
nfs-over-tls.
It also fixes the code for the case where a server sends an RPC message
in multiple RPC message fragments. Although this is allowed by RFC5531,
no extant NFS server does this. However, it is probably good to fix this
in case some future NFS server does do this.
2020-06-21 00:06:04 +00:00
Michael Tuexen
5087b6e732 Set a variable also in the case of an INET6 only kernel
MFC after:		1 week
2020-06-20 23:48:57 +00:00
Xin LI
00e8fb8001 Bump __FreeBSD_version after making liblzma to use libmd implementation
of SHA256.

PR:		200142
2020-06-20 21:32:14 +00:00
Michael Tuexen
ed82c2edd6 Use a struct sockaddr_in pr struct sockaddr_in6 as the option value
for the IPPROTO_SCTP level socket options SCTP_BINDX_ADD_ADDR and
SCTP_BINDX_REM_ADDR. These socket option are intended for internal
use only to implement sctp_bindx().
This is one user of struct sctp_getaddresses less.
struct sctp_getaddresses is strange and will be changed shortly.
2020-06-20 21:06:02 +00:00
Doug Moore
bc1bed77a8 In concluding RB_REMOVE_COLOR, in the case when the sibling of the
root of the too-short tree is black and at least one of the children
of that sibling is red, either one or two rotations finish the
rebalancing. In the case when both of the children are red, the
current implementation uses two rotations where only one is
necessary. This change removes that extra rotation, and in that case
also removes a needless black-to-red-to-black recoloring.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25335
2020-06-20 20:25:39 +00:00
Jeff Roberson
c8b0a88b8d Clarify some language. Favor primary where both master and primary were
used in conjunction with secondary.
2020-06-20 20:21:04 +00:00
Michael Tuexen
7621bd5ead Cleanup the adding and deleting of addresses via sctp_bindx().
There is no need to use the association identifier, so remove it.
While there, cleanup the code a bit.

MFC after:		1 week
2020-06-20 20:20:16 +00:00
Edward Tomasz Napierala
bafd96b8dd Regen after r362440.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-20 18:31:02 +00:00
Edward Tomasz Napierala
52c81be11a Add linux_madvise(2) instead of having Linux apps call the native
FreeBSD madvise(2) directly.  While some of the flag values match,
most don't.

PR:		kern/230160
Reported by:	markj
Reviewed by:	markj
Discussed with:	brooks, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25272
2020-06-20 18:29:22 +00:00
Conrad Meyer
b75a772875 oce(4): Account and trace mbufs before handing to hw
Once tx mbufs have been handed to hardware, nothing serializes the tx
path against completion and potential use-after-free of the outbound
mbuf.  Perform accounting and BPF tap before queueing to hardware to
avoid this race.

Submitted by:	Steve Wirtz <steve_wirtz AT dell.com>
Reviewed by:	markj, rstone
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25364
2020-06-20 17:22:46 +00:00
Hans Petter Selasky
75dc9c41ab Improve debug message to be more precise and clear.
For the sake of the record, this is the last use of the words master and slave
in the FreeBSD's USB stack, drivers and subsystems.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-20 14:16:24 +00:00
Toomas Soome
3830659e99 loader: create single zfs nextboot implementation
We should have nextboot feature implemented in libsa zfs code.
To get there, I have created zfs_nextboot() implementation based on
two sources, our current simple textual string based approach with added
structured boot label PAD structure from OpenZFS.

Secondly, all nvlist details are moved to separate source file and
restructured a bit. This is done to provide base support to add nvlist
add/update feature in followup updates.

And finally, the zfsboot/gptzfsboot disk access functions are swapped to use
libi386 and libsa.

Sponsored by:	Netflix, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D25324
2020-06-20 06:23:31 +00:00
Kyle Evans
e245e555fa raspberry pi 4: cpufreq support
The submitter notes that the bcm2835_cpufreq driver really just needs the
rpi4 compat string added to it; powerd subsequently works and the dev.cpu.0
sysctl values look sane and can be successfully manipulated.

Submitted by:	James Mintram <me@jamesrm.com>
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D25349
2020-06-20 04:07:58 +00:00
Warner Losh
f66ca1b1eb Use the more descriptive src_ccb and dst_ccb for the two ccbs being merged.
MFC after: 1 week
2020-06-20 04:07:23 +00:00
Edward Tomasz Napierala
4afe4fae1b Add warnings for unsupported Linux clockids.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25322
2020-06-19 19:33:06 +00:00
Michal Meloun
b72e2878ab Improve if_dwc:
- refactorize packet receive path. Make sure that we don't leak mbufs
   and/or that we don't create holes in RX descriptor ring
 - slightly simplify handling with TX descriptors

MFC after:	4 weeks
2020-06-19 19:26:55 +00:00
Brandon Bergren
fb0543afa8 [PowerPC] Add virtio to GENERIC
Due to kldxref not being able to generate hints for nonnative platforms,
any cross built VM images do not have /boot/kernel/linker.hints.

This prevents the virtio modules from being loaded, as the fallback code
will always fail the version check when the hints are missing.

Since we want to be able to generate VM images for 32 bit powerpc, add the
virtio modules to GENERIC like we do on powerpc64.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25271
2020-06-19 18:43:13 +00:00
Brandon Bergren
8415f755f1 [PowerPC] Fix booke64 qemu infinite loop in L2 cache enable
Since qemu does not implement the L2 cache, we get stuck forever waiting
for a bit to be set when trying to invalidate it.

To prevent that, we should bail out if the L2 cache is missing.
One easy way to check this is L2CFG0 == 0 (since L2CSIZE always has at
least one bit set in a valid implementation)

(tested on qemu, rb800, and x5000)

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25225
2020-06-19 18:40:39 +00:00
Brandon Bergren
37f530582d [PowerPC] De-giant powermac_nvram, update documentation
* Remove the giant lock requirement from powermac_nvram.
* Update manual pages to reflect current state.

Reviewed by:	bcr (manpages), jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24812
2020-06-19 18:36:10 +00:00
Michal Meloun
188aee740f Finish renaming in if_dwc.
By using DWC TRM terminology, normal descriptor format should be named
extended and alternate descriptor format should be named normal.

Should not been functional change.

MFC after:	4 weeks
2020-06-19 18:34:27 +00:00
Michal Meloun
8d43a8685c Use naming nomenclature used in DesignWare TRM.
Use naming nomenclature used in DesignWare TRM.
This driver was written by using Altera (now Intel) documentation for Arria
FPGA manual. Unfortunately this manual used very different (and in some cases
opposite naming) for registers and descriptor fields. Unfortunately,
this makes future expansion extremely hard.

Should not been functional change.

MFC after:	4 weeks
2020-06-19 18:04:41 +00:00
Andrew Turner
41b84341f5 Use the correct address when creating pci resources
When the PCI and CPU physical addresses are identical it doesn't matter
which is used to create the resources, however on some systems, e.g.
qemu armv7 virt, they are different. This leads to a panic as we try to
map the wrong physical address into the kernel address space.

Reported by:	Jenkins via trasz
Sponsored by:	Innovate UK
2020-06-19 18:00:20 +00:00
Allan Jude
9598fc63e6 ZFS: Allow setting checksum=skein on boot pools
PR:		245889
Reported by:	delphij
Sponsored by:	Klara Inc.
2020-06-19 17:59:55 +00:00
Michal Meloun
7f8437c353 Adapt ARMADA8k PCIe driver to newly imported 5.7 DT.
- temporarily disable handling with phy, we don't have driver for it yet
- always clear cause for administartive interrupt.
While I'm in, fix style(9) (mainly whitespace).

MFC after:	4 weeks
2020-06-19 17:33:54 +00:00
Michal Meloun
224c5a9ff3 Revert r362389, it was committed with <patch>.diff instead of <patch>.txt as
commit log.
2020-06-19 17:32:50 +00:00
Michal Meloun
7a5750fd2d diff --git a/sys/dev/pci/pci_dw_mv.c b/sys/dev/pci/pci_dw_mv.c
index 06a29fefbdd..571fc00f6c1 100644
--- a/sys/dev/pci/pci_dw_mv.c
+++ b/sys/dev/pci/pci_dw_mv.c
@@ -64,15 +64,11 @@ __FBSDID("$FreeBSD$");

 #define MV_GLOBAL_CONTROL_REG		0x8000
 #define PCIE_APP_LTSSM_EN		(1 << 2)
-//#define PCIE_DEVICE_TYPE_SHIFT		4
-//#define PCIE_DEVICE_TYPE_MASK		0xF
-//#define PCIE_DEVICE_TYPE_RC		0x4/

 #define MV_GLOBAL_STATUS_REG		0x8008
 #define	 MV_STATUS_RDLH_LINK_UP			(1 << 1)
 #define  MV_STATUS_PHY_LINK_UP			(1 << 9)

-
 #define MV_INT_CAUSE1			0x801C
 #define MV_INT_MASK1			0x8020
 #define  INT_A_ASSERT_MASK			(1 <<  9)
@@ -90,11 +86,7 @@ __FBSDID("$FreeBSD$");
 #define MV_ARUSER_REG			0x805C
 #define MV_AWUSER_REG			0x8060

-
-
 #define	MV_MAX_LANES	8
-
-
 struct pci_mv_softc {
 	struct pci_dw_softc	dw_sc;
 	device_t		dev;
@@ -112,7 +104,6 @@ static struct ofw_compat_data compat_data[] = {
 	{NULL,		 	  0},
 };

-
 static int
 pci_mv_phy_init(struct pci_mv_softc *sc)
 {
@@ -121,18 +112,23 @@ pci_mv_phy_init(struct pci_mv_softc *sc)
 	for (i = 0; i < MV_MAX_LANES; i++) {
 		rv =  phy_get_by_ofw_idx(sc->dev, sc->node, i, &(sc->phy[i]));
 		if (rv != 0 && rv != ENOENT) {
-	  		device_printf(sc->dev, "Cannot get phy[%d]\n", i);
-	  		goto fail;
-	  	}
-	  	if (sc->phy[i] == NULL)
-	  		continue;
-	  	rv = phy_enable(sc->phy[i]);
-	  	if (rv != 0) {
-	  		device_printf(sc->dev, "Cannot enable phy[%d]\n", i);
-	  		goto fail;
-	  	}
-	  }
-	  return (0);
+			device_printf(sc->dev, "Cannot get phy[%d]\n", i);
+/* XXX revert when phy driver will be implemented */
+#if 0
+		goto fail;
+#else
+		continue;
+#endif
+		}
+		if (sc->phy[i] == NULL)
+			continue;
+		rv = phy_enable(sc->phy[i]);
+		if (rv != 0) {
+			device_printf(sc->dev, "Cannot enable phy[%d]\n", i);
+			goto fail;
+		}
+	}
+	return (0);

 fail:
 	for (i = 0; i < MV_MAX_LANES; i++) {
@@ -173,13 +169,14 @@ pci_mv_init(struct pci_mv_softc *sc)
 	/* Enable local interrupts */
 	pci_dw_dbi_wr4(sc->dev, DW_MSI_INTR0_MASK, 0xFFFFFFFF);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_MASK1, 0xFFFFFFFF);
-	pci_dw_dbi_wr4(sc->dev, MV_INT_MASK2, 0xFFFFFFFF);
+	pci_dw_dbi_wr4(sc->dev, MV_INT_MASK2, 0xFFFFFFFD);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE1, 0xFFFFFFFF);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE2, 0xFFFFFFFF);

 	/* Errors have own interrupt, not yet populated in DTt */
 	pci_dw_dbi_wr4(sc->dev, MV_ERR_INT_MASK, 0);
 }
+
 static int pci_mv_intr(void *arg)
 {
 	struct pci_mv_softc *sc = arg;
@@ -188,8 +185,6 @@ static int pci_mv_intr(void *arg)
 	/* Ack all interrups */
 	cause1 = pci_dw_dbi_rd4(sc->dev, MV_INT_CAUSE1);
 	cause2 = pci_dw_dbi_rd4(sc->dev, MV_INT_CAUSE2);
-	if (cause1 == 0 || cause2 == 0)
-		return(FILTER_STRAY);

 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE1, cause1);
 	pci_dw_dbi_wr4(sc->dev, MV_INT_CAUSE2, cause2);
2020-06-19 17:25:54 +00:00
Michal Meloun
a67687fcd8 Use native-sized accesses when accessing memory from kdb.
Not all MMIO mapped devices supports byte access.

MFC after:	4 weeks
2020-06-19 16:26:42 +00:00
Michal Meloun
1f446a117e Improve DesignWare PCIe driver:
- only normal memory window is mandatory, prefetchable memory and
  I/O windows should be optional
- full PCIe configuration space is supported
- remove duplicated check from function for accessing configuration space.
  It is already contained in pci_dw_check_dev()

MFC after:	2 weeks
2020-06-19 16:15:06 +00:00
Michal Meloun
d5d4dd38b4 Add specific stub for ARMADA 8k SoC to Marvell RTC driver.
The AXI bridge is different between ARMADA 38x and 8K, and both platforms
needs specific setup to mitigate HW issues with accessing RTC registers.

MFC after:	2 weeks
2020-06-19 15:32:55 +00:00
Michal Meloun
5e2e692c94 Add specialized gpio driver for ARMADA 8k SoC.
Older marvell gpio blocks are to different for reusing/enhancing
existing frivers.

MFC after:	2 weeks
2020-06-19 15:21:33 +00:00
Michal Meloun
daa58c3472 Add DTB files for ARMADA 8040 based boards.
MFC after:	2 weeks
2020-06-19 14:28:56 +00:00
Michael Tuexen
7a9dbc33f9 Remove last argument of sctp_addr_mgmt_ep_sa(), since it is not used.
MFC after:		1 week
2020-06-19 12:35:29 +00:00
Mark Johnston
cdd02f43b9 Revert r362360.
This commit was simply wrong since two different objects are locked.

Reported by:	lwhsu, pho
Pointy hat:	markj
2020-06-19 11:04:49 +00:00
Mark Johnston
f034074034 Restore a check unintentionally dropped in r362361.
MFC with:	r362361
2020-06-19 04:18:20 +00:00
Mark Johnston
0f1e6ec591 Add a helper function for validating VA ranges.
Functions which take untrusted user ranges must validate against the
bounds of the map, and also check for wraparound.  Instead of having the
same logic duplicated in a number of places, add a function to check.

Reviewed by:	dougm, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25328
2020-06-19 03:32:04 +00:00
Mark Johnston
61b006887e Fix a double object unlock in vm_object_backing_collapse_wait().
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25327
2020-06-19 03:31:46 +00:00
Kirk McKusick
93440bbefd The binary representation of the superblock (the fs structure) is written
out verbatim to the disk: see ffs_sbput() in sys/ufs/ffs/ffs_subr.c.
It contains a pointer to the fs_summary_info structure. This pointer
value inadvertently causes garbage to be stored. It is garbage because
the pointer to the fs_summary_info structure is the address the then
current stack or heap. Although a mere pointer does not reveal anything
useful (like a part of a private key) to an attacker, garbage output
deteriorates reproducibility.

This commit zeros out the pointer to the fs_summary_info structure
before writing the out the superblock.

Reviewed by:  kib
Tested by:    Peter Holm
PR:           246983
Sponsored by: Netflix
2020-06-19 01:04:25 +00:00
Kirk McKusick
34816cb9ae Move the pointers stored in the superblock into a separate
fs_summary_info structure. This change was originally done
by the CheriBSD project as they need larger pointers that
do not fit in the existing superblock.

This cleanup of the superblock eases the task of the commit
that immediately follows this one.

Suggested by: brooks
Reviewed by:  kib
PR:           246983
Sponsored by: Netflix
2020-06-19 01:02:53 +00:00
Mike Karels
349eddbd07 Add support for bcm54213PE in brgphy.
This chip is used in the Rasperry Pi 4, and is supported by the if_genet
driver. Currently we use the ukphy mii driver, this patch switches over
to the brgphy mii driver instead. To support the rgmii-rxid phy mode,
which is now the default in the Linux dtb, we add support for clock
skewing.

These changes are taken from OpenBSD and NetBSD, except for the bailout
in brgphy_bcm54xx_clock_delay() in rgmii mode, which was found necessary
after testing.

Submitted by:	Robert Crowston, crowston at protomail.com
Differential Revision:	https://reviews.freebsd.org/D25251
2020-06-18 23:57:10 +00:00
Pawel Biernacki
049264c5cc hw.bus.info: rework handler
hw.bus.info was added in r68522 as a node, but there was never anything
connected "behind" it.  Its only purpose is to return a struct u_businfo.
The only in-base consumer are devinfo(3)/devinfo(8).
Rewrite the handler as SYSCTL_PROC and mark it as MPSAFE and read-only
as there never was a writable path.

Reviewed by:	kib
Approved by:	kib (mentor)
Sponsored by:	Mysterious Code Ltd.
Differential Revision:	https://reviews.freebsd.org/D25321
2020-06-18 21:42:54 +00:00
Konstantin Belousov
8a15ac8378 Fix execution of linux binary from multithreaded non-Linux process.
If multithreaded non-Linux process execs Linux binary, then non-Linux
threads different from the one that execing are cleared by
single-threading at boundary, and then terminating them in
post_execve(). Since at that time the process is already switched to
linux ABI, linuxolator is involved in the thread clearing on boundary,
but cannot find the emul data.

Handle it by pre-creating emuldata for all threads in the execing process.

Also remove a code in linux_proc_exec() handler that cleared emul data
for other threads when execing from multithreaded Linux process. It is
excessive.

PR:	247020
Reported by:	Martin FIlla <freebsd@sysctl.cz>
Reported by:	Henrique L. Amorim, Independent Security Researcher
Reported by:	Rodrigo Rubira Branco (BSDaemon), Amazon Web Services
Reviewed by:	markj
Tested by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25293
2020-06-18 20:49:56 +00:00
Mark Johnston
95033af923 Add the SCTP_SUPPORT kernel option.
This is in preparation for enabling a loadable SCTP stack.  Analogous to
IPSEC/IPSEC_SUPPORT, the SCTP_SUPPORT kernel option must be configured
in order to support a loadable SCTP implementation.

Discussed with:	tuexen
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-18 19:32:34 +00:00
Alexander Motin
ead7e10308 Make polled request timeout less invasive.
Instead of panic after one second of polling, make the normal timeout
handler to activate, reset the controller and abort the outstanding
requests.  If all of it won't happen within 10 seconds then something
in the driver is likely stuck bad and panic is the only way out.

In particular this fixed device hot unplug during execution of those
polled commands, allowing clean device detach instead of panic.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-18 19:16:03 +00:00
Andrew Turner
c794cdc0a2 Stop assuming we can print rman_res_t with %lx
This is not the case on armv6 and armv7, where we also build this driver.
Fix by casting through uintmax_t and using %jx.

Sponsored by:	Innovate UK
2020-06-18 06:21:00 +00:00
Andriy Gapon
4c7d1ab06d hdac_intr_handler: keep working until global interrupt status clears
It is plausible that the hardware interrupts a host only when GIS goes
from zero to one.  GIS is formed by OR-ing multiple hardware statuses,
so it's possible that a previously cleared status gets set again while
another status has not been cleared yet.  Thus, there will be no new
interrupt as GIS always stayed set.  If we don't re-examine GIS then we
can leave it set and never get another interrupt again.

Without this change I frequently saw a problem where snd_hda would stop
working.  Setting dev.hdac.1.polling=1 would bring it back to life and
afterwards I could set polling back to zero.  Sometimes the problem
started right after a boot, sometimes it happened after resuming from
S3, frequently it would occur when sound output and input are active
concurrently (such as during conferencing).  I looked at HDAC_INTSTS
while the sound was not working and I saw that both HDAC_INTSTS_GIS and
HDAC_INTSTS_CIS were set, but there were no interrupts.

I have collected some statistics over a period of several days about how
many loops (calls to hdac_one_intr) the new code did for a single
interrupt:
+--------+--------------+
|Loops   |Times Happened|
+--------+--------------+
|0       |301           |
|1       |12857746      |
|2       |280           |
|3       |2             |
|4+      |0             |
+--------+--------------+
I believe that previously the sound would get stuck each time we had to loop
more than once.

The tested hardware is:
hdac1: <AMD (0x15e3) HDA Controller> mem 0xfe680000-0xfe687fff at device 0.6 on pci4
hdacc1: <Realtek ALC269 HDA CODEC> at cad 0 on hdac1

No objections:	mav
MFC after:	5 weeks
Differential Revision: https://reviews.freebsd.org/D25128
2020-06-18 06:12:06 +00:00
Chuck Silvers
d9a8abf6c2 Move all of the functions in ffs_subr.c that are only used by the ufs kernel
module from that file into ffs_vfsops.c.  This fixes the build for kernel
configs that don't include FFS.

PR:		247256
Submitted by:	glebius
Reviewed by:	mckusick (earlier version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25285
2020-06-17 23:39:52 +00:00
Bjoern A. Zeeb
ce19cceb8d When converting the static arrays to mallocarray() in r356621 I missed
one place where we now need to multiply the size of the struct with the
number of entries.  This lead to problems when restarting user space
daemons, as the cleanup was never properly done, resulting in MRT_ADD_VIF
EADDRINUSE.
Properly zero all array elements to avoid this problem.

PR:		246629, 206583
Reported by:	(many)
MFC after:	4 days
Sponsored by:	Rubicon Communications, LLC (d/b/a "Netgate")
2020-06-17 21:04:38 +00:00
Bjoern A. Zeeb
b7b3d237e7 The call into ifa_ifwithaddr() needs to be epoch protected; ortherwise
we'll panic on an assertion.
While here, leave a comment that the ifp was never protected and stable
(as glebius pointed out) and this needs to be fixed properly.

Discovered while working on:	PR 246629
Reviewed by:	glebius
MFC after:	4 days
Sponsored by:	Rubicon Communications, LLC (d/b/a "Netgate")
2020-06-17 20:58:37 +00:00
Andrew Turner
9a7053ce96 Clean up the pci host generic driver
- Support Prefetchable Memory.
 - Use the correct rman when allocating memory and ioports.
 - Translate PCI addresses in bus_alloc_resource to allow physical
   addresses that are different than pci addresses.

Reviewed by:	Robert Crowston <crowston_protonmail.com>
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D25121
2020-06-17 19:56:17 +00:00
Andrew Turner
3a6413d81e Support pmap_extract_and_hold on arm64 stage 2 mappings
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D24469
2020-06-17 19:45:05 +00:00
Alexander Motin
550d5d64fe Fix admin qpair leak if detached during initial reset.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-06-17 17:51:40 +00:00
Alan Somers
eea79fde5a Remove vfs_statfs and vnode_mount macros from NFS
These macro definitions are no longer needed as the NFS OSX port is long
dead.  The vfs_statfs macro conflicts with the vfsops field of the same
name.

Submitted by:	shivank@
Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	Google, Inc. (GSoC 2020)
Differential Revision:	https://reviews.freebsd.org/D25263
2020-06-17 16:20:19 +00:00
Ruslan Bukin
c9ea007c3b Complete the ACPI support for ARM Coresight:
o Parse the ACPI DSD (Device Specific Data) graph property and record
  device connections.
o Split-out FDT support to a separate file.
o Get the corresponding (FDT/ACPI) Coresight platform data in
  the device drivers.

Sponsored by:	DARPA, AFRL
2020-06-17 15:54:51 +00:00
Michael Tuexen
2d87bacde4 Allow the self reference to be NULL in case the timer was stopped.
Submitted by:		Timo Voelker
MFC after:		1 week
2020-06-17 15:27:45 +00:00
Tom Jones
d88fe3d964 Add header definition for RFC4340, Datagram Congestion Control Protocol
Add a header definition for DCCP as defined in RFC4340. This header definition
is required to perform validation when receiving and forwarding DCCP packets.
We do not currently support DCCP.

Reviewed by:	gallatin, bz
Approved by:	bz (co-mentor)
MFC after:	1 week
MFC with:	350749
Differential Revision:	https://reviews.freebsd.org/D21179
2020-06-17 13:27:13 +00:00
Andrew Turner
f3e9395d0c Add all the TCR_EL1 fields
These will be used when adding support for new Armv8 extensions.

Sponsored by:	Innovate UK
2020-06-17 11:56:10 +00:00
Hans Petter Selasky
11304ef50e Fix HW TLS offload regression issue after r359919, in mlx5en(4).
Changes in the mbuf layout regarding HW TLS, resulted in wrong detection
of starting mbuf. Use a boolean variable to handle this and pass m_adj()
the top mbuf, so that the packet header is adjusted correctly.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-17 11:14:54 +00:00
Hans Petter Selasky
a26df270c9 Allow multicast packets to be received in promiscious mode, in mlx4en(4).
Make sure we disable the multicast filter in promiscious mode aswell as when
the all multicast flag is set.

MFC after:	1 week
Found by:	Tycho Nightingale <tychon@freebsd.org>
Sponsored by:	Mellanox Technologies
2020-06-17 11:12:10 +00:00
Vladimir Kondratyev
94811094f8 evdev: Add AT translated set1 scancodes for 'Eisu' & 'Kana' keys.
PR:		247292
Submitted by:	Yuichiro NAITO <naito.yuichiro@gmail.com>
MFC after:	1 week
2020-06-17 08:35:35 +00:00
Conrad Meyer
a116b5d3e4 vm: Drop vm_map_clip_{start,end} macro wrappers
No functional change.

Reviewed by:	dougm, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25282
2020-06-16 22:53:56 +00:00
Ryan Moeller
33b39b6615 Apply default security flavor in vfs_export
There may be some version of mountd out there that does not supply a default
security flavor when none is given for an export.

Set the default security flavor in vfs_export if none is given, and remove the
workaround for oexport compat.

Reported by:	npn
Reviewed by:	rmacklem
Approved by:	mav (mentor)
MFC after:	3 days
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25300
2020-06-16 21:30:30 +00:00
Randall Stewart
95ef69c63c iSo in doing final checks on OCA firmware with all the latest tweaks the dup-ack checking
packet drill script was failing with a number of unexpected acks. So it turns
out if you have the default recvwin set up to 1Meg (like OCA's do) and you
have no window scaling (like the dupack checking code) then we have another
case where we are always trying to update the rwnd and sending an
ack when we should not.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D25298
2020-06-16 18:16:45 +00:00
Simon J. Gerraty
73845fdbd3 Make KENV_MVALLEN tunable
When doing secure boot, loader wants to export loader.ve.hashed
the value of which typically exceeds KENV_MVALLEN.

Replace use of KENV_MVALLEN with tunable kenv_mvallen.

Add getenv_string_buffer() for the case where a stack buffer cannot be
created and use uma_zone_t kenv_zone for suitably sized buffers.

Reviewed by:	stevek, kevans
Obtained from:	Abhishek Kulkarni <abkulkarni@juniper.net>
MFC after:	1 week
Sponsored by:	Juniper Networks
Differential Revision: https://reviews.freebsd.org//D25259
2020-06-16 17:02:56 +00:00
Randall Stewart
4d418f8da8 So it turns out rack has a shortcoming in dup-ack counting. It counts the dupacks but
then does not properly respond to them. This is because a few missing bits are not present.
BBR actually does properly respond (though it also sends a TLP which is interesting and
maybe something to fix)..

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D25294
2020-06-16 12:26:23 +00:00
Rick Macklem
2ed5e42378 Expose UID_xxx and GID_xxx definitions to userspace.
This patch moves the UID_xxx and GID_xxx definitions out of the
#ifdef _KERNEL section, so that userspace programs like mountd
can use them.
There are a couple of userspace programs that do define UID_ROOT,
but they do not include sys/conf.h.  Since they are defined as
the same value, maybe they should be changed to include sys/conf.h.

Reviewed by:	kib
Differential Revision:	https:/reviews.freebsd.org/D25281
2020-06-16 02:31:22 +00:00
Adrian Chadd
209be66e26 [rsu] Update wme ie API use.
Whoops, forgot to land this one too!
2020-06-16 01:11:40 +00:00
Adrian Chadd
bac852bbac [net80211] Add missing commit to previous-1 uapsd commit.
Whoops; somehow my big commit line didn't include this..  cue the tree breakage emails.
2020-06-16 00:28:45 +00:00
Adrian Chadd
8379e8db7a [net80211] Add initial U-APSD negotiation support.
U-APSD (unscheduled automatic power save delivery) is a power save method
that's a bit better than legacy PS-POLL - stations can mark frames with
an extra flag that tells the AP to leak out more frames after it sends
its own frames rather than needing to send a PS-POLL to get another frame
from the AP.

Now, this code just handles the negotiation bits; it doesn't actually
implement U-APSD.  That's up to drivers, and nothing in the tree yet
implements this.  I /may/ implement this for ath(4) if I eventually care
enough but right now I plan on just implementing it for firmware offload
based NICs that handle this in the NIC.

I'll commit the ifconfig bit after this and I may have some follow-up
commits as this gets used more by me in local testing.

This should be a glorious no-op for everyone else.  If things change
for anyone that isn't fixed by a complete recompile then please reach out
to me.
2020-06-16 00:27:32 +00:00
Edward Tomasz Napierala
3d8dd98381 Make Linux uname(2) return x86_64 to 32-bit apps. This helps Steam.
PR:		kern/240432
Analyzed by by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25248
2020-06-15 20:12:10 +00:00
Vincenzo Maffione
ef6fdb3312 if_vtnet: let vtnet_rx_vq_intr() and vtnet_rxq_tq_intr() share code
Since the two functions are similar, introduce a common function
(vtnet_rx_vq_process()) to share common code.
This also improves locking, by ensuring vrxs_rescheduled is accessed
under the RXQ lock, and taskqueue_enqueue() is not called under the
lock (therefore avoiding a spurious duplicate lock warning).

Reported by:	jrtc27
MFC after:	2 weeks
2020-06-15 19:46:34 +00:00
John Baldwin
ad54157b5e Simplify MACHINE_ARCH to be a single string.
Big endian and armv4 mean that we are now down to only two supported
variants.  A future change will use MACHINE_ARCH in assembly which
does not support C-style string concatentation and thus needs
MACHINE_ARCH defined as a single string.

Reviewed by:	imp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25211
2020-06-15 18:57:43 +00:00
Ryan Moeller
cbb9ccf735 Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in
various NIC drivers.

Reviewed by:	hselasky, np, gallatin, jpaetzel
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25120
2020-06-15 16:35:27 +00:00
Takanori Watanabe
ccb9fc3218 Update event masks constant to Bluetooth core spec V5.2
and add LE Events.

PR: 247257
Submitted by:	Marc Veldman
2020-06-15 14:58:40 +00:00
Jessica Clarke
576b099a5f vtnet: Fix regression introduced in r361944
For legacy devices that don't support MrgRxBuf (such as bhyve pre-r358180),
r361944 failed to update the receive handler to account for the additional
padding introduced by the unused num_buffers field that is now always present
in struct vtnet_rx_header. Thus, calculate the padding dynamically based on
vtnet_hdr_size.

PR:		247242
Reported by:	thj
Tested by:	thj
2020-06-14 22:39:34 +00:00
Vincenzo Maffione
0a182b4c63 iflib: netmap: enter/exit netmap mode after device stops
Avoid possible race conditions by calling nm_set_native_flags()
and nm_clear_native_flags() only after the device has been
stopped.

MFC after:	1 week
2020-06-14 21:07:12 +00:00
Vincenzo Maffione
16f224b5f8 netmap: vtnet: fix races in vtnet_netmap_reg()
The nm_register callback needs to call nm_set_native_flags()
or nm_clear_native_flags() once the device has been stopped.
However, in the current implementation this is not true,
as the device is stopped by vtnet_init_locked(). This causes
race conditions where the driver crashes as soon as it
dequeues netmap buffers assuming they are mbufs (or the other
way around).
To fix the issue, we extend vtnet_init_locked() with a second
argument that, if not zero, will set/clear the netmap flags.
This results in a huge simplification of the nm_register
callback itself.
Also, use netmap_reset() to check if a ring is going to be
re-initialized in netmap mode.

MFC after:	1 week
2020-06-14 20:47:31 +00:00
Brandon Bergren
a4ec123c56 [PowerPC] Fix scc z8530 driver
Parts of the z8530 driver were still using the SUN channel spacing.

This was invalid on PowerMac and QEMU, where the attachment was to escc,
not escc-legacy. This means the driver has apparently NEVER worked properly
on Macintosh hardware.

Add documentation for the channel spacing details, and change to using
driver-specific initialization instead of hardcoded spacing so either
spacing can be used.

Fixes boot hang in QEMU when using the serial console, and fixes use on
Xserve serial (and presumably PowerMacs that have a Stealth Serial port
or similar)

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24661
2020-06-14 16:47:16 +00:00
Michael Tuexen
b231bff8b2 Allocate the mbuf for the signature in the COOKIE or the correct size.
While there, do also do some cleanups.

MFC after:		1 week
2020-06-14 16:05:08 +00:00
Edward Tomasz Napierala
889cd28520 Make linux(4) warn about unsupported CMSG level/type.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25255
2020-06-14 14:38:40 +00:00
Doug Rabson
3900c11481 Add support for the timecreate attribute
This maps to the va_birthtime VFS attribute.
2020-06-14 11:41:57 +00:00
Michael Tuexen
4471043177 Cleanups, no functional change.
MFC after:		1 week
2020-06-14 09:50:00 +00:00
Toomas Soome
e7fd9688ea Move font related data structured to sys/font.c and update vtfontcvt
Prepare support to be able to handle font data in loader, consolidate
data structures to sys/font.h and update vtfontcvt.

vtfontcvt update is about to output set of glyphs in form of C source,
the implementation does allow to output compressed or uncompressed font
bitmaps.

Reviewed by:	bcr
Differential Revision:	https://reviews.freebsd.org/D24189
2020-06-14 06:58:58 +00:00
Rick Macklem
9d6fc9963e Oops, r362158 committed a duplicate definition of MAXSECFLAVORS.
This patch gets rid of the duplicate.
2020-06-14 01:22:19 +00:00
Adrian Chadd
e9efad4f9e [net80211] Treat frames without an rx status as not a decap'ed A-MSDU.
Drivers for NICs which do A-MSDU decap in hardware / driver will need to
set the rx status, so if it's missing then treat it as not a decap'ed
A-MSDU.
2020-06-14 00:23:06 +00:00
Adrian Chadd
1209ded2e1 [net80211] Also convert the ddb path
Whoops - this belonged in my previous commit.
2020-06-14 00:21:48 +00:00
Rick Macklem
3fa08158f7 Version bump for r362158, since the arguments for vfs_checkexp() changed. 2020-06-14 00:12:29 +00:00
Rick Macklem
1f7104d720 Fix export_args ex_flags field so that is 64bits, the same as mnt_flags.
Since mnt_flags was upgraded to 64bits there has been a quirk in
"struct export_args", since it hold a copy of mnt_flags
in ex_flags, which is an "int" (32bits).
This happens to currently work, since all the flag bits used in ex_flags are
defined in the low order 32bits. However, new export flags cannot be defined.
Also, ex_anon is a "struct xucred", which limits it to 16 additional groups.
This patch revises "struct export_args" to make ex_flags 64bits and replaces
ex_anon with ex_uid, ex_ngroups and ex_groups (which points to a
groups list, so it can be malloc'd up to NGROUPS in size.
This requires that the VFS_CHECKEXP() arguments change, so I also modified the
last "secflavors" argument to be an array pointer, so that the
secflavors could be copied in VFS_CHECKEXP() while the export entry is locked.
(Without this patch VFS_CHECKEXP() returns a pointer to the secflavors
array and then it is used after being unlocked, which is potentially
a problem if the exports entry is changed.
In practice this does not occur when mountd is run with "-S",
but I think it is worth fixing.)

This patch also deleted the vfs_oexport_conv() function, since
do_mount_update() does the conversion, as required by the old vfs_cmount()
calls.

Reviewed by:	kib, freqlabs
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D25088
2020-06-14 00:10:18 +00:00
Adrian Chadd
e81d909274 [net80211] Handle offloaded AMSDU in AMPDU reordering.
In the 11n world, most NICs did A-MPDU receive/transmit offloading but
not A-MSDU offloading.  So, the net80211 A-MPDU receive path would just
receive MPDUs, do the reordering bit, pass it up to the rest of
net80211 for crypto decap and then do A-MSDU decap before throwing ethernet
frames up to the rest of the system.

However 11ac and 11ax NICs are increasingly doing A-MSDU offload (and
newer 11ax stuff does socket offload, but hey I don't want to scare people
JUST yet) - so although A-MPDU reordering may be done in the OS, A-MSDUs
look like a normal MPDU.  This means that all the MSDUs are actually
faked into a set of MPDUs with matching 802.11 header - the sequence number,
QoS header and any encryption verification bits (like IV) are just copied.

This shows up as MASSIVE packet loss in net80211, cause after the first MPDU
we just toss the rest.

(And don't get me started about ethernet decap with A-MPDU host reordering;
we'll have to cross that bridge for later 11ac and 11ax bits too.)

Anyway, this work changes each A-MPDU reorder slot into an mbufq.
The mbufq is treated as a whole set of frames to pass up to the stack
and reordered/de-duped as a group.  The last frame in the reorder list
is checked to see if it's an A-MSDU final frame so any duplicates are
correctly tossed rather than double-received.  Other than that, the
rest of the logic is unchanged.

The previous commit did a small subset of this - if there wasn't any reordering
going on then it'd accept the A-MSDUs.  This is the rest of the needed work.

This is a no-op for 11n NICs doing A-MPDU reordering but needing software
A-MSDU decap - they aren't tagged as A-MSDU and so any subsequent
frames added to the reorder slot are tossed.

Tested:

* QCA9880 (ath10k/athp) - STA/AP mode;
* RT3593 (if_rsu) - 11n STA+DWDS mode (I'm committing through it rn);
* QCA9380 (if_ath) - STA/AP mode.
2020-06-13 23:35:22 +00:00
Adrian Chadd
ea3d5fd9df [net80211] separate out node allocation and node initialisation.
This is a new, optional (for now!) method that drivers can use to separate
node allocation and node initialisation.  Right now they're the same, and
drivers that need to do node allocation via firmware commands need to sleep
and thus they need to defer node allocation into an internal taskqueue.

Right now they're just separate but not deferred.  Later on if I get the time
we'll start deferring the node and key related operations but that requires
making a bunch of other stuff (notably things that generate frames!) also
async/deferred.

Tested:

* RT3593, STA/DWDS mode
* AR9380, STA/AP modes
* QCA9880 (athp) - STA/AP modes
2020-06-13 22:20:02 +00:00
Michael Tuexen
d60bdf8569 Remove usage of empty macro.
MFC after:		1 week
2020-06-13 21:23:26 +00:00
Michael Tuexen
64c8fc5de8 Simpify a condition, no functional change.
MFC after:		1 week
2020-06-13 18:38:59 +00:00
Conrad Meyer
8bc0d2b855 Fix !DEBUGNET build after r362138
X-MFC-With:	r362138
2020-06-13 03:16:09 +00:00
Conrad Meyer
508a6e84e7 Flip kern.tty_info_kstacks on by default
It's a useful debug aid for anyone using Ctrl-T today, and doesn't seem to be
widely known.  So, enable it out of the box to help people find it.

It's a tunable and sysctl, so if you don't like it, it's easy to disable
locally.

If people really hate it, we can always flip it back.

Reported by:	Daniel O'Connor
2020-06-13 03:04:40 +00:00
Doug Moore
9f1041dc2e Linuxkpi uses the rb-tree structures without using their interfaces,
making them break when the representation changes. Revert changes that
eliminated the color field from rb-trees, leaving everything as it was
before.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25250
2020-06-13 01:54:09 +00:00
Conrad Meyer
479ab044c1 net80211: Add framework for debugnet(4) support
Allow net80211 drivers to register a small vtable of debugnet-related
methods.

This is not a functional change.  Driver support is needed, similar to
debugnet(4) for wired NICs.

Reviewed by:	adrian, markj (earlier version both)
Differential Revision:	https://reviews.freebsd.org/D17308
2020-06-13 00:59:36 +00:00
John Baldwin
d93010c598 Allow <sys/elf_common.h> to be used in assembly.
Hide C-only declarations under #ifndef LOCORE.  This will be used by
future changes to define ELF notes in assembly.

Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25211
2020-06-12 23:43:44 +00:00
John Baldwin
4f3c25bce0 Allow <sys/param.h> to be included from userland assembly files.
This will be used by future changes to define ELF notes in assembly.

Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25211
2020-06-12 23:42:36 +00:00
John Baldwin
26d292d3e2 Various optimizations to software AES-CCM and AES-GCM.
- Make use of cursors to avoid data copies for AES-CCM and AES-GCM.

  Pass pointers into the request's input and/or output buffers
  directly to the Update, encrypt, and decrypt hooks rather than
  always copying all data into a temporary block buffer on the stack.

- Move handling for partial final blocks out of the main loop.

  This removes branches from the main loop and permits using
  encrypt/decrypt_last which avoids a memset to clear the rest of the
  block on the stack.

- Shrink the on-stack buffers to assume AES block sizes and CCM/GCM
  tag lengths.

- For AAD data, pass larger chunks to axf->Update.  CCM can take each
  AAD segment in a single call.  GMAC can take multiple blocks at a
  time.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25058
2020-06-12 23:10:30 +00:00
John Baldwin
4e6a381306 Fix a regression in r361804 for TLS 1.3.
I was not including the record type stored in the first byte of the
trailer as part of the payload to be encrypted and hashed.

Sponsored by:	Netflix
2020-06-12 22:27:26 +00:00
Konstantin Belousov
17edf152e5 Control for Special Register Buffer Data Sampling mitigation.
New microcode update for Intel enables mitigation for SRBDS, which
slows down RDSEED and related instructions.  The update also provides
a control to limit the mitigation to SGX enclaves, which should
restore the speed of random generator by the cost of potential
cross-core bufer sampling.

See https://software.intel.com/security-software-guidance/insights/deep-dive-special-register-buffer-data-sampling

GIve the user control over it.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25221
2020-06-12 22:14:45 +00:00
Konstantin Belousov
958d257ed5 x86: add bits definitions for SRBDS mitigation control.
See https://software.intel.com/security-software-guidance/insights/deep-dive-special-register-buffer-data-sampling

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25221
2020-06-12 22:12:57 +00:00
Eric van Gyzen
8cc8c5864a Honor db_pager_quit in some vm_object ddb commands
These can be rather verbose.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2020-06-12 21:53:08 +00:00
Simon J. Gerraty
66d8bce379 mac_veriexec_fingerprint_check_vnode: v_writecount > 0 means active writers
v_writecount can actually be < 0 for text,
so check for v_writecount > 0

Reviewed by:	stevek
MFC after:	1 week
2020-06-12 21:51:20 +00:00
John Baldwin
b0b2161ce4 Fix AES-CCM requests with an AAD size smaller than a single block.
The amount to copy for the first block is the minimum of the size of
the AAD region or the remaining space in the first block.

Reported by:	cryptocheck -z
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25140
2020-06-12 21:33:02 +00:00
John Baldwin
822d2d6ac9 Various fixes to TLS for MIPS.
- Clear the current thread's TLS pointer on exec. Previously the TLS
  pointer (and register) remain unchanged.

- Explicitly clear the TLS pointer when new threads are created.

- Make md_tls_tcb_offset per-process instead of per-thread.

  The layout of the TLS and TCB are identical for all threads in a
  process, it is only the TLS pointer values themselves that vary by
  thread.  This also makes setting md_tls_tcb_offset in
  cpu_set_user_tls() redundant with the setting in exec_setregs(), so
  only set it in exec_setregs().

Submitted by:	Alfredo Mazzinghi (1)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24957
2020-06-12 21:21:18 +00:00
Eric van Gyzen
6fba90f201 FPU init: allocate initial state from UMA to ensure alignment
The Intel Instruction Set Reference says this about the XSAVE instruction:

    Use of a destination operand not aligned to 64-byte boundary
    (in either 64-bit or 32-bit modes) results in a general-protection
    (#GP) exception.

This alignment happens naturally when all malloc buckets are powers
of two.  However, this change is necessary on some systems when
certain non-power-of-two (and non-multiple of 64) malloc buckets
are defined.

Reviewed by:	cem; kib; earlier version by jhb
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:17:56 +00:00
Eric van Gyzen
701acc2fd8 FPU: make xsave_area_desc static
...because it can be.

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:12:26 +00:00
Eric van Gyzen
674cbe7908 FPU init: Do potentially blocking operations before disabling interrupts
In particular, uma_zcreate creates sysctl oids, which locks an sx lock,
which uses IPIs under contention.  IPIs tend not to work very well
when interrupts are disabled.  Who knew, right?

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:10:45 +00:00
Randall Stewart
f092a3c71c So it turns out with the right window scaling you can get the code in all stacks to
always want to do a window update, even when no data can be sent. Now in
cases where you are not pacing thats probably ok, you just send an extra
window update or two. However with bbr (and rack if its paced) every time
the pacer goes off its going to send a "window update".

Also in testing bbr I have found that if we are not responding to
data right away we end up staying in startup but incorrectly holding
a pacing gain of 192 (a loss). This is because the idle window code
does not restict itself to only work with PROBE_BW. In all other
states you dont want it doing a PROBE_BW state change.

Sponsored by:	Netflix Inc.
Differential Revision: 	https://reviews.freebsd.org/D25247
2020-06-12 19:56:19 +00:00
Andrew Gallatin
6da16e3eb0 x86: Bump default msi/msix vector limit to 2048
Given that 64c/128t CPUs are currently available, and that many
devices (nvme, many NICs) desire to map 1 MSI-X vector per core,
or even 1 per-thread, it is becoming far easier to see MSI-X interrupt
setup fail due to msi vector exhaustion, and devices fail to attach at
boot on large system.

This bump costs 12KB on amd64 (and 6KB on i386), which seems
worth the trade off for a better out of the box experience on
high end hardware.

Reviewed by:	jhb
MFC after:	21 days
Sponsored by:	Netflix
2020-06-12 18:41:12 +00:00
Doug Moore
13dca1937f Revert r362108, as it breaks compilation. 2020-06-12 17:48:12 +00:00
Ruslan Bukin
72842e4697 Coresight replicator:
o Add a header file;
o Split-out FDT attachment to a separate file;
o Add ACPI attachment.

Sponsored by:	DARPA, AFRL
2020-06-12 17:31:38 +00:00
Doug Moore
3159ceca97 The linuxkpi code accesses left/right rb tree pointers without using
RB_LEFT or RB_RIGHT, so they aren't stripping off the color bit
encoded there. Strip off that bit for linuxkpi.

Reported by:	dch
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25245
2020-06-12 16:51:55 +00:00
Michael Tuexen
3ee11586b2 Whitespace change due to upstream cleanup.
MFC after:		1 week
2020-06-12 16:40:10 +00:00
Michael Tuexen
2f9e6db0be More cleanups due to ifdef cleanup done upstream
MFC after:		1 week
2020-06-12 16:31:13 +00:00
Edward Tomasz Napierala
462171d9aa Add compat.linux.debug sysctl, to make it possible to silence down
the debug messages. While here, clean up some variable naming.

Reviewed by:	bcr (manpages), emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25230
2020-06-12 14:37:50 +00:00
Edward Tomasz Napierala
599dadca55 Fix naming clash.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-12 14:31:19 +00:00
Edward Tomasz Napierala
34ff0c0e6a Make linux(4) warn about unsupported fcntls.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25231
2020-06-12 14:25:32 +00:00