Commit Graph

132652 Commits

Author SHA1 Message Date
Oleksandr Tymoshenko
c847212986 Remove licenses
I haven't requested explicit permission from authors and shouldn't have
added BSDL headers without it.

Requestes by:	imp
2020-06-04 17:20:58 +00:00
Mark Johnston
e9ee2675cb Update vt(4) config option names to chase r303043.
PR:		246080
Submitted by:	David Marec <david@lapinbilly.eu>
MFC after:	1 week
2020-06-04 16:05:24 +00:00
Eugene Grosbein
47cb0632e8 ipfw: unbreak matching with big table type flow.
Test case:

# n=32769
# ipfw -q table 1 create type flow:proto,dst-ip,dst-port
# jot -w 'table 1 add tcp,127.0.0.1,' $n 1 | ipfw -q /dev/stdin
# ipfw -q add 5 unreach filter-prohib flow 'table(1)'

The rule 5 matches nothing without the fix if n>=32769.

With the fix, it works:
# telnet localhost 10001
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Permission denied
telnet: Unable to connect to remote host

MFC after:	2 weeks
Discussed with: ae, melifaro
2020-06-04 14:15:39 +00:00
Andriy Gapon
e84d431622 superio: do not assume that current LDN cannot change after config exit
That assumption should be true when superio(4) uses the hardware
exlusively.  But it turns out to not hold on some real systems.
So, err on the side of correctness rather than performance.
Clear current_ldn in sio_conf_exit.

Reported by:	bz
Tested by:	bz
MFC after:	1 week
2020-06-04 13:18:21 +00:00
Konstantin Belousov
7428630b75 UFS: write inode block for fdatasync(2) if pointers in inode where allocated
The fdatasync() description in POSIX specifies that
    all I/O operations shall be completed as defined for synchronized I/O
    data integrity completion.
and then the explanation of Synchronized I/O Data Integrity Completion says
    The write is complete only when the data specified in the write
    request is successfully transferred and all file system
    information required to retrieve the data is successfully
    transferred.

For UFS this means that all pointers must be on disk. Indirect
pointers already contribute to the list of dirty data blocks, so only
direct blocks and root pointers to indirect blocks, both of which
reside in the inode block, should be taken care of. In ffs_balloc(),
mark the inode with the new flag IN_IBLKDATA that specifies that
ffs_syncvnode(DATA_ONLY) needs a call to ffs_update() to flush the
inode block.

Reviewed by:	mckusick
Discussed with:	tmunro
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25072
2020-06-04 12:23:15 +00:00
Oleksandr Tymoshenko
3a3bc1b1fd Add copyright headers to spigen overlays for rpi3 and rpi4
Reported by:	Rodney W. Grimes <freebsd@gndrsh.dnsmgr.net> (for rpi4)
2020-06-04 02:36:41 +00:00
Ed Maste
4d13f78444 Correct terminology in vm.imply_prot_max sysctl description
As with r361769 (man page), PROT_* are properly called protections, not
permissions.

MFC after:	1 week
MFC with:	r361769
Sponsored by:	The FreeBSD Foundation
2020-06-04 01:49:29 +00:00
John Baldwin
8c27b7a98b Add opt_compat.h needed by r359374.
Reported by:	kevans
2020-06-03 23:21:44 +00:00
Adrian Chadd
e649b526cc [run] Fix up tx/rx frame size.
This specifically fixes that TX frames are large enough now to hold a 3900 odd
byte AMSDU (the little ones); me flipping it on earlier messed up transmit!

Tested:

* if_run, STA mode, TX/RX TCP/UDP iperf.  TCP is now back to normal and
  correctly does ~ 3200 byte AMSDU/fast frames (2x1600ish byte MSDUs).
2020-06-03 22:30:44 +00:00
John Baldwin
1a4a7e98eb Explicitly zero IVs on the stack.
Reviewed by:	delphij
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25057
2020-06-03 22:19:52 +00:00
John Baldwin
0065d9a47f Explicitly zero AES key schedules on the stack.
Reviewed by:	delphij
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25057
2020-06-03 22:18:21 +00:00
Oleksandr Tymoshenko
eb5e1004e2 Add spigen overlay for Raspberry Pi 4
Submitted by:	gergely.czuczy@harmless.hu
2020-06-03 22:18:15 +00:00
John Baldwin
66f2e4b620 Explicitly zero on-stack IVs, tags, and HMAC keys.
Reviewed by:	delphij
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25057
2020-06-03 22:15:11 +00:00
John Baldwin
20c128da91 Add explicit bzero's of sensitive data in software crypto consumers.
Explicitly zero IVs, block buffers, and hashes/digests.

Reviewed by:	delphij
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25057
2020-06-03 22:11:05 +00:00
Oleksandr Tymoshenko
0897babceb Add dtb for Firefly RK3399 to the list of Rockchip dtbs 2020-06-03 21:19:57 +00:00
Adrian Chadd
53652fb94e [otus] enable 802.11n for 2GHz and 5GHz.
This flips on basic 11n for 2GHz/5GHz station operation.

* It flips on HT20 and MCS rates;
* It enables A-MPDU decap - the payload format is a bit different;
* It does do some basic checks for HT40 but I haven't yet flipped on
  HT40 support;
* It enables software A-MSDU transmit; I honestly don't want to make
  A-MPDU TX work and there are apparently issues with QoS and A-MPDU TX.
  So I totally am ignoring A-MPDU TX;
* MCS rate transmit is fine.

I haven't:

* A-MPDU TX, as I said above;
* made radiotap work fully;
* HT40;
* short-GI support;
* lots of other stuff that honestly no-one is likely to use.

But! Hey, this is another ye olde 11n USB NIC that now works pretty OK
in 11n rates. A-MPDU receive seems fine enough given it's a draft-n
device from before 2010.

Tested:

* Ye olde UB82 Test NIC (AR9170 + AR9104) - 2GHz/5GHz
2020-06-03 20:25:02 +00:00
John Baldwin
093a8f8daf Revise r361712 to disable tcpmd5.ko for 'options TCP_SIGNATURE' 2020-06-03 18:42:28 +00:00
Vincenzo Maffione
e8c07b1246 netmap: vtnet: clean up rxsync disabled logs
MFC after:	1 week
2020-06-03 17:47:32 +00:00
Vincenzo Maffione
1b6d5a80a6 netmap: vtnet: fix race condition in rxsync
This change prevents a race that happens when rxsync dequeues
N-1 rx packets (with N being the size of the netmap rx ring).
In this situation, the loop exits without re-enabling the
rx interrupts, thus causing the VQ to stall.

MFC after:	1 week
2020-06-03 17:46:21 +00:00
Vincenzo Maffione
2d769e25b1 netmap: vtnet: add vtnrx_nm_refill index to receive queues
The new index tracks the next netmap slot that is going
to be enqueued into the virtqueue. The index is necessary
to prevent the receive VQ and the netmap rx ring from going
out of sync, considering that we never enqueue N slots, but
at most N-1. This change fixes a bug that causes the VQ
and the netmap ring to go out of sync after N-1 packets
have been received.

MFC after:	1 week
2020-06-03 17:42:17 +00:00
Ryan Moeller
78a3645fd2 Fix typo in previous commit
Applied the wrong patch

Reported by:	Michael Butler <imb@protected-networks.net>
Approved by:	mav (mentor)
Sponsored by:	iXsystems.com
2020-06-03 17:26:00 +00:00
Ryan Moeller
f057d56c6c scope6: Check for NULL afdata before dereferencing
Narrows the race window with if_detach.

Approved by:	mav (mentor)
MFC after:	3 days
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25017
2020-06-03 16:57:30 +00:00
Randall Stewart
2cf21ae559 We should never allow either the broadcast or IN_ADDR_ANY to be
connected to or sent to. This was fond when working with Michael
Tuexen and Skyzaller. Skyzaller seems to want to use either of
these two addresses to connect to at times. And it really is
an error to do so, so lets not allow that behavior.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D24852
2020-06-03 14:16:40 +00:00
Randall Stewart
f1ea4e4120 This fixes a couple of skyzaller crashes. Most
of them have to do with TFO. Even the default stack
had one of the issues:

1) We need to make sure for rack that we don't advance
   snd_nxt beyond iss when we are not doing fast open. We
   otherwise can get a bunch of SYN's sent out incorrectly
   with the seq number advancing.
2) When we complete the 3-way handshake we should not ever
   append to reassembly if the tlen is 0, if TFO is enabled
   prior to this fix we could still call the reasemmbly. Note
   this effects all three stacks.
3) Rack like its cousin BBR should track if a SYN is on a
   send map entry.
4) Both bbr and rack need to only consider len incremented on a SYN
   if the starting seq is iss, otherwise we don't increment len which
   may mean we return without adding a sendmap entry.

This work was done in collaberation with Michael Tuexen, thanks for
all the testing!
Sponsored by:	Netflix Inc
Differential Revision:	https://reviews.freebsd.org/D25000
2020-06-03 14:07:31 +00:00
Michael Tuexen
d442a65733 Restrict enabling TCP-FASTOPEN to end-points in CLOSED or LISTEN state
Enabling TCP-FASTOPEN on an end-point which is in a state other than
CLOSED or LISTEN, is a bug in the application. So it should not work.
Also the TCP code does not (and needs not to) handle this.
While there, also simplify the setting of the TF_FASTOPEN flag.

This issue was found by running syzkaller.

Reviewed by:		rrs
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D25115
2020-06-03 13:51:53 +00:00
Andrey V. Elsukov
dd4490fdab Add if_reassing method to all tunneling interfaces.
After r339550 tunneling interfaces have started handle appearing and
disappearing of ingress IP address on the host system.
When such interfaces are moving into VNET jail, they lose ability to
properly handle ifaddr_event_ext event. And this leads to need to
reconfigure tunnel to make it working again.

Since moving an interface into VNET jail leads to removing of all IP
addresses, it looks consistent, that tunnel configuration should also
be cleared. This is what will do if_reassing method.

Reported by:	John W. O'Brien <john saltant com>
MFC after:	1 week
2020-06-03 13:02:31 +00:00
Ryan Moeller
693d10a291 tmpfs: Preserve alignment of struct fid fields
On 64-bit platforms, the two short fields in `struct tmpfs_fid` are padded to
the 64-bit alignment of the long field.  This pushes the offsets of the
subsequent fields by 4 bytes and makes `struct tmpfs_fid` bigger than
`struct fid`.  `tmpfs_vptofh()` casts a `struct fid *` to `struct tmpfs_fid *`,
causing 4 bytes of adjacent memory to be overwritten when the struct fields are
set.  Through several layers of indirection and embedded structs, the adjacent
memory for one particular call to `tmpfs_vptofh()` happens to be the stack
canary for `nfsrvd_compound()`.  Half of the canary ends up being clobbered,
going unnoticed until eventually the stack check fails when `nfsrvd_compound()`
returns and a panic is triggered.

Instead of duplicating fields of `struct fid` in `struct tmpfs_fid`, narrow the
struct to cover only the unique fields for tmpfs and assert at compile time
that the struct fits in the allotted space.  This way we don't have to
replicate the offsets of `struct fid` fields, we just use them directly.

Reviewed by:	kib, mav, rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25077
2020-06-03 09:38:51 +00:00
Vincenzo Maffione
06f6997eb5 netmap: vale: fix disabled logs
MFC after:	1 week
2020-06-03 05:49:19 +00:00
Vincenzo Maffione
81d2cade1c netmap: vtnet: remove leftover memory barriers
MFC after:	1 week
2020-06-03 05:48:42 +00:00
Vincenzo Maffione
f0d8d352c0 netmap: vtnet: call netmap_rx_irq() under VQ lock
The netmap_rx_irq() function normally wakes up user-space threads
waiting for more packets. In this case, it is not necessary to
call it under the driver queue lock. However, if the interface is
attached to a VALE switch, netmap_rx_irq() ends up calling rxsync
on the interface (see netmap_bwrap_intr_notify()). Although
concurrent rxsyncs are serialized through the kring lock
(see nm_kr_tryget()), the lock acquire operation is not blocking.
As a result, it may happen that netmap_rx_irq() is called on
an RX ring while another instance is running, causing the
second call to fail, and received packets stall in the receive VQ.
We fix this issue by calling netmap_irx_irq() under the VQ lock.

MFC after:	1 week
2020-06-03 05:27:29 +00:00
Vincenzo Maffione
1b89d00bd4 netmap: vtnet: honor NM_IRQ_RESCHED
The netmap_rx_irq() function may return NM_IRQ_RESCHED to inform the
driver that more work is pending, and that netmap expects netmap_rx_irq()
to be called again as soon as possible.
This change implements this behaviour in the vtnet driver.

MFC after:	1 week
2020-06-03 05:09:33 +00:00
Jason A. Harmening
1dccf71b4b Remove unnecessary WITNESS check in x86 bus_dma
When I did some bus_dma cleanup in r320528, I brought forward some sketchy
WITNESS checks from the prior x86 busdma wrappers, instead of recognizing
them as technical debt and just dropping them.  Two of these were removed in
r346351 and r346851, but one remains in bounce_bus_dmamem_alloc(). This check
could be constrained to only apply in the BUS_DMA_NOWAIT case, but it's cleaner
to simply remove it and rely on the checks already present in the sleepable
allocation paths used by this function.

While here, remove another unnecessary witness check in bus_dma_tag_create
(the tag is always allocated with M_NOWAIT), and fix a couple of typos.

Reported by:	cem
Reviewed by:	kib, cem
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25107
2020-06-03 00:16:36 +00:00
Adrian Chadd
6bc40d8d83 [run] note that PHY_HT is for mixed mode.
Submitted by:	Ashish Gupta <ashishgu@andrew.cmu.edu>
Differential Revision:	https://reviews.freebsd.org/D25108
2020-06-02 22:37:53 +00:00
Adrian Chadd
bb7234be77 [run] Set the number of HT chains.
* Set the tx/rx chains based on the existing MIMO eeprom reads
* Add 3-chain rates

Tested:

* MAC/BBP RT5390 (rev 0x0502), RF RT5370 (MIMO 1T1R), 2g/5g STA
* MAC/BBP RT3593 (rev 0x0402), RF RT3053 (MIMO 3T3R), 2g/5g STA
2020-06-02 22:36:17 +00:00
Doug Moore
9062e428f8 Remove from RB_REMOVE_COLOR some null checks where the pointer checked
is provably never null.  Restructure the surrounding code just enough
to make the non-nullness obvious.

Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D25089
2020-06-02 17:18:16 +00:00
Adrian Chadd
8b05d37a76 [run] Add 11NA flags for 5G NICs that support HT.
Now that I'm a proud owner of an ASUS USB-N66, I can test 2G/5G and
3-stream configurations.

For now, just flip on 5G HT rates.  I've tested this in both
5G HT20 and 5G 11a modes.  It's still one stream for now until
we verify that the number of streams reported (ie the MIMO below)
is actually the number of 11n streams, NOT the number of antennas.
(They don't have to match! You can have more antennas than MIMO
streams!)

Tested:

* run0: MAC/BBP RT3593 (rev 0x0402), RF RT3053 (MIMO 3T3R)
2020-06-02 16:40:58 +00:00
Hans Petter Selasky
d053391cd7 Implement __is_constexpr() function macro in the LinuxKPI.
Bump the FreeBSD version.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-02 12:23:04 +00:00
Hans Petter Selasky
ef5f8c18b5 Implement struct_size() function macro in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-02 10:19:45 +00:00
Hans Petter Selasky
c185f13b92 Implement BUILD_BUG_ON_ZERO() in the LinuxKPI.
Tested using gcc and clang.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-02 09:45:43 +00:00
Jason A. Harmening
ef1eabca5d vt(4): reset scrollback and cursor position after clearing history buffer
r361601 implemented basic support for cleaing the console history buffer.
But after clearing the history buffer, it's not especially useful to be
able to scroll back through that buffer, or for the cursor position to
remain at (very likely) the bottom of the screen.

PR:		224436
Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D25079
2020-06-02 01:21:48 +00:00
Vladimir Kondratyev
ec45be6c36 [psm] Workaround active PS/2 multiplexor hang
which happens on some laptops after returning to legacy multiplexing mode
at initialization stage.

PR:		242542
Reported by:	Felix Palmen <felix@palmen-it.de>
MFC after:	1 week
2020-06-02 01:04:49 +00:00
Vladimir Kondratyev
8137fb2e38 [psm] Do not disable trackpoint when hw.psm.elantech.touchpad_off is enabled
PR:		246117
Reported by:	Alexander Sieg <ports@xanderio.de>
MFC after:	1 week
2020-06-02 00:53:39 +00:00
Kyle Evans
f45b131296 modules: don't build ipsec/tcpmd5 if the kernel is configured for IPSEC
IPSEC_SUPPORT can currently only cope with either IPSEC || IPSEC_SUPPORT,
not both. Refrain from building if IPSEC is set, as the resulting module
won't be able to load anyways if it's built into the kernel.

KERN_OPTS is safe here; for tied modules, it will reflect the kernel
configuration. For untied modules, it will defer to whatever is set in
^/sys/conf/config.mk, which doesn't set IPSEC for modules. The latter
situation has some risk to it for uncommon scenarios, but such is the life
of untied kernel modules.

Reported by:	jenkins (a lot), O. Hartmann (once)
Generally discussed with:	imp, jhb
2020-06-02 00:32:36 +00:00
Rick Macklem
c13e414dc2 Fix build issue introduced by r361699.
Reported by:	cy (and others)
2020-06-02 00:03:26 +00:00
Alexander V. Chernikov
41e66f4eca Add rib subscription API.
Currently there is no easy way of subscribing for the routing table changes.
The only existing way is to set ifa_rtrequest callback in the each protocol
 ifaddr, which is not convenient or extandable.

This change provides generic notification subscription mechanism, that will
 replace current ifa_rtrequest one and allow other applications such as
 accelerated routing lookup modules subscribe for the changes.

In particular, this change provides 2 hooks: 1) synchronous one
 (RIB_NOTIFY_IMMEDIATE), called under RIB_WLOCK, which ensures exact
 ordering of the changes and 2) async one, (RIB_NOTIFY_DELAYED)
 that is called after the change w/o holding locks. The latter one does not
 provide any notification ordering guarantee.

Differential Revision:  https://reviews.freebsd.org/D25070
2020-06-01 21:52:24 +00:00
Alexander V. Chernikov
46cc6153d4 Finish r361706: add sys/net/route/route_ctl.h, missed in previous commit. 2020-06-01 21:51:20 +00:00
Alexander V. Chernikov
da187ddb3d * Add rib_<add|del|change>_route() functions to manipulate the routing table.
The main driver for the change is the need to improve notification mechanism.
Currently callers guess the operation data based on the rtentry structure
 returned in case of successful operation result. There are two problems with
 this appoach. First is that it doesn't provide enough information for the
 upcoming multipath changes, where rtentry refers to a new nexthop group,
 and there is no way of guessing which paths were added during the change.
 Second is that some rtentry fields can change during notification and
 protecting from it by requiring customers to unlock rtentry is not desired.

Additionally, as the consumers such as rtsock do know which operation they
 request in advance, making explicit add/change/del versions of the functions
 makes sense, especially given the functions don't share a lot of code.

With that in mind, introduce rib_cmd_info notification structure and
 rib_<add|del|change>_route() functions, with mandatory rib_cmd_info pointer.
 It will be used in upcoming generalized notifications.

* Move definitions of the new functions and some other functions/structures
 used for the routing table manipulation to a separate header file,
 net/route/route_ctl.h. net/route.h is a frequently used file included in
 ~140 places in kernel, and 90% of the users don't need these definitions.

Reviewed by:		ae
Differential Revision:	https://reviews.freebsd.org/D25067
2020-06-01 20:49:42 +00:00
Alexander V. Chernikov
e7403d0230 Revert r361704, it accidentally committed merged D25067 and D25070. 2020-06-01 20:40:40 +00:00
Alexander V. Chernikov
79674562b8 * Add rib_<add|del|change>_route() functions to manipulate the routing table.
The main driver for the change is the need to improve notification mechanism.
Currently callers guess the operation data based on the rtentry structure
 returned in case of successful operation result. There are two problems with
 this appoach. First is that it doesn't provide enough information for the
 upcoming multipath changes, where rtentry refers to a new nexthop group,
 and there is no way of guessing which paths were added during the change.
 Second is that some rtentry fields can change during notification and
 protecting from it by requiring customers to unlock rtentry is not desired.

Additionally, as the consumers such as rtsock do know which operation they
 request in advance, making explicit add/change/del versions of the functions
 makes sense, especially given the functions don't share a lot of code.

With that in mind, introduce rib_cmd_info notification structure and
 rib_<add|del|change>_route() functions, with mandatory rib_cmd_info pointer.
 It will be used in upcoming generalized notifications.

* Move definitions of the new functions and some other functions/structures
 used for the routing table manipulation to a separate header file,
 net/route/route_ctl.h. net/route.h is a frequently used file included in
 ~140 places in kernel, and 90% of the users don't need these definitions.

Reviewed by:	ae
Differential Revision: https://reviews.freebsd.org/D25067
2020-06-01 20:32:02 +00:00
Brandon Bergren
30dc2aebd7 [PowerPC] Fix build-id note on powerpc64 kernel
Due to the ordering of the powerpc64 linker script, we were discarding
all notes before emitting .note.gnu.build-id. This had the effect of
generating an empty build id section and breaking the kern.build_id
sysctl added in r348611.

powerpc and powerpcspe are uneffected.

PR:		246430
MFC after:	3 days
Sponsored by:	Tag1 Consulting, Inc.
2020-06-01 19:40:59 +00:00
Ryan Moeller
1cfffed85d Assign default security flavor when converting old export args
vfs_export requires security flavors be explicitly listed when
exporting as of r360900.

Use the default AUTH_SYS flavor when converting old export args to
ensure compatibility with the legacy mount syscall.

Reported by:	rmacklem
Reviewed by:	rmacklem
Approved by:	mav (mentor)
MFC after:	3 days
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25045
2020-06-01 18:43:51 +00:00
Vincenzo Maffione
9ec71596c0 netmap: if_vtnet: avoid netmap ring wraparound
netmap assumes the one "slot" is left unused to distinguish
the empty ring and full ring conditions. This assumption was
violated by vtnet_netmap_rxq_populate().

MFC after:	1 week
2020-06-01 16:14:29 +00:00
Vincenzo Maffione
36f2d67026 netmap: if_vtnet: replace vtnet_free_used()
The functionality contained in this function is duplicated,
as it is already available in vtnet_txq_free_mbufs()
and vtnet_rxq_free_mbufs().

MFC after:	1 week
2020-06-01 16:12:09 +00:00
Vincenzo Maffione
c9de157d36 netmap: vtnet: fix RX virtqueue initialization bug
The vtnet_netmap_rxq_populate() function erroneously assumed
that kring->nr_hwcur = 0, i.e. the kring was in the initial
state. However, this is not always the case: for example,
when a vtnet reinit is triggered by some changes in the
interface flags or capenable.
This patch changes the behaviour of vtnet_netmap_kring_refill()
so that it always starts publishing the netmap buffers starting
from the current value of kring->nr_hwcur.

MFC after:	1 week
2020-06-01 16:10:44 +00:00
Mateusz Guzik
b4ee4ce158 Remove ->f_label from struct file
The field was added in r141137 in 2005 and is unused.

It avoidably grows a struct which is NOFREE and easily gets hundreds of
thousands of instances.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D25036
2020-06-01 15:58:22 +00:00
Adrian Chadd
f6287cc63c [ath] Don't re-program the beacon timers if we miss a beacon in software-beacon STA mode.
This is something I added a few years ago to handle resyncing the beacon if
we miss a beacon or need to sync after association/reassociation/powersave.

However, if we're doing STA+AP mode (eg DWDS) then we don't want
to reprogram the beacons here; this may upset normal AP operation.
I missed checking for the sc->sc_swbmiss flag so I was reinitialising
the beacon timers after every beacon miss / TSFOOR option, and
that isn't likely good.

This plus ensuring that STA's are created with "-beacon" to disable
BMISS/TSFOOR processing will hopefully quieten some of the issues
I've seen with missed beacons / TSFOOR (out of range) interrupts
coming in when operating in STA mode.

Tested:

* AR9380/AR9580, STA+AP modes
2020-06-01 06:10:25 +00:00
Peter Wemm
694f3fc81c Clarify which hints file is the source of an error message.
PR:		246688
Submitted by:	Ashish Gupta <lrx337@gmail.com>
MFC after:	1 week
2020-06-01 03:37:58 +00:00
Matt Macy
1f93e931d9 Fix panics when using iflib pseudo device support
Reviewed by:	gallatin@, hselasky@
MFC after:	1 week
Sponsored by:	Netgate, Inc.
Differential Revision:	https://reviews.freebsd.org/D23710
2020-05-31 18:42:00 +00:00
Mark Johnston
0cfac4d5c6 Handle getcpu() calls in vsyscall emulation on amd64.
linux_getcpu() has been implemented since r356241.

PR:		246339
Submitted by:	John Hay <john@sanren.ac.za>
MFC after:	1 week
2020-05-31 18:20:20 +00:00
Mitchell Horne
cd9207569f Remove remnant of arm's ELF trampoline
The trampoline code used for loading gzipped a.out kernels on arm was
removed in r350436. A portion of this code allowed for DDB to find the
symbol tables when booting without loader(8), and some of this was
untouched in the removal. Remove it now.

Differential Revision:	https://reviews.freebsd.org/D24950
2020-05-31 14:43:04 +00:00
Li-Wen Hsu
b7596ac187 Fix directly building in sys/modules
Sponsored by:	The FreeBSD Foundation
2020-05-31 05:02:15 +00:00
Rick Macklem
c19cba61e9 Add the .h file that describes the operations for the rpctls_syscall.
This .h file will be used by the nfs-over-tls daemons to do the system
call that was added by r361599.
2020-05-31 01:12:52 +00:00
Ed Maste
36f0d336a7 elf_common.h: define DF_1_PIE
DF_1_PIE indicates that the object is a position-independent executable.

Reference:
https://docs.oracle.com/cd/E36784_01/html/E36857/chapter6-42444.html

MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2020-05-30 19:57:26 +00:00
Mike Karels
51cefda170 genet: workaround for problem with ICMPv6 echo replies
The ICMPv6 echo reply is constructed with the IPv6 header too close to
the beginning of a packet for an Ethernet header to be prepended, so we
end up with an mbuf containing just the Ethernet header.  The GENET
controller doesn't seem to handle this, with or without transmit checksum
offload.  At least until we have chip documentation, do a pullup to
satisfy the chip.  Hopefully this can be fixed properly in the future.
2020-05-30 02:09:36 +00:00
Mike Karels
0add2a5229 genet: fix issues with transmit checksum offload
Fix problem with ICMP echo replies: check only deferred data checksum
flags, and not the received checksum status bits, when checking whether
a packet has a deferred checksum; otherwise echo replies are corrupted
because the received checksum status bits are still present.

Fix some unhandled cases in packet shuffling for checksum offload.
2020-05-30 02:02:34 +00:00
Doug Moore
9628f49377 RB_REMOVE invokes RB_REMOVE_COLOR either when child is red or child is
null. In the first case, RB_REMOVE_COLOR just changes the child to
black and returns. With this change, RB_REMOVE handles that case, and
drops the child argument to RB_REMOVE_COLOR, since that value is
always null.

RB_REMOVE_COLOR is changed to remove a couple of unneeded tests, and
to eliminate some deep indentation.

RB_ISRED is defined to combine a null check with a test for redness,
to replace that combination in several places.

Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D25032
2020-05-30 01:48:12 +00:00
John Baldwin
1319a76179 Only build ipsec modules if the kernel includes IPSEC_SUPPORT.
Honoring the kernel-supplied opt_ipsec.h in r361632 causes builds of
ipsec modules to fail if the kernel doesn't include IPSEC_SUPPORT.
However, the module can never be loaded into such a kernel, so only
build the modules if the kernel includes IPSEC_SUPPORT.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D25059
2020-05-30 00:47:03 +00:00
Adrian Chadd
c775d4ac42 [run] Don't add 11ng channels (2GHz) for RF2020
Don't also add the 11ng channels if we're not in 11n mode or net80211 will
get super weird.
2020-05-30 00:07:42 +00:00
Adrian Chadd
700af579c5 [run] Set ampdu rxmax same as linux; RF2020 isn't an 11n NIC
This is from the linux driver:

* set the ampdu rx max to 32k for 1 stream devics like mine, and
  64k for larger ones
* Don't enable 11n bits for RF2020
2020-05-30 00:06:26 +00:00
Conrad Meyer
b71dc87559 geom_part: Dispatch to partitions to create providers and aliases
This allows partitions to create additional aliases of their own.  The
default method implementations preserve the existing behavior.

No functional change.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D24938
2020-05-29 19:44:18 +00:00
John Baldwin
6f9454895c Add opt_ipsec.h to fix standalone builds after r361633. 2020-05-29 19:29:10 +00:00
John Baldwin
28d2a72bbf Consistently include opt_ipsec.h for consumers of <netipsec/ipsec.h>.
This fixes ipsec.ko to include all of IPSEC_DEBUG.

Reviewed by:	imp
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25046
2020-05-29 19:22:40 +00:00
John Baldwin
4bcbd26ff8 Honor opt_ipsec.h from kernel builds.
To make this simpler, set the default contents of opt_ipsec.h
for standalone modules in sys/conf/config.mk.

Reviewed by:	imp
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25046
2020-05-29 19:21:35 +00:00
Alexander Motin
ec18cf79e6 Remove session locking from cfiscsi_pdu_update_cmdsn().
cs_cmdsn can be incremented with single atomic.  expcmdsn/maxcmdsn set in
cfiscsi_pdu_prepare() based on cs_cmdsn are not required to be updated
synchronously, only monotonically, that is achieved with lock there.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-05-29 17:52:20 +00:00
Adrian Chadd
f520d76129 [run] Add initial 802.11n support.
* Enable self-generated 11n frames
* add MCS rates for 1-stream and 2-stream rates; will do 3-stream
  once the rest of this tests out OK with other people.
* Hard-code 1 stream for now
* Add A-MPDU RX mbuf tagging
* RTS/CTS if doing RTSCTS in HT protmode as well as legacy; they're
  separate configuration flags
* Update the amrr rate index stuff - walk the rates array like others
  to find the right one - this now works for MCS and CCK/OFDM rates
* Add support for atheros fast frames/AMSDU support as we can generate
  those in net80211.

TODO:

* HT40 isn't enabled yet
* No A-MPDU support just yet; that requires some more firmware research
  and maybe porting some ath(4) A-MPDU support/tracking into net80211
* Short preamble flags aren't set yet for MCS; need to check the linux
  driver and see what's going on there
* Add 3x3 rates and set tx/rx stream configuration appropriately
* More 5GHz testing; I have a 3x3 dual band USB NIC coming soon that'll
  let me test this.
* Figure out why the RX path isn't performing as fast as it could -
  there's only a single buffer loaded at a time for the receive path
  in the USB bulk handler and this may not be super useful.

Tested:

* RT5390 usb, 1x1, RF5370 (2GHz radio), STA mode - A-MSDU TX, A-MPDU RX

Submitted by:	Ashish Gupta <ashishgu@andrew.cmu.edu>
Differential Revision:	https://reviews.freebsd.org/D22840
2020-05-29 15:56:44 +00:00
Alexander Motin
dbcf7598b0 Report STATUS_QUEUED/SENT in ctladm dumpooa output.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-05-29 13:07:52 +00:00
Andrey V. Elsukov
e43ae8dcb5 Fix O_IP_FLOW_LOOKUP opcode handling.
Do not check table value matching when table lookup has failed.

Reported by:	Sergey Lobanov
MFC after:	1 week
2020-05-29 10:37:42 +00:00
Mateusz Guzik
1c58c09f5a uma: hide item_domain under ifdef NUMA
Fixes build warnings on mips.
2020-05-29 08:30:35 +00:00
Andriy Gapon
cffd37da23 do not enable pci bridge decoding on resume until I/O windows are restored
PCI bus driver restores most but not all of a child PCI-PCI bridge
configuration.  The bridge's I/O windows are restored by pcib driver and
that happens later in time.  This can be problematic because the Command
register is restored before the windows are restored.  If the firmware
programs the windows incorrectly or even does not program them at all,
then the bridge can start claiming I/O cycles that are not intended for
it.  This will continue until the correct windows are restored.

I have observed this problem with a buggy BIOS where after resuming from
S3 an I/O port window of a PCI-PCI bridge was configured with zero base
and limit causing the bridge to claim 0x0 - 0xFFF port range.  That
interfered with ACPI port access including ACPI PM Timer at port 0x808,
thus wreaking havoc in the time keeping.

The solution is to restore the Command register of PCI-PCI bridges after
the windows are restored in pcib driver.  While here, I decided that for
other PCI device types (normal and cardbus) it's better to restore the
Command register after their BARs are restored.

To do: per jhb's suggestion, move the window handling to pci driver.

Reviewed by:	imp, jhb, kib
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D25028
2020-05-29 07:50:55 +00:00
Andriy Gapon
e298466ee2 corefile_open_last: don't keep a locked vnode while locking other ones
Consider this scenario:
- kern.corefile=/var/coredumps/%N.%U.%I.core
- multiple processes with the same name crash at the same time

It's possible that one process selects existing file N as oldvp while it
keeps looking for an unused file number.  Another process scans through
files and stumbles upon N.  That process would be blocked on the vnode
lock while holding the directory vnode exclusively locked.  The first
process would, thus, get blocked on the directory's vnode lock.

More generally, holding a file's vnode lock (oldvp) while trying to lock
its directory (for the next lookup) is a violation of the vnode locking
order.

I have observed this deadlock in the wild.

So, the change to keep oldvp "opened" but unlocked and to lock it again
only if it's to be returned as the result.
As kib noted, an alternative would be to keep the directory locked and
to use VOP_LOOKUP directly for scanning through existing core files.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D25027
2020-05-29 07:44:02 +00:00
John Baldwin
4542cd9379 Increment the correct pointer when a crypto buffer spans an mbuf or iovec.
When a crypto_cursor_copyback() request spanned multiple mbufs or
iovecs, the pointer into the mbuf/iovec was incremented instead of the
pointer into the source buffer being copied from.

PR:		246737
Reported by:	Jenkins, ZFS test suite
Sponsored by:	Netflix
2020-05-29 05:41:21 +00:00
Alexander Motin
353c460050 Move EXPDATASN/R2TSN from PDU to CTL_PRIV_FRONTEND.
We any way have per-I/O space in CTL_PRIV_FRONTEND, while for PDU private
fields I have better use ideas.  Plus to me such use of PDU fields looked
a layering violation.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-05-29 02:32:48 +00:00
Justin Hibbits
c4bc4ae778 powerpc: Stop advertising that POWER8 and POWER9 support HTM
HTM is on the chopping block, doesn't work on FreeBSD, and has only token
support in PowerISA 3.1 and POWER10.  Don't advertise something we'll never
support.
2020-05-29 00:46:31 +00:00
Rick Macklem
3f4739659e Oops two, missed syscall.mk as well. 2020-05-29 00:10:19 +00:00
John Baldwin
2684603c5f Permit SO_NO_DDP and SO_NO_OFFLOAD to be read via getsockopt(2).
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24627
2020-05-29 00:09:12 +00:00
Adrian Chadd
5687a376df [mips] fix up the assembly generation of unaligned exception loads
I noticed that unaligned accesses were returning garbage values.

Give test data like this:

char testdata[] = { 0x12, 0x34, 0x56, 0x78, 0x9a, 0xbc, 0xde, 0xf1, 0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef, 0x5a };

Iterating through uint32_t space 1 byte at a time should
look like this:

freebsd-carambola2:/mnt# ./test
Hello, world!
offset 0 pointer 0x410b00 value 0x12345678 0x12345678
offset 1 pointer 0x410b01 value 0x3456789a 0x3456789a
offset 2 pointer 0x410b02 value 0x56789abc 0x56789abc
offset 3 pointer 0x410b03 value 0x789abcde 0x789abcde
offset 4 pointer 0x410b04 value 0x9abcdef1 0x9abcdef1
offset 5 pointer 0x410b05 value 0xbcdef123 0xbcdef123
offset 6 pointer 0x410b06 value 0xdef12345 0xdef12345
offset 7 pointer 0x410b07 value 0xf1234567 0xf1234567

.. but to begin with it looked like this:

offset 0 value 0x12345678
offset 1 value 0x00410a9a
offset 2 value 0x00419abc
offset 3 value 0x009abcde
offset 4 value 0x9abcdef1
offset 5 value 0x00410a23
offset 6 value 0x00412345
offset 7 value 0x00234567

The amusing reason? The compiler is generating the lwr/lwl incorrectly.
Here's an example after I tried to replace the two macros with a single
invocation and offset, rather than having the compiler compile in addiu
to s3 - but the bug is the same:

1044: 8a620003 lwl v0,0(s3)
1048: 9a730000 lwr s3,3(s3)

.. which is just totally trashy and wrong.

This explicitly tells the compiler to treat the output as being read
and written to, which is what lwl/lwr does with the destination
register.

I think a subsequent commit should unify these macros to skip an addiu,
but that can be a later commit.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D25040
2020-05-29 00:05:43 +00:00
Rick Macklem
f4903a79fb Oops, missed syscall.h and sysproto.h for r361602.
Pointy hat goes on me.
2020-05-28 23:57:50 +00:00
Alexander Motin
30a31f6c71 Remove PDU_TOTAL_TRANSFER_LEN() macro.
I don't see a point to copy io->scsiio.kern_total_len into the request
PDU private field.  The io is going to stay with us till the end, and
kern_total_len field is not changed after being first initialized.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-05-28 23:55:46 +00:00
Alexander Motin
767300e87a Make struct ctl_be_lun first element of struct ctl_be_*_lun.
It allows to remove some extra pointer dereferences and slightly tightens
up the code by unification.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-05-28 21:30:29 +00:00
Rick Macklem
c01cd3f558 Update the files created from the new syscalls.master from r361599.
Reviewed by:	brooks
Differential Revision:	https://reviews.freebsd.org/D24949
2020-05-28 21:23:02 +00:00
Jason A. Harmening
98f7cf022c vt(4): Add support for `vidcontrol -C'
Extract scrollback buffer initialization into a common routine, used both
during vt(4) init and in handling the CONS_CLRHIST ioctl.

PR:		224436
Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D24815
2020-05-28 21:22:30 +00:00
Jung-uk Kim
0b229c80ae MFV: r361597
Import ACPICA 20200528.
2020-05-28 21:19:44 +00:00
Rick Macklem
d9021e389a Add a syscall for the nfs-over-tls daemons to use.
The nfs-over-tls daemons need a system call to perform operations such as
associate a file descriptor with a krpc socket.
The daemons will not be in head for some time, but it will make it
easier for testers of nfs-over-tls to do testing if the system call
is in head (basically the stub for libc which will be commited soon).

Reviewed by:	brooks
Differential Revision:	https://reviews.freebsd.org/D24949
2020-05-28 21:06:10 +00:00
Mark Johnston
81302f1d77 Fix boot on systems where NUMA domain 0 is unpopulated.
- Add vm_phys_early_add_seg(), complementing vm_phys_early_alloc(), to
  ensure that segments registered during hammer_time() are placed in the
  right domain.  Otherwise, since the SRAT is not parsed at that point,
  we just add them to domain 0, which may be incorrect and results in a
  domain with only several MB worth of memory.
- Fix uma_startup1() to try allocating memory for zones from any domain.
  If domain 0 is unpopulated, the allocation will simply fail, resulting
  in a page fault slightly later during boot.
- Change _vm_phys_domain() to return -1 for addresses not covered by the
  affinity table, and change vm_phys_early_alloc() to handle wildcard
  domains.  This is necessary on amd64, where the page array is dense
  and pmap_page_array_startup() may allocate page table pages for
  non-existent page frames.

Reported and tested by:	Rafael Kitover <rkitover@gmail.com>
Reviewed by:	cem (earlier version), kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25001
2020-05-28 19:41:00 +00:00
Alexander Motin
0d7fed74c7 Remove ctl_free_beio() LUN and ctl_io dependencies.
This slightly simplifies the code, plus may be a ground for asynchronous
buffer free.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-05-28 18:12:05 +00:00
Mitchell Horne
dde3b16bbc Add macros simplifying the fake preload setup
This is in preparation for booting via loader(8). Lift these macros from arm64
so we don't need to worry about the size when inserting new elements. This
could have been done in r359673, but I didn't think I would be returning to
this function so soon.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D24910
2020-05-28 14:56:11 +00:00
Alexander V. Chernikov
9d5df78e64 Fix NOINET6 build broken by r361575.
Reported by:	ci, hps
2020-05-28 09:52:28 +00:00
Marcin Wojtas
ccbaa67d8b Change return types of hash update functions in SHA-NI
r359374 introduced crypto_apply function which takes as argument a function pointer
that is expected to return an int, however aesni hash update functions
return void.
Because of that the function pointer passed was simply cast with
its return value changed.
This resulted in undefined behavior, in particular when mbuf is used, (ipsec)
m_apply checks return value of function pointer passed to it
and in our case bogusly fails after calculating hash of the first mbuf
in chain.
Fix it by changing signatures of sha update routines in aesni and
dropping the casts.

Submitted by: Kornel Duleba
Reviewed by: jhb, cem
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D25030
2020-05-28 09:13:20 +00:00
Hans Petter Selasky
4ac6682cab Fix check for wMaxPacketSize in USB bluetooth driver,
in case device is not FULL speed.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-05-28 08:41:18 +00:00
Hans Petter Selasky
506a911bad Implement helper function, usbd_get_max_frame_length(), which allows kernel
device drivers to correctly predict the default USB transfer frame length.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-05-28 08:38:25 +00:00
Roger Pau Monné
06592d60f0 xen/control: short circuit xctrl_on_watch_event on spurious event
If there's no data to read from xenstore short-circuit
xctrl_on_watch_event to return early, there's no reason to continue
since the lack of data would prevent matching against any known event
type.

Sponsored by:	Citrix Systems R&D
MFC with:	r352925
MFC after:	1 week
2020-05-28 08:20:16 +00:00
Roger Pau Monné
3e7df58df2 xen/blkfront: use the correct type for disk sectors
The correct type to use to represent disk sectors is blkif_sector_t
(which is an uint64_t underneath). This avoid truncation of the disk
size calculation when resizing on i386, as otherwise the calculation
of d_mediasize in xbd_connect is truncated to the size of unsigned
long, which is 32bits on i386.

Note this issue didn't affect amd64, because the size of unsigned long
is 64bits there.

Sponsored by:	Citrix Systems R&D
MFC after:	1 week
2020-05-28 08:19:13 +00:00
Roger Pau Monné
a7752a2eda xenpv: do not use low 1MB for Xen mappings on i386
On amd64 we already avoid using memory below 4GB in order to prevent
clashes with MMIO regions, but i386 was allowed to use any hole in
the physical memory map in order to map Xen pages.

Limit this to memory above the 1MB boundary on i386 in order to avoid
clashes with the MMIO holes in that area.

Sponsored by:	Citrix Systems R&D
MFC after:	1 week
2020-05-28 08:18:34 +00:00
Hans Petter Selasky
5e0552018c Don't allow USB device drivers to parent own interface.
It will prevent proper USB device detach.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-05-28 08:05:46 +00:00
Alexander V. Chernikov
a37a5246ca Use fib[46]_lookup() in mtu calculations.
fib[46]_lookup_nh_ represents pre-epoch generation of fib api,
providing less guarantees over pointer validness and requiring
on-stack data copying.

Conversion is straight-forwarded, as the only 2 differences are
requirement of running in network epoch and the need to handle
RTF_GATEWAY case in the caller code.

Differential Revision:	https://reviews.freebsd.org/D24974
2020-05-28 08:00:08 +00:00
Alexander V. Chernikov
c74ce5cca3 Make NFS address selection use fib4_lookup().
fib4_lookup_nh_ represents pre-epoch generation of fib api,
providing less guarantees over pointer validness and requiring
on-stack data copying.
Switch call to use new fib4_lookup(), allowing to eventually
deprecate old api.

Differential Revision:	https://reviews.freebsd.org/D24977
2020-05-28 07:35:07 +00:00
Alexander V. Chernikov
3553b3007f Switch ip_output/icmp_reflect rt lookup calls with fib4_lookup.
fib4_lookup_nh_ represents pre-epoch generation of fib api,
providing less guarantees over pointer validness and requiring
on-stack data copying.

Conversion is straight-forwarded, as the only 2 differences are
requirement of running in network epoch and the need to handle
RTF_GATEWAY case in the caller code.

Reviewed by:	ae
Differential Revision:	https://reviews.freebsd.org/D24976
2020-05-28 07:31:53 +00:00
Alexander V. Chernikov
1483c1c508 Replace ip6_ouput fib6_lookup_nh_<ext|basic> calls with fib6_lookup().
fib6_lookup_nh_ represents pre-epoch generation of fib api,
providing less guarantees over pointer validness and requiring
on-stack data copying.

Conversion is straight-forwarded, as the only 2 differences are
requirement of running in network epoch and the need to handle
RTF_GATEWAY case in the caller code.

Reviewed by:	ae
Differential Revision:	https://reviews.freebsd.org/D24973
2020-05-28 07:29:44 +00:00
Alexander V. Chernikov
7bfc98af12 Switch gif(4) path verification to fib[46]_check_urfp().
fibX_lookup_nh_ represents pre-epoch generation of fib api,
providing less guarantees over pointer validness and requiring
on-stack data copying.
Use specialized fib[46]_check_urpf() from newer KPI instead,
to allow removal of older KPI.

Reviewed by:	ae
Differential Revision:	https://reviews.freebsd.org/D24978
2020-05-28 07:26:18 +00:00
Alexander V. Chernikov
cb86ca48bf Unlock rtentry before calling for epoch(9) destruction as the destruction
may happen immediately, leading to panic.

Reported by:	bdragon
2020-05-28 07:23:27 +00:00
Justin Hibbits
9a47dfc308 powerpc/pmap: Remove some debug from r361544 2020-05-28 03:08:50 +00:00
Brandon Bergren
9b51a10a9c [PowerPC] Fix radix crash when passing -1 from userspace
Found by running libc tests with radix enabled.

Detect unsigned integer wrapping with a postcondition.

Note: Radix MMU is not enabled by default yet.

Sponsored by:	Tag1 Consulting, Inc.
2020-05-28 00:49:02 +00:00
Rick Macklem
469f2e9e9a Fix sosend() for the case where mbufs are passed in while doing ktls.
For kernel tls, sosend() needs to call ktls_frame() on the mbuf list
to be sent.  Without this patch, this was only done when sosend()'s
arguments used a uio_iov and not when an mbuf list is passed in.
At this time, sosend() is never called with an mbuf list argument when
kernel tls is in use, but will be once nfs-over-tls has been incorporated
into head.

Reviewed by:	gallatin, glebius
Differential Revision:	https://reviews.freebsd.org/D24674
2020-05-27 23:20:35 +00:00
Adrian Chadd
9f716a645d [ath] Update ath_rate_sample to use the same base type as ticks.
Until net80211 grows a specific ticks type that matches the system,
manually use the same type as the kernel/net80211 'ticks' type
(signed int.)

Tested:

* AR9380, STA mode
2020-05-27 22:48:34 +00:00
Konstantin Belousov
fe0dcc402f Simplify the condition to enable superpage mappings in vm_fault_soft_fast().
The list of arches list there matches the list of arches where
default VM_NRESERVLEVEL > 0.  Before sparc64 removal, that was the
only arch that defined VM_NRESERVLEVEL > 0 to help with cache coloring,
but did not implemented superpages.  Now it can be simplified.

Submitted by:	alc
Reviewed by:	markj
2020-05-27 21:44:26 +00:00
Alan Somers
2a2306099d geli: fix a livelock during panic
During any kind of shutdown, kern_reboot calls geli's pre_sync event hook,
which tries to destroy all unused geli devices. But during a panic, geli
can't destroy any devices, because the scheduler is stopped, so it can't
switch threads. A livelock results, and the system never dumps core.

This commit fixes the problem by refusing to destroy any devices during
panic, used or otherwise.

PR:		246207
Reviewed by:	jhb
MFC after:	2 weeks
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D24697
2020-05-27 19:13:26 +00:00
Adrian Chadd
67a26c98f2 [net80211] Fix interrupted scan logic and ticks comparison
The scan task refactoring stuff circa 2014-2016 broke the blocking task
 into a taskqueue with some async bits, but it apparently broke scans
 being interrupted by traffic.

Notably - the new "field" SCAN_PAUSE sets both SCAN_INTERRUPT and SCAN_CANCEL,
and a bunch of existing code was checking for SCAN_CANCEL only and breaking
the scan. Unfortunately it was then (a) cancelling the scan entirely and
(b) not notifying userland that scan was done.

So:
* Update the calls to scan_end() to only pass in 1 (saying the scan is complete)
  if SCAN_CANCEL is set WITHOUT SCAN_INTERRUPT. If both are set then yes,
  the scan is interrupted, but it isn't canceled - it's just paused.
* Update the "did the scan flags change whilst the driver was called" logic
  to check for canceled scans, not interrupted scans.
* The "scan done" logic now explicitly checks for either interrupted or
  completed scans. This accounts for the situation where a scan is being
  aborted via traffic but it ALSO happens to have finished (ie the last
  channel was checked.)

This doesn't ENTIRELY fix scanning as the resume function is broken
due to incorrect ticks math. Thus, the second half of this patch
changes the ieee80211_ticks_*() macros to use int instead of long,
matching the logic that the TCP code does with ticks and handles
wrapping / negative ticks values. If cast to long then the wrapping
math wouldn't work right (ie, if ticks was actually negative,
ie, after the system has been up for a while.)

This allows contbgscan() to correctly calculate if a scan should
continue based on ticks and ic->ic_lastdata .

Reviewed by:	bz
Differential Revision:	https://reviews.freebsd.org/D25031
2020-05-27 18:32:12 +00:00
Emmanuel Vadot
8e52f22b25 linuxkpi: Add kstrtou16
This function convert a char * to a u16.
Simply use strtoul and cast to compare for ERANGE

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24996
2020-05-27 11:42:09 +00:00
Emmanuel Vadot
9b703f3082 linuxkpi: Add rcu_swap_protected
This macros swap an rcu pointer with a normal pointer.
The condition only seems to be used for debug/warning under linux, ignore
for now.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24954
2020-05-27 10:01:30 +00:00
Emmanuel Vadot
8287045dde linuxkpi: Add overflow.h
Only add check_add_overflow and check_mul_overflow as those are the only
two needed function by DRM v5.3.
Both gcc and clang have builtin to do this check so use them directly
but throw an error if the compiler/code checker doesn't support this builtin.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselsasky
Differential Revision:	https://reviews.freebsd.org/D25015
2020-05-27 09:31:50 +00:00
Andrew Turner
2cb0e95f48 Support creating and using arm64 pmap at stage 2
Add minimal support for creating stage 2 IPA -> PA mappings. For this we
need to:

 - Create a new vmid set to allocate a vmid for each Virtual Machine
 - Add the missing stage 2 attributes
 - Use these in pmap_enter to create a new mapping
 - Handle stage 2 faults

The vmid set is based on the current asid set that was generalised in
r358328. It adds a function pointer for bhyve to use when the kernel needs
to reset the vmid set. This will need to call into EL2 and invalidate the
TLB.

The stage 2 attributes have been added. To simplify setting these fields
two new functions are added to get the memory type and protection fields.
These are slightly different on stage 1 and stage 2 tables. We then use
them in pmap_enter to set the new level 3 entry to be stored.

The D-cache on all entries is cleaned to the point of coherency. This is
to allow the data to be visible to the VM. To allow for userspace to load
code when creating a new executable entry an invalid entry is created. When
the VM tried to use it the I-cache is invalidated. As the D-cache has
already been cleaned this will ensure the I-cache is synchronised with the
D-cache.

When the hardware implements a VPIPT I-cache we need to either have the
correct VMID set or invalidate it from EL2. As the host kernel will have
the wrong VMID set we need to call into EL2 to clean it. For this a second
function pointer is added that is called when this invalidation is needed.

Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D23875
2020-05-27 08:00:38 +00:00
Adrian Chadd
69c41b7071 [ata_da] remove duplicate definition; it trips up ye olde gcc-6 on mips32
Checked first with: irc
2020-05-27 02:10:09 +00:00
Justin Hibbits
d4ed51f329 Properly sort ifdef archs in vm_fault_soft_fast superpage guards.
Sort broken in r360887.
2020-05-27 01:35:46 +00:00
Justin Hibbits
45b69dd63e powerpc/mmu: Convert PowerPC pmap drivers to ifunc from kobj
With IFUNC support in the kernel, we can finally get rid of our poor-man's
ifunc for pmap, utilizing kobj.  Since moea64 uses a second tier kobj as
well, for its own private methods, this adds a second pmap install function
(pmap_mmu_init()) to perform pmap 'post-install pre-bootstrap'
initialization, before the IFUNCs get initialized.

Reviewed by:	bdragon
2020-05-27 01:24:12 +00:00
Brandon Bergren
64cc3b0c28 [PowerPC] Fix invalid asm in trap code
In this context, 0 actually means 0 (i.e. this is a li instruction).

While most assemblers will ignore this, I did have a compile failure at one
point when using an external toolchain.

In the future, we should use the li syntax to make this clearer.

Sponsored by:	Tag1 Consulting, Inc.
2020-05-27 00:17:05 +00:00
Eric Joyner
71d104536b ice(4): Introduce new driver for Intel E800 Ethernet controllers
The ice(4) driver is the driver for the Intel E8xx series Ethernet
controllers; currently with codenames Columbiaville and
Columbia Park.

These new controllers support 100G speeds, as well as introducing
more queues, better virtualization support, and more offload
capabilities. Future work will enable virtual functions (like
in ixl(4)) and the other functionality outlined above.

For full functionality, the kernel should be compiled with
"device ice_ddp" like in the amd64 NOTES file, and/or
ice_ddp_load="YES" should be added to /boot/loader.conf so that
the DDP package file included in this commit can be downloaded
to the adapter. Otherwise, the adapter will fall back to a single
queue mode with limited functionality.

A man page for this driver will be forthcoming.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21959
2020-05-26 23:35:10 +00:00
Conrad Meyer
9d3b7f622a x86: Detect new feature bits
Fix an off-by-one in AVX512VPOPCNTDQ identification.  That was actually the
TME bit.

Reported by:	debdrup
2020-05-26 23:12:57 +00:00
Alexander Motin
2bbcd07e39 Properly check kern_sg_entries for S/G list.
ctl_data_print() is called in core context, so does not even know meaning
of ext_sg_entries.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-05-26 19:09:19 +00:00
Brandon Bergren
9941cb0657 [PowerPC] Fix atomic_cmpset_masked().
A recent kernel change caused the previously unused atomic_cmpset_masked() to
be used.

It had a typo in it.

Instead of reading the old value from an uninitialized variable, read it
from the passed-in pointer as intended.

This fixes crashes on 64 bit Book-E.

Obtained from:	jhibbits
2020-05-26 19:03:45 +00:00
Ruslan Bukin
d75038a0af Fix entering KDB with dtrace-enabled kernel.
Reviewed by:	markj, jhb
Differential Revision:	https://reviews.freebsd.org/D24018
2020-05-26 16:44:05 +00:00
Ruslan Bukin
43843cc281 Rename dmar_get_dma_tag() to acpi_iommu_get_dma_tag().
This is needed for a new IOMMU controller support.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D24943
2020-05-26 16:40:40 +00:00
Marcin Wojtas
2287afd818 Update ENA driver version to v2.2.0
Driver version upgrade is connected with support for the new device
fetures, like Tx drops reporting or disabling meta caching.

Moreover, the driver configuration from the sysctl was reworked to
provide safer and better flow for configuring:
* number of IO queues (new feature),
* drbr size on Tx,
* Rx queue size.

Moreover, a lot of minor bug fixes and improvements were added.

Copyright date in the license of the modified files in this release was
updated to 2020.

Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc.
2020-05-26 16:11:46 +00:00
Marcin Wojtas
7381a86f47 Refactor ena_tx_map_mbuf() function
There is no guarantee from bus_dmamap_load_mbuf_sg() for matching
mbuf chain segments to dma physical segments.

This patch ensure correctly mapping to LLQ header and DMA segments.

Submitted by: Ido Segev <idose@amazon.com>
Obtained from: Amazon, Inc.
2020-05-26 16:05:42 +00:00
Marcin Wojtas
9bf7da9517 Fix double-free bug within ena_detach()
There is ena_free_all_io_rings_resources() called twice on device
detach:

ena_detach():

ena_destroy_device():
/* First call */
ena_free_all_io_rings_resources()

/* Second call */
ena_free_all_io_rings_resources()

The double-free causes panic() on kldunload, for example.

As the ena_destroy_device() is also called by ena_reset_task() it is
better to stay unchanged. Thus, remove the "Second call" of the function.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 16:02:10 +00:00
Marcin Wojtas
0b432b702e Allow disabling meta caching for ENA Tx path
Determined by a flag passed from the device. No metadata is set within
ena_tx_csum when caching is disabled.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 16:00:30 +00:00
Marcin Wojtas
9762a033da Create ENA IO queues with optional backoff
If requested size of IO queues is not supported try to decrease it until
finding the highest value that can be satisfied.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:58:48 +00:00
Marcin Wojtas
56d41ad5fe Add sysctl node for ENA IO queues number adjustment
By default, in ena_attach() the driver attempts to acquire
ena_adapter::max_num_io_queues MSI-X vectors for the purpose of IO
queues, however this is not guaranteed. The number of vectors acquired
depends also on system resources availability.

Regardless of that, enable the number of effectively used IO queues to
be further limited through the sysctl node.

Example: Assumming that there are 8 IO queues configured by default, the
command

$ sysctl dev.ena.0.io_queues_nb=4

will reduce the number of available IO queues to 4. Similarly, the value
can be also increased up to maximum supported value. A value higher than
maximum supported number of IO queues is ignored. Zero is ignored too.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:57:02 +00:00
Marcin Wojtas
e2735b095b Fix assumptions about number of IO queues in the ENA
Make the ena_adapter::num_io_queues a number of effectively used IO
queues. While the ena_adapter::max_num_io_queues is an upper-bound
specified by the HW, the ena_adapter::num_io_queues may be lower than
that, depending on runtime system resources availability.

On reset, there are called ena_destroy_device() and then
ena_restore_device(). The latter calls, in turn, ena_enable_msix(),
which will attempt to re-acquire ena_adapter::max_num_io_queues of
MSIX vectors again.

Thus, the value of ena_adapter::num_io_queues may be different before
and after reset. For this reason, free the IO rings structures (drbr,
counters) in ena_destroy_device() and allocate again in
ena_restore_device().

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:54:32 +00:00
Marcin Wojtas
2182354622 Rework ENA Tx buffer ring size reconfiguration
This method has been aligned with the way how the Rx queue size is being
updated - so it's now done synchronously instead of resetting the
device.

Moreover, the input parameter is now being validated if it's a power of
2. Without this, it can cause kernel panic.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:50:30 +00:00
Marcin Wojtas
7d8c4fee95 Rework ENA Rx queue size configuration
This patch reworks how the Rx queue size is being reconfigured and how
the information from the device is being processed.

Reconfiguration of the queues and reset of the device in order to make
the changes alive isn't the best approach. It can be done synchronously
and it will let to pass information if the reconfiguration was
successful to the user. It now is done in the ena_update_queue_size()
function.

To avoid reallocation of the ring buffer, statistic counters and the
reinitialization of the mutexes when only new size has to be assigned,
the io queues initialization function has been split into 2 stages:
basic, which is just copying appropriate fields and the advanced, which
allocates and inits more advanced structures for the IO rings.

Moreover, now the max allowed Rx and Tx ring size is being kept
statically in the adapter and the size of the variables holding those
values has been changed to uint32_t everywhere.

Information about IO queues size is now being logged in the up routine
instead of the attach.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:48:06 +00:00
Marcin Wojtas
92dc69a727 Mark the ENA driver as epoch ready
Recent changes to the epoch requires driver to notify that they knows
epoch in order to prevent input packet function to enter epoch each
time the packet is received.

ENA is using NET_TASK for handling Rx, so it's entering epoch
automatically whenever this task is being executed.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:45:54 +00:00
Marcin Wojtas
579d23aa96 Improve indentation in ena_up() and ena_down()
If the conditional check for ENA_FLAG_DEV_UP is negated, the body of the
function can have smaller indentation and it makes the code cleaner.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:44:08 +00:00
Marcin Wojtas
02a2a7cea4 Expose argument names for non static ENA driver functions
As functions which are declared in the header files are intended to be
the interface and are going to be used by other files, it's better to
include argument names in the definition, so the caller won't have to
check the .c file in order to check their meaning and order.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:41:53 +00:00
Marcin Wojtas
6959869eae Use single global lock in the ENA driver
Currently, the driver had 2 global locks - one was sx lock used for
up/down synchronization and the second one was mutex, which was used
for link configuration and timer service callout.

It is better to have single lock for that. We cannot use mutex, as it
can sleep and cause witness errors in up/down configuration, so sx lock
seems to be the only choice.

Callout cannot use sx lock, but the timer service is MP safe, so we just
need to avoid race between ena_down() and ena_detach(). It can be
avoided by acquiring sx lock.

Simple macros were added that are encapsulating implementation of the
lock and makes the code cleaner.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:39:41 +00:00
Marcin Wojtas
7926bc4492 Add trigger reset function in the ENA driver
As the reset triggering is no longer a simple macro that was just
setting appropriate flag, the new function for triggering reset was
added. It improves code readability a lot, as we are avoiding additional
indentation.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:37:55 +00:00
Marcin Wojtas
8551662135 Provide ENA driver version in a sysctl node
Usage example: $ sysctl hw.ena.driver_version

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:35:22 +00:00
Marcin Wojtas
aa9c3226b9 Remove unused argument from static function in ena.c
The function ena_enable_msix_and_set_admin_interrupts takes two
arguments while the second is not used and so can be spared. This is a
static function, only ena.c is affected.

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:33:43 +00:00
Marcin Wojtas
6c84cec373 Enable Tx drops reporting in the ENA driver
Tx drops statistics are fetched from HW every ena_keepalive_wd() call
and are observable using one of the commands:
* sysctl dev.ena.0.hw_stats.tx_drops
* netstat -I ena0 -d

Submitted by:  Maciej Bielski <mba@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:31:28 +00:00
Marcin Wojtas
8483b844e7 Adjust ENA driver to the new HAL
* Removed adaptive interrupt moderation (not suported on FreeBSD).
* Use ena_com_free_q_entries instead of ena_com_free_desc.
* Don't use ENA_MEM_FREE outside of the ena_com.
* Don't use barriers before calling doorbells as it's already done in
  the HAL.
* Add function that generates random RSS key, common for all driver's
  interfaces.
* Change admin stats sysctls to U64.

Submitted by:  Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by:  Amazon, Inc.
2020-05-26 15:29:19 +00:00
Alexander Motin
3873b14991 Fix fallout of r319722 in CTL HA.
ha_lso is a listening socket (unless bind() has failed), so should use
solisten_upcall_set(NULL, NULL), not soupcall_clear().

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-05-26 15:08:35 +00:00
Marcin Wojtas
b01edfb515 Fix AES-CTR compatibility issue in ipsec
r361390 decreased blocksize of AES-CTR from 16 to 1.
Because of that ESP payload is no longer aligned to 16 bytes
before being encrypted and sent.
This is a good change since RFC3686 specifies that the last block
doesn't need to be aligned.
Since FreeBSD before r361390 couldn't decrypt partial blocks encrypted
with AES-CTR we need to enforce 16 byte alignment in order to preserve
compatibility.
Add a sysctl(on by default) to control it.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: jhb
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D24999
2020-05-26 14:16:26 +00:00
Marcin Wojtas
da6526096f Restore XHCI operation on Armada 38x
r347343 split generic xhci driver into three files.
Include generic_xhci_fdt.c when building kernel for Armada SoCs.
This brings back XHCI support on these platforms and also
others, which use GENERIC config.

Submitted by: Kornel Duleba
Obtained from: Semihalf
MFC after: 1 week
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D24944
2020-05-26 14:10:53 +00:00
Alexander Motin
fd10265cd2 Do not remove upcall if we haven't yet.
This fixes assertion if we failed to bind listening HA socket.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-05-26 13:57:14 +00:00
Roger Pau Monné
f44b653756 xen-locore: fix size in GDT descriptor
There was an off-by-one in the GDT descriptor size field used by the
early Xen boot code. The GDT descriptor size should be the size of the
GDT minus one. No functional change expected as a result of this
change.

Sponsored by:	Citrix Systems R&D
2020-05-26 10:24:06 +00:00
Hans Petter Selasky
4f3c0f3d1e Fix build issue after r360292 when using both RSS and KERN_TLS options.
Sponsored by:	Mellanox Technologies
2020-05-26 08:25:24 +00:00
Hans Petter Selasky
bf43f9812c Sync with Linux packet pacing enhancements in mlx5en(4).
Linux commit:
05d3ac978ed25b753bfe34fe76c50c31ee506a82

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-26 07:41:46 +00:00
Justin Hibbits
0aca9ecd85 powerpc/booke pmap: Fix iteration for 64-bit kernel page table creation
Kernel page tables actually start at index 4096, given kernel base address
of 0xc008000000000000, not index 0, which would yield 0xc000000000000000.
Fix this by indexing at the real base, instead of the assumed base.
2020-05-26 03:58:19 +00:00
Brandon Bergren
8ef0c667f4 [PowerPC] Ensure ppc32 cpu_switch routines set up Secure-PLT.
This is a correctness fix needed to enable the ifunc conversion of the pmap
in D24993.

Since we are making function calls that may need to go through the PLT, ensure
r30 is set up correctly.

This fixes crashes when booting with D24993 applied.

Reviewed by:	jhibbits (in IRC)
Sponsored by:	Tag1 Consulting, Inc.
2020-05-26 02:27:10 +00:00
John Baldwin
b04748bef2 Update cryptocteon(4) and nlmsec(4) for changes in r361481.
This does not add support for separate output buffers but updates the
drivers to cope with the changes.

Pointy hat to:	jhb
2020-05-25 23:49:46 +00:00
Chuck Silvers
d79ff54b5c This commit enables a UFS filesystem to do a forcible unmount when
the underlying media fails or becomes inaccessible. For example
when a USB flash memory card hosting a UFS filesystem is unplugged.

The strategy for handling disk I/O errors when soft updates are
enabled is to stop writing to the disk of the affected file system
but continue to accept I/O requests and report that all future
writes by the file system to that disk actually succeed. Then
initiate an asynchronous forced unmount of the affected file system.

There are two cases for disk I/O errors:

   - ENXIO, which means that this disk is gone and the lower layers
     of the storage stack already guarantee that no future I/O to
     this disk will succeed.

   - EIO (or most other errors), which means that this particular
     I/O request has failed but subsequent I/O requests to this
     disk might still succeed.

For ENXIO, we can just clear the error and continue, because we
know that the file system cannot affect the on-disk state after we
see this error. For EIO or other errors, we arrange for the geom_vfs
layer to reject all future I/O requests with ENXIO just like is
done when the geom_vfs is orphaned. In both cases, the file system
code can just clear the error and proceed with the forcible unmount.

This new treatment of I/O errors is needed for writes of any buffer
that is involved in a dependency. Most dependencies are described
by a structure attached to the buffer's b_dep field. But some are
created and processed as a result of the completion of the dependencies
attached to the buffer.

Clearing of some dependencies require a read. For example if there
is a dependency that requires an inode to be written, the disk block
containing that inode must be read, the updated inode copied into
place in that buffer, and the buffer then written back to disk.

Often the needed buffer is already in memory and can be used. But
if it needs to be read from the disk, the read will fail, so we
fabricate a buffer full of zeroes and pretend that the read succeeded.
This zero'ed buffer can be updated and written back to disk.

The only case where a buffer full of zeros causes the code to do
the wrong thing is when reading an inode buffer containing an inode
that still has an inode dependency in memory that will reinitialize
the effective link count (i_effnlink) based on the actual link count
(i_nlink) that we read. To handle this case we now store the i_nlink
value that we wrote in the inode dependency so that it can be
restored into the zero'ed buffer thus keeping the tracking of the
inode link count consistent.

Because applications depend on knowing when an attempt to write
their data to stable storage has failed, the fsync(2) and msync(2)
system calls need to return errors if data fails to be written to
stable storage. So these operations return ENXIO for every call
made on files in a file system where we have otherwise been ignoring
I/O errors.

Coauthered by: mckusick
Reviewed by:   kib
Tested by:     Peter Holm
Approved by:   mckusick (mentor)
Sponsored by:  Netflix
Differential Revision:  https://reviews.freebsd.org/D24088
2020-05-25 23:47:31 +00:00
John Baldwin
b02676a2cb Update sec(4) for separate output buffers changes in r361481.
This does not add support for separate output buffers but updates the
driver to cope with the changes.

Pointy hat to:	jhb
2020-05-25 23:20:33 +00:00
John Baldwin
72d874fa90 Update cesa(4) for separate output buffers changes in r361481.
This does not add support for separate output buffers but updates the
driver to cope with the changes.

Pointy hat to:	jhb
2020-05-25 23:12:49 +00:00
Adrian Chadd
8c01c3dc46 [ath] [ath_hal] Propagate the HAL_RESET_TYPE through to the chip reset; set it during ath_reset()
Although I added the reset type field to ath_hal_reset() years ago,
I never finished adding it both throughout the HALs and in if_ath.c.

This will eventually deprecate the ath_hal force_full_reset option
because it can be requested at the driver layer.

So:

* Teach ar5416ChipReset() and ar9300_chip_reset() about the HAL type
* Use it in ar5416Reset() and ar9300_reset() when doing a full chip reset
* Extend ath_reset() to include the HAL_RESET_TYPE parameter added in the above functions
* Use HAL_RESET_NORMAL in most calls to ath_reset()
* .. but use HAL_RESET_BBPANIC for the BB panics, and HAL_RESET_FORCE_COLD during fatal, beacon miss and other hardware related hangs.

This should be a glorified no-op outside of actual hardware issues.
I've tested things with ath_hal force_full_reset set to 1 for years now,
so I know that feature and a full reset works (albeit much slower than
a warm reset!) and it does unwedge hardware.

The eventual aim is to use this for all the places where the driver
detects a potential hang as well as if long calibration - ie, noise floor
calibration - fails to complete. That's one of the big hardware related
things that causes station mode operation to hang without easy recovery.

Differential Revision:	https://reviews.freebsd.org/D24981
2020-05-25 22:31:45 +00:00
John Baldwin
a639f9379b Support separate output buffers for aesni(4).
The backend routines aesni(4) call for specific encryption modes all
expect virtually contiguous input/output buffers.  If the existing
output buffer is virtually contiguous, always write to the output
buffer directly from the mode-specific routines.  If the output buffer
is not contiguous, then a temporary buffer is allocated whose output
is then copied to the output buffer.  If the input buffer is not
contiguous, then the existing buffer used to hold the input is also
used to hold temporary output.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24545
2020-05-25 22:30:44 +00:00
John Baldwin
2adc3c9417 Support separate output buffers in ccr(4).
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24545
2020-05-25 22:23:13 +00:00
John Baldwin
ba63e5e701 Add a sysctl knob to use separate output buffers for /dev/crypto.
This is a testing aid to permit using testing a driver's support of
separate output buffers via cryptocheck.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24545
2020-05-25 22:21:09 +00:00
John Baldwin
33f3bad35e Export the _kern_crypto sysctl node from crypto.c.
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24545
2020-05-25 22:18:33 +00:00
John Baldwin
9c0e3d3a53 Add support for optional separate output buffers to in-kernel crypto.
Some crypto consumers such as GELI and KTLS for file-backed sendfile
need to store their output in a separate buffer from the input.
Currently these consumers copy the contents of the input buffer into
the output buffer and queue an in-place crypto operation on the output
buffer.  Using a separate output buffer avoids this copy.

- Create a new 'struct crypto_buffer' describing a crypto buffer
  containing a type and type-specific fields.  crp_ilen is gone,
  instead buffers that use a flat kernel buffer have a cb_buf_len
  field for their length.  The length of other buffer types is
  inferred from the backing store (e.g. uio_resid for a uio).
  Requests now have two such structures: crp_buf for the input buffer,
  and crp_obuf for the output buffer.

- Consumers now use helper functions (crypto_use_*,
  e.g. crypto_use_mbuf()) to configure the input buffer.  If an output
  buffer is not configured, the request still modifies the input
  buffer in-place.  A consumer uses a second set of helper functions
  (crypto_use_output_*) to configure an output buffer.

- Consumers must request support for separate output buffers when
  creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are
  only permitted to queue a request with a separate output buffer on
  sessions with this flag set.  Existing drivers already reject
  sessions with unknown flags, so this permits drivers to be modified
  to support this extension without requiring all drivers to change.

- Several data-related functions now have matching versions that
  operate on an explicit buffer (e.g. crypto_apply_buf,
  crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf).

- Most of the existing data-related functions operate on the input
  buffer.  However crypto_copyback always writes to the output buffer
  if a request uses a separate output buffer.

- For the regions in input/output buffers, the following conventions
  are followed:
  - AAD and IV are always present in input only and their
    fields are offsets into the input buffer.
  - payload is always present in both buffers.  If a request uses a
    separate output buffer, it must set a new crp_payload_start_output
    field to the offset of the payload in the output buffer.
  - digest is in the input buffer for verify operations, and in the
    output buffer for compute operations.  crp_digest_start is relative
    to the appropriate buffer.

- Add a crypto buffer cursor abstraction.  This is a more general form
  of some bits in the cryptosoft driver that tried to always use uio's.
  However, compared to the original code, this avoids rewalking the uio
  iovec array for requests with multiple vectors.  It also avoids
  allocate an iovec array for mbufs and populating it by instead walking
  the mbuf chain directly.

- Update the cryptosoft(4) driver to support separate output buffers
  making use of the cursor abstraction.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24545
2020-05-25 22:12:04 +00:00
Conrad Meyer
852c303b61 copystr(9): Move to deprecate (attempt #2)
This reapplies logical r360944 and r360946 (reverting r360955), with fixed
copystr() stand-in replacement macro.  Eventually the goal is to convert
consumers and kill the macro, but for a first step it helps if the macro is
correct.

Prior commit message:

Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy (with correction from brooks@ -- thanks).

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:		jhb (earlier version)
Discussed with:		brooks (thanks!)
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-25 16:40:48 +00:00
Marcin Wojtas
9085d7d6b8 Introduce a driver for NXP LS1046A SoC AHCI.
Implement support for AHCI controller found in
NXP QorIQ Layerscape SoCs.

Submitted by: Artur Rojek <ar@semihalf.com>
Reviewed by: manu
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D24466
2020-05-25 16:00:08 +00:00
Marcin Wojtas
d97d838569 Introduce support for Epson RX-8803 RTC.
This patch introduces support for Epson RX-8803 RTC controller accessible
over I2C bus. It has a resolution of 1 sec.
Support for interrupt based alarm was not implemented.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: manu
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D24364
2020-05-25 15:40:02 +00:00
Marcin Wojtas
7187ccccdc Add TCA6416 GPIO expander support.
Add basic TCA6416 GPIO expander support over I2C bus. The driver handles
enabling and disabling pins, setting pin mode to IN and OUT and
toggling the pins. External interrupts are not supported.

Submitted by: Dawid Gorecki <dgr@semihalf.com>
Reviewed by: manu, mmel
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D24363
2020-05-25 15:31:43 +00:00
Marcin Wojtas
1e6005d807 Introduce VF610 I2C controller support.
NXP LS1046A contains I2C controller compatible with Vybrid VF610.
Existing Vybrid MVF600 driver can be used to support it. For that purpose
declare driver as ofw_iicbus and add methods associated with ofw_iicbus.

For VF610 add dynamic clock prescaler calculation using clock information
from clock driver and clock frequency requested in device tree.

On the occasion add detach function and add additional error handling
in i2c_attach function.

Submitted by: Dawid Gorecki <dgr@semihalf.com>
Reviewed by: manu
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D24361
2020-05-25 15:21:38 +00:00
Marcin Wojtas
a5dfa67db1 Add GPIO support for QorIQ boards.
This patch adds a GPIO controller support targeted for NXP LS1046A
SoC. The driver implements the following features:
 * setting direction of each pin (IN or OUT)
 * setting the mode of output pins (PUSHPULL or OPENDRAIN)
 * setting the state of each output pin (1 or 0)
 * reading the state of each input pin (1 or 0)

Submitted by: Kamil Koczurek <kek@semihalf.com>
              Dawid Gorecki <dgr@semihalf.com>
Reviewed by: manu
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D24353
2020-05-25 14:55:37 +00:00
Marcin Wojtas
eacff8a248 Add LS1046A clockgen driver.
Driver provides probe and attach functions for LS1046A clockgen and passes
configuration information to QorIQ clockgen class. It may be used as
a reference implementation for different QorIQ clockgen devices.

Submitted by: Dawid Gorecki <dgr@semihalf.com>
Reviewed by: mmel, manu
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D24352
2020-05-25 14:45:18 +00:00
Marcin Wojtas
b8cb0864dc Add QorIQ platform clockgen driver.
This patch adds classes and functions that can be used with various NXP
QorIQ Layerscape SoCs.

As for the clock topology - there is single platform PLL, which supplies
clocks for the peripheral bus and additional PLLs for CPU cores. There
can be multiple core PLLs (For example - LS1046A has two PLLs - CGAPLL1
and CGAPLL2). Each PLL has fixed dividers on output. The core PLLs
are not accessible from dts.

This is a preparation patch for NXP LS1046A SoC support.

Submitted by: Dawid Gorecki <dgr@semihalf.com>
Reviewed by: mmel
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential Revision: https://reviews.freebsd.org/D24351
2020-05-25 14:31:32 +00:00
Emmanuel Vadot
42f0f394f7 linuxkpi: Fix mod_timer and del_timer_sync
mod_timer is supposed to return 1 if the modified timer was pending, which
is exactly what callout_reset does so return the value after checking
that it's a correct one in case the api change.
del_timer_sync returns int so add a function and handle that.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24983
2020-05-25 12:46:05 +00:00
Emmanuel Vadot
4efd5dd70d linuxkpi: Add refcount.h
Implement some refcount functions needed by drm.
Just use the atomic_t struct and functions from linuxkpi for simplicity.

Sponsored-by: The FreeBSD Foundation

Reviewed by:	hselsasky
Differential Revision:	https://reviews.freebsd.org/D24985
2020-05-25 12:44:07 +00:00
Emmanuel Vadot
93d70cd3fe linuxkpi: Add __same_type and __must_be_array macros
The same_type macro simply wraps around builtin_types_compatible_p which
exist for both GCC and CLANG, which returns 1 if both types are the same.
The __must_be_array macros returns 1 if the argument is an array.

This is needed for DRM v5.3

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24953
2020-05-25 12:42:55 +00:00
Mateusz Guzik
5a90435ccf proc: refactor clearing credentials into proc_unset_cred 2020-05-25 12:41:44 +00:00
Hans Petter Selasky
ce69b84204 Improve set progress parameters, SET PSV for HW TLS in mlx5en(4).
There is no need for a fence and there is no need to provide
the TCP sequence number.

Sponsored by:	Mellanox Technologies
2020-05-25 12:37:45 +00:00
Hans Petter Selasky
233a6665b6 Correctly set the initial vector for TLS v1.3 for mlx5en(4).
For TLS v1.3 the 12 bytes of the initial vector, IV, should just be copied
as-is from the kernel to the gcm_iv field, which hold the first 4 bytes,
and the remaining 8 bytes go to the subsequent implicit_iv field.
There is no need to consider the byte order on the 12 bytes of IV like
initially done.

Sponsored by:	Mellanox Technologies
2020-05-25 12:34:15 +00:00
Hans Petter Selasky
9550e3403e Update the TLS capability bit after recent PRM changes in mlx5en(4).
A CX6-DX firmware version equal to or newer than 12.27.0372 is
now required.

Sponsored by:	Mellanox Technologies
2020-05-25 12:31:48 +00:00
Mateusz Guzik
e3d16bb6a8 vfs: use atomic_{store,load}_long to manage f_offset
... instead of depending on the compiler not to mess them up
2020-05-25 04:57:57 +00:00
Mateusz Guzik
442e617fd2 vfs: restore mtx-protected foffset locking for 32 bit platforms
They depend on it to accurately read the offset.

The new code is not used as it would add an interrupt enable/disable
trip on top of the atomic.

This also fixes a bug where 32-bit nolock request would still lock the offset.

No changes for 64-bit.

Reported by:	emaste
2020-05-25 04:56:41 +00:00
Mateusz Guzik
3fc40153b2 vfs: scale foffset_lock by using atomics instead of serializing on mtx pool
Contending cases still serialize on sleepq (which would be taken anyway).

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21626
2020-05-24 03:50:49 +00:00
Conrad Meyer
b6be5f6405 Unbreak ARM64 kernel build after r361426
X-MFC-With:	r361426
2020-05-23 23:10:03 +00:00
Conrad Meyer
37f1f2684f Update to Zstandard 1.4.5
As usual, the full release notes are found on Github:

  https://github.com/facebook/zstd/releases/tag/v1.4.5

Notable changes include:

* Improved decompress performance on amd64 and arm (5-10%
  and 15-50%, respectively).
* '--patch-from' zstd(1) CLI option, which provides something like a very fast
  version of bspatch(1) with slightly worse compression.  See release notes.

In this update, I dropped the 3-year old -O0 workaround for an LLVM ARM bug;
the bug was fixed in LLVM SVN in 2017, but we didn't remove this workaround
from our tree until now.

MFC after:	I won't, but feel free
Relnotes:	yes
2020-05-23 21:23:46 +00:00
Conrad Meyer
c0f419ec56 contrib/zstd: Revise Xlist for 1.4.5 import 2020-05-23 20:39:36 +00:00
Emmanuel Vadot
77c68315f6 bbr: Use arc4random_uniform from libkern.
This unbreak LINT build

Reported by:	jenkins, melifaro
2020-05-23 19:52:20 +00:00
Alexander V. Chernikov
4d2c2509f2 Move <add|del|change>_route() functions to route_ctl.c in preparation of
multipath control plane changed described in D24141.

Currently route.c contains core routing init/teardown functions, route table
 manipulation functions and various helper functions, resulting in >2KLOC
 file in total. This change moves most of the route table manipulation parts
 to a dedicated file, simplifying planned multipath changes and making
 route.c more manageable.

Differential Revision:	https://reviews.freebsd.org/D24870
2020-05-23 19:06:57 +00:00
Emmanuel Vadot
af300929f9 linuxkpi: Add prandom_u32_max
This is just a wrapper around arc4random_uniform
Needed by DRM v5.3

Sponsored-by: The FreeBSD Foundation
Reviewed by:	cem, hselasky
Differential Revision:	https://reviews.freebsd.org/D24961
2020-05-23 17:52:25 +00:00
Emmanuel Vadot
353d02e927 libkern: Add arc4random_uniform
This variant get a random number up to the limit passed as the argument.
This is simply a copy of the libc version.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	cem, hselasky (previous version)
Differential Revision:	https://reviews.freebsd.org/D24962
2020-05-23 17:51:06 +00:00
Alexander V. Chernikov
a82f62ec2d Remove refcounting from rtentry.
After making rtentry reclamation backed by epoch(9) in r361409, there is
 no reason in keeping reference counting code.

Differential Revision:	https://reviews.freebsd.org/D24867
2020-05-23 12:15:47 +00:00
Dimitry Andric
d65cd7a57b Merge llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and openmp
llvmorg-10.0.1-rc1-0-gf79cd71e145 (aka 10.0.1 rc1).

MFC after:	3 weeks
2020-05-23 10:32:18 +00:00
Alexander V. Chernikov
2bbab0af6d Use epoch(9) for rtentries to simplify control plane operations.
Currently the only reason of refcounting rtentries is the need to report
 the rtable operation details immediately after the execution.
Delaying rtentry reclamation allows to stop refcounting and simplify the code.
Additionally, this change allows to reimplement rib_lookup_info(), which
 is used by some of the customers to get the matching prefix along
 with nexthops, in more efficient way.

The change keeps per-vnet rtzone uma zone. It adds nh_vnet field to
 nhop_priv to be able to reliably set curvnet even during vnet teardown.
Rest of the reference counting code will be removed in the D24867 .

Differential Revision:	https://reviews.freebsd.org/D24866
2020-05-23 10:21:02 +00:00
John Baldwin
016fc6ddb3 Remove a workaround for GCM requests with an empty payload.
This was copied from ccr(4) (which does require the workaround), but
is reportedly not needed for ccp(4).

Discussed with:	cem
Sponsored by:	Netflix
2020-05-22 20:52:36 +00:00
Mitchell Horne
e1a6e0e33e Simplify the RISC-V kernel linker invocation
Remove our custom SYSTEM_LD definition. This generates program headers
that are more consistent with other architectures, and more importantly,
are in line with what loader(8) expects when loading a kernel.

As noted in https://reviews.freebsd.org/D22920, there is no apparent
reason why the kernel would need a writable text segment, so removal of
the -N flag isn't likely to cause issue.

Reviewed by:	kp, br
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D24909
2020-05-22 18:54:56 +00:00
Alan Somers
bfcb817bcd Fix issues with FUSE_ACCESS when default_permissions is disabled
This patch fixes two issues relating to FUSE_ACCESS when the
default_permissions mount option is disabled:

* VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server
  in the form of FUSE_ACCESS operations. The FUSE protocol has no equivalent
  of VADMIN, so we must evaluate such things kernel-side, regardless of the
  default_permissions setting.

* The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes:
  for the access(2) syscall and to check directory permissions for
  searchability during lookup. FreeBSD sends it much more frequently, due to
  differences between our VFS and Linux's, for which FUSE was designed. But
  this patch does eliminate several cases not required by the FUSE protocol:

  * for any FUSE_*XATTR operation
  * when creating a new file
  * when deleting a file
  * when setting timestamps, such as by utimensat(2).

* Additionally, when default_permissions is disabled, this patch removes one
  FUSE_GETATTR operation when deleting a file.

PR:		245689
Reported by:	MooseFS FreeBSD Team <freebsd@moosefs.pro>
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24777
2020-05-22 18:11:17 +00:00
Alexander Motin
1f29b46c42 Do not try to fill socket send buffer to the last byte.
Setting so_snd.sb_lowat to at least 1/8 of the socket buffer size allows
send thread more actively use PDUs coalescing, that dramatically reduces
TCP lock congestion and number of context switches, when the socket is
full and PDUs are small.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-05-22 18:10:46 +00:00