Commit Graph

138664 Commits

Author SHA1 Message Date
glebius
4d54a39862 There is a long standing problem with multicast programming for NICs
and IPv6.  With IPv6 we may call if_addmulti() in context of processing
of an incoming packet.  Usually this is interrupt context.  While most
of the NIC drivers are able to reprogram multicast filters without
sleeping, some of them can't.  An example is e1000 family of drivers.
With iflib conversion the problem was somewhat hidden.  Iflib processes
packets in private taskqueue, so going to sleep doesn't trigger an
assertion.  However, the sleep would block operation of the driver and
following incoming packets would fill the ring and eventually would
start being dropped.  Enabling epoch for the full time of a packet
processing again started to trigger assertions for e1000.

Fix this problem once and for all using a general taskqueue to call
if_ioctl() method in all cases when if_addmulti() is called in a
non sleeping context.  Note that nobody cares about returned value.

Reviewed by:	hselasky, kib
Differential Revision:	  https://reviews.freebsd.org/D22154
2019-10-29 17:36:06 +00:00
glebius
8bb52bf920 Merge td_epochnest with td_no_sleeping.
Epoch itself doesn't rely on the counter and it is provided
merely for sleeping subsystems to check it.

- In functions that sleep use THREAD_CAN_SLEEP() to assert
  correctness.  With EPOCH_TRACE compiled print epoch info.
- _sleep() was a wrong place to put the assertion for epoch,
  right place is sleepq_add(), as there ways to call the
  latter bypassing _sleep().
- Do not increase td_no_sleeping in non-preemptible epochs.
  The critical section would trigger all possible safeguards,
  no sleeping counter is extraneous.

Reviewed by:	kib
2019-10-29 17:28:25 +00:00
glebius
426fb7a9ac Augment macros that manipulate td_no_sleeping with assertions to check
underleak and overflow of the counter.

Reviewed by:	kib
2019-10-29 17:19:36 +00:00
scottl
4f0893e589 Add device IDs for the next generation of Intel HDA audio.
MFC after:	3 days
2019-10-28 23:31:22 +00:00
vmaffione
999cebf707 netmap: enter NET_EPOCH on generic txsync
After r353292, netmap generic adapter on if_vlan interfaces panics on
asserting the NET_EPOCH. In more detail, this happens when
nm_os_generic_xmit_frame() is called, that is in the generic txsync
routine.
Fix the issue by entering the NET_EPOCH during the generic txsync.
We amortize the cost of entering/exiting over a whole batch of
transmissions.

PR:		241489
Reported by:	Aleksandr Fedorov <aleksandr.fedorov@itglobal.com>
2019-10-28 19:00:27 +00:00
kib
5a02b0c0de Fix reset of the kernel stack pointer in TSS for !PTI case on pmap activation
after r354095.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2019-10-28 10:50:37 +00:00
tsoome
33d5dc951f loader: zio_checksum_verify should check byteswap
We do have both native and byteswap checksum callbacks in place but the
selection is not wired.

MFC after:	1 week
2019-10-27 08:35:29 +00:00
kib
fd9fb6f4f4 Provide dummy definition of the amd64 struct pcb for -m32 compilation.
I do not see a need in the proper x86/include/pcb.h header.

Reported and tested by:	antoine
MFC after:	1 week
2019-10-26 18:22:52 +00:00
manu
fd529f0dcf arm64: rockchip: dts: Build the Khadas board DTS
We boot on thoses boards so build them.

Submitted by:	s199p.wa1k9r@gmail.com
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22158
2019-10-26 17:51:43 +00:00
asomers
22b0eb13b2 MFZoL: Avoid retrieving unused snapshot props
This patch modifies the zfs_ioc_snapshot_list_next() ioctl to enable it
to take input parameters that alter the way looping through the list of
snapshots is performed. The idea here is to restrict functions that
throw away some of the snapshots returned by the ioctl to a range of
snapshots that these functions actually use. This improves efficiency
and execution speed for some rollback and send operations.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Signed-off-by: Alek Pinchuk <apinchuk@datto.com>
Closes #8077
zfsonlinux/zfs@4c0883fb4a

MFC after:	2 weeks
2019-10-26 17:11:02 +00:00
np
bde6f75009 cxgbe(4): Use correct FetchBurstMin values for T6.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-10-25 21:53:05 +00:00
mhorne
f3a300c932 RISC-V: skip cpu-map when parsing elf_hwcap
The fill_elf_hwcap() function expects to find only cpu nodes under the
/cpus entry of the device tree. Newer versions of QEMU insert a cpu-map
node which describes the CPU topology, breaking this function. To fix
this, simply skip any non-cpu entries.

Reviewed by:	markj, kp, jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22151
2019-10-25 21:39:29 +00:00
gonzo
7c8bef6bea arm64: rk3399: add SPI driver and include it in GENERIC config
SPI driver for Rockchip's RK3399 SoC. Implements PIO mode, CS selection,
SPI mode and frequency configuration.

Reviewed by:	manu
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D22148
2019-10-25 21:38:38 +00:00
rpokala
12d01d18eb Args for buf_track() might be unused
If neither FULL_BUF_TRACKING nor BUF_TRACKING are defined, then the body of
buf_track() becomes empty. Mark the arguments with "__unused" so the
compiler doesn't complain about unused arguments in that case.

Reported by:	Bruce Leverett (Panasas)
Reviewed by:	cem (on IRC)
MFC after:	1 month
Sponsored by:	Panasas
2019-10-25 21:32:28 +00:00
gonzo
a642819db8 arm64: rk3399: Add clock and gate for SPI clocks
MFC after:	1 month
2019-10-25 21:21:21 +00:00
markj
784b469212 Apply kernel module linker scripts to firmware files.
Use a separate make variable to specify the linker script so that it is
only applied at link time and not during intermediate generation of .fwo
files.

This ensures that the .text padding inserted by the amd64 linker script
is applied to the stub module load handlers embedded in firmware
modules.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22125
2019-10-25 20:15:04 +00:00
kib
b01d1a3a2f amd64: move pcb out of kstack to struct thread.
This saves 320 bytes of the precious stack space.

The only negative aspect of the change I can think of is that the
struct thread increased by 320 bytes obviously, and that 320 bytes are
not swapped out anymore. I believe the freed stack space is much more
important than that.  Also, current struct thread size is 1392 bytes
on amd64, so UMA will allocate two thread structures per (4KB) slab,
which leaves a space for pcb without increasing zone memory use.

Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D22138
2019-10-25 20:09:42 +00:00
peterj
2fc0413f87 Fix use of uninitialised variable.
The RK805 regs array was being allocated before it's required size was
known, causing the driver to use memory it didn't own.  That memory
was subsequently allocated and used elsewhere causing later fatal data
aborts in rk805_map().

Whilst I'm here, add a sanity check to catch unsupported PMICs (this
shouldn't ever get hit because the probe should have failed).

Reviewed by:	manu
MFC after:	1 week
Sponsored by:	Google
2019-10-25 19:38:02 +00:00
bz
d7908b9933 Properly set VNET when nuking recvif from fragment queues.
In theory the eventhandler invoke should be in the same VNET as
the the current interface. We however cannot guarantee that for
all cases in the future.

So before checking if the fragmentation handling for this VNET
is active, switch the VNET to the VNET of the interface to always
get the one we want.

Reviewed by:	hselasky
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22153
2019-10-25 18:54:06 +00:00
manu
b2e908a725 arm64: rockchip: Add RK3399 TypeC phy driver
This is a driver for the USB3 PHY present in the RK3399.
While the phy support DP (Display Port) the driver doesn't has we have
no driver to test this with for now.
All the lane and pll configuration is just magic values from rockchip.
While the manual have some info on those registers it's really hard to
understand how to calculate those values (if there is a way).

MFC after:	1 month
2019-10-25 18:10:02 +00:00
manu
77f7cf2489 arm64: rockchip: Add rk_dwc3 driver
This is a simplebus like driver that attaches the dwc3 child node and
enable the clocks needed for the module.

MFC after:	1 month
2019-10-25 18:08:59 +00:00
manu
2ce78127ee arm64: rk3399: Add clock and gate for usb3 clocks
MFC after:	1 month
2019-10-25 18:08:25 +00:00
manu
5b55c17150 flash: Add GigaDevice gd25q128 flash
Add this flash chip which is a 128Mb spi flash.

MFC after:	1 week
2019-10-25 17:56:24 +00:00
glebius
fea9cffaa1 Add couple more assertions to vm_pager_assert_in(). The bogus page is
not allowed at ends of the request, and all non-bogus pages must be
consecutive.

Reviewed by:	kib
2019-10-25 16:59:54 +00:00
avg
a95b774891 superio: do not crash if failed to create the character device
MFC after:	1 week
2019-10-25 16:30:24 +00:00
bz
3f9b3c4810 frag6: do not leak counter in error cases
When allocating the IPv6 fragement packet queue entry we do checks
against counters and if we pass we increment one of the counters
to claim the spot.  Right after that we have two cases (malloc and MAC)
which can both fail in which case we free the entry but never released
our claim on the counter.  In theory this can lead to not accepting new
fragments after a long time, especially if it would be MAC "refusing"
them.
Rather than immediately subtracting the value in the error case, only
increment it after these two cases so we can no longer leak it.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-25 16:29:09 +00:00
avg
d54ea7b22f superio_io.h: fix the copyright
MFC after:	1 week
2019-10-25 16:28:39 +00:00
avg
89a5addf37 superio: add a simple ioctl interface
MFC after:	1 week
2019-10-25 16:07:24 +00:00
avg
1960ca08aa owc_gpiobus: add missing space in r354077 2019-10-25 15:46:54 +00:00
avg
57fc2e848c owc_gpiobus_read_data: add recovery time to the read slot
Reviewed by:	imp
MFC after:	2 weeks
2019-10-25 15:39:46 +00:00
avg
311e521a35 owc_gpiobus_read_data: compare times in sbintime_t units
Previously the code used sbttous() before microseconds comparison in one
place, sbttons() and nanoseconds in another, division by SBT_1US and
microseconds in yet another.

Now the code consistently uses multiplication by SBT_1US to convert
microseconds to sbintime_t before comparing them with periods between
calls to sbinuptime().  This is fast, this is precise enough (below
0.03%) and the periods defined by the protocol cannot overflow.

Reviewed by:	imp (D22108)
MFC after:	2 weeks
2019-10-25 15:38:09 +00:00
andrew
efe7a1e01f Remove the arm4 ID register masks, they are not needed after r353641.
Sponsored by:	DARPA, AFRL
2019-10-25 14:46:09 +00:00
andrew
7a2f3d8b41 Make special register names lowercase so they don't conflict with future
ID register macros.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-10-25 14:30:27 +00:00
avg
d91b70f305 owc_gpiobus_read_data: disable preemption earlier
Now this is done before starting the low pulse that has rather tight
timing.

Reviewed by:	imp (D22108)
MFC after:	2 weeks
2019-10-25 14:20:59 +00:00
avg
4fc619ef64 ow_temp: better scopes for the lock
The lock is used only for start / stop signaling.
It is used only for 'flags' field and the related condition variable.

This change is a follow-up to r354067, it was suggested by Warner in
D22107.

Suggested by:	imp
MFC after:	1 week
2019-10-25 13:47:17 +00:00
avg
c6cc52a36a ow_temp: drop the lock around a call that can sleep
This is similar to what is done around other calls that lead to
own_command_wait() that can sleep.

Reviewed by:	imp
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D22107
2019-10-25 13:42:36 +00:00
avg
5d234dc677 gpioiic: set output after switching to output mode if presetting it failed
Some controllers cannot preset future output value while the pin is in
input mode.  This adds a fallback for those controllers.  The new code
assumes that a controller reports an error in that case.

For example, all hardware supported by nctgpio behaves in that way.

This is a temporary measure.  In the future we will use
GPIO_PIN_PRESET_LOW / GPIO_PIN_PRESET_HIGH to preset the output either
in hardware, if supported, or in software (e.g., in
gpiobus_pin_setflags).

While here, I extracted common functionality of gpioiic_set{sda,scl} and
gpioiic_get{sda,scl} to gpioiic_setpin and gpioiic_getpin respectively.

MFC after:	2 weeks
2019-10-25 09:37:54 +00:00
brooks
60af079f33 nda(4): Remove unnecessary union and avoid Clang -Wsizeof-array-divwarning
Clang trunk recently gained this new warning, and complains about the
sizeof(trim->data) / sizeof(struct nvme_dsm_range) expression, since
the left hand side's element type (char) does not match the right hand
side's type. The byte buffer is unnecessary so we can remove it to clean
up the code and fix the warning at the same time.

No functional change.

Submitted by:	James Clarke <jrtc27@jrtc27.com>
Reviewed by:	imp
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D21912
2019-10-24 22:23:53 +00:00
bz
9589e8e651 frag6: prevent overwriting initial fragoff=0 packet meta-data.
When we receive the packet with the first fragmented part (fragoff=0)
we remember the length of the unfragmentable part and the next header
(and should probably also remember ECN) as meta-data on the reassembly
queue.
Someone replying this packet so far could change these 2 (3) values.
While changing the next header seems more severe, for a full size
fragmented UDP packet, for example, adding an extension header to the
unfragmentable part would go unnoticed (as the framented part would be
considered an exact duplicate) but make reassembly fail.
So do not allow updating the meta-data after we have seen the first
fragmented part anymore.

The frag6_20 test case is added which failed before triggering an
ICMPv6 "param prob" due to the check for each queued fragment for
a max-size violation if a fragoff=0 packet was received.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 22:07:45 +00:00
glebius
74a423d9ac Use THREAD_CAN_SLEEP() macro to check if thread can sleep. There is no
functional change.

Discussed with:	kib
2019-10-24 21:55:19 +00:00
mckusick
cc5bcc5d49 After the unlink() of one name of a file with multiple links, a
stat() of one of the remaining names of the file does not show an
updated ctime (inode modification time) until several seconds after
the unlink() completes. The problem only occurs when the filesystem
is running with soft updates enabled. When running with soft updates,
the ctime is not updated until the soft updates background process
has settled all the needed I/O operations.

This commit causes the ctime to be updated immediately during the
unlink(). A side effect of this change is that the ctime is updated
again when soft updates has finished its processing because that
is the time that is correct from the perspective of programs that
look at the disk (like dump). This change does not cause any extra
I/O to be done, it just ensures that stat() updates the ctime before
handing it back.

PR:           241373
Reported by:  Alan Somers
Tested by:    Alan Somers
MFC after:    3 days
Sponsored by: Netflix
2019-10-24 21:28:37 +00:00
bz
2230476674 frag6: handling of overlapping fragments to conform to RFC 8200
While the comment was updated in r350746, the code was not.
RFC8200 says that unless fragment overlaps are exact (same fragment
twice) not only the current fragment but the entire reassembly queue
for this packet must be silently discarded, which we now do if
fragment offset and fragment length do not match.

Obtained from:	jtl
MFC after:	3 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D16850
2019-10-24 20:22:52 +00:00
tuexen
56626fc5ad Ensure that the flags indicating IPv4/IPv6 are not changed by failing
bind() calls. This would lead to inconsistent state resulting in a panic.
A fix for stable/11 was committed in
https://svnweb.freebsd.org/base?view=revision&revision=338986
An accelerated MFC is planned as discussed with emaste@.

Reported by:		syzbot+2609a378d89264ff5a42@syzkaller.appspotmail.com
Obtained from:		jtl@
MFC after:		1 day
Sponsored by:		Netflix, Inc.
2019-10-24 20:05:10 +00:00
bz
a41c96bda5 frag6: export another counter read-only by sysctl
Similar to the system global counter also export the per-VNET counter
"frag6_nfragpackets" detailing the current number of fragment packets
in this VNET's reassembly queues.
The read-only counter is helpful for in-VNET statistical monitoring and
for test-cases.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 20:00:37 +00:00
bz
6b64e9286e frag6: fix counter leak in error case and optimise code
In case the first fragmented part (off=0) arrives we check for the
maximum packet size for each fragmented part we already queued with the
addition of the unfragmentable part from the first one.

For one we do not have to enter the loop at all if this is the first
fragmented part to arrive, and we can skip the check.

Should we encounter an error case we send an ICMPv6 message for any
fragment exceeding the maximum length limit.  While dequeueing the
original packet and freeing it, statistics were not updated and leaked
both the reassembly queue count for the fragment and the global
fragment count.  Found by code inspection and confirmed by tightening
test cases checking more statistical and system counters.

While here properly wrap a line.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 19:57:18 +00:00
bz
d4a9dd6f75 frag6.c: do not leak packet queue entry in error case
When we are checking for the maximum reassembled packet size of the
fragmentable part and run into the error case (packet too big),
we are leaking the packet queue enntry if this was a first fragment
to arrive.
Properly cleanup, removing the queue entry from the bucket, decrementing
counters, and freeing the memory.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 19:47:32 +00:00
mckusick
97c391a761 Soft updates needs to keep an on-disk linked list of inodes that
have been unlinked, but are still referenced by open file descriptors.
These inodes cannot be freed until the final file descriptor reference
has been closed. If the system crashes while they are still being
referenced, these inodes and their referenced blocks need to be
freed by fsck. By having them on a linked list with the head pointer
in the superblock, fsck can quickly find and process them rather
than having to check every inode in the filesystem to see if it is
unreferenced.

When updating the head pointer of this list of unlinked inodes in
the superblock, the superblock check-hash was not getting updated.
If the system crashed with the incorrect superblock check-hash, the
superblock would appear to be corrupted. This patch ensures that
the superblock check-hash is updated when updating the head pointer
of the unlinked inodes list.

There is no need to MFC as superblock check hashes first appeared in
13.0.

Tested by:    Peter Holm
Sponsored by: Netflix
2019-10-24 19:47:18 +00:00
gallatin
fb7539ef5c Add a tunable to set the pgcache zone's maxcache
When it is set to 0 (the default), a heavy Netflix-style web workload
suffers from heavy lock contention on the vm page free queue called from
vm_page_zone_{import,release}() as the buckets are frequently drained.
When setting the maxcache, this contention goes away.

We should eventually try to autotune this, as well as make this
zone eligable for uma_reclaim().

Reviewed by:	alc, markj
Not Objected to by: jeff
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22112
2019-10-24 18:39:05 +00:00
jhb
7622bc9ddb Use a counter with a random base for explicit IVs in GCM.
This permits constructing the entire TLS header in ktls_frame() rather
than ktls_seq().  This also matches the approach used by OpenSSL which
uses an incrementing nonce as the explicit IV rather than the sequence
number.

Reviewed by:	gallatin
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D22117
2019-10-24 18:13:26 +00:00
bz
27aa98f3aa frag6: leave a note about upper layer header checks TBD
Per sepcification the upper layer header needs to be within the first
fragment.  The check was not done so far and there is an open review for
related work, so just leave a note as to where to put it.
Move the extraction of frag offset up to this as it is needed to determine
whether this is a first fragment or not.

MFC after:	3 weeks
Sponsored by:	Netflix
2019-10-24 12:16:15 +00:00