Commit Graph

131256 Commits

Author SHA1 Message Date
Konstantin Belousov
74cb9a5333 Fix a bug in r358168, do not call sigfastblock_setpend() under a mutex.
PR:	244250
Reported and tested by:	lwhsu
Sponsored by:	The FreeBSD Foundation
2020-02-20 21:25:12 +00:00
Kristof Provost
55cd93249b virtio: Pass the interrupt type in mmio mode
When we register an interrupt handler we need to pass the intr_type along in
bus_setup_intr().

The interrupt type matters because it is used to decide if we need to enter
NET_EPOCH. That meant that vtmmio-based if_vtnet did not, which led to panics
with INVARIANTS set.

Sponsored by:	Axiado
2020-02-20 17:26:08 +00:00
Emmanuel Vadot
1a7ba9a01c linuxkpi: Add str_has_prefix
This function test if the string str begins with the string pointed
at by prefix.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23767
2020-02-20 17:20:50 +00:00
Emmanuel Vadot
8f0c734385 linuxkpi: Add list_is_first function
This function just test if the element is the first of the list.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23766
2020-02-20 17:19:16 +00:00
Konstantin Belousov
b08bdabee4 Add more values for PCI capabilities, PCIe extended capabilities, and subclasses.
Taken from
https://pcisig.com/sites/default/files/files/PCI_Code-ID_r_1_11__v24_Jan_2019.pdf

Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	1 week
2020-02-20 17:08:52 +00:00
Mateusz Guzik
65cdfb4caa make sysent for r358172 ("vfs: add realpathat syscall") 2020-02-20 16:58:57 +00:00
Mateusz Guzik
0573d0a9b8 vfs: add realpathat syscall
realpath(3) is used a lot e.g., by clang and is a major source of getcwd
and fstatat calls. This can be done more efficiently in the kernel.

This works by performing a regular lookup while saving the name and found
parent directory. If the terminal vnode is a directory we can resolve it using
usual means. Otherwise we can use the name saved by lookup and resolve the
parent.

See the review for sample syscall counts.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23574
2020-02-20 16:58:19 +00:00
Michael Tuexen
64f29eb1df Remove an unused timer type.
MFC after:		1 week
2020-02-20 15:37:44 +00:00
Konstantin Belousov
a113b17f10 Do not read sigfastblock word on syscall entry.
On machines with SMAP, fueword executes two serializing instructions
which can be seen in microbenchmarks.

As a measure to restore microbenchmark numbers, only read the word on
the attempt to deliver signal in ast().  If the word is set, signal is
not delivered and word is kept, preventing interruption of
interruptible sleeps by signals until userspace calls
sigfastblock(UNBLOCK) which clears the word.

This way, the spurious EINTR that userspace can see while in critical
section is on first interruptible sleep, if a signal is pending, and
on signal posting.  It is believed that it is not important for rtld
and lbithr critical sections.  It might be visible for the application
code e.g. for the callback of dl_iterate_phdr(3), but again the belief
is that the non-compliance is acceptable.  Most important is that the
retry of the sleeping syscall does not interrupt unless additional
signal is posted.

For now I added the knob kern.sigfastblock_fetch_always to enable the
word read on syscall entry to be able to diagnose possible issues due
to spurious EINTR.

While there, do some code restructuting to have all sigfastblock()
handling located in kern_sig.c.

Reviewed by:	jeff
Discussed with:	mjg
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D23622
2020-02-20 15:34:02 +00:00
Bjoern A. Zeeb
a1a6c01e41 ip6_output: improve extension header handling
Move IPv6 source address checks from after extension header heandling
to the top of the function. If we do not pass these checks there is
no reason to do a lot of work upfront.

Fold extension header preparations and length calculations together into
a single branch and macro rather than doing them sequentially.
Likewise move extension header concatination into a single branch block
only doing it if we recorded any extension header length length.

Reviewed by:	melifaro (earlier version), markj, gallatin
Sponsored by:	Netflix (partially, originally)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23740
2020-02-20 10:56:12 +00:00
Baptiste Daroussin
4b60673b0d Bump __FreeBSD_version after bumping ncurses shlib 2020-02-20 09:17:45 +00:00
Adrian Chadd
af2441fbc7 [ath] Attempt to fix epoch handling.
The epoch stuff with taskqueues works fine if the driver never calls
the receive path in other contexts, but this driver does.  If there was
a chip reset during active receive then part of the reset will call
the receive path to flush out any active packets before reinitialising
the receive queue and that needs to be done with the epoch held.

So:

* make the receive task a normal task again
* explicitly call epoch enter/exit around the legacy and newer DMA
  receive paths
* add a couple of epoch asserts to ensure that the receive packet
  path itself is called with epoch held.

This fixes it on my Atom eeepc laptop (circa 2010!) that I did
all of my initial 802.11n work in this driver and net80211.

Tested:

* AR9285, STA mode

TODO:

* Test on EDMA chipset (AR9380)
* Test in AP/adhoc modes, just to be sure (eg for beacon
  receive processing in particular.)
2020-02-20 07:12:43 +00:00
Warner Losh
cafbf0c664 Don't convert all lower-layer errors to EIO.
Don't convert all lower layer errors to EIO. Instead, pass the actual error up
the stack. This will allow the upper layers that look for ENXIO to react
properly to that signal from the lower layers and, for UFS, unmount the
filesystem.

Reviewed by: kib@
Differential Revision:  https://reviews.freebsd.org/D23755
2020-02-20 01:33:01 +00:00
Warner Losh
65252dc903 Don't spam the console with an additional, and useless, error message.
There's no need to spam the console with this error message. If there's an I/O
error, the disk/cam driver will report it at the lower levels. If that's an
actual problem, the upper layers will report that.

Reviewed by: kib@
Differential Revision:  https://reviews.freebsd.org/D23756
2020-02-20 00:34:46 +00:00
Jeff Roberson
4b3dac72b3 Silence a gcc warning about no return from a function that handles every
possible enum in a switch statement.  I verified that this emits nothing
as expected on clang.  radix relies on constant propagation to eliminate
any branching from these access routines.

Reported by:	lwhsu/tinderbox
2020-02-19 22:34:22 +00:00
Jeff Roberson
1ddda2eb24 Use SMR to provide a safe unlocked lookup for vm_radix.
The tree is kept correct for readers with store barriers and careful
ordering.  The existing object lock serializes writers.  Consumers
will be introduced in later commits.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D23446
2020-02-19 19:58:31 +00:00
Jeff Roberson
83bf6ee49b Since r357940 it is no longer possible to use a single type cast for all
atomic_*_ptr functions.
2020-02-19 19:51:09 +00:00
Jeff Roberson
c6fd3e23f7 Use per-domain locks for the bucket cache.
This gives much better concurrency when there are a large number of
cores per-domain and multiple domains.  Avoid taking the lock entirely
if it will not be productive.  ROUNDROBIN domains will have mixed
memory in each domain and will load balance to all domains.

While here refactor the zone/domain separation and bucket limits to
simplify callers.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D23673
2020-02-19 18:48:46 +00:00
Jeff Roberson
e9ceb9dd11 Don't release xbusy on kmem pages. After lockless page lookup we will not
be able to guarantee that they can be racquired without blocking.

Reviewed by:	kib
Discussed with:	markj
Differential Revision:	https://reviews.freebsd.org/D23506
2020-02-19 09:10:11 +00:00
Jeff Roberson
6c5f36ff30 Eliminate some unnecessary uses of UMA_ZONE_VM. Only zones involved in
virtual address or physical page allocation need to be marked with this
flag.

Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23712
2020-02-19 08:17:27 +00:00
Jeff Roberson
bf7dba0b91 Type validating smr protected pointer accessors.
This API is intended to provide some measure of safety with SMR
protected pointers.  A struct wrapper provides type checking and
a guarantee that all access is mediated by the API unless abused.  All
modifying functions take an assert as an argument to guarantee that
the required synchronization is present.

Reviewed by:	kib, markj, mjg
Differential Revision:	https://reviews.freebsd.org/D23711
2020-02-19 08:15:20 +00:00
Hiroki Sato
294de6bbd6 Add _BIX (Battery Information Extended) object support.
ACPI Control Method Batteries have a _BIF and/or _BIX object which
provide static properties of the battery.  FreeBSD acpi_cmbat module
supported _BIF object only, which was deprecated as of ACPI 4.0.
_BIX is an extended version of _BIF defined in ACPI 4.0 or later.

As of writing, _BIX has two revisions.  One is in ACPI 4.0 (rev.0) and
another is in ACPI 6.0 (rev.1).  It seems that hardware vendors still
stick to _BIF only or _BIX rev.0 + _BIF for the maximum compatibility.
Microsoft requires _BIX rev.0 for Windows machines, so there are some
laptop machines with _BIX rev.0 only. In this case, FreeBSD does not
recognize the battery information.

After this change, the acpi_cmbat module gets battery information from
_BIX or _BIF object and internally uses _BIX rev.1 data structure as
the primary information store in the kernel.  ACPIIO_BATT_GET_BI[FX]
returns an acpi_bi[fx] structure built by using information obtained
from a _BIF or a _BIX object found on the system.  The revision number
field can be used to check which field is available.  The acpiconf(8)
utility will show additional information if _BIX is available.

Although ABIs of ACPIIO_BATT_* were changed, the existing APIs for
userland utilities are not changed and the backward-compatible ABIs
are provided.  This means that older versions of acpiconf(8) can also
work with the new kernel. The (union acpi_battery_ioctl_arg) was
padded to 256 byte long to avoid another ABI change in the future.
A _BIX object with its revision number >1 will be treated as
compatible with the rev.1 _BIX format.

Reviewed by:	takawata
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23728
2020-02-19 06:28:55 +00:00
Ryan Libby
9fab908a79 powerpc: unconditionally mark SLB zones UMA_ZONE_CONTIG
PR:		244118
Reported by:	Francis Little <oggy at farscape.co.uk>
Tested by:	Francis Little, Mark Millard <marklmi at yahoo.com>
Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D23729
2020-02-19 04:46:41 +00:00
Justin Hibbits
478d3cf5b8 powerpc/amigaone: Fix license header formatting on cpld files
This should've been fixed before initial commit, but wasn't.  Not even sure
how it happened in the first place.
2020-02-19 03:39:11 +00:00
Navdeep Parhar
02cd773916 cxgbe(4): Congestion drops are maintained per E-channel and not per
buffer group.

This fixes a bug where congestion drops on port 1 of a T6 card would
incorrectly be counted as drops on port 0.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-19 00:48:58 +00:00
Kirk McKusick
98b6844690 Additional KASSERTs to ensure the consistency of the soft updates
indirdep structure. No functional change.

Tested by:    Peter Holm (as part of a larger patch)
Sponsored by: Netflix
2020-02-18 23:56:23 +00:00
Michael Tuexen
868b51f234 Epochify SCTP. 2020-02-18 21:25:17 +00:00
Navdeep Parhar
9a4a1be02c cxgbe/iw_cxgbe: correctly enforce the max reg_mr depth.
Reported by:	Andrew Zhu @ Netapp
Obtained from:	Chelsio Communications
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-18 20:43:10 +00:00
Hans Petter Selasky
fbb890056e Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process
incoming packets in taskqueue context.

This patch extends r357772.

Differential Revision:	https://reviews.freebsd.org/D23742
Reviewed by:	glebius@
Sponsored by:	Mellanox Technologies
2020-02-18 19:53:36 +00:00
Michael Tuexen
ba0d525006 Remove unused function. 2020-02-18 19:41:55 +00:00
Dimitry Andric
2d8a0c01e5 Fix the following -Werror warning from clang 10.0.0:
sys/arm/allwinner/clkng/aw_clk_mipi.c:144:6: error: misleading indentation; statement is not part of the previous 'if' [-Werror,-Wmisleading-indentation]
                                        m++;
                                        ^
sys/arm/allwinner/clkng/aw_clk_mipi.c:142:5: note: previous statement is here
                                if (best == *fout)
                                ^

Move the increment operations into the for loop headers instead.

Discussed with:	manu
MFC after:	3 days
2020-02-18 17:55:24 +00:00
Fedor Uporov
3767ed5b11 Add a EXT2FS-specific implementation for lseek(SEEK_DATA).
The lseek(SEEK_DATA) optimization logic could be simply borrowed from ufs side.
See, https://reviews.freebsd.org/D19599.

Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D23605
2020-02-18 16:39:57 +00:00
Bjoern A. Zeeb
7c1daefe2c ip6_output: update comments.
Clear up some comments and improve to panic messages.

No functional changes.

MFC after:	3 days
2020-02-18 11:28:00 +00:00
Chuck Silvers
2272f66379 amd64: keep PTE bitmasks in sync with target pmap during pv reclaim
in reclaim_pv_chunk_domain(), when we switch to a new target pmap from which
we are trying to reclaim a pv chunk, always update the current PTE bitmasks
to match.

Reviewed by:	kib, markj
Approved by:	imp (mentor)
Sponsored by:	Netflix
2020-02-18 00:02:20 +00:00
Dimitry Andric
8a1e7a1d5f Merge r358034 from the clang1000-import branch:
Disable new clang 10.0.0 warnings about misleading indentation in
sys/contrib/ncsw/Peripherals/FM/fman_ncsw.c.

This is horribly formatted contributed code, and fixing it is not worth
the effort.

MFC after:	3 days
2020-02-17 20:23:26 +00:00
Dimitry Andric
b267558ca6 Merge r358030 from the clang1000-import branch:
Work around new clang 10.0.0 -Werror warning:

sys/arm/allwinner/aw_cir.c:208:41: error: converting the result of '<<' to a boolean; did you mean '((1 & 255) << 23) != 0'? [-Werror,-Wint-in-bool-context]
        active_delay = (AW_IR_ACTIVE_T + 1) * (AW_IR_ACTIVE_T_C ? 128 : 1);
                                               ^
sys/arm/allwinner/aw_cir.c:130:39: note: expanded from macro 'AW_IR_ACTIVE_T_C'
#define AW_IR_ACTIVE_T_C                ((1 & 0xff) << 23)
                                                    ^

Add the != 0 part to indicate that we indeed want to compare against
zero.

MFC after:	3 days
2020-02-17 20:22:10 +00:00
Scott Long
332e6e31c2 Fix syntax error from r357647. Adjust a variable name to make the use more
clear.

Reported by:	dim
2020-02-17 20:12:34 +00:00
Dimitry Andric
816dab96c1 Disable new clang 10.0.0 warnings about misleading indentation in
sys/contrib/ncsw/Peripherals/FM/fman_ncsw.c.

This is horribly formatted contributed code, and fixing it is not worth
the effort.
2020-02-17 19:20:47 +00:00
Dimitry Andric
30882a7c88 Work around new clang 10.0.0 -Werror warning:
sys/arm/allwinner/aw_cir.c:208:41: error: converting the result of '<<' to a boolean; did you mean '((1 & 255) << 23) != 0'? [-Werror,-Wint-in-bool-context]
        active_delay = (AW_IR_ACTIVE_T + 1) * (AW_IR_ACTIVE_T_C ? 128 : 1);
                                               ^
sys/arm/allwinner/aw_cir.c:130:39: note: expanded from macro 'AW_IR_ACTIVE_T_C'
#define AW_IR_ACTIVE_T_C                ((1 & 0xff) << 23)
                                                    ^

Add the != 0 part to indicate that we indeed want to compare against
zero.
2020-02-17 18:37:15 +00:00
Dimitry Andric
05b1ae81f6 Tentatively apply D23730:
Fix compile errors in altera_sdcard_io.c after r357647

Summary:
After rS357647, building universe results in compilation errors for
_.mips.BERI_DE4_SDROOT:

```
sys/dev/altera/sdcard/altera_sdcard_io.c: In function 'altera_sdcard_io_start_internal':
sys/dev/altera/sdcard/altera_sdcard_io.c:299:13: error: '*bp' is a pointer; did you mean to use '->'?
  switch (*bp->bio_cmd) {
             ^~
             ->
sys/dev/altera/sdcard/altera_sdcard_io.c:301:38: error: '*bp' is a pointer; did you mean to use '->'?
   altera_sdcard_write_cmd_arg(sc, *bp->bio_pblkno *
                                      ^~
                                      ->
sys/dev/altera/sdcard/altera_sdcard_io.c:307:42: error: '*bp' is a pointer; did you mean to use '->'?
   altera_sdcard_write_rxtx_buffer(sc, *bp->bio_data,
                                          ^~
                                          ->
sys/dev/altera/sdcard/altera_sdcard_io.c:308:10: error: '*bp' is a pointer; did you mean to use '->'?
       *bp->bio_bcount);
          ^~
          ->
sys/dev/altera/sdcard/altera_sdcard_io.c:309:38: error: '*bp' is a pointer; did you mean to use '->'?
   altera_sdcard_write_cmd_arg(sc, *bp->bio_pblkno *
                                      ^~
                                      ->
sys/dev/altera/sdcard/altera_sdcard_io.c: In function 'altera_sdcard_io_start':
sys/dev/altera/sdcard/altera_sdcard_io.c:336:20: error: incompatible types when assigning to type 'struct bio *' from type 'struct bio'
  sc->as_currentbio = *bp;
                    ^
```

The first few are because `->` has a higher precedence than `*`, so the
expressions should use `(*bp)->foo` instead.  I also renamed the
variable to `bpp` to make it clearer that it is a pointer-to-pointer.

The last one is because `sc->as_currentbio` is already a `struct bio *`,
there is no need to dereference `bp` there.

Last but not least, I would really suggest rewriting the
`altera_sdcard_io_start_internal()` function to just return success or
failure, so the caller can decide to set `bp` to NULL.
2020-02-17 18:31:32 +00:00
Michael Tuexen
a610bb2120 Fix the non-default stream schedulers such that do not interleave
user messages when it is now allowed.

Thanks to Christian Wright for reporting the issue for the userland
stack and providing a fix for the priority scheduler.

MFC after:		1 week
2020-02-17 18:05:03 +00:00
Andrew Turner
334790ea6b Use EARLY_DRIVER_MODULE in the acpi bus.
We need this to use EARLY_DRIVER_MODULE in child drivers on arm64. This
should be a no-op on x86 as it has DRIVER_MODULE in the nexus driver making
all later drivers attach in the last pass.

Reviewed by:	imp
MFC after:	1 month
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D23717
2020-02-17 15:32:21 +00:00
Mark Johnston
34e2051faf Remove swblk_t.
It was used only to store the bounds of each swap device.  However,
since swblk_t is a signed 32-bit int and daddr_t is a signed 64-bit
int, swp_pager_isondev() may return an invalid result if swap devices
are repeatedly added and removed and sw_end for a device ends up
becoming a negative number.

Note that the removed comment about maximum swap size still applies.

Reviewed by:	jeff, kib
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23666
2020-02-17 15:11:07 +00:00
Mark Johnston
725b4ff001 Fix a swap block allocation race.
putpages' allocation of swap blocks is done under the global sw_dev
lock.  Previously it would drop that lock before inserting the allocated
blocks into the object's trie, creating a window in which swap blocks
are allocated but are not visible to swapoff.  This can cause
swp_pager_strategy() to fail and panic the system.

Fix the problem bluntly, by allocating swap blocks under the object
lock.

Reviewed by:	jeff, kib
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23665
2020-02-17 15:10:41 +00:00
Mark Johnston
c90d075be4 Fix object locking races in swapoff(2).
swap_pager_swapoff_object()'s goal is to allocate pages for all valid
swap blocks belonging to the object, for which there is no resident
page.  If the page corresponding to a block is already resident and
valid, the block can simply be discarded.

The existing implementation tries to minimize the number of I/Os used.
For each cluster of swap blocks, it finds maximal runs of valid swap
blocks not resident in memory, and valid resident pages.  During this
processing, the object lock may be dropped in several places: when
calling getpages, or when blocking on a busy page in
vm_page_grab_pages().  While the lock is dropped, another thread may
free swap blocks, causing getpages to page in stale data.

Fix the problem following a suggestion from Jeff: use getpages'
readahead capability to perform clustering rather than doing it
ourselves.  The simplies the code a bit without reintroducing the old
behaviour of performing one I/O per page.

Reviewed by:	jeff
Reported by:	dhw, gallatin
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23664
2020-02-17 15:09:40 +00:00
Michael Tuexen
6b8fba3c5c Don't use uninitialised stack memory if the sysctl variable
net.inet.tcp.hostcache.enable is set to 0.
The bug resulted in using possibly a too small MSS value or wrong
initial retransmission timer settings. Possibly the value used
for ssthresh was also wrong.

Submitted by:		Richard Scheffenegger
Reviewed by:		Cheng Cui, rgrimes@, tuexen@
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D23687
2020-02-17 14:54:21 +00:00
Konstantin Belousov
eca86ffaa1 Fix typo.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-02-17 13:26:36 +00:00
Bjoern A. Zeeb
10108cb673 Partially revert VNET change and expand VNET structure.
Revert parts of r353274 replacing vnet_state with a shutdown flag.

Not having the state flag for the current SI_SUB_* makes it harder to debug
kernel or module panics related to VNET bringup or teardown.
Not having the state also does not allow us to check for other dependency
levels between components, e.g. for moving interfaces.

Expand the VNET structure with the new boolean flag indicating that we are
doing a shutdown of a given vnet and update the vnet magic cookie for the
change.

Update libkvm to compile with a bool in the kernel struct.

Bump __FreeBSD_version for (external) module builds to more easily detect
the change.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23097
2020-02-17 11:08:50 +00:00
Hans Petter Selasky
bacb11c9ed Fix kernel panic while trying to read multicast stream.
When VIMAGE is enabled make sure the "m_pkthdr.rcvif" pointer is set
for all mbufs being input by the IGMP/MLD6 code. Else there will be a
NULL-pointer dereference in the netisr code when trying to set the
VNET based on the incoming mbuf. Add an assert to catch this when
queueing mbufs on a netisr to make debugging of similar cases easier.

Found by:	Vladislav V. Prodan
PR:		244002
Reviewed by:	bz@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-17 09:46:32 +00:00
Jeff Roberson
ed581bf68f Add a simple accessor that returns the bytes of memory consumed by a zone. 2020-02-17 01:59:55 +00:00