125528 Commits

Author SHA1 Message Date
Andrew Turner
cd0c606fda Ensure the I-Cache is correctly handled in arm64_icache_sync_range
The cache_handle_range macro to handle the arm64 instruction and data
cache operations would return when it was complete. This causes problems
for arm64_icache_sync_range and arm64_icache_sync_range_checked as they
assume they can execute the i-cache handling instruction after it has been
called.

Fix this by making this assumption correct.

While here add missing instruction barriers and adjust the style to
match the rest of the assembly.

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D18838
2019-01-15 09:48:18 +00:00
Kristof Provost
032dff662c pf: silence a runtime warning
Sometimes, for negated tables, pf can log 'pfr_update_stats: assertion failed'.
This warning does not clarify anything for users, so silence it, just as
OpenBSD has.

PR:		234874
MFC after:	1 week
2019-01-15 08:59:51 +00:00
Xin LI
305bb04ee4 Use TD_IS_IDLETHREAD instead of unrolled version.
MFC after:	2 weeks
2019-01-15 06:44:37 +00:00
Gleb Smirnoff
5a8eee2bb4 Fix compilation on 32-bit. 2019-01-15 03:43:46 +00:00
Gleb Smirnoff
756a541279 Allocate pager bufs from UMA instead of 80-ish mutex protected linked list.
o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many
  pbufs are we going to have set.
  In various subsystems that are going to utilize pbufs create private zones
  via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(),
  and sets a limit on created zone. After startup preallocate pbufs according
  to requirements of all pbuf zones.

  Subsystems that used to have a private limit with old allocator now have
  private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS,
  swap, vnode pager.

  The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9),
  aio(4). They should have their private limits, but changing that is out of
  scope of this commit.

o Fetch tunable value of kern.nswbuf from init_param2() and while here move
  NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only
  this option.
  Default values aren't touched by this commit, but they probably should be
  reviewed wrt to modern hardware.

This change removes a tight bottleneck from sendfile(2) operation, that
uses pbufs in vnode pager. Other pagers also would benefit from faster
allocation.

Together with:	gallatin
Tested by:	pho
2019-01-15 01:02:16 +00:00
Oleksandr Tymoshenko
7c895edb66 [led] propagate error from set_led() to the caller
Do not lose error condition by always returning 0 from set_led.
None of the calls to set_led checks for return value at the moment so
none of API consumers in base is affected.

PR:		231567
Submitted by:	Bertrand Petit <bsdpr@phoe.frmug.org>
MFC after:	1 week
2019-01-15 00:52:41 +00:00
Oleksandr Tymoshenko
6534f93296 [mv_pci] Increase default PCI space size for mv_pci
mv_pci driver reads PCI memory window layout from DTB data and if the
data is incomplete falls back to default value. The value is too small
to fit two PCI spaces for mwlwifi devices on WRT3200ACM so the resource
allocation for them fails. Increase the default to 4Mb from 1Mb so
the devices can be properly attached.

MFC after:	1 week
2019-01-15 00:37:37 +00:00
Gleb Smirnoff
46713135ae Add flag LK_NEW for lockinit() that is converted to LO_NEW and passed
down to lock_init().  This allows for lockinit() on a not prezeroed
memory.
2019-01-15 00:35:19 +00:00
Gleb Smirnoff
bb15d1c778 o Move zone limit from keg level up to zone level. This means that now
two zones sharing a keg may have different limits. Now this is going
  to work:

  zone = uma_zcreate();
  uma_zone_set_max(zone, limit);
  zone2 = uma_zsecond_create(zone);
  uma_zone_set_max(zone2, limit2);

  Kegs no longer have uk_maxpages field, but zones have uz_items. When
  set, it may be rounded up to minimum possible CPU bucket cache size.
  For small limits bucket cache can also be reconfigured to be smaller.
  Counter uz_items is updated whenever items transition from keg to a
  bucket cache or directly to a consumer. If zone has uz_maxitems set and
  it is reached, then we are going to sleep.

o Since new limits don't play well with multi-keg zones, remove them. The
  idea of multi-keg zones was introduced exactly 10 years ago, and never
  have had a practical usage. In discussion with Jeff we came to a wild
  agreement that if we ever want to reintroduce the idea of a smart allocator
  that would be able to choose between two (or more) totally different
  backing stores, that choice should be made one level higher than UMA,
  e.g. in malloc(9) or in mget(), or whatever and choice should be controlled
  by the caller.

o Sleeping code is improved to account number of sleepers and wake them one
  by one, to avoid thundering herd problem.

o Flag UMA_ZONE_NOBUCKETCACHE removed, instead uma_zone_set_maxcache()
  KPI added. Having no bucket cache basically means setting maxcache to 0.

o Now with many fields added and many removed (no multi-keg zones!) make
  sure that struct uma_zone is perfectly aligned.

Reviewed by:	markj, jeff
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D17773
2019-01-15 00:02:06 +00:00
Konstantin Belousov
28b740da38 Handle overflow in calculating max kmem size.
vm_kmem_size is u_long, and it might be not capable of holding page
count times PAGE_SIZE, even when scaled down by VM_KMEM_SIZE_SCALE.  As
bde reported, 12G PAE config ends up with zero for kmem size.

Explicitly check for overflow and clamp kmem size at vm_kmem_size_max.
If we end up at zero size because VM_KMEM_SIZE_MAX is not defined,
panic with clear explanation rather then failing in a way which is
hard to relate.

Reported by:	bde, pho
Tested by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D18767
2019-01-14 07:31:19 +00:00
Olivier Houchard
e8d5909c39 Don't forget to add the needed #includes.
Pointy hat to:	cognet
2019-01-13 23:41:56 +00:00
Olivier Houchard
9cd27257d5 Introduce cpu_icache_sync_range_checked(), that does the same thing as
cpu_icache_sync_range(), except that it sets pcb_onfault to catch any page
fault, as doing cache maintenance operations for non-mapped generates a
data abort, and use it in freebsd32_sysarch(), so that a userland program
attempting to sync the icache with unmapped addresses doesn't crash the
kernel.

Spotted out by:	andrew
2019-01-13 23:29:46 +00:00
Jason A. Harmening
7dff7eda1a Handle SIGIO for listening sockets
r319722 separated struct socket and parts of the socket I/O path into
listening-socket-specific and dataflow-socket-specific pieces.  Listening
socket connection notifications are now handled by solisten_wakeup() instead
of sowakeup(), but solisten_wakeup() does not currently post SIGIO to the
owning process.

PR:	234258
Reported by:	Kenneth Adelman
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18664
2019-01-13 20:33:54 +00:00
Olivier Houchard
8c9c3144cc Impleent COMPAT_FREEBSD32 for arm64.
This is based on early work by andrew@.
2019-01-13 19:49:46 +00:00
Andriy Voskoboinyk
e42e878b35 net80211: provide rate validation for injected frames.
There may be various side effects (device timeout, firmware and / or
kernel panic) when an invalid (or inapplicable - e.g., an MCS rate
for 11g-only device) is set; check rates before sending the frame to
the driver.

How-to-reproduce:
Set an MCS (real or bogus - with 0x80 bit set) rate in ibp_rate0 field
for any device that uses ieee80211_isratevalid() for rate checks -
rum(4), run(4), ural(4), bwi(4) or ral(4); if kernel is compiled
with INVARIANTS the check will result in "rate %d is basic/mcs?" panic.

Tested with WUSB54GC (rum(4)), AP mode.

MFC after:	1 week
2019-01-13 06:01:36 +00:00
Justin Hibbits
2da4e52d79 powerpcspe: Correct SPE high-component loading
Don't clobber the low part of the register restoring the high component of.
This could lead to very bad behavior if it's an ABI-affected register.

While here, also mark the asm volatile in the SPE high save case, to match
the load case.

Reported by:	Branden Bergren (git_bdragon.rtk0.net)
MFC after:	1 week
2019-01-13 04:51:24 +00:00
Justin Hibbits
02f2e80c3f Add AT_HWCAP / AT_HWCAP2 to elf64_sysvec_v2.
Summary:
I was working on implementing ifuncs on powerpc64 elfv2 today, and I suddenly
realized that the reason I was having so much trouble with AT_HWCAP and
AT_HWCAP2 is they are missing from the sysentvec.

After adding them, the auxv is being filled like it should.

Submitted by:	Brandon Bergren (git_bdragon.rtk0.net)
Differential Revision: https://reviews.freebsd.org/D18575
2019-01-13 02:28:37 +00:00
Olivier Houchard
21fb66241a Regenerate sysent files after having modified syscalls.master. 2019-01-13 00:38:55 +00:00
Olivier Houchard
2ca357528f amd64 is the only arch that doesn't require padding for 32bits syscalls, so
instead of listing every arch thar requires it, just exclude amd64.
2019-01-13 00:37:31 +00:00
Olivier Houchard
7045ac437b Instead of using an incomplete list of platforms that uses 64bits time_t
in 32bits mode, special case amd64, as i386 is the only arch that still
uses 32bits time_t.
2019-01-13 00:19:15 +00:00
Conrad Meyer
e49ec46114 amdtemp(4): Add support for Family 15h, Model >=60h
Family 15h is a bit of an oddball.  Early models used the same temperature
register and spec (mostly[1]) as earlier CPU families.

Model 60h-6Fh and 70-7Fh use something more like Family 17h's Service
Management Network, communicating with it in a similar fashion.  To support
them, add support for their version of SMU indirection to amdsmn(4) and use
it in amdtemp(4) on these models.

While here, clarify some of the deviceid macros in amdtemp(4) that were
added with arbitrary, incorrect family numbers, and remove ones that were
not used.  Additionally, clarify intent and condition of heterogenous
multi-socket system detection.

[1]: 15h adds the "adjust range by -49°C if a certain condition is met,"
which previous families did not have.

Reported by:	D. C. <tjoard AT gmail.com>
PR:		234657
Tested by:	D. C. <tjoard AT gmail.com>
2019-01-12 22:36:33 +00:00
Justin Hibbits
431d31e0bf powerpc/pseries: Cache the IPI vector to avoid the common static lookup
The IPI vector is static, and happens to be the most common interrupt by far
on some systems.  Rather than searching for the interrupt every time, cache
the index.

This appears to yield a small performance boost, of about 8% reduction in
buildworld times, on my POWER9 system, when paired with r342975.
2019-01-12 22:10:31 +00:00
Justin Hibbits
56505ec016 powerpc: Add opaque 'private data' to interrupt vectors
The XICS and XIVE need extra data beyond irq and vector.  Rather than
performing a separate search, it's better for the general interrupt facility
to hold a private pointer, since the search already must be done anyway at
that level.
2019-01-12 22:05:42 +00:00
Andrew Turner
be860eae0f Fix the check for the offset of td_frame and td_emuldata in struct thread.
Pointy hat:	andrew
Sponsored by:	DARPA, AFRL
2019-01-12 20:41:57 +00:00
Andriy Voskoboinyk
4367c2d177 net80211: fix possible panic for some drivers after r342211
Check if rate control structures were allocated before trying to
access them in various places; this was possible before on
allocation failure (unlikely), but was revealed after r342211
where allocation was deferred.

In case if driver uses wlan_amrr(4) and it is loaded it
is possible to reproduce the panic via

sysctl net.wlan.<number>.rate_stats

(for wlan0 the number will be 0).

Tested with: RTL8188EE, AP mode + RTL8188CUS, STA mode.

MFC after:	3 days
2019-01-12 14:57:12 +00:00
Andrew Turner
b3c0d957a2 Add support for the Clang Coverage Sanitizer in the kernel (KCOV).
When building with KCOV enabled the compiler will insert function calls
to probes allowing us to trace the execution of the kernel from userspace.
These probes are on function entry (trace-pc) and on comparison operations
(trace-cmp).

Userspace can enable the use of these probes on a single kernel thread with
an ioctl interface. It can allocate space for the probe with KIOSETBUFSIZE,
then mmap the allocated buffer and enable tracing with KIOENABLE, with the
trace mode being passed in as the int argument. When complete KIODISABLE
is used to disable tracing.

The first item in the buffer is the number of trace event that have
happened. Userspace can write 0 to this to reset the tracing, and is
expected to do so on first use.

The format of the buffer depends on the trace mode. When in PC tracing just
the return address of the probe is stored. Under comparison tracing the
comparison type, the two arguments, and the return address are traced. The
former method uses on entry per trace event, while the later uses 4. As
such they are incompatible so only a single mode may be enabled.

KCOV is expected to help fuzzing the kernel, and while in development has
already found a number of issues. It is required for the syzkaller system
call fuzzer [1]. Other kernel fuzzers could also make use of it, either
with the current interface, or by extending it with new modes.

A man page is currently being worked on and is expected to be committed
soon, however having the code in the kernel now is useful for other
developers to use.

[1] https://github.com/google/syzkaller

Submitted by:	Mitchell Horne <mhorne063@gmail.com> (Earlier version)
Reviewed by:	kib
Testing by:	tuexen
Sponsored by:	DARPA, AFRL
Sponsored by:	The FreeBSD Foundation (Mitchell Horne)
Differential Revision:	https://reviews.freebsd.org/D14599
2019-01-12 11:21:28 +00:00
Hans Petter Selasky
f4dbf0d82d snd_uaudio: Add quirks for Edirol UA-25EX in advanced driver mode.
Extend the vendor class USB audio quirk to cover devices without
the USB audio control descriptor.

PR:			234794
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-01-12 11:14:59 +00:00
Navdeep Parhar
1dca7005b1 cxgbe(4): Move some INTx specific code to a more appropriate place. 2019-01-12 04:44:25 +00:00
Ram Kishore Vegesna
b9732f789d Remove accessing remote node and domain objects while processing cam actions.
Issue:
  ocs_fc(4) driver panics. It's induced by setting the port_state
sysctl to offline, then online, then offline, then online, and so
forth and so on in rapid succession.

Reason:
  While we set the port_state to online fc discovery will start and OS
is enumerating the target discs by calling ocs_action(),  then set the
port state to "offline" which deletes domain/sport/nodes.

  In ocs_action()->XPT_GET_TRAN_SETTINGS we are accessing the remote
node which can be invalid to get the wwpn, wwnn and port.

Fix:
  Removed accessing of remote node and domain in some ocs_action() cases.
  Populated the required values from ocs_fcport.
  This removes the dependency of node and domain structures while
processing XPT_PATH_INQ and XPT_GET_TRAN_SETTINGS.
   We will invalidate the target entries after the device lost
timeout(30 seconds).

Approved by: ken, mav
MFC after: 3 weeks
2019-01-11 15:59:24 +00:00
Andrew Turner
80e21aabea Fix the location of td->td_frame at the top of the kernel stack.
In cpu_thread_alloc we would allocate space for the trap frame at the top of
the kernel stack. This is just below the pcb, however due to a missing cast
the pointer arithmetic would use the pcb size, not the trapframe size. As
the pcb is larger than the trapframe this is safe, however later in cpu_fork
we include the case leading to the two disagreeing on the location.

Fix by using the same arithmetic in both locations.

Found by:	An early KASAN patch
Sponsored by:	DARPA, AFRL
2019-01-11 11:32:46 +00:00
Emmanuel Vadot
1f041ae141 Import DTS from Linux 4.20
MFC after:	2 months
2019-01-11 09:40:34 +00:00
Emmanuel Vadot
3dd1a009cd Import DTS includes from 4.19
This was missed in r340337

MFC after:	3 days
2019-01-11 09:20:18 +00:00
Fedor Uporov
6651cf410c Fix errno values returned from DUMMY_XATTR linuxulator calls
Reported by: weiss@uni-mainz.de
Reviewed by: markj
MFC after: 1 day
Differential Revision: https://reviews.freebsd.org/D18812
2019-01-11 07:58:25 +00:00
Sean Eric Fagan
82e20c0a72 Change ZFS quotas to return EINVAL when not present (matches man page).
UFS will return EINVAL when quotas are not enabled on a filesystem; ZFS'
equivalent involves not having quotas (there is not way to enable or disable
quotas as such).  My initial implementation had it return ENOENT, but
quotactl(2) indicates EINVAL is more appropriate.

MFC after:	2 weeks
Approved by:	mav
Reviewed by:	markj
Reported by:	Emrion <kmachine@free.fr>
Sponsored by: iXsystems Inc
PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234413
2019-01-11 02:53:46 +00:00
Andrey V. Elsukov
48266154de Relax requirement to packet size of CARP protocol and remove version check.
CARP shares protocol number 112 with VRRP (RFC 5798). And the size of
VRRP packet may be smaller than CARP. ipfw_chk() does m_pullup() to at
least sizeof(struct carp_header) and can fail when packet is VRRP. This
leads to packet drop and message about failed pullup attempt.
Also, RFC 5798 defines version 3 of VRRP protocol, this version number
also unsupported by CARP and such check leads to packet drop.

carp_input() does its own checks for protocol version and packet size,
so we can remove these checks to be able pass VRRP packets.

PR:		234207
MFC after:	1 week
2019-01-11 01:54:15 +00:00
Emmanuel Vadot
775d35749d dtb: allwinner: Add orangepi-pc to the build
PR:		226011
Submitted by:	Greg V <greg@unrelenting.technology>
MFC after:	1 week
2019-01-11 01:42:47 +00:00
Gleb Smirnoff
cc31f9821c Remove recursive NET_EPOCH_ENTER() from sysctl_ifmalist(), missed in r342872. 2019-01-11 00:45:22 +00:00
Gleb Smirnoff
49d9607a12 Remove support for FreeBSD 9 kernel, which used to change byte order
of packet headers.
2019-01-10 23:27:29 +00:00
Andrew Turner
9c871ab54a Fix a comment, pushed onto is two words.
While here make the comments sentences.

Sponsored by:	DARPA, AFRL
2019-01-10 16:31:07 +00:00
Andriy Voskoboinyk
fdc9504f51 rtwn_usb(4): add IQ calibration support for RTL8192CU
The code is similar to the one for RTL8188E* and probably
should be shared with RTL8188CE (needs to be tested).

Checked with RTL8188CUS, STA mode.

MFC after:	5 days
2019-01-10 05:49:47 +00:00
Andrey V. Elsukov
3b1522c229 Fix the build with INVARIANTS.
MFC after:	1 month
2019-01-10 02:01:20 +00:00
Andrey V. Elsukov
1cdf23bc03 Reduce the size of struct ip_fw_args from 240 to 128 bytes on amd64.
And refactor the code to avoid unneeded initialization to reduce overhead
of per-packet processing.

ipfw(4) can be invoked by pfil(9) framework for each packet several times.
Each call uses on-stack variable of type struct ip_fw_args to keep the
state of ipfw(4) processing. Currently this variable has 240 bytes size
on amd64.  Each time ipfw(4) does bzero() on it, and then it initializes
some fields.

glebius@ has reported that they at Netflix discovered, that initialization
of this variable produces significant overhead on packet processing.
After patching I managed to increase performance of packet processing on
simple routing with ipfw(4) firewalling to about 11% from 9.8Mpps up to
11Mpps (Xeon E5-2660 v4@ + Mellanox 100G card).

Introduced new field flags, it is used to keep track of what fields was
initialized. Some fields were moved into the anonymous union, to reduce
the size. They all are mutually exclusive. dummypar field was unused, and
therefore it is removed.  The hopstore6 field type was changed from
sockaddr_in6 to a bit smaller struct ip_fw_nh6. And now the size of struct
ip_fw_args is 128 bytes.

ipfw_chk() was modified to properly handle ip_fw_args.flags instead of
rely on checking for NULL pointers.

Reviewed by:	gallatin
Obtained from:	Yandex LLC
MFC after:	1 month
Sponsored by:	Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D18690
2019-01-10 01:47:57 +00:00
Gleb Smirnoff
c962ca9f2d Remove unnecessary ifdef. With INVARIANTS all KASSERTs are empty statements,
so won't be compiled in.
2019-01-10 00:52:06 +00:00
Gleb Smirnoff
7b7f772fa0 Bring the comment up to date. 2019-01-10 00:37:14 +00:00
Gleb Smirnoff
bcc3cec43c Simplify sosetopt() so that function has single return point. No
functional change.
2019-01-10 00:25:12 +00:00
Brooks Davis
4f4ef03f5f style(9): fix the indent of a return. 2019-01-09 17:23:59 +00:00
Mark Johnston
c540c6d92c Complete the removal of obsolete ioctl handlers.
PR:		234706
Reviewed by:	imp
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18778
2019-01-09 17:23:08 +00:00
Mark Johnston
72755d285f Stop setting if_linkmib in vlan(4) ifnets.
There are several reasons:
- The structure being exported via IFDATA_LINKSPECIFIC doesn't appear
  to be a standard MIB.
- The structure being exported is private to the kernel and always
  has been.
- No other drivers in common use set the if_linkmib field.
- Because IFDATA_LINKSPECIFIC can be used to overwrite the linkmib
  structure, a privileged user could use it to corrupt internal
  vlan(4) state. [1]

PR:		219472
Reported by:	CTurt <ecturt@gmail.com> [1]
Reviewed by:	kp (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18779
2019-01-09 16:47:16 +00:00
Hans Petter Selasky
ef0111fdf3 Fix loopback traffic when using non-lo0 link local IPv6 addresses.
The loopback interface can only receive packets with a single scope ID,
namely the scope ID of the loopback interface itself. To mitigate this
packets which use the scope ID are appearing as received by the real
network interface, see "origifp" in the patch. The current code would
drop packets which are designated for loopback which use a link-local
scope ID in the destination address or source address, because they
won't match the lo0's scope ID. To fix this restore the network
interface pointer from the scope ID in the destination address for
the problematic cases. See comments added in patch for a more detailed
description.

This issue was introduced with route caching (ae@).

Reviewed by:		bz (network)
Differential Revision:	https://reviews.freebsd.org/D18769
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-01-09 14:28:08 +00:00
Andriy Voskoboinyk
7071b803da net80211: fix panic when device is removed during initialization
if_dead() is called during device detach - check if interface is
still exists before trying to refresh vap MAC address
(IF_LLADDR will trigger page fault otherwise).

MFC after:	5 days
2019-01-09 12:50:24 +00:00