Commit Graph

136068 Commits

Author SHA1 Message Date
cem
d19a408272 x86: Halt non-BSP CPUs on panic IPI_STOP
We may need the BSP to reboot, but we don't need any AP CPU that isn't the
panic thread.  Any CPU landing in this routine during panic isn't the panic
thread, so we can just detect !BSP && panic and shut down the logical core.

The savings can be demonstrated in a bhyve guest with multiple cores; before
this change, N guest threads would spin at 100% CPU.  After this change,
only one or two threads spin (depending on if the panicing CPU was the BSP
or not).

Konstantin points out that this may break any future patches which allow
switching ddb(4) CPUs after panic and examining CPU-local state that cannot
be inspected remotely.  In the event that such a mechanism is incorporated,
this behavior could be made configurable by tunable/sysctl.

Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20019
2019-04-24 18:24:22 +00:00
br
c8164f0828 Add support for Cadence network controller found in HiFive Unleashed board.
Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19798
2019-04-24 13:44:30 +00:00
br
dd64f72a22 Implement pic_pre_ithread(), pic_post_ithread().
Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19819
2019-04-24 13:41:46 +00:00
gallatin
7a82a42584 iflib: Add pfil hooks
As with mlx5en, the idea is to drop unwanted traffic as early
in receive as possible, before mbufs are allocated and anything
is passed up the stack.  This can save considerable CPU time
when a machine is under a flooding style DOS attack.

The major change here is to remove the unneeded abstraction where
callers of rxd_frag_to_sd() get back a pointer to the mbuf ring, and
are responsible for NULL'ing that mbuf themselves. Now this happens
directly in rxd_frag_to_sd(), and it returns an mbuf. This allows us
to use the decision (and potentially mbuf) returned by the pfil
hooks. The driver can now recycle mbufs to avoid re-allocation when
packets are dropped.

Reviewed by:	marius  (shurd and erj also provided feedback)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19645
2019-04-24 13:32:04 +00:00
ae
97ddb4fef9 Add GRE-in-UDP encapsulation support as defined in RFC8086.
This GRE-in-UDP encapsulation allows the UDP source port field to be
used as an entropy field for load-balancing of GRE traffic in transit
networks. Also most of multiqueue network cards are able distribute
incoming UDP datagrams to different NIC queues, while very little are
able do this for GRE packets.

When an administrator enables UDP encapsulation with command
`ifconfig gre0 udpencap`, the driver creates kernel socket, that binds
to tunnel source address and after udp_set_kernel_tunneling() starts
receiving of all UDP packets destined to 4754 port. Each kernel socket
maintains list of tunnels with different destination addresses. Thus
when several tunnels use the same source address, they all handled by
single socket.  The IP[V6]_BINDANY socket option is used to be able bind
socket to source address even if it is not yet available in the system.
This may happen on system boot, when gre(4) interface is created before
source address become available. The encapsulation and sending of packets
is done directly from gre(4) into ip[6]_output() without using sockets.

Reviewed by:	eugen
MFC after:	1 month
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D19921
2019-04-24 09:05:45 +00:00
jhibbits
a8ca8ce89b powerpc: Add a couple missing isyncs
mtmsr and mtsr require context synchronizing instructions to follow.  Without
a CSI, there's a chance for a machine check exception.  This reportedly does
occur on a MPC750 (PowerMac G3).

Reported by:	Mark Millard
2019-04-24 02:51:58 +00:00
kevans
21f6670fd2 fdt: stop installing FDT_DTS_FILE
r346307 inadvertently started installing FDT_DTS_FILE along with the kernel.
While this isn't necessarily bad, it was not intended or discussed and it
actively breaks some current setups that don't anticipate any .dtb being
installed when it's using static fdt. This change could be reconsidered down
the line, but it needs to be done with prior discussion.

Fix it by pushing FDT_DTS_FILE build down into the raw dtb.build.mk bits.
This technically allows modules building DTS to accidentally specify an
FDT_DTS_FILE that gets built but isn't otherwise useful (since it's not
installed), but I suspect this isn't a big deal and would get caught with
any kind of testing -- and perhaps this might end up useful in some other
way, for example by some module wanting to embed fdt in some other way than
our current/normal mechanism.

Reported by:	Mori Hiroki <yamori813@yahoo.co.jp>
MFC after:	3 days
X-MFC-With:	r346307
2019-04-24 01:11:50 +00:00
dchagin
5532ebf392 Since r339624 HEAD does not need for backslashes in syscalls.master,
however to make a merge r345471 to the stable add backslashes
to the syscalls.master.

MFC after:	3 days
2019-04-23 18:10:46 +00:00
kevans
82762c6336 tun(4): Defer clearing TUN_OPEN until much later
tun destruction will not continue until TUN_OPEN is cleared. There are brief
moments in tunclose where the mutex is dropped and we've already cleared
TUN_OPEN, so tun_destroy would be able to proceed while we're in the middle
of cleaning up the tun still. tun_destroy should be blocked until these
parts (address/route purges, mostly) are complete.

PR:		233955
MFC after:	2 weeks
2019-04-23 17:28:28 +00:00
cem
48c2cd9457 ip6_randomflowlabel: Avoid blocking if random(4) is not available
If kern.random.initial_seeding.bypass_before_seeding is disabled, random(4)
and arc4random(9) will block indefinitely until enough entropy is available
to initially seed Fortuna.

It seems that zero flowids are perfectly valid, so avoid blocking on random
until initial seeding takes place.

Discussed with:	bz (earlier revision)
Reviewed by:	thj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20011
2019-04-23 17:18:20 +00:00
luporl
94e1c75ad9 [PPC64] Fix wrong KASSERT in mphyp_pte_insert()
As mphyp_pte_unset() can also remove PTE entries, and as this can
happen in parallel with PTEs evicted by mphyp_pte_insert(), there
is a (rare) chance the PTE being evicted gets removed before
mphyp_pte_insert() is able to do so. Thus, the KASSERT should
check wether the result is H_SUCCESS or H_NOT_FOUND, to avoid
panics if the situation described above occurs.

More details about this issue can be found in PR 237470.

PR:		237470
Reviewed by:	jhibbits
Differential Revision:	https://reviews.freebsd.org/D20012
2019-04-23 17:11:45 +00:00
cem
16e165fb1d netdump: Fix !COMPAT_FREEBSD11 unused variable warning
Reported by:	Ralf Wenk <iz-rpi03_hs-karlsruhe.de>
Sponsored by:	Dell EMC Isilon
2019-04-23 17:05:57 +00:00
emaste
d359c6bb14 Enable Mellanox drivers (modules) on AArch64
Tested by Greg V with mlx5en on an Ampere eMAG instance at Packet.com on
c2.large.arm (with some additional uncommitted PCIe WIP).

PR:		237055
Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	hselasky
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D19983
2019-04-23 15:11:01 +00:00
kib
a7264374f6 poib: assign link-local address according to RFC
RFC 4391 specifies that the IB interface GID should be re-used as IPv6
link-local address.  Since the code in in6_get_hw_ifid() ignored
IFT_INFINIBAND case, ibX interfaces ended up with the local address
borrowed from some other interface, which is non-compliant.

Use lowest eight bytes from GID for filling the link-local address,
same as Linux.

Reviewed by:	bz (previous version), ae, hselasky, slavash,
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D20006
2019-04-23 12:23:44 +00:00
bz
87874d0b3e iFix udp_output() lock inconsistency.
In r297225 the initial INP_RLOCK() was replaced by an early
acquisition of an r- or w-lock depending on input variables
possibly extending the write locked area for reasons not entirely
clear but possibly to avoid a later case of unlock and relock
leading to a possible race condition and possibly in order to
allow the route cache to work for connected sockets.

Unfortunately the conditions were not 1:1 replicated (probably
because of the route cache needs). While this would not be a
problem the legacy IP code compared to IPv6 has an extra case
when dealing with IP_SENDSRCADDR. In a particular case we were
holding an exclusive inp lock and acquired the shared udbinfo
lock (now epoch).
When then running into an error case, the locking assertions
on release fired as the udpinfo and inp lock levels did not match.

Break up the special case and in that particular case acquire
and udpinfo lock depending on the exclusitivity of the inp lock.

MFC After:	9 days
Reported-by:	syzbot+1f5c6800e4f99bdb1a48@syzkaller.appspotmail.com
Reviewed by:	tuexen
Differential Revision:	https://reviews.freebsd.org/D19594
2019-04-23 10:12:33 +00:00
wma
c45cce178c This patch offers a workaround to buf_ring reordering
visible on armv7 and armv8. Similar issue to rS302292.

Obtained from:         Semihalf
Authored by:           Michal Krawczyk <mk@semihalf.com>
Approved by:           wma
Differential Revision: https://reviews.freebsd.org/D19932
2019-04-23 06:36:32 +00:00
jhibbits
4d72e303d4 [PowerPC64] pseries-llan: increment packet output counters on error and success
Summary: when using pseries-llan driver, Opkts and Oerrs counters (netstat
-i) are always zero. This patch adds an small error handling to increment
these counters.

Submitted by:	alfredo.junior_eldorado.org.br
Differential Revision: https://reviews.freebsd.org/D20009
2019-04-23 03:19:03 +00:00
jhibbits
405274bad5 powerpc64/pseries: Fix hypervisor call with extra arguments
Some hypervisor calls, such as H_SEND_LOGICAL_LAN, take more arguments than
are traditionally passed in registers.  The HCALL ABI will accept these
arguments in r11 and r12.  With ELFv2 ABI, these arguments are 2
double-words lower than ELFv1 ABI, as two double-words in the stack frame
are no longer used, and therefore removed from the frame.  Fix the offsets
for loading the registers for the HCALL.  This fixes the phyp_llan driver
with ELFv2 kernel.

Submitted by:	alfredo.junior_eldorado.org.br
Differential Revision:	https://reviews.freebsd.org/D20008
2019-04-23 03:05:26 +00:00
hselasky
8019d4dcb5 Revert r346530 until further.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-22 19:36:19 +00:00
gallatin
accdb3810d Track device's NUMA domain in ifnet & alloc ifnet from NUMA local memory
This commit adds new if_alloc_domain() and if_alloc_dev() methods to
allocate ifnets.  When called with a domain on a NUMA machine,
ifalloc_domain() will record the NUMA domain in the ifnet, and it will
allocate the ifnet struct from memory which is local to that NUMA
node.  Similarly, if_alloc_dev() is a wrapper for if_alloc_domain
which uses a driver supplied device_t to call ifalloc_domain() with
the appropriate domain.

Note that the new if_numa_domain field fits in an alignment pad in
struct ifnet, and so does not alter the size of the structure.

Reviewed by:	glebius, kib, markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19930
2019-04-22 19:24:21 +00:00
np
642cf308aa cxgbe/t4_tom: Add a "TCB history" feature that samples hardware state
for a tid and maintains a running history of some interesting events.

Service TCP_INFO queries from the history when the tid is being tracked
there.
2019-04-22 17:48:10 +00:00
np
6f5b0ae5af cxgbe(4): Make sure bundled_fw is always initialized before use.
This fixes a bug that prevented the driver from auto-flashing the
firmware when it didn't see one on the card.  This feature was
introduced in r321390 and this bug was introduced in r343269.

Reported by:	gallatin@
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-04-22 17:00:30 +00:00
bz
b1a35f1113 r297225 move the assignment of sin from add to the top of the function.
sin is not changed after the initial assignment, so no need to set it again.

MFC after:	10 days
2019-04-22 14:53:53 +00:00
bz
e6c9a8836e Remove some excessive brackets.
No functional change.

MFC after:	10 days
2019-04-22 14:20:49 +00:00
markj
473a07e47c Clarify the relationship between INVARIANTS and DIAGNOSTIC a bit.
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-04-22 11:31:13 +00:00
markj
056f72107f Disable vm map consistency checking by default on INVARIANTS kernels.
The checks are too expensive for a general-purpose kernel.  Enable the
checks when DIAGNOSTIC is defined and provide a sysctl to enable the
checks in a non-DIAGNOSTIC INVARIANTS kernel.

Reviewed by:	kib
Discussed with:	Doug Moore <dougm@rice.edu>
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19999
2019-04-22 11:23:35 +00:00
hselasky
8752233742 Fix build for mips and powerpc after r346530.
Need to include sys/kernel.h to define SYSINIT() which is used
by sys/eventhandler.h .

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-22 08:32:00 +00:00
hselasky
b65993070e Fix panic in network stack due to memory use after free in relation to
fragmented packets.

When sending IPv4 and IPv6 fragmented packets and a fragment is lost,
the mbuf making up the fragment will remain in the temporary hashed
fragment list for a while. If the network interface departs before the
so-called slow timeout clears the packet, the fragment causes a panic
when the timeout kicks in due to accessing a freed network interface
structure.

Make sure that when a network device is departing, all hashed IPv4 and
IPv6 fragments belonging to it, get freed.

Backtrace:
panic()
icmp6_reflect()

hlim = ND_IFINFO(m->m_pkthdr.rcvif)->chlim;
^^^^ rcvif->if_afdata[AF_INET6] is NULL.

icmp6_error()
frag6_freef()
frag6_slowtimo()
pfslowtimo()
softclock_call_cc()
softclock()
ithread_loop()

Differential Revision:	https://reviews.freebsd.org/D19622
Reviewed by:		bz (network), adrian
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-22 07:27:24 +00:00
cem
4120e886df gnop(8): Nopify configuration as a kernel dump device
As a dummy / no-op dump device, to facilitate dumpon(8) testing.

Reviewed by:	markj (earlier version)
Differential Revision:	https://reviews.freebsd.org/D19991
2019-04-22 03:25:49 +00:00
mav
b537b150e3 Report DIF protection type the disk is formatted with.
Some disks formatted with protection report errors if written without
protection used.  This should help to diagnose the problem.

MFC after:	2 weeks
2019-04-22 01:08:14 +00:00
rmacklem
20a126d20f Add #ifdef INET as requested by bz@. 2019-04-21 22:53:51 +00:00
mav
956606fe26 Polish SCSI sense data validity checks.
According to specs and common sense, all sense data reported in descriptor
format should be valid.  But practice shows different, some devices return
descriptors with invalid data, resulting in error messages looking worse.

Decouple block/stream commands sense data and information field printing.
Looking on present specs, there are much more cases when those fields are
not related, and incomplete old code was not printing valid sense data and
leaving empty lines for invalid.

MFC after:	2 weeks
2019-04-21 19:07:03 +00:00
ian
d8c1004be9 Move the reporting of spurious interrupts under bootverbose control, because
occasional spurious interrupts are a normal thing on this hardware.  Also,
change the name of the cpu-local interrupt controller driver from local_intc
to lintc, because the name gets built into interrupt names, which have to
fit into a 19-byte field for stats reporting (so this allows 5 more bytes
of the actual interrupt name to be displayed).
2019-04-21 17:39:01 +00:00
adrian
1205340c36 [ath] [ath_hal] [ath_hal_9300] Extend the start PCU receive to handle resetting ANI.
One of the fun issues with scanning has been how the existing
ANI values were programmed into the hardware when channels were
changed.  If you're on a really crappy channel and ANI has made
you deaf then when you scan you continue to be deaf on all channels.

This code passes in a flag to startpcureceive which in AR5416 and later
is also used to enable ANI.  This allows it to know if it's a normal
operation or a scan operation.

This fixes my situation at home where a temporary spot of a device
going deaf due to interference starts scanning and .. can't hear
anything until I restart.

Now, this isn't the full fix - ideally:

(a) all the ANI config and per-channel information would be migrated
     to the shared HAL stuff and enabled for all of the NICs;
(b) when a station reassociates and some other error conditions
    (like missed beacons, NF calibration failures, etc) a knob
    to reset ANI parameters would likely help recovery.

But hey, I'm committing bits of code again! woo!

Tested:

* AR9344 (2G), STA operation
2019-04-21 02:36:01 +00:00
wulf
e56fffc7f2 psm(4): give names to synaptics commands
Submitted by:	Ben LeMasurier <ben@crypt.ly>
MFC after:	2 weeks
2019-04-20 21:06:12 +00:00
wulf
b07da6e7b4 psm(4): respect tap_disabled configuration with enabled Extended support
This fixes a bug where, even when hw.psm.tap_enabled=0, touchpad taps
were processed.
tap_enabled has three states: unconfigured, disabled, and enabled (-1, 0, 1).
To respect PR kern/139272, taps are ignored only when explicity disabled.

Submitted by:	Ben LeMasurier <ben@crypt.ly> (initial version)
MFC after:	2 weeks
2019-04-20 21:04:56 +00:00
wulf
68284b4a20 psm(4): do not process gestures when palm is present
Ignoring of gesture processing when the palm is detected helps to reduce
some of the erratic pointer behavior.

This fixes regression introduced in r317814

Reported by:	Ben LeMasurier <ben@crypt.ly>
MFC after:	2 weeks
2019-04-20 21:02:41 +00:00
wulf
c9e5830e63 psm(4): Add support for 4 and 5 finger touches in synaptics driver
While 4-th and 5-th finger positions are not exported through PS/2
interface, total number of touches is reported by MT trackpads.

MFC after:	2 weeks
2019-04-20 21:00:44 +00:00
cem
926ea6e16b netdump: Fix 11 compatibility DIOCSKERNELDUMP ioctl
The logic was present for the 11 version of the DIOCSKERNELDUMP ioctl, but
had not been updated for the 12 ABI.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D19980
2019-04-20 16:07:29 +00:00
emaste
3ea680f19f Enable ioremap for aarch64 in the LinuxKPI
Required for Mellanox drivers (e.g. on Ampere eMAG at Packet.com).

PR:		237055
Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D19987
2019-04-20 15:57:05 +00:00
asomers
fdcd6f1673 Use symlinks for kernel modules rather than hardlinks
When aliasing a kernel module to a different name (ie if_igb for if_em),
it's better to use symlinks than hard links. kldxref will omit entries for
the links, ensuring that the loaded module has the correct name.

Reviewed by:	imp
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19979
2019-04-20 12:51:05 +00:00
markj
9b1b8919b2 Export cpu_core from opensolaris.ko.
It is referenced by dtrace*.ko.

PR:		191462
Submitted by:	me.freebsd@cgf.cx
MFC after:	1 week
2019-04-20 11:34:53 +00:00
ganbold
a7e393156a Add SY8106A Buck Regulator and Allwinner CIR devices to GENERIC arm64 kernel. 2019-04-20 03:21:47 +00:00
jhibbits
e878595678 powerpc64/powernv: Relax flash block write requirements
Since writes don't necessarily need to be on erase-block boundaries, we can
relax the block size and alignments down to sector size.  If it needs to be
erased, opalflash_erase() will check proper alignment and size.
2019-04-20 02:44:38 +00:00
rmacklem
1422b10126 Add support for the ModeSetMasked attribute to the NFSv4.1 server.
I do not know of an extant NFSv4.1 client that currently does a Setattr
operation for the ModeSetMasked, but it has been discussed on the linux-nfs
mailing list.
This patch adds support for doing a Setattr of ModeSetMasked, so that it
will work for any future NFSv4.1 client that chooses to do so.
Tested via a hacked FreeBSD NFSv4.1 client.

MFC after:	2 weeks
2019-04-19 23:35:08 +00:00
rmacklem
a359c7f26f Replace "vp" with NULL to make the code more readable.
At the time of this nfsv4_sattr() call, "vp == NULL", so this patch doesn't
change the semantics, but I think it makes the code more readable.
It also makes it consistent with the nfsv4_sattr() call a few lines above
this one. Found during code inspection.

MFC after:	2 weeks
2019-04-19 23:27:23 +00:00
cem
8cd2fbf3e0 Revert r346410 and r346411
libkern in .PATH has too many filename conflicts with libc and my -DNO_CLEAN
tinderbox didn't catch that ahead of time.  Mea culpa.
2019-04-19 22:08:17 +00:00
cem
3288ae0d72 kernel build: Disable unhelpful GCC warning (tripped after r346352)
-Wformat-zero-length does not highlight any particularly wrong code and it
is especially meaningless for device_printf().  Turn it off entirely to
remove a source of false positives.

Sponsored by:	Dell EMC Isilon
2019-04-19 20:08:45 +00:00
cem
f521967f0b Bump __FreeBSD_version after r346410 2019-04-19 20:06:22 +00:00
cem
316c180eb7 libkern: Bring in arc4random_uniform(9) from libc
It is a useful arc4random wrapper in the kernel for much the same reasons as
in userspace.  Move the source to libkern (because kernel build is
restricted to sys/, but userspace can include any file it likes) and build
kernel and libc versions from the same source file.

Copy the documentation from arc4random_uniform(3) to the section 9 page.

While here, add missing arc4random_buf(9) symlink.

Sponsored by:	Dell EMC Isilon
2019-04-19 20:05:47 +00:00
tuexen
a26060a087 When an IPv6 packet is received for a raw socket which has the
IPPROTO_IPV6 level socket option IPV6_CHECKSUM enabled and the
checksum check fails, drop the message. Without this fix, an
ICMP6 message was sent indicating a parameter problem.

Thanks to bz@ for suggesting a way to simplify this fix.

Reviewed by:		bz@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D19969
2019-04-19 18:09:37 +00:00
adrian
b8eb7a0fb4 [ath] Fix return value check to not complain.
Compilers complain more about things, so let's keep them happy.
2019-04-19 18:00:33 +00:00
tuexen
36983a7b31 When a checksum has to be computed for a received IPv6 packet because it
is requested by the application using the IPPROTO_IPV6 level socket option
IPV6_CHECKSUM on a raw socket, ensure that the packet contains enough
bytes to contain the checksum at the specified offset.

Reported by:		syzbot+6295fcc5a8aced81d599@syzkaller.appspotmail.com
Reviewed by:		bz@
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D19968
2019-04-19 17:28:28 +00:00
tuexen
ac889cf71b Avoid a buffer overwrite in rip6_output() when computing the checksum
as requested by the user via the IPPROTO_IPV6 level socket option
IPV6_CHECKSUM. The check if there are enough bytes in the packet to
store the checksum at the requested offset was wrong by 1.

Reviewed by:		bz@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D19967
2019-04-19 17:21:35 +00:00
tuexen
3e538ee797 Improve input validation for the socket option IPV6_CHECKSUM.
When using the IPPROTO_IPV6 level socket option IPV6_CHECKSUM on a raw
IPv6 socket, ensure that the value is either -1 or a non-negative even
number.

Reviewed by:		bz@, thj@
MFC after:		1 week
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D19966
2019-04-19 17:17:41 +00:00
thj
56e95f6462 Add stat counter for ipv6 atomic fragments
Add a stat counter to track ipv6 atomic fragments. Atomic fragments can be
generated in response to invalid path MTU values, but are also a potential
attack vector and considered harmful (see RFC6946 and RFC8021).

While here add tracking of the atomic fragment counter to netstat and systat.

Reviewed by:    tuexen, jtl, bz
Approved by:    jtl (mentor), bz (mentor)
Event:  Aberdeen hackathon 2019
Differential Revision:  https://reviews.freebsd.org/D17511
2019-04-19 17:06:43 +00:00
mav
17e894d1e9 Change the way FreeBSD GID inheritance is hacked.
I believe previous ifdef caused NULL dereference in later zfs_log_create()
on attempt to create file inside directory belonging to ephemeral group
created on illumos, trying to write to log information about GID domain
of the newly created file, inheriting the ephemeral GID.

This patch reuses original illumos SGID code with exception that due to
lack of ID mapping code on FreeBSD ephemeral GID will turn into GID_NOBODY
by another ifdef inside zfs_fuid_map_id().

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2019-04-19 15:44:45 +00:00
tychon
35c95d331a remove the 4GB boundary requirement on PCI DMA segments
Reviewed by:	kib
Discussed with:	jhb
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19867
2019-04-19 13:43:33 +00:00
rmacklem
9454609db3 Fix the NFSv4.0 server so that it does not support NFSv4.1 attributes.
During inspection of a packet trace, I noticed that an NFSv4.0 mount
reported that it supported attributes that are only defined for NFSv4.1.
In practice, this bug appears to be benign, since NFSv4.0 clients will
not use attributes that were added for NFSv4.1.
However, this was not correct and this patch fixes the NFSv4.0 server
so that it only supports attributes defined for NFSv4.0.
It also adds a definition for NFSv4.1 attributes that can only be set,
although it is only defined as 0 for now.
This is anticipation of the addition of support for the NFSv4.1 mode+mask
attribute soon.

MFC after:	2 weeks
2019-04-19 03:36:22 +00:00
cem
2d9b775121 Update to Zstandard 1.4.0
The full release notes can be found on Github:

  https://github.com/facebook/zstd/releases/tag/v1.4.0

Relnotes:	yes
2019-04-19 02:54:13 +00:00
jhibbits
b33106f398 powerpc/powernv: Make erasing before writes optional
If the OPAL flash driver supports writing without erase, it adds a
'no-erase' property to the flash device node.  Honor that property and don't
bother erasing if it exists.
2019-04-19 02:28:04 +00:00
jhb
b4bd29249c Push down INP_WLOCK slightly in tcp_ctloutput.
The inp lock is not needed for testing the V6 flag as that flag is set
once when the inp is created and never changes.  For non-TCP socket
options the lock is immediately dropped after checking that flag.
This just pushes the lock down to only be acquired for TCP socket
options.

This isn't a hot-path, more a cosmetic cleanup I noticed while reading
the code.

Reviewed by:	bz
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19740
2019-04-18 23:21:26 +00:00
imp
7b5943709e When parsing command line stuff, treat tabs and spaces the same.
When creating complex config files, people like to use tabs to offset
sections. Treat them the same as spaces for delimiters.
2019-04-18 22:52:12 +00:00
cem
4c79b5f69c random(4): Restore availability tradeoff prior to r346250
As discussed in that commit message, it is a dangerous default.  But the
safe default causes enough pain on a variety of platforms that for now,
restore the prior default.

Some of this is self-induced pain we should/could do better about; for
example, programmatic CI systems and VM managers should introduce entropy
from the host for individual VM instances.  This is considered a future work
item.

On modern x86 and Power9 systems, this may be wholly unnecessary after
D19928 lands (even in the non-ideal case where early /boot/entropy is
unavailable), because they have fast hardware random sources available early
in boot.  But D19928 is not yet landed and we have a host of architectures
which do not provide fast random sources.

This change adds several tunables and diagnostic sysctls, documented
thoroughly in UPDATING and sys/dev/random/random_infra.c.

PR:		230875 (reopens)
Reported by:	adrian, jhb, imp, and probably others
Reviewed by:	delphij, imp (earlier version), markm (earlier version)
Discussed with:	adrian
Approved by:	secteam(delphij)
Relnotes:	yeah
Security:	related
Differential Revision:	https://reviews.freebsd.org/D19944
2019-04-18 20:48:54 +00:00
hselasky
d9340a46aa Implement flag for telling cuse(3) clients if the peer is running in 32-bit
compat mode or not. This is useful when implementing compatibility ioctl(2)
handlers in userspace.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-18 19:04:07 +00:00
kib
870e4adb34 Use correct type name.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2019-04-18 15:31:03 +00:00
kib
9da7c1a266 Correct handling of RMRR during early enumeration stages.
On some machines, DMAR contexts must be created before all devices
under the scope of the corresponding DMAR unit are enumerated.
Current code has two problems with that:
- scope lookup returns NULL device_t, which causes to skip creating a
  context with RMRR, which is fatal for the affected device.
- calculation of the final pci dbsf address fails if any bridge in the
  scope is not yet enumerated, because code relies on pcib_get_bus().

Make creation of contexts work either with device_t, or with DMAR PCI
scope paths.  Scope provides enough information to infer context
address, and it is directly matched against DMAR tables scopes.

When calculating bus addresses for the scope or device, use direct
pci_cfgregread(PCIR_SECBUS_1) to get the secondary bus number, instead
of pcib_get_bus().

The issue was observed on HP Gen servers, where iLO PCI devices are
located behind south bridge switch.  Turning on translation without
satisfying RMRR requests caused iLO to mostly hang, up to the level of
being unusable to control the server.

While there, remove hw.dmar.dmar_match_verbose tunable, and make the
normal logging under bootverbose useful and sufficient to diagnose
DRHD and RMRR parsing and matching.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2019-04-18 14:18:06 +00:00
kib
1995e4dec9 Remove witness warning. dmar_bus_dmamap_create() does not sleep.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2019-04-18 14:03:59 +00:00
kib
f9ebd5245d Reduce verbosity, do not announce details of irte programming by default.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2019-04-18 14:02:33 +00:00
kp
d88fa777c9 pf: No need to M_NOWAIT in DIOCRSETTFLAGS
Now that we don't hold a lock during DIOCRSETTFLAGS memory allocation we can
use M_WAITOK.

MFC after:	1 week
Event:		Aberdeen hackathon 2019
Pointed out by:	glebius@
2019-04-18 11:37:44 +00:00
manu
bf879e13c9 arm: allwinner: Fix audio for Allwinner H3/H5
Due to three conditions the codec driver for Allwinner A10/A20 and H3/H5 did not work properly here:

    Wrong bit position for the analog audio reset
    Hardware Reset of codec was not de-asserted correctly
    Linux DTS file did not contain the address of the analog register the way as the driver was expecting it.

This patch proposes fixes for those three parts.

Submitted by:	freebsdnewbie@freenet.de (Manuel Stühn)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D19910
2019-04-17 21:45:19 +00:00
manu
c3ff664282 ofw_graph: Add functions for graph bindings
Those functions are helpers to work on graph bindings.
graphs are mostly use with video related devices.
See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/graph.txt?id=4436a3711e3249840e0679e92d3c951bcaf25515

MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D19877
2019-04-17 20:09:01 +00:00
kevans
377f5a8be5 Compile sha1.c when ether support is included
sha1 is used by ether_gen_addr after r346324. Perhaps in an ideal world we
could detect that the kernel's been compiled without sha1_* bits included
and silently fallback to arc4random instead because these platforms/kernel
configs are far and few between. It's fairly lightweight, though, so just
include it for now.
2019-04-17 18:08:28 +00:00
kevans
84233445c4 iflib: Use new ether_gen_addr, restricting addresses to that subset
Differential Revision:	https://reviews.freebsd.org/D19587
2019-04-17 17:19:54 +00:00
kevans
9902f46a44 net: adjust randomized address bits
Give devices that need a MAC a 16-bit allocation out of the FreeBSD
Foundation OUI range. Change the name ether_fakeaddr to ether_gen_addr now
that we're dealing real MAC addresses with a real OUI rather than random
locally-administered addresses.

Reviewed by:	bz, rgrimes
Differential Revision:	https://reviews.freebsd.org/D19587
2019-04-17 17:18:43 +00:00
kp
9fe8ed111f pf: Fix panic on invalid DIOCRSETTFLAGS
If during DIOCRSETTFLAGS pfrio_buffer is NULL copyin() will fault, which we're
not allowed to do with a lock held.
We must count the number of entries in the table and release the lock during
copyin(). Only then can we re-acquire the lock. Note that this is safe, because
pfr_set_tflags() will check if the table and entries exist.

This was discovered by a local syzcaller instance.

MFC after:	1 week
Event:		Aberdeen hackathon 2019
2019-04-17 16:42:54 +00:00
ian
ff419c455c Only set up the interrupts that will actually be used in arm generic_timer.
The code previously set up interrupt handlers for all the interrupt
resources available, including for timers that are not in use.  That could
lead to interrupt storms.  For example, if boot firmware enabled the virtual
timer but the kernel is using the physical timer, it could get flooded with
interrupts on the virtual timer which it cannot shut off.  By only setting
up an interrupt handler for the hardware that will actually be used, any
interrupts from other timer units will remain masked in the interrupt
controller.

Differential Revision:	https://reviews.freebsd.org/D19871
2019-04-17 15:27:11 +00:00
kevans
b658113064 fdt: further consolidate DTB building and revise manpage
FDT_DTS_FILE was built separately with a rule in sys/conf/files and
recreated the rules we used in dtb.mk. Now that we have other infrastructure
to build a DTB along with the kernel, fold FDT_DTS_FILE into that since it
doesn't have any special requirements.

fdt(4) never got revised to mention the DTS/DTSO make options, so do that
now.

Reviewed by:	imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D19736
2019-04-17 03:29:16 +00:00
manu
788090ecde arm: allwinner: Makes more device optional
MFC after:	2 weeks
2019-04-16 22:42:50 +00:00
manu
e5bc73bddd arm: Order files.arm to have cloudabi and annapurna sections
MFC after:	2 weeks
2019-04-16 20:06:39 +00:00
manu
94008dd35f arm: Add kern_clocksource.c directly in files.arm
This files is needed and included in all our config so move it to a common
location.

MFC after:	2 weeks
2019-04-16 20:04:22 +00:00
kib
7fc477c2c0 Fix initial x87 state after r345562.
After the referenced commit, we did not set x87 and sse valid bits in
the xstate_bv bitmask for initial fpu state (stored in memory), when
using XSAVE.

The state is loaded into FPU register file to initialize the process
FPU state, and since both bits were clear, the default x87 and SSE
states were loaded.  By chance, FreeBSD ABI SSE2 state is same as FPU
initial state, so the bug is not visible for 64bit processes.  But on
i386, the precision control should be set to double (53bit mantissa),
instead of the default double extended (64bit mantissa). For 32bit
processes on amd64, kernel reloads control word with the right mask,
which only left native i386 and amd64 native but using x87 as
affected.

Fix it by setting minimal required xstate_bv mask.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-16 19:46:02 +00:00
manu
d6c1d3bf63 allwinner: clk: Garbage collect old clock implementation
The old clocks are disconneted from the build since r337344.
Remove all those pseudo drivers. The only one remaining is for gmac
(the ethernet controller) so move it to sys/arm/allwinner.
While here remove a83t support from gmacclk as it is unneeded since r326114.

MFC after:	1 month
2019-04-16 19:38:16 +00:00
cem
fbf32f5f72 stack_protector: Add tunable to bypass random cookies
This is a stopgap measure to unbreak installer/VM/embedded boot issues
introduced (or at least exposed by) in r346250.

Add the new tunable, "security.stack_protect.permit_nonrandom_cookies," in
order to continue boot with insecure non-random stack cookies if the random
device is unavailable.

For now, enable it by default.  This is NOT safe.  It will be disabled by
default in a future revision.

There is follow-on work planned to use fast random sources (e.g., RDRAND on
x86 and DARN on Power) to seed when the early entropy file cannot be
provided, for whatever reason.  Please see D19928.

Some better hacks may be used to make the non-random __stack_chk_guard
slightly less predictable (from delphij@ and mjg@); those suggestions are
left for a future revision.  I think it may also be plausible to move stack
guard initialization far later in the boot process; potentially it could be
moved all the way to just before userspace is started.

Reported by:	many
Reviewed by:	delphij, emaste, imp (all w/ caveat: this is a stopgap fix)
Security:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19927
2019-04-16 18:47:20 +00:00
cem
dec0f8b592 random(4): Add is_random_seeded(9) KPI
The imagined use is for early boot consumers of random to be able to make
decisions based on whether random is available yet or not.  One such
consumer seems to be __stack_chk_init(), which runs immediately after random
is initialized.  A follow-up patch will attempt to address that.

Reported by:	many
Reviewed by:	delphij (except man page)
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D19926
2019-04-16 17:12:17 +00:00
gallatin
952f3ca34e Replace cosqos with numa_domain in mbuf pkthdr
The cosqos field was added nearly 6 years ago in r254804, and it is
still unused by any in-tree consumers.  I have a patchset that I'm
working on which aligns many network resources by NUMA domain,
including inps, inpcb lb group, tcp pacing, lagg output link
selection, backing pages for sendfile, and more.  It reduces
cross-domain traffic by roughly 50% for a real web workload.

This patchset relies on being able to store the numa domain in the
mbuf, and grabbing the unused cosqos field for this purpose is the
first step in starting to usptream it.

Reviewed by:	kib, markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19862
2019-04-16 16:49:34 +00:00
emaste
0ae08aa89a correct readlinkat(2) return type
r176215 corrected readlink(2)'s return type and the type of the last
argument.  readlink(2) was introduced in r177788 after being developed
as part of Google Summer of Code 2007; it appears to have inherited the
wrong return type.

Man pages and header files were already ssize_t; update syscalls.master
to match.

PR:		197915
Submitted by:	Henning Petersen <henning.petersen@t-online.de>
MFC after:	2 weeks
2019-04-16 13:26:31 +00:00
manu
b6f38bc718 aw_syscon: Add a new compatible
Since 5.0 DTS the syscon controller have a new compatible as it
exports new subnodes, we currently only use it as a syscon provider
so just add the new compatible.

Tested On:  H3
MFC after:	1 month
2019-04-16 12:40:49 +00:00
manu
5c94073f5b aw_rtc: Register the clocks
Since latest DTS update the rtc is supposed to register two clocks :

- osc32k (the 32k oscillator on the board that the RTC uses directly and
that other peripheral can use)
- iosc (the internal oscillator of the RTC when available which frequency
depend on the SoC revision)

Since we need the RTC before the proper clock control unit (because it uses
those clocks) attach it a BUS_PASS_BUS + MIDDLE and attach the clock control
unit at BUS_PASS_BUS + LAST for the SoC that requires it.

Tested On:	     A20, H3, A64

MFC after:	1 month
2019-04-16 12:39:31 +00:00
fsu
147d386c7c ext2fs: Initial version of DTrace support.
Commit forgotten file.

Reviewed by:    pfg, gnn
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19848
2019-04-16 11:37:15 +00:00
fsu
c3ee715fe1 ext2fs: Initial version of DTrace support.
Reviewed by:    pfg, gnn
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19848
2019-04-16 11:20:10 +00:00
peterj
3a30b2e1eb Specify correct Ethernet phy for RPI-B
Correct a typo in the RPI-B ethernet config - the RPi-B includes a
SMC LAN9512 USB bridge and Ethernet 10/100 NIC/phy.  The phy part of
this is supported by smscphy.

Tested On: RPi1 Model B

Approved by:	grog, jhb (mentors)
MFC after:	3 days
2019-04-16 09:44:46 +00:00
peterj
c485d101c5 Fix cpufreq(4) on RPI-B
Since r324184 the root node compatible for the original Raspberry Pi
is "brcm,bcm2835", add it to the compatible list of bcm2835_cpufreq.

Tested On: RPi1 Model B

Note that the default Das U-Boot FDT does not include a cpus clause
so actually adding a bcm2835_cpufreq device requires adding a FDT
overlay defining the cpu.

Approved by:	grog, jhb (mentors)
MFC after:	3 days
2019-04-16 09:42:42 +00:00
mw
faf2f0725c Improve tpm20 style
No functional changes to the code are applied.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
2019-04-16 02:46:21 +00:00
mw
79a9626e09 tpm: Prevent session hijack
Check caller thread id before allowing to read the buffer
to make sure that it can only be accessed by the thread that
did the associated write to the TPM.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: delphij
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D19713
2019-04-16 02:28:35 +00:00
cem
654aeb58dd random(4): Block read_random(9) on initial seeding
read_random() is/was used, mostly without error checking, in a lot of
very sensitive places in the kernel -- including seeding the widely used
arc4random(9).

Most uses, especially arc4random(9), should block until the device is seeded
rather than proceeding with a bogus or empty seed.  I did not spy any
obvious kernel consumers where blocking would be inappropriate (in the
sense that lack of entropy would be ok -- I did not investigate locking
angle thoroughly).  In many instances, arc4random_buf(9) or that family
of APIs would be more appropriate anyway; that work was done in r345865.

A minor cleanup was made to the implementation of the READ_RANDOM function:
instead of using a variable-length array on the stack to temporarily store
all full random blocks sufficient to satisfy the requested 'len', only store
a single block on the stack.  This has some benefit in terms of reducing
stack usage, reducing memcpy overhead and reducing devrandom output leakage
via the stack.  Additionally, the stack block is now safely zeroed if it was
used.

One caveat of this change is that the kern.arandom sysctl no longer returns
zero bytes immediately if the random device is not seeded.  This means that
FreeBSD-specific userspace applications which attempted to handle an
unseeded random device may be broken by this change.  If such behavior is
needed, it can be replaced by the more portable getrandom(2) GRND_NONBLOCK
option.

On any typical FreeBSD system, entropy is persisted on read/write media and
used to seed the random device very early in boot, and blocking is never a
problem.

This change primarily impacts the behavior of /dev/random on embedded
systems with read-only media that do not configure "nodevice random".  We
toggle the default from 'charge on blindly with no entropy' to 'block
indefinitely.'  This default is safer, but may cause frustration.  Embedded
system designers using FreeBSD have several options.  The most obvious is to
plan to have a small writable NVRAM or NAND to persist entropy, like larger
systems.  Early entropy can be fed from any loader, or by writing directly
to /dev/random during boot.  Some embedded SoCs now provide a fast hardware
entropy source; this would also work for quickly seeding Fortuna.  A 3rd
option would be creating an embedded-specific, more simplistic random
module, like that designed by DJB in [1] (this design still requires a small
rewritable media for forward secrecy).  Finally, the least preferred option
might be "nodevice random", although I plan to remove this in a subsequent
revision.

To help developers emulate the behavior of these embedded systems on
ordinary workstations, the tunable kern.random.block_seeded_status was
added.  When set to 1, it blocks the random device.

I attempted to document this change in random.4 and random.9 and ran into a
bunch of out-of-date or irrelevant or inaccurate content and ended up
rototilling those documents more than I intended to.  Sorry.  I think
they're in a better state now.

PR:		230875
Reviewed by:	delphij, markm (earlier version)
Approved by:	secteam(delphij), devrandom(markm)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D19744
2019-04-15 18:40:36 +00:00
hselasky
cc2d24c16f Remove superfluous USB keyword.
Discussed with:		danfe@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-15 17:32:38 +00:00
gallatin
5e9e5fdf6f mlx5en: Enable new pfil(9) KPI ethernet filtering hooks
This allows efficient filtering at packet ingress on mlx5en.

Note that the packets are filtered (and potentially dropped) *before*
the driver has committed to (re)allocating an mbuf for the
packet. Dropped packets are treated essentially the same as an
error. Nothing is allocated, and the existing buffer is recycled. This
allows us to drop malicious packets at close to line rate with very
little CPU use.

Reviewed by:	hselasky, slavash, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19063
2019-04-15 17:14:50 +00:00
hselasky
ef58b1f602 Fix spelling.
Submitted by:		Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-04-15 14:32:19 +00:00
emaste
b17bf6cec4 Add quirk for ignoring SPCR AccessWidth values on the PL011 UART
The SPCR table on the Lenovo HR330A Ampere eMAG server indicates 8-bit
access, but 32-bit access is required for the PL011 to work.

PL011 on SBSA platforms always supports 32-bit access (and that was
hardcoded here before my EC2 fix), let's use 32-bit access for PL011
and 32BIT interface types.

Tested by emaste on Ampere eMAG and Cavium/Marvell ThunderX2.

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	andrew, imp (earlier)
Differential Revision:	https://reviews.freebsd.org/D19507
2019-04-15 13:41:53 +00:00
rmacklem
eddbdfff07 Fix the NFSv4 client to safely find processes.
r340744 broke the NFSv4 client, because it replaced pfind_locked() with a
call to pfind(), since pfind() acquires the sx lock for the pid hash and
the NFSv4 already holds a mutex when it does the call.
The patch fixes the problem by recreating a pfind_any_locked() and adding the
functions pidhash_slockall() and pidhash_sunlockall to acquire/release
all of the pid hash locks.
These functions are then used by the NFSv4 client instead of acquiring
the allproc_lock and calling pfind().

Reviewed by:	kib, mjg
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19887
2019-04-15 01:27:15 +00:00
tuexen
18c75290c7 When sending a routing message, don't allow the user to set the
RTF_RNH_LOCKED flag in rtm_flags, since this flag is used only
internally.

Reported by:		syzbot+65c676f5248a13753ea0@syzkaller.appspotmail.com
Reviewed by:		ae@
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D19898
2019-04-14 10:18:14 +00:00
rmacklem
c5cfdafb1f Add support for INET6 addresses to the kernel code that dumps open/lock state.
PR#223036 reported that INET6 callback addresses were not printed by
nfsdumpstate(8). This kernel patch adds INET6 addresses to the dump structure,
so that nfsdumpstate(8) can print them out, post-r346190.
The patch also includes the addition of #ifdef INET, INET6 as requested
by bz@.

PR:		223036
Reviewed by:	bz, rgrimes
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19839
2019-04-13 22:00:09 +00:00
tuexen
485db168ce When sending IPv4 packets on a SOCK_RAW socket using the IP_HDRINCL option,
ensure that the ip_hl field is valid. Furthermore, ensure that the complete
IPv4 header is contained in the first mbuf. Finally, move the length checks
before relying on them when accessing fields of the IPv4 header.
Reported by:		jtl@
Reviewed by:		jtl@
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D19181
2019-04-13 10:47:47 +00:00
imp
3dcee6772d Move mpr/mps drivers from per-arch NOTES files into the MI notes
file. They are in more arches they they aren't. Add appropriate
nodevice directives in powerpc and arm.
2019-04-13 06:30:45 +00:00
imp
96ac98747a Fix sbttons for values > 2s
Add test against negative times. Add code to cope with larger values
properly.

Discussed with: bde@ (quite some time ago, for an earlier version)
2019-04-13 04:46:35 +00:00
jhibbits
7061ad58c2 Add NUMA support to powerpc
Summary:
Initial NUMA support:
    - associate CPU with domain
    - associate memory ranges with domain
    - identify domain for devices
    - limit device interrupt binding to appropriate domain

- Additionally fixes a bug in the setting of Maxmem which led to
  only memory attached to the first socket being enabled for DMA

A pmap variant can opt in to numa support by by calling `numa_mem_regions`
at the end of pmap_bootstrap - registering the corresponding ranges with the
VM.

This yields a ~20% improvement in build times of llvm on dual socket POWER9
over non-NUMA.

Original patch by mmacy.

Differential Revision: https://reviews.freebsd.org/D17933
2019-04-13 04:03:18 +00:00
jhibbits
81013cfae7 powerpc/dtrace: Fix dtrace powerpc asm, and simplify stack walking
Fix some execution bugs in the dtrace powerpc asm.  addme pulls in the carry
flag which we don't want, and the result wasn't recorded anyways, so the
following beq to check for exit condition wasn't checking the right
condition.

Simplify the stack walking in dtrace_isa.c, so there's only a single walker
that handles both pc and sp.  This should make it easier to follow, and any
bugfix that may be needed for walking only needs to be made in one place
instead of two now.

MFC after:	2 weeks
2019-04-13 03:32:21 +00:00
jhibbits
ef80efbacd powerpc: Add file forgotten in r346144
Forgot to add the changes for DELAY(), which lowers priority during the
delay period.  Also, mark the timebase read as volatile so newer GCC does
not optimize it away, as it reportedly does currently.

MFC after:	2 weeks
MFC with:	r346144
2019-04-13 02:29:30 +00:00
mav
44190ed378 Fix SCSI sense data pass through.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-04-12 18:54:09 +00:00
kib
333f08f7aa Ignore doomed vnodes in tmpfs_update_mtime().
Otherwise we might dereference NULL vp->v_data after
VP_TO_TMPFS_NODE().

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-12 17:11:50 +00:00
trasz
15e5c29cae Remove unneeded conditionals for sv_ functions - all the ABIs
(apart from null_sysvec) define them, so the 'else' branch is
never taken.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19889
2019-04-12 14:18:16 +00:00
tychon
e660248c13 for a cache-only zone the destructor tries to destroy a non-existent keg
Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19835
2019-04-12 12:46:25 +00:00
jhibbits
831bf1a1aa powerpc: Adjust priority NOPs, and make them functions
PowerISA 2.07 and PowerISA 3.0 both specify special NOPs for priority
adjustments, with "medium" priority being normal.  We had been setting
medium-low as our normal priority.  Rather than guess each time as to what
we want and the right NOP, wrap them in inline functions, and replace the
occurrances of the NOPs with the functions.  Also, make DELAY() drop to very
low priority while waiting, so we don't burn CPU.

Coupled with r346143, this shaves off a modest 5-8% on buildworld times with
-j72.  There may be more room for improvement with judicious use of these
NOPs.

MFC after:	2 weeks
2019-04-12 00:53:30 +00:00
jhibbits
c5657f49cc powerpc64: Increase the nap level on power9 idling
The POWER9 documentation specifies that levels 0-3 are the 'lightest' sleep
level, meaning lowest latency and with no state loss.  However, state 3 is
not implemented, and is instead reserved for future chips.  This now
properly configures the PSSCR, specifying state 2 as the lowest level to
enter, but request level 0 for quickest sleep level.  If the OCC determines
that the CPU can enter states 1 or 2 it will trigger the transition to those
states on demand.

MFC after:	1 week
2019-04-12 00:44:33 +00:00
tuexen
7186df98c8 Fix an SCTP related locking issue. Don't report that the TCB_SEND_LOCK
is owned, when it is not.

This issue was found by running syzkaller.
MFC after:		1 week
2019-04-11 20:39:12 +00:00
trasz
9e141477c1 Use shared vnode locks for the ELF interpreter.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19874
2019-04-11 11:21:45 +00:00
markj
9abf4945e6 Reinitialize multicast source filter structures after invalidation.
When leaving a multicast group, a hole may be created in the inpcb's
source filter and group membership arrays.  To remove the hole, the
succeeding array elements are copied over by one entry.  The multicast
code expects that a newly allocated array element is initialized, but
the code which shifts a tail of the array was leaving stale data
in the final entry.  Fix this by explicitly reinitializing the last
entry following such a copy.

Reported by:	syzbot+f8c3c564ee21d650475e@syzkaller.appspotmail.com
Reviewed by:	ae
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19872
2019-04-11 08:00:59 +00:00
oshogbo
06483b0326 The nvlist_report_missing is also used by the cnvlist.
It can't be a static one.

Reported by:	jenkins
MFC after:	2 weeks
2019-04-11 04:24:41 +00:00
cy
145eb83ae0 Catch up to r343631: Avoid "pfil: duplicate hook" due to
ipf_check_wrapper and ipf_check_wrapper6 being registered
under the same pa_rulename.

MFC after:	3 days
2019-04-11 04:22:06 +00:00
oshogbo
c45b7353f3 libnv: fix compilation warnings
When building libnv without a debug those arguments are no longer used
because assertions will be changed to NOP.

Submitted by: Mindaugas Rasiukevicius <rmind@netbsd.org>
MFC after:    2 weeks
2019-04-11 04:21:58 +00:00
oshogbo
13c654428c libnv: fix compilation warnings
When building libnv without a debug those arguments are no longer used
because assertions will be changed to NOP.

Submitted by:	Mindaugas Rasiukevicius <rmind@netbsd.org>
MFC after:	2 weeks
2019-04-11 03:47:53 +00:00
kibab
af9a3c8f19 Add some CMD53-related definitions
In preparation to adding block mode functions, add necessary definitions.

Reviewed by:	bz
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D19832
2019-04-10 20:44:54 +00:00
manu
818002a905 arm: dtb: Compile the Linux DTS for pandaboards
Reported by:	ci.freebsd.org
2019-04-10 20:11:28 +00:00
kibab
49c5ad42bb Implement CMD53 block mode support for SDHCI and AllWinner-based boards
If a custom block size requested, use it, otherwise revert to the previous logic
of using just a data size if it's less than MMC_BLOCK_SIZE, and MMC_BLOCK_SIZE otherwise.

Reviewed by:	bz
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D19783
2019-04-10 19:53:36 +00:00
kibab
674b0be51d Add new fields to mmc_data in preparation to SDIO CMD53 block mode support
SDIO command CMD53 (IO_RW_EXTENDED) allows data transfers using blocks of 1-2048 bytes,
with a maximum of 511 blocks per request.
Extend mmc_data structure to properly describe such requests,
and initialize the new fields in kernel and userland consumers.

No actual driver changes happen yet, these will follow in the separate changes.

Reviewed by:	bz
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D19779
2019-04-10 19:49:35 +00:00
manu
ac7512a382 arm: kernel: Remove old kernel configs
Follow up to r346095
All those kernels are either not working or the release have switched
to GENERIC
2019-04-10 19:27:14 +00:00
manu
ea57e476f1 arm: dts: Remove some old DTS
RPI is using the firmware provided DTS since 12.0
Pandaboard works with the Linux DTS
RK* Exynos* and Meson*/Odroid* don't even work with current
source code, if someone wants to make them work again they
better use the Linux DTS.
2019-04-10 19:18:05 +00:00
rrs
5883516e75 Fix a small bug in the tcp_log_id where the bucket
was unlocked and yet the bucket-unlock flag was not
changed to false. This can cause a panic if INVARIANTS
is on and we go through the right path (though rare).
This fixes the correct bug :)

Reported by:	syzbot+179a1ad49f3c4c215fa2@syzkaller.appspotmail.com
Reviewed by:	tuexen@
2019-04-10 18:58:11 +00:00
manu
332b830883 Import DTS files from Linux 5.0
MFC after:	2 months
2019-04-10 18:15:36 +00:00
lwhsu
2bc49e10c6 Fix build in sys/modules/nfscommon
Sponsored by:	The FreeBSD Foundation
2019-04-10 16:48:45 +00:00
asomers
d0c51c9a66 fix cache_lookup's documentation
cache_lookup's documentation got dislocated by r324378. Relocate and expand
it.

Reviewed by:	jhb, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-04-10 13:02:33 +00:00
trasz
8fedfd26a7 Improve vnode lock assertions.
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-04-10 10:21:14 +00:00
avos
80646d8a6d urtw(4), otus(4), iwi(4): allow to set non-default MAC address via ifconfig(8)
Tested with Netgear WG111 v3 (RTL8187B, urtw(4)), STA mode.

MFC after:	1 week
2019-04-10 08:17:56 +00:00
glebius
3078edc62c Obvious comment correction. 2019-04-09 22:15:39 +00:00
jhb
2b1f6ab13c Refine r330113 to honor the ProducerConsumer flag most of the time.
While it is true that the ACPI spec says that the flag is only valid
on Extended Address Space Descriptors, examples of other descriptors
in the spec use the ProducerConsumer flag explicitly, and real
hardware uses it as well.  In fact, even in the ASL of the Thunder X2
for which r330113 was a workaround, some devices use this flag on
non-Extended Address Space Descriptors correctly.  Instead, only
ignore the flag for resources associated with the UART devices on the
Thunder X2 using the "ARMH0011" HID to identify these devices.

This should fix regressions from ignoring this flag in other contexts
such as Hyper-V.

PR:		235876
Reported by:	Wei Hu <weh@microsoft.com>
Tested by:	emaste (Thunder X2)
MFC after:	2 weeks
2019-04-09 21:18:02 +00:00
kib
5c087ad1bb Add vn_fsync_buf().
Provide a convenience function to avoid the hack with filling fake
struct vop_fsync_args and then calling vop_stdfsync().

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-09 20:20:04 +00:00
kib
dea9380e34 Fix dirty buf exhaustion easily triggered with msdosfs.
If truncate(2) is performed on msdosfs file, which extends the file by
system-depended large amount, fs creates corresponding amount of dirty
delayed-write buffers, which can consume all buffers.  Such buffers
cannot be flushed by the bufdaemon because the ftruncate() thread owns
the vnode lock.  So the system runs out of free buffers, and even
truncate() thread starves, which means deadlock because it owns the
vnode lock.

Fix this by doing vnode fsync in extendfile() when low memory or low
buffers condition detected, which flushes all dirty buffers belonging
to the file being extended.

Note that the more usual fallback to bawrite() does not work
acceptable in this situation, because it would only allow one buffer
to be recycled.  Other filesystems, most important UFS, do not allow
userspace to create arbitrary amount of dirty delayed-write buffers
without feedback, so bawrite() is good enough for them.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-09 19:55:02 +00:00
jhb
8350e99074 Don't pre-reserve resources for CPU devices when they are set.
CPUs can use shared (RF_SHAREABLE) resources for the I/O port used for
entering and exiting C states.  If this I/O port is included in an ACPI
system resource device, then this happens to still work, but if the port
wasn't part of a system resource device, only the first CPU could allocate
the I/O port and use C states since resource_list_reserve() was always
allocating the resource from nexus0 without RF_SHAREABLE.  By avoiding
the reservation, the flags from the bus_alloc_resource() in the CPU driver
(which include RF_SHAREABLE) are honored.

PR:		236513
Reported by:	stockhausen@collogia.de
Sleuthing by:	avg
Reviewed by:	avg
MFC after:	2 weeks
2019-04-09 19:22:08 +00:00
kib
fcf1189407 pci_cfgreg.c: Use io port config access for early boot time.
Some early PCIe chipsets are explicitly listed in the white-list to
enable use of the MMIO config space accesses, perhaps because ACPI
tables were not reliable source of the base MCFG address at that time.
For that chipsets, MCFG base was read from the known chipset MCFGbase
config register.

During very early stage of boot, when access to the PCI config space
is performed (see e.g. pci_early_quirks.c), we cannot map 255MB of
registers because the method used with pre-boot pmap overflows initial
kernel page tables.

Move fallback to read MCFGbase to the attachment method of the
x86/legacy device, which removes code duplication, and results in the
use of io accesses until MCFG is parsed or legacy attach called.

For amd64, pre-initialize cfgmech with CFGMECH_1, right now we
dynamically assign CFGMECH_1 to it anyway, and remove checks for
CFGMECH_NONE.

There is a mention in the Intel documentation for corresponding
chipsets that OS must use either io port or MMIO access method, but we
already break this rule by reading MCFGbase register, so one more
access seems to be innocent.

Reported by:	longwitz@incore.de
PR:	236838
Reviewed by:	avg (other version), jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19833
2019-04-09 18:07:17 +00:00
trasz
ca6bb3d6ec Factor out section loading into a separate function.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19846
2019-04-09 15:24:38 +00:00
ganbold
8aba5de150 In some cases like NanoPI R1, its second USB ethernet
RTL8152 (chip version URE_CHIP_VER_4C10) doesn't
have hardwired MAC address, in other words, it is all zeros.
This commit fixes it by setting random MAC address
when MAC address is all zeros.

Reviewed by:	kevlo
Differential Revision:	https://reviews.freebsd.org/D19856
2019-04-09 13:54:08 +00:00
imp
5d511432d6 Style only change: Prefer $() to ``
$() is more modern and also nests. Convert the mix of styles to using
only the former (although the latter was more common). It's the more
dominant style in other shell scripts these days as well.

Differential Revision:  https://reviews.freebsd.org/D19840
2019-04-08 18:25:14 +00:00
kib
aeecbe07e1 Handle races when remounting UFS volume from ro to rw.
In particular, ensure that writers are not unleashed before SU
structures are initialized.  Also, correctly handle MNT_ASYNC before
this.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-08 15:20:05 +00:00
trasz
de8cb8b511 Refactor ELF interpreter loading into a separate function.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19741
2019-04-08 14:31:07 +00:00
oshogbo
e5fad78020 In the unlinkat syscall, the operation is performed on the directory
descriptor, not the file descriptor. The file descriptor is used only for
verification so do not expect any additional capabilities on it.

Reported by:	antoine
Tested by:	antoine
Discussed with:	kib, emaste, bapt
Sponsored by:	Fudo Security
2019-04-08 14:23:52 +00:00
ganbold
7285d802f5 Fix URE_WDT6_SET_MODE value in the register definition.
Both linux and u-boot sources for RTL8152 driver has this value.
RTL8152 USB ethernet is used in NanoPI R1 board as second ethernet.
This fixes for me RTL8152 USB ethernet not detected problem after
reboot on NanoPI R1 board.

Both NetBSD and OpenBSD have a wrong value so far.
2019-04-08 13:40:46 +00:00
imp
f7689276a8 Make RELDATE be on a single line.
All variable assignments that start in column 1 have to be on a single
line for amd to build due to as weird dependency there (most likely it
can be fixed to use the new VARS_ONLY feature, but it isn't
today). usr.sbin/amd/include/Makefile calls
usr.sbin/amd/include/newvers.sh which does:
	eval `LC_ALL=C egrep '^[A-Z]+=' $1 | grep -v COPYRIGHT`
which is where that requirement comes from. It handles COPYRIGHT since
that's an exception. Rather than add additional exceptions, cope with
the long line in newvers.sh instead. Note: it no longer needs to
filter COPYRIGHT because the assignment doesn't start in column 1
anymore.

I had done a universe when I had an earlier version of r346018 that
had it as one line. When I changed it to multi-line as suggested in
the review, I only built kernels on a couple of architectures to make
sure it didn't break anything.

Add comment to newvers.sh noting this.

Obviously, this unbreaks the amd build.
2019-04-07 21:01:02 +00:00
mhorne
6de23af4cf RISC-V: initialize pcpu slightly earlier
In certain scenarios, it is possible for PCPU data to be
accessed before it has been initialized (e.g. during printf
if the kernel was built with the TSLOG option).

Initialize the PCPU pointer for hart 0 at the beginning of
initriscv() rather than near the end.

Reviewed by:		markj
Approved by:		markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D19726
2019-04-07 20:12:24 +00:00
imp
7d87d74539 Use default shell assignment rather more complicated if then
construct.

Discussed with: emaste@, allanjude@ (changes (or not) based on their feedback)
Differential Revision: https://reviews.freebsd.org/D19797
2019-04-07 18:39:55 +00:00
ian
fa027ee465 Add g_label_flashmap.c to the module, should have been part of r345480.
Reported by:	Jia-Shiun Li <jiashiun@gmail.com>
2019-04-07 16:33:22 +00:00
oshogbo
99638c7483 Bump FreeBSD version after r345982.
Reported by:	Shawn Webb <shawn.webb@hardenedbsd.org>
Discussed with: imp, cy, rgrimes
2019-04-07 16:07:41 +00:00
markj
6e75ac3373 Set the p_oppid field of orphans when exiting.
Such processes will be reparented to the reaper when the current
parent is done with them (i.e., ptrace detached), so p_oppid must be
updated accordingly.

Add a regression test to exercise this code path.  Previously it
would not be possible to reap an orphan with a stale oppid.

Reviewed by:	kib, mjg
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19825
2019-04-07 14:26:14 +00:00
kib
587458f1cd Give new home to the comment from ppt_pci_reset(), explaining a nuance
of power reset.

Noted by:	soralx@cydem.org
Sponsored by:	Mellanox Technologies
MFC after:	12 days
2019-04-07 08:58:09 +00:00
cem
16397b2576 kern/subr_pctrie: Fix mismatched signedness in assertion comparison
'tos' is an index into an array and never holds a negative value.  Correct
its signedness to match PCTRIE_LIMIT, which it is compared to in assertions.

No functional change (kills a warning).
2019-04-06 21:56:24 +00:00
rmacklem
4bb25ea3ad Add INET6 support for the upcalls to the nfsuserd daemon.
The kernel code uses UDP to do upcalls to the nfsuserd(8) daemon to get
updates to the username<->uid and groupname<->gid mappings.
A change to AF_LOCAL last year had to be reverted, since it could result
in vnode locking issues on the AF_LOCAL socket.
This patch adds INET6 support and the required #ifdef INET and INET6
to the code.

Requested by:	bz
PR:		205193
Reviewed by:	bz, rgrimes
MFC after:	2 weeks
Differential Revision:	http://reviews.freebsd.org/D19218
2019-04-06 21:53:46 +00:00
cem
ecbc847507 kern/subr_pctrie: Convert old-style boolean_t to plain "bool"
No functional change.
2019-04-06 20:38:44 +00:00
asomers
5a6d620e2b fusefs: fix a panic on mount
Don't page fault if the file descriptor provided with "-o fd" is invalid.
This is a merge of r345419 from the projects/fuse2 branch.

Reviewed by:	ngie
Tested by:	Marek Zarychta <zarychtam@plan-b.pwste.edu.pl>
MFC after:	2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19836
2019-04-06 18:04:04 +00:00
oshogbo
def45a363e Regen after r345982. 2019-04-06 09:37:10 +00:00
oshogbo
20d273b44b Introduce funlinkat syscall that always us to check if we are removing
the file associated with the given file descriptor.

Reviewed by:	kib, asomers
Reviewed by:	cem, jilles, brooks (they reviewed previous version)
Discussed with:	pjd, and many others
Differential Revision:	https://reviews.freebsd.org/D14567
2019-04-06 09:34:26 +00:00
jkim
f5b6bd46f2 MFV: r345969
Import ACPICA 20190405.
2019-04-06 06:02:42 +00:00
jhibbits
90186d78c6 powerpc/powernv: Fix major bugs in opal_flash
* The BIO bio_data may not be page aligned.  Only the base address of each
  page worth of data is extracted to pass to OPAL.  Without page alignment
  it can scribble over random memory when finishing the page read.  Fix this
  by short-reading the first page to properly align for full page reads.
* Fix the definition of OPAL_FLASH_ERASE.
* Properly handle the async message result, as now returned from r345974.
2019-04-06 02:39:56 +00:00
jhibbits
3ef32d3557 powerpc/powernv: Fix issues in opal_async
* Properly return the full opal_msg from an async completion.
* Don't keep bugging OPAL, wait 100us or so.  With some minor changes to
  DELAY() to drop to very low priority, the thread won't hog the CPU while
  polling for the async completion.
2019-04-06 02:31:01 +00:00
kib
810f55efd4 Add DEV_RESET /dev/devctl2 ioctl.
It performs BUS_RESET_CHILD() on the parental bus and the specified
device.

Reviewed by:	imp (previous version), jhb (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19646
2019-04-05 19:31:26 +00:00
kib
6eb7556345 Remove single-use DEV_RESET() macro.
It conflicts with the sys/bus.h DEV_XXX namespace.

Reviewed by:	imp (previous version), jhb (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19646
2019-04-05 19:27:51 +00:00
kib
1f83f13cf3 Implement resets for PCI buses and PCIe bridges.
For PCI device (i.e. child of a PCI bus), reset tries FLR if
implemented and worked, and falls to power reset otherwise.

For PCIe bus (child of a PCIe bridge or root port), reset
disables PCIe link and then re-trains it, performing what is known as
link-level reset.

Reviewed by:	imp (previous version), jhb (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19646
2019-04-05 19:25:26 +00:00
kib
3298fbd5b2 Provide newbus infrastructure for initiating device reset.
The methods BUS_RESET_PREPARE(), BUS_RESET(), and BUS_RESET_POST()
should be implemented by bus which can provide reset to a device.  The
methods are described in inline doxygen comments.

Code only provides BUS_RESET_PREPARE() and BUS_RESET_POST() helpers
instead of default implementation, because actual bus needs to handle
device state around reset, while helpers provide the other half of
typical prepare/post code.

Reviewed by:	imp (previous version), jhb (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19646
2019-04-05 18:09:22 +00:00
kib
2a6eaf86a2 vn_vmap_seekhole(): align running offset to the block boundary.
Otherwise we might miss the last iteration where EOF appears below
unaligned noff.

Reported and reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19811
2019-04-05 16:14:16 +00:00
kib
3a48424552 Fix mis-merge.
Amusingly, it is nop.

Noted by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
X-MFC-rev:	r345702
2019-04-05 16:12:35 +00:00
manu
99cbdc45aa twsi: Use config_intrhook_oneshot instead of config_intrhook_establish
Suggested by:	ian
MFC after:	1 month
X-MFC-With:	345948
2019-04-05 15:53:27 +00:00
manu
3f5e78d74d twsi: Add interrupt mode
Add the ability to use interrupts for i2c message.
We still use polling for early boot i2c transfer (for PMIC
for example) but as soon as interrupts are available use them.
On Allwinner SoC >A20 is seems that polling mode is broken for some
reason, this is now fixed by using interrupt mode.
For Allwinner also fix the frequency calculation, the one in the code
was for when the APB frequency is at 48Mhz while it is at 24Mhz on most
(all?) Allwinner SoCs. We now support both cases.

While here add more debug info when it's compiled in.

Tested On: A20, H3, A64
MFC after:	1 month
2019-04-05 14:44:23 +00:00
imp
b8f858c079 Remove another instance of All Rights Reserved.
Remove the phrase from boilerplate copyright we stick on vers.c when
we can't find the template file. In practice, this won't change a
thing, except for the case of compiling the kernel standalone w/o the
rest of a tree on a system that doesn't have
/usr/share/examples/etc/bsd-copyright installed.
2019-04-05 14:27:48 +00:00
imp
e37799a8d3 Add mpr, mps, mpt to NOTES file
Add these to all the architectures that these are in the GENERIC
kernel.
2019-04-05 02:54:02 +00:00
rmacklem
185da79fc0 Revert r320698, since the related userland changes were reverted by r338192.
r338192 reverted the changes to nfsuserd so that it could use an AF_LOCAL
socket, since it resulted in a vnode locking panic().
Post r338192 nfsuserd daemons use the old AF_INET socket for upcalls and
do not use these kernel changes.
I left them in for a while, so that nfsuserd daemons built from head sources
between r320757 (Jul. 6, 2017) and r338192 (Aug. 22, 2018) would need them
by default.
This only affects head, since the changes were never MFC'd.
I will add an UPDATING entry, since an nfsuserd daemon built from head
sources between r320757 and r338192 will not run unless the "-use-udpsock"
option is specified. (This command line option is only in the affected
revisions of the nfsuserd daemon.)

I suspect few will be affected by this, since most who run systems built
from head sources (not stable or releases) will have rebuilt their nfsuserd
daemon from sources post r338192 (Aug. 22, 2018)

This is being reverted in preparation for an update to include AF_INET6
support to the code.
2019-04-04 23:30:27 +00:00
emaste
8087e3c8fb if_muge: use NULL not 0 for DRIVER_MODULE pointer args
Sponsored by:	The FreeBSD Foundation
2019-04-04 19:59:31 +00:00
rgrimes
cda8035706 Use IN_foo() macros from sys/netinet/in.h inplace of handcrafted code
There are a few places that use hand crafted versions of the macros
from sys/netinet/in.h making it difficult to actually alter the
values in use by these macros.  Correct that by replacing handcrafted
code with proper macro usage.

Reviewed by:		karels, kristof
Approved by:		bde (mentor)
MFC after:		3 weeks
Sponsored by:		John Gilmore
Differential Revision:	https://reviews.freebsd.org/D19317
2019-04-04 19:01:13 +00:00
rmacklem
e349a3602c Fix malloc stats for the RPCSEC_GSS server code when DEBUG is enabled.
The code enabled when "DEBUG" is defined uses mem_alloc(), which is a
malloc(.., M_RPC, M_WAITOK | M_ZERO), but then calls gss_release_buffer()
which does a free(.., M_GSSAPI) to free the memory.
This patch fixes the problem by replacing mem_alloc() with a
malloc(.., M_GSSAPI, M_WAITOK | M_ZERO).
This bug affects almost no one, since the sources are not normally built
with "DEBUG" defined.

Submitted by:	peter@ifm.liu.se
MFC after:	2 weeks
2019-04-04 01:23:06 +00:00
cem
33133b3b41 Replace read_random(9) with more appropriate arc4rand(9) KPIs
Reviewed by:	ae, delphij
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19760
2019-04-04 01:02:50 +00:00
pjd
c118ff293a Implement automatic online expansion of GELI providers - if the underlying
provider grows, GELI will expand automatically and will move the metadata
to the new location of the last sector.

This functionality is turned on by default. It can be turned off with the
-R flag, but it is not recommended - if the underlying provider grows and
automatic expansion is turned off, it won't be possible to attach this
provider again, as the metadata is no longer located in the last sector.

If the automatic expansion is turned off and the underlying provider grows,
GELI will only log a message with the previous size of the provider, so
recovery can be easier.

Obtained from:	Fudo Security
2019-04-03 23:57:37 +00:00
emaste
2ef79738b1 cpsw: use phy-handle in FDT to find PHY address
In r337703 DTS files were updated to Linux 4.18, including Linux commit
4d8b032d3c03f4e9788a18bbb51b10e6c9e8a56b which removed the `phy_id`
property from am335x-bone-common (as the property was deprecated).

Use `phy-handle` via fdt_get_phyaddr, keeping the existing code as a
fallback for old DTBs.

PR:		236624
Submitted by:	manu, Gerald Aryeetey <aryeeteygerald_rogers.com>
Reported by:	Gerald Aryeetey
Reviewed by:	manu
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19814
2019-04-03 21:01:53 +00:00
rrs
50c7932ba8 Undo my previous erroneous commit changing the tcp_output kassert.
Hmm now the question is where did the tcp_log_id change go :o
2019-04-03 19:35:07 +00:00
mav
6c0c4d730c Fix typos in r345849.
MFC after:	1 week
2019-04-03 18:35:13 +00:00
mav
295e1572d3 List few more ATA commands.
MFC after:	1 week
2019-04-03 18:27:54 +00:00
kib
36b7295410 msdosfs: zero tail of the last block on truncation for VREG vnodes as well.
Despite the call to vtruncbuf() from detrunc(), which results in
zeroing part of the partial page after EOF, there still is a
possibility to retain the stale data which is revived on file
enlargement.  If the filesystem block size is greater than the page
size, partial block might keep other after-EOF pages wired and they
get reused then.  Fix it by zeroing whole part of the partial buffer
after EOF, not relying on vnode_pager_setsize().

PR:	236977
Reported by:	asomers
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-03 17:02:18 +00:00
mw
77b8255fbc Add a cv_wait to the TPM2.0 harvesting function
Harvesting has to compete for the TPM chip with userspace.
Before this change the callout could hijack an unread buffer
causing a userspace call to the TPM to fail.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: delphij
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D19712
2019-04-03 08:22:58 +00:00
jhibbits
ca52774d85 powerpc: Allow emulating optional FPU instructions on CPUs with an FPU
The e5500 has an FPU, but lacks the optional fsqrt instruction.  This
instruction gets emulated in the kernel, but the emulation uses stale data,
from the last switch out, and does not return the result of the operation
immediately.  Fix both of these conditions by saving and restoring the FPRs
around the emulation point.

MFC after:	1 week
MFC with:	r345829
2019-04-03 04:01:08 +00:00
mw
7c5d4b81ab Create kernel module to parse Veriexec manifest based on envs
The current approach of injecting manifest into mac_veriexec is to
verify the integrity of it in userspace (veriexec (8)) and pass its
entries into kernel using a char device (/dev/veriexec).
This requires verifying root partition integrity in loader,
for example by using memory disk and checking its hash.
Otherwise if rootfs is compromised an attacker could inject their own data.

This patch introduces an option to parse manifest in kernel based on envs.
The loader sets manifest path and digest.
EVENTHANDLER is used to launch the module right after the rootfs is mounted.
It has to be done this way, since one might want to verify integrity of the init file.
This means that manifest is required to be present on the root partition.
Note that the envs have to be set right before boot to make sure that no one can spoof them.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: sjg
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D19281
2019-04-03 03:57:37 +00:00
jhibbits
acf1cd99b3 powerpc: Apply r178139 from sparc64 to powerpc's fpu_sqrt
This fix was committed less than 2 months after the code was forked into the
powerpc kernel.  Though powerpc doesn't use quad-precision floating point,
or need it for emulation, the changes do look like correctness fixes
overall.

This was found while trying to get fsqrt emulation working on e5500, which
does have a real FPU, but lacks the fsqrt instruction.  This is not the
complete fix, the rest is to be committed separately.

MFC after:	1 week
2019-04-03 03:54:30 +00:00
rmacklem
c5f8d6c34f Add a comment to the r345818 patch to explain why cl_refs is initialized to 2.
PR:		235582
MFC after:	2 weeks
2019-04-03 03:50:16 +00:00
rmacklem
bdb31b3b51 Fix a race in the RPCSEC_GSS server code that caused crashes.
When a new client structure was allocated, it was added to the list
so that it was visible to other threads before the expiry time was
initialized, with only a single reference count.
The caller would increment the reference count, but it was possible
for another thread to decrement the reference count to zero and free
the structure before the caller incremented the reference count.
This could occur because the expiry time was still set to zero when
the new client structure was inserted in the list and the list was
unlocked.

This patch fixes the race by initializing the reference count to two
and initializing all fields, including the expiry time, before inserting
it in the list.

Tested by:	peter@ifm.liu.se
PR:		235582
MFC after:	2 weeks
2019-04-02 23:51:08 +00:00
mav
dc10c69c7f Build NVMe CAM transport unrelated to NVMe SIM.
Before this I suppose it was impossible load CAM-based NVMe as module.
Plus this appeared to be needed to build r345815 without NVMe driver.

MFC after:	2 weeks
2019-04-02 20:27:56 +00:00
mav
66e1cda3f6 Make cam_error_print() decode NVMe commands.
MFC after:	2 weeks
2019-04-02 19:37:52 +00:00
tychon
6fcd6bd4ce ioat(4) should use bus_dma(9) for the operation source and destination
addresses

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19725
2019-04-02 19:08:06 +00:00
tychon
87228e148f ioatcontrol(8) could exercise 8k-aligned copy with page-break, crc and
crc-copy modes.

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19780
2019-04-02 19:06:25 +00:00
tychon
8b2669726f DMAR driver assumes all physical addresses are backed by a fully
initialized struct vm_page.

Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19753
2019-04-02 18:50:49 +00:00
np
b0756a4416 cxgbe(4): Add a flag to indicate that bits in interrupt cause but not in
interrupt enable are not fatal.

The firmware sets up all the interrupt enables based on run time
configuration, which means the information in the enables is more
accurate than what's compiled into the driver.  This change also allows
the fatal bits to be updated without any changes in the driver in some
cases.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-04-02 18:50:33 +00:00
mav
8d4daa603c Unify SCSI_STATUS_BUSY retry handling with other cases.
- Do not retry if periph was invalidated.
 - Do not decrement retry_count if already zero.
 - Report action_string when applicable.

MFC after:	2 weeks
2019-04-02 14:46:10 +00:00
kib
4853d54461 tmpfs: plug holes on rw->ro mount update.
In particular:
- suspend the mount around vflush() to avoid new writes come after the
  vnode is processed;
- flush pending metadata updates (mostly node times);
- remap all rw mappings of files from the mount into ro.

It is not clear to me how to handle writeable mappings on rw->ro for
tmpfs best.  Other filesystems, which use vnode vm object, call
vgone() on vnodes with writers, which sets the vm object type to
OBJT_DEAD, and keep the resident pages and installed ptes as is.  In
particular, the existing mappings continue to work as far as
application only accesses resident pages, but changes are not flushed
to file.

For tmpfs the vm object of VREG vnodes also serves as the data pages
container, giving single copy of the mapped pages, so it cannot be set
to OBJT_DEAD.  Alternatives for making rw mappings ro could be either
invalidating them at all, or marking as CoW.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19737
2019-04-02 13:59:04 +00:00
kib
01adc0720f tmpfs: ignore tmpfs_set_status() if mount point is read-only.
In particular, this fixes atimes still changing for ro tmpfs.
tmpfs_set_status() gains tmpfs_mount * argument.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19737
2019-04-02 13:49:32 +00:00
kib
3b825b8983 Block creation of the new nodes for read-only tmpfs mounts.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19737
2019-04-02 13:41:26 +00:00