Commit Graph

126307 Commits

Author SHA1 Message Date
Hans Petter Selasky
7f36930024 Implement dev_err_once() function macro in the LinuxKPI.
Submitted by:		Johannes Lundberg <johalun0@gmail.com>
MFC after:		1 week
Sponsored by:		Limelight Networks
Sponsored by:		Mellanox Technologies
2019-03-13 17:46:05 +00:00
Hans Petter Selasky
8fdb5febfc Implement dma_set_mask_and_coherent() in the LinuxKPI.
Submitted by:		Johannes Lundberg <johalun0@gmail.com>
MFC after:		1 week
Sponsored by:		Limelight Networks
Sponsored by:		Mellanox Technologies
2019-03-13 17:42:31 +00:00
Navdeep Parhar
4a21f4c606 cxgbe(4): Update T4/5/6 firmwares to 1.23.0.0.
Obtained from:	Chelsio Communications
MFC after:	1 month
Sponsored by:	Chelsio Communications
2019-03-13 06:46:15 +00:00
Konstantin Belousov
22d7708455 hwpmc/core: Adopt to upcoming Skylake TSX errata.
The forthcoming microcode update will fix a TSX bug by clobbering PMC3
when TSX instructions are executed (even speculatively).  There is an
alternate mode where CPU executes all TSX instructions by aborting
them, in which case PMC3 is still available to OS.  Any code that
correctly uses TSX must be ready to handle abort anyway.

Since it is believed that FreeBSD population of hwpmc(4) users is
significantly larger than the population of TSX users, switch the
microcode into TSX abort mode whenever a pmc is allocated, and back to
bug avoidance mode when the last pmc is deallocated.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-03-12 19:33:25 +00:00
Kirk McKusick
3193b25a5a This is an additional fix for bug report 230962. When using
extended attributes, the kernel can panic with either "ffs_truncate3"
or with "softdep_deallocate_dependencies: dangling deps".

The problem arises because the flushbuflist() function which is
called to clear out buffers is passed either the V_NORMAL flag to
indicate that it should flush buffer associated with the contents
of the file or the V_ALT flag to indicate that it should flush the
buffers associated with the extended attribute data. The buffers
containing the extended attribute data are identified by having
their BX_ALTDATA flag set in the buffer's b_xflags field. The
BX_ALTDATA flag is set on the buffer when the extended attribute
block is first allocated or when its contents are read in from the
disk.

On a busy system, a buffer may be reused for another purpose, but
the contents of the block that it contained continues to be held
in the main page cache. Each physical page is identified as holding
the contents of a logical block within a specified file (identified
by a vnode). When a request is made to read a file, the kernel first
looks for the block in the existing buffers.  If it is not found
there, it checks the page cache to see if it is still there. If
it is found in the page cache, then it is remapped into a new
buffer thus avoiding the need to read it in from the disk.

The bug is that when a buffer request made for an extended attribute
is fulfilled by reconstituting a buffer from the page cache rather
than reading it in from disk, the BX_ALTDATA flag was not being
set. Thus the flushbuflist() function would never clear it out and
the "ffs_truncate3" panic would occur because the vnode being cleared
still had buffers on its clean-buffer list. If the extended attribute
was being updated, it is first read, then updated, and finally
written. If the read is fulfilled by reconstituting the buffer
from the page cache the BX_ALTDATA flag was not set and thus the
dirty buffer would never be flushed by flushbuflist(). Eventually
the buffer would be recycled. Since it was never written it would
have an unfinished dependency which would trigger the
"softdep_deallocate_dependencies: dangling deps" panic.

The fix is to ensure that the BX_ALTDATA flag is set when a buffer
has been reconstituted from the page cache.

PR:           230962
Reported by:  2t8mr7kx9f@protonmail.com
Reviewed by:  kib
Tested by:    Peter Holm
MFC after:    1 week
Sponsored by: Netflix
2019-03-12 19:08:41 +00:00
Konstantin Belousov
3dcf329ee5 Add register number, CPUID bits, and print identification for TSX
force abort errata.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-03-12 18:59:01 +00:00
Konstantin Belousov
45a2d058d2 Remove useless version check.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-03-12 18:57:11 +00:00
Konstantin Belousov
dad0dd21cc isci(4): Use controller->lock for busdma tags.
isci(4) uses deferred loading.  Typically on amd64 and i386 non-PAE
the tag does not create any restrictions, but on i386 PAE-tables but
non-PAE configs callbacks might be used.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2019-03-12 16:49:08 +00:00
Edward Tomasz Napierala
2df8bd90c8 Drop unused 'p' argument to nfsv4_strtogid().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 15:07:47 +00:00
Edward Tomasz Napierala
c703cba811 Drop unused 'p' argument to nfsv4_gidtostr().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 15:05:11 +00:00
Edward Tomasz Napierala
0658ac3943 Drop unused 'p' argument to nfsv4_strtouid().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 15:02:52 +00:00
Edward Tomasz Napierala
0f86b94a56 Drop unused 'p' argument to nfsv4_uidtostr().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-12 14:59:08 +00:00
Edward Tomasz Napierala
f32bf2922f Drop unused 'p' argument to nfsrv_getuser().
Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19455
2019-03-12 14:53:53 +00:00
Kashyap D Desai
0298863e63 Update driver version to 07.709.04.00-fbsd
Submitted by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed by:  Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by:  Ken
MFC after:  3 days
Sponsored by:   Broadcom Inc
2019-03-12 09:29:46 +00:00
Kashyap D Desai
54f784f59a Allocated MFI frames should be same as MPT frames reserved for DCMDs
Submitted by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed by:  Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by:  Ken
MFC after:  3 days
Sponsored by:   Broadcom Inc
2019-03-12 09:29:01 +00:00
Kashyap D Desai
5437c8b88e fw_outstanding"(outstanding IOs at firmware level) counter gets screwed up when R1 fastpath
writes are running. Some of the cases which are not handled properly in driver are:

1. With R1 fastpath supported, single write from CAM layer can consume 2 MPT frames
at driver/firmware level for fastpath qualification(if fw_outstanding < controller Queue Depth).
Due to this driver has to throttle IOs coming from CAM layer as well as second fastpath
write(of R1 write) against Adapter Queue Depth.
If "fw_outstanding" reaches to adapter queue depth, driver should return IOs from CAM layer with
device busy status.While allocating second MPT frame(corresponding to R1 FP write) also, driver
should ensure fw_outstanding should not exceed adapter QD.

2. For R1 fastpath writes completion, driver decrements "fw_oustanding" counter without
really returning MPT frame to free pool. It may cause IOs(with heavy IOs running, consuming whole
adapter Queue Depth) consuming MPT frames reserved for DCMDs(management commands) and
DCMDs(internal and sent by application) not getting MPT frame will start failing.

Below is one test case to hit the issue described above-
1. Run heavy IOs (outstanding IOs should hit adapter Queue Depth).
2. Run management tool (Broadcom's storcli tool) querying adapter in loop (run command- "storcli64 /c0 show" in loop).
3. Management tool's requests would start failing due to non-availability of free MPT frames as all frames would be consumed by IOs.

Fix: Increment/decrement of "fw_outstanding" counter should be in sync with MPT frame get/return.

Submitted by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed by:  Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by:  Ken
MFC after:  3 days
Sponsored by:   Broadcom Inc
2019-03-12 09:24:58 +00:00
Warner Losh
7e48d71151 Fix botched merge with 355066
When merging from Netflix's tree, resetting the carsize was dropped
accidentally. This fix fixes that revision by properly resetting how
many are in the car.

Noticed by: mav@
2019-03-12 05:10:41 +00:00
Warner Losh
329f0aa952 Kill tz_minuteswest and tz_dsttime.
Research Unix, 7th Edition introduced TIMEZONE and DSTFLAG
compile-time constants in sys/param.h to communicate these values for
the machine. 4.2BSD moved from the compile-time to run-time and
introduced these variables and used for localtime() to return the
right offset from UTC (sometimes referred to as GMT, for this purpose
is the same). 4.4BSD migrated to using the tzdata code/database and
these variables were basically unused.

FreeBSD removed the real need for these with adjkerntz in
1995. However, some RTC clocks continued to use these variables,
though they were largely unused otherwise.  Later, phk centeralized
most of the uses in utc_offset, but left it using both tz_minuteswest
and adjkerntz.

POSIX (IEEE Std 1003.1-2017) states in the gettimeofday specification
"If tzp is not a null pointer, the behavior is unspecified" so there's
no standards reason to retain it anymore. In fact, gettimeofday has
been marked as obsolecent, meaning it could be removed from a future
release of the standard. It is the only interface defined in POSIX
that references these two values. All other references come from the
tzdata database via tzset().

These were used to more faithfully implement early unix ABIs which
have been removed from FreeBSD.  NetBSD has completely eliminated
these variables years ago. Linux has migrated to tzdata as well,
though these variables technically still exist for compatibility
with unspecified older programs.

So, there's no real reason to have them these days. They are a
historical vestige that's no longer used in any meaningful way.

Reviewed By: jhb@, brooks@
Differential Revision: https://reviews.freebsd.org/D19550
2019-03-12 04:49:47 +00:00
Kirk McKusick
42a5a356a8 Add KASSERT to the softdep_disk_write_complete() function in the
soft dependency code to ensure that it will be able to avoid a
dangling dependency.

Sponsored by: Netflix
2019-03-12 00:10:31 +00:00
Kirk McKusick
3532718257 Give more complete information in INVARIANTS panic messages at end of
the ffs_truncate() function.

Sponsored by: Netflix
2019-03-11 23:53:56 +00:00
Kirk McKusick
c11cbfd957 Update the main loop in the flushbuflist() routine to properly select
buffers for flushing when requested to flush both normal and extended
attributes buffers.

Sponsored by: Netflix
2019-03-11 22:42:33 +00:00
Kirk McKusick
a9f59cc029 Augment the UFS filesystem specific print function (called by the
kernel vn_printf() routine when printing out vnodes associated with
a UFS filesystem) to also include the inode's link count, effective
link count, generation number, owner, group, flags, size, and for
UFS2 filesystems, the extent size.

Sponsored by: Netflix
2019-03-11 22:05:34 +00:00
Kirk McKusick
93fa5ae7f1 Augment DDB "show buffer" command to print the buffer's referenced
vnode pointer (b_vp). The value of b_vp can be used by "show vnode"
to print the vnode and "show vnodebufs" to print all the clean and
dirty buffers associated with the vnode (which should include this
buffer).

Sponsored by: Netflix
2019-03-11 21:49:44 +00:00
Warner Losh
3899afd370 Upgrade Chipfancier SLC quirk to all versions
The 16GB, 32GB and 128GB versions of this product all have the same
problem. For some reason, the RC10 size is correct, while the RC16
size is larger (oddly by the capacity size / 1024 bytes). Using the
RC16 size results in illegal LBA range errors when geom tastes the
device. So, expand the quirk to cover all versions of this chip.

Ideally, we'd get both READ CAPACITY 10 and READ CAPACITY 16 sizes and
print a warnnig if they differ and use the smaller of the two numbers,
though that may be problematical as well. Furthermore, SBC-4
encourages users transition to RC16 only, which suggests that in the
future RC10 may disappear from some drives. It's unclear how to cope
with these drives generically.

PR: 234503
MFC After: 1 week
2019-03-11 20:57:54 +00:00
Simon J. Gerraty
f5fdf82d82 Add _PC_ACL_* to vop_stdpathconf
This avoid EINVAL from tmpfs etc.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D19512
2019-03-11 20:40:56 +00:00
Vladimir Kondratyev
76cefcd810 Fix amd64/i386 LINT build after r344982
Submitted by:	jkim
Reported by:	rpokala
MFC with:	r344982
2019-03-11 19:46:15 +00:00
Alexander Motin
aa8676f25d Revert minor part of r344934.
I tried to save some CPU time on hopeless aggregation attempts, but it seems
the condition I added is overly strict, blocking also aggregation of optional
I/Os in cases which previously were possible.  Revert just to be safe.

MFC after:	1 month
2019-03-11 17:39:09 +00:00
Hans Petter Selasky
c29a65e6ec Eliminate useless warning message when reading sysctl node in mlx4core.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-03-11 14:34:25 +00:00
Hans Petter Selasky
6f490688f5 Improve support for switching to and from command polling mode in mlx4core.
Make sure the enter and leave polling routines can be called multiple times
with same setting. Ignore setting polling or event mode twice. This fixes a
deadlock during shutdown if polling mode was already selected.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-03-11 14:29:50 +00:00
David Bright
2fb6802f27 Fix a scribbler in the PMS driver.
The ESGL bit was left uninitialized when executing the REPORT LUNS
ioctl. This could allow a zeroed data buffer to be treated as a
scatter/gather list. The firmware would eventually walk past the end
of the data buffer, potentially find what looked like a valid
address/length pair, and write the result to semi-random memory.

Obtained from:	Dell EMC Isilon
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19398
2019-03-11 14:26:45 +00:00
Kenneth D. Merry
6f9dbc0e6e Fix CRN resets in the isp(4) driver in certain situations.
The Command Reference Number (CRN) is part of the FC-Tape features
that we enable when talking to tape drives.  It starts at 1, and
goes to 255 and wraps around to 1.  There are a number of reset
type conditions that result in the CRN getting reset to 1.  These
are detailed in section 4.10 (table 8) of the FCP-4r02b specification.

One of the conditions is when a PRLI (Process Login) is sent by
the initiator, and the Establish Image Pair bit is set in Word 0
of the PRLI.

Previously, the isp(4) driver core sent a notification via
isp_async() that the target had changed or stayed in place, but
there was no indication of whether a PRLI was sent and whether the
Establish Image Pair bit was set.

The result of this was that in some situations, notably
switching back and forth between a direct connection and a switch
connection to a tape drive, the isp(4) driver would fail to reset
the CRN in situations that require it according to the spec.  When
the CRN isn't reset in a situation that requires it, the tape drive
then rejects every subsequent command that is sent to the drive.
It is assuming that the commands are being sent out of order.

So, modify the isp(4) driver to include Word 0 of the PRLI command
when it sends isp_async() notifications of target changes.  Look at
the Establish Image Pair bit, and reset the CRN if that bit is set.

With this change, I am able to switch a tape drive back and forth
between a direct connection and a switch connection, and the isp(4)
driver resets the CRN when it should.

sys/dev/isp_stds.h:
	Add bit definitions for PRLI Word 0.

sys/dev/ispmbox.h:
	Add PRLI Word 0 to the port database type, isp_pdb_t.

sys/dev/ispvar.h
	Add PRLI Word 0 to fcportdb_t.

sys/dev/isp.c:
	Populate the new prli_word0 parameter in the port database.

	In isp_pdb_add_update(), add a check to see if the
	Establish Image Pair bit is set in PRLI Word 0.  If it is,
	then that is an additional reason to create a change
	notification.

sys/dev/isp_freebsd.c:
	In isp_async(), if the device changed or stayed, look at
	PRLI Word 0 to see if the Establish Image Pair bit is set.
	If it is, reset the CRN if we haven't already.

MFC after:	1 week
Sponsored by:	Spectra Logic
Differential Revision:	https://reviews.freebsd.org/D19472
2019-03-11 14:21:14 +00:00
Andrey V. Elsukov
ca0f03e808 Add IP_FW_NAT64 to codes that ipfw_chk() can return.
It will be used by upcoming NAT64 changes. We use separate code
to avoid propogating EACCES error code to user level applications
when NAT64 consumes a packet.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2019-03-11 10:42:09 +00:00
Andrey V. Elsukov
d76227959a Add NULL pointer check to nat64_output().
It is possible, that a processed packet was originated by local host,
in this case m->m_pkthdr.rcvif is NULL. Check and set it to V_loif to
avoid NULL pointer dereference in IP input code, since it is expected
that packet has valid receiving interface when netisr processes it.

Obtained from:	Yandex LLC
MFC after:	1 week
Sponsored by:	Yandex LLC
2019-03-11 10:33:32 +00:00
Andriy Voskoboinyk
589526906c iwm(4): use correct channel list source for Intel 3168
Intel 3168 uses another EEPROM section to store channel flags;
port missing bits from iwlwifi to make it work.

PR:		230750, 236235
Tested by:	Bert JW Regeer <xistence@0x58.com>
MFC after:	3 days
2019-03-11 08:30:29 +00:00
Ian Lepore
6daa4a4079 Mark the imx_spi device busy while transfers are in progress, so that the
module can't be unloaded while interrupts are pending.
2019-03-11 03:07:05 +00:00
Andriy Voskoboinyk
b3ec1ab8dc urtw(4): add promiscuous mode callback
Also, pass control frames to the host while in MONITOR mode and / or
when promiscuous mode is enabled.

Tested with Netgear WG111 v3 (RTL8187B), STA / MONITOR modes.

MFC after:	2 weeks
2019-03-11 02:02:04 +00:00
Andriy Voskoboinyk
786ac7035f Fix ieee80211_radiotap(9) usage in wireless drivers:
- Alignment issues:
 * Add missing __packed attributes + padding across all drivers; in
most places there was an assumption that padding will be always
minimally suitable; in few places - e.g., in urtw(4) / rtwn(4) -
padding was just missing.
 * Add __aligned(8) attribute for all Rx radiotap headers since they can
contain 64-bit TSF timestamp; it cannot appear in Tx radiotap headers, so
just drop the attribute here. Refresh ieee80211_radiotap(9) man page
accordingly.

- Since net80211 automatically updates channel frequency / flags in
ieee80211_radiotap_chan_change() drop duplicate setup for these fields
in drivers.

Tested with Netgear WG111 v3 (urtw(4)), STA mode.

MFC after:	2 weeks
2019-03-11 01:27:01 +00:00
Edward Tomasz Napierala
4f4463dfa3 Fix crash in low memory conditions. Usual backtrace looked
like this:

pqisrc_build_sgl() at pqisrc_build_sgl+0x8d/frame 0xfffffe009e8b7a00
pqisrc_build_raid_io() at pqisrc_build_raid_io+0x231/frame 0xfffffe009e8b7a40
pqisrc_build_send_io() at pqisrc_build_send_io+0x375/frame 0xfffffe009e8b7b00
pqi_request_map_helper() at pqi_request_map_helper+0x282/frame 0xfffffe009e8b7ba0
bus_dmamap_load_ccb() at bus_dmamap_load_ccb+0xd7/frame 0xfffffe009e8b7c00
pqi_map_request() at pqi_map_request+0x9b/frame 0xfffffe009e8b7c70
pqisrc_io_start() at pqisrc_io_start+0x55c/frame 0xfffffe009e8b7d50
smartpqi_cam_action() at smartpqi_cam_action+0xb8/frame 0xfffffe009e8b7de0
xpt_run_devq() at xpt_run_devq+0x30a/frame 0xfffffe009e8b7e40
xpt_action_default() at xpt_action_default+0x94b/frame 0xfffffe009e8b7e90
dastart() at dastart+0x33b/frame 0xfffffe009e8b7ee0
xpt_run_allocq() at xpt_run_allocq+0x1a2/frame 0xfffffe009e8b7f30
dastrategy() at dastrategy+0x71/frame 0xfffffe009e8b7f60
g_disk_start() at g_disk_start+0x351/frame 0xfffffe009e8b7fc0
g_io_request() at g_io_request+0x3cf/frame 0xfffffe009e8b8010
g_part_start() at g_part_start+0x120/frame 0xfffffe009e8b8090
g_io_request() at g_io_request+0x3cf/frame 0xfffffe009e8b80e0
zio_vdev_io_start() at zio_vdev_io_start+0x4b2/frame 0xfffffe009e8b8140
zio_execute() at zio_execute+0x17c/frame 0xfffffe009e8b8180
zio_nowait() at zio_nowait+0xc4/frame 0xfffffe009e8b81b0
vdev_queue_io_done() at vdev_queue_io_done+0x138/frame 0xfffffe009e8b81f0
zio_vdev_io_done() at zio_vdev_io_done+0x151/frame 0xfffffe009e8b8220
zio_execute() at zio_execute+0x17c/frame 0xfffffe009e8b8260
taskqueue_run_locked() at taskqueue_run_locked+0x10c/frame 0xfffffe009e8b82c0
taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe009e8b82f0
fork_exit() at fork_exit+0x84/frame 0xfffffe009e8b8330
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe009e8b8330

Reviewed by:	deepak.ukey_microsemi.com, sbruno
MFC after:	2 weeks
Sponsored by:	Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D19470
2019-03-10 23:05:38 +00:00
Vladimir Kondratyev
2b4ee39838 atrtc(4): install ACPI RTC/CMOS operation region handler
FreeBSD base system does not provide an ACPI handler for the PC/AT RTC/CMOS
device with PnP ID PNP0B00; on some HP laptops, the absence of this handler
causes suspend/resume and poweroff(8) to hang or fail [1], [2]. On these
laptops EC _REG method queries the RTC date/time registers via ACPI
before suspending/powering off. The handler should be registered before
acpi_ec driver is loaded.

This change adds handler to access CMOS RTC operation region described in
section 9.15 of ACPI-6.2 specification [3]. It is installed only for ACPI
version of atrtc(4) so it should not affect old ACPI-less i386 systems.

It is possible to disable the handler with loader tunable:
debug.acpi.disabled=atrtc

Informational debugging printf can be enabled by setting hw.acpi.verbose=1
in loader.conf

[1] https://wiki.freebsd.org/Laptops/HP_Envy_6Z-1100
[2] https://wiki.freebsd.org/Laptops/HP_Notebook_15-af104ur
[3] https://uefi.org/sites/default/files/resources/ACPI_6_2.pdf

PR:		207419, 213039
Submitted by:	Anthony Jenkins <Scoobi_doo@yahoo.com>
Reviewed by:	ian
Discussed on:	acpi@, 2013-2015, several threads
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19314
2019-03-10 20:19:43 +00:00
Ian Lepore
68dd779577 Give the mx25l device sole ownership of the name /dev/flash/spi* instead of
trying to use disk_add_alias() to make spi* an alias for mx25l*.  It turns
out disk_add_alias() works for partitions, but not slices, and that's hard
to fix.

This change is, in effect, a partial revert of r344526.

The mips world relies on the existence of flashmap names formatted as
/dev/flash/spi0s.name, whereas pretty much nothing relies on at45d devices
using the /dev/spi* names (because until recently the at45d driver didn't
even work reliably). So this change makes mx25l devices the sole owner of
the /dev/flash/spi* namespace, which actually makes some sense because it is
a SpiFlash(tm) device, so flash/spi isn't a horrible name.

Reported by:	Mori Hiroki <yamori813@yahoo.co.jp>
2019-03-10 18:48:08 +00:00
Gleb Smirnoff
c93410229c Most Ethernet drivers that potentially can run a pfil(9) hook with
PFIL_MEMPTR flag are intentionally providing a memory address that
isn't aligned to pointer alignment. This is done to align an IPv4
or IPv6 header that is expected to follow Ethernet header.

When we return PFIL_REALLOCED we store a pointer to allocated mbuf
at this address. With this change the KPI changes to store the pointer
at aligned address, which usually yields in +2 bytes.

Provide two inlines:

pfil_packet_align() to get aligned pfil_packet_t for a misaligned one
pfil_mem2mbuf() to read out mbuf pointer from misaligned pfil_packet_t

Provide function pfil_realloc(), not used yet, that would convert a
memory pfil_packet_t to an mbuf one.

Reported by:	hps
Reviewed by:	hps, gallatin
2019-03-10 17:20:09 +00:00
Gleb Smirnoff
b9fdb4b3a3 Properly handle a case when a first filter returns PFIL_REALLOCED, then
second one returns PFIL_PASS.
2019-03-10 17:08:05 +00:00
Justin Hibbits
093f7de620 powerpc: Print trap frame address in ddb backtraces
Registers visible from 'show reg' don't always match the registers from the
offending trap frame.  Knowing the frame address lets one examine the
registers manually.

MFC after:	1 week
2019-03-09 03:24:39 +00:00
Justin Hibbits
66306e6acd powerpc: Print trap frame address for fatal traps
MFC after:	1 week
2019-03-09 03:18:37 +00:00
Bjoern A. Zeeb
6d8b651c1e Add two more products found inside a T480 to usbdevs.
Add an Intel Bluetooth module.
Add Synaptics as a vendor with a fingerprint reader product.

MFC after:		2 weeks
2019-03-09 03:15:09 +00:00
Justin Hibbits
b2c820735a powerpc: Print data address register on alignment exceptions
MFC after:	1 week
2019-03-09 03:10:56 +00:00
Bjoern A. Zeeb
715eb7e7d5 Try to improve comment for socket state bits.
In r324227 the comment moved into socketvar.h originally from
sockstate.h r180948.  Try to improve English and as a consequence rewrap
the comment.

No functional changes.

Reviewed by:		jhb (a wording suggestion)
Differential Revision:	https://reviews.freebsd.org/D13865
2019-03-09 01:37:00 +00:00
Warner Losh
2ffd6fce5b Don't print all the I/O we abort on a reset, unless we're out of
retries.

When resetting the controller, we abort I/O. Prior to this fix, we
printed a ton of abort messages for I/O that we're going to
retry. This imparts no useful information. Stop printing them unless
our retry count is exhausted. Clarify code for when we don't retry,
and remove useless arg to a routine that's always called with it
as 'true'. All the other debug is still printed (including multiple
reset messages if we have multiple timeouts before the taskqueue
runs the actual reset) so that we know when we reset.

Reviewed by: jimharris@, chuck@
Differential Revision: https://reviews.freebsd.org/D19431
2019-03-09 01:18:16 +00:00
Bjoern A. Zeeb
b25d74e06c Improve ARP logging.
r344504 added an extra ARP_LOG() call in case of an if_output() failure.
It turns out IPv4 can be noisy. In order to not spam the console by default:
(a) add a counter for these events so people can keep better track of how
    often it happens, and
(b) add a sysctl to select the default ARP_LOG log level and set it to
    INFO avoiding the one (the new) DEBUG level by default.

Claim a spare (1st one after 10 years since the stats were added) in order
to not break netstat from FreeBSD 12->13 updates in the future.

Reviewed by:		karels
Differential Revision:	https://reviews.freebsd.org/D19490
2019-03-09 01:12:59 +00:00
Alexander Motin
5ca679e3c4 MFV/ZoL: Disable LBA weighting on files and SSDs
The LBA weighting makes sense on rotational media where the outer tracks
have twice the bandwidth of the inner tracks. However, it is detrimental
on nonrotational media such as solid state disks, where the only effect
is to ensure that metaslabs enter the best-fit allocation behavior
sooner, which is detrimental to performance. It also makes no sense on
files where the underlying filesystem can arrange things however it
wants.

Author: Richard Yao <ryao@gentoo.org>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3712
zfsonlinux/zfs@fb40095f5f

To reduce code divergence this merge replaces equivalent but different
FreeBSD code detecting non-rotating medium vdevs.

MFC after:	1 month
2019-03-08 21:13:45 +00:00
Alexander Motin
673544c3dd Add separate aggregation limit for non-rotating media.
Before sequential scrub patches ZFS never aggregated I/Os above 128KB.
Sequential scrub bumped that to 1MB, which motivation I understand for
spinning disks, since it should reduce number of head seeks.  But for
SSDs it makes much less sense to me, especially on FreeBSD, where due
to MAXPHYS limitation device will likely still see bunch of 128KB I/Os
instead of one large.  Having more strict aggregation limit allows to
avoid allocation of large memory buffer and memcpy to/from it, that is
a serious problem when bandwidth reaches few GB/s.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2019-03-08 19:38:52 +00:00
Alexander Motin
3a3ba532e7 MFV/ZoL: Fix zfs_vdev_aggregation_limit bounds checking
Update the bounds checking for zfs_vdev_aggregation_limit so that
it has a floor of zero and a maximum value of the supported block
size for the pool.

Additionally add an early return when zfs_vdev_aggregation_limit
equals zero to disable aggregation.  For very fast solid state or
memory devices it may be more expensive to perform the aggregation
than to issue the IO immediately.

Author: Brian Behlendorf <behlendorf1@llnl.gov>
zfsonlinux/zfs@a58df6f536

MFV/ZoL: Cap maximum aggregate IO size

Commit 8542ef8 allowed optional IOs to be aggregated beyond
the specified aggregation limit.  Since the aggregation limit
was also used to enforce the maximum block size, setting
`zfs_vdev_aggregation_limit=16777216` could result in an
attempt to allocate an ABD larger than 16M.

Author: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #6259
Closes #6270
zfsonlinux/zfs@2d678f779a
2019-03-08 18:49:27 +00:00
Michael Tuexen
3a35ad54a8 Fix locking bug.
MFC after:		3 days
2019-03-08 18:17:57 +00:00
Michael Tuexen
a458a6e620 Some cleanup and consistency improvements.
MFC after:		3 days
2019-03-08 18:16:19 +00:00
Kristof Provost
f8e7fe32a4 pf: Fix DIOCGETSRCNODES
r343295 broke DIOCGETSRCNODES by failing to reset 'nr' after counting the
number of source tracking nodes.
This meant that we never copied the information to userspace, leading to '? ->
?' output from pfctl.

PR:		236368
MFC after:	1 week
2019-03-08 09:33:16 +00:00
Hans Petter Selasky
6b94b89bdb Teardown ifnet after stopping port in the mlx4en(4) driver.
mlx4_en_stop_port() calls mlx4_en_put_qp() which can refer the link level
address of the network interface, which in turn will be freed by the
network interface detach function. Make sure the port is stopped
before detaching the network interface.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-03-08 09:18:29 +00:00
Hans Petter Selasky
e0ba1be6d7 Don't hold state lock while detaching network device instance in mlx4en(4).
It can happen during shutdown that the lock will recurse when the mlx4en(4)
instance is part of a lagg interface. Call ether_ifdetach() unlocked.

Backtrace:
panic(): _sx_xlock_hard: recursed on non-recursive sx &mdev->state_lock
_sx_xlock_hard()
_sx_xlock()
mlx4_en_ioctl()
if_setlladdr()
lagg_port_destroy()
lagg_port_ifdetach()
if_detach()
mlx4_en_destroy_netdev()
mlx4_en_remove()
mlx4_remove_device()
mlx4_unregister_device()
mlx4_unload_one()
mlx4_shutdown()
linux_pci_shutdown()
bus_generic_shutdown()

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2019-03-08 09:16:29 +00:00
Justin Hibbits
1cd7081eb1 powerpc64: Fix early exit with invalid kernel SLB entries
The check for early exit should be checking the SLB entry itself.  As
currently written it was checking the address of the SLB, which is always
non-zero, so would go through the kernel SR restore loop regardless.

Submitted by:	mmacy
MFC after:	2 weeks
2019-03-08 04:20:33 +00:00
Justin Hibbits
9ffdae0fd7 powerpc: Fix cpufreq statement scoping
The second statements on the lines are not guarded by the `if' condition.
This triggers a warning with newer gcc.  It's relatively harmless given the
usage, but incorrect.  Instead, wrap the statements so they're properly
guarded.

Reported by:	powerpc64-gcc xtoolchain
MFC after:	1 week
2019-03-08 03:59:53 +00:00
Conrad Meyer
ab69c4858c Fortuna: Add Chacha20 as an alternative stream cipher
Chacha20 with a 256 bit key and 128 bit counter size is a good match for an
AES256-ICM replacement.

In userspace, Chacha20 is typically marginally slower than AES-ICM on
machines with AESNI intrinsics, but typically much faster than AES on
machines without special intrinsics.  ChaCha20 does well on typical modern
architectures with SIMD instructions, which includes most types of machines
FreeBSD runs on.

In the kernel, we can't (or don't) make use of AESNI intrinsics for
random(4) anyway.  So even on amd64, using Chacha provides a modest
performance improvement in random device throughput today.

This change makes the stream cipher used by random(4) configurable at boot
time with the 'kern.random.use_chacha20_cipher' tunable.

Very rough, non-scientific measurements at the /dev/random device, on a
GENERIC-NODEBUG amd64 VM with 'pv', show a factor of 2.2x higher throughput
for Chacha20 over the existing AES-ICM mode.

Reviewed by:	delphij, markm
Approved by:	secteam (delphij)
Differential Revision:	https://reviews.freebsd.org/D19475
2019-03-08 01:17:20 +00:00
Bjoern A. Zeeb
30b450774e Update for IETF draft-ietf-6man-ipv6only-flag.
When we roam between networks and our link-state goes down, automatically remove
the IPv6-Only flag from the interface.  Otherwise we might switch from an
IPv6-only to and IPv4-only network and the flag would stay and we would prevent
IPv4 from working.

While the actual function call to clear the flag is under EXPERIMENTAL,
the eventhandler is not as we might want to re-use it for other
functionality on link-down event (such was re-calculate default routers
for example if there is more than one).

Reviewed by:	hrs
Differential Revision:	https://reviews.freebsd.org/D19487
2019-03-07 23:03:39 +00:00
Alexander Motin
ede8782611 Improve entropy for ZFS taskqueue selection.
I just found that at least on Skylake CPUs cpu_ticks() never returns odd
values, only even, and possibly has even bigger step (176/2?), that makes
its lower bits very bad entropy source, leaving half of taskqueues unused.
Switch to sbinuptime(), closer to upstreams, mitigates the problem by the
rate conversion working as kind of hash function.  In case that is somehow
not enough (timer rate is too low or too divisible) mix in curcpu.

MFC after:	1 week
2019-03-07 22:56:39 +00:00
Brooks Davis
9e23ca1c94 Correct my previous correction to the license. It now matches the text
in https://spdx.org/licenses/GPL-2.0.html
2019-03-07 22:34:45 +00:00
Brooks Davis
b1329b31f7 Correct license boilerplate, to match the SPDX tag.
The GPL-2.0 tag is a deprecated tag which means that same thing as
GPL-2.0-only.
2019-03-07 22:20:20 +00:00
Emmanuel Vadot
d83a581cad arm64: allwinner: a64: Add TCON clock
The tcon clock need a mux table for it's parent, for now just
list the parents twice.
2019-03-07 19:32:01 +00:00
Emmanuel Vadot
1788e14d92 arm64: allwinner: Add CCU DE2
The Display Engine 2 have it's own Clock and Control Unit, add support
for it.
2019-03-07 19:30:37 +00:00
Emmanuel Vadot
2b0adb4404 arm: allwinner: Fix NM clock recalc
If the NM clock is using a fractional divider the formula isn't the same.
2019-03-07 19:28:47 +00:00
Andrey V. Elsukov
40025d42fd Fix typo.
MFC after:	1 week
2019-03-07 10:01:32 +00:00
Michael Tuexen
e6dcce69ca After removing an entry from the stream scheduler list, set the pointers
to NULL, since we are checking for it in case the element gets inserted
again.

This issue was found by running syzkaller.

MFC after:		3 days
2019-03-07 08:43:20 +00:00
Justin Hibbits
058250a8ab powerpc: Save stack pointer in savectx
This allows 'show acttrace' to show backtrace on processes currently running
on CPUs.

Reported by:	Brandon Bergren
MFC after:	1 week
2019-03-07 04:43:08 +00:00
Andrey V. Elsukov
83354acf5a Fix the problem with O_LIMIT states introduced in r344018.
dyn_install_state() uses `rule` pointer when it creates state.
For O_LIMIT states this pointer actually is not struct ip_fw,
it is pointer to O_LIMIT_PARENT state, that keeps actual pointer
to ip_fw parent rule. Thus we need to cache rule id and number
before calling dyn_get_parent_state(), so we can use them later
when the `rule` pointer is overrided.

PR:		236292
MFC after:	3 days
2019-03-07 04:40:44 +00:00
Matt Macy
8ea23c2b5b add GPL text in addition to SPDX tags as requested by core
MFC after:	1 week
2019-03-07 03:53:48 +00:00
Matt Macy
030963c090 add gcov to LINT build
MFC after:	1 week
2019-03-07 03:50:34 +00:00
Matt Macy
b02af3b2cf Add build time GPL warning when GCOV is enabled
MFC after:	1 week
2019-03-07 03:47:41 +00:00
Alexander Motin
551b7d3a29 Add respective tunables to few ZFS sysctls.
MFC after:	1 week
2019-03-07 01:24:08 +00:00
Conrad Meyer
9a6a45d850 fuse: switch from DFLTPHYS/MAXBSIZE to maxcachebuf
On GENERIC kernels with empty loader.conf, there is no functional change.
DFLTPHYS and MAXBSIZE are both 64kB at the moment.  This change allows
larger bufcache block sizes to be used when either MAXBSIZE (custom kernel)
or the loader.conf tunable vfs.maxbcachebuf (GENERIC) is adjusted higher
than the default.

Suggested by:	ken@
2019-03-07 00:55:49 +00:00
Bjoern A. Zeeb
21231a7aa6 Update for IETF draft-ietf-6man-ipv6only-flag.
All changes are hidden behind the EXPERIMENTAL option and are not compiled
in by default.

Add ND6_IFF_IPV6_ONLY_MANUAL to be able to set the interface into no-IPv4-mode
manually without router advertisement options.  This will allow developers to
test software for the appropriate behaviour even on dual-stack networks or
IPv6-Only networks without the option being set in RA messages.
Update ifconfig to allow setting and displaying the flag.

Update the checks for the filters to check for either the automatic or the manual
flag to be set.  Add REVARP to the list of filtered IPv4-related protocols and add
an input filter similar to the output filter.

Add a check, when receiving the IPv6-Only RA flag to see if the receiving
interface has any IPv4 configured.  If it does, ignore the IPv6-Only flag.

Add a per-VNET global sysctl, which is on by default, to not process the automatic
RA IPv6-Only flag.  This way an administrator (if this is compiled in) has control
over the behaviour in case the node still relies on IPv4.
2019-03-06 23:31:42 +00:00
Conrad Meyer
e7df98863b FUSE: Prevent trivial panic
When open(2) was invoked against a FUSE filesystem with an unexpected flags
value (no O_RDONLY / O_RDWR / O_WRONLY), an assertion fired, causing panic.

For now, prevent the panic by rejecting such VOP_OPENs with EINVAL.

This is not considered the correct long term fix, but does prevent an
unprivileged denial-of-service.

PR:		236329
Reported by:	asomers
Reviewed by:	asomers
Sponsored by:	Dell EMC Isilon
2019-03-06 22:56:49 +00:00
John Baldwin
2e43efd0bb Drop "All rights reserved" from my copyright statements.
Reviewed by:	rgrimes
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D19485
2019-03-06 22:11:45 +00:00
Mark Johnston
f3af92bd36 Reorder copyright lines to preserve the source of "All rights reserved."
Reported by:	rgrimes
MFC with:	r344829, r344830
2019-03-06 16:50:14 +00:00
Adrian Chadd
34d5464b85 [ath_hal_ar9300] Add the missing bits from the previous HAL commit.
Noticed by: 75+ emails telling me I messed up.
2019-03-06 08:52:02 +00:00
Adrian Chadd
7fbcfe69e7 [ath_hal] [ath_hal_ar9300] ANI fixes and preparation for userland control.
* The ani function bitmap was being badly used when determining if a command
  could be used.  In hostap modes only a couple of the ANI control parameters
  are enabled.

* The ani function bitmap was not being reset to HAL_ANI_ALL if transitioning
  from AP -> STA.

* Change mrcCckOff to mrcCck - 1 == on, rather than 1 == off.  This matches
  the API used to set the value from userland via the diagnostic API.

* Handle OFDM/CCK noise immunity level commands in ar9300_ani_control().
  These will only come from userland and it will go and program the rest of
  the ANI control parameters with the values in the ANI table.

* Ensure all of the ANI parameters can be tweaked at runtime, even if they're
  disabled.

Tested:

* carambola2 (AR9331), STA/AP modes
2019-03-06 07:54:29 +00:00
Mark Johnston
3b5b20292b Implement minidump support for RISC-V.
Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D18320
2019-03-06 00:01:06 +00:00
Mark Johnston
3a3dfb2815 Initialize dump_avail[] on riscv.
Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D19170
2019-03-05 23:58:16 +00:00
Mark Johnston
91c3fda00b Add pmap_get_tables() for riscv.
This mirrors the arm64 implementation and is for use in the minidump
code.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D18321
2019-03-05 23:56:40 +00:00
Mark Johnston
6a85590370 Show wiring state of map entries in procstat -v.
Note that only entries wired by userspace are shown as such.  In
particular, entries transiently wired by sysctl_wire_old_buffer() are
not flagged as wired in procstat -v output.

Reviewed by:	kib (previous version)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19461
2019-03-05 19:45:37 +00:00
Eric Joyner
bc408c7d61 Remove references to CONTIGMALLOC_WORKS in iflib and em
From Jake:
"The iflib_fl_setup() function tries to pick various buffer sizes based
on the max_frame_size value defined by the parent driver. However, this
code was wrapped under CONTIGMALLOC_WORKS, which was never actually
defined anywhere.

This same code pattern was used in if_em.c, likely trying to match
what iflib uses.

Since CONTIGMALLOC_WORKS is not defined, remove this dead code from
iflib_fl_setup and if_em.c

Given that various iflib drivers appear to be using a similar
calculation, it might be worth making this buffer size a value that the
driver can peek at in the future."

Submitted by:	Jacob Keller <jacob.e.keller@intel.com>
Reviewed by:	shurd@
MFC after:	1 week
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D19199
2019-03-05 19:12:51 +00:00
Kristof Provost
5ea5849a7b tun: VIMAGE fix for if_tun cloner
The if_tun cloner is not virtualised, but if_clone_attach() does use a
virtualised list of cloners.
The result is that we can't find the if_tun cloner when we try to remove
a renamed tun interface. Virtualise the cloner, and move the final
cleanup into a sysuninit so that we're sure this happens after all of
the vnet_sysuninits

Note that we need unit numbers to be system-unique (rather than unique
per vnet, as is done by if_clone_simple()). The unit number is used to
create the corresponding /dev/tunX device node, and this node must match
with the interface.
Switch to if_clone_advanced() so that we have control over the unit
numbers.

Reproduction scenario:
	jail -c -n foo persist vnet
	jexec test ifconfig tun create
	jexec test ifconfig tun0 name wg0
	jexec test ifconfig wg0 destroy

PR:		235704
Reviewed by:	bz, hrs, hselasky
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19248
2019-03-05 13:21:07 +00:00
Marcel Moolenaar
96937e3b23 Revert revision 254095
In revision 254095, gpt_entries is not set to match the on-disk
hdr_entries, but rather is computed based on available space.
There are 2 problems with this:

1.  The GPT backend respects hdr_entries and only reads and writes
    that number of partition entries.  On top of that, CRC32 is
    computed over the table that has hdr_entries elements.  When
    the common code works on what is possibly a larger number, the
    behaviour becomes inconsistent and problematic.  In particular,
    it would be possible to add a new partition that on a reboot
    isn't there anymore.
2.  The calculation of gpt_entries is based on flawed assumptions.
    The GPT specification does not dictate that sectors are layed
    out in a particular way that the available space can be
    determined by looking at LBAs.  In practice, implementations
    do the same thing, because there's no reason to do it any
    other way.  Still, GPT allows certain freedoms that can be
    exploited in some form or shape if the need arises.

PR:		229977
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19438
2019-03-05 04:15:34 +00:00
Alexander Motin
c3c93809f6 bridge: Fix spurious warnings about capabilities
Mask off the bits we don't care about when checking that capabilities
of the member interfaces have been disabled as intended.

Submitted by:	Ryan Moeller <ryan@ixsystems.com>
Reviewed by:	kristof, mav
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D18924
2019-03-04 22:01:09 +00:00
Dimitry Andric
1791078b17 Set tentative merge date, and bump __FreeBSD_version. 2019-03-04 19:23:11 +00:00
Edward Tomasz Napierala
01c27978f5 Don't pass td to nfsvno_open().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-04 14:50:00 +00:00
Edward Tomasz Napierala
127152fe56 Don't pass td to nfsvno_createsub().
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-03-04 14:30:53 +00:00
Edward Tomasz Napierala
5edc9102dc Don't pass td to nfsd_fhtovp(), it's unused.
Reviewed by:	rmacklem (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19421
2019-03-04 13:18:04 +00:00
Edward Tomasz Napierala
af444b18ed Push down the thread argument in NFS server code, using curthread
instead of passing it explicitly. No functional changes

Reviewed by:	rmacklem (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19419
2019-03-04 13:12:23 +00:00
Edward Tomasz Napierala
113aa93390 Push down td in nfsrvd_dorpc() - make it use curthread instead
of it being explicitly passed as an argument. No functional changes.

The big picture here is that I want to get rid of the 'td' argument
being passed everywhere, and this is the first piece that affects
the NFS server.

Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19417
2019-03-04 13:02:36 +00:00
Fedor Uporov
9441309ae0 Fix double free in case of mount error.
Reported by:    Christopher Krah <krah@protonmail.com>
Reported as:    FS-9-EXT3-2: Denial Of Service in nmount-5 (vm_fault_hold)
Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19385
2019-03-04 11:33:49 +00:00
Fedor Uporov
3eed9f20d4 Do not read the on-disk inode in case of vnode allocation.
Reported by:    Christopher Krah <krah@protonmail.com>
Reported as:    FS-6-EXT2-4: Denial Of Service in mkdir-0 (ext2_mkdir/vn_rdwr)
Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19327
2019-03-04 11:27:47 +00:00
Fedor Uporov
736da5176d Fix integer overflow possibility.
Reported by:    Christopher Krah <krah@protonmail.com>
Reported as:    FS-2-EXT2-1: Out-of-Bounds Write in nmount (ext2_vget)
Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19326
2019-03-04 11:19:21 +00:00
Fedor Uporov
4ff6603ab3 Do not panic if inode bitmap is corrupted.
admbug:         804
Reported by:    Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19325
2019-03-04 11:12:19 +00:00
Fedor Uporov
80a4a9716b Validate block bitmaps.
Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19324
2019-03-04 11:01:23 +00:00
Fedor Uporov
daa2d62da2 Add additional on-disk inode checks.
Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19323
2019-03-04 10:55:01 +00:00
Fedor Uporov
6e38bf94e5 Make superblock reading logic more strict.
Add more on-disk superblock consistency checks to ext2_compute_sb_data() function.
It should decrease the probability of mounting filesystems with corrupted superblock data.

Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D19322
2019-03-04 10:42:25 +00:00
Adrian Chadd
647915ff20 [ath_hal_ar9300] Add the extra ANI configuration fields for the AR93xx HAL.
Tested:

* Carambola2 (Ar9331), STA/AP modes
2019-03-04 06:43:00 +00:00
Adrian Chadd
dc5c74a6f4 [ath_hal] add extra ANI fields for the AR9300 HAL.
I'm trying to debug why reception upstairs here is so terrible and it
turns out ANI is buggy.  (Which is no surprise, ANI is always buggy.)

Tested:

* Carambola2 (AR9331), STA/AP modes
2019-03-04 06:42:06 +00:00
Andriy Voskoboinyk
7f74097165 rtwn_usb(4): fix Tx instability with RTL8192CU chipsets
- Fix data frames transmission via POWER_STATUS register setup -
it seems to be set by MACID_CONFIG firmware command, which was broken*
in r290439 and later disabled in r307529.

We can re-enable it later if / when firmware rate adaptation will be
ready; however, this step will be required anyway - for firmware-less
builds.

- Force RTS / CTS protection frame rate to CCK1 (this rate works fine
without any additional setup; no better workaround is known yet).

The problem was not observed on the channel 1 or with CCK1 rate enforced
('ifconfig wlan0 ucastrate 1' for 11 b/g; not possible for 11n networks
due to ifconfig(8) bug).

* I'm not sure if it works before r290439 because - AFAIR - I never seen
firmware rate adaptation working for 10-STABLE urtwn(4)
(It needs EN_BCN bit set and RSSI updates at least).

Tested with RTL8188CUS in STA mode
(in regular mode and with disabled MRR - DARFRC*8 is set to 0)

PR:		233949
MFC after:	2 weeks
2019-03-04 03:02:14 +00:00
Andriy Voskoboinyk
d225ba4aa7 rtwn_usb(4): fix LED blinking for RTL8192CU during scanning
Tested with RTL8188CUS, STA mode.

MFC after:	5 days
2019-03-04 01:54:28 +00:00
Alexander Motin
053db1fefd Reduce CTL threads priority to about PUSER.
Since in most configurations CTL serves as network service, we found
that this change improves local system interactivity under heavy load.
Priority of main threads is set slightly higher then worker taskqueues
to make them quickly sort incoming requests not creating bottlenecks,
while plenty of worker taskqueues should be less sensitive to latency.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2019-03-04 00:49:07 +00:00
Michael Tuexen
be62c88b80 Allocate an assocition id and register the stcb with holding the lock.
This avoids a race where stcbs can be found, which are not completely
initialized.

This was found by running syzkaller.

MFC after:		3 days
2019-03-03 19:55:06 +00:00
Gleb Smirnoff
3fe00ac483 Remove bogus assert that I added in r319722. It is a legitimate case
to call soabort() on a newborn socket created by sonewconn() in case
if further setup of PCB failed. Code in sofree() handles such socket
correctly.

Submitted by:	jtl, rrs
MFC after:	3 weeks
2019-03-03 18:57:48 +00:00
Warner Losh
95108cadbc Add ABORTED_BY_REQUEST to the list of things we look at DNR bit and tell why to comment (code already does this) 2019-03-03 03:36:33 +00:00
Ian Lepore
e70ece1297 Allow the sector size of the disk device to be configured using hints or
FDT data.  The sector size must be a multiple of the device's page size.
If not configured, use the historical default of the device page size.

Setting the disk sector size to 512 or 4096 allows a variety of standard
filesystems to be used on the device.  Of course you wouldn't want to be
writing frequently to a SPI flash chip like it was a disk drive, but for
data that gets written once (or rarely) and read often, using a standard
filesystem is a nice convenient thing.
2019-03-02 23:20:47 +00:00
Ian Lepore
2274a2f725 Add some comments. Give #define'd names to some scattered numbers. Change
some #define'd names to be more descriptive.  When reporting a post-write
compare failure, report the page number, not the byte address of the page.
The latter is the only functional change, it makes the number match the
words of the error message.
2019-03-02 22:28:43 +00:00
Justin Hibbits
83b009dab5 powerpc: fix 'show spr' for ELFv1 powerpc64
Update and flush the right cache range for the ELFv1 ABI.

MFC after:	1 week
2019-03-02 21:11:46 +00:00
Justin Hibbits
5b4c63b781 powerpc/booke: Depessimize MAS register updates even more
Remove isyncs between MAS register updates in the TLB miss handler, since
it's only needed before the TLB update instructions.
2019-03-02 20:59:18 +00:00
Ian Lepore
d4249d08d9 Bugfix: use a dummy buffer for the inactive side of a transfer.
This is especially important for writes.  SPI is inherently a bidirectional
bus; you receive data (even if it's garbage) while writing.  We should not
receive that data into the same buffer we're writing to the device.

When reading it doesn't matter what we send to the device, but using the
dummy buffer for that as well is pleasingly symmetrical.
2019-03-02 20:58:51 +00:00
Michael Tuexen
5f98c80550 Remove debug output.
MFC after:		3 days
2019-03-02 16:10:11 +00:00
Michael Tuexen
bab9988af5 Allow SCTP stream reconfiguration operations only in ESTABLISHED
state.

This issue was found by running syzkaller.

MFC after:		3 days
2019-03-02 14:30:27 +00:00
Michael Tuexen
49f1449309 Handle the case when calling the IPPROTO_SCTP level socket option
SCTP_STATUS on an association with no primary path (early state).

This issue was found by running syzkaller.

MFC after:		3 days
2019-03-02 14:15:33 +00:00
Michael Tuexen
e57d481c5e Report the correct length when using the IPPROTO_SCTP level
socket options SCTP_GET_PEER_ADDRESSES and SCTP_GET_LOCAL_ADDRESSES.
2019-03-02 13:12:37 +00:00
Navdeep Parhar
b43e2d7de6 cxgbev(4): Enable 32b port capabilities in the VF driver.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-03-02 04:39:59 +00:00
Justin Hibbits
51244b1e46 powerpc: Scale intrcnt by mp_ncpus
On very large powerpc64 systems (2x22x4 power9) it's very easy to run out of
available IRQs and crash the system at boot.  Scale the count by mp_ncpus,
similar to x86, so this doesn't happen.  Further work can be done in the future
to scale the I/O IRQs as well, but that's left for the future.

Submitted by:	mmacy
MFC after:	3 weeks
2019-03-02 01:51:41 +00:00
Conrad Meyer
7d93ab5e35 Embedded chacha: Add 0-bit iv + 128-bit counter mode
This mode might be suitable for a Fortuna keystream primitive.

Reviewed by:	markm
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19410
2019-03-01 23:30:23 +00:00
Conrad Meyer
e66ccbeaa3 fortuna: Deduplicate kernel vs user includes
No functional change.

Reviewed by:	markj, markm
Approved by:	secteam (delphij), core (brooks)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19409
2019-03-01 22:51:45 +00:00
John Baldwin
2c352feb3b Fix missed posted interrupts in VT-x in bhyve.
When a vCPU is HLTed, interrupts with a priority below the processor
priority (PPR) should not resume the vCPU while interrupts at or above
the PPR should.  With posted interrupts, bhyve maintains a bitmap of
pending interrupts in PIR descriptor along with a single 'pending'
bit.  This bit is checked by a CPU running in guest mode at various
places to determine if it should be checked.  In addition, another CPU
can force a CPU in guest mode to check for pending interrupts by
sending an IPI to a special IDT vector reserved for this purpose.

bhyve had a bug in that it would only notify a guest vCPU of an
interrupt (e.g. by sending the special IPI or by resuming it if it was
idle due to HLT) if an interrupt arrived that was higher priority than
PPR and no interrupts were currently pending.  This assumed that if
the 'pending' bit was set, any needed notification was already in
progress.  However, if the first interrupt sent to a HLTed vCPU was
lower priority than PPR and the second was higher than PPR, the first
interrupt would set 'pending' but not notify the vCPU, and the second
interrupt would not notify the vCPU because 'pending' was already set.
To fix this, track the priority of pending interrupts in a separate
per-vCPU bitmask and notify a vCPU anytime an interrupt arrives that
is above PPR and higher than any previously-received interrupt.

This was found and debugged in the bhyve port to SmartOS maintained by
Joyent.  Relevant SmartOS bugs with more background:

https://smartos.org/bugview/OS-6829
https://smartos.org/bugview/OS-6930
https://smartos.org/bugview/OS-7354

Submitted by:	Patrick Mooney <pmooney@pfmooney.com>
Reviewed by:	tychon, rgrimes
Obtained from:	SmartOS / Joyent
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19299
2019-03-01 20:43:48 +00:00
Conrad Meyer
51c68d18e2 Fortuna: push CTR-mode loop down into randomdev hash.h interface
As a step towards adding other potential streaming ciphers.  As well as just
pushing the loop down into the rijndael APIs (basically 128-bit wide AES-ICM
mode) to eliminate some excess explicit_bzero().

No functional change intended.

Reviewed by:	delphij, markm
Approved by:	secteam (delphij)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19411
2019-03-01 19:21:45 +00:00
Michael Tuexen
20ab225b61 Honor the memory limits provided when processing the IPPROTO_SCTP
level socket option SCTP_GET_LOCAL_ADDRESSES in a getsockopt() call.

Thanks to Thomas Barabosch for reporting the issue which was found by
running syzkaller.

MFC after:		3 days
2019-03-01 18:47:41 +00:00
Edward Tomasz Napierala
1699546def Remove sv_pagesize, originally introduced with r100384.
In all of the architectures we have today, we always use PAGE_SIZE.
While in theory one could define different things, none of the
current architectures do, even the ones that have transitioned from
32-bit to 64-bit like i386 and arm. Some ancient mips binaries on
other systems used 8k instead of 4k, but we don't support running
those and likely never will due to their age and obscurity.

Reviewed by:	imp (who also contributed the commit message)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19280
2019-03-01 16:16:38 +00:00
Michael Tuexen
3aee58ca76 Improve consistency, not functional change.
MFC after:		3 days
2019-03-01 15:57:55 +00:00
Alexander Motin
5a62e92f44 There is no device atacard but there is device atapccard.
Reported by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	1 week
2019-03-01 15:00:13 +00:00
Bjoern A. Zeeb
4974f2b172 Add ushort and ulong to linux/types.h.
When porting code once written for Linux we find not only uints but also ushort and ulong.
Provide central typedefs as part of the linuxkpi for those as well.

Reviewed by:	hselasky, emaste
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19405
2019-03-01 14:33:20 +00:00
Emmanuel Vadot
a5120cf00c arm64: rockchip: rk3399_pll: Fix the recalc function
The plls frequency are now correctly calculated in fractional mode
and integer mode.
While here add some debug printfs (disabled by default)
Tested with powerd on the little cluster on a RockPro64.

MFC after:	1 week
2019-03-01 13:05:37 +00:00
Kristof Provost
6f4909de5f pf: IPv6 fragments with malformed extension headers could be erroneously passed by pf or cause a panic
We mistakenly used the extoff value from the last packet to patch the
next_header field. If a malicious host sends a chain of fragmented packets
where the first packet and the final packet have different lengths or number of
extension headers we'd patch the next_header at the wrong offset.
This can potentially lead to panics or rule bypasses.

Security:       CVE-2019-5597
Obtained from:  OpenBSD
Reported by:    Corentin Bayet, Nicolas Collignon, Luca Moro at Synacktiv
2019-03-01 07:37:45 +00:00
Pawel Jakub Dawidek
b8da50d526 Improve readability of the code by making it explicit where the 'c' variable
starts. It is also more consistent with similar code in this file.
2019-03-01 05:54:13 +00:00
Justin Hibbits
6775dfdf54 powerpc/powernv: Add OPAL flash device driver
Firmware needed by petitboot, for example, GPU firmware, can be installed to
a partition in the flash filesystem.  This driver exposes the full flash
given by the device tree, letting the user manage firmware, etc, from
FreeBSD.

To use the partitions provided by the flash module, the fdt_slicer module is
needed, but the module isn't needed for raw access, so there's no direct
dependency link in here.

MFC after:	2 weeks
2019-03-01 04:36:55 +00:00
Ian Lepore
f266da5c28 Add another required header file.
For some reason this seems to be required on aarch64, but I can build armv7
from clean without needing this in the list.  (The file does get included,
so the mystery is why armv7 works.)
2019-03-01 04:17:43 +00:00
Ian Lepore
fd6bb0db87 Add required header file to SRCS. 2019-03-01 03:09:43 +00:00
Ian Lepore
608accbf19 Undo accidental part of r344681.
I think I must have accidentally mouse-click pasted while scrolling and
didn't notice it.

Reported by:	jhibbits@
2019-03-01 02:53:54 +00:00
Justin Hibbits
dac618a648 powerpc/powernv: Add asynchronous token management for powernv
The OPAL firmware only supports a finite number of in-flight asynchronous
operations.  Rather than have each subsystem try to manage its own, use a
central management service to hand out tokens.

More work can be done to improve asynchronous behavior, such as funneling
things through a future OPAL heartbeat handler, but capabilities will be
added as needed.

Augment the existing consumers (i2c and sensors) to use this new API.

MFC after:	4 weeks
2019-03-01 02:49:47 +00:00
Navdeep Parhar
41dda0d9eb cxgbe(4): Don't forget to report link state to the kernel if the link is
already up at attach.

Reported by:	Fabrice Bruel @ Orange Business Service
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-03-01 02:43:30 +00:00
Ian Lepore
238eb01b5c Build fdt support modules on systems that use fdt data.
kern.opts.mk sets make var OPT_FDT to a non-empty value if platform.h
contains OPT_FDT.
2019-03-01 02:31:43 +00:00
Justin Hibbits
f476f0add8 Revert r344675
It's an incorrect approach to solve the problem.  We already have a
fdt/fdt_slicer module, it just needs to be wired into the build.
2019-03-01 02:08:12 +00:00
Conrad Meyer
3948ad29e9 cxgb(4): Netdump: only reference allocated qsets
SGE_QSETS is an upper bound -- fewer qsets may be allocated depending on
the number of CPUs.

Reviewed by:	markj, np, vangyzen
X-MFC-With:	r333288
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D17274
2019-03-01 01:57:22 +00:00
Marcin Wojtas
e3431664df Prevent detaching driver if the attach is not finished
When the device is in attaching state, detach should return
EBUSY instead of success. In other case, there could be race
between attach and detach during the driver unloading.

If driver goes sleep and releases GIANT lock during attaching,
unloading module could start. In such case when attach continues
after module unload, page fault "supervisor read instruction,
page not present" occurred.

This patch works around the real issue, which is a locking
deficiency of the busses.

Submitted by: Rafal Kozik <rk@semihalf.com>
Reviewed by: imp
Obtained from: Semihalf
MFC after: 2 weeks
Sponsored by: Amazon, Inc.
Differential Revision: https://reviews.freebsd.org/D19375
2019-03-01 01:18:39 +00:00
Justin Hibbits
c8e720aaae GEOM: Add fdt_slicer to the GEOM flashmap module for fdt-based platforms
geom_flashmap depends on a slicer being available in order to do any
work.  On fdt platforms this is provided by fdt_slicer, but this needs
to be available.  Often it's compiled into the kernel for platforms that
boot from the relevant media, but this is not always the case.  Add the
file to the geom_flashmap module so that it can be used on platforms
which don't always need this functionality available.
2019-02-28 23:00:47 +00:00
John Baldwin
d18e541983 Don't assume all children of a nexus are ports.
Specifically, ccr(4) devices are also children of cxgbe nexus devices.
Rather than making assumptions about the child device's softc, walk
the list of ports from the nexus' softc to determine if a child is a
port in t4_child_location_str().  This fixes a panic when detaching a
ccr device.

Reviewed by:	np
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D19399
2019-02-28 22:10:19 +00:00
Mark Johnston
2b64ab22e8 Allow FIONBIO and FIOASYNC ioctls on POSIX shm descriptors.
They have no effect, as with filesystem file descriptors.
This improves compatibility with some existing userspace code.

Submitted by:	Greg V <greg@unrelenting.technology>
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D19330
2019-02-28 22:00:36 +00:00
Alexander Motin
1f8c4546a8 Limit 24xx adapters to only MSI interrupts by default.
This was actually the known good configuration we used before.
Single MSI-X configuration doesn't even work there on my tests, just due
to lack of documentation not sure whether by design or I am doing something
wrong.

PR:		233654
MFC after:	1 week
2019-02-28 21:07:16 +00:00
Konstantin Belousov
bced332adf Invalidate cache for the PDPTE page when using PAE paging but PAT is
not supported.

According to SDM rev. 69 vol. 3, for PDPTE registers loads:
- when PAT is not supported, access to the PDPTE page is performed as
  UC, see 4.9.1;
- when PAT is supported, the access is WB, see 4.9.2.

So potentially CPU might load stale memory as PDPTEs if both PAT and
self-snoop are not implemented.  To be safe, add total local cache
flush to pmap_cold() before initial load of cr3, and flush PDPTE page
in pmap_pinit(), if PAT is not implemented.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19365
2019-02-28 19:19:02 +00:00
Enji Cooper
1ece6232d2 Remove references to pdwait4(2) and CAP_PDWAIT from rights(4)
@cem removed references to pdwait4(2) (a nonexistent syscall) in
r320058.

This change removes references to pdwait4(2) and `CAP_PDWAIT` in
rights(4) to not mislead the user into thinking that pdwait4(2)/`CAP_PDWAIT` is
actually implemented in the stock FreeBSD kernel.

The goal of this functionality was to simplify monitoring/manipulating
processes started with `pdfork`, et al, and avoid races with waiting on pids.
The syscall was never completed though--just discussed on the capsicum mailing
list back in 2015:
https://lists.cam.ac.uk/pipermail/cl-capsicum-discuss/2015-May/msg00012.html
. That being said, there are members of the project (@rwatson, etc) who
have longterm goals to implement this syscall to better secure pdfork(2)
calls.

PR:		235871
Reviewed by:	emaste
Discussed with:	rwatson
Approved by:	emaste (mentor)
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D18950
2019-02-28 18:12:14 +00:00