Commit Graph

118374 Commits

Author SHA1 Message Date
Konstantin Belousov
5a21cd1941 The nvme module should explicitly declare dependency on the cam.
If both nvme and cam are compiled as modules, nvme cannot be kldloaded
otherwise.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-31 14:21:32 +00:00
Alexander Motin
b6c46372af Fix port control for PEX 8749.
That chip has three Station Ports, so previous address math was incorrect.

MFC after:	13 days
Sponsored by:	iXsystems, Inc.
2017-08-31 13:41:44 +00:00
Baptiste Daroussin
91f1caeccb Add sysctls for arc shrinking and growing values
The default value for arc_no_grow_shift may not be optimal when using
several GiB ARC. Expose it via sysctl allows users to tune it easily.

Also expose arc_grow_retry via sysctl for the same reason. The default value of
60s might, in case of intensive load, be too long.

Submitted by:	Nikita Kozlov <nikita.kozlov@blade-group.com>
Reviewed by:	mav, manu, bapt
MFC after:	2 weeks
Sponsored by:	blade
Differential Revision:	https://reviews.freebsd.org/D12144
2017-08-31 13:02:17 +00:00
Alexander Motin
c7dabb6563 Make ntb_set_ctx() always generate fake link event.
It allows application driver get initial link state without racing with
hardware interrupts, thanks to the context rmlock held here.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2017-08-31 10:59:39 +00:00
Alexander Motin
ba4b25cbba Make ntb_transport(4) ready receive early link events.
Those events may be reported as soon as callback is registered, if the link
is enabled by hardware or some other application.

While there, clean link_is_up variable on link down event.

MFC after:	1 week
2017-08-31 10:53:10 +00:00
Navdeep Parhar
2f318252cb cxgbe(4): Add two new debug flags -- one to allow manual firmware
install after full initialization, and another to disable the TCB
cache (T6+).  The latter works as a tunable only.

Note that debug_flags are for debugging only and should not be set
normally.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-30 23:41:04 +00:00
Ed Maste
118ee8bdea xls_ehci: eliminate 'format string is not a string literal' warning
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 22:58:11 +00:00
Ed Maste
7a8d38a717 octeon_ebt3000_cf: eliminate 'format string is not a string literal' warning
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 22:54:11 +00:00
Alexander Motin
ed9652da5f Add NTB driver for PLX/Avago/Broadcom PCIe switches.
This driver supports both NTB-to-NTB and NTB-to-Root Port modes (though
the second with predictable complications on hot-plug and reboot events).
I tested it with PEX 8717 and PEX 8733 chips, but expect it should work
with many other compatible ones too.  It supports up to two NT bridges
per chip, each of which can have up to 2 64-bit or 4 32-bit memory windows,
6 or 12 scratchpad registers and 16 doorbells.  There are also 4 DMA engines
in those chips, but they are not yet supported.

While there, rename Intel NTB driver from generic ntb_hw(4) to more specific
ntb_hw_intel(4), so now it is on par with this new ntb_hw_plx(4) driver and
alike to Linux naming.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-30 21:16:32 +00:00
John Baldwin
b8b3a64000 Apply 64k padding to stack pointer for 32-bit processes.
In particular, MIPS now has COMPAT_FREEBSD32 for n64 kernels so this
cannot be ignored for n64.  On the other hand, it is unneeded for o32
MIPS kernels as the issue is only present when using 64-bit registers,
so remove the workaround from o32 kernels.

Reviewed by:	jmallett
Sponsored by:	DARPA / AFRL
2017-08-30 19:21:11 +00:00
Sean Bruno
a969350226 Revert r323008 and its conversion of e1000/iflib to using SX locks.
This seems to be missing something on the 82574L causing NFS root mounts
to hang.

Reported by:	kib
2017-08-30 18:56:24 +00:00
Navdeep Parhar
e3d00a3e03 cxgbe(4): Zero out the memory allocated for the debug dump.
cudbg_collect seems to expect it this way.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-30 18:46:38 +00:00
Konstantin Belousov
405058aa7a Only make the if_ix module depend on netmap when netmap is configured.
Approved by:	erj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-30 18:19:25 +00:00
Ed Maste
53d534724a bhnd: initialize variable before use
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 17:39:51 +00:00
Ed Maste
3460f018f9 arge: correct bzero sizeof (pointed-to object, not pointer)
Reported by:	Clang
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2017-08-30 17:38:55 +00:00
Maxim Sobolev
f76de5dd51 Add proper support for the md_label into md(4) ioctl compat layer.
While I am here, declare struct md_ioctl32 as packed which allows
us to stop playing tricks with sizeof(md_ioctl32)+y as well as
simplifies md_pad handling. Both were necessary because of different
alignment preferences on amd64 vs i386.

MFC after:	4 weeks
2017-08-30 15:07:10 +00:00
Konstantin Belousov
35872e79b7 Adjust interface of swapon_check_swzone() to its actual usage.
The function return value is not used.  Its argument is always
swap_total/PAGE_SIZE, so make it not take any arguments.

Submitted by:	ota@j.email.ne.jp
PR:	221356
MFC after:	1 week
2017-08-30 10:17:00 +00:00
Konstantin Belousov
f08b30995a Make the swap_pager_full variable static.
r290920 removed the use of the variable from vm/vm_pageout.c.

Submitted by:	ota@j.email.ne.jp
PR:	221356
MFC after:	1 week
2017-08-30 09:44:05 +00:00
Ed Schouten
b53b978a6c Complete the CloudABI networking refactoring.
Now that all of the packaged software has been adjusted to either use
Flower (https://github.com/NuxiNL/flower) for making incoming/outgoing
network connections or can have connections injected, there is no longer
need to keep accept() around. It is now a lot easier to write networked
services that are address family independent, dual-stack, testable, etc.

Remove all of the bits related to accept(), but also to
getsockopt(SO_ACCEPTCONN).
2017-08-30 07:30:06 +00:00
Ed Maste
89fdc67c9c usb: Add external "Intenso Memory" disk UQ_MSC_NO_INQUIRY quirk
PR:		221852
Submitted by:	Fabian Keil
Reviewed by:	hselasky
Obtained from:	ElectroBSD
MFC after:	1 week
2017-08-30 01:44:11 +00:00
Sean Bruno
e17e5b4134 Continuation of lock cleanup in e1000.
Post-cold sleep instead of DELAY when waiting for firmware.

Convert softc mutex to an SX lock.  Change all waits to sleeps
once interrupts are enabled (and it is safe to sleep).

Submitted by:	Matt Macy <matt@mattmacy.io>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12101
2017-08-30 00:20:43 +00:00
John Baldwin
cccc713394 Allow execution of FreeBSD/mips shared objects.
This was applied to all other FreeBSD architectures in r169846 but wasn't
included in the intial MIPS import.

Sponsored by:	DARPA / AFRL
2017-08-30 00:14:36 +00:00
Navdeep Parhar
fc740a161b cxgbe(4): Update T6/T5/T4 firmwares to 1.16.59.0.
These firmwares come from a pre-release snapshot.  The final firmwares
in this Chelsio release cycle will likely be .61.0 or later and those
will be the next "long lived" firmwares in FreeBSD head and stable
branches.  .59 is being provided in head (only) for wider test exposure.

Obtained from:	Chelsio Communications
Sponsored by:	Chelsio Communications
2017-08-29 23:37:26 +00:00
Ed Maste
3c3d2ba6fe zfs: do not advertise edonr which is not yet supported
illumos 4185 ("add new cryptographic checksums to ZFS: SHA-512,
Skein, Edon-R") was intentionally merged only partially in r289422,
without adding support for skein, sha512 and edonr on FreeBSD.

Support for skein and sha512 was added later on, but edonr is still not
implemented in FreeBSD.

Prior to this commit zfs(8) correctly rejected edonr, but with an error
message that claimed support:

fk@r500 ~ $zfs set checksum=edonr tank
cannot set property for 'tank': 'checksum' must be one of 'on | off | fletcher2 | fletcher4 | sha256 | sha512 | skein | edonr'

PR:		204055
Submitted by:	Fabian Keil
Approved by:	allanjude
Obtained from:	ElectroBSD
MFC after:	1 week
2017-08-29 22:24:22 +00:00
Warner Losh
a3cf03a519 Add missing test for NVME CCBs for nvme passthru support.
Submitted by: Chuck Tuffli
2017-08-29 21:04:29 +00:00
Warner Losh
9f8ed7e40b Fix NVMe's use of XPT_GDEV_TYPE
This patch changes the way XPT_GDEV_TYPE works for NVMe. The current
ccb_getdev structure includes pointers to the NVMe Identify Controller
and Namespace structures, but these are kernel virtual addresses which
are not accessible from user space.

As an alternative, the patch changes the pointers into padding in
ccb_getdev and adds two new types to ccb_dev_advinfo to retrieve the
Identify Controller (CDAI_TYPE_NVME_CNTRL) and Namespace
(CDAI_TYPE_NVME_NS) data structures.

Reviewed By: rpokala, imp
Differential Revision: https://reviews.freebsd.org/D10466
Submitted by: Chuck Tuffli
2017-08-29 17:03:30 +00:00
Warner Losh
c2005bba77 Fix a few overlooked spots where the coded uses 16-bit NSIDs. Chuck
Tuffli had submitted a more thorough patch that I was unaware of when
I did my work and this brings in the bits I missed from that patch.

PR: 220267
Submitted by: Chuck Tuffli
2017-08-29 15:46:34 +00:00
Warner Losh
519772814d Add CAM/NVMe support for CAM_DATA_SG
This adds support in pass(4) for data to be described with a
scatter-gather list (sglist) to augment the existing (single) virtual
address.

Differential Revision: https://reviews.freebsd.org/D11361
Submitted by: Chuck Tuffli
Reviewed by: imp@, scottl@, kenm@
2017-08-29 15:29:57 +00:00
Warner Losh
850564b948 Add new compile-time option NVME_USE_NVD that sets the default value
of the runtime hw.nvme.use_vnd tunable. We still default to nvd unless
otherwise requested.

Sponsored by: Netflix
2017-08-28 23:54:25 +00:00
Warner Losh
c02565f9fa Set the max transactions for NVMe drives better.
Provided a better estimate for the number of transactions that can be
pending at one time. This will be number of queues * number of
trackers / 4, as suggested by Jim Harris. This gives a better estimate
of the number of transactions that CAM should queue before applying
back pressure. This should be revisted when we have real multi-queue
support in CAM and the upper layers of the I/O stack.

Sponsored by: Netflix
2017-08-28 23:54:20 +00:00
Warner Losh
be650b3469 Add nvme_sim.c since that's not runtime switchable.
Sponsored by: Netflix
2017-08-28 23:54:16 +00:00
Navdeep Parhar
1ba8c29ca8 cxgbe(4): Do not access the mailbox without appropriate locks while
creating hardware VIs.

This fixes a bad race on systems with hw.cxgbe.num_vis > 1.

Reported by:	olivier@
MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-28 22:41:15 +00:00
Conrad Meyer
2744a0b69b Drop CACHE_LINE_SIZE to 64 bytes on x86
The actual cache line size has always been 64 bytes.

The 128 number arose as an optimization for Core 2 era Intel processors.  By
default (configurable in BIOS), these CPUs would prefetch adjacent cache
lines unintelligently.  Newer CPUs prefetch more intelligently.

The latest Core 2 era CPU was introduced in September 2008 (Xeon 7400
series, "Dunnington").  If you are still using one of these CPUs, especially
in a multi-socket configuration, consider locating the "adjacent cache line
prefetch" option in BIOS and disabling it.

Reported by:	mjg
Reviewed by:	np
Discussed with:	jhb
Sponsored by:	Dell EMC Isilon
2017-08-28 22:28:41 +00:00
Andriy Voskoboinyk
0cc18edf18 rtwn(4): some initial preparations for (basic) VHT support.
Rename RTWN_RIDX_MCS to RTWN_RIDX_HT_MCS before adding 802.11ac
MCS rate indexes (they have different offset).

No functional change intended.
2017-08-28 22:14:16 +00:00
Mark Johnston
aed9aaaa76 Synchronize page laundering with pmap_extract_and_hold().
Before r207410, the hold count of a page in a page queue was protected
by the queue lock, and, before laundering a page, the page daemon
removed managed writeable mappings of the page before releasing the
queue lock. This ensured that other threads could not concurrently
create transient writeable mappings using pmap_extract_and_hold() on a
user map, as is done for example by vmapbuf(). With that revision,
however, a race can allow the creation of such a mapping, meaning that
the page might be modified as it is being laundered, potentially
resulting in it being marked clean when its contents do not match
those given to the pager. Close the race by using the page lock to
synchronize the hold count check in vm_pageout_cluster() with the
removal of writeable managed mappings.

Reported by:	alc
Reviewed by:	alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D12084
2017-08-28 22:10:15 +00:00
Marius Strobl
e023501c6a Don't set any WOL enabling hardware bits if WOL isn't requested
according to the enabled interface capability bits. Also remove
some dead code, which tried to preserve already set contents of
E1000_WUC while that register is completely overwritten shortly
after in all cases.
2017-08-28 22:09:12 +00:00
Navdeep Parhar
7023d9d4c6 cxgbe(4): Maintain one ifmedia per physical port instead of one per
Virtual Interface (VI).  All autonomous VIs that share a port share the
same media.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-28 21:44:25 +00:00
Konstantin Belousov
4eeec01fee Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-08-28 21:04:56 +00:00
Konstantin Belousov
fbcbbe78dc Verify that the BPB media descriptor and FAT ID match.
FAT specification requires that for valid FAT, FAT cluster 0 has a
specific value derived from the BPB media descriptor.  The lowest
(little-endian) byte must be equal to bpb.bpbMedia, other bits in the
cluster number must be all 1's.  Implement the check to reduce the
chance of the randomly corrupted FAT to pass the mount attempt.

Submitted by:	Siva Mahadevan <smahadevan@freebsdfoundation.org>
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D12124
2017-08-28 20:52:32 +00:00
Alexander Motin
546ec4e544 Mask doorbells while processing them.
This fixes interrupt storms on hardware using legacy level-triggered
interrupts, since doorbell processing could take time after interrupt
handler completion, that triggered extra interrupts in a loop.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-28 20:00:21 +00:00
Alexander Motin
e6f000753e Fix fake interrupt when set doorbell is unmasked.
Since the doorbell bit is already set when interrupt handler is called,
the event was not propagated to upper layer.  It was working normally
because present code was not using masking actively, but that is going
to change.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2017-08-28 19:52:57 +00:00
Bryan Drewery
8359a6b7b3 Allow vdrop() of a vnode not yet on the per-mount list after r306512.
The old code allowed calling vdrop() before insmntque() to place the vnode back
onto the freelist for later recycling.  Some downstream consumers may rely on
this support.  Normally insmntque() failing is fine since is uses vgone() and
immediately frees the vnode rather than attempting to add it to the freelist if
vdrop() were used instead.

Also assert that vhold() cannot be used on such a vnode.

Reviewed by:	kib, cem, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12126
2017-08-28 19:29:51 +00:00
Warner Losh
9754579b01 Add comment about where we need to place this routine, and why.
Sponsored by: Netflix
2017-08-28 19:27:33 +00:00
Warner Losh
43effc8ca8 Add comment about where we need to place this routine, and why.
Sponsored by: Netflix
2017-08-28 19:25:49 +00:00
Alan Cox
ee620ea47d Update a couple vm_object lock assertions in the swap pager to reflect the
new use of the vm_object's lock to synchronize updates to a radix trie
mapping per-vm object page indices to on-disk swap blocks.

Fix a typo in a nearby comment.

Reviewed by:	kib, markj
X-MFC with:	r322913
Differential Revision:	https://reviews.freebsd.org/D12134
2017-08-28 17:02:25 +00:00
Alan Cox
d5efa0a475 Switching from a global hash table to per-vm_object radix tries for mapping
vm_object page indices to on-disk swap space (r322913) has changed the
synchronization requirements for a couple swap pager functions.  Whereas
before a read lock on the vm object sufficed because of the global mutex
on the hash table, a write lock on the vm object may now be required.  In
particular, calls to vm_pager_page_unswapped() now require a write lock on
the vm_object.  Consequently, vm_fault()'s fast path cannot call
vm_pager_page_unswapped().  The swap space will have to be released at a
later point.

Reviewed by:	kib, markj
X-MFC with:	r322913
Differential Revision:	https://reviews.freebsd.org/D12134
2017-08-28 16:55:43 +00:00
Maxim Sobolev
f7ca2bbe44 Add ability to label md(4) devices.
This feature comes from the fact that we rely memory-backed md(4)
in our build process heavily. However, if the build goes haywire
the allocated resources (i.e. swap and memory-backed md(4)'s) need
to be purged. It is extremely useful to have ability to attach
arbitrary labels to each of the virtual disks so that they can
be identified and GC'ed if neecessary.

MFC after:	4 weeks
Differential Revision:	https://reviews.freebsd.org/D10457
2017-08-28 15:54:07 +00:00
Michael Tuexen
3d5af7a127 Fix blackhole detection.
There were two bugs related to the blackhole detection:
* The smalles size was tried more than two times.
* The restored MSS was not the original one, but the second
  candidate.

MFC after:	1 week
Sponsored by:	Netflix, Inc.
2017-08-28 11:41:18 +00:00
Ed Schouten
5d17a1d629 Make _Static_assert() work with GCC in older C++ standards.
GCC only activates C11 keywords in C mode, not C++ mode. This means
that when targeting an older C++ standard, we cannot fall back to using
_Static_assert(). In this case, do define _Static_assert() as a macro
that uses a typedef'ed array.

Discussed in:	r322875 commit thread
Reported by:	Mark MIllard
MFC after:	1 month
2017-08-28 09:35:17 +00:00
Navdeep Parhar
573443c542 cxgbe(4): vi_mac_funcs should include the base Ethernet function. It is
already used in the driver as if it does.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-28 07:50:54 +00:00
Navdeep Parhar
358346b456 cxgbe(4): Remove write only variable from t4_port_init.
MFC after:	3 days
2017-08-28 04:06:40 +00:00
Navdeep Parhar
0fa7560dfd cxgbe(4): Fix some assertions during driver detach. The netmap queues
can't be initialized if the VI isn't.

MFC after:	3 days
2017-08-28 03:25:41 +00:00
Navdeep Parhar
f6d9d14b93 cxgbe(4): Verify that the driver accesses the firmware mailbox in a
thread-safe manner.

MFC after:	3 days
2017-08-28 03:13:16 +00:00
Andriy Voskoboinyk
191ccdf545 net80211: fix a typo (premable -> preamble). 2017-08-27 22:13:03 +00:00
Conrad Meyer
4ae2ade114 Enhance debugibility of sysctl leaf re-use warnings
Print the full conflicting oid path, and include the function name in the
warning so it is clear that the warnings are sysctl-related.

PR:		221853
Submitted by:	Fabian Keil <fk AT fabiankeil.de> (earlier version)
Sponsored by:	Dell EMC Isilon
2017-08-27 17:12:30 +00:00
Andriy Voskoboinyk
ac1b5d811f rtwn(4): deduplicate r92c_write_txpower(). 2017-08-27 13:02:51 +00:00
Andriy Voskoboinyk
5c7083ce99 rtwn(4): change type for Tx power values (RTL8192C / RTL8188EU).
Tx power values can easily fit into uint8_t + only 8 bits are written
to registers; values may overflow only in case if ROM contains
malformed data (but limit is checked anyway).

Tested with RTL8188CUS, dev.rtwn.1.debug=0x2000 (no changes).
2017-08-27 12:44:56 +00:00
Konstantin Belousov
58d8f357c7 Let g_access() log the actual error number.
Submitted by:	 Fabian Keil <fk@fabiankeil.de>
PR:	221855
MFC after:	1 week
2017-08-27 12:24:25 +00:00
Konstantin Belousov
c9dad45f69 Add PCI Id for MosChip MCS9900.
Submitted by:	Robert Clausecker <fuz@fuz.su>
PR:	214670
MFC after:	1 week
2017-08-27 11:37:07 +00:00
Scott Long
757ff64216 Start overhauling debug printing in the MPS and MPR drivers. The focus of this
commit it to make initiazation less chatty in the normal case, and more useful
and informative when real debugging is turned on.

Reviewed by:	ken (earlier version)
Sponsored by:	Netflix
2017-08-27 06:24:06 +00:00
Conrad Meyer
eee87314d3 Improve scheduler performance
Improve scheduler performance by flattening nonsensical topology layers
(layers with only one child don't serve any purpose).

This is especially relevant on non-AMD Zen systems after r322776.  On my
dual core Intel laptop, this brings the kern.sched.topology_spec table down
from three levels to two.

Submitted by:	jeff
Reviewed by:	attilio
Sponsored by:	Dell EMC Isilon
2017-08-27 05:14:48 +00:00
Warner Losh
b0fc288b1a Eliminate redunant device path matching.
Use efi_devpath_match instead of device_paths_match. They are
functionally the same. Remove device_paths_match from boot1.c and call
efi_devpath_match instead.

Sponsored by: Netflix
2017-08-27 03:10:16 +00:00
Ryan Libby
0d4e7ec5f3 amd64: drop q suffix from rd[fg]sbase for gas compatibility
Reviewed by:	kib
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12133
2017-08-26 23:13:18 +00:00
Warner Losh
4532588625 Use efi_devpath_str for debug path info.
Kill our own hand-rolled (and somewhat flawed) devpath_str in favor of
the recently added efi_devpath_str in libefi. This gives us much
better names at the expense of not being able to debug on EFI 1.2
machines (since the UEFI protocol efi_devpath_str depends on was added
in UEFI 2.0). However, this isn't the first thing that requires newer
than EFI 1.2, so it's quite possible that this doesn't change the
universe of machines we can EFI boot from. This will now give us the
full UEFI path, even for devices we don't yet know about. More
importantly, it gives us the full HD(...) part of the path, which is
sufficient by itself to locate disks that follow the rules (dd one
disk (but not partition) to another still needs the rest of the path
to disambiguate, but that isn't following the rules that require every
GPT table to have globally unique GUIDs for every partion).

This also has the side effect of shrinking boot1.efi by ~3k.

Sponsored by: Netflix
2017-08-26 23:04:19 +00:00
Warner Losh
57d7e4f502 Link in libefi for boot1
Add libefi to the list of libraries we'll link in. Move EFI table
definitions back to libefi so we don't have drift between the two
efi_main routines.

Sponsored by: Netflix
2017-08-26 18:30:14 +00:00
Warner Losh
abc3c1089d Forward declare struct dsk to avoid warnings when building libi386.
Sponsored by: Netflix
2017-08-26 18:30:08 +00:00
Warner Losh
868cb4f19e Remove useless 'static' for an enum definition.
Sponsored by: Netflix
2017-08-26 18:30:03 +00:00
Warner Losh
f4380fc45d Fix warnings due to type mismatch.
Cast ctxp to caddr_t to pass data as expected. While void * is a
universal type, char * isn't (and that's what caddr_t is defined as).
One could argue these prototypes should take void * rather than
caddr_t, but changing that is much more invasive.

Sponsored by: Netflix
2017-08-26 18:29:58 +00:00
Warner Losh
6fefa52c48 _STAND is sometimes defined on the command line. Make the define here match.
Sponsored by: Netflix
2017-08-26 18:29:53 +00:00
Warner Losh
7b900dc2be No need for MK_ZFS around these: they are by their nature only active
when MK_ZFS is true.

Sponsored by: Netflix
2017-08-26 18:29:48 +00:00
Warner Losh
0414cb13ec Use the loader.efi conventions for the various EFI tables.
Sponsored by: Netflix
2017-08-26 18:29:43 +00:00
Warner Losh
194748f6a3 Cleanup efi_main return type
Make the return type of efi_main uniform. Declare the Exit() function
as not returning. Move efi_main's declaration to the proper header.

Sponsored by: Netflix
2017-08-26 18:29:37 +00:00
Warner Losh
c40241db0f Move efi_main into efi/loader
Move the efi_main routine out of libefi into sys/boot/efi/loader.
Since boot1 has its own efi_main routine, this effectively prevents
boot1 from linking with libefi. By moving it out, we can share code
better (though though some refactoring with boot1's efi_main and
loader.efi's efi_main is definitely in order).

Sponsored by: Netflix
2017-08-26 18:29:24 +00:00
Konstantin Belousov
5e52cd4324 MFamd64 r322720, r322723:
Simplify i386 trap().

- Use more relevant name 'signo' instead of 'i' for the local variable
  which contains a signal number to send for the current exception.
- Eliminate two labels 'userout' and 'out' which point to the very end
  of the trap() function.  Instead use return directly.
- Re-indent the prot_fault_translation block by reducing if() nesting.
- Some more monor style changes.

Reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-26 18:12:25 +00:00
Konstantin Belousov
aefdf821cd Remove unused code.
The machdep.uprintf_signal sysctl replaced it in more convenient way,
not requiring recompilation to use and providing more information on
fault.

Reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-26 18:09:27 +00:00
Konstantin Belousov
325b406c49 MFamd64 r322718:
Use ANSI C declaration for trap_pfault().  Style.

Reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-26 18:06:29 +00:00
Konstantin Belousov
8ce5520255 MFamd64 r322719:
Trim excessive 'extern'.

Reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-26 18:04:29 +00:00
Mariusz Zaborski
3453dc72ad Hide length of geli passphrase during boot.
Introduce additional flag to the geli which allows to restore previous
behavior.

Reviewed by:	AllanJude@, cem@ (previous version)
MFC:		1 month
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D11751
2017-08-26 14:07:24 +00:00
Conrad Meyer
70dd00d6c0 Fix limits.h constants to have correct type on MIPS
Use correctly typed constants to avoid bogus errors like
https://lists.freebsd.org/pipermail/svn-src-all/2017-August/150400.html
(like x86 _limits.h).

Reported by:	bde, asomers
Reviewed by:	mjoras, tinderbox
Sponsored by:	Dell EMC Isilon
2017-08-26 03:21:12 +00:00
Navdeep Parhar
7cb574039b cxgbe(4): Dump the mailbox contents in the same format as CH_DUMP_MBOX.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-25 23:31:15 +00:00
Konstantin Belousov
f425ab8e50 Replace global swhash in swap pager with per-object trie to track swap
blocks assigned to the object pages.

- The global swhash_mtx is removed, trie is synchronized by the
  corresponding object lock.
- The swp_pager_meta_free_all() function used during object
  termination is optimized by only looking at the trie instead of
  having to search whole hash for the swap blocks owned by the object.
- On swap_pager_swapoff(), instead of iterating over the swhash,
  global object list have to be inspected. There, we have to ensure
  that we do see valid trie content if we see that the object type is
  swap.
Sizing of the swblk zone is same as for swblock zone, each swblk maps
SWAP_META_PAGES pages.

Proposed by:	alc
Reviewed by:	alc, markj (previous version)
Tested by:	alc, pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D11435
2017-08-25 23:13:21 +00:00
John Baldwin
12fb14f36d Don't grab SOCK_LOCK for soref() when queuing an AIO request.
The AIO job holds a reference on the associated file descriptor, so the
socket's count should already be > 0.  This fixes a LOR with the socket
buffer lock after recent socket locking changes in HEAD.

Sponsored by:	Chelsio Communications
2017-08-25 23:10:27 +00:00
Sean Bruno
7d0d648446 Add a different #define for the maximum number of transmit and
recieve descriptors for the igb(4) class of devices.  This will
allow a better definition for maximum going forward.  Some igb(4)
devices support more than the default 4K.

Reported by:	Jason (j@nitrology.com)
Sponsored by:	Limelight Networks
2017-08-25 22:38:55 +00:00
Warner Losh
030edcce02 Fill in reserved areas from NVMe spec in the IDENTIFY structure
(struct nvme_controller_data) as defined in the NVM Express
specification, revsion 1.3.

Sponsored by: Netflix
2017-08-25 21:38:43 +00:00
Warner Losh
696c950297 NVME Namespace ID is 32-bits, so widen interface to reflect that.
Sponsored by: Netflix
2017-08-25 21:38:38 +00:00
Warner Losh
223a9b93ac Add feature codes from NVMe 1.3 specification:
o Automomous Power State Transition
o Host Memory Buffer
o Timestamp
o Keep Alive Timer
o Host Controlled Thermal Management
o Non-Operational Power State Config

Also note that feature codes 0x78-0x7f are reserved for the NVMe
Management Interface.

Sponsored by: Netflix
2017-08-25 21:38:29 +00:00
Sean Bruno
32a04bb81d Use counter(9) for PLPMTUD counters.
Remove unused PLPMTUD sysctl counters.

Bump UPDATING and FreeBSD Version to indicate a rebuild is required.

Submitted by:	kevin.bowling@kev009.com
Reviewed by:	jtl
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D12003
2017-08-25 19:41:38 +00:00
Alan Cox
a0ae476b7e Correct a regression in the previous change, r322459. Specifically, the
removal of the "blk" parameter from blst_meta_alloc() had the unintended
effect of generating an out-of-range allocation when the cursor reaches
the end of the tree if the number of managed blocks in the tree equals
the so-called "radix" (which in the blist code is not the standard notion
of what a radix is but rather the maximum number of leaves in a tree of
the current height.)  In other words, only certain swap configurations
were affected, which is why earlier testing did not reveal the problem.

Submitted by:	Doug Moore <dougm@rice.edu>
Reported by:	pho, kib
Tested by:	pho
X-MFC with:	r322459
Differential Revision:	https://reviews.freebsd.org/D12106
2017-08-25 18:47:23 +00:00
Maxim Sobolev
c7e10205ae Make spinconsole platform independent and hook it up into EFI loader on
i386 and amd64. Not enabled on ARMs, those are lacking timer routines.

MFC after:	2 moths
Sponsored by:	Sippy Software, Inc.
2017-08-25 17:29:48 +00:00
Ed Schouten
8212ad9a99 Sync CloudABI compatibility against the latest upstream version (v0.13).
With Flower (CloudABI's network connection daemon) becoming more
complete, there is no longer any need for creating any unconnected
sockets. Socket pairs in combination with file descriptor passing is all
that is necessary, as that is what is used by Flower to pass network
connections from the public internet to listening processes.

Remove all of the kernel bits that were used to implement socket(),
listen(), bindat() and connectat(). In principle, accept() and
SO_ACCEPTCONN may also be removed, but there are still some consumers
left.

Obtained from:	https://github.com/NuxiNL/cloudabi
MFC after:	1 month
2017-08-25 11:01:39 +00:00
Bruce Evans
f7eb827c48 Fix bugs in (mostly) not-yet-activated parts of early/emergency output:
- map the hard-coded frame buffer address above KERNBASE.  Using the
  physical address only worked because of larger mapping bugs.

  The hard-coded frame buffer address only works on x86.  Use messy ifdefs
  to try to avoid warnings about unused code for other arches.

- remove the sysctl for reading and writing the table kernel console
  attributes.  Writing only worked for emergency output since normal
  output uses unalterd copies.

- fix the test for the emergency console being usable

- explain why a hard-coded attribute is used very early.  Emergency output
  works on x86 even before the pcpu pointer is initialized.
2017-08-25 10:57:17 +00:00
Konstantin Belousov
2f9d88c7ae Protect v_rdev dereference with the vnode interlock instead of the
vnode lock.

Caller of softdep_count_dependencies() may own a buffer lock, which
might conflict with the lock order.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
2017-08-25 09:51:22 +00:00
Bruce Evans
9bc7c36337 Support setting the colors of cursors for the VGA renderer.
Advertise this by changing the defaults to mostly red.  If you don't like
this, change them (almost) back using:
   vidcontrol -c charcolors,base=7,height=0
   vidcontrol -c mousecolors,base=0[,height=15]

The (graphics mode only) mouse cursor colors were hard-coded to a black
border and lightwhite interior.  Black for the border is the worst
possible default, since it is the same as the default black background
and not good for any dark background.  Reversing this gives the better
default of X Windows.  Coloring everything works better still.  Now
the coloring defaults to a lightwhite border and red interior.

Coloring for the character cursor is more complicated and mode
dependent.  The new coloring doesn't apply for hardware cursors.  For
non-block cursors, it only applies in graphics mode.  In text mode,
the cursor color was usually a hard-coded (dull)white for the background
only, unless the foreground was white when it was a hard-coded black
for the background only, unless the foreground was white and the
background was black it was reverse video.  In graphics mode, it was
always reverse video for the block cursor.  Reverse video is worse,
especially over cutmarking regions, since cutmarking still uses simple
reverse video (nothing better is possible in text mode) and double
reverse video for the cursor gives normal video.  Now, graphics mode
uses the same algorithm as the best case for text mode in all cases
for graphics mode.  The hard-coded sequence { white, black, } for the
background is now { red, white, blue, } where the first 2 colors can
be configured.  The blue color at the end is a sentinel which prevents
reverse video being used in most cases but breaks the compatibility
setting for white on black and black on white characters.  This will
be fixed later.  The compatibility setting is most needed for mono modes.

The previous commit to syscons.c changed sc_cnterm() to be more careful.
It followed null pointers in some cases.  But sc_cnterm() has been
unreachable for 15+ years since changes for multiple consoles turned
off calls to the the cnterm destructor for all console drivers.  Before
them, it was only called at boot time.  So no driver with an attached
console has ever been unloadable and not even the non-console destructors
have been tested much.
2017-08-25 07:04:41 +00:00
Warner Losh
0012e436e3 Use _Static_assert
These files are compiled in userland too, so we can't use sys/systm.h
and rely on CTASSERT. Switch to using _Static_assert instead.

MFC After: 3 days
Sponsored by: Netflix
2017-08-25 04:33:06 +00:00
Warner Losh
0c26c1992f Sanity check sizes
Add compile time sanity checks to make sure that packed structures are
the proper size, typically as defined in the NVMe standard.
2017-08-25 04:05:53 +00:00
Warner Losh
abb61405a6 Enable bus mastering on the device before resetting the device. The
card has to do PCIe transactions to complete the reset process, but
can't do them, per the PCIe spec, unless bus mastering is enabled.

Submitted by: Kinjal Patel
PR: 22166
2017-08-25 03:15:18 +00:00
Bruce Evans
0c7a1e15ad Oops, the previous commit was missing 1 line. 2017-08-25 02:41:01 +00:00
Bruce Evans
7db291d201 Fix missing switching of the terminal emulator when switching the
terminal state for kernel console output.

r56043 in 2000 added many complications to support dynamic selection
of the terminal emulator using modules and the ioctl CONS_SETTERM.
This was never completed.  There are still no modules, but it is easy
to restore the scterm and dumb emulators at compile time.  Then
boot-time configuration for the preferred one doesn't work right, but
CONS_SETTERM almost works after fixing this bug.  CONS_SETTERM only
switches the emulator for the user state, leaving the kernel state(s)
still using the boot-time emulator.  The fix is especially important
when switching from sc to scteken, since the scteken state has pointers
in it.

Rename kernel_console_ts to sc_kts.
2017-08-25 02:37:32 +00:00
Gleb Smirnoff
dcfa556b02 Garbage collect RT_NORTREF, which is no longer in use after FLOWTABLE removal. 2017-08-24 23:08:12 +00:00
Eric Joyner
b82c9ba1c4 ixv(4): Add more robust mailbox API negotiation
The previous update to the driver to 3.2.12-k changed the VF's API version
to 1.2, but did not let the VF fall back to 1.1 or 1.0 versions. So, this
patch tries 1.2 first, then the older versions in succession if that fails.

This should allow the VF driver to negotiate 1.1 and work with older PF
drivers, such as the one used in Amazon's EC2 service.

PR:		220872
Submitted by:	Jeb Cramer <jeb.j.cramer@intel.com>
MFC after:	1 week
Sponsored by:	Intel Corporation
2017-08-24 22:56:22 +00:00
Warner Losh
08fc2f23b3 Expand the latency tracking array from 1.024s to 8.192s to help track
extreme outliers from dodgy drives. Adjust comments to reflect this,
and make sure that the number of latency buckets match in the two
places where it matters.
2017-08-24 22:11:10 +00:00
Warner Losh
e4c9cba71f Fix 32-bit overflow on latency measurements
o Allow I/O scheduler to gather times on 32-bit systems. We do this by shifting
  the sbintime_t over by 8 bits and truncating to 32-bits. This gives us 8.24
  time. This is sufficient both in range (256 seconds is about 128x what current
  users need) and precision (60ns easily meets the 1ms smallest bucket size
  measurements). 64-bit systems are unchanged. Centralize all the time math so
  it's easy to tweak tha range / precision tradeoffs in the future.
o While I'm here, the I/O scheduler should be using periph_data rather than
  sim_data since it is operating on behalf of the periph.

Differential Review: https://reviews.freebsd.org/D12119
2017-08-24 22:10:58 +00:00
Gleb Smirnoff
555b3e2f2c Third take on the r319685 and r320480. Actually allow for call soisconnected()
via soisdisconnected(), and in the earlier unlock earlier to avoid lock
recursion.

This fixes a situation when a socket on accept queue is reset before being
accepted.

Reported by:	Jason Eggleston <jeggleston llnw.com>
2017-08-24 20:49:19 +00:00
Conrad Meyer
d2e155a4f0 Remove unused declaration and update ddb.4
A follow-up to r322836.

Warnings for the unused declaration were breaking some second tier
architectures, but did not show up in Clang on x86.

Reported by:	markj (ddb.4), emaste (declaration)
Sponsored by:	Dell EMC Isilon
2017-08-24 19:16:25 +00:00
David C Somayajulu
b1c4d096e6 Fix qlnx_tso_check() so that every window of
(ETH_TX_LSO_WINDOW_BDS_NUM - nbds_in_hdr) has atleast
ETH_TX_LSO_WINDOW_MIN_LEN bytes

MFC after:5 days
2017-08-24 19:09:42 +00:00
Conrad Meyer
0c1d923efb Merge print_lockchain and print_sleepchain
When debugging a deadlock, it is useful to follow the full chain of locks as
far as possible.

Reviewed by:	jhb
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12115
2017-08-24 15:12:16 +00:00
Konstantin Belousov
2624320fcc Stop masking FSGSBASE and SMEP features under monitors.
Not enabling FSGSBASE in %cr4 does not prevent reporting of the
feature by the CPUID instruction (blame Int*l).  As result, kernels
which were run under monitors pretended that usermode cannot modify
TLS base without the syscall, while libc noted right combination of
capable CPU and the new kernel version, trying to use the WRFSBASE
instruction.

Really old hypervisors that cannot handle enablement of these features
in %cr4 would require the manual configuration, by setting the loader
tunable hw.cpu_stdext_disable=0x81

Reported by:	lwhsu, mjoras
Sponsored by:	The FreeBSD Foundation
MFC after:	18 days
2017-08-24 10:57:34 +00:00
Konstantin Belousov
9fc847133e Save KGSBASE in pcb before overriding it with the guest value.
Reported by:	lwhsu, mjoras
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	18 days
2017-08-24 10:49:53 +00:00
Hans Petter Selasky
8ac4c959ab Compile fixes for LINT on 32-bit platforms.
MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-24 08:09:42 +00:00
Oleksandr Tymoshenko
00120249a7 Add "xlnx,zynq-7000" to zedboard and zybo compatible property
This property is required to boot CURRENT on zedboard and zybo

PR:		221208
Submitted by:	Thomas Skibo <thoma555-bsd@yahoo.com>
2017-08-24 02:08:52 +00:00
Sean Bruno
21e10b16e0 iflib: call device's if_init function during vlan initialization.
Submitted by:	bhargava.marreddy@broadcom.com
Reviewed by:	shurd
Sponsored by:	Broadcom
Differential Revision:	https://reviews.freebsd.org/D12098
2017-08-23 21:49:56 +00:00
Alexander Motin
b4af3e8a85 Add missing restart_queue initialization.
MFC after:	1 week
2017-08-23 19:00:06 +00:00
Mark Johnston
c72fadf8f0 Set the bus number field when attaching a PCI device.
MFC after:	1 week
2017-08-23 16:50:10 +00:00
Michael Tuexen
e8aba2eb21 Avoid TCP log messages which are false positives.
This is https://svnweb.freebsd.org/changeset/base/322812, just for
alternate TCP stacks.

XMFC with: 	322812
2017-08-23 15:08:51 +00:00
Michael Tuexen
4da9052a05 Avoid TCP log messages which are false positives.
The check for timestamps are too early to handle SYN-ACK correctly.
So move it down after the corresponing processing has been done.

PR:		216832
Obtained from:	antonfb@hesiod.org
MFC after:	1 week
2017-08-23 14:50:08 +00:00
Hans Petter Selasky
1251590741 Add new mlx5ib(4) driver to the kernel source tree which supports
Remote DMA over Converged Ethernet, RoCE, for the ConnectX-4 series of
PCI express network cards.

There is currently no user-space support and this driver only supports
kernel side non-routable RoCE V1. The krping kernel module can be used
to test this driver. Full user-space support including RoCE V2 will be
added as part of the ongoing upgrade to ibcore from Linux 4.9. Otherwise
this driver is feature equivalent to mlx4ib(4). The mlx5ib(4) kernel
module will only be built when WITH_OFED=YES is specified.

MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-23 12:09:37 +00:00
Jung-uk Kim
a1d0659ca9 Fix size to copyout(9) for cpuset_getid(2).
MFC after:	3 days
2017-08-22 20:46:29 +00:00
Alexander Motin
ffc7e53a65 Fix off-by-one error when parsing SRAT table.
Reviewed by:	jhb
MFC after:	1 week
2017-08-22 19:56:30 +00:00
Andrew Turner
dab076c004 Remove an unneeded call to pmap_invalidate_all. This was never called as
the anyvalid variable is never set.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-08-22 18:20:25 +00:00
Konstantin Belousov
761fb3ef29 Ensure that fs/gs bases are stored in pcb before copying the pcb for
new process or thread.

Reported and tested by:	ae, dhw
Sponsored by:	The FreeBSD Foundation
MFC after:	20 days
2017-08-22 18:15:47 +00:00
Ed Maste
495947bcfa newvers.sh: accommodate git worktree
newvers.sh looks for a .vcs subdirectory (e.g. .git, .svn) to determine
which vcs info tool to run (e.g., git rev-parse, svn info).

(As of r308789 if a .vcs subdirectory is not found at ${TOPDIR} then
newvers.sh walks up successive parent directories, testing for the .vcs
subdirectory at each step.  This is done in case the FreeBSD source is
built in a subdirectory as part of some larger project, but either way
newvers.sh still tests for the .vcs subdirectory.)

However, when using git worktree there is no .git subdirectory but
rather a plain text .git file which contains a reference to the main
working tree.

Change findvcs() to test that the .vcs entry exists, regardless of type.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2017-08-22 17:57:34 +00:00
Andrew Turner
012faa32f9 Fix a bug in pmap_protect where we invalidate the wrong page. With this we
can now remove an unneeded call to invalidate all entries.

MFC after:	1 week
Sponsored by:	DARPA, AFRL
2017-08-22 17:38:06 +00:00
Mark Johnston
7e1a02baa5 Add some miscellaneous definitions to support the DRM drivers.
MFC after:	1 week
2017-08-22 17:13:28 +00:00
Andrew Turner
e8e65e28ca Fix a comment on uncommitted work. 2017-08-22 13:53:53 +00:00
Andrew Turner
6683b30c03 Move the l0 pagetable address to struct mdproc. It is a property of the
whole process so should live there.

Sponsored by:	DARPA, AFRL
2017-08-22 13:16:14 +00:00
Conrad Meyer
bb14d5643b subr_smp: Clean up topology analysis, add additional layers
Rather than repeatedly nesting loops, separate concerns with a single loop
per call stack level.  Use a table to drive the recursive routine.  Handle
missing topology layers more gracefully (infer a single unit).

Analyze some additional optional layers which may be present on e.g. AMD Zen
systems (groups, aka dies, per package; and cachegroups, aka CCXes, per
group).

Display that additional information in the boot-time topology information,
when it is relevent (non-one).

Reviewed by:	markj@, mjoras@ (earlier version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12019
2017-08-22 00:10:15 +00:00
John Baldwin
3c0e63a4c4 Enable hardfloat CPU instructions in the FP exception handler.
This permits compiling with clang's integrated assembler.

Sponsored by:	DARPA / AFRL
2017-08-21 21:48:24 +00:00
David C Somayajulu
7fb518469e Upgrade FW to 5.4.66
sysctls to display stats, stats polled every 2 seconds
Modify QLA_LOCK()/QLA_UNLOCK() to not sleep after acquiring mtx_lock
Add support to turn OFF/ON error recovery following heartbeat failure for
debug purposes.
Set default max values to 32 Tx/Rx/SDS rings

MFC after:5 days
2017-08-21 20:27:45 +00:00
Andrew Turner
cbf2160e81 Improve the performance of the arm64 thread switching code.
The full system memory barrier around a TLB invalidation is stricter than
required. It needs to wait on accesses to main memory, with just the weaker
store variant before the invalidate. As such use the dsb istst, tlbi, dlb
ish sequence already used in pmap.

The tlbi instruction in this sequence is also unnecessarily using a
broadcast invalidate when it just needs to invalidate the local CPUs TLB.
Switch to a non-broadcast variant of this instruction.

Sponsored by:	DARPA, AFRL
2017-08-21 18:12:32 +00:00
John Baldwin
ac3b479ec8 Add a guard around _ILP32 for mips.
This is already done for other architectures in this file and fixes the
build with clang.
2017-08-21 17:45:06 +00:00
Konstantin Belousov
3e902b3d76 Make WRFSBASE and WRGSBASE instructions functional.
Right now, we enable the CR4.FSGSBASE bit on CPUs which support the
facility (Ivy and later), to allow usermode to read fs and gs bases
without syscalls. This bit also controls the write access to bases
from userspace, but WRFSBASE and WRGSBASE instructions currently
cannot be used, because return path from both exceptions or interrupts
overrides bases with the values from pcb.

Supporting the instructions is useful because this means that usermode
can implement green-threads completely in userspace without issuing
syscalls to change all of the machine context.

Support is implemented by saving the fs base and user gs base when
PCB_FULL_IRET flag is set. The flag is set on the context switch,
which potentially causes clobber of the bases due to activation of
another context, and when explicit modification of the user context by
a syscall or exception handler is performed. In particular, the patch
moves setting of the flag before syscalls change context.

The changes to doreti_exit and PUSH_FRAME to clear PCB_FULL_IRET on
entry from userspace can be considered a bug fixes on its own.

Reviewed by:	jhb (previous version)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D12023
2017-08-21 17:38:02 +00:00
Konstantin Belousov
f0d5223230 Avoid dereferencing potentially freed workitem in
softdep_count_dependencies().

Buffer's b_dep list is protected by the SU mount lock.  Owning the
buffer lock is not enough to guarantee the stability of the list.

Calculation of the UFS mount owning the workitems from the buffer must
be much more careful to not dereference the work item which might be
freed meantime.  To get to ump, use the pointers chain which does not
involve workitems at all.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-08-21 16:23:44 +00:00
Konstantin Belousov
b5f2560d09 Style.
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-08-21 16:16:02 +00:00
Andrey V. Elsukov
c32396978e Remove stale comments.
MFC after:	1 week
2017-08-21 13:54:29 +00:00
Andrey V. Elsukov
22bbefb2c9 Fix the regression introduced in r275710.
When a security policy should match TCP connection with specific ports,
the SYN+ACK segment send by syncache_respond() is considered as forwarded
packet, because at this moment TCP connection does not have PCB structure,
and ip_output() is called without inpcb pointer. In this case SPIDX filled
for SP lookup will not contain TCP ports and security policy will not
be found. This can lead to unencrypted SYN+ACK on the wire.

This patch restores the old behavior, when ports will not be filled only
for forwarded packets.

Reported by:	Dewayne Geraghty <dewayne.geraghty at heuristicsystems.com.au>
MFC after:	1 week
2017-08-21 13:52:21 +00:00
Hans Petter Selasky
714ed5b27b Fix for deadlock situation in the LinuxKPI's RCU synchronize API.
Deadlock condition:
The return value of TDQ_LOCKPTR(td) is the same for two threads.

1) The first thread signals a wakeup while keeping the rcu_read_lock().
This invokes sched_add() which in turn will try to lock TDQ_LOCK().

2) The second thread is calling synchronize_rcu() calling mi_switch() over
and over again trying to yield(). This prevents the first thread from running
and releasing the RCU reader lock.

Solution:
Release the thread lock while yielding to allow other threads to acquire the
lock pointed to by TDQ_LOCKPTR(td).

Found by:	KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-21 11:51:40 +00:00
Konstantin Belousov
9ed84d55c1 Simplify the code.
Noted by:	Oliver Pinter
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-20 11:18:16 +00:00
Konstantin Belousov
e5cffdd34b Do not drop NFS vnode lock when performing consistency checks.
Currently several paths in the NFS client upgrade the shared vnode
lock to exclusive, which might cause temporal dropping of the lock.
This action appears to be fatal for nullfs mounts over NFS. If the
operation is performed over nullfs vnode, then bypassed down to NFS
VOP, and the lock is dropped, other thread might reclaim the upper
nullfs vnode.  Since on reclaim the nullfs vnode lock and NFS vnode
lock are split, the original lock state of the nullfs vnode is not
restored.  As result, VFS operations receive not locked vnode after a
VOP call.

Stop upgrading the vnode lock when we check the consistency or flush
buffers as result of detected inconsistency.  Instead, allocate a new
lockmgr lock for each NFS node, which is locked exclusive instead of
the vnode lock upgrade.  In other words, the other parallel
modification of the vnode are excluded by either vnode lock conflict
or exclusivity of the new lock when the vnode lock is shared.

Also revert r316529 because now the vnode cannot be reclaimed during
ncl_vinvalbuf().

In collaboration with:	pho
Reviewed by:	rmacklem
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D12083
2017-08-20 10:08:45 +00:00
Konstantin Belousov
b59ea73029 Allow vinvalbuf() to operate with the shared vnode lock.
This mode allows other clean buffers to arrive while we flush the buf
lists for the vnode, which is fine for the targeted use.  We only need
that all buffers existed at the time of the function start were
flushed.  In fact, only one assert has to be relaxed.

In collaboration with:	pho
Reviewed by:	rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
X-Differential revision:	https://reviews.freebsd.org/D12083
2017-08-20 10:07:45 +00:00
Konstantin Belousov
43b7b1f29b Simplify amd64 trap().
- Use more relevant name 'signo' instead of 'i' for the local variable
  which contains a signal number to send for the current exception.
- Eliminate two labels 'userout' and 'out' which point to the very end
  of the trap() function.  Instead use return directly.
- Re-indent the prot_fault_translation block by reducing if() nesting.
- Some more monor style changes.

Requested and reviewed by:	bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-20 09:52:25 +00:00
Konstantin Belousov
4031ebef84 Trim excessive 'extern' and remove unused declaration.
Reviewed by:	bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-20 09:42:09 +00:00
Konstantin Belousov
dad2e0e420 Use ANSI C declaration for trap_pfault(). Style.
Reviewed by:	bde
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-20 09:39:10 +00:00
Mark Johnston
00d47e02ac Define prefetch() only if it hasn't already been defined.
MFC after:	1 week
2017-08-20 01:42:01 +00:00
Mark Johnston
faf7a6e18c Add a couple of trivial headers to the LinuxKPI.
MFC after:	1 week
2017-08-20 01:40:24 +00:00
Conrad Meyer
c768afe370 hwpstate: Add support for family 17h pstate info from MSRs
This information is normally available via acpi_perf, but in case it is not,
add support for fetching the information via MSRs on AMD family 17h (Zen)
processors.  Zen uses a slightly different formula than previous generation
AMD CPUs.

This was inspired by, but does not fix, PR 221621.

Reported by:	Sean P. R. <seanpr AT swbell.net>
Reviewed by:	mjoras@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12082
2017-08-20 00:41:49 +00:00
Bruce Evans
36e19a0f8c Fix setting of defaults for the text cursor.
There was already a per-vty defaults field, but it was useless since it was
only initialized when propagating the global settings and thus no different
from the current global settings and not per-vty.  The global defaults field
was also invariant after boot time, but not quite so useless.

Fix this by adding a second selection bit the the control flags of the
relevant ioctl().  vidcontrol doesn't support this yet.  Setting either
default propagates the change to the current setting for the same level
and then to all lower levels.

Improve the 3-way escape sequence used by termcap to control the cursor.
The "normal" (ve) case has always used reset, so the user could set
it to anything, but since the reset is to a global value this is not
very useful, especially since the "very visible" (vs) case doesn't
reset but inconsistently forces to a blinking block.  Change vs to
first reset and then XOR the blinking bit so that it is predictably
different from ve.
2017-08-19 23:13:33 +00:00
Bruce Evans
4ea1f4f5ea Rename curr_curs_attr to base_curr_attr. The actual current cursor
attribute field is curs_attr.  The base field holds user data translated
in a reversible way and is needed because current field holds this in
an irreversible way for efficiency.

Factor out some common code for the reversible translation.  This is
slightly simpler now, and much easier to expand.

Translate the magic flags value -1 to a single control flag internally
up front so other flags can be trusted later.  This can be used for the
relevant ioctl() too.

Remove CONS_CURSOR_FLAGS which contained all the control flags.  It was
unused and not useful.  After adding more flags, there will be tests on
a couple at a time but never on them all.  This API should have used this
to disallow unknown flags.
2017-08-19 21:40:42 +00:00
Konstantin Belousov
cef5e2bd91 Use the known valid segment when accessing memory in #UD handler.
Make sure that %eflags.D flag is cleared for hook.
Improve comments.

When #UD dtrace code checks for a registered hook before checking that
the exception was raised from kernel mode, we might run with the user
%ds, trapping on access.  Exception entry from userspace automatically
load valid %ss, which we can use there instead.

Noted and reviewed by:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-08-19 21:00:02 +00:00
Bruce Evans
7692d200c1 Use better hard-coded defaults for the cursor shape, and remove nearby
redundant initializations.

Hard-code base = 0, height = (approx. 1/8 of the boot-time font height)
in all cases, and remove the BIOS/MD support for setting these values.
This asks for an underline cursor sized for the boot-time font instead
of various less hard-coded but worse values.  I used that think that
the x86 BIOS always gave the same values as the above hard-coding, but
on 1 of my systems it gives the wrong value of base = 1.

The remaining BIOS fields are shift_state and bell_pitch.  These are now
consistently not explicitly reinitialized to 0.  All sc_get_bios_value()
functions except x86's are now empty, and the only useful thing that x86
returns is shift_state.  This really belongs in atkbdc, but heavier
use of the BIOS to read the more useful typematic rate has been removed
there.  fb still makes much heavier use of the BIOS.
2017-08-19 19:33:16 +00:00
Andrew Turner
43f0edd4e7 Remove redundant declarations. Newer gcc has a warning for these so will
fail when building with -Werror.

Sponsored by:	DARPA, AFRL
2017-08-19 17:18:27 +00:00
Andrew Turner
3f32b92b1d Use armv8-a in -march, it is accepted by both clang and gcc.
Sponsored by:	DARPA, AFRL
2017-08-19 17:15:40 +00:00
Vladimir Kondratyev
76136d200d Add support for generic MS Windows 7/8/10-compatible USB HID touchscreens
found in many laptops.

Reviewed by:		hps, gonzo, bcr (manpages)
Approved by:		gonzo (mentor)
Differential Revision:	https://reviews.freebsd.org/D12017
2017-08-19 17:00:10 +00:00
Emmanuel Vadot
d5d62a7a89 RPI DTS: Add value previously set by VideoCore and DTB links
Using latest U-Boot for RPI 1 or 2 the DTB loaded by the firmware is discarded.
The DTB was previously patched by the firmware to contain the DMA channel mask.
DTB provided by the rpi firmware or DTS in the Linux tree contain the raw value
directly. Do the same for our DTS as we cannot switch to the upstream ones yet.
Not having the DMA channel mask setup properly cause mmc not to be detected
(and probably other problems on driver using DMA).

Also, add links for rpi dtb to the name used by u-boot. This way the dtb can be
loaded by ubldr using the U-Boot env variable fdtfile.

Tested On: RPI B Rev2, RPI Zero, RPI 2 v1.1 RPI 2 v1.2

Thanks to Sylvain Garrigues <sylvain@sylvaingarrigues.com> for the help.

PR:		218344
2017-08-19 14:27:11 +00:00
Ed Maste
49c6edfc84 sys/modules: don't build qlxgbe if the user objects to sourceless ucode
PR:		204749
Submitted by:	Fabian Keil
Obtained from:	ElectroBSD
MFC after:	1 week
2017-08-19 01:12:05 +00:00
Ed Maste
722f80aeb5 sys/modules: don't build bxe if the user objects to sourceless ucode
PR:		204747
Submitted by:	Fabian Keil
Obtained from:	ElectroBSD
MFC after:	1 week
2017-08-19 00:45:29 +00:00
Conrad Meyer
9657edd793 Move some other SI_SUB_INIT_IF initializations to SI_SUB_TASKQ
Drop the EARLY_AP_STARTUP gtaskqueue code, as gtaskqueues are now
initialized before APs are started.

Reviewed by:	hselasky@, jhb@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12054
2017-08-18 18:55:07 +00:00
Konstantin Belousov
828304899d When checking that #UD comes from kernel mode, check that the
exception did not happen in vm86 mode.  A vm86 userland process could
have a %cs that matches GSEL_KPL, while dtrace cannot hook it.

Submitted by:	Maxime Villard <max@m00nbsd.net>
MFC after:	3 days
2017-08-18 17:11:15 +00:00
Ed Maste
0b4060b073 cam iosched: fix typos in comments
PR:		220947
Submitted by:	Fabian Keil
Obtained from:	ElectroBSD
2017-08-18 16:38:33 +00:00
Bruce Evans
15e0c6511a Fix syscons escape sequence for setting the local cursor type. This sequence
was aliased to a vt sequence, causing and fixing various bugs.

For syscons, this restores support for arg 2 which sets blinking block
too forcefully, and restores bugs for arg 0 and 1.  Arg 2 is used for
vs in the cons25 entry in termcap, but I've never noticed an application
that uses this.  The bugs involve replacing local settings by global
ones and need better handling of defaults to fix.

For vt, this requires moving the aliasing code from teken to vt where
it belongs.  This sequences is very important for cons25 compatibility
in vt since it is used by the cons25 termcap entries for ve, vi and
vs.  vt can't properly support vs for either cons25 or xterm since it
doesn't support blinking.  For xterm, the termcap entry for vs asks
for something different using 12;25h instead of 25h.

Rename C25CURS for this to C25LCT and change its description to be closer
to echoing the old comment about it.  CURS is too generic.

Fix missing syscons escape sequence for setting the global cursor shape
(and type).  Only support this in syscons since vt can't emulate anything
in it.
2017-08-18 15:40:40 +00:00
Ruslan Bukin
5651294282 Fix module unload when SGX support is not present in CPU.
Sponsored by:	DARPA, AFRL
2017-08-18 14:47:06 +00:00
Gleb Smirnoff
03f55691de Fix cut and paste typo that prevented T5 firmware to be compiled in.
Reviewed by:	np
2017-08-18 14:30:12 +00:00
Bruce Evans
97933a41fb Improve names for cons25 sequences.
In a recent commit, I forgot to expand an X to an abbreviation of "BORDER".
Fix this and some nearby bad names.

The descriptions were copied from comments in scterm-sc.c, but some
of these are bad.  The border [color] was inconsistently described as
a property of the "display", but I had changed this to "adapter" to
match the descriptions for other color settings.  All colors supported
by the cons25 sequences are actually properties of the current vty and
that should not be described.  But the other colors are defaults.
Change "adapter" to "default" for them and remove "adapter" for the
border.  Reduce the verbosity of the abbreviation from AD to D.
2017-08-18 14:04:14 +00:00
Bruce Evans
e4501d816b Fix vt100 escape sequence for showing and hiding the cursor in syscons.
It should toggle between 2 states, but it used a cut-down version of
support for a related 3-state syscons escape sequence and inherited
bugs from that.  The usual misbehaviour was that hiding and showing
the cursor reset it to a global default.

Support for the 3-state sequence remains broken by aliasing to the 2-state
sequence.  This works better but incompatibly for the 2 cases that it
supports.
2017-08-18 12:45:00 +00:00
Bruce Evans
dd833891de Fix missing syscons escape sequence for setting the border color. 2017-08-18 10:38:49 +00:00
Ryan Libby
19a9c3df5e safe: quiet -Wtautological-compare
Code was testing that an unsigned type was >= 0.

Reviewed by:	markj
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
2017-08-18 08:05:33 +00:00
Michael Tuexen
63ec505a4f Ensure inp_vflag is consistently set for TCP endpoints.
Make sure that the flags INP_IPV4 and INP_IPV6 are consistently set
for inpcbs used for TCP sockets, no matter if the setting is derived
from the net.inet6.ip6.v6only sysctl or the IPV6_V6ONLY socket option.
For UDP this was already done right.

PR:		221385
MFC after:	1 week
2017-08-18 07:27:15 +00:00
Mark Johnston
e9666bf645 Remove some unneeded subroutines for padding writes to dump devices.
Right now we only need to pad when writing kernel dump headers, so
flatten three related subroutines into one. The encrypted kernel dump
code already writes out its key in a dumper.blocksize-sized block.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11647
2017-08-18 04:07:25 +00:00
Mark Johnston
01938d3666 Rename mkdumpheader() and group EKCD functions in kern_shutdown.c.
This helps simplify the code in kern_shutdown.c and reduces the number
of globally visible functions.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11603
2017-08-18 04:04:09 +00:00
Mark Johnston
50ef60dabe Factor out duplicated kernel dump code into dump_{start,finish}().
dump_start() and dump_finish() are responsible for writing kernel dump
headers, optionally writing the key when encryption is enabled, and
initializing the initial offset into the dump device.

Also remove the unused dump_pad(), and make some functions static now that
they're only called from kern_shutdown.c.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11584
2017-08-18 03:52:35 +00:00
Lawrence Stewart
9a61faf67d An off-by-one error exists in sbuf_vprintf()'s use of SBUF_HASROOM() when an
sbuf is filled to capacity by vsnprintf(), the loop exits without error, and
the sbuf is not marked as auto-extendable.

SBUF_HASROOM() evaluates true if there is room for one or more non-NULL
characters, but in the case that the sbuf was filled exactly to capacity,
SBUF_HASROOM() evaluates false. Consequently, sbuf_vprintf() incorrectly
assigns an ENOMEM error to the sbuf when in fact everything is fine, in turn
poisoning the buffer for all subsequent operations.

Correct by moving the ENOMEM assignment into the loop where it can be made
unambiguously.

As a related safety net change, explicitly check for the zero bytes drained
case in sbuf_drain() and set EDEADLK as the error. This avoids an infinite loop
in sbuf_vprintf() if a drain function were to inadvertently return a value of
zero to sbuf_drain().

Reviewed by:	cem, jtl, gallatin
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D8535
2017-08-18 02:06:28 +00:00
Oleg Bulyzhin
2052157d77 Fix BSD label partition end sector calculation.
Reviewed by:	ae
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D12066
2017-08-17 19:39:42 +00:00
Ed Maste
e7b993842e arm64: return error instead of panic in unimplemented ptrace ops
We don't need a panic as a reminder that these need to be implemented.

Reported by:	Shawn Webb
MFC after:	3 week
Sponsored by:	The FreeBSD Foundation
2017-08-17 19:16:23 +00:00
Conrad Meyer
0b53ecd1d7 Discover CPU topology on multi-die AMD Zen systems
The Nodes per Processor topology information determines how many bits of the
APIC ID represent the Node (Zeppelin die, on Zen systems) ID.  Documented in
Ryzen and Epyc Processor Programming Reference (PPR).

Correct topology information enables the scheduler to make better decisions
on this hardware.

Reviewed by:	kib@
Tested by:	jeff@ (earlier version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11801
2017-08-17 16:54:37 +00:00
Lawrence Stewart
a8ec96af28 Implement simple record boundary tracking in sbuf(9) to avoid record splitting
during drain operations. When an sbuf is configured to use this feature by way
of the SBUF_DRAINTOEOR sbuf_new() flag, top-level sections started with
sbuf_start_section() create a record boundary marker that is used to avoid
flushing partial records.

Reviewed by:	cem,imp,wblock
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D8536
2017-08-17 07:20:09 +00:00
Conrad Meyer
35d87c7e96 Fix unused varable warning in !SMP case
Fallout from r322588.  I'm not sure why !SMP is a knob we have, but, we have
it.

Reported by:	Michael Butler <imb AT protected-networks.net>
Sponsored by:	Dell EMC Isilon
2017-08-17 04:37:27 +00:00
John Baldwin
b99836cea9 Mark ZFS ABD inline functions static.
When built with -fno-inline-functions zfs.ko contains undefined references
to these functions if they are only marked inline.

Reviewed by:	avg (earlier version)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-16 23:40:32 +00:00
Ryan Libby
d395fd0d46 aesni: quiet -Wcast-qual
Reviewed by:	delphij
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12021
2017-08-16 22:54:35 +00:00
Conrad Meyer
fe0593d72a Add SI_SUB_TASKQ after SI_SUB_INTR and move taskqueue initialization there for EARLY_AP_STARTUP
This fixes a regression accidentally introduced in r322588, due to an
interaction with EARLY_AP_STARTUP.

Reviewed by:	bdrewery@, jhb@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12053
2017-08-16 21:42:27 +00:00
Warner Losh
0362fb3a9c Define proposed GUID for FreeBSD boot loader variables. 2017-08-16 20:09:39 +00:00
Warner Losh
d9983db1f6 Remove unused defines. 2017-08-16 20:06:38 +00:00
Kristof Provost
9ce40d321d bpf: Fix incorrect cleanup
Cleaning up a bpf_if is a two stage process. We first move it to the
bpf_freelist (in bpfdetach()) and only later do we actually free it (in
bpf_ifdetach()).

We cannot set the ifp->if_bpf to NULL from bpf_ifdetach() because it's
possible that the ifnet has already gone away, or that it has been assigned
a new bpf_if.
This can lead to a struct ifnet which is up, but has if_bpf set to NULL,
which will panic when we try to send the next packet.

Keep track of the pointer to the bpf_if (because it's not always
ifp->if_bpf), and NULL it immediately in bpfdetach().

PR:		213896
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11782
2017-08-16 19:40:07 +00:00
Conrad Meyer
dc6a82801d x86: Add dynamic interrupt rebalancing
Add an option to dynamically rebalance interrupts across cores
(hw.intrbalance); off by default.

The goal is to minimize preemption. By placing interrupt sources on distinct
CPUs, ithreads get preferentially scheduled on distinct CPUs.  Overall
preemption is reduced and latency is reduced. In our workflow it reduced
"fighting" between two high-frequency interrupt sources.  Reduced latency
was proven by, e.g., SPEC2008.

Submitted by:	jeff@ (earlier version)
Reviewed by:	kib@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D10435
2017-08-16 18:48:53 +00:00
Bryan Drewery
96dd05dd7d Quote ${MAKE} when passing in env in case it contains spaces.
Downstream we are wrapping MAKE with a limits(1) call which
interferes with these non-quoted cases.

Sponsored by:	Dell EMC Isilon
2017-08-16 17:54:24 +00:00
Ian Lepore
ce44a73667 Fix compile error with option DEBUG. This is fallout from some long-ago
INTRNG refactoring that didn't get caught at the time because code in a
debugf() statement isn't compiled unless DEBUG is defined.

PR:		221557
2017-08-16 16:51:55 +00:00
Ruslan Bukin
7dea76609b Rename macro DEBUG to SGX_DEBUG.
This fixes LINT kernel build.

Reported by:	lwhsu
Sponsored by:	DARPA, AFRL
2017-08-16 13:44:46 +00:00
Bruce Evans
60e47915b4 Undeprecate the CONS_CURSORTYPE ioctl. It was "deprecated" in 2001,
but it was actually extended then and it is still used (just once) in
/usr/src by its primary user (vidcontrol), while its replacement is
still not used in /usr/src.

yokota became inactive soon after deprecating CONS_CURSORTYPE (this
was part of a large change to make cursor attributes per-vty).

vidcontrol has incomplete support even for the old ioctl.  I will
update it soon.  Then there are many broken escape sequences to fix.
This is just to prepare for setting cursor colors using vidcontrol.
2017-08-16 10:59:37 +00:00
Ruslan Bukin
2164af29a0 Add support for Intel Software Guard Extensions (Intel SGX).
Intel SGX allows to manage isolated compartments "Enclaves" in user VA
space. Enclaves memory is part of processor reserved memory (PRM) and
always encrypted. This allows to protect user application code and data
from upper privilege levels including OS kernel.

This includes SGX driver and optional linux ioctl compatibility layer.
Intel SGX SDK for FreeBSD is also available.

Note this requires support from hardware (available since late Intel
Skylake CPUs).

Many thanks to Robert Watson for support and Konstantin Belousov
for code review.

Project wiki: https://wiki.freebsd.org/Intel_SGX.

Reviewed by:	kib
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11113
2017-08-16 10:38:06 +00:00
Ruslan Bukin
7bbdb843b6 Add OBJ_PG_DTOR flag to VM object.
Setting this flag allows us to skip pages removal from VM object queue
during object termination and to leave that for cdev_pg_dtor function.

Move pages removal code to separate function vm_object_terminate_pages()
as comments does not survive indentation.

This will be required for Intel SGX support where we will have to remove
pages from VM object manually.

Reviewed by:	kib, alc
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11688
2017-08-16 08:49:11 +00:00
Mark Johnston
1f1c4ea123 Add device resource management fields to struct device.
MFC after:	1 week
2017-08-16 06:33:48 +00:00
Navdeep Parhar
b58fd65490 cxgbe/t4_tom: Use correct name for the ISS-valid bit in options2.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-15 19:21:27 +00:00
Matt Joras
d148c2a2b1 Rework vlan(4) locking.
Previously the locking of vlan(4) interfaces was not very comprehensive.
Particularly there was very little protection against the destruction of
active vlan(4) interfaces or concurrent modification of a vlan(4)
interface. The former readily produced several different panics.

The changes can be summarized as using two global vlan locks (an
rmlock(9) and an sx(9)) to protect accesses to the if_vlantrunk field of
struct ifnet, in addition to other places where global exclusive access
is required. vlan(4) should now be much more resilient to the destruction
of active interfaces and concurrent calls into the configuration path.

PR:	220980
Reviewed by:	ae, markj, mav, rstone
Approved by:	rstone (mentor)
MFC after:	4 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11370
2017-08-15 17:52:37 +00:00
Mark Johnston
33fff5d536 Add vm_page_alloc_after().
This is a variant of vm_page_alloc() which accepts an additional parameter:
the page in the object with largest index that is smaller than the requested
index. vm_page_alloc() finds this page using a lookup in the object's radix
tree, but in some cases its identity is already known, allowing the lookup
to be elided.

Modify kmem_back() and vm_page_grab_pages() to use vm_page_alloc_after().
vm_page_alloc() is converted into a trivial wrapper of
vm_page_alloc_after().

Suggested by:	alc
Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11984
2017-08-15 16:39:49 +00:00
Alan Somers
69b14f7acd Fix some ZFS debugging messages
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
	Be more careful about the use of provider names vs vdev names in
	ZFS_LOG statements.

MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-08-15 15:20:04 +00:00
Toomas Soome
fb46c37596 loader.efi: repace XXX with real comments in trap.c
There are two missing comments marked as XXX in trap.c, fix this.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12035
2017-08-15 14:03:26 +00:00
Hans Petter Selasky
48a597ef59 Add new USB quirk.
Submitted by:		devel@stasyan.com
PR:			221328
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-15 08:44:36 +00:00
Xin LI
e6f9d1ce45 Plug memory leak in arge_encap().
Reported by:	Ilja Van Sprundel <ivansprundel ioactive.com>
Submitted by:	Domagoj Stolfa <domagoj.stolfa gmail.com>
Reviewed by:	adrian
MFC after:	3 days
2017-08-15 06:01:36 +00:00
Conrad Meyer
f3fed04372 Fix a couple of comment typos
No functional change.

Submitted by:	Anton Rang <anton.rang AT isilon.com>
Sponsored by:	Dell EMC Isilon
2017-08-15 02:21:02 +00:00
Konstantin Belousov
baec5778ed Print whole machine state on double fault.
It is quite useful when double fault is not caused by a stack overflow.

Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-14 11:23:07 +00:00
Konstantin Belousov
0fd7ea1f21 Add {rd,wr}{fs,gs}base C wrappers for instructions.
Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-14 11:20:54 +00:00
Konstantin Belousov
7bf0049e48 Style.
Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-08-14 11:20:10 +00:00
Sepherosa Ziehau
93b4e111bb hyperv: Update copyright for the files changed in 2017
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11982
2017-08-14 06:00:50 +00:00
Sepherosa Ziehau
d0cd8231e0 hyperv/hn: Re-set datapath after synthetic parts reattached.
Do this even for non-transparent mode VF. Better safe than sorry.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11981
2017-08-14 05:55:16 +00:00
Sepherosa Ziehau
c2d50b263f hyperv/hn: Minor cleanup
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11979
2017-08-14 05:46:50 +00:00
Sepherosa Ziehau
a97fff1913 hyperv/hn: Fix/enhance receiving path when VF is activated.
- Update hn(4)'s stats properly for non-transparent mode VF.
- Allow BPF tapping to hn(4) for non-transparent mode VF.
- Don't setup mbuf hash, if 'options RSS' is set.
  In Azure, when VF is activated, TCP SYN and SYN|ACK go through hn(4)
  while the rest of segments and ACKs belonging to the same TCP 4-tuple
  go through the VF.  So don't setup mbuf hash, if a VF is activated
  and 'options RSS' is not enabled.  hn(4) and the VF may use neither
  the same RSS hash key nor the same RSS hash function, so the hash
  value for packets belonging to the same flow could be different!
- Disable LRO.
  hn(4) will only receive broadcast packets, multicast packets, TCP
  SYN and SYN|ACK (in Azure), LRO is useless for these packet types.
  For non-transparent, we definitely _cannot_ enable LRO at all, since
  the LRO flush will use hn(4) as the receiving interface; i.e.
  hn_ifp->if_input(hn_ifp, m).

While I'm here, remove unapplied comment and minor style change.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11978
2017-08-14 05:40:52 +00:00
Sepherosa Ziehau
3bed4e54f8 hyperv/hn: Update VF's ibytes properly under transparent VF mode.
While, I'm here add comment about why updating VF's imcast stat is
not necessary.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11948
2017-08-14 05:30:02 +00:00
Ian Lepore
a6e709f29c Add hinted attachment for non-FDT systems. Also, print a message if
setting up the timer fails, because on some types of chips that's the
first attempt to access the device.  If the chip is missing/non-responsive
then you'd get a driver that attached and didn't register the rtc, with
no clue about why.  On other chip types there are inits that come before
timer setup, and they already print messages about errors.
2017-08-14 02:23:10 +00:00
Ian Lepore
1dc8df138d Add back the drivers for Dallas/Maxim ds13xx and Seiko S35390x now that
they've been rewritten/fixed to not cause panics by doing i2c transfers
before interrupts are available.

PR:		221227
2017-08-14 00:12:14 +00:00
Ian Lepore
098f6cb6e6 Minor fixes and enhancements for the s35390a i2c RTC driver...
- Add FDT probe code.
- Do i2c transfers with exclusive bus ownership.
- Use config_intrhook_oneshot() to defer chip setup because some i2c
  busses can't do transfers without interrupts.
- Add a detach() routine.
- Add to module build.
2017-08-14 00:00:24 +00:00
Ian Lepore
90cff13c3c Remove the old ds1374 driver and use the ds13rtc driver instead. Adjust
several mips config files accordingly.
2017-08-13 22:07:42 +00:00
Ian Lepore
3777ed4378 Change "chiptype" to "compatible". Making the hint name the same as the FDT
property name should make it easier to document the list of names accepted
by both configuration mechanisms.
2017-08-13 21:45:46 +00:00
Ian Lepore
bb2e8108e1 Add a new driver, ds13rtc, that handles all DS13xx series i2c RTC chips.
This driver supports only basic timekeeping functionality.  It completely
replaces the ds133x driver.  It can also replace the ds1374 driver, but that
will take a few other changes in MIPS code and config, and will be committed
separately.  It does NOT replace the existing ds1307 driver, which provides
access to some of the extended features on the 1307 chip, such as controlling
the square wave output signal.  If both ds1307 and ds13rtc drivers are
present, the ds1307 driver will outbid and win control of the device.

This driver can be configured with FDT data, or by using hints on non-FDT
systems.  In addition to the standard hints for i2c devices, it requires
a "chiptype" string of the form "dallas,ds13xx" where 'xx' is the chip id
(i.e., the same format as FDT compat strings).
2017-08-13 21:02:40 +00:00
Andrew Turner
062c276886 Add support for multiple GICv3 ITS devices. For this we add sc_irq_base
and sc_irq_length to the softc to handle the base number of IRQs available,
make gicv3_get_nirqs return the number of available interrupt IDs, and
limit which CPUs we send interrupts to based on the numa domain.

The last point is only strictly needed on a dual socket ThunderX where we
are unable to send MSI/MSI-X interrupts between sockets.

Sponsored by:	DARPA, AFRL
2017-08-13 18:54:51 +00:00
Ian Lepore
2db14f97de Add config_intrhook_oneshot(): schedule an intrhook function and unregister
it automatically after it runs.

The config_intrhook mechanism allows a driver to stall the boot process
until device(s) required for booting are available, by not allowing system
inits to proceed until all intrhook functions have been unregistered.
Virtually all existing code simply unregisters from within the hook function
when it gets called.

This new function makes that common usage more convenient. Instead of
allocating and filling in a struct, passing it to a function that might (in
theory) fail, and checking the return code, now a driver can simply call
this cannot-fail routine, passing just the intrhook function and its arg.

Differential Revision:	https://reviews.freebsd.org/D11963
2017-08-13 18:10:24 +00:00
Kirk McKusick
037331ddbd When read requests are sent from a filesystem running above g_journal,
the g_journal level needs to check whether it is holding a newer
copy of the block than that which exists on the disk. If so, it
needs to return its copy. If not, it should pass the request down
to the disk to fulfill. It currently considers six queues:

0) delayed queue,
1) unsent (current queue),
2) in-flight to the journal (flush queue),
3) active journal (active queue),
4) inactive journal (inactive queue), and
5) inflight to the disk (copy queue).

Checking on two of these queues is unnecessary:

0) The delayed requests should not be used for reads because they
   have not yet been entered into the journal, so their value should
   reflect the disk contents, not the future contents that are not
   yet committed.

2) Because all the bio's in the flush queue are also found on the
   active queue, there is no need to inspect the flush queue for
   reads since they will be found when searching the active queue.

Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-13 18:09:22 +00:00
Kirk McKusick
8fccf8ffd7 Eliminate a variable that is only ever set.
Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-13 18:06:38 +00:00
Alan Cox
bee93d3cf0 The *_meta_* functions include a radix parameter, a blk parameter, and
another parameter that identifies a starting point in the memory address
block.  Radix is a power of two, blk is a multiple of radix, and the
starting point is in the range [blk, blk+radix), so that blk can always be
computed from the other two.  This change drops the blk parameter from the
meta functions and computes it instead.  (On amd64, for example, this
change reduces subr_blist.o's text size by 7%.)

It also makes the radix parameters unsigned to address concerns that the
calculation of '-radix' might overflow without the -fwrapv option.  (See
https://reviews.freebsd.org/D11819.)

Submitted by:	Doug Moore <dougm@rice.edu>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11964
2017-08-13 16:39:49 +00:00
Roger Pau Monné
72446721e4 srat: use pmap_unmapbios
To match the pmap_mapbios.

Reported by:	jhb
MFC with:	r322403
2017-08-13 14:50:38 +00:00
Nathan Whitehorn
c670f31f19 Move NVME controller shutdown from being called as part of module unloading
to being called through the newbus DEVICE_SHUTDOWN() path. This ensures that
the NVME controller gets shut down before the device and bus disappear
and prevents data corruption on shutdown on at least Samsung EVO 960 SSDs.

PR:		kern/211852
Reviewed by:	imp
MFC after:	2 weeks
2017-08-12 22:13:06 +00:00
John Baldwin
992029ba10 Reliably enable debug exceptions on all CPUs.
Previously, debug exceptions were only enabled on the boot CPU if
DDB was enabled in the dbg_monitor_init() function.  APs also called
this function, but since mp_machdep.c doesn't include opt_ddb.h, the
APs ended up calling an empty stub defined in <machine/debug_monitor.h>
instead of the real function.  Also, if DDB was not enabled in the kernel,
the boot CPU would not enable debug exceptions.

Fix this by adding a new dbg_init() function that always clears the OS
lock to enable debug exceptions which the boot CPU and the APs call.
This function also calls dbg_monitor_init() to enable hardware breakpoints
from DDB on all CPUs if DDB is enabled.  Eventually base support for
hardware breakpoints/watchpoints will need to move out of the DDB-only
debug_monitor.c for use by userland debuggers.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D12001
2017-08-12 18:42:54 +00:00
John Baldwin
c9ee3caf19 Don't panic for PT_GETFPREGS.
Only fetch the VFP state from the CPU if the thread whose registers are
being requested is the current thread.  If a stopped thread's registers
are being fetched by a debugger, the saved state in the PCB is already
valid.

Reviewed by:	andrew
MFC after:	1 week
2017-08-12 18:38:18 +00:00
Ian Lepore
4541b9aab6 Bid for the device with BUS_PROBE_GENERIC, because this is very much a
generic driver with minimal feature support for a large number of chips.
More featureful per-chip drivers might exist (especially out-of-tree) and
those should win the bidding even if they use BUS_PROBE_DEFAULT.
2017-08-12 17:39:32 +00:00
Navdeep Parhar
5d973bad2a cxgbe(4): Save the last reported link parameters and compare them with
the current state to determine whether to generate a link-state change
notification.  This fixes a bug introduced in r321063 that caused the
driver to sometimes skip these notifications.

Reported by:	Jason Eggleston @ LLNW
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-12 14:02:19 +00:00
John Baldwin
319e8f3e36 Fix a typo. 2017-08-11 22:47:32 +00:00
Ed Maste
67d4951ddf arm: enable ARM_MANY_BOARD in NOTES for LINT build
Added in r238189, ARM_MANY_BOARD adds support for multiple ARM boards in
a single kernel. Include it for LINT builds to avoid duplicate symbol
errors when linking with lld.

Sponsored by:	The FreeBSD Foundation
2017-08-11 19:49:29 +00:00
Mark Johnston
d055a46cda Bump KERNELDUMP_BUFFER_SIZE to 4096.
The encrypted kernel dump code writes data in blocks of this size. A buffer
size of 4096 allows encrypted dumps to work with 4Kn drives.

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11870
2017-08-11 19:24:08 +00:00
Ian Lepore
c82d887d47 Stop calling atrtc_set() from the xen timer clock_settime() method. That
removes the only reference to atrtc_set() from outside of atrtc.c, so make
it static.

The xen timer driver registers as a realtime clock with 1us resolution.  In
the past that resulted in only the xen timer's clock_settime() getting
called, so it would call atrtc_set() to set the hardware clock as well.  As
of r32090, the clock_settime() method of all registered realtime clocks gets
called, so the xen driver no longer needs to chain-call the lower-resolution
driver.

Thanks to royger@ for talking me through the xen stuff, and for testing.
2017-08-11 19:02:11 +00:00
Ed Maste
9432a9bd9f Rename at91_pmc's M_PMC malloc type to avoid duplicate definition
M_PMC is defined in sys/dev/hwpmc/hwpmc_mod.c, and the LINT kernel build
fails when linking with lld due to a duplicate symbol error.

Sponsored by:	The FreeBSD Foundation
2017-08-11 18:09:26 +00:00
David C Somayajulu
45f1312387 Performance enhancements to reduce CPU utililization for large number of
TCP connections (order of tens of thousands), with predominantly Transmits.

Choice to perform receive operations either in IThread or Taskqueue Thread.

Submitted by:Vaishali.Kulkarni@cavium.com
MFC after:5 days
2017-08-11 17:43:25 +00:00
Ryan Libby
9de5f67de2 x86/crc32_sse42.c: quiet unused function warning
Reviewed by:	cem
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11980
2017-08-11 17:05:31 +00:00
Mark Johnston
af0460beda Have sendfile_swapin() use vm_page_grab_pages().
Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11942
2017-08-11 16:32:24 +00:00
Mark Johnston
9df950b35d Modify vm_page_grab_pages() to handle VM_ALLOC_NOWAIT.
This will allow its use in sendfile_swapin().

Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11942
2017-08-11 16:29:22 +00:00
Alan Cox
6921451dab An invalid page can't be dirty.
Reviewed by:	kib
MFC after:	1 week
2017-08-11 16:27:54 +00:00
Roger Pau Monné
c642d2f5b5 acpi/srat: fix build without DMAP
Use pmap_mapbios to map memory used to store the cpus array.

Reported by:	lwhsu
X-MFC-with:	r322348
2017-08-11 14:19:55 +00:00
Andrew Turner
a92a2f00b1 Only return the current cpu if it's in the cpumask. When we restrict the
cpumask it probably means we are unable to sent interrupts to CPUs outside
the map. As such only return the current CPU when it's within the mask
otherwise return the first valid CPU.

This is needed on ThunderX as, in a dual socket configuration, we are
unable to send MSI/MSI-X interrupts between sockets.

Reviewed by:	mmel
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11957
2017-08-11 12:45:58 +00:00
Hans Petter Selasky
ebf854802d Make sure the "vm_flags" and "vm_page_prot" fields get set correctly
in the VM area structure in the LinuxKPI when doing mmap() and that
unsupported bits are masked away.

While at it fix some redundant use of parenthesing inside some related
macros.

Found by:	KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-11 10:44:40 +00:00
Mark Johnston
0b7bd01a82 Add a specialized function for DRM drivers to register themselves.
Such drivers attach to a vgapci bus rather than directly to a pci bus. For
the rest of the LinuxKPI to work correctly in this case, we override the
vgapci bus' ivars with those of the grandparent.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11932
2017-08-11 03:59:48 +00:00
Mark Johnston
7e05ffa6e6 Micro-optimize kmem_unback().
We can remove some unnecessary object radix tree lookups by using the
object memq to iterate over pages in the specified range. This does not,
however, eliminate the lookup needed in vm_page_free_toq() to remove each
tree entry.

Reviewed by:	alc, kib (previous revision)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11945
2017-08-11 03:09:11 +00:00
Mark Johnston
2c642ec1e7 Make vm_page_sunbusy() assert that the page is unlocked.
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11946
2017-08-10 22:43:38 +00:00
Ian Lepore
c89cfb4377 Ensure the clocks driver is attached before any drivers that need to enable
clocks in their attach().
2017-08-10 19:42:30 +00:00
Roger Pau Monné
3f0a9fe06c mptable: fix i386 build failure
Reported by:	emaste
X-MFC-with:	r322347
2017-08-10 17:46:57 +00:00
Kenneth D. Merry
6d4ffcb4ac Changes to make mps(4) and mpr(4) handle reinit with reallocation.
When the mps(4) and mpr(4) drivers need to reinitialize the
firmware, they sometimes need to reallocate all of the memory
allocated by the driver.  The reallocation happens whenever the IOC
Facts change.  That should only happen after a firmware upgrade.

If the reinitialization happens as a result of a timed out command
sent to the card, the command that timed out and triggered the
reinit may have been freed if iocfacts_allocate() reallocated all
memory.  If the caller attempts to access the command after that,
the kernel will panic because the caller will be dereferencing
freed memory.

The solution is to set a flag in the softc when we reallocate,
and avoid dereferencing the command strucure if we've reallocated.

The changes are largely the same in both drivers, since mpr(4) is a
derivative of mps(4).

 o In iocfacts_allocate(), if the IOC Facts have changed and we
   need to reallocate, set the REALLOCATED flag in the softc.

 o Change wait_command() to take a struct mps_command ** instead of
   a struct mps_command *.  This allows us to NULL out the caller's
   command pointer if we have to reinit the controller and the data
   structures get reallocated.  (The REALLOCATED flag will be set
   in the softc if that has happened.)

 o In every place that calls wait_command(), make sure we handle
   the case where the command is NULL after the call.

 o The mpr(4) driver has mpr_request_polled() which can also
   reinitialize the card.  Also check for reallocation there.

Reviewed by:	scottl, slm
MFC after:	1 week
Sponsored by:	Spectra Logic
2017-08-10 14:59:17 +00:00
Sean Bruno
5f593927a8 Purge deprecated locking macros.
Submitted by:	Matt Macy <matt@mattmacy.io>
Sponsored by:	Limelight Networks
2017-08-10 14:54:36 +00:00
Ruslan Bukin
af19cc59ca Support for v1.10 (latest) of RISC-V privilege specification.
New version is not compatible on supervisor mode with v1.9.1
(previous version).

Highlights:
    o BBL (Berkeley Boot Loader) provides no initial page tables
      anymore allowing us to choose VM, to build page tables manually
      and enable MMU in S-mode.
    o SBI interface changed.
    o GENERIC kernel.
      FDT is now chosen standard for RISC-V hardware description.
      DTB is now provided by Spike (golden model simulator). This
      allows us to introduce GENERIC kernel. However, description
      for console and timer devices is not provided in DTB, so move
      these devices temporary to nexus bus.
    o Supervisor can't access userspace by default. Solution is to
      set SUM (permit Supervisor User Memory access) bit in sstatus
      register.
    o Compressed extension is now turned on by default.
    o External GCC 7.1 compiler used.
    o _gp renamed to __global_pointer$
    o Compiler -march= string is now in use allowing us to choose
      required extensions (compressed, FPU, atomic, etc).

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11800
2017-08-10 14:18:09 +00:00
Marcin Wojtas
e2b9d20234 Enable OF_setprop API function to add property in FDT
This patch modifies function ofw_fdt_setprop (called by OF_setprop),
so that it can add property, when replacing is not possible.
Adding property is needed to fixup FDT's that have missing
properties.

Submitted by: Patryk Duda <pdk@semihalf.com>
Reviewed by: nwhitehorn, cognet (mentor)
Approved by: cognet (mentor)
Obtained from: Semihalf
Differential Revision: https://reviews.freebsd.org/D11879
2017-08-10 13:45:56 +00:00
Hans Petter Selasky
f6800be3ce Use integer type to pass around jiffies and/or ticks values in the
LinuxKPI because in FreeBSD ticks are 32-bit.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 13:05:40 +00:00
Hans Petter Selasky
4ef8a6301f Fixes for wait event in the LinuxKPI. These are regression issues
after r319757.

1) Correct the return value from __wait_event_common() from 1 to 0 in
case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other
case __ret is zero and will be substituted in the last part of the
macro with the appropriate value before return.

2) Make sure the "timeout" argument is casted to "int" before
evaluating negativity. Else the signedness of a "long" might be
checked instead of the signedness of an integer.

3) The wait_event() function should not have a return value.

Found by:	KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 13:00:10 +00:00
Hans Petter Selasky
8ea4441598 Make sure the linux_wait_event_common() function in the LinuxKPI properly
handles a timeout value of MAX_SCHEDULE_TIMEOUT which basically means there
is no timeout. This is a regression issue after r319757.

While at it change the type of returned variable from "long" to "int" to
match the actual return type.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 12:51:04 +00:00
Roger Pau Monné
a74bb29ada x86: bump MAX_APIC_ID to 512
Introduce a new define to take int account the xAPIC ID limit, for
systems where x2APIC is not available/reliable.

Also change some of the usages of the APIC ID to use an unsigned int
(which is the correct storage type to deal with x2APIC IDs as found in
x2APIC MADT entries).

This allows booting FreeBSD on a box with 256 CPUs and APIC IDs up to
295:

FreeBSD/SMP: Multiprocessor System Detected: 256 CPUs
FreeBSD/SMP: 1 package(s) x 64 core(s) x 4 hardware threads
Package HW ID = 0
	Core HW ID = 0
		CPU0 (BSP): APIC ID: 0
		CPU1 (AP/HT): APIC ID: 1
		CPU2 (AP/HT): APIC ID: 2
		CPU3 (AP/HT): APIC ID: 3
[...]
	Core HW ID = 73
		CPU252 (AP): APIC ID: 292
		CPU253 (AP/HT): APIC ID: 293
		CPU254 (AP/HT): APIC ID: 294
		CPU255 (AP/HT): APIC ID: 295

Submitted by:		kib (previous version)
Relnotes:		yes
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11913
2017-08-10 09:16:40 +00:00
Roger Pau Monné
84525e55c1 x86: make the arrays that depend on MAX_APIC_ID dynamic
So that MAX_APIC_ID can be bumped without wasting memory.

Note that the usage of MAX_APIC_ID in the SRAT parsing forces the
parser to allocate memory directly from the phys_avail physical memory
array, which is not the best approach probably, but I haven't found
any other way to allocate memory so early in boot. This memory is not
returned to the system afterwards, but at least it's sized according
to the maximum APIC ID found in the MADT table.

Sponsored by:		Citrix Systems R&D
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11912
2017-08-10 09:16:03 +00:00
Roger Pau Monné
fd1f83fb45 apic_enumerator: only set mp_ncpus and mp_maxid at probe cpus phase
Populate the lapics arrays and call cpu_add/lapic_create in the setup
phase instead. Also store the max APIC ID found in the newly
introduced max_apic_id global variable.

This is a requirement in order to make the static arrays currently
using MAX_LAPIC_ID dynamic.

Sponsored by:		Citrix Systems R&D
MFC after:		1 month
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D11911
2017-08-10 09:15:18 +00:00