Commit Graph

118010 Commits

Author SHA1 Message Date
Ryan Libby
19a9c3df5e safe: quiet -Wtautological-compare
Code was testing that an unsigned type was >= 0.

Reviewed by:	markj
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
2017-08-18 08:05:33 +00:00
Michael Tuexen
63ec505a4f Ensure inp_vflag is consistently set for TCP endpoints.
Make sure that the flags INP_IPV4 and INP_IPV6 are consistently set
for inpcbs used for TCP sockets, no matter if the setting is derived
from the net.inet6.ip6.v6only sysctl or the IPV6_V6ONLY socket option.
For UDP this was already done right.

PR:		221385
MFC after:	1 week
2017-08-18 07:27:15 +00:00
Mark Johnston
e9666bf645 Remove some unneeded subroutines for padding writes to dump devices.
Right now we only need to pad when writing kernel dump headers, so
flatten three related subroutines into one. The encrypted kernel dump
code already writes out its key in a dumper.blocksize-sized block.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11647
2017-08-18 04:07:25 +00:00
Mark Johnston
01938d3666 Rename mkdumpheader() and group EKCD functions in kern_shutdown.c.
This helps simplify the code in kern_shutdown.c and reduces the number
of globally visible functions.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11603
2017-08-18 04:04:09 +00:00
Mark Johnston
50ef60dabe Factor out duplicated kernel dump code into dump_{start,finish}().
dump_start() and dump_finish() are responsible for writing kernel dump
headers, optionally writing the key when encryption is enabled, and
initializing the initial offset into the dump device.

Also remove the unused dump_pad(), and make some functions static now that
they're only called from kern_shutdown.c.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11584
2017-08-18 03:52:35 +00:00
Lawrence Stewart
9a61faf67d An off-by-one error exists in sbuf_vprintf()'s use of SBUF_HASROOM() when an
sbuf is filled to capacity by vsnprintf(), the loop exits without error, and
the sbuf is not marked as auto-extendable.

SBUF_HASROOM() evaluates true if there is room for one or more non-NULL
characters, but in the case that the sbuf was filled exactly to capacity,
SBUF_HASROOM() evaluates false. Consequently, sbuf_vprintf() incorrectly
assigns an ENOMEM error to the sbuf when in fact everything is fine, in turn
poisoning the buffer for all subsequent operations.

Correct by moving the ENOMEM assignment into the loop where it can be made
unambiguously.

As a related safety net change, explicitly check for the zero bytes drained
case in sbuf_drain() and set EDEADLK as the error. This avoids an infinite loop
in sbuf_vprintf() if a drain function were to inadvertently return a value of
zero to sbuf_drain().

Reviewed by:	cem, jtl, gallatin
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D8535
2017-08-18 02:06:28 +00:00
Oleg Bulyzhin
2052157d77 Fix BSD label partition end sector calculation.
Reviewed by:	ae
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D12066
2017-08-17 19:39:42 +00:00
Ed Maste
e7b993842e arm64: return error instead of panic in unimplemented ptrace ops
We don't need a panic as a reminder that these need to be implemented.

Reported by:	Shawn Webb
MFC after:	3 week
Sponsored by:	The FreeBSD Foundation
2017-08-17 19:16:23 +00:00
Conrad Meyer
0b53ecd1d7 Discover CPU topology on multi-die AMD Zen systems
The Nodes per Processor topology information determines how many bits of the
APIC ID represent the Node (Zeppelin die, on Zen systems) ID.  Documented in
Ryzen and Epyc Processor Programming Reference (PPR).

Correct topology information enables the scheduler to make better decisions
on this hardware.

Reviewed by:	kib@
Tested by:	jeff@ (earlier version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11801
2017-08-17 16:54:37 +00:00
Lawrence Stewart
a8ec96af28 Implement simple record boundary tracking in sbuf(9) to avoid record splitting
during drain operations. When an sbuf is configured to use this feature by way
of the SBUF_DRAINTOEOR sbuf_new() flag, top-level sections started with
sbuf_start_section() create a record boundary marker that is used to avoid
flushing partial records.

Reviewed by:	cem,imp,wblock
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D8536
2017-08-17 07:20:09 +00:00
Conrad Meyer
35d87c7e96 Fix unused varable warning in !SMP case
Fallout from r322588.  I'm not sure why !SMP is a knob we have, but, we have
it.

Reported by:	Michael Butler <imb AT protected-networks.net>
Sponsored by:	Dell EMC Isilon
2017-08-17 04:37:27 +00:00
John Baldwin
b99836cea9 Mark ZFS ABD inline functions static.
When built with -fno-inline-functions zfs.ko contains undefined references
to these functions if they are only marked inline.

Reviewed by:	avg (earlier version)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2017-08-16 23:40:32 +00:00
Ryan Libby
d395fd0d46 aesni: quiet -Wcast-qual
Reviewed by:	delphij
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12021
2017-08-16 22:54:35 +00:00
Conrad Meyer
fe0593d72a Add SI_SUB_TASKQ after SI_SUB_INTR and move taskqueue initialization there for EARLY_AP_STARTUP
This fixes a regression accidentally introduced in r322588, due to an
interaction with EARLY_AP_STARTUP.

Reviewed by:	bdrewery@, jhb@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12053
2017-08-16 21:42:27 +00:00
Warner Losh
0362fb3a9c Define proposed GUID for FreeBSD boot loader variables. 2017-08-16 20:09:39 +00:00
Warner Losh
d9983db1f6 Remove unused defines. 2017-08-16 20:06:38 +00:00
Kristof Provost
9ce40d321d bpf: Fix incorrect cleanup
Cleaning up a bpf_if is a two stage process. We first move it to the
bpf_freelist (in bpfdetach()) and only later do we actually free it (in
bpf_ifdetach()).

We cannot set the ifp->if_bpf to NULL from bpf_ifdetach() because it's
possible that the ifnet has already gone away, or that it has been assigned
a new bpf_if.
This can lead to a struct ifnet which is up, but has if_bpf set to NULL,
which will panic when we try to send the next packet.

Keep track of the pointer to the bpf_if (because it's not always
ifp->if_bpf), and NULL it immediately in bpfdetach().

PR:		213896
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11782
2017-08-16 19:40:07 +00:00
Conrad Meyer
dc6a82801d x86: Add dynamic interrupt rebalancing
Add an option to dynamically rebalance interrupts across cores
(hw.intrbalance); off by default.

The goal is to minimize preemption. By placing interrupt sources on distinct
CPUs, ithreads get preferentially scheduled on distinct CPUs.  Overall
preemption is reduced and latency is reduced. In our workflow it reduced
"fighting" between two high-frequency interrupt sources.  Reduced latency
was proven by, e.g., SPEC2008.

Submitted by:	jeff@ (earlier version)
Reviewed by:	kib@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D10435
2017-08-16 18:48:53 +00:00
Bryan Drewery
96dd05dd7d Quote ${MAKE} when passing in env in case it contains spaces.
Downstream we are wrapping MAKE with a limits(1) call which
interferes with these non-quoted cases.

Sponsored by:	Dell EMC Isilon
2017-08-16 17:54:24 +00:00
Ian Lepore
ce44a73667 Fix compile error with option DEBUG. This is fallout from some long-ago
INTRNG refactoring that didn't get caught at the time because code in a
debugf() statement isn't compiled unless DEBUG is defined.

PR:		221557
2017-08-16 16:51:55 +00:00
Ruslan Bukin
7dea76609b Rename macro DEBUG to SGX_DEBUG.
This fixes LINT kernel build.

Reported by:	lwhsu
Sponsored by:	DARPA, AFRL
2017-08-16 13:44:46 +00:00
Bruce Evans
60e47915b4 Undeprecate the CONS_CURSORTYPE ioctl. It was "deprecated" in 2001,
but it was actually extended then and it is still used (just once) in
/usr/src by its primary user (vidcontrol), while its replacement is
still not used in /usr/src.

yokota became inactive soon after deprecating CONS_CURSORTYPE (this
was part of a large change to make cursor attributes per-vty).

vidcontrol has incomplete support even for the old ioctl.  I will
update it soon.  Then there are many broken escape sequences to fix.
This is just to prepare for setting cursor colors using vidcontrol.
2017-08-16 10:59:37 +00:00
Ruslan Bukin
2164af29a0 Add support for Intel Software Guard Extensions (Intel SGX).
Intel SGX allows to manage isolated compartments "Enclaves" in user VA
space. Enclaves memory is part of processor reserved memory (PRM) and
always encrypted. This allows to protect user application code and data
from upper privilege levels including OS kernel.

This includes SGX driver and optional linux ioctl compatibility layer.
Intel SGX SDK for FreeBSD is also available.

Note this requires support from hardware (available since late Intel
Skylake CPUs).

Many thanks to Robert Watson for support and Konstantin Belousov
for code review.

Project wiki: https://wiki.freebsd.org/Intel_SGX.

Reviewed by:	kib
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11113
2017-08-16 10:38:06 +00:00
Ruslan Bukin
7bbdb843b6 Add OBJ_PG_DTOR flag to VM object.
Setting this flag allows us to skip pages removal from VM object queue
during object termination and to leave that for cdev_pg_dtor function.

Move pages removal code to separate function vm_object_terminate_pages()
as comments does not survive indentation.

This will be required for Intel SGX support where we will have to remove
pages from VM object manually.

Reviewed by:	kib, alc
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11688
2017-08-16 08:49:11 +00:00
Mark Johnston
1f1c4ea123 Add device resource management fields to struct device.
MFC after:	1 week
2017-08-16 06:33:48 +00:00
Navdeep Parhar
b58fd65490 cxgbe/t4_tom: Use correct name for the ISS-valid bit in options2.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2017-08-15 19:21:27 +00:00
Matt Joras
d148c2a2b1 Rework vlan(4) locking.
Previously the locking of vlan(4) interfaces was not very comprehensive.
Particularly there was very little protection against the destruction of
active vlan(4) interfaces or concurrent modification of a vlan(4)
interface. The former readily produced several different panics.

The changes can be summarized as using two global vlan locks (an
rmlock(9) and an sx(9)) to protect accesses to the if_vlantrunk field of
struct ifnet, in addition to other places where global exclusive access
is required. vlan(4) should now be much more resilient to the destruction
of active interfaces and concurrent calls into the configuration path.

PR:	220980
Reviewed by:	ae, markj, mav, rstone
Approved by:	rstone (mentor)
MFC after:	4 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11370
2017-08-15 17:52:37 +00:00
Mark Johnston
33fff5d536 Add vm_page_alloc_after().
This is a variant of vm_page_alloc() which accepts an additional parameter:
the page in the object with largest index that is smaller than the requested
index. vm_page_alloc() finds this page using a lookup in the object's radix
tree, but in some cases its identity is already known, allowing the lookup
to be elided.

Modify kmem_back() and vm_page_grab_pages() to use vm_page_alloc_after().
vm_page_alloc() is converted into a trivial wrapper of
vm_page_alloc_after().

Suggested by:	alc
Reviewed by:	alc, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11984
2017-08-15 16:39:49 +00:00
Alan Somers
69b14f7acd Fix some ZFS debugging messages
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
	Be more careful about the use of provider names vs vdev names in
	ZFS_LOG statements.

MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2017-08-15 15:20:04 +00:00
Toomas Soome
fb46c37596 loader.efi: repace XXX with real comments in trap.c
There are two missing comments marked as XXX in trap.c, fix this.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12035
2017-08-15 14:03:26 +00:00
Hans Petter Selasky
48a597ef59 Add new USB quirk.
Submitted by:		devel@stasyan.com
PR:			221328
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-15 08:44:36 +00:00
Xin LI
e6f9d1ce45 Plug memory leak in arge_encap().
Reported by:	Ilja Van Sprundel <ivansprundel ioactive.com>
Submitted by:	Domagoj Stolfa <domagoj.stolfa gmail.com>
Reviewed by:	adrian
MFC after:	3 days
2017-08-15 06:01:36 +00:00
Conrad Meyer
f3fed04372 Fix a couple of comment typos
No functional change.

Submitted by:	Anton Rang <anton.rang AT isilon.com>
Sponsored by:	Dell EMC Isilon
2017-08-15 02:21:02 +00:00
Konstantin Belousov
baec5778ed Print whole machine state on double fault.
It is quite useful when double fault is not caused by a stack overflow.

Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-14 11:23:07 +00:00
Konstantin Belousov
0fd7ea1f21 Add {rd,wr}{fs,gs}base C wrappers for instructions.
Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-08-14 11:20:54 +00:00
Konstantin Belousov
7bf0049e48 Style.
Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-08-14 11:20:10 +00:00
Sepherosa Ziehau
93b4e111bb hyperv: Update copyright for the files changed in 2017
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11982
2017-08-14 06:00:50 +00:00
Sepherosa Ziehau
d0cd8231e0 hyperv/hn: Re-set datapath after synthetic parts reattached.
Do this even for non-transparent mode VF. Better safe than sorry.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11981
2017-08-14 05:55:16 +00:00
Sepherosa Ziehau
c2d50b263f hyperv/hn: Minor cleanup
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11979
2017-08-14 05:46:50 +00:00
Sepherosa Ziehau
a97fff1913 hyperv/hn: Fix/enhance receiving path when VF is activated.
- Update hn(4)'s stats properly for non-transparent mode VF.
- Allow BPF tapping to hn(4) for non-transparent mode VF.
- Don't setup mbuf hash, if 'options RSS' is set.
  In Azure, when VF is activated, TCP SYN and SYN|ACK go through hn(4)
  while the rest of segments and ACKs belonging to the same TCP 4-tuple
  go through the VF.  So don't setup mbuf hash, if a VF is activated
  and 'options RSS' is not enabled.  hn(4) and the VF may use neither
  the same RSS hash key nor the same RSS hash function, so the hash
  value for packets belonging to the same flow could be different!
- Disable LRO.
  hn(4) will only receive broadcast packets, multicast packets, TCP
  SYN and SYN|ACK (in Azure), LRO is useless for these packet types.
  For non-transparent, we definitely _cannot_ enable LRO at all, since
  the LRO flush will use hn(4) as the receiving interface; i.e.
  hn_ifp->if_input(hn_ifp, m).

While I'm here, remove unapplied comment and minor style change.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11978
2017-08-14 05:40:52 +00:00
Sepherosa Ziehau
3bed4e54f8 hyperv/hn: Update VF's ibytes properly under transparent VF mode.
While, I'm here add comment about why updating VF's imcast stat is
not necessary.

MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D11948
2017-08-14 05:30:02 +00:00
Ian Lepore
a6e709f29c Add hinted attachment for non-FDT systems. Also, print a message if
setting up the timer fails, because on some types of chips that's the
first attempt to access the device.  If the chip is missing/non-responsive
then you'd get a driver that attached and didn't register the rtc, with
no clue about why.  On other chip types there are inits that come before
timer setup, and they already print messages about errors.
2017-08-14 02:23:10 +00:00
Ian Lepore
1dc8df138d Add back the drivers for Dallas/Maxim ds13xx and Seiko S35390x now that
they've been rewritten/fixed to not cause panics by doing i2c transfers
before interrupts are available.

PR:		221227
2017-08-14 00:12:14 +00:00
Ian Lepore
098f6cb6e6 Minor fixes and enhancements for the s35390a i2c RTC driver...
- Add FDT probe code.
- Do i2c transfers with exclusive bus ownership.
- Use config_intrhook_oneshot() to defer chip setup because some i2c
  busses can't do transfers without interrupts.
- Add a detach() routine.
- Add to module build.
2017-08-14 00:00:24 +00:00
Ian Lepore
90cff13c3c Remove the old ds1374 driver and use the ds13rtc driver instead. Adjust
several mips config files accordingly.
2017-08-13 22:07:42 +00:00
Ian Lepore
3777ed4378 Change "chiptype" to "compatible". Making the hint name the same as the FDT
property name should make it easier to document the list of names accepted
by both configuration mechanisms.
2017-08-13 21:45:46 +00:00
Ian Lepore
bb2e8108e1 Add a new driver, ds13rtc, that handles all DS13xx series i2c RTC chips.
This driver supports only basic timekeeping functionality.  It completely
replaces the ds133x driver.  It can also replace the ds1374 driver, but that
will take a few other changes in MIPS code and config, and will be committed
separately.  It does NOT replace the existing ds1307 driver, which provides
access to some of the extended features on the 1307 chip, such as controlling
the square wave output signal.  If both ds1307 and ds13rtc drivers are
present, the ds1307 driver will outbid and win control of the device.

This driver can be configured with FDT data, or by using hints on non-FDT
systems.  In addition to the standard hints for i2c devices, it requires
a "chiptype" string of the form "dallas,ds13xx" where 'xx' is the chip id
(i.e., the same format as FDT compat strings).
2017-08-13 21:02:40 +00:00
Andrew Turner
062c276886 Add support for multiple GICv3 ITS devices. For this we add sc_irq_base
and sc_irq_length to the softc to handle the base number of IRQs available,
make gicv3_get_nirqs return the number of available interrupt IDs, and
limit which CPUs we send interrupts to based on the numa domain.

The last point is only strictly needed on a dual socket ThunderX where we
are unable to send MSI/MSI-X interrupts between sockets.

Sponsored by:	DARPA, AFRL
2017-08-13 18:54:51 +00:00
Ian Lepore
2db14f97de Add config_intrhook_oneshot(): schedule an intrhook function and unregister
it automatically after it runs.

The config_intrhook mechanism allows a driver to stall the boot process
until device(s) required for booting are available, by not allowing system
inits to proceed until all intrhook functions have been unregistered.
Virtually all existing code simply unregisters from within the hook function
when it gets called.

This new function makes that common usage more convenient. Instead of
allocating and filling in a struct, passing it to a function that might (in
theory) fail, and checking the return code, now a driver can simply call
this cannot-fail routine, passing just the intrhook function and its arg.

Differential Revision:	https://reviews.freebsd.org/D11963
2017-08-13 18:10:24 +00:00
Kirk McKusick
037331ddbd When read requests are sent from a filesystem running above g_journal,
the g_journal level needs to check whether it is holding a newer
copy of the block than that which exists on the disk. If so, it
needs to return its copy. If not, it should pass the request down
to the disk to fulfill. It currently considers six queues:

0) delayed queue,
1) unsent (current queue),
2) in-flight to the journal (flush queue),
3) active journal (active queue),
4) inactive journal (inactive queue), and
5) inflight to the disk (copy queue).

Checking on two of these queues is unnecessary:

0) The delayed requests should not be used for reads because they
   have not yet been entered into the journal, so their value should
   reflect the disk contents, not the future contents that are not
   yet committed.

2) Because all the bio's in the flush queue are also found on the
   active queue, there is no need to inspect the flush queue for
   reads since they will be found when searching the active queue.

Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-13 18:09:22 +00:00