Commit Graph

224401 Commits

Author SHA1 Message Date
kib
e56264ca17 Implement userspace gettimeofday(2) with HPET timecounter.
Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC.  For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge.  Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.

Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET.  For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.

Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods.  Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location.  __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.

Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access.  But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge
machines.

Tested by:	Howard Su <howard0su@gmail.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D7473
2016-08-17 09:52:09 +00:00
kib
591244e54c By default, allow all to read the HPET registers pages. At the same
time, by, by default disallow writes to the mmaped HPET pages.

Intent is to allow userspace to use HPET as fast (i.e. no-syscall)
timecounter for gettimeofday(2).  Unfortunately, the permission model
does not make it possible to safely unhide /dev/hpet in the jails even
if default mode is set to 0444, because untrusted jailed root may
change device permissions to writeable.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2016-08-17 09:20:04 +00:00
sephe
c24d51a7f2 hyperv/util: Factor out helper for IC device_probe DEVMETHOD
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7530
2016-08-17 08:38:49 +00:00
manu
50ac04f160 Correctly print and cast u_int64_t and off_t.
Reported by:	ed, imp
MFC after:	1 week
2016-08-17 08:29:30 +00:00
sephe
6b1ea3b570 hyperv/util: Don't reference hn_softc in KVP
hn_softc is private data struct.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7519
2016-08-17 08:26:08 +00:00
kevlo
e9605dafbc - Add the 'restrict' type qualifier to match function prototype.
- Use .Lb libc rather than libpthread.

Reviewed by:	delphij
2016-08-17 07:25:50 +00:00
sephe
4561a8d9c2 hyperv/hn: Get rid of unused bits
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7518
2016-08-17 05:57:10 +00:00
sephe
93e9768e25 hyperv/hn: Remove reference to nvsp_status
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7517
2016-08-17 05:45:57 +00:00
sephe
ee90d2aeb3 hyperv/hn: Remove reference to nvsp_msg
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7516
2016-08-17 05:34:02 +00:00
sephe
3d5002ca01 hyperv/hn: Simplify RNDIS RX packets acknowledgement.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7515
2016-08-17 05:25:47 +00:00
sephe
4dccd19522 hyperv/hn: Ignore the useless TX table.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7514
2016-08-17 05:14:26 +00:00
sephe
336a9723e0 hyperv/storvsc: Deliver CAM_SEL_TIMEOUT upon SRB status error.
SRB status is set to 0x20 by the hypervisor, if the specified LUN is
unaccessible, and even worse the INQUIRY response will not be set by
the hypervisor at all under this situation.  Additionally, SRB status
is 0x20 too, for TUR on an unaccessible LUN.

Deliver CAM_SEL_TIMEOUT to CAM upon SRB status errors as suggested by
Scott Long, other values seems improper.

This commit fixes the Hyper-V disk hotplug support.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7521
2016-08-17 05:02:18 +00:00
davidcs
f2caa4fd8b Add support for set/get cam search mode
MFC after: 5 days
2016-08-17 02:40:17 +00:00
davidcs
becbecbec4 Add ql_minidump.h
MFC after:5 days
2016-08-17 01:57:58 +00:00
davidcs
dd07c34fd6 Upgrade fw, bootloader and minidump template to version 5.4.58
Add minidump retrieval code

MFC after: 5 days
2016-08-17 01:56:37 +00:00
vangyzen
899bddbae3 PCIe HotPlug: Detect bridges that are not really HotPlug capable
Some devices report that they have an MRL when they actually
do not.  Since they always report that the MRL is open, child
devices would be ignored.  Try to detect these devices and
ignore their claim of HotPlug support.  Specifically,
if there is an open MRL but the Data Link Layer is active,
the MRL is not real.

Revert r303645 to re-enable HotPlug support for slots with
power controllers, since it works correctly in my testing.

Start the DLL state-change timer if Presence /or/ MRL state changes,
along with other conditions.  Previously, we started the timer iff
Presence changed.  If there is an MRL, it must be closed for power
to be turned on, so Presence is unlikely to change on an MRL-close event.

Add a printf() of interesting registers on HotPlug interrupts and
commands (one from erj@).  These were very useful for debugging.
Guard them with bootverbose, since they're spam in normal operation.

In collaboration with:	jhb
Reviewed by:	jhb
MFC after:	1 day
Relnotes:	yes (re-enable HotPlug support for slots with power controllers)
Sponsored by:	Dell Inc.
Differential Revision:	https://reviews.freebsd.org/D7509
2016-08-17 01:24:34 +00:00
glebius
0d6b2c40a8 Fix a stupid typo (or copy/paste buffer malfunction). 2016-08-16 23:00:22 +00:00
glebius
0495c9a2f3 We should not be allowing a timeout to reset when a drain is in progress on
it (either async or sync drain).

At this moment the only user of drain is TCP, but TCP wouldn't reschedule a
callout after it has drained it, since it drains only when a tcpcb is closed.
This for now the problem isn't observed.

Submitted by:	rrs
2016-08-16 21:55:34 +00:00
landonf
44059bbc95 bhnd(4): Implement NVRAM support required for PMU bring-up.
- Added a generic bhnd_nvram_parser API, with support for the TLV format
  used on WGT634U devices, the standard BCM NVRAM format used on most
  modern devices, and the "board text file" format used on some hardware
  to supply external NVRAM data at runtime (e.g. via an EFI variable).

- Extended the bhnd_bus_if and bhnd_nvram_if interfaces to support both
  string-based and primitive data type variable access, required for
  common behavior across both SPROM and NVRAM data sources.
- Extended the existing SPROM implementation to support the new
  string-based NVRAM APIs.

- Added an abstract bhnd_nvram driver, implementing the bhnd_nvram_if
  atop the bhnd_nvram_parser API.
- Added a CFE-based bhnd_nvram driver to provide read-only access to
  NVRAM data on MIPS SoCs, pending implementation of a flash-aware
  bhnd_nvram driver.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7489
2016-08-16 21:32:05 +00:00
landonf
298dc9cdcd bhndb(4): Drop MIPS-incompatible __builtin_ctz dependency.
This replaces the bitfield representation of the bhndb register window
freelist with the bitstring API, eliminating a dependency on
(MIPS-unsupported) __builtin_ctz().

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7495
2016-08-16 21:20:05 +00:00
mckusick
d1132818f9 Bug 211013 reports that a write error to a UFS filesystem running
with softupdates panics the kernel. The problem that has been pointed
out is that when there is a transient write error on certain metadata
blocks, specifically directory blocks (PAGEDEP), inode blocks
(INODEDEP), indirect pointer blocks (INDIRDEPS), and cylinder group
(BMSAFEMAP, but only when journaling is enabled), we get a panic
in one of the routines called by softdep_disk_io_initiation that
the I/O is "already started" when we retry the write.

These dependency types potentially need to do roll-backs when called
by softdep_disk_io_initiation before doing a write and then a
roll-forward when called by softdep_disk_write_complete after the
I/O completes.  The panic happens when there is a transient error.
At the top of softdep_disk_write_complete we check to see if the
write had an error and if an error occurred we just return.  This
return is correct most of the time because the main role of the routines
called by softdep_disk_write_complete is to process the now-completed
dependencies so that the next I/O steps can happen.

But for the four types listed above, they do not get to do their
rollback operations. This causes the panic when softdep_disk_io_initiation
gets called on the second attempt to do the write and the roll-back
routines find that the roll-backs have already been done. As an
aside I note that there is also the problem that the buffer will
have been unlocked and thus made visible to the filesystem and to
user applications with the roll-backs in place.

The way to resolve the problem is to add a flag to the routines called
by softdep_disk_write_complete for the four dependency types noted
that indicates whether the write was successful (WRITESUCCEEDED).
If the write does not succeed, they do just the roll-backs and then
return. If the write was successful they also do their usual
processing of the now-completed dependencies.

The fix was tested by selectively injecting write errors for buffers
holding dependencies of each of the four types noted above and then
verifying that the kernel no longer paniced and that following the
successful retry of the write that the filesystem could be unmounted
and successfully checked cleanly.

PR: 211013
Reviewed by: kib
2016-08-16 21:02:30 +00:00
ngie
1e7f2891b9 Only expect :encode_tv_random_million to fail on 64-bit platforms
It passes on i386

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-08-16 20:35:36 +00:00
markj
347d2809fd Remove prototypes missed in r303951. 2016-08-16 19:43:17 +00:00
kib
73de606a86 In UFS_BALLOC(), invalidate pages of indirect buffers on failed block
allocation unwinding.

Dandling buffers are released on UFS_BALLOC() failure to ensure that
later attempt to allocate blocks in close range do not find the blocks
with invalid content, since possible partial block allocations are
unwound.  As such, it is not enough to just release the buffers, the
pages must also invalidated and removed from the vnode vm_object
queue.  Otherwise the pages might be found later and used to
reconstruct indirect buffers when doing allocations at offset close to
the failure point, and their stale content compromise the filesystem
integrity.

Note that just marking the buffer as B_INVAL is not enough, B_NOCACHE
is required.  To be sure, clear the B_CACHE flag as well.  This
complements the r174973, which started releasing buffers.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 17:30:58 +00:00
kib
fd0eaea7aa On unwind after failed block allocation in ffs_balloc_ufs{1,2}, assert
that recorded allocated blocks numbers match the physical block
numbers of dandling buffers which are released.

When finally freeing the blocks during unwind, assert that dandling
buffers where not re-allocated.  They shouldn't, because the vnode lock
is owned exclusive.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 17:18:38 +00:00
mckusick
0aeae89e96 Add two new macros, SLIST_CONCAT and LIST_CONCAT. Note in both the
queue.h header file and in the queue.3 manual page that they are O(n)
so should be used only in low-usage paths with short lists (otherwise
an STAILQ or TAILQ should be used).

Reviewed by: kib
2016-08-16 17:07:48 +00:00
kib
e037bf91c2 When looking up dandling buffers for unwing after failing block
allocation in UFS_BALLOC(), there is no need to map them.

Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 17:05:15 +00:00
kib
4e01efcc0c When block allocation fails in UFS_BALLOC(), and the volume does not
have SU enabled, there is no point in calling softdep_request_cleanup().

The call cannot produce free blocks, but we unecessarily lock ufsmount
and do inode block write.  Usual point of not doing optimizations for
the corner case of the full volume is not applicable there, the work
is easily avoidable, and the addition CPU and disk io load do not lead
to succeeding retry.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 16:50:48 +00:00
kib
03c6968a37 In ffs_balloc_ufs{1,2} routines, assert that unwind records do not
overflow local arrays.  This is not immediately obvious from the
static code inspection, due to retry logic.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 16:49:56 +00:00
araujo
cc1da273bd Use nitems() from sys/param.h.
MFC after:	2 weeks.
2016-08-16 15:53:05 +00:00
araujo
26a21fb116 Use nitems() from sys/param.h.
MFC after:	2 weeks.
2016-08-16 15:52:10 +00:00
rrs
c9af4f2541 A few more wording tweaks as suggested (with some modifications
as well) by Ravi Pokala. Thanks for the comments :-)
Sponsored by: Netflix Inc.
2016-08-16 15:17:36 +00:00
rrs
4d7e0cd8cd Here we update the modular tcp to be able to switch to an
alternate TCP stack in other then the closed state (pre-listen/connect).
The idea is that *if* that is supported by the alternate stack, it
is asked if its ok to switch. If it approves the "handoff" then we
allow the switch to happen. Also the fini() function now gets a flag
to tell if you are switching away *or* the tcb is destroyed. The
init() call into the alternate stack is moved to the end so the
tcb is more fully formed before the init transpires.

Sponsored by:	Netflix Inc.
Differential Revision:	D6790
2016-08-16 15:11:46 +00:00
manu
f13741f250 Only use WaitForKeys event if it exists, this is not the case in u-boot efi implementation.
Reviewed by:	jhb, emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D6781
2016-08-16 14:33:25 +00:00
manu
7019b55f50 Use %ju modifier for u_int64_t and %jd modifier for off_t.
off_t is long long on arm32 and long on amd64

MFC after:	1 week
2016-08-16 14:23:35 +00:00
sbz
2f07869625 tty: Use proper definition of exit status code and stdin macro
Reviewed by:	bapt, bdrewery
Differential Revision:	https://reviews.freebsd.org/D6828
2016-08-16 14:15:09 +00:00
rrs
30758bfe03 Comments describing how to properly use the new lock_add functions
and its respective companion.

Sponsored by:	Netflix Inc.
2016-08-16 13:08:03 +00:00
rrs
fae35efb7f This cleans up the timer code in TCP and also makes it so we do not
take the INFO lock *unless* we are really going to delete the TCB.

Differential Revision:	D7136
2016-08-16 12:40:56 +00:00
bdrewery
8543217336 Trim unneeded bootstrap after r301470 made 9.1 the minimum supported release.
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-08-16 12:13:12 +00:00
brooks
237cd762eb Don't conflate enum nss_status return values values with int (NS_SUCCESS,
NS_RETURN) values.

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D6046
2016-08-16 11:38:45 +00:00
kib
e04600d300 The fdatasync(2) call must be cancellation point.
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2016-08-16 08:27:03 +00:00
adrian
77dd7a47e3 [mips] fix use-before-initialised.
Found by: gcc-5.3
2016-08-16 07:51:05 +00:00
sephe
e1f56cc39c hyperv/hn: Simplify RNDIS message checks on RX path.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7502
2016-08-16 07:45:35 +00:00
sephe
86f8b8d983 hyperv/hn: Simplify RNDIS NVS message sending.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7501
2016-08-16 07:37:02 +00:00
sephe
b2c45dd531 hyperv/hn: Factor out hn_nvs_send/hn_nvs_send_sglist
Avoid unnecessary message type setting and centralize the send context
to transaction id cast.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7500
2016-08-16 07:26:53 +00:00
sephe
779c070ec2 tcp/lro: Make # of LRO entries tunable
Reviewed by:	hps, gallatin
Obtained from:	rrs, gallatin
MFC after:	2 weeks
Sponsored by:	Netflix (rrs, gallatin), Microsoft (sephe)
Differential Revision:	https://reviews.freebsd.org/D7499
2016-08-16 06:40:27 +00:00
markj
31eeb8a691 Regenerate DTrace tests. 2016-08-16 02:34:25 +00:00
markj
46fb7ce066 MFV r304057:
7085 add support for "if" and "else" statements in dtrace

illumos/illumos-gate@c3bd3abd88

Add syntactic sugar to dtrace: "if" and "else" statements. The sugar is
baked down to standard dtrace features by adding additional clauses with
the appropriate predicates.

Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Richard Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>

MFC after:	2 weeks
Relnotes:	yes
2016-08-16 02:30:19 +00:00
markj
b1b1e507c6 MFV r301526:
7035 string-related subroutines should validate input earlier

Reviewed by: Alex Wilson <alex.wilson@joyent.com>
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Patrick Mooney <pmooney@pfmooney.com>

illumos/illumos-gate@771e39c3b1

MFC after:	2 weeks
2016-08-16 02:25:19 +00:00
markj
231aa84ab7 MFV r301525:
7033 ustack helper should fault on bad return values

Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Reviewed by: Bryan Cantrill <bryan@joyent.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author: Alex Wilson <alex.wilson@joyent.com>

illumos/illumos-gate@a2f72b65eb

MFC after:	2 weeks
2016-08-16 02:20:02 +00:00