Commit Graph

215201 Commits

Author SHA1 Message Date
Konstantin Belousov
e2a18110f0 Remove duplicated code.
aio_aqueue() calls aio_init_aioinfo() as the first action. There is no
need to duplicate the code in kern_aio_fsync().

Also fix indent for aio_aqueue() definition.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D7523
2016-08-17 10:14:22 +00:00
Konstantin Belousov
1680854946 Implement userspace gettimeofday(2) with HPET timecounter.
Right now, userspace (fast) gettimeofday(2) on x86 only works for
RDTSC.  For older machines, like Core2, where RDTSC is not C2/C3
invariant, and which fall to HPET hardware, this means that the call
has both the penalty of the syscall and of the uncached hw behind the
QPI or PCIe connection to the sought bridge.  Nothing can me done
against the access latency, but the syscall overhead can be removed.
System already provides mappable /dev/hpetX devices, which gives
straight access to the HPET registers page.

Add yet another algorithm to the x86 'vdso' timehands. Libc is updated
to handle both RDTSC and HPET.  For HPET, the index of the hpet device
to mmap is passed from kernel to userspace, index might be changed and
libc invalidates its mapping as needed.

Remove cpu_fill_vdso_timehands() KPI, instead require that
timecounters which can be used from userspace, to provide
tc_fill_vdso_timehands{,32}() methods.  Merge i386 and amd64
libc/<arch>/sys/__vdso_gettc.c into one source file in the new
libc/x86/sys location.  __vdso_gettc() internal interface is changed
to move timecounter algorithm detection into the MD code.

Measurements show that RDTSC even with the syscall overhead is faster
than userspace HPET access.  But still, userspace HPET is three-four
times faster than syscall HPET on several Core2 and SandyBridge
machines.

Tested by:	Howard Su <howard0su@gmail.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D7473
2016-08-17 09:52:09 +00:00
Konstantin Belousov
51c762d172 By default, allow all to read the HPET registers pages. At the same
time, by, by default disallow writes to the mmaped HPET pages.

Intent is to allow userspace to use HPET as fast (i.e. no-syscall)
timecounter for gettimeofday(2).  Unfortunately, the permission model
does not make it possible to safely unhide /dev/hpet in the jails even
if default mode is set to 0444, because untrusted jailed root may
change device permissions to writeable.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
2016-08-17 09:20:04 +00:00
Sepherosa Ziehau
c4b78a2628 hyperv/util: Factor out helper for IC device_probe DEVMETHOD
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7530
2016-08-17 08:38:49 +00:00
Emmanuel Vadot
393fb50cbc Correctly print and cast u_int64_t and off_t.
Reported by:	ed, imp
MFC after:	1 week
2016-08-17 08:29:30 +00:00
Sepherosa Ziehau
eacbe92463 hyperv/util: Don't reference hn_softc in KVP
hn_softc is private data struct.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7519
2016-08-17 08:26:08 +00:00
Kevin Lo
0de6c9d651 - Add the 'restrict' type qualifier to match function prototype.
- Use .Lb libc rather than libpthread.

Reviewed by:	delphij
2016-08-17 07:25:50 +00:00
Sepherosa Ziehau
bf965e6dee hyperv/hn: Get rid of unused bits
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7518
2016-08-17 05:57:10 +00:00
Sepherosa Ziehau
85c8f64b0b hyperv/hn: Remove reference to nvsp_status
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7517
2016-08-17 05:45:57 +00:00
Sepherosa Ziehau
8b55644bae hyperv/hn: Remove reference to nvsp_msg
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7516
2016-08-17 05:34:02 +00:00
Sepherosa Ziehau
5f0dee26f1 hyperv/hn: Simplify RNDIS RX packets acknowledgement.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7515
2016-08-17 05:25:47 +00:00
Sepherosa Ziehau
46911ec745 hyperv/hn: Ignore the useless TX table.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7514
2016-08-17 05:14:26 +00:00
Sepherosa Ziehau
86afc9b625 hyperv/storvsc: Deliver CAM_SEL_TIMEOUT upon SRB status error.
SRB status is set to 0x20 by the hypervisor, if the specified LUN is
unaccessible, and even worse the INQUIRY response will not be set by
the hypervisor at all under this situation.  Additionally, SRB status
is 0x20 too, for TUR on an unaccessible LUN.

Deliver CAM_SEL_TIMEOUT to CAM upon SRB status errors as suggested by
Scott Long, other values seems improper.

This commit fixes the Hyper-V disk hotplug support.

Submitted by:	Hongjiang Zhang <honzhan microsoft com>
MFC after:	3 days
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7521
2016-08-17 05:02:18 +00:00
David C Somayajulu
00caeec74a Add support for set/get cam search mode
MFC after: 5 days
2016-08-17 02:40:17 +00:00
David C Somayajulu
80e3aec68c Add ql_minidump.h
MFC after:5 days
2016-08-17 01:57:58 +00:00
David C Somayajulu
6a62bec0cd Upgrade fw, bootloader and minidump template to version 5.4.58
Add minidump retrieval code

MFC after: 5 days
2016-08-17 01:56:37 +00:00
Eric van Gyzen
991d431fa8 PCIe HotPlug: Detect bridges that are not really HotPlug capable
Some devices report that they have an MRL when they actually
do not.  Since they always report that the MRL is open, child
devices would be ignored.  Try to detect these devices and
ignore their claim of HotPlug support.  Specifically,
if there is an open MRL but the Data Link Layer is active,
the MRL is not real.

Revert r303645 to re-enable HotPlug support for slots with
power controllers, since it works correctly in my testing.

Start the DLL state-change timer if Presence /or/ MRL state changes,
along with other conditions.  Previously, we started the timer iff
Presence changed.  If there is an MRL, it must be closed for power
to be turned on, so Presence is unlikely to change on an MRL-close event.

Add a printf() of interesting registers on HotPlug interrupts and
commands (one from erj@).  These were very useful for debugging.
Guard them with bootverbose, since they're spam in normal operation.

In collaboration with:	jhb
Reviewed by:	jhb
MFC after:	1 day
Relnotes:	yes (re-enable HotPlug support for slots with power controllers)
Sponsored by:	Dell Inc.
Differential Revision:	https://reviews.freebsd.org/D7509
2016-08-17 01:24:34 +00:00
Gleb Smirnoff
dc4ee9a895 Fix a stupid typo (or copy/paste buffer malfunction). 2016-08-16 23:00:22 +00:00
Gleb Smirnoff
c0f50fa012 We should not be allowing a timeout to reset when a drain is in progress on
it (either async or sync drain).

At this moment the only user of drain is TCP, but TCP wouldn't reschedule a
callout after it has drained it, since it drains only when a tcpcb is closed.
This for now the problem isn't observed.

Submitted by:	rrs
2016-08-16 21:55:34 +00:00
Landon J. Fuller
1728aef23d bhnd(4): Implement NVRAM support required for PMU bring-up.
- Added a generic bhnd_nvram_parser API, with support for the TLV format
  used on WGT634U devices, the standard BCM NVRAM format used on most
  modern devices, and the "board text file" format used on some hardware
  to supply external NVRAM data at runtime (e.g. via an EFI variable).

- Extended the bhnd_bus_if and bhnd_nvram_if interfaces to support both
  string-based and primitive data type variable access, required for
  common behavior across both SPROM and NVRAM data sources.
- Extended the existing SPROM implementation to support the new
  string-based NVRAM APIs.

- Added an abstract bhnd_nvram driver, implementing the bhnd_nvram_if
  atop the bhnd_nvram_parser API.
- Added a CFE-based bhnd_nvram driver to provide read-only access to
  NVRAM data on MIPS SoCs, pending implementation of a flash-aware
  bhnd_nvram driver.

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7489
2016-08-16 21:32:05 +00:00
Landon J. Fuller
9dfeb4140c bhndb(4): Drop MIPS-incompatible __builtin_ctz dependency.
This replaces the bitfield representation of the bhndb register window
freelist with the bitstring API, eliminating a dependency on
(MIPS-unsupported) __builtin_ctz().

Approved by:	adrian (mentor)
Differential Revision:	https://reviews.freebsd.org/D7495
2016-08-16 21:20:05 +00:00
Kirk McKusick
988fd417a0 Bug 211013 reports that a write error to a UFS filesystem running
with softupdates panics the kernel. The problem that has been pointed
out is that when there is a transient write error on certain metadata
blocks, specifically directory blocks (PAGEDEP), inode blocks
(INODEDEP), indirect pointer blocks (INDIRDEPS), and cylinder group
(BMSAFEMAP, but only when journaling is enabled), we get a panic
in one of the routines called by softdep_disk_io_initiation that
the I/O is "already started" when we retry the write.

These dependency types potentially need to do roll-backs when called
by softdep_disk_io_initiation before doing a write and then a
roll-forward when called by softdep_disk_write_complete after the
I/O completes.  The panic happens when there is a transient error.
At the top of softdep_disk_write_complete we check to see if the
write had an error and if an error occurred we just return.  This
return is correct most of the time because the main role of the routines
called by softdep_disk_write_complete is to process the now-completed
dependencies so that the next I/O steps can happen.

But for the four types listed above, they do not get to do their
rollback operations. This causes the panic when softdep_disk_io_initiation
gets called on the second attempt to do the write and the roll-back
routines find that the roll-backs have already been done. As an
aside I note that there is also the problem that the buffer will
have been unlocked and thus made visible to the filesystem and to
user applications with the roll-backs in place.

The way to resolve the problem is to add a flag to the routines called
by softdep_disk_write_complete for the four dependency types noted
that indicates whether the write was successful (WRITESUCCEEDED).
If the write does not succeed, they do just the roll-backs and then
return. If the write was successful they also do their usual
processing of the now-completed dependencies.

The fix was tested by selectively injecting write errors for buffers
holding dependencies of each of the four types noted above and then
verifying that the kernel no longer paniced and that following the
successful retry of the write that the filesystem could be unmounted
and successfully checked cleanly.

PR: 211013
Reviewed by: kib
2016-08-16 21:02:30 +00:00
Enji Cooper
1b8cc98d83 Only expect :encode_tv_random_million to fail on 64-bit platforms
It passes on i386

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-08-16 20:35:36 +00:00
Enji Cooper
ef5c8d5460 Only expect :encode_tv_random_million to fail on 64-bit platforms
It passes on i386

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-08-16 20:32:08 +00:00
Mark Johnston
915a263ea2 Remove prototypes missed in r303951. 2016-08-16 19:43:17 +00:00
Enji Cooper
532c3cde6a MFhead @ r304232 2016-08-16 18:32:01 +00:00
Konstantin Belousov
948137db10 In UFS_BALLOC(), invalidate pages of indirect buffers on failed block
allocation unwinding.

Dandling buffers are released on UFS_BALLOC() failure to ensure that
later attempt to allocate blocks in close range do not find the blocks
with invalid content, since possible partial block allocations are
unwound.  As such, it is not enough to just release the buffers, the
pages must also invalidated and removed from the vnode vm_object
queue.  Otherwise the pages might be found later and used to
reconstruct indirect buffers when doing allocations at offset close to
the failure point, and their stale content compromise the filesystem
integrity.

Note that just marking the buffer as B_INVAL is not enough, B_NOCACHE
is required.  To be sure, clear the B_CACHE flag as well.  This
complements the r174973, which started releasing buffers.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 17:30:58 +00:00
Konstantin Belousov
2da4bcb82b On unwind after failed block allocation in ffs_balloc_ufs{1,2}, assert
that recorded allocated blocks numbers match the physical block
numbers of dandling buffers which are released.

When finally freeing the blocks during unwind, assert that dandling
buffers where not re-allocated.  They shouldn't, because the vnode lock
is owned exclusive.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 17:18:38 +00:00
Kirk McKusick
65d3199746 Add two new macros, SLIST_CONCAT and LIST_CONCAT. Note in both the
queue.h header file and in the queue.3 manual page that they are O(n)
so should be used only in low-usage paths with short lists (otherwise
an STAILQ or TAILQ should be used).

Reviewed by: kib
2016-08-16 17:07:48 +00:00
Konstantin Belousov
39f7cbe9ab When looking up dandling buffers for unwing after failing block
allocation in UFS_BALLOC(), there is no need to map them.

Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 17:05:15 +00:00
Konstantin Belousov
554248587e When block allocation fails in UFS_BALLOC(), and the volume does not
have SU enabled, there is no point in calling softdep_request_cleanup().

The call cannot produce free blocks, but we unecessarily lock ufsmount
and do inode block write.  Usual point of not doing optimizations for
the corner case of the full volume is not applicable there, the work
is easily avoidable, and the addition CPU and disk io load do not lead
to succeeding retry.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 16:50:48 +00:00
Konstantin Belousov
f1503c5a49 In ffs_balloc_ufs{1,2} routines, assert that unwind records do not
overflow local arrays.  This is not immediately obvious from the
static code inspection, due to retry logic.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-08-16 16:49:56 +00:00
Marcelo Araujo
4b61b26b28 Use nitems() from sys/param.h.
MFC after:	2 weeks.
2016-08-16 15:53:05 +00:00
Marcelo Araujo
4c59a8860a Use nitems() from sys/param.h.
MFC after:	2 weeks.
2016-08-16 15:52:10 +00:00
Randall Stewart
eadd00f81a A few more wording tweaks as suggested (with some modifications
as well) by Ravi Pokala. Thanks for the comments :-)
Sponsored by: Netflix Inc.
2016-08-16 15:17:36 +00:00
Randall Stewart
587d67c008 Here we update the modular tcp to be able to switch to an
alternate TCP stack in other then the closed state (pre-listen/connect).
The idea is that *if* that is supported by the alternate stack, it
is asked if its ok to switch. If it approves the "handoff" then we
allow the switch to happen. Also the fini() function now gets a flag
to tell if you are switching away *or* the tcb is destroyed. The
init() call into the alternate stack is moved to the end so the
tcb is more fully formed before the init transpires.

Sponsored by:	Netflix Inc.
Differential Revision:	D6790
2016-08-16 15:11:46 +00:00
Emmanuel Vadot
858a3f496f Only use WaitForKeys event if it exists, this is not the case in u-boot efi implementation.
Reviewed by:	jhb, emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D6781
2016-08-16 14:33:25 +00:00
Emmanuel Vadot
5f306327fe Use %ju modifier for u_int64_t and %jd modifier for off_t.
off_t is long long on arm32 and long on amd64

MFC after:	1 week
2016-08-16 14:23:35 +00:00
Sofian Brabez
80a2905883 tty: Use proper definition of exit status code and stdin macro
Reviewed by:	bapt, bdrewery
Differential Revision:	https://reviews.freebsd.org/D6828
2016-08-16 14:15:09 +00:00
Randall Stewart
0fa047b98c Comments describing how to properly use the new lock_add functions
and its respective companion.

Sponsored by:	Netflix Inc.
2016-08-16 13:08:03 +00:00
Randall Stewart
b07fef500b This cleans up the timer code in TCP and also makes it so we do not
take the INFO lock *unless* we are really going to delete the TCB.

Differential Revision:	D7136
2016-08-16 12:40:56 +00:00
Bryan Drewery
0295d98c14 Trim unneeded bootstrap after r301470 made 9.1 the minimum supported release.
MFC after:	3 days
Sponsored by:	EMC / Isilon Storage Division
2016-08-16 12:13:12 +00:00
Brooks Davis
3bb0c17d3e Don't conflate enum nss_status return values values with int (NS_SUCCESS,
NS_RETURN) values.

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D6046
2016-08-16 11:38:45 +00:00
Konstantin Belousov
1c1cc89580 The fdatasync(2) call must be cancellation point.
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2016-08-16 08:27:03 +00:00
Adrian Chadd
b0754a31cf [mips] fix use-before-initialised.
Found by: gcc-5.3
2016-08-16 07:51:05 +00:00
Sepherosa Ziehau
de56155fe0 hyperv/hn: Simplify RNDIS message checks on RX path.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7502
2016-08-16 07:45:35 +00:00
Sepherosa Ziehau
62c4e6e992 hyperv/hn: Simplify RNDIS NVS message sending.
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7501
2016-08-16 07:37:02 +00:00
Sepherosa Ziehau
5ac4acb202 hyperv/hn: Factor out hn_nvs_send/hn_nvs_send_sglist
Avoid unnecessary message type setting and centralize the send context
to transaction id cast.

MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D7500
2016-08-16 07:26:53 +00:00
Sepherosa Ziehau
8452c1b345 tcp/lro: Make # of LRO entries tunable
Reviewed by:	hps, gallatin
Obtained from:	rrs, gallatin
MFC after:	2 weeks
Sponsored by:	Netflix (rrs, gallatin), Microsoft (sephe)
Differential Revision:	https://reviews.freebsd.org/D7499
2016-08-16 06:40:27 +00:00
Mark Johnston
5968c00154 Regenerate DTrace tests. 2016-08-16 02:34:25 +00:00