Commit Graph

200895 Commits

Author SHA1 Message Date
delphij
cd85118a8b MFV r268714:
Improve extreme rewind import.

When doing an "extreme rewind" import ("zpool import -XF"), we attempt
to verify all data in the pool, essentially scrubbing the entire pool.
The problem is that spa_load_verify_cb() issues an unbounded number of
concurrent scrub i/os.  This can lead to all of memory being used for
these zio's, wedging the system. Like normal scrub, we need to put a
cap on the number of outstanding i/os, and have the traverse thread
block when we reach this cap.

For this purpose the cap can be very large (10,000) to optimize the
elevator algorithm.  Three kernel tunables have been added:

	vfs.zfs.spa_load_verify_maxinflight
	vfs.zfs.spa_load_verify_metadata
	vfs.zfs.spa_load_verify_data

The latter two tunables controls whether metadata and/or user data
when doing extreme rewind.

Make 'zpool import -T' imply scrub.

Make zpool import -T <txg> accept hexadecimal values for the txg when
prefixed with 0x.

Skip txg's for which there is no uberblock when doing extreme rewind.

Skip reading all user data twice by skipping prefetches when doing
extreme rewinds as we do not access via the ARC.

Illumos issues:
  4970 need controls on i/o issued by zpool import -XF
  4971 zpool import -T should accept hex values
  4972 zpool import -T implies extreme rewind, and thus a scrub
  4973 spa_load_retry retries the same txg
  4974 spa_load_verify() reads all data twice

MFC after:	2 weeks
2014-07-15 22:44:04 +00:00
bdrewery
d543f02759 Document the 'show bio' command added in 2009.
While here also reword 'show buffer' to have an 'addr' argument and to
match other struct documentation.

MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2014-07-15 21:13:08 +00:00
delphij
1fb217334d MFV r268702:
Add missing *_destroy() calls in various places with ZFS.

Illumos issue:
  4975 missing mutex_destroy() calls in zfs

MFC after:	2 weeks
2014-07-15 20:32:23 +00:00
kib
24a1fa13b7 Followup to r268466.
- Move the code to calculate resident count into separate function.
  It reduces the indent level and makes the operation of
  vmmap_skip_res_cnt tunable more clear.
- Optimize the calculation of the resident page count for map entry.
  Skip directly to the next lowest available index and page among the
  whole shadow chain.
- Restore the use of pmap_incore(9), only to verify that current
  mapping is indeed superpage.
- Note the issue with the invalid pages.

Suggested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-15 19:57:03 +00:00
kib
f65dceba26 Change the calculation of the kinfo_vmentry field kve_private_resident
to reflect its name.

Noted and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-15 19:49:00 +00:00
np
a4e0ce3808 cxgbe(4): Display CF facility correctly in the device log.
MFC after:	3 days
2014-07-15 18:24:41 +00:00
neel
eb07e4ed55 Add support for operand size and address size override prefixes in bhyve's
instruction emulation [1].

Fix bug in emulation of opcode 0x8A where the destination is a legacy high
byte register and the guest vcpu is in 32-bit mode. Prior to this change
instead of modifying %ah, %bh, %ch or %dh the emulation would end up
modifying %spl, %bpl, %sil or %dil instead.

Add support for moffsets by treating it as a 2, 4 or 8 byte immediate value
during instruction decoding.

Fix bug in verify_gla() where the linear address computed after decoding
the instruction was not being truncated to the effective address size [2].

Tested by:	Leon Dang [1]
Reported by:	Peter Grehan [2]
Sponsored by:	Nahanni Systems
2014-07-15 17:37:17 +00:00
alc
9e8e83700c Actually set the "no execute" bit on 1 MB page mappings in pmap_protect().
Previously, the "no execute" bit was being set directly in the PTE, instead
of the local variable in which the new PTE value is being constructed.  So,
when the local variable was finally assigned to the PTE, the "no execute"
bit setting was lost.
2014-07-15 17:16:06 +00:00
jhb
17d78db27b Fix build with SMP disabled.
CR:		https://phabric.freebsd.org/D407
Reviewed by:	royger
2014-07-15 15:40:33 +00:00
bapt
7eba29e60e Add a comment to explain the EAGAIN is only there for POSIX compliance
Resquested by:	kib
Reviewed by:	des
2014-07-15 15:29:43 +00:00
kib
1ecfe30151 Make amd64 pmap_copy_pages() functional for pages not mapped by DMAP.
Requested and reviewed by:	royger
Tested by:	pho, royger
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-07-15 09:30:43 +00:00
alc
80950c5bb8 Eliminate repeated calculation of next_bucket in pmap_protect() and
pmap_remove().  Eliminate an unnecessary variable from pmap_remove() and
pmap_advise().
2014-07-15 05:34:27 +00:00
marcel
ca1d5922c5 Add image_data() for checking whether a sequence of blocks has data.
Use this for VHD and VMDK to avoid allocating space in the image
for empty sectors.

Note that this negatively affects performance because mkimg uses a
temporary file for the intermediate storage. When mkimg has better
internal book keeping, performance can be significantly improved.
2014-07-15 04:39:23 +00:00
pfg
b2edcea2f0 libc/stdlib: Minor cleanups to code originating in NetBSD
Mostly ANSIfication and typos.

Obtained from:	NetBSD
MFC after:	5 days
2014-07-15 03:28:37 +00:00
kevlo
6a45f468c1 Document that listen(2) can fail with EDESTADDRREQ. 2014-07-15 02:21:51 +00:00
pfg
93698c5f59 libc/gen: small updates to code originating at OpenBSD
arc4random.c
- CVS rev. 1.22
Change arc4random_uniform() to calculate ``2**32 % upper_bound'' as
``-upper_bound % upper_bound''. Simplifies the code and makes it the
same on both ILP32 and LP64 architectures, and also slightly faster on
LP64 architectures by using a 32-bit remainder instead of a 64-bit
remainder.
- CVS rev. 1.23
Spacing

readpassphrase.c
-CVS rev. v 1.24
most obvious unsigned char casts for ctype

Obtained from:	OpenBSD
MFC after:	5 days
2014-07-15 02:21:35 +00:00
dteske
bf1e81a37a Fix an issue with service(8) where utilities such as screen(1) and tmux(1)
would behave differently when utilizing rc-script was invoked manually vs.
service(8). The issue being that these utilities require the TERM environ
variable to be set and service(8) was not passing it down.

Reported by:	Michael Dexter <editor@callfortesting.org>
PR:		bin/191869
Reviewed by:	allanjude
MFC after:	3 days
X-MFC-to:	stable/10, stable/9
2014-07-15 02:18:55 +00:00
np
d6f3d65931 Allow multi-byte reads in the private CHELSIO_T4_GET_I2C ioctl. The
firmware allows up to 48B to be read this way but the driver limits
itself to 8B at a time to remain compatible with old cxgbetool
binaries.

MFC after:	1 week
2014-07-15 01:03:29 +00:00
grehan
bffc595f8b Use the blockif CHS routine to create fake CHS values,
and then populate them in the identity page.

This fixes a divide-by-zero error at probe time with NetBSD.

MFC after:	1 week.
2014-07-15 00:27:08 +00:00
grehan
4632b82c93 Add a call to synthesize a C/H/S value for block emulations
that require it (ahci). The algorithm used is from the VHD
specification.
2014-07-15 00:25:54 +00:00
gavin
79ac32d748 fortune(8): Search fortune files installed by ports/packages as well as
those supplied by the base system.

PR:		191800
Submitted by:	Andy Kosela
MFC after:	1 week
2014-07-14 23:25:29 +00:00
mjg
b55267950c Plug p_pptr null test in do_execve. It is always true. 2014-07-14 22:40:46 +00:00
pfg
ce102225bd newfs_msdosfs: Respect FSFIXFAT
Fix some whitespace issues while here.

Obtained from:	NetBSD (rev. 1.9)
MFC after:	3 days
2014-07-14 21:32:40 +00:00
mjg
fe1391d5a5 Manage struct sigacts refcnt with atomics instead of a mutex.
MFC after:	1 week
2014-07-14 21:12:59 +00:00
ian
a99cc79448 Fix the Zedboard/Zynq ethernet driver to handle media speed changes so
that it can connect to switches at speeds other than 1gb.

This requires changing the reference clock speed.  Since we still don't
have a general clock API that lets a SoC-independant driver manipulate its
own clocks, this change includes a weak reference to a routine named
cgem_set_ref_clk().  The default implementation is a no-op; SoC-specific
code can provide an implementation that actually changes the speed.

Submitted by:	Thomas Skibo <ThomasSkibo@sbcglobal.net>
2014-07-14 20:58:57 +00:00
pfg
375de29597 msdosfs: Assorted fixes from other BSDs.
When truncating cluster chains fix the length of the cluster head.
http://marc.info/?t=140304310700005&r=1&w=2

Avoid infinite loops in cluster chain linked lists.
http://marc.info/?l=openbsd-tech&m=140275150804337&w=2

Avoid off-by-one on FAT12 filesystems.
http://marc.info/?l=openbsd-tech&m=140234174104724&w=2

Obtained from:	NetBSD (from OpenBSD)
MFC after:	1 week
2014-07-14 20:58:02 +00:00
pfg
7716a7c65e fsck_msdosfs: be a bit more permissive
The free space value in the FSInfo block is merely unitialized when it is
0xffffffff. This fixes a bug found in NetBSD.

It must be noted that we never supported all the checks that NetBSD does
as some of them would cause failures with a freshly created FAT32
from MS-Windows.

While here, bring some space fixes.

Obtained from:	NetBSD (rev. 1.22)
MFC after:	3 days
2014-07-14 20:17:09 +00:00
pfg
cbc817e242 Minor (mostly cosmetic) cleanups
Several whitespace fixes
convert *rootDir from external to static.

Obtained from:	NetBSD, OpenBSD (partial)
MFC after:	3 days
2014-07-14 19:16:49 +00:00
delphij
ba32c2f8ac Bump mdoc date after r268621.
X-MFC-With:	r268621
2014-07-14 17:54:36 +00:00
nwhitehorn
ea8d777f28 On my Lenovo laptop, the firmware maps the EFI framebuffer with MTRRs set
to uncacheable. This leads to execrable console performance. Once PMAP is
up, remap the framebuffer as write-combining. This reduces boot time on my
laptop by 60% when booting with EFI.

MFC after:	2 weeks
2014-07-14 17:42:22 +00:00
alc
95f74559b9 Eliminate dead code. There is no direct map. This code was cut-and-pasted
from amd64.
2014-07-14 17:16:09 +00:00
smh
5563475471 Don't report non-native block-size pools under zpool status -x
zpool status -x is used to identify pools that are exhibiting
errors or are otherwise unavailable, therefore non-native
block-size pools shouldn't be reported.

Also update man page to clarify other additional conditions
which won't cause a pool to be displayed under zpool status -x.

Sponsored by:	Multiplay
2014-07-14 14:33:03 +00:00
jmmv
3b9fdabf1f Make generation of nslexer.c more robust.
Ensure that lex errors fail the build instead of being silently ignored
due to the piped call.  Also postpone the update of the nslexer.c file
until we are sure we have generated it properly.

These changes fix some very obscure build failures I encountered while
building FreeBSD within a chroot that did not have devfs mounted. The
specific errors looked like:

.../libc.so.7: undefined reference to `_nsyyerror'
.../libc.so.7: undefined reference to `_nsyyin'
.../libc.so.7: undefined reference to `_nsyylex'
.../libc.so.7: undefined reference to `_nsyylineno'
.../libc.so.7: undefined reference to `_nsyytext'

and were caused due to a mangled nslexer.c being linked into libc.
2014-07-14 13:53:10 +00:00
gahr
e896978a89 Unbreak the build by re-enabling exceptions.
Disabling them breaks build on archs using GCC. The problem is at line 156 of
bits/basic_ios.h:

	if (this->exceptions() & __state)
		__throw_exception_again;

With exceptions disabled __throw_exception_again is defined as

#define __throw_exception_again

at line 45 of exception_defines.h and the code results in an empty loop body,
which fails because of -Werror.

Approved by:	cognet
2014-07-14 12:24:38 +00:00
kib
7eb2d27b75 Rework the tmpfs unmount.
- Suspend filesystem for unmount.  This prevents new tmpfs nodes from
  instantiating, and also ensures that only unmount thread can destroy
  nodes.

- Do not start tmpfs node deletion until all vnodes are reclaimed,
  which guarantees that no thread can access tmpfs data.  For this,
  call vflush() in the loop, until the mnt_nvnodelistsize is non-zero.
  Note that after mnt_nvnodelistsize becomes 0, insmntque() blocks
  insertion of a vnode germ into the mount list of vnodes.

- Fail node allocation when the filesystem is being unmounted.  This
  is race-free due to the vflush() call in loop.  This is mostly
  cosmetic, avoiding some more work which might be done until
  suspension in unmount is started.

Note that there is currently no way to prevent new vnode instantiation
from readers during the unmount.  Due to this, forced unmount might
live-lock if vflush() loop cannot get to the zero vnode count due to
races with readers.  The unmount would proceed after the load is
lifted.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:52:33 +00:00
kib
2488a7f8be Change forgotten in r268615. Set the OBJ_TMPFS_NODE flag for
vm_object of VREG tmpfs node.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:35:14 +00:00
kib
8664d64bc3 The OBJ_TMPFS flag of vm_object means that there is unreclaimed tmpfs
vnode for the tmpfs node owning this object.  The flag is currently
used for two purposes.  First, it allows to correctly handle VV_TEXT
for tmpfs vnode when the ref count on the object is decremented to 1,
similar to vnode_pager_dealloc() for regular filesystems.  Second, it
prevents some operations, which are done on OBJT_SWAP vm objects
backing user anonymous memory, but are incorrect for the object owned
by tmpfs node.

The second kind of use of the OBJ_TMPFS flag is incorrect, since the
vnode might be reclaimed, which clears the flag, but vm object
operations must still be disallowed.

Introduce one more flag, OBJ_TMPFS_NODE, which is permanently set on
the object for VREG tmpfs node, and used instead of OBJ_TMPFS to test
whether vm object collapse and similar actions should be disabled.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:30:37 +00:00
kib
e152d89652 Use tmpfs_vn_get_ino_gen() to handle the races with reclaim in tmpfs
dotdot lookup.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:16:55 +00:00
kib
1ae17f77a1 Style. Add comment about lock mode.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:13:56 +00:00
kib
a698488f0f Extract the code to put a filesystem into the suspended state (at the
unmount time) in the helper vfs_write_suspend_umnt().  Use it instead
of two inline copies in FFS.

Fix the bug in the FFS unmount, when suspension failed, the ufs
extattrs were not reinitialized.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:10:00 +00:00
kib
a3c7856b26 In tmpfs_alloc_file(), code after the 'out' label does only 'return
error;'.  Replace goto's with the return.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:02:40 +00:00
kib
2aa8688209 Add convenience macro to assert tmpfs node lock.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:59:25 +00:00
kib
7dd9ab980a Add some assertions for the code handling vm_object for tmpfs vnode.
In particular, vnode must be exclusively locked when the tmpfs vnode
and object are divorced.  When the vnode is opened, the object must be
still alive, since only live vnode can be opened, and the tmpfs node
owns a reference on the object.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:55:02 +00:00
kib
8542c8b735 The tmpfs_link() must not dereference the filesystem-specific data for
a vnode until it is verified that the vnode indeed belongs to tmpfs
mount.  Otherwise, it might access random memory, at least in the
debug kernel.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:45:29 +00:00
kib
7b68dd9333 In kern_linkat(), avoid passing doomed vnode to the VOP.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:41:13 +00:00
kib
2f41c9023e Generalize vn_get_ino() to allow filesystems to use custom vnode
producer, instead of hard-coding VFS_VGET().  New function, which
takes callback, is called vn_get_ino_gen(), standard callback for
vn_get_ino() is provided.

Convert inline copies of vn_get_ino() in msdosfs and cd9660 into the
uses of vn_get_ino_gen().

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:34:54 +00:00
kib
d84e7ad50c Remove code separator lines which do not conform to style(9).
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:17:11 +00:00
kevlo
c4fd7b3987 Make bind(2) and connect(2) return EAFNOSUPPORT for AF_UNIX on wrong
address family.

See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191586 for the
original discussion.

Reviewed by:	terry
2014-07-14 06:00:01 +00:00
markj
880dd1a983 Invoke the DTrace trap handler before calling trap() on amd64. This matches
the upstream implementation and helps ensure that a trap induced by tracing
fbt::trap:entry is handled without recursively generating another trap.

This makes it possible to run most (but not all) of the DTrace tests under
common/safety/ without triggering a kernel panic.

Submitted by:	Anton Rang <anton.rang@isilon.com> (original version)
Phabric:	D95
2014-07-14 04:38:17 +00:00
jmmv
0b964c341a Explicitly disable the build of tests when building bmake.
During "make buildworld", building bmake is (one of) the very first steps
and we should not be building any of its tests.  Conceptually, this is the
right thing to do 1) for build simplicity reasons and 2) because there is
no need to build any tests this early on.

In practice, this fixes tinderbox builds of CURRENT from 9.x when MK_TESTS
is enabled.  This is because bsd.test.mk needs some modern bmake features
not present in 9.x (:tW) and tinderbox is forcing the build to use the
CURRENT share/mk files from the very beginning (see r266617).  By skipping
the build of the tests when still using the host make, we omit the problem.
Arguably, what tinderbox is doing is wrong and needs to be addressed, but
that is a separate issue.
2014-07-13 23:53:41 +00:00