freebsd-dev/sys/cddl
Pawel Jakub Dawidek 3a482ccc5e MFC r203504,r204067,r204073,r204101,r204804,r205079,r205080,r205132,r205133,
r205134,r205231,r205253,r205264,r205346,r206051,r206667,r206792,r206793,
    r206794,r206795,r206796,r206797:

r203504:

Open provider for writting when we find the right one. Opening too much
providers for writing provokes huge traffic related to taste events send
by GEOM on close. This can lead to various problems with opening GEOM
providers that are created on top of other GEOM providers.

Reorted by:	Kurt Touet <ktouet@gmail.com>, mr
Tested by:	mr, Baginski Darren <kickbsd@ya.ru>

r204067:

Update comment. We also look for GPT partitions.

r204073:

Add tunable and sysctl to skip hostid check on pool import.

r204101:

Don't set f_bsize to recordsize. It might confuse some software (like squid).

Submitted by:	Alexander Zagrebin <alexz@visp.ru>

r204804:

Remove racy assertion.

Reported by:	Attila Nagy <bra@fsn.hu>
Obtained from:	OpenSolaris, Bug ID 6827260

r205079:

Remove bogus assertion.

Reported by:	Johan Ström <johan@stromnet.se>
Obtained from:	OpenSolaris, Bug ID 6920880

r205080:

Force commit to correct Bug ID:

Obtained from:	OpenSolaris, Bug ID 6920880

r205132:

Don't bottleneck on acquiring the stream locks - this avoids a massive
drop off in throughput with large numbers of simultaneous reads

r205133:

fix compilation under ZIO_USE_UMA

r205134:

make UMA the default allocator for ZFS buffers - this avoids
a great deal of contention in kmem_alloc

r205231:

- reduce contention by breaking up ARC state locks in to 16 for data
  and 16 for metadata
- export L2ARC tunables as sysctls
- add several kstats to track L2ARC state more precisely
- avoid holding a contended lock when atomically incrementing a
  contended counter (no lock protection needed for atomics)

r205253:

use CACHE_LINE_SIZE instead of hardcoding 128 for lock pad

pointed out by Marius Nuennerich and jhb@

r205264:

- cache line align arcs_lock array (h/t Marius Nuennerich)
- fix ARCS_LOCK_PAD to use architecture defined CACHE_LINE_SIZE
- cache line align buf_hash_table ht_locks array

r205346:

The same code is used to import and to create pool.
The order of operations is the following:
1. Try to open vdev by remembered path and guid.
2. If 1 failed, try to find vdev which guid matches and ignore the path.
3. If 2 failed this means either that the vdev we're looking for is gone
   or that pool is being created and vdev doesn't contain proper guid yet.
   To be able to handle pool creation we open vdev by path anyway.

Because of 3 it is possible that we open wrong vdev on import which can lead to
confusions.

The solution for this is to check spa_load_state. On pool creation it will be
equal to SPA_LOAD_NONE and we can open vdev only by path immediately and if it
is not equal to SPA_LOAD_NONE we first open by path+guid and when that fails,
we open by guid. We no longer open wrong vdev on import.

r206051:

IOCPARM_MAX defines maximum size of a structure that can be passed
directly to ioctl(2). Because of how ioctl command is build using _IO*()
macros we have only 13 bits to encode structure size. So the structure
can be up to 8kB-1.

Currently we define IOCPARM_MAX as PAGE_SIZE.

This is IMHO wrong for three main reasons:

1. It is confusing on archs with page size larger than 8kB (not really
   sure if we support such archs (sparc64?)), as even if PAGE_SIZE is
   bigger than 8kB, we won't be able to encode anything larger in ioctl
   command.

2. It is a waste. Why the structure can be only 4kB on most archs if we
   have 13 bits dedicated for that, not 12?

3. It shouldn't depend on architecture and page size. My ioctl command
   can work on one arch, but can't on the other?

Increase IOCPARM_MAX to 8kB and make it independed of PAGE_SIZE and
architecture it is compiled for. This allows to use all the bits on all the
archs for size. Note that this doesn't mean we will copy more on every ioctl(2)
call. No. We still copyin(9)/copyout(9) only exact number of bytes encoded in
ioctl command.

Practical use for this change is ZFS. zfs_cmd_t structure used for ZFS
ioctls is larger than 4kB.

Silence on:	arch@

r206667:

Fix 3-way deadlock that can happen because of ZFS and vnode lock
order reversal.

thread0 (vfs_fhtovp)	thread1 (vop_getattr)	thread2 (zfs_recv)
--------------------	---------------------	------------------
			vn_lock
rrw_enter_read
						rrw_enter_write (hangs)
			rrw_enter_read (hangs)
vn_lock (hangs)

Reported by:	Attila Nagy <bra@fsn.hu>

r206792:

Set ARC_L2_WRITING on L2ARC header creation.

Obtained from:	OpenSolaris

r206793:

Remove racy assertion.

Obtained from:	OpenSolaris

r206794:

Extend locks scope to match OpenSolaris.

r206795:

Add missing list and lock destruction.

r206796:

Style fixes.

r206797:

Restore previous order.
2010-04-18 21:36:34 +00:00
..
boot/zfs MFC r201684. 2010-02-23 16:46:34 +00:00
compat/opensolaris MFC r202964: 2010-02-25 00:46:51 +00:00
contrib/opensolaris MFC r203504,r204067,r204073,r204101,r204804,r205079,r205080,r205132,r205133, 2010-04-18 21:36:34 +00:00
dev dtrace_gethrtime: improve scaling of TSC ticks to nanoseconds 2009-07-15 17:07:39 +00:00