freebsd-nq

Author	SHA1	Message	Date
Pawel Jakub Dawidek	3a482ccc5e	MFC r203504,r204067,r204073,r204101,r204804,r205079,r205080,r205132,r205133, r205134,r205231,r205253,r205264,r205346,r206051,r206667,r206792,r206793, r206794,r206795,r206796,r206797: r203504: Open provider for writting when we find the right one. Opening too much providers for writing provokes huge traffic related to taste events send by GEOM on close. This can lead to various problems with opening GEOM providers that are created on top of other GEOM providers. Reorted by: Kurt Touet <ktouet@gmail.com>, mr Tested by: mr, Baginski Darren <kickbsd@ya.ru> r204067: Update comment. We also look for GPT partitions. r204073: Add tunable and sysctl to skip hostid check on pool import. r204101: Don't set f_bsize to recordsize. It might confuse some software (like squid). Submitted by: Alexander Zagrebin <alexz@visp.ru> r204804: Remove racy assertion. Reported by: Attila Nagy <bra@fsn.hu> Obtained from: OpenSolaris, Bug ID 6827260 r205079: Remove bogus assertion. Reported by: Johan Ström <johan@stromnet.se> Obtained from: OpenSolaris, Bug ID 6920880 r205080: Force commit to correct Bug ID: Obtained from: OpenSolaris, Bug ID 6920880 r205132: Don't bottleneck on acquiring the stream locks - this avoids a massive drop off in throughput with large numbers of simultaneous reads r205133: fix compilation under ZIO_USE_UMA r205134: make UMA the default allocator for ZFS buffers - this avoids a great deal of contention in kmem_alloc r205231: - reduce contention by breaking up ARC state locks in to 16 for data and 16 for metadata - export L2ARC tunables as sysctls - add several kstats to track L2ARC state more precisely - avoid holding a contended lock when atomically incrementing a contended counter (no lock protection needed for atomics) r205253: use CACHE_LINE_SIZE instead of hardcoding 128 for lock pad pointed out by Marius Nuennerich and jhb@ r205264: - cache line align arcs_lock array (h/t Marius Nuennerich) - fix ARCS_LOCK_PAD to use architecture defined CACHE_LINE_SIZE - cache line align buf_hash_table ht_locks array r205346: The same code is used to import and to create pool. The order of operations is the following: 1. Try to open vdev by remembered path and guid. 2. If 1 failed, try to find vdev which guid matches and ignore the path. 3. If 2 failed this means either that the vdev we're looking for is gone or that pool is being created and vdev doesn't contain proper guid yet. To be able to handle pool creation we open vdev by path anyway. Because of 3 it is possible that we open wrong vdev on import which can lead to confusions. The solution for this is to check spa_load_state. On pool creation it will be equal to SPA_LOAD_NONE and we can open vdev only by path immediately and if it is not equal to SPA_LOAD_NONE we first open by path+guid and when that fails, we open by guid. We no longer open wrong vdev on import. r206051: IOCPARM_MAX defines maximum size of a structure that can be passed directly to ioctl(2). Because of how ioctl command is build using _IO*() macros we have only 13 bits to encode structure size. So the structure can be up to 8kB-1. Currently we define IOCPARM_MAX as PAGE_SIZE. This is IMHO wrong for three main reasons: 1. It is confusing on archs with page size larger than 8kB (not really sure if we support such archs (sparc64?)), as even if PAGE_SIZE is bigger than 8kB, we won't be able to encode anything larger in ioctl command. 2. It is a waste. Why the structure can be only 4kB on most archs if we have 13 bits dedicated for that, not 12? 3. It shouldn't depend on architecture and page size. My ioctl command can work on one arch, but can't on the other? Increase IOCPARM_MAX to 8kB and make it independed of PAGE_SIZE and architecture it is compiled for. This allows to use all the bits on all the archs for size. Note that this doesn't mean we will copy more on every ioctl(2) call. No. We still copyin(9)/copyout(9) only exact number of bytes encoded in ioctl command. Practical use for this change is ZFS. zfs_cmd_t structure used for ZFS ioctls is larger than 4kB. Silence on: arch@ r206667: Fix 3-way deadlock that can happen because of ZFS and vnode lock order reversal. thread0 (vfs_fhtovp) thread1 (vop_getattr) thread2 (zfs_recv) -------------------- --------------------- ------------------ vn_lock rrw_enter_read rrw_enter_write (hangs) rrw_enter_read (hangs) vn_lock (hangs) Reported by: Attila Nagy <bra@fsn.hu> r206792: Set ARC_L2_WRITING on L2ARC header creation. Obtained from: OpenSolaris r206793: Remove racy assertion. Obtained from: OpenSolaris r206794: Extend locks scope to match OpenSolaris. r206795: Add missing list and lock destruction. r206796: Style fixes. r206797: Restore previous order.	2010-04-18 21:36:34 +00:00
Xin LI	b1ebb318cb	MFC r201690: Space cleanup for revision 202669 committed separately for easier review. This commit is purely space changes. Submitted by: Matt Reimer Sponsored by: VPOP Technologies, Inc.	2010-01-20 01:14:54 +00:00
Xin LI	ac07939f0e	MFC r201689: Instead of assuming all vdevs are healthy, check the newest vdev label for each vdev's status. Booting from a degraded vdev should now be more robust. Submitted by: Matt Reimer <mattjreimer at gmail.com> Sponsored by: VPOP Technologies, Inc.	2010-01-20 01:13:52 +00:00
John Baldwin	5e05dbe9bd	MFC 200309: - Port bios_getmem() from libi386 to {gpt,}zfsboot() and use it to safely allocate a heap region above 1MB. This enables {gpt,}zfsboot() to allocate much larger buffers than before. - Use a larger buffer (1MB instead of 128K) for temporary ZFS buffers. This allows more reliable reading of compressed files in a raidz/raidz2 pool.	2009-12-18 21:01:56 +00:00
Robert Noland	262b2ce076	MFC 198420 Correct some issues with zfs boot. - Teach it to read gang blocks. (essentially untested) If you see "ZFS: gang block detected!", please let me know, so we can either remove the printf if it works, or fix it if it doesn't. - If multiple partitions exist on a disk, probe them all. We also need to reset dsk->start to 0 to read the right sector here. - With GPT, we can have 128 partitions. - If the bootfs property has ever been set on a pool it seems that it never goes away. zpool won't allow you to add to the pool with the bootfs property set. However, if you clear the property back to default we end up getting 0 for the object number and read a bogus block pointer and fail to boot. - Fix some error printfs. The printf in the loader is only capable of c,s and u formats. - Teach printf how to display %llu	2009-11-21 15:02:35 +00:00
Doug Rabson	e1899ef6c8	Add support for booting from raidz1 and raidz2 pools.	2009-05-16 10:48:20 +00:00
Doug Rabson	7b3569ff05	Use full 64bit arithmetic when converting file offsets to block numbers - fixes booting on filesystems with inode numbers with values above 4194304. Submitted by: ps	2008-12-17 18:12:01 +00:00
Paul Saab	5ee5aed0a3	Fix a leak introduced in r185902. We should free the devspec if we've successfully found a zfs pool.	2008-12-11 16:48:35 +00:00
Paul Saab	390edcc5b9	Avoid a double free in devopen by not freeing the device structure in zfs_dev_open. This stops a panic in the loader when trying to read from a zfs device and no zfs devices exist.	2008-12-11 02:23:49 +00:00
Doug Rabson	937a012e5d	Don't get confused if we encounter a device which is part of a raidz or raidz2 pool while probing for vdevs. PR: 129539 Submitted by: Paul Wootton (paul at fletchermoorland dot co dot uk)	2008-12-10 10:46:34 +00:00
Paul Saab	8f6a8ed553	Correct include path for i386 specific includes. This allows zfs to boot on systems where the loader is built on amd64 systems.	2008-12-06 14:45:03 +00:00
Doug Rabson	ebd4055a33	Fix amd64 build and re-enable gptzfsboot.	2008-11-22 14:24:55 +00:00
Doug Rabson	0d16312b46	Some zfsboot fixes from Norikatsu Shigemura: 1. zfsboot2 (boot2) doesn't %d (printf), so change %d to %u. 2. chase new zpool versioning as SPA_VERSION. Obtained from: sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h Submitted by: nork	2008-11-19 16:59:19 +00:00
Doug Rabson	51f0d2e192	Add a GPT-aware variant of zfsboot which should be used in a similar manner to gptboot, i.e. installed in a freebsd-boot partition using /sbin/gpart or /sbin/gpt. Tweak the /boot/loader ZFS support so that it can find ZFS pools that are contained in GPT partitions.	2008-11-19 16:39:01 +00:00
Pawel Jakub Dawidek	1ba4a712dd	Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris	2008-11-17 20:49:29 +00:00

15 Commits