freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	f82360acf2	Existing VOP_VPTOCNP() interface has a fatal flow that is critical for nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced. Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like. Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix. Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)	2011-11-19 07:50:49 +00:00
Pawel Jakub Dawidek	df663c3dd3	Correct typo in comment. Reported by: Fabian Keil <fk@fabiankeil.de> MFC after: 3 days	2011-11-05 16:44:25 +00:00
Pawel Jakub Dawidek	98dd1c40c4	In zvol_open() if the spa_namespace_lock is already held, it means that ZFS is trying to open and taste ZVOL as its VDEV. This is not supported, so return an error instead of panicing on spa_namespace_lock recursion. Reported by: Robert Millan <rmh@debian.org> PR: kern/162008 MFC after: 3 days	2011-11-05 16:29:03 +00:00
Martin Matuska	e1d4b72a2e	Fix typo in copyright notice introduced in r226724 (missing character in e-mail adress) Reported by: pjd MFC after: 3 days	2011-10-25 13:52:38 +00:00
Martin Matuska	571e19b341	Update copyright information in several ZFS files, as the clause 3.3 of the CDDL licence explicitly requires every Contributor to add a copyright notice. This also reflects the copyright notices for the changes recently added by Illumos. MFC after: 3 days	2011-10-25 08:35:30 +00:00
Pawel Jakub Dawidek	9782a86c85	- Use better naming now that we allow to rename any mounted file system (not only legacy). - Update copyright to include myself. MFC after: 2 weeks	2011-10-24 21:31:53 +00:00
Pawel Jakub Dawidek	649bbd1cd0	Don't forget to rename mounted snapshots of the file system being renamed. MFC after: 2 weeks	2011-10-24 20:41:31 +00:00
Pawel Jakub Dawidek	27fbc05657	Include <sys/zfs_vfsops.h> only when compiling kernel module. MFC after: 2 weeks	2011-10-24 05:26:40 +00:00
Pawel Jakub Dawidek	497b7ef946	Allow to rename file systems without remounting if it is possible. It is possible for file systems with 'mountpoint' preperty set to 'legacy' or 'none' - we don't have to change mount directory for them. Currently such file systems are unmounted on rename and not even mounted back. This introduces layering violation, as we need to update 'f_mntfromname' field in statfs structure related to mountpoint (for the dataset we are renaming and all its children). In my opinion it is worth it, as it allow to update FreeBSD in even cleaner way - in ZFS-only configuration root file system is ZFS file system with 'mountpoint' property set to 'legacy'. If root dataset is named system/rootfs, we can snapshot it (system/rootfs@upgrade), clone it (system/oldrootfs), update FreeBSD and if it doesn't boot we can boot back from system/oldrootfs and rename it back to system/rootfs while it is mounted as /. Before it was not possible, because unmounting / was not possible. MFC after: 2 weeks	2011-10-24 00:38:09 +00:00
Pawel Jakub Dawidek	72b880fa83	Update per-thread I/O statistics collection in ZFS. This allows to see processes I/O activity in 'top -m io' output. PR kern/156218 Reported by: Marcus Reid <marcus@blazingdot.com> Patch by: avg MFC after: 3 days	2011-10-21 21:49:34 +00:00
Pawel Jakub Dawidek	b39ba076ec	zfs vdev_file_io_start: validate vdev before using vdev_tsd vdev_tsd can be NULL for certain vdev states. At least in userland testing with ztest. Submitted by: avg MFC after: 3 days	2011-10-21 14:00:48 +00:00
Martin Matuska	ceac02f8e6	Import fix for Illumos bug #1475 to reduce diff against upstream. Panic caused by this bug was already partially fixed by pjd@ in p4 CH 185940 and 185942. Reference: 1475 zfs spill block hold can access invalid spill blkptr https://www.illumos.org/issues/1475 Reviewed by: delphij Obtained from: Illumos (issue 1475, changeset 13469:b8e89e5c4167) MFC after: 1 week	2011-10-18 13:58:22 +00:00
Xin LI	4aadb12e0b	Fix a bug in sa_find_sizes() which could lead to panic: When calculating space needed for SA_BONUS buffers, hdrsize is always rounded up to next 8-aligned boundary. However, in two places the round up was done against sum of 'total' plus hdrsize. On the other hand, hdrsize increments by 4 each time, which means in certain conditions, we would end up returning with will_spill == 0 and (total + hdrsize) larger than full_space, leading to a failed assertion because it's invalid for dmu_set_bonus. Sponsored by: iXsystems, Inc. Reviewed by: mm MFC after: 3 days	2011-10-17 22:23:27 +00:00
Konstantin Belousov	3407fefef6	Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)	2011-09-06 10:30:11 +00:00
Martin Matuska	82378711f9	Generalize ffs_pages_remove() into vn_pages_remove(). Remove mapped pages for all dataset vnodes in zfs_rezget() using new vn_pages_remove() to fix mmapped files changed by zfs rollback or zfs receive -F. PR: kern/160035, kern/156933 Reviewed by: kib, pjd Approved by: re (kib) MFC after: 1 week	2011-08-25 08:17:39 +00:00
Pawel Jakub Dawidek	4969b96e57	We need to unlock and destroy vnode attached to znode which we are freeing. Reviewed by: kib Approved by: re (bz) MFC after: 1 week	2011-08-24 22:07:38 +00:00
Martin Matuska	6e1f1d4690	zfs_ioctl.c: improve code readability in zfs_ioc_dataset_list_next() zvol.c: fix calling of dmu_objset_prefetch() in zvol_create_minors() by passing full instead of relative dataset name and prefetching all visible datasets to be processed later instead of just the pool name Reviewed by: pjd Approved by: re (kib) MFC after: 1 week > Reviewed by: If someone else reviewed your modification. > Approved by: If you needed approval for this commit. > Obtained from: If the change is from a third party. > MFC after: N [day[s]\|week[s]\|month[s]]. Request a reminder email. > Security: Vulnerability reference (one per line) or description. > Empty fields above will be automatically removed. M opensolaris/uts/common/fs/zfs/zfs_ioctl.c M opensolaris/uts/common/fs/zfs/zvol.c	2011-08-13 21:35:22 +00:00
Martin Matuska	cc82ff1c96	Fix race between dmu_objset_prefetch() invoked from zfs_ioc_dataset_list_next() and dsl_dir_destroy_check() indirectly invoked from dmu_recv_existing_end() via dsl_dataset_destroy() by not prefetching temporary clones, as these count as always inconsistent. In addition, do not prefetch hidden datasets at all as we are not going to process these later. Filed as Illumos Bug #1346 PR: kern/157728 Tested by: Borja Marcos <borjam@sarenet.es>, mm Reviewed by: pjd Approved by: re (kib) MFC after: 1 week	2011-08-13 10:58:53 +00:00
Pawel Jakub Dawidek	7b1085ba55	Eliminate the zfsdev_state_lock entirely and replace it with the spa_namespace_lock. This fixes LOR between the spa_namespace_lock and spa_config lock. LOR can cause deadlock on vdevs removal/insertion. Reported by: gibbs, delphij Tested by: delphij Approved by: re (kib) MFC after: 1 week	2011-08-12 07:04:16 +00:00
Martin Matuska	d32cac295c	Fix panic in zfs_read() if IO_SYNC flag supplied by checking for zfsvfs->z_log before calling zil_commit(). [1] Do not call zfs_read() from zfs_getextattr() with the IO_SYNC flag. Submitted by: Alexander Zagrebin <alex@zagrebin.ru> [1] Reviewed by: pjd@ Approved by: re (kib) MFC after: 3 days	2011-08-02 11:28:33 +00:00
Martin Matuska	ad4887a72a	Fix integer overflow in txg_delay() by initializing the variable "timeout" as clock_t. Filed as Illumos Bug #1313 Reviewed by: avg Approved by: re (kib) MFC after: 3 days	2011-08-01 14:50:31 +00:00
Martin Matuska	4e1407c428	Fix serious bug in ZIL that can lead to pool corruption in the case of a held dataset during remount. Detailed description is available at: https://www.illumos.org/issues/883 illumos-gate revision: 13380:161b964a0e10 Reviewed by: pjd Approved by: re (kib) Obtained from: Illumos (Bug #883) MFC after: 3 days	2011-07-30 19:00:31 +00:00
Xin LI	101b7b5daa	Bring the code more in-line with OpenSolaris source to ease future port. Reviewed by: pjd, mm Approved by: re (kib)	2011-07-21 20:02:22 +00:00
Xin LI	b447d101fa	A different implementation of r224231 proposed by pjd@, which does not require change in the znode structure. Specifically, it queries rdev from the znode in the same sa_bulk_lookup already done in zfs_getattr(). Submitted by: pjd (with some revisions) Reviewed by: pjd, mm Approved by: re (kib)	2011-07-21 20:01:51 +00:00
Xin LI	b1ad061e42	Add a new field to in-core znode, z_rdev, to represent device nodes. PR: kern/159010 Reviewed by: mm@ Approved by: re (kib) MFC after: 2 weeks	2011-07-20 16:53:32 +00:00
Martin Matuska	1bc399c4b1	ZFS tries to allocate blocks evenly across all devices. This means when devices are imbalanced zfs will lots of CPU searching for space on devices which tend to be pretty full. It should instead fail quickly on the full devices and move onto devices which have more availability. New loader tunable: vfs.zfs.mg_alloc_failures (min = 8) Illumos-gate changeset: 13379:4df42cc92254 Obtained from: Illumos (Bug #1051) MFC after: 2 weeks	2011-07-18 08:29:49 +00:00
Martin Matuska	3ded43e7b7	Resurrect the ZFS "aclmode" property Change default of "aclmode" to "discard". Illumos-gate changeset: 13370:8c04143bd318 Obtained from: Illumos (Feature #742) MFC after: 2 weeks	2011-07-18 07:16:44 +00:00
Martin Matuska	fbfed0cda6	Add a new "REFCOMPRESSRATIO" property. For snapshots, this is the same as COMPRESSRATIO, but for filesystems/volumes, the COMPRESSRATIO is based on the data "USED" (ie, includes blocks in children, but not blocks shared with the origin). This is needed to figure out how much space a filesystem would use if it were not compressed (ignoring snapshots). Illumos-gate revision: 13387 Obtained from: Illumos (Feature #1092) MFC after: 2 weeks	2011-06-28 07:52:01 +00:00
Martin Matuska	85a418012f	Disable vdev cache (readahead) by default. The vdev cache is very underutilized (hit ratio 30%-70%) and may consume excessive memory on systems with many vdevs. Illumos-gate revision: 13346 Obtained from: Illumos (Bug #175) MFC after: 1 week	2011-06-28 06:32:35 +00:00
Justin T. Gibbs	1c3bf59584	Remove C constructs that are incompatible with C++ from various OpenSolaris and ZFS header files. These changes are sufficient to allow a C++ program to use the libzfs library. Note: The majority of these files already included 'extern "C"' declarations, so the intention of providing C++ compatibility already existed even if it wasn't provided. cddl/compat/opensolaris/include/assert.h: Wrap our compatibility assert implementation in 'extern "C"'. Since this is a compatibility header I matched the Solaris style of doing this explicitly rather than rely on FreeBSD's __BEGIN/END_DECLS macro. sys/cddl/compat/opensolaris/sys/kstat.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/ddt.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa.h: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h: Rename parameters in function declarations that conflict with C++ keywords. This was the solution preferred by members of the Illumos community. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_ioctl.h: In C, nested structures are visible in the global namespace, but in C++, they take on the namespace of the structure in which they are contained. Flatten nested structure definitions within struct zfs_cmd so these structures are visible in the global namespace when compiled in both languages. Sponsored by: Spectra Logic Corporation	2011-06-10 20:10:30 +00:00
Martin Matuska	baa256da8c	Silence notice on pool creation, import and access. Suggested by: Jeremy Chadwick (freebsd-stable@) Discussed with: pjd MFC after: 1 week	2011-06-07 20:46:31 +00:00
Pawel Jakub Dawidek	b5a060dd8b	Don't pass pointer to name buffer which is on the stack to another thread, because the stack might be paged out once the other thread tries to use the data. Instead, just allocate memory. MFC after: 2 weeks	2011-05-24 20:10:12 +00:00
Pawel Jakub Dawidek	541c60d988	Don't access task structure once we call task function. The task structure might be no longer available. This also allows to eliminates the need for two tasks in the zio structure. Submitted by: anonymous MFC after: 2 weeks	2011-05-24 20:07:15 +00:00
Rick Macklem	965e561750	Fix the zfs file system so that it uses the lock flags argument added to VFS_FHTOVP() by r222167. Reviewed by: pjd	2011-05-22 21:04:32 +00:00
Rick Macklem	694a586a43	Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib	2011-05-22 01:07:54 +00:00
Martin Matuska	a5c44f92bf	Restore old (v15) behaviour for a recursive snapshot destroy. (zfs destroy -r pool/dataset@snapshot) To destroy all descendent snapshots with the same name the top level snapshot was not required to exist. So if the top level snapshot does not exist, check permissions of the parent dataset instead. Filed as Illumos Bug #1043 Reviewed by: delphij Approved by: pjd MFC after: together with v28	2011-05-18 07:37:02 +00:00
Marius Strobl	edd870e447	Convert the last use of xcopyout() to ddi_copyout() and remove the now unused xcopyin() as well as xcopyout(). MFC together with r219089. Approved by: mm	2011-05-03 20:13:27 +00:00
Martin Matuska	29bf94b8d8	Fix deduplicated zfs receive (dmu_recv_stream builds incomplete guid_to_ds_map) Illumos-gate changeset: 13329:c48b8bf84ab7 MFC together with v28 Approved by: pjd Obtained from: Illumos (Bug #755)	2011-04-30 14:52:49 +00:00
Pawel Jakub Dawidek	65612637e8	Checking file access on size change is bogus. The checks are done earlier by VFS where we know if this is truncate(2) or ftruncate(2). If this is the latter we should depend on the mode the file was opened and not on the current permission. PR: standards/154873 Reported by: Mark Martinec <Mark.Martinec@ijs.si> Discussed with: Eric Schrock <eric.schrock@delphix.com> Discussed with: Mark Maybee <Mark.Maybee@Oracle.COM> MFC after: 1 month	2011-03-24 20:28:09 +00:00
Pawel Jakub Dawidek	d7d23301ae	Fix potential panic in dbuf_sync_list() relate to spill blocks handling. Obtained from: IllumOS MFC after: 1 month	2011-03-14 11:07:12 +00:00
Pawel Jakub Dawidek	cae905e5d0	Correct readdir over ZFS handling. Reported by: Pierre Beyssac <pb@fasterix.frmug.org> MFC after: 1 month	2011-03-08 18:39:41 +00:00
Pawel Jakub Dawidek	a96e8e86f0	Fix libzpool build. MFC after: 1 month	2011-03-06 01:22:14 +00:00
Pawel Jakub Dawidek	2348f1110e	Make renaming of a ZVOL, ZVOL's parent directory and ZVOL snapshot work. Reported by: avg MFC after: 1 month	2011-03-05 22:31:03 +00:00
Pawel Jakub Dawidek	5bf0660559	Simplify zvol_remove_minors() a bit. MFC after: 1 month	2011-03-05 22:24:31 +00:00
Pawel Jakub Dawidek	10b9d77bf1	Finally... Import the latest open-source ZFS version - (SPA) 28. Few new things available from now on: - Data deduplication. - Triple parity RAIDZ (RAIDZ3). - zfs diff. - zpool split. - Snapshot holds. - zpool import -F. Allows to rewind corrupted pool to earlier transaction group. - Possibility to import pool in read-only mode. MFC after: 1 month	2011-02-27 19:41:40 +00:00
Konstantin Belousov	ca67168159	For UIO_NOCOPY case of reading request on zfs vnode, which has vm object attached, activate the page after the successful read, and free the page if read was unsuccessfull. Freshly allocated page is not on any queue yet, and not activating (or deactivating) the page leaves it on no queue, excluding the page from pagedaemon scans and making the memory disappeared until the vnode reclaimed. Reviewed by: avg MFC after: 1 week	2011-02-11 10:46:15 +00:00
Edward Tomasz Napierala	dc7a965673	Make it impossible to clear the MNT_NFS4ACLS flag on ZFS filesystem by using "mount -uw". Reviewed by: pjd MFC after: 2 weeks	2011-02-06 23:34:09 +00:00
Andrey V. Elsukov	459d0e830d	vdev's sectorsize should not be greater than 8 Kbytes and also it should be power of 2. This prevents non-aligned access while probing vdev's labels. PR: kern/147852 Reviewed by: pjd MFC after: 1 week	2011-02-04 15:22:56 +00:00
Edward Tomasz Napierala	7a93bf9a69	Add MNT_NFS4ACLS to ZFS mount flags. It's not conditional, since there is no way to disable NFSv4 ACLs in ZFS. This should make it easier for the NFS server to figure out whether the exported filesystem supports ACLs or not. Reviewed by: pjd MFC after: 2 weeks	2011-01-19 17:11:52 +00:00
Matthew D Fleming	e704482d43	Re-commit the zfs sysctl(9) type-safety changes. Thanks to dim and pjd for the pointer to zfs_context.h for building userland.	2011-01-13 18:20:19 +00:00
Matthew D Fleming	374a993a88	Revert cddl changes for sysctl(9) until I understand why this isn't building on universe.	2011-01-12 23:06:38 +00:00
Matthew D Fleming	4a2ce5903f	sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly. Commit the zfs piece.	2011-01-12 19:53:30 +00:00
Martin Matuska	df06a59a77	MFp4 r186485, r186859: Fix a race by defining two tasks in the zio structure as we can still be returning from issue task when interrupt task is used. Tested by: pjd Approved by: pjd, delphij (mentor) MFC after: 3 days	2011-01-03 12:57:07 +00:00
Pawel Jakub Dawidek	8735863465	Remove redundant semicolon and empty like.	2010-12-11 13:35:25 +00:00
Ivan Voras	d7ccd95be8	Undo r216230: the interaction between saved ashift in metadata and detected ashift does not support this. With this change, pools created while stripesize=512 could not be imported when stripesize becomes larger (on the same drive). Noticed by: pjd	2010-12-07 15:24:08 +00:00
Ivan Voras	8b08562112	Use GEOM stripesize field when calculating ashift. This will enable correct alignment on drives with large sector sizes (e.g. 4 KiB) but the implementation might need to be revisited if devices with large stripesizes appear (e.g. if RAID controllers or flash drives start using the field), probably by introducing a physsectorsize field in GEOM providers. Discussed with: mav, mostly silence on freebsd-geom@ and freebsd-fs@	2010-12-06 12:18:02 +00:00
Andriy Gapon	c59690f249	zfs+sendfile: populate all requested pages, not just those already cached kern_sendfile() uses vm_rdwr() to read-ahead blocks of data to populate page cache. When sendfile stumbles upon a page that is not populated yet, it sends out all the mbufs that it collected so far. This resulted in very poor performance with ZFS when file data is not in the page cache, because ZFS vop_read for UIO_NOCOPY case populated only those pages that are already in cache, but not valid. Which means that most of the time it populated only the first requested page in the described above scenario. Reported by: Alexander Zagrebin <alexz@visp.ru> Tested by: Alexander Zagrebin <alexz@visp.ru>, Artemiev Igor <ai@kliksys.ru> MFC after: 12 days	2010-11-16 15:53:44 +00:00
Andriy Gapon	f9e2e99d5d	fix misspelling in a comment Reported by: Daniel Braniss <danny@cs.huji.ac.il> MFC after: 3 days	2010-11-16 12:30:47 +00:00
Martin Matuska	8db47aa15e	Disable VFS_HOLD placed on mnt_vnodecovered during the mount of a snapshot and VFS_RELE on a non-existing hold on snapshot parent's z_vfs. This disables the changes from OpenSolaris onnv-revision 9234:bffdc4fc05c4 (bug IDs: 6792139, 6794830) - not applicable to FreeBSD. This fixes the process hang if umounting a manually mounted snapshot. Reported by: Alexander Zagrebin <alexz@visp.ru> Approved by: delphij (mentor) MFC after: 1 week	2010-11-13 21:09:18 +00:00
Xin LI	b97a9057c2	Validate whether the zfs_cmd_t submitted from userland is not smaller than what we have. Without the check the kernel could accessing memory that does not belong to the request struct. Note that we do not test if the struct equals in size at this time, which may faciliate forward compatibility with newer binaries. Reviewed by: pjd at MeetBSD CA '2010 MFC after: 1 week	2010-11-05 22:18:09 +00:00
Martin Matuska	e25376bdd0	Bugfix merge from OpenSolaris: OpenSolaris onnv-revision: 10209:91f47f0e7728 6830541 zfs_get_data_trips on a verify 6696242 multiple zfs_fillpage() zfs: accessing past end of object panics 6785914 zfs fails to drop dn_struct_rwlock in recovery code path Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6830541, 6696242, 6785914) MFC after: 2 weeks	2010-10-26 15:48:03 +00:00
Andriy Gapon	23a1bcf8c6	zfs: add vop_getpages method implementation This should make vnode_pager_getpages path a bit shorter and clearer. Also this should eliminate problems with partially valid pages. Having this method opens room for future optimizations. To do: try to satisfy other pages besides the required one taking into account tradeofs between number of page faults, read throughput and read latency. Also, eventually vop_putpages should be added too. Reviewed by: kib, mm, pjd MFC after: 3 weeks	2010-10-16 20:43:05 +00:00
Rui Paulo	6e634bb80f	In zfs_post_common(), use %d instead of %hhu. Found with: clang	2010-10-13 17:12:23 +00:00
Andriy Gapon	f6bb41924c	zfs + sendfile: do not produce partially valid pages for vnode's tail Since r212650 and before this change sendfile(2) could produce a partially valid page for a trailing portion of a ZFS vnode. vm_fault() always wants to see a fully valid page even if it's the last page that partially extends beyond vnode's end. Otherwise it calls vop_getpages() to bring in the page. In the case of ZFS this means that the data is read from the page into the same page and this breaks checks in ZFS mappedread() - a thread that set VPO_BUSY on the page in vm_fault() will get blocked forever waiting for it to be cleared. Many thanks to Kai and Jeremy for reproducing the issue and providing important debugging information and help. Reported by: Kai Gallasch <gallasch@free.de>, Jeremy Chadwick <freebsd@jdc.parodius.com> Tested by: Kai Gallasch <gallasch@free.de>, Jeremy Chadwick <freebsd@jdc.parodius.com> Reviewed by: kib MFC after: 3 days To-Do: apply the same treatment to tmpfs + sendfile	2010-10-12 17:04:21 +00:00
Pawel Jakub Dawidek	19ebc67beb	Provide internal ioflags() function that converts ioflag provided by FreeBSD's VFS to OpenSolaris-specific ioflag expected by ZFS. Use it for read and write operations. Reviewed by: mm MFC after: 1 week	2010-10-10 20:49:33 +00:00
Martin Matuska	a362d75576	Change FAPPEND to IO_APPEND as this is a ioflag and not a fflag. This corrects writing to append-only files on ZFS. PR: kern/149495 [1], kern/151082 [2] Submitted by: Daniel Zhelev <daniel@zhelev.biz> [1], Michael Naef <cal@linu.gs> [2] Approved by: delphij (mentor) MFC after: 1 week	2010-10-08 23:01:38 +00:00
Martin Matuska	aa007a9f0e	Properly handle IO with B_FAILFAST Retry IO once with ZIO_FLAG_TRYHARD before declaring a pool faulted OpenSolaris revision and Bug IDs: 9725:0bf7402e8022 6843014 ZFS B_FAILFAST handling is broken Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6843014) MFC after: 3 weeks	2010-09-27 09:42:31 +00:00
Martin Matuska	96a1a6a568	Enable offlining of log devices. OpenSolaris revision and Bug IDs: 9701:cc5b64682e64 6803605 should be able to offline log devices 6726045 vdev_deflate_ratio is not set when offlining a log device 6599442 zpool import has faults in the display Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6803605, 6726045, 6599442) MFC after: 3 weeks	2010-09-27 09:05:51 +00:00
Andriy Gapon	68653c3bd6	zfs_map_page/zfs_unmap_page: do not use sched_pin() and SFB_CPUPRIVATE zfs_map_page/zfs_unmap_page are mostly called around potential I/O paths and it seems to be a not very good idea to do cpu pinning there. Suggested by: kib MFC after: 2 weeks	2010-09-21 05:58:45 +00:00
Andriy Gapon	ff5e15a487	zfs_vnops: use zfs_map_page/zfs_unmap_page helper functions in another place MFC after: 2 weeks	2010-09-21 05:54:36 +00:00
Andriy Gapon	9d5eb9aa5d	zfs arc_reclaim_needed: fix typo in mismerge in r212780 PR: kern/146410, kern/138790 MFC after: 3 weeks X-MFC with: r212780	2010-09-17 07:34:50 +00:00
Andriy Gapon	921d3fd122	zfs+sendfile: advance uio_offset upon reading as well Picked from analogous code in tmpfs. MFC after: 1 week	2010-09-17 07:20:20 +00:00
Andriy Gapon	44532bc5cd	zfs arc_reclaim_needed: remove redundant checks for arc_c_max and arc_c_max Those checks are not present in upstream code and they are enforced in actual calculations of delta by which ARC size can be grown or should be reduced. MFC after: 3 weeks	2010-09-17 07:17:38 +00:00
Andriy Gapon	7c1353491f	zfs arc_reclaim_needed: more reasonable threshold for available pages vm_paging_target() is not a trigger of any kind for pageademon, but rather a "soft" target for it when it's already triggered. Thus, trying to keep 2048 pages above that level at the expense of ARC was simply driving ARC size into the ground even with normal memory loads. Instead, use a threshold at which a pagedaemon scan is triggered, so that ARC reclaiming helps with pagedaemon's task, but the latter still recycles active and inactive pages. PR: kern/146410, kern/138790 MFC after: 3 weeks	2010-09-17 07:14:07 +00:00
Martin Matuska	d1ee63f836	Fix kernel panic when moving a file to .zfs/shares Fix possible loss of correct error return code in ZFS mount OpenSolaris revisions and Bug IDs: 11824:53128e5db7cf 6863610 ZFS mount can lose correct error return 12079:13822b941977 6939941 problem with moving files in zfs (142901-12) Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6863610, 6939941) MFC after: 3 days	2010-09-15 19:55:26 +00:00
Andriy Gapon	8a3883cfb7	zfs vn_has_cached_data: take into account v_object->cache != NULL This mirrors code in tmpfs. This changge shouldn't affect much read path, it may cause unnecessary vm_page_lookup calls in the case where v_object has no active or inactive pages but has some cache pages. I believe this situation to be non-essential. In write path this change should allow us to properly detect the above case and free a cache page when we write to a range that corresponds to it. If this situation is undetected then we could have a discrepancy between data in page cache and in ARC or on disk. This change allows us to re-enable vn_has_cached_data() check in zfs_write. NOTE: strictly speaking resident_page_count and cache fields of v_object should be exmined under VM_OBJECT_LOCK, but for this particular usage we may get away with it. Discussed with: alc, kib Approved by: pjd Tested with: tools/regression/fsx MFC after: 3 weeks	2010-09-15 11:05:41 +00:00
Andriy Gapon	0b1ca38a69	zfs mappedread, update_pages: use int for offset and length within a page uint64_t, int64_t were redundant there Approved by: pjd Tested by: tools/regression/fsx MFC after: 2 weeks	2010-09-15 10:48:16 +00:00
Andriy Gapon	c002c3e8c2	zfs mappedread: use uiomove_fromphys where possible Reviewed by: alc Approved by: pjd Tested by: tools/regression/fsx MFC after: 2 weeks	2010-09-15 10:44:20 +00:00
Andriy Gapon	fbbdb19dcd	zfs: catch up with vm_page_sleep_if_busy changes Reviewed by: alc Approved by: pjd Tested by: tools/regression/fsx MFC after: 2 weeks	2010-09-15 10:39:21 +00:00
Andriy Gapon	21bd3e2576	tmpfs, zfs + sendfile: mark page bits as valid after populating it with data Otherwise, adding insult to injury, in addition to double-caching of data we would always copy the data into a vnode's vm object page from backend. This is specific to sendfile case only (VOP_READ with UIO_NOCOPY). PR: kern/141305 Reported by: Wiktor Niesiobedzki <bsd@vink.pl> Reviewed by: alc Tested by: tools/regression/sockets/sendfile MFC after: 2 weeks	2010-09-15 10:31:27 +00:00
Martin Matuska	9a13d2e1b3	Remove duplicated VFS_HOLD due to a mismerge. PR: kern/150544 Approved by: delphij (mentor) MFC after: 1 day	2010-09-14 12:12:18 +00:00
Martin Matuska	4eeef2e44a	Add missing vop_vector zfsctl_ops_shares Add missing locks around VOP_READDIR and VOP_GETATTR with z_shares_dir PR: kern/150544 Approved by: delphij (mentor) Obtained from: perforce (pjd) MFC after: 1 day	2010-09-14 10:27:32 +00:00
Pawel Jakub Dawidek	3c907063e9	Remove the page queues lock around vm_page_undirty() - it is no longer needed. Reviewed by: alc	2010-09-13 19:47:09 +00:00
Matthew D Fleming	4d369413e1	Replace sbuf_overflowed() with sbuf_error(), which returns any error code associated with overflow or with the drain function. While this function is not expected to be used often, it produces more information in the form of an errno that sbuf_overflowed() did.	2010-09-10 16:42:16 +00:00
Pawel Jakub Dawidek	86b19d1861	On FreeBSD we can log from pool that have multiple top-level vdevs or log vdevs, so don't deny adding new vdevs if bootfs property is set. MFC after: 2 weeks	2010-09-09 21:20:18 +00:00
Justin T. Gibbs	f03f7a0ca3	Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic. Add the BIO_ORDERED flag for struct bio and update bio clients to use it. The barrier semantics of bioq_insert_tail() were broken in two ways: o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio. o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice. sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail(). o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active. o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows. o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction. sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio. sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set. sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command. Wrap some lines to 80 columns. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED. Sponsored by: Spectra Logic Corporation MFC after: 1 month	2010-09-02 19:40:28 +00:00
Jaakko Heinonen	de478dd4b4	execve(2) has a special check for file permissions: a file must have at least one execute bit set, otherwise execve(2) will return EACCES even for an user with PRIV_VFS_EXEC privilege. Add the check also to vaccess(9), vaccess_acl_nfs4(9) and vaccess_acl_posix1e(9). This makes access(2) to better agree with execve(2). Because ZFS doesn't use vaccess(9) for VEXEC, add the check to zfs_freebsd_access() too. There may be other file systems which are not using vaccess*() functions and need to be handled separately. PR: kern/125009 Reviewed by: bde, trasz Approved by: pjd (ZFS part)	2010-08-30 16:30:18 +00:00
Pawel Jakub Dawidek	b8a4becc2d	Return NULL pointer instead of B_FALSE as it is done in the vendor code. Obtained from: //depot/user/pjd/zfs/...	2010-08-28 19:29:06 +00:00
Martin Matuska	8d87b396f8	Import changes from OpenSolaris that provide - better ACL caching and speedup of ACL permission checks - faster handling of stat() - lowered mutex contention in the read/writer lock (rrwlock) - several related bugfixes Detailed information (OpenSolaris onnv changesets and Bug IDs): 9749:105f407a2680 6802734 Support for Access Based Enumeration (not used on FreeBSD) 6844861 inconsistent xattr readdir behavior with too-small buffer 9866:ddc5f1d8eb4e 6848431 zfs with rstchown=0 or file_chown_self privilege allows user to "take" ownership 9981:b4907297e740 6775100 stat() performance on files on zfs should be improved 6827779 rrwlock is overly protective of its counters 10143:d2d432dfe597 6857433 memory leaks found at: zfs_acl_alloc/zfs_acl_node_alloc 6860318 truncate() on zfsroot succeeds when file has a component of its path set without access permission 10232:f37b85f7e03e 6865875 zfs sometimes incorrectly giving search access to a dir 10250:b179ceb34b62 `6867395` zpool_upgrade_007_pos testcase panic'd with BAD TRAP: type=e (#pf Page fault) 10269:2788675568fd 6868276 zfs_rezget() can be hazardous when znode has a cached ACL 10295:f7a18a1e9610 6870564 panic in zfs_getsecattr Approved by: delphij (mentor) Obtained from: OpenSolaris (multiple Bug IDs) MFC after: 2 weeks	2010-08-28 09:24:11 +00:00
Martin Matuska	abe5837f7c	Update ZFS metaslab code from OpenSolaris. This provides a noticeable write speedup, especially on pools with less than 30% of free space. Detailed information (OpenSolaris onnv changesets and Bug IDs): 11146:7e58f40bcb1c 6826241 Sync write IOPS drops dramatically during TXG sync 6869229 zfs should switch to shiny new metaslabs more frequently 11728:59fdb3b856f6 6918420 zdb -m has issues printing metaslab statistics 12047:7c1fcc8419ca 6917066 zfs block picking can be improved Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6826241, 6869229, 6918420, 6917066) MFC after: 2 weeks	2010-08-28 08:59:55 +00:00
Pawel Jakub Dawidek	4e52cdd0f7	Use ZFS_CTLDIR_NAME instead of hardcoding ".zfs".	2010-08-27 21:31:15 +00:00
Pawel Jakub Dawidek	8733ff6e11	Update comment now that I finally committed r211854. MFC after: 1 month	2010-08-26 23:44:32 +00:00
Andriy Gapon	694a0a8717	zfs arc_reclaim_thread: no need to call arc_reclaim_needed when resetting needfree needfree is checked at the very start of arc_reclaim_needed. This change makes code easier to follow and maintain in face of potential changed in arc_reclaim_needed. Also, put the whole sub-block under _KERNEL because needfree can be set only in kernel code. To do: rename needfree to something else to aovid confusion with OpenSolaris global variable of the same name which is used in the same code, but has different meaning (page deficit). Note: I have an impression that locking around accesses to this variable as well as mutual notifications between arc_reclaim_thread and arc_lowmem are not proper. MFC after: 1 week	2010-08-24 17:48:22 +00:00
Pawel Jakub Dawidek	8dc7024be4	In FreeBSD we use 'jailed' property. MFC after: 2 weeks	2010-08-07 10:23:54 +00:00
Martin Matuska	f4e7a6c3f1	Import two changesets from OpenSolaris to make future updates easier. The changes do not affect FreeBSD code because zfs_znode_move(), cleanlocks() and cleanshares() are not used. OpenSolaris onnv changeset: 9788:f660bc44f2e8, 9909:aa280f585a3e Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6843700, 6790232) MFC after: 7 weeks	2010-07-25 15:17:24 +00:00
Martin Matuska	34f56898a1	Consider snapshots as descendants via zfs allow -d OpenSolaris onnv changeset: 9847:2f3ba86e857a Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6809340) MFC after: 1 week	2010-07-24 22:28:29 +00:00
Andriy Gapon	a85d8d8acc	zfs arc_memory_throttle: available memory is free + cache OpenSolaris freemem has the same meaning as our v_free_count + v_cache_count. Obtained from: Artem Belevich <fbsdlist@src.cx>, Peter Jeremy <peterjeremy@acm.org> Discussed with: pjd MFC after: 2 weeks	2010-07-23 17:44:01 +00:00
Martin Matuska	2bacd082bd	Enable fake resolving of SMB RIDs by using nulldomain and UID_NOBODY - fixes panics when Solaris/OpenSolaris pools that contain files uploaded with the SMB protocol are accessed Enable seting/unsetting the sharesmb property (dummy action) - allows users who import pools from Solaris/Opensolaris to unset the sharesmb property and get rid of annoying messages PR: kern/145778, kern/148709 Approved by: pjd, delphij (mentor) MFC after: 7 weeks	2010-07-22 23:30:24 +00:00
Martin Matuska	f926b455e7	To improve latency, lower default vfs.zfs.vdev.max_pending from 35 to 10 OpenSolaris onnv changeset (partial): 10801:e0bf032e8673 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6891731) MFC after: 1 week	2010-07-20 05:22:14 +00:00
Nathan Whitehorn	04bcbbf81e	Increase stack size for ZFS sync thread. This is required to make ZFS function on 64-bit PowerPC. Reviewed by: pjd Obtained from: OpenSolaris changeset 14653:7cf402a7f374	2010-07-17 13:31:27 +00:00
John Baldwin	61e1c19319	Revert the previous commit. The race is not applicable to the lockmgr implementation in 8.0 and later as its flags field does not hold dynamic state such as waiters flags, but is only modified in lockinit() aside from VN_LOCK_*(). Discussed with: attilio	2010-07-16 19:52:03 +00:00
John Baldwin	dbfcf8cfea	When the MNTK_EXTENDED_SHARED mount option was added, some filesystems were changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE in the vnode lock's flags) until after they had determined if the vnode was a FIFO. This occurs after the vnode has been inserted a VFS hash or some similar table, so it is possible for another thread to find this vnode via vget() on an i-node number and block on the vnode lock. If the lockmgr interlock (vnode interlock for vnode locks) is not held when clearing the LK_NOSHARE flag, then the lk_flags field can be clobbered. As a result the thread blocked on the vnode lock may never get woken up. Fix this by holding the vnode interlock while modifying the lock flags in this case. MFC after: 3 days	2010-07-16 19:20:20 +00:00
Martin Matuska	8fc257994d	Merge ZFS version 15 and almost all OpenSolaris bugfixes referenced in Solaris 10 updates 141445-09 and 142901-14. Detailed information: (OpenSolaris revisions and Bug IDs, Solaris 10 patch numbers) 7844:effed23820ae 6755435 zfs_open() and zfs_close() needs to use ZFS_ENTER/ZFS_VERIFY_ZP (141445-01) 7897:e520d8258820 6748436 inconsistent zpool.cache in boot_archive could panic a zfs root filesystem upon boot-up (141445-01) 7965:b795da521357 6740164 zpool attach can create an illegal root pool (141909-02) 8084:b811cc60d650 6769612 zpool_import() will continue to write to cachefile even if altroot is set (N/A) 8121:7fd09d4ebd9c 6757430 want an option for zdb to disable space map loading and leak tracking (141445-01) 8129:e4f45a0bfbb0 6542860 ASSERT: reason != VDEV_LABEL_REMOVE\|\|vdev_inuse(vd, crtxg, reason, 0) (141445-01) 8188:fd00c0a81e80 6761100 want zdb option to select older uberblocks (141445-01) 8190:6eeea43ced42 6774886 zfs_setattr() won't allow ndmp to restore SUNWattr_rw (141445-01) 8225:59a9961c2aeb 6737463 panic while trying to write out config file if root pool import fails (141445-01) 8227:f7d7be9b1f56 6765294 Refactor replay (141445-01) 8228:51e9ca9ee3a5 6572357 libzfs should do more to avoid mnttab lookups (141909-01) 6572376 zfs_iter_filesystems and zfs_iter_snapshots get objset stats twice (141909-01) 8241:5a60f16123ba 6328632 zpool offline is a bit too conservative (141445-01) 6739487 ASSERT: txg <= spa_final_txg due to scrub/export race (141445-01) 6767129 ASSERT: cvd->vdev_isspare, in spa_vdev_detach() (141445-01) 6747698 checksum failures after offline -t / export / import / scrub (141445-01) 6745863 ZFS writes to disk after it has been offlined (141445-01) 6722540 50% slowdown on scrub/resilver with certain vdev configurations (141445-01) 6759999 resilver logic rewrites ditto blocks on both source and destination (141445-01) 6758107 I/O should never suspend during spa_load() (141445-01) 6776548 codereview(1) runs off the page when faced with multi-line comments (N/A) 6761406 AMD errata 91 workaround doesn't work on 64-bit systems (141445-01) 8242:e46e4b2f0a03 6770866 GRUB/ZFS should require physical path or devid, but not both (141445-01) 8269:03a7e9050cfd 6674216 "zfs share" doesn't work, but "zfs set sharenfs=on" does (141445-01) 6621164 $SRC/cmd/zfs/zfs_main.c seems to have a syntax error in the translation note (141445-01) 6635482 i18n problems in libzfs_dataset.c and zfs_main.c (141445-01) 6595194 "zfs get" VALUE column is as wide as NAME (141445-01) 6722991 vdev_disk.c: error checking for ddi_pathname_to_dev_t() must test for NODEV (141445-01) 6396518 ASSERT strings shouldn't be pre-processed (141445-01) 8274:846b39508aff 6713916 scrub/resilver needlessly decompress data (141445-01) 8343:655db2375fed 6739553 libzfs_status msgid table is out of sync (141445-01) 6784104 libzfs unfairly rejects numerical values greater than 2^63 (141445-01) 6784108 zfs_realloc() should not free original memory on failure (141445-01) 8525:e0e0e525d0f8 6788830 set large value to reservation cause core dump (141445-01) 6791064 want sysevents for ZFS scrub (141445-01) 6791066 need to be able to set cachefile on faulted pools (141445-01) 6791071 zpool_do_import() should not enable datasets on faulted pools (141445-01) 6792134 getting multiple properties on a faulted pool leads to confusion (141445-01) 8547:bcc7b46e5ff7 6792884 Vista clients cannot access .zfs (141445-01) 8632:36ef517870a3 6798384 It can take a village to raise a zio (141445-01) 8636:7e4ce9158df3 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() (141909-01) 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface (141909-01) 6702206 ZFS read/writer lock contention throttles sendfile() benchmark (141445-01) 6780491 Zone on a ZFS filesystem has poor fork/exec performance (141445-01) 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); (141445-01) 8692:692d4668b40d 6801507 ZFS read aggregation should not mind the gap (141445-01) 8697:e62d2612c14d 6633095 creating a filesystem with many properties set is slow (141445-01) 8768:dfecfdbb27ed 6775697 oracle crashes when overwriting after hitting quota on zfs (141909-01) 8811:f8deccf701cf 6790687 libzfs mnttab caching ignores external changes (141445-01) 6791101 memory leak from libzfs_mnttab_init (141445-01) 8845:91af0d9c0790 6800942 smb_session_create() incorrectly stores IP addresses (N/A) 6582163 Access Control List (ACL) for shares (141445-01) 6804954 smb_search - shortname field should be space padded following the NULL terminator (N/A) 6800184 Panic at smb_oplock_conflict+0x35() (N/A) 8876:59d2e67b4b65 6803822 Reboot after replacement of system disk in a ZFS mirror drops to grub> prompt (141445-01) 8924:5af812f84759 6789318 coredump when issue zdb -uuuu poolname/ (141445-01) 6790345 zdb -dddd -e poolname coredump (141445-01) 6797109 zdb: 'zdb -dddddd pool_name/fs_name inode' coredump if the file with inode was deleted (141445-01) `6797118` zdb: 'zdb -dddddd poolname inum' coredump if I miss the fs name (141445-01) 6803343 shareiscsi=on failed, iscsitgtd failed request to share (141445-01) 9030:243fd360d81f 6815893 hang mounting a dataset after booting into a new boot environment (141445-01) 9056:826e1858a846 6809691 'zpool create -f' no longer overwrites ufs infomation (141445-01) 9179:d8fbd96b79b3 6790064 zfs needs to determine uid and gid earlier in create process (141445-01) 9214:8d350e5d04aa 6604992 forced unmount + being in .zfs/snapshot/<snap1> = not happy (141909-01) 6810367 assertion failed: dvp->v_flag & VROOT, file: ../../common/fs/gfs.c, line: 426 (141909-01) 9229:e3f8b41e5db4 6807765 ztest_dsl_dataset_promote_busy needs to clean up after ENOSPC (141445-01) 9230:e4561e3eb1ef 6821169 offlining a device results in checksum errors (141445-01) 6821170 ZFS should not increment error stats for unavailable devices (141445-01) 6824006 need to increase issue and interrupt taskqs threads in zfs (141445-01) 9234:bffdc4fc05c4 6792139 recovering from a suspended pool needs some work (141445-01) 6794830 reboot command hangs on a failed zfs pool (141445-01) 9246:67c03c93c071 6824062 System panicked in zfs_mount due to NULL pointer dereference when running btts and svvs tests (141909-01) 9276:a8a7fc849933 6816124 System crash running zpool destroy on broken zpool (141445-03) 9355:09928982c591 6818183 zfs snapshot -r is slow due to set_snap_props() doing txg_wait_synced() for each new snapshot (141445-03) 9391:413d0661ef33 6710376 log device can show incorrect status when other parts of pool are degraded (141445-03) 9396:f41cf682d0d3 (part already merged) 6501037 want user/group quotas on ZFS (141445-03) 6827260 assertion failed in arc_read(): hdr == pbuf->b_hdr (141445-03) 6815592 panic: No such hold X on refcount Y from zfs_znode_move (141445-03) 6759986 zfs list shows temporary %clone when doing online zfs recv (141445-03) 9404:319573cd93f8 6774713 zfs ignores canmount=noauto when sharenfs property != off (141445-03) 9412:4aefd8704ce0 `6717022` ZFS DMU needs zero-copy support (141445-03) 9425:e7ffacaec3a8 6799895 spa_add_spares() needs to be protected by config lock (141445-03) 6826466 want to post sysevents on hot spare activation (141445-03) 6826468 spa 'allowfaulted' needs some work (141445-03) 6826469 kernel support for storing vdev FRU information (141445-03) 6826470 skip posting checksum errors from DTL regions of leaf vdevs (141445-03) 6826471 I/O errors after device remove probe can confuse FMA (141445-03) 6826472 spares should enjoy some of the benefits of cache devices (141445-03) 9443:2a96d8478e95 6833711 gang leaders shouldn't have to be logical (141445-03) 9463:d0bd231c7518 6764124 want zdb to be able to checksum metadata blocks only (141445-03) 9465:8372081b8019 6830237 zfs panic in zfs_groupmember() (141445-03) 9466:1fdfd1fed9c4 6833162 phantom log device in zpool status (141445-03) 9469:4f68f041ddcd `6824968` add ZFS userquota support to rquotad (141445-03) 9470:6d827468d7b5 6834217 godfather I/O should reexecute (141445-03) 9480:fcff33da767f 6596237 Stop looking and start ganging (141909-02) 9493:9933d599bc93 6623978 lwb->lwb_buf != NULL, file ../../../uts/common/fs/zfs/zil.c, line 787, function zil_lwb_commit (141445-06) 9512:64cafcbcc337 6801810 Commit of aligned streaming rewrites to ZIL device causes unwanted disk reads (N/A) 9515:d3b739d9d043 6586537 async zio taskqs can block out userland commands (142901-09) 9554:787363635b6a 6836768 zfs_userspace() callback has no way to indicate failure (N/A) 9574:1eb6a6ab2c57 6838062 zfs panics when an error is encountered in space_map_load() (141909-02) 9583:b0696cd037cc 6794136 Panic BAD TRAP: type=e when importing degraded zraid pool. (141909-03) 9630:e25a03f552e0 6776104 "zfs import" deadlock between spa_unload() and spa_async_thread() (141445-06) 9653:a70048a304d1 6664765 Unable to remove files when using fat-zap and quota exceeded on ZFS filesystem (141445-06) 9688:127be1845343 6841321 zfs userspace / zfs get userused@ doesn't work on mounted snapshot (N/A) 6843069 zfs get userused@S-1-... doesn't work (N/A) 9873:8ddc892eca6e 6847229 assertion failed: refcount_count(&tx->tx_space_written) + delta <= tx->tx_space_towrite in dmu_tx.c (141445-06) 9904:d260bd3fd47c 6838344 kernel heap corruption detected on zil while stress testing (141445-06) 9951:a4895b3dd543 6844900 zfs_ioc_userspace_upgrade leaks (N/A) 10040:38b25aeeaf7a 6857012 zfs panics on zpool import (141445-06) 10000:241a51d8720c 6848242 zdb -e no longer works as expected (N/A) 10100:4a6965f6bef8 6856634 snv_117 not booting: zfs_parse_bootfs: error2 (141445-07) 10160:a45b03783d44 6861983 zfs should use new name <-> SID interfaces (N/A) 6862984 userquota commands can hang (141445-06) 10299:80845694147f 6696858 zfs receive of incremental replication stream can dereference NULL pointer and crash (N/A) 10302:a9e3d1987706 6696858 zfs receive of incremental replication stream can dereference NULL pointer and crash (fix lint) (N/A) 10575:2a8816c5173b (partial merge) 6882227 spa_async_remove() shouldn't do a full clear (142901-14) 10800:469478b180d9 6880764 fsync on zfs is broken if writes are greater than 32kb on a hard crash and no log attached (142901-09) 6793430 zdb -ivvvv assertion failure: bp->blk_cksum.zc_word[2] == dmu_objset_id(zilog->zl_os) (N/A) 10801:e0bf032e8673 (partial merge) 6822816 assertion failed: zap_remove_int(ds_next_clones_obj) returns ENOENT (142901-09) 10810:b6b161a6ae4a 6892298 buf->b_hdr->b_state != arc_anon, file: ../../common/fs/zfs/arc.c, line: 2849 (142901-09) 10890:499786962772 6807339 spurious checksum errors when replacing a vdev (142901-13) 11249:6c30f7dfc97b 6906110 bad trap panic in zil_replay_log_record (142901-13) 6906946 zfs replay isn't handling uid/gid correctly (142901-13) 11454:6e69bacc1a5a 6898245 suspended zpool should not cause rest of the zfs/zpool commands to hang (142901-10) 11546:42ea6be8961b (partial merge) 6833999 3-way deadlock in dsl_dataset_hold_ref() and dsl_sync_task_group_sync() (142901-09) Discussed with: pjd Approved by: delphij (mentor) Obtained from: OpenSolaris (multiple Bug IDs) MFC after: 2 months	2010-07-12 23:49:04 +00:00
Martin Matuska	d3cf8f4b68	Import latest ARC change from OpenSolaris: - large ghost eviction causes high write latency - arc_adjust might adjust MRU unnecessarily - arc_adapt can lead to wild arc_p adjustment OpenSolaris onnv-revision: 12636:13b5d698941e Submitted by: avg Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6950219, 6953403, 6951024) MFC after: 1 month	2010-06-17 22:47:44 +00:00
Pawel Jakub Dawidek	653e034db5	Turn off UMA allocations on all archs by default. It isn't stable even on amd64. Reported by: many MFC after: 3 days	2010-06-17 17:41:42 +00:00
Pawel Jakub Dawidek	fcc7888f82	Remove redundant assignment. MFC after: 3 days	2010-06-16 12:42:20 +00:00
Martin Matuska	bc5752e811	Fix arc_read_done may try to byteswap undefined data (sparc related) OpenSolaris onnv-revision: 10839:cf83b553a2ab Obtained from: OpenSolaris (Bug ID 6836714) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:28:46 +00:00
Martin Matuska	726db0af89	Fix panic in zfs_getsecattr OpenSolaris onnv-revision: 10295:f7a18a1e9610 Obtained from: OpenSolaris (Bug ID 6870564) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:27:10 +00:00
Martin Matuska	9dac494ce6	Fix possible zfs panic on zpool import OpenSolaris onnv-revision: 10040:38b25aeeaf7a Obtained from: OpenSolaris (Bug ID 6857012) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:25:57 +00:00
Martin Matuska	16547ea20b	Fix zpool resilver stalls with spa_scrub_thread in a 3 way deadlock OpenSolaris onnv-revision: 9997:174d75a29a1c Obtained from: OpenSolaris (Bug ID 6843235) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:24:10 +00:00
Martin Matuska	072b6fc60e	Fix ZFS panic deadlock: cycle in blocking chain via zfs_zget OpenSolaris onnv-revision: 9774:0bb234ab2287 Obtained from: OpenSolaris (Bug ID 6788152) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:22:45 +00:00
Martin Matuska	1aa2ebdd23	Fix vdev_probe() starvation brings txg train to a screeching halt OpenSolaris onnv-revision: 9722:e3866bad4e96 Obtained from: OpenSolaris (Bug ID 6844069) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:21:37 +00:00
Martin Matuska	a2d6c8d15b	Fix incomplete resilvering after disk replacement (raidz) OpenSolaris onnv-revision: 9434:3bebded7c76a Obtained from: OpenSolaris (Bug ID 6794570) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:20:50 +00:00
Martin Matuska	b90308c521	Fix zfs destroy fails to free object in open context, stops up txg train OpenSolaris onnv-revision: 9409:9dc3f17354ed Obtained from: OpenSolaris (Bug ID 6809683) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:19:51 +00:00
Martin Matuska	62c55b6d08	Fix unable to remove a file over NFS after hitting refquota limit OpenSolaris onnv-revision: 8890:8c2bd5f17bf2 Obtained from: OpenSolaris (Bug ID 6798878) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-06-12 11:18:29 +00:00
Martin Matuska	711bf9bcf1	Fix freeing space after deleting large files with holes. OpenSolaris onnv revision: 9950:78fc41aa9bc5 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6792701) MFC after: 3 days	2010-06-03 11:08:46 +00:00
Martin Matuska	dc5d34e454	Fix ZIL close when doing zfs rollback or zfs receive on a mounted dataset. The fix is a partial import and merge of OpenSolaris onnv revisions 8227:f7d7be9b1f56. and 9292:e112194b5b73 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6798298) MFC after: 3 days	2010-06-01 08:43:46 +00:00
Pawel Jakub Dawidek	510ec358c5	Fix a bug where resilver is not started automatically on pool import or load. If disk was missing on pool load or import and on next pool load or import it was present, resilver wasn't started automatically and ZFS reported all disks as ONLINE and healthy. Then, when another disk died, pool became unaccessible, because if it was 2-way mirror or RAIDZ1 two vdevs were out of sync. To fix the problem, start resilver automatically on pool load or import. Obtained from: OpenSolaris MFC after: 3 days	2010-05-31 23:17:45 +00:00
Pawel Jakub Dawidek	b1c7417cd8	Fix panic when reading label from provider with non power of 2 sector size. Reported by: James R. Van Artsdalen <james-freebsd-fs2@jrv.org> MFC after: 3 days	2010-05-31 23:11:43 +00:00
Martin Matuska	dd85b12982	Remove kstat.zfs.arcstats.l2_write_bytes_written The arcstats.l2_write_bytes_written kstat counter introduced in r205231 was duplicite with vendor's arcstats.l2_write_bytes counter imported in r208373 (OpenSolaris revision 8582:df9361868dbe) Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-05-23 21:16:34 +00:00
Martin Matuska	5b170d55ae	Fix zfs receive temporarily changing unchanged stream properties. Fix possible panic with zfs_enable_datasets. OpenSolaris onnv revision: 8536:33bd5de3260e Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6748561, 6757075) MFC after: 3 days	2010-05-23 21:02:43 +00:00
Pawel Jakub Dawidek	4e8c7af455	Create UMA zones unconditionally. MFC after: 3 days	2010-05-23 19:10:06 +00:00
Pawel Jakub Dawidek	a95add4cf8	Remove ZIO_USE_UMA from arc.c as well. MFC after: 3 days	2010-05-23 18:42:33 +00:00
Martin Matuska	55a381515b	Fix kernel panic when calling spa_tryimport() on a corrupted pool. OpenSolaris onnv revision: 8680:005fe27123ba Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6786321) MFC after: 1 day	2010-05-23 10:13:11 +00:00
Martin Matuska	e3fffd1a9f	Fix mutex_exit misorder that can cause a kernel panic. OpenSolaris onnv revision: 8667:5c308a17eb7c Approved by: delphij (mentor) Obtained from: OpenSolaris (Bug ID 6795440) MFC after: 1 day	2010-05-23 10:08:05 +00:00
Martin Matuska	7838815ebb	Update L2ARC code and fix several bugs. - improve ARC memory consumption (Bug ID 6488341) - ARC/L2ARC metadata accounting (Bug ID 6748019) - L2ARC turbo warmup (Bud ID 6748023) - kstats for ARC content (Bug ID 6748023) - kstats for evicted bytes from ARC by L2ARC state (Bud ID 6871680) - fix panic on i386 systems (Bug ID 6821260) OpenSolaris onnv revisions: 8582:df9361868dbe, 8628:97dcded6e556, 9215:7c4584f76b47, 9274:a10f8bd993c1, 10357:29060492b29d OpenSolaris Bug IDs: 6748019, 6748023, 6748030, 6488341, 6798268, 6821260, 6790261, 6871680 Approved by: pjd, delphij (mentor) Obtained from: OpenSlaris (multiple bug IDs) MFC after: 3 days	2010-05-21 09:52:49 +00:00
Martin Matuska	370227d241	Reorder some already introduced locking variables. OpenSolaris onnv revision: 8214:d7abf7c1f1c1 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6747934) MFC after: 3 days	2010-05-21 09:35:28 +00:00
Martin Matuska	911e1f9b1d	Fix stack overflow in zfs send. OpenSolaris onnv-revision: 8012:8ea30813950f Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6765626) MFC after: 3 days	2010-05-21 08:55:18 +00:00
Martin Matuska	8b2bc083b9	Fix: vdev_reopen() can lead to failed allocations OpenSolaris onnv-revision: 7980:589f37f25048 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID `6764914`) MFC after: 3 days	2010-05-21 08:50:34 +00:00
Pawel Jakub Dawidek	2b3d97b81d	Fix userland build by making io_task available only for the kernel and by providing taskq_dispatch_safe() macro. MFC after: 1 week	2010-05-16 19:44:08 +00:00
Pawel Jakub Dawidek	ed3c664257	Allow to configure UMA usage for ZIO data via loader and turn it on by default for amd64. On i386 I saw performance degradation when UMA was used, but for amd64 it should help. MFC after: 3 days	2010-05-16 15:14:59 +00:00
Pawel Jakub Dawidek	cfb3e98d37	Add task structure to zio and use it instead of allocating one. This eliminates the only place where we can sleep when calling zio_interrupt(). As a side-effect this can actually improve performance a little as we allocate one less thing for every I/O. Prodded by: kib MFC after: 1 week	2010-05-16 15:12:34 +00:00
Pawel Jakub Dawidek	ea478cb1da	The whole point of having dedicated worker thread for each leaf VDEV was to avoid calling zio_interrupt() from geom_up thread context. It turns out that when provider is forcibly removed from the system and we kill worker thread there can still be some ZIOs pending. To complete pending ZIOs when there is no worker thread anymore we still have to call zio_interrupt() from geom_up context. To avoid this race just remove use of worker threads altogether. This should be more or less fine, because I also thought that zio_interrupt() does more work, but it only makes small UMA allocation with M_WAITOK. It also saves one context switch per I/O request. PR: kern/145339 Reported by: Alex Bakhtin <Alex.Bakhtin@gmail.com> MFC after: 1 week	2010-05-16 11:56:42 +00:00
Martin Matuska	ee56d88b76	Fix deadlock between zfs_dirent_lock and zfs_rmdir OpenSolaris onnv revision: 11321:506b7043a14c Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6847615) MFC after: 3 days	2010-05-16 07:46:03 +00:00
Martin Matuska	db708a6e2c	Fix perfomance problem with ZFS prefetch caching [1] Add statistics for ZFS prefetch (sysctl kstat.zfs.misc.zfetchstats) Partial import of OpenSolaris onnv revision 10474:0e96dd3b905a Reported by: jhell@dataix.net (private e-mail) [1] Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6859997, 6868951) MFC after: 3 days	2010-05-16 07:16:28 +00:00
Martin Matuska	bef629c14d	Fix ZIL-related panic on zfs rollback. OpenSolaris onnv-revision: 8746:e1d96ca6808c Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6796377) MCF after: 1 week	2010-05-13 20:55:58 +00:00
Martin Matuska	c43d127a9a	Import OpenSolaris revision 7837:001de5627df3 It includes the following changes: - parallel reads in traversal code (Bug ID 6333409) - faster traversal for zfs send (Bug ID 6418042) - traversal code cleanup (Bug ID 6725675) - fix for two scrub related bugs (Bug ID 6729696, 6730101) - fix assertion in dbuf_verify (Bug ID 6752226) - fix panic during zfs send with i/o errors (Bug ID 6577985) - replace P2CROSS with P2BOUNDARY (Bug ID 6725680) List of OpenSolaris Bug IDs: 6333409, 6418042, 6757112, 6725668, 6725675, 6725680, 6725698, 6729696, 6730101, 6752226, 6577985, 6755042 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (multiple Bug IDs) MFC after: 1 week	2010-05-13 20:32:56 +00:00
Edward Tomasz Napierala	4e28b70950	Add missing check to prevent local users from panicing the kernel by trying to set malformed ACL. MFC after: 3 days	2010-05-13 15:31:00 +00:00
Martin Matuska	f2d1218cbe	Fix possible hang when replaying large truncations. OpenSolaris onnv revision: 7904:6a124a4ca9c5 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6761624) MFC after: 3 days	2010-05-12 09:51:57 +00:00
Pawel Jakub Dawidek	8423e00b36	Eventhough r203504 eliminates taste traffic provoked by vdev_geom.c, ZFS still like to open all vdevs, close them and open them again, which in turn provokes taste traffic anyway. I don't know of any clean way to fix it, so do it the hard way - if we can't open provider for writing just retry 5 times with 0.5 pauses. This should elimitate accidental races caused by other classes tasting providers created on top of our vdevs. MFC after: 3 days Reported by: James R. Van Artsdalen <james-freebsd-fs2@jrv.org> Reported by: Yuri Pankov <yuri.pankov@gmail.com>	2010-05-11 22:29:00 +00:00
Pawel Jakub Dawidek	204b20d932	Add missing new line characters to the warnings. MFC after: 3 days	2010-05-11 22:23:35 +00:00
Martin Matuska	8c04b2242e	Fix failed assertion on destroying datasets from an older pool version. OpenSolaris onnv revision: 9390:887948510f80 PR: kern/146471 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID `6826861`) MFC after: 3 days	2010-05-11 09:26:46 +00:00
Martin Matuska	431905576e	Fix possible panic with zfs destroy. OpenSolaris onnv revision: 8779:f164e0e90508 PR: kern/146471 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6784924) MFC after: 3 days	2010-05-11 09:23:46 +00:00
Martin Matuska	bb8b966850	Fix zfs rename (may occasionally fail with dataset busy). OpenSolaris onnv revision: 8517:41a0783dde17 PR: kern/146471 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6784757) MFC after: 3 days	2010-05-11 09:19:41 +00:00
Martin Matuska	dbbd1505bf	Fix endianess bug in ZFS intent log (ZIL). OpenSolaris onnv revision: 8109:6147a1bdd359 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6760048) MFC after: 3 days	2010-05-11 07:25:13 +00:00
Edward Tomasz Napierala	dc510c105f	Enforce RLIMIT_FSIZE in ZFS. Reviewed by: pjd@	2010-05-07 14:30:21 +00:00
Marius Strobl	626b7c61f8	- Fix broken symlinks on cross platform zfs send/recv. [1] - Enable zfs_ace_byteswap() on FreeBSD as it works just fine (tested between amd64 and sparc64 in both directions by Michael Moll). PR: 146272 Approved by: mm, pjd Obtained from: OpenSolaris (onnv rev. 8283:1ca59f393041; Bug ID 6764193) [1] MFC after: 3 days	2010-05-05 22:15:20 +00:00
Martin Matuska	d75554ec04	Introduce hardforce export option (-F) for "zpool export". When exporting with this flag, zpool.cache remains untouched. OpenSolaris onnv revision: 8211:32722be6ad3b Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID: 6775357)	2010-05-05 18:22:29 +00:00
Martin Matuska	7d4daf9a10	Speed up ZFS list operation with objset prefetching. Partial import of OpenSolaris onnv revisions: 8415:8809e849f63e, 10474:0e96dd3b905a PR: kern/146297 Submitted by: myself Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6386929, 6755389, 6847118) MFC after: 2 weeks	2010-05-04 17:40:24 +00:00
Martin Matuska	77a7f64749	Fix deadlock during zfs receive. OpenSolaris onnv revision: 9299:8809e849f63e PR: kern/146296 Submitted by: myself Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris (Bug ID 6783818, 6826836) MFC after: 1 week	2010-05-04 17:30:07 +00:00
Martin Matuska	df04ddbaa6	Add sysctl and loader tunable vfs.zfs.txg.write_limit_override. This tunable improves fine-tuning of ZFS write throttling. PR: kern/146108 Suggested by: Nikolay Denev <ndenev at gmail.com> Approved by: pjd, delphij (mentor) MFC after: 2 weeks	2010-05-01 20:44:37 +00:00
Martin Matuska	9ccdc9600e	Change description of tunable group vfs.zfs.txg to be more understandable. Approved by: pjd, delphij (mentor) MFC after: 3 days	2010-05-01 19:53:15 +00:00
Martin Matuska	d8665eb1f6	Fix improper pool write throughput calculation. OpenSolaris onnv revision: 9366:17553395a745 PR: kern/146108 Approved by: pjd, delphij (mentor) Obtained from: OpenSolaris, Bug ID 6817339 MFC after: 2 weeks	2010-04-30 07:48:29 +00:00
Pawel Jakub Dawidek	b19b0de471	Backport fix for 'zfs_znode_dmu_init: existing znode for dbuf' panic from OpenSolaris. PR: kern/144402 Reported by: Alex Bakhtin <alex.bakhtin@gmail.com> Tested by: Alex Bakhtin <alex.bakhtin@gmail.com> Obtained from: OpenSolaris, Bug ID 6895088 MFC after: 3 days	2010-04-28 18:29:48 +00:00
Pawel Jakub Dawidek	7af9c09a61	Allow to modify directory's content even if the ZFS_NOUNLINK (SF_NOUNLINK, sunlnk) flag is set. We only deny dirctory's removal or rename. PR: kern/143343 Reported by: marck MFC after: 3 days	2010-04-22 18:47:23 +00:00
Pawel Jakub Dawidek	bd7226a572	Restore previous order.	2010-04-18 12:43:33 +00:00
Pawel Jakub Dawidek	224329fb6b	Style fixes.	2010-04-18 12:36:53 +00:00
Pawel Jakub Dawidek	eb998be67d	Add missing list and lock destruction.	2010-04-18 12:27:07 +00:00
Pawel Jakub Dawidek	ad3cb80827	Extend locks scope to match OpenSolaris.	2010-04-18 12:25:40 +00:00
Pawel Jakub Dawidek	57a81a8bbc	Remove racy assertion. Obtained from: OpenSolaris	2010-04-18 12:21:52 +00:00
Pawel Jakub Dawidek	5195ca2307	Set ARC_L2_WRITING on L2ARC header creation. Obtained from: OpenSolaris	2010-04-18 12:20:33 +00:00
Pawel Jakub Dawidek	40c7da090f	Fix 3-way deadlock that can happen because of ZFS and vnode lock order reversal. thread0 (vfs_fhtovp) thread1 (vop_getattr) thread2 (zfs_recv) -------------------- --------------------- ------------------ vn_lock rrw_enter_read rrw_enter_write (hangs) rrw_enter_read (hangs) vn_lock (hangs) Submitted by: Attila Nagy <bra@fsn.hu> MFC after: 3 days	2010-04-15 16:40:54 +00:00
Pawel Jakub Dawidek	58804a192e	The same code is used to import and to create pool. The order of operations is the following: 1. Try to open vdev by remembered path and guid. 2. If 1 failed, try to find vdev which guid matches and ignore the path. 3. If 2 failed this means either that the vdev we're looking for is gone or that pool is being created and vdev doesn't contain proper guid yet. To be able to handle pool creation we open vdev by path anyway. Because of 3 it is possible that we open wrong vdev on import which can lead to confusions. The solution for this is to check spa_load_state. On pool creation it will be equal to SPA_LOAD_NONE and we can open vdev only by path immediately and if it is not equal to SPA_LOAD_NONE we first open by path+guid and when that fails, we open by guid. We no longer open wrong vdev on import. MFC after: 2 weeks	2010-03-19 20:14:27 +00:00
Kip Macy	e577b0b2e3	- cache line align arcs_lock array (h/t Marius Nuennerich) - fix ARCS_LOCK_PAD to use architecture defined CACHE_LINE_SIZE - cache line align buf_hash_table ht_locks array MFC after: 7 days	2010-03-17 21:10:09 +00:00
Kip Macy	07c5b1686e	use CACHE_LINE_SIZE instead of hardcoding 128 for lock pad pointed out by Marius Nuennerich and jhb@	2010-03-17 20:00:22 +00:00
Kip Macy	285738b6ad	- reduce contention by breaking up ARC state locks in to 16 for data and 16 for metadata - export L2ARC tunables as sysctls - add several kstats to track L2ARC state more precisely - avoid holding a contended lock when atomically incrementing a contended counter (no lock protection needed for atomics)	2010-03-16 22:17:21 +00:00
Kip Macy	03af82ac5e	fix compilation under ZIO_USE_UMA	2010-03-13 21:52:21 +00:00
Kip Macy	181c6ae3f0	Don't bottleneck on acquiring the stream locks - this avoids a massive drop off in throughput with large numbers of simultaneous reads MFC after: 7 days	2010-03-13 21:41:52 +00:00
Pawel Jakub Dawidek	3a98b0c4df	Remove bogus assertion. Reported by: Johan Ström <johan@stromnet.se> Obtained from: OpenSolaris, Bug ID 6827260 MFC after: 1 week	2010-03-12 12:07:21 +00:00
Pawel Jakub Dawidek	5b2e8d582f	Remove racy assertion. Reported by: Attila Nagy <bra@fsn.hu> Obtained from: OpenSolaris, Bug ID 6827260 MFC after: 1 week	2010-03-06 20:03:26 +00:00
Pawel Jakub Dawidek	251294bca9	Don't set f_bsize to recordsize. It might confuse some software (like squid). Submitted by: Alexander Zagrebin <alexz@visp.ru> MFC after: 2 weeks	2010-02-19 20:18:16 +00:00
Pawel Jakub Dawidek	bbd388268e	Add tunable and sysctl to skip hostid check on pool import.	2010-02-18 22:31:43 +00:00
Pawel Jakub Dawidek	9d3f36a309	Open provider for writting when we find the right one. Opening too much providers for writing provokes huge traffic related to taste events send by GEOM on close. This can lead to various problems with opening GEOM providers that are created on top of other GEOM providers. Reorted by: Kurt Touet <ktouet@gmail.com>, mr Tested by: mr, Baginski Darren <kickbsd@ya.ru> MFC after: 2 weeks	2010-02-04 21:11:44 +00:00
Xin LI	51b1ec310c	Report ZFS filesystem version instead of the zpool version when we say it. Reported by: Yuri Pankov (on -fs@) Submitted by: delphij Approved by: pjd MFC after: 1 week	2010-01-11 23:15:11 +00:00
Xin LI	017b01f662	Re-apply onnv-gate revisions 7994 and 8986 (corresponds to FreeBSD revision 200726 and 200727). It looks like that the two revisions were not applied in the right sequence, I found this when comparing with the OpenSolaris code. MFC after: 3 days Reviewed by: mm@	2010-01-07 20:10:22 +00:00
Xin LI	1ee5de4482	Reduce diff against OpenSolaris - move Giant acquire/release to zfs_znode.c. As a side effect this also eliminates two potential Giant leaks. Approved by: pjd MFC after: 1 month	2010-01-02 23:38:03 +00:00
Xin LI	9189129097	Apply OpenSolaris revision 8012 which brings our zpool to version 14, making it possible for zpools created on OpenSolaris 2009.06 be used on FreeBSD. PR: kern/141800 Submitted by: mm Reviewed by: pjd, trasz Obtained from: OpenSolaris MFC after: 2 weeks	2009-12-28 22:15:11 +00:00
Xin LI	dd0c145752	Apply fix for Solaris bug 6462803: zfs snapshot -r failed because filesystem was busy (onnv revision 8989) Submitted by: mm Approved by: pjd Obtained from: OpenSolaris MFC after: 2 weeks	2009-12-19 11:49:20 +00:00
Xin LI	24a41d7ec6	Apply fix for Solaris bug 6801979: zfs recv can fail with E2BIG (onnv revision 8986) Requested by: mm Submitted by: pjd Obtained from: OpenSolaris MFC after: 2 weeks	2009-12-19 11:47:22 +00:00
Xin LI	775f802393	Apply fix Solaris bug 6462803 zfs snapshot -r failed because filesystem was busy. Submitted by: mm Approved by: pjd MFC after: 2 weeks	2009-12-19 11:43:39 +00:00
Konstantin Belousov	88f2d72947	Change VOP_FSYNC for zfs vnode from VOP_PANIC to zfs_freebsd_fsync(), both to not panic when fsync(2) is called for fifo on zfs filedescriptor, and to actually fsync fifo inode to permanent storage. PR: kern/141177 Reviewed by: pjd MFC after: 1 week	2009-12-05 20:36:42 +00:00
Pawel Jakub Dawidek	dfb903e852	We have to eventually look for provider without checking guid as this is need for attaching when there is no metadata yet. Before r200125 the order of looking for providers was wrong. It was: 1. Find provider by name. 2. Find provider by guid. 3. Find provider by name and guid. Where it should have been: 1. Find provider by name and guid. 2. Find provider by guid. 3. Find provider by name. MFC after: 1 week	2009-12-05 20:16:28 +00:00
Pawel Jakub Dawidek	6468cb2ce0	Fix deadlock when ZVOLs are present and we are replacing dead component or calling scrub when pool is in a degraded state. It will try to taste ZVOLs, which will lead to deadlock, as ZVOL will try to acquire the same locks as replace/scrub is holding already. We can't simply skip provider based on their GEOM class, because ZVOL can have providers build on top of it and we need to skip those as well. We do it by asking for ZFS::iszvol attribute. Any ZVOL-based provider will give us positive answer and we have to skip those providers. This way we remove possibility to create ZFS pools on top of ZVOLs, but it is not very useful anyway. I believe deadlock is still possible in some very complex situations like when we have MD provider on top of UFS file on top of ZVOL. When we try to replace dead component in the pool mentioned ZVOL is based on, there might be a deadlock when ZFS will try to taste MD provider. There is no easy way to detect that, but it isn't very common. MFC after: 1 week	2009-12-05 14:33:11 +00:00
Pawel Jakub Dawidek	ccba826977	Always check guid when opening by path, because we may end up with provider that does have the same name, but only by accident. MFC after: 1 week	2009-12-05 14:24:22 +00:00
Pawel Jakub Dawidek	29c8c85594	Avoid using additional variable for storing an error if we are not going to do anything with it.	2009-12-05 14:21:42 +00:00
Pawel Jakub Dawidek	fd9ee28bfc	Be careful which vattr fields are set during setattr replay. Without this fix strange things can appear after unclean shutdown like files with mode set to 07777. Reported by: des MFC after: 3 days	2009-11-10 22:27:33 +00:00
Pawel Jakub Dawidek	56697614cc	Avoid passing invalid mountpoint to getnewvnode(). Reported by: rwatson Tested by: rwatson MFC after: 3 days	2009-11-10 22:25:46 +00:00
Pawel Jakub Dawidek	fd66267ffb	- zfs_zaccess() can handle VAPPEND too, so map V_APPEND to VAPPEND and call zfs_access() instead of vaccess() in this case as well. - If VADMIN is specified with another V* flag (unlikely) call both zfs_access() and vaccess() after spliting V* flags. This fixes "dirtying snapshot!" panic. PR: kern/139806 Reported by: Carl Chave <carl@chave.us> In co-operation with: jh MFC after: 3 days	2009-10-30 23:33:06 +00:00
Pawel Jakub Dawidek	c217b20ef6	Allow file system owner to modify system flags if securelevel permits. MFC after: 3 days	2009-10-08 16:05:17 +00:00
Pawel Jakub Dawidek	3a6c0cbf26	On FreeBSD it is enough to report provider removal when orphan event is received, we don't have to do it on every ENXIO error in I/O path. Solaris has no GEOM so they have to handle it in a less clean way. MFC after: 3 days	2009-10-07 20:56:15 +00:00
Pawel Jakub Dawidek	2ada529a14	Fix white-spaces. MFC after: 3 days	2009-10-07 20:54:07 +00:00
Pawel Jakub Dawidek	c0103003c0	Fix situation where Mac OS X NFS client creates a file and when it tries to set ownership and mode in the same setattr operation, the mode was overwritten by secpolicy_vnode_setattr(). PR: kern/118320 Submitted by: Mark Thompson <info-gentoo@mark.thompson.bz> MFC after: 3 days	2009-10-07 12:38:19 +00:00
Kip Macy	e6b112e274	Prevent paging pressure from draining arc too much - always drain arc if above arc_c_max - never drain arc if arc is below arc_c_max MFC after: 3 days	2009-10-06 21:40:50 +00:00
Xin LI	6f62807611	Return EOPNOTSUPP instead of EINVAL when doing chflags(2) over an old format ZFS, as defined in the manual page. Submitted by: pjd (response of my original patch but bugs are mine) MFC after: 3 days	2009-10-01 18:58:26 +00:00
Pawel Jakub Dawidek	ab711589df	Handle cases where virtual (GFS) vnodes are referenced when doing forced unmount. In that case we cannot depend on the proper order of invalidating vnodes, so we have to free resources when we have a chance. PR: kern/139062 Reported by: trasz MFC after: 3 days	2009-09-26 00:10:45 +00:00
Pawel Jakub Dawidek	a0b238644a	On lookup error VFS expects *vpp to be set to NULL, be sure to do that. MFC after: 3 days	2009-09-26 00:08:44 +00:00
Pawel Jakub Dawidek	a99aaff645	Use traverse() function to find and return mount point's vnode instead of covered vnode when snapshot is already mounted. MFC after: 3 days	2009-09-26 00:07:14 +00:00
Pawel Jakub Dawidek	1aba32d9b4	- Don't depend on value returned by gfs_*_inactive(), it doesn't work well with forced unmounts when GFS vnodes are referenced. - Make other preparations to GFS for forced unmounts. PR: kern/139062 Reported by: trasz MFC after: 3 days	2009-09-26 00:04:30 +00:00
Pawel Jakub Dawidek	86758476b4	Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to be a bit weak and OpenSolaris also switched to fletcher4. PR: kern/139072 Reported by: Daniel Grund <bugs@dgrund.de> MFC after: 3 days	2009-09-25 18:19:50 +00:00
Pawel Jakub Dawidek	ad8294cf98	Before calling vflush(FORCECLOSE) mark file system as unmounted so the following vnops will fail. This is very important, because without this change vnode could be reclaimed at any point, even if we increased usecount. The only way to ensure that vnode won't be reclaimed was to lock it, which would be very hard to do in ZFS without changing a lot of code. With this change simply increasing usecount is enough to be sure vnode won't be reclaimed from under us. To be precise it can still be reclaimed but we won't be able to see it, because every try to enter ZFS through VFS will result in EIO. The only function that cannot return EIO, because it is needed for vflush() is zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks z_teardown_lock and never returns EIO. MFC after: 3 days	2009-09-24 15:56:26 +00:00
Pawel Jakub Dawidek	ab9bbf4a2b	Close race in zfs_zget(). We have to increase usecount first and then check for VI_DOOMED flag. Before this change vnode could be reclaimed between checking for the flag and increasing usecount. MFC after: 3 days	2009-09-24 15:49:15 +00:00
Edward Tomasz Napierala	c40502ccd0	In VOP_SETACL(9) and VOP_GETACL(9), specifying wrong ACL type should result in EINVAL, not EOPNOTSUPP.	2009-09-23 15:09:34 +00:00
Pawel Jakub Dawidek	eb03c3cdfb	Restore BSD behaviour - when creating new directory entry use parent directory gid to set group ownership and not process gid. This was overlooked during v6 -> v13 switch. PR: kern/139076 Reported by: Sean Winn <sean@gothic.net.au> MFC after: 3 days	2009-09-23 09:18:16 +00:00
Pawel Jakub Dawidek	c4be11d7fc	Purge namecache in the same place OpenSolaris does.	2009-09-20 13:28:29 +00:00
Pawel Jakub Dawidek	5469543c92	Purge file system namecache when receiving incremental stream and rolling back to it. MFC after: 3 days	2009-09-17 15:14:28 +00:00
Pawel Jakub Dawidek	3282c51713	Purge namecache for the file system being rolled back, so it doesn't point at invalid vnodes after the rollback resulting in EIO errors when trying to access files which are in the namecache. Reported by: des MFC after: 3 days	2009-09-17 14:58:21 +00:00
Pawel Jakub Dawidek	95f08808b6	Forced unmounts work just fine in my tests under heavy load. There might still be a problem, but it isn't worth a warning.	2009-09-15 11:42:08 +00:00
Pawel Jakub Dawidek	a4e6b460d3	We believe ZFS is ready for production use. Remove a warning about it being experimental. :)	2009-09-15 11:34:53 +00:00
Pawel Jakub Dawidek	63e1d3df27	- Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows ZFS route of not listing snapshots by default with 'zfs list' command. - Add UPDATING entry to note that ZFS snapshots are no longer visible in mount(8) and df(1) output by default. Reviewed by: kib MFC after: 3 days	2009-09-14 21:10:40 +00:00
Pawel Jakub Dawidek	85c171b2e1	Support both case: when snapshot is already mounted and when it is not yet mounted. MFC after: 3 days	2009-09-13 21:40:36 +00:00
Pawel Jakub Dawidek	8a2c4db0fe	Add missing \n. Reported by: marck	2009-09-13 17:30:56 +00:00
Pawel Jakub Dawidek	7746b6461d	Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories by just returning EOPNOTSUPP. This will allow NFS server to fall back to regular READDIR. Note that converting inode number to snapshot's vnode is expensive operation. Snapshots are stored in AVL tree, but based on their names, not inode numbers, so to convert inode to snapshot vnode we have to interate over all snalshots. This is not a problem in OpenSolaris, because in their READDIRPLUS implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on d_fileno as we do. PR: kern/125149 Reported by: Weldon Godfrey <wgodfrey@ena.com> Analysis by: Jaakko Heinonen <jh@saunalahti.fi> MFC after: 3 days	2009-09-13 16:05:20 +00:00
Pawel Jakub Dawidek	7b4a12379b	When zfs.ko is compiled with debug, make sure that znode and vnode point at each other. MFC after: 3 days	2009-09-13 10:33:51 +00:00
Pawel Jakub Dawidek	33a0ef82f2	Extend scope of the z_teardown_lock lock for consistency and "just in case". MFC after: 3 days	2009-09-13 10:29:51 +00:00
Pawel Jakub Dawidek	7dae3c4faf	Be sure not to overflow struct fid. MFC after: 3 days	2009-09-13 10:25:33 +00:00
Pawel Jakub Dawidek	f53901193d	There is a bug where mze_insert() can trigger an assert() of inserting the same entry twice. This bug is not fixed yet, but leads to situation where when try to access corrupted directory the kernel will panic. Until the bug is properly fixed, try to recover from it and log that it happened. Reported by: marck OpenSolaris bug: 6709336 MFC after: 3 days	2009-09-13 10:12:29 +00:00
Pawel Jakub Dawidek	f5516e3d1d	- Protect reclaim with z_teardown_inactive_lock. - Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if z_dbuf field is NULL - this might happen in case of rollback or forced unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete(). - On forced unmount wait for all znodes to be destroyed - destruction can be done asynchronously via zfs_reclaim_complete(). MFC after: 1 week	2009-09-12 19:53:31 +00:00
Pawel Jakub Dawidek	2a8e7dad33	Tighten up the check for race in zfs_zget() - ZTOV(zp) can not only contain NULL, but also can point to dead vnode, take that into account. PR: kern/132068 Reported by: Edward Fisk" <7ogcg7g02@sneakemail.com>, kris Fix based on patch from: Jaakko Heinonen <jh@saunalahti.fi> MFC after: 1 week	2009-09-12 19:27:54 +00:00
Pawel Jakub Dawidek	3770996142	Only log successful commands! Without this fix we log even unsuccessful commands executed by unprivileged users. Action is not really taken, but it is logged to pool history, which might be confusing. Reported by: Denis Ahrens <denis@h3q.com> MFC after: 3 days	2009-09-08 16:40:08 +00:00
Pawel Jakub Dawidek	d6b8039292	We don't export individual snapshots, so mnt_export field in snapshot's mount point is NULL. That's why when we try to access snapshots over NFS use mnt_export field from the parent file system. MFC after: 1 week	2009-09-08 15:57:03 +00:00
Pawel Jakub Dawidek	f148fd9a4a	When we automatically mount snapshot we want to return vnode of the mount point from the lookup and not covered vnode. This is one of the fixes for using .zfs/ over NFS. MFC after: 1 week	2009-09-08 15:51:40 +00:00
Pawel Jakub Dawidek	2391003912	On FreeBSD we don't have to look for snapshot's mount point, because fhtovp method is already called with proper mount point. MFC after: 1 week	2009-09-08 15:42:55 +00:00
Pawel Jakub Dawidek	6f8e88e1da	Call ZFS_EXIT() after locking the vnode. MFC after: 1 week	2009-09-08 15:37:01 +00:00
Pawel Jakub Dawidek	1ea3566294	Fix reference count leak for a case where snapshot's mount point is updated. Such situation is not supported. This problem was triggered by something like this: # zpool create tank da0 # zfs snapshot tank@snap # cd /tank/.zfs/snapshot/snap (this will mount the snapshot) # cd # mount -u nosuid /tank/.zfs/snapshot/snap (refcount leak) # zpool export tank cannot export 'tank': pool is busy MFC after: 1 week	2009-09-08 08:54:15 +00:00
Pawel Jakub Dawidek	28e449adf2	If we have to use avl_find(), optimize a bit and use avl_insert() instead of avl_add() (the latter is actually a wrapper around avl_find() + avl_insert()). Fix similar case in the code that is currently commented out.	2009-09-07 21:58:54 +00:00
Pawel Jakub Dawidek	3f6043a57d	When snapshot mount point is busy (for example we are still in it) we will fail to unmount it, but it won't be removed from the tree, so in that case there is no need to reinsert it. This fixes a panic reproducable in the following steps: # zfs create tank/foo # zfs snapshot tank/foo@snap # cd /tank/foo/.zfs/snapshot/snap # umount /tank/foo panic: avl_find() succeeded inside avl_add() Reported by: trasz MFC after: 3 days	2009-09-07 21:46:51 +00:00
Edward Tomasz Napierala	343775c0b4	Enable NFSv4 ACL support in ZFS. Reviewed by: pjd	2009-09-07 19:43:13 +00:00
Pawel Jakub Dawidek	c739b7b22b	Don't recheck ownership on update mount. This will eliminate LOR between vfs_busy() and mount mutex. We check ownership in vfs_domount() anyway. Noticed by: kib Reviewed by: kib MFC after: 1 week	2009-09-07 18:54:55 +00:00
Edward Tomasz Napierala	900b1670c4	Prevent the line from wrapping.	2009-09-07 16:56:41 +00:00
Pawel Jakub Dawidek	841bcfea21	Changing provider size is not really supported by GEOM, but doing so when provider is closed should be ok. When administrator requests to change ZVOL size do it immediately if ZVOL is closed or do it on last ZVOL close. PR: kern/136942 Requested by: Bernard Buri <bsd@ask-us.at> MFC after: 1 week	2009-09-07 14:16:50 +00:00
Pawel Jakub Dawidek	5e65224daf	bzero() on-stack argument, so mutex_init() won't misinterpret that the lock is already initialized if we have some garbage on the stack. PR: kern/135480 Reported by: Emil Mikulic <emikulic@gmail.com> MFC after: 3 days	2009-09-07 11:38:43 +00:00
Edward Tomasz Napierala	a41422a93e	Improve wording. Discussed with: pjd, cperciva, rink, wkoszek and des, in order of appearance.	2009-09-05 15:08:58 +00:00
Pawel Jakub Dawidek	26d0605727	Backport the 'dirtying dbuf' panic fix from newer ZFS version. Reported by: Thomas Backman <serenity@exscape.org> MFC after: 1 week	2009-08-31 16:27:00 +00:00
Pawel Jakub Dawidek	575c1d371c	Add missing mountpoint vnode locking. This fixes panic on assertion with DEBUG_VFS_LOCKS and vfs.usermount=1 when regular user tries to mount dataset owned by him. MFC after: 1 week	2009-08-30 21:03:40 +00:00
Pawel Jakub Dawidek	5d5535163a	- Hide ZFS kernel threads under zfskern process. - Use better (shorter) threads names: 'zvol:worker zvol/tank/vol00' -> 'zvol tank/vol00' 'vdev:worker da0' -> 'vdev da0'	2009-08-23 11:33:46 +00:00
Pawel Jakub Dawidek	4ec2b0e7ce	Set priority of vdev_geom threads and zvol threads to PRIBIO.	2009-08-23 11:27:08 +00:00
Pawel Jakub Dawidek	8e9fd65fbf	getcwd() (when __getcwd() fails) works by stating current directory, going up (..), calling readdir and looking for previous directory inode. In case of .zfs/ directory this doesn't work, because .zfs/ is hidden by default, so it won't be visible in readdir output. Fix this by implementing VPTOCNP for snapshot directories, so __getcwd() doesn't fail and getcwd() doesn't have to use readdir method. This fixes /bin/pwd from within .zfs/snapshot/<name>/. Suggested by: kib Approved by: re (rwatson)	2009-08-17 10:00:18 +00:00
Pawel Jakub Dawidek	8461b0f043	Manage asynchronous vnode release just like Solaris. Discussed with: kmacy Approved by: re (kib)	2009-08-17 09:48:34 +00:00
Pawel Jakub Dawidek	e35eb914f4	- Reduce z_teardown_lock lock scope a bit. - The error variable is int, not bool. - Convert spaces to tabs where needed. Approved by: re (kib)	2009-08-17 09:28:15 +00:00
Pawel Jakub Dawidek	0330a5dc10	If z_buf is NULL, we should free znode immediately. Noticed by: avg Approved by: re (kib)	2009-08-17 09:25:37 +00:00
Pawel Jakub Dawidek	d83cfc37a4	- We need to recycle vnode instead of freeing znode. Submitted by: avg - Add missing vnode interlock unlock. - Remove redundant znode locking. Approved by: re (kib)	2009-08-17 09:21:39 +00:00
Pawel Jakub Dawidek	f820bc079f	Fix panic in zfs recv code. The last vnode (mountpoint's vnode) can have 0 usecount. Reported by: Thomas Backman <serenity@exscape.org> Approved by: re (kib)	2009-08-17 09:13:22 +00:00
Pawel Jakub Dawidek	159ef108e1	Remove OpenSolaris taskq port (it performs very poorly in our kernel) and replace it with wrappers around our taskqueue(9). To make it possible implement taskqueue_member() function which returns 1 if the given thread was created by the given taskqueue. Approved by: re (kib)	2009-08-17 09:01:20 +00:00
Pawel Jakub Dawidek	fddc954016	- Fix a race where /dev/zfs control device is created before ZFS is fully initialized. Also destroy /dev/zfs before doing other deinitializations. - Initialization through taskq is no longer needed and there is a race where one of the zpool/zfs command loads zfs.ko and tries to do some work immediately, but /dev/zfs is not there yet. Reported by: pav Approved by: re (kib)	2009-08-17 08:36:41 +00:00
Pawel Jakub Dawidek	abd8353f5d	We don't support ephemeral IDs in FreeBSD and without this fix ZFS can panic when in zfs_fuid_create_cred() when userid is negative. It is converted to unsigned value which makes IS_EPHEMERAL() macro to incorrectly report that this is ephemeral ID. The most reasonable solution for now is to always report that the given ID is not ephemeral. PR: kern/132337 Submitted by: Matthew West <freebsd@r.zeeb.org> Tested by: Thomas Backman <serenity@exscape.org>, Michael Reifenberger <mike@reifenberger.com> Approved by: re (kib) MFC after: 2 weeks	2009-07-27 14:52:34 +00:00
Edward Tomasz Napierala	d2ceff236a	Fix extattr_list_file(2) on ZFS in case the attribute directory doesn't exist and user doesn't have write access to the file. Without this fix, it returns bogus value instead of 0. For some reason this didn't manifest on my kernel compiled with -O0. PR: kern/136601 Submitted by: Jaakko Heinonen <jh at saunalahti dot fi> Approved by: re (kib)	2009-07-22 15:15:58 +00:00
Edward Tomasz Napierala	65588fd503	Fix permission handling for extended attributes in ZFS. Without this change, ZFS uses SunOS Alternate Data Streams semantics - each EA has its own permissions, which are set at EA creation time and - unlike SunOS - invisible to the user and impossible to change. From the user point of view, it's just broken: sometimes access is granted when it shouldn't be, sometimes it's denied when it shouldn't be. This patch makes it behave just like UFS, i.e. depend on current file permissions. Also, it fixes returned error codes (ENOATTR instead of ENOENT) and makes listextattr(2) return 0 instead of EPERM where there is no EA directory (i.e. the file never had any EA). Reviewed by: pjd (idea, not actual code) Approved by: re (kib)	2009-07-20 19:16:42 +00:00
Konstantin Belousov	e0c161b89c	Add another flags argument to vn_open_cred. Use it to specify that some vn_open_cred invocations shall not audit namei path. In particular, specify VN_OPEN_NOAUDIT for dotdot lookup performed by default implementation of vop_vptocnp, and for the open done for core file. vn_fullpath is called from the audit code, and vn_open there need to disable audit to avoid infinite recursion. Core file is created on return to user mode, that, in particular, happens during syscall return. The creation of the core file is audited by direct calls, and we do not want to overwrite audit information for syscall. Reported, reviewed and tested by: rwatson	2009-06-21 13:41:32 +00:00
Jamie Gritton	c1f192193d	Rename the host-related prison fields to be the same as the host.* parameters they represent, and the variables they replaced, instead of abbreviated versions of them. Approved by: bz (mentor)	2009-06-13 15:39:12 +00:00
Kip Macy	f0c6b798a3	pjd has requested that I keep the tunable as zfs_prefetch_disable to minimize gratuitous differences with Opensolaris' ZFS Sorry for the churn	2009-06-11 22:24:08 +00:00
Kip Macy	e4e5e663e0	check against prefetch_enable	2009-06-11 09:51:21 +00:00
Kip Macy	3fa5485637	use default policy for enabling prefetching unless the TUNABLE is set	2009-06-10 21:05:37 +00:00
Kip Macy	107b659450	As far as I can tell systems that have less than 4GB are more often hurt by prefetched than helped. On i386 systems and systems with less than 4GB, prefetch is now disabled by default. I've added a prefetch enable tunable, to enable prefetching for those systems. The prefetch disable tunable will continue to unconditionally disable prefetching.	2009-06-10 01:21:32 +00:00
Paul Saab	a6d545d8ed	Support shared vnode locks for write operations when the offset is provided on filesystems that support it. This really improves mysql + innodb performance on ZFS. Reviewed by: jhb, kmacy, jeffr	2009-06-04 16:18:07 +00:00
Doug Rabson	8be608b58c	Allow the bootfs property to be set for raidz pools on FreeBSD. Reviewed by: pjd	2009-05-31 11:59:32 +00:00
Kip Macy	762169b50a	fix xdrmem_control to be safe in an if statement fix zfs to depend on krpc remove xdr from zfs makefile Submitted by: dchagin@freebsd.org	2009-05-30 22:23:58 +00:00
Kip Macy	139ccddec0	work around snapshot shutdown race reported by Henri Hennebert	2009-05-30 19:26:35 +00:00
Edward Tomasz Napierala	0970b4bae0	MFp4 changes neccessary for NFSv4 ACLs support in ZFS. This is mostly about removing a few #ifdefs and providing compatibility wrappers and VOP implementations to get and set an ACL; ZFS does ACL enforcement all by itself. Note that the VOPs are ifdefed out for now, so this change should be a no-op. Reviewed by: pjd	2009-05-26 08:21:59 +00:00
Edward Tomasz Napierala	194f4d42de	Fix comment.	2009-05-24 15:48:48 +00:00
Kip Macy	e95d34711b	- back out direct map hack - it is no longer needed	2009-05-19 01:14:37 +00:00
Kip Macy	ea41c77517	SAVESTART implies SAVENAME	2009-05-17 01:31:28 +00:00
Kip Macy	be08aa8b59	- allow forced unmounts - don't assume snapshot was auto-mounted	2009-05-16 20:33:13 +00:00
Kip Macy	71bc1ce36e	only use direct map if system has more than 2GB	2009-05-16 20:09:07 +00:00
Kip Macy	32237d8492	apply band-aid to x86_64 systems with more physical memory than kmem by allocating from the direct map	2009-05-16 19:17:15 +00:00
Attilio Rao	dfd233edd5	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
Kip Macy	a6827463ad	don't call vn_rele_async_fini in the !_KERNEL case	2009-05-07 23:34:41 +00:00
Kip Macy	6ef1a81d6e	avoid LOR and gratuitous extra lock acquisitions by moving user_evict list buffers to a temporary list	2009-05-07 21:51:13 +00:00
Kip Macy	77d0162c70	Allow the VM to provide backpressure on the ARC cache as it does on Solaris.	2009-05-07 20:57:06 +00:00
Kip Macy	62fa227ccd	Asynchronously release vnodes to avoid blocking on range locks when calling back in to zfs. This is based on a fix that went in to opensolaris on March 9th. However, it uses a dedicated thread instead of a Solaris' taskq to avoid doing a blocking memory allocation with the vnode interlock held. This fixes a long-time deadlock in ZFS. This is not, strictly speaking, an LOR. The spa_zio thread releases a vnode, this calls in to vn_reclaim which in turn needs to acquire range locks to sync dirty data out to disk. The range locks are already held by a user-level process waiting on a condition variable that it the process is waiting on a spa_zio thread to signal it on. The process could not be signalled because the spa_zio thread could not proceed. The nature of this problem was not apparent due to ZFS locks opting out of witness which meant that DDB did not know about the locks that were held by ZFS. Reviewed by: pjd MFC after: 7 days	2009-05-07 20:28:06 +00:00
Robert Watson	885868cd8f	Remove VOP_LEASE and supporting functions. This hasn't been used since the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon	2009-04-10 10:52:19 +00:00
Andrew Thompson	853a10a581	Revert r190676,190677 The geom and CAM changes for root_hold are the wrong solution for USB design quirks. Requested by: scottl	2009-04-10 04:08:34 +00:00
Andrew Thompson	626fc9fe3d	Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called in situations where sleeping isnt allowed.	2009-04-03 19:46:12 +00:00
John Baldwin	9fca7a854c	The zfs_get_xattrdir() function is used to find the extended attribute directory for a znode. When the directory already exists, it returns a referenced but unlocked vnode. When a directory does not yet exist, it calls zfs_make_xattrdir() to create a new one. zfs_make_xattrdir() returns the vnode both referenced and and locked and zfs_get_xattrdir() was leaking this vnode lock to its callers. Fix this by dropping the vnode lock if zfs_make_xattrdir() successfully creates a new extended attribute directory. Reviewed by: pjd	2009-03-18 16:19:44 +00:00
John Baldwin	33fc362512	Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a filesystem supports additional operations using shared vnode locks. Currently this is used to enable shared locks for open() and close() of read-only file descriptors. - When an ISOPEN namei() request is performed with LOCKSHARED, use a shared vnode lock for the leaf vnode only if the mount point has the extended shared flag set. - Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but not O_CREAT. - Use a shared vnode lock around VOP_CLOSE() if the file was opened with O_RDONLY and the mountpoint has the extended shared flag set. - Adjust md(4) to upgrade the vnode lock on the vnode it gets back from vn_open() since it now may only have a shared vnode lock. - Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since FIFO's require exclusive vnode locks for their open() and close() routines. (My recent MPSAFE patches for UDF and cd9660 already included this change.) - Enable extended shared operations on UFS, cd9660, and UDF. Submitted by: ups Reviewed by: pjd (ZFS bits) MFC after: 1 month	2009-03-11 14:13:47 +00:00
John Baldwin	ea77ff0a15	Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month	2009-02-13 18:18:14 +00:00
Ed Schouten	a4611ab612	Last step of splitting up minor and unit numbers: remove minor(). Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.	2009-01-28 17:57:16 +00:00
Edward Tomasz Napierala	38cc5da78e	MFp4: We don't support TX_CREATE_ACL_ATTR nor TX_MKDIR_ACL_ATTR; code found in zfs_replay.c will panic if it encounters transactions of this type. Make sure we don't put these into the ZIL. Approved by: rwatson (mentor), pjd	2008-11-25 23:05:46 +00:00
Pawel Jakub Dawidek	ad35ee04f4	Fix locking (file descriptor table and Giant around VFS). Most submitted by: kib Reviewed by: kib	2008-11-25 21:14:00 +00:00
Pawel Jakub Dawidek	bcfbcdca9c	IFp4: Don't rely on disk IDs and always use vdev guids, which means always look up for components by reading metadata. This might be slower when there are big number of disks in the system, but is definiately more reliable.	2008-11-22 13:33:06 +00:00
Pawel Jakub Dawidek	74303ba55c	IFp4: Finish implemnetation of chflags(2) for ZFS. While doing this I found that zfs_access() can only handle VREAD, VWRITE and VEXEC, for the rest we need to use vaccess(9).	2008-11-22 13:24:44 +00:00
Pawel Jakub Dawidek	5189bf22c0	IFp4: Don't free pathname too soon, debugging code is still using it.	2008-11-22 13:22:24 +00:00
Pawel Jakub Dawidek	1ba4a712dd	Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris	2008-11-17 20:49:29 +00:00
Edward Tomasz Napierala	4bdaada206	Require write access on a directory being moved from one parent directory to another in ZFS. Approved by: rwatson (mentor), pjd	2008-11-08 19:56:32 +00:00
Edward Tomasz Napierala	36d227d9ed	Backoff the last patch. It was overly restrictive - we want to check for write permission on target only when moving the target between two directories. Approved by: rwatson (mentor)	2008-11-06 22:28:04 +00:00
Edward Tomasz Napierala	b92eda309d	Change ZFS behaviour to match UFS: when moving (rename(2)) a subdirectory from one parent directory to another, in addition to the usual access checks one also needs write access to the subdirectory being moved. Approved by: rwatson (mentor), pjd	2008-11-06 19:17:58 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Attilio Rao	0d7935fd01	Remove the struct thread unuseful argument from bufobj interface. In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync() and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close() Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit. As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-10-10 21:23:50 +00:00
Pawel Jakub Dawidek	062ea27ee4	Add missing ZFS_EXIT(). PR: kern/124899 Submitted by: Masakazu Asama <m-asama@ginzado.ne.jp>	2008-09-15 11:27:25 +00:00
Edward Tomasz Napierala	dfa7fd1d70	Remove VSVTX, VSGID and VSUID. This should be a no-op, as VSVTX == S_ISVTX, VSGID == S_ISGID and VSUID == S_ISUID. Approved by: rwatson (mentor)	2008-09-10 13:16:41 +00:00
Pawel Jakub Dawidek	1b856fa491	Initialize vp, so we don't call VOP_UNLOCK() with NULL vnode pointer. Confirmed by: marcus	2008-09-07 07:55:12 +00:00
Pawel Jakub Dawidek	433751bb50	Lock vnode exclusively around insmntque().	2008-09-06 17:24:07 +00:00
Pawel Jakub Dawidek	7fa1f32a7e	Catch up after last insmntque() changes: - The vnode has to be locked exclusively before calling insmntque(). - Until I find a way to handle insmntque() failures use VV_FORCEINSMQ flag to force insmntque() to always succeed. Reported by: kris, trasz, des, others Suggested by: kib Tested by: trasz	2008-09-05 07:00:40 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Pawel Jakub Dawidek	37876323b1	We want to use LBOLT instead of lbolt on FreeBSD. I've this already fixed in p4, but the fix was never integrated into HEAD. Reported by: ed	2008-07-21 14:35:48 +00:00
Ed Schouten	3f7eea97fd	Remove the $FreeBSD$ tag again, now I know fbsd:nokeywords exists. Requested by: pjd Approved by: philip (mentor)	2008-06-12 08:53:54 +00:00
Ed Schouten	0f03ce1bb8	Turn dev2unit(), minor(), unit2minor() and minor2unit() into macro's. Now that we got rid of the minor-to-unit conversion and the constraints on device minor numbers, we can convert the functions that operate on minor and unit numbers to simple macro's. The unit2minor() and minor2unit() macro's are now no-ops. The ZFS code als defined a macro named `minor'. Change the ZFS code to use umajor() and uminor() here, as it is the correct approach to do this. Also add $FreeBSD$ to keep SVN happy. Approved by: philip (mentor), pjd	2008-06-12 08:30:54 +00:00
Pawel Jakub Dawidek	ed5a2ac45c	Fix namespace collision after src/sys/sys/file.h:1.78.	2008-05-25 22:34:17 +00:00
John Birrell	8fc6245976	Make the zfs module depend on the opensolaris module in preparation for it to shared stuff with the DTrace modules.	2008-05-24 06:43:55 +00:00
Konstantin Belousov	eab626f110	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00

... 4 5 6 7 8 ...

641 Commits