Commit Graph

715 Commits

Author SHA1 Message Date
Marius Strobl
edd870e447 Convert the last use of xcopyout() to ddi_copyout() and remove the now
unused xcopyin() as well as xcopyout().
MFC together with r219089.

Approved by:	mm
2011-05-03 20:13:27 +00:00
Martin Matuska
29bf94b8d8 Fix deduplicated zfs receive
(dmu_recv_stream builds incomplete guid_to_ds_map)

Illumos-gate changeset:	13329:c48b8bf84ab7
MFC together with v28

Approved by:	pjd
Obtained from:	Illumos (Bug #755)
2011-04-30 14:52:49 +00:00
Marcel Moolenaar
8d098dc0c4 Fix copy-paste bug. 2011-04-27 04:03:04 +00:00
Martin Matuska
8b2aa22d8f Partially fix ZFS compat code for sparc64.
Some endianess bugs still need to be resolved.

Submitted by:	marius (parts of the fix)
MFC after:	1 month
2011-04-08 11:08:26 +00:00
Artem Belevich
7a3f3cabb1 Stripped '32' suffix from linux systrace module name on i386.
Approved by: avg
2011-04-08 06:27:43 +00:00
Jung-uk Kim
3453537fa5 Use atomic load & store for TSC frequency. It may be overkill for amd64 but
safer for i386 because it can be easily over 4 GHz now.  More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly).  Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).
2011-04-07 23:28:28 +00:00
Pawel Jakub Dawidek
65612637e8 Checking file access on size change is bogus. The checks are done earlier by
VFS where we know if this is truncate(2) or ftruncate(2). If this is the
latter we should depend on the mode the file was opened and not on the current
permission.

PR:		standards/154873
Reported by:	Mark Martinec <Mark.Martinec@ijs.si>
Discussed with:	Eric Schrock <eric.schrock@delphix.com>
Discussed with:	Mark Maybee <Mark.Maybee@Oracle.COM>
MFC after:	1 month
2011-03-24 20:28:09 +00:00
Pawel Jakub Dawidek
d7d23301ae Fix potential panic in dbuf_sync_list() relate to spill blocks handling.
Obtained from:	IllumOS
MFC after:	1 month
2011-03-14 11:07:12 +00:00
Andriy Gapon
308bce2a0e add DTrace systrace support for linux32 and freebsd32 on amd64 syscalls
Add systrace_linux32 and systrace_freebsd32 modules which provide
support for tracing compat system calls in addition to native system
call tracing provided by systrace module.

Provided that all the systrace modules are loaded now you can select
what syscalls to trace in the following manner:

syscall::xxx:yyy - work on all system calls that match the specification
syscall:freebsd:xxx:yyy - only native system calls
syscall:linux32:xxx:yyy - linux32 compat system calls
syscall:freebsd32:xxx:yyy - freebsd32 compat system calls on amd64

PR:		kern/152822
Submitted by:	Artem Belevich <fbsdlist@src.cx>
Reviewed by:	jhb (earlier version)
MFC after:	3 weeks
2011-03-12 09:09:25 +00:00
Pawel Jakub Dawidek
cae905e5d0 Correct readdir over ZFS handling.
Reported by:	Pierre Beyssac <pb@fasterix.frmug.org>
MFC after:	1 month
2011-03-08 18:39:41 +00:00
Pawel Jakub Dawidek
a96e8e86f0 Fix libzpool build.
MFC after:	1 month
2011-03-06 01:22:14 +00:00
Pawel Jakub Dawidek
2348f1110e Make renaming of a ZVOL, ZVOL's parent directory and ZVOL snapshot work.
Reported by:	avg
MFC after:	1 month
2011-03-05 22:31:03 +00:00
Pawel Jakub Dawidek
5bf0660559 Simplify zvol_remove_minors() a bit.
MFC after:	1 month
2011-03-05 22:24:31 +00:00
Pawel Jakub Dawidek
2fbdb9c0a0 Use proper lock in assertion.
MFC after:	1 month
2011-02-28 05:45:31 +00:00
Pawel Jakub Dawidek
10b9d77bf1 Finally... Import the latest open-source ZFS version - (SPA) 28.
Few new things available from now on:

- Data deduplication.
- Triple parity RAIDZ (RAIDZ3).
- zfs diff.
- zpool split.
- Snapshot holds.
- zpool import -F. Allows to rewind corrupted pool to earlier
  transaction group.
- Possibility to import pool in read-only mode.

MFC after:	1 month
2011-02-27 19:41:40 +00:00
Rebecca Cran
6bccea7c2b Fix typos - remove duplicate "the".
PR:	bin/154928
Submitted by:	Eitan Adler <lists at eitanadler.com>
MFC after: 	3 days
2011-02-21 09:01:34 +00:00
Marcel Moolenaar
6e23016fd7 Use the preload_fetch_addr() and preload_fetch_size() convenience
functions to obtain the address and size of the preloaded pool
configuration file/repository.

Sponsored by: Juniper Networks.
2011-02-13 19:46:55 +00:00
Konstantin Belousov
ca67168159 For UIO_NOCOPY case of reading request on zfs vnode, which has vm object
attached, activate the page after the successful read, and free the page
if read was unsuccessfull.

Freshly allocated page is not on any queue yet, and not activating (or
deactivating) the page leaves it on no queue, excluding the page from
pagedaemon scans and making the memory disappeared until the vnode
reclaimed.

Reviewed by:	avg
MFC after:	1 week
2011-02-11 10:46:15 +00:00
Edward Tomasz Napierala
dc7a965673 Make it impossible to clear the MNT_NFS4ACLS flag on ZFS filesystem
by using "mount -uw".

Reviewed by:	pjd
MFC after:	2 weeks
2011-02-06 23:34:09 +00:00
Andrey V. Elsukov
459d0e830d vdev's sectorsize should not be greater than 8 Kbytes and also
it should be power of 2. This prevents non-aligned access while
probing vdev's labels.

PR:		kern/147852
Reviewed by:	pjd
MFC after:	1 week
2011-02-04 15:22:56 +00:00
Martin Matuska
5c92680fa9 Recommit r218169, enclosing with #ifdef _KERNEL
This change is sufficient for the ZFS kernel module.

Discussed with:	pjd
MFC after:	1 week
2011-02-01 23:12:13 +00:00
Alexander Kabaev
a9c28a203d Revert r218169 until it can be tested and fixed properly. 2011-02-01 21:15:35 +00:00
Martin Matuska
4530e5f790 For ZFS, change the type of clock_t to int64_t.
The clock_t type in OpenSolaris is long (int64_t on amd64).
On FreeBSD clock_t is int32_t. The clock_t type is used in several places
in the ZFS code to store system uptime in milliseconds ("seconds * hz").

With hz=1000 we have a 32-bit integer overflow in 24 days, 20 hours,
31 minutes and 23.648 seconds. This has a user reported negative impact
on l2arc_feed_thread() and may cause unexpected results from other functions
using clock_t.

Reported by:	Artem Belevich <fbsdlist@src.cx> on freebsd-fs@
MFC after:	1 week
2011-02-01 14:28:50 +00:00
Jayachandran C.
baa8c35cb4 CDDL fixes for MIPS n32.
Provide 64 bit atomic ops, and use 32 bit pointer.
2011-01-28 06:12:59 +00:00
Matthew D Fleming
cbc134ad03 Introduce signed and unsigned version of CTLTYPE_QUAD, renaming
existing uses.  Rename sysctl_handle_quad() to sysctl_handle_64().
2011-01-19 23:00:25 +00:00
Edward Tomasz Napierala
7a93bf9a69 Add MNT_NFS4ACLS to ZFS mount flags. It's not conditional, since there
is no way to disable NFSv4 ACLs in ZFS.  This should make it easier
for the NFS server to figure out whether the exported filesystem supports
ACLs or not.

Reviewed by:	pjd
MFC after:	2 weeks
2011-01-19 17:11:52 +00:00
Matthew D Fleming
e704482d43 Re-commit the zfs sysctl(9) type-safety changes.
Thanks to dim and pjd for the pointer to zfs_context.h for building
userland.
2011-01-13 18:20:19 +00:00
Matthew D Fleming
374a993a88 Revert cddl changes for sysctl(9) until I understand why this isn't
building on universe.
2011-01-12 23:06:38 +00:00
Matthew D Fleming
4a2ce5903f sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the zfs piece.
2011-01-12 19:53:30 +00:00
Martin Matuska
df06a59a77 MFp4 r186485, r186859:
Fix a race by defining two tasks in the zio structure
as we can still be returning from issue task when interrupt task is used.

Tested by:	pjd
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2011-01-03 12:57:07 +00:00
Andriy Gapon
dfe3a1b374 cyclic xcall: use smp_no_rendevous_barrier as setup function parameter
In this case we call target function only on a single CPU and do not
need any synchronization at the setup stage.

It's a bit non-obvious but setup function of NULL means that
smp_rendezvous_cpus waits for all CPUs to arrive at the rendezvous
point, but without doing any actual setup.  While using
smp_no_rendevous_barrier means that each CPU proceeds on its own
schedule without any synchronization whatsoever.

MFC after:	3 weeks
2010-12-17 18:22:50 +00:00
Pawel Jakub Dawidek
8735863465 Remove redundant semicolon and empty like. 2010-12-11 13:35:25 +00:00
Ivan Voras
d7ccd95be8 Undo r216230: the interaction between saved ashift in metadata and
detected ashift does not support this. With this change, pools
created while stripesize=512 could not be imported when stripesize
becomes larger (on the same drive).

Noticed by:	pjd
2010-12-07 15:24:08 +00:00
Andriy Gapon
58f61ce4eb opensolaris cyclic: fix deadlock and make a little bit closer to upstream
The dealock was caused in the following way:
- thread T1 on CPU C1 holds a spin mutex, IPIs CPU C2 and waits for the
  IPI to be handled
- C2 executes timer interrupt filter, thus has interrupts disabled, and
  gets blocked on the spin mutex held by T1
The problem seems to have been introduced by simplifications made to
OpenSolaris code during porting.
The problem is fixed by reorganizing the code to more closely resemble
the upstream version.  Interrupt filter (cyclic_fire) now doesn't
acquire any locks, all per-CPU data accesses are performed on a
target CPU with preemption and interrupts disabled thus precluding
concurrent access to the data.
cyp_mtx spin mutex is used to disable preemtion and interrupts; it's not
used for classical mutual exclusion, because xcall already serializes
calls to a CPU.  It's an emulation of OpenSolaris
cyb_set_level(CY_HIGH_LEVEL) call, the spin mutexes could probably be
reduced to just a spinlock_enter()/_exit() pair.

Diff with upstream version is now reduced by ~500 lines, however it still
remains quite large - many things that are not needed (at the moment) or
are irrelevant on FreeBSD were simply ripped out during porting.
Examples of such things:
- support for CPU onlining/offlining
- support for suspend/resume
- support for running callouts at soft interrupt levels
- support for callout rebinding from CPU to CPU
- support for CPU partitions

Tested by:	Artem Belevich <fbsdlist@src.cx>
MFC after:	3 weeks
X-MFC with:	r216252
2010-12-07 12:25:26 +00:00
Andriy Gapon
a10b0e67d9 opensolaris cyclic xcall: no need for special handling of curcpu
smp_rendezvous_cpus already properly handles current CPU case
and non-SMP case.

MFC after:	3 weeks
2010-12-07 12:04:06 +00:00
Andriy Gapon
fe8c7b3d77 dtrace_xcall: no need for special handling of curcpu
smp_rendezvous_cpus alreadt does the right thing in a very similar
fashion, so the code was kind of duplicating that.

MFC after:	3 weeks
2010-12-07 09:19:47 +00:00
Andriy Gapon
7becfa95b9 dtrace_gethrtime_init: pin to master while examining other CPUs
Also use pc_cpumask to be future-friendly.

Reviewed by:	jhb
MFC after:	2 weeks
2010-12-07 09:03:17 +00:00
Ivan Voras
8b08562112 Use GEOM stripesize field when calculating ashift. This will enable correct
alignment on drives with large sector sizes (e.g. 4 KiB) but the
implementation might need to be revisited if devices with large stripesizes
appear (e.g. if RAID controllers or flash drives start using the field),
probably by introducing a physsectorsize field in GEOM providers.

Discussed with: mav, mostly silence on freebsd-geom@ and freebsd-fs@
2010-12-06 12:18:02 +00:00
Edward Tomasz Napierala
de2a57325d Don't panic when we read an empty ACL from ZFS. Apparently this may happen
with filesystems created under MacOS X ZFS port.  This is kind of filesystem
corruption (we don't allow for setting empty ACLs), so make acl_get_file(3)
and related syscalls fail with EINVAL in that case.  In theory, we could
return empty ACL to userland, but I'm afraid this would break some code.

MFC after:	3 days
2010-11-30 21:04:05 +00:00
Andriy Gapon
c59690f249 zfs+sendfile: populate all requested pages, not just those already cached
kern_sendfile() uses vm_rdwr() to read-ahead blocks of data to populate
page cache.  When sendfile stumbles upon a page that is not populated
yet, it sends out all the mbufs that it collected so far.  This
resulted in very poor performance with ZFS when file data is not in the
page cache, because ZFS vop_read for UIO_NOCOPY case populated only
those pages that are already in cache, but not valid.  Which means that
most of the time it populated only the first requested page in the
described above scenario.

Reported by:	Alexander Zagrebin <alexz@visp.ru>
Tested by:	Alexander Zagrebin <alexz@visp.ru>,
		Artemiev Igor <ai@kliksys.ru>
MFC after:	12 days
2010-11-16 15:53:44 +00:00
Andriy Gapon
f9e2e99d5d fix misspelling in a comment
Reported by:	Daniel Braniss <danny@cs.huji.ac.il>
MFC after:	3 days
2010-11-16 12:30:47 +00:00
Martin Matuska
8db47aa15e Disable VFS_HOLD placed on mnt_vnodecovered during the mount of a snapshot
and VFS_RELE on a non-existing hold on snapshot parent's z_vfs.

This disables the changes from OpenSolaris onnv-revision 9234:bffdc4fc05c4
(bug IDs: 6792139, 6794830) - not applicable to FreeBSD.

This fixes the process hang if umounting a manually mounted snapshot.

Reported by:	Alexander Zagrebin <alexz@visp.ru>
Approved by:	delphij (mentor)
MFC after:	1 week
2010-11-13 21:09:18 +00:00
Xin LI
b97a9057c2 Validate whether the zfs_cmd_t submitted from userland is not smaller than
what we have.  Without the check the kernel could accessing memory that
does not belong to the request struct.

Note that we do not test if the struct equals in size at this time, which
may faciliate forward compatibility with newer binaries.

Reviewed by:	pjd at MeetBSD CA '2010
MFC after:	1 week
2010-11-05 22:18:09 +00:00
Martin Matuska
e25376bdd0 Bugfix merge from OpenSolaris:
OpenSolaris onnv-revision:	10209:91f47f0e7728
6830541	zfs_get_data_trips on a verify
6696242	multiple zfs_fillpage() zfs: accessing past end of object panics
6785914	zfs fails to drop dn_struct_rwlock in recovery code path

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6830541, 6696242, 6785914)
MFC after:	2 weeks
2010-10-26 15:48:03 +00:00
Andriy Gapon
23a1bcf8c6 zfs: add vop_getpages method implementation
This should make vnode_pager_getpages path a bit shorter and clearer.
Also this should eliminate problems with partially valid pages.
Having this method opens room for future optimizations.

To do: try to satisfy other pages besides the required one taking into
account tradeofs between number of page faults, read throughput and read
latency.  Also, eventually vop_putpages should be added too.

Reviewed by:	kib, mm, pjd
MFC after:	3 weeks
2010-10-16 20:43:05 +00:00
Rui Paulo
910a5e18ba Pass a format string to panic() and to taskqueue_start_threads().
Found with:	clang
2010-10-13 17:13:43 +00:00
Rui Paulo
6e634bb80f In zfs_post_common(), use %d instead of %hhu.
Found with:	clang
2010-10-13 17:12:23 +00:00
Andriy Gapon
f6bb41924c zfs + sendfile: do not produce partially valid pages for vnode's tail
Since r212650 and before this change sendfile(2) could produce
a partially valid page for a trailing portion of a ZFS vnode.
vm_fault() always wants to see a fully valid page even if it's the last
page that partially extends beyond vnode's end.  Otherwise it calls
vop_getpages() to bring in the page.  In the case of ZFS this means
that the data is read from the page into the same page and this breaks
checks in ZFS mappedread() - a thread that set VPO_BUSY on the page in
vm_fault() will get blocked forever waiting for it to be cleared.

Many thanks to Kai and Jeremy for reproducing the issue and providing
important debugging information and help.

Reported by:	Kai Gallasch <gallasch@free.de>,
		Jeremy Chadwick <freebsd@jdc.parodius.com>
Tested by:	Kai Gallasch <gallasch@free.de>,
		Jeremy Chadwick <freebsd@jdc.parodius.com>
Reviewed by:	kib
MFC after:	3 days
To-Do:		apply the same treatment to tmpfs + sendfile
2010-10-12 17:04:21 +00:00
Pawel Jakub Dawidek
19ebc67beb Provide internal ioflags() function that converts ioflag provided by FreeBSD's
VFS to OpenSolaris-specific ioflag expected by ZFS. Use it for read and write
operations.

Reviewed by:	mm
MFC after:	1 week
2010-10-10 20:49:33 +00:00
Martin Matuska
a362d75576 Change FAPPEND to IO_APPEND as this is a ioflag and not a fflag.
This corrects writing to append-only files on ZFS.

PR:		kern/149495 [1], kern/151082 [2]
Submitted by:	Daniel Zhelev <daniel@zhelev.biz> [1], Michael Naef <cal@linu.gs> [2]
Approved by:	delphij (mentor)
MFC after:	1 week
2010-10-08 23:01:38 +00:00
Andriy Gapon
6c6aca1203 opensolaris_kmem kmem_size(): report lesser of vm_kmem_size and available
physical memory

This is needed to correctly autotune ZFS ARC size when vm_kmem_size is
set to value larger than available physical memory.

MFC after:	2 weeks
2010-10-07 18:16:14 +00:00
Martin Matuska
aa007a9f0e Properly handle IO with B_FAILFAST
Retry IO once with ZIO_FLAG_TRYHARD before declaring a pool faulted

OpenSolaris revision and Bug IDs:

9725:0bf7402e8022
6843014 ZFS B_FAILFAST handling is broken

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6843014)
MFC after:	3 weeks
2010-09-27 09:42:31 +00:00
Martin Matuska
96a1a6a568 Enable offlining of log devices.
OpenSolaris revision and Bug IDs:

9701:cc5b64682e64
6803605	should be able to offline log devices
6726045	vdev_deflate_ratio is not set when offlining a log device
6599442	zpool import has faults in the display

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6803605, 6726045, 6599442)
MFC after:	3 weeks
2010-09-27 09:05:51 +00:00
Andriy Gapon
68653c3bd6 zfs_map_page/zfs_unmap_page: do not use sched_pin() and SFB_CPUPRIVATE
zfs_map_page/zfs_unmap_page are mostly called around potential I/O paths
and it seems to be a not very good idea to do cpu pinning there.

Suggested by:	kib
MFC after:	2 weeks
2010-09-21 05:58:45 +00:00
Andriy Gapon
ff5e15a487 zfs_vnops: use zfs_map_page/zfs_unmap_page helper functions in another place
MFC after:	2 weeks
2010-09-21 05:54:36 +00:00
Andriy Gapon
9d5eb9aa5d zfs arc_reclaim_needed: fix typo in mismerge in r212780
PR:		kern/146410, kern/138790
MFC after:	3 weeks
X-MFC with:	r212780
2010-09-17 07:34:50 +00:00
Andriy Gapon
921d3fd122 zfs+sendfile: advance uio_offset upon reading as well
Picked from analogous code in tmpfs.

MFC after:	1 week
2010-09-17 07:20:20 +00:00
Andriy Gapon
44532bc5cd zfs arc_reclaim_needed: remove redundant checks for arc_c_max and arc_c_max
Those checks are not present in upstream code and they are enforced in
actual calculations of delta by which ARC size can be grown or should be
reduced.

MFC after:	3 weeks
2010-09-17 07:17:38 +00:00
Andriy Gapon
7c1353491f zfs arc_reclaim_needed: more reasonable threshold for available pages
vm_paging_target() is not a trigger of any kind for pageademon, but
rather a "soft" target for it when it's already triggered.
Thus, trying to keep 2048 pages above that level at the expense of ARC
was simply driving ARC size into the ground even with normal memory
loads.
Instead, use a threshold at which a pagedaemon scan is triggered, so
that ARC reclaiming helps with pagedaemon's task, but the latter still
recycles active and inactive pages.

PR:		kern/146410, kern/138790
MFC after:	3 weeks
2010-09-17 07:14:07 +00:00
Martin Matuska
d1ee63f836 Fix kernel panic when moving a file to .zfs/shares
Fix possible loss of correct error return code in ZFS mount

OpenSolaris revisions and Bug IDs:

11824:53128e5db7cf
6863610	ZFS mount can lose correct error return

12079:13822b941977
6939941	problem with moving files in zfs (142901-12)

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6863610, 6939941)
MFC after:	3 days
2010-09-15 19:55:26 +00:00
Andriy Gapon
8a3883cfb7 zfs vn_has_cached_data: take into account v_object->cache != NULL
This mirrors code in tmpfs.
This changge shouldn't affect much read path, it may cause unnecessary
vm_page_lookup calls in the case where v_object has no active or inactive
pages but has some cache pages.  I believe this situation to be non-essential.

In write path this change should allow us to properly detect the above
case and free a cache page when we write to a range that corresponds to it.
If this situation is undetected then we could have a discrepancy between
data in page cache and in ARC or on disk.

This change allows us to re-enable vn_has_cached_data() check in zfs_write.

NOTE: strictly speaking resident_page_count and cache fields of v_object
should be exmined under VM_OBJECT_LOCK, but for this particular usage
we may get away with it.

Discussed with:	alc, kib
Approved by:	pjd
Tested with:	tools/regression/fsx
MFC after:	3 weeks
2010-09-15 11:05:41 +00:00
Andriy Gapon
0b1ca38a69 zfs mappedread, update_pages: use int for offset and length within a page
uint64_t, int64_t were redundant there

Approved by:	pjd
Tested by:	tools/regression/fsx
MFC after:	2 weeks
2010-09-15 10:48:16 +00:00
Andriy Gapon
c002c3e8c2 zfs mappedread: use uiomove_fromphys where possible
Reviewed by:	alc
Approved by:	pjd
Tested by:	tools/regression/fsx
MFC after:	2 weeks
2010-09-15 10:44:20 +00:00
Andriy Gapon
fbbdb19dcd zfs: catch up with vm_page_sleep_if_busy changes
Reviewed by:	alc
Approved by:	pjd
Tested by:	tools/regression/fsx
MFC after:	2 weeks
2010-09-15 10:39:21 +00:00
Andriy Gapon
21bd3e2576 tmpfs, zfs + sendfile: mark page bits as valid after populating it with data
Otherwise, adding insult to injury, in addition to double-caching of data
we would always copy the data into a vnode's vm object page from backend.
This is specific to sendfile case only (VOP_READ with UIO_NOCOPY).

PR:		kern/141305
Reported by:	Wiktor Niesiobedzki <bsd@vink.pl>
Reviewed by:	alc
Tested by:	tools/regression/sockets/sendfile
MFC after:	2 weeks
2010-09-15 10:31:27 +00:00
Martin Matuska
9a13d2e1b3 Remove duplicated VFS_HOLD due to a mismerge.
PR:		kern/150544
Approved by:	delphij (mentor)
MFC after:	1 day
2010-09-14 12:12:18 +00:00
Martin Matuska
4eeef2e44a Add missing vop_vector zfsctl_ops_shares
Add missing locks around VOP_READDIR and VOP_GETATTR with z_shares_dir

PR:		kern/150544
Approved by:	delphij (mentor)
Obtained from:	perforce (pjd)
MFC after:	1 day
2010-09-14 10:27:32 +00:00
Pawel Jakub Dawidek
3c907063e9 Remove the page queues lock around vm_page_undirty() - it is no longer needed.
Reviewed by:	alc
2010-09-13 19:47:09 +00:00
Rui Paulo
47047e3418 Revamp locking a bit. This fixes three problems:
* processes now can't go away while we are inserting probes (fixes a panic)
* if a trap happens, we won't be holding the process lock (fixes a hang)
* fix a LOR between the process lock and the fasttrap bucket list lock

Thanks to kib for pointing some problems.
Sponsored by:	The FreeBSD Foundation
2010-09-12 14:12:16 +00:00
Rui Paulo
eae81e9501 Avoid a LOR (sleepable after non-sleepable) in
fasttrap_tracepoint_enable().

Sponsored by:	The FreeBSD Foundation
2010-09-11 12:58:31 +00:00
Matthew D Fleming
4d369413e1 Replace sbuf_overflowed() with sbuf_error(), which returns any error
code associated with overflow or with the drain function.  While this
function is not expected to be used often, it produces more information
in the form of an errno that sbuf_overflowed() did.
2010-09-10 16:42:16 +00:00
Pawel Jakub Dawidek
6a85b5e08a Forgot to commit this file. Add ZPOOL_CONFIG_IS_LOG.
Reported by:	keramida
MFC after:	2 weeks
2010-09-10 04:44:13 +00:00
Pawel Jakub Dawidek
86b19d1861 On FreeBSD we can log from pool that have multiple top-level vdevs or log
vdevs, so don't deny adding new vdevs if bootfs property is set.

MFC after:	2 weeks
2010-09-09 21:20:18 +00:00
Rui Paulo
d3555b6fc2 Fix two bugs in DTrace:
* when the process exits, remove the associated USDT probes
* when the process forks, duplicate the USDT probes.

Sponsored by:	The FreeBSD Foundation
2010-09-09 09:58:05 +00:00
Justin T. Gibbs
f03f7a0ca3 Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic.
Add the BIO_ORDERED flag for struct bio and update bio clients to use it.

The barrier semantics of bioq_insert_tail() were broken in two ways:

 o In bioq_disksort(), an added bio could be inserted at the head of
   the queue, even when a barrier was present, if the sort key for
   the new entry was less than that of the last queued barrier bio.

 o The last_offset used to generate the sort key for newly queued bios
   did not stay at the position of the barrier until either the
   barrier was de-queued, or a new barrier (which updates last_offset)
   was queued.  When a barrier is in effect, we know that the disk
   will pass through the barrier position just before the
   "blocked bios" are released, so using the barrier's offset for
   last_offset is the optimal choice.

sys/geom/sched/subr_disk.c:
sys/kern/subr_disk.c:
	o Update last_offset in bioq_insert_tail().

	o Only update last_offset in bioq_remove() if the removed bio is
	  at the head of the queue (typically due to a call via
	  bioq_takefirst()) and no barrier is active.

	o In bioq_disksort(), if we have a barrier (insert_point is non-NULL),
	  set prev to the barrier and cur to it's next element.  Now that
	  last_offset is kept at the barrier position, this change isn't
	  strictly necessary, but since we have to take a decision branch
	  anyway, it does avoid one, no-op, loop iteration in the while
	  loop that immediately follows.

	o In bioq_disksort(), bypass the normal sort for bios with the
	  BIO_ORDERED attribute and instead insert them into the queue
	  with bioq_insert_tail().  bioq_insert_tail() not only gives
	  the desired command order during insertion, but also provides
	  barrier semantics so that commands disksorted in the future
	  cannot pass the just enqueued transaction.

sys/sys/bio.h:
	Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.

sys/cam/ata/ata_da.c:
sys/cam/scsi/scsi_da.c
	Use an ordered command for SCSI/ATA-NCQ commands issued in
	response to bios with the BIO_ORDERED flag set.

sys/cam/scsi/scsi_da.c
	Use an ordered tag when issuing a synchronize cache command.

	Wrap some lines to 80 columns.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
sys/geom/geom_io.c
	Mark bios with the BIO_FLUSH command as BIO_ORDERED.

Sponsored by:	Spectra Logic Corporation
MFC after:	1 month
2010-09-02 19:40:28 +00:00
Rui Paulo
ea950d20f6 Make the /dev/dtrace/helper node have the mode 0660. This allows
programs that refuse to run as root (pgsql) to install probes when their
user is part of the wheel group.

Sponsored by:	The FreeBSD Foundation
2010-09-01 12:08:32 +00:00
Jaakko Heinonen
de478dd4b4 execve(2) has a special check for file permissions: a file must have at
least one execute bit set, otherwise execve(2) will return EACCES even
for an user with PRIV_VFS_EXEC privilege.

Add the check also to vaccess(9), vaccess_acl_nfs4(9) and
vaccess_acl_posix1e(9). This makes access(2) to better agree with
execve(2). Because ZFS doesn't use vaccess(9) for VEXEC, add the check
to zfs_freebsd_access() too. There may be other file systems which are
not using vaccess*() functions and need to be handled separately.

PR:		kern/125009
Reviewed by:	bde, trasz
Approved by:	pjd (ZFS part)
2010-08-30 16:30:18 +00:00
Pawel Jakub Dawidek
b8a4becc2d Return NULL pointer instead of B_FALSE as it is done in the vendor code.
Obtained from:	//depot/user/pjd/zfs/...
2010-08-28 19:29:06 +00:00
Pawel Jakub Dawidek
3e9e888541 Move ZUT_OBJS in the same place that is used in vendor code.
Obtained from:	//depot/user/pjd/zfs/...
2010-08-28 19:28:12 +00:00
Martin Matuska
8d87b396f8 Import changes from OpenSolaris that provide
- better ACL caching and speedup of ACL permission checks
- faster handling of stat()
- lowered mutex contention in the read/writer lock (rrwlock)
- several related bugfixes

Detailed information (OpenSolaris onnv changesets and Bug IDs):

9749:105f407a2680
6802734	Support for Access Based Enumeration (not used on FreeBSD)
6844861	inconsistent xattr readdir behavior with too-small buffer

9866:ddc5f1d8eb4e
6848431	zfs with rstchown=0 or file_chown_self privilege allows user to "take" ownership

9981:b4907297e740
6775100	stat() performance on files on zfs should be improved
6827779	rrwlock is overly protective of its counters

10143:d2d432dfe597
6857433	memory leaks found at: zfs_acl_alloc/zfs_acl_node_alloc
6860318	truncate() on zfsroot succeeds when file has a component of its path set without access permission

10232:f37b85f7e03e
6865875	zfs sometimes incorrectly giving search access to a dir

10250:b179ceb34b62
6867395	zpool_upgrade_007_pos testcase panic'd with BAD TRAP: type=e (#pf Page fault)

10269:2788675568fd
6868276	zfs_rezget() can be hazardous when znode has a cached ACL

10295:f7a18a1e9610
6870564	panic in zfs_getsecattr

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (multiple Bug IDs)
MFC after:	2 weeks
2010-08-28 09:24:11 +00:00
Martin Matuska
abe5837f7c Update ZFS metaslab code from OpenSolaris.
This provides a noticeable write speedup, especially on pools with
less than 30% of free space.

Detailed information (OpenSolaris onnv changesets and Bug IDs):

11146:7e58f40bcb1c
6826241	Sync write IOPS drops dramatically during TXG sync
6869229	zfs should switch to shiny new metaslabs more frequently

11728:59fdb3b856f6
6918420	zdb -m has issues printing metaslab statistics

12047:7c1fcc8419ca
6917066	zfs block picking can be improved

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6826241, 6869229, 6918420, 6917066)
MFC after:	2 weeks
2010-08-28 08:59:55 +00:00
Rui Paulo
4d02a00a57 Remove debugging.
Sponsored by:	The FreeBSD Foundation
2010-08-28 08:39:37 +00:00
Rui Paulo
9f4ee6172d Replace a memory barrier with a mutex barrier.
Sponsored by:	The FreeBSD Foundation
2010-08-28 08:13:38 +00:00
Pawel Jakub Dawidek
4e52cdd0f7 Use ZFS_CTLDIR_NAME instead of hardcoding ".zfs". 2010-08-27 21:31:15 +00:00
Pawel Jakub Dawidek
8733ff6e11 Update comment now that I finally committed r211854.
MFC after:	1 month
2010-08-26 23:44:32 +00:00
Andriy Gapon
694a0a8717 zfs arc_reclaim_thread: no need to call arc_reclaim_needed when
resetting needfree

needfree is checked at the very start of arc_reclaim_needed.
This change makes code easier to follow and maintain in face of
potential changed in arc_reclaim_needed.

Also, put the whole sub-block under _KERNEL because needfree can be set
only in kernel code.

To do: rename needfree to something else to aovid confusion with
OpenSolaris global variable of the same name which is used in the same
code, but has different meaning (page deficit).

Note: I have an impression that locking around accesses to this variable
as well as mutual notifications between arc_reclaim_thread and
arc_lowmem are not proper.

MFC after:	1 week
2010-08-24 17:48:22 +00:00
Rui Paulo
5caab9d4cd Replace a pksignal() call with tdksignal().
Pointed out by:	kib
2010-08-24 12:12:03 +00:00
Rui Paulo
625564de63 MD fasttrap implementation.
Sponsored by:	The FreeBSD Foundation
2010-08-24 12:05:58 +00:00
Rui Paulo
8605d1ae99 Port the fasttrap provider to FreeBSD. This provider is responsible for
injecting debugging probes in the userland programs and is the basis for
the pid provider and the usdt provider.

Sponsored by:	The FreeBSD Foundation
2010-08-24 11:11:58 +00:00
Rui Paulo
4e41f3537a Port this to FreeBSD. We miss some suword functions, so we use copyout.
Sponsored by:	The FreeBSD Foundation
2010-08-22 11:41:06 +00:00
Rui Paulo
de788cde7b Destroy the helper device when unloading.
Sponsored by:	The FreeBSD Foundation
2010-08-22 11:05:37 +00:00
Rui Paulo
6c44520886 Add more compatibility structure members needed by the upcoming fasttrap
DTrace device.

Sponsored by:	The FreeBSD Foundation
2010-08-22 11:04:43 +00:00
Rui Paulo
c6f5742f90 Kernel DTrace support for:
o uregs  (sson@)
o ustack (sson@)
o /dev/dtrace/helper device (needed for USDT probes)

The work done by me was:
Sponsored by:	The FreeBSD Foundation
2010-08-22 10:53:32 +00:00
Rui Paulo
58f668bba5 Add a function compatibility function dtrace_instr_size_isa() that on
FreeBSD does the same as dtrace_dis_isize().

Sponsored by:	The FreeBSD Foundation
2010-08-22 10:40:15 +00:00
Rui Paulo
5e3caca7f6 Add the FreeBSD definition for the fasttrap ioctls.
Sponsored by:	The FreeBSD Foundation
2010-08-22 10:13:56 +00:00
Rui Paulo
cd306d6fa1 Add a sysname char * to struct opensolaris_utsname.
Sponsored by:	The FreeBSD Foundation
2010-08-21 14:09:24 +00:00
Rui Paulo
e60cf26f21 Port the DTrace helper ioctls to FreeBSD and add a helper member to
dof_helper_t (needed by drti.o).

Sponsored by:	The FreeBSD Foundation
2010-08-21 11:58:08 +00:00
Rui Paulo
e0be1c75f0 Add sysname to struct opensolaris_utsname. This is needed by one DTrace
test.

Sponsored by:	The FreeBSD Foundation
2010-08-21 11:41:32 +00:00
Warner Losh
5c0e643e73 First cut at mips n64 ABI support 2010-08-19 03:31:26 +00:00
Pawel Jakub Dawidek
8dc7024be4 In FreeBSD we use 'jailed' property.
MFC after:	2 weeks
2010-08-07 10:23:54 +00:00
Martin Matuska
f4e7a6c3f1 Import two changesets from OpenSolaris to make future updates easier.
The changes do not affect FreeBSD code because
zfs_znode_move(), cleanlocks() and cleanshares() are not used.

OpenSolaris onnv changeset:	9788:f660bc44f2e8, 9909:aa280f585a3e

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6843700, 6790232)
MFC after:	7 weeks
2010-07-25 15:17:24 +00:00
Martin Matuska
34f56898a1 Consider snapshots as descendants via zfs allow -d
OpenSolaris onnv changeset:	9847:2f3ba86e857a

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6809340)
MFC after:	1 week
2010-07-24 22:28:29 +00:00
Andriy Gapon
a85d8d8acc zfs arc_memory_throttle: available memory is free + cache
OpenSolaris freemem has the same meaning as our v_free_count +
v_cache_count.

Obtained from:	Artem Belevich <fbsdlist@src.cx>,
		Peter Jeremy <peterjeremy@acm.org>
Discussed with:	pjd
MFC after:	2 weeks
2010-07-23 17:44:01 +00:00
Martin Matuska
2bacd082bd Enable fake resolving of SMB RIDs by using nulldomain and UID_NOBODY
- fixes panics when Solaris/OpenSolaris pools that contain files
uploaded with the SMB protocol are accessed

Enable seting/unsetting the sharesmb property (dummy action)
- allows users who import pools from Solaris/Opensolaris to unset
the sharesmb property and get rid of annoying messages

PR:		kern/145778, kern/148709
Approved by:	pjd, delphij (mentor)
MFC after:	7 weeks
2010-07-22 23:30:24 +00:00
Martin Matuska
f926b455e7 To improve latency, lower default vfs.zfs.vdev.max_pending from 35 to 10
OpenSolaris onnv changeset (partial):	10801:e0bf032e8673

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6891731)
MFC after:	1 week
2010-07-20 05:22:14 +00:00
Nathan Whitehorn
2785677d3d Add OpenSolaris atomics for powerpc64 and connect ZFS to the build on
this platform.

Reviewed by:	pjd
2010-07-17 13:34:01 +00:00
Nathan Whitehorn
04bcbbf81e Increase stack size for ZFS sync thread. This is required to make ZFS
function on 64-bit PowerPC.

Reviewed by:	pjd
Obtained from:	OpenSolaris changeset 14653:7cf402a7f374
2010-07-17 13:31:27 +00:00
John Baldwin
61e1c19319 Revert the previous commit. The race is not applicable to the lockmgr
implementation in 8.0 and later as its flags field does not hold dynamic
state such as waiters flags, but is only modified in lockinit() aside
from VN_LOCK_*().

Discussed with:	attilio
2010-07-16 19:52:03 +00:00
John Baldwin
dbfcf8cfea When the MNTK_EXTENDED_SHARED mount option was added, some filesystems were
changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE
in the vnode lock's flags) until after they had determined if the vnode was
a FIFO.  This occurs after the vnode has been inserted a VFS hash or some
similar table, so it is possible for another thread to find this vnode via
vget() on an i-node number and block on the vnode lock.  If the lockmgr
interlock (vnode interlock for vnode locks) is not held when clearing the
LK_NOSHARE flag, then the lk_flags field can be clobbered.  As a result
the thread blocked on the vnode lock may never get woken up.  Fix this by
holding the vnode interlock while modifying the lock flags in this case.

MFC after:	3 days
2010-07-16 19:20:20 +00:00
Martin Matuska
8fc257994d Merge ZFS version 15 and almost all OpenSolaris bugfixes referenced
in Solaris 10 updates 141445-09 and 142901-14.

Detailed information:
(OpenSolaris revisions and Bug IDs, Solaris 10 patch numbers)

7844:effed23820ae
6755435	zfs_open() and zfs_close() needs to use ZFS_ENTER/ZFS_VERIFY_ZP (141445-01)

7897:e520d8258820
6748436	inconsistent zpool.cache in boot_archive could panic a zfs root filesystem upon boot-up (141445-01)

7965:b795da521357
6740164	zpool attach can create an illegal root pool (141909-02)

8084:b811cc60d650
6769612	zpool_import() will continue to write to cachefile even if altroot is set (N/A)

8121:7fd09d4ebd9c
6757430	want an option for zdb to disable space map loading and leak tracking (141445-01)

8129:e4f45a0bfbb0
6542860	ASSERT: reason != VDEV_LABEL_REMOVE||vdev_inuse(vd, crtxg, reason, 0) (141445-01)

8188:fd00c0a81e80
6761100	want zdb option to select older uberblocks (141445-01)

8190:6eeea43ced42
6774886	zfs_setattr() won't allow ndmp to restore SUNWattr_rw (141445-01)

8225:59a9961c2aeb
6737463	panic while trying to write out config file if root pool import fails (141445-01)

8227:f7d7be9b1f56
6765294	Refactor replay (141445-01)

8228:51e9ca9ee3a5
6572357	libzfs should do more to avoid mnttab lookups (141909-01)
6572376	zfs_iter_filesystems and zfs_iter_snapshots get objset stats twice (141909-01)

8241:5a60f16123ba
6328632	zpool offline is a bit too conservative (141445-01)
6739487	ASSERT: txg <= spa_final_txg due to scrub/export race (141445-01)
6767129	ASSERT: cvd->vdev_isspare, in spa_vdev_detach() (141445-01)
6747698	checksum failures after offline -t / export / import / scrub (141445-01)
6745863	ZFS writes to disk after it has been offlined (141445-01)
6722540	50% slowdown on scrub/resilver with certain vdev configurations (141445-01)
6759999	resilver logic rewrites ditto blocks on both source and destination (141445-01)
6758107	I/O should never suspend during spa_load() (141445-01)
6776548	codereview(1) runs off the page when faced with multi-line comments (N/A)
6761406	AMD errata 91 workaround doesn't work on 64-bit systems (141445-01)

8242:e46e4b2f0a03
6770866	GRUB/ZFS should require physical path or devid, but not both (141445-01)

8269:03a7e9050cfd
6674216	"zfs share" doesn't work, but "zfs set sharenfs=on" does (141445-01)
6621164	$SRC/cmd/zfs/zfs_main.c seems to have a syntax error in the translation note (141445-01)
6635482	i18n problems in libzfs_dataset.c and zfs_main.c (141445-01)
6595194	"zfs get" VALUE column is as wide as NAME (141445-01)
6722991	vdev_disk.c: error checking for ddi_pathname_to_dev_t() must test for NODEV (141445-01)
6396518	ASSERT strings shouldn't be pre-processed (141445-01)

8274:846b39508aff
6713916	scrub/resilver needlessly decompress data (141445-01)

8343:655db2375fed
6739553	libzfs_status msgid table is out of sync (141445-01)
6784104	libzfs unfairly rejects numerical values greater than 2^63 (141445-01)
6784108	zfs_realloc() should not free original memory on failure (141445-01)

8525:e0e0e525d0f8
6788830	set large value to reservation cause core dump (141445-01)
6791064	want sysevents for ZFS scrub (141445-01)
6791066	need to be able to set cachefile on faulted pools (141445-01)
6791071	zpool_do_import() should not enable datasets on faulted pools (141445-01)
6792134	getting multiple properties on a faulted pool leads to confusion (141445-01)

8547:bcc7b46e5ff7
6792884	Vista clients cannot access .zfs (141445-01)

8632:36ef517870a3
6798384	It can take a village to raise a zio (141445-01)

8636:7e4ce9158df3
6551866	deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() (141909-01)
6504953	zfs_getpage() misunderstands VOP_GETPAGE() interface (141909-01)
6702206	ZFS read/writer lock contention throttles sendfile() benchmark (141445-01)
6780491	Zone on a ZFS filesystem has poor fork/exec performance (141445-01)
6747596	assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); (141445-01)

8692:692d4668b40d
6801507	ZFS read aggregation should not mind the gap (141445-01)

8697:e62d2612c14d
6633095	creating a filesystem with many properties set is slow (141445-01)

8768:dfecfdbb27ed
6775697	oracle crashes when overwriting after hitting quota on zfs (141909-01)

8811:f8deccf701cf
6790687	libzfs mnttab caching ignores external changes (141445-01)
6791101	memory leak from libzfs_mnttab_init (141445-01)

8845:91af0d9c0790
6800942	smb_session_create() incorrectly stores IP addresses (N/A)
6582163	Access Control List (ACL) for shares (141445-01)
6804954	smb_search - shortname field should be space padded following the NULL terminator (N/A)
6800184	Panic at smb_oplock_conflict+0x35() (N/A)

8876:59d2e67b4b65
6803822	Reboot after replacement of system disk in a ZFS mirror drops to grub> prompt (141445-01)

8924:5af812f84759
6789318	coredump when issue zdb -uuuu poolname/ (141445-01)
6790345 zdb -dddd -e poolname coredump (141445-01)
6797109 zdb: 'zdb -dddddd pool_name/fs_name inode' coredump if the file with inode was deleted (141445-01)
6797118 zdb: 'zdb -dddddd poolname inum' coredump if I miss the fs name (141445-01)
6803343 shareiscsi=on failed, iscsitgtd failed request to share (141445-01)

9030:243fd360d81f
6815893	hang mounting a dataset after booting into a new boot environment (141445-01)

9056:826e1858a846
6809691	'zpool create -f' no longer overwrites ufs infomation (141445-01)

9179:d8fbd96b79b3
6790064	zfs needs to determine uid and gid earlier in create process (141445-01)

9214:8d350e5d04aa
6604992	forced unmount + being in .zfs/snapshot/<snap1> = not happy (141909-01)
6810367	assertion failed: dvp->v_flag & VROOT, file: ../../common/fs/gfs.c, line: 426 (141909-01)

9229:e3f8b41e5db4
6807765	ztest_dsl_dataset_promote_busy needs to clean up after ENOSPC (141445-01)

9230:e4561e3eb1ef
6821169	offlining a device results in checksum errors (141445-01)
6821170	ZFS should not increment error stats for unavailable devices (141445-01)
6824006	need to increase issue and interrupt taskqs threads in zfs (141445-01)

9234:bffdc4fc05c4
6792139	recovering from a suspended pool needs some work (141445-01)
6794830	reboot command hangs on a failed zfs pool (141445-01)

9246:67c03c93c071
6824062	System panicked in zfs_mount due to NULL pointer dereference when running btts and svvs tests (141909-01)

9276:a8a7fc849933
6816124	System crash running zpool destroy on broken zpool (141445-03)

9355:09928982c591
6818183	zfs snapshot -r is slow due to set_snap_props() doing txg_wait_synced() for each new snapshot (141445-03)

9391:413d0661ef33
6710376	log device can show incorrect status when other parts of pool are degraded (141445-03)

9396:f41cf682d0d3 (part already merged)
6501037	want user/group quotas on ZFS (141445-03)
6827260	assertion failed in arc_read(): hdr == pbuf->b_hdr (141445-03)
6815592	panic: No such hold X on refcount Y from zfs_znode_move (141445-03)
6759986	zfs list shows temporary %clone when doing online zfs recv (141445-03)

9404:319573cd93f8
6774713	zfs ignores canmount=noauto when sharenfs property != off (141445-03)

9412:4aefd8704ce0
6717022	ZFS DMU needs zero-copy support (141445-03)

9425:e7ffacaec3a8
6799895	spa_add_spares() needs to be protected by config lock (141445-03)
6826466	want to post sysevents on hot spare activation (141445-03)
6826468	spa 'allowfaulted' needs some work (141445-03)
6826469	kernel support for storing vdev FRU information (141445-03)
6826470	skip posting checksum errors from DTL regions of leaf vdevs (141445-03)
6826471	I/O errors after device remove probe can confuse FMA (141445-03)
6826472	spares should enjoy some of the benefits of cache devices (141445-03)

9443:2a96d8478e95
6833711	gang leaders shouldn't have to be logical (141445-03)

9463:d0bd231c7518
6764124	want zdb to be able to checksum metadata blocks only (141445-03)

9465:8372081b8019
6830237	zfs panic in zfs_groupmember() (141445-03)

9466:1fdfd1fed9c4
6833162	phantom log device in zpool status (141445-03)

9469:4f68f041ddcd
6824968	add ZFS userquota support to rquotad (141445-03)

9470:6d827468d7b5
6834217	godfather I/O should reexecute (141445-03)

9480:fcff33da767f
6596237	Stop looking and start ganging (141909-02)

9493:9933d599bc93
6623978	lwb->lwb_buf != NULL, file ../../../uts/common/fs/zfs/zil.c, line 787, function zil_lwb_commit (141445-06)

9512:64cafcbcc337
6801810	Commit of aligned streaming rewrites to ZIL device causes unwanted disk reads (N/A)

9515:d3b739d9d043
6586537	async zio taskqs can block out userland commands (142901-09)

9554:787363635b6a
6836768	zfs_userspace() callback has no way to indicate failure (N/A)

9574:1eb6a6ab2c57
6838062	zfs panics when an error is encountered in space_map_load() (141909-02)

9583:b0696cd037cc
6794136	Panic BAD TRAP: type=e when importing degraded zraid pool. (141909-03)

9630:e25a03f552e0
6776104	"zfs import" deadlock between spa_unload() and spa_async_thread() (141445-06)

9653:a70048a304d1
6664765	Unable to remove files when using fat-zap and quota exceeded on ZFS filesystem (141445-06)

9688:127be1845343
6841321	zfs userspace / zfs get userused@ doesn't work on mounted snapshot (N/A)
6843069	zfs get userused@S-1-... doesn't work (N/A)

9873:8ddc892eca6e
6847229	assertion failed: refcount_count(&tx->tx_space_written) + delta <= tx->tx_space_towrite in dmu_tx.c (141445-06)

9904:d260bd3fd47c
6838344	kernel heap corruption detected on zil while stress testing (141445-06)

9951:a4895b3dd543
6844900	zfs_ioc_userspace_upgrade leaks (N/A)

10040:38b25aeeaf7a
6857012	zfs panics on zpool import (141445-06)

10000:241a51d8720c
6848242	zdb -e no longer works as expected (N/A)

10100:4a6965f6bef8
6856634	snv_117 not booting: zfs_parse_bootfs: error2 (141445-07)

10160:a45b03783d44
6861983	zfs should use new name <-> SID interfaces (N/A)
6862984	userquota commands can hang (141445-06)

10299:80845694147f
6696858	zfs receive of incremental replication stream can dereference NULL pointer and crash (N/A)

10302:a9e3d1987706
6696858	zfs receive of incremental replication stream can dereference NULL pointer and crash (fix lint) (N/A)

10575:2a8816c5173b (partial merge)
6882227 spa_async_remove() shouldn't do a full clear (142901-14)

10800:469478b180d9
6880764	fsync on zfs is broken if writes are greater than 32kb on a hard crash and no log attached (142901-09)
6793430 zdb -ivvvv assertion failure: bp->blk_cksum.zc_word[2] == dmu_objset_id(zilog->zl_os) (N/A)

10801:e0bf032e8673 (partial merge)
6822816 assertion failed: zap_remove_int(ds_next_clones_obj) returns ENOENT (142901-09)

10810:b6b161a6ae4a
6892298 buf->b_hdr->b_state != arc_anon, file: ../../common/fs/zfs/arc.c, line: 2849 (142901-09)

10890:499786962772
6807339	spurious checksum errors when replacing a vdev (142901-13)

11249:6c30f7dfc97b
6906110 bad trap panic in zil_replay_log_record (142901-13)
6906946 zfs replay isn't handling uid/gid correctly (142901-13)

11454:6e69bacc1a5a
6898245 suspended zpool should not cause rest of the zfs/zpool commands to hang (142901-10)

11546:42ea6be8961b (partial merge)
6833999 3-way deadlock in dsl_dataset_hold_ref() and dsl_sync_task_group_sync() (142901-09)

Discussed with:	pjd
Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (multiple Bug IDs)
MFC after:	2 months
2010-07-12 23:49:04 +00:00
Rui Paulo
6e19f4de12 Merge from vendor-sys/opensolaris:
* add fasttrap files
2010-07-06 10:28:19 +00:00
Martin Matuska
d3cf8f4b68 Import latest ARC change from OpenSolaris:
- large ghost eviction causes high write latency
- arc_adjust might adjust MRU unnecessarily
- arc_adapt can lead to wild arc_p adjustment

OpenSolaris onnv-revision:	12636:13b5d698941e

Submitted by:	avg
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6950219, 6953403, 6951024)
MFC after:	1 month
2010-06-17 22:47:44 +00:00
Pawel Jakub Dawidek
653e034db5 Turn off UMA allocations on all archs by default. It isn't stable even on
amd64.

Reported by:	many
MFC after:	3 days
2010-06-17 17:41:42 +00:00
Pawel Jakub Dawidek
fcc7888f82 Remove redundant assignment.
MFC after:	3 days
2010-06-16 12:42:20 +00:00
Martin Matuska
bc5752e811 Fix arc_read_done may try to byteswap undefined data (sparc related)
OpenSolaris onnv-revision:	10839:cf83b553a2ab

Obtained from:	OpenSolaris (Bug ID 6836714)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:28:46 +00:00
Martin Matuska
726db0af89 Fix panic in zfs_getsecattr
OpenSolaris onnv-revision:	10295:f7a18a1e9610

Obtained from:	OpenSolaris (Bug ID 6870564)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:27:10 +00:00
Martin Matuska
9dac494ce6 Fix possible zfs panic on zpool import
OpenSolaris onnv-revision:	10040:38b25aeeaf7a

Obtained from:	OpenSolaris (Bug ID 6857012)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:25:57 +00:00
Martin Matuska
16547ea20b Fix zpool resilver stalls with spa_scrub_thread in a 3 way deadlock
OpenSolaris onnv-revision:	9997:174d75a29a1c

Obtained from:	OpenSolaris (Bug ID 6843235)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:24:10 +00:00
Martin Matuska
072b6fc60e Fix ZFS panic deadlock: cycle in blocking chain via zfs_zget
OpenSolaris onnv-revision:	9774:0bb234ab2287

Obtained from:	OpenSolaris (Bug ID 6788152)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:22:45 +00:00
Martin Matuska
1aa2ebdd23 Fix vdev_probe() starvation brings txg train to a screeching halt
OpenSolaris onnv-revision:	9722:e3866bad4e96

Obtained from:	OpenSolaris (Bug ID 6844069)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:21:37 +00:00
Martin Matuska
a2d6c8d15b Fix incomplete resilvering after disk replacement (raidz)
OpenSolaris onnv-revision:	9434:3bebded7c76a

Obtained from:	OpenSolaris (Bug ID 6794570)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:20:50 +00:00
Martin Matuska
b90308c521 Fix zfs destroy fails to free object in open context, stops up txg train
OpenSolaris onnv-revision:	9409:9dc3f17354ed

Obtained from:	OpenSolaris (Bug ID 6809683)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:19:51 +00:00
Martin Matuska
62c55b6d08 Fix unable to remove a file over NFS after hitting refquota limit
OpenSolaris onnv-revision:	8890:8c2bd5f17bf2

Obtained from:	OpenSolaris (Bug ID 6798878)
Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-06-12 11:18:29 +00:00
John Baldwin
3aa6d94e0c Update several places that iterate over CPUs to use CPU_FOREACH(). 2010-06-11 18:46:34 +00:00
Martin Matuska
711bf9bcf1 Fix freeing space after deleting large files with holes.
OpenSolaris onnv revision:	9950:78fc41aa9bc5

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6792701)
MFC after:	3 days
2010-06-03 11:08:46 +00:00
Martin Matuska
dc5d34e454 Fix ZIL close when doing zfs rollback or zfs receive on a mounted dataset.
The fix is a partial import and merge of OpenSolaris onnv revisions
8227:f7d7be9b1f56. and 9292:e112194b5b73

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6798298)
MFC after:	3 days
2010-06-01 08:43:46 +00:00
Pawel Jakub Dawidek
510ec358c5 Fix a bug where resilver is not started automatically on pool import or load.
If disk was missing on pool load or import and on next pool load or import
it was present, resilver wasn't started automatically and ZFS reported all disks
as ONLINE and healthy. Then, when another disk died, pool became unaccessible,
because if it was 2-way mirror or RAIDZ1 two vdevs were out of sync.

To fix the problem, start resilver automatically on pool load or import.

Obtained from:	OpenSolaris
MFC after:	3 days
2010-05-31 23:17:45 +00:00
Pawel Jakub Dawidek
b1c7417cd8 Fix panic when reading label from provider with non power of 2 sector size.
Reported by:	James R. Van Artsdalen <james-freebsd-fs2@jrv.org>
MFC after:	3 days
2010-05-31 23:11:43 +00:00
Martin Matuska
dd85b12982 Remove kstat.zfs.arcstats.l2_write_bytes_written
The arcstats.l2_write_bytes_written kstat counter introduced
in r205231 was duplicite with vendor's arcstats.l2_write_bytes counter
imported in r208373 (OpenSolaris revision 8582:df9361868dbe)

Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-05-23 21:16:34 +00:00
Martin Matuska
5b170d55ae Fix zfs receive temporarily changing unchanged stream properties.
Fix possible panic with zfs_enable_datasets.

OpenSolaris onnv revision:	8536:33bd5de3260e

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6748561, 6757075)
MFC after:	3 days
2010-05-23 21:02:43 +00:00
Pawel Jakub Dawidek
4e8c7af455 Create UMA zones unconditionally.
MFC after:	3 days
2010-05-23 19:10:06 +00:00
Pawel Jakub Dawidek
a95add4cf8 Remove ZIO_USE_UMA from arc.c as well.
MFC after:	3 days
2010-05-23 18:42:33 +00:00
Konstantin Belousov
afe1a68827 Reorganize syscall entry and leave handling.
Extend struct sysvec with three new elements:
sv_fetch_syscall_args - the method to fetch syscall arguments from
  usermode into struct syscall_args. The structure is machine-depended
  (this might be reconsidered after all architectures are converted).
sv_set_syscall_retval - the method to set a return value for usermode
  from the syscall. It is a generalization of
  cpu_set_syscall_retval(9) to allow ABIs to override the way to set a
  return value.
sv_syscallnames - the table of syscall names.

Use sv_set_syscall_retval in kern_sigsuspend() instead of hardcoding
the call to cpu_set_syscall_retval().

The new functions syscallenter(9) and syscallret(9) are provided that
use sv_*syscall* pointers and contain the common repeated code from
the syscall() implementations for the architecture-specific syscall
trap handlers.

Syscallenter() fetches arguments, calls syscall implementation from
ABI sysent table, and set up return frame. The end of syscall
bookkeeping is done by syscallret().

Take advantage of single place for MI syscall handling code and
implement ptrace_lwpinfo pl_flags PL_FLAG_SCE, PL_FLAG_SCX and
PL_FLAG_EXEC. The SCE and SCX flags notify the debugger that the
thread is stopped at syscall entry or return point respectively.  The
EXEC flag augments SCX and notifies debugger that the process address
space was changed by one of exec(2)-family syscalls.

The i386, amd64, sparc64, sun4v, powerpc and ia64 syscall()s are
changed to use syscallenter()/syscallret(). MIPS and arm are not
converted and use the mostly unchanged syscall() implementation.

Reviewed by:	jhb, marcel, marius, nwhitehorn, stas
Tested by:	marcel (ia64), marius (sparc64), nwhitehorn (powerpc),
	stas (mips)
MFC after:	1 month
2010-05-23 18:32:02 +00:00
Martin Matuska
55a381515b Fix kernel panic when calling spa_tryimport() on a corrupted pool.
OpenSolaris onnv revision:	8680:005fe27123ba

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6786321)
MFC after:	1 day
2010-05-23 10:13:11 +00:00
Martin Matuska
e3fffd1a9f Fix mutex_exit misorder that can cause a kernel panic.
OpenSolaris onnv revision:	8667:5c308a17eb7c

Approved by:	delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6795440)
MFC after:	1 day
2010-05-23 10:08:05 +00:00
Martin Matuska
7838815ebb Update L2ARC code and fix several bugs.
- improve ARC memory consumption (Bug ID 6488341)
- ARC/L2ARC metadata accounting (Bug ID 6748019)
- L2ARC turbo warmup (Bud ID 6748023)
- kstats for ARC content (Bug ID 6748023)
- kstats for evicted bytes from ARC by L2ARC state (Bud ID 6871680)
- fix panic on i386 systems (Bug ID 6821260)

OpenSolaris onnv revisions:
8582:df9361868dbe, 8628:97dcded6e556, 9215:7c4584f76b47,
9274:a10f8bd993c1, 10357:29060492b29d

OpenSolaris Bug IDs:
6748019, 6748023, 6748030, 6488341, 6798268, 6821260, 6790261, 6871680

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSlaris (multiple bug IDs)
MFC after:	3 days
2010-05-21 09:52:49 +00:00
Martin Matuska
370227d241 Reorder some already introduced locking variables.
OpenSolaris onnv revision:	8214:d7abf7c1f1c1

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6747934)
MFC after:	3 days
2010-05-21 09:35:28 +00:00
Martin Matuska
911e1f9b1d Fix stack overflow in zfs send.
OpenSolaris onnv-revision: 8012:8ea30813950f

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6765626)
MFC after:	3 days
2010-05-21 08:55:18 +00:00
Martin Matuska
8b2bc083b9 Fix: vdev_reopen() can lead to failed allocations
OpenSolaris onnv-revision: 7980:589f37f25048

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6764914)
MFC after:	3 days
2010-05-21 08:50:34 +00:00
Pawel Jakub Dawidek
2b3d97b81d Fix userland build by making io_task available only for the kernel and by
providing taskq_dispatch_safe() macro.

MFC after:	1 week
2010-05-16 19:44:08 +00:00
Pawel Jakub Dawidek
ed3c664257 Allow to configure UMA usage for ZIO data via loader and turn it on by
default for amd64. On i386 I saw performance degradation when UMA was used,
but for amd64 it should help.

MFC after:	3 days
2010-05-16 15:14:59 +00:00
Pawel Jakub Dawidek
cfb3e98d37 Add task structure to zio and use it instead of allocating one.
This eliminates the only place where we can sleep when calling zio_interrupt().
As a side-effect this can actually improve performance a little as we
allocate one less thing for every I/O.

Prodded by:	kib
MFC after:	1 week
2010-05-16 15:12:34 +00:00
Pawel Jakub Dawidek
ea478cb1da The whole point of having dedicated worker thread for each leaf VDEV was to
avoid calling zio_interrupt() from geom_up thread context. It turns out that
when provider is forcibly removed from the system and we kill worker thread
there can still be some ZIOs pending. To complete pending ZIOs when there is
no worker thread anymore we still have to call zio_interrupt() from geom_up
context. To avoid this race just remove use of worker threads altogether.
This should be more or less fine, because I also thought that zio_interrupt()
does more work, but it only makes small UMA allocation with M_WAITOK.
It also saves one context switch per I/O request.

PR:		kern/145339
Reported by:	Alex Bakhtin <Alex.Bakhtin@gmail.com>
MFC after:	1 week
2010-05-16 11:56:42 +00:00
Martin Matuska
ee56d88b76 Fix deadlock between zfs_dirent_lock and zfs_rmdir
OpenSolaris onnv revision:	11321:506b7043a14c

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6847615)
MFC after:	3 days
2010-05-16 07:46:03 +00:00
Martin Matuska
db708a6e2c Fix perfomance problem with ZFS prefetch caching [1]
Add statistics for ZFS prefetch (sysctl kstat.zfs.misc.zfetchstats)

Partial import of OpenSolaris onnv revision 10474:0e96dd3b905a

Reported by:	jhell@dataix.net (private e-mail) [1]
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6859997, 6868951)
MFC after:	3 days
2010-05-16 07:16:28 +00:00
Martin Matuska
bef629c14d Fix ZIL-related panic on zfs rollback.
OpenSolaris onnv-revision: 8746:e1d96ca6808c

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6796377)
MCF after:	1 week
2010-05-13 20:55:58 +00:00
Martin Matuska
c43d127a9a Import OpenSolaris revision 7837:001de5627df3
It includes the following changes:
- parallel reads in traversal code (Bug ID 6333409)
- faster traversal for zfs send (Bug ID 6418042)
- traversal code cleanup (Bug ID 6725675)
- fix for two scrub related bugs (Bug ID 6729696, 6730101)
- fix assertion in dbuf_verify (Bug ID 6752226)
- fix panic during zfs send with i/o errors (Bug ID 6577985)
- replace P2CROSS with P2BOUNDARY (Bug ID 6725680)

List of OpenSolaris Bug IDs:
6333409, 6418042, 6757112, 6725668, 6725675, 6725680,
6725698, 6729696, 6730101, 6752226, 6577985, 6755042

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (multiple Bug IDs)
MFC after:	1 week
2010-05-13 20:32:56 +00:00
Edward Tomasz Napierala
4e28b70950 Add missing check to prevent local users from panicing the kernel by trying
to set malformed ACL.

MFC after:	3 days
2010-05-13 15:31:00 +00:00
Martin Matuska
f2d1218cbe Fix possible hang when replaying large truncations.
OpenSolaris onnv revision:	7904:6a124a4ca9c5

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6761624)
MFC after:	3 days
2010-05-12 09:51:57 +00:00
Pawel Jakub Dawidek
c60c36a745 I added vfs_lowvnodes event, but it was only used for a short while and now
it is totally unused. Remove it.

MFC after:	3 days
2010-05-11 22:46:36 +00:00
Pawel Jakub Dawidek
8423e00b36 Eventhough r203504 eliminates taste traffic provoked by vdev_geom.c,
ZFS still like to open all vdevs, close them and open them again,
which in turn provokes taste traffic anyway.

I don't know of any clean way to fix it, so do it the hard way - if we can't
open provider for writing just retry 5 times with 0.5 pauses. This should
elimitate accidental races caused by other classes tasting providers created on
top of our vdevs.

MFC after:	3 days
Reported by:	James R. Van Artsdalen <james-freebsd-fs2@jrv.org>
Reported by:	Yuri Pankov <yuri.pankov@gmail.com>
2010-05-11 22:29:00 +00:00
Pawel Jakub Dawidek
204b20d932 Add missing new line characters to the warnings.
MFC after:	3 days
2010-05-11 22:23:35 +00:00
Martin Matuska
8c04b2242e Fix failed assertion on destroying datasets from an older pool version.
OpenSolaris onnv revision:	9390:887948510f80

PR:		kern/146471
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6826861)
MFC after:	3 days
2010-05-11 09:26:46 +00:00
Martin Matuska
431905576e Fix possible panic with zfs destroy.
OpenSolaris onnv revision:	8779:f164e0e90508

PR:		kern/146471
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6784924)
MFC after:	3 days
2010-05-11 09:23:46 +00:00
Martin Matuska
bb8b966850 Fix zfs rename (may occasionally fail with dataset busy).
OpenSolaris onnv revision:	8517:41a0783dde17

PR:		kern/146471
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6784757)
MFC after:	3 days
2010-05-11 09:19:41 +00:00
Martin Matuska
dbbd1505bf Fix endianess bug in ZFS intent log (ZIL).
OpenSolaris onnv revision:	8109:6147a1bdd359

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6760048)
MFC after:	3 days
2010-05-11 07:25:13 +00:00
Edward Tomasz Napierala
dc510c105f Enforce RLIMIT_FSIZE in ZFS.
Reviewed by:	pjd@
2010-05-07 14:30:21 +00:00
Marius Strobl
626b7c61f8 - Fix broken symlinks on cross platform zfs send/recv. [1]
- Enable zfs_ace_byteswap() on FreeBSD as it works just fine (tested between
  amd64 and sparc64 in both directions by Michael Moll).

PR:		146272
Approved by:	mm, pjd
Obtained from:	OpenSolaris (onnv rev. 8283:1ca59f393041; Bug ID 6764193) [1]
MFC after:	3 days
2010-05-05 22:15:20 +00:00
Martin Matuska
d75554ec04 Introduce hardforce export option (-F) for "zpool export".
When exporting with this flag, zpool.cache remains untouched.

OpenSolaris onnv revision: 8211:32722be6ad3b

Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID: 6775357)
2010-05-05 18:22:29 +00:00
Martin Matuska
7d4daf9a10 Speed up ZFS list operation with objset prefetching.
Partial import of OpenSolaris onnv revisions:
8415:8809e849f63e, 10474:0e96dd3b905a

PR:		kern/146297
Submitted by:	myself
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6386929, 6755389, 6847118)
MFC after:	2 weeks
2010-05-04 17:40:24 +00:00
Martin Matuska
77a7f64749 Fix deadlock during zfs receive.
OpenSolaris onnv revision:	9299:8809e849f63e

PR:		kern/146296
Submitted by:	myself
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris (Bug ID 6783818, 6826836)
MFC after:	1 week
2010-05-04 17:30:07 +00:00
Martin Matuska
df04ddbaa6 Add sysctl and loader tunable vfs.zfs.txg.write_limit_override.
This tunable improves fine-tuning of ZFS write throttling.

PR:		kern/146108
Suggested by:	Nikolay Denev <ndenev at gmail.com>
Approved by:	pjd, delphij (mentor)
MFC after:	2 weeks
2010-05-01 20:44:37 +00:00
Martin Matuska
9ccdc9600e Change description of tunable group vfs.zfs.txg to be more
understandable.

Approved by:	pjd, delphij (mentor)
MFC after:	3 days
2010-05-01 19:53:15 +00:00
Martin Matuska
d8665eb1f6 Fix improper pool write throughput calculation.
OpenSolaris onnv revision:	9366:17553395a745

PR:		kern/146108
Approved by:	pjd, delphij (mentor)
Obtained from:	OpenSolaris, Bug ID 6817339
MFC after:	2 weeks
2010-04-30 07:48:29 +00:00
Pawel Jakub Dawidek
b19b0de471 Backport fix for 'zfs_znode_dmu_init: existing znode for dbuf' panic from OpenSolaris.
PR:		kern/144402
Reported by:	Alex Bakhtin <alex.bakhtin@gmail.com>
Tested by:	Alex Bakhtin <alex.bakhtin@gmail.com>
Obtained from:	OpenSolaris, Bug ID 6895088
MFC after:	3 days
2010-04-28 18:29:48 +00:00
Pawel Jakub Dawidek
7af9c09a61 Allow to modify directory's content even if the ZFS_NOUNLINK (SF_NOUNLINK,
sunlnk) flag is set. We only deny dirctory's removal or rename.

PR:		kern/143343
Reported by:	marck
MFC after:	3 days
2010-04-22 18:47:23 +00:00
Rui Paulo
ff569d8436 Rename the cyclic global variable lapic_cyclic_clock_func to just
cyclic_clock_func. This will make more sense when we start developing non
x86 cyclic version.
2010-04-20 17:03:30 +00:00
Rui Paulo
b527a2a4dc The amd64 version of the cyclic dtrace module is a verbatim copy of the
i386 version, so instead having a copy of the same file, use Makefile
foo to include the i386 version on amd64.
2010-04-20 16:30:17 +00:00
Xin LI
0e568ab25c Partially MFp4 #176265 by pjd@:
- Properly initialize and destroy system_taskq.
 - Add a dummy implementation of taskq_create_proc().

Note: We do not currently use system_taskq in ZFS so this is mostly a
no-op at this time.  Proper system_taskq initialization is required
by newer ZFS code.

Ok'ed by:	pjd
MFC after:	2 weeks
2010-04-19 09:03:36 +00:00
Pawel Jakub Dawidek
bd7226a572 Restore previous order. 2010-04-18 12:43:33 +00:00
Pawel Jakub Dawidek
224329fb6b Style fixes. 2010-04-18 12:36:53 +00:00
Pawel Jakub Dawidek
eb998be67d Add missing list and lock destruction. 2010-04-18 12:27:07 +00:00
Pawel Jakub Dawidek
ad3cb80827 Extend locks scope to match OpenSolaris. 2010-04-18 12:25:40 +00:00
Pawel Jakub Dawidek
57a81a8bbc Remove racy assertion.
Obtained from:	OpenSolaris
2010-04-18 12:21:52 +00:00
Pawel Jakub Dawidek
5195ca2307 Set ARC_L2_WRITING on L2ARC header creation.
Obtained from:	OpenSolaris
2010-04-18 12:20:33 +00:00
Pawel Jakub Dawidek
40c7da090f Fix 3-way deadlock that can happen because of ZFS and vnode lock
order reversal.

thread0 (vfs_fhtovp)	thread1 (vop_getattr)	thread2 (zfs_recv)
--------------------	---------------------	------------------
			vn_lock
rrw_enter_read
						rrw_enter_write (hangs)
			rrw_enter_read (hangs)
vn_lock (hangs)

Submitted by:	Attila Nagy <bra@fsn.hu>
MFC after:	3 days
2010-04-15 16:40:54 +00:00
Pawel Jakub Dawidek
58804a192e The same code is used to import and to create pool.
The order of operations is the following:
1. Try to open vdev by remembered path and guid.
2. If 1 failed, try to find vdev which guid matches and ignore the path.
3. If 2 failed this means either that the vdev we're looking for is gone
   or that pool is being created and vdev doesn't contain proper guid yet.
   To be able to handle pool creation we open vdev by path anyway.

Because of 3 it is possible that we open wrong vdev on import which can lead to
confusions.

The solution for this is to check spa_load_state. On pool creation it will be
equal to SPA_LOAD_NONE and we can open vdev only by path immediately and if it
is not equal to SPA_LOAD_NONE we first open by path+guid and when that fails,
we open by guid. We no longer open wrong vdev on import.

MFC after:	2 weeks
2010-03-19 20:14:27 +00:00
Kip Macy
e577b0b2e3 - cache line align arcs_lock array (h/t Marius Nuennerich)
- fix ARCS_LOCK_PAD to use architecture defined CACHE_LINE_SIZE
- cache line align buf_hash_table ht_locks array

MFC after:	7 days
2010-03-17 21:10:09 +00:00
Kip Macy
07c5b1686e use CACHE_LINE_SIZE instead of hardcoding 128 for lock pad
pointed out by Marius Nuennerich and jhb@
2010-03-17 20:00:22 +00:00
Kip Macy
285738b6ad - reduce contention by breaking up ARC state locks in to 16 for data
and 16 for metadata
- export L2ARC tunables as sysctls
- add several kstats to track L2ARC state more precisely
- avoid holding a contended lock when atomically incrementing a
  contended counter (no lock protection needed for atomics)
2010-03-16 22:17:21 +00:00
Kip Macy
03af82ac5e fix compilation under ZIO_USE_UMA 2010-03-13 21:52:21 +00:00
Kip Macy
181c6ae3f0 Don't bottleneck on acquiring the stream locks - this avoids a massive
drop off in throughput with large numbers of simultaneous reads

MFC after:	7 days
2010-03-13 21:41:52 +00:00
Pawel Jakub Dawidek
3a98b0c4df Remove bogus assertion.
Reported by:	Johan Ström <johan@stromnet.se>
Obtained from:	OpenSolaris, Bug ID 6827260
MFC after:	1 week
2010-03-12 12:07:21 +00:00
Pawel Jakub Dawidek
5b2e8d582f Remove racy assertion.
Reported by:	Attila Nagy <bra@fsn.hu>
Obtained from:	OpenSolaris, Bug ID 6827260
MFC after:	1 week
2010-03-06 20:03:26 +00:00
Marcel Moolenaar
d49ebec408 Use mf and not mf.a. The latter doesn't force memory ordering and
applies to sequential memory.
2010-02-22 01:24:34 +00:00
Pawel Jakub Dawidek
251294bca9 Don't set f_bsize to recordsize. It might confuse some software (like squid).
Submitted by:	Alexander Zagrebin <alexz@visp.ru>
MFC after:	2 weeks
2010-02-19 20:18:16 +00:00
Pawel Jakub Dawidek
bbd388268e Add tunable and sysctl to skip hostid check on pool import. 2010-02-18 22:31:43 +00:00
Xin LI
d0d444da07 Remove two files that are not needed by FreeBSD.
Approved by:	pjd
MFC after:	2 weeks
2010-02-05 23:17:59 +00:00
Pawel Jakub Dawidek
9d3f36a309 Open provider for writting when we find the right one. Opening too much
providers for writing provokes huge traffic related to taste events send
by GEOM on close. This can lead to various problems with opening GEOM
providers that are created on top of other GEOM providers.

Reorted by:	Kurt Touet <ktouet@gmail.com>, mr
Tested by:	mr, Baginski Darren <kickbsd@ya.ru>
MFC after:	2 weeks
2010-02-04 21:11:44 +00:00
Xin LI
63243c5c71 On FreeBSD, time_t is 64-bit for all platforms except i386 and powerpc,
where the type is 32-bit.  ZFS can handle 64-bit timestamp internally
but zfs_setattr() would check if the time value can fit, we change the
checking macros to match 64-bit timestamp if the platform supports it.

This change has some downsides like, while you can import zfs on 32-bit
platforms, the timestamp would overflow if they are out of the range.

This fixes the Y2.038K issue on platforms using 64-bit timestamps.

Reviewed by:	pjd
MFC after:	1 month
2010-01-25 07:52:54 +00:00
Xin LI
51b1ec310c Report ZFS filesystem version instead of the zpool version when we say it.
Reported by:	Yuri Pankov (on -fs@)
Submitted by:	delphij
Approved by:	pjd
MFC after:	1 week
2010-01-11 23:15:11 +00:00
Xin LI
017b01f662 Re-apply onnv-gate revisions 7994 and 8986 (corresponds to FreeBSD
revision 200726 and 200727).  It looks like that the two revisions
were not applied in the right sequence, I found this when comparing
with the OpenSolaris code.

MFC after:	3 days
Reviewed by:	mm@
2010-01-07 20:10:22 +00:00
Xin LI
03ec6213ee Instead of assuming all vdevs are healthy, check the newest vdev label
for each vdev's status.  Booting from a degraded vdev should now be
more robust.

Submitted by:	Matt Reimer <mattjreimer at gmail.com>
Sponsored by:	VPOP Technologies, Inc.
MFC after:	2 weeks
2010-01-06 23:09:23 +00:00
Pawel Jakub Dawidek
49335ba853 Teach the (gpt)zfsboot and zfsloader raidz code to use its buffers
more efficiently.

Before this patch, in the worst case memory use would increase
exponentially on the number of drives in the raidz vdev.

Submitted by:	Matt Reimer <mattjreimer@gmail.com>
Sponsored by:	VPOP Technologies, Inc.
Silence from:	dfr
2010-01-06 22:39:40 +00:00
Xin LI
1ee5de4482 Reduce diff against OpenSolaris - move Giant acquire/release to
zfs_znode.c.  As a side effect this also eliminates two potential
Giant leaks.

Approved by:	pjd
MFC after:	1 month
2010-01-02 23:38:03 +00:00
Xin LI
9189129097 Apply OpenSolaris revision 8012 which brings our zpool to version 14,
making it possible for zpools created on OpenSolaris 2009.06 be used
on FreeBSD.

PR:		kern/141800
Submitted by:	mm
Reviewed by:	pjd, trasz
Obtained from:	OpenSolaris
MFC after:	2 weeks
2009-12-28 22:15:11 +00:00
Xin LI
dd0c145752 Apply fix for Solaris bug 6462803: zfs snapshot -r failed because
filesystem was busy
(onnv revision 8989)

Submitted by:	mm
Approved by:	pjd
Obtained from:	OpenSolaris
MFC after:	2 weeks
2009-12-19 11:49:20 +00:00
Xin LI
24a41d7ec6 Apply fix for Solaris bug 6801979: zfs recv can fail with E2BIG
(onnv revision 8986)

Requested by:	mm
Submitted by:	pjd
Obtained from:	OpenSolaris
MFC after:	2 weeks
2009-12-19 11:47:22 +00:00
Xin LI
775f802393 Apply fix Solaris bug 6462803 zfs snapshot -r failed because
filesystem was busy.

Submitted by:	mm
Approved by:	pjd
MFC after:	2 weeks
2009-12-19 11:43:39 +00:00
Konstantin Belousov
88f2d72947 Change VOP_FSYNC for zfs vnode from VOP_PANIC to zfs_freebsd_fsync(),
both to not panic when fsync(2) is called for fifo on zfs
filedescriptor, and to actually fsync fifo inode to permanent storage.

PR:	kern/141177
Reviewed by:	pjd
MFC after:	1 week
2009-12-05 20:36:42 +00:00
Pawel Jakub Dawidek
dfb903e852 We have to eventually look for provider without checking guid as this is need
for attaching when there is no metadata yet.

Before r200125 the order of looking for providers was wrong. It was:
1. Find provider by name.
2. Find provider by guid.
3. Find provider by name and guid.

Where it should have been:
1. Find provider by name and guid.
2. Find provider by guid.
3. Find provider by name.

MFC after:	1 week
2009-12-05 20:16:28 +00:00
Pawel Jakub Dawidek
6468cb2ce0 Fix deadlock when ZVOLs are present and we are replacing dead component or
calling scrub when pool is in a degraded state. It will try to taste ZVOLs,
which will lead to deadlock, as ZVOL will try to acquire the same locks as
replace/scrub is holding already.

We can't simply skip provider based on their GEOM class, because ZVOL can have
providers build on top of it and we need to skip those as well.

We do it by asking for ZFS::iszvol attribute. Any ZVOL-based provider will give
us positive answer and we have to skip those providers.

This way we remove possibility to create ZFS pools on top of ZVOLs, but it is
not very useful anyway.

I believe deadlock is still possible in some very complex situations like when
we have MD provider on top of UFS file on top of ZVOL. When we try to replace
dead component in the pool mentioned ZVOL is based on, there might be a
deadlock when ZFS will try to taste MD provider. There is no easy way to detect
that, but it isn't very common.

MFC after:	1 week
2009-12-05 14:33:11 +00:00
Pawel Jakub Dawidek
ccba826977 Always check guid when opening by path, because we may end up with provider
that does have the same name, but only by accident.

MFC after:	1 week
2009-12-05 14:24:22 +00:00
Pawel Jakub Dawidek
29c8c85594 Avoid using additional variable for storing an error if we are not going
to do anything with it.
2009-12-05 14:21:42 +00:00
Paul Saab
e44920da1a Correct another case of not doing 64bit math. This allows mine and
other raidz2 volumes to boot.

Submitted by:	Matt Reimer <mattjreimer@gmail.com>
2009-11-13 02:50:50 +00:00
Pawel Jakub Dawidek
fd9ee28bfc Be careful which vattr fields are set during setattr replay.
Without this fix strange things can appear after unclean shutdown like
files with mode set to 07777.

Reported by:	des
MFC after:	3 days
2009-11-10 22:27:33 +00:00
Pawel Jakub Dawidek
56697614cc Avoid passing invalid mountpoint to getnewvnode().
Reported by:	rwatson
Tested by:	rwatson
MFC after:	3 days
2009-11-10 22:25:46 +00:00
Pawel Jakub Dawidek
fd66267ffb - zfs_zaccess() can handle VAPPEND too, so map V_APPEND to VAPPEND and call
zfs_access() instead of vaccess() in this case as well.
- If VADMIN is specified with another V* flag (unlikely) call both
  zfs_access() and vaccess() after spliting V* flags.

This fixes "dirtying snapshot!" panic.

PR:		kern/139806
Reported by:	Carl Chave <carl@chave.us>
In co-operation with:	jh
MFC after:	3 days
2009-10-30 23:33:06 +00:00
Robert Noland
14c436e101 Correct some issues with zfs boot.
- Teach it to read gang blocks. (essentially untested)
   If you see "ZFS: gang block detected!", please let
   me know, so we can either remove the printf if it
   works, or fix it if it doesn't.

 - If multiple partitions exist on a disk, probe them all.
   We also need to reset dsk->start to 0 to read the right
   sector here.

 - With GPT, we can have 128 partitions.

 - If the bootfs property has ever been set on a pool
   it seems that it never goes away.  zpool won't allow
   you to add to the pool with the bootfs property set.
   However, if you clear the property back to default
   we end up getting 0 for the object number and read
   a bogus block pointer and fail to boot.

 - Fix some error printfs. The printf in the loader is
   only capable of c,s and u formats.

 - Teach printf how to display %llu

Reviewed by:	dfr, jhb
MFC after:	2 weeks
2009-10-23 18:44:53 +00:00
Pawel Jakub Dawidek
c217b20ef6 Allow file system owner to modify system flags if securelevel permits.
MFC after:	3 days
2009-10-08 16:05:17 +00:00
Pawel Jakub Dawidek
68c53ef849 File system owner is when uid matches and jail matches.
MFC after:	3 days
2009-10-08 16:03:19 +00:00
Pawel Jakub Dawidek
3a6c0cbf26 On FreeBSD it is enough to report provider removal when orphan event is
received, we don't have to do it on every ENXIO error in I/O path.
Solaris has no GEOM so they have to handle it in a less clean way.

MFC after:	3 days
2009-10-07 20:56:15 +00:00
Pawel Jakub Dawidek
2ada529a14 Fix white-spaces.
MFC after:	3 days
2009-10-07 20:54:07 +00:00
Pawel Jakub Dawidek
c0103003c0 Fix situation where Mac OS X NFS client creates a file and when it tries
to set ownership and mode in the same setattr operation, the mode was
overwritten by secpolicy_vnode_setattr().

PR:		kern/118320
Submitted by:	Mark Thompson <info-gentoo@mark.thompson.bz>
MFC after:	3 days
2009-10-07 12:38:19 +00:00
Kip Macy
e6b112e274 Prevent paging pressure from draining arc too much
- always drain arc if above arc_c_max - never drain arc if arc is below arc_c_max

MFC after:	3 days
2009-10-06 21:40:50 +00:00
Xin LI
6f62807611 Return EOPNOTSUPP instead of EINVAL when doing chflags(2) over an old
format ZFS, as defined in the manual page.

Submitted by:	pjd (response of my original patch but bugs are mine)
MFC after:	3 days
2009-10-01 18:58:26 +00:00
Pawel Jakub Dawidek
ab711589df Handle cases where virtual (GFS) vnodes are referenced when doing forced
unmount. In that case we cannot depend on the proper order of invalidating
vnodes, so we have to free resources when we have a chance.

PR:		kern/139062
Reported by:	trasz
MFC after:	3 days
2009-09-26 00:10:45 +00:00
Pawel Jakub Dawidek
a0b238644a On lookup error VFS expects *vpp to be set to NULL, be sure to do that.
MFC after:	3 days
2009-09-26 00:08:44 +00:00
Pawel Jakub Dawidek
a99aaff645 Use traverse() function to find and return mount point's vnode instead of
covered vnode when snapshot is already mounted.

MFC after:	3 days
2009-09-26 00:07:14 +00:00
Pawel Jakub Dawidek
1aba32d9b4 - Don't depend on value returned by gfs_*_inactive(), it doesn't work
well with forced unmounts when GFS vnodes are referenced.
- Make other preparations to GFS for forced unmounts.

PR:		kern/139062
Reported by:	trasz
MFC after:	3 days
2009-09-26 00:04:30 +00:00
Pawel Jakub Dawidek
86758476b4 Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to
be a bit weak and OpenSolaris also switched to fletcher4.

PR:		kern/139072
Reported by:	Daniel Grund <bugs@dgrund.de>
MFC after:	3 days
2009-09-25 18:19:50 +00:00
Pawel Jakub Dawidek
ad8294cf98 Before calling vflush(FORCECLOSE) mark file system as unmounted so the
following vnops will fail. This is very important, because without this change
vnode could be reclaimed at any point, even if we increased usecount. The only
way to ensure that vnode won't be reclaimed was to lock it, which would be very
hard to do in ZFS without changing a lot of code. With this change simply
increasing usecount is enough to be sure vnode won't be reclaimed from under
us. To be precise it can still be reclaimed but we won't be able to see it,
because every try to enter ZFS through VFS will result in EIO.

The only function that cannot return EIO, because it is needed for vflush() is
zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks
z_teardown_lock and never returns EIO.

MFC after:	3 days
2009-09-24 15:56:26 +00:00
Pawel Jakub Dawidek
ab9bbf4a2b Close race in zfs_zget(). We have to increase usecount first and then
check for VI_DOOMED flag. Before this change vnode could be reclaimed
between checking for the flag and increasing usecount.

MFC after:	3 days
2009-09-24 15:49:15 +00:00
Edward Tomasz Napierala
c40502ccd0 In VOP_SETACL(9) and VOP_GETACL(9), specifying wrong ACL type should result
in EINVAL, not EOPNOTSUPP.
2009-09-23 15:09:34 +00:00
Pawel Jakub Dawidek
eb03c3cdfb Restore BSD behaviour - when creating new directory entry use parent directory
gid to set group ownership and not process gid.

This was overlooked during v6 -> v13 switch.

PR:		kern/139076
Reported by:	Sean Winn <sean@gothic.net.au>
MFC after:	3 days
2009-09-23 09:18:16 +00:00
Pawel Jakub Dawidek
c4be11d7fc Purge namecache in the same place OpenSolaris does. 2009-09-20 13:28:29 +00:00
Pawel Jakub Dawidek
5469543c92 Purge file system namecache when receiving incremental stream and rolling back
to it.

MFC after:	3 days
2009-09-17 15:14:28 +00:00
Pawel Jakub Dawidek
3282c51713 Purge namecache for the file system being rolled back, so it doesn't point at
invalid vnodes after the rollback resulting in EIO errors when trying to access
files which are in the namecache.

Reported by:	des
MFC after:	3 days
2009-09-17 14:58:21 +00:00
Pawel Jakub Dawidek
95f08808b6 Forced unmounts work just fine in my tests under heavy load. There might
still be a problem, but it isn't worth a warning.
2009-09-15 11:42:08 +00:00
Pawel Jakub Dawidek
a4e6b460d3 We believe ZFS is ready for production use. Remove a warning about it being
experimental. :)
2009-09-15 11:34:53 +00:00
Pawel Jakub Dawidek
63e1d3df27 - Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular
df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows
  ZFS route of not listing snapshots by default with 'zfs list' command.
- Add UPDATING entry to note that ZFS snapshots are no longer visible in
  mount(8) and df(1) output by default.

Reviewed by:	kib
MFC after:	3 days
2009-09-14 21:10:40 +00:00
Pawel Jakub Dawidek
85c171b2e1 Support both case: when snapshot is already mounted and when it is not yet
mounted.

MFC after:	3 days
2009-09-13 21:40:36 +00:00
Pawel Jakub Dawidek
8a2c4db0fe Add missing \n.
Reported by:	marck
2009-09-13 17:30:56 +00:00
Pawel Jakub Dawidek
7746b6461d Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories
by just returning EOPNOTSUPP. This will allow NFS server to fall back to
regular READDIR.

Note that converting inode number to snapshot's vnode is expensive operation.
Snapshots are stored in AVL tree, but based on their names, not inode numbers,
so to convert inode to snapshot vnode we have to interate over all snalshots.

This is not a problem in OpenSolaris, because in their READDIRPLUS
implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on
d_fileno as we do.

PR:		kern/125149
Reported by:	Weldon Godfrey <wgodfrey@ena.com>
Analysis by:	Jaakko Heinonen <jh@saunalahti.fi>
MFC after:	3 days
2009-09-13 16:05:20 +00:00
Pawel Jakub Dawidek
7b4a12379b When zfs.ko is compiled with debug, make sure that znode and vnode point at
each other.

MFC after:	3 days
2009-09-13 10:33:51 +00:00
Pawel Jakub Dawidek
33a0ef82f2 Extend scope of the z_teardown_lock lock for consistency and "just in case".
MFC after:	3 days
2009-09-13 10:29:51 +00:00
Pawel Jakub Dawidek
7dae3c4faf Be sure not to overflow struct fid.
MFC after:	3 days
2009-09-13 10:25:33 +00:00
Pawel Jakub Dawidek
f53901193d There is a bug where mze_insert() can trigger an assert() of inserting
the same entry twice. This bug is not fixed yet, but leads to situation
where when try to access corrupted directory the kernel will panic.
Until the bug is properly fixed, try to recover from it and log that it
happened.

Reported by:	marck
OpenSolaris bug:	6709336
MFC after:	3 days
2009-09-13 10:12:29 +00:00
Pawel Jakub Dawidek
f5516e3d1d - Protect reclaim with z_teardown_inactive_lock.
- Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if
  z_dbuf field is NULL - this might happen in case of rollback or forced
  unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete().
- On forced unmount wait for all znodes to be destroyed - destruction can be
  done asynchronously via zfs_reclaim_complete().

MFC after:	1 week
2009-09-12 19:53:31 +00:00
Pawel Jakub Dawidek
2a8e7dad33 Tighten up the check for race in zfs_zget() - ZTOV(zp) can not only contain
NULL, but also can point to dead vnode, take that into account.

PR:		kern/132068
Reported by:	Edward Fisk" <7ogcg7g02@sneakemail.com>, kris
Fix based on patch from:	Jaakko Heinonen <jh@saunalahti.fi>
MFC after:	1 week
2009-09-12 19:27:54 +00:00
Pawel Jakub Dawidek
3770996142 Only log successful commands! Without this fix we log even unsuccessful
commands executed by unprivileged users. Action is not really taken, but it is
logged to pool history, which might be confusing.

Reported by:	Denis Ahrens <denis@h3q.com>
MFC after:	3 days
2009-09-08 16:40:08 +00:00
Pawel Jakub Dawidek
d6b8039292 We don't export individual snapshots, so mnt_export field in snapshot's
mount point is NULL. That's why when we try to access snapshots over NFS
use mnt_export field from the parent file system.

MFC after:	1 week
2009-09-08 15:57:03 +00:00
Pawel Jakub Dawidek
f148fd9a4a When we automatically mount snapshot we want to return vnode of the mount point
from the lookup and not covered vnode. This is one of the fixes for using .zfs/
over NFS.

MFC after:	1 week
2009-09-08 15:51:40 +00:00
Pawel Jakub Dawidek
2391003912 On FreeBSD we don't have to look for snapshot's mount point,
because fhtovp method is already called with proper mount point.

MFC after:	1 week
2009-09-08 15:42:55 +00:00
Pawel Jakub Dawidek
6f8e88e1da Call ZFS_EXIT() after locking the vnode.
MFC after:	1 week
2009-09-08 15:37:01 +00:00
Konstantin Belousov
211ddddce7 Lock Giant around vn_open_cred().
Remove innocent unnecessary call to NDFREE().

Reported by:	marcel
Reviewed and tested by:	pjd
MFC after:	3 days
2009-09-08 09:17:34 +00:00
Pawel Jakub Dawidek
1ea3566294 Fix reference count leak for a case where snapshot's mount point is updated.
Such situation is not supported.

This problem was triggered by something like this:

	# zpool create tank da0
	# zfs snapshot tank@snap
	# cd /tank/.zfs/snapshot/snap  (this will mount the snapshot)
	# cd
	# mount -u nosuid /tank/.zfs/snapshot/snap  (refcount leak)
	# zpool export tank
	cannot export 'tank': pool is busy

MFC after:	1 week
2009-09-08 08:54:15 +00:00
Pawel Jakub Dawidek
28e449adf2 If we have to use avl_find(), optimize a bit and use avl_insert() instead of
avl_add() (the latter is actually a wrapper around avl_find() + avl_insert()).

Fix similar case in the code that is currently commented out.
2009-09-07 21:58:54 +00:00
Pawel Jakub Dawidek
3f6043a57d When snapshot mount point is busy (for example we are still in it)
we will fail to unmount it, but it won't be removed from the tree,
so in that case there is no need to reinsert it.

This fixes a panic reproducable in the following steps:

	# zfs create tank/foo
	# zfs snapshot tank/foo@snap
	# cd /tank/foo/.zfs/snapshot/snap
	# umount /tank/foo
	panic: avl_find() succeeded inside avl_add()

Reported by:	trasz
MFC after:	3 days
2009-09-07 21:46:51 +00:00
Edward Tomasz Napierala
343775c0b4 Enable NFSv4 ACL support in ZFS.
Reviewed by:	pjd
2009-09-07 19:43:13 +00:00