freebsd-skq

Author	SHA1	Message	Date
Bjoern A. Zeeb	9c759b587f	Try to unbreak the build after r300611 by including the header defining VM_MIN_KERNEL_ADDRESS. Sponsored by: DARPA/AFRL	2016-05-24 17:38:27 +00:00
Ruslan Bukin	fed1ca4b71	Add initial DTrace support for RISC-V. Sponsored by: DARPA, AFRL Sponsored by: HEIF5	2016-05-24 16:41:37 +00:00
Andrew Turner	0d0da76911	Mark all memory before the kernel as toxic to DTrace. Obtained from: ABT Systems Ltd Sponsored by: The FreeBSD Foundation	2016-05-24 13:57:23 +00:00
Andriy Gapon	fabe7e4ecc	add vop_print methods to vnode operatios of various zfsctl node types This should help with diagnostics of zfsctl problems. MFC after: 2 weeks	2016-05-18 13:21:29 +00:00
Andriy Gapon	e34c8d727b	move zfsctl_freebsd_root_lookup right next to zfsctl_root_lookup That makes it easier to reason about the code. MFC after: 5 weeks	2016-05-18 08:29:39 +00:00
Andriy Gapon	a4bbed22d2	zfsctl_common_fid: remove redundant assignment "Reinterpret cast" to zfid_short_t and assignment of zf_len do the job already. MFC after: 1 week	2016-05-18 08:26:09 +00:00
Andriy Gapon	e6d4eefe2a	zfsctl: tighten an assertion and remove an unused definition There are only two entries under .zfs and 'shares' has an ID of a special persistent object in its filesystem. MFC after: 1 week	2016-05-18 08:23:39 +00:00
Andriy Gapon	439e9b6804	zfs_root: no need to set the root flag here That was both redundant as zfs_znode_sa_init() already does the job and insufficient as the root vnode can be reached via other means. MFC after: 1 weeks	2016-05-18 08:19:41 +00:00
Andriy Gapon	74a3df2b1f	zfsctl_freebsd_root_lookup: gfs_vop_lookup may return a doomed vnode gfs code is (almsot) completely agnostic of FreeBSD VFS locking, so it does not handle doomed but not yet dead vnodes and may return them. Check for those vnodes here and retry a lookup. Note that ZFS and gfs have additional protections that ensure that a parent vnode of the current vnode is never doomed. The fixed problem is an occasional failure to lookup a 'snapshot' or 'shares' directories under .zfs. Note that for the above reason all uses of zfsctl_root_lookup() are better be replaced with VOP_LOOKUP. MFC after: 5 weeks	2016-05-18 08:02:49 +00:00
Alan Somers	5f7b3969e9	Speed up vdev_geom_open_by_guids Speedup is hard to measure because the only time vdev_geom_open_by_guids gets called on many drives at the same time is during boot. But with vdev_geom_open hacked to always call vdev_geom_open_by_guids, operations like "zpool create" speed up by 65%. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c * Read all of a vdev's labels in parallel instead of sequentially. * In vdev_geom_read_config, don't read the entire label, including the uberblock. That's a waste of RAM. Just read the vdev config nvlist. Reduces the IO and RAM involved with tasting from 1MB to 448KB. Reviewed by: avg MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D6153	2016-05-17 15:17:23 +00:00
Andriy Gapon	857a214d03	zfs_ioc_rename: fix a reversed condition FreeBSD zfs_ioc_rename() has an option, not present upstream, that allows to rename snapshots without unmounting them first. I am not sure what is a rationale for that option, but its actual behavior was the opposite of the intended behavior. That is, by default the snapshots were not unmounted. The option was introduced as part of a large update from upstream in r248498. One of the consequences was a havoc under .zfs/snapshot after the rename. The snapshots got new names but were mounted on top of directories with old names, so readdir would list the new names, but lookup would still find the old mounts. PR: 209093 Reported by: Frédéric VANNIÈRE <f.vanniere@planet-work.com> MFC after: 5 days	2016-05-17 07:56:05 +00:00
Andriy Gapon	afe674f089	do not destroy 'snapdir' when it becomes inactive That was just wrong. In fact, we can safely keep this static entry when it's inactive. Now the destructive action is moved to the reclaim method and the function is renamed from zfsctl_snapdir_inactive(0 to zfsctl_snapdir_reclaim(). Also, we can use gfs_vop_reclaim() instead of gfs_dir_inactive() + kmem_free(). Lastly, we can just assert that the node does not any children when it is reclaimed, even on the force unmount. That's because zfs_umount() does an extra vflush() pass which should destroy all snapshot-mountpoint vnodes that are the snapdir's children. MFC after: 5 weeks	2016-05-16 15:48:56 +00:00
Andriy Gapon	9c3e205296	try to recycle "snap" vnodes as soon as possible Those vnodes should not linger. "Stale" nodes may get out of synchronization with actual snapshots. For example if we destroy a snapshot and create a new one with the same name. Or when we rename a snapshot. While there fix the argument type for zfsctl_snapshot_reclaim(). Also, its original argument can be passed to gfs_vop_reclaim() directly. Bug 209093 could be related although I have not specifically verified that. Referencing just in case. PR: 209093 MFC after: 5 weeks	2016-05-16 15:37:41 +00:00
Andriy Gapon	0ab1aa90fa	fix locking in zfsctl_root_lookup Dropping the root vnode's lock after VFS_ROOT() didn't really help the fact that we acquired the lock while holding its child's, .zfs, lock while performing the operaiton. So, directly use zfs_zget() to get the root vnode. While there simplify the code in zfsctl_freebsd_root_lookup. We know that .zfs is always exclusively locked. We know that there is already a reference on *vpp, so no need for an extra one. Account for the fact that .. lookup may ask for a different lock type, not necessarily LK_EXCLUSIVE. And handle a possible failure to acquire the lock given the lock flags. MFC after: 5 weeks	2016-05-16 15:28:39 +00:00
Andriy Gapon	705e6b8170	gfs_lookup_dot() does not have to acquire any locks In fact, that was dangerous. For example, zfsctl_snapshot_reclaim() calls gfs_dir_lookup() on ".." path and that ends up calling gfs_lookup_dot() which violated locking order by acquiring the parent's directory vnode lock after the child's vnode lock. Also, the previous behavior was inconsistent as gfs_dir_lookup() returned a locked vnode for . and .. lookups, but not for any other. Now gfs_lookup_dot() just references a resulting vnode and the locking is done in its consumers, where necessary. Note that we do not enable shared locking support for any gfs / zfsctl vnodes. This commit partially reverts r273641. MFC after: 5 weeks	2016-05-16 15:13:16 +00:00
Andriy Gapon	7223645bd1	avoid deadlock between zfsctl_snapdir_lookup and zfsctl_snapshot_reclaim The former acquired a snap vnode lock while holding sd_lock while the latter does the opposite. The solution is drop sd_lock before acquiring the vnode lock. That should be okay as we are still holding a lock on the 'snapshot' directory in the exclusive mode. That lock ensures that there are no concurrent lookups in the directory and thus no concurrent mount attempts. But now we have to account for the possibility that the snap vnode might get reclaim after we drop sd_lock and before we can get the node lock. So, check for that case and retry. MFC after: 5 weeks	2016-05-16 15:03:52 +00:00
Andriy Gapon	c6cd01d924	fix a vnode reference leak caused by illumos compat traverse() This commit partially reverts r273641 which introduced the leak. It did so to accomodate for some consumers of traverse() that expected the starting vnode to stay as-is. But that introduced the leak in the case when a mounted filesystem was found and its root vnode was returned. r299914 removed the troublesome consumers and now there is no reason to keep the starting vnode. So, now the new rules are: - if there is no mounted filesystem, then nothing is changed - otherwise the starting vnode is always released - the root vnode of the mounted filesystem is returned locked and referenced in the case of success MFC after: 5 weeks X-MFC after: r299914	2016-05-16 12:15:19 +00:00
Andriy Gapon	20ec8b0f9b	fix up r299902: mount_snapshot requires that the covered vnode is locked Previously that was not strictly enforced. MFC after: 4 weeks X-MFC with: r299902	2016-05-16 11:48:43 +00:00
Andriy Gapon	cf7aa80bbd	zfsctl_ops_snapshot: remove methods should never be called We pretend that snapshots mounted under .zfs are part of the original filesystem and we try very hard to hide vnodes on top of which the snapshots are mounted. Given that I believe that the removed operations should never be called. They might have been called previously because of issues fixed in r299906, r299908 and r299913. MFC after: 5 weeks	2016-05-16 07:24:30 +00:00
Andriy Gapon	cb68fd3513	zfsctl_snapdir_lookup: always clear VV_ROOT flag of snapshot's root VV_ROOT Previosuly we did that only if the snapshot was mounted earlier, its root vnode got recycled and then we accessed it again. We never cleared the flag for a freshly mounted snapshot. That was very inconsistent and probably a source of some bugs. Or maybe that painted over some bugs which might get revealed now. We should consistently clear the flag because we try very hard to pretend that snapshots auto-mounted under .zfs are part of their original filesystem. In other words, we try to hide the fact that they are different filesystems / mountpoints. MFC after: 5 weeks	2016-05-16 06:49:09 +00:00
Andriy Gapon	4df590b5b6	add zfs_vptocnp with special handling for snapshots under .zfs The logic is similar to that already present in zfs_dirlook() to handle a dot-dot lookup on a root vnode of a snapshot mounted under .zfs/snapshot/. illumos does not have an equivalent of vop_vptocnp, so there only the lookup had to be patched up. MFC after: 4 weeks	2016-05-16 06:40:51 +00:00
Andriy Gapon	a03fb1cf6a	mount_snapshot: consolidate all error handling This makes sure that the original vnode is always unlocked and released if any error happens. MFC after: 4 weeks	2016-05-16 06:30:25 +00:00
Andriy Gapon	3055925d42	zfsctl: fix several problems with reference counts * Remove excessive references on a snapshot mountpoint vnode. zfsctl_snapdir_lookup() called VN_HOLD() on a vnode returned from zfsctl_snapshot_mknode() and the latter also had a call to VN_HOLD() on the same vnode. On top of that gfs_dir_create() already returns the vnode with the use count of 1 (set in getnewvnode). So there was 3 references on the vnode. * mount_snapshot() should keep a reference to a covered vnode. That reference is owned by the mountpoint (mounted snapshot filesystem). * Remove cryptic manipulations of a covered vnode in zfs_umount(). FreeBSD dounmount() already does the right thing and releases the covered vnode. PR: 207464 Reported by: dustinwenz@ebureau.com Tested by: Howard Powell <hpowell@lighthouseinstruments.com> MFC after: 3 weeks	2016-05-16 06:24:04 +00:00
John Baldwin	fdce57a042	Add an EARLY_AP_STARTUP option to start APs earlier during boot. Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot. This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed. This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs. However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu. Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case. As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code. These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling). PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix	2016-05-14 18:22:52 +00:00
Enji Cooper	622282b3b8	Include arpa/inet.h to get the htonl(3) definition MFC after: 2 weeks Reported by: clang Sponsored by: EMC / Isilon Storage Division	2016-05-13 11:15:33 +00:00
Conrad Meyer	3b56262303	compat/opensolaris: Don't redefined off64_t if already defined A follow-up to r299456. Reported by: gjb Sponsored by: EMC / Isilon Storage Division	2016-05-11 16:05:32 +00:00
Alexander Motin	c59a902fa3	MFV r299453: 6765 zfs_zaccess_delete() comments do not accurately reflect delete permissions for ACLs Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Author: Kevin Crowe <kevin.crowe@nexenta.com> openzfs/openzfs@a40149b935	2016-05-11 13:53:29 +00:00
Alexander Motin	0eb65a5367	MFV r299451: 6764 zfs issues with inheritance flags during chmod(2) with aclmode=passthrough Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Author: Albert Lee <trisk@nexenta.com> openzfs/openzfs@1bcf0d240b	2016-05-11 13:50:34 +00:00
Alexander Motin	85a69dbf66	MFV r299449: 6763 aclinherit=restricted masks inherited permissions by group perms (groupmask) Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Author: Albert Lee <trisk@nexenta.com> openzfs/openzfs@eebb483d0c	2016-05-11 13:48:15 +00:00
Alexander Motin	2a219f349e	MFV r299442: 6762 POSIX write should imply DELETE_CHILD on directories - and some additional considerations Reviewed by: Gordon Ross <gwr@nexenta.com> Reviewed by: Yuri Pankov <yuri.pankov@nexenta.com> Author: Kevin Crowe <kevin.crowe@nexenta.com> openzfs/openzfs@d316fffc9c	2016-05-11 13:43:20 +00:00
Alexander Motin	42a54f9745	MFV r299440: 6736 ZFS per-vdev ZAPs Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: John Kennedy <john.kennedy@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Don Brady <don.brady@intel.com> Reviewed by: Dan McDonald <danmcd@omniti.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Joe Stein <joe.stein@delphix.com> openzfs/openzfs@215198a6ad	2016-05-11 12:54:00 +00:00
Alexander Motin	7d54dbae83	MFV r299438: 6842 Fix empty xattr dir causing lockup Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Chunwei Chen <tuxoko@gmail.com> openzfs/openzfs@02525cd08f	2016-05-11 12:46:07 +00:00
Alexander Motin	d7ff478705	MFV r299436: 6843 Make xattr dir truncate and remove in one tx Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Chunwei Chen <tuxoko@gmail.com> openzfs/openzfs@399cc7d5d9	2016-05-11 12:43:54 +00:00
Alexander Motin	0b99ac761e	MFV r299434: 6841 Undirty freed spill blocks Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed by: Dan McDonald <danmcd@omniti.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Robert Mustacchi <rm@joyent.com> Author: Tim Chase <tim@chase2k.com> openzfs/openzfs@445e67805d	2016-05-11 12:38:07 +00:00
Ruslan Bukin	d7dc6bae03	Implement FBT provider (MD part) for DTrace on MIPS. Tested on MIPS64. Sponsored by: DARPA, AFRL Sponsored by: HEIF5	2016-05-05 13:54:50 +00:00
Alan Somers	c9a807447d	Fix a use-after-free when "zpool import" fails clear vd->vdev_tsd in vdev_geom_close_locked instead of vdev_geom_detach. In the latter function, it would fail to happen in certain circumstances where cp->private was unset. Ideally, the latter should never happen, but it can happen when vdev open fails, or where spares are involved. MFC after: 4 weeks X-MFC-With: 298786 Sponsored by: Spectra Logic Corp	2016-04-29 21:29:37 +00:00
Andriy Gapon	27b6c49726	add invpcid instruction to i386 dtrace disassembler tables MFC after: 2 weeks	2016-04-29 15:45:22 +00:00
Alan Somers	663f649ff6	Refactor vdev_geom_attach and friends to reduce code duplication sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Move checks for provider's sectorsize and mediasize into a single location in vdev_geom_attach. Remove the zfs::vdev::taste class; it's ok to use the regular vdev class for tasting. Consolidate guid checks into a single location in vdev_attach_ok. Consolidate some error handling code from vdev_geom_attach into vdev_geom_detach, closing a resource leak of geom consumers in the process. Reviewed by: avg MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D5974	2016-04-29 15:23:51 +00:00
Mark Johnston	676a03fa6a	Increase DTRACE_FUNCNAMELEN from 128 to 192. This allows for the long function components encountered in www/firefox. This constant is part of DTrace's userland ABI, so this change may not be MFC'ed. PR: 207735	2016-04-25 18:44:11 +00:00
Mark Johnston	328d8adb9b	Allow DOF sections with excessively long probe function components. Without this change, DTrace will refuse to load a DOF section if the function component of any of its probes exceeds DTRACE_FUNCNAMELEN (128). Probes in C++ programs can have very long function components. Rather than rejecting all probes if a single probe exceeds the limit, simply skip the invalid probe and emit a warning. This ensures that valid probes are instantiated. PR: 207735 MFC after: 2 weeks	2016-04-25 18:40:57 +00:00
Mark Johnston	cd8bbc382d	Add a kern.dtrace.err_verbose sysctl to control dtrace_err_verbose. When this flag is turned on, DOF and DIF validation errors are printed to the kernel message buffer. This is useful for debugging. Also remove the debug.dtrace.debug sysctl, which has no effect.	2016-04-25 18:09:36 +00:00
Andriy Gapon	2d69831b85	lahf/sahf are supported on some amd64 processors While the instructions were not included into the original instruction set, their support can be indicated by a special feature bit. For example: CPU: AMD Phenom(tm) II X4 955 Processor (3214.71-MHz K8-class CPU) ... AMD Features2=0x37ff<LAHF, ...> Clang 3.8 uses lahf/sahf as a faster alternative to pushf/popf where possible. MFC after: 2 weeks	2016-04-22 13:44:12 +00:00
Andriy Gapon	dbbcddb426	MFV r298471: 6052 decouple lzc_create() from the implementation details illumos/illumos-gate@26455f9efc `26455f9efc` https://www.illumos.org/issues/6052 At the moment type parameter of lzc_create() is of dmu_objset_type_t type. That exposes an implementation detail and requires sys/fs/zfs.h to be included in libzfs_core.h creating unnecessary coupling between libzfs_core interface and ZFS internals. I think that dmu_objset_type_t should be replaced with a libzfs_core enumeration of supported dataset types. For ABI reasons the new enumeration could be bit-compatible with dmu_objset_type_t. For example: typedef enum { LZC_DST_ZFS = 2, LZC_DST_ZVOL } lzc_dataset_type_t; Reviewed by: Matthew Ahrens <mahrens@delphix.com> Approved by: Richard Lowe <richlowe@richlowe.net> Author: Andriy Gapon <andriy.gapon@clusterhq.com> MFC after: 2 weeks Sponsored by: ClusterHQ	2016-04-22 13:00:27 +00:00
Mark Johnston	6c2806594b	Make the second argument of dtrace_invop() a trapframe pointer. Currently this argument is a pointer into the stack which is used by FBT to fetch the first five probe arguments. On all non-x86 architectures it's simply the trapframe address, so this change has no functional impact. On amd64 it's a pointer into the trapframe such that stack[1 .. 5] gives the first five argument registers, which are deliberately grouped together in the amd64 trapframe definition. A trapframe argument simplifies the invop handlers on !x86 and makes the x86 FBT invop handler easier to understand. Moreover, it allows for invop handlers that may want to modify the register set of the interrupted thread.	2016-04-17 23:08:47 +00:00
Andriy Gapon	e01dd79f9a	zfs_rezget: z_vnode can not be NULL if zp is valid MFC after: 3 weeks	2016-04-16 07:41:56 +00:00
Andriy Gapon	c2d36fc5cd	zfs: enable vn_io_fault support Note that now we have to account for possible partial writes in dmu_write_uio_dbuf(). It seems that on illumos either all or none of the data are expected to be written. But the partial writes are quite expected when vn_io_fault support is enabled. Reviewed by: kib MFC after: 7 weeks Differential Revision: https://reviews.freebsd.org/D2790	2016-04-16 07:35:53 +00:00
Alan Somers	739f4ae3b1	Don't corrupt ZFS label's physpath attribute when booting while a disk is missing Prior to this change, vdev_geom_open_by_path would call vdev_geom_attach prior to verifying the device's GUIDs. vdev_geom_attach calls vdev_geom_attrchange to set the physpath in the vdev object. The result is that if the disk could not be found, then the labels for other disks in the same TLD would overwrite the missing disk's physpath with the physpath of whichever disk currently has the same devname as the missing one used to have. MFC after: 4 weeks Sponsored by: Spectra Logic Corp	2016-04-15 16:36:17 +00:00
Alan Somers	c29088b5c7	Add more debugging statements in vdev_geom.c Log a debugging message whenever geom functions fail in vdev_geom_attach. Printing these messages is controlled by vfs.zfs.debug MFC after: 4 weeks Sponsored by: Spectra Logic Corp	2016-04-14 23:14:41 +00:00
Alan Somers	f0ac053088	Update a debugging message in vdev_geom_open_by_guids for consistency with similar messages elsewhere in the file. MFC after: 4 weeks Sponsored by: Spectra Logic Corp	2016-04-14 19:20:31 +00:00
Alan Somers	4e3ab010a2	Fix rare double free in vdev_geom_attrchanged sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Don't drop the g_topology_lock before freeing old_physpath. That opens up a race where one thread can call vdev_geom_attrchanged, set old_physpath, drop the g_topology_lock, then block trying to acquire the SCL_STATE lock. Then another thread can come into vdev_geom_attrchanged, set old_physpath to the same value, and proceed to free it. When the first thread resumes, it will free the same location. It turns out that the SCL_STATE lock isn't needed. It was originally added by gibbs to protect vd->vdev_physpath while updating the same. However, the update process subsequently was switched to an atomic operation (a pointer swap). Now, there is no need for the SCL_STATE lock, and hence no need to drop the g_topology_lock. Reviewed by: delphij MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D5413	2016-04-12 19:11:14 +00:00

1 2 3 4 5 ...

1513 Commits