freebsd-skq

Author	SHA1	Message	Date
Kip Macy	ea41c77517	SAVESTART implies SAVENAME	2009-05-17 01:31:28 +00:00
Kip Macy	2e9c90d55b	enable adaptive spinning on zfs locks	2009-05-16 23:56:45 +00:00
Kip Macy	be08aa8b59	- allow forced unmounts - don't assume snapshot was auto-mounted	2009-05-16 20:33:13 +00:00
Kip Macy	71bc1ce36e	only use direct map if system has more than 2GB	2009-05-16 20:09:07 +00:00
Kip Macy	32237d8492	apply band-aid to x86_64 systems with more physical memory than kmem by allocating from the direct map	2009-05-16 19:17:15 +00:00
Doug Rabson	e1899ef6c8	Add support for booting from raidz1 and raidz2 pools.	2009-05-16 10:48:20 +00:00
Attilio Rao	dfd233edd5	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
Kip Macy	469ef3e563	rename xdr support files to avoid conflicts when linking in to the kernel	2009-05-11 04:18:58 +00:00
Kip Macy	8569258bf8	- rename atomic.S and crc32.c to avoid collisions when linking zfs in to the kernel - update Makefile - ifdef out acl_{alloc, free}, they aren't used by zfs and conflict with existing in-kernel routines	2009-05-09 01:45:55 +00:00
Marko Zec	29b02909eb	Introduce a new virtualization container, provisionally named vprocg, to hold virtualized instances of hostname and domainname, as well as a new top-level virtualization struct vimage, which holds pointers to struct vnet and struct vprocg. Struct vprocg is likely to become replaced in the near future with a new jail management API import. As a consequence of this change, change struct ucred to point to a struct vimage, instead of directly pointing to a vnet. Merge vnet / vimage / ucred refcounting infrastructure from p4 / vimage branch. Permit kldload / kldunload operations to be executed only from the default vimage context. This change should have no functional impact on nooptions VIMAGE kernel builds. Reviewed by: bz Approved by: julian (mentor)	2009-05-08 14:11:06 +00:00
Kip Macy	a6827463ad	don't call vn_rele_async_fini in the !_KERNEL case	2009-05-07 23:34:41 +00:00
Kip Macy	c20fd07777	move VN_RELE_ASYNC to the compatibility layer with the rest of the VN_* defines	2009-05-07 23:02:15 +00:00
Kip Macy	6ef1a81d6e	avoid LOR and gratuitous extra lock acquisitions by moving user_evict list buffers to a temporary list	2009-05-07 21:51:13 +00:00
Kip Macy	77d0162c70	Allow the VM to provide backpressure on the ARC cache as it does on Solaris.	2009-05-07 20:57:06 +00:00
Kip Macy	62fa227ccd	Asynchronously release vnodes to avoid blocking on range locks when calling back in to zfs. This is based on a fix that went in to opensolaris on March 9th. However, it uses a dedicated thread instead of a Solaris' taskq to avoid doing a blocking memory allocation with the vnode interlock held. This fixes a long-time deadlock in ZFS. This is not, strictly speaking, an LOR. The spa_zio thread releases a vnode, this calls in to vn_reclaim which in turn needs to acquire range locks to sync dirty data out to disk. The range locks are already held by a user-level process waiting on a condition variable that it the process is waiting on a spa_zio thread to signal it on. The process could not be signalled because the spa_zio thread could not proceed. The nature of this problem was not apparent due to ZFS locks opting out of witness which meant that DDB did not know about the locks that were held by ZFS. Reviewed by: pjd MFC after: 7 days	2009-05-07 20:28:06 +00:00
Jamie Gritton	b38ff370e4	Introduce the extensible jail framework, using the same "name=value" interface as nmount(2). Three new system calls are added: * jail_set, to create jails and change the parameters of existing jails. This replaces jail(2). * jail_get, to read the parameters of existing jails. This replaces the security.jail.list sysctl. * jail_remove to kill off a jail's processes and remove the jail. Most jail parameters may now be changed after creation, and jails may be set to exist without any attached processes. The current jail(2) system call still exists, though it is now a stub to jail_set(2). Approved by: bz (mentor)	2009-04-29 21:14:15 +00:00
Robert Watson	885868cd8f	Remove VOP_LEASE and supporting functions. This hasn't been used since the removal of NQNFS, but was left in in case it was required for NFSv4. Since our new NFSv4 client and server can't use it for their requirements, GC the old mechanism, as well as other unused lease- related code and interfaces. Due to its impact on kernel programming and binary interfaces, this change should not be MFC'd. Proposed by: jeff Reviewed by: jeff Discussed with: rmacklem, zach loafman @ isilon	2009-04-10 10:52:19 +00:00
Andrew Thompson	853a10a581	Revert r190676,190677 The geom and CAM changes for root_hold are the wrong solution for USB design quirks. Requested by: scottl	2009-04-10 04:08:34 +00:00
Andrew Thompson	626fc9fe3d	Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called in situations where sleeping isnt allowed.	2009-04-03 19:46:12 +00:00
Robert Watson	455f3aa24f	Move dtnfsclient.c in the cddl tree to nfs_kdtrace.c in the nfsclient directory, since it's under a BSD license, and this keeps NFS internals- aware tracing parts close to NFS. MFC after: 1 month Suggested by: jhb	2009-03-25 17:47:22 +00:00
Robert Watson	10263f0832	Add DTrace probes to the NFS access and attribute caches. Access cache events are: nfsclient:accesscache:flush:done nfsclient:accesscache:get:hit nfsclient:accesscache:get:miss nfsclient:accesscache:load:done They pass the vnode, uid, and requested or loaded access mode (if any); the load event may also report a load error if the RPC fails. The attribute cache events are: nfsclient:attrcache:flush:done nfsclient:attrcache:get:hit nfsclient:attrcache:get:miss nfsclient:attrcache:load:done They pass the vnode, optionally the vattr if one is present (hit or load), and in the case of a load event, also a possible RPC error. MFC after: 1 month Sponsored by: Google, Inc.	2009-03-24 17:14:34 +00:00
Robert Watson	47294818f9	Add dtnfsclient, a first cut at an NFSv2/v3 client reuest DTrace provider. The NFS client exposes 'start' and 'done' probes for NFSv2 and NFSv3 RPCs when using the new RPC implementation, passing in the vnode, mbuf chain, credential, and NFSv2 or NFSv3 procedure number. For 'done' probes, the error number is also available. Probes are named in the following way: ... nfsclient:nfs2:write:start nfsclient:nfs2:write:done ... nfsclient:nfs3:access:start nfsclient:nfs3:access:done ... Access to the unmarshalled arguments is not easily available at this point in the stack, but the passed probe arguments are sufficient to to a lot of interesting things in practice. Technically, these probes may cover multiple RPC retransmits, and even transactions if the transaction ID change as a result of authentication failure or a jukebox error from the server, but usefully capture the intent of a single NFS request, such as access, getattr, write, etc. Typical use might involve profiling RPC latency by system call, number of RPCs, how often a getattr leads to a call to access, when failed access control checks occur, etc. More detailed RPC information might best be provided by adding a krpc provider. It would also be useful to add NFS client probes for events such as the access cache or attribute cache satisfying requests without an RPC. Sponsored by: Google, Inc. MFC after: 1 month	2009-03-22 22:07:52 +00:00
John Baldwin	9fca7a854c	The zfs_get_xattrdir() function is used to find the extended attribute directory for a znode. When the directory already exists, it returns a referenced but unlocked vnode. When a directory does not yet exist, it calls zfs_make_xattrdir() to create a new one. zfs_make_xattrdir() returns the vnode both referenced and and locked and zfs_get_xattrdir() was leaking this vnode lock to its callers. Fix this by dropping the vnode lock if zfs_make_xattrdir() successfully creates a new extended attribute directory. Reviewed by: pjd	2009-03-18 16:19:44 +00:00
John Baldwin	33fc362512	Add a new internal mount flag (MNTK_EXTENDED_SHARED) to indicate that a filesystem supports additional operations using shared vnode locks. Currently this is used to enable shared locks for open() and close() of read-only file descriptors. - When an ISOPEN namei() request is performed with LOCKSHARED, use a shared vnode lock for the leaf vnode only if the mount point has the extended shared flag set. - Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but not O_CREAT. - Use a shared vnode lock around VOP_CLOSE() if the file was opened with O_RDONLY and the mountpoint has the extended shared flag set. - Adjust md(4) to upgrade the vnode lock on the vnode it gets back from vn_open() since it now may only have a shared vnode lock. - Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since FIFO's require exclusive vnode locks for their open() and close() routines. (My recent MPSAFE patches for UDF and cd9660 already included this change.) - Enable extended shared operations on UFS, cd9660, and UDF. Submitted by: ups Reviewed by: pjd (ZFS bits) MFC after: 1 month	2009-03-11 14:13:47 +00:00
Jamie Gritton	f86bce5ed0	Extend the "vfsopt" mount options for more general use. Make struct vfsopt and the vfs_buildopts function public, and add some new fields to struct vfsopt (pos and seen), and new functions vfs_getopt_pos and vfs_opterror. Further extend the interface to allow reading options from the kernel in addition to sending them to the kernel, with vfs_setopt and related functions. While this allows the "name=value" option interface to be used for more than just FS mounts (planned use is for jails), it retains the current "vfsopt" name and <sys/mount.h> requirement. Approved by: bz (mentor)	2009-03-02 23:26:30 +00:00
Ed Schouten	802cb57e34	Add memmove() to the kernel, making the kernel compile with Clang. When copying big structures, LLVM generates calls to memmove(), because it may not be able to figure out whether structures overlap. This caused linker errors to occur. memmove() is now implemented using bcopy(). Ideally it would be the other way around, but that can be solved in the future. On ARM we don't do add anything, because it already has memmove(). Discussed on: arch@ Reviewed by: rdivacky	2009-02-28 16:21:25 +00:00
John Baldwin	ea77ff0a15	Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month	2009-02-13 18:18:14 +00:00
Ed Schouten	a4611ab612	Last step of splitting up minor and unit numbers: remove minor(). Inside the kernel, the minor() function was responsible for obtaining the device minor number of a character device. Because we made device numbers dynamically allocated and independent of the unit number passed to make_dev() a long time ago, it was actually a misnomer. If you really want to obtain the device number, you should use dev2udev(). We already converted all the drivers to use dev2unit() to obtain the device unit number, which is still used by a lot of drivers. I've noticed not a single driver passes NULL to dev2unit(). Even if they would, its behaviour would make little sense. This is why I've removed the NULL check. Ths commit removes minor(), minor2unit() and unit2minor() from the kernel. Because there was a naming collision with uminor(), we can rename umajor() and uminor() back to major() and minor(). This means that the makedev(3) manual page also applies to kernel space code now. I suspect umajor() and uminor() isn't used that often in external code, but to make it easier for other parties to port their code, I've increased __FreeBSD_version to 800062.	2009-01-28 17:57:16 +00:00
Warner Losh	78bc7eec0d	Put the MIPS support back in after it was removed in r185029.	2008-12-04 16:31:08 +00:00
Pawel Jakub Dawidek	35a15332f3	MFp4: Remove assertion that is no longer valid - we now use VOP_CLOSE() in more places (ie vdev_file.c).	2008-11-29 12:32:42 +00:00
Edward Tomasz Napierala	38cc5da78e	MFp4: We don't support TX_CREATE_ACL_ATTR nor TX_MKDIR_ACL_ATTR; code found in zfs_replay.c will panic if it encounters transactions of this type. Make sure we don't put these into the ZIL. Approved by: rwatson (mentor), pjd	2008-11-25 23:05:46 +00:00
Pawel Jakub Dawidek	ad35ee04f4	Fix locking (file descriptor table and Giant around VFS). Most submitted by: kib Reviewed by: kib	2008-11-25 21:14:00 +00:00
Ganbold Tsagaankhuu	79dae0aa0b	Remove unused variable. Found with: Coverity Prevent(tm) CID: 3669,3671 Approved by: jb	2008-11-25 19:25:54 +00:00
Pawel Jakub Dawidek	83080c1ece	Don't use PRIV_ROOT. Here we check if user can share ZFS file system, so PRIV_NFS_DAEMON seems best choice. Discussed with: rwatson	2008-11-23 20:14:19 +00:00
Pawel Jakub Dawidek	bcfbcdca9c	IFp4: Don't rely on disk IDs and always use vdev guids, which means always look up for components by reading metadata. This might be slower when there are big number of disks in the system, but is definiately more reliable.	2008-11-22 13:33:06 +00:00
Pawel Jakub Dawidek	74303ba55c	IFp4: Finish implemnetation of chflags(2) for ZFS. While doing this I found that zfs_access() can only handle VREAD, VWRITE and VEXEC, for the rest we need to use vaccess(9).	2008-11-22 13:24:44 +00:00
Pawel Jakub Dawidek	5189bf22c0	IFp4: Don't free pathname too soon, debugging code is still using it.	2008-11-22 13:22:24 +00:00
Doug Rabson	786895f6ba	Add definitions for ZFS pool version 13.	2008-11-21 09:10:35 +00:00
Doug Rabson	0d16312b46	Some zfsboot fixes from Norikatsu Shigemura: 1. zfsboot2 (boot2) doesn't %d (printf), so change %d to %u. 2. chase new zpool versioning as SPA_VERSION. Obtained from: sys/cddl/contrib/opensolaris/uts/common/sys/fs/zfs.h Submitted by: nork	2008-11-19 16:59:19 +00:00
Pawel Jakub Dawidek	1ba4a712dd	Update ZFS from version 6 to 13 and bring some FreeBSD-specific changes. This bring huge amount of changes, I'll enumerate only user-visible changes: - Delegated Administration Allows regular users to perform ZFS operations, like file system creation, snapshot creation, etc. - L2ARC Level 2 cache for ZFS - allows to use additional disks for cache. Huge performance improvements mostly for random read of mostly static content. - slog Allow to use additional disks for ZFS Intent Log to speed up operations like fsync(2). - vfs.zfs.super_owner Allows regular users to perform privileged operations on files stored on ZFS file systems owned by him. Very careful with this one. - chflags(2) Not all the flags are supported. This still needs work. - ZFSBoot Support to boot off of ZFS pool. Not finished, AFAIK. Submitted by: dfr - Snapshot properties - New failure modes Before if write requested failed, system paniced. Now one can select from one of three failure modes: - panic - panic on write error - wait - wait for disk to reappear - continue - serve read requests if possible, block write requests - Refquota, refreservation properties Just quota and reservation properties, but don't count space consumed by children file systems, clones and snapshots. - Sparse volumes ZVOLs that don't reserve space in the pool. - External attributes Compatible with extattr(2). - NFSv4-ACLs Not sure about the status, might not be complete yet. Submitted by: trasz - Creation-time properties - Regression tests for zpool(8) command. Obtained from: OpenSolaris	2008-11-17 20:49:29 +00:00
Edward Tomasz Napierala	4bdaada206	Require write access on a directory being moved from one parent directory to another in ZFS. Approved by: rwatson (mentor), pjd	2008-11-08 19:56:32 +00:00
Edward Tomasz Napierala	36d227d9ed	Backoff the last patch. It was overly restrictive - we want to check for write permission on target only when moving the target between two directories. Approved by: rwatson (mentor)	2008-11-06 22:28:04 +00:00
Edward Tomasz Napierala	b92eda309d	Change ZFS behaviour to match UFS: when moving (rename(2)) a subdirectory from one parent directory to another, in addition to the usual access checks one also needs write access to the subdirectory being moved. Approved by: rwatson (mentor), pjd	2008-11-06 19:17:58 +00:00
Craig Rodrigues	6a73ed4f46	Remove definition of KMEM_DEBUG accidentally brought in by latest DTrace import. Noticed by: thompsa	2008-11-05 20:32:13 +00:00
Craig Rodrigues	f5a97d1bcb	Merge latest DTrace changes from Perforce.	2008-11-05 19:39:11 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Attilio Rao	0d7935fd01	Remove the struct thread unuseful argument from bufobj interface. In particular following functions KPI results modified: - bufobj_invalbuf() - bufsync() and BO_SYNC() "virtual method" of the buffer objects set. Main consumers of bufobj functions are affected by this change too and, in particular, functions which changed their KPI are: - vinvalbuf() - g_vfs_close() Due to the KPI breakage, __FreeBSD_version will be bumped in a later commit. As a side note, please consider just temporary the 'curthread' argument passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP Reviewed by: kib Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-10-10 21:23:50 +00:00
John Birrell	fd4cdfbf46	Disable use of the user credentials until there is code to set the levels that DTrace uses. This fixes a bug that would have affected kernels built with MAC and all kernels built after the mpsafetty integration. The bug will be apparent in RELENG7 on MAC kernels. Reported by: kan	2008-09-27 17:52:48 +00:00
Ed Schouten	6bfa9a2d66	Replace all calls to minor() with dev2unit(). After I removed all the unit2minor()/minor2unit() calls from the kernel yesterday, I realised calling minor() everywhere is quite confusing. Character devices now only have the ability to store a unit number, not a minor number. Remove the confusion by using dev2unit() everywhere. This commit could also be considered as a bug fix. A lot of drivers call minor(), while they should actually be calling dev2unit(). In -CURRENT this isn't a problem, but it turns out we never had any problem reports related to that issue in the past. I suspect not many people connect more than 256 pieces of the same hardware. Reviewed by: kib	2008-09-27 08:51:18 +00:00
Ed Schouten	d3ce832719	Remove unit2minor() use from kernel code. When I changed kern_conf.c three months ago I made device unit numbers equal to (unneeded) device minor numbers. We used to require bitshifting, because there were eight bits in the middle that were reserved for a device major number. Not very long after I turned dev2unit(), minor(), unit2minor() and minor2unit() into macro's. The unit2minor() and minor2unit() macro's were no-ops. We'd better not remove these four macro's from the kernel, because there is a lot of (external) code that may still depend on them. For now it's harmless to remove all invocations of unit2minor() and minor2unit(). Reviewed by: kib	2008-09-26 14:19:52 +00:00
Warner Losh	6e1a9d1739	Mips needs the same treatment for atomic_or_8 as the other RISCy architectures.	2008-09-18 19:57:06 +00:00
Pawel Jakub Dawidek	062ea27ee4	Add missing ZFS_EXIT(). PR: kern/124899 Submitted by: Masakazu Asama <m-asama@ginzado.ne.jp>	2008-09-15 11:27:25 +00:00
Edward Tomasz Napierala	dfa7fd1d70	Remove VSVTX, VSGID and VSUID. This should be a no-op, as VSVTX == S_ISVTX, VSGID == S_ISGID and VSUID == S_ISUID. Approved by: rwatson (mentor)	2008-09-10 13:16:41 +00:00
Pawel Jakub Dawidek	1b856fa491	Initialize vp, so we don't call VOP_UNLOCK() with NULL vnode pointer. Confirmed by: marcus	2008-09-07 07:55:12 +00:00
Pawel Jakub Dawidek	433751bb50	Lock vnode exclusively around insmntque().	2008-09-06 17:24:07 +00:00
Pawel Jakub Dawidek	7fa1f32a7e	Catch up after last insmntque() changes: - The vnode has to be locked exclusively before calling insmntque(). - Until I find a way to handle insmntque() failures use VV_FORCEINSMQ flag to force insmntque() to always succeed. Reported by: kris, trasz, des, others Suggested by: kib Tested by: trasz	2008-09-05 07:00:40 +00:00
Attilio Rao	59d4932531	Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions. Manpages are updated accordingly. Tested by: Diego Sardina <siarodx at gmail dot com>	2008-08-31 14:26:08 +00:00
Scott Long	a25cb00747	Ensure that the padding calcualtion doesn't return a negative value. Submitted by: kib Approved by: jb	2008-08-29 15:55:49 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Warner Losh	e6b3a7a9c1	Add MIPS support. Reviewed by: jb@	2008-08-23 04:58:11 +00:00
John Birrell	ac80559536	Add calls to callout_drain() to ensure the callouts are flushed before we free memory from underneath them. This fixes an occasional panic I've been seeing in softclock() where a bad pointer would be encountered when pushing DTrace hard.	2008-08-19 21:28:58 +00:00
Pawel Jakub Dawidek	37876323b1	We want to use LBOLT instead of lbolt on FreeBSD. I've this already fixed in p4, but the fix was never integrated into HEAD. Reported by: ed	2008-07-21 14:35:48 +00:00
Pawel Jakub Dawidek	28814ddbe8	We want to check new options given, not the current ones. This fixes 'zpool import -o <mntopt> <name>' not working properly.	2008-07-21 09:45:44 +00:00
Ed Schouten	3f7eea97fd	Remove the $FreeBSD$ tag again, now I know fbsd:nokeywords exists. Requested by: pjd Approved by: philip (mentor)	2008-06-12 08:53:54 +00:00
Ed Schouten	0f03ce1bb8	Turn dev2unit(), minor(), unit2minor() and minor2unit() into macro's. Now that we got rid of the minor-to-unit conversion and the constraints on device minor numbers, we can convert the functions that operate on minor and unit numbers to simple macro's. The unit2minor() and minor2unit() macro's are now no-ops. The ZFS code als defined a macro named `minor'. Change the ZFS code to use umajor() and uminor() here, as it is the correct approach to do this. Also add $FreeBSD$ to keep SVN happy. Approved by: philip (mentor), pjd	2008-06-12 08:30:54 +00:00
Ed Schouten	29d4cb241b	Don't enforce unique device minor number policy anymore. Except for the case where we use the cloner library (clone_create() and friends), there is no reason to enforce a unique device minor number policy. There are various drivers in the source tree that allocate unr pools and such to provide minor numbers, without using them themselves. Because we still need to support unique device minor numbers for the cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's that are used in combination with the cloner library should be marked with this flag to make the cloning work. This means drivers can now freely use si_drv0 to store their own flags and state, making it effectively the same as si_drv1 and si_drv2. We still keep the minor() and dev2unit() routines around to make drivers happy. The NTFS code also used the minor number in its hash table. We should not do this anymore. If the si_drv0 field would be changed, it would no longer end up in the same list. Approved by: philip (mentor)	2008-06-11 18:55:19 +00:00
John Birrell	4ca07625aa	Merge a recent change from the OpenSolaris source tree. (Don't ask for a vendor import of this yet, we're in the early days of svn) Instead of using cyclic timers to call the state clean and deadman callbacks, use a callout on FreeBSD to avoid the deadlock on FreeBSD due to trying to send interprocessor interrupts with interrupts disabled. Reported by: ps, jhb, peter, thompsa	2008-06-01 01:46:37 +00:00
Pawel Jakub Dawidek	ed5a2ac45c	Fix namespace collision after src/sys/sys/file.h:1.78.	2008-05-25 22:34:17 +00:00
John Birrell	727acbb41b	Comment out the code that breaks with invariants. This is stuff that is still WIP along with the lockstat provider, so there is no harm leaving it out for now.	2008-05-25 20:24:07 +00:00
Bjoern A. Zeeb	079d3bfcfb	Remove redundant redeclaration of 'zone_drain'.	2008-05-24 19:30:38 +00:00
John Birrell	8fc6245976	Make the zfs module depend on the opensolaris module in preparation for it to shared stuff with the DTrace modules.	2008-05-24 06:43:55 +00:00
John Birrell	25f292128c	Messing with the endian defines breaks the use of other FreeBSD headers.	2008-05-23 23:03:17 +00:00
John Birrell	fd930d81d8	Delete a couple of OpenSolaris headers which get in the way of our implementation.	2008-05-23 22:40:58 +00:00
John Birrell	8599306711	OpenSolaris kernel module compatibility sources.	2008-05-23 22:39:28 +00:00
John Birrell	5a3c3bfaa3	The cyclic timer device. This is a cut down version of the one in OpenSolaris. We don't have the lock levels that they do, so this is just hooked into clock interrupts.	2008-05-23 22:21:58 +00:00
John Birrell	91eaf3e183	Custom DTrace kernel module files plus FreeBSD-specific DTrace providers.	2008-05-23 05:59:42 +00:00
John Birrell	32a109c1d8	A 'special' compatibility header to plug OpenSolaris code.	2008-05-22 09:08:41 +00:00
John Birrell	4706efa4f6	Additional compatibility headers.	2008-05-22 08:35:03 +00:00
John Birrell	1583a68737	Compatibility stuff for DTrace.	2008-05-22 08:33:24 +00:00
John Birrell	5a1b490d50	FreeBSD changes to vendor source.	2008-05-22 07:33:39 +00:00
John Birrell	cd844e7a7d	This commit was generated by cvs2svn to compensate for changes in r179193, which included commits to RCS files with non-trunk default branches.	2008-05-22 07:04:10 +00:00
Attilio Rao	295624f56a	LO_ENROLLPEND is no more existing so just axe it (it was left out by the original commit axing it).	2008-05-16 02:09:13 +00:00
John Birrell	db612abe8d	Add FreeBSD IDs to files that originate in FreeBSD.	2008-04-22 07:43:00 +00:00
Konstantin Belousov	eab626f110	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
Marius Strobl	5b20de10b9	Add atomic operations for ZFS/sparc64. Approved by: core, pjd Obtained from: OpenSolaris (w/ adaptations) MFC after: 2 weeks	2008-04-11 22:59:33 +00:00
Marius Strobl	20a8e8d594	- Fix the path encoded in the multiple inclusion protection. - GCC uses 32-byte function alignment for UltraSPARC CPUs. - Remove code duplication. Approved by: core, pjd MFC after: 2 weeks	2008-04-11 22:53:06 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Robert Watson	237fdd787b	In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr. MFC after: 1 month Discussed with: imp, rink	2008-03-16 10:58:09 +00:00
Pawel Jakub Dawidek	2b1c6615bc	Fix mmap(2) on ZFS after some changes in VM subsystem. Submitted by: alc Reported by: kris (originally) and many others Tested with: fsx MFC after: 1 week	2008-03-15 23:23:04 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	628f51d275	Introduce some functions in the vnode locks namespace and in the ffs namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.	2008-02-24 16:38:58 +00:00
Pawel Jakub Dawidek	79bc018dd7	- Reduce how much ZFS caches by default. This is another change to mitigate 'kmem_map too small panics'. - Print two warnings if there is not enough memory and not enough address space. - Improve comment.	2008-01-24 11:24:16 +00:00
Pawel Jakub Dawidek	44ce1efd91	Change type of kmem_used() and kmem_size() functions to uint64_t, so it doesn't overflow in arc.c in this check: if (kmem_used() > (kmem_size() * 4) / 5) return (1); With this bug ZFS almost doesn't cache. Only 32bit machines are affected that have vm.kmem_size set to values >=1GB. Reported by: David Taylor <davidt@yadt.co.uk>	2008-01-24 11:21:54 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
John Birrell	35a04710d7	Remove some compatibility stuff that we now get from the Solaris header.	2007-11-29 00:15:08 +00:00
John Birrell	b468fe2bce	* Check endianness the FreeBSD way. * Use LBOLT rather than lbolt to avoid a clash with a FreeBSD global variable.	2007-11-28 22:16:00 +00:00
John Birrell	9587fed572	Fix a prototype definition.	2007-11-28 22:13:28 +00:00
John Birrell	da9085a1c0	Check endianness the FreeBSD way.	2007-11-28 22:12:21 +00:00
John Birrell	47b288c152	Include an extra header to get this to compile cleanly.	2007-11-28 22:11:39 +00:00
John Birrell	57438287ab	Add more OpenSolaris compatibility headers.	2007-11-28 21:50:40 +00:00
John Birrell	eca148b637	Remove an extern that is defined elsewhere.	2007-11-28 21:50:05 +00:00
John Birrell	edadde229a	Add compatibility cruft moved from under _SOLARIS_C_SOURCE in sys/types.h	2007-11-28 21:49:16 +00:00
John Birrell	35ba7f225f	Remove a typedef which was just a hack to avoid including vmem.h. That typedef breaks other Solaris code.	2007-11-28 21:48:25 +00:00
John Birrell	773f4e3849	Add a missing volatile so that the code compiles cleanly.	2007-11-28 21:47:09 +00:00
John Birrell	4fc8feafc7	Rename the definition of lbolt to LBOLT to avoid a clash with a global variable in FreeBSD. Until now lbolt in sys/proc.h has been #ifdef'ed out based on _SOLARIS_C_SOURCE, but that is going away now.	2007-11-28 21:44:17 +00:00
Pawel Jakub Dawidek	4d4daf5901	Warn if kmem_map size is set to less than 512MB. Previous warning was a bit pointless, because default is set to something around 300MB and also insufficient. MFC after: 3 days	2007-11-07 14:44:31 +00:00
Pawel Jakub Dawidek	232a80f675	Remove unused header. MFC after: 3 days	2007-11-05 22:18:34 +00:00
Pawel Jakub Dawidek	a33b7a8f5f	If setting a state to anything but open state, close access to vdev. This fixes replacing drive in place, eg. zpool replace tank da1 da1. Before it complained that device is already open. MFC after: 1 week	2007-11-05 21:30:48 +00:00
Pawel Jakub Dawidek	171eb887e9	Remove "zfs:" prefix from lock and condvar names and also skip non-letter characters (mostly "&"). Because top(1) shows only first six characters of wait channel, without this change we saw only one meaningful character. Requested by: kris & others MFC after: 1 week	2007-11-05 18:40:55 +00:00
Ulf Lilleengen	6509baf851	- Add sysctl for sizeof(znode_t), which will be used by fstat(1). Approved by: pjd (mentor)	2007-11-02 00:35:05 +00:00
Pawel Jakub Dawidek	ef2d58b58f	Call zil_commit() (if ZIL is not disabled) after every non-read request (BIO_WRITE and BIO_FLUSH) as it is done is Solaris. The difference is that Solaris calls it only for sync requests, but we can't say in GEOM is the request is sync or async, so we do it for every request. MFC after: 1 week	2007-11-01 11:04:21 +00:00
Pawel Jakub Dawidek	4f2398ea17	- Move crfree() outside MNT_ILOCK()/MNT_IUNLOCK() to eliminate a LOR: 1st 0xc4cea568 struct mount mtx (struct mount mtx) @ /usr/src/sys/modules/zfs/../../compat/opensolaris/kern/opensolaris_vfs.c:209 2nd 0xc3ee9010 sleep mtxpool (sleep mtxpool) @ /usr/src/sys/kern/kern_resource.c:1266 - Move crdup() outside MNT_ILOCK()/MNT_IUNLOCK(), as it can sleep. Reported by: Olli Hauer <ohauer@gmx.de> MFC after: 3 days	2007-11-01 08:58:29 +00:00
Julian Elischer	3745c395ec	Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.	2007-10-20 23:23:23 +00:00
Andrew Thompson	1fe1be1535	ZFS_LOG adds a newline by itself. Pointed out by: pjd	2007-10-14 16:14:32 +00:00
Andrew Thompson	9528621759	Print the ZFS ereport to the console if vfs.zfs.debug is set to help diagnose problems with zfs-on-root since devd isnt running yet. Reviewed by: pjd	2007-10-14 07:58:50 +00:00
Pawel Jakub Dawidek	e8bd23b460	Fix lock leak leading to the 'System call <name> returning with 1 locks held' panic. Reported by: kris Approved by: re (kensmith)	2007-10-04 17:51:59 +00:00
Pawel Jakub Dawidek	a95a61fc19	Now that we have CDDLed code in the tree, add CDDL license. Discussed with: core Approved by: re (kensmith)	2007-09-23 07:04:50 +00:00
Pawel Jakub Dawidek	a3c8c2e60f	Reduce the limit of vnodes on i386 when ZFS is loaded to 3/4 of the original value, so we don't run out of KVA. The default vnodes limit fits better for UFS, but ZFS allocated more file system specific memory for a vnode than UFS. Don't touch vnodes limit if we detect it was tuned by system administrator and restore original value when ZFS is unloaded. This isn't final fix, but before we implement something better, this will help to stabilize ZFS under heavy load on i386. Approved by: re (bmah)	2007-09-10 19:58:14 +00:00
Pawel Jakub Dawidek	ef0ffc1c6f	After dfr@ vnode leak fix, we can allow ARC to consume more memory. Tested by: kris Approved by: re (bmah)	2007-09-10 18:12:27 +00:00
Pawel Jakub Dawidek	6bc581fcf0	Use CTLFLAG_RDTUN for tunable sysctls. Approved by: re (bmah)	2007-09-01 06:23:42 +00:00
Pawel Jakub Dawidek	70eaa4219c	Some ZFS threads needs stack larger than the default 8kB, so use 16kB of alternate stack if the default is smaller than 16kB. Approved by: re (rwatson)	2007-08-16 20:33:20 +00:00
Pawel Jakub Dawidek	aa222db26f	Update assertion after revision 1.23. Reviewed by: dfr Approved by: re (rwatson)	2007-07-24 15:00:43 +00:00
Doug Rabson	2dc26b36c8	Correct a reference-counting mistake in the ZFS code which led to abnormal memory usage and pessimal cache performance. Reviewed by: pjd Approved by: re (rwatson)	2007-07-09 09:03:49 +00:00
Doug Rabson	7761242694	In zfs_vget, if we fail to translate an inode number to the corresponding vnode, make sure we return an error code to the caller. Reviewed by: pjd Approved by: re	2007-06-27 12:00:24 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Marcel Moolenaar	6d63683c41	Add my copyright. Requested by: pjd@	2007-06-08 16:20:03 +00:00
Pawel Jakub Dawidek	3b7917d766	- Reduce number of atomic operations needed to be implemented in asm by implementing some of them using existing ones. - Allow to compile ZFS on all archs and use atomic operations surrounded by global mutex on archs we don't have or can't have all atomic operations needed by ZFS.	2007-06-08 12:35:47 +00:00
Pawel Jakub Dawidek	083c4dd695	Missing atomic operations for ZFS/ia64. Submitted by: marcel	2007-06-08 12:26:30 +00:00
David Malone	041b706b2f	Despite several examples in the kernel, the third argument of sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int. Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples. In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.	2007-06-04 18:25:08 +00:00
Pawel Jakub Dawidek	b166b92692	Reimplement traverse() helper function: 1. Pass locking flags to VFS_ROOT(). 2. Check v_mountedhere while the vnode is locked. 3. Always return locked vnode on success. Change 1 fixes problem reported by Stephen M. Rumble - after zfs_vfsops.c,1.9 change, zfs_root() no longer locks the vnode unconditionally and traverse() didn't pass right lock type to VFS_ROOT(). The result was that kernel paniced when .zfs/ directory was accessed via NFS.	2007-06-04 11:31:46 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Pawel Jakub Dawidek	5750956634	Adjust va_mask for setattr. FreeBSD doesn't have va_mask, so we initialize it based on individual fields beeing set. This doesn't work for setattr replay, because va_type is set there, so we add AT_TYPE flag to va_mask, which won't be accepted by zfs_setattr(). Reported by: kris	2007-05-28 02:37:43 +00:00
Pawel Jakub Dawidek	a906fff9c5	Because we allocate componentname structures on stack, bzero() them before use just in case.	2007-05-28 00:26:20 +00:00
Pawel Jakub Dawidek	0d99488ded	There are too many false positive LORs reported by WITNESS, so when ZFS debug is turned off, initialize locks with NOWITNESS flag. At some point I'll get back to them, we would probably need BLESSING functionality, which is currently turned off by default.	2007-05-26 21:37:14 +00:00
Pawel Jakub Dawidek	fbd08bbe6a	DNLC_NO_VNODE can't be NULL. Reported by: ru	2007-05-24 13:44:45 +00:00
Pawel Jakub Dawidek	f92dd5c2d9	Initialize ZFS a bit earlier and block root mounting until initialization is complete. This fixes some root-on-ZFS configurations. Reported by: Bruno Damour <freebsd.ruomad@free.fr> Tested by: Bruno Damour <freebsd.ruomad@free.fr>	2007-05-24 07:43:00 +00:00
Pawel Jakub Dawidek	d4c4dfe96f	FreeBSD's namecache works quite well with ZFS, so remove DNLC.	2007-05-23 21:33:02 +00:00
Pawel Jakub Dawidek	4282c449dc	All objects we create using GFS are directories, so initialize d_type properly, but add XXX comment saying that it can eventually change in the future.	2007-05-23 21:27:47 +00:00
Pawel Jakub Dawidek	124427f96d	Lock vnode on lookup. This fixes ZIL replay for rmdir/unlink/rename. Reported by: des	2007-05-22 21:22:25 +00:00
Pawel Jakub Dawidek	68e752c31c	Increase debug level - this message is not that important.	2007-05-09 22:32:49 +00:00
Pawel Jakub Dawidek	6a7309390f	- Add missing lock destruction and remove duplicate initializations. With this change it is possible to unload zfs.ko module from WITNESS-enabled kernel. - Remove bogus comment.	2007-05-06 19:05:37 +00:00
Pawel Jakub Dawidek	7baf73a6c2	Use provider's ident to handle situations when disks are moved around and show up with different names: first try to open provider using remembered name and compare its ident, if equal, this is our provider, if not equal or there is no provider with such name, find provider with remembered ident and don't care about the name.	2007-05-06 01:39:39 +00:00
Pawel Jakub Dawidek	fab3f4465e	MFp4: We don't need to cover vnode_pager_setsize() with the z_map_lock.	2007-05-06 01:27:54 +00:00
Pawel Jakub Dawidek	57504dcfaf	Share-lock a vnode where possible.	2007-05-02 01:03:10 +00:00
Pawel Jakub Dawidek	5bec66402b	When parent directory has to be unlocked, lock it back with the same lock type. Before this change, if directory was shared-locked, it was relocked exclusively.	2007-05-02 00:41:44 +00:00
Pawel Jakub Dawidek	9167141244	Lock vnode using cn_lkflags in case the caller wants the vnode to be shared-locked.	2007-05-02 00:39:52 +00:00
Pawel Jakub Dawidek	04748b1b2e	The getnewvnode() function sets LK_NOSHARE by default, so if we want to support shared vnodes locking, we need to remove that flag. Also add LK_CANRECURSE flag as found in nfsclient.	2007-05-02 00:22:12 +00:00
Pawel Jakub Dawidek	0775674bbc	ZFS should update timestamps upon the creat() of an existing file. Obtained from: OpenSolaris Bug: http://bugs.opensolaris.org/view_bug.do?bug_id=6465105	2007-05-02 00:18:22 +00:00
Pawel Jakub Dawidek	6de6bff649	- Lock vnode with flags passed in as argument in zfs_vget() and zfs_root(). Pointed out by: ups Also reported by: kris - Add comments where I'm not sure if LK_RETRY should be used.	2007-05-02 00:09:34 +00:00

1 2 3 4 5 ...

311 Commits