freebsd-skq

Author	SHA1	Message	Date
pjd	2a3cf7f364	- Make 'flags' argument to chflags(2), fchflags(2) and lchflags(2) of type u_long. Before this change it was of type int for syscalls, but prototypes in sys/stat.h and documentation for chflags(2) and fchflags(2) (but not for lchflags(2)) stated that it was u_long. Now some related functions use u_long type for flags (strtofflags(3), fflagstostr(3)). - Make path argument of type 'const char *' for consistency. Discussed on: arch Sponsored by: The FreeBSD Foundation	2013-03-21 22:44:33 +00:00
kib	1b20f7cc18	Remove negative name cache entry pointing to the target name, which could be instantiated while tdvp was unlocked. Reported by: Rick Miller <vmiller at hostileadmin com> Tested by: pho MFC after: 1 week	2013-03-17 15:11:37 +00:00
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
attilio	cde657f5b1	Remove a racy checks on resident and cached pages for tmpfs_mapped{read, write}() functions: - tmpfs_mapped{read, write}() are only called within VOP_{READ, WRITE}(), which check before-hand to work only on valid VREG vnodes. Also the vnode is locked for the duration of the work, making vnode reclaiming impossible, during the operation. Hence, vobj can never be NULL. - Currently check on resident pages and cached pages without vm object lock held is racy and can do even more harm than good, as a page could be transitioning between these 2 pools and then be skipped entirely. Skip the checks as lookups on empty splay trees are very cheap. Discussed with: alc Tested by: flo MFC after: 2 weeks	2013-02-10 01:04:10 +00:00
gleb	97e76936ec	tmpfs: Replace directory entry linked list with RB-Tree. Use file name hash as a tree key, handle duplicate keys. Both VOP_LOOKUP and VOP_READDIR operations utilize same tree for search. Directory entry offset (cookie) is either file name hash or incremental id in case of hash collisions (duplicate-cookies). Keep sorted per directory list of duplicate-cookie entries to facilitate cookie number allocation. Don't fail if previous VOP_READDIR() offset is no longer valid, start with next dirent instead. Other file system handle it similarly. Workaround race prone tn_readdir_last[pn] fields update. Add tmpfs_dir_destroy() to free all dirents. Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from tmpfs_dir_getdents() instead of hard coded -1. Mark directory traversal routines static as they are no longer used outside of tmpfs_subr.c	2013-01-06 22:15:44 +00:00
attilio	d5d551ec46	Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag. Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.	2012-11-09 18:02:25 +00:00
mdf	394f27b845	Fix up kernel sources to be ready for a 64-bit ino_t. Original code by: Gleb Kurtsou	2012-09-27 23:30:49 +00:00
kib	cac2fe116f	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
trasz	023bd7c6bf	Remove unused thread argument to vrecycle(). Reviewed by: kib	2012-04-23 14:10:34 +00:00
jh	ce0372c7aa	Return EOPNOTSUPP rather than EPERM for the SF_SNAPSHOT flag because tmpfs doesn't support snapshots. Suggested by: bde	2012-04-18 15:22:08 +00:00
jh	734fbc687a	Sync tmpfs_chflags() with the recent changes to UFS: - Add a check for unsupported file flags. - Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.	2012-04-16 18:10:34 +00:00
jh	e8d1b1d0ce	tmpfs: Allow update mounts only for certain options. Since r230208 update mounts were allowed if the list of mount options contained the "export" option. This is not correct as tmpfs doesn't really support updating all options. Reviewed by: kevlo, trociny	2012-04-16 18:07:42 +00:00
gleb	748c9a0d1c	Provide better description for vfs.tmpfs.memory_reserved sysctl. Suggested by: Anton Yuzhaninov <citrin@citrin.ru>	2012-04-15 21:59:28 +00:00
attilio	628004ddfb	- Introduce a cache-miss optimization for consistency with other accesses of the cache member of vm_object objects. - Use novel vm_page_is_cached() for checks outside of the vm subsystem. Reviewed by: alc MFC after: 2 weeks X-MFC: r234039	2012-04-09 17:05:18 +00:00
gleb	75744aafa6	tmpfs supports only INT_MAX nodes due to limitations of unit number allocator. Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in memory is not likely to become possible in foreseeable feature and would require new unit number allocator. Discussed with: delphij MFC after: 2 weeks	2012-04-07 15:30:46 +00:00
gleb	fb452e77b0	Add vfs_getopt_size. Support human readable file system options in tmpfs. Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs. Discussed with: delphij MFC after: 2 weeks	2012-04-07 15:27:34 +00:00
gleb	c4ba84f86e	Add reserved memory limit sysctl to tmpfs. Cleanup availble and used memory functions. Check if free pages available before allocating new node. Discussed with: delphij	2012-04-07 15:23:51 +00:00
gleb	cc93f6fa4e	Prevent tmpfs_rename() deadlock in a way similar to UFS Unlock vnodes and try to lock them one by one. Relookup fvp and tvp. Approved by: mdf (mentor)	2012-03-14 09:15:50 +00:00
gleb	fa1fc18612	Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp() Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same. Don't mess with vnode holdcount, vget() takes care of it already. Approved by: mdf (mentor)	2012-03-14 08:29:21 +00:00
kib	8adabb0356	Remove fifo.h. The only used function declaration from the header is migrated to sys/vnode.h. Submitted by: gianni	2012-03-11 12:19:58 +00:00
jhb	7bf627fd71	Similar to the fixes in 226967 and 226987, purge any name cache entries associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname. MFC after: 2 weeks	2012-03-02 18:55:19 +00:00
tijl	fec991ff0a	Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value to intmax_t instead of uintmax_t, because the original type is off_t.	2012-02-14 11:24:24 +00:00
kevlo	1a0c72b3fe	Return EOPNOTSUPP since we only support update mounts for NFS export. Spotted by: trociny	2012-01-17 01:25:53 +00:00
kevlo	ba45f0caad	Add nfs export support to tmpfs(5) Reviewed by: kib	2012-01-16 10:25:22 +00:00
alc	213db2103e	When tmpfs_write() resets an extended file to its original size after an error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size. Reviewed by: kib MFC after: 3 weeks	2012-01-16 00:26:49 +00:00
alc	413b89a7e9	Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call vm_object_pip_{add,subtract}() on the swap object because the swap object can't be destroyed while the vnode is exclusively locked. Moreover, even if the swap object could have been destroyed during tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken because vm_object_pip_subtract() does not wake up the sleeping thread that is trying to destroy the swap object. Free invalid pages after an I/O error. There is no virtue in keeping them around in the swap object creating more work for the page daemon. (I believe that any non-busy page in the swap object will now always be valid.) vm_pager_get_pages() does not return a standard errno, so its return value should not be returned by tmpfs without translation to an errno value. There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to occur with the swap object locked. Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite(). (The swap pager already spam your console if data corruption is imminent.) Reviewed by: kib MFC after: 3 weeks	2012-01-14 23:04:27 +00:00
alc	d27a56b062	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
alc	9976fae042	Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and tmpfs_nocacheread(). It is both unnecessary and a pessimization. It results in either the page being zeroed twice or zeroed first and then overwritten by an I/O operation. MFC after: 3 weeks	2012-01-03 03:29:01 +00:00
ivoras	6123cbfb2e	Avoid panics from recursive rename operations. Not a perfect patch but good enough for now. PR: kern/159418 Submitted by: Gleb Kurtsou Reviewed by: kib MFC after: 1 month	2011-11-22 16:18:12 +00:00
delphij	b669bf1954	Improve the way to calculate available pages in tmpfs: - Don't deduct wired pages from total usable counts because it does not make any sense. To make things worse, on systems where swap size is smaller than physical memory and use a lot of wired pages (e.g. ZFS), tmpfs can suddenly have free space of 0 because of this; - Count cached pages as available; [1] - Don't count inactive pages as available, technically we could but that might be too aggressive; [1] [1] Suggested by kib@ MFC after: 1 week	2011-11-21 20:26:22 +00:00
marcel	7952c4cb03	Don astbestos garment and remove the warning about TMPFS being experimental -- highly experimental even. So far the closest to a bug in TMPFS that people have gotten to relates to how ZFS can take away from the memory that TMPFS needs. One can argue that such is not a bug in TMPFS. Irrespective, even if there is a bug here and there in TMPFS, it's not in our own advantage to scare people away from using TMPFS. I for one have been using it, even with ZFS, very successfully.	2011-11-07 16:21:50 +00:00
pho	d08a02f709	Added missing cache purge of from argument for rename(). Reported by: Anton Yuzhaninov <citrin citrin ru> In collaboration with: kib MFC after: 1 week	2011-11-01 12:33:06 +00:00
kib	a9d505a22a	Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)	2011-09-06 10:30:11 +00:00
alc	21902be08c	Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib	2011-06-29 16:40:41 +00:00
rmacklem	fbb8a5e8ec	Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib	2011-05-22 01:07:54 +00:00
alc	f5f2bab600	Eliminate two dubious attempts at optimizing the implementation of a file's last accessed, modified, and changed times: TMPFS_NODE_ACCESSED and TMPFS_NODE_CHANGED should be set unconditionally in tmpfs_remove() without regard to the number of hard links to the file. Otherwise, after the last directory entry for a file has been removed, a process that still has the file open could read stale values for the last accessed and changed times with fstat(2). Similarly, tmpfs_close() should update the time-related fields even if all directory entries for a file have been removed. In this case, the effect is that the time-related fields will have values that are later than expected. They will correspond to the time at which fstat(2) is called. In collaboration with: kib MFC after: 1 week	2011-02-22 14:47:10 +00:00
alc	927e4eb5e4	tmpfs_remove() isn't modifying the file's data, so it shouldn't set TMPFS_NODE_MODIFIED on the node. PR: 152488 Submitted by: Anton Yuzhaninov Reviewed by: kib MFC after: 1 week	2011-02-19 21:04:36 +00:00
alc	b90f855a9d	Further simplify tmpfs_reg_resize(). Also, update its comments, including style fixes.	2011-02-14 15:36:38 +00:00
alc	333b3f4277	Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vm object's size field. Previously, that field was always zero, even when the object tn_reg.tn_aobj contained numerous pages. Apply style fixes to tmpfs_reg_resize(). In collaboration with: kib	2011-02-13 14:46:39 +00:00
kib	5c60dad772	In tmpfs_readdir(), normalize handling of the directory entries that either overflow the supplied buffer, or cause uiomove fail. Do not advance cached de when directory entry was not copied out. Do not return EOF when no entries could be copied due to first entry too large for supplied buffer, signal EINVAL instead. Reported by: Beat G?tzi <beat chruetertee ch> MFC after: 1 week	2011-01-20 09:39:16 +00:00
avg	f7a1d6d2b8	tmpfs + sendfile: do not produce partially valid pages for vnode's tail See r213730 for details of analogous change in ZFS. MFC after: 3 days	2010-10-12 17:16:51 +00:00
avg	65a73d5f0b	tmpfs, zfs + sendfile: mark page bits as valid after populating it with data Otherwise, adding insult to injury, in addition to double-caching of data we would always copy the data into a vnode's vm object page from backend. This is specific to sendfile case only (VOP_READ with UIO_NOCOPY). PR: kern/141305 Reported by: Wiktor Niesiobedzki <bsd@vink.pl> Reviewed by: alc Tested by: tools/regression/sockets/sendfile MFC after: 2 weeks	2010-09-15 10:31:27 +00:00
ivoras	6803865312	Avoid "Entry can disappear before we lock fdvp" panic. PR: 150143 Submitted by: Gleb Kurtsou <gk at FreeBSD.org> Pretty sure it won't blow up: mckusick MFC after: 2 weeks	2010-09-07 22:40:45 +00:00
ed	7efa097b64	Add support for whiteouts on tmpfs. Right now unionfs only allows filesystems to be mounted on top of another if it supports whiteouts. Even though I have sent a patch to daichi@ to let unionfs work without it, we'd better also add support for whiteouts to tmpfs. This patch implements .vop_whiteout and makes necessary changes to lookup() and readdir() to take them into account. We must also make sure that when adding or removing a file, we honour the componentname's DOWHITEOUT and ISWHITEOUT, to prevent duplicate filenames. MFC after: 1 month	2010-08-22 05:36:06 +00:00
alc	534c9ebdcf	Eliminate unnecessary page queues locking.	2010-06-16 00:41:21 +00:00
trasz	e6f92048fa	Style fixes and removal of unneeded variable. Submitted by: bde@	2010-05-06 18:43:19 +00:00
trasz	402e3baade	Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize(). Reviewed by: kib	2010-05-05 16:44:25 +00:00
alc	ea7b6345be	Push down the acquisition of the page queues lock into vm_page_unwire(). Update the comment describing which lock should be held on entry to vm_page_wire(). Reviewed by: kib	2010-05-05 03:45:46 +00:00
alc	1923b6ded3	Acquire the page lock around vm_page_unwire() and vm_page_wire(). Reviewed by: kib	2010-05-03 16:41:11 +00:00
alc	f35e97166b	It makes no sense for vm_page_sleep_if_busy()'s helper, vm_page_sleep(), to unconditionally set PG_REFERENCED on a page before sleeping. In many cases, it's perfectly ok for the page to disappear, i.e., be reclaimed by the page daemon, before the caller to vm_page_sleep() is reawakened. Instead, we now explicitly set PG_REFERENCED in those cases where having the page persist until the caller is awakened is clearly desirable. Note, however, that setting PG_REFERENCED on the page is still only a hint, and not a guarantee that the page should persist.	2010-05-02 17:33:46 +00:00
jh	cf48230780	Add "maxfilesize" mount option for tmpfs to allow specifying the maximum file size limit. Default is UINT64_MAX when the option is not specified. It was useless to set the limit to the total amount of memory and swap in the system. Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to check if there is enough memory available. Remove now unused get_swpgtotal(). Reviewed by: Gleb Kurtsou Approved by: trasz (mentor)	2010-01-29 12:09:14 +00:00
jh	c1e8eaa2ba	- Change the type of nodes_max to u_int and use "%u" format string to convert its value. [1] - Set default tm_nodes_max to min(pages + 3, UINT32_MAX). It's more reasonable than the old four nodes per page (with page size 4096) because non-empty regular files always use at least one page. This fixes possible overflow in the calculation. [2] - Don't allow more than tm_nodes_max nodes allocated in tmpfs_alloc_node(). PR: kern/138367 Suggested by: bde [1], Gleb Kurtsou [2] Approved by: trasz (mentor)	2010-01-20 16:56:20 +00:00
jh	37c0c9cce1	- Fix some style bugs in tmpfs_mount(). [1] - Remove a stale comment about tmpfs_mem_info() 'total' argument. Reported by: bde [1]	2010-01-13 14:17:21 +00:00
jh	0f220ff694	- Change the type of size_max to u_quad_t because its value is converted with vfs_scanopt(9) using the "%qu" format string. - Limit the maximum value of size_max to (SIZE_MAX - PAGE_SIZE) to prevent overflow in howmany() macro. PR: kern/141194 Approved by: trasz (mentor) MFC after: 2 weeks	2010-01-08 07:57:43 +00:00
alc	33ab51801e	There is no need to "busy" a page when the object is locked for the duration of the operation.	2009-10-26 18:02:05 +00:00
delphij	3af844143a	Add locking around access to parent node, and bail out when the parent node is already freed rather than panicking the system. PR: kern/122038 Submitted by: gk Tested by: pho MFC after: 1 week	2009-10-11 07:03:56 +00:00
delphij	9fa7d8b89b	Add a special workaround to handle UIO_NOCOPY case. This fixes data corruption observed when sendfile() is being used. PR: kern/127213 Submitted by: gk MFC after: 2 weeks	2009-10-07 23:17:15 +00:00
delphij	df8f450648	Fix a bug that causes the fsx test case of mmap'ed page being out of sync of read/write, inspired by ZFS's counterpart. PR: kern/139312 Submitted by: gk@ MFC after: 1 week	2009-10-04 10:38:04 +00:00
kib	fa686c638e	Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)	2009-06-23 20:45:22 +00:00
alc	9ec14df4c8	Eliminate unnecessary variables.	2009-06-13 20:21:08 +00:00
alc	095c1f19b6	Eliminate redundant setting of a page's valid bits and pointless clearing of the same page's dirty bits.	2009-05-27 18:12:10 +00:00
attilio	1dcb84131b	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
alc	a8c3c977f2	Use uiomove_fromphys() instead of the combination of sf_buf and uiomove(). This is not only shorter; it also eliminates unnecessary thread pinning on architectures that implement a direct map. MFC after: 3 weeks	2009-02-22 19:50:09 +00:00
alc	f736b7c6db	Simplify the unwiring and activation of pages. MFC after: 1 week	2009-02-22 18:15:17 +00:00
kib	8571ced24c	Lookup up the directory entry for the tmpfs node that are deleted by both node pointer and name component. This does the right thing for hardlinks to the same node in the same directory. Submitted by: Yoshihiro Ota <ota j email ne jp> PR: kern/131356 MFC after: 2 weeks	2009-02-08 19:18:33 +00:00
bz	9fd12161ca	Remove unused local variables. Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks	2009-01-31 17:36:22 +00:00
trasz	0ad8692247	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
obrien	d31fa36475	The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp and bcmp are not the same thing. 'man bcmp' states that the return is "non-zero" if the two byte strings are not identical. Where as, 'man memcmp' states that the return is the "difference between the first two differing bytes (treated as unsigned char values" if the two byte strings are not identical. So provide a proper memcmp(9), but it is a C implementation not a tuned assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).	2008-09-23 14:45:10 +00:00
kib	a127656dea	fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(), vattr_null() or by zeroing it. Remove these to allow preinitialization of fields work in vn_stat(). This is needed to get birthtime initialized correctly. Submitted by: Jaakko Heinonen <jh saunalahti fi> Discussed on: freebsd-fs MFC after: 1 month	2008-09-20 19:50:52 +00:00
kib	ef4f1dc9c7	Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR(). NODEV is more appropriate when va_rdev doesn't have a meaningful value. Submitted by: Jaakko Heinonen <jh saunalahti fi> Suggested by: bde Discussed on: freebsd-fs MFC after: 1 month	2008-09-20 19:49:15 +00:00
kib	81d455e702	Initialize va_flags and va_filerev properly in VOP_GETATTR(). Don't initialize va_vaflags and va_spare because they are not part of the VOP_GETATTR() API. Also don't initialize birthtime to ctime or zero. Submitted by: Jaakko Heinonen <jh saunalahti fi> Reviewed by: bde Discussed on: freebsd-fs MFC after: 1 month	2008-09-20 19:46:45 +00:00
delphij	0791911684	Reflect license change of NetBSD code. Obtained from: NetBSD MFC after: 3 days	2008-09-03 18:53:48 +00:00
attilio	dbf35e279f	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
kib	591c809288	Do not redo the vnode tear-down work already done by insmntque() when vnode cannot be put on the vnode list for mount. Reported and tested by: marck Guilty party: me MFC after: 3 days	2008-06-15 18:40:58 +00:00
kib	52243403eb	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
dfr	79d2dfdaa6	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
attilio	4014b55830	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
attilio	71b7824213	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
attilio	18d0a0dd51	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
delphij	e93a39275f	Turn MPASS(0) into panic with more obvious reason why the assertion is failed.	2007-12-07 00:00:21 +00:00
delphij	17d10d92fe	size_max should be unsigned, as such, use size_t here.	2007-12-06 23:19:05 +00:00
wkoszek	411cf00f62	Explicitly initialize 'error' to 0 (two places). It lets one to build tmpfs from the latest source tree with older compiler--gcc3. Reviewed by: kib@ (on freebsd-current@) Approved by: cognet@ (mentor)	2007-12-04 20:14:15 +00:00
delphij	c5d6580fd4	MFp4: Several fixes to tmpfs which makes it to survive from pho@'s strees2 suite, to quote his letter, this change: 1. It removes the tn_lookup_dirent stuff. I think this cannot be fixed, because nothing protects vnode/tmpfs node between lookup is done, and actual operation is performed, in the case the vnode lock is dropped. At least, this is the case with the from vnode for rename. For now, we do the linear lookup in the parent node. This has its own drawbacks. Not mentioning speed (that could be fixed by using hash), the real problem is the situation where several hardlinks exist in the dvp. But, I think this is fixable. 2. The patch restores the VV_ROOT flag on the root vnode after it became reclaimed and allocated again. This fixes MPASS assertion at the start of the tmpfs_lookup() reported by many. Submitted by: kib	2007-11-18 04:52:40 +00:00
delphij	b0d971c673	MFp4: Fix several style(9) bugs. Submitted by: des	2007-11-18 04:40:42 +00:00
delphij	99be157da9	Correct a stack overflow which will trigger panics when mode= is specified, caused by incorrect format string specified to vfs_scanopt() and subsequently vsscanf(). Pointed out by: kib Submitted by: des	2007-11-12 18:57:33 +00:00
delphij	0c91cfd26b	MFp4: Provide a dummy verb "export" to shut up the message showed up at start when NFS is enabled. Reported by: rafan Approved by: re (tmpfs blanket)	2007-10-04 17:11:48 +00:00
delphij	679cdcf0e4	Additional work is still needed before we can claim that tmpfs is stable enough for production usage. Warn user upon mount. Approved by: re (tmpfs blanket)	2007-10-04 17:08:46 +00:00
delphij	e83de305a6	MFp4: rework tmpfs_readdir() logic in terms of correctness. Approved by: re (tmpfs blanket) Tested with: fstest, fsx	2007-08-16 11:00:07 +00:00
delphij	5496743409	MFp4: - LK_RETRY prohibits vget() and vn_lock() to return error. Remove associated code. [1] - Properly use vhold() and vdrop() instead of their unlocked versions, we are guaranteed to have the vnode's interlock unheld. [1] - Fix a pseudo-infinite loop caused by 64/32-bit arithmetic with the same way used in modern NetBSD versions. [2] - Reorganize tmpfs_readdir to reduce duplicated code. Submitted by: kib [1] Obtained from: NetBSD [2] Approved by: re (tmpfs blanket)	2007-08-10 11:00:30 +00:00
delphij	1e2d5f7f4a	MFp4: - Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1] - Properly lock around tn_vnode to avoid NULL deference - Be more careful handling vnodes () () This is a WIP [1] by pjd via howardsu Thanks kib@ for his valuable VFS related comments. Tested with: fsx, fstest, tmpfs regression test set Found by: pho's stress2 suite Approved by: re (tmpfs blanket)	2007-08-10 05:24:49 +00:00
delphij	8f39689f22	MFp4 - Refine locking to eliminate some potential race/panics: - Copy before testing a pointer. This closes a race window. - Use msleep with the node interlock instead of tsleep. - Do proper locking around access to tn_vpstate. - Assert vnode VOP lock for dir_{atta,de}tach to capture inconsistent locking. Suggested by: kib Submitted by: delphij Reviewed by: Howard Su Approved by: re (tmpfs blanket)	2007-08-03 06:24:31 +00:00
delphij	41fd4384d1	MFp4: Force 64-bit arithmatic when caculating the maximum file size. This fixes tmpfs caculations on 32-bit systems equipped with more than 4GB swap. Reported by: Craig Boston <craig xfoil gank org> PR: kern/114870 Approved by: re (tmpfs blanket)	2007-07-24 17:14:53 +00:00
delphij	0321a712a7	MFp4: When swapping is not enabled, allow creating files by taking physical memory pages into account for tm_maxfilesize. Reported by: Dominique Goncalves <dominique.goncalves gmail.com> Submitted by: Howard Su Approved by: re (tmpfs blanket)	2007-07-23 06:54:58 +00:00
delphij	d6ea8c65f3	MFp4: Rework on tmpfs's mapped read/write procedures. This should finally fix fsx test case. The printf's added here would be eventually turned into assertions. Submitted by: Mingyan Guo (mostly) Approved by: re (tmpfs blanket)	2007-07-19 03:34:50 +00:00
delphij	ff9bf4b792	MFp4: Make use of the kernel unit number allocation facility for tmpfs nodes. Submitted by: Mingyan Guo <guomingyan gmail com> Approved by: re (tmpfs blanket)	2007-07-11 14:26:27 +00:00
delphij	b4ff595242	MFp4: - Plug memory leak. - Respect underlying vnode's properties rather than assuming that the user want root:wheel + 0755. Useful for using tmpfs(5) for /tmp. - Use roundup2 and howmany macros instead of rolling our own version. - Try to fix fsx -W -R foo case. - Instead of blindly zeroing a page, determine whether we need a pagein order to prevent data corruption. - Fix several bugs reported by Coverity. Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij Coverity ID: CID 2550, 2551, 2552, 2557 Approved by: re (tmpfs blanket)	2007-07-08 15:56:12 +00:00
delphij	f5523801a3	MFp4: - Remove unnecessary NULL checks after M_WAITOK allocations. - Use VOP_ACCESS instead of hand-rolled suser_cred() calls. [1] - Use malloc(9) KPI to allocate memory for string. The optimization taken from NetBSD is not valid for FreeBSD because our malloc(9) already act that way. [2] Requested by: rwatson [1] Submitted by: Howard Su [2] Approved by: re (tmpfs blanket)	2007-06-29 05:23:15 +00:00
delphij	66a8b537d6	Space/style cleanups after last set of commits. Approved by: re (tmpfs blanket)	2007-06-28 02:39:31 +00:00
delphij	17bb881f75	Staticify most of fifo/vn operations, they should not be directly exposed outside. Approved by: re (tmpfs blanket)	2007-06-28 02:36:41 +00:00

1 2 3 4

158 Commits