freebsd-skq

Author	SHA1	Message	Date
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
gleb	97e76936ec	tmpfs: Replace directory entry linked list with RB-Tree. Use file name hash as a tree key, handle duplicate keys. Both VOP_LOOKUP and VOP_READDIR operations utilize same tree for search. Directory entry offset (cookie) is either file name hash or incremental id in case of hash collisions (duplicate-cookies). Keep sorted per directory list of duplicate-cookie entries to facilitate cookie number allocation. Don't fail if previous VOP_READDIR() offset is no longer valid, start with next dirent instead. Other file system handle it similarly. Workaround race prone tn_readdir_last[pn] fields update. Add tmpfs_dir_destroy() to free all dirents. Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from tmpfs_dir_getdents() instead of hard coded -1. Mark directory traversal routines static as they are no longer used outside of tmpfs_subr.c	2013-01-06 22:15:44 +00:00
kib	cac2fe116f	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
jh	ce0372c7aa	Return EOPNOTSUPP rather than EPERM for the SF_SNAPSHOT flag because tmpfs doesn't support snapshots. Suggested by: bde	2012-04-18 15:22:08 +00:00
jh	734fbc687a	Sync tmpfs_chflags() with the recent changes to UFS: - Add a check for unsupported file flags. - Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.	2012-04-16 18:10:34 +00:00
gleb	748c9a0d1c	Provide better description for vfs.tmpfs.memory_reserved sysctl. Suggested by: Anton Yuzhaninov <citrin@citrin.ru>	2012-04-15 21:59:28 +00:00
gleb	c4ba84f86e	Add reserved memory limit sysctl to tmpfs. Cleanup availble and used memory functions. Check if free pages available before allocating new node. Discussed with: delphij	2012-04-07 15:23:51 +00:00
gleb	cc93f6fa4e	Prevent tmpfs_rename() deadlock in a way similar to UFS Unlock vnodes and try to lock them one by one. Relookup fvp and tvp. Approved by: mdf (mentor)	2012-03-14 09:15:50 +00:00
gleb	fa1fc18612	Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp() Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same. Don't mess with vnode holdcount, vget() takes care of it already. Approved by: mdf (mentor)	2012-03-14 08:29:21 +00:00
alc	213db2103e	When tmpfs_write() resets an extended file to its original size after an error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size. Reviewed by: kib MFC after: 3 weeks	2012-01-16 00:26:49 +00:00
alc	413b89a7e9	Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call vm_object_pip_{add,subtract}() on the swap object because the swap object can't be destroyed while the vnode is exclusively locked. Moreover, even if the swap object could have been destroyed during tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken because vm_object_pip_subtract() does not wake up the sleeping thread that is trying to destroy the swap object. Free invalid pages after an I/O error. There is no virtue in keeping them around in the swap object creating more work for the page daemon. (I believe that any non-busy page in the swap object will now always be valid.) vm_pager_get_pages() does not return a standard errno, so its return value should not be returned by tmpfs without translation to an errno value. There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to occur with the swap object locked. Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite(). (The swap pager already spam your console if data corruption is imminent.) Reviewed by: kib MFC after: 3 weeks	2012-01-14 23:04:27 +00:00
alc	d27a56b062	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
alc	21902be08c	Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib	2011-06-29 16:40:41 +00:00
alc	b90f855a9d	Further simplify tmpfs_reg_resize(). Also, update its comments, including style fixes.	2011-02-14 15:36:38 +00:00
alc	333b3f4277	Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vm object's size field. Previously, that field was always zero, even when the object tn_reg.tn_aobj contained numerous pages. Apply style fixes to tmpfs_reg_resize(). In collaboration with: kib	2011-02-13 14:46:39 +00:00
kib	5c60dad772	In tmpfs_readdir(), normalize handling of the directory entries that either overflow the supplied buffer, or cause uiomove fail. Do not advance cached de when directory entry was not copied out. Do not return EOF when no entries could be copied due to first entry too large for supplied buffer, signal EINVAL instead. Reported by: Beat G?tzi <beat chruetertee ch> MFC after: 1 week	2011-01-20 09:39:16 +00:00
ed	7efa097b64	Add support for whiteouts on tmpfs. Right now unionfs only allows filesystems to be mounted on top of another if it supports whiteouts. Even though I have sent a patch to daichi@ to let unionfs work without it, we'd better also add support for whiteouts to tmpfs. This patch implements .vop_whiteout and makes necessary changes to lookup() and readdir() to take them into account. We must also make sure that when adding or removing a file, we honour the componentname's DOWHITEOUT and ISWHITEOUT, to prevent duplicate filenames. MFC after: 1 month	2010-08-22 05:36:06 +00:00
jh	c1e8eaa2ba	- Change the type of nodes_max to u_int and use "%u" format string to convert its value. [1] - Set default tm_nodes_max to min(pages + 3, UINT32_MAX). It's more reasonable than the old four nodes per page (with page size 4096) because non-empty regular files always use at least one page. This fixes possible overflow in the calculation. [2] - Don't allow more than tm_nodes_max nodes allocated in tmpfs_alloc_node(). PR: kern/138367 Suggested by: bde [1], Gleb Kurtsou [2] Approved by: trasz (mentor)	2010-01-20 16:56:20 +00:00
alc	33ab51801e	There is no need to "busy" a page when the object is locked for the duration of the operation.	2009-10-26 18:02:05 +00:00
delphij	3af844143a	Add locking around access to parent node, and bail out when the parent node is already freed rather than panicking the system. PR: kern/122038 Submitted by: gk Tested by: pho MFC after: 1 week	2009-10-11 07:03:56 +00:00
kib	fa686c638e	Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)	2009-06-23 20:45:22 +00:00
attilio	1dcb84131b	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
kib	8571ced24c	Lookup up the directory entry for the tmpfs node that are deleted by both node pointer and name component. This does the right thing for hardlinks to the same node in the same directory. Submitted by: Yoshihiro Ota <ota j email ne jp> PR: kern/131356 MFC after: 2 weeks	2009-02-08 19:18:33 +00:00
bz	9fd12161ca	Remove unused local variables. Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks	2009-01-31 17:36:22 +00:00
obrien	d31fa36475	The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp and bcmp are not the same thing. 'man bcmp' states that the return is "non-zero" if the two byte strings are not identical. Where as, 'man memcmp' states that the return is the "difference between the first two differing bytes (treated as unsigned char values" if the two byte strings are not identical. So provide a proper memcmp(9), but it is a C implementation not a tuned assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).	2008-09-23 14:45:10 +00:00
delphij	0791911684	Reflect license change of NetBSD code. Obtained from: NetBSD MFC after: 3 days	2008-09-03 18:53:48 +00:00
kib	591c809288	Do not redo the vnode tear-down work already done by insmntque() when vnode cannot be put on the vnode list for mount. Reported and tested by: marck Guilty party: me MFC after: 3 days	2008-06-15 18:40:58 +00:00
attilio	4014b55830	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
attilio	18d0a0dd51	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
delphij	e93a39275f	Turn MPASS(0) into panic with more obvious reason why the assertion is failed.	2007-12-07 00:00:21 +00:00
delphij	c5d6580fd4	MFp4: Several fixes to tmpfs which makes it to survive from pho@'s strees2 suite, to quote his letter, this change: 1. It removes the tn_lookup_dirent stuff. I think this cannot be fixed, because nothing protects vnode/tmpfs node between lookup is done, and actual operation is performed, in the case the vnode lock is dropped. At least, this is the case with the from vnode for rename. For now, we do the linear lookup in the parent node. This has its own drawbacks. Not mentioning speed (that could be fixed by using hash), the real problem is the situation where several hardlinks exist in the dvp. But, I think this is fixable. 2. The patch restores the VV_ROOT flag on the root vnode after it became reclaimed and allocated again. This fixes MPASS assertion at the start of the tmpfs_lookup() reported by many. Submitted by: kib	2007-11-18 04:52:40 +00:00
delphij	5496743409	MFp4: - LK_RETRY prohibits vget() and vn_lock() to return error. Remove associated code. [1] - Properly use vhold() and vdrop() instead of their unlocked versions, we are guaranteed to have the vnode's interlock unheld. [1] - Fix a pseudo-infinite loop caused by 64/32-bit arithmetic with the same way used in modern NetBSD versions. [2] - Reorganize tmpfs_readdir to reduce duplicated code. Submitted by: kib [1] Obtained from: NetBSD [2] Approved by: re (tmpfs blanket)	2007-08-10 11:00:30 +00:00
delphij	1e2d5f7f4a	MFp4: - Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1] - Properly lock around tn_vnode to avoid NULL deference - Be more careful handling vnodes () () This is a WIP [1] by pjd via howardsu Thanks kib@ for his valuable VFS related comments. Tested with: fsx, fstest, tmpfs regression test set Found by: pho's stress2 suite Approved by: re (tmpfs blanket)	2007-08-10 05:24:49 +00:00
delphij	8f39689f22	MFp4 - Refine locking to eliminate some potential race/panics: - Copy before testing a pointer. This closes a race window. - Use msleep with the node interlock instead of tsleep. - Do proper locking around access to tn_vpstate. - Assert vnode VOP lock for dir_{atta,de}tach to capture inconsistent locking. Suggested by: kib Submitted by: delphij Reviewed by: Howard Su Approved by: re (tmpfs blanket)	2007-08-03 06:24:31 +00:00
delphij	ff9bf4b792	MFp4: Make use of the kernel unit number allocation facility for tmpfs nodes. Submitted by: Mingyan Guo <guomingyan gmail com> Approved by: re (tmpfs blanket)	2007-07-11 14:26:27 +00:00
delphij	b4ff595242	MFp4: - Plug memory leak. - Respect underlying vnode's properties rather than assuming that the user want root:wheel + 0755. Useful for using tmpfs(5) for /tmp. - Use roundup2 and howmany macros instead of rolling our own version. - Try to fix fsx -W -R foo case. - Instead of blindly zeroing a page, determine whether we need a pagein order to prevent data corruption. - Fix several bugs reported by Coverity. Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij Coverity ID: CID 2550, 2551, 2552, 2557 Approved by: re (tmpfs blanket)	2007-07-08 15:56:12 +00:00
delphij	f5523801a3	MFp4: - Remove unnecessary NULL checks after M_WAITOK allocations. - Use VOP_ACCESS instead of hand-rolled suser_cred() calls. [1] - Use malloc(9) KPI to allocate memory for string. The optimization taken from NetBSD is not valid for FreeBSD because our malloc(9) already act that way. [2] Requested by: rwatson [1] Submitted by: Howard Su [2] Approved by: re (tmpfs blanket)	2007-06-29 05:23:15 +00:00
delphij	66a8b537d6	Space/style cleanups after last set of commits. Approved by: re (tmpfs blanket)	2007-06-28 02:39:31 +00:00
delphij	ee97250932	Use vfs_timestamp instead of nanotime when obtaining a timestamp for use with timekeeping. Approved by: re (tmpfs blanket)	2007-06-28 02:34:32 +00:00
delphij	a78e2646a2	MFp4: Several clean-ups and improvements over tmpfs: - Remove tmpfs_zone_xxx KPI, the uma(9) wrapper, since they does not bring any value now. - Use \|= instead of = when applying VV_ROOT flag. - Remove tm_avariable_nodes list. Use uma to hold the released nodes. - init/destory interlock mutex of node when init/fini instead of ctor/dtor. - Change memory computing using u_int to fix negative value in 2G mem machine. - Remove unnecessary bzero's - Rely uma logic to make file id allocation harder to guess. - Fix some unsigned/signed related things. Make sure we respect -o size=xxxx - Use wire instead of hold a page. - Pass allocate_zero to obtain zeroed pages upon first use. Submitted by: Howard Su Approved by: re (tmpfs blanket, kensmith)	2007-06-25 18:46:13 +00:00
delphij	1da67b5003	Use vfs_timestamp() instead of nanotime() - make it up to the user to make decisions about how detail they wanted timestamps to have.	2007-06-18 14:40:19 +00:00
delphij	d50b261fe6	MFp4: fix two locking problems: - Hold TMPFS_LOCK while updating tm_pages_used. - Hold vm page while doing uiomove. This will hopefully fix all known panics. Submitted by: Howard Su	2007-06-18 01:43:13 +00:00
delphij	ef518e4122	MFp4: Add tmpfs, an efficient memory file system. Please note that, this is currently considered as an experimental feature so there could be some rough edges. Consult http://wiki.freebsd.org/TMPFS for more information. For now, connect tmpfs to build on i386 and amd64 architectures only. Please let us know if you have success with other platforms. This work was developed by Julio M. Merino Vidal for NetBSD as a SoC project; Rohit Jalan ported it from NetBSD to FreeBSD. Howard Su and Glen Leeder are worked on it to continue this effort. Obtained from: NetBSD via p4 Submitted by: Howard Su (with some minor changes) Approved by: re (kensmith)	2007-06-16 01:56:05 +00:00

44 Commits