freebsd-skq

Author	SHA1	Message	Date
Gleb Kurtsou	ca846258e2	Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp() Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same. Don't mess with vnode holdcount, vget() takes care of it already. Approved by: mdf (mentor)	2012-03-14 08:29:21 +00:00
Konstantin Belousov	b80dcb55aa	Remove fifo.h. The only used function declaration from the header is migrated to sys/vnode.h. Submitted by: gianni	2012-03-11 12:19:58 +00:00
John Baldwin	58d65e8031	Similar to the fixes in 226967 and 226987, purge any name cache entries associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname. MFC after: 2 weeks	2012-03-02 18:55:19 +00:00
Tijl Coosemans	0662ee9826	Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value to intmax_t instead of uintmax_t, because the original type is off_t.	2012-02-14 11:24:24 +00:00
Kevin Lo	e0d3195bd6	Return EOPNOTSUPP since we only support update mounts for NFS export. Spotted by: trociny	2012-01-17 01:25:53 +00:00
Kevin Lo	57eb5548c9	Add nfs export support to tmpfs(5) Reviewed by: kib	2012-01-16 10:25:22 +00:00
Alan Cox	0b05cac3d2	When tmpfs_write() resets an extended file to its original size after an error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size. Reviewed by: kib MFC after: 3 weeks	2012-01-16 00:26:49 +00:00
Alan Cox	93431cb74c	Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call vm_object_pip_{add,subtract}() on the swap object because the swap object can't be destroyed while the vnode is exclusively locked. Moreover, even if the swap object could have been destroyed during tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken because vm_object_pip_subtract() does not wake up the sleeping thread that is trying to destroy the swap object. Free invalid pages after an I/O error. There is no virtue in keeping them around in the swap object creating more work for the page daemon. (I believe that any non-busy page in the swap object will now always be valid.) vm_pager_get_pages() does not return a standard errno, so its return value should not be returned by tmpfs without translation to an errno value. There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to occur with the swap object locked. Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite(). (The swap pager already spam your console if data corruption is imminent.) Reviewed by: kib MFC after: 3 weeks	2012-01-14 23:04:27 +00:00
Alan Cox	2971897d51	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
Alan Cox	04f883d798	Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and tmpfs_nocacheread(). It is both unnecessary and a pessimization. It results in either the page being zeroed twice or zeroed first and then overwritten by an I/O operation. MFC after: 3 weeks	2012-01-03 03:29:01 +00:00
Ivan Voras	6e92aee4e2	Avoid panics from recursive rename operations. Not a perfect patch but good enough for now. PR: kern/159418 Submitted by: Gleb Kurtsou Reviewed by: kib MFC after: 1 month	2011-11-22 16:18:12 +00:00
Xin LI	296a25a245	Improve the way to calculate available pages in tmpfs: - Don't deduct wired pages from total usable counts because it does not make any sense. To make things worse, on systems where swap size is smaller than physical memory and use a lot of wired pages (e.g. ZFS), tmpfs can suddenly have free space of 0 because of this; - Count cached pages as available; [1] - Don't count inactive pages as available, technically we could but that might be too aggressive; [1] [1] Suggested by kib@ MFC after: 1 week	2011-11-21 20:26:22 +00:00
Marcel Moolenaar	82543c5928	Don astbestos garment and remove the warning about TMPFS being experimental -- highly experimental even. So far the closest to a bug in TMPFS that people have gotten to relates to how ZFS can take away from the memory that TMPFS needs. One can argue that such is not a bug in TMPFS. Irrespective, even if there is a bug here and there in TMPFS, it's not in our own advantage to scare people away from using TMPFS. I for one have been using it, even with ZFS, very successfully.	2011-11-07 16:21:50 +00:00
Peter Holm	948fa27d49	Added missing cache purge of from argument for rename(). Reported by: Anton Yuzhaninov <citrin citrin ru> In collaboration with: kib MFC after: 1 week	2011-11-01 12:33:06 +00:00
Konstantin Belousov	3407fefef6	Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)	2011-09-06 10:30:11 +00:00
Alan Cox	6bbee8e28a	Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib	2011-06-29 16:40:41 +00:00
Rick Macklem	694a586a43	Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib	2011-05-22 01:07:54 +00:00
Alan Cox	4d2f3d2cde	Eliminate two dubious attempts at optimizing the implementation of a file's last accessed, modified, and changed times: TMPFS_NODE_ACCESSED and TMPFS_NODE_CHANGED should be set unconditionally in tmpfs_remove() without regard to the number of hard links to the file. Otherwise, after the last directory entry for a file has been removed, a process that still has the file open could read stale values for the last accessed and changed times with fstat(2). Similarly, tmpfs_close() should update the time-related fields even if all directory entries for a file have been removed. In this case, the effect is that the time-related fields will have values that are later than expected. They will correspond to the time at which fstat(2) is called. In collaboration with: kib MFC after: 1 week	2011-02-22 14:47:10 +00:00
Alan Cox	7ded42ba28	tmpfs_remove() isn't modifying the file's data, so it shouldn't set TMPFS_NODE_MODIFIED on the node. PR: 152488 Submitted by: Anton Yuzhaninov Reviewed by: kib MFC after: 1 week	2011-02-19 21:04:36 +00:00
Alan Cox	4673c751f8	Further simplify tmpfs_reg_resize(). Also, update its comments, including style fixes.	2011-02-14 15:36:38 +00:00
Alan Cox	b10d1d5d60	Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vm object's size field. Previously, that field was always zero, even when the object tn_reg.tn_aobj contained numerous pages. Apply style fixes to tmpfs_reg_resize(). In collaboration with: kib	2011-02-13 14:46:39 +00:00
Konstantin Belousov	9fb9c623a6	In tmpfs_readdir(), normalize handling of the directory entries that either overflow the supplied buffer, or cause uiomove fail. Do not advance cached de when directory entry was not copied out. Do not return EOF when no entries could be copied due to first entry too large for supplied buffer, signal EINVAL instead. Reported by: Beat G?tzi <beat chruetertee ch> MFC after: 1 week	2011-01-20 09:39:16 +00:00
Andriy Gapon	e07b64c567	tmpfs + sendfile: do not produce partially valid pages for vnode's tail See r213730 for details of analogous change in ZFS. MFC after: 3 days	2010-10-12 17:16:51 +00:00
Andriy Gapon	21bd3e2576	tmpfs, zfs + sendfile: mark page bits as valid after populating it with data Otherwise, adding insult to injury, in addition to double-caching of data we would always copy the data into a vnode's vm object page from backend. This is specific to sendfile case only (VOP_READ with UIO_NOCOPY). PR: kern/141305 Reported by: Wiktor Niesiobedzki <bsd@vink.pl> Reviewed by: alc Tested by: tools/regression/sockets/sendfile MFC after: 2 weeks	2010-09-15 10:31:27 +00:00
Ivan Voras	b2143ecb99	Avoid "Entry can disappear before we lock fdvp" panic. PR: 150143 Submitted by: Gleb Kurtsou <gk at FreeBSD.org> Pretty sure it won't blow up: mckusick MFC after: 2 weeks	2010-09-07 22:40:45 +00:00
Ed Schouten	99d57a6bd8	Add support for whiteouts on tmpfs. Right now unionfs only allows filesystems to be mounted on top of another if it supports whiteouts. Even though I have sent a patch to daichi@ to let unionfs work without it, we'd better also add support for whiteouts to tmpfs. This patch implements .vop_whiteout and makes necessary changes to lookup() and readdir() to take them into account. We must also make sure that when adding or removing a file, we honour the componentname's DOWHITEOUT and ISWHITEOUT, to prevent duplicate filenames. MFC after: 1 month	2010-08-22 05:36:06 +00:00
Alan Cox	8393d186b9	Eliminate unnecessary page queues locking.	2010-06-16 00:41:21 +00:00
Edward Tomasz Napierala	307d88b787	Style fixes and removal of unneeded variable. Submitted by: bde@	2010-05-06 18:43:19 +00:00
Edward Tomasz Napierala	b5f770bd86	Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize(). Reviewed by: kib	2010-05-05 16:44:25 +00:00
Alan Cox	e3ef0d2fcf	Push down the acquisition of the page queues lock into vm_page_unwire(). Update the comment describing which lock should be held on entry to vm_page_wire(). Reviewed by: kib	2010-05-05 03:45:46 +00:00
Alan Cox	c5a648516e	Acquire the page lock around vm_page_unwire() and vm_page_wire(). Reviewed by: kib	2010-05-03 16:41:11 +00:00
Alan Cox	b88b6c9d80	It makes no sense for vm_page_sleep_if_busy()'s helper, vm_page_sleep(), to unconditionally set PG_REFERENCED on a page before sleeping. In many cases, it's perfectly ok for the page to disappear, i.e., be reclaimed by the page daemon, before the caller to vm_page_sleep() is reawakened. Instead, we now explicitly set PG_REFERENCED in those cases where having the page persist until the caller is awakened is clearly desirable. Note, however, that setting PG_REFERENCED on the page is still only a hint, and not a guarantee that the page should persist.	2010-05-02 17:33:46 +00:00
Jaakko Heinonen	dec3772ee4	Add "maxfilesize" mount option for tmpfs to allow specifying the maximum file size limit. Default is UINT64_MAX when the option is not specified. It was useless to set the limit to the total amount of memory and swap in the system. Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to check if there is enough memory available. Remove now unused get_swpgtotal(). Reviewed by: Gleb Kurtsou Approved by: trasz (mentor)	2010-01-29 12:09:14 +00:00
Jaakko Heinonen	189ee6be40	- Change the type of nodes_max to u_int and use "%u" format string to convert its value. [1] - Set default tm_nodes_max to min(pages + 3, UINT32_MAX). It's more reasonable than the old four nodes per page (with page size 4096) because non-empty regular files always use at least one page. This fixes possible overflow in the calculation. [2] - Don't allow more than tm_nodes_max nodes allocated in tmpfs_alloc_node(). PR: kern/138367 Suggested by: bde [1], Gleb Kurtsou [2] Approved by: trasz (mentor)	2010-01-20 16:56:20 +00:00
Jaakko Heinonen	5364a38dba	- Fix some style bugs in tmpfs_mount(). [1] - Remove a stale comment about tmpfs_mem_info() 'total' argument. Reported by: bde [1]	2010-01-13 14:17:21 +00:00
Jaakko Heinonen	720c50b339	- Change the type of size_max to u_quad_t because its value is converted with vfs_scanopt(9) using the "%qu" format string. - Limit the maximum value of size_max to (SIZE_MAX - PAGE_SIZE) to prevent overflow in howmany() macro. PR: kern/141194 Approved by: trasz (mentor) MFC after: 2 weeks	2010-01-08 07:57:43 +00:00
Alan Cox	4afcae9ba3	There is no need to "busy" a page when the object is locked for the duration of the operation.	2009-10-26 18:02:05 +00:00
Xin LI	82cf92d483	Add locking around access to parent node, and bail out when the parent node is already freed rather than panicking the system. PR: kern/122038 Submitted by: gk Tested by: pho MFC after: 1 week	2009-10-11 07:03:56 +00:00
Xin LI	3fa0694aaa	Add a special workaround to handle UIO_NOCOPY case. This fixes data corruption observed when sendfile() is being used. PR: kern/127213 Submitted by: gk MFC after: 2 weeks	2009-10-07 23:17:15 +00:00
Xin LI	7441ac4618	Fix a bug that causes the fsx test case of mmap'ed page being out of sync of read/write, inspired by ZFS's counterpart. PR: kern/139312 Submitted by: gk@ MFC after: 1 week	2009-10-04 10:38:04 +00:00
Konstantin Belousov	3364c323e6	Implement global and per-uid accounting of the anonymous memory. Add rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved for the uid. The accounting information (charge) is associated with either map entry, or vm object backing the entry, assuming the object is the first one in the shadow chain and entry does not require COW. Charge is moved from entry to object on allocation of the object, e.g. during the mmap, assuming the object is allocated, or on the first page fault on the entry. It moves back to the entry on forks due to COW setup. The per-entry granularity of accounting makes the charge process fair for processes that change uid during lifetime, and decrements charge for proper uid when region is unmapped. The interface of vm_pager_allocate(9) is extended by adding struct ucred *, that is used to charge appropriate uid when allocation if performed by kernel, e.g. md(4). Several syscalls, among them is fork(2), may now return ENOMEM when global or per-uid limits are enforced. In collaboration with: pho Reviewed by: alc Approved by: re (kensmith)	2009-06-23 20:45:22 +00:00
Alan Cox	47f11d9a46	Eliminate unnecessary variables.	2009-06-13 20:21:08 +00:00
Alan Cox	e2d0be0172	Eliminate redundant setting of a page's valid bits and pointless clearing of the same page's dirty bits.	2009-05-27 18:12:10 +00:00
Attilio Rao	dfd233edd5	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
Alan Cox	2984411d7b	Use uiomove_fromphys() instead of the combination of sf_buf and uiomove(). This is not only shorter; it also eliminates unnecessary thread pinning on architectures that implement a direct map. MFC after: 3 weeks	2009-02-22 19:50:09 +00:00
Alan Cox	2ade0391c2	Simplify the unwiring and activation of pages. MFC after: 1 week	2009-02-22 18:15:17 +00:00
Konstantin Belousov	e3c7e75305	Lookup up the directory entry for the tmpfs node that are deleted by both node pointer and name component. This does the right thing for hardlinks to the same node in the same directory. Submitted by: Yoshihiro Ota <ota j email ne jp> PR: kern/131356 MFC after: 2 weeks	2009-02-08 19:18:33 +00:00
Bjoern A. Zeeb	7956d34b95	Remove unused local variables. Submitted by: Christoph Mallon christoph.mallon@gmx.de Reviewed by: kib MFC after: 2 weeks	2009-01-31 17:36:22 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
David E. O'Brien	ae72afe0f2	The kernel implemented 'memcmp' is an alias for 'bcmp'. However, memcmp and bcmp are not the same thing. 'man bcmp' states that the return is "non-zero" if the two byte strings are not identical. Where as, 'man memcmp' states that the return is the "difference between the first two differing bytes (treated as unsigned char values" if the two byte strings are not identical. So provide a proper memcmp(9), but it is a C implementation not a tuned assembly implementation. Therefore bcmp(9) should be preferred over memcmp(9).	2008-09-23 14:45:10 +00:00

1 2

89 Commits