freebsd-nq

Author	SHA1	Message	Date
David Greenman	6cde7a165f	Fixed two potentially serious classes of bugs: 1) The vnode pager wasn't properly tracking the file size due to "size" being page rounded in some cases and not in others. This sometimes resulted in corrupted files. First noticed by Terry Lambert. Fixed by changing the "size" pager_alloc parameter to be a 64bit byte value (as opposed to a 32bit page index) and changing the pagers and their callers to deal with this properly. 2) Fixed a bogus type cast in round_page() and trunc_page() that caused some 64bit offsets and sizes to be scrambled. Removing the cast required adding casts at a few dozen callers. There may be problems with other bogus casts in close-by macros. A quick check seemed to indicate that those were okay, however.	1998-10-13 08:24:45 +00:00
Doug Rabson	e69763a315	Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.	1998-09-04 08:06:57 +00:00
Doug Rabson	069e9bc1b4	Change various syscalls to use size_t arguments instead of u_int. Add some overflow checks to read/write (from bde). Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable. Reviewed by: Bruce Evans <bde@zeta.org.au>	1998-08-24 08:39:39 +00:00
Bruce Evans	a23d65bfc8	Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.	1998-07-15 02:32:35 +00:00
Doug Rabson	711458e3e9	Don't truncate the return value of mmap to sizeof(int).	1998-07-05 11:56:52 +00:00
Bruce Evans	be160d60ab	Removed unused includes.	1998-06-21 18:02:50 +00:00
Doug Rabson	ecbb00a262	This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.	1998-06-07 17:13:14 +00:00
Peter Wemm	4183b6b622	Make the previous commit compile..	1998-05-19 07:13:21 +00:00
Guido van Rooij	05feb99ff1	Plug hole reported on Bugtraq: do not allow mmap with WRITE privs for append-only and immutable files. Obtained from: OpenBSD (partly)	1998-05-18 18:26:27 +00:00
Guido van Rooij	c8bdd56b3a	Fix for mmap of char devices bug as described in OpenBSD advisory of 1998/02/20 Reviewed by: John Dyson Submitted by: "Cy Schubert" <cschuber@uumail.gov.bc.ca>	1998-03-12 19:36:18 +00:00
John Dyson	8f9110f6a1	This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.	1998-03-07 21:37:31 +00:00
Eivind Eklund	0b08f5f737	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
Eivind Eklund	47cfdb166d	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
Alexander Langer	651bb81717	caddr_t --> void *	1997-12-31 02:35:29 +00:00
Eivind Eklund	5591b823d1	Make COMPAT_43 and COMPAT_SUNOS new-style options.	1997-12-16 17:40:42 +00:00
Poul-Henning Kamp	cb226aaa62	Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /ARGSUSED/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.	1997-11-06 19:29:57 +00:00
Bruce Evans	79624e2147	Removed unused #includes.	1997-09-01 03:17:34 +00:00
Peter Wemm	54f42e4ba0	Allow non-page aligned file offset mmap's, providing that the system is allowed to choose the address, or that the MAP_FIXED address has the same remainder when modulo PAGE_SIZE as the file offset. Apparently this is posix1003.1b specified behavior. SVR4 and the other *BSD's allow it too. It costs us nothing to support and means we don't get EINVAL on some mmap code that works perfectly elsewhere. Obtained from: NetBSD	1997-08-30 18:50:06 +00:00
Bruce Evans	b9dcd593ff	Fixed type mismatches for functions with args of type vm_prot_t and/or vm_inherit_t. These types are smaller than ints, so the prototypes should have used the promoted type (int) to match the old-style function definitions. They use just vm_prot_t and/or vm_inherit_t. This depends on gcc features to work. I fixed the definitions since this is easiest. The correct fix may be to change the small types to u_int, to optimize for time instead of space.	1997-08-25 22:15:31 +00:00
John Dyson	0a0a85b3e0	Add support for 4MB pages. This includes the .text, .data, .data parts of the kernel, and also most of the dynamic parts of the kernel. Additionally, 4MB pages will be allocated for display buffers as appropriate (only.) The 4MB support for SMP isn't complete, but doesn't interfere with operation either.	1997-07-17 04:34:03 +00:00
John Dyson	4a40e3d42f	Correct the return code for the mlock system call. Also add the stubs for mlockall and munlockall.	1997-06-15 23:35:32 +00:00
Bruce Evans	3ac4d1ef0c	Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined. Fixed everything that depended on getting fcntl.h stuff from the wrong place. Most things don't depend on file.h stuff at all.	1997-03-23 03:37:54 +00:00
Peter Wemm	6875d25465	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
John Dyson	996c772f58	This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>	1997-02-10 02:22:35 +00:00
John Dyson	afa07f7e83	Change the map entry flags from bitfields to bitmasks. Allows for some code simplification.	1997-01-16 04:16:22 +00:00
Jordan K. Hubbard	1130b656e5	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
John Dyson	9b5a5d81be	Prepare better for multi-platform by eliminating another required pmap routine (pmap_is_referenced.) Upper level recoded to use pmap_ts_referenced.	1997-01-11 07:19:02 +00:00
John Dyson	d0aea04fe0	Let the VM system know that on certain arch's that VM_PROT_READ also implies VM_PROT_EXEC. We support it that way for now, since the break system call by default gives VM_PROT_ALL. Now we have a better chance of coalesing map entries when mixing mmap/break type operations. This was contributing to excessive numbers of map entries on the modula-3 runtime system. The problem is still not "solved", but the situation makes more sense. Eventually, when we work on architectures where VM_PROT_READ is orthogonal to VM_PROT_EXEC, we will have to visit this issue carefully (esp. regarding security issues.)	1996-12-30 05:31:21 +00:00
John Dyson	94328e9057	The code unnecessarily created an object with no handle up-front, which has the negative effect of disabling some map optimizations. This patch defers the creation of the object until it needs to be at fault time. Submitted by: Alan Cox <alc@cs.rice.edu>	1996-12-28 22:40:44 +00:00
Joerg Wunsch	e9822d926c	Make DFLDSIZ and MAXDSIZ fully-supported options. "Don't forget to do a ``make depend''" :-)	1996-12-22 23:17:09 +00:00
John Dyson	7aaaa4fd5d	Implement closer-to POSIX mlock semantics. The major difference is that we do allow mlock to span unallocated regions (of course, not mlocking them.) We also allow mlocking of RO regions (which the old code couldn't.) The restriction there is that once a RO region is wired (mlocked), it cannot be debugged (or EVER written to.) Under normal usage, the new mlock code will be a significant improvement over our old stuff.	1996-12-14 17:54:17 +00:00
John Dyson	851c12ff1d	Change mmap to use OBJT_DEFAULT instead of OBJT_SWAP by default for anonymous objects. The system will automatically change the type to SWAP if needed (for size or pageout reasons.)	1996-10-29 22:07:11 +00:00
John Dyson	fcae040bc0	Remove a bogus optimization in the mmap code. It is superfluous, and at best is the same speed as the unoptimized code. At worst, it slows down trivial programs.	1996-10-24 02:56:23 +00:00
Poul-Henning Kamp	1111860c29	Remove a stale comment.	1996-10-13 07:16:50 +00:00
David Greenman	cd6eea255f	Fixed bug with reversed trunc/round_page() in madvise...start must be trunced, end must be rounded.	1996-09-19 10:12:41 +00:00
John Dyson	67bf686897	Backed out the recent changes/enhancements to the VM code. The problem with the 'shell scripts' was found, but there was a 'strange' problem found with a 486 laptop that we could not find. This commit backs the code back to 25-jul, and will be re-entered after the snapshot in smaller (more easily tested) chunks.	1996-07-30 03:08:57 +00:00
David Greenman	0f281c28fa	Slight performance tweak for previous commit.	1996-07-28 02:54:09 +00:00
John Dyson	bf6dfc7b35	Allow sequentially created mmap'ed anonymous regions to coalesce. There is little or no reason to create a swap pager for small mmap's. The vm_map_insert code will automatically create a swap pager if the object becomes too large. This fix, per a request from phk.	1996-07-27 17:21:41 +00:00
John Dyson	feb32a8fa9	Remove experimental header file. My test-build must have picked it up in an unexpected place. Submitted by: jkh	1996-07-27 04:06:11 +00:00
John Dyson	4f4d35edf0	This commit is meant to solve a couple of VM system problems or performance issues. 1) The pmap module has had too many inlines, and so the object file is simply bigger than it needs to be. Some common code is also merged into subroutines. 2) Removal of some evil PHYS_TO_VM_PAGE macro calls. Unfortunately, a few have needed to be added also. The removal caused the need for more vm_page_lookups. I added lookup hints to minimize the need for the page table lookup operations. 3) Removal of some bogus performance improvements, that mostly made the code more complex (tracking individual page table page updates unnecessarily). Those improvements actually hurt 386 processors perf (not that people who worry about perf use 386 processors anymore :-)). 4) Changed pv queue manipulations/structures to be TAILQ's. 5) The pv queue code has had some performance problems since day one. Some significant scalability issues are resolved by threading the pv entries from the pmap AND the physical address instead of just the physical address. This makes certain pmap operations run much faster. This does not affect most micro-benchmarks, but should help loaded system performance significantly. DG helped and came up with most of the solution for this one. 6) Most if not all pmap bit operations follow the pattern: pmap_test_bit(); pmap_clear_bit(); That made for twice the necessary pv list traversal. The pmap interface now supports only pmap_tc_bit type operations: pmap_[test/clear]_modified, pmap_[test/clear]_referenced. Additionally, the modified routine now takes a vm_page_t arg instead of a phys address. This eliminates a PHYS_TO_VM_PAGE operation. 7) Several rewrites of routines that contain redundant code to use common routines, so that there is a greater likelihood of keeping the cache footprint smaller.	1996-07-27 03:24:10 +00:00
John Dyson	f35329ac0f	This commit is dual-purpose, to fix more of the pageout daemon queue corruption problems, and to apply Gary Palmer's code cleanups. David Greenman helped with these problems also. There is still a hang problem using X in small memory machines.	1996-05-31 00:38:04 +00:00
John Dyson	867a482d66	Initial support for mincore and madvise. Both are almost fully supported, except madvise does not page in with MADV_WILLNEED, and MADV_DONTNEED doesn't force dirty pages out.	1996-05-19 07:36:50 +00:00
John Dyson	b18bfc3da7	This set of commits to the VM system does the following, and contain contributions or ideas from Stephen McKay <syssgm@devetir.qld.gov.au>, Alan Cox <alc@cs.rice.edu>, David Greenman <davidg@freebsd.org> and me: More usage of the TAILQ macros. Additional minor fix to queue.h. Performance enhancements to the pageout daemon. Addition of a wait in the case that the pageout daemon has to run immediately. Slightly modify the pageout algorithm. Significant revamp of the pmap/fork code: 1) PTE's and UPAGES's are NO LONGER in the process's map. 2) PTE's and UPAGES's reside in their own objects. 3) TOTAL elimination of recursive page table pagefaults. 4) The page directory now resides in the PTE object. 5) Implemented pmap_copy, thereby speeding up fork time. 6) Changed the pv entries so that the head is a pointer and not an entire entry. 7) Significant cleanup of pmap_protect, and pmap_remove. 8) Removed significant amounts of machine dependent fork code from vm_glue. Pushed much of that code into the machine dependent pmap module. 9) Support more completely the reuse of already zeroed pages (Page table pages and page directories) as being already zeroed. Performance and code cleanups in vm_map: 1) Improved and simplified allocation of map entries. 2) Improved vm_map_copy code. 3) Corrected some minor problems in the simplify code. Implemented splvm (combo of splbio and splimp.) The VM code now seldom uses splhigh. Improved the speed of and simplified kmem_malloc. Minor mod to vm_fault to avoid using pre-zeroed pages in the case of objects with backing objects along with the already existant condition of having a vnode. (If there is a backing object, there will likely be a COW... With a COW, it isn't necessary to start with a pre-zeroed page.) Minor reorg of source to perhaps improve locality of ref.	1996-05-18 03:38:05 +00:00
Poul-Henning Kamp	aa8de40ae5	Another sweep over the pmap/vm macros, this time with more focus on the usage. I'm not satisfied with the naming, but now at least there is less bogus stuff around.	1996-05-03 21:01:54 +00:00
David Greenman	8f2ec877b8	Force device mappings to always be shared. It doesn't make sense for them to ever be COW and we need the mappings to be shared for backward compatibilty. Reviewed by: dyson	1996-03-16 15:00:05 +00:00
John Dyson	5850152d95	Allow mmap'ed devices to work correctly across forks. The sanest solution appeared to be to allow the child to maintain the same mapping as the parent.	1996-03-12 02:27:20 +00:00
Peter Wemm	9154ee6aec	Oops.. I nearly forgot the actual core of the length/rounding/etc fixes that Bruce asked for. These still are not quite perfect, and in particular, it can get upset on extreme boundary cases (addr = 0xfff, len = 0xffffffff, which would end up mapping a single page rather than failing), but this is better code that I committed before. (note, the VM system does not (apparently) support single mmap segment sizes above 0x80000000 anyway)	1996-03-02 17:14:09 +00:00
John Dyson	de5f6a7765	1) Eliminate unnecessary bzero of UPAGES. 2) Eliminate unnecessary copying of pages during/after forks. 3) Add user map simplification.	1996-03-02 02:54:24 +00:00
Peter Wemm	dabee6fecc	kern_descrip.c: add fdshare()/fdcopy() kern_fork.c: add the tiny bit of code for rfork operation. kern/sysv_: shmfork() takes one less arg, it was never used. sys/shm.h: drop "isvfork" arg from shmfork() prototype sys/param.h: declare rfork args.. (this is where OpenBSD put it..) sys/filedesc.h: protos for fdshare/fdcopy. vm/vm_mmap.c: add minherit code, add rounding to mmap() type args where it makes sense. vm/: drop unused isvfork arg. Note: this rfork() implementation copies the address space mappings, it does not connect the mappings together. ie: once the two processes have split, the pages may be shared, but the address space is not. If one does a mmap() etc, it does not appear in the other. This makes it not useful for pthreads, but it is useful in it's own right for having light-weight threads in a static shared address space. Obtained from: Original by Ron Minnich, extended by OpenBSD	1996-02-23 18:49:25 +00:00
John Dyson	bd7e5f992e	Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.	1996-01-19 04:00:31 +00:00

1 2

84 Commits