freebsd-dev

Author	SHA1	Message	Date
Bosko Milekic	08442f8a82	Introduce numerous SMP friendly changes to the mbuf allocator. Namely, introduce a modified allocation mechanism for mbufs and mbuf clusters; one which can scale under SMP and which offers the possibility of resource reclamation to be implemented in the future. Notable advantages: o Reduce contention for SMP by offering per-CPU pools and locks. o Better use of data cache due to per-CPU pools. o Much less code cache pollution due to excessively large allocation macros. o Framework for `grouping' objects from same page together so as to be able to possibly free wired-down pages back to the system if they are no longer needed by the network stacks. Additional things changed with this addition: - Moved some mbuf specific declarations and initializations from sys/conf/param.c into mbuf-specific code where they belong. - m_getclr() has been renamed to m_get_clrd() because the old name is really confusing. m_getclr() HAS been preserved though and is defined to the new name. No tree sweep has been done "to change the interface," as the old name will continue to be supported and is not depracated. The change was merely done because m_getclr() sounds too much like "m_get a cluster." - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and systat(1) (see TODO below). - Fixed systat(1) to display number of "free mbufs" based on new per-CPU stat structures. - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported per-CPU stat structures. All infos are fetched via sysctl. TODO (in order of priority): - Re-enable mbtypes statistics in both netstat(1) and systat(1) after introducing an SMP friendly way to collect the mbtypes stats under the already introduced per-CPU locks (i.e. hopefully don't use atomic() - it seems too costly for a mere stat update, especially when other locks are already present). - Optionally have systat(1) display not only "total free mbufs" but also "total free mbufs per CPU pool." - Fix minor length-fetching issues in netstat(1) related to recently re-enabled option to read mbuf stats from a core file. - Move reference counters at least for mbuf clusters into an unused portion of the cluster itself, to save space and need to allocate a counter. - Look into introducing resource freeing possibly from a kproc. Reviewed by (in parts): jlemon, jake, silby, terry Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha) Preliminary performance measurements: jlemon (and me, obviously) URL: http://people.freebsd.org/~bmilekic/mb_alloc/	2001-06-22 06:35:32 +00:00
Poul-Henning Kamp	da936bf80a	Remove unneeded <stddef.h> #includes.	2000-10-29 16:57:42 +00:00
Jason Evans	9722d88fba	For lockmgr mutex protection, use an array of mutexes that are allocated and initialized during boot. This avoids bloating sizeof(struct lock). As a side effect, it is no longer necessary to enforce the assumtion that lockinit()/lockdestroy() calls are paired, so the LK_VALID flag has been removed. Idea taken from: BSD/OS.	2000-10-12 22:37:28 +00:00
Peter Wemm	ab063af911	Move the MSG* and SEM* options to opt_sysvipc.h Remove evil allocation macros from machdep.c (why was that there???) and use malloc() instead. Move paramters out of param.h and into the code itself. Move a bunch of internal definitions from public sys/.h headers (without #ifdef _KERNEL even) into the code itself. I had hoped to make some of this more dynamic, but the cost of doing wakeups on all sleeping processes on old arrays was too frightening. The other possibility is to initialize on the first use, and allow dynamic sysctl changes to parameters right until that point. That would allow /etc/rc.sysctl to change SEM and MSG* defaults as we presently do with SHM*, but without the nightmare of changing a running system.	2000-05-01 13:33:56 +00:00
Peter Wemm	255108f385	Make sysv-style shared memory tuneable params fully runtime adjustable via sysctl. It's done pretty simply but it should be quite adequate. Also move SHMMAXPGS from $machine/include/vmparam.h as the comments that went with it were wrong... we don't allocate KVM space for the pages so that comment is bogus.. The only practical limit is how much physical ram you want to lock up as this stuff isn't paged out or swap backed.	2000-03-30 07:17:05 +00:00
Brian Feldman	f48b807fc0	This is Bosko Milekic's mbuf allocation waiting code. Basically, this means that running out of mbuf space isn't a panic anymore, and code which runs out of network memory will sleep to wait for it. Submitted by: Bosko Milekic <bmilekic@dsuper.net> Reviewed by: green, wollman	1999-12-12 05:52:51 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Mike Smith	134c934ce7	Move the initialisation/tuning of nmbclusters from param.c/machdep.c into uipc_mbuf.c. This reduces three sets of identical tunable code to one set, and puts the initialisation with the mbuf code proper. Make NMBUFs tunable as well. Move the nmbclusters sysctl here as well. Move the initialisation of maxsockets from param.c to uipc_socket2.c, next to its corresponding sysctl. Use the new tunable macros for the kern.vm.kmem.size tunable (this should have been in a separate commit, whoops).	1999-07-05 08:52:54 +00:00
Dag-Erling Smørgrav	5a00f36414	Allow setting MAXFILES in the kernel config.	1999-04-09 16:28:11 +00:00
Matthew Dillon	3ea57f9da2	Fixed problems with kernel config file overrides of sysv semaphore parameters. Prior to this fix a kernel config override would effect only some of the kernel files, resulting in panics. PR: kern/9068	1998-12-14 08:34:55 +00:00
David Greenman	dd0b2081f4	Implemented zero-copy TCP/IP extensions via sendfile(2) - send a file to a stream socket. sendfile(2) is similar to implementations in HP-UX, Linux, and other systems, but the API is more extensive and addresses many of the complaints that the Apache Group and others have had with those other implementations. Thanks to Marc Slemko of the Apache Group for helping me work out the best API for this. Anyway, this has the "net" result of speeding up sends of files over TCP/IP sockets by about 10X (that is to say, uses 1/10th of the CPU cycles) when compared to a traditional read/write loop.	1998-11-05 14:28:26 +00:00
Bruce Evans	bef7db2e66	Moved definition of fscale from param.c to kern_synch.c where it should always have been (it has no user-servicable parts even at compile time) and staticized it.	1998-07-11 13:06:41 +00:00
Poul-Henning Kamp	8cb5266728	Add 3 sysctl variables for future use by ps)1_	1998-06-30 21:25:58 +00:00
Bruce Evans	df471779ea	Round tickadj up. This prevents tickadj from being 0 when HZ > 500, which makes adjtime(2) useless and confuses xntpd(8) into refusing to start even when it would use the kernel PLL instead of adjtime(). The result is the same as recommended by tickadj(8), at least when HZ divides 10^6. Of course, you wouldn't want to actually use adjtime() when HZ is large. In the silly boundary case of HZ == 10^6, tickadj == tick == 1 so the clock stops while adjtime() is active.	1998-06-21 12:22:35 +00:00
Garrett Wollman	98271db4d5	Convert socket structures to be type-stable and add a version number. Define a parameter which indicates the maximum number of sockets in a system, and use this to size the zone allocators used for sockets and for certain PCBs. Convert PF_LOCAL PCB structures to be type-stable and add a version number. Define an external format for infomation about socket structures and use it in several places. Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through sysctl(3) without blocking network interrupts for an unreasonable length of time. This probably still has some bugs and/or race conditions, but it seems to work well enough on my machines. It is now possible for `netstat' to get almost all of its information via the sysctl(3) interface rather than reading kmem (changes to follow).	1998-05-15 20:11:40 +00:00
Guido van Rooij	d3c0af6943	Raise ncallout from NPROC + 16 to NPROC + 16 + MAXFILES. This shold prevent a possible DOS attack. The proper fix (to dynamically grow the callout list) is in the make. Submitted by: Paul Traina	1998-02-27 19:58:29 +00:00
Bruce Evans	ac5fcb6d4d	Removed unused #includes.	1997-06-14 11:38:46 +00:00
Peter Wemm	6875d25465	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
Bruce Evans	5131d64e0c	Removed option EXTRAVNODES. All versions of FreeBSD-2.x have a sysctl variable `kern.maxvnodes' which gives much better control over vnode allocation than EXTRAVNODES (except in -current between 1995/10/28 and 1996/11/12, kern.maxvnodes was read-only and thus useless).	1997-01-16 13:16:10 +00:00
David Greenman	649c409d03	Fix bug related to map entry allocations where a sleep might be attempted when allocating memory for network buffers at interrupt time. This is due to inadequate checking for the new mcl_map. Fixed by merging mb_map and mcl_map into a single mb_map. Reviewed by: wollman	1997-01-15 20:46:02 +00:00
Jordan K. Hubbard	1130b656e5	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
Peter Wemm	114a8cff43	Add an option "EXTRA_VNODES" to cause an extra number of vnode structures to be allocated at boot time. This is an expensive option, as they consume physical ram and are not pageable etc. In certain situations, this kind of option is quite useful, especially for news servers that access a large number of directories at random and torture the name cache. Defining 5000 or 10000 extra vnodes should cut down the amount of vnode recycling somewhat, which should allow better name and directory caching etc. This is a "your mileage may vary" option, with no real indication of what works best for your machine except trial and error. Too many will cost you ram that you could otherwise use for disk buffers etc. This is based on something John Dyson mentioned to me a while ago.	1996-05-31 00:20:34 +00:00
Garrett Wollman	cb7545a995	Allocate mbufs from a separate submap so that NMBCLUSTERS works as expected.	1996-05-10 19:28:55 +00:00
Poul-Henning Kamp	e911eafcba	removed: CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei() ptei() kvtopte() ptetov() ispt() ptetoav() &c &c new: NPDEPG Major macro cleanup.	1996-05-02 14:21:14 +00:00
Poul-Henning Kamp	f8845af0db	First pass at cleaning up macros relating to pages, clusters and all that.	1996-05-02 10:43:17 +00:00
Jeffrey Hsu	70b012cab2	Merge in Lite2: proc LIST changes. Reviewed by: david & bde	1996-03-11 05:52:50 +00:00
Peter Wemm	4bd4912865	Add more options into the conf/options and i386/conf/options.i386 files and the #include hooks so that 'make depend' is more useful. This covers most of the options I regularly use (but not all) and some other easy ones.	1996-03-02 18:24:13 +00:00
Garrett Wollman	50c73f3620	Convert SYSV IPC to new-style options. (I hope I got everything...) The LKMs will need an extra file, to come later.	1996-01-04 20:29:06 +00:00
Poul-Henning Kamp	154c04e573	Last commit this round: Staticize. we are now down to about 1146 symbols being global, of which I estimate that about 100 are validly so.	1995-12-10 13:45:30 +00:00
Bruce Evans	28f8db1403	Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.	1995-07-29 11:44:31 +00:00
David Greenman	3d5e37c501	Removed "GATEWAY" consideration when calculating number of mbuf clusters. It now always uses the value that was used for the GATEWAY case.	1995-06-29 08:21:32 +00:00
David Greenman	ac7e6123a6	Killed "TIMEZONE" and "DST" options. They have been forced to 0 by config for more than a year now. Moved the declaration of 'tz' into kern_time.c.	1995-06-29 07:07:00 +00:00
David Greenman	cddc961a83	Made "NMBCLUSTERS" calculation dynamic and fixed bogus use of "NMBCLUSTERS" in machdep.c (it should use the global nmbclusters). Moved the calculation of nmbclusters into conf/param.c (same place where nmbclusters has always been assigned), and made the calculation include an extra amount based on "maxusers". NMBCLUSTERS can still be overrided in the kernel config file as always, but this change will make that generally unnecessary. This fixes the "bug" reports from people who have misconfigured kernels seeing the network "hang" when the mbuf cluster pool runs out. Reviewed by: John Dyson	1995-05-25 07:36:24 +00:00
Guido van Rooij	e6373c9ec0	Implement maxprocperuid and maxfilesperproc. They are tunable via sysctl(8). The initial value of maxprocperuid is maxproc-1, that of maxfilesperproc is maxfiles (untill maxfile will disappear) Now it is at least possible to prohibit one user opening maxfiles -Guido Submitted by: Obtained from:	1995-02-20 19:42:42 +00:00
Joerg Wunsch	3ab1adc555	Alow overriding of the various SHM* options. Submitted by: Heikki Suonsivu <hsu@fx7.cs.hut.fi>	1995-02-16 11:29:19 +00:00
David Greenman	ec2bb6ade1	Increase maxfiles to NPROC*2. This makes the per-process open file limit highly bogus, however, and this needs to be fixed.	1995-01-12 03:38:12 +00:00
David Greenman	0d94caffca	These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman	1995-01-09 16:06:02 +00:00
Doug Rabson	3d903220e4	Added SYSV ipcs. Obtained from: NetBSD and FreeBSD-1.1.5	1994-09-13 14:47:38 +00:00
David Greenman	3c4dd3568f	Added $Id$	1994-08-02 07:55:43 +00:00
Rodney W. Grimes	26f9a76710	The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch. Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman	1994-05-25 09:21:21 +00:00
Rodney W. Grimes	df8bae1de4	BSD 4.4 Lite Kernel Sources	1994-05-24 10:09:53 +00:00

41 Commits