freebsd-nq

Author	SHA1	Message	Date
Doug Rabson	e69763a315	Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.	1998-09-04 08:06:57 +00:00
Garrett Wollman	9898afa1f1	Bow to tradition and correctly implement the bogus-but-hallowed semantics of getsockopt never telling how much it might have copied if only the buffer were big enough.	1998-08-31 18:07:23 +00:00
Garrett Wollman	d224dbc106	Correctly set the return length regardless of the relative size of the user's buffer. Simplify the logic a bit. (Can we have a version of min() for size_t?)	1998-08-31 15:34:55 +00:00
KATO Takenori	582e52862a	- hw.machine_arch returns cpu architecture type. - moved definition of MACHINE_ARCH from cpu.h to parm.h as alpha. - Added definitions of _MACHINE and _MACHINE_ARCH. - Added hw.ispc98. The hw.ispc98 is 1 in PC98 kernel and is 0 in IBM-PC kernel. Discussed with: John Birrell <jb@FreeBSD.ORG>	1998-08-31 08:41:58 +00:00
Bruce Evans	f5ce675296	Oops, the previous revision unconfigured too much pre-Lite2 compatibilty cruft. At least lsvfs(1) was broken.	1998-08-29 13:13:10 +00:00
Luoqi Chen	ddae3cb9a0	Close a race window for getnewbuf() between shared lock holders of the vnode. Reviewed by: Mike Smith	1998-08-28 20:07:13 +00:00
Matthew Dillon	8e519d1f35	priority comparison in maybe_resched() didn't work properly if current and chk process were on different scheduler queues. Fixed.	1998-08-26 05:27:42 +00:00
Poul-Henning Kamp	12e14047a4	Fix DDBs printing of buf-flags after I changed them yesterday.	1998-08-25 14:41:42 +00:00
Poul-Henning Kamp	1d9b3ba13d	Remove the last remaining evidence of B_TAPE. Reclaim 3 unused bits in b_flags	1998-08-24 17:47:25 +00:00
Doug Rabson	069e9bc1b4	Change various syscalls to use size_t arguments instead of u_int. Add some overflow checks to read/write (from bde). Change all modifications to vm_page::flags, vm_page::busy, vm_object::flags and vm_object::paging_in_progress to use operations which are not interruptable. Reviewed by: Bruce Evans <bde@zeta.org.au>	1998-08-24 08:39:39 +00:00
Doug Rabson	c49265d091	Regnerate.	1998-08-24 08:32:19 +00:00
Doug Rabson	2e83b28161	Fix a few syscall arguments to use size_t instead of u_int.	1998-08-24 08:29:52 +00:00
Doug Rabson	a4f6773848	Add partial KLD support for ELF. The module loading is not written yet.	1998-08-24 08:25:26 +00:00
Bruce Evans	00671271c3	Fixed printf format errors. Only one left in LINT on i386's.	1998-08-24 02:28:16 +00:00
Poul-Henning Kamp	be18fc123b	remove bdevsw arg from dsopen(); Forgotten by: julian Reviewed by: bde	1998-08-23 20:16:35 +00:00
Dag-Erling Smørgrav	70d154a652	Don't check minor number of dump device at all. Discussed-with: Jörg Wunsch	1998-08-23 14:18:08 +00:00
Bruce Evans	1fcee46997	Fixed printf format errors.	1998-08-23 10:16:26 +00:00
Bruce Evans	cf8c7b0963	Added D_TTY to the cdevswitch flags for all tty drivers. This is required for the Lite2 fix for always returning EIO in dead_read(). Cleaned up the cdevswitch initializers for all tty drivers. Removed explicit calls to ttsetwater() from all (tty) drivers. ttsetwater() is now called centrally for opens, not just for parameter changes.	1998-08-23 08:26:42 +00:00
Garrett Wollman	cfe8b629f1	Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.	1998-08-23 03:07:17 +00:00
Bruce Evans	5879dcdb05	Moved `nx' functions to the one place where they are used (su.c). They shouldn't be used there either. They should have gone away about 3 years ago when the statically initialized devswitches went away, but su.c unfortunately still frobs the cdevswitch in the old way.	1998-08-20 06:10:42 +00:00
Dag-Erling Smørgrav	9103e8640c	Include opt_devfs.h which defines SLICE, to make previous commit meaningful. Pointed out by: Luoqi Chen	1998-08-19 20:20:52 +00:00
Søren Schmidt	e620a1cbed	Make struct buf->b_offset reflect the real byte offset which got in via the uio struct. This enables device drivers to use != DEV_BSIZE blocking on devices with wierd sector/block sizes (ie CDROM's).	1998-08-19 10:50:32 +00:00
Bruce Evans	5cf40a698b	A limit of 200000 for the output buffer high watermark was excessive, since (hardware) ttys have too low a bandwidth to benefit significantly from large buffers. Use twice the old limit for the new-default case and 8 times the old limit for the driver-specifies-watermark case. Nothing uses these cases yet. Removed related debugging code.	1998-08-19 04:01:00 +00:00
Mike Smith	287e61c39f	Presently there is only one `currentldt' variable for all cpus in a SMP system. Unexpected things could happen if each cpu has a different ldt setting and one cpu tries to use value of currentldt set by another cpu. The fix is to move currentldt to the per-cpu area. It includes patches I filed in PR i386/6219 which are also user ldt related. PR: i386/7591, i386/6219 Submitted by: Luoqi Chen <luoqi@watermarkgroup.com>	1998-08-18 07:47:12 +00:00
Bruce Evans	2d2f8ae7ad	Fixed nonsense overflow checking (checking that a long variable is less than INT_MAX after it has possibly overflowed). Removed an unused variable and its associated 2 style bugs. Removed unused includes.	1998-08-17 17:28:10 +00:00
Dag-Erling Smørgrav	d08b9c139f	Enable kernel dumps on SLICE systems.	1998-08-16 11:27:19 +00:00
John Polstra	317c91f4d4	Make ELF kernels build again.	1998-08-16 04:19:03 +00:00
Bruce Evans	86a14a7a0a	Use [u]intptr_t instead of [u_]long for casts between pointers and integers. Don't forget to cast to (void *) as well.	1998-08-16 01:21:52 +00:00
Bruce Evans	69ed480f48	pmap.c: Cast pointers to (vm_offset_t) instead of to (u_long) (as before) or to (uintptr_t)(void ) (as would be more correct). Don't cast vm_offset_t's to (u_long) just to do arithmetic on them. mp_machdep.c: Cast pointers to (uintptr_t) instead of to (u_long). Don't forget to cast pointers to (void ) first or to recover from integral possible integral promotions, although this is too much work for machine-dependent code. vm code generally avoids warnings for pointer vs long size mismatches by using vm_offset_t to represent pointers; pmap.c often uses plain `unsigned int' instead of vm_offset_t and didn't use u_long elsewhere, but this style was messed up by code apparently imported from mp_machdep.c.	1998-08-16 00:41:40 +00:00
Bruce Evans	160bd4c62f	Oops, the printf format error fixes confused curp->area with a pointer.	1998-08-15 22:42:20 +00:00
Doug Rabson	7032ad107e	Protect all modifications to v_numoutput with splbio().	1998-08-13 08:09:08 +00:00
Bruce Evans	13950bd2ed	Don't configure compatibility code for pre-Lite2 mount() calls by default. This code should go away soon.	1998-08-12 20:17:42 +00:00
Doug Rabson	a2c99e3e72	Modify the internal interfaces to the kernel linker to make it possible for DDB to use its symbol tables.	1998-08-12 08:44:21 +00:00
Bruce Evans	18c5a6c435	Implemented dynamic registration of software interrupt handlers. Not used yet. Use dummy SWI handlers to avoid some checks for null pointers.	1998-08-11 15:08:13 +00:00
Bruce Evans	c41141b002	Fixed the formatting of some tables (mainly the one produced by ps in ddb) which I broke by changing %8[l]x to %8p. Hacked the central printf routine to not add an "0x" prefix for %p formats if the field width is nonzero. The tables are still horribly misformatted on 64-bit machines. Use %p instead of %8p to print pointers when the field width isn't important.	1998-08-10 14:27:34 +00:00
Poul-Henning Kamp	22126f4208	The machine dependent disk slice manager does not recognize DOS partition type 15 (Extended DOS, LBA) as a container for DOS logical volumes, so the appropriate slices (e.g. sd1s5) are not initialized. PR: 7549 PR: 4120 Reviewed by: phk Submitted by: Jim Mattson <jmattson@sonic.net>	1998-08-10 07:22:14 +00:00
Doug Rabson	d474eaaa5f	Protect all modifications to paging_in_progress with splvm(). The i386 managed to avoid corruption of this variable by luck (the compiler used a memory read-modify-write instruction which wasn't interruptable) but other architectures cannot. With this change, I am now able to 'make buildworld' on the alpha (sfx: the crowd goes wild...)	1998-08-06 08:33:19 +00:00
Bruce Evans	6360628342	Removed unused function hzto().	1998-08-05 18:06:40 +00:00
David Greenman	760c5490ee	Move assignment of cur_rlp to after the acquisition of the list lock. PR: 7496 Submitted by: Stefan Eggers <seggers@semyam.dinoco.de>	1998-08-05 14:06:04 +00:00
Poul-Henning Kamp	205d5ed6ff	remove nonsense code. PR: 7482 Reviewed by: phk Submitted by: Stefan Eggers <seggers@semyam.dinoco.de>	1998-08-04 09:21:04 +00:00
Bruce Evans	34e9dea435	Added a flags arg to dsopen() and updated drivers. The DSO_ONESLICE and DSO_NOLABELS flags prevent searching for slices and labels respectively. Current drivers don't set these flags. When DSO_NOLABELS is set, the in-core label for the whole disk is cloned to create an in-core label for each slice. This gives the correct result (a good in-core label for the compatibility slice) if DSO_ONESLICE is set or only one slice is found, but usually gives broken labels otherwise, so DSO_ONESLICE should be set if DSO_NOLABELS is set.	1998-07-30 15:16:06 +00:00
Doug Rabson	8a8a13c8f0	Only access an int for READU/WRITEU since that is what ptrace is declared to return.	1998-07-29 18:41:30 +00:00
Doug Rabson	a9d81f7c5c	Default to FreeBSD if no brand detected. This makes life easier when bootstrapping from NetBSD/alpha.	1998-07-29 18:39:35 +00:00
Bruce Evans	d974cf4dda	Fixed printf format errors.	1998-07-29 17:38:14 +00:00
Bruce Evans	f9a9c96c25	Centralized and optimized handling of large sectors. Centralized checking of transfer sizes and alignments. Old version tested with 2K-sectors on od disks by: Shunsuke Akiyama <akiyama@kme.mei.co.jp>.	1998-07-29 11:15:54 +00:00
Bruce Evans	ea0823f2c9	Use the slice-relative blkno in all parts of the label write protection checks. Using the partition-relative blkno in some parts broke the write protection for partitions at unusual offsets (only for partitions at offset 1 on i386's).	1998-07-29 08:24:23 +00:00
Joerg Wunsch	57308494ec	Make the logging of abnormally exiting processes optional by a sysctl. PR: kern/1711 Submitted by: Nick Sayer <nsayer@kfu.com>	1998-07-28 22:34:12 +00:00
Bruce Evans	1733a6c1df	Set bp->b_resid for failed transfers in dscheck(). This is the best place to set it, and the wd and wfd strategy routines don't set it (for failed transfers) because they expect dscheck() to initialize everything necessary. dscheck() has always set B_ERROR, but this is not quite sufficient, because b_resid is used by physio() to decide how much of a B_ERROR'ed i/o was done.	1998-07-28 19:39:09 +00:00
Bruce Evans	aa6db4230d	Used daddr_t's, not ints, to store disk block numbers. Updated printf formats and args to match. Fixed old printf format errors (all related; most were hidden by calling printf indirectly). This change somehow avoids compiler bugs for 64-bit longs on i386's, although it increases the number of 64-bit calculations.	1998-07-28 18:25:51 +00:00
Bruce Evans	bc9e7c3b42	Fixed double counting of runtime after a process exits. The last timeslice of the exiting process was counted for both the exiting process and the next process to run if the next process runs immediately. Broken in: mostly in kern_clock.c rev.1.70 (1998/05/28)	1998-07-27 19:16:21 +00:00
David Greenman	01ddfa33e6	Only call m_reclaim() if M_WAIT since calling it from an interrupt can cause problems. PR: 7403	1998-07-27 03:59:48 +00:00
Bruce Evans	f69c53b019	Don't pass the label to diskerr(), since the label is being constructed and may be invalid. In particular, d_secpercyl may be 0, and diskerr() divides by it.	1998-07-25 16:35:06 +00:00
Doug Rabson	0bf030847d	Add some very simple support for a compiled in (from config(8)) resource database.	1998-07-22 08:35:52 +00:00
Bruce Evans	7cb743ae28	Initialize more defaults for the in-core label for the whole disk. Callers only need to initialize d_secperunit now, but should initialize d_type (to reduce the IDE/SCSI confusion), d_typename (put the disk model in it) and geometry info (if it isn't completely ficticious). Callers will soon need to initialize d_secsize.	1998-07-20 14:35:27 +00:00
Bruce Evans	f21b93a0f7	Cleaned up rev.1.39 - the shadowing variable should have just gone away.	1998-07-20 13:51:11 +00:00
Bruce Evans	92d1f65ed2	Moved allocation of the slices struct to the right place. Initialize everything in it (the devsw pointers were not initialized early or at all for the !DEVFS case, but this was harmless on i386's).	1998-07-20 13:39:45 +00:00
Bruce Evans	1e550e3809	Backed out rev.1.43 (removed nonsense SLICE ifdef). SLICE is normally only defined in opt_devfs.h, so testing it before including anything is normally a no-op. Undef'ing DEVFS before including opt_devfs.h is similarly useless. OTOH, DEVFS support for sliced but not SLICEd (despite defined(SLICE)) devices is either harmless (if there are no such devices, then nothing in this file is used) or necessary (otherwise). It even seems to work for sliced cd devices.	1998-07-20 12:37:59 +00:00
Bill Fenner	0c495036b4	Undo rev 1.41 until we get more details about why it makes some systems fail.	1998-07-18 18:48:45 +00:00
Bruce Evans	18da528d41	Changed %n to %r in devfs name format strings. %n has almost gone away.	1998-07-15 12:18:34 +00:00
Bruce Evans	30166fabb6	Cast between longs and pointers via intptr_t. There shouldn't be nearly so many casts here. Casting an pointer that was an integer back to an integer just to compare it with -1 is bad, and casting it back just to compare it with NULL is just wrong.	1998-07-15 06:51:14 +00:00
Bruce Evans	d4d88b1e4d	Cast between u_longs and object pointers via uintptr_t. Access the entry address as a uintfptr_t, not as a long, and not necessarily as what modload(8) passes (it takes a u_long from the exec header and passes a u_int).	1998-07-15 06:39:12 +00:00
Bruce Evans	aae0aa4593	Cast between longs and pointers via intptr_t. The results of fuword() should be checked before casting. The results of suword() should be checked.	1998-07-15 06:19:33 +00:00
Bruce Evans	1ede4662be	Cast longs to intptr_t before casting them to pointers. Fixed bitrot in pseudo-declaration of `struct fcntl_args'. fcntl() is now broken in some cases when ints are larger than longs.	1998-07-15 06:10:16 +00:00
Bruce Evans	c2da0fd903	Cast pointers to intptr_t instead of or before casting to long. Fixed bitrot in K&R support (suword() now takes a long word). Didn't fix corresponding bitrot in store.9 and fetch.9. The correct types for the store and fetch families are problematic. The `word' functions are unfortunately named and need to be split to handle ints/longs/object pointers/function pointers. Storing argv[] as longs is quite broken when longs are longer than pointers, but usually works because it clobbers variables that will soon be reinitialized.	1998-07-15 05:21:48 +00:00
Bruce Evans	7cd99438f8	Cast u_longs to uintptr_t before casting them to pointers. Don't attempt to even partially support systems with function pointers larger than object pointers.	1998-07-15 05:00:26 +00:00
Bruce Evans	6a206dd96a	Cast function pointers to uintfptr_t before casting them to u_long. Hopefully caddr_t is large enough to hold function pointers. Cast object pointers to uintptr_t before casting them to u_long. Types are wronger than usual for the PT_READ_U case. ptrace() can only return ints, but longs are accessed.	1998-07-15 04:43:49 +00:00
Bruce Evans	a23d65bfc8	Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.	1998-07-15 02:32:35 +00:00
Bruce Evans	37889b394a	Changed to the C9x draft spelling of the (unsigned) integral type suitable for holding object pointers (ptrint_t -> uintptr_t). Added corresponding signed type (intptr_t). Changed/added corresponding non-C9x types for function pointers to match. Don't use nonstandard types to implement these types, and don't comment on them in <machine/types.h>.	1998-07-14 05:09:48 +00:00
Bruce Evans	9f14a215f4	Fixed printf format errors.	1998-07-13 07:05:55 +00:00
Doug Rabson	7a6c46b55a	Initialise all the fields separately in vattr_null since on the alpha they are not all the same width.	1998-07-12 16:45:39 +00:00
Doug Rabson	45c95fa1d6	Change interrupt api to be closer to intr_create/intr_connect.	1998-07-12 16:20:52 +00:00
Bruce Evans	bef7db2e66	Moved definition of fscale from param.c to kern_synch.c where it should always have been (it has no user-servicable parts even at compile time) and staticized it.	1998-07-11 13:06:41 +00:00
Bruce Evans	2f18a2801b	Fixed printf format errors.	1998-07-11 10:45:45 +00:00
Bruce Evans	ed62fb52ec	Fixed printf format errors.	1998-07-11 10:28:47 +00:00
Bruce Evans	ac1e407b32	Fixed printf format errors.	1998-07-11 07:46:16 +00:00
Bruce Evans	e0c38587af	Fixed (un)sign extension bugs in %+n format. -4 became (long)(u_long)(u_int)-4 = 0x00000000fffffffc on machines with 32-bit ints and 64-bit longs. Restored %z format for printing signed hex. %+x shouldn't have been used since it is an error in userland. Prepared to nuke %n format by cloning it to %r. %n shouldn't have been used because it means something completely different in userland. Now %+r is equivalent to ddb's original %r, and %r is equivalent to ddb's original %n. Ignore '+' flag in combination with unsigned formats %{o,p,u,x}.	1998-07-08 10:41:32 +00:00
Sean Eric Fagan	c5edb423c6	Add support for run-time configuration of core file names. In a nutshell, you can specify the corefile name by using: sysctl -w kern.corefile="format" where format is a pathname (relative or absolute -- default is "%N.core"), with "%N" (process name), "%P" (process ID), and "%U" (user ID) formats. Reviewed by: Mike Smith, with strong requests by Julian :)	1998-07-08 06:38:39 +00:00
Julian Elischer	6deaf84b1f	Catch a few corner cases where FreeBSD differs enough from BSD 4.4 to confuse Soft updates.. Should solve several "dangling deps" panics.	1998-07-08 01:04:33 +00:00
Bruce Evans	c4ebf24f6e	Don't depend on gcc's feature of casting lvalues.	1998-07-07 04:36:23 +00:00
Bill Fenner	dece5b6a43	Introduce (fairly hacky) workaround for odd TCP behavior with application writes of size (100,208]+N*MCLBYTES. The bug: sosend() hands each mbuf off to the protocol output routine as soon as it has copied it, in the hopes of increasing parallelism (see http://www.kohala.com/~rstevens/vanj.88jul20.txt ). This works well for TCP as long as the first mbuf handed off is at least the MSS. However, when doing small writes (between MHLEN and MINCLSIZE), the transaction is split into 2 small MBUF's and each is individually handed off to TCP. TCP assumes that the first small mbuf is the whole transaction, so sends a small packet. When the second small mbuf arrives, Nagle prevents TCP from sending it so it must wait for a (potentially delayed) ACK. This sends throughput down the toilet. The workaround: Set the "atomic" flag when we're doing small writes. The "atomic" flag has two meanings: 1. Copy all of the data into a chain of mbufs before handing off to the protocol. 2. Leave room for a datagram header in said mbuf chain. TCP wants the first but doesn't want the second. However, the second simply results in some memory wastage (but is why the workaround is a hack and not a fix). The real fix: The real fix for this problem is to introduce something like a "requested transfer size" variable in the socket->protocol interface. sosend() would then accumulate an mbuf chain until it exceeded the "requested transfer size". TCP could set it to the TCP MSS (note that the current interface causes strange TCP behaviors when the MSS > MCLBYTES; nobody notices because MCLBYTES > ethernet's MTU).	1998-07-06 19:27:14 +00:00
Julian Elischer	596f8506ad	fix braino from yesterdays' megacommit Not sure of the result of it.. (may or may not effect anything) but it's fixed now. (found by: comparing what cvsup sent back to me with what I tested..)	1998-07-05 20:33:18 +00:00
Julian Elischer	f7ea2f55d1	There is no such thing any more as "struct bdevsw". There is only cdevsw (which should be renamed in a later edit to deventry or something). cdevsw contains the union of what were in both bdevsw an cdevsw entries. The bdevsw[] table stiff exists and is a second pointer to the cdevsw entry of the device. it's major is in d_bmaj rather than d_maj. some cleanup still to happen (e.g. dsopen now gets two pointers to the same cdevsw struct instead of one to a bdevsw and one to a cdevsw). rawread()/rawwrite() went away as part of this though it's not strictly the same patch, just that it involves all the same lines in the drivers. cdroms no longer have write() entries (they did have rawwrite (?)). tapes no longer have support for bdev operations. Reviewed by: Eivind Eklund and Mike Smith Changes suggested by eivind.	1998-07-04 22:30:26 +00:00
Julian Elischer	fd5d1124e2	VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>	1998-07-04 20:45:42 +00:00
Poul-Henning Kamp	52f8e5d672	Hmm, braino in last commit.	1998-07-04 19:29:15 +00:00
Poul-Henning Kamp	0edd53d22a	Change the sign on a race-condition, so that instead of ending up several tens of milliseconds out in the future we end up the right place with a subweeniesecond error.	1998-07-04 19:12:21 +00:00
Poul-Henning Kamp	3e5e083cb7	Update M_EXT support in m_copypacket(). PR: 7122 Reviewed by: phk Submitted by: Castor Fu <castor@geocast.com> Originally forgotten by: julian	1998-07-03 08:36:48 +00:00
David Greenman	e25169f239	Reset MNT_ASYNC flag if needed if unmount() should fail. Submitted by: Paul Saab <paul@mu.org>	1998-07-03 03:47:24 +00:00
Poul-Henning Kamp	6ca4ca2476	When we transfer time from one timecounter to the next, use nanouptime(), not nanotime(); Otherwise we end up in 2026... Fix the arg to dummy_get_timecount()	1998-07-02 21:35:02 +00:00
Poul-Henning Kamp	8cb5266728	Add 3 sysctl variables for future use by ps)1_	1998-06-30 21:25:58 +00:00
Bruce Evans	673796a715	Nuked opt_defunct.h and kern_opt.c. config(8) now generates good enough warnings about all unknown options.	1998-06-30 14:43:04 +00:00
Poul-Henning Kamp	67f4e2ed05	Add trailing newline to sys/syscall.mk so that diff doesn't choke on it.	1998-06-28 10:01:52 +00:00
David Greenman	c87e2930e6	Added a sysctl variable kern.sugid_coredump for controlling coredump behavior of setuid/setgid binaries that defaults to 0 (coredump disabled).	1998-06-28 08:37:45 +00:00
Poul-Henning Kamp	c259b8dd2b	Report the mode as the result of the VOP_GETATTR rather than the vnodes type, they may not correspond.	1998-06-27 06:43:09 +00:00
Poul-Henning Kamp	7c281842e3	Remove isdisk() hacks.	1998-06-26 18:14:25 +00:00
Poul-Henning Kamp	b62591052c	Remove bdevsw_add(), change the only two users to use bdevsw_add_generic(). Extend cdevsw to be superset of bdevsw. Remove non-functional bdev lkm support. Teach wcd what the open() args mean.	1998-06-25 11:28:07 +00:00
Bruce Evans	be160d60ab	Removed unused includes.	1998-06-21 18:02:50 +00:00
Bruce Evans	e5b19842ef	Removed unused includes.	1998-06-21 14:53:44 +00:00
Bruce Evans	df471779ea	Round tickadj up. This prevents tickadj from being 0 when HZ > 500, which makes adjtime(2) useless and confuses xntpd(8) into refusing to start even when it would use the kernel PLL instead of adjtime(). The result is the same as recommended by tickadj(8), at least when HZ divides 10^6. Of course, you wouldn't want to actually use adjtime() when HZ is large. In the silly boundary case of HZ == 10^6, tickadj == tick == 1 so the clock stops while adjtime() is active.	1998-06-21 12:22:35 +00:00
Bruce Evans	316bbd5c6f	Converted add_interrupt_randomness() to take a `void *' arg. Rewrote mmioctl() to fix hundreds of style bugs and a few error handling bugs (don't check for superuser privilege for inappropriate ioctls, don't check the input arg for the output-only MEM_RETURNIRQ ioctl, and don't return EPERM for null changes).	1998-06-21 11:33:32 +00:00
Bruce Evans	9a2daf9190	Changed the type of an isa/general interrupt handler to take a `void *' arg. Fixed or hid most of the resulting type mismatches. Handlers can now be updated locally (except for reworking their global declarations in isa_device.h).	1998-06-18 15:32:09 +00:00
Bruce Evans	f95ac73519	Use copyout() instead of bcopy() to copy the image to user space. bcopy() caused panics under heavy paging (not quite as suspected - the kernel stack seemed to get corrupted). Fixed long lines. Reviewed by: phk	1998-06-16 14:36:40 +00:00
Doug Rabson	b1bf661000	[Add missing files from previous commit] Major changes to the generic device framework for FreeBSD/alpha: * Eliminate bus_t and make it possible for all devices to have attached children. * Support dynamically extendable interfaces for drivers to replace both the function pointers in driver_t and bus_ops_t (which has been removed entirely. Two system defined interfaces have been defined, 'device' which is mandatory for all devices and 'bus' which is recommended for all devices which support attached children. * In addition, the alpha port defines two simple interfaces 'clock' for attaching various real time clocks to the system and 'mcclock' for the many different variations of mc146818 clocks which can be attached to different alpha platforms. This eliminates two more function pointer tables in favour of the generic method dispatch system provided by the device framework. Future device interfaces may include: * cdev and bdev interfaces for devfs to use in replacement for specfs and the fixed interfaces bdevsw and cdevsw. * scsi interface to replace struct scsi_adapter (not sure how this works in CAM but I imagine there is something similar there). * various tailored interfaces for different bus types such as pci, isa, pccard etc.	1998-06-14 13:53:12 +00:00
Doug Rabson	99d11cde56	Major changes to the generic device framework for FreeBSD/alpha: * Eliminate bus_t and make it possible for all devices to have attached children. * Support dynamically extendable interfaces for drivers to replace both the function pointers in driver_t and bus_ops_t (which has been removed entirely. Two system defined interfaces have been defined, 'device' which is mandatory for all devices and 'bus' which is recommended for all devices which support attached children. * In addition, the alpha port defines two simple interfaces 'clock' for attaching various real time clocks to the system and 'mcclock' for the many different variations of mc146818 clocks which can be attached to different alpha platforms. This eliminates two more function pointer tables in favour of the generic method dispatch system provided by the device framework. Future device interfaces may include: * cdev and bdev interfaces for devfs to use in replacement for specfs and the fixed interfaces bdevsw and cdevsw. * scsi interface to replace struct scsi_adapter (not sure how this works in CAM but I imagine there is something similar there). * various tailored interfaces for different bus types such as pci, isa, pccard etc.	1998-06-14 13:46:10 +00:00
Poul-Henning Kamp	938ee3ce4d	Introduce std_pps_ioctl() to automagically DTRT. Add scaling capability to timex.offset, ntpd-4.0.73 will support this.	1998-06-13 09:30:26 +00:00
Doug Rabson	3900ddb2dc	Only build this on i386 for now. I may use it for the alpha later but currently it doesn't compile.	1998-06-11 07:23:59 +00:00
Julian Elischer	32f5d4d843	Replace 'sleep()' with 'tsleep()' Accidentally imported from Kirk's codebase. Pointed out by: various.	1998-06-10 22:02:14 +00:00
Julian Elischer	28913ebe4e	Submitted by: Kirk McKusick <mckusick@McKusick.COM> Fix for potential hang when trying to reboot the system or to forcibly unmount a soft update enabled filesystem. FreeBSD already handled the reboot case differently, this is however a better fix.	1998-06-10 18:13:19 +00:00
Doug Rabson	897cd717a5	Add initial support for the FreeBSD/alpha kernel. This is very much a work in progress and has never booted a real machine. Initial development and testing was done using SimOS (see http://simos.stanford.edu for details). On the SimOS simulator, this port successfully reaches single-user mode and has been tested with loads as high as one copy of /bin/ls :-). Obtained from: partly from NetBSD/alpha	1998-06-10 10:57:29 +00:00
Doug Rabson	8c12612cf6	64bit fixes: don't cast pointers to int.	1998-06-10 10:31:08 +00:00
Doug Rabson	2b605d0804	64bit fixes: don't cast p->p_retval to an int*.	1998-06-10 10:30:23 +00:00
Doug Rabson	831b9ef2be	64bit fixes: use u_long not int for ioctl command.	1998-06-10 10:29:31 +00:00
Doug Rabson	10d4743f6f	64bit fixes: use size_t not u_int for sizes.	1998-06-10 10:28:29 +00:00
Doug Rabson	2ef49ddfcb	64bit fixes: p->p_retval is a register_t[] not an int[].	1998-06-10 10:27:43 +00:00
Poul-Henning Kamp	a58f0f8e66	Add a tc_ prefix to struct timecounter members. Urged by: bde	1998-06-09 13:10:54 +00:00
Bruce Evans	1afde994e9	Pass lists of possible root devices and their names up to the machine-independent code and try mounting the devices in the lists instead of guessing alternative root devices in a machine- dependent way. autoconf.c: Reject preposterous slice numbers instead of silently converting them to COMPATIBILITY_SLICE. Don't forget to force slice = COMPATIBILITY_SLICE in the floppy device name. Eliminated most magic numbers and magic device names in setroot(). Fixed dozens of style bugs. vfs_conf.c: Put the actual root device name instead of "root_device" in the mount struct if the actual name is available. This is useful after booting with -s. If it were set in all cases then it could be used to do mount(8)'s ROOTSLICE_HUNT and fsck(8)'s hotroot guess better.	1998-06-09 12:52:35 +00:00
Bruce Evans	e7c1c309fa	Don't generate COMPAT_43 cruft if there are no COMPAT_43 syscalls. In particular, don't generate an include of "opt_compat.h" if it wouldn't affect anything we create. This will fix recent breakage of the ibcs2 LKM. The ibcs2 syscall files were not regenerated properly, so the LKM didn't break immediately when we started generating this extraneous include.	1998-06-09 03:32:05 +00:00
John Dyson	0d3dd8fbc5	Remove some junk left over from a previous commit. Submitted by: phk	1998-06-08 18:18:28 +00:00
Bruce Evans	414c93f3aa	Updated generated files.	1998-06-08 11:08:35 +00:00
Bruce Evans	bf0955a99d	Fixed some style bugs in output (missing tabs and unparenthesized macros). Fixed some style bugs in source (mostly, superfluous backslashes).	1998-06-08 11:02:00 +00:00
Doug Rabson	2e91d07af9	Fix a typo which prevented i386 elf from working at all (including Linux emulated elf binaries).	1998-06-08 09:19:35 +00:00
Poul-Henning Kamp	48115288df	Add a member function more to the timecounters, this one is for use with latch based PPS implementations. The client that uses it will be committed after more testing.	1998-06-07 20:36:55 +00:00
Doug Rabson	ecbb00a262	This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.	1998-06-07 17:13:14 +00:00
Poul-Henning Kamp	dbb3475507	Add a "this" style argument and a "void *private" so timecounters can figure out which instance to wount with.	1998-06-07 08:40:53 +00:00
Bruce Evans	e3a03f0cfb	Don't attempt to copy the whole slices "struct" for DIOCGSLICEINFO. The slices "struct" isn't really a struct; we allocate only part of it in the fully dangerously dedicated case. Since the "struct" is malloced, the page beyond it may not be mapped, so attempts to copy it would crash. This problem became larger when the full struct was bloated from < 1K to > 3K by the addition of (mostly unused) DEVFS tokens some time before 2.2.0 was released.	1998-06-06 03:06:55 +00:00
David Greenman	b5afad7198	Moved limit frobbing (and the resulting limcopy()) that occurs for accounting to the accounting function so that this isn't needlessly done for some process exits. Reviewed by: bde,phk	1998-06-05 21:44:20 +00:00
David Greenman	9523f5c199	If we are out of mb_map space and we failed to m_reclaim() anything and the alloc is not M_DONTWAIT, then panic with "Out of mbuf clusters". Callers that specify M_WAIT can't deal with getting a NULL buffer, so this is a more graceful failure than randomly page faulting in the socket code or elsewhere.	1998-06-05 21:41:48 +00:00
John Dyson	e8f367853b	Correct sleep priority.	1998-06-02 05:39:13 +00:00
Peter Dufault	ce47711dee	Set PAGE_SIZE for _SC_PAGESIZE sysconf().	1998-06-01 21:54:43 +00:00
Peter Wemm	4dc75870b2	Have the wakeup routine do the upcall if needed. Obtained from: NetBSD	1998-05-31 18:38:43 +00:00
Poul-Henning Kamp	e796e00de3	Some cleanups related to timecounters and weird ifdefs in <sys/time.h>. Clean up (or if antipodic: down) some of the msgbuf stuff. Use an inline function rather than a macro for timecounter delta. Maintain process "on-cpu" time as 64 bits of microseconds to avoid needless second rollover overhead. Avoid calling microuptime the second time in mi_switch() if we do not pass through _idle in cpu_switch() This should reduce our context-switch overhead a bit, in particular on pre-P5 and SMP systems. WARNING: Programs which muck about with struct proc in userland will have to be fixed. Reviewed, but found imperfect by: bde	1998-05-28 09:30:28 +00:00
John Dyson	cf2819ccb8	Make flushing dirty pages work correctly on filesystems that unexpectedly do not complete writes even with sync I/O requests. This should help the behavior of mmaped files when using softupdates (and perhaps in other circumstances also.)	1998-05-21 07:47:58 +00:00
Peter Dufault	aebde78243	1. Add new defs for mins and maxs for the POSIX flavor priorities. They end up being the same, but it doesn't look like you're comparing apples and oranges. 2. Use need_resched instead of reset_priority. This isn't right either, since for example you'll round-robin against equal priority FIFO processes when lowering the priority of another process, but this works better and a real fix needs to be in kern_synch and not out here. 3. This is not a device driver: copyin/copyout the structure.	1998-05-19 21:11:53 +00:00
Poul-Henning Kamp	579f4456b9	Change a data type internal to the timecounters, and remove the "delta" function. Reviewed, but not entirely approved by: bde	1998-05-19 18:55:02 +00:00
Poul-Henning Kamp	58067a9909	Make the size of the msgbuf (dmesg) a "normal" option.	1998-05-19 08:58:53 +00:00
Tor Egge	afc6ea238f	Disallow reading the current kernel stack. Only the user structure and the current registers should be accessible. Reviewed by: David Greenman <dg@root.com>	1998-05-19 00:00:14 +00:00
Peter Dufault	2a61a11038	1. Don't use "nosys" and generate coredumps for unconfigured system calls - return ENOSYS per the spec. 2. Fix interface stub to set priority properly.	1998-05-18 12:53:45 +00:00
Tor Egge	2f1e70693d	Add forwarding of roundrobin to other cpus. This gives a more regular update of cpu usage as shown by top when one process is cpu bound (no system calls) while the system is otherwise idle (except for top). Don't attempt to switch to the BSP in boot(). If the system was idle when an interrupt caused a panic, this won't work. Instead, switch to the BSP in cpu_reset. Remove some spurious forward_statclock/forward_hardclock warnings.	1998-05-17 22:12:14 +00:00
Bruce Evans	ee002b68d1	Fixed interval calculation in realitimexpire() again. Obtained from: rev.1.9. Broken in: rev.1.50. Fixed a spelling error. Obtained from: Lite2.	1998-05-17 20:13:01 +00:00
Bruce Evans	c8b4782815	Fixed stale references to hzto() in comments.	1998-05-17 20:08:05 +00:00
Tor Egge	cb87a87c16	Supply the correct process argument to dounmount when possible.	1998-05-17 19:38:55 +00:00
Tor Egge	5931a9c24e	For SMP, use prv_PPAGE1/prv_PMAP1 instead of PADDR1/PMAP1. get_ptbase and pmap_pte_quick no longer generates IPIs. This should reduce the number of IPIs during heavy paging.	1998-05-17 18:53:19 +00:00
Poul-Henning Kamp	c21410e119	s/nanoruntime/nanouptime/g s/microruntime/microuptime/g Reviewed by: bde	1998-05-17 11:53:46 +00:00
Garrett Wollman	98271db4d5	Convert socket structures to be type-stable and add a version number. Define a parameter which indicates the maximum number of sockets in a system, and use this to size the zone allocators used for sockets and for certain PCBs. Convert PF_LOCAL PCB structures to be type-stable and add a version number. Define an external format for infomation about socket structures and use it in several places. Define a mechanism to get all PF_LOCAL and PF_INET PCB lists through sysctl(3) without blocking network interrupts for an unreasonable length of time. This probably still has some bugs and/or race conditions, but it seems to work well enough on my machines. It is now possible for `netstat' to get almost all of its information via the sysctl(3) interface rather than reading kmem (changes to follow).	1998-05-15 20:11:40 +00:00
Peter Wemm	9c4aed2ed7	Nuke signanosleep(). (I've left nanosleep1() seperate to nanosleep() as I don't want to mess with the multiple returns)	1998-05-14 11:31:08 +00:00
Peter Wemm	06b6493558	regen after signanosleep nuke	1998-05-14 11:29:06 +00:00
Peter Wemm	786cf38a29	deep-six signanosleep(). It sounded like a good idea at the time.	1998-05-14 11:28:11 +00:00
Peter Wemm	1973d51bfb	Commit an old change that has been sitting around for a long while. signanosleep() did not deal with signal masks properly. This change was based on a discussion with bde some time ago (at least 6 months or more). signanosleep() should probably go away since it was never really used for more than a few weeks and doesn't appear in released code. It should probably be killed before somebody uses it and it becomes a gratuitous nonstandard feature.	1998-05-14 10:38:52 +00:00
Bruce Evans	b322fb5d76	Backed out previous commit. It is invalid to call d_ioctl() on possibly non-open devices, and we don't want to restrict dumping to swap devices anwyay. It is especially invalid to call d_ioctl() in non-process context for panics. d_psize() can be called on non-open devices, at least on non-SLICED ones that support d_dump(), and setdumpdev() has depended on this for a long time although it is probably wrong, but even d_psize() can't be called in non-process context - that's why dumpsys() depends on previously computed values although these values may be stale. The historical restriction to devices with dkpart(dev) == SWAP_PART should go away.	1998-05-12 17:34:02 +00:00
John Dyson	1f56217280	Fix the futimes/undelete/utrace conflict with other BSD's. Note that the only common usage of utrace (the possible problem with this commit) is with malloc, so this should be a real problem. Add the various NetBSD syscalls that allow full emulation of their development environment.	1998-05-11 03:55:28 +00:00
John Dyson	f0175db1ee	Attempt to set write combining mode for graphics devices.	1998-05-11 01:06:08 +00:00
Mike Smith	7be2d30077	In the words of the submitter: --------- Make callers of namei() responsible for releasing references or locks instead of having the underlying filesystems do it. This eliminates redundancy in all terminal filesystems and makes it possible for stacked transport layers such as umapfs or nullfs to operate correctly. Quality testing was done with testvn, and lat_fs from the lmbench suite. Some NFS client testing courtesy of Patrik Kudo. vop_mknod and vop_symlink still release the returned vpp. vop_rename still releases 4 vnode arguments before it returns. These remaining cases will be corrected in the next set of patches. --------- Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-07 04:58:58 +00:00
Julian Elischer	7f2f1b784e	Add dump support to the DEVFS/slice code. now we can actually catch our crashes :-) Submitted by: Luoqi Chen <luoqi@chen.ml.org> (the man who's everywhere)	1998-05-06 22:14:48 +00:00
Mike Smith	79cc756d8b	As described by the submitter: Reverse the VFS_VRELE patch. Reference counting of vnodes does not need to be done per-fs. I noticed this while fixing vfs layering violations. Doing reference counting in generic code is also the preference cited by John Heidemann in recent discussions with him. The implementation of alternative vnode management per-fs is still a valid requirement for some filesystems but will be revisited sometime later, most likely using a different framework. Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-06 05:29:41 +00:00
John Dyson	96fb8cf258	Fix the shm panic. I mistakenly used the shadow_count to keep the object from being split, and instead added an OBJ_NOSPLIT.	1998-05-04 17:12:53 +00:00
John Dyson	cbd8ec0902	Work around some VM bugs, the worst being an overly aggressive swap space free calculation. More complete fixes will be forthcoming, in a week.	1998-05-04 03:01:44 +00:00
Bruce Evans	77849078bf	Oops, the previous commit should have changed `i386' to` __i386__', not `__i386'.	1998-05-01 16:40:21 +00:00
Bruce Evans	809e3a8464	Partially fixed write clustering for cases where cluster_wbuild() is called from vfs_bio_awrite() without going through cluster_write() or ufs_bmaparray(), in particular for all writes to block disk devices. Only ufs_bmaparray() sets vp->v_maxio in a correct way, and it doesn't seem to be called early enough even for regular files.	1998-05-01 16:29:27 +00:00
Peter Wemm	b1951f4028	vm_page_is_valid() wasn't expecting a large offset argument, it's expecting a sub-page offset. We were passing the file position, and vm_page_bits() could do some interesting things when base was larger PAGE_SIZE. if (size > PAGE_SIZE - base) size = PAGE_SIZE - base; is interesting when (PAGE_SIZE - base) is negative. I could imagine that this could have interesting consequences for memory page -> device block bit validation.	1998-05-01 15:10:59 +00:00
Peter Wemm	f806d5a257	Fix one problem with NFSv3 > 2GB file support. Submitted by: bde	1998-05-01 15:04:35 +00:00
Eivind Eklund	288078be0f	Translate T_PROTFLT to SIGSEGV instead of SIGBUS when running under Linux emulation. This make Allegro Common Lisp 4.3 work under FreeBSD! Submitted by: Fred Gilham <gilham@csl.sri.com> Commented on by: bde, dg, msmith, tg Hoping he got everything right: eivind	1998-04-28 18:15:08 +00:00
David E. O'Brien	cbcfa1ba6a	Discussed with: bde	1998-04-24 11:50:30 +00:00
David E. O'Brien	8f89f24fc3	Create virgin disklabels with 8 (MAXPARTITIONS) partitions rather than three (RAW_PART + 1); This makes ``disklabel -Brw sdN auto'' do the Right Thing.	1998-04-24 11:49:57 +00:00
David Greenman	9351a2295a	Added kern.ipc.nmbclusters	1998-04-24 04:15:52 +00:00
Julian Elischer	c0bab11dfe	Make the devfs SLICE option a standard type option. (hopefully it will go away eventually anyhow)	1998-04-20 03:57:41 +00:00
Julian Elischer	3e425b968d	Add changes and code to implement a functional DEVFS. This code will be turned on with the TWO options DEVFS and SLICE. (see LINT) Two labels PRE_DEVFS_SLICE and POST_DEVFS_SLICE will deliniate these changes. /dev will be automatically mounted by init (thanks phk) on bootup. See /sys/dev/slice/slice.4 for more info. All code should act the same without these options enabled. Mike Smith, Poul Henning Kamp, Soeren, and a few dozen others This code does not support the following: bad144 handling. Persistance. (My head is still hurting from the last time we discussed this) ATAPI flopies are not handled by the SLICE code yet. When this code is running, all major numbers are arbitrary and COULD be dynamically assigned. (this is not done, for POLA only) Minor numbers for disk slices ARE arbitray and dynamically assigned.	1998-04-19 23:32:49 +00:00
Dag-Erling Smørgrav	59bad7c53b	Backed out lseek changes.	1998-04-19 22:20:32 +00:00
Dag-Erling Smørgrav	25096724e8	Return EINVAL and do not change file pointer if resulting offset is negative. PR: kern/6184	1998-04-18 19:24:44 +00:00
Peter Wemm	37b8ccd37a	In vfs_msync(), test to see if the vnode being examined is "interesting" (ie: it has a vm_object attached and is marked as OBJ_MIGHTBEDIRTY) before attempting to lock it. This should reduce the cpu hit that is incurred when doing a sync(2) and when the syncer process is doing the 30-second writeback of dirty mmap() data to disk. Skip this speedup if we are doing an unmount() to be sure to get everything - we can afford to occasionally miss a msync while the system is running, but not at unmount. I'm not sure about the VXLOCK and MNT_WAIT case, it seems a bit odd to skip doing a page_clean at unmount time just because a vnode is VXLOCKed, but that's what was being done before...	1998-04-18 06:26:16 +00:00
Dag-Erling Smørgrav	dc73342347	Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.	1998-04-17 22:37:19 +00:00
Bruce Evans	ab36c3d3e7	Really finish supporting compiling with `gcc -ansi'.	1998-04-17 04:53:44 +00:00
Peter Wemm	efdc5523c0	When the softdep conversion took place, the periodic vfs_msync() from update got lost. This is responsible for ensuring that dirty mmap() pages get periodically written to disk. Without it, long time mmap's might not have their dirty pages written out at all of the system crashes or isn't cleanly shut down. This could be nasty if you've got a long-running writing via mmap(), dirty pages used to get written to disk within 30 seconds or so.	1998-04-16 03:31:26 +00:00
Tor Egge	71033a8c50	Unlock mountlist_slock if the mount point was busy (unmount in progress) during the attempt at lazy fsync.	1998-04-15 18:37:49 +00:00
Bruce Evans	c1087c1324	Support compiling with `gcc -ansi'.	1998-04-15 17:47:40 +00:00
Poul-Henning Kamp	115facb29d	Fix a minor mbuf leak created by the previous change. Reviewed by: phk Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)	1998-04-14 06:24:43 +00:00
Poul-Henning Kamp	aba558930b	setsockopt() transports user option data in an mbuf. if the user data is greater than MLEN, setsockopt is unable to pass it onto the protocol handler. Allocate a cluster in such case. PR: 2575 Reviewed by: phk Submitted by: Julian Assange proff@iq.org	1998-04-11 20:31:46 +00:00
Poul-Henning Kamp	a2481bbe8e	When pmap_pinit0() allocates a page for proc0's page directory, kernal page table may need to be extended. But while growing the kernel page table (pmap_growkernel()), newly allocated kernel page table pages are entered into every process' page directory. For proc0, the page directory is not allocated yet, and results in a page fault. Eventually, the machine panics with "lockmgr: not holding exclusive lock". PR: 5458 Reviewed by: phk Submitted by: Luoqi Chen <luoqi@luoqi.watermarkgroup.com>	1998-04-11 17:24:06 +00:00
Alexander Langer	7c2e3d329a	Grammar police.	1998-04-10 00:09:04 +00:00
Wolfram Schneider	5ddc8ded1d	New mount option nosymfollow. If enabled, the kernel lookup() function will not follow symbolic links on the mounted file system and return EACCES (Permission denied).	1998-04-08 18:31:59 +00:00
Poul-Henning Kamp	5f88ec3625	Minor adjustments to the timecounting and proc0. Mostly Submitted by: bde	1998-04-08 09:01:53 +00:00
Peter Wemm	100ceca222	Today is not my lucky day. Fix missing brace and I got a request to use EMLINK instead.	1998-04-06 19:32:37 +00:00
Peter Wemm	193afe0189	Use a different errno (ELOOP (as sef mentioned) since the text that goes with the error sounds ok for the condition) if O_NOFOLLOW gets a link.	1998-04-06 18:43:28 +00:00
Peter Wemm	0fdc628b41	Rather than let users get fd's to symlink files, make O_NOFOLLOW cause an error if it gets a link (like it does if it gets a socket). The implications of letting users try and do file operations on symlinks themselves were too worrying.	1998-04-06 18:25:21 +00:00
Peter Wemm	7e3426aa1f	Implement a new open(2) flag: O_NOFOLLOW. This will instruct open to not follow symlinks, but to open a handle on the link itself(!). As strange as this might sound, it has several useful applications safe race-free ways of opening files in hostile areas (eg: /tmp, a mode 1777 /var/mail, etc). It also would allow things like fchown() to work on the link rather than having to implement a new syscall specifically for that task. Reviewed by: phk	1998-04-06 17:38:43 +00:00
Peter Wemm	aacdc613e5	curproc is initialized in locore at the same time for both SMP and UP now.	1998-04-06 15:51:22 +00:00
Peter Wemm	cf34ef61ee	Use real types for the SMP pages being allocated rather than arrays of ints. Remove some no longer needed casts. Initialize the per-cpu global data area using the structs rather than knowing too much about layout, alignment, etc.	1998-04-06 15:48:30 +00:00
Poul-Henning Kamp	2eeb0e2ea0	Make read_random() take a (void ) argument instead of (char )	1998-04-06 09:30:42 +00:00
Poul-Henning Kamp	4cf41af3d4	Make a kernel version of the timer* functions called timerval* to be more consistent. OK'ed by: bde	1998-04-06 08:26:08 +00:00
Poul-Henning Kamp	5704ba6a06	More fixes for the iterative case of nanosleep1 from bruce. I hate the 2-arg time{spec\|val}{add\|sub} functions!	1998-04-05 12:10:41 +00:00
Poul-Henning Kamp	bfe6c9fabf	Make the dummy timecounter run at 1 MHz rather than 100kHz (noticed by bde) fix the itimer(REAL) handling.	1998-04-05 11:49:36 +00:00
Peter Wemm	d59fbbf6c8	If there is no error code, don't copyout the remaining time. (As documented in the man page and the standards). (and besides, nanosleep1 isn't setting it in this case at present anyway, so we'd be copying junk).	1998-04-05 11:17:19 +00:00
Poul-Henning Kamp	338418263d	Fix nanosleep1 based on Bruces suggestion.	1998-04-05 10:28:01 +00:00
Andrey A. Chernov	80a39463c9	Remove unused atv.tv_usec = 0; from select/poll code	1998-04-05 10:03:52 +00:00
Peter Wemm	2257b488b9	tsleep() returns EWOULDBLOCK if the timeout expired. Don't return this to usermode, otherwise sleep(3) fails, cron doesn't work, etc etc etc.	1998-04-05 07:31:44 +00:00
Peter Wemm	b90dcc0c5d	Fix previous commit. Don't people read compiler messages or something??	1998-04-05 02:59:10 +00:00
Poul-Henning Kamp	91ad39c6b3	Handle double fraction overflow in nano & microtime functions (spotted by Bruce) Use tvtohz() a place where it fits.	1998-04-04 18:46:13 +00:00
Poul-Henning Kamp	00af9731c9	Time changes mark 2: * Figure out UTC relative to boottime. Four new functions provide time relative to boottime. * move "runtime" into struct proc. This helps fix the calcru() problem in SMP. * kill mono_time. * add timespec{add\|sub\|cmp} macros to time.h. (XXX: These may change!) * nanosleep, select & poll takes long sleeps one day at a time Reviewed by: bde Tested by: ache and others	1998-04-04 13:26:20 +00:00
John Dyson	aec0bcdf5b	Perhaps fix a problem that some drivers have that they don't properly initialize the b_kvasize element. This might fix some of the split I/O requests that some people have.	1998-04-04 05:55:05 +00:00
Poul-Henning Kamp	4ff16568be	Try to fix poll & select after I broke them.	1998-04-02 07:22:17 +00:00
Tor Egge	5758c2de94	Add two workarounds for broken MP tables: - Attempt to handle PCI devices where the interrupt is an ISA/EISA interrupt according to the mp table. - Attempt to handle multiple IO APIC pins connected to the same PCI or ISA/EISA interrupt source. Print a warning if this happens, since performance is suboptimal. This workaround is only used for PCI devices. With these two workarounds, the -SMP kernel is capable of running on my Asus P/I-P65UP5 motherboard when version 1.4 of the MP table is disabled.	1998-04-01 21:07:37 +00:00
Poul-Henning Kamp	460608e768	Fix an off by 1<<32 error.	1998-03-31 10:47:01 +00:00
Poul-Henning Kamp	75da0aa298	Add a dummy timecounter until we find the real thing(s).	1998-03-31 10:44:56 +00:00
Poul-Henning Kamp	227ee8a188	Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde	1998-03-30 09:56:58 +00:00
John Dyson	006b9b7df9	Correct a significant problem with the softupdates port. Allow fsync to work properly within the softupdates framework, and thereby eliminate some unfortunate panics.	1998-03-29 18:23:44 +00:00
Poul-Henning Kamp	934f5f3306	Export MD5Transform in md5.c and remove a private version in random_machdep.c md5 is standard as a consequence of this.	1998-03-29 11:55:06 +00:00
Peter Dufault	7c9f6f8f8b	Remove duplicate comment	1998-03-28 18:16:29 +00:00
Peter Dufault	38c76440b8	Include sys/resource.h to get PRIO_MAX.	1998-03-28 14:49:47 +00:00
Bruce Evans	3c1300a6b3	Removed unused #includes.	1998-03-28 13:25:01 +00:00
Bruce Evans	771b51ef7b	Don't depend on <sys/mount.h> including <sys/socket.h>.	1998-03-28 12:04:40 +00:00
Peter Dufault	8a6472b723	Finish _POSIX_PRIORITY_SCHEDULING. Needs P1003_1B and _KPOSIX_PRIORITY_SCHEDULING options to work. Changes: Change all "posix4" to "p1003_1b". Misnamed files are left as "posix4" until I'm told if I can simply delete them and add new ones; Add _POSIX_PRIORITY_SCHEDULING system calls for FreeBSD and Linux; Add man pages for _POSIX_PRIORITY_SCHEDULING system calls; Add options to LINT; Minor fixes to P1003_1B code during testing.	1998-03-28 11:51:01 +00:00
Bruce Evans	08637435f2	Moved some #includes from <sys/param.h> nearer to where they are actually used.	1998-03-28 10:33:27 +00:00
Poul-Henning Kamp	c6bcf724da	Split the padding out into a separate function. Synchronize the kernel and libmd versions of md5c.c PR: misc/6127 Reviewed by: phk Submitted by: Ari Suutari <ari@suutari.iki.fi>	1998-03-27 10:23:00 +00:00
John Dyson	f9be84912c	Correct a problem where buffers might not be zeroed when needed. The B_MALLOC buffers might not have been properly zeroed.	1998-03-27 06:48:24 +00:00
Poul-Henning Kamp	a0502b19d4	Add two new functions, get{micro\|nano}time. They are atomic, but return in essence what is in the "time" variable. gettime() is now a macro front for getmicrotime(). Various patches to use the two new functions instead of the various hacks used in their absence. Some puntuation and grammer patches from Bruce. A couple of XXX comments.	1998-03-26 20:54:05 +00:00
Jonathan Lemon	640c4313af	Add the ability to make real-mode BIOS calls from the kernel. Currently, everything is contained inside #ifdef VM86, so this option must be present in the config file to use this functionality. Thanks to Tor Egge, these changes should work on SMP machines. However, it may not be throughly SMP-safe. Currently, the only BIOS calls made are memory-sizing routines at bootup, these replace reading the RTC values.	1998-03-23 19:52:59 +00:00
John Dyson	52c64c95c5	In kern_physio.c fix tsleep priority messup. In vfs_bio.c, remove b_generation count usage, remove redundant reassignbuf, remove redundant spl(s), manage page PG_ZERO flags more correctly, utilize in invalid value for b_offset until it is properly initialized. Add asserts for #ifdef DIAGNOSTIC, when b_offset is improperly used. when a process is not performing I/O, and just waiting on a buffer generally, make the sleep priority low. only check page validity in getblk for B_VMIO buffers. In vfs_cluster, add b_offset asserts, correct pointer calculation for clustered reads. Improve readability of certain parts of the code. Remove redundant spl(s). In vfs_subr, correct usage of vfs_bio_awrite (From Andrew Gallatin <gallatin@cs.duke.edu>). More vtruncbuf problems fixed.	1998-03-19 22:48:16 +00:00
John Dyson	1c77c6b7b0	Fix an embarassing problem in vtruncbuf.	1998-03-19 18:46:58 +00:00
John Dyson	4641c8ac1d	Correct a problem where data OR metadata could be thrown away if a buffer is grown.	1998-03-17 17:36:05 +00:00
KATO Takenori	f1aca9c33f	Deleted PC-98 code because (1) machine dependent code should not be in here, and (2) the flag used in PC-98 code has been assigned to another purpose.	1998-03-17 08:41:28 +00:00
John Dyson	2deb5d0417	Correct a severely evil bug in the vtruncbuf code. It didn't cause me any problems until after the previous commit. This problem then caused a severe case of creeping crud on my diskdrive, and hosed my system so bad, that I needed to do a complete reinstall. Sorry!!! I assume that others have manifest this bug.	1998-03-17 06:30:52 +00:00
Julian Elischer	c2a94b7a3c	Remove a soft-update hook that was accidentally added to the READ path. also add some comments, and a couple of very minor cosmetic changes.	1998-03-16 18:39:41 +00:00
Poul-Henning Kamp	b05dcf3c2f	A bunch of BNN (Bruce Normal Nits) from bde: Bring back the softclock inlining save a couple of <<32's many white-space shuffles.	1998-03-16 10:19:12 +00:00
John Dyson	e85c1afb7c	Allow vfs_ioopt to be enabled with a (temporary) config option.	1998-03-16 02:13:03 +00:00
John Dyson	bef608bd7e	Some VM improvements, including elimination of alot of Sig-11 problems. Tor Egge and others have helped with various VM bugs lately, but don't blame him -- blame me!!! pmap.c: 1) Create an object for kernel page table allocations. This fixes a bogus allocation method previously used for such, by grabbing pages from the kernel object, using bogus pindexes. (This was a code cleanup, and perhaps a minor system stability issue.) pmap.c: 2) Pre-set the modify and accessed bits when prudent. This will decrease bus traffic under certain circumstances. vfs_bio.c, vfs_cluster.c: 3) Rather than calculating the beginning virtual byte offset multiple times, stick the offset into the buffer header, so that the calculated offset can be reused. (Long long multiplies are often expensive, and this is a probably unmeasurable performance improvement, and code cleanup.) vfs_bio.c: 4) Handle write recursion more intelligently (but not perfectly) so that it is less likely to cause a system panic, and is also much more robust. vfs_bio.c: 5) getblk incorrectly wrote out blocks that are incorrectly sized. The problem is fixed, and writes blocks out ONLY when B_DELWRI is true. vfs_bio.c: 6) Check that already constituted buffers have fully valid pages. If not, then make sure that the B_CACHE bit is not set. (This was a major source of Sig-11 type problems.) vfs_bio.c: 7) Fix a potential system deadlock due to an incorrectly specified sleep priority while waiting for a buffer write operation. The change that I made opens the system up to serious problems, and we need to examine the issue of process sleep priorities. vfs_cluster.c, vfs_bio.c: 8) Make clustered reads work more correctly (and more completely) when buffers are already constituted, but not fully valid. (This was another system reliability issue.) vfs_subr.c, ffs_inode.c: 9) Create a vtruncbuf function, which is used by filesystems that can truncate files. The vinvalbuf forced a file sync type operation, while vtruncbuf only invalidates the buffers past the new end of file, and also invalidates the appropriate pages. (This was a system reliabiliy and performance issue.) 10) Modify FFS to use vtruncbuf. vm_object.c: 11) Make the object rundown mechanism for OBJT_VNODE type objects work more correctly. Included in that fix, create pager entries for the OBJT_DEAD pager type, so that paging requests that might slip in during race conditions are properly handled. (This was a system reliability issue.) vm_page.c: 12) Make some of the page validation routines be a little less picky about arguments passed to them. Also, support page invalidation change the object generation count so that we handle generation counts a little more robustly. vm_pageout.c: 13) Further reduce pageout daemon activity when the system doesn't need help from it. There should be no additional performance decrease even when the pageout daemon is running. (This was a significant performance issue.) vnode_pager.c: 14) Teach the vnode pager to handle race conditions during vnode deallocations.	1998-03-16 01:56:03 +00:00
John Dyson	26300b34f1	Disable the vfs.ioopt option for now, so that we don't get gratuitious bugreports. I might not be able to fix the problems before 3.0, due to other, more important things.	1998-03-14 19:50:36 +00:00
Tor Egge	8293f20aee	Don't misuse vnode interlocks in routines that can be called from interrupts. PR: 5893	1998-03-14 02:55:01 +00:00
Peter Dufault	917827427e	idprio processes must be preempted as soon as anything is runnable.	1998-03-11 20:50:42 +00:00
Mike Smith	617dd81f70	If the root mount fails from a device that is not the compatability slice of a disk, because that slice does not exist, try again mounting from the compatability slice. This handles the case where a disk has been initialised by 'disklabel auto', which places a bogus and invalid slice entry on the disk. The bootstrap is not smart enough to reject this slice, and pretends to boot from it. Believing the the bootstrap at this point is unwise. Booting from non-'wd' disks thus prepared is still broken, as 'disklabel -rwB xdN auto' does not initialise the disk type field, and the bootstrap mistakenly claims that the disk is handled by 'wd'. Behaviour is now consistent with DEVFS expected characteristics.	1998-03-11 00:10:31 +00:00
John Birrell	cbe0799aaf	Add statements to generate a sys/syscall.mk file for inclusion during the libc/libc_r to automatically pick up syscall names on the assumption that default asm code needs to generated for them. In the up-coming changes to the libc makefiles, there is the option to provide a machine dependent asm source file which will turn off the automatic generation of the default. There is also an option to just stop code being generated for a syscall. In most cases, though, the default asm code is all that is required, so this change makes that the most convenient was to do business. Idea suggested by: bde	1998-03-09 04:00:42 +00:00
Julian Elischer	b1897c197c	Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman) Submitted by: Kirk McKusick (mcKusick@mckusick.com) Obtained from: WHistle development tree	1998-03-08 09:59:44 +00:00
John Dyson	eed2412e5a	Free the first page also if it is not valid.	1998-03-08 06:21:33 +00:00
John Dyson	8f9110f6a1	This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.	1998-03-07 21:37:31 +00:00
Tor Egge	1146c3560f	The APs now reload the interrupt descriptor table pointer after f00f_hack has run. Use the global r_idt descriptor in f00f_hack when in SMP mode, so the APs find the relocated interrupt descriptor table. Submitted by: Partially from David A Adkins <adkin003@tc.umn.edu>	1998-03-07 20:16:49 +00:00
John Dyson	9b2e5bad34	Some kern_lock code improvements. Add missing wakeup, and enable disabling some diagnostics when memory or speed is at a premium.	1998-03-07 19:25:34 +00:00
Bruce Evans	0b8a3ff790	Set the input and output buffer sizes and the input buffer watermarks dynamically depending on the line speed(s). This should give the old sizes and watermarks until drivers are changed. Display the input watermarks in pstat and sicontrol.	1998-03-07 15:36:29 +00:00
Peter Dufault	917e476dad	Reviewed by: msmith, bde long ago POSIX.4 headers and sysctl variables. Nothing should change unless POSIX4 is defined or _POSIX_VERSION is set to 199309.	1998-03-04 10:27:00 +00:00
Peter Dufault	644d85f4ca	Reviewed by: msmith, bde long ago Fix for RTPRIO scheduler to eliminate invalid context switches. POSIX.4 headers and sysctl variables. Nothing should change unless POSIX4 is defined or _POSIX_VERSION is set to 199309.	1998-03-04 10:25:55 +00:00
John Dyson	a638dbdbf4	Fix a rounding error for the NFS buffer validend. Submitted by: John W. De Boskey <jwd@unx.sas.com>	1998-03-04 03:17:30 +00:00
Tor Egge	02c1dc3bbc	When entering the apic version of slow interrupt handler, level interrupts are masked, and EOI is sent iff the corresponding ISR bit is set in the local apic. If the CPU cannot obtain the interrupt service lock (currently the global kernel lock) the interrupt is forwarded to the CPU holding that lock. Clock interrupts now have higher priority than other slow interrupts.	1998-03-03 22:56:30 +00:00
Tor Egge	3163861c7b	Forward the signal if the process runs on a different CPU. This reduces the signal handling latency for cpu-bound processes that performs very few system calls. The IPI for forcing an additional software trap is no longer dependent upon BETTER_CLOCK being defined.	1998-03-03 20:55:26 +00:00
Tor Egge	fe9cd27373	Reduce timeout before assuming that forwarding of hardclock or softclock failed. Don't complain on forwarding failure, unless BETTER_CLOCK_DIAGNOSTIC is defined.	1998-03-03 20:09:14 +00:00
Peter Wemm	c8a7999933	Update the ELF image activator to use some of the exec resources rather than rolling it's own. This means that it now uses the "safe" exec_map_first_page() to get the ld.so headers rather than risking a panic on a page fault failure (eg: NFS server goes down). Since all the ELF tools go to a lot of trouble to make sure everything lives in the first page for executables, this is a win. I have not seen any ELF executable on any system where all the headers didn't fit in the first page with lots of room to spare. I have been running variations of this code for some time on my pure ELF systems.	1998-03-02 05:47:58 +00:00
John Dyson	59228495d7	Change vfs.ioopt default back to '0'.	1998-03-01 23:07:45 +00:00
Mike Smith	34bdbbd0de	The intent is to get rid of WILLRELE in vnode_if.src by making a complement to all ops that return a vpp, VFS_VRELE. This is initially only for file systems that implement the following ops that do a WILLRELE: vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link, vop_rename, vop_mkdir, vop_rmdir, vop_symlink This is initial DNA that doesn't do anything yet. VFS_VRELE is implemented but not called. A default vfs_vrele was created for fs implementations that use the standard vnode management routines. VFS_VRELE implementations were made for the following file systems: Standard (vfs_vrele) ffs mfs nfs msdosfs devfs ext2fs Custom union umapfs Just EOPNOTSUPP fdesc procfs kernfs portal cd9660 These implementations may change as VOP changes are implemented. In the next phase, in the vop implementations calls to vrele and the vrele part of vput will be moved to the top layer vfs_vnops and made visible to all layers. vput will be replaced by unlock in these cases. Unlocking will still be done in the per fs layer but the refcount decrement will be triggered at the top because it doesn't hurt to hold a vnode reference a little longer. This will have minimal impact on the structure of the existing code. This will only be done for vnode arguments that are released by the various fs vop implementations. Wider use of VFS_VRELE will likely require restructuring of the code. Reviewed by: phk, dyson, terry et. al. Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-03-01 22:46:53 +00:00
Guido van Rooij	4049a04253	Make sure that you can only bind a more specific address when it is done by the same uid. Obtained from: OpenBSD	1998-03-01 19:39:29 +00:00
John Dyson	ffc82b0a70	1) Use a more consistent page wait methodology. 2) Do not unnecessarily force page blocking when paging pages out. 3) Further improve swap pager performance and correctness, including fixing the paging in progress deadlock (except in severe I/O error conditions.) 4) Enable vfs_ioopt=1 as a default. 5) Fix and enable the page prezeroing in SMP mode. All in all, SMP systems especially should show a significant improvement in "snappyness."	1998-03-01 04:18:54 +00:00
Guido van Rooij	d3c0af6943	Raise ncallout from NPROC + 16 to NPROC + 16 + MAXFILES. This shold prevent a possible DOS attack. The proper fix (to dynamically grow the callout list) is in the make. Submitted by: Paul Traina	1998-02-27 19:58:29 +00:00
Bruce Evans	5132080e71	Removed unused #includes.	1998-02-25 13:08:07 +00:00
Bruce Evans	57518a4e83	Removed a stale comment and staler code.	1998-02-25 06:30:15 +00:00
Bruce Evans	79aa4f4704	Don't depend on "implicit int" or bloat the data section in the declaration of ptc_devsw_installed. Fixed a spelling error.	1998-02-25 06:19:15 +00:00
Bruce Evans	2094493a6c	Don't depend on "implicit int".	1998-02-25 06:16:37 +00:00
Bruce Evans	8f03c6f18f	Declare function pointer args as pointers, not as functions.	1998-02-25 06:13:32 +00:00
Bruce Evans	a0d38b495a	Fixed a missing newline in a debugging printf. Fixed punctuation in some comments.	1998-02-25 06:04:46 +00:00
Bruce Evans	6b16931c00	Removed unused #includes.	1998-02-25 05:58:50 +00:00
Bruce Evans	9c8fff87fc	Fixed the calculation of `delta' in settime(). We once set all times consistently wrong (up to 1 tick too late), but recent changes fixed the setting of the main clock, making other times inconsistent. The inconsistencies tended to show up as a negative resource usage for the process that set the time. Fixed the check for setting the clock backwards. A stale timestamp (`time') was checked, so it was possible to set the clock backwards by up to almost 1 tick. Until recently, this bug was compensated for by setting the clock consistently wrong. Merged the comment about setting the clock backwards from Lite2. Removed latency micro-optimizations/speed pessimizations in settime(). microtime() and set_timecounter() are relatively expensive, and they must be called together with clock updates blocked to get a consistent `delta', so significant latency optimizations are not possible. Removed some stale comments.	1998-02-25 04:10:32 +00:00
John Dyson	8a58a9f6c9	Try to dynamically size the VM_KMEM_SIZE (but is still able to be overridden in a way identically as before.) I had problems with the system properly handling the number of vnodes when there is alot of system memory, and the default VM_KMEM_SIZE. Two new options "VM_KMEM_SIZE_SCALE" and "VM_KMEM_SIZE_MAX" have been added to support better auto-sizing for systems with greater than 128MB.	1998-02-23 07:41:23 +00:00
John Dyson	64d3c7e32d	Clean-up the vget mechanism by permanently attaching VM objects to vnodes, therefore vget doesn't need to do so anymore. Other minor improvements include the temp free vnode queue obeying the VAGE flag and a printf that warns of to-be-removed code being executed.	1998-02-23 06:59:52 +00:00
Poul-Henning Kamp	7ec73f6417	Replace TOD clock code with more systematic approach. Highlights: * Simple model for underlying hardware. * Hardware basis for timekeeping can be changed on the fly. * Only one hardware clock responsible for TOD keeping. * Provides a real nanotime() function. * Time granularity: .232E-18 seconds. * Frequency granularity: .238E-12 s/s * Frequency adjustment is continuous in time. * Less overhead for frequency adjustment. * Improves xntpd performance. Reviewed by: bde, bde, bde	1998-02-20 16:36:17 +00:00
Bruce Evans	876a94ee2c	Staticized. Don't depend on "implicit int".	1998-02-20 13:52:15 +00:00
Bruce Evans	e31abede1f	Don't depend on "implicit int" or bloat the data section in the declaration of xxx_devsw_installed.	1998-02-20 13:46:58 +00:00
Bruce Evans	d68fa50ccb	Don't depend on "implicit int".	1998-02-20 13:37:40 +00:00
Bruce Evans	39e4376ba7	Removed unused #includes.	1998-02-20 13:11:54 +00:00
Bill Fenner	92f57d003c	Revert sosend() to its behavior from 4.3-Tahoe and before: if so_error is set, clear it before returning it. The behavior introduced in 4.3-Reno (to not clear so_error) causes potentially transient errors (e.g. ECONNREFUSED if the other end hasn't opened its socket yet) to be permanent on connected datagram sockets that are only used for writing. (soreceive() clears so_error before returning it, as does getsockopt(...,SO_ERROR,...).) Submitted by: Van Jacobson <van@ee.lbl.gov>, via a comment in the vat sources.	1998-02-19 19:38:20 +00:00
Eivind Eklund	d94f38ace2	Add HW_WDOG to LINT, and turn it into a new-style option.	1998-02-16 23:57:49 +00:00
Poul-Henning Kamp	15b7a47005	A bunch of nits from bde.	1998-02-15 14:15:21 +00:00
Poul-Henning Kamp	c7c9a816a1	Add a nanotime() function so that we can start to use this call.	1998-02-15 13:55:06 +00:00
Poul-Henning Kamp	9ada5a50f3	unifdef -UEXT_CLOCK fdef -UEXT_CLOCK, it is irrelevant. Fix a couple of nits from bde while here anyway.	1998-02-15 13:50:12 +00:00
Bruce Evans	338ca54caf	Fixed an aliasing bug. It was too easy to defeat the check for moving or shrinking an open partition (by changing the label for a compatibility slice while partitions on the corresponding real slice are open, or vice versa).	1998-02-15 05:41:31 +00:00
John Dyson	9f24f214c3	Make the rootdir handling more consistent. Now, processes always have a root vnode associated with them, and no special checks for the null case are needed. Submitted by: terry@freebsd.org	1998-02-15 04:17:09 +00:00
Eivind Eklund	be41061b8b	Make NO_LKM a new-style option. Forgotten by: dima	1998-02-12 18:02:07 +00:00
Dima Ruban	bd45deefaa	I'm not sure whether this is a correct way to do it, but here's a new kernel option - "NO_LKM" If anyone has better ideas - please let me know.	1998-02-11 20:47:55 +00:00
David Greenman	c78ab18a81	Fix a && that should be an &. Reviewed by: "John S. Dyson" <dyson@FreeBSD.ORG> Submitted by: jwd@unx.sas.com (John W. DeBoskey)	1998-02-11 20:06:48 +00:00
Eivind Eklund	e9fe146bb4	Include SIMPLELOCK_DEBUG functions even if SMP if compiling LINT; give an error for the combination if _not_ compiling LINT.	1998-02-11 00:05:26 +00:00
Eivind Eklund	cfa5673efd	Move include of <machine/ipl.h> inside ifndef SMP where it is used, to avoid getting 'unused include file' warnings in the SMP case.	1998-02-10 17:10:23 +00:00
KATO Takenori	1b11919b2b	Fixed vnode interlock handling. Reviewed by: Bruce Evans <bde@zeta.org.au> Tor Egge <Tor.Egge@idi.ntnu.no>	1998-02-10 02:54:24 +00:00
Eivind Eklund	303b270b0a	Staticize.	1998-02-09 06:11:36 +00:00
John Dyson	3217023e7c	Fix a problem with vn_lock in fsync.	1998-02-08 01:41:33 +00:00
KATO Takenori	16e3b0b67a	When the vp is lcoked, vget() calls vfs_object_create() with waslocked = TRUE. This change may fix lockmgr panic in umapfs/nullfs. PR: 5634 Reviewed by: "John S. Dyson" <toor@dyson.iquest.net> Suggested by: Bruce Evans <bde@zeta.org.au>	1998-02-07 08:44:31 +00:00
Eivind Eklund	0b08f5f737	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
John Dyson	95461b450d	1) Start using a cleaner and more consistant page allocator instead of the various ad-hoc schemes. 2) When bringing in UPAGES, the pmap code needs to do another vm_page_lookup. 3) When appropriate, set the PG_A or PG_M bits a-priori to both avoid some processor errata, and to minimize redundant processor updating of page tables. 4) Modify pmap_protect so that it can only remove permissions (as it originally supported.) The additional capability is not needed. 5) Streamline read-only to read-write page mappings. 6) For pmap_copy_page, don't enable write mapping for source page. 7) Correct and clean-up pmap_incore. 8) Cluster initial kern_exec pagin. 9) Removal of some minor lint from kern_malloc. 10) Correct some ioopt code. 11) Remove some dead code from the MI swapout routine. 12) Correct vm_object_deallocate (to remove backing_object ref.) 13) Fix dead object handling, that had problems under heavy memory load. 14) Add minor vm_page_lookup improvements. 15) Some pages are not in objects, and make sure that the vm_page.c can properly support such pages. 16) Add some more page deficit handling. 17) Some minor code readability improvements.	1998-02-05 03:32:49 +00:00
Eivind Eklund	47cfdb166d	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
David Greenman	1540674007	Restrict idleprio to superuser: Realtime priority has to be restricted for reasons which should be obvious. However, for idle priority, there is a potential for system deadlock if an idleprio process gains a lock on a resource that other processes need (and the idleprio process can't run due to a CPU-bound normal process). Fix me! XXX PR: 5639	1998-02-04 18:43:10 +00:00
Bruce Evans	a3bdf7a34c	Fixed staticization.	1998-02-03 21:41:12 +00:00
Bruce Evans	a92ae47539	Updated generated files.	1998-02-03 17:52:21 +00:00
Bruce Evans	14f1d4260d	Fixed type of mincore().	1998-02-03 17:45:43 +00:00
Bruce Evans	125ff6b079	Generate a forward declaration of `struct proc' in <sys/sysproto.h>. Removed extra args to a printf. Fixed some style inconsistencies (unnecessary parentheses for printf). awk is not C.	1998-02-03 17:39:13 +00:00
John Dyson	5abb66d243	Return the vm_map in the eproc structure, so we can support more accurate VSZ display in PS.	1998-02-02 05:14:03 +00:00
John Dyson	eaf13dd73a	Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.	1998-01-31 11:56:53 +00:00
Eivind Eklund	3f2076daf5	Make the debug options new-style. This also zaps a DPT option from lint; it wasn't referenced from anywhere.	1998-01-31 07:23:16 +00:00
Eivind Eklund	e0d781f3a5	Make POWERFAIL_NMI, PPS_SYNC and NATM new style options. This also fixes a couple of defunct options; submitted by bde.	1998-01-31 05:00:21 +00:00
Tor Egge	d09a16d804	Update freevnodes when adding a vnode to the head of the free list.	1998-01-31 01:17:58 +00:00
Poul-Henning Kamp	c5b193bfba	Retire LFS. If you want to play with it, you can find the final version of the code in the repository the tag LFS_RETIREMENT. If somebody makes LFS work again, adding it back is certainly desireable, but as it is now nobody seems to care much about it, and it has suffered considerable bitrot since its somewhat haphazard integration. R.I.P	1998-01-30 11:34:06 +00:00
Steve Price	694ad0a9b1	Fix a couple of operator precedence bugs. PR: 5450 Submitted by: Sakari Jalovaara <sja@tekla.fi>	1998-01-25 17:25:41 +00:00
John Dyson	33b90a70cd	Various NFS fixes: Make vfs_bio buffer mgmt work better. Buffers were being used after brelse. Make nfs_getpages work independently of other NFS interfaces. This eliminates some difficult recursion problems and decreases pagefault overhead. Remove an erroneous vfs_unbusy_pages. Fix a reentrancy problem, with nfs_vinvalbuf when vnode is already being rundown. Reassignbuf wasn't being called when needed under certain circumstances. (Thanks to Bill Paul for help.)	1998-01-25 06:24:09 +00:00
Eivind Eklund	7b778b5e61	Make all file-system (MFS, FFS, NFS, LFS, DEVFS) related option new-style. This introduce an xxxFS_BOOT for each of the rootable filesystems. (Presently not required, but encouraged to allow a smooth move of option *FS to opt_dontuse.h later.) LFS is temporarily disabled, and will be re-enabled tomorrow.	1998-01-24 02:54:56 +00:00
John Dyson	50ce7ff499	Add better support for larger I/O clusters, including larger physical I/O. The support is not mature yet, and some of the underlying implementation needs help. However, support does exist for IDE devices now.	1998-01-24 02:01:46 +00:00
John Dyson	2d8acc0f4a	VM level code cleanups. 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)	1998-01-22 17:30:44 +00:00
Bruce Evans	ffbb164e19	Set p_retval for the correct process in getpriority(). This fixes a null pointer panic when the pointer for the incorrect process is NULL. getpriority() was broken in rev.1.27. Rev.1.28 broke the warning instead of fixing the problem. PR: 5495	1998-01-19 12:39:00 +00:00
John Dyson	4722175765	Tie up some loose ends in vnode/object management. Remove an unneeded config option in pmap. Fix a problem with faulting in pages. Clean-up some loose ends in swap pager memory management. The system should be much more stable, but all subtile bugs aren't fixed yet.	1998-01-17 09:17:02 +00:00
Poul-Henning Kamp	6f70df1587	Move almost all the ntp related stuff from kern_clock.c to kern_ntptime.c. The only bit left over is that which is executed in all calls to hardclock(). Various cleanups and staticizing along the road.	1998-01-14 20:48:16 +00:00
Poul-Henning Kamp	7907a6bc55	Make softticks static. Remove unneeded stuff.	1998-01-14 19:42:47 +00:00
John Dyson	53f6f08545	Fix another vnode leak.	1998-01-12 03:15:01 +00:00
John Dyson	925a3a419a	Fix some vnode management problems, and better mgmt of vnode free list. Fix the UIO optimization code. Fix an assumption in vm_map_insert regarding allocation of swap pagers. Fix an spl problem in the collapse handling in vm_object_deallocate. When pages are freed from vnode objects, and the criteria for putting the associated vnode onto the free list is reached, either put the vnode onto the list, or put it onto an interrupt safe version of the list, for further transfer onto the actual free list. Some minor syntax changes changing pre-decs, pre-incs to post versions. Remove a bogus timeout (that I added for debugging) from vn_lock. PHK will likely still have problems with the vnode list management, and so do I, but it is better than it was.	1998-01-12 01:46:33 +00:00
John Dyson	1616db3cf8	Implement the first page access for object type determination more VM clean. Also, use vm_map_insert instead of vm_mmap. Reviewed by: dg@freebsd.org	1998-01-11 21:35:38 +00:00
Poul-Henning Kamp	bb303fe246	Try to solve timeout race by not touching softtics here.	1998-01-11 19:07:58 +00:00
Poul-Henning Kamp	55c449bc0f	Fix softclock calling so we don't loose timeouts (I broke this ~10h ago)	1998-01-11 00:44:31 +00:00
Poul-Henning Kamp	eeb355f73f	Whoops. softclock is called from doreti_swi as well. Abandon call from hardclock(). Forgot this: Pointed hat sent by: bd	1998-01-10 14:55:14 +00:00
Poul-Henning Kamp	a50ec50568	Effect the divorce of kern_clock.c and kern_timeout.c (which was repository copied from kern_clock.c)	1998-01-10 13:16:26 +00:00
Eivind Eklund	e4f4247a08	Make the BOOTP family new-style options (in opt_bootp.h)	1998-01-09 03:21:07 +00:00
Poul-Henning Kamp	9cf25fbec7	Improve hardpps readability a bit: * Rename usec to p_usec so you can search for it. * Macroize the huge median_of_3_samples if statement.	1998-01-07 12:29:17 +00:00
John Dyson	857d737ed6	Disable io optimizations again, minor bug found, and will be fixed in a few days.	1998-01-07 09:26:29 +00:00
John Dyson	95e5e988e0	Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.	1998-01-06 05:26:17 +00:00
Alexander Langer	de17eb59b4	Added missing caddr_t --> void * conversions for sys/mman.h functions. Submitted by: bde	1998-01-01 17:07:46 +00:00
Bruce Evans	b1679c0f7e	Use a real malloc type for M_LINKER instead of #defining it as M_TEMP. Fixed a comment.	1998-01-01 08:56:24 +00:00
John Dyson	483140ead1	Add the vnode interlock back around vref.	1997-12-29 16:54:03 +00:00
Bruce Evans	b2ef07b7a2	Fixed style bugs in previous commit.	1997-12-29 08:54:52 +00:00
John Dyson	60f8d46448	Fix the decl of vfs_ioopt, allow LFS to compile again, fix a minor problem with the object cache removal.	1997-12-29 01:03:55 +00:00
John Dyson	2be70f79f6	Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.	1997-12-29 00:25:11 +00:00
Bruce Evans	f82057be9e	Handle "%...p" as "%#...x" instead of "0x%...x". This is a quick fix for field widths being 2 larger than specified for "%<number>p". Only printing of null pointers is "wrong" now (it is actually "right", but inconsistent with printf(3)).	1997-12-28 05:03:33 +00:00
Bruce Evans	64cfdf460e	Restored used include of <sys/malloc.h>. malloc() is not used here, but kmem_malloc() is used and it takes the same "flags" as malloc(). Use the mbuf allocation "flags" M_WAIT and M_DONTWAIT consistently. There is really only one boolean flag, M_DONTWAIT, but the "flags" were always treated as enum-like values, except in some places here where the values are tacitly converted to boolean flags. Treat them as enum-like values everywhere, except where we tacitly assume that there are only two values in order to convert them to the corresponding two kmem_malloc() "flags".	1997-12-28 01:01:13 +00:00
Bruce Evans	675ea6f083	Unspammed nested include of <vm/vm_zone.h>.	1997-12-27 02:56:39 +00:00
Poul-Henning Kamp	71f461f86a	Rename "i586_ctr" to "tsc" (both upper and lower case instances). Fix a couple of printfs too. Warning: This changes the names of a couple of kernel options!	1997-12-26 20:42:37 +00:00
Gary Palmer	b3b84d9b17	Make kern.ncpu reports the number of detected processors when running with a SMP kernel.	1997-12-25 13:14:21 +00:00
Nate Williams	e1d6dc656d	This patch causes the "calltodo" timer list to be decremented by the amount of time that the laptop was suspending. Thus, select() calls that might have suspended rather than firing at 1hr + "time suspended" since the timer was posted. Adding: options APM_FIXUP_CALLTODO to the kernel config enables the patch. [ This patch was slightly modified to use a consistant indent style and I removed some unused local variables. After this has been tested a few weeks we'll make the options the default, so for now I'm now documenting it in LINT. Mike can later if he wants. ] Reviewed by: Mike Smith <msmith@freebsd.org> Submitted by: Ken Key <key@cs.utk.edu>	1997-12-23 16:32:35 +00:00
John Dyson	6d94bea461	Improve my copyright.	1997-12-22 11:54:00 +00:00
Sean Eric Fagan	d5f81602a7	Clear the p_stops field on change of user/group id, unless the correct flag is set in the p_pfsflags field. This, essentially, prevents an SUID proram from hanging after being traced. (E.g., "truss /usr/bin/rlogin" would fail, but leave rlogin in a stopevent state.) Yet another case where procctl is (hopefully ;)) no longer needed in the general case. Reviewed by: bde (thanks bruce :))	1997-12-20 03:05:47 +00:00
Bruce Evans	214279cec9	Use __inline instead of inline to prevent pedantic compiler warnings.	1997-12-19 23:25:16 +00:00
Bruce Evans	1aa9ea7cb9	Removed some bogus casts.	1997-12-19 23:18:37 +00:00
John Dyson	1efb74fbcc	Some performance improvements, and code cleanups (including changing our expensive OFF_TO_IDX to btoc whenever possible.)	1997-12-19 09:03:37 +00:00
Garrett Wollman	e51d1e8707	Revert poll() for UFS files to traditional behavior where polling for read- or writability always returns true. This works around bugs in netscape and squid, at a minimum.	1997-12-17 14:44:23 +00:00
Eivind Eklund	4027779c22	Regenerate after changing makesyscalls.sh.	1997-12-16 22:27:22 +00:00
Eivind Eklund	0b3c4d50c2	Move around opt_compat include to accomodate Linulator brokenness (for the time being).	1997-12-16 18:51:45 +00:00
Eivind Eklund	5591b823d1	Make COMPAT_43 and COMPAT_SUNOS new-style options.	1997-12-16 17:40:42 +00:00
David Greenman	c7ce9e2634	Fix bug where a struct buf was free()'d back to the system malloc pool. Quite amazing that the system runs at all with this bug. Also present in 2.2.5. The bug appears to have come in with changes in rev 1.53. PR: might fix PR#5313 Submitted by: bde	1997-12-16 15:40:29 +00:00
Garrett Wollman	1cbbd625cc	Add support for poll(2) on files. vop_nopoll() now returns POLLNVAL if one of the new poll types is requested; hopefully this will not break any existing code. (This is done so that programs have a dependable way of determining whether a filesystem supports the extended poll types or not.) The new poll types added are: POLLWRITE - file contents may have been modified POLLNLINK - file was linked, unlinked, or renamed POLLATTRIB - file's attributes may have been changed POLLEXTEND - file was extended Note that the internal operation of poll() means that it is impossible for two processes to reliably poll for the same event (this could be fixed but may not be worth it), so it is not possible to rewrite `tail -f' to use poll at this time.	1997-12-15 03:09:59 +00:00
Mike Smith	0bec68bf7c	Consult sa_len before trampling it with MSG_COMPAT set. PR: kern/5291 Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)	1997-12-15 02:29:11 +00:00
Tor Egge	5c623cb649	Add support for low resolution SMP kernel profiling. - A nonprofiling version of s_lock (called s_lock_np) is used by mcount. - When profiling is active, more registers are clobbered in seemingly simple assembly routines. This means that some callers needed to save/restore extra registers. - The stack pointer must have space for a 'fake' return address in idle, to avoid stack underflow.	1997-12-15 02:18:35 +00:00
Tor Egge	549a42942d	Don't forward hardclock or statclock to stopped cpus. Disable forwarding when a panic has occured.	1997-12-15 01:14:10 +00:00
John Polstra	a8e99ec909	Make gzipped dynamically linked executables work again. There was an old bug here that failed to copy the a.out header into memory properly. It didn't matter until changes were made recently to the dynamic linker.	1997-12-14 19:36:24 +00:00
Mike Smith	5af7db2b73	As described by the submitter: ... fix a bug with orecvfrom() or recvfrom() called with the MSG_COMPAT flag on kernels compiled with the COMPAT_43 option. The symptom is that the fromaddr is not correctly returned. This affects the Linux emulator. Submitted by: pb@fasterix.freenix.org (Pierre Beyssac)	1997-12-14 03:15:21 +00:00
John Dyson	8256655132	After one of my analysis passes to evaluate methods for SMP TLB mgmt, I noticed some major enhancements available for UP situations. The number of UP TLB flushes is decreased much more than significantly with these changes. Since a TLB flush appears to cost minimally approx 80 cycles, this is a "nice" enhancement, equiv to eliminating between 40 and 160 instructions per TLB flush. Changes include making sure that kernel threads all use the same PTD, and eliminate unneeded PTD switches at context switch time.	1997-12-14 02:11:23 +00:00
Tor Egge	80db913bd6	Add needed #include. Problem found by: Bruce Evans <bde@zeta.org.au>	1997-12-12 21:45:23 +00:00
John Dyson	74b2192ae6	We have had support for running the kernel daemons as threads for quite a while, but forgot to do so. For now, this code supports most daemons running as kernel threads in UP kernels, and as full processes in SMP. We will soon be able to run them as threads in SMP, but not yet.	1997-12-12 04:00:59 +00:00
John Dyson	648899413d	Quiet some lint.	1997-12-10 04:14:23 +00:00
Steve Passe	eae8fc2c8a	The improvements to clock statistics by Tor Egge Wrappered and enabled by the define BETTER_CLOCK (on by default in smpyests.h) Reviewed by: smp@csn.net Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>	1997-12-08 23:00:24 +00:00
John-Mark Gurney	0c2a51b49e	add process id to tmp files... this prevents two runs from stomping over each other's tmp files... (usr.bin/truss uncovered this bug)	1997-12-08 09:00:47 +00:00
John Dyson	78922e413c	Correct prototypes to match POSIX. Correct return code for aio_cancel. Submitted by: Alex Nash <nash@mcs.com>	1997-12-08 02:18:25 +00:00
Sean Eric Fagan	847e5f5f9a	Use at_exit() to invoke procfs_exit() instead of calling it directly. Note that an unload facility should be used to call rm_at_exit() (if procfs is being loaded as an LKM and is subsequently removed), but it was non-obvious how to do this in the VFS framework. Reviewed by: Julian Elischer	1997-12-08 01:06:36 +00:00
Sean Eric Fagan	ed1b05436a	Surround the call to procfs_exit() by #ifdef PROCFS/#endif -- much to my surprise, procfs actually is optional, and some people truly do generate kernels without it. Wow. I built a kernel without 'options PROCFS' and it compiled and linked.	1997-12-07 18:16:43 +00:00
John Dyson	f2e6e69d92	Slight performance improvement, removal of unneeded SPLs.	1997-12-07 04:06:41 +00:00
Bruce Evans	df1c78063c	Use ENOIOCTL instead of -1 (= ERESTART) for diskslice ioctls that are not handled at a particular level.	1997-12-06 14:27:56 +00:00
Bruce Evans	239b7b699e	Use ENOIOCTL instead of -1 (= ERESTART) for tty ioctls that are not handled at a particular level. This fixes mainly restarting of interrupted TIOCDRAINs and TIOCSETA{W,F}s.	1997-12-06 13:25:01 +00:00
Sean Eric Fagan	2a024a2b05	Changes to allow event-based process monitoring and control.	1997-12-06 04:11:14 +00:00
Bruce Evans	1cd52ec333	Don't include <sys/lock.h> in headers when only `struct simplelock' is required. Fixed everything that depended on the pollution.	1997-12-05 19:55:52 +00:00
John Dyson	d4060a8751	Some fixes from John Hood: 1) Fix the initialization of malloc structure that changed due to perf opt. 2) Remove unneeded include. 3) An initialization assert added to malloc. Submitted by: John Hood <cgull@smoke.marlboro.vt.us>	1997-12-05 05:36:58 +00:00
John-Mark Gurney	4d9deedb49	document and make the NO_F00F_HACK a proper option... also, sort some option includes while I'm here.. Forgotten by: sef	1997-12-04 21:21:26 +00:00
Jordan K. Hubbard	e41b6f2db7	After consultation with David, change #ifndef NO_F00F_HACK to #if defined(I586_CPU) && !defined(NO_F00F_HACK)	1997-12-04 14:35:40 +00:00
Sean Eric Fagan	c4fbf2774d	Work around for the Intel Pentium F00F bug; this is Intel's recommended workaround. Note that this currently eats up two pages extra in the system; this could be alleviated by aligning idt correctly, and then only dealing with that (as opposed to the current method of allocated two pages and copying the IDT table to that, and then setting that to be the IDT table).	1997-12-03 02:45:50 +00:00
Poul-Henning Kamp	ab3f746966	In all such uses of struct buf: 's/b_un.b_addr/b_data/g'	1997-12-02 21:07:20 +00:00
Bruce Evans	52aef196f7	Cleaned up __getcwd(). This should be cosmetic except disabled calls are now counted. Reviewed by: phk	1997-12-02 10:32:21 +00:00
John Dyson	b4b3edc1f4	Fix a serious problem during resizing buffers where old buffers address space wasn't being properly reclaimed. Submitted by: Bruce Evans <bde@freebsd.org>	1997-12-01 19:04:00 +00:00
John Dyson	e499ed6f86	Fix a problem when creating a new kernel thread. In some cases, aio_read or aio_write can return the pid of the new thread. This is due to the way that return values from system calls being passed by side-effect in the proc structure now. This commit fixes the problem with aio_read and aio_write.	1997-12-01 18:41:08 +00:00
Julian Elischer	1b0493988c	Cleanup my last patch here Reviewed by: sef@kthrup.com and phk@freebsd.org	1997-12-01 11:34:41 +00:00
John Dyson	11783b142b	Fix error handling for VCHR type I/O. Also, fix another spl problem, and remove alot of overly verbose debugging statements. ioproclist { int aioprocflags; /* AIO proc flags / TAILQ_ENTRY(aioproclist) list; / List of processes / struct proc aioproc; /* The AIO thread / TAILQ_HEAD (,aiocblist) jobtorun; / suggested job to run / }; / * data-structure for lio signal management / struct aio_liojob { int lioj_flags; int lioj_buffer_count; int lioj_buffer_finished_count; int lioj_queue_count; int lioj_queue_finished_count; struct sigevent lioj_signal; / signal on all I/O done / TAILQ_ENTRY (aio_liojob) lioj_list; struct kaioinfo lioj_ki; }; #define LIOJ_SIGNAL 0x1 /* signal on all done (lio) / #define LIOJ_SIGNAL_POSTED 0x2 / signal has been posted / / * per process aio data structure / struct kaioinfo { int kaio_flags; / per process kaio flags / int kaio_maxactive_count; / maximum number of AIOs / int kaio_active_count; / number of currently used AIOs / int kaio_qallowed_count; / maxiumu size of AIO queue / int kaio_queue_count; / size of AIO queue / int kaio_ballowed_count; / maximum number of buffers / int kaio_queue_finished_count; / number of daemon jobs finished / int kaio_buffer_count; / number of physio buffers / int kaio_buffer_finished_count; / count of I/O done / struct proc kaio_p; /* process that uses this kaio block / TAILQ_HEAD (,aio_liojob) kaio_liojoblist; / list of lio jobs / TAILQ_HEAD (,aiocblist) kaio_jobqueue; / job queue for process / TAILQ_HEAD (,aiocblist) kaio_jobdone; / done queue for process / TAILQ_HEAD (,aiocblist) kaio_bufqueue; / buffer job queue for process / TAILQ_HEAD (,aiocblist) kaio_bufdone; / buffer done queue for process / }; #define KAIO_RUNDOWN 0x1 / process is being run down / #define KAIO_WAKEUP 0x2 / wakeup process when there is a significant event / TAILQ_HEAD (,aioproclist) aio_freeproc, aio_activeproc; TAILQ_HEAD(,aiocblist) aio_jobs; / Async job list / TAILQ_HEAD(,aiocblist) aio_bufjobs; / Phys I/O job list / TAILQ_HEAD(,aiocblist) aio_freejobs; / Pool of free jobs / static void aio_init_aioinfo(struct proc p) ; static void aio_onceonly(void ) ; static int aio_free_entry(struct aiocblist aiocbe); static void aio_process(struct aiocblist aiocbe); static int aio_newproc(void) ; static int aio_aqueue(struct proc p, struct aiocb job, int type) ; static void aio_physwakeup(struct buf bp); static int aio_fphysio(struct proc p, struct aiocblist aiocbe, int type); static int aio_qphysio(struct proc p, struct aiocblist iocb); static void aio_daemon(void uproc); SYSINIT(aio, SI_SUB_VFS, SI_ORDER_ANY, aio_onceonly, NULL); static vm_zone_t kaio_zone=0, aiop_zone=0, aiocb_zone=0, aiol_zone=0, aiolio_zone=0; / * Single AIOD vmspace shared amongst all of them / static struct vmspace aiovmspace = NULL; /* * Startup initialization / void aio_onceonly(void na) { TAILQ_INIT(&aio_freeproc); TAILQ_INIT(&aio_activeproc); TAILQ_INIT(&aio_jobs); TAILQ_INIT(&aio_bufjobs); TAILQ_INIT(&aio_freejobs); kaio_zone = zinit("AIO", sizeof (struct kaioinfo), 0, 0, 1); aiop_zone = zinit("AIOP", sizeof (struct aioproclist), 0, 0, 1); aiocb_zone = zinit("AIOCB", sizeof (struct aiocblist), 0, 0, 1); aiol_zone = zinit("AIOL", AIO_LISTIO_MAX * sizeof (int), 0, 0, 1); aiolio_zone = zinit("AIOLIO", AIO_LISTIO_MAX * sizeof (struct aio_liojob), 0, 0, 1); aiod_timeout = AIOD_TIMEOUT_DEFAULT; aiod_lifetime = AIOD_LIFETIME_DEFAULT; jobrefid = 1; } /* * Init the per-process aioinfo structure. * The aioinfo limits are set per-process for user limit (resource) management. / void aio_init_aioinfo(struct proc p) { struct kaioinfo *ki; if (p->p_aioinfo == NULL) { ki = zalloc(kaio_zone); p->p_aioinfo = ki	1997-12-01 07:01:45 +00:00
John Dyson	f4f0ecefab	Correct a last minute code change. Would have been an infinite loop under certain error conditions. Submitted by: pst@shockwave.com	1997-11-30 23:21:08 +00:00
John Dyson	c5efdcbdec	Fix an spl nit.	1997-11-30 21:47:36 +00:00
John Dyson	84af4da65a	Finish up the vast majority of the AIO/LIO functionality. Proper signal support was missing in the previous version of the AIO code. More tunables added, and very efficient support for VCHR files has been added. Kernel threads are not used for VCHR files, all work for such files is done for the requesting process directly. Some attempt has been made to charge the requesting process for resource utilization, but more work is needed. aio_fsync is still missing (but the original fsync system call can be used for now.) aio_cancel is essentially a noop, but that is okay per POSIX. More aio_cancel functionality can be added later, if it is found to be needed. The functions implemented include: aio_read, aio_write, lio_listio, aio_error, aio_return, aio_cancel, aio_suspend. The code has been implemented to support the POSIX spec 1003.1b (formerly known as POSIX 1003.4 spec) features of the above. The async I/O features are truly async, with the VCHR mode of operation being essentially the same as physio (for appropriate files) for maximum efficiency. This code also supports the signal capability, is highly tunable, allowing management of resource usage, and has been written to allow a per process usage quota. Both the O'Reilly POSIX.4 book and the actual POSIX 1003.1b document were the reference specs used. Any filedescriptor can be used with these new system calls. I know of no exceptions where these system calls will not work. (TTY's will also probably work.)	1997-11-30 04:36:31 +00:00
John Dyson	f4feb04e1f	Disable the VCHR optimization for AIO until I have implemented it. Just in case anyone wants to play with the POSIX AIO/LIO stuff. (As it is, it should work with ANY vnode, on UP systems only, for now.)	1997-11-29 02:57:46 +00:00
John Dyson	fd3bf77574	Fix and complete the AIO syscalls. There are some performance enhancements coming up soon, but the code is functional. Docs will be forthcoming.	1997-11-29 01:33:10 +00:00
Julian Elischer	95802bf803	Shift a few SYSINT() calls around. this results in a few functions becoming static, and the SYSINITs being close to the code they are related to. setting up the dump device is with dumpsys() and kicking off the scheduler is with the scheduler. Mounting root is with the code that does it. Reviewed by: phk	1997-11-25 07:07:48 +00:00
Bruce Evans	c463cf1cae	Fixed multiple definitions of boothowto. Fixed bitrot in the read-only access to kern.boottime.	1997-11-24 18:35:04 +00:00
Bruce Evans	b672aa4ba6	Removed all traces of P_IDLEPROC. It was tested but never set.	1997-11-24 15:15:33 +00:00
Bruce Evans	21e5241572	Fixed some #include messes. Hid the check of the user %cs in syscall() under `#ifdef DIAGNOSTIC'.	1997-11-24 13:25:37 +00:00
John Dyson	4ced7dd5bf	Avoid manipulating the buffer map at interrupt time by deferring bfreekva to getnewbuf, and remove from brelse. Reviewed by: dg@root.com	1997-11-24 06:18:27 +00:00
John Dyson	289500ad9e	Fix the buffer flag frobbing. Note: It is invalid to gratuitiously modify b_flags, and this patch removes unneeded modifications. Only the needed b_flags bits are modified now. (Specifically, it is usually wrong to zero b_flags.) Submitted by: bde@freebsd.org	1997-11-24 04:14:21 +00:00
Bruce Evans	a3c78a768e	Fixed a missing conversion of retval to p_retval in disabled code. Fixed overflow of FFLAGS() in fcntl(F_SETFL, ...). This was not a security hole, but gave wrong results for silly flags values. E.g., it make fcntl(F_SETFL, -1) equivalent to fcntl(F_SETFL, 0). POSIX requires ignoring the open mode bits in fcntl() (even if they would be invalid for open()).	1997-11-23 12:24:59 +00:00
Bruce Evans	d826c47904	Fixed duplicate definitions of M_FILE (one static).	1997-11-23 10:43:49 +00:00
Bruce Evans	2087c8967c	Fixed some style bugs in the poll() code. Removed dead code to "Avoid inadvertently sleeping forever". hzto() never returns 0.	1997-11-23 10:30:50 +00:00
Bruce Evans	cb451ebdbd	Staticized.	1997-11-22 08:35:46 +00:00
Bruce Evans	865737f450	Staticized. Use OID_AUTO instead of a magic number for the debug.syncprt sysctl. (This sysctl doesn't actually work. FreeBSD nuked it, but parts of it were mismerged from Lite2. It is not very good, but better than nothing.)	1997-11-22 06:41:21 +00:00
Bruce Evans	d02601f8cf	Fixed rev.1.81. mp->mnt_kern_flag was restored in the non-error case of `mount -u'. This only matters for `mount -u' competing with unmounts. If I understand the locking correctly: if mount() blocks, then unmount() may run and set mp->kern_flag for the same mp. Then unmount() blocks waiting for mount() to finish. When unmount() continues, its MNTK flags (MNTK_UNMOUNT and MNTK_MWAIT) may have been clobbered. Didn't fix old bugs: - restoring mp->mnt_kern_flag is wrong for the same reasons in the error case. - the error case of unmount() seems to be broken too: (a) MNTK_UNMOUNT gets clobbered, although another unmount() may have set it. Perhaps it shouldn't be set until after the full lock is aquired. (b) MNTK_MWAIT isn't honoured. Fixed a nearby style bug.	1997-11-22 06:10:36 +00:00
Bruce Evans	6881d20fbb	Const poisoning from ks_shortdesc.	1997-11-21 11:37:03 +00:00
Bruce Evans	d73424aa6b	Fixed a sloppy common-style definitions.	1997-11-20 20:07:59 +00:00
Bruce Evans	01166e9245	Avoid passing a `retval' to wait1() Disallow wait options that are not a combination of the standard POSIX options WUNTRACED and WNOHANG, as is required by POSIX. BSD doesn't have any extensions here, but the code was `#ifdef notyet' for some reason.	1997-11-20 19:09:43 +00:00
Bruce Evans	be67169a57	Removed unused includes. Staticized. Avoid passing a `retval' to fork1(). Fixed some style bugs.	1997-11-20 16:36:17 +00:00
Bruce Evans	d96bc99dac	Removed unused #includes. Ifdefed a conditionally used #include. Fixed nonblocking mode. It was per-device instead of per-file. Don't depend on gcc's misfeature of rewriting char args in old-style function definitions to match wrong prototypes. Break K&R1 support to fix this quickly.	1997-11-18 16:12:51 +00:00
Bruce Evans	fc8f7066d2	Get buffer stuff by #including <sys/buf.h> instead of <sys/vnode.h>. Staticized boot(). Fixed a gratuitous ANSIism.	1997-11-18 15:16:43 +00:00
Bruce Evans	d662024a0d	Removed an unused #include. Fixed a style bug (one of many KNF breakages in vfs_subr.c moved here).	1997-11-18 13:03:48 +00:00
Bruce Evans	1fe5398ce7	Get tty ioctl numbers by #including <sys/ttycom.h> instead of <sys/tty.h>. Don't #include <sys/fcntl.h> (the select -> poll changes removed all dependencies on it).	1997-11-18 12:59:09 +00:00
Bruce Evans	b7f5d5b54d	Removed an unused #include. Added an unsed #include of <sys/ucred.h> to prepare for not including it in <sys/param.h>. Moved conditionally used #includes inside an ifdef.	1997-11-18 12:52:10 +00:00
Bruce Evans	e4474ce8bf	Removed an unused #include. Ifdefed a conditionally used #include.	1997-11-18 12:43:41 +00:00
Bruce Evans	99af8d4e43	Removed unused #include.	1997-11-18 12:24:22 +00:00
Bruce Evans	fdebd4f0fd	Get locking stuff by #including <sys/lock.h> instead of <sys/vnode.h>.	1997-11-18 10:02:40 +00:00
Peter Wemm	6245570f67	Don't generate new prototype files with the extra int retval[] arg at the end since pdk deleted them. Forgotten by: phk	1997-11-18 03:34:39 +00:00
Julian Elischer	52bf64c787	Reviewed by: hackers@freebsd.org in general Obtained from: Whistle Communications tree Add an option to the way UFS works dependent on the SUID bit of directories This changes makes things a whole lot simpler on systems running as fileservers for PCs and MACS. to enable the new code you must 1/ enable option SUIDDIR on the kernel. 2/ mount the filesystem with option suiddir. hopefully this makes it difficult enough for people to do this accidentally. see the new chmod(2) man page for detailed info.	1997-11-13 00:28:51 +00:00
Tor Egge	d72ec6655e	Set return value for the correct process in ptrace().	1997-11-12 12:28:12 +00:00
Julian Elischer	b1f4a44b03	Reviewed by: various. Ever since I first say the way the mount flags were used I've hated the fact that modes, and events, internal and exported, and short-term and long term flags are all thrown together. Finally it's annoyed me enough.. This patch to the entire FreeBSD tree adds a second mount flag word to the mount struct. it is not exported to userspace. I have moved some of the non exported flags over to this word. this means that we now have 8 free bits in the mount flags. There are another two that might well move over, but which I'm not sure about. The only user visible change would have been in pstat -v, except that davidg has disabled it anyhow. I'd still like to move the state flags and the 'command' flags apart from each other.. e.g. MNT_FORCE really doesn't have the same semantics as MNT_RDONLY, but that's left for another day.	1997-11-12 05:42:33 +00:00
Jordan K. Hubbard	64bd2f7b57	MF22: MSG_EOR bug fix. Submitted by: wollman	1997-11-09 05:07:40 +00:00
Tor Egge	31e5225482	Use UPAGES when setting up private pages for SMP (which includes idle stack).	1997-11-07 19:58:34 +00:00
Poul-Henning Kamp	0abc78a697	Rename some local variables to avoid shadowing other local variables. Found by: -Wshadow	1997-11-07 09:21:01 +00:00
Poul-Henning Kamp	4a11ca4e29	Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused	1997-11-07 08:53:44 +00:00

... 6 7 8 9 10 ...

2258 Commits