8463 Commits

Author SHA1 Message Date
des
a032109782 Move the definition of PT_[GS]ET{,DB,FP}REGS from the MD ptrace.h to the
MI ptrace.h, since all platforms define them.  Keep the MD ptrace.h around
for FIX_SSTEP (which is currently only needed on Alpha).
2002-03-16 00:25:53 +00:00
alfred
2c16fbdd2a Fixes to make select/poll mpsafe.
Problem:
  selwakeup required calling pfind which would cause lock order
  reversals with the allproc_lock and the per-process filedesc lock.
Solution:
  Instead of recording the pid of the select()'ing process into the
  selinfo structure, actually record a pointer to the thread.  To
  avoid dereferencing a bad address all the selinfo structures that
  are in use by a thread are kept in a list hung off the thread
  (protected by sellock).  When a selwakeup occurs the selinfo is
  removed from that threads list, it is also removed on the way out
  of select or poll where the thread will traverse its list removing
  all the selinfos from its own list.

Problem:
  Previously the PROC_LOCK was used to provide the mutual exclusion
  needed to ensure proper locking, this couldn't work because there
  was a single condvar used for select and poll and condvars can
  only be used with a single mutex.
Solution:
  Introduce a global mutex 'sellock' which is used to provide mutual
  exclusion when recording events to wait on as well as performing
  notification when an event occurs.

Interesting note:
  schedlock is required to manipulate the per-thread TDF_SELECT
  flag, however if given its own field it would not need schedlock,
  also because TDF_SELECT is only manipulated under sellock one
  doesn't actually use schedlock for syncronization, only to protect
  against corruption.

Proc locks are no longer used in select/poll.

Portions contributed by: davidc
2002-03-14 01:32:30 +00:00
phk
b32f8d14b1 Add commented out GEOM line to NOTES 2002-03-11 08:27:23 +00:00
luigi
ad77debfc5 Export a (machine dependent) kernel variable bootdev as
machdep.guessed_bootdev, and add code to sysctl to parse its value
and give a (not necessarily correct) name to the device we booted
from (the main motivation for this code is to use the info in the
PicoBSD boot scripts, and the impact on the kernel is minimal).

NOTE: the information available in bootdev is not always reliable,
so you should not trust it too much.  The parsing code is the same
as in boot2.c, and cannot cover all cases -- as it is, it seems to
work fine with floppies and IDE disks recognised by the BIOS. It
_should_ work as well with SCSI disks recognised by the BIOS.
Booting from a CDROM in floppy emulation will return /dev/fd0 (because
this is what the BIOS tells us).
Booting off the network (e.g. with etherboot) leaves bootdev unset so
the value will be printed as "invalid (0xffffffff)".

Finally, this feature might go away at some point, hopefully when we
have a more reliable way to get the same information.

MFC-after: 5 days
2002-03-10 20:08:44 +00:00
alc
50cc1db981 Condition the compilation of trapwrite() on I386_CPU. 2002-03-10 02:11:38 +00:00
mike
b8cc0d1207 o Don't require long long support in bswap64() functions.
o In i386's <machine/endian.h>, macros have some advantages over
  inlines, so change some inlines to macros.
o In i386's <machine/endian.h>, ungarbage collect word_swap_int()
  (previously __uint16_swap_uint32), it has some uses on i386's with
  PDP endianness.

Submitted by:	bde

o Move a comment up in <machine/endian.h> that was accidentially moved
  down a few revisions ago.
o Reenable userland's use of optimized inline-asm versions of
  byteorder(3) functions.
o Fix ordering of prototypes vs. redefinition of byteorder(3)
  functions, so that the non-GCC (libc asm) case has proper
  prototypes.
o Add proper prototypes for byteorder(3) functions in <sys/param.h>.
o Prevent redundant duplicate prototypes by making use of the
  _BYTEORDER_PROTOTYPED define.
o Move the bswap16(), bswap32(), bswap64() C functions into MD space
  for platforms in which asm versions don't exist.  This significantly
  reduces the complexity of some things at the cost of duplicate code.

Reviewed by:	bde
2002-03-09 21:02:16 +00:00
hm
15b1f704b5 after joerg's recent merge of i4b's sppp with the main sppp, remove
now obsolete file.
2002-03-09 13:31:56 +00:00
luigi
a10c2f3265 Enable DEVICE_POLLING in LINT now that it is safe to compile it there. 2002-03-09 08:04:58 +00:00
hm
1a9cc50908 make pcvt compile again without "options XSERVER".
PR: 35577
2002-03-08 19:06:46 +00:00
rwatson
f3b2ed4175 Note that several of the recently documented clock-related kernel options
are for debugging purposes only.

Suggested by:	bde
2002-03-08 18:59:05 +00:00
rwatson
01c622cea5 Apply a bit more of the patch from conf/35674: document the various
clock options in more detail.

PR:	conf/35674
Submitted by:	Hiten Pandya <hiten@uk.FreeBSD.org>
2002-03-08 18:50:07 +00:00
rwatson
846a5e02c0 Apply part of the patch from conf/35674 to move the PFIL_HOOKS option
to somewhere more useful, and improve documentation of it.

PR:	conf/35674
Submitted by:	Hiten Pandya <hiten@uk.FreeBSD.org>
2002-03-08 18:47:32 +00:00
rwatson
34c01fe419 Synchronize NOTES with -STABLE LINT with respects to the placement
and commenting of NETSMB, NETWMBCRYPTO, and SMBFS.  In NOTES, they
had all floated to the bottom of the file with the list of seemingly
random and unclassified kernel options.  This change moves them back
up to the network protocol and file system areas, and also documents
the dependencies.
2002-03-08 15:34:23 +00:00
phk
23d68e31d6 #include <machine/smp.h> in the SMP case.
don't include <sys/smp.h> at all.

Fallout from:	probably something jake did.
Hint by:	jhb
2002-03-08 14:31:12 +00:00
jake
96a2ab54a0 Add needed includes of machine/smp.h, remove nested include in sys/smp.h
so that inlines in machine/smp.h can use variables declared in sys/smp.h.
2002-03-07 04:43:51 +00:00
cjc
bacc847dbf Sync with GENERIC. WITNESS_SKIPSPIN, and the ciss and iir devices. 2002-03-06 19:44:08 +00:00
jeff
f25b3f9f5b Add a new variable mp_maxid. This is used so that per cpu datastructures may
be allocated as arrays indexed by the cpu id.  Previously the only reliable
way to know the max cpu id was through MAXCPU. mp_ncpus isn't useful here
because cpu ids may be sparsely mapped, although x86 and alpha do not do this.

Also, call cpu_mp_probe much earlier so the max cpu id is known before the VM
starts up.  This is intended to help support per cpu queues for the new
allocator, but may be useful elsewhere.

Reviewed by:	jake
Approved by:	jake
2002-03-05 10:01:46 +00:00
iwasaki
3f245d8dd1 Add generalized power profile code.
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.

 - move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
 - call power_profile_set_state() also from APM driver when AC-line
   status changes
 - add call-back function for Crusoe LongRun controlling on power
   profile changes for a example
2002-03-04 18:46:13 +00:00
alfred
199a58a697 Support for USB fm radio.
Submitted by: David Yeske <dyeske@yahoo.com>
2002-03-04 03:51:21 +00:00
arr
ed36876e15 - Move a comment from being on the same line as a #ifdef to the line
following it.  This should have gone in the previous commit, but
  misviewed Bruce's patch.

Requested by: bde
2002-02-28 21:52:08 +00:00
markm
875d7f4ce6 Make it a bit clearer where this file is to be used and where it
should not be. (Comments only)

Inspired by:	bde
2002-02-28 18:26:30 +00:00
arr
8cb475ee0d - trap -> trap() in panic() string.
- Translate the message into some sort of understandable english.
- Fix a couple near-by style nits.

Submitted by: bde
2002-02-28 08:13:55 +00:00
silby
c58cf9d742 Fix a minor swap leak.
Previously, the UPAGES/KSTACK area of processes/threads would leak memory
at the time that a previously swapped process was terminated.  Lukcily, the
leak was only 12K/proc, so it was unlikely to be a major problem unless you
had an undersized swap partition.

Submitted by:	dillon
Reviewed by:	silby
MFC after:	1 week
2002-02-28 07:41:12 +00:00
bmilekic
a3cf10660c Make MPLOCKED work again in asm files and stringify it explicitly
where necessary.

Reviewed by: jake
2002-02-28 06:17:05 +00:00
peter
6f849e9fc8 Fix warning (const lost in assignment), harmless in this case. 2002-02-28 03:13:47 +00:00
peter
a7a7b48006 Fix warnings (prototype for nonexisting static function) 2002-02-28 03:12:00 +00:00
peter
0535cd31ee Fix warnings.. bootpc_init() and related. 2002-02-28 03:07:35 +00:00
peter
d61500d1a7 Fix format warning.
Submitted by:	LINT, -Werror
2002-02-27 23:21:46 +00:00
jhb
7ec87b04c7 Back out part of KSE/M2 that snuck in under the radar: changing the
prototype of bzero() on the i386 to have a volatile first argument.

Requested by:	bde, jake
2002-02-27 22:12:29 +00:00
arr
d6f6c63d78 - Insert a space in the panic() string in order more clearly show the
message.
2002-02-27 20:20:42 +00:00
jhb
522b437ab4 Use td_ucred and thus remove now unneeded proc lock acquire and release. 2002-02-27 19:09:30 +00:00
jhb
3706cd3509 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
silby
230f96f3ce Fix a horribly suboptimal algorithm in the vm_daemon.
In order to determine what to page out, the vm_daemon checks
reference bits on all pages belonging to all processes.  Unfortunately,
the algorithm used reacted badly with shared pages; each shared page
would be checked once per process sharing it; this caused an O(N^2)
growth of tlb invalidations.  The algorithm has been changed so that
each page will be checked only 16 times.

Prior to this change, a fork/sleepbomb of 1300 processes could cause
the vm_daemon to take over 60 seconds to complete, effectively
freezing the system for that time period.  With this change
in place, the vm_daemon completes in less than a second.  Any system
with hundreds of processes sharing pages should benefit from this change.

Note that the vm_daemon is only run when the system is under extreme
memory pressure.  It is likely that many people with loaded systems saw
no symptoms of this problem until they reached the point where swapping
began.

Special thanks go to dillon, peter, and Chuck Cranor, who helped me
get up to speed with vm internals.

PR:		33542, 20393
Reviewed by:	dillon
MFC after:	1 week
2002-02-27 18:03:02 +00:00
tmm
3ed05b7b89 Add the following functions/macros to support byte order conversions and
device drivers for bus system with other endinesses than the CPU (using
interfaces compatible to NetBSD):

- bwap16() and bswap32(). These have optimized implementations on some
  architectures; for those that don't, there exist generic implementations.
- macros to convert from a certain byte order to host byte order and vice
  versa, using a naming scheme like le16toh(), htole16().
  These are implemented using the bswap functions.
- stream bus space access functions, which do not perform a byte order
  conversion (while the normal access functions would if the bus endianess
  differs from the CPU endianess).

htons(), htonl(), ntohs() and ntohl() are implemented using the new
functions above for kernel usage. None of the above interfaces is currently
exported to user land.

Make use of the new functions in a few places where local implementations
of the same functionality existed.

Reviewed by:	mike, bde
Tested on alpha by:	mike
2002-02-27 17:16:18 +00:00
robert
e9d577adb2 Use the updated getcredhostname() function. 2002-02-27 16:55:30 +00:00
robert
27b6691da3 - Use the new getcredhostname function in xenix_utsname(),
ibcs2_getipdomainname(), and ibcs2_utssys().

Reviewed by:	phk
2002-02-27 15:23:01 +00:00
peter
fd5a3329ee Re-fix a pointer/integer warning. 2002-02-27 09:58:06 +00:00
peter
f2dee2e96f Back out all the pmap related stuff I've touched over the last few days.
There is some unresolved badness that has been eluding me, particularly
affecting uniprocessor kernels.  Turning off PG_G helped (which is a bad
sign) but didn't solve it entirely.  Userland programs still crashed.
2002-02-27 09:51:33 +00:00
peter
1b62fa07e3 Bandaid for the Uniprocessor kernel exploding. This makes a UP kernel
boot and run (and indeed I am committing from it) instead of exploding
during the int 0x15 call from inside the atkbd driver to get the keyboard
repeat rates.
2002-02-27 06:05:24 +00:00
alfred
d9b9538950 clarify panic message 2002-02-27 04:27:36 +00:00
peter
2051cd9c72 Jake further reduced IPI shootdowns on sparc64 in loops by using ranged
shootdowns in a couple of key places.  Do the same for i386.  This also
hides some physical addresses from higher levels and has it use the
generic vm_page_t's instead.  This will help for PAE down the road.

Obtained from:	jake (MI code, suggestions for MD part)
2002-02-27 02:14:58 +00:00
dillon
8718c47431 didn't quite undo the last reversion. This gets it. 2002-02-27 01:48:17 +00:00
dillon
95be92ead0 revert compatibility fix temporarily (thought it would not break anything
leaving it in).
2002-02-26 20:34:52 +00:00
dillon
996781f17a revert last commit temporarily due to whining on the lists. 2002-02-26 20:33:41 +00:00
dillon
5052e3a1dc Make peter's commit compatible with interrupt-enabled critical_enter()
and exit(), which has already solved the problem in regards to deadlocked
IPI's.
2002-02-26 18:08:54 +00:00
dillon
57b097e18c STAGE-1 of 3 commit - allow (but do not require) interrupts to remain
enabled in critical sections and streamline critical_enter() and
critical_exit().

This commit allows an architecture to leave interrupts enabled inside
critical sections if it so wishes.  Architectures that do not wish to do
this are not effected by this change.

This commit implements the feature for the I386 architecture and provides
a sysctl, debug.critical_mode, which defaults to 1 (use the feature).  For
now you can turn the sysctl on and off at any time in order to test the
architectural changes or track down bugs.

This commit is just the first stage.  Some areas of the code, specifically
the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will
be cleaned up in the STAGE-2 commit when the critical_*() functions are
moved entirely into MD files.

The following changes have been made:

	* critical_enter() and critical_exit() for I386 now simply increment
	  and decrement curthread->td_critnest.  They no longer disable
	  hard interrupts.  When critical_exit() decrements the counter to
	  0 it effectively calls a routine to deal with whatever interrupts
	  were deferred during the time the code was operating in a critical
	  section.

	  Other architectures are unaffected.

	* fork_exit() has been conditionalized to remove MD assumptions for
	  the new code.  Old code will still use the old MD assumptions
	  in regards to hard interrupt disablement.  In STAGE-2 this will
	  be turned into a subroutine call into MD code rather then hardcoded
	  in MI code.

	  The new code places the burden of entering the critical section
	  in the trampoline code where it belongs.

	* I386: interrupts are now enabled while we are in a critical section.
	  The interrupt vector code has been adjusted to deal with the fact.
	  If it detects that we are in a critical section it currently defers
	  the interrupt by adding the appropriate bit to an interrupt mask.

	* In order to accomplish the deferral, icu_lock is required.  This
	  is i386-specific.  Thus icu_lock can only be obtained by mainline
	  i386 code while interrupts are hard disabled.  This change has been
	  made.

	* Because interrupts may or may not be hard disabled during a
	  context switch, cpu_switch() can no longer simply assume that
	  PSL_I will be in a consistent state.  Therefore, it now saves and
	  restores eflags.

	* FAST INTERRUPT PROVISION.  Fast interrupts are currently deferred.
	  The intention is to eventually allow them to operate either while
	  we are in a critical section or, if we are able to restrict the
	  use of sched_lock, while we are not holding the sched_lock.

	* ICU and APIC vector assembly for I386 cleaned up.  The ICU code
	  has been cleaned up to match the APIC code in regards to format
	  and macro availability.  Additionally, the code has been adjusted
	  to deal with deferred interrupts.

	* Deferred interrupts use a per-cpu boolean int_pending, and
	  masks ipending, spending, and fpending.  Being per-cpu variables
	  it is not currently necessary to lock; bus cycles modifying them.

	  Note that the same mechanism will enable preemption to be
	  incorporated as a true software interrupt without having to
	  further hack up the critical nesting code.

	* Note: the old critical_enter() code in kern/kern_switch.c is
	  currently #ifdef to be compatible with both the old and new
	  methodology.  In STAGE-2 it will be moved entirely to MD code.

Performance issues:

	One of the purposes of this commit is to enhance critical section
	performance, specifically to greatly reduce bus overhead to allow
	the critical section code to be used to protect per-cpu caches.
	These caches, such as Jeff's slab allocator work, can potentially
	operate very quickly making the effective savings of the new
	critical section code's performance very significant.

	The second purpose of this commit is to allow architectures to
	enable certain interrupts while in a critical section.  Specifically,
	the intention is to eventually allow certain FAST interrupts to
	operate rather then defer.

	The third purpose of this commit is to begin to clean up the
	critical_enter()/critical_exit()/cpu_critical_enter()/
	cpu_critical_exit() API which currently has serious cross pollution
	in MI code (in fork_exit() and ast() for example).

	The fourth purpose of this commit is to provide a framework that
	allows kernel-preempting software interrupts to be implemented
	cleanly.  This is currently used for two forward interrupts in I386.
	Other architectures will have the choice of using this infrastructure
	or building the functionality directly into critical_enter()/
	critical_exit().

	Finally, this commit is designed to greatly improve the flexibility
	of various architectures to manage critical section handling,
	software interrupts, preemption, and other highly integrated
	architecture-specific details.
2002-02-26 17:06:21 +00:00
bde
1db1715215 Initialize a variable bogusly to avoid a gcc bug that causes a spurious
warning.
2002-02-26 17:04:29 +00:00
peter
c3e9a433a0 Fix a warning. useracc() should take a const pointer argument. 2002-02-26 01:00:39 +00:00
peter
748d0e1167 Work-in-progress commit syncing up pmap cleanups that I have been working
on for a while:
- fine grained TLB shootdown for SMP on i386
- ranged TLB shootdowns.. eg: specify a range of pages to shoot down with
  a single IPI, since the IPI is very expensive.  Adjust some callers
  that used to trigger this inside tight loops to do a ranged shootdown
  at the end instead.
- PG_G support for SMP on i386 (options ENABLE_PG_G)
- defer PG_G activation till after we decide what we are going to do with
  PSE and the 4MB pages at the start of the kernel.  This should solve
  some rumored strangeness about stale PG_G entries getting stuck
  underneath the 4MB pages.
- add some instrumentation for the fine TLB shootdown
- convert some asm instruction wrappers from functions to inlines.  gcc
  seems to do a fair bit better with this.
- [temporarily!] pessimize the tlb shootdown IPI handlers.  I will fix
  this again shortly.

This has been working fairly well for me for a while, but I have tweaked
it again prior to commit since my last major testing round.  The only
outstanding problem that I know of is PG_G related, which is why there
is an option for it (not on by default for SMP).  I have seen a world
speedups by a few percent (as much as 4 or 5% in one case) but I have
*not* accurately measured this - I am a bit sceptical of these numbers.
2002-02-25 23:49:51 +00:00
peter
f9abbb56a5 Tidy up some warnings 2002-02-25 21:42:23 +00:00