10896 Commits

Author SHA1 Message Date
jeff
4385f99f93 - Use the correct test in the ipi bitmask handler for IPI_PREEMPT so that
we actually issue preemptions.
 - Remove the #ifdef IPI_PREEMPTION so it is always compiled in.  Leave
   the option which optionally enables support in sched_4bsd.  sched_ule.c
   will soon use this functionality as a run time rather than compile time
   option.
 - Compare against the idlethread rather than the priority.  There are some
   idle prio tasks that we can preempt.

Discussed with:	ups
Tested on:	i386, amd64
2007-01-11 00:17:02 +00:00
jkim
24024d4f50 Add SSSE3 extensions and correct CNXT-ID spelling for Intel processors. 2007-01-09 19:23:22 +00:00
netchild
977ef4a8bc MFp4 (112498):
Rename the locking flags to EMUL_DOLOCK and EMUL_DONTLOCK to prevent confusion.

Submitted by:	rdivacky
2007-01-07 19:00:38 +00:00
netchild
f87d2b65bb regen after addition of linux_utimes and linux_rt_sigtimedwait 2006-12-31 13:20:31 +00:00
netchild
33166d619b MFp4 (111746, 108671, 108945, 112352):
- add linux utimes syscall [1]
 - add linux rt_sigtimedwait syscall [2]

Submitted by:	"Scot Hetzel" <swhetzel@gmail.com> [1]
Submitted by:	Bruce Becker <hostmaster@whois.gts.net> [2]
PR:		93199 [2]
2006-12-31 13:16:00 +00:00
bde
c4404408fa Fix oops in previous commit. 2006-12-29 15:48:18 +00:00
bde
b616a20d08 Fixed some style bugs (mainly assorted errors in comments, and inconsistent
spelling of `result').
2006-12-29 15:29:49 +00:00
bde
0a6fe7fb48 Fixed some style bugs (whitespace only). 2006-12-29 14:28:23 +00:00
bde
6a28e42ead Try harder to garbage-collect the "LOCORE" (really asm) version of
MPLOCKED.  The cleaning in rev.1.25 was supposed to have been undone
by rev.1.26, but 1.26 could never have actually affected asm files
since atomic.h is full of C declarations so including it in asm files
would just give syntax errors.  The asm MPLOCKED is even less needed
than when misplaced definitions of it were first removed, and is now
unused in any asm file in the src tree except in anachronismns in
sys/i386/i386/support.s.
2006-12-29 13:36:26 +00:00
bde
9d0b590514 Avoid an instruction in atomic_cmpset_{int_long)() in most cases.
These functions are used a lot for mutexes, so this reduces the text
size of an average kernel by about 0.75%.  This wasn't intended to
be a significant optimization, but it somehow increased the maximum
number of packets per second that can be transmitted by my bge hardware
from 320000 to 460000 (this benchmark is CPU-bound and remarkably
sensitive to changes in the text section).

Details: we would prefer to leave the result of the cmpxchg in %al,
but cannot tell gcc that it is there, so we have to convert it to an
integer register.  We converted  to %al, then to %[re]ax, but the
latter step is usually wasted since gcc usually only wants the condition
code and can recover it from %al just as easily as from %[re]ax.  Let
gcc promote %al in the few cases where this is needed.

Nearby style fixes;
- let gcc manage the load of `res', and don't abuse `res' for a copy of `exp'
- don't echo `res's name in comments
- consistently spell the condition code as 'e' after comparison for equality
- don't hard-code %al anywhere except in constraints
- for the version that doesn't use cmpxchg, there is no requirement to use
  %al anywhere, so don't hard-code it in the constraints either.

Style non-fix:
- for the versions that use cmpxchg, keep using "a" (was %[re]ax, now %al)
  for the main output operand, although this is not required.  The input
  and output operands that use the "a" constraint are now decoupled, and
  this makes things clearer except for the reason that the output register
  is hard-coded.  It is now just a hack to tell gcc that the input "a" has
  been clobbered without increasing the number of operands.
2006-12-27 20:26:00 +00:00
jkim
6397f13732 Regen (just to fix 'generated from' line from the previous commit). 2006-12-20 20:42:58 +00:00
jkim
da8c5f8136 Add linux_nanosleep() and regen. 2006-12-20 20:21:48 +00:00
jkim
3b05cb0c58 MFP4: 109655
- Move linux_nanosleep() from src/sys/amd64/linux32/linux32_machdep.c to
src/sys/compat/linux/linux_time.c.
- Validate timespec ranges before use as Linux kernel does.
- Fix l_timespec structure.
- Clean up style(9) nits.
2006-12-20 20:17:35 +00:00
davidxu
5a984630fa Add a lwpid field into per-cpu structure, the lwpid represents current
running thread's id on each cpu. This allow us to add in-kernel adaptive
spin for user level mutex. While spinning in user space is possible,
without correct thread running state exported from kernel, it hardly
can be implemented efficiently without wasting cpu cycles, however
exporting thread running state unlikely will be implemented soon as
it has to design and stablize interfaces. This implementation is
transparent to user space, it can be disabled dynamically. With this
change, mutex ping-pong program's performance is improved massively on
SMP machine. performance of mysql super-smack select benchmark is increased
about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems
which have bunch of cpus and system-call overhead is low (athlon64, opteron,
and core-2 are known to be fast), the adaptive spin does help performance.

Added sysctls:
    kern.threads.umtx_dflt_spins
        if the sysctl value is non-zero, a zero umutex.m_spincount will
        cause the sysctl value to be used a spin cycle count.
    kern.threads.umtx_max_spins
        the sysctl sets upper limit of spin cycle count.

Tested on: Athlon64 X2 3800+, Dual Xeon 5130
2006-12-20 04:40:39 +00:00
kmacy
ee539b99dc Evidently FreeBSD has long relied on the compiler to treat structures
passed by value (trap frames) as if they were in fact being passed by
reference. For better or worse, this incorrect behaviour is no longer
present in gcc 4.1. In this patch I convert all trapframe arguments to
be explicitly pass by reference. I also remove vm86_initflags, pushing
the very little work that it actually does up into vm86_prepcall.

Reviewed by: kan
Tested by: kan
2006-12-17 05:07:01 +00:00
kmacy
4f9056e412 vm86_initflags was causing gcc41 and even gcc346 to get rather confused
- de-obfuscate

Suggested by: kan
Reviewed by: kan
Tested by: kan
2006-12-17 03:17:46 +00:00
n_hibma
c98f016084 Align the interfaces for the various watchdogs and make the interface
behave as expected.

Also:
- Return an error if WD_PASSIVE is passed in to the ioctl as only
  WD_ACTIVE is implemented at the moment. See sys/watchdog.h for an
  explanation of the difference between WD_ACTIVE and WD_PASSIVE.
- Remove the I_HAVE_TOTALLY_LOST_MY_SENSE_OF_HUMOR define. If you've
  lost your sense of humor, than don't add a define.

Specific changes:

i80321_wdog.c
  Don't roll your own passive watchdog tickle as this would defeat the
  purpose of an active (userland) watchdog tickle.

ichwd.c / ipmi.c:
  WD_ACTIVE means active patting of the watchdog by a userland process,
  not whether the watchdog is active. See sys/watchdog.h.

kern_clock.c:
  (software watchdog) Remove a check for WD_ACTIVE as this does not make
  sense here. This reverts r1.181.
2006-12-15 21:44:49 +00:00
yongari
12c67e24f1 Add msk(4) to the list of drivers supported by GENERIC kernel. 2006-12-13 03:41:47 +00:00
jhb
dfb2d953f1 Give Host-PCI bridge drivers their own pcib_alloc_msi() and
pcib_alloc_msix() methods instead of using the method from the generic
PCI-PCI bridge driver as the PCI-PCI methods will be gaining some PCI-PCI
specific logic soon.
2006-12-12 19:27:01 +00:00
jhb
f74d404e39 Sort function prototypes. 2006-12-12 19:24:45 +00:00
jhb
ea27014d6f Replace a few magic numbers. 2006-12-12 19:23:52 +00:00
jhb
65d8bd30a0 Add a function to return the MD interrupt source cookie associated with
an interrupt event.  Use this in the x86 code to fixup the intrcnt names
when an interrupt handler is removed.
2006-12-12 19:20:19 +00:00
sobomax
8563c840b5 Allow machdep.cpu_idle_hlt to be set from the loader. This should allow
to workaround the problem with SMP kernels on Turion64 X2 processors
described in kern/104678 and may be useful in other situations too.

MFC after:	3 days
2006-12-06 18:27:17 +00:00
julian
396ed947f6 Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
bde
852d9800f0 Optimized RTC accesses by avoiding null writes to the index register
and by only delaying when an RTC register is written to.  The delay
after writing to the data register is now not just a workaround.

This reduces the number of ISA accesses in the usual case from 4 to
1.  The usual case is 2 rtcin()'s for each RTC interrupt.  The index
register is almost always RTC_INTR for this.  The 3 extra ISA accesses
were 1 for writing the index and 2 for delays.  Some delays are needed
in theory, but in practice they now just slow down slow accesses some
more since almost eveyone including us does them wrong so modern systems
enforce sufficient delays in hardware.  I used to have the delays ifdefed
out, but with the index register optimization the delays are rarely
executed so the old magic ones can be kept or even implemented non-
magically without significant cost.

Optimizing RTC interrupt handling is more interesting than it used to
be because RTC interrupts are currently needed to fix the more efficient
apic timer interrupts on some systems.  apic_timer_hz is normally 2000
so the RTC interrupt rate needs to be 2048 to keep the apic timer
firing on such systems.  Without these changes, each RTC interrupt
normally took 10 ISA accesses (2 PIC accesses and 2 sets of 4 RTC
accesses).  Each ISA access takes 1-1.5uS so 10 of then at 2048 Hz
takes 2-3% of a CPU.  Now 4 of them take 0.8-1.2% of a CPU.
2006-12-03 03:49:28 +00:00
jb
da35e3e55f Turn console printf buffering into a kernel option and only on
by default for sun4v where it is absolutely required.

This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.

Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.
2006-11-30 04:17:05 +00:00
ru
41b733910f Tweak the comment about mapping a kernel using large pages. 2006-11-25 23:00:46 +00:00
alc
b2f49e89df The global variable avail_end is redundant and only used once. Eliminate
it.  Make avail_start static to the pmap on amd64.  (It no longer exists
on other architectures.)
2006-11-19 20:54:58 +00:00
jhb
0133dd1b2e - Add macro constants for the various fields in %dr7 and use them in place
of various scattered magic values.
- Pretty print the address of hardware watchpoints in 'show watch' rather
  than just displaying hex.
- Expand address field width on amd64 for 64-bit pointers.
2006-11-17 19:20:32 +00:00
jhb
84450a55fb Trim some noise from bootverbose:
- Drop the printf in intr_machdep.c when we assign an interrupt souce to
  a CPU.  Each source already has a more detailed printf.
- Don't output a line for each ioapic pin showing its initial state, this
  has outlived its usefulness.
- When an APIC enumerator sets the bus, polarity, or trigger mode of an
  ioapic pin, just return success without printing anything if the new
  value matches the current one.

MFC after:	2 weeks
2006-11-17 16:41:03 +00:00
jhb
375950dbc5 A few more style fixes. 2006-11-17 16:37:35 +00:00
maxim
11bd032934 o Make pv_maxchunks no less than maxproc. This helps to survive a
forkbomb explosion.

Reviewed by:	alc
Security:	local DoS
X-MFC atfer:	RELENG_6 is not affected due to a different pv_entry
		allocation code.
2006-11-16 11:46:24 +00:00
jhb
7bbb24af70 Various whitespace and style fixes. 2006-11-15 19:53:48 +00:00
jhb
7613d4fe06 Fix a typo that broke MSI (MSI-X worked fine) in the later revisions of
the MSI patches.
2006-11-15 18:40:00 +00:00
jhb
fa70d01397 MD support for PCI Message Signalled Interrupts on amd64 and i386:
- Add a new apic_alloc_vectors() method to the local APIC support code
  to allocate N contiguous IDT vectors (aligned on a M >= N boundary).
  This function is used to allocate IDT vectors for a group of MSI
  messages.
- Add MSI and MSI-X PICs.  The PIC code here provides methods to manage
  edge-triggered MSI messages as x86 interrupt sources.  In addition to
  the PIC methods, msi.c also includes methods to allocate and release
  MSI and MSI-X messages.  For x86, we allow for up to 128 different
  MSI IRQs starting at IRQ 256 (IRQs 0-15 are reserved for ISA IRQs,
  16-254 for APIC PCI IRQs, and IRQ 255 is reserved).
- Add pcib_(alloc|release)_msi[x]() methods to the MD x86 PCI bridge
  drivers to bubble the request up to the nexus driver.
- Add pcib_(alloc|release)_msi[x]() methods to the x86 nexus drivers that
  ask the MSI PIC code to allocate resources and IDT vectors.

MFC after:	2 months
2006-11-13 22:23:34 +00:00
ru
9c4d5bd7a3 Fix NKPT comments to match reality. Note that the current value
of NKPT is no longer enough to run amd64 with 16G of RAM, as it
doesn't have space for mapping a kernel (16M kernel would require
additionally 8 page tables).
2006-11-13 20:33:54 +00:00
ru
8beeb4382c Fix a comment. 2006-11-13 06:26:57 +00:00
alc
6093953d36 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
trhodes
58cca8458a Merge posix4/* into normal kernel hierarchy.
Reviewed by:	glanced at by jhb
Approved by:	silence on -arch@ and -standards@
2006-11-11 16:26:58 +00:00
jhb
fe20b44cca Don't dump the $PIR table under bootverbose. The pirtool program in
src/tools/tools works fine, and dumping this table can add a lot of noise.

MFC after:	1 week
2006-11-09 18:03:36 +00:00
ru
794ab15a99 Spelling. 2006-11-07 21:57:18 +00:00
jhb
bfa0cb7ef2 Remove old XXX comment about possibly adding a print_Intel_info() function
to dump CPUID level=2 stuff.  A print_INTEL_info() function that does just
that was added a while ago.
2006-11-07 18:48:18 +00:00
jhb
2a9e42c005 Remove duplicate IDTVEC macro definition, it's already defined in
<machine/intr_machdep.h>.
2006-11-07 18:46:33 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
jb
0ba5da19a5 Remove the KDTRACE option again because of the complaints about having
it as a default.

For the record, the KDTRACE option caused _no_ additional source files
to be compiled in; certainly no CDDL source files. All it did was to
allow existing BSD licensed kernel files to include one or more CDDL
header files.

By removing this from DEFAULTS, the onus is on a kernel builder to add
the option to the kernel config, possibly by including GENERIC and
customising from there. It means that DTrace won't be a feature
available in FreeBSD by default, which is the way I intended it to be.

Without this option, you can't load the dtrace module (which contains
the dtrace device and the DTrace framework). This is equivalent to
requiring an option in a kernel config before you can load the linux
emulation module, for example.

I think it is a mistake to have DTrace ported to FreeBSD, but not
to have it available to everyone, all the time. The only exception
to this is the companies which distribute systems with FreeBSD embedded.
Those companies will customise their systems anyway. The KDTRACE
option was intended for them, and only them.
2006-11-04 23:50:12 +00:00
jb
f7bc0a87d6 Build in kernel support for loading DTrace modules by default. This
adds the hooks that DTrace modules register with, and adds a few functions
which have the dtrace_ prefix to allow the DTrace FBT (function boundary
trace) provider to avoid tracing because they are called from the DTtrace
probe context.

Unlike other forms of tracing and debug, DTrace support in the kernel
incurs negligible run-time cost.

I think the only reason why anyone wouldn't want to have kernel support
enabled for DTrace would be due to the license (CDDL) under which DTrace
is released.
2006-11-04 04:58:10 +00:00
jb
d2bd807356 Add a cnputs() function to write a string to the console with
a lock to prevent interspersed strings written from different CPUs
at the same time.

To avoid putting a buffer on the stack or having to malloc one,
space is incorporated in the per-cpu structure. The buffer
size if 128 bytes; chosen because it's the next power of 2 size
up from 80 characters.

String writes to the console are buffered up the end of the line
or until the buffer fills. Then the buffer is flushed to all
console devices.

Existing low level console output via cnputc() is unaffected by
this change. ithread calls to log() are also unaffected to avoid
blocking those threads.

A minor change to the behaviour in a panic situation is that
console output will still be buffered, but won't be written to
a tty as before. This should prevent interspersed panic output
as a number of CPUs panic before we end up single threaded
running ddb.

Reviewed by:	scottl, jhb
MFC after:	2 weeks
2006-11-01 04:54:51 +00:00
takawata
1fdb1394dc Fix Typo.
Pointed out by: ru
2006-10-31 07:22:24 +00:00
takawata
74d3a43566 Add conf file entries for acpi_aiboost drivers. 2006-10-30 05:51:54 +00:00
netchild
248a5cec67 regen after linux_io_* backout 2006-10-29 14:12:44 +00:00