Commit Graph

1520 Commits

Author SHA1 Message Date
marcel
11092d3888 Include <sys/sched.h> for sched_throw(). 2007-06-06 04:44:19 +00:00
jeff
91d1501790 Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
jeff
8297f778b9 Commit 13/14 of sched_lock decomposition.
- Add a new parameter to cpu_switch() that is used to release the lock on
   the outgoing thread and properly acquire the lock on the incoming
   thread.  This parameter is not required for schedulers that don't do
   per-cpu locking and architectures which do not support it may continue
   to use the 4BSD scheduler.  This feature is presently not supported
   on ia64

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:58:47 +00:00
jeff
be3241715a - Change comments and asserts to reflect the removal of the global
scheduler lock.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:57:32 +00:00
jeff
20e0f793e8 Commit 10/14 of sched_lock decomposition.
- Use sched_throw() rather than replicating the same cpu_throw() code for
   each architecture.  This also allows the scheduler to use any locking it
   may want to.
 - Use the thread_lock() rather than sched_lock when preempting.
 - The scheduler lock is not required to synchronize release_aps.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:56:08 +00:00
attilio
e333d0ff0e Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
attilio
7dd8ed88a9 Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)
2007-05-31 22:52:15 +00:00
piso
42dfc78150 In some particular cases (like in pccard and pccbb), the real device
handler is wrapped in a couple of functions - a filter wrapper and an
ithread wrapper. In this case (and just in this case), the filter
wrapper could ask the system to schedule the ithread and mask the
interrupt source if the wrapped handler is composed of just an ithread
handler: modify the "old" interrupt code to make it support
this situation, while the "new" interrupt code is already ok.

Discussed with: jhb
2007-05-31 19:25:35 +00:00
yongari
f53195d29a Honor maxsegsz of less than a page size in a DMA tag. Previously it
used to return PAGE_SIZE without respect to restrictions of a DMA tag.
This affected all of the busdma load functions that use
_bus_dmamap_loader_buffer() as their back-end.

Reviewed by:	scottl
2007-05-29 06:30:26 +00:00
alc
a530caef2a Eliminate an unused definition. 2007-05-27 20:34:26 +00:00
marcel
df27a8ac99 Have the processor defer all faults and exceptions for control
speculative loads. This at least makes control speculative loads
work. In the future we should analyze which faults/exceptions
we want to handle rather than defer to avoid having to call the
recovery code when it's not strictly necessary.
2007-05-27 19:02:47 +00:00
kan
4c2d706212 Allow FreeBSD's native ELF image activators to execute shared libraries the
same way it was enabled for Linux binares in linuxulator.

This allows binaries built with -pie. Many ports auto-detect -fPIE support
in GCC 4.2 and build binaries FreeBSD was unable to run.
2007-05-22 02:22:58 +00:00
marcel
4e45a73058 When speculation fails (as determined by the chk instruction) the
processor is to jump to recovery code. This branching behaviour
may not be implemented by the processor and a Speculative Operation
fault is raised. The OS is responsible to emulate the branch.
Implement this, because GCC 4.2 uses advanced loads regularly.
2007-05-21 05:11:43 +00:00
marcel
2a4c24267a Fix GCC warning: va = va += PAGE_SIZE contains pointless operation
va = va. Fix white space in nearby lines.
2007-05-19 18:25:14 +00:00
marcel
797bdcc549 Add a level of indirection to the kernel PTE table. The old
scheme allowed for 1024 PTE pages, each containing 256 PTEs.
This yielded 2GB of KVA. This is not enough to boot a kernel
on a 16GB box and in general too low for a 64-bit machine.
By adding a level of indirection we now have 1024 2nd-level
directory pages, each capable of supporting 2GB of KVA. This
brings the grand total to 2TB of KVA.
2007-05-19 13:11:27 +00:00
marcel
a94f1d6237 Account for the fact that contigmalloc(9) can return a NULL pointer.
Fix the flags argument: M_WAITOK is not a valid flag. Its presence
leaves the indication that contigmalloc(9) will not return a NULL
pointer.

The use of contigmalloc(9) in this place is probably not a good idea
given the constraints. It's probably better to lift the constraints
and instead add a permanent mapping to the ITR. It's possible that
the first 256MB of memory is exhausted when we get here.

This fixes a kernel panic on a 16GB rx3600.
2007-05-19 12:50:12 +00:00
jeff
e1996cb960 - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts.  This can be used to abstract away pcpu details but also changes
   to use atomics for all counters now.  This means sched lock is no longer
   responsible for protecting counts in the switch routines.

Contributed by:		Attilio Rao <attilio@FreeBSD.org>
2007-05-18 07:10:50 +00:00
alc
b34f6f7ab1 Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory.  The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used.  Previously,
only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR:		112194
2007-05-05 19:50:28 +00:00
sepotvin
a1e73b1eaf Add support for specifying a minimal size for vm.kmem_size in the loader via
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.

Approved by: njl (mentor)
Reviewed by: alc
2007-04-21 01:14:48 +00:00
pjd
f4e110ebf2 Remove trailing '.' for consistency! 2007-04-10 21:40:13 +00:00
pjd
b159725895 Add UFS_GJOURNAL options to the GENERIC kernel.
Approved by:	re (kensmith)
2007-04-10 16:49:41 +00:00
jkim
c06098a406 Catch up with ACPI-CA 20070320 import. 2007-03-22 18:16:43 +00:00
jhb
8b3222b80b Change the amd64, i386, and ia64 nexus drivers to setup bus space tags and
handles when activating a resource via bus_activate_resource() rather than
doing some of the work in bus_alloc_resource() and some of it in
bus_activate_resource().

One note is that when using isa_alloc_resourcev() on PC-98, drivers now
need to just use bus_release_resource() without explicitly calling
bus_deactivate_resource() first.  nyan@ has already fixed all of the PC-98
drivers.
2007-03-21 15:36:38 +00:00
alc
b03ddb707b Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file.  Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb
2007-03-11 05:54:29 +00:00
mohans
a332cb00d5 Over NFS, an open() call could result in multiple over-the-wire
GETATTRs being generated - one from lookup()/namei() and the other
from nfs_open() (for cto consistency). This change eliminates the
GETATTR in nfs_open() if an otw GETATTR was done from the namei()
path. Instead of extending the vop interface, we timestamp each attr
load, and use this to detect whether a GETATTR was done from namei()
for this syscall. Introduces a thread-local variable that counts the
syscalls made by the thread and uses <pid, tid, thread syscalls> as
the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on
thread state that could be used as the timestamp with minimal overhead.
2007-03-09 04:02:38 +00:00
scottl
32acf7e446 Don't increment total_bounced when doing no-op dmamap_sync ops. 2007-03-06 18:28:43 +00:00
piso
b44ce1c4db Updated ia64 isa support with the new bus_setup_intr() syntax.
Approved by: re (implicit?)
2007-02-24 16:56:22 +00:00
piso
6a2ffa86e5 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
alc
989e3abb2c Change pmap_protect() so that execute access can be removed without
simultaneously removing write access.
2007-02-21 06:00:46 +00:00
alc
cc7fb68847 Eliminate some acquisitions and releases of the page queues lock that are
no longer necessary.
2007-02-18 06:33:02 +00:00
marcel
e43343208a Now that the free page queue mutex is a sleep mutex, we cannot call
vm_page_alloc() from within a critical section in pmap_growkernel().
Since the need for a critical section may never have existed in the
first place, simply get rid of it.

Discussed with: alc@
2007-02-11 02:52:54 +00:00
brooks
beaea8e48e Include GEOM_LABEL in GENERIC. It's very useful and not well publicized
enough.

Approved by:	pjd
2007-02-09 19:03:18 +00:00
marcel
0245423ad8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
imp
9109b1ceb8 Remove 3rd clause, renumber, ok per email 2007-01-12 07:26:21 +00:00
davidxu
5a984630fa Add a lwpid field into per-cpu structure, the lwpid represents current
running thread's id on each cpu. This allow us to add in-kernel adaptive
spin for user level mutex. While spinning in user space is possible,
without correct thread running state exported from kernel, it hardly
can be implemented efficiently without wasting cpu cycles, however
exporting thread running state unlikely will be implemented soon as
it has to design and stablize interfaces. This implementation is
transparent to user space, it can be disabled dynamically. With this
change, mutex ping-pong program's performance is improved massively on
SMP machine. performance of mysql super-smack select benchmark is increased
about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems
which have bunch of cpus and system-call overhead is low (athlon64, opteron,
and core-2 are known to be fast), the adaptive spin does help performance.

Added sysctls:
    kern.threads.umtx_dflt_spins
        if the sysctl value is non-zero, a zero umutex.m_spincount will
        cause the sysctl value to be used a spin cycle count.
    kern.threads.umtx_max_spins
        the sysctl sets upper limit of spin cycle count.

Tested on: Athlon64 X2 3800+, Dual Xeon 5130
2006-12-20 04:40:39 +00:00
julian
396ed947f6 Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
marcel
aee58e2da0 Since printf also has at least one critical section, we need to
initialize pc_curthread. While here, rename early_pcpu to pcpu0
to be conistent (compare thread0 and proc0).
2006-11-18 23:15:25 +00:00
marcel
a76d88c55c Now that printf() needs the PCPU, set it up before we call printf().
Change the pc_pcb field from a pointer to struct pcb to struct pcb
so that sizeof(struct pcb) includes the PCB we use for IPI_STOP.
Statically declare early_pcb so that we don't have to allocate the
PCB for thread0. This way we can setup the PCPU before cninit()
and thus before we use printf().
2006-11-18 21:52:26 +00:00
marcel
08b7cebe65 Revert previous commit. PC_CONS_BUFR is not used nor needed by
assembly.
2006-11-18 21:48:13 +00:00
ru
8beeb4382c Fix a comment. 2006-11-13 06:26:57 +00:00
alc
6093953d36 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
rwatson
2e2a6d6a66 Add missing includes of priv.h. 2006-11-06 17:43:10 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
jb
0ba5da19a5 Remove the KDTRACE option again because of the complaints about having
it as a default.

For the record, the KDTRACE option caused _no_ additional source files
to be compiled in; certainly no CDDL source files. All it did was to
allow existing BSD licensed kernel files to include one or more CDDL
header files.

By removing this from DEFAULTS, the onus is on a kernel builder to add
the option to the kernel config, possibly by including GENERIC and
customising from there. It means that DTrace won't be a feature
available in FreeBSD by default, which is the way I intended it to be.

Without this option, you can't load the dtrace module (which contains
the dtrace device and the DTrace framework). This is equivalent to
requiring an option in a kernel config before you can load the linux
emulation module, for example.

I think it is a mistake to have DTrace ported to FreeBSD, but not
to have it available to everyone, all the time. The only exception
to this is the companies which distribute systems with FreeBSD embedded.
Those companies will customise their systems anyway. The KDTRACE
option was intended for them, and only them.
2006-11-04 23:50:12 +00:00
jb
f7bc0a87d6 Build in kernel support for loading DTrace modules by default. This
adds the hooks that DTrace modules register with, and adds a few functions
which have the dtrace_ prefix to allow the DTrace FBT (function boundary
trace) provider to avoid tracing because they are called from the DTtrace
probe context.

Unlike other forms of tracing and debug, DTrace support in the kernel
incurs negligible run-time cost.

I think the only reason why anyone wouldn't want to have kernel support
enabled for DTrace would be due to the license (CDDL) under which DTrace
is released.
2006-11-04 04:58:10 +00:00
marcel
c4a6a4c4a7 Make sure kern_envp is never NULL. If we don't get a pointer to
the environment from the loader, use the static environment.
2006-11-03 04:06:17 +00:00
jb
d2bd807356 Add a cnputs() function to write a string to the console with
a lock to prevent interspersed strings written from different CPUs
at the same time.

To avoid putting a buffer on the stack or having to malloc one,
space is incorporated in the per-cpu structure. The buffer
size if 128 bytes; chosen because it's the next power of 2 size
up from 80 characters.

String writes to the console are buffered up the end of the line
or until the buffer fills. Then the buffer is flushed to all
console devices.

Existing low level console output via cnputc() is unaffected by
this change. ithread calls to log() are also unaffected to avoid
blocking those threads.

A minor change to the behaviour in a panic situation is that
console output will still be buffered, but won't be written to
a tty as before. This should prevent interspersed panic output
as a number of CPUs panic before we end up single threaded
running ddb.

Reviewed by:	scottl, jhb
MFC after:	2 weeks
2006-11-01 04:54:51 +00:00
jb
d20db01f97 Remove the KSE option now that it's in DEFAULTS on these arches/machines.
The 'nooption' kernel config entry has to be used to turn KSE off now.
This isn't my preferred way of dealing with this, but I'll defer to
scottl's experience with the io/mem kernel option change and the grief
experienced over that.

Submitted by:	scottl@
2006-10-26 22:11:35 +00:00
jb
3a250c3e16 Add 'options KSE' to the kernel config DEFAULTS on all arches/machines
except sun4v.

This change makes the transition from a default to an option more
transparent and is an attempt to head off all the compliants that are
likely from people who don't read UPDATING, based on experience with
the io/mem change.

Submitted by:	scottl@
2006-10-26 22:05:25 +00:00
jb
f82c799735 Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by:	davidxu@
2006-10-26 21:42:22 +00:00