Commit Graph

1698 Commits

Author SHA1 Message Date
John Baldwin
d66ff27773 Change the amd64, i386, and ia64 nexus drivers to setup bus space tags and
handles when activating a resource via bus_activate_resource() rather than
doing some of the work in bus_alloc_resource() and some of it in
bus_activate_resource().

One note is that when using isa_alloc_resourcev() on PC-98, drivers now
need to just use bus_release_resource() without explicitly calling
bus_deactivate_resource() first.  nyan@ has already fixed all of the PC-98
drivers.
2007-03-21 15:36:38 +00:00
Alan Cox
c640357f04 Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file.  Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb
2007-03-11 05:54:29 +00:00
Mohan Srinivasan
f9bb753844 Over NFS, an open() call could result in multiple over-the-wire
GETATTRs being generated - one from lookup()/namei() and the other
from nfs_open() (for cto consistency). This change eliminates the
GETATTR in nfs_open() if an otw GETATTR was done from the namei()
path. Instead of extending the vop interface, we timestamp each attr
load, and use this to detect whether a GETATTR was done from namei()
for this syscall. Introduces a thread-local variable that counts the
syscalls made by the thread and uses <pid, tid, thread syscalls> as
the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on
thread state that could be used as the timestamp with minimal overhead.
2007-03-09 04:02:38 +00:00
Scott Long
94e6bc303f Don't increment total_bounced when doing no-op dmamap_sync ops. 2007-03-06 18:28:43 +00:00
Paolo Pisati
531db8b139 Updated ia64 isa support with the new bus_setup_intr() syntax.
Approved by: re (implicit?)
2007-02-24 16:56:22 +00:00
Paolo Pisati
ef544f6312 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
Alan Cox
5f9e5adf8b Change pmap_protect() so that execute access can be removed without
simultaneously removing write access.
2007-02-21 06:00:46 +00:00
Alan Cox
ae0663a383 Eliminate some acquisitions and releases of the page queues lock that are
no longer necessary.
2007-02-18 06:33:02 +00:00
Marcel Moolenaar
0b098709b0 Now that the free page queue mutex is a sleep mutex, we cannot call
vm_page_alloc() from within a critical section in pmap_growkernel().
Since the need for a critical section may never have existed in the
first place, simply get rid of it.

Discussed with: alc@
2007-02-11 02:52:54 +00:00
Brooks Davis
983f970981 Include GEOM_LABEL in GENERIC. It's very useful and not well publicized
enough.

Approved by:	pjd
2007-02-09 19:03:18 +00:00
Marcel Moolenaar
1d3aed33e8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
Warner Losh
fed32d7544 Remove 3rd clause, renumber, ok per email 2007-01-12 07:26:21 +00:00
David Xu
4e32b7b3cc Add a lwpid field into per-cpu structure, the lwpid represents current
running thread's id on each cpu. This allow us to add in-kernel adaptive
spin for user level mutex. While spinning in user space is possible,
without correct thread running state exported from kernel, it hardly
can be implemented efficiently without wasting cpu cycles, however
exporting thread running state unlikely will be implemented soon as
it has to design and stablize interfaces. This implementation is
transparent to user space, it can be disabled dynamically. With this
change, mutex ping-pong program's performance is improved massively on
SMP machine. performance of mysql super-smack select benchmark is increased
about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems
which have bunch of cpus and system-call overhead is low (athlon64, opteron,
and core-2 are known to be fast), the adaptive spin does help performance.

Added sysctls:
    kern.threads.umtx_dflt_spins
        if the sysctl value is non-zero, a zero umutex.m_spincount will
        cause the sysctl value to be used a spin cycle count.
    kern.threads.umtx_max_spins
        the sysctl sets upper limit of spin cycle count.

Tested on: Athlon64 X2 3800+, Dual Xeon 5130
2006-12-20 04:40:39 +00:00
Julian Elischer
ad1e7d285a Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
Marcel Moolenaar
a602be7b07 Since printf also has at least one critical section, we need to
initialize pc_curthread. While here, rename early_pcpu to pcpu0
to be conistent (compare thread0 and proc0).
2006-11-18 23:15:25 +00:00
Marcel Moolenaar
77121031e7 Now that printf() needs the PCPU, set it up before we call printf().
Change the pc_pcb field from a pointer to struct pcb to struct pcb
so that sizeof(struct pcb) includes the PCB we use for IPI_STOP.
Statically declare early_pcb so that we don't have to allocate the
PCB for thread0. This way we can setup the PCPU before cninit()
and thus before we use printf().
2006-11-18 21:52:26 +00:00
Marcel Moolenaar
2fd31a5e0d Revert previous commit. PC_CONS_BUFR is not used nor needed by
assembly.
2006-11-18 21:48:13 +00:00
Ruslan Ermilov
26af9ac7d0 Fix a comment. 2006-11-13 06:26:57 +00:00
Alan Cox
44b8bd66f9 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
Robert Watson
43593547a3 Add missing includes of priv.h. 2006-11-06 17:43:10 +00:00
Robert Watson
acd3428b7d Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
John Birrell
8391a99bf7 Remove the KDTRACE option again because of the complaints about having
it as a default.

For the record, the KDTRACE option caused _no_ additional source files
to be compiled in; certainly no CDDL source files. All it did was to
allow existing BSD licensed kernel files to include one or more CDDL
header files.

By removing this from DEFAULTS, the onus is on a kernel builder to add
the option to the kernel config, possibly by including GENERIC and
customising from there. It means that DTrace won't be a feature
available in FreeBSD by default, which is the way I intended it to be.

Without this option, you can't load the dtrace module (which contains
the dtrace device and the DTrace framework). This is equivalent to
requiring an option in a kernel config before you can load the linux
emulation module, for example.

I think it is a mistake to have DTrace ported to FreeBSD, but not
to have it available to everyone, all the time. The only exception
to this is the companies which distribute systems with FreeBSD embedded.
Those companies will customise their systems anyway. The KDTRACE
option was intended for them, and only them.
2006-11-04 23:50:12 +00:00
John Birrell
1f80cd9398 Build in kernel support for loading DTrace modules by default. This
adds the hooks that DTrace modules register with, and adds a few functions
which have the dtrace_ prefix to allow the DTrace FBT (function boundary
trace) provider to avoid tracing because they are called from the DTtrace
probe context.

Unlike other forms of tracing and debug, DTrace support in the kernel
incurs negligible run-time cost.

I think the only reason why anyone wouldn't want to have kernel support
enabled for DTrace would be due to the license (CDDL) under which DTrace
is released.
2006-11-04 04:58:10 +00:00
Marcel Moolenaar
5910f6cc85 Make sure kern_envp is never NULL. If we don't get a pointer to
the environment from the loader, use the static environment.
2006-11-03 04:06:17 +00:00
John Birrell
3d068827c2 Add a cnputs() function to write a string to the console with
a lock to prevent interspersed strings written from different CPUs
at the same time.

To avoid putting a buffer on the stack or having to malloc one,
space is incorporated in the per-cpu structure. The buffer
size if 128 bytes; chosen because it's the next power of 2 size
up from 80 characters.

String writes to the console are buffered up the end of the line
or until the buffer fills. Then the buffer is flushed to all
console devices.

Existing low level console output via cnputc() is unaffected by
this change. ithread calls to log() are also unaffected to avoid
blocking those threads.

A minor change to the behaviour in a panic situation is that
console output will still be buffered, but won't be written to
a tty as before. This should prevent interspersed panic output
as a number of CPUs panic before we end up single threaded
running ddb.

Reviewed by:	scottl, jhb
MFC after:	2 weeks
2006-11-01 04:54:51 +00:00
John Birrell
3750d1ecad Remove the KSE option now that it's in DEFAULTS on these arches/machines.
The 'nooption' kernel config entry has to be used to turn KSE off now.
This isn't my preferred way of dealing with this, but I'll defer to
scottl's experience with the io/mem kernel option change and the grief
experienced over that.

Submitted by:	scottl@
2006-10-26 22:11:35 +00:00
John Birrell
013d6d8cb4 Add 'options KSE' to the kernel config DEFAULTS on all arches/machines
except sun4v.

This change makes the transition from a default to an option more
transparent and is an attempt to head off all the compliants that are
likely from people who don't read UPDATING, based on experience with
the io/mem change.

Submitted by:	scottl@
2006-10-26 22:05:25 +00:00
John Birrell
8460a577a4 Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by:	davidxu@
2006-10-26 21:42:22 +00:00
Ruslan Ermilov
837f167eb2 Move "device splash" back to MI NOTES and "files", it's MI. 2006-10-23 13:23:14 +00:00
Marcel Moolenaar
5118cd1e4a o Eliminate nexus_print_resources(). Use resource_list_print_type()
instead.
o  Eliminate nexus_print_all_resources(). Inline the function body
   in nexus_print_child().
2006-10-23 00:38:58 +00:00
Alan Cox
43200cd3ed Eliminate unnecessary PG_BUSY tests. 2006-10-22 04:18:01 +00:00
Dag-Erling Smørgrav
c43ac89acc Move more MD devices and options out of MI NOTES. 2006-10-20 09:52:27 +00:00
Dag-Erling Smørgrav
c276283866 The VGA_DEBUG option only exists on {amd64,i386,ia64}.
Also remove 'device io' from amd64 NOTES; DEFAULTS takes care of it.
2006-10-20 08:56:26 +00:00
Marcel Moolenaar
d9bbf2a8a9 Fix previous revision:
o  day and mday are the same. No need to subtract 1 from mday.
o  Set dow to -1 as clock_ct_to_ts() checks this field and
   returns EINVAL on any day of the week but Sunday.
2006-10-19 00:53:35 +00:00
David Xu
5f641fc0fb o Add keyword volatile for user mutex owner field.
o Fix type consistent problem by using type long for old
  umtx and wait channel.
o Rename casuptr to casuword.
2006-10-17 02:24:47 +00:00
Hiroki Sato
b84baf83b9 Add a newline to the printf().
Spotted by:	Peter Carah <pete@altadena.net>
MFC after:	3 days
2006-10-15 16:52:59 +00:00
Marcel Moolenaar
84ee7f6ba2 Include freebsd32_signal.h now that signal-related definitions are
moved there.

Found by: ia64 tinderbox.
2006-10-06 19:33:44 +00:00
Simon L. B. Nielsen
4517aab293 - Remove SCHED_ULE from GENERIC to better avoid foot-shooting by
unsuspecting users.
- Add a comment in NOTES about experimental status of SCHED_ULE.
- Make warning about experimental status in sched_ule(4) a bit
  stronger.

Suggested and reviewed by:	dougb
Discussed on:			developers
MFC after:			3 days
2006-10-05 20:31:58 +00:00
John Birrell
6825d60738 PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
Security:
Move the relocation definitions to the common elf header so that DTrace
can use them on one architecture targeted to a different one.

Add the additional ELF types defines in Sun's "Linker and Libraries"
manual.
2006-10-04 21:37:10 +00:00
Poul-Henning Kamp
60e88b878c Use calendrical calculations from subr_clock.c instead of home-rolled. 2006-10-02 16:32:36 +00:00
Poul-Henning Kamp
b69f71eb29 Second part of a little cleanup in the calendar/timezone/RTC handling.
Split subr_clock.c in two parts (by repo-copy):
   subr_clock.c contains generic RTC and calendaric stuff. etc.
   subr_rtc.c contains the newbus'ified RTC interface.

Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock}
sysctls and associated variables into subr_clock.c.  They are
not machine dependent and we have generic code that relies on being
present so they are not even optional.
2006-10-02 15:42:02 +00:00
Poul-Henning Kamp
f645b0b51c First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
Ruslan Ermilov
6c9fdda750 Added COMPAT_FREEBSD6 option. 2006-09-26 12:36:34 +00:00
Alexander Kabaev
d9cb97ff9d Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes
the former and  __builtin_va_start was present in all GCC version 3.1 and
later.
2006-09-21 01:37:02 +00:00
Robert Watson
ef0d1723db Add audit hooks for ppc, ia64 system call paths.
Reviewed by:	marcel (ia64)
Obtained from:	TrustedBSD Project
MFC after:	3 days
2006-09-16 17:03:02 +00:00
David Xu
66e1c26dba Implement casuword32, compare and set user integer, thank Marcel Moolenarr
who wrote the IA64 version of casuword32.
2006-08-28 02:28:15 +00:00
Alan Cox
b554f899bd Eliminate unused definitions. (They came from NetBSD.)
Discussed with: cognet, grehan, marcel
2006-08-25 23:51:11 +00:00
John Baldwin
7e9f73f3ed First pass at allowing memory to be mapped using cache modes other than
WB (write-back) on x86 via control bits in PTEs and PDEs (including making
use of the PAT MSR).  Changes include:
- A new pmap_mapdev_attr() function for amd64 and i386 which takes an
  additional parameter (relative to pmap_mapdev()) specifying the cache
  mode for this mapping.  Note that on amd64 only WB mappings are done with
  the direct map, all other modes result in a private mapping.
- pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached)
  mappings rather than WB.  Previously we relied on the BIOS setting up
  MTRR's to enforce memio regions being treated as UC.  This might make
  hw.cbb_start_memory unnecessary in some cases now for example.
- A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places
  that used pmap_mapdev() to map non-device memory (such as ACPI tables)
  to do so using WB as before.
- A new pmap_change_attr() function for amd64 and i386 that changes the
  caching mode for a range of KVA.

Reviewed by:	alc
2006-08-11 19:22:57 +00:00
Alan Cox
78985e424a Complete the transition from pmap_page_protect() to pmap_remove_write().
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write().  However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.

The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567.  The short version is that I'm
trying to clean up and fix our support for execute access.

Reviewed by: marcel@ (ia64)
2006-08-01 19:06:06 +00:00
Marcel Moolenaar
302981e72a Remove sio(4) and related options from MI files to amd64, i386
and pc98 MD files. Remove nodevice and nooption lines specific
to sio(4) from ia64, powerpc and sparc64 NOTES. There were no
such lines for arm yet.
sio(4) is usable on less than half the platforms, not counting
a future mips platform. Its presence in MI files is therefore
increasingly becoming a burden.
2006-07-29 18:38:54 +00:00
John Baldwin
cb76d9b05c Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.
2006-07-28 20:22:58 +00:00
John Baldwin
af5bf12239 Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
John Baldwin
22ea1bc57a Unify the checking for lock misbehavior in the various syscall()
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
  section.
- Add a new explicit check before userret() to make sure we don't return
  with any locks held.  The advantage here is that we can include the
  syscall number and name in syscall() whereas that info is not available
  in userret().
- Drop the mtx_assert()'s of sched_lock and Giant.  They are replaced by
  the more general checks just added.

MFC after:	2 weeks
2006-07-27 22:32:30 +00:00
John Baldwin
0c5d1dbd43 Add KTR_SYSC tracing to the syscall() implementations that didn't have it
yet.

MFC after:	1 week
2006-07-27 21:25:50 +00:00
John Baldwin
00f1856905 Add missing ptrace(2) system-call stops to various syscall()
implementations.

MFC after:	1 week
2006-07-27 19:50:16 +00:00
Marcel Moolenaar
4b8d8ccc23 Move default GEOM classes from files.ia64, where they were marked
standard, to the DEFAULTS file.
2006-07-17 20:02:51 +00:00
John Baldwin
19e9205a23 Simplify the pager support in DDB. Allowing different db commands to
install custom pager functions didn't actually happen in practice (they
all just used the simple pager and passed in a local quit pointer).  So,
just hardcode the simple pager as the only pager and make it set a global
db_pager_quit flag that db commands can check when the user hits 'q' (or a
suitable variant) at the pager prompt.  Also, now that it's easy to do so,
enable paging by default for all ddb commands.  Any command that wishes to
honor the quit flag can do so by checking db_pager_quit.  Note that the
pager can also be effectively disabled by setting $lines to 0.

Other fixes:
- 'show idt' on i386 and pc98 now actually checks the quit flag and
  terminates early.
- 'show intr' now actually checks the quit flag and terminates early.
2006-07-12 21:22:44 +00:00
Matt Jacob
086ba9f74f Make the firmware assist driver resident in
preparation for isp using it.
2006-07-09 16:40:31 +00:00
Bruce Evans
9366bb574f Fixed FP_R*. fp{get_set}round() apparently never worked on ia64, since
the alpha values were used and are quite different.

Fixed some style bugs by copying from the i386 version where it is better.
2006-07-05 06:10:21 +00:00
Marcel Moolenaar
559adb10ad Partial support for branch long emulation. This only emulates the
branch long jump and not the branch long call. Support for that is
forthcoming.
2006-06-29 19:59:18 +00:00
Alan Cox
9b123ca12a Make several changes to pmap_enter_quick_locked():
1. Make the caller responsible for performing pmap_install().  This reduces
the number of times that pmap_install() is performed by
pmap_enter_object() from twice per page to twice overall.

2. Don't block if pmap_find_pte() is unable to allocate a PTE.  If it did
block, then it might wind up mapping a cache page.  Specifically, if
pmap_enter_quick_locked() slept when called from pmap_enter_object(), the
page daemon could change an active or inactive page into a cache page just
before it was to be mapped.

3. Bail out of pmap_enter_quick_locked() if pv entries aren't plentiful.
In other words, don't force the allocation of a pv entry if they aren't
readily available.

Reviewed by: marcel@
2006-06-27 05:05:05 +00:00
Sergey Babkin
d81175c738 Backed out the change by request from rwatson.
PR:		kern/14584
2006-06-26 22:03:22 +00:00
Sergey Babkin
7a799f1ef0 The common UID/GID space implementation. It has been discussed on -arch
in 1999, and there are changes to the sysctl names compared to PR,
according to that discussion. The description is in sys/conf/NOTES.
Lines in the GENERIC files are added in commented-out form.
I'll attach the test script I've used to PR.

PR:		kern/14584
Submitted by:	babkin
2006-06-25 18:37:44 +00:00
Marcel Moolenaar
0705941372 Update to SDM 2.2:
o  Add tf (test feature) instruction,
o  Add vmsw (VM switch) instruction.

While here, update copyright.

MFC after: 1 week
2006-06-24 19:21:11 +00:00
Marcel Moolenaar
a4b606cd18 Sync up with SDM 2.1:
o  Add nop/hint formats F16, I18, M48 and X5,
o  Add format M47 for ptc.e,
o  Add hint instruction,
o  Fix decoding of cmp8xchg16.
2006-06-24 01:19:52 +00:00
Marcel Moolenaar
99ab1812b9 Identify the cual-core Montecito.
MFC after: 3 days
2006-06-22 00:56:58 +00:00
Alexander Leidinger
28a3ae7f88 Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's
an explicit comment that it's needed for the linuxolator. This is not the
case anymore. For all other architectures there was only a "KEEP THIS".
I'm (and other people too) running a COMPAT_43-less kernel since it's not
necessary anymore for the linuxolator. Roman is running such a kernel for a
for longer time. No problems so far. And I doubt other (newer than ia32
or alpha) architectures really depend on it.

This may result in a small performance increase for some workloads.

If the removal of COMPAT_43 results in a not working program, please
recompile it and all dependencies and try again before reporting a
problem.

The only place where COMPAT_43 is needed (as in: does not compile without
it) is in the (outdated/not usable since too old) svr4 code.

Note: this does not remove the COMPAT_43TTY option.

Nagging by:	rdivacky
2006-06-15 19:58:53 +00:00
Stephan Uphoff
2053c12705 Remove mpte optimization from pmap_enter_quick().
There is a race with the current locking scheme and removing
it should have no measurable performance impact.
This fixes page faults leading to panics in pmap_enter_quick_locked()
on amd64/i386.

Reviewed by: alc,jhb,peter,ps
2006-06-15 01:01:06 +00:00
Warner Losh
78878cef94 Add the ability to subset the devices that UART pulls in. This allows
the arm to compile without all the extras that don't appear, at least
not in the flavors of ARM I deal with.  This helps us save about 100k.

If I've botched the available devices on a platform, please let me
know and I'll correct ASAP.
2006-06-12 04:21:50 +00:00
Alan Cox
ce142d9ec0 Introduce the function pmap_enter_object(). It maps a sequence of resident
pages from the same object.  Use it in vm_map_pmap_enter() to reduce the
locking overhead of premapping objects.

Reviewed by: tegge@
2006-06-05 20:35:27 +00:00
Warner Losh
35988e2d46 EISA bus ia64 systems don't exist in reality. I'm told they may exist in
theory, but that it was OK to remove from NOTES.

OK'd by: marcel
2006-06-02 04:46:26 +00:00
Alan Cox
98c8f52baf Correct a syntax error in the previous revision. 2006-06-01 19:23:45 +00:00
Mike Silbersack
f25d341cfb After much discussion with mjacob and scottl, change bus_dmamem_alloc so
that it just warns the user with a printf when it misaligns a piece
of memory that was requested through a busdma tag.

Some drivers (such as mpt, and probably others) were asking for alignments
that could not be satisfied, but as far as driver operation was concerned,
that did not matter.  In the theory that other drivers will fall into
this same category, we agreed that panicing or making the allocation
fail will cause more hardship than is necessary.  The printf should
be sufficient motivation to get the driver glitch fixed.
2006-06-01 04:49:29 +00:00
Matt Jacob
87ba3ce6f4 Since it's to all intents and purposes identical
code to amd64 && i386, match the recent changes
to bus_dmamem_alloc here.
2006-05-31 00:38:53 +00:00
Marcel Moolenaar
4f0289b073 Unbreak after previous commit. While here, improve function naming
consistency by s/ssc/ssc_/g.
2006-05-27 17:52:08 +00:00
Poul-Henning Kamp
05c3592e13 Update to new console api. 2006-05-26 18:25:34 +00:00
Marius Strobl
8df071afd9 Add le(4). I could actually only test it on alpha, i386 and sparc64 but
given that this includes the more problematic platforms I see no reason
why it shouldn't also work on amd64 and ia64.
2006-05-17 20:45:45 +00:00
Poul-Henning Kamp
c40da00ca3 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
Marcel Moolenaar
d793542b06 Fix braino in previous commit: Don't redefine OID_AUTO to something
not equal to -1, or at all for that matter.
2006-05-11 22:49:31 +00:00
Poul-Henning Kamp
99ab8292c7 Remove more straggling CPU_ macro references 2006-05-11 17:53:26 +00:00
Poul-Henning Kamp
5405ab4889 Clean out sysctl machdep.* related defines.
The cmos clock related stuff should really be in MI code.
2006-05-11 17:29:25 +00:00
Marcel Moolenaar
64220a7e28 Rewrite of puc(4). Significant changes are:
o  Properly use rman(9) to manage resources. This eliminates the
   need to puc-specific hacks to rman. It also allows devinfo(8)
   to be used to find out the specific assignment of resources to
   serial/parallel ports.
o  Compress the PCI device "database" by optimizing for the common
   case and to use a procedural interface to handle the exceptions.
   The procedural interface also generalizes the need to setup the
   hardware (program chipsets, program clock frequencies).
o  Eliminate the need for PUC_FASTINTR. Serdev devices are fast by
   default and non-serdev devices are handled by the bus.
o  Use the serdev I/F to collect interrupt status and to handle
   interrupts across ports in priority order.
o  Sync the PCI device configuration to include devices found in
   NetBSD and not yet merged to FreeBSD.
o  Add support for Quatech 2, 4 and 8 port UARTs.
o  Add support for a couple dozen Timedia serial cards as found
   in Linux.
2006-04-28 21:21:53 +00:00
Marcel Moolenaar
c4830f0477 In nexus_teardown_intr(), actually remove the handler.
MFC after: 1 day
2006-04-21 16:12:28 +00:00
Warner Losh
80837e3924 Set the rid of the resource obtained from rman_reserve_resource. 2006-04-20 04:18:30 +00:00
Alan Cox
826c207263 Retire pmap_track_modified(). We no longer need it because we do not
create managed mappings within the clean submap.  To prevent regressions,
add assertions blocking the creation of managed mappings within the clean
submap.

Reviewed by: tegge
2006-04-12 04:22:52 +00:00
Marcel Moolenaar
d6d31e0900 Improve handling of IPI_STOP:
o  use atomic operations to fiddle with stopped_cpus and started_cpus.
o  disable interrupts while we're waiting to be started.
o  remove logic relating to cpustop_restartfunc as it's not used.
2006-04-03 23:56:40 +00:00
Marcel Moolenaar
bfcdefd8aa Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the
PCB in which the context of stopped CPUs is stored. To access this
PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The
definition, when present, lives in <machine/kdb.h> and abstracts
where MD code saves the context. Define KDB_STOPPEDPCB on i386,
amd64, alpha and sparc64 in accordance to previous code.
2006-04-03 22:51:47 +00:00
Peter Wemm
b9eee07e36 Remove the unused sva and eva arguments from pmap_remove_pages(). 2006-04-03 21:16:10 +00:00
John Baldwin
06ad42b2f7 Close some races between procfs/ptrace and exit(2):
- Reorder the events in exit(2) slightly so that we trigger the S_EXIT
  stop event earlier.  After we have signalled that, we set P_WEXIT and
  then wait for any processes with a hold on the vmspace via PHOLD to
  release it.  PHOLD now KASSERT()'s that P_WEXIT is clear when it is
  invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops
  to zero.
- Change proc_rwmem() to require that the processing read from has its
  vmspace held via PHOLD by the caller and get rid of all the junk to
  screw around with the vmspace reference count as we no longer need it.
- In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it
  doesn't exist.
- Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers
  FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem()
  to clear an earlier single-step simualted via a breakpoint).  We only
  do one to avoid races.  Also, by making the EINVAL error for unknown
  requests be part of the default: case in the switch, the various
  switch cases can now just break out to return which removes a _lot_ of
  duplicated PRELE and proc unlocks, etc.  Also, it fixes at least one bug
  where a LWP ptrace command could return EINVAL with the proc lock still
  held.
- Changed the locking for ptrace_single_step(), ptrace_set_pc(), and
  ptrace_clear_single_step() to always be called with the proc lock
  held (it was a mixed bag previously).  Alpha and arm have to drop
  the lock while the mess around with breakpoints, but other archs
  avoid extra lock release/acquires in ptrace().  I did have to fix a
  couple of other consumers in kern_kse and a few other places to
  hold the proc lock and PHOLD.

Tested by:	ps (1 mostly, but some bits of 2-4 as well)
MFC after:	1 week
2006-02-22 18:57:50 +00:00
John Baldwin
414c4ab4c5 Fix the hw.realmem sysctl. The global realmem variable is a count of
pages, not a count of bytes.  The sysctl handler for hw.realmem already
uses ctob() to convert realmem from pages to bytes.  Thus, on archs that
were storing a byte count in the realmem variable, hw.realmem was inflated.

Reported by:	Valerio daelli valerio dot daelli at gmail dot com (alpha)
MFC after:	3 days
2006-02-14 14:50:11 +00:00
Marcel Moolenaar
e13946c127 Correct the spinlock nesting of the idle thread of the APs before we
save the MCA state of the AP. Saving the MCA state of the AP requires
us to allocate memory, which uses sleep locks.
Now that we correct the spinlock nesting of the AP without having
schedlock, avoid calling spinlock_exit(). Instead call critical_exit()
and manually clear the MD spinlock count.

MFC after: 3 days
2006-02-11 19:55:18 +00:00
Poul-Henning Kamp
eb2da9a51f Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly.  Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret().  Use the absolute value of
td->pticks in userret() and eliminate third argument.
2006-02-08 08:09:17 +00:00
Poul-Henning Kamp
5b1a8eb397 Modify the way we account for CPU time spent (step 1)
Keep track of time spent by the cpu in various contexts in units of
"cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t
only when somebody wants to inspect the numbers.

For now "cputicks" are still derived from the current timecounter
and therefore things should by definition remain sensible also on
SMP machines.  (The main reason for this first milestone commit is
to verify that hypothesis.)

On slower machines, the avoided multiplications to normalize timestams
at every context switch, comes out as a 5-7% better score on the
unixbench/context1 microbenchmark.  On more modern hardware no change
in performance is seen.
2006-02-07 21:22:02 +00:00
Marcel Moolenaar
f9d7b4d515 Allocate memory for the MCA state information with M_NOWAIT. We can
get a MCA event at any moment and it may not be safe to sleep.

MFC after: 3 days
2006-02-07 02:02:14 +00:00
Marcel Moolenaar
8c4f6925c4 Remove devices acpi & mem, as they are in defaults already. 2006-02-02 23:41:08 +00:00
Marcel Moolenaar
05157fa0a1 s/DT_IA64_PLT_RESERVE/DT_IA_64_PLT_RESERVE/ 2006-01-28 17:58:22 +00:00
Marcel Moolenaar
7ee3d29ed6 o Add missing relocations.
o  Minor white-space fixups.
2006-01-18 01:45:57 +00:00
Marcel Moolenaar
853b7411b6 s/R_IA64_/R_IA_64_/g as per the ia64 psABI. 2006-01-17 21:03:22 +00:00
Poul-Henning Kamp
d3e64681d6 Move the old BSD4.3 tty compatibility from (!BURN_BRIDGES && COMPAT_43)
to COMPAT_43TTY.

Add COMPAT_43TTY to NOTES and */conf/GENERIC

Compile tty_compat.c only under the new option.

Spit out
	#warning "Old BSD tty API used, please upgrade."
if ioctl_compat.h gets #included from userland.
2006-01-10 09:19:10 +00:00
Warner Losh
d5e61c97a6 By popular demand, move __HAVE_ACPI and __PCI_REROUTE_INTERRUPT into
param.h.  Per request, I've placed these just after the
_NO_NAMESPACE_POLLUTION ifndef.  I've not renamed anything yet, but
may since we don't need the __.

Submitted by: bde, jhb, scottl, many others.
2006-01-09 06:05:57 +00:00
Poul-Henning Kamp
8c92c2096d Use ttyalloc() instead of ttymalloc() 2006-01-04 09:46:20 +00:00
Warner Losh
501755f4f6 Define __HAVE_ACPI and/or __PCI_REROUTE_INTERRUPT, as appropriate for
each platform.  These will be used in the pci code in preference to
the complicated #ifdefs we have there now.
2006-01-01 20:59:28 +00:00
Alexander Leidinger
ef39c05baa MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
   this allows to try different coloring algorithms without the need to
   touch every file [1]
 - make the page queue tuning values readable: sysctl vm.stats.pagequeue
 - autotuning of the page coloring values based upon the cache size instead
   of options in the kernel config (disabling of the page coloring as a
   kernel option is still possible)

MD changes:
 - detection of the cache size: only IA32 and AMD64 (untested) contains
   cache size detection code, every other arch just comes with a dummy
   function (this results in the use of default values like it was the
   case without the autotuning of the page coloring)
 - print some more info on Intel CPU's (like we do on AMD and Transmeta
   CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by:	Chad David <davidc@acns.ab.ca> [1]
Reviewed by:		alc, arch (in 2004)
Discussed with:		alc, Chad David, arch (in 2004)
2005-12-31 14:39:20 +00:00
Maxim Sobolev
900b28f9f6 Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure
with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually
allow executing elf dynamic binaries (aka shared libraries). When it is
requested to execute ET_DYN elf image check if this flag is on after we
know the elf brand allowing execution if so.

PR:		kern/87615
Submitted by:	Marcin Koziej <creep@desk.pl>
2005-12-26 21:23:57 +00:00
John Baldwin
b439e431bf Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly.  In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures.  Basically,
  all the archs were taking a trapframe and converting it into a clockframe
  one way or another.  Now they can just extract the PC and usermode values
  directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
  accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
  the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
  timecounter, call hardclock() directly.  This removes an extra
  conditional check from every clock interrupt on Alpha on the BSP.
  There is probably room for even further pruning here by changing Alpha
  to use the simplified timecounter we use on x86 with the lapic timer
  since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
  is false, so add a KASSERT() to that affect and remove a condition
  to slightly optimize the non-lapic case.
- Change prototypeof  arm_handler_execute() so that it's first arg is a
  trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on:	alpha, amd64, arm, i386, ia64, sparc64
Reviewed by:	bde (mostly)
2005-12-22 22:16:09 +00:00
Marcel Moolenaar
757686b115 Make our ELF64 type definitions match standards. In particular this
means:
o  Remove Elf64_Quarter,
o  Redefine Elf64_Half to be 16-bit,
o  Redefine Elf64_Word to be 32-bit,
o  Add Elf64_Xword and Elf64_Sxword for 64-bit entities,
o  Use Elf_Size in MI code to abstract the difference between
   Elf32_Word and Elf64_Word.
o  Add Elf_Ssize as the signed counterpart of Elf_Size.

MFC after: 2 weeks
2005-12-18 04:52:37 +00:00
John Baldwin
696effb697 - Cleanup whitespace and extra ()s in vtophys() macros.
- Move vtophys() macros next to vtopte() where vtopte() exists to match
  comments above vtopte().
- Remove references to the alternate address space in the comment above
  vtopte().  amd64 never had the alternate address space, and i386 lost it
  prior to PAE support being added.
- s/entires/entries/ in comments.

Reviewed by:	alc
2005-12-06 21:09:01 +00:00
Ruslan Ermilov
224d140293 Drop _MACHINE_ARCH and _MACHINE defines (not to be confused with
MACHINE_ARCH and MACHINE).  Their purpose was to be able to test
in cpp(1), but cpp(1) only understands integer type expressions.
Using such unsupported expressions introduced a number of subtle
bugs, which were discovered by compiling with -Wundef.
2005-12-06 13:27:21 +00:00
Ruslan Ermilov
44e09d2fa2 Fix -Wundef warnings from compiling GENERIC and LINT kernels of
all architectures.
2005-12-06 11:19:37 +00:00
Ruslan Ermilov
6646524f34 - Allow duplicate "machine" directives with the same arguments.
- Move existing "machine" directives to DEFAULTS.
2005-11-27 23:17:00 +00:00
John Baldwin
7417e80b4e Don't enable PUC_FASTINTR by default in the source. Instead, enable it
via the DEFAULTS kernel configs.  This allows folks to turn it that option
off in the kernel configs if desired without having to hack the source.
This is especially useful since PUC_FASTINTR hangs the kernel boot on my
ultra60 which has two uart(4) devices hung off of a puc(4) device.

I did not enable PUC_FASTINTR by default on powerpc since powerpc does not
currently allow sharing of INTR_FAST with non-INTR_FAST like the other
archs.
2005-11-21 20:22:35 +00:00
John Baldwin
d0750fb9b0 Create DEFAULTS files for alpha, ia64, powerpc, and sparc64 and move
'device mem' over from GENERIC to DEFAULTS to be consistent with i386 and
amd64.  Additionally, on ia64 enable ACPI by default since ia64 requires
acpi.
2005-11-21 20:17:46 +00:00
Alan Cox
97a0c226d6 Eliminate pmap_init2(). It's no longer used. 2005-11-20 06:09:49 +00:00
Alan Cox
65336314cf In get_pv_entry() use PMAP_LOCK() instead of PMAP_TRYLOCK() when deadlock
cannot possibly occur.
2005-11-13 02:17:05 +00:00
Alan Cox
7a35a21e7b Reimplement the reclamation of PV entries. Specifically, perform
reclamation synchronously from get_pv_entry() instead of
asynchronously as part of the page daemon.  Additionally, limit the
reclamation to inactive pages unless allocation from the PV entry zone
or reclamation from the inactive queue fails.  Previously, reclamation
destroyed mappings to both inactive and active pages.  get_pv_entry()
still, however, wakes up the page daemon when reclamation occurs.  The
reason being that the page daemon may move some pages from the active
queue to the inactive queue, making some new pages available to future
reclamations.

Print the "reclaiming PV entries" message at most once per minute, but
don't stop printing it after the fifth time.  This way, we do not give
the impression that the problem has gone away.

Reviewed by: tegge
2005-11-09 08:19:21 +00:00
Alan Cox
e9cb1037da Begin and end the initialization of pvzone in pmap_init().
Previously, pvzone's initialization was split between pmap_init() and
pmap_init2().  This split initialization was the underlying cause of
some UMA panics during initialization.  Specifically, if the UMA boot
pages was exhausted before the pvzone was fully initialized, then UMA,
through no fault of its own, would use an inappropriate back-end
allocator leading to a panic.  (Previously, as a workaround, we have
increased the UMA boot pages.)  Fortunately, there is no longer any
reason that pvzone's initialization cannot be completed in
pmap_init().

Eliminate a check for whether pv_entry_high_water has been initialized
or not from get_pv_entry().  Since pvzone's initialization is
completed in pmap_init(), this check is no longer needed.

Use cnt.v_page_count, the actual count of available physical pages,
instead of vm_page_array_size to compute the maximum number of pv
entries.

Introduce the vm.pmap.pv_entries tunable on alpha and ia64.

Eliminate some unnecessary white space.

Discussed with: tegge (item #1)
Tested by: marcel (ia64)
2005-11-04 18:03:24 +00:00
Alan Cox
fcf67b0496 Remove the remaining spl*() calls. Add some assertions. Eliminate some
excessive white space.
2005-11-03 07:51:02 +00:00
Robert Watson
5bb84bc84b Normalize a significant number of kernel malloc type names:
- Prefer '_' to ' ', as it results in more easily parsed results in
  memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
  as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
  memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
  attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion.  Similar changes are required for UMA zone names.
2005-10-31 15:41:29 +00:00
Marcel Moolenaar
6739824a02 Remove a stray return statement in the interrupt dispatch function
that caused a premature exit after calling a fast interrupt handler
and bypassing a much needed critical_exit() and the scheduling of
the interrupt thread for non-fast handlers. In short: unbreak :-)
2005-10-30 17:23:01 +00:00
John Baldwin
e0f66ef861 Reorganize the interrupt handling code a bit to make a few things cleaner
and increase flexibility to allow various different approaches to be tried
in the future.
- Split struct ithd up into two pieces.  struct intr_event holds the list
  of interrupt handlers associated with interrupt sources.
  struct intr_thread contains the data relative to an interrupt thread.
  Currently we still provide a 1:1 relationship of events to threads
  with the exception that events only have an associated thread if there
  is at least one threaded interrupt handler attached to the event.  This
  means that on x86 we no longer have 4 bazillion interrupt threads with
  no handlers.  It also means that interrupt events with only INTR_FAST
  handlers no longer have an associated thread either.
- Renamed struct intrhand to struct intr_handler to follow the struct
  intr_foo naming convention.  This did require renaming the powerpc
  MD struct intr_handler to struct ppc_intr_handler.
- INTR_FAST no longer implies INTR_EXCL on all architectures except for
  powerpc.  This means that multiple INTR_FAST handlers can attach to the
  same interrupt and that INTR_FAST and non-INTR_FAST handlers can attach
  to the same interrupt.  Sharing INTR_FAST handlers may not always be
  desirable, but having sio(4) and uhci(4) fight over an IRQ isn't fun
  either.  Drivers can always still use INTR_EXCL to ask for an interrupt
  exclusively.  The way this sharing works is that when an interrupt
  comes in, all the INTR_FAST handlers are executed first, and if any
  threaded handlers exist, the interrupt thread is scheduled afterwards.
  This type of layout also makes it possible to investigate using interrupt
  filters ala OS X where the filter determines whether or not its companion
  threaded handler should run.
- Aside from the INTR_FAST changes above, the impact on MD interrupt code
  is mostly just 's/ithread/intr_event/'.
- A new MI ddb command 'show intrs' walks the list of interrupt events
  dumping their state.  It also has a '/v' verbose switch which dumps
  info about all of the handlers attached to each event.
- We currently don't destroy an interrupt thread when the last threaded
  handler is removed because it would suck for things like ppbus(8)'s
  braindead behavior.  The code is present, though, it is just under
  #if 0 for now.
- Move the code to actually execute the threaded handlers for an interrrupt
  event into a separate function so that ithread_loop() becomes more
  readable.  Previously this code was all in the middle of ithread_loop()
  and indented halfway across the screen.
- Made struct intr_thread private to kern_intr.c and replaced td_ithd
  with a thread private flag TDP_ITHREAD.
- In statclock, check curthread against idlethread directly rather than
  curthread's proc against idlethread's proc. (Not really related to intr
  changes)

Tested on:	alpha, amd64, i386, sparc64
Tested on:	arm, ia64 (older version of patch by cognet and marcel)
2005-10-25 19:48:48 +00:00
Ade Lovett
8d228514fb Specifically panic() in the case where pmap_insert_entry() fails to
get a new pv under high system load where the available pv entries
have been exhausted before the pagedaemon has a chance to wake up
to reclaim some.

Prior to this, the NULL pointer dereference ended up causing
secondary panics with rather less than useful resulting tracebacks.

Reviewed by:	alc, jhb
MFC after:	1 week
2005-10-21 19:42:43 +00:00
Poul-Henning Kamp
7423b2b40c Make ttyconsolemode() call ttsetwater() so that drivers don't have to. 2005-10-16 20:58:22 +00:00
David Xu
9104847f21 1. Change prototype of trapsignal and sendsig to use ksiginfo_t *, most
changes in MD code are trivial, before this change, trapsignal and
   sendsig use discrete parameters, now they uses member fields of
   ksiginfo_t structure. For sendsig, this change allows us to pass
   POSIX realtime signal value to user code.

2. Remove cpu_thread_siginfo, it is no longer needed because we now always
   generate ksiginfo_t data and feed it to libpthread.

3. Add p_sigqueue to proc structure to hold shared signals which were
   blocked by all threads in the proc.

4. Add td_sigqueue to thread structure to hold all signals delivered to
   thread.

5. i386 and amd64 now return POSIX standard si_code, other arches will
   be fixed.

6. In this sigqueue implementation, pending signal set is kept as before,
   an extra siginfo list holds additional siginfo_t data for signals.
   kernel code uses psignal() still behavior as before, it won't be failed
   even under memory pressure, only exception is when deleting a signal,
   we should call sigqueue_delete to remove signal from sigqueue but
   not SIGDELSET. Current there is no kernel code will deliver a signal
   with additional data, so kernel should be as stable as before,
   a ksiginfo can carry more information, for example, allow signal to
   be delivered but throw away siginfo data if memory is not enough.
   SIGKILL and SIGSTOP have fast path in sigqueue_add, because they can
   not be caught or masked.
   The sigqueue() syscall allows user code to queue a signal to target
   process, if resource is unavailable, EAGAIN will be returned as
   specification said.
   Just before thread exits, signal queue memory will be freed by
   sigqueue_flush.
   Current, all signals are allowed to be queued, not only realtime signals.

Earlier patch reviewed by: jhb, deischen
Tested on: i386, amd64
2005-10-14 12:43:47 +00:00
Poul-Henning Kamp
2628fdabad Eliminate need for __RMAN_RESOURCE_VISIBLE
Reviewed by:	marcel@
2005-10-06 17:39:18 +00:00
Robert Watson
5f419982c2 Back out alpha/alpha/trap.c:1.124, osf1_ioctl.c:1.14, osf1_misc.c:1.57,
osf1_signal.c:1.41, amd64/amd64/trap.c:1.291, linux_socket.c:1.60,
svr4_fcntl.c:1.36, svr4_ioctl.c:1.23, svr4_ipc.c:1.18, svr4_misc.c:1.81,
svr4_signal.c:1.34, svr4_stat.c:1.21, svr4_stream.c:1.55,
svr4_termios.c:1.13, svr4_ttold.c:1.15, svr4_util.h:1.10,
ext2_alloc.c:1.43, i386/i386/trap.c:1.279, vm86.c:1.58,
unaligned.c:1.12, imgact_elf.c:1.164, ffs_alloc.c:1.133:

Now that Giant is acquired in uprintf() and tprintf(), the caller no
longer leads to acquire Giant unless it also holds another mutex that
would generate a lock order reversal when calling into these functions.
Specifically not backed out is the acquisition of Giant in nfs_socket.c
and rpcclnt.c, where local mutexes are held and would otherwise violate
the lock order with Giant.

This aligns this code more with the eventual locking of ttys.

Suggested by:	bde
2005-09-28 07:03:03 +00:00
Peter Wemm
add121a476 Implement 32 bit getcontext/setcontext/swapcontext on amd64. I've added
stubs for ia64 to keep it compiling.  These are used by 32 bit apps such
as gdb.
2005-09-27 18:04:20 +00:00
John Baldwin
3c2bc2bf26 Add a new atomic_fetchadd() primitive that atomically adds a value to a
variable and returns the previous value of the variable.

Tested on:	i386, alpha, sparc64, arm (cognet)
Reviewed by:	arch@
Submitted by:	cognet (arm)
MFC after:	1 week
2005-09-27 17:39:11 +00:00
Robert Watson
84d2b7df26 Add GIANT_REQUIRED and WITNESS sleep warnings to uprintf() and tprintf(),
as they both interact with the tty code (!MPSAFE) and may sleep if the
tty buffer is full (per comment).

Modify all consumers of uprintf() and tprintf() to hold Giant around
calls into these functions.  In most cases, this means adding an
acquisition of Giant immediately around the function.  In some cases
(nfs_timer()), it means acquiring Giant higher up in the callout.

With these changes, UFS no longer panics on SMP when either blocks are
exhausted or inodes are exhausted under load due to races in the tty
code when running without Giant.

NB: Some reduction in calls to uprintf() in the svr4 code is probably
desirable.

NB: In the case of nfs_timer(), calling uprintf() while holding a mutex,
or even in a callout at all, is a bad idea, and will generate warnings
and potential upset.  This needs to be fixed, but was a problem before
this change.

NB: uprintf()/tprintf() sleeping is generally a bad ideas, as is having
non-MPSAFE tty code.

MFC after:	1 week
2005-09-19 16:51:43 +00:00
Christian S.J. Peron
33cdc78d01 Introduce a kernel config for the Mandatory Access Control framework.
This kernel config briefly describes some of the major MAC policies
available on FreeBSD. The hope is that this will raise the awareness
about MAC and get more people interested.

Discussed with:	scottl
2005-09-18 03:15:36 +00:00
Alan Cox
ac31d065a6 Eliminate unused definitions. 2005-09-11 20:51:15 +00:00
David E. O'Brien
2a191126de Canonize the include of acpi.h. 2005-09-11 18:39:03 +00:00
Marcel Moolenaar
8115693121 Merge db_interface.c and db_trace.c into db_machdep.c. 2005-09-10 03:18:51 +00:00
Marcel Moolenaar
216e80c2ba Move the prototypes of db_md_set_watchpoint(), db_md_clr_watchpoint()
and db_md_list_watchpoints() to ddb/ddb.h.
2005-09-10 03:01:25 +00:00
Marcel Moolenaar
464d16ddf0 Move the ia32_sigcode structure from ia32_sigtramp.c to ia32_signal.c.
It's a bit excessive to have it in a file of its own.
2005-09-10 02:12:49 +00:00
Marcel Moolenaar
0522a40412 Remove redundant $FreeBSD$ 2005-09-10 01:13:33 +00:00
Marcel Moolenaar
87a59250b5 Change the High FP lock from a sleep lock to a spin lock. We can
take the lock from interrupt context, which causes an implicit
lock order reversal. We've been using the lock carefully enough
that making it a spin lock should not be harmful.
2005-09-09 19:18:36 +00:00
Marcel Moolenaar
cca2e0f1cc Milestone: enable SMP by default. 2005-09-05 21:36:28 +00:00
Marcel Moolenaar
ab870058d7 o In pmap_remove_pte: always invalidate the page. Previously the page
was not invalidated if the PTE was not actually being removed.  In
   an UP kernel this didn't cause problems, because the new mapping
   would preempt the old one. In an SMP kernel this could lead to the
   use of stale translations when processes move between CPUs at the
   "right" moment.  This fixes the last of the obvious SMP problems
   and it should be safe to enable SMP by default now.
o  In pmap_remove_pte: minor code refactoring to avoid duplication.
o  Test all PTE pointers against NULL. Don't use implicit boolean
   tests.
2005-09-05 21:32:02 +00:00
Marcel Moolenaar
5280c8c2ab o s/vhpt_size/pmap_vhpt_log2size/g
o  s/vhpt_base/pmap_vhpt_base/g
o  s/vhpt_bucket/pmap_vhpt_bucket/g
o  Declare the above in <machine/pmap.h>
o  Move the vm.stats.vhpt.* sysctls to machdep.vhpt.*
o  Create a tunable machdep.vhpt.log2size, with corresponding sysctl.
   The tunable allows the user to specify the VHPT size from the loader.
o  Don't keep track of the number of PTEs in the VHPT. Calculate the
   population when necessary by iterating the buckets and summing up
   the length of the buckets.
o  Don't perform the tpa instruction with a bucket lock held. The
   instruction can (theoretically) fault and locking is not needed.
2005-09-03 23:53:50 +00:00
Marcel Moolenaar
43be3aac7a Fix collision chain termination checks. The result of IA64_PHYS_TO_RR7
is never 0, so one cannot test for a NULL pointer after a physical
address is translated into a virtual pointer with said macro. Instead,
keep the physical address around and test it against 0. Note that
this obviously implies that a PTE can never be allocated at physical
address 0. This isn't exactly guaranteed, but hasn't been a problem
so far. We test the physical address against 0 for as long as the ia64
port exists...
2005-09-03 19:43:15 +00:00
Alan Cox
ba8bca610c Pass a value of type vm_prot_t to pmap_enter_quick() so that it determine
whether the mapping should permit execute access.
2005-09-03 18:20:20 +00:00
Stefan Farfeleder
a1f85d7f83 Move MINSIGSTKSZ from <machine/signal.h> to <machine/_limits.h> and rename
it to __MINSIGSTKSZ.  Define MINSIGSTKSZ in <sys/signal.h>.

This is done in order to use MINSIGSTKSZ for the macro PTHREAD_STACK_MIN
in <pthread.h> (soon <limits.h>) without having to include the whole
<sys/signal.h> header.

Discussed with:		bde
2005-08-20 16:44:41 +00:00
Marcel Moolenaar
d41a7ed490 Remove the execute permission for stacks. 2005-08-14 23:17:59 +00:00
Marcel Moolenaar
a812f8435a o s/pmap_lpte_/pmap_/g
o  Remove pmap_is_referenced(). It was already compiled-out.
2005-08-13 21:16:38 +00:00
Marcel Moolenaar
86257f240a Fix the problem with the IPI for the lazy context switching of the
high FP registers. It was not that the IPI got lost due to the
perceived unreliability of the IPI delivery, but rather that the
IPI was not assigned a vector (ugh). Sending a 0 vector to a CPU
results in a stray external interrupt.
Add a KASSERT to ipi_send() to catch this. The initialization of
the IPIs could be better, but it's not at all sure what the future
of the code is. Avoid wasting a lot of time on something that is
going to be rewritten anyway.
2005-08-13 21:08:32 +00:00
Marcel Moolenaar
4630415a47 Improve SMP support:
o  Allocate a VHPT per CPU. The VHPT is a hash table that the CPU
   uses to look up translations it can't find in the TLB. As such,
   the VHPT serves as a level 1 cache (the TLB being a level 0 cache)
   and best results are obtained when it's not shared between CPUs.
   The collision chain (i.e. the hash bucket) is shared between CPUs,
   as all buckets together constitute our collection of PTEs. To
   achieve this, the collision chain does not point to the first PTE
   in the list anymore, but to a hash bucket head structure. The
   head structure contains the pointer to the first PTE in the list,
   as well as a mutex to lock the bucket. Thus, each bucket is locked
   independently of each other. With at least 1024 buckets in the VHPT,
   this provides for sufficiently finei-grained locking to make the
   ssolution scalable to large SMP machines.
o  Add synchronisation to the lazy FP context switching. We do this
   with a seperate per-thread lock. On SMP machines the lazy high FP
   context switching without synchronisation caused inconsistent
   state, which resulted in a panic. Since the use of the high FP
   registers is not common, it's possible that races exist. The ia64
   package build has proven to be a good stress test, so this will
   get plenty of exercise in the near future.
o  Don't use the local ID of the processor we want to send the IPI to
   as the argument to ipi_send(). use the struct pcpu pointer instead.
   The reason for this is that IPI delivery is unreliable. It has been
   observed that sending an IPI to a CPU causes it to receive a stray
   external interrupt. As such, we need a way to make the delivery
   reliable. The intended solution is to queue requests in the target
   CPU's per-CPU structure and use a single IPI to inform the CPU that
   there's a new entry in the queue. If that IPI gets lost, the CPU
   can check it's queue at any convenient time (such as for each
   clock interrupt). This also allows us to send requests to a CPU
   without interrupting it, if such would be beneficial.

With these changes SMP is almost working. There are still some random
process crashes and the machine can hang due to having the IPI lost
that deals with the high FP context switch.

The overhead of introducing the hash bucket head structure results
in a performance degradation of about 1% for UP (extra pointer
indirection). This is surprisingly small and is offset by gaining
reasonably/good scalable SMP support.
2005-08-06 20:28:19 +00:00
Marcel Moolenaar
045f23cd0d Reduce the default MAXCPU from 16 to 4. This is in preparation of
allocating a VHPT per CPU. Since we don't yet know how many CPUs
are actually in the system at the time we need to allocate the
VHPTs, we allocate for MAXCPU processors. This can result in a
lot of wasted space for 2-way machines. So, for now, limit MAXCPU
to something smaller until we have something more dynamic.
2005-08-06 19:59:23 +00:00
Marcel Moolenaar
cbef4d0edc For ia64_ptc_{e,g,ga,l}(), use instruction serialization. We
typically don't know what the TLB described and need to assume
that it affects the fetching of instructions.
2005-08-06 19:54:31 +00:00
Jeff Roberson
8d511e2a05 - Add support for saving stack traces and displaying them via printf(9)
and KTR.

Contributed by:		Antoine Brodin <antoine.brodin@laposte.net>
Concept code from:	Neal Fachan <neal@isilon.com>
2005-08-03 04:27:40 +00:00
John Baldwin
122eceef61 Convert the atomic_ptr() operations over to operating on uintptr_t
variables rather than void * variables.  This makes it easier and simpler
to get asm constraints and volatile keywords correct.

MFC after:	3 days
Tested on:	i386, alpha, sparc64
Compiled on:	ia64, powerpc, amd64
Kernel toolchain busted on:	arm
2005-07-15 18:17:59 +00:00
Ken Smith
22e59cec3b Add recently invented COMPAT_FREEBSD5 option.
MFC after:	3 days
2005-07-14 15:39:06 +00:00
David Xu
740fd64d65 Validate if the value written into {FS,GS}.base is a canonical
address, writting non-canonical address can cause kernel a panic,
by restricting base values to 0..VM_MAXUSER_ADDRESS, ensuring
only canonical values get written to the registers.

Reviewed by: peter, Josepha Koshy < joseph.koshy at gmail dot com >
Approved by: re (scottl)
2005-07-10 23:31:11 +00:00
Marcel Moolenaar
7906787a5f Enhance ia64_flush_dirty() to handle the case in which td != curthread.
This case is triggered with ptrace(2) and the PT_SETREGS function.
Change the return type of the function to int so that errors can be
passed on to the caller.

Approved by: re (scottl)
2005-07-05 17:12:18 +00:00
Marcel Moolenaar
a2aeb24eff Implement functions calls from within DDB on ia64. On ia64 a function
pointer doesn't point to the first instruction of that function, but
rather to a descriptor. The descriptor has the address of the first
instruction, as well as the value of the global pointer. The symbol
table doesn't know anything about descriptors, so if you lookup the
name of a function you get the address of the first instruction. The
cast from the address, which is the result of the symbol lookup, to a
function pointer as is done in db_fncall is therefore invalid.
Abstract this detail behind the DB_CALL macro. By default DB_CALL is
defined as db_fncall_generic, which yields the old behaviour. On ia64
the macro is defined as db_fncall_ia64, in which a descriptor is
constructed to yield a valid function pointer.

While here, introduce DB_MAXARGS. DB_MAXARGS replaces the existing
(local) MAXARGS. The DB_MAXARGS macro can be defined by platforms to
create a convenient maximum. By default this will be the legacy 10.
On ia64 we define this macro to be 8, for 8 is the maximum number of
arguments that can be passed in registers. This avoids having to
implement spilling of arguments on the memory stack.

Approved by: re (dwhite)
2005-07-02 23:52:37 +00:00
Marcel Moolenaar
5116398a06 Fix a buglet that was present in the ia64 code and that got inherited
by amd64 and i386: For buffered writes we collect data and write it
out a ${DEV_BSIZE}-sized block at a time. The fragsz variable is used
to keep track of how much data we have collected in the buffer so far
and it's reset to zero immediately after writing a block to the dump
device.
When the last, possibly partially filled buffer is flushed, we didn't
reset fragsz to 0 and as such would stop reflecting reality. Since we
currently only need to do buffered writes once, this isn't a problem.
However, when kernel dumps are made by hand (say by callling doadump
from within DDB), the improperly cleared state from the first call to
dumpsys causes the next call to dumpsys to create an invalid code file.
This change resets fragsz after flushing the partially filled buffer so
that it fixes the two problems at once.

Approved by: re (scottl)
2005-07-02 19:57:31 +00:00
Peter Wemm
62919d788b Jumbo-commit to enhance 32 bit application support on 64 bit kernels.
This is good enough to be able to run a RELENG_4 gdb binary against
a RELENG_4 application, along with various other tools (eg: 4.x gcore).
We use this at work.

ia32_reg.[ch]: handle the 32 bit register file format, used by ptrace,
	procfs and core dumps.
procfs_*regs.c: vary the format of proc/XXX/*regs depending on the client
	and target application.
procfs_map.c: Don't print a 64 bit value to 32 bit consumers, or their
	sscanf fails.  They expect an unsigned long.
imgact_elf.c: produce a valid 32 bit coredump for 32 bit apps.
sys_process.c: handle 32 bit consumers debugging 32 bit targets.  Note
	that 64 bit consumers can still debug 32 bit targets.

IA64 has got stubs for ia32_reg.c.

Known limitations: a 5.x/6.x gdb uses get/setcontext(), which isn't
implemented in the 32/64 wrapper yet.  We also make a tiny patch to
gdb pacify it over conflicting formats of ld-elf.so.1.

Approved by:	re
2005-06-30 07:49:22 +00:00
Marcel Moolenaar
c31450b00d Handle B-unit break instructions. The break.b is unique in that the
immediate is not saved by the architecture. Any of the break.{mifx}
instructions have their immediate saved in cr.iim on interruption.
Consequently, when we handle the break interrupt, we end up with a
break value of 0 when it was a break.b. The immediate is important
because it distinguishes between different uses of the break and
which are defined by the runtime specification.
The bottomline is that when the GNU debugger replaces a B-unit
instruction with a break instruction in the inferior, we would not
send the process a SIGTRAP when we encounter it, because the value
is not one we recognize as a debugger breakpoint.

This change adds logic to decode the bundle in which the break
instruction lives whenever the break value is 0. The assumption
being that it's a break.b and we fetch the immediate directly out
of the instruction. If the break instruction was not a break.b,
but any of break.{mifx} with an immediate of 0, we would be doing
unnecessary work. But since a break 0 is invalid, this is not a
problem and it will still result in a SIGILL being sent to the
process.

Approved by: re (scottl)
2005-06-27 23:51:38 +00:00
Marcel Moolenaar
fc37111e5d Replace the existing copyright notice with my own. Over the years I've
changed this file so much that it's equivalent to a rewrite, and I'm not
talking about any of the cosmetic changes of course.

Approved by: re (scottl)
2005-06-27 23:34:35 +00:00
Marcel Moolenaar
9701d67eb8 Cosmetic: s/u_int64_t/uint64_t/g
Approved by: re (scottl)
2005-06-27 23:29:06 +00:00
David E. O'Brien
c3e0dfa1f8 Add .cvsignore files just like in sys/<arch>/compiled, this keeps CVS from
questing kernel config files not in CVS.

Approved by:	re(kensmith)
2005-06-20 16:52:59 +00:00
Marcel Moolenaar
442add308f Define IPI_PREEMPT. Update a nearby comment while I'm here. 2005-06-12 19:03:01 +00:00
Alan Cox
1c245ae7d1 Introduce a procedure, pmap_page_init(), that initializes the
vm_page's machine-dependent fields.  Use this function in
vm_pageq_add_new_page() so that the vm_page's machine-dependent and
machine-independent fields are initialized at the same time.

Remove code from pmap_init() for initializing the vm_page's
machine-dependent fields.

Remove stale comments from pmap_init().

Eliminate the Boolean variable pmap_initialized from the alpha, amd64,
i386, and ia64 pmap implementations.  Its use is no longer required
because of the above changes and earlier changes that result in physical
memory that is being mapped at initialization time being mapped without
pv entries.

Tested by: cognet, kensmith, marcel
2005-06-10 03:33:36 +00:00
Joseph Koshy
f263522a45 MFP4:
- Implement sampling modes and logging support in hwpmc(4).

- Separate MI and MD parts of hwpmc(4) and allow sharing of
  PMC implementations across different architectures.
  Add support for P4 (EMT64) style PMCs to the amd64 code.

- New pmcstat(8) options: -E (exit time counts) -W (counts
  every context switch), -R (print log file).

- pmc(3) API changes, improve our ability to keep ABI compatibility
  in the future.  Add more 'alias' names for commonly used events.

- bug fixes & documentation.
2005-06-09 19:45:09 +00:00
Marcel Moolenaar
470cd51ee6 Create nexus in configure_first() instead of in configure(). This
makes sure that sysinit tasks that run after configure_first(),
but before configure() have a nexus to hang devices off.
2005-05-29 23:44:22 +00:00
Marcel Moolenaar
a0c51afb16 Call cninit_finish() in configure_final(). 2005-05-29 22:48:41 +00:00
Yoshihiro Takahashi
d4fcf3cba5 Remove bus_{mem,p}io.h and related code for a micro-optimization on i386
and amd64.  The optimization is a trivial on recent machines.

Reviewed by:	-arch (imp, marcel, dfr)
2005-05-29 04:42:30 +00:00
Yoshihiro Takahashi
b22bf66063 - Move bus dependent defines to {isa,cbus}_dmareg.h.
- Use isa/isareg.h rather than <arch>/isa/isa.h.

Tested on: i386, pc98
2005-05-14 10:14:56 +00:00
Marcel Moolenaar
6fab4fece2 Don't define _MACHINE_BUS_MEMIO_H_ nor _MACHINE_BUS_PIO_H_. 2005-05-10 02:59:24 +00:00
David Xu
21fc316430 Change cpu_set_kse_upcall to more generic style, so we can reuse it
in other codes. Add cpu_set_user_tls, use it to tweak user register
and setup user TLS. I ever wanted to merge it into cpu_set_kse_upcall,
but since cpu_set_kse_upcall is also used by M:N threads which may
not need this feature, so I wrote a separated cpu_set_user_tls.
2005-04-23 02:32:32 +00:00
Marcel Moolenaar
8773a80baf Sanity the RTC code:
o  Remove the clock interface. Not only does it conflict with the MI
   version when device genclock is added to the kernel, it was also
   not possible to have more than 1 clock device. This of course would
   have been a problem if we actually had more than 1 clock device.
   In short: we don't need a clock interface and if we do eventually,
   we should be using the MI one.
o  Rewrite inittodr() and resettodr() to take into account that:
   1)  We use the EFI interface directly.
   2)  time_t is 64-bit and we do need to make sure we can determine
       leap years from year 2100 and on. Add a nice explanation of
       where leap years come from and why.
   3)  This rewrite happened in 2005 so any date prior to 1/1/2005
       (either M/D/Y or D/M/Y) is bogus. Reprogram the EFI clock with
       1/1/2005 in that case.
   4)  The EFI clock has a high probability of being correct, so
       only (further) correct the EFI clock when the file system time
       is larger. That should never happen in a time-synchronised world.
       Complain when EFI lost 2 days or more.

Replace the copyright notice now that I (pretty much) rewrote all of
this file.
2005-04-22 05:04:58 +00:00
Marcel Moolenaar
ff7125a623 Add empty header (except of the multiple-inclusion protection) to
get hwpmc(4) to compile on this platform.
2005-04-20 18:44:53 +00:00
Warner Losh
06db52b609 Break out the definition of bus_space_{tag,handle}_t and a few other types
into _bus.h to help with name space polution from including all of bus.h.
In a few days, I'll commit changes to the MI code to take advantage of thse
sepration (after I've made sure that these changes don't break anything in
the main tree, I've tested in my trees, but you never know...).

Suggested by: bde (in 2002 or 2003 I think)
Reviewed in principle by: jhb
2005-04-18 21:45:34 +00:00
Marcel Moolenaar
02b47ea204 Add a kpte command to DDB. It dumps the PTE of a KVA. This helps
to analyze faults and TLB/VHPT inconsistencies.
2005-04-16 23:38:32 +00:00
Marcel Moolenaar
e190f6efc8 Return better "error" values for UWX_BOTTOM and UWX_ABI_FRAME in
unw_step(). Both errors denote the end of a stack trace (i.e. no
prior frame), but are otherwise not error conditions.
Have db_trace() return 0 when the trace ends due to one of these
return codes as they are really normal termination conditions.

This change especially improves the output of the "show thread"
command in DDB when there are threads in fork_trampoline() and
previously db_trace() would return an error, causing the show
command to emit '***'.
2005-04-16 05:38:59 +00:00
Marcel Moolenaar
64c92ba929 Initialize curthread before we save the APs MCA state. Saving the
MCA state requires a spin lock, which requires a valid curthread.
This change allows SMP kernels to boot into multi-user again.

While here, update the copyright notice and use __FBSDID for the
revision string.
2005-04-15 00:21:23 +00:00
John Baldwin
aa9aa68d2f Use PCPU_LAZY_INC() for cnt.v_{intr,trap,syscalls} rather than atomic
operations in some places and simple non-per CPU math in others.
2005-04-12 23:18:54 +00:00
Marcel Moolenaar
a08d773359 Dot the i's:
1  Move the debug.clock_adjust_* sysctls to debug.clock.adjust_* to
   make it easier to get only the clock statistics.
2  Make the sysctls read-only [suggested by Marius].
3  When determining the new clock adjustment, we checked for an error
   either larger than 12.5% or smaller than 12.5%. We left out an error
   of exactly 12.5%. For errors larger than 12.5% we adjust the clock
   reload value in such a way that the next clock interrupt would be
   early (as in premature). For errors less than 12.5% we stopped the
   adjustment.
   The current algorithm doesn't benefit from excluding an error of
   exactly 12.5%. Change the code to stop adjusting the clock if the
   error is *not* larger than 12.5% [suggested by Marius].

Discussed with: marius@
2005-04-12 18:50:57 +00:00
John Baldwin
c6a37e8413 Divorce critical sections from spinlocks. Critical sections as denoted by
critical_enter() and critical_exit() are now solely a mechanism for
deferring kernel preemptions.  They no longer have any affect on
interrupts.  This means that standalone critical sections are now very
cheap as they are simply unlocked integer increments and decrements for the
common case.

Spin mutexes now use a separate KPI implemented in MD code: spinlock_enter()
and spinlock_exit().  This KPI is responsible for providing whatever MD
guarantees are needed to ensure that a thread holding a spin lock won't
be preempted by any other code that will try to lock the same lock.  For
now all archs continue to block interrupts in a "spinlock section" as they
did formerly in all critical sections.  Note that I've also taken this
opportunity to push a few things into MD code rather than MI.  For example,
critical_fork_exit() no longer exists.  Instead, MD code ensures that new
threads have the correct state when they are created.  Also, we no longer
try to fixup the idlethreads for APs in MI code.  Instead, each arch sets
the initial curthread and adjusts the state of the idle thread it borrows
in order to perform the initial context switch.

This change is largely a big NOP, but the cleaner separation it provides
will allow for more efficient alternative locking schemes in other parts
of the kernel (bare critical sections rather than per-CPU spin mutexes
for per-CPU data for example).

Reviewed by:	grehan, cognet, arch@, others
Tested on:	i386, alpha, sparc64, powerpc, arm, possibly more
2005-04-04 21:53:56 +00:00
Maxim Sobolev
6bcf003260 Add USB Communication Device Class Ethernet driver. Originally written for
FreeBSD based on aue(4) it was picked by OpenBSD, then from OpenBSD ported
to NetBSD and finally NetBSD version merged with original one goes into
FreeBSD.

Obtained from:  http://www.gank.org/freebsd/cdce/
                NetBSD
                OpenBSD
2005-03-22 14:52:40 +00:00
Nate Lawson
ac5f2dab74 s/SLIST/STAILQ to catch up with changes to resource lists.
Missed by:	imp
2005-03-20 06:55:49 +00:00
Murray Stokely
991f5121f0 Add a comment to note that pseudo-device bpf is required for DHCP.
This is mentioned in the Handbook but it is not as obvious to new
users why bpf is needed compared to the other largely self-explanatory
items in GENERIC.

PR:		conf/40855
MFC after:	1 week
2005-03-18 15:24:00 +00:00
Ian Dowse
60719a1a44 Split configure() into 3 separate steps like we do on other
architectures. This makes it possible to insert hooks before and
after the device attachment step.

Tested thanks to:	marcel
2005-03-18 09:45:43 +00:00
Scott Long
5974e5c71c Refactor the bus_dma header files so that the interface is described in
sys/bus_dma.h instead of being copied in every single arch.  This slightly
reorders a flag that was specific to AXP and thus changes the ABI there.
The interface still relies on bus_space definitions found in <machine/bus.h>
so it cannot be included on its own yet, but that will be fixed at a later
date.  Add an MD <machine/bus_dma.h> for ever arch for consistency and to
allow for future MD augmentation of the API.  sparc64 makes heavy use of
this right now due to its different bus_dma implemenation.
2005-03-14 16:46:28 +00:00
Scott Long
8bf0837c7a Remove dead code. 2005-03-07 02:18:52 +00:00
Joerg Wunsch
a5f50ef9e4 netchild's mega-patch to isolate compiler dependencies into a central
place.

This moves the dependency on GCC's and other compiler's features into
the central sys/cdefs.h file, while the individual source files can
then refer to #ifdef __COMPILER_FEATURE_FOO where they by now used to
refer to #if __GNUC__ > 3.1415 && __BARC__ <= 42.

By now, GCC and ICC (the Intel compiler) have been actively tested on
IA32 platforms by netchild.  Extension to other compilers is supposed
to be possible, of course.

Submitted by:	netchild
Reviewed by:	various developers on arch@, some time ago
2005-03-02 21:33:29 +00:00
Marcel Moolenaar
f685f62c98 Make sure fpswa_iface equals NULL when bootinfo.bi_fpswa equals 0.
We need to be able to test for the (possible) non-existence of the
FPSWA code.

PR: ia64/77591
Submitted by: Christian Kandeler (christian dot kandeler at hob dot de)
MFC after: 1 day
2005-03-02 20:29:04 +00:00
Wes Peters
95e2054492 Attempt to doff the pointy hat: implement 'hw.realmem' on remaining
architectures.  Pointed out by O'Brien, ScottL via email.

Reviewed by:	obrien (various)
2005-03-01 21:55:27 +00:00
Xin LI
130d7d9ffb Remove acpi_perf from {ARCH}/conf/NOTES, to make tinderbox happy.
Reported by:	tinderbox
Inspired by:	acpi_perf build structure removal commit
2005-02-25 07:10:37 +00:00
Ruslan Ermilov
3971d2cf5e Use a common multi-inclusion protection, and add such a
protection to alpha/include/exec.h.
2005-02-19 21:16:48 +00:00
Marcel Moolenaar
3ec2e857c1 s/descr/oid_descr/ 2005-02-09 04:48:23 +00:00
Poul-Henning Kamp
0c3c54da63 Since we are quite unlikely to ever face another platform which
uses the i8237 without trying to emulate the PC architecture move
the register definitions for the i8237 chip into the central include
file for the chip, except for the PC98 case which is magic.

Add new isa_dmatc() function which tells us as cheaply as possible
if the terminal count has been reached for a given channel.
2005-02-06 13:46:39 +00:00
Nate Lawson
3888a87205 Finish the job of sorting all includes and fix the build by including
malloc.h before proc.h on sparc64.  Noticed by das@

Compiled on:	alpha, amd64, i386, pc98, sparc64
2005-02-06 01:55:08 +00:00
Nate Lawson
69bc96f231 Build cpufreq and acpi_perf on platforms that are likely to be able to
use them.
2005-02-05 21:01:09 +00:00
Marcel Moolenaar
6fb59928a6 Include sys/bus.h before sys/cpu.h. The latter needs device_t. 2005-02-04 06:38:58 +00:00
Nate Lawson
4c4381e288 Add an implementation of cpu_est_clockrate(9). This function estimates the
current clock frequency for the given CPU id in units of Hz.
2005-02-04 05:32:56 +00:00
Warner Losh
1f0ce611b3 nit in /*- 2005-01-31 08:16:45 +00:00
Marcel Moolenaar
b71bca0f84 Fix handling of post increment: Either the first or second operand
is the register with the memory address, and it's that register's
value we need to increment or decrement.

MFC after: 3 days
2005-01-27 06:01:44 +00:00
Scott Long
9580b6766e Fix compile errors. Bah. 2005-01-18 11:06:34 +00:00
Scott Long
a9e9b7e47b Fix an assignment that I missed in the last commit. 2005-01-15 20:03:59 +00:00
Scott Long
33072f4de7 Add bus_dmamap_load_mbuf_sg() to ia64 2005-01-15 19:26:17 +00:00
John Baldwin
f4ef9cec40 - Remove some OBE comments regarding cpu_exit(). cpu_exit() is no longer
the last action of kern_exit().  Instead, it is a MD callout to cleanup
  per-process state during exit.
- Add notes of concern to Alpha and ia64 about the possible need to drop
  fp state in cpu_thread_exit() rather than in cpu_exit() since it is
  per-thread state rather than per-process.
2005-01-14 20:13:04 +00:00
Warner Losh
86cb007f9f /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 22:18:23 +00:00
Marcel Moolenaar
2fa9a15eca Further enhance the handling of misaligned loads and stores:
o  implement double-extended and single precision loads and stores,
o  implement double precision stores,
o  replace the machdep.unaligned_print sysctl with debug.unaligned_print
   and change the default value to 0,
o  replace the machdep.unaligned_sigbus sysctl with debug.unaligned_test,
o  Remmove the fillfd() function. The function is trvial enough for
   inline assembly.

The debug.unaligned_test sysctl is used to test the emulation of
misaligned loads and stores. When PSR.ac is 0, the CPU will handle
misaligned memory accesses itselfi and we don't get an exception
for it. When PSR.ac is 1, the process needs to be signalled and we
should not emulate. The sysctl takes effect when PSR.ac is 1 and
tells us that we should emulate and not send a signal.

PR: 72268
MFC after: 1 week
2005-01-02 00:20:54 +00:00
Alan Cox
1f70d62298 Modify pmap_enter_quick() so that it expects the page queues to be locked
on entry and it assumes the responsibility for releasing the page queues
lock if it must sleep.

Remove a bogus comment from pmap_enter_quick().

Using the first change, modify vm_map_pmap_enter() so that the page queues
lock is acquired and released once, rather than each time that a page
is mapped.
2004-12-23 20:16:11 +00:00
Alan Cox
85f5b24573 In the common case, pmap_enter_quick() completes without sleeping.
In such cases, the busying of the page and the unlocking of the
containing object by vm_map_pmap_enter() and vm_fault_prefault() is
unnecessary overhead.  To eliminate this overhead, this change
modifies pmap_enter_quick() so that it expects the object to be locked
on entry and it assumes the responsibility for busying the page and
unlocking the object if it must sleep.  Note: alpha, amd64, i386 and
ia64 are the only implementations optimized by this change; arm,
powerpc, and sparc64 still conservatively busy the page and unlock the
object within every pmap_enter_quick() call.

Additionally, this change is the first case where we synchronize
access to the page's PG_BUSY flag and busy field using the containing
object's lock rather than the global page queues lock.  (Modifications
to the page's PG_BUSY flag and busy field have asserted both locks for
several weeks, enabling an incremental transition.)
2004-12-15 19:55:05 +00:00
Marcel Moolenaar
7126a966b3 Fix the last of the instability and the cause of the annoying
"vm_fault: fault on nofault entry, addr: %lx" panic. The problem was a
stale PTE in the TLB that marked the page as not present, even though
we had a good PTE in the VHPT. We typically don't yet insert PTEs in
the TLB. We do that lazily. The CPU will look for the PTE in the VHPT
when there's no PTE in the TLB. Unfortunately this doesn't handle the
case of the stale PTE in the TLB. The quick fix is to invalidate the
TLB (sloppily) when the VHPT doesn't contain a valid PTE. This is also
the only case that may cause a PTE in the TLB that marks a page as
non-present.
2004-12-12 19:27:58 +00:00
Marcel Moolenaar
3579953091 Use primitive types to avoid creating an artificial header dependency:
o  s/u_long/unsigned long/
o  s/uint32_t/unsigned int/g
o  s/uint64_t/unsigned long/g

Trigger case: multimedia/mpeg2codec
2004-12-11 06:15:12 +00:00
Marcel Moolenaar
f5929532f1 Don't obtain the HCDP address directly from the bootinfo structure.
Use a function to keep the details at arms length from uart(4).
2004-12-08 05:46:54 +00:00
Marcel Moolenaar
bcc5241c43 Change gdb_cpu_setreg() to not take the value to which to set the
specified register, but a pointer to the in-memory representation of
that value. The reason for this is twofold:
1. Not all registers can be represented by a register_t. In particular
   FP registers fall in that category. Passing the new register value
   by reference instead of by value makes this point moot.
2. When we receive a G or P packet, both are for writing a register,
   the packet will have the register value in target-byte order and
   in the memory representation (modulo the fact that bytes are sent
   as 2 printable hexadecimal numbers of course). We only need to
   decode the packet to have a pointer to the register value.

This change fixes the bug of extracting the register value of the P
packet as a hexadecimal number instead of as a bit array. The quick
(and dirty) fix to bswap the register value in gdb_cpu_setreg() as
it has been added on i386 and amd64 can therefore be removed and has
in fact been that.

Tested on: alpha, amd64, i386, ia64, sparc64
2004-12-01 06:40:35 +00:00
Marcel Moolenaar
c0678028d7 Whitespace fixes:
o  Remove a bogus comment that relates to alpha.
o  s/u_int64_t/uint64_t/g
o  Add bi_spare2 to make the internal padding explicit.
o  Move BOOTINFO_MAGIC after the field it applies to.
2004-11-28 04:34:17 +00:00
David Schultz
6004362e66 Don't include sys/user.h merely for its side-effect of recursively
including other headers.
2004-11-27 06:51:39 +00:00
Marcel Moolenaar
2ba0042660 Remove struct ia64_itir and use a plain old uint64_t instead. 2004-11-21 21:40:08 +00:00
David Schultz
ab44ebf537 Remove UAREA_PAGES.
Reviewed by:	arch@
2004-11-20 02:29:50 +00:00
David Schultz
11111b709f U areas are going away, so don't allocate one for process 0.
Reviewed by:	arch@
2004-11-20 02:29:25 +00:00
David Schultz
ff3fd2e764 user.h is included only to get pcb.h, so use the latter directly instead. 2004-11-20 02:28:14 +00:00
Marcel Moolenaar
a0c88fb1d9 Remove the BR tag. When the machine doesn't have the DIG64 HCDP
table with console settings, we now only need to know at which
address the UART lives. Leaving the baudrate unspecified results
in us using the baudrate at which the UART operates. This removes
one parameter that can interfere with a successful installation
out of the box.
2004-11-14 23:42:48 +00:00
John Baldwin
d39d4a6e64 - Change the ddb paging "support" to use a variable (db_lines_per_page) to
control the number of lines per page rather than a constant.  The variable
  can be examined and changed in ddb as '$lines'.  Setting the variable to
  0 will effectively turn off paging.
- Change db_putchar() to force out pending whitespace before outputting
  newlines and carriage returns so that one can rub out content on the
  current line via '\r     \r' type strings.
- Change the simple pager to rub out the --More-- prompt explicitly when
  the routine exits.
- Add some aliases to the simple pager to make it more compatible with
  more(1): 'e' and 'j' do a single line.  'd' does half a page, and
  'f' does a full page.

MFC after:	1 month
Inspired by:	kris
2004-11-01 22:15:15 +00:00
Poul-Henning Kamp
32204b9721 Use bioq_takefirst() 2004-10-23 12:44:19 +00:00
Poul-Henning Kamp
95bc568977 Add new function ttyinitmode() which sets our systemwide default
modes on a tty structure.

Both the ".init" and the current settings are initialized allowing
the function to be used both at attach and open time.

The function takes an argument to decide if echoing should be enabled.
Echoing should not be enabled for regular physical serial ports
unless they are consoles, in which case they should be configured
by ttyconsolemode() instead.

Use the new function throughout.
2004-10-18 21:51:27 +00:00
Nate Lawson
8f528832e5 Print flags in the nexus for child devices. 2004-10-14 22:36:47 +00:00
Nate Lawson
31ad3b8802 Move the code for halting the CPU (acpi_cpu_c1) into machdep files.
This removes the last MD portion of acpi_cpu.c.

MFC after:	2 weeks
2004-10-11 05:39:15 +00:00
Marcel Moolenaar
07cf947238 Add the Madison II, which is the second generation Madison. The Madison II
is model 2 in the Itanium 2 family and has up to 9MB of L3 cache and clocks
higher than 1.5Ghz. There's no LV variant AFAICT.
2004-10-06 02:43:28 +00:00
Alan Cox
8ceb3dcb60 The physical address stored in the vm_page is page aligned. There is no
need to mask off the page offset bits.  (This operation made some sense
prior to i386/i386/pmap.c revision 1.254 when we passed a physical address
rather than a vm_page pointer to pmap_enter().)
2004-10-03 00:16:43 +00:00
Alan Cox
07b3303943 Eliminate unnecessary uses of PHYS_TO_VM_PAGE() from pmap_enter(). These
uses predate the change in the pmap_enter() interface that replaced the
page's physical address by the address of its vm_page structure.  The
PHYS_TO_VM_PAGE() was being used to compute the address of the same vm_page
structure that was being passed in.
2004-10-02 07:34:58 +00:00
Marcel Moolenaar
287e12f172 ...And fix WITNESS builds: declare syscallnames. 2004-09-26 20:39:56 +00:00
Marcel Moolenaar
feae534e49 Fix INVARIANTS build: Include <machine/cpu.h>. 2004-09-26 00:38:56 +00:00
Marcel Moolenaar
03bfdd1362 Move the IA-32 trap handling from trap() to ia32_trap(). Move the
ia32_syscall() function along with it to ia32_trap.c. When COMPAT_IA32
is not defined, we'll raise SIGEMT instead.
2004-09-25 04:27:44 +00:00
Marcel Moolenaar
0c32530bb7 Redefine a PTE as a 64-bit integral type instead of a struct of
bit-fields. Unify the PTE defines accordingly and update all
uses.
2004-09-23 00:05:20 +00:00
Marcel Moolenaar
0675c65d6b s/u_int#_t/uint#_t/g 2004-09-22 23:12:46 +00:00
Marcel Moolenaar
08d3edb315 For the atomic_{add|clear|set|subtract} family of inlines, return the
old or previous value instead of void. This is not as is documented
in atomic(9), but is API (and ABI) compatible and simply makes sense.
This feature will primarily be used for atomic PTE updates in PMAP/ng.
2004-09-22 19:58:43 +00:00
Marcel Moolenaar
5c48823c36 MFp4: various style fixes, including
o  s/u_int/uint/g
o  s/#define<sp>/#define<tab>/g
o  indent macro definitions
o  Improve vertical spacing
o  Globally align line continuation character
2004-09-22 19:47:42 +00:00
John Baldwin
76764432e4 - Add support for "paging" in stack trace output. That is, when you do
a stack trace from ddb, the output will pause with a '--More--' prompt
  every 18 lines.  If you hit Enter, it will print another line and prompt
  again.  If you hit space it will output another page and then prompt.
  If you hit 'q' or 'x' it will abort the rest of the stack trace.
- Fix the sparc64 userland stack trace to honor the total count of lines
  to print.  This is useful if your trace happens to walk back onto
  0xdeadc0de and gets stuck in an endless loop.

MFC after:	1 month
Tested on:	i386, alpha, sparc64
2004-09-20 19:05:32 +00:00
Marcel Moolenaar
13e6668525 MFp4:
Completely remove the remaining EFI includes and add our own (type)
definitions instead. While here, abstract more of the internals by
providing interface functions.
2004-09-19 03:50:46 +00:00
Alan Cox
7580b56bdc Release the page queues lock earlier in pmap_protect() and pmap_remove() in
order to reduce contention.
2004-09-18 22:56:58 +00:00
Marcel Moolenaar
9f9ae8ebb7 Provide our own FPSWA definitions, instead of depending on the Intel
EFI headers and put them all in <machine/fpu.h>. The Intel EFI headers
conflict with the Intel ACPI headers (duplicate type definitions), so
are being phased out in the kernel.
2004-09-17 22:19:41 +00:00
Marcel Moolenaar
cba8d3ae49 Remove useless inclusion of <machine/fpu.h> 2004-09-17 20:42:45 +00:00
Poul-Henning Kamp
7ce1979be6 Add new a function isa_dma_init() which returns an errno when it fails
and which takes a M_WAITOK/M_NOWAIT flag argument.

Add compatibility isa_dmainit() macro which whines loudly if
isa_dma_init() fails.

Problem uncovered by:	tegge
2004-09-15 12:09:50 +00:00
Marcel Moolenaar
fa3b7cae8d Catch up with other platforms: switch the default scheduler to 4BSD. 2004-09-12 05:50:32 +00:00
Scott Long
50736a153b Fix a problem with tag->boundary inheritence that has existed since day one
and was propagated to nearly every platform.  The boundary of the child needs
to consider the boundary of the parent and pick the minimum of the two, not
the maximum.  However, if either is 0 then pick the appropriate one.
This bug was exposed by a recent change to ATA, which should now be fixed by
this change.  The alignment and maxsegsz tag attributes likely also need
a similar review in the near future.

This is a MT5 candidate.

Reviewed by: marcel
Submitted by: sos (in part)
2004-09-08 04:54:19 +00:00
Marcel Moolenaar
566d143be0 Sync the busdma code with i386. The most tangible upshot is that
the alignment and boundary constraints are being respected, which
fixes the reported ATA problems with SiI chips.
I consider the busdma implementation worrisome nonetheless. Not
only is there too much MI code duplicated in MD files, there's a
lot of questionable code. I smell a wholesale, cross-platform
overhaul coming...

MT5 candidate.
2004-09-08 02:55:04 +00:00
Julian Elischer
ed062c8d66 Refactor a bunch of scheduler code to give basically the same behaviour
but with slightly cleaned up interfaces.

The KSE structure has become the same as the "per thread scheduler
private data" structure. In order to not make the diffs too great
one is #defined as the other at this time.

The KSE (or td_sched) structure is  now allocated per thread and has no
allocation code of its own.

Concurrency for a KSEGRP is now kept track of via a simple pair of counters
rather than using KSE structures as tokens.

Since the KSE structure is different in each scheduler, kern_switch.c
is now included at the end of each scheduler. Nothing outside the
scheduler knows the contents of the KSE (aka td_sched) structure.

The fields in the ksegrp structure that are to do with the scheduler's
queueing mechanisms are now moved to the kg_sched structure.
(per ksegrp scheduler private data structure). In other words how the
scheduler queues and keeps track of threads is no-one's business except
the scheduler's. This should allow people to write experimental
schedulers with completely different internal structuring.

A scheduler call sched_set_concurrency(kg, N) has been added that
notifies teh scheduler that no more than N threads from that ksegrp
should be allowed to be on concurrently scheduled. This is also
used to enforce 'fainess' at this time so that a ksegrp with
10000 threads can not swamp a the run queue and force out a process
with 1 thread, since the current code will not set the concurrency above
NCPU, and both schedulers will not allow more than that many
onto the system run queue at a time. Each scheduler should eventualy develop
their own methods to do this now that they are effectively separated.

Rejig libthr's kernel interface to follow the same code paths as
linkse for scope system threads. This has slightly hurt libthr's performance
but I will work to recover as much of it as I can.

Thread exit code has been cleaned up greatly.
exit and exec code now transitions a process back to
'standard non-threaded mode' before taking the next step.
Reviewed by:	scottl, peter
MFC after:	1 week
2004-09-05 02:09:54 +00:00
Marcel Moolenaar
44af2aa001 Add aac(4) and aacp(4). The driver is 64-bit clean for roughly a year
now and has been mentioned on the freebsd-ia64 list.
2004-09-02 18:05:26 +00:00
Julian Elischer
5995adc206 Remove an unneeded argument..
The removed argument could trivially be derived from the remaining one.
That in turn should be the same as curthread, but it is possible that curthread could be expensive to derive on some syste,s so leave it as an argument.
Having both proc and thread as an argumen tjust gives an opportunity for
them to get out sync.

MFC after:	3 days
2004-08-31 07:34:54 +00:00
Julian Elischer
99e9dcb817 Remove sched_free_thread() which was only used
in diagnostics. It has outlived its usefulness and has started
causing panics for people who turn on DIAGNOSTIC, in what is otherwise
good code.

MFC after:	2 days
2004-08-31 06:12:13 +00:00
Alan Cox
bfa15df9ba Remove unnecessary check for curthread == NULL. 2004-08-30 03:52:05 +00:00
Marcel Moolenaar
224407d6a8 s/ENTRY/ENTRY_NOPROFILE/g for particular functions that do not follow
the C calling convention or are otherwise not regular functions. This
allows us to boot a profiling kernel.
2004-08-30 01:32:28 +00:00
Marcel Moolenaar
82ecff453f Catch up with the drive-by renaming of IA32 to COMPAT_IA32. Missed
11 days ago when all the other places were fixed and finally caught
by the tinderbox run...
2004-08-27 21:57:00 +00:00
Marcel Moolenaar
0f2fe153bc Move the kernel-specific logic to adjust frompc from MI to MD. For
these two reasons:
1. On ia64 a function pointer does not hold the address of the first
   instruction of a functions implementation. It holds the address
   of a function descriptor. Hence the user(), btrap(), eintr() and
   bintr() prototypes are wrong for getting the actual code address.
2. The logic forces interrupt, trap and exception entry points to
   be layed-out contiguously. This can not be achieved on ia64 and is
   generally just bad programming.

The MCOUNT_FROMPC_USER macro is used to set the frompc argument to
some kernel address which represents any frompc that falls outside
the kernel text range. The macro can expand to ~0U to bail out in
that case.
The MCOUNT_FROMPC_INTR macro is used to set the frompc argument to
some kernel address to represent a call to a trap or interrupt
handler. This to avoid that the trap or interrupt handler appear to
be called from everywhere in the call graph. The macro can expand
to ~0U to prevent adjusting frompc. Note that the argument is selfpc,
not frompc.

This commit defines the macros on all architectures equivalently to
the original code in sys/libkern/mcount.c. People can take it from
here...

Compile-tested on: alpha, amd64, i386, ia64 and sparc64
Boot-tested on: i386
2004-08-27 19:42:35 +00:00
Alan Cox
8991a235cb The machine-independent parts of the virtual memory system always pass a
valid pmap to the pmap functions that require one.  Remove the checks for
NULL.  (These checks have their origins in the Mach pmap.c that was
integrated into BSD.  None of the new code written specifically for
FreeBSD included them.)
2004-08-27 19:06:17 +00:00
Andre Oppermann
c21fd23260 Always compile PFIL_HOOKS into the kernel and remove the associated kernel
compile option.  All FreeBSD packet filters now use the PFIL_HOOKS API and
thus it becomes a standard part of the network stack.

If no hooks are connected the entire packet filter hooks section and related
activities are jumped over.  This removes any performance impact if no hooks
are active.

Both OpenBSD and DragonFlyBSD have integrated PFIL_HOOKS permanently as well.
2004-08-27 15:16:24 +00:00