Commit Graph

1698 Commits

Author SHA1 Message Date
marius
e150d8eb07 Additionally clear the STICK bit in the SOFTINT register when
receiving a PIL_TICK interrupt. This change was erroneously
omitted in r182730.
2008-09-03 21:48:12 +00:00
marius
a030d21a6f - USIII-based machines can consist of CPUs running at different
frequencies (and having different cache sizes) so use the STICK
  (System TICK) timer, which was introduced due to this and is
  driven by the same frequency across all CPUs, instead of the
  TICK timer, whose frequency varies with the CPU clock, to drive
  hardclock. We try to use the STICK counter with all CPUs that are
  USIII or beyond, even when not necessary due to identical CPUs,
  as we can can also avoid the workaround for the BlackBird erratum
  #1 there. Unfortunately, using the STICK counter currently causes
  a hang with USIIIi MP machines for reasons unknown, so we still
  use the TICK timer there (which is okay as they can only consist
  of identical CPUs).
- Given that we only (try to) synchronize the (S)TICK timers of APs
  with the BSP during startup, we could end up spinning forever in
  DELAY(9) if that function is migrated to another CPU while we're
  spinning due to clock drift afterwards, so pin to the CPU in order
  to avoid migration. Unfortunately, pinning doesn't work at the
  point DELAY(9) is required by the low-level console drivers, yet,
  so switch to a function pointer, which is updated accordingly, for
  implementing DELAY(9). For USIII and beyond, this would also allow
  to easily use the STICK counter instead of the TICK one here,
  there's no benefit in doing so however.
  While at it, use cpu_spinwait(9) for spinning in the delay-
  functions. This currently is a NOP though.
- Don't set the TICK timer of the BSP to 0 during at startup as
  there's no need to do so.
- Implement cpu_est_clockrate().
- Unfortunately, USIIIi-based machines don't provide a timecounter
  device besides the STICK and TICK counters (well, in theory the
  Tomatillo bridges have a performance counter that can be (ab)used
  as timecounter by configuring it to count bus cycles, though unlike
  the performance counter of Schizo bridges, the Tomatillo one is
  broken and counts Sun knows what in this mode). This means that
  we've to use a (S)TICK counter for timecounting, which has the old
  problem of not being in sync across CPUs, so provide an additional
  timecounter function which binds itself to the BSP but has an
  adequate low priority.
2008-09-03 17:39:19 +00:00
obrien
babf0085b2 ahc(4) work better in Sparc64 with AHC_ALLOW_MEMIO.
Submitted by:	Nathan Whitehorn <nwhitehorn@freebsd.org>
2008-09-02 21:46:17 +00:00
marius
6b0f6beaa5 - USIII-based machines can consist of CPUs having different cache
sizes (and running at different frequencies) so move the cacheinfo
  to the PCPU data. While at it, remove some redundant and/or unused
  members from struct cacheinfo.
- In sparc64_init don't assume the first CPU node we find in the OFW
  device tree is the BSP.
2008-09-02 21:13:54 +00:00
marius
6a6f30db5a Bypass isa_probe_children(9) and directly call bus_generic_attach(9)
in order to avoid the invasive probes done by identify-routines of
ISA drivers, which may access unassigned addresses or those of
unrelated devices and thus in turn can trigger master/target aborts
as revealed by r182108 and ahc(4). I think that this is also the
cause of the hang previously seen on B100 blades during boot.
Bypassing isa_probe_children(9) also avoids adding ISA hints, which
just can be wrong for sparc64.

Reported by:	gavin
2008-09-02 21:06:28 +00:00
marius
e95621f381 There's a race in kmem(4) between checking whether a page is resident
in the kernel and copying it out, causing a panic when faulting on a
nofault entry. Handle this case gracefully by letting the kernel copy
functions return EFAULT instead. As such this change addresses the
same problem as r154721 does for i386.

MFC after:	3 days
2008-08-24 20:53:36 +00:00
marius
504a0eb6a5 MFamd64: r133413
In syscall, always make a copy of parameters from trapframe, this
becauses some syscalls using set_mcontext can sneakily change
parameters and later when those syscalls references parameters,
they will wrongly use register values in mcontext_t.

PR:		72998
MFC after:	3 days
2008-08-24 20:02:18 +00:00
marius
b170b03c8e Announce the speed of the PCI bus for informational purpose.
MFC after:	3 days
2008-08-24 16:22:04 +00:00
marius
5e0a924fec The PCI specifications don't explain the details on how to calculate
the latency based on the Min_Gnt register so use the algorithm found
in OpenSolaris as they probably know how to interpret the value Sun
puts into these registers (previously, the latency calculated for
66MHz was most likely wrong) and for bridges additionally set up the
secondary latency register. Also set up the bridge control register
the way it's done in OpenSolaris. As the latency register don't apply
to PCI-Express and the bridge control setup wasn't tested on sun4v
(besides most likely not being needed), expand the #ifndef SUN4V
accordingly.

MFC after:	3 days
2008-08-24 15:05:46 +00:00
marius
5f4692e243 Update the comment regarding the workaround for the BlackBird
TICK_COMPARE bug and the instruction alignment used for it based
on information found in the OpenSolaris source.

MFC after:	3 days
2008-08-23 20:53:27 +00:00
marius
9e48c37294 - Provide and consume module dependency information.
- Fix whitespace bugs.

MFC after:	3 days
2008-08-23 16:07:20 +00:00
marius
ccd3e368a5 - Removed unused sc_node.
- Provide module dependency information.
- Static'ize ebus_release_resource() in order to match prototype.
- Remove outdated and/or obsolete comments.
- Fix whitespace bugs.

MFC after:	3 days
2008-08-23 15:44:13 +00:00
marius
36dc0db8e1 Provide and consume module dependency information.
MFC after:	3 days
2008-08-23 15:20:33 +00:00
marius
7103197d9e Remove clkbrd(4) as a separate device and compile it solely based
on the presence of fhc(4) instead; we by far don't support all of
the functionality provide by the clock board but in general it's
an integral part of FireHose-based systems which shouldn't be
possible to omit.
2008-08-23 14:28:44 +00:00
marius
e7f32da60c - Add kbdmux(4); since sunkbd(4) was tought to emulate atkbd(4) like
ukbd(4) does and that emulation was enabled by default, all three of
  them work together with kbdmux(4) out of the box just fine.
- Fix some whitespace bugs.

MFC after:	3 days
2008-08-23 14:17:00 +00:00
marius
b09c8c9fe7 cosmetic changes and style fixes 2008-08-22 20:28:19 +00:00
marius
fc32a2339d Avoid misaligned access of struct frame.
MFC after:	3 days
2008-08-22 19:05:47 +00:00
ed
cc3116a938 Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

  The old TTY layer has a driver model that is not abstract enough to
  make it friendly to use. A good example is the output path, where the
  device drivers directly access the output buffers. This means that an
  in-kernel PPP implementation must always convert network buffers into
  TTY buffers.

  If a PPP implementation would be built on top of the new TTY layer
  (still needs a hooks layer, though), it would allow the PPP
  implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

  With the old TTY layer, it isn't entirely safe to destroy TTY's from
  the system. This implementation has a two-step destructing design,
  where the driver first abandons the TTY. After all threads have left
  the TTY, the TTY layer calls a routine in the driver, which can be
  used to free resources (unit numbers, etc).

  The pts(4) driver also implements this feature, which means
  posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

  One of the major improvements is the per-TTY mutex, which is expected
  to improve scalability when compared to the old Giant locking.
  Another change is the unbuffered copying to userspace, which is both
  used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from:		//depot/projects/mpsafetty/...
Approved by:		philip (ex-mentor)
Discussed:		on the lists, at BSDCan, at the DevSummit
Sponsored by:		Snow B.V., the Netherlands
dcons(4) fixed by:	kan
2008-08-20 08:31:58 +00:00
jhb
d90774443d Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports
already define _KERNEL to get to this and I'm about to add hooks to
libkvm to access per-CPU data.

MFC after:	1 week
2008-08-19 19:53:52 +00:00
bz
1021d43b56 Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from:	//depot/projects/vimage-commit2/...
Reviewed by:	brooks, des, ed, mav, julian,
		jamie, kris, rwatson, zec, ...
		(various people I forgot, different versions)
		md5 (with a bit of help)
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
X-MFC after:	never
V_Commit_Message_Reviewed_By:	more people than the patch
2008-08-17 23:27:27 +00:00
marius
18594e5bbc cosmetic changes and style fixes 2008-08-13 20:30:28 +00:00
marius
61c5134502 Assume OpenSolaris knows better and use their value for VM_MAX_PROM_ADDRESS. 2008-08-12 20:00:28 +00:00
marius
2f734060b9 - Add sys_tick and the USIII and beyond sys_tick_cmpr to state_regs[].
- Const'ify and static'ize as appropriate.
- Use __FBSDID().
2008-08-12 19:43:36 +00:00
marius
d335d6871a - Reimplement {d,i}tlb_enter() and {d,i}tlb_va_to_pa() in C. There's
no particular reason for them to be implemented in assembler and
  having them in C allows easier extension as well as using more C
  macros and {d,i}tlb_slot_max rather than hard-coding magic (and
  actually spitfire-only) values.
- Fix the compilation of pmap_print_tte().
- Change pmap_print_tlb() to use ldxa() rather than re-rolling it
  inline as well as TLB_DAR_SLOT and {d,i}tlb_slot_max rather than
  hardcoding magic (and actually spitfire-only) values.
- While at it, suffix the above mentioned functions with "_sun4u" to
  underline they're architecture-specific.
- Use __FBSDID and macros instead of magic values in locore.S.
- Remove unused includes and smp_stack in locore.S.
2008-08-07 22:46:25 +00:00
ed
7237d2d9a2 Disconnect drivers that haven't been ported to MPSAFE TTY yet.
As clearly mentioned on the mailing lists, there is a list of drivers
that have not been ported to the MPSAFE TTY layer yet. Remove them from
the kernel configuration files. This means people can now still use
these drivers if they explicitly put them in their kernel configuration
file, which is good.

People should keep in mind that after August 10, these drivers will not
work anymore. Even though owners of the hardware are capable of getting
these drivers working again, I will see if I can at least get them to a
compilable state (if time permits).
2008-08-03 10:32:17 +00:00
marius
a94ec32bd7 - Remove redundant inclusion of opt_global.h.
- Use __FBSDID in autoconf.c.

MFC after:	3 days
2008-07-21 17:15:51 +00:00
delphij
cb283fcdf7 Add HWPMC_HOOKS to GENERIC kernels, this makes hwpmc.ko work out
of the box.
2008-07-07 22:55:11 +00:00
marius
54ef085aee - Merge macros depending on the flags being preserved between calls
into a single "__asm"-statement as GCC doesn't guarantee their
  consecutive output even when using consecutive "__asm __volatile"-
  statement for them. Remove the otherwise unnecessary "__volatile". [1]
- The inline assembler instructions used here alter the condition
  codes so add them to the clobber list accordingly.
- The inline assembler instructions used here uses output operands
  before all input operands are consumed so add appropriate modifiers.

Pointed out by:	bde [1]
MFC after:	2 weeks
2008-07-05 15:44:56 +00:00
marius
6960811ea0 - Fix spelling and style.
- Use __FBSDID.
2008-07-05 15:30:07 +00:00
marius
55c1025147 Revert the addition of "__volatile" to "__asm" done in r180011, since
the condition codes where added to the clobber lists in r180073 the
former is unnecessary.
2008-07-05 15:28:30 +00:00
marius
935cdf2ac6 Improve r180011 by explicitly adding the condition codes to the
clobber list.

Suggested by:	Christoph Mallon
2008-06-27 22:17:14 +00:00
marius
762f29e950 Use "__asm __volatile" rather than "__asm" for instruction sequences
that modify condition codes (the carry bit, in this case). Without
"__volatile", the compiler might add the inline assembler instructions
between unrelated code which also uses condition codes, modifying the
latter.
This prevents the TCP pseudo header checksum calculation done in
tcp_output() from having effects on other conditions when compiled
with GCC 4.2.1 at "-O2" and "options INET6" left out. [1]

Reported & tested by:	Boris Kochergin [1]
MFC after:		3 days
2008-06-25 21:04:59 +00:00
ed
4d6a9685e8 Remove the unused major/minor numbers from iodev and memdev.
Now that st_rdev is being automatically generated by the kernel, there
is no need to define static major/minor numbers for the iodev and
memdev. We still need the minor numbers for the memdev, however, to
distinguish between /dev/mem and /dev/kmem.

Approved by:	philip (mentor)
2008-06-25 07:45:31 +00:00
alc
964def13e2 The VM system no longer uses setPQL2(). Remove it and its helpers. 2008-05-23 04:03:54 +00:00
alc
a8f81206ad Retire pmap_addr_hint(). It is no longer used. 2008-05-18 04:16:57 +00:00
remko
91e9f2c6be Resort the if_ti driver to match the PCI Network cards instead of placing
it under the mii devices list.

PR:		kern/123147
Submitted by:	gavin
Approved by:	imp (mentor, implicit)
MFC after:	3 days
2008-05-17 23:50:00 +00:00
alc
783a45362f Add a stub for pmap_align_superpage() on machines that don't (yet)
implement pmap-level support for superpages.
2008-05-09 23:31:42 +00:00
marius
28f285493a - Remove the BUS_HANDLE_MIN checking in the __BUS_DEBUG_ACCESS macro;
for UPA it should have fulfilled its purpose by now and Fireplane-
  and JBus-based machines are way to messy in organization to implement
  something equivalent.
- Fix a bunch of style(9) bugs.
2008-05-08 21:10:39 +00:00
marius
5d531f6d43 Remove #if 0'ed code referencing no longer existent ecache_flush(). 2008-05-08 21:02:07 +00:00
marius
5f200e421a Use <machine/intr_machdep.h> directly instead of depending on header
pollution in the otherwise unused <sys/pcpu.h>.
2008-05-08 20:57:08 +00:00
marius
57e87ccecb - Use the name returned by device_get_nameunit(9) for the name of the
counter-timer timecounter so the associated SYSCTL nodes don't clash on
  machines having multiple U2P and U2S bridges as well as establishing a
  clear mapping between these bridges and their timecounter device.
- Don't bother setting up a "nice" name for the IOMMU, just use the name
  returned by device_get_nameunit(9), too.
- Fix some minor style(9) bugs.
- Use __FBSDID in counter.c

MFC after:	1 week
2008-05-07 21:22:15 +00:00
sam
39c0719a2e enable IEEE80211_DEBUG and IEEE80211_AMPDU_AGE by default 2008-05-03 17:05:38 +00:00
marius
5c5505b2be Remove an header which is unused for sun4v.
MFC after:	3 days
2008-05-02 17:44:18 +00:00
marius
cea060d682 Remove the MD isa_irq_pending() and the underlying PCI-specific
infrastructure. Its only consumer ever was sio(4) and thus was
unused on sparc64 since removing the last traces of sio(4) in
sparc64 configuration files in favor for uart(4) over three
years ago. If similar functionality is required again it should
be brought back as an MD intr_pending() which works for all
busses by using for example interrupt controller hooks.
2008-04-26 11:01:38 +00:00
jeff
14b586bf96 - Add an integer argument to idle to indicate how likely we are to wake
from idle over the next tick.
 - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are
   suspended in cpu specific states.  This function can fail and cause the
   scheduler to fall back to another mechanism (ipi).
 - Implement support for mwait in cpu_idle() on i386/amd64 machines that
   support it.  mwait is a higher performance way to synchronize cpus
   as compared to hlt & ipis.
 - Allow selecting the idle routine by name via sysctl machdep.idle.  This
   replaces machdep.cpu_idle_hlt.  Only idle routines supported by the
   current machine are permitted.

Sponsored by:	Nokia
2008-04-25 05:18:50 +00:00
marius
792d568d6d - Include <machine/utrap.h> so this header doesn't have an MD
dependency.
- Make prototypes style(9) compliant.

MFC after:	1 week
2008-04-23 20:38:37 +00:00
marius
f7fcfdc595 o Rename ic_eoi to ic_clear to emphasize the functions it points
don't send and EOI which works like on amd64/i386 and blocks all
  interrupts on the relevant interrupt controller.
o Replace the post_filter and post_inthread hooks registered when
  creating the interrupt events with just ic_clear as on sparc64 we
  don't need to do any disable->EOI->enable dance to unblock all but
  the relevant interrupt while running the filter or handler; just
  not clearing the interrupt already has the same effect.
o Merge from amd64/i386:
  - Split the intr_table_lock into an sx lock used for most things,
    and a spin lock to protect intrcnt_index.
  - Add support for binding interrupts to CPUs, including for the
    bus_bind_intr(9) interface, a assign_cpu hook and initially
    shuffling interrupts arround in a round-robin fashion.

Reviewed by:	jhb
MFC after:	1 month
2008-04-23 20:04:38 +00:00
phk
bbf813673e Make genclock standard on all platforms.
Thanks to: grehan & marcel for platform support on ia64 and ppc.
2008-04-21 10:09:55 +00:00
sam
3569e353ca Multi-bss (aka vap) support for 802.11 devices.
Note this includes changes to all drivers and moves some device firmware
loading to use firmware(9) and a separate module (e.g. ral).  Also there
no longer are separate wlan_scan* modules; this functionality is now
bundled into the wlan module.

Supported by:	Hobnob and Marvell
Reviewed by:	many
Obtained from:	Atheros (some bits)
2008-04-20 20:35:46 +00:00
marius
3215bc9d5f On sparc64 machines with multiple host-PCI-bridges these bridges
have separate configuration spaces so by definition they implement
different PCI domains. Thus change psycho(4) to use PCI domains
instead of reenumerating all PCI busses so they have globally unique
bus numbers and drop support for reenumerating busses in the OFW PCI
code.
According to CVS history reenumeration was also required in order to
get some E450 to boot but given that no other open source kernel
changes the PCI bus numbers assigned by the firmware I believe the
real problem was that the old code used the bus number as the device
number for the PCI busses and unlike most of the other machines the
firmwares of the problematic ones don't use disjoint PCI bus numbers
across the host-PCI-bridges.

MFC after:	1 month
2008-04-17 12:38:00 +00:00
jeff
8efb03d60e - Add the interrupt vector number to intr_event_create so MI code can
lookup hard interrupt events by number.  Ignore the irq# for soft intrs.
 - Add support to cpuset for binding hardware interrupts.  This has the
   side effect of binding any ithread associated with the hard interrupt.
   As per restrictions imposed by MD code we can only bind interrupts to
   a single cpu presently.  Interrupts can be 'unbound' by binding them
   to all cpus.

Reviewed by:	jhb
Sponsored by:	Nokia
2008-04-11 03:26:41 +00:00
marius
90e772efe6 - Add support for IPI_PREEMPT. [1]
- Add my copyright to mp_machdep.c for having implemented support for
  USIII and up and some fixes.

Obtained from:	sun4v (modulo style(9) bugs) [1]
2008-04-09 21:14:01 +00:00
jhb
79918c45a6 Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This
allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt
code.
- Rename the intr_event 'eoi', 'disable', and 'enable' hooks to
  'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric.
  Also, add a comment describe what the MI code expects them to do.
- On amd64, i386, and powerpc this is effectively a NOP.
- On arm, don't bother masking the interrupt unless the ithread is
  scheduled in the non-INTR_FILTER case to match what INTR_FILTER did.
  Also, don't bother unmasking the interrupt in the post_filter case if
  we never masked it.  The INTR_FILTER case had been doing this by having
  arm_unmask_irq for the post_filter (formerly 'eoi') hook.
- On ia64, stray interrupts are now masked for the non-INTR_FILTER case.
  They were already masked in the INTR_FILTER case.
- On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for
  both the 'post_filter' and 'post_ithread' hooks to match what the
  non-INTR_FILTER code did.
- On sun4v, retire the ithread wrapper hack by using an appropriate
  'post_ithread' hook instead (it's what 'post_ithread'/'enable' was
  designed to do even in 5.x).

Glanced at by:	piso
Reviewed by:	marius
Requested by:	marius [1], [5]
Tested on:	amd64, i386, arm, sparc64
2008-04-05 19:58:30 +00:00
dfr
dc98ee4196 Add kernel module support for nfslockd and krpc. Use the module system
to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k'
option to rpc.lockd and make kernel NLM the default. A user can still
force the use of the old user NLM by building a kernel without NFSLOCKD
and/or removing the nfslockd.ko module.
2008-03-27 11:54:20 +00:00
jb
34e730ca27 When building a kernel module, define MAXCPU the same as SMP so
that modules work with and without SMP.
2008-03-27 05:03:26 +00:00
phk
fa71439e44 The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
	timer_spkr_acquire()
	timer_spkr_release()
and
	timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all.  In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that.  It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
2008-03-26 20:09:21 +00:00
marius
dd3d50596e - Const'ify the bus_stream_asi and bus_type_asi arrays.
- Replace hard-coded functions names missed in bus_machdep.c rev. 1.44
  with __func__.
- Break some long lines.

MFC after:	1 month
2008-03-24 17:57:01 +00:00
pjd
2023a8c5fd Oops. Use atomic_add_long() for atomic_fetchadd_long() (not atomic_add_int())
for sparc64 and sun4v.

Noticed by:	marius
2008-03-19 07:27:24 +00:00
jhb
c04bb048f6 Simplify the interrupt code a bit:
- Always include the ie_disable and ie_eoi methods in 'struct intr_event'
  and collapse down to one intr_event_create() routine.  The disable and
  eoi hooks simply aren't used currently in the !INTR_FILTER case.
- Expand 'disab' to 'disable' in a few places.
- Use function casts for arm and i386:intr_eoi_src() instead of wrapper
  routines since to trim one extra indirection.

Compiled on:	{arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER}
Tested on:	{amd64,i386} x {FILTER, !FILTER}
2008-03-17 22:42:01 +00:00
pjd
ea49d310bf Implement atomic_fetchadd_long() for all architectures and document it.
Reviewed by:	attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
2008-03-16 21:20:50 +00:00
rwatson
877d7c65ba In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
jhb
9c113163fb Add preliminary support for binding interrupts to CPUs:
- Add a new intr_event method ie_assign_cpu() that is invoked when the MI
  code wishes to bind an interrupt source to an individual CPU.  The MD
  code may reject the binding with an error.  If an assign_cpu function
  is not provided, then the kernel assumes the platform does not support
  binding interrupts to CPUs and fails all requests to do so.
- Bind ithreads to CPUs on their next execution loop once an interrupt
  event is bound to a CPU.  Only shared ithreads are bound.  We currently
  leave private ithreads for drivers using filters + ithreads in the
  INTR_FILTER case unbound.
- A new intr_event_bind() routine is used to bind an interrupt event to
  a CPU.
- Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
  PIC method.
- For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
  an interrupt source and binds its interrupt event to the specified CPU.
  MI code can currently (ab)use this by doing:

	intr_bind(rman_get_start(irq_res), cpu);

  however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
  where the implementation in the x86 nexus(4) driver would end up calling
  intr_bind() internally.

Requested by:	kmacy, gallatin, jeff
Tested on:	{amd64, i386} x {regular, INTR_FILTER}
2008-03-14 19:41:48 +00:00
jeff
acb93d599c Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
yongari
4e37a0a72b Uncomment vr(4), vr(4) should work on all architectures. 2008-03-11 05:09:03 +00:00
marius
16f60e5947 - Fix some style bugs.
- Replace hard-coded functions names missed in rev. 1.44 with __func__.

MFC after:	1 week
2008-03-09 17:09:15 +00:00
marius
f77ece5c7e - Do as the comment in pmap_bootstrap() suggests and flush all non-locked
TLB entries possibly left over by the firmware and also do so while
  bootstrapping APs.
- Use __FBSDID.

MFC after:	1 month
2008-03-09 15:53:34 +00:00
jeff
ad2a31513f - Remove the old smp cpu topology specification with a new, more flexible
tree structure that encodes the level of cache sharing and other
   properties.
 - Provide several convenience functions for creating one and two level
   cpu trees as well as a default flat topology.  The system now always
   has some topology.
 - On i386 and amd64 create a seperate level in the hierarchy for HTT
   and multi-core cpus.  This will allow the scheduler to intelligently
   load balance non-uniform cores.  Presently we don't detect what level
   of the cache hierarchy is shared at each level in the topology.
 - Add a mechanism for testing common topologies that have more information
   than the MD code is able to provide via the kern.smp.topology tunable.
   This should be considered a debugging tool only and not a stable api.

Sponsored by:	Nokia
2008-03-02 07:58:42 +00:00
marius
e3f122733a The Sun disk label only uses 16-bit fields for cylinders, heads and
sectors so the geometry of large IDE disks has to be adjusted. This
corresponds to what the OpenSolaris dad(7D) driver does except that
the latter only tweaks sectors and effectively limits the mediasize
to 128GB so the cylinders and heads fields won't ever overflow. Not
limiting the mediasize is a compromise between allowing to use Sun
disk label as far as possible and being able to use the entire disk
with another disk label.
This allows to use the full capacity of large IDE disks if they were
not labeled under (Open)Solaris (in both ways of the meaning).

MFC after:	2 weeks
2008-02-11 21:40:22 +00:00
ru
910410640b Add a wrapper function that bound checks writes to the dump device. 2008-01-28 19:04:07 +00:00
yongari
c4d5fd8820 Uncomment sf(4), sf(4) should work on all architectures. 2008-01-21 06:51:25 +00:00
jhb
c7e0e41f73 Add COMPAT_FREEBSD7 and enable it in configs that have COMPAT_FREEBSD6. 2008-01-07 21:40:11 +00:00
alc
545d26e30b Add an access type parameter to pmap_enter(). It will be used to implement
superpage promotion.

Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is
a Boolean.
2008-01-03 07:34:34 +00:00
alc
37cdbd87f5 Add configuration knobs for the superpage reservation system. Initially,
the reservation will only be enabled on amd64.
2007-12-27 16:45:39 +00:00
alc
811e8e3c3f Update two tracepoints, i.e., CTRx() invocations, to reflect the demise of
page coloring a few months ago.
2007-12-27 03:52:14 +00:00
rwatson
bdee30611d Add a new 'why' argument to kdb_enter(), and a set of constants to use
for that argument.  This will allow DDB to detect the broad category of
reason why the debugger has been entered, which it can use for the
purposes of deciding which DDB script to run.

Assign approximate why values to all current consumers of the
kdb_enter() interface.
2007-12-25 17:52:02 +00:00
jkoshy
39d4b4accf Add stubs to unbreak LINT. 2007-12-07 13:45:47 +00:00
rwatson
99285f7544 Break out stack(9) from ddb(4):
- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
  definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it is
  defined, or also if "options DDB" is defined to provide compatibility
  with existing users of stack(9).

Add new stack_save_td(9) function, which allows the capture of a stacktrace
of another thread rather than the current thread, which the existing
stack_save(9) was limited to.  It requires that the thread be neither
swapped out nor running, which is the responsibility of the consumer to
enforce.

Update stack(9) man page.

Build tested:	amd64, arm, i386, ia64, powerpc, sparc64, sun4v
Runtime tested:	amd64 (rwatson), arm (cognet), i386 (rwatson)
2007-12-02 20:40:35 +00:00
marius
5d3e8757cd Fix a non-fatal off-by-one error in the previous revision. 2007-12-01 19:42:33 +00:00
marius
edd33ddaa9 - Add the PCI side of the HOST-PCI bridge itself to the bus. This
is required by the X.Org PCI domains code and additionally needs
  a workaround for Hummingbird and Sabre bridges as these don't
  allow their config headers to be read at any width, which is an
  unusual behavior.
- In psycho(4) take advantage of DEFINE_CLASS_0 and use more
  appropriate types for some softc members.

MFC after:	3 days
2007-11-30 23:02:42 +00:00
attilio
2562874cb6 Make ADAPTIVE_GIANT as the default in the kernel and remove the option.
Currently, Giant is not too much contented so that it is ok to treact it
like any other mutexes.

Please don't forget to update your own custom config kernel files.

Approved by:	cognet, marcel (maintainers of arches where option is
		not enabled at the moment)
2007-11-28 05:50:45 +00:00
scottl
b607c8d8ad Extend critical section coverage in the low-level interrupt handlers to
include the ithread scheduling step.  Without this, a preemption might
occur in between the interrupt getting masked and the ithread getting
scheduled.  Since the interrupt handler runs in the context of curthread,
the scheudler might see it as having a such a low priority on a busy system
that it doesn't get to run for a _long_ time, leaving the interrupt stranded
in a disabled state.  The only way that the preemption can happen is by
a fast/filter handler triggering a schduling event earlier in the handler,
so this problem can only happen for cases where an interrupt is being
shared by both a fast/filter handler and an ithread handler.  Unfortunately,
it seems to be common for this sharing to happen with network and USB
devices, for example.  This fixes many of the mysterious TCP session
timeouts and NIC watchdogs that were being reported.  Many thanks to Sam
Lefler for getting to the bottom of this problem.

Reviewed by: jhb, jeff, silby
2007-11-21 04:03:51 +00:00
marius
825e639df6 Let sunkbd(4) emulate an AT keyboard by default.
This has the following benefits:
- allows to use the AT keyboard maps in share/syscons/keymaps with
  sunkbd(4),
- allows to use kbdmux(4) with sunkbd(4),
- allows Sun RS232 keyboards to be configured and used the same
  way as Sun USB keyboards driven by ukbd(4) (which also does AT
  keyboard emulation) with X.Org, putting an end to the problem
  of native support for the former in X.Org being broken over and
  over again.

MFC after:	3 days
2007-11-18 18:11:16 +00:00
alc
d1ab859bdc Prevent the leakage of wired pages in the following circumstances:
First, a file is mmap(2)ed and then mlock(2)ed.  Later, it is truncated.
Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the
pages beyond the EOF are unmapped and freed.  However, when the file is
mlock(2)ed, the pages beyond the EOF are unmapped but not freed because
they have a non-zero wire count.  This can be a mistake.  Specifically,
it is a mistake if the sole reason why the pages are wired is because of
wired, managed mappings.  Previously, unmapping the pages destroys these
wired, managed mappings, but does not reduce the pages' wire count.
Consequently, when the file is unmapped, the pages are not unwired
because the wired mapping has been destroyed.  Moreover, when the vm
object is finally destroyed, the pages are leaked because they are still
wired.  The fix is to reduce the pages' wired count by the number of
wired, managed mappings destroyed.  To do this, I introduce a new pmap
function pmap_page_wired_mappings() that returns the number of managed
mappings to the given physical page that are wired, and I use this
function in vm_object_page_remove().

Reviewed by: tegge
MFC after: 6 weeks
2007-11-17 22:52:29 +00:00
marcel
1e7c4f0a3f o Rename cpu_thread_setup() to cpu_thread_alloc() to better
communicate that it relates to (is called by) thread_alloc()
o  Add cpu_thread_free() which is called from thread_free()
   to counter-act cpu_thread_alloc().

i386:	Have cpu_thread_free() call cpu_thread_clean() to
	preserve behaviour.
ia64:	Have cpu_thread_free() call mtx_destroy() for the
	mutex initialized in cpu_thread_alloc().

PR: ia64/118024
2007-11-14 20:21:54 +00:00
kib
9ae733819b Fix for the panic("vm_thread_new: kstack allocation failed") and
silent NULL pointer dereference in the i386 and sparc64 pmap_pinit()
when the kmem_alloc_nofault() failed to allocate address space. Both
functions now return error instead of panicing or dereferencing NULL.

As consequence, vmspace_exec() and vmspace_unshare() returns the errno
int. struct vmspace arg was added to vm_forkproc() to avoid dealing
with failed allocation when most of the fork1() job is already done.

The kernel stack for the thread is now set up in the thread_alloc(),
that itself may return NULL. Also, allocation of the first process
thread is performed in the fork1() to properly deal with stack
allocation failure. proc_linkup() is separated into proc_linkup()
called from fork1(), and proc_linkup0(), that is used to set up the
kernel process (was known as swapper).

In collaboration with:	Peter Holm
Reviewed by:	jhb
2007-11-05 11:36:16 +00:00
marius
03771327ba - Make failure to route a ISA interrupt non fatal. Apparently the
Blade 1500/SX1500 boards have inherited the firmware bug of the
  AX1105 mainboards to not include an interrupt map entry for the
  parallel port controller (for the AX1105 the heuristic code for
  E450s probably erroneously kicks in and guesses an interrupt).
- Take advantage of bus_generic_setup_intr(9).
- Fix some whitespace bugs.
2007-10-28 22:08:37 +00:00
marius
a060d7dfbc - Fix the handling of R_SPARC_OLO10, which is a bit of a special case
in the way we implement handling of relocations.
  As for the kernel part this fixes the loading of lots of modules,
  which failed to load due to unresolvable symbols when built after
  the GCC 4.2.0 import. This wasn't due to a change in GCC itself
  though but one of several changes in configuration done along the
  import. Specfically, HAVE_AS_REGISTER_PSEUDO_OP, which causes GCC
  to denote global registers used for scratch purposes and in turn
  GAS uses R_SPARC_OLO10 relocations for, is now defined.
  While at it replace some more ELF_R_TYPE which should have been
  ELF64_R_TYPE_ID but didn't cause problems so far.
- Sync a sanity check between kernel and rtld(1) and change it to be
  maintenance free regarding the type used for the lookup table.
- Sprinkle const on lookup tables.
- Use __FBSDID.

Reported and tested by:	yongari
MFC after:		5 days
2007-10-16 19:17:48 +00:00
alc
19c4fce2e3 Correct a lock assertion failure in sparc64's pmap_page_is_mapped() that is
a consequence of sparc64/sparc64/vm_machdep.c revision 1.76.  It occurs
when uma_small_free() frees a page.  The solution has two parts: (1) Mark
pages allocated with VM_ALLOC_NOOBJ as PG_UNMANAGED.  (2) Defer the lock
assertion in pmap_page_is_mapped() until after PG_UNMANAGED is tested.
This is safe because both PG_UNMANAGED and PG_FICTITIOUS are immutable
flags, i.e., they do not change state between the time that a page is
allocated and freed.

Approved by:	re (kensmith)
PR:		116794
2007-10-07 18:03:03 +00:00
marius
d60b8a3096 Make the PCI code aware of PCI domains (aka PCI segments) so we can
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.

Suggested by:	jhb
Reviewed by:	grehan, jhb, marcel
Approved by:	re (kensmith), jhb (PCI maintainer hat)
2007-09-30 11:05:18 +00:00
marius
6cdd0265f8 - Use the actual clock frequency of the PCI bus instead of assuming
33MHz for calculating the latency timer values for its children.
  Inspired by NetBSD doing the same and Linux as well as OpenSolaris
  using a similar approach.
  While at it rename a variable and change its type to be more
  appropriate fuer values of PCI properties so the variable can be
  more easily reused.
- Initialize the cache line size register of PCI devices to a
  legal value; the cache line size is limited to 64 bytes by the
  Fireplane/Safari, JBus and UPA interconnection busses. Setting
  it to an unsupported value caused bad performance at least with
  GEM as it causes them to not do cache line bursts and to not
  issue cache line commands on the PCI bus.

Approved by:	re (kensmith)
MFC after:	1 week
2007-09-26 20:10:36 +00:00
brueffer
26461bf019 Use the correct expanded name for SCTP.
PR:		116496
Submitted by:	koitsu
Reviewed by:	rrs
Approved by:	re (kensmith)
2007-09-26 20:05:07 +00:00
alc
d1bce06c64 Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq.  Instead, they are kept in a separate per-object
splay tree of cached pages.  However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock.  Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE).  The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held.  Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case.  Cached pages
are reclaimed far, far more often than they are reactivated.  Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated.  Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page.  Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
   Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
2007-09-25 06:25:06 +00:00
jeff
3fc0f8b973 - Move all of the PS_ flags into either p_flag or td_flags.
- p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or
   previously the sched_lock.  These bugs have existed for some time.
 - Allow swapout to try each thread in a process individually and then
   swapin the whole process if any of these fail.  This allows us to move
   most scheduler related swap flags into td_flags.
 - Keep ki_sflag for backwards compat but change all in source tools to
   use the new and more correct location of P_INMEM.

Reported by:	pho
Reviewed by:	attilio, kib
Approved by:	re (kensmith)
2007-09-17 05:31:39 +00:00
alc
20b10da706 It has been observed on the mailing lists that the different categories
of pages don't sum to anywhere near the total number of pages on amd64.
This is for the most part because uma_small_alloc() pages have never been
counted as wired pages, like their kmem_malloc() brethren.  They should
be.  This changes fixes that.

It is no longer necessary for the page queues lock to be held to free
pages allocated by uma_small_alloc().  I removed the acquisition and
release of the page queues lock from uma_small_free() on amd64 and ia64
weeks ago.  This patch updates the other architectures that have
uma_small_alloc() and uma_small_free().

Approved by: re (kensmith)
2007-09-15 18:47:02 +00:00
marius
be8d1ddc2e o Revamp the sparc64 interrupt code in order to be able to interface
with the INTR_FILTER-enabled MI code. Basically this consists of
  registering an interrupt controller (of which there can be multiple
  and optionally different ones either per host-to-foo bridge or shared
  amongst host-to-foo bridges in any one machine) along with an interrupt
  vector as specific argument for all the interrupt vectors used by a
  given host-to-foo bridge (roughly similar to registering interrupt
  sources on amd64 and i386), providing functions to enable, clear and
  disable the interrupts of the children beneath the bridge.
  This also includes:
  - No longer entering a critical section in tl0_intr() and tl1_intr()
    for executing interrupt handlers but rather let the handlers enter
    it themselves so in the case of intr_event_handle() we don't enter
    a nested critical section.
  - Adding infrastructure for binding delivery of interrupt vectors to
    specific CPUs which later on can be interfaced with the code from
    amd64/i386 for binding interrupts to specific CPUs.
  - Getting rid of the wrapper hack introduced along the lines of the
    API changes for INTR_FILTER which as a side-effect caused interrupts
    associated with ithread handlers only to get the elevated priority
    of those associated with filters ("fast handlers") (this removes the
    hack also in the non-INTR_FILTER case).
  - Disabling (by not clearing) an interrupt in the interrupt controller
    until all associated handlers have been executed, which is crucial
    for the typical locking strategy of NIC drivers in order to work
    correctly in case of shared interrupts. This was a more or less
    theoretical problem on sparc64 though, as shared interrupts are
    rather uncommon there except for the on-board SCCs and UARTs.
  Note that due to the behavior of at least of some of the interrupt
  controllers used on sparc64 an enable+EOI instead of a disable+EOI
  approach (as implied by the INTR_FILTER MI code and implemented on
  other architectures) is used as the latter can cause lost interrupts
  or in the worst case interrupt starvation.
o Correct a typo in sbus_alloc_resource() which caused (pass-through)
  allocations to only work down to the grandchildren of the bus, which
  wasn't a real problem so far as we don't support any devices which are
  great-grandchildren or greater of a U2S bridge, yet.
o In fhc(4) use bus_{read,write}_4() instead of bus_space_{read,write}_4()
  in order to get rid of sc_bh and sc_bt in the fhc_softc. Also get rid
  of some other unneeded members in fhc_softc.

Reviewed by:	marcel (earlier version)
Approved by:	re (kensmith)
2007-09-06 19:16:30 +00:00
marius
02fe3685d7 Style(9) fix - use #define<tab> consistently.
Approved by:	re (kensmith)
2007-09-06 14:56:09 +00:00
marius
5a1a2bd9cf - Divorce the IOTSBs, which so far where handled via a global list
instead of per IOMMU, so we no longer need to program all of them
  identically in systems having multiple IOMMUs. This continues the
  rototilling of the nexus(4) done about 5 months ago, which amongst
  others changed nexus(4) and the drivers for host-to-foo bridges
  to provide bus_get_dma_tag methods, allowing to handle DMA tags in
  a hierarchical way and to link them with devices.
  This still doesn't move the silicon bug workarounds for Sabre (and
  in the uncommitted schizo(4) for Tomatillo) bridges into special
  bus_dma_tag_create() and bus_dmamap_sync() methods though, as w/o
  fully newbus'ified bus_dma_tag_create() and bus_dma_tag_destroy()
  this still requires too much hackery, i.e. per-child parent DMA
  tags in the parent driver.
- Let the host-to-foo drivers supply the maximum physical address
  of the IOMMU accompanying the bridges. Previously iommu(4) hard-
  coded an upper limit of 16GB, which actually only applies to the
  IOMMUs of the Hummingbird and Sabre bridges. The Psycho variants
  as well as the U2S in fact can can translate to up to 2TB, i.e.
  translate to 41-bit physical addresses. According to the recently
  available Tomatillo documentation these bridges even translate to
  43-bit physical addresses and hints at the Schizo bridges doing
  43 bits as well.
  This fixes the issue the FreeBSD 6.0 todo list item "Max RAM on
  sparc64" was refering to and pretty much obsoletes the lack of
  support for bounce buffers on sparc64.

Thanks to Nathan Whitehorn for pointing me at the Tomatillo manual.

Approved by:	re (kensmith)
2007-08-05 11:56:44 +00:00
dwmalone
e8276674f3 If clock_ct_to_ts fails to convert time time from the real time clock,
print a one line error message. Add some comments on not being able to
trust the day of week field (I'll act on these comments in a follow up
commit).

Approved by:	re
MFC after:	3 weeks
2007-07-23 09:42:32 +00:00
jeff
e6da269e0c - Remove the global definition of sched_lock in mutex.h to break
new code and third party modules which try to depend on it.
 - Initialize sched_lock in sched_4bsd.c.
 - Declare sched_lock in sparc64 pmap.c and assert that we're compiling
   with SCHED_4BSD to prevent accidental crashes from running ULE.  This
   is the sole remaining file outside of the scheduler that uses the
   global sched_lock.

Approved by:	re
2007-07-18 20:46:06 +00:00
marius
d1ccb2d7c7 - Move ofw_pci_alloc_busno() to the ofw_pci KOBJ interface,
allowing the driver for the host-PCI-bridge to indicate that
  reenumeration of the PCI busses isn't supported by returning
  -1 instead of a valid PCI bus number. This is needed in order
  support both Tomatillo, which don't support reenumeration and
  thus are apparently intended to be used for independently
  numbered PCI domains only, and Psycho bridges, whose busses
  need to be reenumerated on at least some E450, without the
  #ifndef currently used for sun4v in order to support multiple
  independently PCI domains. The actual allocation/incrementation
  of the PCI bus numbers is now done in psycho(4), though it
  no longer establish a mapping between bus numbers and device
  nodes like ofw_pci_alloc_busno() did as that functionality
  wasn't used (but can easily brought back if really needed).
  The now no longer used sys/sparc64/pci/ofw_pci.c is also
  removed from sys/conf/files.sun4v as ofw_pci_alloc_busno()
  wasn't used there in the first place.
- In ofw_pci_default_{adjust_busrange,intr_pending}() sanity
  check that the device has a parent before passing it on.
- Make psycho_softcs static to sys/sparc64/pci/psycho.c as
  it's not used outside of that module.
- In sys/sparc64/pci/ofw_pcib_subr.c remove the superfluous
  inclusion of opt_global.h and correct the debug output for
  adjusting the subordinate bus number.
2007-06-18 21:49:42 +00:00
marius
5af2c4982c For sun4u also add PCI busses with a device unit number of -1
instead of using the PCI bus number, like it's already done for
sun4v in order to deal properly with independently numbered PCI
domains which can't be reenumerated (in the case of sun4u f.e.
Tomatillo bridges). For machines where we need to reenumerate
all PCI busses this change obviously introduces the theoretical
cosmetic problem that the device number of the PCI bus no longer
equals to its PCI bus number. In practice this doesn't happen
as both are assigned linearly and in parallel.
2007-06-18 21:46:07 +00:00
marius
c8a8a74641 Remove unused softc. 2007-06-17 16:44:08 +00:00
marius
a9f02e3bc0 - Don't register the over-temperature and power-fail interrupt
handlers as filter/"fast" handlers so shutdown_nice() can
  acquire the process lock.
- Use bus_{read,write}_8() instead of bus_space_{read,write}_8()
  in order to get rid of sc_bushandle and sc_bustag in the softc.
- Remove the banal and outdated comment above sbus_filter_stub().
2007-06-16 23:49:41 +00:00
marius
c27e004e36 - Use the newly introduced pcib_mtx spin lock to lock psycho_ce(),
allowing it to be a filter/"fast" handler. Locking the interrupt
  handlers with a spin lock is mainly a requirement in schizo(4)
  but as we ought to register the spin lock anyway it should not
  hurt to take advantage of it in psycho(4).
- Pass both a driver_filter_t and a driver_intr_t argument to
  psycho_set_intr(), allowing to get rid of the FAST interrupt
  flag hack.
- Don't register the over-temperature interrupt handler as filter/
  "fast" handler so shutdown_nice() can acquire the process lock.
- Use bus_{read,write}_8() instead of bus_space_{read,write}_8()
  in order to get rid of sc_bushandle and sc_bustag in the softc.
- Correct the debug output for adjusting the subordinate bus number.
- Remove the banal and outdated above psycho_filter_stub().
- Fix some white space nits.
2007-06-16 23:46:41 +00:00
marius
ee65d962e2 - Add support for sending IPIs with USIII and greater sun4u CPUs.
These CPUs use an enhanced layout of the interrupt vector dispatch
  and dispatch status registers in order to allow sending IPIs to
  multiple targets simultaneously. Thus support for these CPUs was
  put in a newly added cheetah_ipi_selected(). This is intended to
  be pointed to by cpu_ipi_selected, which now is a function pointer,
  in order to avoid cpu_impl checks once booted. Alternatively it
  can point to spitfire_ipi_selected(), which was renamed from
  cpu_ipi_selected(). Consequently cpu_ipi_send() was also renamed
  to spitfire_ipi_send() (there's no need for a cheetah equivalent
  of this so far). Initialization of the cpu_ipi_selected pointer
  and other requirements is done in mp_init(), which was renamed
  from mp_tramp_alloc(), as cpu_mp_start() isn't called on UP
  systems while cpu_ipi_selected() is. As a side-effect this allows
  to make mp_tramp static to sys/sparc64/sparc64/mp_machdep.c.
  For the sake of avoiding #ifdef SMP and for keeping the history in
  place cheetah_ipi_selected() and spitfire_ipi_{selected,send}()
  where not put into/moved to sys/sparc64/sparc64/{cheetah,spitfire}.c
- Add some CTASSERTs and KASSERTs ensuring that MAXCPU doesn't
  exceed the data types we use to store the CPU bit fields or the
  number of USIII and greater CPUs supported by the current
  cheetah_ipi_selected() implementation (which for JBus-CPUs is
  only 4; that should be fine though as according to OpenSolaris
  there are no sun4u machines with more than 4 JBus-CPUs).
- In cpu_mp_start() don't enumerate and start more than MAXCPU CPUs
  as we can't handle more than that.
- In cpu_mp_start() check for upa-portid vs. portid depending on
  cpu_impl for consistency with nexus(4).
- In spitfire_ipi_selected() add KASSERTs ensuring that a CPU isn't
  told to IPI itself as sun4u CPUs just can't do that.
- In spitfire_ipi_send() do a MEMBAR #Sync after writing the
  interrupt vector data as we want to make sure the payload was
  actually written before we trigger the dispatch.
- In spitfire_ipi_send() also verify IDR_BUSY when checking whether
  the dispatch was successful as it has to be cleared for this to
  be the case.
- Remove some redundant variables.
2007-06-16 23:26:00 +00:00
marius
e1c912de86 - Flesh out the support for the EBus variant which actually is the
RTC function of a National Semiconductor PC87317/PC97317. This
  consists of using the century register the same way Solaris does
  for compatibility reasons. Once there is a MD power(4) we'd also
  want to interface the APC (Advanced Power Control) functionality
  of the same chip function with it.
- Use a macro for the device description and take advantage of
  ISA_PNP_PROBE() setting the device description.
- Use the generated typedefs for the prototypes of the device
  interface functions.
2007-06-16 23:17:23 +00:00
marius
2f5e33cdc9 Remove the code for displaying the OFW hostid during boot for the
reasons outlined in the comment removed along with it, because the
OFW hostid has no real meaning for FreeBSD and mainly so the OFW
hostid is not confused with the FreeBSD hostid.
2007-06-16 23:07:53 +00:00
delphij
c990e91fd1 Enable SCTP by default for GENERIC kernels in order to give it
more exposure.  The current state of SCTP implementation is
considered to be ready for 32-bit platforms, but still need some
work/testing on 64-bit platforms.

Approved by:	re (kensmith)
Discussed with:	rrs
2007-06-14 17:14:27 +00:00
thompsa
000ee7da5f Add wlan_scan_ap and wlan_scan_sta to platforms that include wlan. 2007-06-11 08:26:40 +00:00
marcel
a3226c8185 Use default options for default partitioning schemes, rather than
making the relevant files standard. This avoids duplication and
makes it easier to override/disable unwanted schemes. Since ARM
doesn't have a DEFAULTS configuration file, leave the source
files for the BSD and MBR partitioning schemes in files.arm for
now.
2007-06-11 00:38:06 +00:00
marcel
75588c5a15 Add kdb_cpu_sync_icache(), intended to synchronize instruction
caches with data caches after writing to memory. This typically
is required to make breakpoints work on ia64 and powerpc. For
those architectures the function is implemented.
2007-06-09 21:55:17 +00:00
rwatson
4f2da72e8b Enable AUDIT by default in the GENERIC kernel, allowing security event
auditing to be turned on without a kernel recompile, just an rc.conf
option.

Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
2007-06-08 20:29:07 +00:00
piso
0533835c9e Teach the bridge wrapper how to handle the filter+ithread case.
Reviewed by: marius
2007-06-06 22:19:23 +00:00
jeff
be3241715a - Change comments and asserts to reflect the removal of the global
scheduler lock.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:57:32 +00:00
jeff
20e0f793e8 Commit 10/14 of sched_lock decomposition.
- Use sched_throw() rather than replicating the same cpu_throw() code for
   each architecture.  This also allows the scheduler to use any locking it
   may want to.
 - Use the thread_lock() rather than sched_lock when preempting.
 - The scheduler lock is not required to synchronize release_aps.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-04 23:56:08 +00:00
attilio
e333d0ff0e Rework the PCPU_* (MD) interface:
- Rename PCPU_LAZY_INC into PCPU_INC
- Add the PCPU_ADD interface which just does an add on the pcpu member
  given a specific value.

Note that for most architectures PCPU_INC and PCPU_ADD are not safe.
This is a point that needs some discussions/work in the next days.

Reviewed by: alc, bde
Approved by: jeff (mentor)
2007-06-04 21:38:48 +00:00
alc
4d78df42fa Add the machine-specific definitions for configuring the new physical
memory allocator.

Approved by:	re
2007-06-04 02:32:07 +00:00
alc
67d75e2c91 Prepare for the new physical memory allocator: Change the way that the
physical page's color is obtained.

Approved by:	re
2007-06-03 19:39:38 +00:00
attilio
7dd8ed88a9 Revert VMCNT_* operations introduction.
Probabilly, a general approach is not the better solution here, so we should
solve the sched_lock protection problems separately.

Requested by: alc
Approved by: jeff (mentor)
2007-05-31 22:52:15 +00:00
piso
42dfc78150 In some particular cases (like in pccard and pccbb), the real device
handler is wrapped in a couple of functions - a filter wrapper and an
ithread wrapper. In this case (and just in this case), the filter
wrapper could ask the system to schedule the ithread and mask the
interrupt source if the wrapped handler is composed of just an ithread
handler: modify the "old" interrupt code to make it support
this situation, while the "new" interrupt code is already ok.

Discussed with: jhb
2007-05-31 19:25:35 +00:00
yongari
f53195d29a Honor maxsegsz of less than a page size in a DMA tag. Previously it
used to return PAGE_SIZE without respect to restrictions of a DMA tag.
This affected all of the busdma load functions that use
_bus_dmamap_loader_buffer() as their back-end.

Reviewed by:	scottl
2007-05-29 06:30:26 +00:00
simokawa
8adde71ecb Enable fwip and dcons in GENERIC. They seem fairly stable.
Note on dcons:
To enable dcons in kernel, put the following lines in /boot/loader.conf.
You may also want to enable dcons in /etc/ttys.

boot_multicons="YES"
#Force dcons to be the high-level console if a firewire bus presents.
#hw.firewire.dcons_crom.force_console=1

FireWire/dcons support in loader will come shortly.
(i386/amd64 only)
2007-05-28 14:38:43 +00:00
kan
4c2d706212 Allow FreeBSD's native ELF image activators to execute shared libraries the
same way it was enabled for Linux binares in linuxulator.

This allows binaries built with -pie. Many ports auto-detect -fPIE support
in GCC 4.2 and build binaries FreeBSD was unable to run.
2007-05-22 02:22:58 +00:00
jeff
953418f0d5 - rename VMCNT_DEC to VMCNT_SUB to reflect the count argument.
Suggested by:	julian@
Contributed by:	attilio@
2007-05-20 22:33:42 +00:00
marius
6aa8341d75 - Staticize cpu_ipi_send() and cpu_mp_unleash() as these aren't
referenced outside of mp_machdep.c
- Replace a magic 14 with the newly added IDC_ITID_SHIFT macro.
- Remove the global mp_boot_mid variable as it's not really necessary
  and just replacing it with PCPU_GET(mid) doesn't have any impact on
  performance once booted.
- Replace PCPU_GET(cpuid) with the curcpu shortcut.
- Replace hardcoded function names in panic strings etc with __func__
  so they don't need to be updated when renaming the function.
- Use register_t instead of u_long for variables used to hold the
  return value of intr_disable() so we don't need to apply any
  knowledge about the actual width of that value here.
- Improve the wording of some comments.
- Fix several style(9) bugs.
2007-05-20 14:49:01 +00:00
marius
743faad470 - Also identify USIIIi+, USIV and USIV+ CPUs.
- Use __FBSDID in identcpu.c.
- Remove #ifndef SUN4V around global cpu_impl variable; it doesn't
  hurt on sun4v for now and once setPQL2() is gone sun4v can stop
  sharing identcpu.c with sparc64, making the reminder of this file
  also sparc64-only again. [1]

Submitted by:	kmacy [1]
2007-05-20 13:47:36 +00:00
marius
42e374c608 Delete the unused/not really used sparc64 (as in sun4u) cache.h,
iommureg.h (which already began to bitrot) and iommuvar.h from the
sun4v source and adjust some of the source which is shared between
sparc64 and sun4v as appropriate.
2007-05-20 13:06:45 +00:00
kan
ad6731806a Include machine/pcb.hto turn extern struct pcb stoppcbs[]; construct
into the valid C.
2007-05-19 05:01:43 +00:00
jeff
e1996cb960 - define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating
vmcnts.  This can be used to abstract away pcpu details but also changes
   to use atomics for all counters now.  This means sched lock is no longer
   responsible for protecting counts in the switch routines.

Contributed by:		Attilio Rao <attilio@FreeBSD.org>
2007-05-18 07:10:50 +00:00
marius
6d578f19e4 - Add bits for userland profiling. For sun4u this is compile-tested only.
- Replace magic 14 with PIL_TICK.
2007-05-11 23:43:55 +00:00
alc
b34f6f7ab1 Define every architecture as either VM_PHYSSEG_DENSE or
VM_PHYSSEG_SPARSE depending on whether the physical address space is
densely or sparsely populated with memory.  The effect of this
definition is to determine which of two implementations of
vm_page_array and PHYS_TO_VM_PAGE() is used.  The legacy
implementation is obtained by defining VM_PHYSSEG_DENSE, and a new
implementation that trades off time for space is obtained by defining
VM_PHYSSEG_SPARSE.  For now, all architectures except for ia64 and
sparc64 define VM_PHYSSEG_DENSE.  Defining VM_PHYSSEG_SPARSE on ia64
allows the entirety of my Itanium 2's memory to be used.  Previously,
only the first 1 GB could be used.  Defining VM_PHYSSEG_SPARSE on
sparc64 allows USIIIi-based systems to boot without crashing.

This change is a combination of Nathan Whitehorn's patch and my own
work in perforce.

Discussed with: kmacy, marius, Nathan Whitehorn
PR:		112194
2007-05-05 19:50:28 +00:00
marius
ea30d35434 Use the VIS-based Spitfire version of the page copying and zeroing
functions with CPUs they apply to only, otherwise default to the
plain C functions. This is modeled in a way so that f.e. a Cheetah
version of these functions can be inserted easily.
2007-05-01 16:19:28 +00:00
marius
61e9800ad7 Make the rman(9) workaround actually work. The main problem was that
the UPA_IMR2 resource is also shared with/a subset of the Schizo PCI
bus B CSR bank. I'm not entirely sure how this previously managed to
escape testing...
2007-05-01 15:02:18 +00:00
sepotvin
a1e73b1eaf Add support for specifying a minimal size for vm.kmem_size in the loader via
vm.kmem_size_min. Useful when using ZFS to make sure that vm.kmem size will
be at least 256mb (for example) without forcing a particular value via vm.kmem_size.

Approved by: njl (mentor)
Reviewed by: alc
2007-04-21 01:14:48 +00:00
pjd
f4e110ebf2 Remove trailing '.' for consistency! 2007-04-10 21:40:13 +00:00
pjd
b159725895 Add UFS_GJOURNAL options to the GENERIC kernel.
Approved by:	re (kensmith)
2007-04-10 16:49:41 +00:00
alc
b03ddb707b Push down the implementation of PCPU_LAZY_INC() into the machine-dependent
header file.  Reimplement PCPU_LAZY_INC() on amd64 and i386 making it
atomic with respect to interrupts.

Reviewed by: bde, jhb
2007-03-11 05:54:29 +00:00
mohans
a332cb00d5 Over NFS, an open() call could result in multiple over-the-wire
GETATTRs being generated - one from lookup()/namei() and the other
from nfs_open() (for cto consistency). This change eliminates the
GETATTR in nfs_open() if an otw GETATTR was done from the namei()
path. Instead of extending the vop interface, we timestamp each attr
load, and use this to detect whether a GETATTR was done from namei()
for this syscall. Introduces a thread-local variable that counts the
syscalls made by the thread and uses <pid, tid, thread syscalls> as
the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on
thread state that could be used as the timestamp with minimal overhead.
2007-03-09 04:02:38 +00:00
marius
3ee9e586b3 Rototill the sparc64 nexus(4) (actually this brings in the code the
sun4v nexus(4) in turn is based on):
o Change nexus(4) to manage the resources of its children so the
  respective device drivers don't need to figure them out of OFW
  themselves.
o Change nexus(4) to provide the ofw_bus KOBJ interface instead of
  using IVARs for supplying the OFW node and the subset of standard
  properties of its children. Together with the previous change this
  also allows to fully take advantage of newbus in that drivers like
  fhc(4), which attach on multiple parent busses, no longer require
  different bus front-ends as obtaining the OFW node and properties
  as well as resource allocation works the same for all supported
  busses. As such this change also is part 4/4 of allowing creator(4)
  to work in USIII-based machines as it allows this driver to attach
  on both nexus(4) and upa(4). On the other hand removing these IVARs
  breaks API compatibility with the powerpc nexus(4) but which isn't
  that bad as a) sparc64 currently doesn't share any device driver
  hanging off of nexus(4) with powerpc and b) they were no longer
  compatible regarding OFW-related extensions at the pci(4) level
  since quite some time.
o Provide bus_get_dma_tag methods in nexus(4) and its children in
  order to handle DMA tags in a hierarchical way and get rid of the
  sparc64_root_dma_tag kludge. Together with the previous two items
  this changes also allows to completely get rid of the nexus(4)
  IVAR interface. It also includes:
  - pushing the constraints previously specified by the nexus_dmatag
    down into the DMA tags of psycho(4) and sbus(4) as it's their
    IOMMUs which induce these restrictions (and nothing at the
    nexus(4) or anything that would warrant specifying them there),
  - fixing some obviously wrong constraints of the psycho(4) and
    sbus(4) DMA tags, which happened to not actually be used with
    the sparc64_root_dma_tag kludge in place and therefore didn't
    cause problems so far,
  - replacing magic constants for constraints with macros as far
    as it is obvious as to where they come from.
  This doesn't include taking advantage of the newbus way to get
  the parent DMA tags implemented by this change in order to divorce
  the IOTSBs of the PCI and SBus IOMMUs or for implementing the
  workaround for the DMA sync bug in Sabre (and Tomatillo) bridges,
  yet, though.
o Get rid of the notion that nexus(4) (mostly) reflects an UPA bus
  by replacing ofw_upa.h and with ofw_nexus.h (which was repo-copied
  from ofw_upa.h) and renaming its content, which actually applies to
  all of Fireplane/Safari, JBus and UPA (in the host bus case), as
  appropriate.
o Just use M_DEVBUF instead of a separate M_NEXUS malloc type for
  allocating the device info for the children of nexus(4). This is
  done in order to not need to export M_NEXUS when deriving drivers
  for subordinate busses from the nexus(4) class.
o Use the DEFINE_CLASS_0() macro to declare the nexus(4) driver so
  we can derive subclasses from it.
o Const'ify the nexus_excl_name and nexus_excl_type arrays as well
  as add 'associations' and 'rsc', which are pseudo-devices without
  resources and therefore of no real interest for nexus(4), to the
  former.
o Let the nexus(4) device memory rman manage the entire 64-bit address
  space instead of just the UPA_MEMSTART to UPA_MEMEND subregion as
  Fireplane/Safari- and JBus-based machines use multiple ranges,
  which can't be as easily divided as in the case of UPA (limiting
  the address space only served for sanity checking anyway).
o Use M_WAITOK instead of M_NOWAIT when allocating the device info
  for children of nexus(4) in order to give one less opportunity
  for adding devices to nexus(4) to fail.
o While adapting the drivers affected by the above nexus(4) changes,
  change them to take advantage of rman_get_rid() instead of caching
  the RIDs assigned to allocated resources, now that the RIDs of
  resources are correctly set.
o In iommu(4) and nexus(4) replace hard-coded functions names, which
  actually became outdated in several places, in panic strings and
  status massages with __func__. [1]
o Use driver_filter_t in prototypes where appropriate.
o Add my copyright to creator(4), fhc(4), nexus(4), psycho(4) and
  sbus(4) as I changed considerable amounts of these drivers as well
  as added a bunch of new features, workarounds for silicon bugs etc.
o Fix some white space nits.

Due to lack of access to Exx00 hardware, these changes, i.e. central(4)
and fhc(4), couldn't be runtime tested on such a machine. Exx00 are
currently reported to panic before trying to attach nexus(4) anyway
though.

PR:		76052 [1]
Approved by:	re (kensmith)
2007-03-07 21:13:51 +00:00
piso
378d044a37 Wrap at 80 bus_setup_intr() in upa_setup_intr(). 2007-03-06 12:19:37 +00:00
marius
1950dd41df Use uma_set_align(). 2007-02-25 10:52:47 +00:00
piso
6a2ffa86e5 o break newbus api: add a new argument of type driver_filter_t to
bus_setup_intr()

o add an int return code to all fast handlers

o retire INTR_FAST/IH_FAST

For more info: http://docs.freebsd.org/cgi/getmsg.cgi?fetch=465712+0+current/freebsd-current

Reviewed by: many
Approved by: re@
2007-02-23 12:19:07 +00:00
brooks
beaea8e48e Include GEOM_LABEL in GENERIC. It's very useful and not well publicized
enough.

Approved by:	pjd
2007-02-09 19:03:18 +00:00
marcel
0245423ad8 Evolve the ctlreq interface added to geom_gpt into a generic
partitioning class that supports multiple schemes. Current
schemes supported are APM (Apple Partition Map) and GPT.
Change all GEOM_APPLE anf GEOM_GPT options into GEOM_PART_APM
and GEOM_PART_GPT (resp).

The ctlreq interface supports verbs to create and destroy
partitioning schemes on a disk; to add, delete and modify
partitions; and to commit or undo changes made.
2007-02-07 18:55:31 +00:00
marius
c8d049b911 Quiet GCC4 warnings regarding the width of printf()-arguments not
matching the format. While at it limit the format to unsigned int as
we're only interested in the 11 least significant bits anyway.
2007-01-20 17:14:12 +00:00
marius
46318caabd - Use bus_get_dma_tag() to obtain the parent DMA tag so dma(4) will
work when we start requiring this.
- Don't specify an alignment when creating our own parent DMA tag;
  the supported DMA engines require no alignment constraint (f.e. the
  LANCE child does though) and it's no inherited by the child DMA
  tags anyway (which probably is a bug though).
- Fix whitespace nits.
2007-01-20 14:06:01 +00:00
marius
6cc2be150f Convert the remainder of the low hanging fruits regarding including
headers in .S directly rather than getting to their macros through
genassym.c/assym.s so there are less headers genassym.c has to be
kept in sync with.
While at it fix some stytle(9) bugs (indentation, prototype format,
sort headers, etc) and remove trailing whitespace.
2007-01-19 11:15:34 +00:00
marius
2d9010d810 - Rename UPA_BUS_SPACE to NEXUS_BUS_SPACE; besides an UPA bus, nexus(4)
may also reflect a Fireplane/Safari or JBus bus (or a virtual bus which
  in turn reflects a JBus bus or something like that...).
- In the both the sparc64 and sun4v bus_machdep.c use __FBSDID.
- Spell SBus the official way in comments.
- Replace hardcoded function names (all of which were actually outdated)
  in panic and status strings with __func__.
- Fix whitespace nits.
2007-01-18 18:32:26 +00:00
marius
52099a4877 Remove the compat shims for the ISA old-stlye in{b,w,l}()/out{b,w,l}()
and friends along with all hacks required to implement them. None of
the drivers currently built (as part of GENERIC, LINT or modules) on
sparc64 or sun4v and none of those we might want to use there in
future uses them, AFAICT there actually never was a driver hooked up
to the sparc64 or sun4v build that correctly used these functions
(and it looks like that due to a bug read{b,w,l}()/write{b,w,l}() and
the other functions working on a memory handle never actually worked on
sun4v). All they ever were good for on sparc64 and sun4v was erroneously
dragging in dependencies on isa(4) in drivers like f.e. dpt(4), si(4)
and syscons(4) in source files that supposedly were bus-neutral and
hiding issues with drivers like f.e. ng_bt3c(4) that used these
functions with busses other than isa(4) and therefore couldn't work on
these platforms.
2007-01-18 13:52:44 +00:00
marius
2d74edbd80 Resurrect upa(4), now used for the subordinate/slave UPA bridge and
bus hanging off from the Fireplane/Safari bus in some USIII machines.
This is part 3/4 of allowing creator(4) to work in these machines.
The little info needed on how to configure the bridge and to work
around the incorrect values contained in the `interrupts' properties
of its children were obtained form OpenSolaris.
2007-01-16 22:08:27 +00:00
marius
7f03dfc1a6 - Merge sys/sparc64/creator/creator_upa.c into sys/dev/fb/creator.c.
The separate bus front-end was inherited from the OpenBSD creator(4),
  which at that time had a mainbus(4) (for USI/II machines, which use
  an UPA interconnection bus as the nexus) and an upa(4) (for USIII
  machines, which use a subordinate/slave UPA bus hanging off from the
  Fireplane/Safari interconnection bus) front-end. With FreeBSD and
  newbus there is/will be no need to have two separate bus front-ends
  for these busses, so we can easily coallapse the shared front-end
  and the back-end into a single source file (note that the FreeBSD
  creator_upa.c was misnomer anyway; based on what it actually attached
  to that should have been creator_nexus.c), actually OpenBSD meanwhile
  also has moved to a shared front-end and a single source file. Due
  to the low-level console support creator.c also wasn't free from bus
  related things before.
  While at it, also split sys/sparc64/creator/creator.h into a
  sys/dev/fb/creatorreg.h that only contains register macros and move
  the structures to the top of sys/dev/fb/creator.c as suggested by
  style(9) so creator(4) is no longer scattered over two directories.
- Use OF_decode_addr()/sparc64_fake_bustag() to obtain the bus tags and
  handles for the low-level console support instead of hardcoding
  support for AFB/FFB hanging off from nexus(4) only. This is part 2/4
  of allowing creator(4) to work in USIII machines (which have a UPA
  bus hanging off from the Fireplane/Safari bus reflected by the nexus),
  which already makes it work as the low-level console there.
- Allocate resources in the bus attach routine regardless of whether
  creator(4) is used as for the low-level console and thus the required
  bus tags and handles have been already obtained or not so the resources
  are marked as taken in the respective RMAN.
- For both obtaining the bus tags and handles for the low-level console
  support as well as allocating the corresponding resources in the
  regular bus attach routine don't bother to get all for the maximum of
  24 register banks but only (for) the two tag/handle pairs required for
  providing the video interface for syscons(4) support. If we can't
  allocate the rest of them just limit the memory range accessible via
  creator_fb_mmap() accordingly.
- Sanity check the memory range spanned by the first and last resources
  and the resources in between as far as possible, as the XFree86/Xorg
  sunffb(4) expects to be able to access the whole region, even though
  the backing resources are actually non-continuous. Limit and check
  the memory range accessible via creator_fb_mmap() accordingly.
- Reduce the size of buffers for OFW properties to what they actually
  need to hold.
- Rename some tables to creator_<foo> for consistency.
- Also for the sizes in the creator_fb_mmap() mapping table entries use
  macros for consistency, add macros for the remaining register banks
  for completeness.
2007-01-16 21:08:22 +00:00
marius
026ed31f96 Teach OF_decode_addr() about the bus space used for devices on the
nexus (which might or might not reflect an UPA interconnection bus;
accordingly UPA_BUS_SPACE should be renamed to NEXUS_BUS_SPACE at a
later point) and subordinate/slave UPA busses. This is part 1/4 of
allowing creator(4) to work in USIII machines (which have a UPA bus
hanging off from the Fireplane/Safari bus reflected by the nexus).
2007-01-16 20:42:21 +00:00
marius
6d5dd8a49a Check the return value of bus_setup_intr() when setting up the
over-temperature and power-fail interrupts.

Suggested by:	Coverity Prevent (CID 683)
MFC after:	1 week
2007-01-15 22:37:59 +00:00
imp
9109b1ceb8 Remove 3rd clause, renumber, ok per email 2007-01-12 07:26:21 +00:00
marius
6d67972dc5 o Changes to psycho_attach(): [1]
- Clear the PCI AFSR and status error bits as previous errors still
    might be indicated.
  - Set up the PCI control and diagnostic registers according to the
    capabilities, workarounds, etc of/for specific revisions of the
    supported bridges. This includes no longer setting Hummingbird-/
    Sabre-specific bits in the PCI control register but preserving
    what the firmware has initialized them to like OpenSolaris does.
    Previously we were setting these bits according to the example in
    the Sabre documentation, which I doubt is appropriate for all
    Sabre based designs and especially not for Hummingbirds. This
    also includes not enabling bus parking unless the firmware tells
    us to.
  - Set the PCI latency timer register as this isn't always done by
    the firmware.
o Remove a redundant argument from psycho_set_intr() and in this
  function check the return value of bus_setup_intr(). [2]
o Let psycho_setup_intr() return ENOMEM instead of 0 when it can't
  allocate memory for the interrupt wrapper stub and EINVAL instead
  of 0 if it can't find the interrupt vector in the interrupt map.
o Add a workaround for a bug of the Sabre-APB-combination where it
  doesn't drain DMA write data for devices behind additional PCI-PCI
  bridges underneath the APB PCI-PCI bridge. This workaround (do
  things necessary in order to achieve a manual drain when coherency
  is required) is currently implemented in psycho_setup_intr() and
  psycho_intr_stub() (for easy MFC'ing) and therefore is only applied
  for interrupt handlers. This should be moved to psycho(4)-specific
  bus_dma_tag_create() and bus_dmamap_sync() methods, respectively,
  once this driver is converted to make use of BUS_GET_DMA_TAG(), so
  the workaround is also applied for polling(4) callbacks. [3]
o Fix some minor style issues.

Info from:	OpenSolaris [1]
Info from:	Linux, OpenBSD, OpenSolaris [3]
Suggested by:	Coverity Prevent (CID 682) [2]
MFC after:	1 month
2007-01-08 01:26:47 +00:00
marius
384c6b86d0 In ofw_pcibus_attach() skip dupe PCI devices reported by the
firmware (mainly 'pmu' and its 'lomp' dupe found in a couple of
later USII{e,i}-based machines) by checking whether a device with
the same triple of bus number, slot and function already has been
added. This is the simple yet effective approach introduced in
OpenBSD some time ago, but which has the flaw that it assumes
that the device and its dupe(s) found in the OFW device tree are
equal or at least the one encountered first is in some way the
more important one (this is the case with 'pmu' and 'lomp'; the
'pmu' node has couple of properties and children while the 'lomp'
one misses most of these). If there's ever a device/dupe pair
where we don't encounter the more important node first, we'll
probably need to introduce a quirk list in order to add the
desired device but prevent its dupe(s) from being added.

MFC after:	1 week
2007-01-08 01:08:24 +00:00
kmacy
0e8bee8e99 add new large page sizes for use by shared loader 2006-12-18 07:28:59 +00:00
kmacy
063528d699 GC unused fields in pcpu 2006-12-17 02:04:19 +00:00
kmacy
6a6429aa4c Do explicit bounds checking as a function of the actual size of the
reloc_target_bitmask array as opposed to the (known) index of the last value.
This change fixes CID 691.
2006-12-10 04:18:03 +00:00
julian
396ed947f6 Threading cleanup.. part 2 of several.
Make part of John Birrell's KSE patch permanent..
Specifically, remove:
Any reference of the ksegrp structure. This feature was
never fully utilised and made things overly complicated.
All code in the scheduler that tried to make threaded programs
fair to unthreaded programs.  Libpthread processes will already
do this to some extent and libthr processes already disable it.

Also:
Since this makes such a big change to the scheduler(s), take the opportunity
to rename some structures and elements that had to be moved anyhow.
This makes the code a lot more readable.

The ULE scheduler compiles again but I have no idea if it works.

The 4bsd scheduler still reqires a little cleaning and some functions that now do
ALMOST nothing will go away, but I thought I'd do that as a separate commit.

Tested by David Xu, and Dan Eischen using libthr and libpthread.
2006-12-06 06:34:57 +00:00
jb
da35e3e55f Turn console printf buffering into a kernel option and only on
by default for sun4v where it is absolutely required.

This change moves the buffer from struct pcpu to the stack to avoid
using the critical section which created a LOR in a couple of cases
due to interaction with the tty code and kqueue. The LOR can't be
fixed with the critical section and the pcpu buffer can't be used
without the critical section.

Putting the buffer on the stack was my initial solution, but it was
pointed out that the stress on the stack might cause problems
depending on the call path. We don't have a way of creating tests
for those possible cases, so it's best to leave this as an option
for the time being. In time we may get enough data to enable this
option more generally.
2006-11-30 04:17:05 +00:00
kmacy
1eddb5f0f6 - Explicitly name the fields in pcb that we use to store trap state for later
retrieval, rather than using pad
- save the fault address in sfar for use by the alignment fixup handler
- mask off the trap number, so the context id doesn't confuse the UT_MAX
  comparison

This change fixes alignment fixup handling which is needed for traceroute
to work in spite of its copious unaligned accesses
2006-11-29 05:18:19 +00:00
kmacy
c41b4d1505 remove unused reference to tsb pa 2006-11-24 18:36:04 +00:00
kmacy
5ce0966852 Add mechanism to track TSB misses in tsb miss handler
Remove unused debug code
2006-11-22 00:18:22 +00:00
kmacy
7a3167fc07 remove 13 (largely) redundant files and switch to the sparc64/sparc64 version
Reviewed by: jb (mentor rwatson)
2006-11-18 07:10:52 +00:00
alc
6093953d36 Make pmap_enter() responsible for setting PG_WRITEABLE instead
of its caller.  (As a beneficial side-effect, a high-contention
acquisition of the page queues lock in vm_fault() is eliminated.)
2006-11-12 21:48:34 +00:00
jb
0ba5da19a5 Remove the KDTRACE option again because of the complaints about having
it as a default.

For the record, the KDTRACE option caused _no_ additional source files
to be compiled in; certainly no CDDL source files. All it did was to
allow existing BSD licensed kernel files to include one or more CDDL
header files.

By removing this from DEFAULTS, the onus is on a kernel builder to add
the option to the kernel config, possibly by including GENERIC and
customising from there. It means that DTrace won't be a feature
available in FreeBSD by default, which is the way I intended it to be.

Without this option, you can't load the dtrace module (which contains
the dtrace device and the DTrace framework). This is equivalent to
requiring an option in a kernel config before you can load the linux
emulation module, for example.

I think it is a mistake to have DTrace ported to FreeBSD, but not
to have it available to everyone, all the time. The only exception
to this is the companies which distribute systems with FreeBSD embedded.
Those companies will customise their systems anyway. The KDTRACE
option was intended for them, and only them.
2006-11-04 23:50:12 +00:00
jb
4a565ec80c Backout the previous change. It was not intended to be part of the
commit and, while something like that is probably required for sparc64,
it hadn't been tested.
2006-11-04 05:27:21 +00:00
jb
f7bc0a87d6 Build in kernel support for loading DTrace modules by default. This
adds the hooks that DTrace modules register with, and adds a few functions
which have the dtrace_ prefix to allow the DTrace FBT (function boundary
trace) provider to avoid tracing because they are called from the DTtrace
probe context.

Unlike other forms of tracing and debug, DTrace support in the kernel
incurs negligible run-time cost.

I think the only reason why anyone wouldn't want to have kernel support
enabled for DTrace would be due to the license (CDDL) under which DTrace
is released.
2006-11-04 04:58:10 +00:00
kmacy
744ac9e5e7 make pcb pad area accessible from asm
Approved by: scottl (standing in for rwatson as mentor)
2006-11-03 23:33:40 +00:00
marius
abfeec800f - In sunkbd_probe_keyboard() don't bother to determine the keyboard layout
as we have no use for that info. Instead let this function return the
  keyboard ID and verify at its invocation in sunkbd_configure() that we're
  talking to a Sun type 4/5/6 keyboard, i.e. a keyboard supported by this
  driver.
- Add an option SUNKBD_EMULATE_ATKBD whose code is based on the respective
  code in ukbd(4) and like UKBD_EMULATE_ATSCANCODE causes this driver to
  emit AT keyboard/KB_101 compatible scan codes in K_RAW mode as assumed by
  kbdmux(4). Unlike UKBD_EMULATE_ATSCANCODE, SUNKBD_EMULATE_ATKBD also
  triggers the use of AT keyboard maps and thus allows to use the map files
  in share/syscons/keymaps with this driver at the cost of an additional
  translation (in ukbd(4) this just is the way of operation).
- Implement an option SUNKBD_DFLT_KEYMAP, which like the equivalent options
  of the other keyboard drivers allows to specify the default in-kernel
  keyboard map. For obvious reasons this made to only work when also using
  SUNKBD_EMULATE_ATKBD.
- Implement sunkbd_check(), sunkbd_check_char() and sunkbd_clear_state(),
  which are also required for interoperability with kbdmux(4).
- Implement K_CODE mode and FreeBSD keypad compose.
- As a minor hack define KBD_DFLT_KEYMAP also in the !SUNKBD_EMULATE_ATKBD
  case so we can obtain fkey_tab from <dev/kbd/kbdtables.h> rather than
  having to duplicate it and #ifdef some more code.
- Don't use the TX-buffer for writing the two command bytes for setting the
  keyboard LEDs as this consequently requires a hardware FIFO that is at
  least two bytes in depth, which the NMOS-variant of the Zilog SCCs doesn't
  have. Thus use an inlined version of uart_putc() to consecutively write
  the command bytes (a cleaner approach would be to do this via the soft
  interrupt handler but that variant wouldn't work while in ddb(4)). [1]
- Fix some minor style(9) bugs.

PR:		90316 [1]
Reviewed by:	marcel [1]
2006-11-02 00:01:15 +00:00
marius
85b7c67667 Remove the atkbd(4), atkbdc(4) and psm(4) hints. In theory they can be
used on sparc64 but that would be totally wrong in practice.
2006-11-01 18:17:53 +00:00
jb
d2bd807356 Add a cnputs() function to write a string to the console with
a lock to prevent interspersed strings written from different CPUs
at the same time.

To avoid putting a buffer on the stack or having to malloc one,
space is incorporated in the per-cpu structure. The buffer
size if 128 bytes; chosen because it's the next power of 2 size
up from 80 characters.

String writes to the console are buffered up the end of the line
or until the buffer fills. Then the buffer is flushed to all
console devices.

Existing low level console output via cnputc() is unaffected by
this change. ithread calls to log() are also unaffected to avoid
blocking those threads.

A minor change to the behaviour in a panic situation is that
console output will still be buffered, but won't be written to
a tty as before. This should prevent interspersed panic output
as a number of CPUs panic before we end up single threaded
running ddb.

Reviewed by:	scottl, jhb
MFC after:	2 weeks
2006-11-01 04:54:51 +00:00
marius
02d450964f In the replacement text of the __bswapN_const() macros encapsulate the
argument in parentheses so these macros are safe to use and invocations
with an expression as the argument like __bswap32_const(42 << 23 | 13)
work as expected. Additionally, mask all the individually shifted bytes
as appropriate so the bytes which exceed the width of the respective
__bswapN_const() macro in invocations like __bswap16_const(0xdead600d)
are ignored like it's the case with the corresponding __bswapN_var()
function.

MFC after:	3 days
2006-10-30 21:50:11 +00:00
jb
d20db01f97 Remove the KSE option now that it's in DEFAULTS on these arches/machines.
The 'nooption' kernel config entry has to be used to turn KSE off now.
This isn't my preferred way of dealing with this, but I'll defer to
scottl's experience with the io/mem kernel option change and the grief
experienced over that.

Submitted by:	scottl@
2006-10-26 22:11:35 +00:00
jb
3a250c3e16 Add 'options KSE' to the kernel config DEFAULTS on all arches/machines
except sun4v.

This change makes the transition from a default to an option more
transparent and is an attempt to head off all the compliants that are
likely from people who don't read UPDATING, based on experience with
the io/mem change.

Submitted by:	scottl@
2006-10-26 22:05:25 +00:00
jb
f82c799735 Make KSE a kernel option, turned on by default in all GENERIC
kernel configs except sun4v (which doesn't process signals properly
with KSE).

Reviewed by:	davidxu@
2006-10-26 21:42:22 +00:00
ru
dcc4e06e70 Move "device splash" back to MI NOTES and "files", it's MI. 2006-10-23 13:23:14 +00:00
ru
73b959b47b Revision 1.25 had the ATKBD_DFLT_KEYMAP option turned on and then off:
: # Options for atkbd:
: options        ATKBD_DFLT_KEYMAP       # specify the built-in keymap
: makeoptions    ATKBD_DFLT_KEYMAP=jp.106
[...]
: nooption       ATKBD_DFLT_KEYMAP
: nomakeoption   ATKBD_DFLT_KEYMAP

(Previously the option was inherited from MI NOTES.)  So my tool in
rev. 1.26 reduced this to removing all "ATKBD_DFLT_KEYMAP" lines,
leaving the option effectively disabled as it was before, but since
it's actually supported on sparc64, turn it on now.
2006-10-23 10:05:36 +00:00
ru
5924f811d0 Mechanically kill redundant nodevice/nooption/nomakeoption, i.e.,
those that do not exist in MI NOTES or switched on/off in the MD
NOTES.
2006-10-23 09:45:22 +00:00
des
a3f4000fda Move more MD devices and options out of MI NOTES. 2006-10-20 09:52:27 +00:00
davidxu
bb5a3880aa o Add keyword volatile for user mutex owner field.
o Fix type consistent problem by using type long for old
  umtx and wait channel.
o Rename casuptr to casuword.
2006-10-17 02:24:47 +00:00
kmacy
f64ee1a32f The T2000 has multiple PCI domains requiring bus allocation to be done differently.
This pulls in changes by jmg from perforce and makes them sun4v only for now.

Approved by: scottl (acting as backup for mentor rwatson)
2006-10-12 04:44:01 +00:00
bde
f79090f0c4 The powerpc and sparc64 MD `reboot' commands should never have existed
since they just duplicated the MI `reset' command.  Instead of removing
them, make `reboot' an MI alias for `reboot' since this gives a better
way of killing the `r' alias for `reset'.  Remove the `registers' command
that was used to kill the alias.

Turn the powerpc and sparc64 MD `halt' command into an MI command.

A copy of sparc64/db_interface.c grew in sun4v just after I found the
extra reboot commands.  It has not been changed, and is now not
identical.  Duplicated commands come out duplicated in ddb's online
help, but cause large problems when used (e.g., on i386's with 2 halt's
and an hwatch, typing h doesn' give the expected message about an
ambiguous command, but hangs like the halt command or a looping parseri
would).
2006-10-10 07:26:54 +00:00
kmacy
6051e81fe9 unbreak buildkernel for sparc64 - fallout from sun4v
Approved by: rwatson (mentor)
Reviewed by: jmg
2006-10-09 06:08:24 +00:00
kmacy
98d1f51687 unbreak sparc64 loader build
re-add accidentally deleted asi value
remove sun4v only header include

Approved by: rwatson (mentor)
Reviewed by: jmg
2006-10-09 05:59:04 +00:00
kmacy
65e20bda09 kernel clean up to make the sun4v kernel build
Reviewed by: jmg
Approved by: rwatson (mentor)
2006-10-09 04:45:19 +00:00
simon
440a3866d6 - Remove SCHED_ULE from GENERIC to better avoid foot-shooting by
unsuspecting users.
- Add a comment in NOTES about experimental status of SCHED_ULE.
- Make warning about experimental status in sched_ule(4) a bit
  stronger.

Suggested and reviewed by:	dougb
Discussed on:			developers
MFC after:			3 days
2006-10-05 20:31:58 +00:00
jb
bf543444cf PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
Security:
Move the relocation definitions to the common elf header so that DTrace
can use them on one architecture targeted to a different one.

Add the additional ELF types defines in Sun's "Linker and Libraries"
manual.
2006-10-04 21:37:10 +00:00
phk
50c81b8a9a First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
ru
d4abbeb0fe Added COMPAT_FREEBSD6 option. 2006-09-26 12:36:34 +00:00
alc
398a198e9a The fix in revision 1.152 converted in the wrong direction.
Fix a typo in a comment.

Submitted by: Michael Plass
2006-09-22 07:16:36 +00:00
alc
af1e0e1577 The sparc64/sparc64/pmap.c implementations of pmap_remove(),
pmap_protect(), and pmap_copy() have optimizations for regions
larger than PMAP_TSB_THRESH (which works out to 16MB).  This
caused a panic in tsb_foreach for kernel mappings, since
pm->pm_tsb is NULL in that case.  This fix teaches tsb_foreach
to use the kernel's tsb in that case.

Submitted by: Michael Plass
MFC after: 3 days
2006-09-22 07:02:15 +00:00
kan
c9b2659ee8 Use __builtin_va_start instead of __builtin_stdarg_start. GCC4 obsoletes
the former and  __builtin_va_start was present in all GCC version 3.1 and
later.
2006-09-21 01:37:02 +00:00
marius
e2d308d090 Do as the USII CPU manual suggests and leave interrupts enabled
for a bit before retrying to resend an IPI in order to avoid
deadlocks if the other CPU is also trying to send one.
OpenSolaris uses a delay of 1 microsecond here but waiting 2
microseconds with interrupts enabled like Linux does shouldn't
hurt but is a bit safer.

MFC after:	1 day
2006-09-03 21:20:21 +00:00
davidxu
87b5aa08ee Implement casuword32, compare and set user integer, thank Marcel Moolenarr
who wrote the IA64 version of casuword32.
2006-08-28 02:28:15 +00:00
alc
9c31c61c94 Eliminate the unnecessary acquisition and release of the page queues lock
from pmap_pinit().
2006-08-06 19:36:07 +00:00
alc
a152234cf9 Complete the transition from pmap_page_protect() to pmap_remove_write().
Originally, I had adopted sparc64's name, pmap_clear_write(), for the
function that is now pmap_remove_write().  However, this function is more
like pmap_remove_all() than like pmap_clear_modify() or
pmap_clear_reference(), hence, the name change.

The higher-level rationale behind this change is described in
src/sys/amd64/amd64/pmap.c revision 1.567.  The short version is that I'm
trying to clean up and fix our support for execute access.

Reviewed by: marcel@ (ia64)
2006-08-01 19:06:06 +00:00
marcel
7067faff16 Remove sio(4) and related options from MI files to amd64, i386
and pc98 MD files. Remove nodevice and nooption lines specific
to sio(4) from ia64, powerpc and sparc64 NOTES. There were no
such lines for arm yet.
sio(4) is usable on less than half the platforms, not counting
a future mips platform. Its presence in MI files is therefore
increasingly becoming a burden.
2006-07-29 18:38:54 +00:00
jhb
3a707d012d Retire SYF_ARGMASK and remove both SYF_MPSAFE and SYF_ARGMASK. sy_narg is
now back to just being an argument count.
2006-07-28 20:22:58 +00:00
jhb
c62c38439f Now that all system calls are MPSAFE, retire the SYF_MPSAFE flag used to
mark system calls as being MPSAFE:
- Stop conditionally acquiring Giant around system call invocations.
- Remove all of the 'M' prefixes from the master system call files.
- Remove support for the 'M' prefix from the script that generates the
  syscall-related files from the master system call files.
- Don't explicitly set SYF_MPSAFE when registering nfssvc.
2006-07-28 19:05:28 +00:00
jhb
12302c47d0 Unify the checking for lock misbehavior in the various syscall()
implementations and adjust some of the checks while I'm here:
- Add a new check to make sure we don't return from a syscall in a critical
  section.
- Add a new explicit check before userret() to make sure we don't return
  with any locks held.  The advantage here is that we can include the
  syscall number and name in syscall() whereas that info is not available
  in userret().
- Drop the mtx_assert()'s of sched_lock and Giant.  They are replaced by
  the more general checks just added.

MFC after:	2 weeks
2006-07-27 22:32:30 +00:00
yongari
9b54b752db Add stge(4) to the list of drivers supported by GENERIC kernel. 2006-07-25 01:06:32 +00:00
alc
004ef88e09 Add pmap_clear_write() to the interface between the virtual memory
system's machine-dependent and machine-independent layers.  Once
pmap_clear_write() is implemented on all of our supported
architectures, I intend to replace all calls to pmap_page_protect() by
calls to pmap_clear_write().  Why?  Both the use and implementation of
pmap_page_protect() in our virtual memory system has subtle errors,
specifically, the management of execute permission is broken on some
architectures.  The "prot" argument to pmap_page_protect() should
behave differently from the "prot" argument to other pmap functions.
Instead of meaning, "give the specified access rights to all of the
physical page's mappings," it means "don't take away the specified
access rights from all of the physical page's mappings, but do take
away the ones that aren't specified."  However, owing to our i386
legacy, i.e., no support for no-execute rights, all but one invocation
of pmap_page_protect() specifies VM_PROT_READ only, when the intent
is, in fact, to remove only write permission.  Consequently, a
faithful implementation of pmap_page_protect(), e.g., ia64, would
remove execute permission as well as write permission.  On the other
hand, some architectures that support execute permission have
basically ignored whether or not VM_PROT_EXECUTE is passed to
pmap_page_protect(), e.g., amd64 and sparc64.  This change represents
the first step in replacing pmap_page_protect() by the less subtle
pmap_clear_write() that is already implemented on amd64, i386, and
sparc64.

Discussed with: grehan@ and marcel@
2006-07-20 17:48:41 +00:00
jhb
a72b0bcd7f Simplify the pager support in DDB. Allowing different db commands to
install custom pager functions didn't actually happen in practice (they
all just used the simple pager and passed in a local quit pointer).  So,
just hardcode the simple pager as the only pager and make it set a global
db_pager_quit flag that db commands can check when the user hits 'q' (or a
suitable variant) at the pager prompt.  Also, now that it's easy to do so,
enable paging by default for all ddb commands.  Any command that wishes to
honor the quit flag can do so by checking db_pager_quit.  Note that the
pager can also be effectively disabled by setting $lines to 0.

Other fixes:
- 'show idt' on i386 and pc98 now actually checks the quit flag and
  terminates early.
- 'show intr' now actually checks the quit flag and terminates early.
2006-07-12 21:22:44 +00:00
mjacob
672bdf0dbc Make the firmware assist driver resident in
preparation for isp using it.
2006-07-09 16:40:31 +00:00
babkin
f0555f2de9 Backed out the change by request from rwatson.
PR:		kern/14584
2006-06-26 22:03:22 +00:00
babkin
3d8be823b0 The common UID/GID space implementation. It has been discussed on -arch
in 1999, and there are changes to the sysctl names compared to PR,
according to that discussion. The description is in sys/conf/NOTES.
Lines in the GENERIC files are added in commented-out form.
I'll attach the test script I've used to PR.

PR:		kern/14584
Submitted by:	babkin
2006-06-25 18:37:44 +00:00
netchild
11681ee0b5 Remove COMPAT_43 from GENERIC (and other kernel configs). For amd64 there's
an explicit comment that it's needed for the linuxolator. This is not the
case anymore. For all other architectures there was only a "KEEP THIS".
I'm (and other people too) running a COMPAT_43-less kernel since it's not
necessary anymore for the linuxolator. Roman is running such a kernel for a
for longer time. No problems so far. And I doubt other (newer than ia32
or alpha) architectures really depend on it.

This may result in a small performance increase for some workloads.

If the removal of COMPAT_43 results in a not working program, please
recompile it and all dependencies and try again before reporting a
problem.

The only place where COMPAT_43 is needed (as in: does not compile without
it) is in the (outdated/not usable since too old) svr4 code.

Note: this does not remove the COMPAT_43TTY option.

Nagging by:	rdivacky
2006-06-15 19:58:53 +00:00
ups
b3a7439a45 Remove mpte optimization from pmap_enter_quick().
There is a race with the current locking scheme and removing
it should have no measurable performance impact.
This fixes page faults leading to panics in pmap_enter_quick_locked()
on amd64/i386.

Reviewed by: alc,jhb,peter,ps
2006-06-15 01:01:06 +00:00
marius
35917647ed - Complete breaking out the definition of bus_space_{tag,handle}_t by
moving the typedef of bus_space_tag_t from sys/sparc64/include/bus.h
  to sys/sparc64/include/_bus.h. This brings sparc64 in sync with the
  other platforms and fixes the compilation of drivers which include
  <sys/rman.h> before <machine/bus.h> after sys/sys/rman.h rev. 1.34.
- Remove the definition of bus_type_t from sys/sparc64/include/_bus.h
  as it's unused since sys/sparc64/include/bus.h rev. 1.6 and
  sys/sparc64/sparc64/bus_machdep.c rev. 1.3.
- Remove some pointless comments.
2006-06-13 19:18:09 +00:00
marius
a4cadb999e Correct transposed digits in device names which were added in the
previous revision.
2006-06-13 18:40:39 +00:00
imp
038d1db25e Add the ability to subset the devices that UART pulls in. This allows
the arm to compile without all the extras that don't appear, at least
not in the flavors of ARM I deal with.  This helps us save about 100k.

If I've botched the available devices on a platform, please let me
know and I'll correct ASAP.
2006-06-12 04:21:50 +00:00
marius
840beae239 - Merge sys/sparc64/pci/psycho.c rev. 1.8:
Map the device memory belonging to resources of type SYS_RES_MEMORY into
  KVA upon activation so that rman_get_virtual() works as expected.
- In sbus_alloc_resource() checking whether toffs is 0 as an indication
  that no applicable child range was found isn't appropriate as it's
  perfectly valid for the requested SYS_RES_MEMORY resource to start at
  the beginning of a child range. So check for the RMAN still being NULL
  instead.
- As a minor runtime speed optimization break out of the loop where we
  search for the applicable child range in sbus_alloc_resource() as soon
  as it's found.
- Let sbus_setup_intr() return ENOMEM rather than 0 if it can't allocate
  memory for the interrupt clearing info.
- Actually do what the comment in sbus_setup_intr() says and disable the
  respective interrupt while fiddling with it.
- Remove some superfluous INTVEC() around inr, which already only contains
  the interrupt vector, in sbus_setup_intr().
- While here, fix a style(9) bug in sbus_setup_intr() (don't use function
  calls in initializers).

The first two changes are required for a CG6 driver.

MFC after:	2 weeks
2006-06-08 21:02:25 +00:00
sam
9c0742c63f add ath & co.
MFC after:	1 month
2006-06-07 18:10:28 +00:00
alc
ff4adb11fe Introduce the function pmap_enter_object(). It maps a sequence of resident
pages from the same object.  Use it in vm_map_pmap_enter() to reduce the
locking overhead of premapping objects.

Reviewed by: tegge@
2006-06-05 20:35:27 +00:00
marius
ff15bbbd3e - Declare the PnP map const.
- Add devices found in V210 to the PnP map.
- Don't leak memory if we didn't find a match for a node in the PnP map.

MFC after:	2 weeks
2006-06-05 17:48:54 +00:00
alc
4676c61f83 MFalpha/amd64/arm/ia64
Retire pmap_track_modified().  We no longer need it because we do not
 create managed mappings within the clean submap.  To prevent regressions,
 add assertions blocking the creation of managed mappings within the clean
 submap.
2006-05-29 06:12:01 +00:00
phk
ef310efff8 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
phk
5d8c57a08b Clean out sysctl machdep.* related defines.
The cmos clock related stuff should really be in MI code.
2006-05-11 17:29:25 +00:00
yongari
9b5f432030 Uncomment sk(4) as it's now working. 2006-04-27 06:03:17 +00:00
delphij
da32f1fb9a Move AHC_REG_PRETTY_PRINT and AHD_REG_PRETTY_PRINT below
their corresponding devices.
2006-04-24 08:44:34 +00:00
imp
cdc20c723d Set the rid for any resource obtained from rman_reserve_resource.
Reviewed by: wollman, jmg	(as were the other commits fixing this problem)
2006-04-20 04:20:41 +00:00
marcel
b3f3f05092 Remove sab(4). 2006-04-19 19:39:35 +00:00
marius
3b2002ac40 - Since critical sections no longer raise the processor interrupt level to
above what's used for fast interrupts, only interrupts with the level of
  the interrupt which led to calling intr_fast() (which is used with both
  fast and ithread interrupts) are blocked while in that function. Thus
  intr_fast() can be preempted by a fast interrupt (which are of a higher
  level than ithread interrupts) while servicing an ithread interrupt. This
  can lead to a stale pointer to the head of the active interrupt requests
  list when back in the ithread interrupt invocation of intr_fast(), in turn
  resulting in corruption of the interrupt request lists and consequently
  in a panic. Solve this be turning off interrupts in intr_fast() before
  reading the pointer to the head of the active list rather than after. [1]
- Add a KASSERT in intr_fast() which asserts that ir_func is non-zero before
  calling it. [1]
- Increment interrupt stats after calling the handlers rather than before.
  This reduces the delay until direct and fast handlers are serviced, in my
  testings by 30% on average for the direct tick interrupt handler, in turn
  resulting in less clock drift.

PR:		94778 [1]
Submitted by:	Andrew Belashov [1]
MFC after:	2 weeks
2006-04-17 21:03:24 +00:00
marius
8ea6d154a2 For USIII CPUs the type of the trap caused by peeking/poking non-existent
PCI devices apparently was changed from a special deferred trap with TPC
pointing to the membar #Sync following the failing load/store instruction
to a precise trap with TPC pointing to the failing load/store instruction.
Thus remove the check the check whether TPC points to a membar #Sync in
case of a data access trap as it's off-by-one for USIII CPUs and it should
be sufficient to check whether the trap happend while in fasword*() to
properly detect traps caused by peeking/poking. This also corresponds to
what other OSs do. Note that also only the USIIi manual suggests to check
the TPC for such traps while the USII one doesn't (in the public USIII
manual device peeking/poking isn't mentioned at all).
2006-04-04 21:00:44 +00:00
marcel
8278e2d5fb Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the
PCB in which the context of stopped CPUs is stored. To access this
PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The
definition, when present, lives in <machine/kdb.h> and abstracts
where MD code saves the context. Define KDB_STOPPEDPCB on i386,
amd64, alpha and sparc64 in accordance to previous code.
2006-04-03 22:51:47 +00:00
marius
6a5bc8bdd6 - s,tramoline,trampoline, in a comment.
- Use FBSDID in trap.c
- Make the global trap_sig[] static as it's not used outside of trap.c.
- In sendsig() remove an unused variable.
- In trap() sync with the other archs; for fast data access MMU miss and
  data access protection traps set ksi_addr to the SFAR reg which contains
  the faulting address and otherwise to the TPC reg. Generally the TCP reg
  contains the address of the instruction that caused the exception, except
  for fast instruction access traps (and some others; more refinement may
  be needed here) it also contains the faulting address.
  Previously sendsig() always set si_addr to the SFAR reg which is wrong
  for most traps.
- In sendsig() add support for FreeBSD old-style signals.

These changes are inspired by kmacy's sun4v changes and allow libsigsegv
to build on FreeBSD/sparc64, but it doesn't pass all checks and tests it
actually should, yet.

MFC after:	5 days
2006-04-03 21:27:01 +00:00
peter
0f363b7d24 Remove the unused sva and eva arguments from pmap_remove_pages(). 2006-04-03 21:16:10 +00:00
marcel
9259a3c772 Add scc(4). 2006-03-30 18:40:25 +00:00
marius
ea86aa3726 - We only lock the local per-CPU page in the local dTLB, so accessing the
foreign per-CPU pages in cpu_ipi_send() in order to get the module IDs
  of the other CPUs can cause a page fault. If this happens when doing a
  TLB shootdown while dealing with another page fault this causes a panic
  due to the recursive page fault. As I don't spot other code that assumes
  or requires that accessing foreign per-CPU pages must not page fault
  solve this by adding a statically allocated (and therefore locked in the
  kernel pages) array which establishes a FreeBSD CPU ID -> module ID
  relation and use that in cpu_ipi_selected() (instead of statically
  allocating the per-CPU pages which would just waste memory on say a dual
  CPU machine as sun4u theoretically supports up to 128 CPUs or wasting
  dTLB slots for the foreign per-CPU pages). [1]
- Fix a potential race in cpu_ipi_send(); as we don't serialize the access
  to cpu_ipi_selected() between MI and MD use (only MI-MI and MD-MD) we
  might catch the NACK bit caused by sending another IPI. Solve this by
  checking the NACK bit in the contents of the interrupt dispatch status
  reg read while interrupts were still turned off instead of reading that
  reg anew after interrupts were turned on again. This is also what the
  CPU docs suggest to do.
- Add a workaround for the SpitFire erratum #54 bug (affecting interrupt
  dispatch). While public info regarding what this CPU bug actually causes
  is not available testing shows that with the workaround in place it's
  less likely to get a "couldn't send ipi" panic, it doesn't solve these
  panics entirely though. [2]

Reported by:		kris [1]
Some clue from:		kmacy [1]
Info from:		Linux, OpenSolaris [2]
Additional testing by:	kris
MFC after:		3 days
2006-03-29 00:14:08 +00:00
marius
98a829e8ea Add convenience macros for the bits in ASI_ESTATE_ERROR_EN_REG (used
for ECC handling) and the additional uses of the ASIs 0x77 and 0x7f
as well as their bits (used for a CPU bug workaround).

MFC after:	3 days
2006-03-29 00:08:48 +00:00
marius
8aaa7c780c - Add a comment describing why tick_init() is called before cninit().
- Fix a typo in another comment.
2006-03-28 20:28:31 +00:00
marius
d6a3e171f0 - Move the check for too high HZ values from tick_init() to tick_start()
as we have to call tick_init() before cninit() in order to provide the
  low-level console drivers with a working DELAY() which in turn means we
  cannot use panic() in tick_init().
- s,to high, too high, in the panic string

Inspired by:	kmacy's sun4v changes
MFC after:	3 days
2006-03-28 20:25:46 +00:00
marius
e3b3837a25 Add convenience macros for the full register set and use them to replace
magic constants in clkbrd.c

Info from:	OpenSolaris
2006-03-28 19:46:48 +00:00
marius
e2e06058c4 Sync with the other archs and declare the memory location referenced by
the address argument of the bus_space_write_multi_*() familiy as const.

Prodded by:	damien
2006-03-28 19:19:37 +00:00
brueffer
6dd5abcf92 Fix a c/p error.
Obtained from:	The TrustedBSD Project
Approved by:	rwatson (mentor)
2006-02-28 21:25:00 +00:00
marius
8fac408a72 - Don't bother traversing trap frames in stack_save(). This fixes panics
when option DEBUG_LOCKS is used. Trap frames are determined by checking
  whether the caller was one of the tl0_*() or tl1_*() asm functions via
  a newly added pair of dummy symbols in exception.S which mark the begin
  and end of these functions. The tl_trap_* pair marks those in the special
  .trap section and the tl_text_* in the regular .text section. Because
  of their performance penalty db_search_symbol()/db_symbol_values() and
  linker_ddb_search_symbol()/linker_ddb_symbol_values() aren't used here
  for determining the caller, with db_search_symbol()/db_symbol_values()
  additionally not being reentrant.
- For consistency, change db_backtrace() to also use the new markers for
  determining the tl0_*() and tl1_*() asm functions instead of bcmp()'ing
  the symbol name.
- Use FBSDID in db_trace.c.

PR:			93226
Based on a patch by:	Antoine Brodin <antoine.brodin@laposte.net>
Ok'ed by:		jhb
2006-02-19 11:54:46 +00:00
rwatson
e63de5c5ca Add system call auditing support for sparc64.
Submitted by:	brueffer
Obtained from:	TrustedBSD Project
2006-02-18 16:36:56 +00:00
marius
ef4c871cca For E250 and E450 enable the watchdog part of the MK48Txx as it just
works there.

MFC after:	3 days
2006-02-15 16:56:38 +00:00
jhb
a3a3e58c13 Fix the hw.realmem sysctl. The global realmem variable is a count of
pages, not a count of bytes.  The sysctl handler for hw.realmem already
uses ctob() to convert realmem from pages to bytes.  Thus, on archs that
were storing a byte count in the realmem variable, hw.realmem was inflated.

Reported by:	Valerio daelli valerio dot daelli at gmail dot com (alpha)
MFC after:	3 days
2006-02-14 14:50:11 +00:00
phk
79081baaf0 CPU time accounting speedup (step 2)
Keep accounting time (in per-cpu) cputicks and the statistics counts
in the thread and summarize into struct proc when at context switch.

Don't reach across CPUs in calcru().

Add code to calibrate the top speed of cpu_tickrate() for variable
cpu_tick hardware (like TSC on power managed machines).

Don't enforce monotonicity (at least for now) in calcru.  While the
calibrated cpu_tickrate ramps up it may not be true.

Use 27MHz counter on i386/Geode.

Use TSC on amd64 & i386 if present.

Use tick counter on sparc64
2006-02-11 09:33:07 +00:00
phk
74f8e63a10 Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly.  Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret().  Use the absolute value of
td->pticks in userret() and eliminate third argument.
2006-02-08 08:09:17 +00:00
phk
bb2f62f536 Modify the way we account for CPU time spent (step 1)
Keep track of time spent by the cpu in various contexts in units of
"cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t
only when somebody wants to inspect the numbers.

For now "cputicks" are still derived from the current timecounter
and therefore things should by definition remain sensible also on
SMP machines.  (The main reason for this first milestone commit is
to verify that hypothesis.)

On slower machines, the avoided multiplications to normalize timestams
at every context switch, comes out as a 5-7% better score on the
unixbench/context1 microbenchmark.  On more modern hardware no change
in performance is seen.
2006-02-07 21:22:02 +00:00
marius
40c49f280e Hook up le(4) to the build. For now it's only added to the sparc64 GENERIC
in order to support the on-board LANCE in Ultra 1 and to the MI NOTES as
it should work just fine with the AMD PCnet family of chips on all archs
but is not yet meant to replace lnc(4). If a kernel includes all of le(4),
lnc(4) and pcn(4) precedence is given to lnc(4)/pcn(4) for now.
2006-01-31 22:34:13 +00:00
marius
1434ed8e8b o lsi64854_enet_intr():
- Like lsi64854_scsi_intr() return -1 in case there was a DMA error so
    the caller can distinguish it from a normal interrupt and leave the
    reset of the DMA engine to the caller so we don't kill any state there.
  - Move the static 'dodrain' flag to struct lsi64854_softc as there can
    be more than one LSI64854 used for a LANCE in a system and reset it
    again once draining the E-cache is done so we don't keep draining the
    cache with every interrupt.
  - Remove calling sc->sc_intrchain(), we will call lsi64854_enet_intr()
    via sc->intr() in the interrupt handler of the LANCE driver and not
    use it in chained mode.

o lsi64854_pp_intr():
  - Like lsi64854_scsi_intr() return -1 in case there was a DMA error so
    the caller can distinguish it from a normal interrupt.

o Remove the no longer used sc_intrchain* from struct lsi64854_softc.

o Make lsi64854_reset(), lsi64854_setup*() and lsi64854_*_intr() static
  to lsi64854.c as we do and will only call them via the respective
  function pointers in struct lsi64854_softc.

o While here fix style(9) bugs (variable definition inside a nested scope).
2006-01-31 12:50:02 +00:00
marius
7c6b3706d9 Revert the part of rev. 1.3 which enabled the chaining of the DMA engine
interrupt handler for the LANCE devices and remove dma_setup_intr(). We
just can't completely ignore the DMA engine in a LANCE driver anyway and
calling the DMA engine interrupt handler in the LANCE driver directly
allows to cover it by the LANCE driver lock.
2006-01-30 21:43:14 +00:00
marius
679181b556 - Register the generic implementations for the device shutdown, suspend
and resume methods so these events propagate through the device driver
  hierarchy.
- In dma(4) enable the chaining of the DMA engine interrupt handler for
  the LANCE devices via a dma_setup_intr(). This was commented out before
  as I was unsure whether I'd use it but this is probably cleaner than
  fiddling with the DMA engine interrupt in the LANCE driver directly.
- In ebus_setup_dinfo() free 'intrs' instead of 'reg' twice in case
  setting up a child fails due to routing one of its interrupts fails. [1]

Found by:	Coverity Prevent [1]
MFC after:	3 days
2006-01-26 21:14:32 +00:00
jhb
c89e14c0d6 Make the ACPI and OpenFirmware PCI bus drivers subclasses of the generic
PCI bus driver.
2006-01-20 22:01:34 +00:00
kris
14241840f7 Correct typos (s/OFERFLOW/OVERFLOW/).
Reviewed by:	jhb
2006-01-16 01:35:25 +00:00