Commit Graph

11157 Commits

Author SHA1 Message Date
Poul-Henning Kamp
29cc138cdf Use correct bitmask for identifying chip family. 2008-03-17 00:36:16 +00:00
Pawel Jakub Dawidek
6eb4157ffc Implement atomic_fetchadd_long() for all architectures and document it.
Reviewed by:	attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
2008-03-16 21:20:50 +00:00
Roman Divacky
d8653dd986 Regen. 2008-03-16 16:29:37 +00:00
Roman Divacky
5dfb688191 Implement sched_setaffinity and get_setaffinity using
real cpu affinity setting primitives.

Reviewed by:	jeff
Approved by:	kib (mentor)
2008-03-16 16:27:44 +00:00
Robert Watson
237fdd787b In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
John Baldwin
eaf86d1678 Add preliminary support for binding interrupts to CPUs:
- Add a new intr_event method ie_assign_cpu() that is invoked when the MI
  code wishes to bind an interrupt source to an individual CPU.  The MD
  code may reject the binding with an error.  If an assign_cpu function
  is not provided, then the kernel assumes the platform does not support
  binding interrupts to CPUs and fails all requests to do so.
- Bind ithreads to CPUs on their next execution loop once an interrupt
  event is bound to a CPU.  Only shared ithreads are bound.  We currently
  leave private ithreads for drivers using filters + ithreads in the
  INTR_FILTER case unbound.
- A new intr_event_bind() routine is used to bind an interrupt event to
  a CPU.
- Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
  PIC method.
- For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
  an interrupt source and binds its interrupt event to the specified CPU.
  MI code can currently (ab)use this by doing:

	intr_bind(rman_get_start(irq_res), cpu);

  however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
  where the implementation in the x86 nexus(4) driver would end up calling
  intr_bind() internally.

Requested by:	kmacy, gallatin, jeff
Tested on:	{amd64, i386} x {regular, INTR_FILTER}
2008-03-14 19:41:48 +00:00
John Baldwin
c9107e85d9 Fix a silly bogon which prevented all the CPUs that are tagged as interrupt
receivers from being given interrupts if any CPUs in the system were not
tagged as interrupt receivers that I introduced when switching the x86
interrupt code to track CPUs via FreeBSD CPU IDs rather than local APIC
IDs.  In practice this only affects systems with Hyperthreading (though
disabling HTT in the BIOS would workaround the issue) as that is the only
case currently where one can have CPUs that aren't tagged as interrupt
receivers.  On a Dell SC1425 test box with 2 x Xeon w/ HTT (so 4 logical
CPUs of which 2 were interrupt receivers) the result was that all
device interrupts were sent to CPU 0.

MFC after:	1 week
Pointy hat to:	jhb
2008-03-14 03:44:42 +00:00
John Baldwin
5217af301c Rework how the nexus(4) device works on x86 to better handle the idea of
different "platforms" on x86 machines.  The existing code already handles
having two platforms: ACPI and legacy.  However, the existing approach was
rather hardcoded and difficult to extend.  These changes take the approach
that each x86 hardware platform should provide its own nexus(4) driver (it
can inherit most of its behavior from the default legacy nexus(4) driver)
which is responsible for probing for the platform and performing
appropriate platform-specific setup during attach (such as adding a
platform-specific bus device).  This does mean changing the x86 platform
busses to no longer use an identify routine for probing, but to move that
logic into their matching nexus(4) driver instead.
- Make the default nexus(4) driver in nexus.c on i386 and amd64 handle the
  legacy platform.  It's probe routine now returns BUS_PROBE_GENERIC so it
  can be overriden.
- Expose a nexus_init_resources() routine which initializes the various
  resource managers so that subclassed nexus(4) drivers can invoke it from
  their attach routine.
- The legacy nexus(4) driver explicitly adds a legacy0 device in its
  attach routine.
- The ACPI driver no longer contains an new-bus identify method.  Instead
  it exposes a public function (acpi_identify()) which is a probe routine
  that the MD nexus(4) drivers can use to probe for ACPI.  All of the
  probe logic in acpi_probe() is now moved into acpi_identify() and
  acpi_probe() is just a stub.
- On i386 and amd64, an ACPI-specific nexus(4) driver checks for ACPI via
  acpi_identify() and claims the nexus0 device if the probe succeeds.  It
  then explicitly adds an acpi0 device in its attach routine.
- The legacy(4) driver no longer knows anything about the acpi0 device.
- On ia64 if acpi_identify() fails you basically end up with no devices.
  This matches the previous behavior where the old acpi_identify() would
  fail to add an acpi0 device again leaving you with no devices.

Discussed with:	imp
Silence on:	arch@
2008-03-13 20:39:04 +00:00
John Baldwin
d0234f752f Use the SMAP data from the loader if it is provided instead of using
virtual 86 mode to query the BIOS directly.  This is needed for certain
HP machines whose BIOS only provide an SMAP when invoked from real mode.
On such machines the loader will be able to query the SMAP successfully
due to the recent BTX changes, but the kernel will not.

One thing I'm not sure of is if we can skip the INT 12h probe altogether
if we have the SMAP from the loader as it seems that we do the INT 12h
probe to setup enough state so we can use vm86 to call the BIOS.

MFC after:	1 week
2008-03-13 18:56:53 +00:00
Konstantin Belousov
22eca0bf45 Since version 4.3, gcc changed its behaviour concerning the i386/amd64
ABI and the direction flag, that is it now assumes that the direction
flag is cleared at the entry of a function and it doesn't clear once
more if needed. This new behaviour conforms to the i386/amd64 ABI.

Modify the signal handler frame setup code to clear the DF {e,r}flags
bit on the amd64/i386 for the signal handlers.

jhb@ noted that it might break old apps if they assumed DF == 1 would be
preserved in the signal handlers, but that such apps should be rare and
that older versions of gcc would not generate such apps.

Submitted by:	Aurelien Jarno <aurelien aurel32 net>
PR:	121422
Reviewed by:	jhb
MFC after:	2 weeks
2008-03-13 10:54:38 +00:00
Konstantin Belousov
ea39de9f93 Add missed parentheses 2008-03-13 09:52:48 +00:00
John Baldwin
391664b110 The variable MTRR registers actually have variable-sized PhysBase and
PhysMask fields based on the number of physical address bits supported
by the current CPU.  The old code assumed 36 bits on i386 and 40 bits on
amd64.  In truth, all Intel CPUs up until recently used 36 bits (a newer
Intel CPU uses 38 bits) and all the Opteron CPUs used 40 bits.

In at least one case (the new Intel CPU) having the size of the mask field
wrong resulted in writing questionable values into the MTRR registers on
the application processors (BSP as well if you modify the MTRRs via
memcontrol or running X, etc.).  The result of the questionable physmask
was that all of memory was apparently treated as uncached rather than
write-back resulting in a very significant performance hit.

Fix this by constructing a run-time mask for the PhysBase and PhysMask
fields based on the number of physical address bits supported by the CPU.
All 64-bit capable CPUs provide a count of PA bits supported via the
0x80000008 extended CPUID feature, so use that if it is available.  If that
feature is not available, then assume 36 PA bits.

While I'm here, expand the (now-unused) macros for the PhysBase and
PhysMask fields to the current largest possible value (52 PA bits).

MFC after:	1 week
PR:		i386/120516
Reported by:	Nokia
2008-03-12 22:09:19 +00:00
John Baldwin
4cbd0e8984 MFamd64: Break up the probe logic in the mem_drvinit routines so it's
a bit easier to parse.
2008-03-12 21:44:46 +00:00
Jeff Roberson
6617724c5f Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
John Baldwin
1b085fde87 Style(9) these files. No changes in the compiled code. (Verified by
diff'ing objdump -d output).
2008-03-11 21:41:36 +00:00
John Baldwin
336d8e5536 Add constants for the various fields in MTRR registers.
MFC after:	1 week
Verified by:	md5(1)
2008-03-11 20:10:37 +00:00
John Baldwin
463e0f91cb Probe CPUs after the PCI hierarchy on i386, amd64, and ia64. This allows
the cpufreq drivers to reliably use properties of PCI devices for quirks,
etc.
- For the legacy drivers, add CPU devices via an identify routine in the
  CPU driver itself rather than in the legacy driver's attach routine.
- Add CPU devices after Host-PCI bridges in the acpi bus driver.
- Change the ichss(4) driver to use pci_find_bsf() to locate the ICH and
  check its device ID rather than having a bogus PCI attachment that only
  checked for the ID in probe and always failed.  As a side effect, you
  can now kldload ichss after boot.
- Fix the ichss(4) driver to use the correct device_t for the ICH (and not
  for ichss0) when doing PCI config space operations to enable SpeedStep.

MFC after:	2 weeks
Reviewed by:	njl, Andriy Gapon  avg of icyb.net.ua
2008-03-10 22:18:07 +00:00
John Baldwin
c3cefed5eb - Don't execute cpuid to fetch the features. We already have the features
present in cpu_feature2.  Also, use CPUID2_EST rather than a magic
  number.
- Don't free the ACPI settings list in detach if we are going to fail the
  request.  Otherwise an attempt to kldunload est would free the array
  but the driver would keep trying to use it.

MFC after:	1 week
2008-03-10 22:00:35 +00:00
Jeff Roberson
32c9d3a767 - Rather than repeating the same preemption code everywhere call the scheduler
specific sched_preempt() routine.
2008-03-10 01:32:48 +00:00
Rink Springer
2e7328e7cc Import uslcom(4) from OpenBSD - this is a driver for Silicon Laboratories
CP2101/CP2102 based USB serial adapters.

Reviewed by:		imp, emaste
Obtained from:		OpenBSD
MFC after:		2 weeks
2008-03-05 14:13:30 +00:00
Bruce Evans
f3d2db418f Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before.  Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.

I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision.  However, this was too hard for now.  In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed.  One
thing that is completely broken now is LDBL_MAX.  This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
2008-03-05 11:21:14 +00:00
Bruce Evans
021dfaf077 Oops, back out previous commit since it was to the wrong file. 2008-03-05 11:17:20 +00:00
Bruce Evans
69c0326e8c Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before.  Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.

I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision.  However, this was too hard for now.  In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed.  One
thing that is completely broken now is LDBL_MAX.  This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
2008-03-05 11:11:53 +00:00
Jeff Roberson
81aa71755b - Remove the old smp cpu topology specification with a new, more flexible
tree structure that encodes the level of cache sharing and other
   properties.
 - Provide several convenience functions for creating one and two level
   cpu trees as well as a default flat topology.  The system now always
   has some topology.
 - On i386 and amd64 create a seperate level in the hierarchy for HTT
   and multi-core cpus.  This will allow the scheduler to intelligently
   load balance non-uniform cores.  Presently we don't detect what level
   of the cache hierarchy is shared at each level in the topology.
 - Add a mechanism for testing common topologies that have more information
   than the MD code is able to provide via the kern.smp.topology tunable.
   This should be considered a debugging tool only and not a stable api.

Sponsored by:	Nokia
2008-03-02 07:58:42 +00:00
Justin T. Gibbs
b601964112 In est_acpi_info(), initialize count before passing its pointer to
CPUFREQ_DRV_SETTINGS().  The value of count on input is used to
prefent overflow of the settings buffer passed into CPUFREQ_DRV_SETTINGS().

This corrects the "est: CPU supports Enhanced Speedstep, but is not recognized."
error on my system.

MFC after: 1 week
2008-03-01 21:58:34 +00:00
John Baldwin
905829bfa9 With the recent change to enable CPU brands from the VIA chips, the
code to add padlock features to the CPU model on VIA CPUs was no longer
effective.  Change the code to instead output a separate printf during
dmesg for VIA Padlock features similar to other cpuid feature bitmasks.

MFC after:	1 week
2008-02-29 19:18:09 +00:00
Rui Paulo
2487d8f877 Validate the id16 values gathered from ACPI (previously a TODO item).
Style changes by me and njl.

Approved by:  	 njl (mentor)
Reviewed by:	 njl (mentor)
Submitted by: 	 Takeharu KATO <takeharu1219 at ybb.ne.jp>
PR:	  	 119350
MFC after:	 1 week
2008-02-28 19:10:42 +00:00
John Baldwin
4a78f78435 - Check for the extended CPUID registers on VIA CPUs so we can get the
brand string.
- Fix a nit in the previous commit.  "Eden" is a product name, not a core
  name.  The new ID is still for an "Esther" core.
2008-02-28 17:59:54 +00:00
John Baldwin
23e30a506b Support the VIA C7 Eden CPU and treat it just like a C7 Esther. We may
want to adjust this code to just assume that all CPUs >= Esther should
be checked for the extended cpuid flags register.

MFC after:	3 days
PR:		i386/119491
2008-02-25 22:42:33 +00:00
Scott Long
7bbd40c57e Teach the dump and minidump code to respect the maxioszie attribute of
the disk; the hard-coded assumption of 64K doesn't work in all cases.
2008-02-15 06:26:25 +00:00
Scott Long
54f8dbc48f If busdma is being used to realign dynamic buffers and the alignment is set to
PAGE_SIZE or less, the bounce page counting logic was flawed and wouldn't
reserve any pages.  Adjust to be correct.  Review of other architectures is
forthcoming.

Submitted by: Joseph Golio
2008-02-12 16:24:30 +00:00
Jung-uk Kim
865df544c6 Fix Linux mmap with MAP_GROWSDOWN flag.
Reported by:	Andriy Gapon (avg at icyb dot net dot ua)
Tested by:	Andriy Gapon (avg at icyb dot net dot ua)
Pointyhat:	me
MFC after:	3 days
2008-02-11 19:35:03 +00:00
Poul-Henning Kamp
31d48c5406 Add support for PC Engines ALIX boards.
Style cleanup.

Hide some messages behind bootverbose.
2008-02-10 19:14:42 +00:00
Scott Long
593c873471 Remove the rr232x driver. It has been superceded by the hptrr driver. 2008-02-03 07:07:30 +00:00
John Baldwin
7157eae462 For no good reason I had assumed that ACPI table headers would be page
aligned (or at least not cross a page boundary).  However, it turns out
that on at least one machine one table header does cross a page boundary.
This caused problems with the MADT early probe as it uses the crash dump
map to load ACPI tables by loading the RSDT/XSDT into pages 1 ... N and
loading the header of each ACPI table header into page 0 looking for the
MADT.  However, if a table header crossed a page boundary, then page 1
would get trashed resulting in a panic.  Fix this by reserving the first
2 pages for ACPI table headers (headers are less than a page in size,
so 2 pages will be sufficient) and use pages 2 .. N for the RSDT and XSDT.

Note: amd64 should probably be simplified to just use pmap_mapbios()
for all these tables which will use the direct map and not need the
crash dump hack.

MFC after:	5 days
Tested on:	i386
Reported by:	Pete French  petefrench of ticketswitch.com
2008-01-31 16:51:43 +00:00
Alexander Motin
2a57ca33c7 Move GET_STACK_USAGE from MI header to i386/amd64 MD ones.
Somebody who can, please feel free to implement it for other archs
or copy this one if it suits.
2008-01-31 08:24:27 +00:00
Ruslan Ermilov
007b1b7bae Add a wrapper function that bound checks writes to the dump device. 2008-01-28 19:04:07 +00:00
John Baldwin
c05655bfda Use cpu_spinwait() (i.e., "pause") when spinning on rdtsc during DELAY().
MFC after:	1 week
2008-01-17 18:59:38 +00:00
Alan Cox
6634dbbde4 Retire PMAP_DIAGNOSTIC. Any useful diagnostics that were conditionally
compiled under PMAP_DIAGNOSTIC are now KASSERT()s.  (Note: The kernel
option DIAGNOSTIC still disables inlining of certain pmap functions.)

Eliminate dead code from pmap_enter().  This code implemented an assertion.
On i386, an equivalent check is already implemented.  However, on amd64,
a small change is required to implement an equivalent check.

Eliminate \n from a nearby panic string.

Use KASSERT() to reimplement pmap_copy()'s two assertions.
2008-01-17 18:25:52 +00:00
Peter Wemm
2577760fca Update the KVA_PAGES comments for the effect that PAE has on it. It
becomes a unit size of 2MB instead of 4MB and must be a multiple of 8 to
get a valid KERNBASE.
2008-01-14 22:53:01 +00:00
Peter Wemm
a658a1e0a5 Add a CTASSERT that KERNBASE is valid. This is usually messed up by an
invalid KVA_PAGES, so add a pointer to there.
2008-01-14 22:51:43 +00:00
Attilio Rao
22db15c06f VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
Bruce Evans
0209f729a1 MFamd64 (everything possible up to 1.19; mainly the amd64 implementations
of fpget*() and fpset*()).

The i386 fpget*() were efficient but a bit obfuscated (using macros
and a case statement to demultiplex them through a single inline).
The demultiplexing mainly gave smaller source code.

The i386 fpset*() were obfuscated in the same way and were very
inefficient due to the case statement not having enough cases or
complexity so all cases used the FP environment.

This also fixes a harmless bug in rev.1.12.  fpsetmask() extracted the
old value from the bit-field twice, but the doubled shift was harmless
since the shift count is 0.

All fp*() interfaces are now inline functions on i386.  They used to
be macros that call (a different set of) inline functions.  This is a
small ABI change which shouldn't cause problems since cases where
inlining fails (mainly -O0) only give (working) static functions.
2008-01-11 18:59:35 +00:00
Bruce Evans
f107f876a6 Separate fpresetsticky() from the other fpset functions so that the
others can be replaced cleanly by the amd64 versions.   There is no
current amd64 version to merge, but there is an old one which is
similar.

Fix the following bugs in fpresetsticky():
- garbage args clobbered non-sticky bits in the status register
- the return value was usually garbage since it was masked with the
  arg instead of with the field selector.

Optimize fpresetsticky() to avoid using the environment as in
feclearexcept() (use only fnclex() if possible) and also to avoid
using fnclex() for null changes.  The second of these optimizations
might not be so good since its branch might cost more than it saves.
2008-01-11 18:27:01 +00:00
Bruce Evans
98a80542e7 MFamd64 1.15-1.18 (cosmetic changes, mainly to comments). The inline
functions haven't been cleaned up here because the amd64 cleanups
don't apply directly and the functions here will be merged or rewritten
later.
2008-01-11 17:54:20 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
Alan Cox
fa093ee242 Convert a PMAP_DIAGNOSTIC to a KASSERT. 2008-01-08 08:30:30 +00:00
John Baldwin
5965c4b71c Add COMPAT_FREEBSD7 and enable it in configs that have COMPAT_FREEBSD6. 2008-01-07 21:40:11 +00:00
Alan Cox
5cccf58676 Shrink the size of struct vm_page on amd64 and i386 by eliminating
pv_list_count from struct md_page.  Ever since Peter rewrote the pv
entry allocator for amd64 and i386 pv_list_count has been correctly
maintained but otherwise unused.
2008-01-06 18:51:04 +00:00
Alan Cox
eb2a051720 Add an access type parameter to pmap_enter(). It will be used to implement
superpage promotion.

Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is
a Boolean.
2008-01-03 07:34:34 +00:00