Commit Graph

11289 Commits

Author SHA1 Message Date
phk
c763b22a79 Back in the good old days, PC's had random pieces of rock for
frequency generation and what frequency the generated was anyones
guess.

In general the 32.768kHz RTC clock x-tal was the best, because that
was a regular wrist-watch Xtal, whereas the X-tal generating the
ISA bus frequency was much lower quality, often costing as much as
several cents a piece, so it made good sense to check the ISA bus
frequency against the RTC clock.

The other relevant property of those machines, is that they
typically had no more than 16MB RAM.

These days, CPU chips croak if their clocks are not tightly within
specs and all necessary frequencies are derived from the master
crystal by means if PLL's.

Considering that it takes on average 1.5 second to calibrate the
frequency of the i8254 counter, that more likely than not, we will
not actually use the result of the calibration, and as the final
clincher, we seldom use the i8254 for anything besides BEL in
syscons anyway, it has become time to drop the calibration code.

If you need to tell the system what frequency your i8254 runs,
you can do so from the loader using hw.i8254.freq or using the
sysctl kern.timecounter.tc.i8254.frequency.
2008-03-26 22:12:00 +00:00
phk
fa71439e44 The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
	timer_spkr_acquire()
	timer_spkr_release()
and
	timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all.  In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that.  It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
2008-03-26 20:09:21 +00:00
dfr
79d2dfdaa6 Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
  client handle safely with replies being de-multiplexed at the socket
  upcall (typically driven directly by the NIC interrupt) and handed
  off to whichever thread matches the reply. For UDP sockets, many RPC
  clients can share the same socket. This allows the use of a single
  privileged UDP port number to talk to an arbitrary number of remote
  hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
  server would be relatively straightforward and would follow
  approximately the Solaris KPI. A single thread should be sufficient
  for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
  callbacks. I've tested the NLM server reasonably extensively - it
  passes both my own tests and the NFS Connectathon locking tests
  running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
  support for the local NFS client's locking needs, it does have to
  field async replies and granted callbacks from remote NLMs that the
  local client has contacted. We relay these replies to the userland
  rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
  it will detect deadlocks caused by a lock request that covers more
  than one blocking request. As required by the NLM protocol, all
  deadlock detection happens synchronously - a user is guaranteed that
  if a lock request isn't rejected immediately, the lock will
  eventually be granted. The old system allowed for a 'deferred
  deadlock' condition where a blocked lock request could wake up and
  find that some other deadlock-causing lock owner had beaten them to
  the lock.

* Since both local and remote locks are managed by the same kernel
  locking code, local and remote processes can safely use file locks
  for mutual exclusion. Local processes have no fairness advantage
  compared to remote processes when contending to lock a region that
  has just been unlocked - the local lock manager enforces a strict
  first-come first-served model for both local and remote lockers.

Sponsored by:	Isilon Systems
PR:		95247 107555 115524 116679
MFC after:	2 weeks
2008-03-26 15:23:12 +00:00
phk
632e5d39f7 Rename timer0_max_count to i8254_max_count.
Rename timer0_real_max_count to i8254_real_max_count and make it static.
Rename timer_freq to i8254_freq and make it a loader tunable.
2008-03-26 15:03:24 +00:00
phk
44bfb30efd The RTC related pscnt and psdiv variables have no business being public. 2008-03-26 13:25:27 +00:00
brueffer
b64d211df2 Fix some "in in" typos in comments.
PR:		121490
Submitted by:	Anatoly Borodin <anatoly.borodin@gmail.com>
Approved by:	rwatson (mentor), jkoshy
MFC after:	3 days
2008-03-26 07:32:08 +00:00
alc
0e2f1e0b38 Enable the automatic creation of superpage reservations. 2008-03-26 03:12:00 +00:00
jkim
3e99f5d364 Belatedly add BPF_JITTER in NOTES for supported architectures. 2008-03-24 22:23:22 +00:00
kib
53a15ee1ea Prevent the overflow in the calculation of the next page directory.
The overflow causes the wraparound with consequent corruption of the
(almost) whole address space mapping.

As Alan noted, pmap_copy() does not require the wrap-around checks
because it cannot be applied to the kernel's pmap. The checks there are
included for consistency.

Reported and tested by:	kris (i386/pmap.c:pmap_remove() part)
Reviewed by:	alc
MFC after:	1 week
2008-03-23 07:07:27 +00:00
jhb
fbea3b6403 Explicitly use spinlock_enter/exit rather than locking the icu_lock spin
lock in the 8259A drivers as these drivers are only used on UP systems.
This slightly reduces the penalty of an SMP kernel (such as GENERIC) on
a UP x86 machine.
2008-03-20 21:53:27 +00:00
jhb
6cf6d7b22b Implement a BUS_BIND_INTR() method in the bus interface to bind an IRQ
resource to a CPU.  The default method is to pass the request up to the
parent similar to BUS_CONFIG_INTR() so that all busses don't have to
explicitly implement bus_bind_intr.  A bus_bind_intr(9) wrapper routine
similar to bus_setup/teardown_intr() is added for device drivers to use.
Unbinding an interrupt is done by binding it to NOCPU.  The IRQ resource
must be allocated, but it can happen in any order with respect to
bus_setup_intr().  Currently it is only supported on amd64 and i386 via
nexus(4) methods that simply call the intr_bind() routine.

Tested by:	gallatin
2008-03-20 21:24:32 +00:00
jhb
c04bb048f6 Simplify the interrupt code a bit:
- Always include the ie_disable and ie_eoi methods in 'struct intr_event'
  and collapse down to one intr_event_create() routine.  The disable and
  eoi hooks simply aren't used currently in the !INTR_FILTER case.
- Expand 'disab' to 'disable' in a few places.
- Use function casts for arm and i386:intr_eoi_src() instead of wrapper
  routines since to trim one extra indirection.

Compiled on:	{arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER}
Tested on:	{amd64,i386} x {FILTER, !FILTER}
2008-03-17 22:42:01 +00:00
phk
341d5f653b A cautionary XXX comment about seemingly bogus errata checks. 2008-03-17 09:05:15 +00:00
phk
234bd357a6 Increase time we wait for things to settle to 1 millisecond,
10 microseconds is too short.

Always set the cpu to the highest frequency so that we get through
boot and don't handicap cpus where powerd(8) is not used.
2008-03-17 09:01:43 +00:00
phk
7f4e520791 Revert last commit and stop committing before morning tea. 2008-03-17 09:00:59 +00:00
phk
0ef978b056 Increase time we wait for things to settle to 1 millisecond,
10 microseconds is too short.

Always set the cpu to the highest frequency so that we get through
boot and don't handicap cpus where powerd(8) is not used.
2008-03-17 08:38:38 +00:00
phk
573f4ecf0d Use correct bitmask for identifying chip family. 2008-03-17 00:36:16 +00:00
pjd
ea49d310bf Implement atomic_fetchadd_long() for all architectures and document it.
Reviewed by:	attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
2008-03-16 21:20:50 +00:00
rdivacky
64c7931e65 Regen. 2008-03-16 16:29:37 +00:00
rdivacky
b13a84dcb7 Implement sched_setaffinity and get_setaffinity using
real cpu affinity setting primitives.

Reviewed by:	jeff
Approved by:	kib (mentor)
2008-03-16 16:27:44 +00:00
rwatson
877d7c65ba In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
jhb
9c113163fb Add preliminary support for binding interrupts to CPUs:
- Add a new intr_event method ie_assign_cpu() that is invoked when the MI
  code wishes to bind an interrupt source to an individual CPU.  The MD
  code may reject the binding with an error.  If an assign_cpu function
  is not provided, then the kernel assumes the platform does not support
  binding interrupts to CPUs and fails all requests to do so.
- Bind ithreads to CPUs on their next execution loop once an interrupt
  event is bound to a CPU.  Only shared ithreads are bound.  We currently
  leave private ithreads for drivers using filters + ithreads in the
  INTR_FILTER case unbound.
- A new intr_event_bind() routine is used to bind an interrupt event to
  a CPU.
- Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
  PIC method.
- For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
  an interrupt source and binds its interrupt event to the specified CPU.
  MI code can currently (ab)use this by doing:

	intr_bind(rman_get_start(irq_res), cpu);

  however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
  where the implementation in the x86 nexus(4) driver would end up calling
  intr_bind() internally.

Requested by:	kmacy, gallatin, jeff
Tested on:	{amd64, i386} x {regular, INTR_FILTER}
2008-03-14 19:41:48 +00:00
jhb
8293fe75e4 Fix a silly bogon which prevented all the CPUs that are tagged as interrupt
receivers from being given interrupts if any CPUs in the system were not
tagged as interrupt receivers that I introduced when switching the x86
interrupt code to track CPUs via FreeBSD CPU IDs rather than local APIC
IDs.  In practice this only affects systems with Hyperthreading (though
disabling HTT in the BIOS would workaround the issue) as that is the only
case currently where one can have CPUs that aren't tagged as interrupt
receivers.  On a Dell SC1425 test box with 2 x Xeon w/ HTT (so 4 logical
CPUs of which 2 were interrupt receivers) the result was that all
device interrupts were sent to CPU 0.

MFC after:	1 week
Pointy hat to:	jhb
2008-03-14 03:44:42 +00:00
jhb
64ab71ccbd Rework how the nexus(4) device works on x86 to better handle the idea of
different "platforms" on x86 machines.  The existing code already handles
having two platforms: ACPI and legacy.  However, the existing approach was
rather hardcoded and difficult to extend.  These changes take the approach
that each x86 hardware platform should provide its own nexus(4) driver (it
can inherit most of its behavior from the default legacy nexus(4) driver)
which is responsible for probing for the platform and performing
appropriate platform-specific setup during attach (such as adding a
platform-specific bus device).  This does mean changing the x86 platform
busses to no longer use an identify routine for probing, but to move that
logic into their matching nexus(4) driver instead.
- Make the default nexus(4) driver in nexus.c on i386 and amd64 handle the
  legacy platform.  It's probe routine now returns BUS_PROBE_GENERIC so it
  can be overriden.
- Expose a nexus_init_resources() routine which initializes the various
  resource managers so that subclassed nexus(4) drivers can invoke it from
  their attach routine.
- The legacy nexus(4) driver explicitly adds a legacy0 device in its
  attach routine.
- The ACPI driver no longer contains an new-bus identify method.  Instead
  it exposes a public function (acpi_identify()) which is a probe routine
  that the MD nexus(4) drivers can use to probe for ACPI.  All of the
  probe logic in acpi_probe() is now moved into acpi_identify() and
  acpi_probe() is just a stub.
- On i386 and amd64, an ACPI-specific nexus(4) driver checks for ACPI via
  acpi_identify() and claims the nexus0 device if the probe succeeds.  It
  then explicitly adds an acpi0 device in its attach routine.
- The legacy(4) driver no longer knows anything about the acpi0 device.
- On ia64 if acpi_identify() fails you basically end up with no devices.
  This matches the previous behavior where the old acpi_identify() would
  fail to add an acpi0 device again leaving you with no devices.

Discussed with:	imp
Silence on:	arch@
2008-03-13 20:39:04 +00:00
jhb
990c3cc04e Use the SMAP data from the loader if it is provided instead of using
virtual 86 mode to query the BIOS directly.  This is needed for certain
HP machines whose BIOS only provide an SMAP when invoked from real mode.
On such machines the loader will be able to query the SMAP successfully
due to the recent BTX changes, but the kernel will not.

One thing I'm not sure of is if we can skip the INT 12h probe altogether
if we have the SMAP from the loader as it seems that we do the INT 12h
probe to setup enough state so we can use vm86 to call the BIOS.

MFC after:	1 week
2008-03-13 18:56:53 +00:00
kib
be9c86776f Since version 4.3, gcc changed its behaviour concerning the i386/amd64
ABI and the direction flag, that is it now assumes that the direction
flag is cleared at the entry of a function and it doesn't clear once
more if needed. This new behaviour conforms to the i386/amd64 ABI.

Modify the signal handler frame setup code to clear the DF {e,r}flags
bit on the amd64/i386 for the signal handlers.

jhb@ noted that it might break old apps if they assumed DF == 1 would be
preserved in the signal handlers, but that such apps should be rare and
that older versions of gcc would not generate such apps.

Submitted by:	Aurelien Jarno <aurelien aurel32 net>
PR:	121422
Reviewed by:	jhb
MFC after:	2 weeks
2008-03-13 10:54:38 +00:00
kib
efc24456b8 Add missed parentheses 2008-03-13 09:52:48 +00:00
jhb
4aef4283ee The variable MTRR registers actually have variable-sized PhysBase and
PhysMask fields based on the number of physical address bits supported
by the current CPU.  The old code assumed 36 bits on i386 and 40 bits on
amd64.  In truth, all Intel CPUs up until recently used 36 bits (a newer
Intel CPU uses 38 bits) and all the Opteron CPUs used 40 bits.

In at least one case (the new Intel CPU) having the size of the mask field
wrong resulted in writing questionable values into the MTRR registers on
the application processors (BSP as well if you modify the MTRRs via
memcontrol or running X, etc.).  The result of the questionable physmask
was that all of memory was apparently treated as uncached rather than
write-back resulting in a very significant performance hit.

Fix this by constructing a run-time mask for the PhysBase and PhysMask
fields based on the number of physical address bits supported by the CPU.
All 64-bit capable CPUs provide a count of PA bits supported via the
0x80000008 extended CPUID feature, so use that if it is available.  If that
feature is not available, then assume 36 PA bits.

While I'm here, expand the (now-unused) macros for the PhysBase and
PhysMask fields to the current largest possible value (52 PA bits).

MFC after:	1 week
PR:		i386/120516
Reported by:	Nokia
2008-03-12 22:09:19 +00:00
jhb
93c9f42d65 MFamd64: Break up the probe logic in the mem_drvinit routines so it's
a bit easier to parse.
2008-03-12 21:44:46 +00:00
jeff
acb93d599c Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
jhb
384b4a0305 Style(9) these files. No changes in the compiled code. (Verified by
diff'ing objdump -d output).
2008-03-11 21:41:36 +00:00
jhb
6ab428b54b Add constants for the various fields in MTRR registers.
MFC after:	1 week
Verified by:	md5(1)
2008-03-11 20:10:37 +00:00
jhb
18300325cb Probe CPUs after the PCI hierarchy on i386, amd64, and ia64. This allows
the cpufreq drivers to reliably use properties of PCI devices for quirks,
etc.
- For the legacy drivers, add CPU devices via an identify routine in the
  CPU driver itself rather than in the legacy driver's attach routine.
- Add CPU devices after Host-PCI bridges in the acpi bus driver.
- Change the ichss(4) driver to use pci_find_bsf() to locate the ICH and
  check its device ID rather than having a bogus PCI attachment that only
  checked for the ID in probe and always failed.  As a side effect, you
  can now kldload ichss after boot.
- Fix the ichss(4) driver to use the correct device_t for the ICH (and not
  for ichss0) when doing PCI config space operations to enable SpeedStep.

MFC after:	2 weeks
Reviewed by:	njl, Andriy Gapon  avg of icyb.net.ua
2008-03-10 22:18:07 +00:00
jhb
e0c2a8245e - Don't execute cpuid to fetch the features. We already have the features
present in cpu_feature2.  Also, use CPUID2_EST rather than a magic
  number.
- Don't free the ACPI settings list in detach if we are going to fail the
  request.  Otherwise an attempt to kldunload est would free the array
  but the driver would keep trying to use it.

MFC after:	1 week
2008-03-10 22:00:35 +00:00
jeff
d952a04ecf - Rather than repeating the same preemption code everywhere call the scheduler
specific sched_preempt() routine.
2008-03-10 01:32:48 +00:00
rink
699140d247 Import uslcom(4) from OpenBSD - this is a driver for Silicon Laboratories
CP2101/CP2102 based USB serial adapters.

Reviewed by:		imp, emaste
Obtained from:		OpenBSD
MFC after:		2 weeks
2008-03-05 14:13:30 +00:00
bde
20db6615fa Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before.  Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.

I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision.  However, this was too hard for now.  In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed.  One
thing that is completely broken now is LDBL_MAX.  This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
2008-03-05 11:21:14 +00:00
bde
b5ae836fd2 Oops, back out previous commit since it was to the wrong file. 2008-03-05 11:17:20 +00:00
bde
bc7b82cc51 Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before.  Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.

I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision.  However, this was too hard for now.  In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed.  One
thing that is completely broken now is LDBL_MAX.  This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
2008-03-05 11:11:53 +00:00
jeff
ad2a31513f - Remove the old smp cpu topology specification with a new, more flexible
tree structure that encodes the level of cache sharing and other
   properties.
 - Provide several convenience functions for creating one and two level
   cpu trees as well as a default flat topology.  The system now always
   has some topology.
 - On i386 and amd64 create a seperate level in the hierarchy for HTT
   and multi-core cpus.  This will allow the scheduler to intelligently
   load balance non-uniform cores.  Presently we don't detect what level
   of the cache hierarchy is shared at each level in the topology.
 - Add a mechanism for testing common topologies that have more information
   than the MD code is able to provide via the kern.smp.topology tunable.
   This should be considered a debugging tool only and not a stable api.

Sponsored by:	Nokia
2008-03-02 07:58:42 +00:00
gibbs
2bbc151215 In est_acpi_info(), initialize count before passing its pointer to
CPUFREQ_DRV_SETTINGS().  The value of count on input is used to
prefent overflow of the settings buffer passed into CPUFREQ_DRV_SETTINGS().

This corrects the "est: CPU supports Enhanced Speedstep, but is not recognized."
error on my system.

MFC after: 1 week
2008-03-01 21:58:34 +00:00
jhb
af9b407d79 With the recent change to enable CPU brands from the VIA chips, the
code to add padlock features to the CPU model on VIA CPUs was no longer
effective.  Change the code to instead output a separate printf during
dmesg for VIA Padlock features similar to other cpuid feature bitmasks.

MFC after:	1 week
2008-02-29 19:18:09 +00:00
rpaulo
c1cd98f421 Validate the id16 values gathered from ACPI (previously a TODO item).
Style changes by me and njl.

Approved by:  	 njl (mentor)
Reviewed by:	 njl (mentor)
Submitted by: 	 Takeharu KATO <takeharu1219 at ybb.ne.jp>
PR:	  	 119350
MFC after:	 1 week
2008-02-28 19:10:42 +00:00
jhb
dd53c63465 - Check for the extended CPUID registers on VIA CPUs so we can get the
brand string.
- Fix a nit in the previous commit.  "Eden" is a product name, not a core
  name.  The new ID is still for an "Esther" core.
2008-02-28 17:59:54 +00:00
jhb
b54152091e Support the VIA C7 Eden CPU and treat it just like a C7 Esther. We may
want to adjust this code to just assume that all CPUs >= Esther should
be checked for the extended cpuid flags register.

MFC after:	3 days
PR:		i386/119491
2008-02-25 22:42:33 +00:00
scottl
7163c9c1fc Teach the dump and minidump code to respect the maxioszie attribute of
the disk; the hard-coded assumption of 64K doesn't work in all cases.
2008-02-15 06:26:25 +00:00
scottl
db8258708b If busdma is being used to realign dynamic buffers and the alignment is set to
PAGE_SIZE or less, the bounce page counting logic was flawed and wouldn't
reserve any pages.  Adjust to be correct.  Review of other architectures is
forthcoming.

Submitted by: Joseph Golio
2008-02-12 16:24:30 +00:00
jkim
3bffed0bec Fix Linux mmap with MAP_GROWSDOWN flag.
Reported by:	Andriy Gapon (avg at icyb dot net dot ua)
Tested by:	Andriy Gapon (avg at icyb dot net dot ua)
Pointyhat:	me
MFC after:	3 days
2008-02-11 19:35:03 +00:00
phk
22b65bed61 Add support for PC Engines ALIX boards.
Style cleanup.

Hide some messages behind bootverbose.
2008-02-10 19:14:42 +00:00
scottl
249efc9b24 Remove the rr232x driver. It has been superceded by the hptrr driver. 2008-02-03 07:07:30 +00:00
jhb
9c76956524 For no good reason I had assumed that ACPI table headers would be page
aligned (or at least not cross a page boundary).  However, it turns out
that on at least one machine one table header does cross a page boundary.
This caused problems with the MADT early probe as it uses the crash dump
map to load ACPI tables by loading the RSDT/XSDT into pages 1 ... N and
loading the header of each ACPI table header into page 0 looking for the
MADT.  However, if a table header crossed a page boundary, then page 1
would get trashed resulting in a panic.  Fix this by reserving the first
2 pages for ACPI table headers (headers are less than a page in size,
so 2 pages will be sufficient) and use pages 2 .. N for the RSDT and XSDT.

Note: amd64 should probably be simplified to just use pmap_mapbios()
for all these tables which will use the direct map and not need the
crash dump hack.

MFC after:	5 days
Tested on:	i386
Reported by:	Pete French  petefrench of ticketswitch.com
2008-01-31 16:51:43 +00:00
mav
739abe292f Move GET_STACK_USAGE from MI header to i386/amd64 MD ones.
Somebody who can, please feel free to implement it for other archs
or copy this one if it suits.
2008-01-31 08:24:27 +00:00
ru
910410640b Add a wrapper function that bound checks writes to the dump device. 2008-01-28 19:04:07 +00:00
jhb
a114208c34 Use cpu_spinwait() (i.e., "pause") when spinning on rdtsc during DELAY().
MFC after:	1 week
2008-01-17 18:59:38 +00:00
alc
7f5a9c7a36 Retire PMAP_DIAGNOSTIC. Any useful diagnostics that were conditionally
compiled under PMAP_DIAGNOSTIC are now KASSERT()s.  (Note: The kernel
option DIAGNOSTIC still disables inlining of certain pmap functions.)

Eliminate dead code from pmap_enter().  This code implemented an assertion.
On i386, an equivalent check is already implemented.  However, on amd64,
a small change is required to implement an equivalent check.

Eliminate \n from a nearby panic string.

Use KASSERT() to reimplement pmap_copy()'s two assertions.
2008-01-17 18:25:52 +00:00
peter
ed5c5e33bf Update the KVA_PAGES comments for the effect that PAE has on it. It
becomes a unit size of 2MB instead of 4MB and must be a multiple of 8 to
get a valid KERNBASE.
2008-01-14 22:53:01 +00:00
peter
8ead68ecb0 Add a CTASSERT that KERNBASE is valid. This is usually messed up by an
invalid KVA_PAGES, so add a pointer to there.
2008-01-14 22:51:43 +00:00
attilio
71b7824213 VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
bde
7371ad79e8 MFamd64 (everything possible up to 1.19; mainly the amd64 implementations
of fpget*() and fpset*()).

The i386 fpget*() were efficient but a bit obfuscated (using macros
and a case statement to demultiplex them through a single inline).
The demultiplexing mainly gave smaller source code.

The i386 fpset*() were obfuscated in the same way and were very
inefficient due to the case statement not having enough cases or
complexity so all cases used the FP environment.

This also fixes a harmless bug in rev.1.12.  fpsetmask() extracted the
old value from the bit-field twice, but the doubled shift was harmless
since the shift count is 0.

All fp*() interfaces are now inline functions on i386.  They used to
be macros that call (a different set of) inline functions.  This is a
small ABI change which shouldn't cause problems since cases where
inlining fails (mainly -O0) only give (working) static functions.
2008-01-11 18:59:35 +00:00
bde
b1a379ee65 Separate fpresetsticky() from the other fpset functions so that the
others can be replaced cleanly by the amd64 versions.   There is no
current amd64 version to merge, but there is an old one which is
similar.

Fix the following bugs in fpresetsticky():
- garbage args clobbered non-sticky bits in the status register
- the return value was usually garbage since it was masked with the
  arg instead of with the field selector.

Optimize fpresetsticky() to avoid using the environment as in
feclearexcept() (use only fnclex() if possible) and also to avoid
using fnclex() for null changes.  The second of these optimizations
might not be so good since its branch might cost more than it saves.
2008-01-11 18:27:01 +00:00
bde
466cc1c021 MFamd64 1.15-1.18 (cosmetic changes, mainly to comments). The inline
functions haven't been cleaned up here because the amd64 cleanups
don't apply directly and the functions here will be merged or rewritten
later.
2008-01-11 17:54:20 +00:00
attilio
18d0a0dd51 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
alc
7915a2d351 Convert a PMAP_DIAGNOSTIC to a KASSERT. 2008-01-08 08:30:30 +00:00
jhb
c7e0e41f73 Add COMPAT_FREEBSD7 and enable it in configs that have COMPAT_FREEBSD6. 2008-01-07 21:40:11 +00:00
alc
db37482a35 Shrink the size of struct vm_page on amd64 and i386 by eliminating
pv_list_count from struct md_page.  Ever since Peter rewrote the pv
entry allocator for amd64 and i386 pv_list_count has been correctly
maintained but otherwise unused.
2008-01-06 18:51:04 +00:00
alc
545d26e30b Add an access type parameter to pmap_enter(). It will be used to implement
superpage promotion.

Correct a style error in kmem_malloc(): pmap_enter()'s last parameter is
a Boolean.
2008-01-03 07:34:34 +00:00
alc
394df92ff3 Provide a legitimate pindex to vm_page_alloc() in pmap_growkernel()
instead of writing apologetic comments.  As it turns out, I need every
kernel page table page to have a legitimate pindex to support superpage
promotion on kernel memory.

Correct a nearby style error: Pointers should be compared to NULL.
2008-01-02 08:54:39 +00:00
jhb
ad97e37cb4 Include a "pae" feature if an i386 kernel is built with PAE support.
Obtained from:	Yahoo!
2007-12-31 21:12:45 +00:00
wkoszek
bb029b61d9 Replace explicit calls to video methods with their respective variants
implemented with macros. This patch improves code readability. Reasoning
behind vidd_* is a sort of "video discipline".

List of macros is supposed to be complete--all methods of video_switch
ought to have their respective macros from now on.

Functionally, this code should be no-op. My intention is to leave current
behaviour of touched code as is.

No objections:	rwatson
Silence on:	freebsd-current@
Approved by:	cognet
2007-12-29 23:26:59 +00:00
rpaulo
c09011a387 Add asmc(4).
Requested by:	njl (mentor)
2007-12-28 22:50:04 +00:00
alc
37cdbd87f5 Add configuration knobs for the superpage reservation system. Initially,
the reservation will only be enabled on amd64.
2007-12-27 16:45:39 +00:00
wkoszek
6bd00b2689 "vt" doesn't refer to any existing device anymore. Remove it.
Reviewed by:	cognet@ (mentor)
Approved by:	cognet@ (mentor)
2007-12-25 22:41:29 +00:00
rwatson
bdee30611d Add a new 'why' argument to kdb_enter(), and a set of constants to use
for that argument.  This will allow DDB to detect the broad category of
reason why the debugger has been entered, which it can use for the
purposes of deciding which DDB script to run.

Assign approximate why values to all current consumers of the
kdb_enter() interface.
2007-12-25 17:52:02 +00:00
jhb
ca17618420 More properly handle links who only have 1 valid IRQ in their bitmask. The
old code special cased them too early which caused a few differences for
these sort of links relative to other PCI links:

- They were always re-routed via the BIOS call instead of assuming that
  they were already routed if the BIOS had programmed the IRQ into a
  matching device during POST.
- If the BIOS did route that link to a different IRQ that was marked as
  invalid, we trusted the $PIR table rather than the BIOS IRQ.

This change moves the special casing for "unique IRQ" links to only take
that into account when picking an IRQ for an unrouted link so that these
links will now not be routed if the BIOS appears to have routed it already
(some BIOSen have problems with that) and so that if the BIOS uses a
different IRQ than the $PIR, we trust the BIOS routing instead (this is
what we do for all other links as well).

Reported by:	Bruce Walter  walter of fortean com
MFC after:	1 week
2007-12-21 16:53:27 +00:00
rpaulo
b3d67d43ff Fix previous commit. The code ended up in the wrong function.
Approved by:	     njl (mentor)
2007-12-16 20:37:27 +00:00
scottl
442c0b4cf6 Add the 'hptrr' driver for supporting the following Highpoint RocketRAID
cards:

     o   RocketRAID 172x series
     o   RocketRAID 174x series
     o   RocketRAID 2210
     o   RocketRAID 222x series
     o   RocketRAID 2240
     o   RocketRAID 230x series
     o   RocketRAID 231x series
     o   RocketRAID 232x series
     o   RocketRAID 2340
     o   RocketRAID 2522

Many thanks to Highpoint for their continued support of FreeBSD.

Submitted by: Highpoint
2007-12-15 00:56:17 +00:00
rpaulo
59ef90c02d Disallow the legacy USB circuit to generate an SMI# via an ICH
register (MacBooks only).
This allows MacBooks to boot in SMP mode without any trick and solves
the timer problems with HZ=1000.

MFC after:	   1 week

Reviewed by:	   njl (mentor), jhb
Approved by:	   njl (mentor), jhb
2007-12-12 20:24:06 +00:00
alc
fed3c18cd6 Eliminate compilation warnings due to the use of non-static inlines
through the introduction and use of the __gnu89_inline attribute.

Submitted by: bde (i386)
MFC after: 3 days
2007-12-09 21:00:36 +00:00
jkoshy
72c27d71d8 Kernel and hwpmc(4) support for callchain capture.
Sponsored by:	FreeBSD Foundation and Google Inc.
2007-12-07 08:20:17 +00:00
njl
2a12030949 Hold Giant over the entire execution of the suspend path instead of
dropping it after each call into newbus.  This doesn't fix any known
problems but seems more correct.

Submitted by:	Marko Zec <zec / icir.org>
2007-12-06 01:39:23 +00:00
kib
3e8ae081b2 Fix the ABI change of the signal delivered on the access to the page
with insufficient protection mode.

For the i386 and amd64, create the tunable, machdep.prot_fault_translation,
with the following behaviour:
	0 = autodetect the signal to be delivered on KERN_PROTECTION_FAILURE
	    from vm_fault based on the ELF OSABI note:
		no note or __FreeBSD_version < 700004 - SIGBUS/BUS_PAGE_FAULT
		note, and __FreeBSD_version >= 700004 - SIGSEGV/SEGV_ACCERR
	1 = always SIGBUS/BUS_PAGE_FAULT
	2 = always SIGSEGV/SEGV_ACCERR

This would do mostly automatic correction of ABI breakage, with the exception
of the untaged binaries for 7-CURRENT/RELENG_7 before the note is fixed. For
them, sysctl would allow to run the binary with manual settings.

Discussed with:	portmgr (kris)
PR:		kern/118304
MFC after:	3 days
2007-12-04 12:33:03 +00:00
alc
8cda75e035 Correct an error under COUNT_IPIS within pmap_lazyfix_action(): Increment
the counter that the pointer refers to, not the pointer.

MFC after: 3 days
2007-12-04 09:06:08 +00:00
rwatson
47c0478314 Remove duplicate $FreeBSD$ tag. 2007-12-02 21:07:49 +00:00
rwatson
99285f7544 Break out stack(9) from ddb(4):
- Introduce per-architecture stack_machdep.c to hold stack_save(9).
- Introduce per-architecture machine/stack.h to capture any common
  definitions required between db_trace.c and stack_machdep.c.
- Add new kernel option "options STACK"; we will build in stack(9) if it is
  defined, or also if "options DDB" is defined to provide compatibility
  with existing users of stack(9).

Add new stack_save_td(9) function, which allows the capture of a stacktrace
of another thread rather than the current thread, which the existing
stack_save(9) was limited to.  It requires that the thread be neither
swapped out nor running, which is the responsibility of the consumer to
enforce.

Update stack(9) man page.

Build tested:	amd64, arm, i386, ia64, powerpc, sparc64, sun4v
Runtime tested:	amd64 (rwatson), arm (cognet), i386 (rwatson)
2007-12-02 20:40:35 +00:00
phk
993b36f0ab Remove XRPU driver, after asking all the users. 2007-12-01 20:07:45 +00:00
alc
7e64e9843c Improve get_pv_entry()'s handling of low-memory conditions. After page
allocation fails and pv entries are reclaimed, there may be an unused pv
entry in a pv chunk that survived the reclamation.  However, previously,
after reclamation, get_pv_entry() did not look for an unused pv entry in
a surviving pv chunk; it simply retried the page allocation.  Now, it
does look for an unused pv entry before retrying the page allocation.

Note: This only applies to RELENG_7.  Earlier branches use a different
pv entry allocator.

MFC after: 6 weeks
2007-11-30 07:14:42 +00:00
bde
7231573802 Don't use plain "ret" instructions at targets of jump instructions,
since the branch caches on at least Athlon XP through Athlon 64 CPU's
don't understand such instructions and guarantee a cache miss taking
at least 10 cycles.  Use the documented workaround "ret $0" instead
("nop; ret" also works, but "ret $0" is probably faster on old CPUs).

Normal code (even asm code) doesn't branch to "ret", since there is
usually some cleanup to do, but the __mcount, .mcount and .mexitcount
entry points were optimized too well to have the minimum number of
instructions (3 instructions each if profiling is not enabled) and
they did this.  I didn't see a significant number of cache misses for
.mexitcount, but for the shared "ret" for __mcount and .mcount I
observed cache misses costing 26 cycles each.  For a send(2) syscall
that makes about 70 function calls, the cost of these cache misses
alone increased the syscall time from about 4000 cycles to about 7000
cycles.  4000 is for a profiling (GUPROF) kernel with profiling disabled;
after this fix, configuring profiling only costs about 600 cycles in the
4000, which is consistent with almost perfect branch prediction in the
mcounting calls.
2007-11-29 02:01:21 +00:00
bde
35b85a2fdb Remove entry points for -finstrument functions since they are currently
unused except to obfuscate disassemblies.  -mprofiler-epilogue is
currently with gcc-4 (it does too little), but -finstrument-functions
is broken in a different way (it does too much).

amd64 version: meger whitespace fixes from i386 version.
2007-11-29 01:15:03 +00:00
jhb
4a39f29f1b MFamd64: 1.109 of pci_cfgreg.c which changes pci_cfgdisable() into a nop
for type #1 similar to what other OS's do.

MFC after:	3 days
2007-11-28 22:22:05 +00:00
jhb
2a71fa9467 Adjust the code to probe for the PCI config mechanism to use.
- On amd64, just assume type #1 is always used.  PCI 2.0 mandated
  deprecated type #2 and required type #1 for all future bridges which
  was well before amd64 existed.
- For i386, ignore whatever value was in 0xcf8 before testing for type #1
  and instead rely on the other tests to determine if type #1 works.  Some
  newer machines leave garbage in 0xcf8 during boot and as a result the
  kernel doesn't find PCI at all (which greatly confuses ACPI which expects
  PCI to exist when PCI busses are in the namespace).

MFC after:	3 days
Discussed with:	scottl
2007-11-28 22:20:08 +00:00
attilio
2562874cb6 Make ADAPTIVE_GIANT as the default in the kernel and remove the option.
Currently, Giant is not too much contented so that it is ok to treact it
like any other mutexes.

Please don't forget to update your own custom config kernel files.

Approved by:	cognet, marcel (maintainers of arches where option is
		not enabled at the moment)
2007-11-28 05:50:45 +00:00
jhb
7fe785218b Remove the 'needbounce' variable from the _bus_dmamap_load_buffer()
routine.  It is not needed as the existing tests for segment coalescing
already handle bounced addresses and it prevents legal segment coalescing
in certain edge cases.

MFC after:	1 week
Reviewed by:	scottl
2007-11-27 17:28:12 +00:00
kib
20098981e1 Implement read_default_ldt in linux_modify_ldt(). It copies out zeroed
descriptor, like real Linux does.

Tested by: Yuriy Tsibizov <yuriy.tsibizov at gmail com>
Submitted by:	rdivacky
MFC after:	1 week
2007-11-26 11:06:19 +00:00
jkoshy
7ea038e33e MFP4: Add assembly language symbols used by hwpmc(4)'s callchain capture. 2007-11-23 03:03:30 +00:00
scottl
b607c8d8ad Extend critical section coverage in the low-level interrupt handlers to
include the ithread scheduling step.  Without this, a preemption might
occur in between the interrupt getting masked and the ithread getting
scheduled.  Since the interrupt handler runs in the context of curthread,
the scheudler might see it as having a such a low priority on a busy system
that it doesn't get to run for a _long_ time, leaving the interrupt stranded
in a disabled state.  The only way that the preemption can happen is by
a fast/filter handler triggering a schduling event earlier in the handler,
so this problem can only happen for cases where an interrupt is being
shared by both a fast/filter handler and an ithread handler.  Unfortunately,
it seems to be common for this sharing to happen with network and USB
devices, for example.  This fixes many of the mysterious TCP session
timeouts and NIC watchdogs that were being reported.  Many thanks to Sam
Lefler for getting to the bottom of this problem.

Reviewed by: jhb, jeff, silby
2007-11-21 04:03:51 +00:00
alc
d1ab859bdc Prevent the leakage of wired pages in the following circumstances:
First, a file is mmap(2)ed and then mlock(2)ed.  Later, it is truncated.
Under "normal" circumstances, i.e., when the file is not mlock(2)ed, the
pages beyond the EOF are unmapped and freed.  However, when the file is
mlock(2)ed, the pages beyond the EOF are unmapped but not freed because
they have a non-zero wire count.  This can be a mistake.  Specifically,
it is a mistake if the sole reason why the pages are wired is because of
wired, managed mappings.  Previously, unmapping the pages destroys these
wired, managed mappings, but does not reduce the pages' wire count.
Consequently, when the file is unmapped, the pages are not unwired
because the wired mapping has been destroyed.  Moreover, when the vm
object is finally destroyed, the pages are leaked because they are still
wired.  The fix is to reduce the pages' wired count by the number of
wired, managed mappings destroyed.  To do this, I introduce a new pmap
function pmap_page_wired_mappings() that returns the number of managed
mappings to the given physical page that are wired, and I use this
function in vm_object_page_remove().

Reviewed by: tegge
MFC after: 6 weeks
2007-11-17 22:52:29 +00:00
marcel
1e7c4f0a3f o Rename cpu_thread_setup() to cpu_thread_alloc() to better
communicate that it relates to (is called by) thread_alloc()
o  Add cpu_thread_free() which is called from thread_free()
   to counter-act cpu_thread_alloc().

i386:	Have cpu_thread_free() call cpu_thread_clean() to
	preserve behaviour.
ia64:	Have cpu_thread_free() call mtx_destroy() for the
	mutex initialized in cpu_thread_alloc().

PR: ia64/118024
2007-11-14 20:21:54 +00:00
julian
7ee6259be7 A bunch more files that should probably print out a thread name
instead of a process name.
2007-11-14 06:51:33 +00:00
julian
b2732e0c22 generally we are interested in what thread did something as
opposed to what process. Since threads by default have teh name of the
process unless over-written with more useful information, just print the
thread name instead.
2007-11-14 06:21:24 +00:00
julian
760b9605ef Apply the same sort of locking done in
sys/dev/acpica/acpi.c rev 1.196 a while ago:

Grab Giant around calls to DEVICE_SUSPEND/RESUME in
acpi_SetSleepState().
If we are resuming non-MPSAFE drivers, they need Giant held for them.
This may fix some obscure suspend/resume problems.  It has fixed keyrate
setting problems that were triggered by cardbus (MPSAFE) changing the
ordering for syscons resume (non-MPSAFE).  Also, add some asserts that
Giant is held in our suspend/resume and shutdown methods.

Submitted by: Marko Zec
2007-11-14 05:43:55 +00:00
peter
7ed74e55f5 Drastically simplify the i386 pcpu backend by merging parts of the
amd64 mechanism over.  Instead of page table hackery that isn't
actually needed, just use 'struct pcpu __pcpu[MAXCPU]' for backing like
all the other platforms do.  Get rid of 'struct privatespace' and a
while mess of #ifdef SMP garbage that set it up.  As a bonus, this
returns the 4MB of KVA that we stole to implement it the old way.
This also allows you to read the pcpu data for each cpu when reading a
minidump.

Background information:  Originally, pcpu stuff was implemented as having
per-cpu page tables and magic to make different data structures appear
at the same actual address.  In order to share page tables, we switched
to using the GDT and %fs/%gs to access it.  But we still did the evil
magic to set it up for the old way.  The "idle stacks" are not used
for the idle process anymore and are just used for a few functions during
bootup, then ignored.  (excercise for reader: free these afterwards).
2007-11-13 23:00:24 +00:00
benjsc
f9553858df Link wpi(4) into the build.
This includes:
    o mtree (for legal/intel_wpi)
    o manpage for i386/amd64 archs
    o module for i386/amd64 archs
    o NOTES for i386/amd64 archs

Approved by: mlaier (comentor)
2007-11-08 22:09:37 +00:00
alc
3d1044f0aa Add comments explaining why all stores updating a non-kernel page table
must be globally performed before calling any of the TLB invalidation
functions.

With one exception, on amd64, this requirement was already met.  Fix this
one case.  Also, as a clarification, change an existing atomic op into a
release.  (Suggested by: jhb)

Reported and reviewed by: ups
MFC after: 3 days
2007-11-05 18:13:34 +00:00
kib
9ae733819b Fix for the panic("vm_thread_new: kstack allocation failed") and
silent NULL pointer dereference in the i386 and sparc64 pmap_pinit()
when the kmem_alloc_nofault() failed to allocate address space. Both
functions now return error instead of panicing or dereferencing NULL.

As consequence, vmspace_exec() and vmspace_unshare() returns the errno
int. struct vmspace arg was added to vm_forkproc() to avoid dealing
with failed allocation when most of the fork1() job is already done.

The kernel stack for the thread is now set up in the thread_alloc(),
that itself may return NULL. Also, allocation of the first process
thread is performed in the fork1() to properly deal with stack
allocation failure. proc_linkup() is separated into proc_linkup()
called from fork1(), and proc_linkup0(), that is used to set up the
kernel process (was known as swapper).

In collaboration with:	Peter Holm
Reviewed by:	jhb
2007-11-05 11:36:16 +00:00
sam
46608d3cac fix build: when usb was enabled wireless drivers were brought in so
remove the nodevice lines that elided wlan support
2007-11-03 19:26:49 +00:00
thompsa
113fde39bb Remove zyd as wireless is not supported on PAE. 2007-11-03 07:11:07 +00:00
alc
1fd60b45c3 Eliminate spurious "Approaching the limit on PV entries, ..."
warnings.  Specifically, whenever vm_page_alloc(9) returned NULL to
get_pv_entry(), we issued a warning regardless of the number of pv
entries in use.  (Note: The older pv entry allocator in RELENG_6 does
not have this problem.)

Reported by:	Jeremy Chadwick

Eliminate the direct call to pagedaemon_wakeup() by get_pv_entry().
This was a holdover from earlier times when the page daemon was
responsible for the reclamation of pv entries.

MFC after: 5 days
2007-11-03 05:15:26 +00:00
peter
e93fa6ca81 Move nvram out of DEFAULTS. There really isn't a lot of justification
for consuming the memory.  The module works just fine in the unlikely
case that this is needed.  It can still be compiled into a custom kernel.
2007-10-29 22:19:08 +00:00
jhb
ae8e7ec2a3 - Add constants for the different memory types in the SMAP table.
- Use the SMAP types and constants from <machine/pc/bios.h> in the boot
  code rather than duplicating it.
2007-10-28 21:23:49 +00:00
peter
9c4d4d9a16 Split /dev/nvram driver out of isa/clock.c for i386 and amd64. I have not
refactored it to be a generic device.
Instead of being part of the standard kernel, there is now a 'nvram' device
for i386/amd64.  It is in DEFAULTS like io and mem, and can be turned off
with 'nodevice nvram'.  This matches the previous behavior when it was
first committed.
2007-10-26 03:23:54 +00:00
imp
49831563eb Add usb serial devices by default. I'm tired of telling people how to
do this that should know better :-).
2007-10-26 02:20:29 +00:00
jhb
81c7dc737f Update copyright attribution.
MFC after:	3 days
2007-10-24 21:16:22 +00:00
rwatson
60570a92bf Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

  mac_<object>_<method/action>
  mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme.  Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier.  Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods.  Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by:	SPARTA (original patches against Mac OS X)
Obtained from:	TrustedBSD Project, Apple Computer
2007-10-24 19:04:04 +00:00
jhb
67997e41d5 Slightly cleanup the 'bootdev' concept on x86 by changing the various
macros to treat the 'slice' field as a real part of the bootdev instead
of as hack that spans two other fields (adaptor (sic) and controller)
that are not used in any modern FreeBSD boot code.

MFC after:	1 week
2007-10-24 04:03:25 +00:00
jhb
2b44eb1afe Stop disabling USB in the PAE kernel config. The USB code has been
using bus_dma(9) for quite a while now and has been used on 64-bit archs
as well.

MFC after:	1 month
2007-10-24 03:53:10 +00:00
julian
51d643caa6 Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0  so that we can eventually MFC the
new kthread_xxx() calls.
2007-10-20 23:23:23 +00:00
bz
830ad96079 Fold multiple asm statements into one so that the compiler at a certain
optimization level (-march=pentium-mmx for example) does not insert
intermediate ops which would trash the carry.

Change both sys/i386/i386/in_cksum.c[1] and sys/i386/include/in_cksum.h.

To my best understanding the same problem was addressed in rev. 1.16
of src/sys/i386/include/in_cksum.h for just a single function 3y ago.

Reviewed by:  jhb
Submitted by: Zhouyi ZHOU <zhouzhouyi FreeBSD.org> (intial version of [1])
MFC after:    5 days
PR:           115678, 69257
2007-10-20 22:18:42 +00:00
kensmith
7e252facf4 Switch over to ULE as the default scheduler for amd64 and i386
architectures.
2007-10-19 12:30:33 +00:00
netchild
21c6e78ea7 Backout sensors framework.
Requested by:	phk
Discussed on:	cvs-all
2007-10-15 20:00:24 +00:00
netchild
8423df3d94 Import it(4) and lm(4), supporting most popular Super I/O Hardware Monitors.
Submitted by:	Constantine A. Murenin <cnst@FreeBSD.org>
Sponsored by:	Google Summer of Code 2007 (GSoC2007/cnst-sensors)
Mentored by:	syrinx
Tested by:	many
OKed by:	kensmith
Obtained from:	OpenBSD (parts)
2007-10-14 10:55:50 +00:00
marius
d60b8a3096 Make the PCI code aware of PCI domains (aka PCI segments) so we can
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.

Suggested by:	jhb
Reviewed by:	grehan, jhb, marcel
Approved by:	re (kensmith), jhb (PCI maintainer hat)
2007-09-30 11:05:18 +00:00
brueffer
26461bf019 Use the correct expanded name for SCTP.
PR:		116496
Submitted by:	koitsu
Reviewed by:	rrs
Approved by:	re (kensmith)
2007-09-26 20:05:07 +00:00
alc
d1bce06c64 Change the management of cached pages (PQ_CACHE) in two fundamental
ways:

(1) Cached pages are no longer kept in the object's resident page
splay tree and memq.  Instead, they are kept in a separate per-object
splay tree of cached pages.  However, access to this new per-object
splay tree is synchronized by the _free_ page queues lock, not to be
confused with the heavily contended page queues lock.  Consequently, a
cached page can be reclaimed by vm_page_alloc(9) without acquiring the
object's lock or the page queues lock.

This solves a problem independently reported by tegge@ and Isilon.
Specifically, they observed the page daemon consuming a great deal of
CPU time because of pages bouncing back and forth between the cache
queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE).  The source of
this problem turned out to be a deadlock avoidance strategy employed
when selecting a cached page to reclaim in vm_page_select_cache().
However, the root cause was really that reclaiming a cached page
required the acquisition of an object lock while the page queues lock
was already held.  Thus, this change addresses the problem at its
root, by eliminating the need to acquire the object's lock.

Moreover, keeping cached pages in the object's primary splay tree and
memq was, in effect, optimizing for the uncommon case.  Cached pages
are reclaimed far, far more often than they are reactivated.  Instead,
this change makes reclamation cheaper, especially in terms of
synchronization overhead, and reactivation more expensive, because
reactivated pages will have to be reentered into the object's primary
splay tree and memq.

(2) Cached pages are now stored alongside free pages in the physical
memory allocator's buddy queues, increasing the likelihood that large
allocations of contiguous physical memory (i.e., superpages) will
succeed.

Finally, as a result of this change long-standing restrictions on when
and where a cached page can be reclaimed and returned by
vm_page_alloc(9) are eliminated.  Specifically, calls to
vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and
return a formerly cached page.  Consequently, a call to malloc(9)
specifying M_NOWAIT is less likely to fail.

Discussed with: many over the course of the summer, including jeff@,
   Justin Husted @ Isilon, peter@, tegge@
Tested by: an earlier version by kris@
Approved by: re (kensmith)
2007-09-25 06:25:06 +00:00
attilio
e25b203061 Fix some entries in the locks static table of witness.
In particular:
- smp_tlb_mtx is no longer used, so it is axed.
- smp rendezvous lock isn't really a leaf spin-mutex. Its bad placement in
  the table, however, has been the source of a false positive LOR reporting
  with the dt_lock.  However, smp rendezvous lock would have had sched_lock
  there for older lock, so it wasn't still a leaf lock.
- allpmaps is only used in ia32 architecture, so it is inserted in the
  appropriate stub.

Addictionally:
- kse_zombie_lock is no longer present, so its definition is axed out.
- zombie_lock doesn't need to have an exported symbol, so just let's it be
  declared as static.

Tested by: kris
Approved by: jeff (mentor)
Approved by: re
2007-09-20 20:38:43 +00:00
kib
038cf0387b Fill in cr2 in the signal context from ksi->ksi_addr.
Together with the sys/i386/i386/trap.c rev. 1.306 it fixes the PR.

Submitted by:	rdivacky
Suggested by:	jhb
Sponsored by:	Google Summer of Code 2007
PR:		kern/77710
Approved by:	re (kensmith)
2007-09-20 13:46:26 +00:00
dwmalone
11cf0c8f4a regen.
Approved by: re (kensmith)
2007-09-18 19:51:49 +00:00
dwmalone
37c880369b The kernel version of Linux statfs64 is actually supposed to take
3 arguments, but we had forgotten the second argument. Also make the
Linux statfs64 struct depend on the architecture because it has an
extra 4 bytes padding on amd64 compared to i386.

The three argument fix is from David Taylor, the struct statfs64
stuff is my fault. With this patch I can install i386 Linux matlab
on an amd64 machine.

Submitted by: David Taylor <davidt_at_yadt.co.uk>
Approved by: re (kensmith)
2007-09-18 19:50:33 +00:00
phk
8ffb12a0b6 Recognize the Soekris NET5501 and configure the error led.
Add watchdog(4) support by using the MFGPT0 in the Geode LX CX5536.
(Supported range: 2^30 .. 2^44 ns = 1s ... 5h)

Approved by:	re (bmah)
2007-09-18 09:19:44 +00:00
peter
4e0e5f8fee Fix an undefined symbol that as/ld neglected to flag as a problem. It
was used in assembler code in such a way that no unresolved relocation
records were generated, so ld didn't flag the problem.   You can see
this with an 'nm' of the kernel.  There will be 'U MAXCPU' on SMP systems.

The impact of this is that the intrcount/intrnames arrays do not have
the intended amount of space reserved.  This could lead to interesting
problems due to the arrays being present in the middle of kernel code.
An overflow would be rather interesting as executable code would be used
as per-cpu incrementing interrupt counters.

This fixes it for now by exporting MAXCPU to the assembler.  A better fix
might be to define these data structures in C - they're only referenced
in the kernel from C code these days anyway.

Approved by:  re (kensmith)
2007-09-17 21:55:28 +00:00
jeff
3fc0f8b973 - Move all of the PS_ flags into either p_flag or td_flags.
- p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or
   previously the sched_lock.  These bugs have existed for some time.
 - Allow swapout to try each thread in a process individually and then
   swapin the whole process if any of these fail.  This allows us to move
   most scheduler related swap flags into td_flags.
 - Keep ki_sflag for backwards compat but change all in source tools to
   use the new and more correct location of P_INMEM.

Reported by:	pho
Reviewed by:	attilio, kib
Approved by:	re (kensmith)
2007-09-17 05:31:39 +00:00
attilio
ae7f786cc5 This is a follow-up, cleaning-up commit about recent changes involving
topology foo functions.
Working at the patch for topology problems in ia32/amd64 evicted some
problems regarding functions ordering in the SI_SUB_CPU family of
SYSINIT'ed subsystems.
In order to avoid problems with new modified to involved functions, a
correct ordering is not semantically specified for SI_SUB_CPU functions
(for a larger view of the issue please visit:
http://lists.freebsd.org/pipermail/freebsd-current/2007-July/075409.html )

Discussed with: peter
Tested by: kris, Rui Paulo <rpaulo@FreeBSD.org>
Approved by: jeff
Approved by: re
2007-09-11 22:54:09 +00:00
nyan
d507f1509c Fix a kernel panic due to a NULL pointer access on pc98.
When any PnP device exists, isa_release_resource() is called with no
activated resource.  So a bushandle is not allocated yet.

Approved by:	re (kensmith)
2007-09-01 12:18:28 +00:00
kib
5b26984cf1 Regenerate.
Approved by:	re (kensmith)
2007-08-28 12:36:23 +00:00
kib
39e24dc75d Implement fake linux sched_getaffinity() syscall to enable java to work
with Linux 2.6 emulation. This shall be reimplemented once FreeBSD gets
native scheduler affinity syscalls.

Submitted by:	rdivacky
Reviewed by:	jkim
Sponsored by:	Google Summer of Code 2007
Approved by:	re (kensmith)
2007-08-28 12:26:35 +00:00
jkoshy
8e094e5065 Assign sizes to assembly language support functions.
Approved by:	re (kensmith)
2007-08-22 05:06:14 +00:00
jkoshy
106a0e34d4 Define an END() macro for use in i386 and amd64 assembly code, akin
to the one available on the ia64, sparc64, and sun4v architectures.

Approved by:	re (kensmith)
2007-08-22 04:26:07 +00:00
alc
cbe3361efb In general, when we map a page into the kernel's address space, we no
longer create a pv entry for that mapping.  (The two exceptions are
mappings into the kernel's exec and pipe submaps.)  Consequently, there is
no reason for get_pv_entry() to dig deep into the free page queues, i.e.,
use VM_ALLOC_SYSTEM, by default.  This revision changes get_pv_entry() to
use VM_ALLOC_NORMAL by default, i.e., before calling pmap_collect() to
reclaim pv entries.

Approved by:	re (kensmith)
2007-08-21 04:59:34 +00:00
des
fcec3dfa48 Add a driver for the on-die digital thermal sensor found on Intel Core
and newer CPUs (including Core 2 and Core / Core 2 based Xeons).  The
driver attaches to each cpu device and creates a sysctl node in that
device's sysctl context (dev.cpu.N.temperature).  When invoked, the
handler binds to the appropriate CPU to ensure a correct reading.

Submitted by:	Rui Paulo <rpaulo@fnop.net>
Sponsored by:	Google Summer of Code 2007
Tested by:	des, marcus, Constantine A. Murenin, Ian FREISLICH
Approved by:	re (kensmith)
MFC after:	3 weeks
2007-08-15 19:26:03 +00:00
njl
6fbfdc2928 Add "show sysregs" command to ddb. On i386, this gives gdt, idt, ldt,
cr0-4, etc.  Support should be added for other platforms that have a
different set of registers for system use.

Loosely based on: OpenBSD
Approved by:	re
2007-08-09 20:14:35 +00:00
peter
c3c2fa9c59 Move mp_topology() from apic_init(i386) and apic_setup_local(amd64) to
cpu_start_mp().  This is after we have read the cpuid registers to
calculate the hyperthreading_cpus value for the sysctl that enables or
disables hyperthread cores.  Change mp_topology() to use that information
rather than trying to do it itself.

This solves the problem of ULE being incorrectly told that dual core
Athlon64 X2 or Operton cpus are hyperthreading cores.  At the very least,
we now have a single piece of code to identify hyperthreading.

Obtained from:  jhb
Approved by:  re (kensmith)
2007-08-02 21:17:58 +00:00
dwmalone
16ad7cce99 It seems that some i386 mothermoards either do not implement the
day of week field correctly, or they remember bad values that are
written into the day of week field. For this reason, ignore the day
of week field when reading the clock on i386 rather than bailing if
it is set incorrectly.

Problems were seen on a number of platforms, including VMWare, qemu,
EPIA ME6000, Epox-3PTA and ABIT-SL30T.

This is a slightly different fix to that proposed by Ted in his PR,
but the same basic idea.

PR:		111117
Submitted by:	Ted Faber <faber@lunabase.org>
Approved by:	re (rwatson)
MFC after:	3 weeks
2007-07-27 09:34:42 +00:00
jhb
2bcc29a2ea If the trap number stored in the trapframe is corrupted into a negative
value, then we would use a negative index into the trap_msg[] array
resulting in a nested page fault.  Make the 'type' variable holding the
trap number unsigned to avoid this.

MFC after:	2 weeks
Approved by:	re (rwatson)
2007-07-26 15:32:55 +00:00
dwmalone
e8276674f3 If clock_ct_to_ts fails to convert time time from the real time clock,
print a one line error message. Add some comments on not being able to
trust the day of week field (I'll act on these comments in a follow up
commit).

Approved by:	re
MFC after:	3 weeks
2007-07-23 09:42:32 +00:00
attilio
d6dfb4f4cb i386_set_ioperm, i386_get_ldt and i386_set_ldt are now MPSAFE
(Giant/sched_lock free) so remove unuseful Giant cruft.

Approved by: jeff
Approved by: re
Sponsorized by: NGX Italy (http://www.ngx.it)
2007-07-20 08:35:18 +00:00
jeff
3bab343460 - Add support for blocking and releasing threads to i386 cpu_switch(). This
is required for per-cpu scheduler lock support.

Obtained from:	attilio
Tested by:	current@ many users
Approved by:	re
2007-07-17 22:34:14 +00:00
mjacob
62984f5f3f Remove the internal use of __packed and put it on the structures
themselves.

Reviewed by:	nate, peter, warner, robert
Approved by:	re (ken)
2007-07-11 22:34:34 +00:00
attilio
62972d9b6e NULL_LDT_BASE is used in !SMP kernels too and set_user_ldt() is not
properly called. Address these two issues.

Reported by: Tinderbox
Tested by: le
Approved by: jeff (mentor)
Approved by: re
2007-07-08 18:17:42 +00:00
njl
ee9b742ebb Now that we have a function that can be called from a cdevsw close()
entry point, use it.

Approved by:	re
2007-07-07 17:54:33 +00:00
attilio
cb5c34b947 Actual code shows several problems in ia32 LDT handling:
- When a LDT entry changes, the old one is freed while it is still
  referenced by gdt and ldtr.  This can lead to disruptive behaviours in
  particular on SMP machines.
- When a LDT entry changes, it is assumed that the only one entity sharing
  the same LDT are threads in the same proc.  It doesn't take in account
  edge cases where two processes share the same VM (rfork'ed ones, for
  example).

This patch addresses these two problems and addictionally it fixes the
usage of refcount switching back it to the old manually-grown refcount
(since in this case would be faster).

Diagnosed by: tegge
Tested by: pho (a former version)
Reviewed by: kib
Approved by: jeff (mentor)
Approved by: re
2007-07-07 16:59:01 +00:00
bz
6da9611026 I4B header files were repo-copied from sys/i386/include/ to
sys/i4b/include/ so they will be available to all architectures
once I4B compiles on those.

Approved by:	re (kensmith)
2007-07-06 07:23:39 +00:00