Commit Graph

11381 Commits

Author SHA1 Message Date
Ed Schouten
bc093719ca Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

  The old TTY layer has a driver model that is not abstract enough to
  make it friendly to use. A good example is the output path, where the
  device drivers directly access the output buffers. This means that an
  in-kernel PPP implementation must always convert network buffers into
  TTY buffers.

  If a PPP implementation would be built on top of the new TTY layer
  (still needs a hooks layer, though), it would allow the PPP
  implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

  With the old TTY layer, it isn't entirely safe to destroy TTY's from
  the system. This implementation has a two-step destructing design,
  where the driver first abandons the TTY. After all threads have left
  the TTY, the TTY layer calls a routine in the driver, which can be
  used to free resources (unit numbers, etc).

  The pts(4) driver also implements this feature, which means
  posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

  One of the major improvements is the per-TTY mutex, which is expected
  to improve scalability when compared to the old Giant locking.
  Another change is the unbuffered copying to userspace, which is both
  used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from:		//depot/projects/mpsafetty/...
Approved by:		philip (ex-mentor)
Discussed:		on the lists, at BSDCan, at the DevSummit
Sponsored by:		Snow B.V., the Netherlands
dcons(4) fixed by:	kan
2008-08-20 08:31:58 +00:00
Kip Macy
c2dfb0d05b don't use cpu_idle_acpi under xen
MFC after:	1 month
2008-08-20 03:28:32 +00:00
John Baldwin
70d12a18f2 Export 'struct pcpu' to userland w/o requiring _KERNEL. A few ports
already define _KERNEL to get to this and I'm about to add hooks to
libkvm to access per-CPU data.

MFC after:	1 week
2008-08-19 19:53:52 +00:00
Kip Macy
ecded8075f protect queue_log not queue
MFC after:	1 month
2008-08-19 02:39:34 +00:00
Kip Macy
6786023a87 Fix compilation without INVARIANTS
MFC after:	1 month
2008-08-19 02:36:56 +00:00
Kip Macy
d1e363dd51 remove redundant PT_SET_MA declaration
MFC after:	1 month
2008-08-19 02:27:31 +00:00
Kip Macy
7e9608c858 PT_UPDATES_FLUSH() is used in common code so it needs to be defined
even in the !defined(XEN) case

MFC after:	1 month
2008-08-18 21:35:09 +00:00
Jung-uk Kim
520ba9d94a MFamd64: Correctly check unsignedness of all registers used
for load instructions with direct or indirect offsets.
2008-08-18 21:17:47 +00:00
Jung-uk Kim
3bfea8682f - Make these files compilable on user land.
- Update copyrights and fix style(9).
2008-08-18 18:59:33 +00:00
Kip Macy
1c8e9487bf Ensure that machine / physical addresses are treated as vm_paddr_t
MFC after:	1 month
2008-08-17 23:39:22 +00:00
Kip Macy
fc715e2309 remove code in XEN version of init386 causing initialization failure
MFC after:	1 month
2008-08-17 23:38:14 +00:00
Kip Macy
f0a565d1c5 translate machine addresses to physical addresses in new code in pmap_init
MFC after:	1 month
2008-08-17 23:36:52 +00:00
Kip Macy
886b1e498b bypass call to trap when handling hypervisor_upcall
MFC after:	1 month
2008-08-17 23:35:36 +00:00
Kip Macy
e9c9d2fcc7 clean up initvalues to work correctly on PAE
MFC after:	1 month
2008-08-17 23:34:44 +00:00
Bjoern A. Zeeb
603724d3ab Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from:	//depot/projects/vimage-commit2/...
Reviewed by:	brooks, des, ed, mav, julian,
		jamie, kris, rwatson, zec, ...
		(various people I forgot, different versions)
		md5 (with a bit of help)
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
X-MFC after:	never
V_Commit_Message_Reviewed_By:	more people than the patch
2008-08-17 23:27:27 +00:00
Kip Macy
2139b228e3 Call in to xen for privileged aspects of context switching
MFC after:	1 month
2008-08-16 21:38:46 +00:00
Kip Macy
8382474434 disable PREEMPTION pending bug fixes to i386/xen/pmap.c
MFC after:	1 month
2008-08-15 21:47:11 +00:00
Kip Macy
24b7d5cd1a Call in to xen for fpu handling when XEN is set
MFC after:	1 month
2008-08-15 21:43:38 +00:00
Kip Macy
10dc76a3f6 Integrate configuration bits for compling xen.
MFC after:	1 month
2008-08-15 20:58:57 +00:00
Kip Macy
93ee134a24 Integrate support for xen in to i386 common code.
MFC after:	1 month
2008-08-15 20:51:31 +00:00
Kip Macy
f0c468df71 Compile fixes for xen build.
MFC after:	1 month.
2008-08-15 04:00:44 +00:00
Jung-uk Kim
8c4d5bbc6f Use int32_t/int16_t instead of int/short as sys/net/bpf_filter.c does. 2008-08-13 19:52:00 +00:00
Jung-uk Kim
f40611e24f - Remove unnecessary jump instruction(s) when offset(s) is/are zero(s).
- Constantly use conditional jumps for unsigned integers.
2008-08-13 19:25:09 +00:00
Attilio Rao
ab46d66ac3 In the case of POWERFAIL_NMI, remove the Giant acquisitions because they
can lead to a deadlock if the thread owning the Giant lock is interrupted
by the NMI.
Instead, tollerate a small race on the x86 architecture.
2008-08-13 18:29:29 +00:00
John Baldwin
bc136b187d Attach the cpufreq child devices with specific orders to enforce relative
priority of some of the drivers that manage the same state (e.g. ichss0
vs est0).  Specifically, powernow, est, and p4tcc are added at order 10,
ichss at order 20, and smist at order 30.  Previously, some laptops were
seeing both ichss0 and est0 attaching and stomping on each other.

XXX: This isn't quite ideal, but works with the existing hacks, I think
what we really want instead is a single "speedstep0" device for CPUs
that the ichss, est, and smist drivers probe (but with differing
priorities).

MFC after:	1 week
2008-08-13 16:09:40 +00:00
Jung-uk Kim
17693f561c MFamd64: Remove unused macros. 2008-08-12 21:45:38 +00:00
Jung-uk Kim
095130bf72 Update copyrights and fix style(9). 2008-08-12 21:31:31 +00:00
Jung-uk Kim
ed67c5d584 Reduce number of stack usages with unused %edi. 2008-08-12 20:12:59 +00:00
Kip Macy
fbcad32779 Import i386 xen sub-arch files.
MFC after:	2 weeks
2008-08-12 19:48:18 +00:00
Kip Macy
41c24a46d4 Import xen sub-arch includes.
MFC after:	2 weeks
2008-08-12 19:41:11 +00:00
John Baldwin
e80531c27f Decode some more "exotic" instructions including: fxsave, fxrstor, ldmxcsr,
stmxcsr, clflush, lfence, mfence, sfence, syscall, sysret, sysenter,
sysexit, pause, monitor, mwait, and swapgs (amd64 only).

MFC after:	1 week
2008-08-11 20:19:42 +00:00
John Baldwin
24f1b6531c MFamd64: Decode "cmov*" instructions.
MFC after:	1 week
2008-08-11 20:10:52 +00:00
Philip Paeps
a51aa5d1f6 Add glxsb(4) driver for the Security Block in AMD Geode LX processors (as
found in Soekris hardware, for instance).  The hardware supports acceleration
of AES-128-CBC accessible through crypto(4) and supplies entropy to random(4).

TODO:

    o Implement rndtest(4) support
    o Performance enhancements

Submitted by:	Patrick Lamaizière <patfbsd -at- davenulle.org>
Reviewed by:	jhb, sam
MFC after:	1 week
2008-08-09 14:52:31 +00:00
Stanislav Sedov
e085f869d5 - Add cpuctl(4) pseudo-device driver to provide access to some low-level
features of CPUs like reading/writing machine-specific registers,
  retrieving cpuid data, and updating microcode.
- Add cpucontrol(8) utility, that provides userland access to
  the features of cpuctl(4).
- Add subsequent manpages.

The cpuctl(4) device operates as follows. The pseudo-device node cpuctlX
is created for each cpu present in the systems. The pseudo-device minor
number corresponds to the cpu number in the system. The cpuctl(4) pseudo-
device allows a number of ioctl to be preformed, namely RDMSR/WRMSR/CPUID
and UPDATE. The first pair alows the caller to read/write machine-specific
registers from the correspondent CPU. cpuid data could be retrieved using
the CPUID call, and microcode updates are applied via UPDATE.

The permissions are inforced based on the pseudo-device file permissions.
RDMSR/CPUID will be allowed when the caller has read access to the device
node, while WRMSR/UPDATE will be granted only when the node is opened
for writing. There're also a number of priv(9) checks.

The cpucontrol(8) utility is intened to provide userland access to
the cpuctl(4) device features. The utility also allows one to apply
cpu microcode updates.

Currently only Intel and AMD cpus are supported and were tested.

Approved by:	kib
Reviewed by:	rpaulo, cokane, Peter Jeremy
MFC after:	1 month
2008-08-08 16:26:53 +00:00
Alan Cox
494c177e81 Make pmap_kenter_attr() static. 2008-08-04 08:04:09 +00:00
Ed Schouten
200d80cd74 Disconnect drivers that haven't been ported to MPSAFE TTY yet.
As clearly mentioned on the mailing lists, there is a list of drivers
that have not been ported to the MPSAFE TTY layer yet. Remove them from
the kernel configuration files. This means people can now still use
these drivers if they explicitly put them in their kernel configuration
file, which is good.

People should keep in mind that after August 10, these drivers will not
work anymore. Even though owners of the hardware are capable of getting
these drivers working again, I will see if I can at least get them to a
compilable state (if time permits).
2008-08-03 10:32:17 +00:00
John Baldwin
d428508ca6 Adjust comment. This stack is only used for booting now and not as an
idle stack.
2008-08-01 20:10:47 +00:00
Jack F Vogel
20976c5bc7 Add igb driver to the default kernel 2008-07-30 22:30:49 +00:00
Alan Cox
e79980e1f7 Correct an off-by-one error in the previous change to pmap_change_attr().
Change the nearby comment to mention the recursive map.
2008-07-28 05:41:35 +00:00
Alan Cox
cc1ec88f72 Don't allow pmap_change_attr() to be applied to the recursive mapping. 2008-07-28 04:13:49 +00:00
Alan Cox
35db2ce0dc Style fixes to several function definitions. 2008-07-27 18:18:50 +00:00
Luoqi Chen
e8f00dec4b Unbreak cc -pg support on i386. In gcc 4.2, %ecx is used as the arg pointer
when stack realignment is turned on (it is ALWAYS on for main), however
in a profiling build %ecx would be clobbered by mcount(), this would lead
to a segmentation fault when the code tries to reference any argument.
This fix changes mcount() to preserve %ecx.

PR:		bin/119709
Reviewed by:	bde
MFC after:	1 week
2008-07-23 11:37:20 +00:00
Alan Cox
59a23cacd4 Correct an error in pmap_change_attr()'s initial loop that verifies that the
given range of addresses are mapped.  Previously, the loop was testing the
same address every time.

Submitted by:	Magesh Dhasayyan
2008-07-18 22:05:51 +00:00
Alan Cox
53d13c6030 Simplify pmap_extract()'s control flow, making it more like the related
functions pmap_extract_and_hold() and pmap_kextract().
2008-07-18 20:07:50 +00:00
Alan Cox
36e6513df5 Update bus_dmamem_alloc()'s first call to malloc() such that M_WAITOK is
specified when appropriate.

Reviewed by:	scottl
2008-07-15 03:34:49 +00:00
Ed Schouten
f4d811f0b2 Make uart(4) the default serial port driver on i386 and amd64.
The uart(4) driver has the advantage of supporting a wider variety of
hardware on a greater amount of platforms. This driver has already been
the standard on platforms such as ia64, powerpc and sparc64.

I've decided not to change anything on pc98. I'd rather let people from
the pc98 team look at this.

Approved by:	philip (mentor), marcel
2008-07-13 07:20:14 +00:00
Xin LI
dbd47f1592 Add HWPMC_HOOKS to GENERIC kernels, this makes hwpmc.ko work out
of the box.
2008-07-07 22:55:11 +00:00
Alan Cox
cc82a18b88 In FreeBSD 7.0 and beyond, pmap_growkernel() should pass VM_ALLOC_INTERRUPT
to vm_page_alloc() instead of VM_ALLOC_SYSTEM.  VM_ALLOC_SYSTEM was the
logical choice before FreeBSD 7.0 because VM_ALLOC_INTERRUPT could not
reclaim a cached page.  Simply put, there was no ordering between
VM_ALLOC_INTERRUPT and VM_ALLOC_SYSTEM as to which "dug deeper" into the
cache and free queues.  Now, there is; VM_ALLOC_INTERRUPT dominates
VM_ALLOC_SYSTEM.

While I'm here, teach pmap_growkernel() to request a prezeroed page.

MFC after:	1 week
2008-07-07 17:25:09 +00:00
Robert Watson
4f7d1876d5 Introduce a new lock, hostname_mtx, and use it to synchronize access
to global hostname and domainname variables.  Where necessary, copy
to or from a stack-local buffer before performing copyin() or
copyout().  A few uses, such as in cd9660 and daemon_saver, remain
under-synchronized and will require further updates.

Correct a bug in which a failed copyin() of domainname would leave
domainname potentially corrupted.

MFC after:	3 weeks
2008-07-05 13:10:10 +00:00
John Baldwin
e9a31041c0 Remove the sbni(4) driver. No one responded to calls to test it on
current@ and stable@.
2008-07-04 21:06:57 +00:00
John Baldwin
2c6298572e Remove the oltr(4) driver. No one responded to calls for testing on
current@ and stable@ for the locking patches.  The driver can always be
revived if someone tests it.

This driver also sleeps in its if_init routine, so it likely doesn't really
work at all anyway in modern releases.
2008-07-04 18:58:53 +00:00
John Baldwin
94f923b69d Remove the arl(4) driver. It is reported to not work on 6.x or later
even though the driver hasn't changed since 4.x (last known working
release).
2008-07-04 18:15:36 +00:00
Alan Cox
0cbeb44158 Eliminate an unused declaration. (In fact, the declaration is bogus
because the variable is defined static to pmap.c on i386.)

Found by: CScout
2008-07-04 17:36:12 +00:00
Ed Schouten
9d7a57e916 Remove the unused M_MEMDEV from the kernel.
The M_MEMDEV memory allocation pool does not seem to be used. We can
live without it.

Approved by:	philip (mentor)
2008-06-25 07:52:10 +00:00
Ed Schouten
721351876c Remove the unused major/minor numbers from iodev and memdev.
Now that st_rdev is being automatically generated by the kernel, there
is no need to define static major/minor numbers for the iodev and
memdev. We still need the minor numbers for the memdev, however, to
distinguish between /dev/mem and /dev/kmem.

Approved by:	philip (mentor)
2008-06-25 07:45:31 +00:00
Jung-uk Kim
1427b09672 Emit opcodes closer to GNU as(1) generated codes and micro-optimize. 2008-06-24 20:12:44 +00:00
Jung-uk Kim
6a9748abc8 Rehash and clean up BPF JIT compiler macros to match AT&T notations. 2008-06-23 23:10:11 +00:00
Xin LI
4d52a57549 Add et(4), a port of DragonFly's Agere ET1310 10/100/Gigabit
Ethernet device driver, written by sephe@

Obtained from:	DragonFly
Sponsored by:	iXsystems
MFC after:	2 weeks
2008-06-20 19:28:33 +00:00
Wojciech A. Koszek
53a609f064 Remove obselete PECOFF image activator support.
PRs assigned at the time of removal:    kern/80742

Discussed on:   freebsd-current (silence), IRC
Tested by:      make universe
Approved by:    cognet (mentor)
2008-06-14 12:51:44 +00:00
Ed Schouten
29d4cb241b Don't enforce unique device minor number policy anymore.
Except for the case where we use the cloner library (clone_create() and
friends), there is no reason to enforce a unique device minor number
policy. There are various drivers in the source tree that allocate unr
pools and such to provide minor numbers, without using them themselves.

Because we still need to support unique device minor numbers for the
cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's
that are used in combination with the cloner library should be marked
with this flag to make the cloning work.

This means drivers can now freely use si_drv0 to store their own flags
and state, making it effectively the same as si_drv1 and si_drv2. We
still keep the minor() and dev2unit() routines around to make drivers
happy.

The NTFS code also used the minor number in its hash table. We should
not do this anymore. If the si_drv0 field would be changed, it would no
longer end up in the same list.

Approved by:	philip (mentor)
2008-06-11 18:55:19 +00:00
John Baldwin
984c25c10b After probing the available frequency settings, restore the CPU to run at
whatever frequency it started at instead of always picking the highest
frequency.  The first version of this driver attempted to do this, but it
set the speed to the first frequency in the list rather than the value it
had saved.

MFC after:	1 week
Discussed with:	rpaulo, phk
2008-05-30 22:01:09 +00:00
Pyun YongHyeon
20f99a5be4 Add jme(4) to the list of drivers supported by GENERIC kernel. 2008-05-27 02:22:32 +00:00
Bjoern A. Zeeb
2e598474fa Remove ISDN4BSD (I4B) from HEAD as it is not MPSAFE and
parts relied on the now removed NET_NEEDS_GIANT.
Most of I4B has been disconnected from the build
since July 2007 in HEAD/RELENG_7.

This is what was removed:
- configuration in /etc/isdn
- examples
- man pages
- kernel configuration
- sys/i4b (drivers, layers, include files)
- user space tools
- i4b support from ppp
- further documentation

Discussed with: rwatson, re
2008-05-26 10:40:09 +00:00
Attilio Rao
0e72a03405 style fix for newly introduced macro. 2008-05-25 14:50:47 +00:00
Bjoern A. Zeeb
b319692931 Restore buildable state. Style ignored.
Leave IDTVEC(ill) where it was unless we compile with KDTRACE_HOOKS[1].
Hide the with DTRACE case case under #ifdef KDTRACE_HOOKS.

Suggested by:	attilio [1]
Reviewed by:	attilio
2008-05-24 19:29:02 +00:00
John Birrell
f1bd3c150c Add a cyclic hook for DTrace. 2008-05-24 06:27:54 +00:00
John Birrell
15653bada1 Add the DTrace hooks for exception handling (Function boundary trace
-fbt- provider), cyclic clock and syscalls.
2008-05-24 06:27:02 +00:00
Alan Cox
d1fdd63483 The VM system no longer uses setPQL2(). Remove it and its helpers. 2008-05-23 04:03:54 +00:00
David E. O'Brien
99f233296d Use the "options " spelling (vs. "options<TAB>") so that commented lines
line up nicely.
2008-05-21 03:36:53 +00:00
Pyun YongHyeon
83a17b90eb Add age(4) to the list of drivers supported by GENERIC kernel. 2008-05-19 02:30:27 +00:00
John Birrell
fdd5d90980 Remove the unknown device that is breaking the tinderbox build. 2008-05-18 11:08:26 +00:00
Alan Cox
1ec1304bdb Retire pmap_addr_hint(). It is no longer used. 2008-05-18 04:16:57 +00:00
Remko Lodder
6e535f6e5b Resort the if_ti driver to match the PCI Network cards instead of placing
it under the mii devices list.

PR:		kern/123147
Submitted by:	gavin
Approved by:	imp (mentor, implicit)
MFC after:	3 days
2008-05-17 23:50:00 +00:00
Attilio Rao
13d4b2b0bc Removed unused assembly offsets for structures digging. 2008-05-16 13:23:47 +00:00
Roman Divacky
7c0cc5f941 Regen.
Approved by:	kib (mentor)
2008-05-13 20:02:26 +00:00
Roman Divacky
4732e446fb Implement robust futexes. Most of the code is modelled after
what Linux does. This is because robust futexes are mostly
userspace thing which we cannot alter. Two syscalls maintain
pointer to userspace list and when process exits a routine
walks this list waking up processes sleeping on futexes
from that list.

Reviewed by:	kib (mentor)
MFC after:	1 month
2008-05-13 20:01:27 +00:00
Alan Cox
ef4d480ced Correct an error in pmap_align_superpage(). Specifically, correctly
handle the case where the mapping is greater than a superpage in size
but the alignment of the physical pages spans a superpage boundary.
2008-05-11 20:33:47 +00:00
Alan Cox
d3249b142b Introduce pmap_align_superpage(). It increases the starting virtual
address of the given mapping if a different alignment might result in more
superpage mappings.
2008-05-09 16:48:07 +00:00
Sam Leffler
6c26723b19 enable IEEE80211_DEBUG and IEEE80211_AMPDU_AGE by default 2008-05-03 17:05:38 +00:00
Rui Paulo
029b1a164a Remove unused variable saved_id16.
Pointy hat to:	me
Pointed out by:	jhb
MFC after:	1 week
2008-05-02 10:16:41 +00:00
Sam Leffler
3971d07be7 Intel 4965 wireless driver (derived from openbsd driver of the same name) 2008-04-29 21:36:17 +00:00
Alan Cox
26b77ff3b1 Always use PG_PS_FRAME to extract the physical address of a 2/4MB page
from a PDE.
2008-04-25 16:00:39 +00:00
Jeff Roberson
6c47aaae12 - Add an integer argument to idle to indicate how likely we are to wake
from idle over the next tick.
 - Add a new MD routine, cpu_wake_idle() to wakeup idle threads who are
   suspended in cpu specific states.  This function can fail and cause the
   scheduler to fall back to another mechanism (ipi).
 - Implement support for mwait in cpu_idle() on i386/amd64 machines that
   support it.  mwait is a higher performance way to synchronize cpus
   as compared to hlt & ipis.
 - Allow selecting the idle routine by name via sysctl machdep.idle.  This
   replaces machdep.cpu_idle_hlt.  Only idle routines supported by the
   current machine are permitted.

Sponsored by:	Nokia
2008-04-25 05:18:50 +00:00
Roman Divacky
a6d043e30d Implement linux_truncate64() syscall.
Tested by:	Aline de Freitas <aline@riseup.net>
Approved by:	kib (mentor)
2008-04-23 15:56:33 +00:00
Poul-Henning Kamp
9b4a8ab7ba Now that all platforms use genclock, shuffle things around slightly
for better structure.

Much of this is related to <sys/clock.h>, which should really have
been called <sys/calendar.h>, but unless and until we need the name,
the repocopy can wait.

In general the kernel does not know about minutes, hours, days,
timezones, daylight savings time, leap-years and such.  All that
is theoretically a matter for userland only.

Parts of kernel code does however care: badly designed filesystems
store timestamps in local time and RTC chips almost universally
track time in a YY-MM-DD HH:MM:SS format, and sometimes in local
timezone instead of UTC.  For this we have <sys/clock.h>

<sys/time.h> on the other hand, deals with time_t, timeval, timespec
and so on.  These know only seconds and fractions thereof.

Move inittodr() and resettodr() prototypes to <sys/time.h>.
Retain the names as it is one of the few surviving PDP/VAX references.

Move startrtclock() to <machine/clock.h> on relevant platforms, it
is a MD call between machdep.c/clock.c.  Remove references to it
elsewhere.

Remove a lot of unnecessary <sys/clock.h> includes.

Move the machdep.disable_rtc_set sysctl to subr_rtc.c where it belongs.
XXX: should be kern.disable_rtc_set really, it's not MD.
2008-04-22 19:38:30 +00:00
Sam Leffler
b032f27c36 Multi-bss (aka vap) support for 802.11 devices.
Note this includes changes to all drivers and moves some device firmware
loading to use firmware(9) and a separate module (e.g. ral).  Also there
no longer are separate wlan_scan* modules; this functionality is now
bundled into the wlan module.

Supported by:	Hobnob and Marvell
Reviewed by:	many
Obtained from:	Atheros (some bits)
2008-04-20 20:35:46 +00:00
Sam Leffler
f446360711 move awi to the Attic; it will not make the jump to the new world order
Reviewed by:	imp
2008-04-20 19:20:39 +00:00
Jeff Roberson
66247efa5a - Add inlines for the monitor and mwait instructions.
Sponsored by:	Nokia
2008-04-18 05:47:56 +00:00
Jung-uk Kim
01c3b1b200 Regenerate. 2008-04-16 19:27:36 +00:00
Jung-uk Kim
26833f3f9a Add stubs for syscalls introduced in Linux 2.6.17 kernel.
Some GNU libc version started using them before 2.6.17 was officially out.

MFC after:	3 days
2008-04-16 19:25:39 +00:00
Poul-Henning Kamp
36bff1ebfb Convert amd64 and i386 to share the atrtc device driver. 2008-04-14 08:00:00 +00:00
Poul-Henning Kamp
2946435299 Move i386 to generic RTC handling code.
Make clock_if.m and subr_rtc.c standard on i386

Add hints for "atrtc" driver, for non-PnP, non-ACPI systems.
NB: Make sure to install GENERIC.hints into /boot/device.hints in these!

Nuke MD inittodr(), resettodr() functions.

Don't attach to PHP0B00 in the "attimer" dummy driver any more, and remove
comments that no longer apply for that reason.

Add new "atrtc" device driver, which handles IBM PC AT Real Time
Clock compatible devices using subr_rtc and clock_if.

This driver is not entirely clean: other code still fondles the
hardware to get a statclock interrupt on non-ACPI timer systems.

Wrap some overly long lines.

After it has settled in -current, this will be ported to amd64.

Technically this is MFC'able, but I fail to see a good reason.
2008-04-12 20:46:06 +00:00
Jeff Roberson
9b33b154b5 - Add the interrupt vector number to intr_event_create so MI code can
lookup hard interrupt events by number.  Ignore the irq# for soft intrs.
 - Add support to cpuset for binding hardware interrupts.  This has the
   side effect of binding any ithread associated with the hard interrupt.
   As per restrictions imposed by MD code we can only bind interrupts to
   a single cpu presently.  Interrupts can be 'unbound' by binding them
   to all cpus.

Reviewed by:	jhb
Sponsored by:	Nokia
2008-04-11 03:26:41 +00:00
Takanori Watanabe
76f3d08d26 Don't break identity mapping set up for ACPI resume path.
With this change, BSP processor context seems to be recovered.
2008-04-10 18:38:31 +00:00
Alan Cox
f4d2c7f13e Correct pmap_copy()'s method for extracting the physical address of a
2/4MB page from a PDE.  Specifically, change it to use PG_PS_FRAME,
not PG_FRAME, to extract the physical address of a 2/4MB page from a
PDE.

Change the last argument passed to pmap_pv_insert_pde() from a
vm_page_t representing the first 4KB page of a 2/4MB page to the
vm_paddr_t of the 2/4MB page.  This avoids an otherwise unnecessary
conversion from a vm_paddr_t to a vm_page_t in pmap_copy().
2008-04-10 16:04:50 +00:00
Konstantin Belousov
50ad4fc65c Regenerate 2008-04-08 09:51:19 +00:00
Konstantin Belousov
48b05c3f82 Implement the linux syscalls
openat, mkdirat, mknodat, fchownat, futimesat, fstatat, unlinkat,
    renameat, linkat, symlinkat, readlinkat, fchmodat, faccessat.

Submitted by:	rdivacky
Sponsored by:	Google Summer of Code 2007
Tested by:	pho
2008-04-08 09:45:49 +00:00
Alan Cox
109d493230 Update pmap_page_wired_mappings() so that it counts 2/4MB page mappings. 2008-04-07 07:38:02 +00:00
John Baldwin
1ee1b68792 Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This
allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt
code.
- Rename the intr_event 'eoi', 'disable', and 'enable' hooks to
  'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric.
  Also, add a comment describe what the MI code expects them to do.
- On amd64, i386, and powerpc this is effectively a NOP.
- On arm, don't bother masking the interrupt unless the ithread is
  scheduled in the non-INTR_FILTER case to match what INTR_FILTER did.
  Also, don't bother unmasking the interrupt in the post_filter case if
  we never masked it.  The INTR_FILTER case had been doing this by having
  arm_unmask_irq for the post_filter (formerly 'eoi') hook.
- On ia64, stray interrupts are now masked for the non-INTR_FILTER case.
  They were already masked in the INTR_FILTER case.
- On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for
  both the 'post_filter' and 'post_ithread' hooks to match what the
  non-INTR_FILTER code did.
- On sun4v, retire the ithread wrapper hack by using an appropriate
  'post_ithread' hook instead (it's what 'post_ithread'/'enable' was
  designed to do even in 5.x).

Glanced at by:	piso
Reviewed by:	marius
Requested by:	marius [1], [5]
Tested on:	amd64, i386, arm, sparc64
2008-04-05 19:58:30 +00:00
Alan Cox
7630c26507 Reintroduce UMA_SLAB_KMAP; however, change its spelling to
UMA_SLAB_KERNEL for consistency with its sibling UMA_SLAB_KMEM.
(UMA_SLAB_KMAP met its original demise in revision 1.30 of
vm/uma_core.c.)  UMA_SLAB_KERNEL is now required by the jumbo frame
allocators.  Without it, UMA cannot correctly return pages from the
jumbo frame zones to the VM system because it resets the pages' object
field to NULL instead of the kernel object.  In more detail, the jumbo
frame zones are created with the option UMA_ZONE_REFCNT.  This causes
UMA to overwrite the pages' object field with the address of the slab.
However, when UMA wants to release these pages, it doesn't know how to
restore the object field, so it sets it to NULL.  This change teaches
UMA how to reset the object field to the kernel object.

Crashes reported by: kris
Fix tested by: kris
Fix discussed with: jeff
MFC after: 6 weeks
2008-04-04 18:41:12 +00:00
Konstantin Belousov
57b4252e45 Add the support for the AT_FDCWD and fd-relative name lookups to the
namei(9).

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:01:21 +00:00
Alan Cox
4ae6e47432 Eliminate an #if 0/#endif that was unintentionally introduced
by the previous revision.
2008-03-29 04:29:50 +00:00
Ed Maste
064fb2d184 If we're returning successfully from bus_dmamem_alloc, don't record a KTR
of error = ENOMEM.
2008-03-28 15:28:20 +00:00
Brooks Davis
96a6e6e6ca Use ; instead of : to end a line.
Submitted by:	Niclas Zeising <niclas dot zeising at gmail dot com>
2008-03-28 08:19:03 +00:00
Paul Saab
6e7534b8c8 Add support to mincore for detecting whether a page is part of a
"super" page or not.

Reviewed by:	alc, ups
2008-03-28 04:29:27 +00:00
Doug Rabson
fa9d9930ca Add kernel module support for nfslockd and krpc. Use the module system
to detect (or load) kernel NLM support in rpc.lockd. Remove the '-k'
option to rpc.lockd and make kernel NLM the default. A user can still
force the use of the old user NLM by building a kernel without NFSLOCKD
and/or removing the nfslockd.ko module.
2008-03-27 11:54:20 +00:00
John Birrell
e483943791 When building a kernel module, define MAXCPU the same as SMP so
that modules work with and without SMP.
2008-03-27 05:03:26 +00:00
Alan Cox
97dbe5e48e MFamd64 with few changes:
1. Add support for automatic promotion of 4KB page mappings to 2MB page
   mappings.  Automatic promotion can be enabled by setting the tunable
   "vm.pmap.pg_ps_enabled" to a non-zero value.  By default, automatic
   promotion is disabled.  Tested by: kris

2. To date, we have assumed that the TLB will only set the PG_M bit in a
   PTE if that PTE has the PG_RW bit set.  However, this assumption does
   not hold on recent processors from Intel.  For example, consider a PTE
   that has the PG_RW bit set but the PG_M bit clear.  Suppose this PTE
   is cached in the TLB and later the PG_RW bit is cleared in the PTE,
   but the corresponding TLB entry is not (yet) invalidated.
   Historically, upon a write access using this (stale) TLB entry, the
   TLB would observe that the PG_RW bit had been cleared and initiate a
   page fault, aborting the setting of the PG_M bit in the PTE.  Now,
   however, P4- and Core2-family processors will set the PG_M bit before
   observing that the PG_RW bit is clear and initiating a page fault.  In
   other words, the write does not occur but the PG_M bit is still set.

   The real impact of this difference is not that great.  Specifically,
   we should no longer assert that any PTE with the PG_M bit set must
   also have the PG_RW bit set, and we should ignore the state of the
   PG_M bit unless the PG_RW bit is set.
2008-03-27 04:34:17 +00:00
Poul-Henning Kamp
dad3b6c6fd Back in the good old days, PC's had random pieces of rock for
frequency generation and what frequency the generated was anyones
guess.

In general the 32.768kHz RTC clock x-tal was the best, because that
was a regular wrist-watch Xtal, whereas the X-tal generating the
ISA bus frequency was much lower quality, often costing as much as
several cents a piece, so it made good sense to check the ISA bus
frequency against the RTC clock.

The other relevant property of those machines, is that they
typically had no more than 16MB RAM.

These days, CPU chips croak if their clocks are not tightly within
specs and all necessary frequencies are derived from the master
crystal by means if PLL's.

Considering that it takes on average 1.5 second to calibrate the
frequency of the i8254 counter, that more likely than not, we will
not actually use the result of the calibration, and as the final
clincher, we seldom use the i8254 for anything besides BEL in
syscons anyway, it has become time to drop the calibration code.

If you need to tell the system what frequency your i8254 runs,
you can do so from the loader using hw.i8254.freq or using the
sysctl kern.timecounter.tc.i8254.frequency.
2008-03-26 22:12:00 +00:00
Poul-Henning Kamp
e465985885 The "free-lance" timer in the i8254 is only used for the speaker
these days, so de-generalize the acquire_timer/release_timer api
to just deal with speakers.

The new (optional) MD functions are:
	timer_spkr_acquire()
	timer_spkr_release()
and
	timer_spkr_setfreq()

the last of which configures the timer to generate a tone of a given
frequency, in Hz instead of 1/1193182th of seconds.

Drop entirely timer2 on pc98, it is not used anywhere at all.

Move sysbeep() to kern/tty_cons.c and use the timer_spkr*() if
they exist, and do nothing otherwise.

Remove prototypes and empty acquire-/release-timer() and sysbeep()
functions from the non-beeping archs.

This eliminate the need for the speaker driver to know about
i8254frequency at all.  In theory this makes the speaker driver MI,
contingent on the timer_spkr_*() functions existing but the driver
does not know this yet and still attaches to the ISA bus.

Syscons is more tricky, in one function, sc_tone(), it knows the hz
and things are just fine.

In the other function, sc_bell() it seems to get the period from
the KDMKTONE ioctl in terms if 1/1193182th second, so we hardcode
the 1193182 and leave it at that.  It's probably not important.

Change a few other sysbeep() uses which obviously knew that the
argument was in terms of i8254 frequency, and leave alone those
that look like people thought sysbeep() took frequency in hertz.

This eliminates the knowledge of i8254_freq from all but the actual
clock.c code and the prof_machdep.c on amd64 and i386, where I think
it would be smart to ask for help from the timecounters anyway [TBD].
2008-03-26 20:09:21 +00:00
Doug Rabson
dfdcada31e Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
  client handle safely with replies being de-multiplexed at the socket
  upcall (typically driven directly by the NIC interrupt) and handed
  off to whichever thread matches the reply. For UDP sockets, many RPC
  clients can share the same socket. This allows the use of a single
  privileged UDP port number to talk to an arbitrary number of remote
  hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
  server would be relatively straightforward and would follow
  approximately the Solaris KPI. A single thread should be sufficient
  for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
  callbacks. I've tested the NLM server reasonably extensively - it
  passes both my own tests and the NFS Connectathon locking tests
  running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
  support for the local NFS client's locking needs, it does have to
  field async replies and granted callbacks from remote NLMs that the
  local client has contacted. We relay these replies to the userland
  rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
  it will detect deadlocks caused by a lock request that covers more
  than one blocking request. As required by the NLM protocol, all
  deadlock detection happens synchronously - a user is guaranteed that
  if a lock request isn't rejected immediately, the lock will
  eventually be granted. The old system allowed for a 'deferred
  deadlock' condition where a blocked lock request could wake up and
  find that some other deadlock-causing lock owner had beaten them to
  the lock.

* Since both local and remote locks are managed by the same kernel
  locking code, local and remote processes can safely use file locks
  for mutual exclusion. Local processes have no fairness advantage
  compared to remote processes when contending to lock a region that
  has just been unlocked - the local lock manager enforces a strict
  first-come first-served model for both local and remote lockers.

Sponsored by:	Isilon Systems
PR:		95247 107555 115524 116679
MFC after:	2 weeks
2008-03-26 15:23:12 +00:00
Poul-Henning Kamp
ebfbcd612a Rename timer0_max_count to i8254_max_count.
Rename timer0_real_max_count to i8254_real_max_count and make it static.
Rename timer_freq to i8254_freq and make it a loader tunable.
2008-03-26 15:03:24 +00:00
Poul-Henning Kamp
f168bfa529 The RTC related pscnt and psdiv variables have no business being public. 2008-03-26 13:25:27 +00:00
Christian Brueffer
662cac9f23 Fix some "in in" typos in comments.
PR:		121490
Submitted by:	Anatoly Borodin <anatoly.borodin@gmail.com>
Approved by:	rwatson (mentor), jkoshy
MFC after:	3 days
2008-03-26 07:32:08 +00:00
Alan Cox
fdcd29b52b Enable the automatic creation of superpage reservations. 2008-03-26 03:12:00 +00:00
Jung-uk Kim
cb7d38abf2 Belatedly add BPF_JITTER in NOTES for supported architectures. 2008-03-24 22:23:22 +00:00
Konstantin Belousov
3f7905d29c Prevent the overflow in the calculation of the next page directory.
The overflow causes the wraparound with consequent corruption of the
(almost) whole address space mapping.

As Alan noted, pmap_copy() does not require the wrap-around checks
because it cannot be applied to the kernel's pmap. The checks there are
included for consistency.

Reported and tested by:	kris (i386/pmap.c:pmap_remove() part)
Reviewed by:	alc
MFC after:	1 week
2008-03-23 07:07:27 +00:00
John Baldwin
eb2b0540e5 Explicitly use spinlock_enter/exit rather than locking the icu_lock spin
lock in the 8259A drivers as these drivers are only used on UP systems.
This slightly reduces the penalty of an SMP kernel (such as GENERIC) on
a UP x86 machine.
2008-03-20 21:53:27 +00:00
John Baldwin
dcc8106854 Implement a BUS_BIND_INTR() method in the bus interface to bind an IRQ
resource to a CPU.  The default method is to pass the request up to the
parent similar to BUS_CONFIG_INTR() so that all busses don't have to
explicitly implement bus_bind_intr.  A bus_bind_intr(9) wrapper routine
similar to bus_setup/teardown_intr() is added for device drivers to use.
Unbinding an interrupt is done by binding it to NOCPU.  The IRQ resource
must be allocated, but it can happen in any order with respect to
bus_setup_intr().  Currently it is only supported on amd64 and i386 via
nexus(4) methods that simply call the intr_bind() routine.

Tested by:	gallatin
2008-03-20 21:24:32 +00:00
John Baldwin
6d2d1c044f Simplify the interrupt code a bit:
- Always include the ie_disable and ie_eoi methods in 'struct intr_event'
  and collapse down to one intr_event_create() routine.  The disable and
  eoi hooks simply aren't used currently in the !INTR_FILTER case.
- Expand 'disab' to 'disable' in a few places.
- Use function casts for arm and i386:intr_eoi_src() instead of wrapper
  routines since to trim one extra indirection.

Compiled on:	{arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER}
Tested on:	{amd64,i386} x {FILTER, !FILTER}
2008-03-17 22:42:01 +00:00
Poul-Henning Kamp
272870cf7b A cautionary XXX comment about seemingly bogus errata checks. 2008-03-17 09:05:15 +00:00
Poul-Henning Kamp
462302db47 Increase time we wait for things to settle to 1 millisecond,
10 microseconds is too short.

Always set the cpu to the highest frequency so that we get through
boot and don't handicap cpus where powerd(8) is not used.
2008-03-17 09:01:43 +00:00
Poul-Henning Kamp
68b84e73e3 Revert last commit and stop committing before morning tea. 2008-03-17 09:00:59 +00:00
Poul-Henning Kamp
5d306f44cc Increase time we wait for things to settle to 1 millisecond,
10 microseconds is too short.

Always set the cpu to the highest frequency so that we get through
boot and don't handicap cpus where powerd(8) is not used.
2008-03-17 08:38:38 +00:00
Poul-Henning Kamp
29cc138cdf Use correct bitmask for identifying chip family. 2008-03-17 00:36:16 +00:00
Pawel Jakub Dawidek
6eb4157ffc Implement atomic_fetchadd_long() for all architectures and document it.
Reviewed by:	attilio, jhb, jeff, kris (as a part of the uidinfo_waitfree.patch)
2008-03-16 21:20:50 +00:00
Roman Divacky
d8653dd986 Regen. 2008-03-16 16:29:37 +00:00
Roman Divacky
5dfb688191 Implement sched_setaffinity and get_setaffinity using
real cpu affinity setting primitives.

Reviewed by:	jeff
Approved by:	kib (mentor)
2008-03-16 16:27:44 +00:00
Robert Watson
237fdd787b In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
John Baldwin
eaf86d1678 Add preliminary support for binding interrupts to CPUs:
- Add a new intr_event method ie_assign_cpu() that is invoked when the MI
  code wishes to bind an interrupt source to an individual CPU.  The MD
  code may reject the binding with an error.  If an assign_cpu function
  is not provided, then the kernel assumes the platform does not support
  binding interrupts to CPUs and fails all requests to do so.
- Bind ithreads to CPUs on their next execution loop once an interrupt
  event is bound to a CPU.  Only shared ithreads are bound.  We currently
  leave private ithreads for drivers using filters + ithreads in the
  INTR_FILTER case unbound.
- A new intr_event_bind() routine is used to bind an interrupt event to
  a CPU.
- Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
  PIC method.
- For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
  an interrupt source and binds its interrupt event to the specified CPU.
  MI code can currently (ab)use this by doing:

	intr_bind(rman_get_start(irq_res), cpu);

  however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
  where the implementation in the x86 nexus(4) driver would end up calling
  intr_bind() internally.

Requested by:	kmacy, gallatin, jeff
Tested on:	{amd64, i386} x {regular, INTR_FILTER}
2008-03-14 19:41:48 +00:00
John Baldwin
c9107e85d9 Fix a silly bogon which prevented all the CPUs that are tagged as interrupt
receivers from being given interrupts if any CPUs in the system were not
tagged as interrupt receivers that I introduced when switching the x86
interrupt code to track CPUs via FreeBSD CPU IDs rather than local APIC
IDs.  In practice this only affects systems with Hyperthreading (though
disabling HTT in the BIOS would workaround the issue) as that is the only
case currently where one can have CPUs that aren't tagged as interrupt
receivers.  On a Dell SC1425 test box with 2 x Xeon w/ HTT (so 4 logical
CPUs of which 2 were interrupt receivers) the result was that all
device interrupts were sent to CPU 0.

MFC after:	1 week
Pointy hat to:	jhb
2008-03-14 03:44:42 +00:00
John Baldwin
5217af301c Rework how the nexus(4) device works on x86 to better handle the idea of
different "platforms" on x86 machines.  The existing code already handles
having two platforms: ACPI and legacy.  However, the existing approach was
rather hardcoded and difficult to extend.  These changes take the approach
that each x86 hardware platform should provide its own nexus(4) driver (it
can inherit most of its behavior from the default legacy nexus(4) driver)
which is responsible for probing for the platform and performing
appropriate platform-specific setup during attach (such as adding a
platform-specific bus device).  This does mean changing the x86 platform
busses to no longer use an identify routine for probing, but to move that
logic into their matching nexus(4) driver instead.
- Make the default nexus(4) driver in nexus.c on i386 and amd64 handle the
  legacy platform.  It's probe routine now returns BUS_PROBE_GENERIC so it
  can be overriden.
- Expose a nexus_init_resources() routine which initializes the various
  resource managers so that subclassed nexus(4) drivers can invoke it from
  their attach routine.
- The legacy nexus(4) driver explicitly adds a legacy0 device in its
  attach routine.
- The ACPI driver no longer contains an new-bus identify method.  Instead
  it exposes a public function (acpi_identify()) which is a probe routine
  that the MD nexus(4) drivers can use to probe for ACPI.  All of the
  probe logic in acpi_probe() is now moved into acpi_identify() and
  acpi_probe() is just a stub.
- On i386 and amd64, an ACPI-specific nexus(4) driver checks for ACPI via
  acpi_identify() and claims the nexus0 device if the probe succeeds.  It
  then explicitly adds an acpi0 device in its attach routine.
- The legacy(4) driver no longer knows anything about the acpi0 device.
- On ia64 if acpi_identify() fails you basically end up with no devices.
  This matches the previous behavior where the old acpi_identify() would
  fail to add an acpi0 device again leaving you with no devices.

Discussed with:	imp
Silence on:	arch@
2008-03-13 20:39:04 +00:00
John Baldwin
d0234f752f Use the SMAP data from the loader if it is provided instead of using
virtual 86 mode to query the BIOS directly.  This is needed for certain
HP machines whose BIOS only provide an SMAP when invoked from real mode.
On such machines the loader will be able to query the SMAP successfully
due to the recent BTX changes, but the kernel will not.

One thing I'm not sure of is if we can skip the INT 12h probe altogether
if we have the SMAP from the loader as it seems that we do the INT 12h
probe to setup enough state so we can use vm86 to call the BIOS.

MFC after:	1 week
2008-03-13 18:56:53 +00:00
Konstantin Belousov
22eca0bf45 Since version 4.3, gcc changed its behaviour concerning the i386/amd64
ABI and the direction flag, that is it now assumes that the direction
flag is cleared at the entry of a function and it doesn't clear once
more if needed. This new behaviour conforms to the i386/amd64 ABI.

Modify the signal handler frame setup code to clear the DF {e,r}flags
bit on the amd64/i386 for the signal handlers.

jhb@ noted that it might break old apps if they assumed DF == 1 would be
preserved in the signal handlers, but that such apps should be rare and
that older versions of gcc would not generate such apps.

Submitted by:	Aurelien Jarno <aurelien aurel32 net>
PR:	121422
Reviewed by:	jhb
MFC after:	2 weeks
2008-03-13 10:54:38 +00:00
Konstantin Belousov
ea39de9f93 Add missed parentheses 2008-03-13 09:52:48 +00:00
John Baldwin
391664b110 The variable MTRR registers actually have variable-sized PhysBase and
PhysMask fields based on the number of physical address bits supported
by the current CPU.  The old code assumed 36 bits on i386 and 40 bits on
amd64.  In truth, all Intel CPUs up until recently used 36 bits (a newer
Intel CPU uses 38 bits) and all the Opteron CPUs used 40 bits.

In at least one case (the new Intel CPU) having the size of the mask field
wrong resulted in writing questionable values into the MTRR registers on
the application processors (BSP as well if you modify the MTRRs via
memcontrol or running X, etc.).  The result of the questionable physmask
was that all of memory was apparently treated as uncached rather than
write-back resulting in a very significant performance hit.

Fix this by constructing a run-time mask for the PhysBase and PhysMask
fields based on the number of physical address bits supported by the CPU.
All 64-bit capable CPUs provide a count of PA bits supported via the
0x80000008 extended CPUID feature, so use that if it is available.  If that
feature is not available, then assume 36 PA bits.

While I'm here, expand the (now-unused) macros for the PhysBase and
PhysMask fields to the current largest possible value (52 PA bits).

MFC after:	1 week
PR:		i386/120516
Reported by:	Nokia
2008-03-12 22:09:19 +00:00
John Baldwin
4cbd0e8984 MFamd64: Break up the probe logic in the mem_drvinit routines so it's
a bit easier to parse.
2008-03-12 21:44:46 +00:00
Jeff Roberson
6617724c5f Remove kernel support for M:N threading.
While the KSE project was quite successful in bringing threading to
FreeBSD, the M:N approach taken by the kse library was never developed
to its full potential.  Backwards compatibility will be provided via
libmap.conf for dynamically linked binaries and static binaries will
be broken.
2008-03-12 10:12:01 +00:00
John Baldwin
1b085fde87 Style(9) these files. No changes in the compiled code. (Verified by
diff'ing objdump -d output).
2008-03-11 21:41:36 +00:00
John Baldwin
336d8e5536 Add constants for the various fields in MTRR registers.
MFC after:	1 week
Verified by:	md5(1)
2008-03-11 20:10:37 +00:00
John Baldwin
463e0f91cb Probe CPUs after the PCI hierarchy on i386, amd64, and ia64. This allows
the cpufreq drivers to reliably use properties of PCI devices for quirks,
etc.
- For the legacy drivers, add CPU devices via an identify routine in the
  CPU driver itself rather than in the legacy driver's attach routine.
- Add CPU devices after Host-PCI bridges in the acpi bus driver.
- Change the ichss(4) driver to use pci_find_bsf() to locate the ICH and
  check its device ID rather than having a bogus PCI attachment that only
  checked for the ID in probe and always failed.  As a side effect, you
  can now kldload ichss after boot.
- Fix the ichss(4) driver to use the correct device_t for the ICH (and not
  for ichss0) when doing PCI config space operations to enable SpeedStep.

MFC after:	2 weeks
Reviewed by:	njl, Andriy Gapon  avg of icyb.net.ua
2008-03-10 22:18:07 +00:00
John Baldwin
c3cefed5eb - Don't execute cpuid to fetch the features. We already have the features
present in cpu_feature2.  Also, use CPUID2_EST rather than a magic
  number.
- Don't free the ACPI settings list in detach if we are going to fail the
  request.  Otherwise an attempt to kldunload est would free the array
  but the driver would keep trying to use it.

MFC after:	1 week
2008-03-10 22:00:35 +00:00
Jeff Roberson
32c9d3a767 - Rather than repeating the same preemption code everywhere call the scheduler
specific sched_preempt() routine.
2008-03-10 01:32:48 +00:00
Rink Springer
2e7328e7cc Import uslcom(4) from OpenBSD - this is a driver for Silicon Laboratories
CP2101/CP2102 based USB serial adapters.

Reviewed by:		imp, emaste
Obtained from:		OpenBSD
MFC after:		2 weeks
2008-03-05 14:13:30 +00:00
Bruce Evans
f3d2db418f Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before.  Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.

I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision.  However, this was too hard for now.  In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed.  One
thing that is completely broken now is LDBL_MAX.  This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
2008-03-05 11:21:14 +00:00
Bruce Evans
021dfaf077 Oops, back out previous commit since it was to the wrong file. 2008-03-05 11:17:20 +00:00
Bruce Evans
69c0326e8c Change float_t and double_t to long double on i386. All floating point
expressions on i386 are evaluated in the range of the long double type,
so this is wrong in a different but hopefully less worse way than
before.  Since expressions are evaluated in long double registers,
there is no runtime cost to using long double instead of double to
declare intermediate values (except in cases where this avoids compiler
bugs), and by careful use of float_t or double_t it is possible to
avoid some of the compiler bugs in this area, provided these types are
declared as long double.

I was going to change float.h to be less broken and more usable in
combination with the change here (in particular, it is more necessary
to know the effective number of bits in a double_t when double_t !=
double, since DBL_MANT_DIG no longer logically gives this, and
LDBL_MANT_DIG doesn't give it either with FreeBSD-i386's default
rounding precision.  However, this was too hard for now.  In particular,
LDBL_MANT_DIG is used a lot in libm, so it cannot be changed.  One
thing that is completely broken now is LDBL_MAX.  This may have sort
of worked when it was changed from DBL_MAX in 2002 (adding 0 to it at
runtime gave +Inf, but you could at least compare with it), but starting
with gcc-3.3.1 in 2003, it is always +Inf due to evaluating it at
compile time in the default rounding precision.
2008-03-05 11:11:53 +00:00
Jeff Roberson
81aa71755b - Remove the old smp cpu topology specification with a new, more flexible
tree structure that encodes the level of cache sharing and other
   properties.
 - Provide several convenience functions for creating one and two level
   cpu trees as well as a default flat topology.  The system now always
   has some topology.
 - On i386 and amd64 create a seperate level in the hierarchy for HTT
   and multi-core cpus.  This will allow the scheduler to intelligently
   load balance non-uniform cores.  Presently we don't detect what level
   of the cache hierarchy is shared at each level in the topology.
 - Add a mechanism for testing common topologies that have more information
   than the MD code is able to provide via the kern.smp.topology tunable.
   This should be considered a debugging tool only and not a stable api.

Sponsored by:	Nokia
2008-03-02 07:58:42 +00:00
Justin T. Gibbs
b601964112 In est_acpi_info(), initialize count before passing its pointer to
CPUFREQ_DRV_SETTINGS().  The value of count on input is used to
prefent overflow of the settings buffer passed into CPUFREQ_DRV_SETTINGS().

This corrects the "est: CPU supports Enhanced Speedstep, but is not recognized."
error on my system.

MFC after: 1 week
2008-03-01 21:58:34 +00:00
John Baldwin
905829bfa9 With the recent change to enable CPU brands from the VIA chips, the
code to add padlock features to the CPU model on VIA CPUs was no longer
effective.  Change the code to instead output a separate printf during
dmesg for VIA Padlock features similar to other cpuid feature bitmasks.

MFC after:	1 week
2008-02-29 19:18:09 +00:00